Skip to main content

Create Data Products with Blueprints

Blueprints Introduction

There are several ways to create a data product, depending on the available templates and the needed components to fulfill your specific use case. To simplify this phase, we introduced the concept of Blueprints.

Blueprints are just templates for pre-defined use cases: they contain a data product template with already-built relationships to certain components that should be created along with it. It is a very powerful and useful mechanism that will speed up the creation of well-known use cases; just to make an example, let's say that in your development team, you usually create data products that read from a remote resource using Spark and create some output tables on Snowflake: you can now create a Blueprint containing the data product and the Spark and Snowflake templates together, to speed up the creation of new use cases. Let's take a step-by-step look at how to create a data product with Blueprints.

caution

As per regular templates, to perform actions related to interacting with the repository (creating a new repository, committing/pushing some files, etc.), witboost requires you to have access to the repository manager where the repositories will be created. There are different ways witboost can be configured to achieve this, and one is to specify a token that you will use to interact with the git repository (if the token is not specified, an error message will notify you to configure it in your personal settings).

If this is how your installation is configured, please refer to Configuring Git Credentials page to set up your token.

Creating data products using a Blueprint

  1. To create a data product using a Blueprint, at first, you must go to Builder and select Blueprints from the drop-down menu.

Builder -> Blueprints

  1. A page with the available blueprints is displayed. Each blueprint card has a name, a short description, and some generic details. It also contains the data product template that this Blueprint will use, and which component templates will be proposed when using it. At the bottom of the template card, you can find two buttons: some angle brackets, and a Select button.

Clicking on the angle brackets will take you to the repository containing the definition of the blueprint (if you have access to it), and the Choose button will allow you to select this blueprint to use it. Let's select the Data Product Blueprint template by clicking the Choose button.

Blueprint templates

  1. A page with a table opens. On the left side, you can see the list of all the templates contained in the blueprint, and the right panel will be empty since it contains what we built so far with the selected blueprint (nothing yet). Also, note that there are two buttons: Add on the right side of the data product row, and Select Existing Data Product in the upper right corner of the table.

Blueprint table page

  1. Let's create a new data product, by clicking on the Add button in the data product's row. The creation menu will be shown on right drawer. Enter all the metadata you need (the fields are the same as when you normally create this data product from a template).

Blueprint create data product

  1. At the end of the creation, you will see the three processes defined in the template (usually Fetch, Publish and Register), just like a normal creation. It can have three states:
  • Loading, a Loading label will appear on the right pane
  • Error, a Start Over button will appear allowing to restart the process, and Error label will appear on the right pane
  • Finished, the data product you created will appear in the right pane with a green check mark. This means that this data product is currently selected and that you can add other components to it. At the same time, the Add button opposite the components becomes active if they have not yet been added.

Blueprint select data product

  1. Click the Add button next to the "Snowflake" output port and, as in the previous step for the data product, a menu will be shown below to add the component to the data product. Fill again all the necessary fields and complete the component creation.
tip

At this stage, while inside the wizard, it is important to link components to the newly created data product, specifying its name when filling out the component form.

Modifying an existing data product with Blueprints

If you have already some data products that were created before you defined your blueprint, you can also check if the existing data products are compliant with the blueprint you defined. This means that you can check if one of your data products contains all the components as defined in the blueprint, and in case it is missing some of them, simply add the missing ones to it. In this way, you can check that all the data products relative to one use case are defined in the same way (you can also enforce this behavior at deployment time by adding a policy that checks the presence of all the needed components).

When selecting an existing data product to check its compliance:

  1. You can add new components to an existing data product using the Blueprint. To do this, go to Builder -> Blueprints as you do when you want to create a new product with the Blueprint. There you will see the familiar table with products and components on the left and empty rows on the right. In the upper right corner you will find the Select Existing Data Product button - click it.

  2. A panel with a list of existing data products appears. You can use the search to find the desired data product. Click on the desired data product to set the radio button on the left and click the Select button on the bottom to select it.

Blueprint modify

note

The search panel will show only data products that were created from the template defined as "main data product template" in the blueprint, to ensure that you can't add wrong components to existing data products that are not compliant with the blueprint definition.

  1. You will now see that the name of the selected data product appears in the right pane with a green check mark - this means that you have successfully selected the data product. Now you can see which components are already contained in it and which are not, and add new ones at your discretion. In this way, you can check and modify your data products according to the available Blueprint templates.