Anatomy of a Data Product Repository
When you create a new data product, hence using a template, witboost will create a repository into your configured git platform, and initialize it with some files. Below, an example repository is shown:
For the sake of the example, our data product does not expose any output port, so that its descriptor is kept short.
A data product is often composed of several output ports, and those can either be stored in the same (mono)repository or be hosted in a completely different location for each output port. So, the files and folders shown above can differ.
Our repository contains:
docs/
folder: which, as the name suggests, will contain documentation filesenvironments/
folder: will host environment-specific configurations for the data productreleases/
folder: it is fully managed by Witboost and stores metadata and descriptors of released/ongoing releases.README.md
: it is left to you as a best practice to always inform on the contents of the repositorycatalog-info.yaml
: general metadata about your data product. This is the main file of this repository and the one that is used by Witboost to keep track of the whole entity.mkdocs.yml
: Witboost's metadata about documentation that is present under thedocs
folder, such as page listings
Later sections will show how each operation in the release lifecycle interacts with a repository.
To avoid inconvenient situations, such as overwriting your data, we encourage you to not modify files and folders under the release
folder.