Provisioning
Provisioning status
The provisioning status of a deployment unit serves as a proxy for the infrastructural status of the various components outlined in the descriptor.
A deployment unit is categorized as:
- Deployed - when all components in the descriptor are correctly deployed
- Partially deployed - when some components in the descriptor are correctly deployed while others are not deployed
- Not deployed - when none of the components in the descriptor are deployed
- Corrupt - when at least one component is in a corrupt state, typically resulting from a failed provisioning operation leaving the component in an inconsistent deployment status
When the root component is enabled (refer to the Intro section for further details), the provisioning status of the deployment unit also considers the infrastructural status of this component.
When a provisioning operation fails to send updates to the configured Marketplaces or Data Catalogs, the deployment unit status will be set to Corrupt, even if all its components have been successfully deployed.
The Provisioning Coordinator serves as the sole trusted source of truth in the witoobst ecosystem for the provisioning status of a deployment unit in a given environment. It exposes the necessary APIs to query this information reliably.
Provisioning operation
A provisioning operation consists of a set of steps aimed at updating the provisioning status of a deployment unit.
Provisioning preview
The Coordinator provides an endpoint to compute a preview of a provisioning operation before initiating the actual process. The endpoint accepts a deployment unit descriptor, along with indications of the target provisioning status for each component in the descriptor. It then returns a list of operations that will be executed to achieve the desired status.
Let's consider a deployment unit DU1
with three components C1
, C2
, C3
.
The current provisioning status is Partially deployed with C1
deployed, C2
deployed and C3
not deployed
You send a provisioning preview request to the Coordinator containing a new version of DU1
's descriptor including a new component C4
, and the new desired status:
C1
should be deployedC2
should be not deployedC3
should be not deployedC4
should be deployed
The provisioning preview computed by the Coordinator will include all the operations needed to satisfy your request. Sample preview:
C1
will stay deployed (no further operations needed)C2
will be undeployed (i.e., its tech adapter will be instructed to undeploy the component from the target infrastructure)C3
will stay not deployed (no further operations needed)C4
will be deployed (i.e., its tech adapter will be instructed to deploy the component on the target infrastructure)
Provisioning constraints
While computing a provisioning preview and before running a provisioning operation, the Coordinator examines the dependencies among components declared in the descriptor, enforcing the following constraints:
- if you intend to deploy a component, then you must also deploy all its dependencies (recursively on every involved component)
- if you intend to undeploy a component, then you must also undeploy all currently deployed components depending on it
- when the root component is enabled, it is considered as a dependency for all the other components
- if a component is currently deployed, but no longer present in the new provided descriptor, it must be undeployed
- if a component is currently deployed, and the new provided descriptor includes an updated version of the component's descriptor, it must be re-deployed
Provisioning plan
A provisioning operation is executed by translating a provisioning preview into a set of tasks organized into a directed acyclic graph (DAG), known as the provisioning plan.
Validation gateway task
It runs the validation process (more on this in the Validation section) by verifying the compliance with the computational policies exposed by the Computational Governance Platform (CGP), and collecting the validation results produced by the various tech adapters involved in the provisioning process. When the validation fails, this task halts the provisioning plan and prevents the execution of the provisioning operation.
It servers as both a gateway toward the validation services and a validation and quality gate for the provisioning operation.
Components provisioning tasks
One for each component whose provisioning status must be updated. They interact with the tech adapters to achieve the target provisioning status. Their execution is scheduled such that:
- if component
C1
andC2
are going to be deployed, andC1
depends onC2
, then the deployment task ofC1
will be run only after the deployment task ofC2
is successfully completed - if component
C1
andC2
are going to be undeployed, andC1
depends onC2
, then the undeployment task ofC2
will be run only after the undeployment task ofC1
is successfully completed
Dependencies drive the infrastructural provisioning. The question to ask, when defining them, is: "Can component A work without component B?". If so, component A is not dependant on B, otherwise B is a dependency of A.
In the context of a data product, let's consider components such as an Impala table and a Storage component (e.g., Amazon S3 storage). The Impala table relies on the underlying S3 storage for its operation, as it exposes a structured representation of data stored in S3.
Can the Impala table correctly work without the underlying S3 storage? No. Therefore, the S3 Storage component should be designated as a dependency of the Impala table component.
When deploying the two components, the S3 Storage will be created before the Impala table to ensure that the table can access the required data.
Conversely, when undeploying them, the Impala table will be removed before the S3 Storage.
This ensures a proper cleanup sequence that maintains the integrity of the deployment environment.
Data Catalog and Marketplace update tasks
Executed only after the components provisioning tasks complete with success. They notify the new overall provisioning status of the deployment unit (along with the updated descriptor) to the configured Data Catalogs and Marketplaces.
Provisioning persistence tasks
The primary objective of these tasks is to track and persist the provisioning status of the deployment unit and its components as it evolves during the execution of the provisioning plan.