Skip to main content

Resource Preprocessing

We have seen in the Overview section, how a resource can be processed before giving it as an input to the Governance Entity's engine.

The Resource Preprocessing is a step that sits in between the Resource being retrieved by CGP and the Governance Entity's engine evaluating the resource.

If you set the Resource Preprocessing to be Previous vs Current, you can actually fulfill an interesting use-case, that is enforcing rules on how a resource can evolve.

Creating a Policy that checks the Resource's evolution

A policy whose Resource Preprocessing set as Previous vs Current has the possibility to compare the descriptor of the last deployed version of the resource against the descriptor of the current version that is going under validation. Whether it is a test or a deployment operation, this kind of policies are extremely useful to detect breaking changes and deviations between versions.

Since the engine that performs the business logic receives as input not only the current descriptor, but also the last one that was successfully deployed, the logic can check if there have been changes that are not allowed (as an example, you can check that all the columns that were exposed in a table are still there).

This feature is particularly important to enforce rules that check that developers are not breaking any existing contracts with other development teams: if they deployed a table with 5 columns, and people are already reading data from it, we can prevent them from releasing a new version of the table with only 4 columns, since they could break all the downstream applications that could be reading from it.

With CUE you can create rules that look for the same values to be present in both parts of the descriptor (last deployed, and current one); with a microservice (in case of a remote engine) you can do all kind of custom controls, like checking that the business terms associated did not change the semantic of the data, or that metadata are still consistent.

The following is a very simple example of a CUE script for a Breaking Change policy:

original: {
id: string
}
current: {
id: string & =~original.id
}

This policy checks that the original and current descriptor's ids are strings, and the current id is equal to the original id.
A more complex example could be the following:

import "list"

#Component: {
kind: string & =~"(?i)^(outputport|workload|storage|observability)$"
if kind != _|_ {
if kind =~ "(?i)^(outputport)$" {
#OutputPort
}
}
...
}

#OutputPort: {
id: string
dataContract: #DataContract
...
}

#OM_Column: {
name: string
...
}

#DataContract: {
schema: [... #OM_Column]
...
}

original: {
components: [...#Component]
...
}

current: {
components: [...#Component]
...
}

_checks: {

current_outputports: [ for n in current.components if n.kind =~ "(?i)^(outputport)$" {n} ]
prev_outputports: [ for n in original.components if n.kind =~ "(?i)^(outputport)$" {n} ]

prevOutputPortsNames: [for n in prev_outputports {n.name}]


presentInBoth: [for n in current_outputports if list.Contains(prevOutputPortsNames,n.name){n} ]

curr_schema: [for n in presentInBoth {n.schema}]
#prev_schema: [for n in prev_outputports {n.schema}]

test: curr_schema & #prev_schema
}

This CUE policy checks that in the new version of the data product the number of output ports and the schema remains identical to the deployed one. You can create multiple breaking change policies that check for different things, even leveraging different engines.