Data Discovery
Overview
Data discovery in Witboost is designed to empower users to efficiently find, understand, and evaluate data products across the organization. The process combines powerful search capabilities with intuitive visual exploration tools, making it easy for both technical and business users to navigate the data landscape.
Search
The Witboost Marketplace provides powerful search capabilities to help you find and evaluate data products and their components (such as output ports). Search supports both simple, text-based queries and advanced filtering to narrow down results based on your needs.
Main Search Page
When you access the Marketplace, the Main Search page gives you a starting point for exploration. Here you can:
- Quickly search by typing a keyword, such as a data product name, domain, or tag.
- See Top Visited, Recently Visited, and Favorites sections for quick access to frequently used data products.
- Choose the data landscape you want to search within (if multiple are available).
This page is designed for quick navigation, making it easy to resume recent activities or explore commonly used products.
Advanced Search
The Advanced Search page provides more control and detail.
From here, you can:
- Refine search results using advanced filters.
- Sort results by relevance or publication date.
- View key details at a glance, such as:
- Description
- Tags
- Published date
- Version
You can search not only for data products but also their components, giving you full visibility into available outputs and dependencies.
Filters
Filters allow you to narrow down search results to find exactly what you need. These filters are fully configurable by your organization, so they can match business, governance, and operational needs.
Common filter types include:
- Domain: Focus on specific domains.
- Favorites: Show only data products you have marked as favorites.
- Tags: Filter based on tags (e.g., GDPR, Confidential, etc.)
- Description: Search for keywords in the data product description.
Using multiple filters together helps you progressively refine search results.
Governance Flags
Governance issues detected by Witboost Computational Governance are displayed directly in search results as flags, so you can quickly identify data products with:
- Missing metadata.
- Failed validations.
- Policy compliance issues.
This makes it easier to assess data quality and readiness before consuming a data product.
Visual Discovery
Beyond text-based search, Witboost offers visual tools to explore the relationships and dependencies between data products.
The Visual Discovery feature provides an interactive, graphical way to explore data within the Witboost Marketplace. It helps you understand relationships, domains, and connections across data products in a clear and intuitive format.
By default, Visual Discovery displays data products grouped by domains, giving you an at-a-glance view of how data products are organized across your organization.
Key Features
Grouping
When you first open Visual Discovery, data products are grouped by domains by default.
Each domain appears as a cluster, making it easy to identify the data products belonging to it.
However, it is possible to customize the grouping logic to analyze data from different perspectives. You can define up to three grouping levels, for example:
- Primary grouping (e.g., Domain).
- Secondary grouping (e.g., Country).
- Tertiary grouping (e.g., Legal Entity).
This flexibility allows you to visualize your data landscape by organizational structure, geographical distribution, or any other relevant attribute.
Example: Group first by Domain, then by Country, and finally by Legal Entity to see how data products are distributed across business units and regions.
Connections Between Data Products
The Visual Discovery graph shows connections between data products, helping you understand data flows:
- Input connections show which data products provide data to the selected product.
- Output connections show where the selected product shares its data.
This makes it easy to:
- Identify dependencies and upstream/downstream relationships.
- Understand the impact of changes to a data product.
- Analyze how data moves through your organization.
Filters
Use the Filters panel to refine what you see in the visualization.
The exact filters available are configured by your organization, meaning they can match your governance and operational needs.
Heatmaps
Heatmaps overlay visual highlights on the graph, helping you identify patterns or issues at a glance.
The available heatmaps are fully configurable by your organization, so you can track metrics that matter most to your business.
Key use cases include:
- Number of connections: quickly identify highly connected data products or potential bottlenecks.
- Maturity or lifecycle stages: visualize the readiness of data products across the organization.
- Publication date: highlight recently published or updated data products to easily spot new additions or track aging assets that may require review or deprecation.
By applying a heatmap, data products in the visualization are color-coded, giving you instant visibility into key metrics and relationships.
Governance flags
Governance Flags focus on policy and compliance issues, reported by the Witboost Computational Governance system.
- They visually indicate where governance problems exist, such as missing metadata, failed validations, or other compliance concerns.
- Flags appear directly on data products, making it easy to identify issues that require action.
Data Product Details Panel
Hover on a data product node to open a popover with quick information such as description, status, owner, tags, and policy compliance.
From this panel, you can:
- Open the details page for full documentation and output ports.
- View lineage to explore input and output connections.
- Access data contracts if the data product exposes or consumes any.
Why Use Visual Discovery
Visual Discovery helps you quickly understand and manage your organization's data by providing a clear, interactive view of relationships and dependencies. It allows you to:
- Explore relationships: see how data products are connected through input and output flows.
- Identify critical nodes: use heatmaps to highlight highly connected or high-traffic data products.
- Understand organization: view how data products are structured across domains, regions, or other business dimensions.
- Simplify impact analysis: quickly assess the potential effects of changes through a visual, intuitive interface.
This makes it easier for both technical and business users to discover, evaluate, and manage data products effectively.