Skip to main content

Overview

The Power Search is a feature that allows users to search for entities, leveraging full-text search and configurable filters. The search is performed on the indexed fields of the entity descriptor, which have been optimized for search operations.
The platform team is able to configure the fields to be indexed and their relevance, to optimize the search results; they also can specify filters and configurations relative to the collators.

How to configure the Power Search

As mentioned earlier, the platform team can customize two main areas: resource indexing and search filters.

Resource Indexing

Resource indexing is a process where specific fields are selected and indexed to make them searchable via full-text search. The platform team can choose which fields to index and assign relevance levels to them. When a user enters a term in the search bar, the search engine will look for that term in the indexed fields and select results based on relevance.

Indexing configuration

The platform team has the ability to configure the fields to be indexed from the descriptor, and their relevance. For example, if the descriptors have fields like 'name', 'description', 'tags', and they believe users are more likely to search using these fields, they can configure these fields to be indexed and their relevance to be high.

The platform administrators can write the configuration in the mesh.search.indexFields section of the configuration file. The configuration is a list of objects, each object representing a field that will be indexed. Each object will have the following fields:

  • path: The path to the field in the document. For nested fields, you can use dot notation to access them (e.g. address.city.name)
  • relevance: The relevance of the field in the search, represented by a letter from A to C, where A is the most relevant and C is the least relevant.

If the path references an array, the search engine will index all the elements of the array, even if they are objects. If the descriptor contains nested objects, you can specify the path to the nested field using dot notation.

Example configuration

This is an example configuration for the following descriptor snippet:

{
"name": "My Resource",
"specific": {
"description": "This is a description of the resource"
},
"tags": [
{
"source": "Tag",
"tagFQN": "experimental",
"labelType": "Manual"
},
{
"source": "Tag",
"tagFQN": "structured",
"labelType": "Automatic"
}
]
}
mesh:
search:
indexFields:
- path: name
relevance: A
- path: specific.description
relevance: B
- path: tags
relevance: C

Search Filters

The search functionalities will feature a set of filters that users can use to narrow down search results. These filters are customizable by the platform administrators, who can choose which fields will be shown to the end users as filters, and how each filter will be applied when searching.

Filter structure

Each filter is represented as an object that defines a single filter condition. Each object has the following fields:

  • field: The field name in the document. For nested fields, you can use dot notation to access them.
  • label: A human-readable label that will be shown in the UI.
  • type: The type of the filter.
note

When defining the field parameter, you can use the _computedInfo object to access computed fields. For example, to access the domain name in the _computedInfo object, you can use _computedInfo.domain.name. Another field worth mentioning is _computedInfo.publishedAt, which can be used to filter by the publication date of the document.

In addition, the field parameter can navigate through arrays, to get all the nested values. FOr example, if your document has a structure like:

{
title: 'Example',
documentId: '1738263',
tags: [
{ id: '1', value: { name: 'pii', descriptions: [{ value: 'private' }] } },
{
id: '2',
value: {
name: 'gdpr',
descriptions: [{ value: 'sensitive' }, { value: 'personal' }],
},
},
],
}

and you want to filter based on the tag descriptions, you can use the following path: tags.value.descriptions.value. With that configuration, the filter will search for the input string in all the descriptions of all the tags: 'private', 'sensitive', and 'personal'.

Filter types

Filters can be of the following types:

  • text: a free-form text field. Just input the text and the search engine will look for it in the filter field.
  • choice: a multiple-choice box with pre-defined values.
  • boolean: a simple toggle. You can use this filter only on boolean values.
  • date: a date picker. You can use this filter only on date fields. The field should be in the Date Time String Format, and you can omit fields after the day, or use the Unix Epoch.

The text filter admits an additional configuration field called match. This field tells Witboost how the input string should be matched with the existing values, and can have the following values:

  • exact: The input string must match the existing values exactly.
  • begins: The input string must be a prefix of the existing values. This is the default value.
  • ends: The input string must be a suffix of the existing values.
  • contains: The input string must be contained in one of the existing values. Since the ends and contains options are more computationally expensive, it is recommended to use them only when necessary.

Filter configuration

The platform team can configure filters in the mesh.search.filters section of the configuration. Each functionality that utilizes the power search feature can have its own set of unique filters, allowing the platform team to define different filters tailored to each specific functionality.

The functionalities that leverage the power search are as follows:

IDFunctionalityDescription
marketplace-projectsMarketplace SearchThe search page in the marketplace, where users can search for published projects.
provisioning-consumablesEntity Search PickerA configurable entity picker designed to facilitate the selection of entities within the platform.

To configure a filter for a specific functionality, add the filter object under the corresponding mesh.search.filters."functionality-id" section. Remember that the filter path and label must be unique within the same functionality.

Example configuration

This is an example configuration for the following descriptor snippet in the Marketplace Search functionality:

{
"domain": "finance",
"tags": [
{
"source": "Tag",
"tagFQN": "experimental",
"labelType": "Manual"
},
{
"source": "Tag",
"tagFQN": "structured",
"labelType": "Automatic"
}
],
"deploymentInfo": {
"status": "COMPLETED",
"deploymentDate": "2021-09-01T00:00:00Z",
"deploymentId": "123456",
"deploymentType": "MANUAL"
},
"consumable": true
}
mesh:
search:
filters:
marketplace-projects:
- field: domain
label: Domain
type: choice
- field: tags.tagFQN
label: Tags
type: text
- field: deploymentInfo.deploymentDate
label: Deployment Date
type: date
- field: consumable
label: Consumable
type: boolean

Default filters

There are some filters included by default in some of the collators, which are not configurable by the platform team. These filters are:

marketplace-projects:
- field: '_computedInfo.environment'
label: 'Environment'
type: 'choice'
- field: '_computedInfo.taxonomy.external_id'
label: 'Data Landscape'
type: 'choice'
- field: '_computedInfo.kind'
label: 'Kind'
type: 'choice'
provisioning-consumables:
- field: 'domain'
label: 'Domain'
type: 'choice'
- field: 'deploymentUnitId'
label: 'Parent'
type: 'choice'
- field: 'environment'
label: 'Environment'
type: 'choice'
- field: 'kind'
label: 'Kind'
type: 'choice'

Collator scheduling configuration

Configuration

The platform team can configure the collators using the following configuration:

mesh:
search:
collators:
marketplace-projects:
schedule:
frequency: { minutes: 10 }
timeout: { minutes: 15 }
initialDelay: { seconds: 3 }
collatorOptions:
batchSize: 100
provisioning-consumables:
schedule:
frequency: { minutes: 10 }
timeout: { minutes: 15 }
initialDelay: { seconds: 3 }
collatorOptions:
batchSize: 100

Configuration Parameters:

  • Frequency: The interval at which the collator ingests all documents again, specified in minutes.
  • Timeout: The maximum time allowed for the collator to ingest all documents, specified in minutes.
  • InitialDelay: The time the collator waits before starting the ingestion, specified in seconds.
  • BatchSize: The number of documents to be ingested in each batch.