Skip to main content

URL Whitelisting

The Witboost Catalog uses a URL Provider to fetch entity definitions from external sources. To enhance security and control which URLs can be processed, Witboost implements a URL whitelisting mechanism that restricts catalog imports to only authorized sources.

Overview

The URL whitelisting feature allows administrators to define a list of allowed URL patterns that the Catalog can process. This prevents unauthorized or potentially malicious catalog imports by restricting the sources from which entity definitions can be fetched.

Configuration

URL whitelisting is configured through the locationsWhitelist setting in your app-config.yaml file:

catalog:
processingInterval: { minutes: 15 }
locationsWhitelist:
- ^https://github\.com/my-org/.*
- ^https://gitlab\.com/my-company/.*
- ^https://bitbucket\.org/my-team/.*

Pattern Types

The URL whitelist supports two types of patterns:

Regular expressions provide the most flexible and powerful way to define URL patterns. They allow you to create precise matching rules for your allowed URLs.

Syntax: Use standard JavaScript regular expression syntax.

Examples:

locationsWhitelist:
# Match all GitHub repositories in a specific organization
- ^https://github\.com/my-organization/.*

# Match specific GitLab projects
- ^https://gitlab\.com/my-company/(project1|project2)/.*

# Match Bitbucket repositories with specific naming pattern
- ^https://bitbucket\.org/my-team/[a-z-]+/.*

# Match multiple domains
- ^https://(github\.com|gitlab\.com)/my-org/.*

# Match specific file types
- ^https://.*\.com/.*/catalog-info\.yaml$

2. Literal String Patterns (Fallback)

If a pattern is not a valid regular expression, the system treats it as a literal string and performs a simple substring match.

Examples:

locationsWhitelist:
# Simple domain matching
- github.com
- gitlab.com

# Specific path matching
- /catalog-info.yaml

Error Handling

When a URL is not whitelisted, the system:

  1. Logs a Warning: Records the blocked URL in the application logs
  2. Skips Processing: Does not attempt to fetch or process the URL
  3. Reports Error: Emits a processing error that can be viewed in the Catalog

Example Log Message:

URL https://unauthorized.com/catalog-info.yaml is not in the whitelist and will be skipped

Disabling Whitelisting

To disable URL whitelisting entirely (not recommended for production), set an empty whitelist:

catalog:
locationsWhitelist: []

Troubleshooting

Common Issues

  1. Pattern Not Matching: Ensure your regex is correct and properly escaped
  2. URLs Being Blocked: Check that your patterns are broad enough to match your URLs