Skip to main content

Matches Regex

Checks if a string value matches a regular expression pattern.

Overview

The matches_regex condition evaluates whether a field's string value matches a specified regular expression pattern. This provides powerful pattern matching capabilities for complex string validation, extraction scenarios, and flexible text matching.

Use Cases

  • Format Validation: Validate email addresses, phone numbers, IDs
  • Pattern-Based Routing: Route based on complex string patterns
  • Log Parsing: Match specific log formats or patterns
  • Data Classification: Categorize data based on content patterns

Configuration

SettingTypeRequiredDescription
keystringYesThe field path to check. Supports dot notation for nested fields. Use * to check all keys.
patternstringYesThe regular expression pattern to match against.
notbooleanNoIf true, inverts the condition (matches if pattern does NOT match).

Regex Syntax

The condition uses Go's RE2 regular expression syntax. Common patterns include:

PatternDescription
.Any character
*Zero or more of preceding
+One or more of preceding
?Zero or one of preceding
^Start of string
$End of string
[abc]Character class
[^abc]Negated character class
\dDigit (0-9)
\wWord character (a-z, A-Z, 0-9, _)
\sWhitespace
`(ab)`
{n}Exactly n occurrences
{n,m}Between n and m occurrences

Examples

Email Validation

Match valid email format:

{
"type_id": "matches_regex",
"config": {
"key": "email",
"pattern": "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$"
}
}

Matches:

{"email": "user@example.com"}
{"email": "john.doe@company.co.uk"}

Does not match:

{"email": "invalid-email"}
{"email": "@example.com"}

IP Address Matching

Match IPv4 addresses:

{
"type_id": "matches_regex",
"config": {
"key": "ip_address",
"pattern": "^\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}$"
}
}

Matches:

{"ip_address": "192.168.1.1"}
{"ip_address": "10.0.0.255"}

UUID Validation

Match UUID format:

{
"type_id": "matches_regex",
"config": {
"key": "id",
"pattern": "^[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}$"
}
}

Matches:

{"id": "550e8400-e29b-41d4-a716-446655440000"}
{"id": "6ba7b810-9dad-11d1-80b4-00c04fd430c8"}

Log Level Matching

Match specific log levels:

{
"type_id": "matches_regex",
"config": {
"key": "message",
"pattern": "^\\[(ERROR|WARN|FATAL)\\]"
}
}

Matches:

{"message": "[ERROR] Connection failed"}
{"message": "[WARN] Disk space low"}
{"message": "[FATAL] System crash"}

Does not match:

{"message": "[INFO] Process started"}
{"message": "[DEBUG] Variable value: 42"}

Phone Number Format

Match US phone numbers:

{
"type_id": "matches_regex",
"config": {
"key": "phone",
"pattern": "^(\\+1)?[-.\\s]?\\(?\\d{3}\\)?[-.\\s]?\\d{3}[-.\\s]?\\d{4}$"
}
}

Matches:

{"phone": "555-123-4567"}
{"phone": "(555) 123-4567"}
{"phone": "+1 555 123 4567"}

Exclusion Pattern

Match messages that do NOT contain timestamps:

{
"type_id": "matches_regex",
"config": {
"key": "message",
"pattern": "\\d{4}-\\d{2}-\\d{2}",
"not": true
}
}

Matches:

{"message": "Simple message without date"}
{"message": "Error occurred"}

Does not match:

{"message": "Event on 2024-01-15"}
{"message": "2024-03-20: System restart"}

Partial Match

Match records containing specific patterns anywhere:

{
"type_id": "matches_regex",
"config": {
"key": "url",
"pattern": "/api/v[0-9]+/"
}
}

Matches:

{"url": "https://example.com/api/v1/users"}
{"url": "/api/v2/orders/123"}

Wildcard Key Check

Check if any field matches a pattern:

{
"type_id": "matches_regex",
"config": {
"key": "*",
"pattern": "password|secret|token"
}
}

Common Patterns

Sensitive Data Detection

Detect potential sensitive data:

{
"operator": "or",
"conditions": [
{"type_id": "matches_regex", "config": {"key": "*", "pattern": "\\b\\d{3}-\\d{2}-\\d{4}\\b"}},
{"type_id": "matches_regex", "config": {"key": "*", "pattern": "\\b\\d{16}\\b"}},
{"type_id": "matches_regex", "config": {"key": "*", "pattern": "(?i)password\\s*[:=]"}}
]
}

This detects SSN patterns, credit card numbers, and password fields.

Environment-Specific URLs

Route based on environment in URLs:

{
"operator": "and",
"conditions": [
{"type_id": "matches_regex", "config": {"key": "endpoint", "pattern": "^https://"}},
{"type_id": "matches_regex", "config": {"key": "endpoint", "pattern": "\\.(prod|production)\\."}}
]
}

Version String Matching

Match semantic version strings:

{
"type_id": "matches_regex",
"config": {
"key": "version",
"pattern": "^v?\\d+\\.\\d+\\.\\d+(-[a-zA-Z0-9]+)?$"
}
}

Matches:

{"version": "1.0.0"}
{"version": "v2.3.1"}
{"version": "1.0.0-beta"}

HTTP Method Routing

Route based on HTTP methods:

{
"type_id": "matches_regex",
"config": {
"key": "method",
"pattern": "^(POST|PUT|PATCH|DELETE)$"
}
}

Exclude Test/Debug Data

Filter out test data:

{
"operator": "and",
"conditions": [
{"type_id": "matches_regex", "config": {"key": "id", "pattern": "^test_", "not": true}},
{"type_id": "matches_regex", "config": {"key": "email", "pattern": "@example\\.com$", "not": true}}
]
}

Common Regex Patterns Reference

Use CasePattern
Email^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
IPv4 Address^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$
UUID^[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}$
URL^https?://[^\s]+$
ISO Date^\d{4}-\d{2}-\d{2}$
Semantic Version^v?\d+\.\d+\.\d+(-[a-zA-Z0-9]+)?$
Hex Color^#[0-9a-fA-F]{6}$

Best Practices

  1. Escape special characters: Remember to double-escape backslashes in JSON (\\d not \d).

  2. Use anchors: Use ^ and $ to match the entire string when needed.

  3. Keep patterns simple: Complex patterns are harder to maintain and debug.

  4. Test patterns: Use a regex tester to verify patterns before deploying.

  5. Consider performance: Very complex patterns can impact processing speed.

  6. Use simpler conditions when possible: For simple prefix/suffix matching, use starts_with or ends_with.

Type Handling

Data TypeBehavior
StringsPattern matching applied
NumbersConverted to string first
BooleansConverted to "true" or "false"
ArraysNot supported directly
NullReturns false
MissingReturns false (or true if not is set)

Limitations

  • Uses RE2 syntax (no backreferences or lookahead/lookbehind)
  • No case-insensitive flag (use (?i) prefix in pattern)
  • Complex patterns may impact performance
  • Cannot match across multiple fields

Troubleshooting

Pattern not matching:

  • Check JSON escaping (double backslashes for \d, \s, etc.)
  • Verify anchors (^ and $) if matching full strings
  • Test pattern with sample data using an online regex tester

Unexpected matches:

  • Add anchors if you want to match the complete string
  • Check for overly broad patterns

Performance issues:

  • Simplify complex patterns
  • Avoid excessive backtracking patterns
  • Consider using simpler conditions (contains, starts_with) when possible

Case sensitivity:

  • Add (?i) at the start of the pattern for case-insensitive matching
  • Example: (?i)error matches "error", "ERROR", "Error"