Volume Anomaly Detection Alert

Description

This alert detects volume anomalies in pipeline metrics using IQR-based (Interquartile Range) statistical methods on historical patterns. It uses the Tukey fence method to identify outliers by comparing current metric values against historically established bounds.

The anomaly detection calculates bounds using: [Q1 - k×IQR, Q3 + k×IQR] where IQR = Q3 - Q1. Values outside these bounds are flagged as anomalies.

The alert includes a bootstrap period where pipelines must accumulate at least 4 weeks of historical data before anomaly detection is enabled. This ensures statistical reliability by preventing false positives on insufficient data.

Compatible with all Monad tiers

Prerequisites

Active pipelines generating metrics in your Monad organization
At least 4 weeks of historical metric data for each pipeline (bootstrap period)
Familiarity with the sensitivity options described in the IQR K-Factor Guidelines below

Setup Instructions

Select the Metric to Monitor (ingress_bytes or egress_bytes)
Set the IQR K-Factor to control sensitivity (common values: 1.5 for outliers, 2.5 for far outliers, 3.0 for extreme outliers)
Select the pipelines to monitor (leave empty to monitor all organization pipelines)

Configuration Options

Settings

Setting	Type	Required	Default	Description
metric	string	Yes	-	The metric to monitor for anomalies. Must be `ingress_bytes` or `egress_bytes`.
iqr_k_factor	float	Yes	-	Sensitivity multiplier for anomaly detection. Values outside the range [Q1 - k×IQR, Q3 + k×IQR] trigger alerts. Use 1.5 for outliers, 2.5 for far outliers, or 3.0 for extreme outliers only.

IQR K-Factor Guidelines

The k-factor determines how sensitive the anomaly detection is:

1.5 - Standard outlier detection (more alerts, catches subtle anomalies)
2.5 - Far outlier detection (balanced sensitivity)
3.0 - Extreme outlier detection (fewer alerts, only major deviations)

Confidence Levels

The alert assigns confidence levels based on the amount of historical data available:

Data Available	Confidence Level	Behavior
< 4 weeks	Bootstrap	No alerting (insufficient data)
4-8 weeks	Low	Alerting enabled with lower confidence
8-20 weeks	Medium	Alerting with moderate confidence
20+ weeks	High	Alerting with high confidence

Alert JSON Format

When an anomaly is detected, the alert generates the following JSON structure:

{
  "rule_id": "550e8400-e29b-41d4-a716-446655440000",
  "name": "Volume Anomaly Alert",
  "organization_id": "org-123",
  "severity": "high",
  "description": "Pipeline pipeline-abc-123: ingress_bytes is above normal range (current: 5000000, above bound: 3500000)",
  "metadata": {
    "pipeline_id": "pipeline-abc-123",
    "metric_name": "ingress_bytes",
    "detection_method": "iqr",
    "current_value": 5000000,
    "lower_bound": 500000,
    "upper_bound": 3500000,
    "threshold_value": 2.5,
    "sample_size": 12,
    "confidence_level": "medium"
  },
  "resource": {
    "resource_type": "pipeline",
    "resource_id": "pipeline-abc-123"
  }
}

Alert Metadata Fields

pipeline_id: The ID of the pipeline that triggered the alert
metric_name: The metric being monitored (ingress_bytes or egress_bytes)
detection_method: The algorithm used for detection (always iqr — IQR-based Tukey fence method)
current_value: The current metric value that triggered the anomaly
lower_bound: The lower threshold (Q1 - k×IQR)
upper_bound: The upper threshold (Q3 + k×IQR)
threshold_value: The k-factor used for the Tukey fence calculation
sample_size: Number of weekly samples used for statistical calculation
confidence_level: Reliability of the detection based on historical data (low, medium, or high)

Use Cases

Capacity Planning: Detect unusual spikes in data volume that may indicate capacity issues
Data Pipeline Health: Identify unexpected drops in data flow that might signal upstream problems
Anomaly Investigation: Catch volume changes that deviate significantly from historical patterns
Cost Management: Alert on unusual data volumes that could impact billing
Security Monitoring: Detect abnormal data exfiltration patterns through volume anomalies
SLA Compliance: Ensure data volumes stay within expected operational ranges

Limitations

Requires a minimum of 4 weeks of historical data before alerting begins (bootstrap period)
Only supports ingress_bytes and egress_bytes metrics
IQR k-factor must be greater than 0
Confidence level is determined by data availability and cannot be manually configured
For metrics with very consistent historical values (near-zero IQR), a small absolute tolerance is used instead
Lower bounds are automatically clamped to 0 for byte metrics (cannot be negative)

Example Configurations

Standard Volume Monitoring

{
  "metric": "ingress_bytes",
  "iqr_k_factor": 2.5
}

Detects far outliers in ingress data volume, suitable for most production use cases.

High Sensitivity Monitoring

{
  "metric": "egress_bytes",
  "iqr_k_factor": 1.5
}

Catches subtle anomalies in egress data volume. Use for critical pipelines where early detection is important.

Low Sensitivity Monitoring

{
  "metric": "ingress_bytes",
  "iqr_k_factor": 3.0
}

Only alerts on extreme deviations. Use for pipelines with naturally variable volumes to reduce alert noise.

Description​

Prerequisites​

Setup Instructions​

Configuration Options​

Settings​

IQR K-Factor Guidelines​

Confidence Levels​

Alert JSON Format​

Alert Metadata Fields​

Use Cases​

Limitations​

Example Configurations​

Standard Volume Monitoring​

High Sensitivity Monitoring​

Low Sensitivity Monitoring​