Volume Anomaly Detection Alert
Description
This alert detects volume anomalies in pipeline metrics using IQR-based (Interquartile Range) statistical methods on historical patterns. It uses the Tukey fence method to identify outliers by comparing current metric values against historically established bounds.
The anomaly detection calculates bounds using: [Q1 - k×IQR, Q3 + k×IQR] where IQR = Q3 - Q1. Values outside these bounds are flagged as anomalies.
The alert includes a bootstrap period where pipelines must accumulate at least 4 weeks of historical data before anomaly detection is enabled. This ensures statistical reliability by preventing false positives on insufficient data.
Compatible with all Monad tiers
Prerequisites
- Active pipelines generating metrics in your Monad organization
- At least 4 weeks of historical metric data for each pipeline (bootstrap period)
- Understanding of IQR-based anomaly detection concepts
Setup Instructions
- Select the Metric to Monitor (ingress_bytes or egress_bytes)
- Set the IQR K-Factor to control sensitivity (common values: 1.5 for outliers, 2.5 for far outliers, 3.0 for extreme outliers)
- Select the pipelines to monitor (leave empty to monitor all organization pipelines)
Configuration Options
Settings
| Setting | Type | Required | Default | Description |
|---|---|---|---|---|
| metric | string | Yes | - | The metric to monitor for anomalies. Must be ingress_bytes or egress_bytes. |
| iqr_k_factor | float | Yes | - | Sensitivity multiplier for anomaly detection. Values outside the range [Q1 - k×IQR, Q3 + k×IQR] trigger alerts. Use 1.5 for outliers, 2.5 for far outliers, or 3.0 for extreme outliers only. |
IQR K-Factor Guidelines
The k-factor determines how sensitive the anomaly detection is:
1.5- Standard outlier detection (more alerts, catches subtle anomalies)2.5- Far outlier detection (balanced sensitivity)3.0- Extreme outlier detection (fewer alerts, only major deviations)
Confidence Levels
The alert assigns confidence levels based on the amount of historical data available:
| Data Available | Confidence Level | Behavior |
|---|---|---|
| < 4 weeks | Bootstrap | No alerting (insufficient data) |
| 4-8 weeks | Low | Alerting enabled with lower confidence |
| 8-20 weeks | Medium | Alerting with moderate confidence |
| 20+ weeks | High | Alerting with high confidence |
Alert JSON Format
When an anomaly is detected, the alert generates the following JSON structure:
{
"rule_id": "550e8400-e29b-41d4-a716-446655440000",
"name": "Volume Anomaly Alert",
"organization_id": "org-123",
"severity": "warning",
"description": "Pipeline pipeline-abc-123: ingress_bytes is above normal range (current: 5000000, above bound: 3500000)",
"metadata": {
"pipeline_id": "pipeline-abc-123",
"metric_name": "ingress_bytes",
"detection_method": "iqr",
"current_value": 5000000,
"lower_bound": 500000,
"upper_bound": 3500000,
"threshold_value": 2.5,
"sample_size": 12,
"confidence_level": "medium"
},
"resource": {
"resource_type": "pipeline",
"resource_id": "pipeline-abc-123"
}
}
Alert Metadata Fields
- pipeline_id: The ID of the pipeline that triggered the alert
- metric_name: The metric being monitored (ingress_bytes or egress_bytes)
- detection_method: The algorithm used for detection (
iqrfor IQR-based detection) - current_value: The current metric value that triggered the anomaly
- lower_bound: The lower threshold (Q1 - k×IQR)
- upper_bound: The upper threshold (Q3 + k×IQR)
- threshold_value: The k-factor used for the Tukey fence calculation
- sample_size: Number of weekly samples used for statistical calculation
- confidence_level: Reliability of the detection based on historical data (
low,medium, orhigh)
Use Cases
- Capacity Planning: Detect unusual spikes in data volume that may indicate capacity issues
- Data Pipeline Health: Identify unexpected drops in data flow that might signal upstream problems
- Anomaly Investigation: Catch volume changes that deviate significantly from historical patterns
- Cost Management: Alert on unusual data volumes that could impact billing
- Security Monitoring: Detect abnormal data exfiltration patterns through volume anomalies
- SLA Compliance: Ensure data volumes stay within expected operational ranges
Limitations
- Requires a minimum of 4 weeks of historical data before alerting begins (bootstrap period)
- Only supports
ingress_bytesandegress_bytesmetrics - IQR k-factor must be greater than 0
- Confidence level is determined by data availability and cannot be manually configured
- For metrics with very consistent historical values (near-zero IQR), a small absolute tolerance is used instead
- Lower bounds are automatically clamped to 0 for byte metrics (cannot be negative)
Example Configurations
Standard Volume Monitoring
{
"metric": "ingress_bytes",
"iqr_k_factor": 2.5
}
Detects far outliers in ingress data volume, suitable for most production use cases.
High Sensitivity Monitoring
{
"metric": "egress_bytes",
"iqr_k_factor": 1.5
}
Catches subtle anomalies in egress data volume. Use for critical pipelines where early detection is important.
Low Sensitivity Monitoring
{
"metric": "ingress_bytes",
"iqr_k_factor": 3.0
}
Only alerts on extreme deviations. Use for pipelines with naturally variable volumes to reduce alert noise.