Overview
Metoro’s Anomaly Detection feature automatically identifies unusual patterns in your systems without requiring you to configure explicit alert thresholds. When an anomaly is detected, Guardian AI automatically investigates to determine if there’s a real issue and what the root cause might be.How It Works
- Detection - Metoro continuously monitors your systems for anomalous behavior
- Investigation - When an anomaly is detected, Guardian automatically runs an investigation
- Analysis - Guardian determines whether the anomaly represents a real issue
- Notification - If an issue is confirmed, Guardian posts to Slack with its findings
Types of Anomalies Detected
Currently, Metoro detects the following types of anomalies:| Anomaly Type | Description |
|---|---|
| Error Rate Spikes | Sudden increases in error rates compared to baseline |
The types of anomalies detected will expand over time. Check back for updates or reach out to our team if you have specific anomaly types you’d like us to support.
Enabling Anomaly Detection
Step 1: Navigate to Settings
Go to Settings → Features → Anomaly DetectionStep 2: Enable Anomaly Detection
Toggle Enable Anomaly Detection to activate the feature.Step 3: Configure Detection Scope
Select which services and environments should have anomaly detection enabled:- Services - Choose specific services or select all
- Environments - Choose specific environments (e.g.,
prod,staging)
Configuring Notifications
Autonomous investigations uses the same flexible notification configuration as other Guardian features.Setting Up Notification Rules
- Navigate to Settings → Features → Autonomous Investigation
- Click Add Notification Configuration
- Configure:
- Services - Which services should trigger notifications
- Environments - Which environments should trigger notifications
- Destination - Where to send notifications (Slack channel, webhook, etc.)
Example Configurations
- Critical Services
- All Production
- Team-Specific
Route anomalies for critical services to an incidents channel:
- Services:
payment-service,auth-service,checkout-service - Environments:
prod - Destination:
#incidents
How Anomaly Detection Differs from Alerts
| Feature | Alerts | Anomaly Detection |
|---|---|---|
| Configuration | You define thresholds | Automatic baseline learning |
| Trigger | Fixed thresholds | Statistical anomalies |
| Investigation | Manual or runbook | Automatic |
| Best for | Known failure modes | Unknown unknowns |
Anomaly Detection and Alerts are complementary. Use alerts for known failure modes with specific thresholds, and anomaly detection to catch unexpected issues.
Best Practices
Start with Production
Focus anomaly detection on production environments first, where issues have the most impact.Review Investigation Quality
Periodically review Guardian’s investigations to ensure they’re finding real issues:- Are the anomalies significant?
- Is the root cause analysis accurate?
- Provide feedback to improve detection
Combine with Alerts
Use both anomaly detection and alerts:- Alerts for critical thresholds you always want to know about
- Anomaly detection for catching unexpected issues
Tune Notification Routing
Route notifications appropriately:- Critical services → dedicated incident channels
- Non-critical services → general monitoring channels
