What is Guardian?
Guardian is Metoro’s AI SRE. Guardian can verify deployments, investigate production issues, root cause alerts, generate fixes, and more.Key Features
Issues Catalog
Track recurring issues in one place with the root cause, symptoms, impact and resolution steps.
Deployment Verification
Automatically verify deployments for regressions by comparing pre- and post-deployment telemetry and infrastructure health.
Autonomous Issue Detection
Detect unexpected production issues without preconfigured thresholds and continue investigating them automatically.
Alert Investigations
Investigate firing alerts with AI to separate noisy alerts from real incidents and return root cause with supporting evidence.
AI Runbooks
Define investigation tasks for Guardian to execute when alerts fire so it gathers the context your team needs most.
Assisted Debugging
Ask Guardian questions directly and gather debugging context across telemetry, infrastructure state, and code changes.
Code Fixes
When connected to GitHub, Guardian can inspect code changes, recommend fixes, and create pull requests for review.
Flexible Notifications
Route deployment results, investigations, and issue findings to the right teams and systems.
Custom Credentials
Use your own AWS Bedrock credentials for complete control over AI processing and billing.
Pricing
AI SRE usage can be billed in two different ways depending on whether you provide your own model credentials.- Bring your own model credentials: If you provide your own LLM credentials, such as your own AWS Bedrock access keys, you only pay Metoro’s normal ingest price. Your model usage is billed directly by your provider.
- Use Metoro-managed model credentials: If you do not provide your own key, Metoro calculates the LLM usage for your account based on the tokens consumed and passes that usage through to you at cost, with no markup added.
Core AI SRE Workflows
Guardian operates through several interconnected workflows that turn telemetry, code, and infrastructure context into action.Deployment Monitoring & Verification
Metoro automatically tracks all releases across your Kubernetes clusters. When you deploy a new image tag:- A
new_deploymentevent is created for the service - If given access to code, Guardian will read code changes and use this information to do a more thorough analysis
- Guardian compares pre- and post-deployment telemetry, log patterns, latency, and infrastructure signals
- If issues are found, Guardian notifies you with evidence
- If not, Guardian marks the deployment as healthy
Autonomous Issue Detection & Root Cause Analysis
Guardian continuously monitors your systems for anomalous behavior:- Detects anomalies like error rate spikes automatically
- Runs an investigation to determine if the behavior reflects a real production issue
- Identifies the root cause using logs, traces, and metrics
- Posts to Slack with findings if an issue is confirmed
Alert Investigations
Guardian can investigate alerts that your team already configured:- An alert fires from a known threshold or condition
- Guardian determines whether the alert is noisy or a real incident
- If it is real, Guardian continues investigating to the likely root cause
- Metoro sends the findings with supporting evidence
AI Runbooks
When you configure alerts, you can attach runbooks for Guardian to execute:- Alert fires based on your configured thresholds
- Guardian executes the runbook you’ve defined
- Creates an investigation document with all gathered data
- Links the investigation from the alert notification
Getting Started
Choose your adoption path
Start with Getting Started to decide whether Metoro will be your full observability and AI SRE platform or run side by side with your current stack.
Install Metoro and confirm telemetry is flowing
Once you know your adoption path and deployment model, install Metoro and wait for telemetry to arrive.
Turn on Guardian AI
Guardian AI is the umbrella setting for Metoro’s AI SRE workflows. During onboarding you will be asked whether you want to enable
Guardian AI. You can also enable this later in the Settings → Features → Guardian AI section.Take me thereTurn on AI deployment verification
Enable deployment verification so Metoro automatically checks whether new rollouts are healthy or regressed. During onboarding you will be asked whether you want to enable
AI Deployment Verification. You can also enable this later in the Settings → Features → Deployment Monitoring section.Take me thereTurn on autonomous issue detection and investigations
Enable autonomous issue detection so Guardian can proactively detect and investigate unexpected production issues. During onboarding you will be asked whether you want to enable
Autonomous Issue Detection. You can also enable this later in the Settings → Features → Autonomous Issue Detection section.Take me thereConnect GitHub and map services
Connect repositories and allow Metoro to automatically map your services to repositories so Guardian can use code context during investigations, deployment analysis, and suggested fixes. During onboarding you will be asked whether you want to connect
GitHub and map services to repositories. You can also do this later in the Settings → Integrations -> Third-party Integrations section.Take me thereNext Steps
Issues
Review recurring Guardian findings and their latest investigations
Deployment Verification
Set up automatic deployment verification
Autonomous Issue Detection
Enable autonomous issue detection and root cause analysis
AI Alert Investigations
Investigate firing alerts with Guardian AI
Assisted Debugging
Ask questions directly in chat and gather debugging context quickly
GitHub Integration
Add repository context and suggested fixes
