> ## Documentation Index
> Fetch the complete documentation index at: https://metoro.io/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Introduction

> Metoro is a Kubernetes-native AI SRE platform with a built-in telemetry layer that is up and running in 5 minutes

<img className="block dark:hidden" src="https://mintcdn.com/metoro/iqFtfx-__JRaa9yb/images/Metoro_16_9.svg?fit=max&auto=format&n=iqFtfx-__JRaa9yb&q=85&s=5a1b09de1c623188bf5967cfa1bfa918" alt="Metoro logo" width="454" height="255" data-path="images/Metoro_16_9.svg" />

<img className="hidden dark:block" src="https://mintcdn.com/metoro/iqFtfx-__JRaa9yb/images/Metoro_16_9.svg?fit=max&auto=format&n=iqFtfx-__JRaa9yb&q=85&s=5a1b09de1c623188bf5967cfa1bfa918" alt="Metoro logo" width="454" height="255" data-path="images/Metoro_16_9.svg" />

## What is Metoro?

Metoro is a **Kubernetes-native AI SRE platform** that automatically detects, investigates, and identifies root causes of production issues. It can investigate your alerts, follow runbooks, verify deployments and suggest fixes for the detected issues.

Out of the box, Metoro collects logs, metrics, traces, profiling, and Kubernetes state with zero manual instrumentation. That gives Metoro's AI agents **full context** to do **accurate root cause analysis** instead of relying on partial telemetry.
Teams who would like to send their custom OpenTelemetry traces/metrics to Metoro can do so by pointing their OpenTelemetry collector to Metoro for seamless integration.

Metoro is also an **observability platform** that can replace Datadog, Grafana, and other tools (as long as running on Kubernetes), so teams can move from automated detection and investigation into direct debugging without switching tools.

Setup takes **\< 5 minutes** with a single Helm install.

## AI SRE workflows

Here are some of the key features that Metoro can help you with:

<CardGroup cols={2}>
  <Card title="Deployment Verification" icon="rocket" href="/ai-sre/deployment-verification">
    Automatically detect deployments, compare pre- and post-deployment telemetry, and flag regressions with evidence.
  </Card>

  <Card title="Autonomous Issue Detection" icon="magnifying-glass" href="/ai-sre/anomaly-detection">
    Detect unusual behavior across telemetry, decide whether it is a real production issue, and continue to root cause automatically.
  </Card>

  <Card title="Alert Investigations" icon="bell" href="/ai-sre/alert-investigations">
    Investigate firing alerts with Metoro, identify noisy alerts versus real incidents, and return full RCA with supporting evidence. Either use Metoro's alerting or send your alerts from different sources to Metoro to investigate.
  </Card>

  <Card title="Code Fixes" icon="code" href="/ai-sre/github-integration">
    Connect GitHub so Metoro can inspect code changes for deployments, recommend and prepare fixes for production issues, and create pull requests for review.
  </Card>

  <Card title="Advisor" icon="triangle-exclamation" href="/advisor/overview">
    Review right-sizing, OOM, and CPU throttling findings for your services with detailed evidence and recurrence history.
  </Card>

  <Card title="AI Runbooks" icon="book" href="/ai-sre/runbooks">
    Attach investigation instructions to alerts so Metoro gathers the context your team cares about most.
  </Card>

  <Card title="Assisted Debugging with AI" icon="brain" href="/ai-sre/assisted-debugging">
    Ask Metoro to investigate or gather information for you.
  </Card>

  <Card title="Metoro MCP Server" icon="server" href="/integrations/metoro-mcp-server">
    Hook up Metoro's MCP Server to your local agents to get production insights during development.
  </Card>
</CardGroup>

## Observability coverage

You will get access to the following observability data and features when you install Metoro:

<CardGroup cols={2}>
  <Card title="Logs" icon="rectangle-terminal" href="/logs/overview">
    Centralized logs with fast search, structured parsing, transformations and log metric visualizations.
  </Card>

  <Card title="Metrics" icon="chart-line" href="/metrics/overview">
    Cluster, node, pod, and service metrics out of the box, plus support for custom application metrics.
  </Card>

  <Card title="Traces" icon="route" href="/traces/overview">
    Automatic zero-instrumentation traces for common protocols, with OpenTelemetry support for custom tracing.
  </Card>

  <Card title="Profiling" icon="brain" href="/profiling/overview">
    On-CPU profiling for all containers so you can see hot code paths to find bottlenecks.
  </Card>

  <Card title="Kubernetes State" icon="cube" href="/kubernetes-resources/kubernetes-resources">
    View your full cluster state overtime to correlate runtime issues with cluster changes. Similar to having k9s with time travel enabled to see historical cluster state.
  </Card>

  <Card title="Dashboards" icon="gauge" href="/dashboards/overview">
    Custom Dashboards that can be created from templates or built from scratch.
  </Card>

  <Card title="Alerting" icon="bell" href="/alerts-monitoring/alerts-overview">
    Automatically setup alerts for your cluster in 5 minutes with AI powered alert suggestions.
  </Card>

  <Card title="Resource Optimization" icon="dollar" href="/advisor/overview">
    Optimize your Kubernetes resources with Metoro's Advisor. Find the workloads that are underutilized, overutilized and experiencing high throttling. Get resource sizing recommendations based on historical usage.
  </Card>

  <Card title="Uptime Monitoring" icon="waveform" href="/uptime-monitoring/overview">
    Set up uptime monitoring for your endpoints and receive alerts when they go down.
  </Card>

  <Card title="Status Pages" icon="page-caret-up" href="/uptime-monitoring/status-pages">
    Create custom status pages for your services from any metric or uptime monitors in your account.
  </Card>
</CardGroup>

## Why teams use Metoro

Teams typically adopt Metoro because:

* Reduce MTTR by catching regressions earlier, investigating faster, and correlating telemetry, infrastructure state, and code changes in one workflow.
* Reduce noisy alert work by letting AI agents filter noise, investigate real issues, and hand teams supporting evidence instead of raw symptoms.
* Engineering teams spend too much time debugging production issues across disconnected dashboards, logs, alerts, traces, and infrastructure tools.
* Correlating different signals and getting from symptom to root cause is often slow, manual, and dependent on whoever knows the system best.
