Documentation Index
Fetch the complete documentation index at: https://metoro.io/docs/llms.txt
Use this file to discover all available pages before exploring further.
Metoro has two main components: the in-cluster agents and the telemetry backend.
Telemetry Data Flow
At a very high level, the flow of data in Metoro is as follows:
- The node agents are responsible for collecting data from the Linux kernel of all nodes in the Kubernetes cluster and writing to cluster-local storage.
- The cluster exporter then reads the data from the local storage, aggregating across all nodes, and sends it to the Metoro backend ingesters along with the Kubernetes metadata (pods, deployments, ConfigMaps, etc.).
- The ingesters write all observability data to the long-term backend storage: currently ClickHouse.
- The API server reads data from the backend storage and serves it to the frontend and any API clients.
The following diagram shows the high-level architecture of Metoro:
Why does an AI SRE need a telemetry layer?
Most teams have a mix of instrumented and non-instrumented services, and Metoro’s telemetry layer helps bridge that gap. Metoro’s AI agents work best when they can reason over complete, correlated production context. By collecting telemetry directly from the kernel with eBPF and Kubernetes metadata, Metoro can investigate across logs, metrics, traces, infrastructure state, pod health, and recent changes without depending on every service being instrumented correctly first.
That matters for root cause analysis. Instead of asking an agent to infer a cause from partial traces/metrics, Metoro gives it the underlying telemetry and cluster state needed to find the root cause accurately.
Yes, Metoro is a full observability platform. It collects telemetry directly from the kernel with eBPF and Kubernetes metadata, and it serves as a central place for monitoring, investigating issues, and running AI SRE workflows. It can also be used as a standalone observability platform for teams that don’t need AI SRE workflows.