Kubernetes Observability: The Complete Guide
Learn what Kubernetes observability is and how to implement effective observability for your k8s clusters.
Kubernetes has become the de facto standard for container orchestration, but running applications on k8s introduces observability challenges that don't exist in traditional environments. You're no longer dealing with a single application on a single server. You have containers that can be scheduled anywhere, pods that come and go, nodes that might fail, and a control plane that orchestrates it all.
flowchart LR
subgraph Kubernetes Cluster
subgraph Workloads
APP[Applications]
end
subgraph Infrastructure
NODES[Nodes]
CP[Control Plane]
end
end
APP -->|Metrics| OBS[Observability Platform]
APP -->|Logs| OBS
APP -->|Traces| OBS
NODES -->|Node Metrics| OBS
CP -->|Events| OBSKubernetes observability is about understanding what's happening across all of these layers. It builds on the traditional pillars of observability (metrics, logs, traces, and profiling) but extends each one with Kubernetes-specific data sources and adds entirely new pillars like resource state tracking and cluster events.
In this guide, you'll learn:
- The six pillars of Kubernetes observability and how they differ from traditional monitoring
- How to collect metrics, logs, traces, and profiles from applications, containers, nodes, and the control plane
- When to build your own stack vs. use a platform (with a quiz to help you decide)
- What to alert on and how to set up effective monitoring for Kubernetes-specific issues
What is Kubernetes Observability?
At its core, Kubernetes observability follows the same principles as traditional application observability. You're collecting telemetry data to understand system behavior, debug issues, and ensure reliability. But in a Kubernetes environment, each pillar has additional data sources that are unique to the platform.
Traditional Pillars (extended for Kubernetes):
- Metrics - Application metrics, plus container metrics, node metrics, and control plane metrics
- Logs - Application logs, plus control plane logs and audit logs
- Traces - Application traces, plus control plane component traces
- Profiles - Application profiling, with eBPF-based profiling for any container
Kubernetes-Specific Pillars:
- Resource State - Tracking Deployments, ConfigMaps, Secrets, and other Kubernetes resources over time
- Kubernetes Events - Pod scheduling, image pulls, container lifecycle, scaling decisions
Let's break down what each pillar looks like in a Kubernetes context.
Metrics
In traditional environments, you collect application metrics to understand performance and health. In Kubernetes, you need metrics at multiple layers:
Application Metrics - The metrics your application exposes about its own behavior: request counts, latencies, error rates, business metrics. This is the same as any other environment.
Container Metrics - Every application in Kubernetes runs inside a container, and containers expose a wealth of metrics about resource consumption. CPU usage, memory consumption, network I/O, filesystem reads and writes.
These metrics tell you how your application is actually running, not just what it reports about itself. A container hitting its memory limit behaves very differently than one with headroom.
Node Metrics - Containers run on nodes, and node health directly impacts the workloads running on them. CPU pressure, memory pressure, disk pressure, network saturation. When a node struggles, every pod on that node feels it.
Control Plane Metrics - The Kubernetes control plane itself (the API server, scheduler, controller manager, etcd) exposes metrics about cluster operations. How long are scheduling decisions taking? Is etcd healthy? Are API requests being throttled? These metrics help you understand cluster-level health.
Logs
Logging in Kubernetes extends beyond application logs to include the platform itself:
Application Logs - Standard stdout/stderr output from your containers. Kubernetes captures these automatically and stores them on the node filesystem. Structured logging (JSON format) makes these much more useful for querying and correlation.
Control Plane Logs - The components that run your cluster generate their own logs. The kubelet on each node, the API server, etcd, the scheduler, controller managers, and any operators you've installed. When something goes wrong at the platform level, these logs are where you'll find answers.
Audit Logs - The Kubernetes API server can log every request made to the cluster. Who created that deployment? When was that secret modified? Which service account deleted those pods? Audit logs are essential for security, compliance, and debugging permission-related issues.
Tracing
Distributed tracing helps you understand request flow through your services:
Application Tracing - Standard distributed tracing across your microservices. A request enters your system, flows through multiple services, and you need to understand the path it takes, where time is spent, and where failures occur.
Control Plane Tracing - Less commonly implemented, but the Kubernetes control plane components support OpenTelemetry tracing. This can help you understand internal cluster operations, particularly useful if you manage your own control plane or need to debug complex scheduling behavior. Note that managed Kubernetes services (EKS, GKE, AKS) typically don't expose full control plane tracing.
Profiling
Profiling tells you why something is slow when metrics tell you that it's slow:
Application Profiling - CPU profiles, memory profiles, goroutine analysis (for Go applications), and similar data for other languages. Continuous profiling lets you understand resource consumption at the code level without waiting for an incident to attach a profiler.
In Kubernetes, profiling becomes particularly valuable because you can profile any container in your cluster, including third-party applications, using eBPF-based profilers that require no code changes.
Kubernetes Metadata
This is where Kubernetes observability diverges most significantly from traditional observability. Kubernetes maintains a rich set of metadata about your workloads that doesn't exist in other environments:
Resource State - Deployments, StatefulSets, DaemonSets, ReplicaSets, ConfigMaps, Secrets, Services, Ingresses. The current state of these resources, how they change over time, and how they relate to each other. When a deployment rolls out, you want to correlate that event with changes in your metrics and logs.
Kubernetes Events - The cluster generates events for significant occurrences: pods being scheduled, images being pulled, containers starting or being killed, volume mounts succeeding or failing, probes failing, HPA scaling decisions. Events are short-lived by default (often around an hour), so persisting them is critical for incident investigation.
Understanding what changed and when is often the key to debugging issues in Kubernetes. A spike in errors that started exactly when a ConfigMap was updated is a very different problem than one that correlates with increased traffic.
Choosing Your Approach
Before diving into implementation details for each pillar, consider what level of control and coverage you need. There's no single right answer here. The best approach depends on your team's size, expertise, and what you're trying to accomplish. You have 4 main options:
- Specialized Kubernetes Observability Platform
- General Full Observability Platform
- Single Pillar Providers
- DIY - Build Your Own Stack
Specialized Kubernetes Observability Platform
Platforms built specifically for Kubernetes observability. Examples include Metoro, Komodor, and Robusta.
- Deep Kubernetes-native understanding
- Purpose-built for k8s workflows and concepts
- Often better correlation of k8s metadata with telemetry
- Pricing models that understand k8s scale patterns
- May not cover non-k8s workloads
- Smaller ecosystems
Best for: Teams that are 80-100% Kubernetes and want fast time-to-value with deep Kubernetes-native insights.
General Full Observability Platform
General-purpose observability platforms that support Kubernetes among many other environments. Examples include Datadog, New Relic, Dynatrace, and Splunk.
- Mature, feature-rich platforms
- Support for hybrid environments (not just k8s)
- Strong ecosystem and integrations
- Well-documented, large communities
- Kubernetes support can feel bolted-on
- May require significant configuration for k8s-specific insights
- Pricing often not optimized for k8s cardinality
- Generic approach may miss k8s-specific context
Best for: Organizations with hybrid environments (Kubernetes + VMs + managed services) needing a single pane of glass.
Single Pillar Providers
Use a managed service for just one type of telemetry, usually logs or metrics. Examples include CloudWatch Logs, Papertrail, or Loggly for logs, and managed Prometheus offerings for metrics.
- Lower complexity than full DIY
- Get value quickly for your most pressing need
- Can mix with DIY for other pillars
- Good stepping stone
- Still need to solve other pillars separately
- No cross-pillar correlation
- May outgrow single-pillar solution
Best for: Teams with one pressing need (usually logs) who want quick wins before expanding.
DIY - Build Your Own Stack
Collect, store, and query each telemetry type yourself using open source tools. A typical stack might include Prometheus and Grafana for metrics, Loki for logs, and Jaeger for traces.
- Full control over your stack
- Can optimize for specific needs
- Often cheaper at scale if you have the expertise
- No vendor lock-in
- Significant operational overhead
- Need to integrate multiple tools yourself
- Correlation across pillars is your problem to solve
- Expertise required for each component
Best for: Teams with strong platform engineering bandwidth who need full control and cost optimization.
Find Your Observability Solution
Not sure which approach is right for you? Answer a few questions to get a recommendation.
Is your environment Kubernetes-only?
How Each Pillar Works
The sections below break down each observability pillar into its two core components: collection (how you gather the data) and storage/querying (where it goes and how you access it).
Who should read this:
- DIY builders - You'll find implementation guidance for each pillar, including tool choices and architectural patterns.
- Provider evaluators - Understanding how things work under the hood helps you ask the right questions and compare platforms effectively.
- Platform users - Even if your provider handles everything, knowing what's happening behind the scenes helps you debug issues and optimize your setup.
If you've chosen a specialized Kubernetes observability platform that handles collection and storage for you, feel free to skip ahead to Monitoring & Alerting.
Metrics
Metrics are numerical measurements collected over time that tell you how your system is performing. In Kubernetes, you need metrics at multiple layers: your application, the containers it runs in, the nodes those containers run on, and the control plane that orchestrates everything.
Collection
In Kubernetes, there are two main paradigms for how applications emit metrics:
- Push-based - Applications actively send metrics to a collector (OpenTelemetry approach)
- Pull-based - A scraper periodically fetches metrics from application endpoints (Prometheus approach)
Both approaches work well in Kubernetes, but pull-based (Prometheus) is more common due to its deep integration with the k8s ecosystem.
Application Metrics
Let's look at both approaches in detail.
OpenTelemetry Push-Based Metrics
With OTEL, your application instruments metrics using the OpenTelemetry SDK and pushes them to an OTEL Collector. The collector then forwards metrics to your storage backend.
flowchart LR
subgraph Kubernetes Cluster
subgraph Deployment
A[App Pods]
end
subgraph DaemonSet
B[OTEL Collector]
end
end
A -->|push| B
B -->|export| C[Metrics Backend]This approach is useful when:
- Your applications already use OpenTelemetry for tracing
- You want a unified instrumentation approach across metrics, traces, and logs
- You're running in environments where pull-based scraping is difficult
Prometheus Pull-Based Metrics
With Prometheus, your application exposes metrics on an HTTP endpoint (typically /metrics).
A Prometheus server periodically scrapes these endpoints to collect metrics.
flowchart RL
subgraph Kubernetes Cluster
subgraph StatefulSet
P[Prometheus]
end
subgraph Deployment
A[App Pods]
end
subgraph Deployment2[Deployment]
K[kube-state-metrics]
end
end
P -->|scrape /metrics| A
P -->|scrape| KThis is the most common approach in Kubernetes because:
- Prometheus is the de facto standard for k8s metrics
- Many applications and libraries expose Prometheus metrics out of the box
- Service discovery in Kubernetes makes it easy to find scrape targets
For high-scale environments, consider Thanos or Cortex on top of Prometheus for long-term storage and multi-cluster queries.
Control Plane Metrics
The Kubernetes control plane components expose their own metrics in Prometheus format:
- kube-apiserver - API request latencies, request counts, etcd cache hits
- kube-scheduler - Scheduling latencies, queue depths, scheduling failures
- kube-controller-manager - Work queue depths, reconciliation times
- etcd - Disk sync durations, database size, leader changes
These metrics help you understand cluster-level health and catch issues before they affect workloads.
flowchart RL
subgraph Kubernetes Cluster
subgraph Control Plane
API[kube-apiserver]
SCHED[kube-scheduler]
CM[kube-controller-manager]
ETCD[etcd]
end
subgraph Monitoring
P[Prometheus]
end
end
P -->|scrape /metrics| API
P -->|scrape /metrics| SCHED
P -->|scrape /metrics| CM
P -->|scrape /metrics| ETCDContainer Metrics
Container metrics come from the container runtime and tell you about resource consumption at the container level. cAdvisor (Container Advisor) is typically integrated with the kubelet and exposes metrics like:
- CPU usage (user, system, throttling)
- Memory usage (working set, RSS, cache)
- Network I/O (bytes sent/received, packets, errors)
- Filesystem I/O (reads, writes, usage)
These metrics are critical because they show you what's actually happening at runtime, not just what your application reports. A container approaching its memory limit will behave differently than one with headroom, even if your application metrics look normal.
cAdvisor collects container metrics by reading from Linux cgroups and the /proc filesystem.
Every container on a node runs inside a cgroup, which the kernel uses to track and limit resource consumption.
cAdvisor watches these cgroups and periodically samples CPU cycles, memory pages, network packets, and disk operations.
In most Kubernetes distributions, cAdvisor is integrated with the kubelet and automatically discovers all containers on the node without any configuration.
Prometheus scrapes the kubelet's /metrics/cadvisor endpoint to collect these metrics, which are already labeled with pod name, namespace, and container name for easy correlation.
flowchart RL
subgraph Node
subgraph Pods
C1[Container]
C2[Container]
end
subgraph Kubelet
CA[cAdvisor]
end
end
subgraph Monitoring
P[Prometheus]
end
C1 -.->|resource usage| CA
C2 -.->|resource usage| CA
P -->|scrape /metrics/cadvisor| CANode Metrics
Node-level metrics come from the node itself and include:
- CPU utilization across all cores
- Memory pressure and availability
- Disk I/O and capacity
- Network bandwidth and errors
The node-exporter (a Prometheus exporter) is commonly deployed as a DaemonSet to collect these metrics from every node. Node metrics help you understand infrastructure-level constraints that affect all pods on that node.
flowchart RL
subgraph Node
NE[node-exporter]
HW[CPU / Memory / Disk / Network]
end
subgraph Monitoring
P[Prometheus]
end
HW -.->|expose| NE
P -->|scrape /metrics| NEThird-Party Exporters
Many third-party components don't expose Prometheus metrics natively. Exporters bridge this gap by collecting metrics from these systems and exposing them in Prometheus format.
Common exporters include:
- redis-exporter - Redis server metrics
- mysql-exporter - MySQL database metrics
- postgres-exporter - PostgreSQL metrics
- kafka-exporter - Kafka broker and consumer metrics
- nginx-exporter - NGINX web server metrics
Storage and Querying
You have two main choices for metrics storage: in-cluster or out-of-cluster.
In-Cluster Storage
Running your metrics storage inside the cluster.
- Generally cheaper (no external service costs)
- Complete control over infrastructure
- Self-contained, no external dependencies
- Lower latency for queries
- Cluster issues affect your observability data
- High-scale storage is complex to maintain
- Persistent storage in k8s can be challenging
- Difficult to aggregate across multiple clusters
Common in-cluster options:
- Prometheus - The standard choice, good for small to medium scale
- VictoriaMetrics - More efficient storage, better for high cardinality
- Mimir - Horizontally scalable, good for large deployments
Out-of-Cluster Storage
Sending metrics to an external service or self-hosted infrastructure outside the cluster.
This approach provides resilience (your metrics survive cluster failures) and makes it easier to aggregate data from multiple clusters. The tradeoff is cost and potential latency.
Options include managed Prometheus services (like Grafana Cloud, Amazon Managed Prometheus) or general observability platforms that accept Prometheus remote write.
Putting It Together
A typical Kubernetes metrics architecture combines multiple collection methods. Your application pods push OTLP metrics to an OpenTelemetry Collector running as a DaemonSet. The collector aggregates and processes these metrics, then exposes them on a Prometheus-compatible endpoint.
Prometheus sits at the center, scraping metrics from multiple sources:
- The OTEL Collector for application metrics
- cAdvisor (via the kubelet) for container resource metrics
- node-exporter for host-level metrics
- kube-state-metrics for Kubernetes object state
- The control plane components for cluster health
Finally, Prometheus uses remote write to send metrics to your storage backend, whether that's in-cluster (VictoriaMetrics, Mimir) or a managed service.
flowchart LR
subgraph Kubernetes Cluster
subgraph Workloads
APP[App Pods]
end
subgraph DaemonSets
OTEL[OTEL Collector]
NE[node-exporter]
end
subgraph Kubelet
CA[cAdvisor]
end
subgraph Control Plane
API[kube-apiserver]
end
subgraph Monitoring
KSM[kube-state-metrics]
P[Prometheus]
end
end
APP -->|push OTLP| OTEL
OTEL -->|expose /metrics| P
P -->|scrape| CA
P -->|scrape| NE
P -->|scrape| API
P -->|scrape| KSM
P -->|remote write| Backend[Metrics Backend]Tracing
Distributed tracing helps you understand request flow across services. When a request enters your system and touches multiple microservices, tracing shows you the complete path, where time is spent, and where failures occur.
Collection
Application Tracing
OpenTelemetry is the modern standard for distributed tracing. If you encounter references to OpenTracing or the legacy Jaeger client libraries, those are deprecated - new projects should use OpenTelemetry.
There are three main approaches to instrumenting your applications:
Manual Instrumentation
You add tracing code explicitly using the OpenTelemetry SDK. This gives you the most control over what gets traced and what attributes are attached to spans.
tracer := otel.Tracer("my-service")
ctx, span := tracer.Start(ctx, "process-order")
defer span.End()
// your code here
Manual instrumentation is essential for capturing business-specific context, but it requires code changes and developer discipline to maintain coverage.
Auto-instrumentation
For dynamic languages like Python, Java, and Node.js, OpenTelemetry provides agents that automatically instrument common libraries and frameworks. You get tracing for HTTP clients, database drivers, and message queues without changing your application code.
This approach gives you good baseline coverage quickly. The trade-off is less control over span names and attributes, and it only works for supported libraries.
eBPF-based Tracing
eBPF-based tracing instruments applications at the kernel level, requiring no code changes or language-specific agents. This is particularly powerful in Kubernetes because you can trace any container, including third-party applications deployed via Helm charts that you don't control.
flowchart LR
subgraph Node
subgraph Pods
APP[Your App]
THIRD[Third-Party App]
end
subgraph DaemonSet
EBPF[eBPF Agent]
end
end
APP -.->|kernel tracing| EBPF
THIRD -.->|kernel tracing| EBPF
EBPF -->|export| Backend[Trace Backend]The eBPF agent runs as a DaemonSet and observes network calls and function invocations at the kernel level. This means you get traces for Redis, PostgreSQL, HTTP, and gRPC calls regardless of what language your application is written in.
Control Plane Tracing
The Kubernetes control plane components (API server, scheduler, controller manager) support OpenTelemetry tracing via OTLP. This is less commonly needed than application tracing, but can be valuable if you manage your own control plane or need to debug complex scheduling behavior. Note that managed Kubernetes services often don't expose control plane tracing, so this is primarily relevant for self-managed clusters.
Control plane traces help you understand internal cluster operations: how long scheduling decisions take, what's happening during API request processing, and where bottlenecks exist in cluster operations.
Storage and Querying
In-Cluster Storage
Running your trace storage inside the cluster.
Common options:
- Jaeger - CNCF graduated project, widely adopted, good UI for trace exploration
- Tempo - Grafana's tracing backend, pairs well with Grafana for visualization
- Zipkin - One of the original distributed tracing systems, still used but less common for new deployments
Out-of-Cluster Storage
Sending traces to an external service provides the same benefits as out-of-cluster metrics storage: resilience to cluster failures and easier multi-cluster aggregation.
Most observability platforms accept OTLP traces, and there are managed Jaeger and Tempo offerings available.
Putting It Together
A typical Kubernetes tracing architecture combines SDK-based instrumentation for your own applications with eBPF-based tracing for third-party workloads.
Applications instrumented with OpenTelemetry send traces to an OTEL Collector running as a DaemonSet. The collector batches, processes, and exports traces to your backend. Meanwhile, an eBPF agent captures traces from uninstrumented workloads at the kernel level. If you've enabled control plane tracing, those components also send traces to the collector.
flowchart LR
subgraph Kubernetes Cluster
subgraph Workloads
APP[Instrumented Apps]
THIRD[Third-Party]
end
subgraph DaemonSets
OTEL[OTEL Collector]
EBPF[eBPF Agent]
end
subgraph Control Plane
API[kube-apiserver]
end
end
APP -->|OTLP| OTEL
THIRD -.->|kernel| EBPF
API -->|OTLP| OTEL
OTEL -->|export| Backend[Trace Backend]
EBPF -->|export| BackendLogs
Logging in Kubernetes follows a different model than traditional environments. Instead of writing to files that you configure, containers write to stdout and stderr, and Kubernetes captures these streams to the node filesystem.
Collection
Application Logs
When a container writes to stdout or stderr, Kubernetes captures that output and stores it on the node at /var/log/containers/.
This happens automatically - you don't need to configure anything for basic log capture.
Structured Logging
If you have control over your applications, emit logs in JSON format.
Structured logs are dramatically easier to query, filter, and correlate than plain text.
Instead of parsing regex patterns, you can filter on fields like level, service, request_id, or any business-specific attribute.
Node-level Agents (Recommended)
The standard approach is to run a log collection agent as a DaemonSet.
The agent tails log files from /var/log/containers/ on each node and forwards them to your storage backend.
flowchart LR
subgraph Node
subgraph Pod
C[Container]
end
FS[/var/log/containers/]
subgraph DaemonSet
FB[Fluent Bit]
end
end
C -->|stdout/stderr| FS
FB -->|tail| FS
FB -->|export| Backend[Log Backend]Common options:
- Fluent Bit - Lightweight, low resource footprint, good for most use cases
- Fluentd - More plugins and flexibility, higher resource usage
- Vector - Modern alternative with good performance
Sidecar Pattern
For cases where you need different log handling per pod (different parsing, different destinations), you can run a logging sidecar container alongside your application. This adds resource overhead and complexity, so only use it when node-level collection doesn't meet your needs.
Control Plane Logs
The Kubernetes control plane components generate their own logs:
- kube-apiserver - API request handling, authentication, authorization
- kube-scheduler - Scheduling decisions and failures
- kube-controller-manager - Controller reconciliation loops
- etcd - Cluster state storage operations
- kubelet - Pod lifecycle, container operations on each node
For self-managed clusters, access these via journalctl (for systemd-managed components) or /var/log/ (for static pods).
For managed Kubernetes (EKS, GKE, AKS), control plane logs often need to be explicitly enabled and are delivered to the cloud provider's logging service. Don't assume they're available by default.
Audit Logs
The Kubernetes API server can log every request made to the cluster. Audit logs tell you who did what, when, and to which resources.
Audit policy levels control how much detail is captured:
- None - Don't log this event
- Metadata - Log request metadata (user, timestamp, resource) but not request/response bodies
- Request - Log metadata and request body
- RequestResponse - Log everything including response bodies
Audit logs are essential for:
- Security - Detecting unauthorized access attempts or suspicious activity
- Compliance - Meeting regulatory requirements for access logging
- Debugging - Understanding who modified a resource and when
Storage and Querying
In-Cluster Storage
Common options:
- Loki - Grafana's log aggregation system, uses label-based indexing rather than full-text indexing, lightweight and cost-effective
- Elasticsearch/OpenSearch - Full-text search capabilities, more powerful querying but higher resource requirements
Out-of-Cluster Storage
Cloud provider logging services (CloudWatch Logs, Google Cloud Logging, Azure Monitor) integrate well with managed Kubernetes offerings. Managed Loki and Elasticsearch services are also available if you want the flexibility without the operational overhead.
Putting It Together
A typical Kubernetes logging architecture has a DaemonSet agent on each node collecting application logs from the filesystem. The same agent can collect control plane logs and receive audit logs from the API server. All logs flow to a central backend for storage and querying.
flowchart LR
subgraph Kubernetes Cluster
subgraph Node
subgraph Pods
APP[App Containers]
end
FS[Node Filesystem]
subgraph DaemonSet
FB[Fluent Bit]
end
end
subgraph Control Plane
API[kube-apiserver]
SCHED[kube-scheduler]
end
end
APP -->|stdout/stderr| FS
FB -->|collect| FS
API -->|audit logs| FB
SCHED -->|logs| FS
FB -->|export| Backend[Log Backend]Profiling
Metrics tell you what is slow. Profiling tells you why.
When your service latency spikes, metrics show you the symptom. Profiling shows you the exact function consuming CPU cycles, the memory allocation causing garbage collection pressure, or the goroutine blocking on a lock.
Modern continuous profiling has low overhead (typically 1-5%), making it safe to run in production. This means you can capture profiles before an incident occurs, not scramble to attach a profiler while the issue is happening.
Collection
Application Profiling
pprof Endpoints
Go applications expose profiling endpoints at /debug/pprof/ by default.
You can capture:
- CPU profiles - Where CPU time is spent
- Heap profiles - Memory allocation patterns
- Goroutine profiles - What goroutines are doing
- Block profiles - Where goroutines block on synchronization
Other languages have equivalent tools:
- Python - py-spy for sampling profiler
- Java - async-profiler for low-overhead profiling
- Node.js - Built-in profiler or clinic.js
Continuous Profiling Agents
Rather than manually capturing profiles, continuous profiling agents periodically collect profiles and send them to a backend for storage and analysis.
Common agents:
- Pyroscope agent - Supports multiple languages, integrates with Pyroscope server
- Parca agent - eBPF-based, very low overhead
eBPF-based Profiling
eBPF-based profilers work at the kernel level, capturing CPU samples from any process without requiring application changes. This is particularly valuable in Kubernetes because you can profile any container - including third-party applications, databases, and message queues that you deploy via Helm charts but don't have source code access to.
flowchart LR
subgraph Node
subgraph Pods
APP[Your App]
THIRD[Third-Party App]
end
subgraph DaemonSet
EBPF[eBPF Profiler]
end
end
APP -.->|kernel sampling| EBPF
THIRD -.->|kernel sampling| EBPF
EBPF -->|export| Backend[Profile Backend]The trade-off compared to language-specific profilers is less detail. eBPF profilers typically capture CPU profiles but may not have visibility into language-specific constructs like goroutines or garbage collection. For applications you control, language-native profiling gives richer data. For everything else, eBPF profiling gives you visibility you wouldn't otherwise have.
Storage and Querying
In-Cluster Storage
- Pyroscope - Open source continuous profiling platform with good visualization
- Parca - Cloud-native profiling, pairs well with Parca agent
Out-of-Cluster Storage
Several observability platforms now offer continuous profiling as part of their product. Cloud provider offerings and managed Pyroscope instances are also available.
Putting It Together
A typical Kubernetes profiling setup combines language-native profiling for applications you control with eBPF-based profiling for third-party workloads.
Applications with pprof endpoints or language agents send profiles to a continuous profiling agent. The eBPF profiler running as a DaemonSet captures CPU profiles from all containers on each node. Both streams flow to a profile backend for storage, analysis, and flame graph visualization.
flowchart LR
subgraph Kubernetes Cluster
subgraph Workloads
GO[Go Apps]
OTHER[Other Apps]
THIRD[Third-Party]
end
subgraph DaemonSets
AGENT[Profiling Agent]
EBPF[eBPF Profiler]
end
end
GO -->|pprof| AGENT
OTHER -->|language agent| AGENT
THIRD -.->|kernel| EBPF
AGENT -->|export| Backend[Profile Backend]
EBPF -->|export| BackendKubernetes Metadata
This pillar is unique to Kubernetes. Traditional observability doesn't have an equivalent because traditional environments don't have the same concept of declarative resource state managed by an orchestrator.
In Kubernetes, understanding what changed is often the key to debugging issues. A spike in errors that started exactly when a ConfigMap was updated is a very different problem than one that correlates with increased traffic.
Resource State
Kubernetes maintains state for many resource types: Deployments, StatefulSets, DaemonSets, ReplicaSets, ConfigMaps, Secrets, Services, Ingresses, and more. Tracking how these resources change over time lets you correlate application issues with platform changes.
kube-state-metrics
kube-state-metrics is a service that watches the Kubernetes API and exposes cluster state as Prometheus metrics. It gives you metrics like:
- Deployment replica counts (desired vs available)
- Pod phases (Pending, Running, Failed)
- Container states (waiting, running, terminated)
- Resource requests and limits
- Node conditions
flowchart LR
subgraph Kubernetes Cluster
subgraph API
APIS[kube-apiserver]
end
subgraph Deployment
KSM[kube-state-metrics]
end
subgraph Monitoring
P[Prometheus]
end
end
APIS -->|watch resources| KSM
KSM -->|expose /metrics| PThis data is essential for understanding cluster state over time. When you see a latency spike at 2:34 PM, you can check if a deployment rolled out, if pod counts changed, or if resource limits were modified around that time.
Kubernetes Events
Kubernetes generates events for significant occurrences in the cluster:
- Scheduling - Pod assigned to node, or scheduling failed
- Image pulls - Successfully pulled image, or pull failed
- Container lifecycle - Container started, killed, OOMKilled
- Scaling - HPA scaled deployment up or down
- Volume mounting - Volume attached successfully or failed
- Probes - Liveness or readiness probe failed
Events are incredibly valuable for debugging, but there's a critical gotcha: Kubernetes only retains events for a short period by default (often around an hour, though this varies by cluster configuration). If you're investigating an incident that happened yesterday, the events are gone.
To preserve events, deploy an event exporter that watches for events and forwards them to your storage backend.
flowchart LR
subgraph Kubernetes Cluster
subgraph API
APIS[kube-apiserver]
end
subgraph Deployment
EXP[Event Exporter]
end
end
APIS -->|watch events| EXP
EXP -->|export| Backend[Event Storage]Tools like kubernetes-event-exporter can send events to Elasticsearch, Loki, or webhook endpoints.
Change Tracking
Beyond raw events, you want to track higher-level changes: when deployments rolled out, what changed in the configuration, and who made the change.
If you're using GitOps tools like ArgoCD or Flux, you get audit trails automatically - every change is a git commit with a timestamp and author. For imperative changes (kubectl apply), consider recording deployment timestamps in your observability platform so you can overlay them on your dashboards.
The goal is to answer the question: "What changed around the time this incident started?" When you can quickly correlate a latency spike with a ConfigMap update or a new deployment, you've found your debugging starting point.
Monitoring & Alerting
Collecting observability data is only useful if you act on it. Monitoring and alerting turn your telemetry into actionable insights and wake you up when things go wrong.
What to Alert On
The Four Golden Signals
Google's SRE book introduced the four golden signals as the essential metrics for monitoring any system:
- Latency - How long requests take, particularly distinguishing between successful and failed requests
- Traffic - How much demand is hitting your system (requests per second, transactions, etc.)
- Errors - The rate of failed requests, whether explicit (5xx errors) or implicit (wrong content, slow responses)
- Saturation - How full your resources are (CPU, memory, disk, network)
These signals apply to any service, but in Kubernetes you have additional platform-specific concerns.
Kubernetes-Specific Alerts
These alerts don't exist in traditional environments, some examples would be:
- Pod restarts / CrashLoopBackOff - Containers repeatedly crashing indicate application bugs or misconfiguration
- OOMKilled containers - Memory limits being hit, need to increase limits or fix memory leaks
- Node NotReady - Nodes failing health checks, potential infrastructure issues
- Node pressure conditions - Memory, disk, or PID pressure on nodes affecting all pods
- Pending pods - Pods that can't be scheduled due to resource constraints or affinity rules
- Failed deployments - Rollouts stuck or rolled back
- PVC binding failures - Persistent volume claims that can't find matching volumes
- Certificate expiration - If using cert-manager, alert before certificates expire
- Control plane health - API server latency, etcd health, scheduler queue depth
Application-Level Alerts
Beyond the golden signals, consider alerts specific to your application's behavior:
- Error rate thresholds - Alert when 5xx errors exceed a percentage (e.g., >1% of requests)
- Latency percentiles - Alert on p99 latency rather than averages to catch tail latency issues
- Queue depth / consumer lag - Message queues backing up indicate processing problems
- Database connection pool exhaustion - Running out of connections causes cascading failures
- Business metrics - Orders per minute dropping, signup conversion rate changing
Alerting Tools
DIY Stack
If you're running your own observability infrastructure:
- Prometheus Alertmanager - Receives alerts from Prometheus, handles deduplication, grouping, routing, and silencing
- Grafana Alerting - Unified alerting across multiple data sources, can alert on Prometheus, Loki, and other backends
Both support routing alerts to different channels based on severity and labels.
flowchart LR
subgraph Kubernetes Cluster
subgraph Monitoring
P[Prometheus]
AM[Alertmanager]
end
end
P -->|alert rules| AM
AM -->|critical| PD[PagerDuty]
AM -->|warning| Slack[Slack]
AM -->|all| Email[Email]Integration with Incident Management
Alerts need to reach the right people through the right channels:
- Critical alerts → PagerDuty, OpsGenie, or similar for on-call rotation and escalation
- Warning alerts → Slack or Teams channels for awareness without paging
- Runbook links → Include links to documentation in alert annotations so responders know what to do
flowchart LR
subgraph Alerting
AM[Alertmanager]
end
subgraph Incident Management
PD[PagerDuty/OpsGenie]
SLACK[Slack/Teams]
end
subgraph Context
RB[Runbooks]
DASH[Dashboards]
end
AM -->|critical| PD
AM -->|warning| SLACK
PD -.->|links to| RB
PD -.->|links to| DASHThe goal is to reduce mean time to resolution (MTTR) by giving responders immediate context.
Platform Alerting
Most observability platforms include built-in alerting with Kubernetes-specific templates. This can accelerate your setup since you get pre-configured alerts for common issues like CrashLoopBackOff, OOMKilled, and node pressure without writing rules from scratch.
Conclusion
Kubernetes observability isn't a fundamentally different discipline - it's traditional observability adapted for a container orchestration environment. You're still collecting metrics, logs, traces, and profiles. The difference is where that data comes from and the additional context Kubernetes provides.
The platform adds two pillars that don't exist elsewhere: resource state (tracking Deployments, ConfigMaps, and other Kubernetes objects over time) and Kubernetes events (scheduling decisions, container lifecycle, scaling actions). These pillars are often the key to understanding why something went wrong, not just that it went wrong.
Collection looks different too. DaemonSets replace traditional agents. eBPF gives you visibility into any container without code changes. kube-state-metrics exposes cluster state as Prometheus metrics. The kubelet's cAdvisor endpoint provides container resource consumption. These are Kubernetes-specific mechanisms, but they're serving the same goal: getting the telemetry you need to understand system behavior.
Whether you build your own stack or use a platform, the principles remain the same. Collect the right data, store it somewhere you can query it, alert on what matters, and make sure you can correlate across pillars when debugging. Kubernetes just gives you more data sources to work with - and more context to help you find answers faster.