CPU Throttling Detection

The CPU Throttling Detection workflow monitors your Kubernetes services for CPU throttling events and creates issues when services experience significant throttling. This helps you identify when services are being constrained by their CPU limits and take corrective action.

How it Works

The workflow monitors two key metrics:

container_resources_cpu_throttled_seconds_total: Measures the time a container spends throttled due to CPU limits
container_resources_cpu_usage_seconds_total: Measures the total CPU time used by the container

When the ratio of throttling time to CPU usage time exceeds configured thresholds, the workflow creates an issue to alert you about potential CPU constraints.

Configuration

The workflow can be configured with the following parameters:

Parameter	Type	Description	Default
`mediumThrottleThreshold`	float	Minimum throttling ratio (throttle time / CPU time) to create a medium severity issue	0.05 (5%)
`highThrottleThreshold`	float	Minimum throttling ratio to create a high severity issue	0.10 (10%)
`minCpuSeconds`	float	Minimum CPU seconds used in the time window before considering throttling issues	3600 (1 hour)

Issue Details

When an issue is created, it includes:

The service and environment experiencing CPU throttling
The throttling ratio (percentage of CPU time spent throttled)
The severity level based on the throttling ratio
A visualization showing:
- CPU throttling over time
- CPU usage patterns

Example Issue

Here’s an example of an issue created by the CPU Throttling Detection workflow:

Title: CPU Throttling Detected: my-service (production)

Service my-service (production environment) is experiencing severe CPU throttling (15.0% of CPU time). 
This indicates that the service is being significantly constrained by CPU limits.

Severity Levels

The workflow assigns severity levels based on the throttling ratio:

Medium: When the throttling ratio meets or exceeds mediumThrottleThreshold (default: 5%)
High: When the throttling ratio meets or exceeds highThrottleThreshold (default: 10%)

Understanding CPU Throttling

CPU throttling in Kubernetes can be counterintuitive. Even if your average CPU usage is under the limit, you can still experience throttling due to how Kubernetes implements CPU limits:

The default quota period is 100ms
For example, with a 50m (millicores) CPU limit:
- The container gets a 5ms CPU quota per 100ms period
- If the container needs more than 5ms of CPU in any 100ms period, it gets throttled
- This happens even if the average CPU usage over longer periods is below the limit

This is particularly problematic for request-handling services because throttling manifests as increased latency.

OOM Detection Investigations

On this page

How it Works
Configuration
Issue Details
Example Issue
Severity Levels
Understanding CPU Throttling
Related Documentation

Get Started

Concepts

Traces

Logs

Metrics

Profiling

Kubernetes Resources

Dashboards

Infrastructure

Issue Detection

Investigations

Alerts & Monitoring

Integrations

Uptime Monitoring

User Management

On-Premises

Administration

How it Works

Configuration

Issue Details

Example Issue

Severity Levels

Understanding CPU Throttling

Get Started

Concepts

Traces

Logs

Metrics

Profiling

Kubernetes Resources

Dashboards

Infrastructure

Issue Detection

Investigations

Alerts & Monitoring

Integrations

Uptime Monitoring

User Management

On-Premises

Administration

​How it Works

​Configuration

​Issue Details

​Example Issue

​Severity Levels

​Understanding CPU Throttling

​Related Documentation

How it Works

Configuration

Issue Details

Example Issue

Severity Levels

Understanding CPU Throttling

Related Documentation