Alert Examples

Here are some example alerts you can set up in Metoro to monitor your Kubernetes infrastructure and applications. These examples cover various scenarios, including CPU usage, error rates, and latency.

If you are using Kubernetes ConfigMaps to manage your alerts, you can define these alerts in a ConfigMap and apply it to your cluster. Make sure to include the label metoro.io/alert: "true" in your ConfigMap.

Complete Example ConfigMap

You can use the following ConfigMap to set up all example alerts at once:

kind: ConfigMap
apiVersion: v1
metadata:
  name: alert-config
  labels:
    metoro.io/alert: "true"
data:
  alert.yaml: |
    alerts:
      - metadata:
          id: "cpu-usage-alert-001"
          name: "High CPU Usage"
          description: "Alert when CPU usage exceeds 80% for 5 minutes"
        type: timeseries
        timeseries:
          expression:
            metoroQLTimeseries:
              query: sum(container_resources_cpu_usage_seconds_total{service_name="/k8s/default/myimportantservice"}) / 60 / sum(container_resources_cpu_limit_cores{service_name="/k8s/default/myimportantservice"})
              bucketSize: 60
          evaluationRules:
            - name: "warning"
              type: static
              static:
                operators:
                  - operator: greaterThan
                    threshold: 80
                persistenceSettings:
                  datapointsToAlarm: 5
                  datapointsInEvaluationWindow: 5
                  missingDatapointBehavior: notBreaching
      
      - metadata:
          id: "error-log-alert-001"
          name: "High Error Rate"
          description: "Alert when error logs exceed 100 in 15 minutes"
        type: timeseries
        timeseries:
          expression:
            metoroQLTimeseries:
              query: count(logs{log_level="error"})
              bucketSize: 60
          evaluationRules:
            - name: "critical"
              type: static
              static:
                operators:
                  - operator: greaterThan
                    threshold: 100
                persistenceSettings:
                  datapointsToAlarm: 15
                  datapointsInEvaluationWindow: 15
                  missingDatapointBehavior: notBreaching
      
      - metadata:
          id: "high-latency-alert-001"
          name: "High Latency"
          description: "Alert when HTTP request duration exceeds 2 seconds for 5 minutes"
        type: timeseries
        timeseries:
          expression:
            metoroQLTimeseries:
              query: trace_duration_quantile(0.99, traces)
              bucketSize: 60
          evaluationRules:
            - name: "warning"
              type: static
              static:
                operators:
                  - operator: greaterThan
                    threshold: 2
                persistenceSettings:
                  datapointsToAlarm: 5
                  datapointsInEvaluationWindow: 5
                  missingDatapointBehavior: notBreaching
      
      - metadata:
          id: "latency-with-notifications-001"
          name: "High Latency with Notifications"
          description: "Alert when HTTP request duration exceeds 2 seconds for 5 minutes with notifications"
        type: timeseries
        timeseries:
          expression:
            metoroQLTimeseries:
              query: trace_duration_quantile(0.99, traces)
              bucketSize: 60
          evaluationRules:
            - name: "Warning"
              type: static
              static:
                operators:
                  - operator: greaterThan
                    threshold: 2
                persistenceSettings:
                  datapointsToAlarm: 5
                  datapointsInEvaluationWindow: 5
                  missingDatapointBehavior: notBreaching
              actions:
                - type: slack
                  slackDestination:
                    channel: "alerts-critical"
                - type: email
                  emailDestination:
                    emails:
                      - "oncall@example.com"
                      - "sre-team@example.com"

Creating/Updating an Alert Overview

On this page

Complete Example ConfigMap

Get Started

Concepts

Traces

Logs

Metrics

Profiling

Kubernetes Resources

Dashboards

Infrastructure

Issue Detection

Investigations

Alerts & Monitoring

Integrations

Uptime Monitoring

User Management

On-Premises

Administration

Complete Example ConfigMap

Get Started

Concepts

Traces

Logs

Metrics

Profiling

Kubernetes Resources

Dashboards

Infrastructure

Issue Detection

Investigations

Alerts & Monitoring

Integrations

Uptime Monitoring

User Management

On-Premises

Administration

​Complete Example ConfigMap

Complete Example ConfigMap