Common Kubernetes Attributes
All node and container metrics include the following Kubernetes attributes. These attributes provide detailed information about the node’s infrastructure, architecture, and location:Node Information
kubernetes.io/hostname- Node hostnamekubernetes.io/os- Operating system (also available asbeta.kubernetes.io/os)kubernetes.io/arch- CPU architecture (also available asbeta.kubernetes.io/arch)kubernetes.io/instance-type- Instance type (also available asnode.kubernetes.io/instance-type,beta.kubernetes.io/instance-type)
Cloud Provider Information
k8s.io/cloud-provider-aws- AWS cloud provider identifiereks.amazonaws.com/nodegroup- EKS node group nameeks.amazonaws.com/nodegroup-image- EKS AMI imageeks.amazonaws.com/capacityType- EKS capacity typeeks.amazonaws.com/sourceLaunchTemplateId- EKS launch template IDeks.amazonaws.com/sourceLaunchTemplateVersion- EKS launch template version
Location and Topology
topology.kubernetes.io/region- Cloud provider regiontopology.kubernetes.io/zone- Availability zonetopology.k8s.aws/zone-id- AWS zone IDtopology.ebs.csi.aws.com/zone- EBS zonefailure-domain.beta.kubernetes.io/region- Region (legacy label)failure-domain.beta.kubernetes.io/zone- Zone (legacy label)
Karpenter-specific Information
karpenter.sh/nodepool- Karpenter node poolkarpenter.sh/capacity-type- Capacity typekarpenter.k8s.aws/instance-category- AWS instance categorykarpenter.k8s.aws/instance-family- AWS instance familykarpenter.k8s.aws/instance-size- AWS instance sizekarpenter.k8s.aws/instance-generation- AWS instance generationkarpenter.k8s.aws/instance-cpu- CPU detailskarpenter.k8s.aws/instance-cpu-manufacturer- CPU manufacturerkarpenter.k8s.aws/instance-memory- Memory capacitykarpenter.k8s.aws/instance-network-bandwidth- Network bandwidthkarpenter.k8s.aws/instance-ebs-bandwidth- EBS bandwidthkarpenter.k8s.aws/instance-hypervisor- Hypervisor typekarpenter.k8s.aws/instance-local-nvme- Local NVMe storage
Other
pool- Generic pool identifier
Container Attributes
instance- The instance identifierenvironment- The environment namecontainer_name- Name of the containercontainer_id- Unique container identifierservice_name- Name of the service the container belongs tonamespace- Kubernetes namespacepod_name- Name of the pod
Node Metrics
Node Information
| Metric Name | Type | Units | Attributes | Description |
|---|---|---|---|---|
| node_info | gauge | - | hostname, kernel_version | Meta information about the node |
| node_cloud_info | gauge | - | provider, account_id, instance_id, instance_type, instance_life_cycle, region, availability_zone, availability_zone_id, local_ipv4, public_ipv4 | Meta information about the cloud instance |
| node_uptime_seconds | gauge | seconds | - | Uptime of the node in seconds |
Node Resource Metrics
| Metric Name | Type | Units | Attributes | Description |
|---|---|---|---|---|
| node_resources_cpu_usage_seconds_total | counter | seconds | mode | The amount of CPU time spent in each mode |
| node_resources_cpu_logical_cores | gauge | count | - | The number of logical CPU cores |
| node_resources_memory_total_bytes | gauge | bytes | - | The total amount of physical memory |
| node_resources_memory_free_bytes | gauge | bytes | - | The amount of unassigned memory |
| node_resources_memory_available_bytes | gauge | bytes | - | The total amount of available memory |
| node_resources_memory_cached_bytes | gauge | bytes | - | The amount of memory used as page cache |
Node Disk Metrics
| Metric Name | Type | Units | Attributes | Description |
|---|---|---|---|---|
| node_resources_disk_reads_total | counter | count | device | The total number of reads completed successfully |
| node_resources_disk_writes_total | counter | count | device | The total number of writes completed successfully |
| node_resources_disk_read_bytes_total | counter | bytes | device | The total number of bytes read from the disk |
| node_resources_disk_written_bytes_total | counter | bytes | device | The total number of bytes written to the disk |
| node_resources_disk_read_time_seconds_total | counter | seconds | device | The total number of seconds spent reading |
| node_resources_disk_write_time_seconds_total | counter | seconds | device | The total number of seconds spent writing |
| node_resources_disk_io_time_seconds_total | counter | seconds | device | The total number of seconds the disk spent doing I/O |
Node Network Metrics
| Metric Name | Type | Units | Attributes | Description |
|---|---|---|---|---|
| node_net_received_bytes_total | counter | bytes | interface | The total number of bytes received |
| node_net_transmitted_bytes_total | counter | bytes | interface | The total number of bytes transmitted |
| node_net_received_packets_total | counter | count | interface | The total number of packets received |
| node_net_transmitted_packets_total | counter | count | interface | The total number of packets transmitted |
| node_net_interface_up | gauge | - | interface | Status of the interface (0:down, 1:up) |
| node_net_interface_ip | gauge | - | interface, ip | IP address assigned to the interface |
Container Metrics
Resource Metrics
| Metric Name | Type | Units | Attributes | Description |
|---|---|---|---|---|
| container_info | gauge | - | image, systemd_triggered_by | Meta information about the container |
| container_restarts_total | counter | count | - | Number of times the container was restarted |
| container_resources_cpu_limit_cores | gauge | cores | - | CPU limit of the container |
| container_resources_cpu_usage_seconds_total | counter | seconds | - | Total CPU time consumed by the container |
| container_resources_cpu_delay_seconds_total | counter | seconds | - | Total time duration processes of the container have been waiting for a CPU (while being runnable) |
| container_resources_cpu_throttled_seconds_total | counter | seconds | - | Total time duration the container has been throttled |
| container_resources_memory_limit_bytes | gauge | bytes | - | Memory limit of the container |
| container_resources_memory_rss_bytes | gauge | bytes | - | Amount of physical memory used by the container (doesn’t include page cache) |
| container_resources_memory_cache_bytes | gauge | bytes | - | Amount of page cache memory allocated by the container |
| container_oom_kills_total | counter | count | - | Total number of times the container was terminated by the OOM killer |
Disk Metrics
| Metric Name | Type | Units | Attributes | Description |
|---|---|---|---|---|
| container_resources_disk_delay_seconds_total | counter | seconds | - | Total time duration processes of the container have been waiting for I/Os to complete |
| container_resources_disk_size_bytes | gauge | bytes | mount_point, device, volume | Total capacity of the volume |
| container_resources_disk_used_bytes | gauge | bytes | mount_point, device, volume | Used capacity of the volume |
| container_resources_disk_reserved_bytes | gauge | bytes | mount_point, device, volume | Reserved capacity of the volume |
| container_resources_disk_reads_total | counter | count | mount_point, device, volume | Total number of reads completed successfully by the container |
| container_resources_disk_read_bytes_total | counter | bytes | mount_point, device, volume | Total number of bytes read from the disk by the container |
| container_resources_disk_writes_total | counter | count | mount_point, device, volume | Total number of writes completed successfully by the container |
| container_resources_disk_written_bytes_total | counter | bytes | mount_point, device, volume | Total number of bytes written to the disk by the container |
Network Metrics
| Metric Name | Type | Units | Attributes | Description |
|---|---|---|---|---|
| container_net_tcp_successful_connects_total | counter | count | - | Total number of successful TCP connects |
| container_net_tcp_connection_time_seconds_total | counter | seconds | - | Time spent on TCP connections |
| container_net_tcp_failed_connects_total | counter | count | - | Total number of failed TCP connects |
| container_net_tcp_active_connections | gauge | count | - | Number of active outbound connections used by the container |
| container_net_tcp_retransmits_total | counter | count | - | Total number of retransmitted TCP segments |
| container_net_latency_seconds | gauge | seconds | - | Round-trip time between the container and a remote IP |
| container_net_tcp_bytes_sent_total | counter | bytes | - | Total number of bytes sent to the peer |
| container_net_tcp_bytes_received_total | counter | bytes | - | Total number of bytes received from the peer |
| container_net_bytes_sent_total | counter | bytes | - | Total number of bytes sent by the container |
| container_net_bytes_received_total | counter | bytes | - | Total number of bytes received by the container |
Application Metrics
| Metric Name | Type | Units | Attributes | Description |
|---|---|---|---|---|
| container_application_type | gauge | - | application_type | Type of the application running in the container (e.g. memcached, postgres, mysql) |
| container_golang_binary_location | gauge | - | binary_location | Location of the Golang binary running in the container |
JVM Metrics
| Metric Name | Type | Units | Attributes | Description |
|---|---|---|---|---|
| container_jvm_info | gauge | - | jvm, java_version | Meta information about the JVM |
| container_jvm_heap_size_bytes | gauge | bytes | jvm | Total heap size in bytes |
| container_jvm_heap_used_bytes | gauge | bytes | jvm | Used heap size in bytes |
| container_jvm_gc_time_seconds | gauge | seconds | jvm, gc | Time spent in the given JVM garbage collector in seconds |
| container_jvm_safepoint_time_seconds | gauge | seconds | jvm | Time the application has been stopped for safepoint operations in seconds |
| container_jvm_safepoint_sync_time_seconds | gauge | seconds | jvm | Time spent getting to safepoints in seconds |
Python Metrics
| Metric Name | Type | Units | Attributes | Description |
|---|---|---|---|---|
| container_python_thread_lock_wait_time_seconds | gauge | seconds | - | Time spent waiting acquiring GIL in seconds |
Protocol-Specific Request Metrics
| Metric Name | Type | Units | Description |
|---|---|---|---|
| container_http_requests_total | counter | count | Total number of outbound HTTP requests |
| container_postgres_queries_total | counter | count | Total number of outbound Postgres queries |
| container_redis_queries_total | counter | count | Total number of outbound Redis queries |
| container_memcached_queries_total | counter | count | Total number of outbound Memcached queries |
| container_mysql_queries_total | counter | count | Total number of outbound Mysql queries |
| container_mongo_queries_total | counter | count | Total number of outbound Mongo queries |
| container_kafka_requests_total | counter | count | Total number of outbound Kafka requests |
| container_cassandra_queries_total | counter | count | Total number of outbound Cassandra requests |
| container_rabbitmq_messages_total | counter | count | Total number of Rabbitmq messages produced or consumed by the container |
| container_nats_messages_total | counter | count | Total number of NATS messages produced or consumed by the container |
| container_dubbo_requests_total | counter | count | Total number of outbound DUBBO requests |
| container_dns_requests_total | counter | count | Total number of outbound DNS requests |
Protocol-Specific Latency Metrics
| Metric Name | Type | Units | Description |
|---|---|---|---|
| container_http_requests_duration_seconds_total | histogram | seconds | Histogram of the response time for each outbound HTTP request |
| container_postgres_queries_duration_seconds_total | histogram | seconds | Histogram of the execution time for each outbound Postgres query |
| container_redis_queries_duration_seconds_total | histogram | seconds | Histogram of the execution time for each outbound Redis query |
| container_memcached_queries_duration_seconds_total | histogram | seconds | Histogram of the execution time for each outbound Memcached query |
| container_mysql_queries_duration_seconds_total | histogram | seconds | Histogram of the execution time for each outbound Mysql query |
| container_mongo_queries_duration_seconds_total | histogram | seconds | Histogram of the execution time for each outbound Mongo query |
| container_kafka_requests_duration_seconds_total | histogram | seconds | Histogram of the execution time for each outbound Kafka request |
| container_cassandra_queries_duration_seconds_total | histogram | seconds | Histogram of the execution time for each outbound Cassandra request |
| container_dubbo_requests_duration_seconds_total | histogram | seconds | Histogram of the response time for each outbound DUBBO request |
| container_dns_requests_duration_seconds_total | histogram | seconds | Histogram of the response time for each outbound DNS request |
