Skip to main content
Use this runbook when bundled ClickHouse server pods need more or less CPU or memory. This applies to ClickHouse created by the metoro-onprem chart and managed by the Altinity ClickHouse Operator. If ClickHouse is externally managed, change resources through the team and tooling that operate that ClickHouse service. The Metoro chart does not manage external ClickHouse pod resources. This page is about ClickHouse server resources. ClickHouse Keeper is a separate lightweight component and normally does not need tuning as part of ClickHouse query or ingest capacity changes.

Check The Dashboard First

Before changing resources, open the Metoro Hub - ClickHouse dashboard in Metoro and review the Global Resource Usage - CHI section. Metoro Hub ClickHouse resource usage dashboard showing CPU, memory, throttling, and network panels Use the dashboard to verify that resizing is necessary. Resize based on sustained resource pressure or ClickHouse symptoms, not a single short spike. Start with these panels:
  • CPU Usage (Metoro Reported) shows how many CPU cores each ClickHouse instance is using.
  • CPU Util (Metoro Reported) shows CPU usage as a percentage of the configured CPU limit.
  • CPU Throttled (Metoro Reported) shows whether ClickHouse is being limited by its CPU limit.
  • Memory Usage (Metoro Reported) shows memory RSS for each ClickHouse instance.
  • Memory Util (Metoro Reported) shows memory RSS as a percentage of the configured memory limit.
CPU pressure usually shows up as sustained high CPU utilization, non-zero throttling, slower inserts, slower queries, or growing ClickHouse merge queues. Memory pressure usually shows up as high memory utilization, OOM restarts, failed queries, or errors from ClickHouse memory limits.

What To Change

The clickhouse.bundled.resources Helm values control ClickHouse server CPU and memory requests and limits. The chart renders these values into the ClickHouse pod template on the ClickHouseInstallation named metoro. Change the values in metoro-hub-values.yaml:
clickhouse:
  bundled:
    resources:
      requests:
        cpu: "4"
        memory: 16Gi
      limits:
        cpu: "8"
        memory: 32Gi
CPU requests affect Kubernetes scheduling and reserved capacity. CPU limits cap burst and query capacity; a CPU limit that is too low can cause throttling. Memory requests affect Kubernetes scheduling. Memory limits cap ClickHouse memory; a memory limit that is too low can cause query failures, OOM restarts, or unstable replicas. Change requests and limits together when you want both scheduling guarantees and runtime capacity to move. Change only requests when the pod needs better scheduling guarantees but the existing limits are still appropriate. Change only limits when scheduling is already sufficient but ClickHouse needs more burst or query headroom.

Increase Or Decrease

Increase CPU or memory when the dashboard shows sustained pressure and ClickHouse symptoms line up with that pressure. Before applying larger requests, confirm the Kubernetes cluster has nodes that can schedule the new ClickHouse pod shape. Decrease CPU or memory only after observing stable low utilization over a representative period. Downsize gradually and watch the dashboard after each change. Do not reduce memory below the amount needed for normal queries and merges, and do not lower CPU limits if ClickHouse is already showing throttling. Changing CPU or memory does not resize PVCs. To increase hot PVC capacity, see Upscale Hot PVC. Changing CPU or memory does not add replicas. To change replica count, see Scale Replicas.

Apply The Change

Apply the updated values with the same Helm release and chart version used for the hub:
helm upgrade --install metoro oci://quay.io/metoro/charts/metoro-onprem \
  --namespace metoro-hub \
  --version 10.0.1 \
  --values metoro-hub-values.yaml

What Happens

After the Helm upgrade, the ClickHouseInstallation named metoro updates its ClickHouse pod template. The Altinity ClickHouse Operator reconciles the updated ClickHouseInstallation. Because pod resources are part of the pod template, the operator may restart ClickHouse pods so they pick up the new CPU and memory values. Expect a rolling reconciliation rather than every ClickHouse pod restarting at once. If any pod or node maintenance is involved, follow Taking Pods Offline so Keeper keeps quorum and at least one ClickHouse pod stays ready. If pods become pending after increasing requests, check node capacity, node selectors, taints, topology spread constraints, and recent scheduler events. The requested pod shape must fit on available Kubernetes nodes.

Verify The Rollout

Check that the ClickHouseInstallation exists and is being reconciled:
kubectl -n metoro-hub get chi metoro
Watch ClickHouse pods while the operator reconciles:
kubectl -n metoro-hub get pods -l app=metoro-clickhouse -w
Check recent events if pods are pending or slow to restart:
kubectl -n metoro-hub get events --sort-by='.lastTimestamp'
After the rollout, return to the Metoro Hub - ClickHouse dashboard and verify that the original signal improved. CPU throttling should drop after raising an undersized CPU limit. CPU or memory utilization should move to the expected range after increasing limits. If the dashboard does not improve, the bottleneck may be disk IO, object storage latency, query shape, ingest volume, or replica count rather than CPU or memory sizing.