Skip to main content
ClickHouse is the main capacity driver in a Metoro Hub deployment. It receives telemetry from the ingesters, compresses and stores that telemetry, and serves the analytical queries used by the UI and API. This guide gives starting points for production sizing. Treat the numbers as planning estimates, then tune from observed ingest rate, query latency, CPU saturation, and disk growth after the system is running. The most important operational choice is to make ClickHouse PVCs resizable and network attached. As the real telemetry shape becomes clear, you may need to move ClickHouse pods onto smaller or larger hosts, add CPU, or grow storage. Network-attached PVCs let pods move between hosts without losing their retained data, and resizable PVCs let operators increase disk capacity without rebuilding the deployment.

ClickHouse And Keeper Roles

ClickHouse Keeper is used for coordination, replication control, and cluster metadata. It is lightweight compared with ClickHouse itself and is unlikely to use more than 100m CPU and 2 GiB memory per Keeper node, but it should run on fast, retained PVCs and Keeper nodes should have low network latency between them. Aim for less than 5ms latency between Keeper nodes. ClickHouse replicas do the actual work: ingesting telemetry, compressing it, writing it to storage, merging data parts, and executing user queries. Size CPU, memory, PVCs, object storage, and local cache around the ClickHouse replicas.

Estimating Data Ingest

The main input for ClickHouse sizing is how much telemetry data is sent to the Metoro Hub. In the formulas below, ingest_MBps means sustained telemetry ingest into the hub in MB/s. For context, a Metoro Hub observing a 5-node cluster with around 20k logs per minute, 20k traces per minute, 100 pods, and 20k active timeseries is typically around 1 MB/s of telemetry ingest. Use this as a rough calibration point, not a fixed conversion rate. Actual ingest varies with log line size, span attributes, metric cardinality, scrape frequency, and workload behavior. If you already know the expected telemetry bandwidth into the hub, use that value directly. If you do not, estimate the expected rate from a representative cluster and scale it by the number and size of clusters that will send data to the hub.

CPU And Memory

For Metoro telemetry, a modern ClickHouse core can ingest about 25 MB/s of telemetry. On current Xeon 6 class processors, this is roughly 2 cores for 50 MB/s of ingest. Size each ClickHouse replica with:
clickhouse_cores_per_replica = max(ingest_MBps / 25, production_query_floor)
For production, use a query floor of at least 4 cores per ClickHouse replica. Larger deployments should start at 8 cores or more per replica. Queries use capacity that is not consumed by ingestion, and larger data volumes usually require more query CPU because each query has more data to scan. Metoro ClickHouse deployments are usually CPU bound rather than memory bound. Use 2 GiB of memory per core as the minimum, and 4 GiB per core as the recommended starting point. To change bundled ClickHouse CPU or memory resources after installation, see Resource Sizing.
minimum_memory_per_replica = clickhouse_cores_per_replica * 2 GiB
recommended_memory_per_replica = clickhouse_cores_per_replica * 4 GiB

Replicas And Shards

Metoro normally stores telemetry in a single ClickHouse shard with three replicas. This gives three copies of the data, and each replica can query the full telemetry dataset. Because each replica stores a full copy, calculate storage per replica first, then multiply by 3 for the default production deployment. Do not add more than three replicas just to increase capacity. Prefer vertical scaling first: larger ClickHouse nodes with more CPU, memory, and disk. Sharding is supported for very large deployments, but it adds operational overhead and should only be considered after vertical scaling is no longer practical. Contact Metoro before planning a sharded production deployment. To change bundled ClickHouse replica count after installation, see Scale Replicas.

Storage Models

Metoro supports two ClickHouse storage models.
ModelHow it worksWhen to use it
PVC-onlyAll telemetry for the full retention period stays on retained Kubernetes PVCs.Simpler operation, but requires large fast disks. This becomes difficult above roughly 10 TB per replica and is not recommended at petabyte scale.
Hot PVC plus object storageAbout one day of recent telemetry stays on fast PVCs. Older telemetry is moved to object storage.Recommended for production because merges happen on fast block storage while longer-term data uses scalable object storage.
Use fast storage for ClickHouse PVCs. For the hot PVC plus object storage model, the hot PVC should still be fast enough for active writes and merges. For both models, use a retained, network-attached StorageClass that supports volume expansion. Avoid host-local retained storage for production ClickHouse PVCs unless the operational team has explicitly accepted the scheduling and migration constraints. To increase the bundled ClickHouse hot PVC after installation, see Upscale Hot PVC.

Storage Formulas

Metoro typically sees about 15x compression on telemetry data. The compressed data rate is:
compressed_MBps = ingest_MBps / 15
For PVC-only storage, size the PVC for the full retention period:
pvc_only_storage_per_replica_MB = compressed_MBps * retention_seconds * 1.5
For hot PVC plus object storage, size the hot PVC for one day of data:
hot_pvc_per_replica_MB = compressed_MBps * 86400 * 1.5
Then size object storage for the rest of the retention period:
object_storage_per_replica_MB = compressed_MBps * max(retention_seconds - 86400, 0) * 1.5
The 1.5 multiplier provides headroom for merges, operational variance, and growth. These formulas produce decimal MB, GB, and TB estimates. Round up to practical Kubernetes PVC sizes such as Gi or Ti. For the default three-replica deployment:
total_metoro_storage = storage_per_replica * 3
This total is before any replication, erasure coding, versioning, or backup overhead inside the storage system itself.

Local Disk Cache For Object Storage

When ClickHouse reads data older than the hot PVC window, it has to read from object storage. Object storage is much slower than local disk, so ClickHouse can use a local disk cache for object-backed data. This cache speeds up repeated reads over the same data and reduces object storage transfer. Observability queries often revisit nearby time ranges, so the cache can materially improve query latency. For example, in the hot PVC plus object storage model, a 7-day query reads the most recent day from the ClickHouse PVC. Older data is served from the local cache when present; otherwise ClickHouse reads it from object storage and fills the cache for later queries. Metoro recommends making the cache as large as practical. Use at least 10Gi of ephemeral local disk or NVMe cache per ClickHouse replica. The cache does not need to be retained: if a ClickHouse pod restarts or moves to another node, the cache is rebuilt as queries read data.

Examples

PVC-only, 10 MB/s, 30 days

For 10 MB/s of ingest and 30 days of retention:
compressed_MBps = 10 / 15 = 0.67 MB/s
retention_seconds = 60 * 60 * 24 * 30 = 2,592,000
storage_per_replica = (10 / 15) * 2,592,000 = 1,728,000 MB = 1.73 TB
storage_per_replica_with_headroom = 1.73 TB * 1.5 = 2.59 TB
With the default three replicas, plan for about 7.78 TB of ClickHouse PVC capacity across the cluster before storage-system overhead. This can be reasonable for smaller deployments, but PVC-only storage becomes harder to operate as retention or ingest grows.

Hot PVC Plus Object Storage, 10 MB/s, 30 days

For 10 MB/s of ingest, 30 days of retention, and the default three replicas:
ResourcePer replica sizing
CPU10 / 25 = 0.4 ingest cores, so use the 4 core production query floor.
Memory4 cores * 4 GiB = 16 GiB recommended.
Hot PVC(10 / 15) * 86400 * 1.5 = 86,400 MB, so round up to a practical size such as 100Gi.
Object storage(10 / 15) * (2,592,000 - 86,400) * 1.5 = 2,505,600 MB, or about 2.5 TB.
Local cacheAt least 10Gi ephemeral local disk or NVMe cache. Larger is better.
Across three replicas, the object storage estimate is about 7.5 TB before object-store overhead. Each replica should also have its own hot PVC and local cache.