Skip to content

Cloud Performance Tuning Guide

ZeptoDB actively leverages hardware optimizations such as SIMD, NUMA-aware allocation, lock-free ring buffer, and LLVM JIT. The following configuration is required to achieve near bare-metal performance even in Kubernetes + container environments.


The overhead of containers themselves is nearly zero (Linux cgroup/namespace). Performance degradation comes from K8s default scheduling policies:

ProblemCauseImpact
CPU throttlingcgroup CFS quota interrupts busy-spinring buffer ingestion throughput drops sharply
NUMA ignoredPod spans multiple NUMA nodesMemory latency increases 2~3×
Disk I/OEBS gp3 = 125 MB/sHDB flush 40× slower (vs NVMe)
THP jitterTransparent Huge Pages compactionUnpredictable latency spikes

Setting requests == limits causes kubelet to assign a dedicated cpuset. CFS throttling is eliminated and the lock-free ring buffer operates reliably.

values.yaml
resources:
requests:
cpu: "8"
memory: "32Gi"
hugepages-2Mi: "4Gi"
limits:
cpu: "8" # requests == limits → Guaranteed QoS
memory: "32Gi"
hugepages-2Mi: "4Gi"

Configure kubelet flags in the launch template of EKS managed node groups or in Karpenter userData:

Terminal window
# Additional kubelet flags
--cpu-manager-policy=static # Dedicated CPU core allocation
--topology-manager-policy=single-numa-node # Enforce single NUMA node
--memory-manager-policy=Static # NUMA-local memory allocation as well
--reserved-memory='[{"numaNode":0,"limits":{"memory":"1Gi"}}]'

How to apply on EKS:

# EKS managed node group — launch template userdata
#!/bin/bash
/etc/eks/bootstrap.sh my-cluster \
--kubelet-extra-args '--cpu-manager-policy=static --topology-manager-policy=single-numa-node'

Automatically applied via Karpenter userData (values.yaml’s karpenter.realtime.userData):

Tuning ItemSettingPurpose
Hugepagesecho 8192 > /proc/sys/vm/nr_hugepagesArena allocator performance
CPU governorecho performance > scaling_governorConsistent clock speed
NUMA balancingecho 0 > numa_balancingZeptoDB manages NUMA directly
Swappinesssysctl vm.swappiness=0Swap unnecessary for in-memory DB
THPecho never > transparent_hugepage/enabledPrevent latency spikes
Networkbusy_poll=50, tcp_low_latency=1Reduce network latency

Use instances with instance store when HDB flush performance matters:

StorageThroughputLatencyUse Case
Instance store (i4i/i4g)~4 GB/s~100μsHDB flush, WAL
EBS io2 Block Express4 GB/s~200μsWhen persistent storage is needed
EBS gp3 (default)125 MB/s~1msCost priority
# values.yaml — Karpenter realtime pool
karpenter:
realtime:
instanceFamilies: ["i4g", "c7g"]
instanceStorePolicy: "RAID0" # Automatic RAID0 for NVMe instance store

⚠️ Instance store data is lost when the node terminates. WAL replication must be enabled.

When there is heavy RPC communication between data nodes in cluster mode, you can bypass CNI overhead:

values.yaml
performanceTuning:
hostNetwork: false # Change to true to bypass CNI

Considerations when using hostNetwork:

  • Pods share the host network namespace, so watch out for port conflicts
  • dnsPolicy: ClusterFirstWithHostNet is automatically configured
  • RPC port (8223) must be allowed in security groups

Standard RDMA is not supported in cloud VPCs. You must use AWS EFA (Elastic Fabric Adapter):

  • EFA-supported instances: hpc7g, p5.48xlarge, trn1.32xlarge, etc.
  • EFA device plugin DaemonSet installation required
  • For most workloads, TCP + busy_poll tuning is sufficient

Terminal window
helm upgrade zeptodb ./deploy/helm/zeptodb \
--set karpenter.enabled=true \
--set performanceTuning.hostNetwork=false \
--set performanceTuning.hugepages.enabled=true

After deployment, verify that the settings are correctly applied with the following commands:

Terminal window
# Check Pod QoS class (should be Guaranteed)
kubectl get pod -n zeptodb -o jsonpath='{.items[0].status.qosClass}'
# Check cpuset (dedicated cores should be assigned)
kubectl exec -n zeptodb <pod> -- cat /sys/fs/cgroup/cpuset.cpus
# Check hugepages
kubectl exec -n zeptodb <pod> -- cat /proc/meminfo | grep -i huge
# Check NUMA
kubectl exec -n zeptodb <pod> -- numactl --show

Properly tuned K8s environment vs bare-metal:

MetricBare-metalK8s (after tuning)Difference
Ingestion (ring buffer)5.52M evt/s~5.4M evt/s~2%
Filter 1M rows (SIMD)272μs~275μs~1%
VWAP 1M rows (JIT)532μs~540μs~1.5%
HDB flush (instance store)4.8 GB/s~4.5 GB/s~6%
HDB flush (EBS gp3)4.8 GB/s125 MB/s38× slower

SIMD, JIT, and lock-free structures are userspace operations, so they are barely affected by containers. The differences come from I/O and scheduling, and the tuning above resolves most of them.