Cloud Performance Tuning Guide
ZeptoDB actively leverages hardware optimizations such as SIMD, NUMA-aware allocation, lock-free ring buffer, and LLVM JIT. The following configuration is required to achieve near bare-metal performance even in Kubernetes + container environments.
Why Default K8s Settings Are Not Enough
Section titled “Why Default K8s Settings Are Not Enough”The overhead of containers themselves is nearly zero (Linux cgroup/namespace). Performance degradation comes from K8s default scheduling policies:
| Problem | Cause | Impact |
|---|---|---|
| CPU throttling | cgroup CFS quota interrupts busy-spin | ring buffer ingestion throughput drops sharply |
| NUMA ignored | Pod spans multiple NUMA nodes | Memory latency increases 2~3× |
| Disk I/O | EBS gp3 = 125 MB/s | HDB flush 40× slower (vs NVMe) |
| THP jitter | Transparent Huge Pages compaction | Unpredictable latency spikes |
1. Guaranteed QoS (CPU Pinning)
Section titled “1. Guaranteed QoS (CPU Pinning)”Setting requests == limits causes kubelet to assign a dedicated cpuset. CFS throttling is eliminated and the lock-free ring buffer operates reliably.
resources: requests: cpu: "8" memory: "32Gi" hugepages-2Mi: "4Gi" limits: cpu: "8" # requests == limits → Guaranteed QoS memory: "32Gi" hugepages-2Mi: "4Gi"2. Kubelet Configuration (Node Level)
Section titled “2. Kubelet Configuration (Node Level)”Configure kubelet flags in the launch template of EKS managed node groups or in Karpenter userData:
# Additional kubelet flags--cpu-manager-policy=static # Dedicated CPU core allocation--topology-manager-policy=single-numa-node # Enforce single NUMA node--memory-manager-policy=Static # NUMA-local memory allocation as well--reserved-memory='[{"numaNode":0,"limits":{"memory":"1Gi"}}]'How to apply on EKS:
# EKS managed node group — launch template userdata#!/bin/bash/etc/eks/bootstrap.sh my-cluster \ --kubelet-extra-args '--cpu-manager-policy=static --topology-manager-policy=single-numa-node'3. Kernel Tuning at Node Boot
Section titled “3. Kernel Tuning at Node Boot”Automatically applied via Karpenter userData (values.yaml’s karpenter.realtime.userData):
| Tuning Item | Setting | Purpose |
|---|---|---|
| Hugepages | echo 8192 > /proc/sys/vm/nr_hugepages | Arena allocator performance |
| CPU governor | echo performance > scaling_governor | Consistent clock speed |
| NUMA balancing | echo 0 > numa_balancing | ZeptoDB manages NUMA directly |
| Swappiness | sysctl vm.swappiness=0 | Swap unnecessary for in-memory DB |
| THP | echo never > transparent_hugepage/enabled | Prevent latency spikes |
| Network | busy_poll=50, tcp_low_latency=1 | Reduce network latency |
4. Storage: Instance Store vs EBS
Section titled “4. Storage: Instance Store vs EBS”Use instances with instance store when HDB flush performance matters:
| Storage | Throughput | Latency | Use Case |
|---|---|---|---|
| Instance store (i4i/i4g) | ~4 GB/s | ~100μs | HDB flush, WAL |
| EBS io2 Block Express | 4 GB/s | ~200μs | When persistent storage is needed |
| EBS gp3 (default) | 125 MB/s | ~1ms | Cost priority |
# values.yaml — Karpenter realtime poolkarpenter: realtime: instanceFamilies: ["i4g", "c7g"] instanceStorePolicy: "RAID0" # Automatic RAID0 for NVMe instance store⚠️ Instance store data is lost when the node terminates. WAL replication must be enabled.
5. Network: hostNetwork
Section titled “5. Network: hostNetwork”When there is heavy RPC communication between data nodes in cluster mode, you can bypass CNI overhead:
performanceTuning: hostNetwork: false # Change to true to bypass CNIConsiderations when using hostNetwork:
- Pods share the host network namespace, so watch out for port conflicts
dnsPolicy: ClusterFirstWithHostNetis automatically configured- RPC port (8223) must be allowed in security groups
6. RDMA / UCX
Section titled “6. RDMA / UCX”Standard RDMA is not supported in cloud VPCs. You must use AWS EFA (Elastic Fabric Adapter):
- EFA-supported instances:
hpc7g,p5.48xlarge,trn1.32xlarge, etc. - EFA device plugin DaemonSet installation required
- For most workloads, TCP + busy_poll tuning is sufficient
How to Apply
Section titled “How to Apply”Helm Deployment
Section titled “Helm Deployment”helm upgrade zeptodb ./deploy/helm/zeptodb \ --set karpenter.enabled=true \ --set performanceTuning.hostNetwork=false \ --set performanceTuning.hugepages.enabled=truePerformance Verification
Section titled “Performance Verification”After deployment, verify that the settings are correctly applied with the following commands:
# Check Pod QoS class (should be Guaranteed)kubectl get pod -n zeptodb -o jsonpath='{.items[0].status.qosClass}'
# Check cpuset (dedicated cores should be assigned)kubectl exec -n zeptodb <pod> -- cat /sys/fs/cgroup/cpuset.cpus
# Check hugepageskubectl exec -n zeptodb <pod> -- cat /proc/meminfo | grep -i huge
# Check NUMAkubectl exec -n zeptodb <pod> -- numactl --showPerformance Comparison Summary
Section titled “Performance Comparison Summary”Properly tuned K8s environment vs bare-metal:
| Metric | Bare-metal | K8s (after tuning) | Difference |
|---|---|---|---|
| Ingestion (ring buffer) | 5.52M evt/s | ~5.4M evt/s | ~2% |
| Filter 1M rows (SIMD) | 272μs | ~275μs | ~1% |
| VWAP 1M rows (JIT) | 532μs | ~540μs | ~1.5% |
| HDB flush (instance store) | 4.8 GB/s | ~4.5 GB/s | ~6% |
| HDB flush (EBS gp3) | 4.8 GB/s | 125 MB/s | 38× slower |
SIMD, JIT, and lock-free structures are userspace operations, so they are barely affected by containers. The differences come from I/O and scheduling, and the tuning above resolves most of them.