Skip to content

ZeptoDB Production Deployment Guide

Choose bare metal or cloud-native deployment based on your workload.

Last Updated: 2026-03-22


WorkloadLatency RequirementRecommendedReason
HFT tick processing< 100usBare metalContainer overhead is fatal
Market data feed< 500usBare metalNetwork latency critical
Real-time risk< 1msBare metal recommendedStability critical
Quant backtesting> 10msCloud OKCost-effective
Batch analytics> 100msCloud OKElasticity important
Development/testingAnyCloud recommendedFast provisioning

Option 1: Bare Metal Deployment (HFT Edition)

Section titled “Option 1: Bare Metal Deployment (HFT Edition)”
  • HFT trading firms
  • Market data providers
  • Real-time risk systems

Hardware:

  • CPU: Intel Xeon (Skylake+) or AMD EPYC (Zen 3+)
    • Minimum 16 cores (realtime 4 + analytics 4 + system 8)
    • Recommended: 32 cores+
  • RAM: Minimum 64GB, recommended 256GB+
    • Hugepages support
    • ECC memory recommended
  • Network: 10GbE or higher
    • RDMA support (Mellanox ConnectX-5+) recommended
    • Kernel bypass (DPDK) support
  • Storage: NVMe SSD
    • Minimum 1TB for HDB
    • io_uring support (Kernel 5.10+)

Operating System:

  • RHEL 8.6+ / Rocky Linux 8+
  • Ubuntu 22.04 LTS+
  • Kernel 5.10+ (io_uring, nohz_full)
Terminal window
# Edit /etc/default/grub
GRUB_CMDLINE_LINUX="isolcpus=0-3 nohz_full=0-3 rcu_nocbs=0-3 \
transparent_hugepage=never intel_pstate=disable \
default_hugepagesz=2M hugepagesz=2M hugepages=16384 \
processor.max_cstate=1 intel_idle.max_cstate=0"
sudo grub2-mkconfig -o /boot/grub2/grub.cfg
sudo reboot

/opt/zeptodb/tune_bare_metal.sh:

#!/bin/bash
# ZeptoDB Bare Metal Optimization Script
set -e
echo "=== ZeptoDB Bare Metal Tuning ==="
# 1. CPU Governor: performance mode
echo "Setting CPU governor to performance..."
for cpu in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do
[ -f "$cpu" ] && echo performance > $cpu
done
# 2. Disable Turbo Boost (consistent latency)
echo "Disabling Turbo Boost..."
echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo 2>/dev/null || true
# 3. Hugepages: 32GB (16,384 x 2MB pages)
echo "Configuring Hugepages (32GB)..."
echo 16384 > /proc/sys/vm/nr_hugepages
# 4. IRQ Affinity: move interrupts to system cores (8-15)
echo "Setting IRQ affinity to cores 8-15..."
for irq in $(grep -E 'eth|mlx' /proc/interrupts | cut -d: -f1 | tr -d ' '); do
echo ff00 > /proc/irq/$irq/smp_affinity 2>/dev/null || true
done
# 5. Network stack tuning (low latency)
echo "Tuning network stack..."
sysctl -w net.core.busy_poll=50
sysctl -w net.core.busy_read=50
sysctl -w net.ipv4.tcp_low_latency=1
sysctl -w net.core.netdev_max_backlog=10000
sysctl -w net.core.rmem_max=134217728
sysctl -w net.core.wmem_max=134217728
# 6. Disable NUMA balancing (explicit control)
echo "Disabling NUMA balancing..."
echo 0 > /proc/sys/kernel/numa_balancing
# 7. Minimize swappiness
echo "Setting swappiness to 0..."
sysctl -w vm.swappiness=0
# 8. Limit C-states (consistent latency)
echo "Limiting C-states..."
for cpu in /sys/devices/system/cpu/cpu*/cpuidle/state*/disable; do
[ -f "$cpu" ] && echo 1 > $cpu 2>/dev/null || true
done
echo "=== Tuning complete ==="
echo ""
echo "Verify with:"
echo " - cat /proc/cmdline"
echo " - numactl --hardware"
echo " - cat /proc/meminfo | grep Huge"
echo " - cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor"
Terminal window
# Install dependencies
sudo dnf install -y clang19 clang19-devel llvm19-devel \
highway-devel numactl-devel ucx-devel lz4-devel \
liburing-devel ninja-build cmake
# Download source
git clone https://github.com/your-org/zeptodb.git
cd zeptodb
# Bare metal optimized build
mkdir -p build && cd build
cmake .. \
-G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_C_COMPILER=clang-19 \
-DCMAKE_CXX_COMPILER=clang++-19 \
-DCMAKE_CXX_FLAGS="-march=native -mtune=native -O3" \
-DAPEX_BARE_METAL=ON \
-DAPEX_USE_HUGEPAGES=ON \
-DAPEX_USE_IO_URING=ON \
-DAPEX_USE_RDMA=ON
ninja -j$(nproc)
# Run tuning script
sudo ../deploy/scripts/tune_bare_metal.sh
Terminal window
# Single NUMA node
sudo numactl --cpunodebind=0 --membind=0 \
taskset -c 0-3 \
./zepto_server \
--port 8123 \
--realtime-cores 0-3 \
--analytics-cores 4-7 \
--hugepages
# Multi-NUMA node (one instance per node)
# Node 0 (realtime)
sudo numactl --cpunodebind=0 --membind=0 \
taskset -c 0-3 \
./zepto_server --port 8123 --node-id 0 --role realtime &
# Node 1 (analytics)
sudo numactl --cpunodebind=1 --membind=1 \
taskset -c 16-19 \
./zepto_server --port 8124 --node-id 1 --role analytics &
Terminal window
# Check CPU affinity
taskset -p $(pidof zepto_server)
# Check NUMA memory usage
numastat -p $(pidof zepto_server)
# Check Hugepages usage
grep Huge /proc/meminfo
# Latency profiling
sudo perf record -F 999 -a -g -- sleep 30
sudo perf script | flamegraph.pl > zepto_latency.svg
# Continuous monitoring
watch -n 1 'cat /proc/$(pidof zepto_server)/status | grep VmHWM'
MetricBare Metal (Optimized)Target
filter 1M250us< 300us
VWAP 1M500us< 600us
Ingestion6.0M ticks/sec> 5.5M
p99 latency550us< 600us
Jitter (p99.9 - p50)+/-5us< 10us

Option 2: Cloud-Native Deployment (Analytics Edition)

Section titled “Option 2: Cloud-Native Deployment (Analytics Edition)”
  • Quant research teams
  • Risk analytics
  • Batch backtesting
  • Development/test environments
+─────────────────────────────────────+
│ Load Balancer (ALB) │
+────────────+────────────────────────+
|
+────────+─────────+
| |
+───v────+ +─────v───+
│ Pod 1 │ │ Pod 2 │ ... (Auto-scaling)
│ APEX │ │ APEX │
+───+────+ +────+────+
| |
+────────+───────+
|
+────────v─────────+
│ EBS (HDB Storage)│
+──────────────────+

Cloud infrastructure:

  • AWS: c6i.4xlarge+ (16 vCPU, 32GB RAM)
  • GCP: c2-standard-16+ (16 vCPU, 64GB RAM)
  • Azure: F16s v2+ (16 vCPU, 32GB RAM)

Kubernetes:

  • K8s 1.26+
  • CNI: Calico or Cilium
  • Storage: EBS gp3, Persistent Disk SSD

deploy/docker/Dockerfile:

# Multi-stage build
FROM clang:19 AS builder
# Install dependencies
RUN apt-get update && apt-get install -y \
cmake ninja-build \
libhighway-dev libnuma-dev liblz4-dev \
liburing-dev && \
rm -rf /var/lib/apt/lists/*
WORKDIR /build
COPY . .
# Cloud-optimized build
RUN mkdir -p build && cd build && \
cmake .. \
-G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_CXX_FLAGS="-march=x86-64-v3 -O3" \
-DAPEX_CLOUD_NATIVE=ON \
-DAPEX_USE_HUGEPAGES=OFF \
-DAPEX_USE_RDMA=OFF && \
ninja -j$(nproc)
# Runtime image
FROM ubuntu:22.04
RUN apt-get update && apt-get install -y \
libhighway1 libnuma1 liblz4-1 liburing2 \
curl ca-certificates && \
rm -rf /var/lib/apt/lists/*
WORKDIR /opt/zeptodb
COPY --from=builder /build/build/zepto_server .
COPY --from=builder /build/build/*.so .
# Non-root user
RUN useradd -r -u 1000 zeptodb && \
chown -R zeptodb:zeptodb /opt/zeptodb
USER zeptodb
EXPOSE 8123
# Health check
HEALTHCHECK --interval=10s --timeout=3s --start-period=30s \
CMD curl -f http://localhost:8123/health || exit 1
ENTRYPOINT ["./zepto_server"]
CMD ["--port", "8123"]

Build:

Terminal window
docker build -t zeptodb:latest .
docker tag zeptodb:latest your-registry/zeptodb:v1.0.0
docker push your-registry/zeptodb:v1.0.0

deploy/k8s/deployment.yaml:

apiVersion: v1
kind: Namespace
metadata:
name: zeptodb
---
apiVersion: v1
kind: ConfigMap
metadata:
name: zepto-config
namespace: zeptodb
data:
zeptodb.conf: |
port: 8123
worker_threads: 8
analytics_mode: true
cloud_native: true
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: zeptodb
namespace: zeptodb
spec:
replicas: 3
selector:
matchLabels:
app: zeptodb
template:
metadata:
labels:
app: zeptodb
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- zeptodb
topologyKey: kubernetes.io/hostname
containers:
- name: zeptodb
image: your-registry/zeptodb:v1.0.0
ports:
- containerPort: 8123
name: http
protocol: TCP
resources:
requests:
cpu: "4"
memory: "16Gi"
limits:
cpu: "8"
memory: "32Gi"
env:
- name: APEX_WORKER_THREADS
value: "8"
- name: APEX_ANALYTICS_MODE
value: "true"
volumeMounts:
- name: config
mountPath: /opt/zeptodb/config
- name: data
mountPath: /opt/zeptodb/data
livenessProbe:
httpGet:
path: /health
port: 8123
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8123
initialDelaySeconds: 10
periodSeconds: 5
volumes:
- name: config
configMap:
name: zepto-config
- name: data
persistentVolumeClaim:
claimName: zeptodb-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: zeptodb-pvc
namespace: zeptodb
spec:
accessModes:
- ReadWriteOnce
storageClassName: gp3 # AWS EBS gp3
resources:
requests:
storage: 500Gi
---
apiVersion: v1
kind: Service
metadata:
name: zeptodb-service
namespace: zeptodb
spec:
type: LoadBalancer
selector:
app: zeptodb
ports:
- port: 8123
targetPort: 8123
protocol: TCP
name: http
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: zeptodb-hpa
namespace: zeptodb
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: zeptodb
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80

Deploy:

Terminal window
kubectl apply -f deploy/k8s/deployment.yaml
# Verify
kubectl get pods -n zeptodb
kubectl get svc -n zeptodb
# Check logs
kubectl logs -f deployment/zeptodb -n zeptodb

helm/zeptodb/values.yaml:

replicaCount: 3
image:
repository: your-registry/zeptodb
tag: v1.0.0
pullPolicy: IfNotPresent
resources:
requests:
cpu: 4
memory: 16Gi
limits:
cpu: 8
memory: 32Gi
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 80
persistence:
enabled: true
storageClass: gp3
size: 500Gi
config:
workerThreads: 8
analyticsMode: true
cloudNative: true
monitoring:
prometheus:
enabled: true
grafana:
enabled: true

Install:

Terminal window
helm install zeptodb ./deploy/helm/zeptodb \
--namespace zeptodb \
--create-namespace \
--values values-prod.yaml

deploy/k8s/monitoring.yaml:

apiVersion: v1
kind: ServiceMonitor
metadata:
name: zeptodb-metrics
namespace: zeptodb
spec:
selector:
matchLabels:
app: zeptodb
endpoints:
- port: http
path: /metrics
interval: 15s
---
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-dashboard
namespace: zeptodb
data:
zeptodb.json: |
{
"dashboard": {
"title": "ZeptoDB Analytics",
"panels": [
{
"title": "Query Latency (p95)",
"targets": [{"expr": "histogram_quantile(0.95, zepto_query_duration_seconds)"}]
},
{
"title": "Throughput (queries/sec)",
"targets": [{"expr": "rate(zepto_queries_total[1m])"}]
},
{
"title": "Memory Usage",
"targets": [{"expr": "zepto_memory_bytes / 1024 / 1024 / 1024"}]
}
]
}
}
MetricCloud (c6i.4xlarge)Target
filter 1M350us< 500us
VWAP 1M650us< 800us
Ingestion4.5M ticks/sec> 4M
p99 latency800us< 1ms
Throughput5K queries/sec> 3K

ItemBare MetalCloudDifference
Initial costHigh (server purchase)Low (per-hour billing)10x
Operational complexityHigh (manual tuning)Low (automated)3x
Latency p50250us350us+40%
Latency p99550us800us+45%
Jitter+/-5us+/-30us6x worse
Throughput6M/sec4.5M/sec-25%
ScalabilityManual (add servers)Auto (HPA)Auto
Cost (annual)$100K (fixed)$50K-$200K (variable)Usage-dependent
Suitable workloadsHFT, real-timeAnalytics, backtesting

Development -> Docker local test -> K8s staging -> Validation

Option A: Hybrid

  • HFT: Bare metal (on-premises)
  • Backtesting: Cloud (AWS)

Option B: Full Cloud

  • All workloads: K8s
  • Latency critical: Dedicated nodes (c6i.metal)

Option C: Full Bare Metal

  • All workloads: On-premises
  • High operational complexity

  • Hardware specs verified (CPU, RAM, Network)
  • Kernel parameters set (isolcpus, hugepages)
  • Tuning script run (tune_bare_metal.sh)
  • NUMA topology verified
  • Benchmarks run (target latency achieved)
  • Docker image built and pushed
  • K8s cluster ready
  • Persistent Volume configured
  • Monitoring set up (Prometheus, Grafana)
  • Auto-scaling policy configured


Next documents: