Zero downtime
maxUnavailable: 0 ensures a new pod is ready before the old one terminates. No capacity reduction during rollout.
Upgrading an in-memory database in production is nerve-wracking. Losing a pod means losing cached state. ZeptoDB’s Helm chart is designed around one principle: never reduce capacity during a rollout.
The chart lives in helm/zeptodb/ and converts the previous monolithic deployment.yaml into parameterized templates:
| Template | Purpose |
|---|---|
deployment.yaml | Rolling update strategy, pod anti-affinity, config checksum |
service.yaml | LoadBalancer + headless service |
configmap.yaml | Server configuration from values |
pvc.yaml | Persistent storage (conditional) |
hpa.yaml | Horizontal Pod Autoscaler (conditional) |
pdb.yaml | PodDisruptionBudget (conditional) |
servicemonitor.yaml | Prometheus Operator integration (conditional) |
A single values.yaml drives both standalone and distributed mode via the cluster.enabled toggle. Cluster ports (RPC, heartbeat) are only exposed when clustering is on.
The core of the rolling update configuration:
strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 0maxUnavailable: 0 is the key. Kubernetes must bring up a new pod and confirm it’s ready before terminating an old one. Combined with preStop sleep, this ensures in-flight queries complete before the old pod shuts down.
maxSurge: 1 means at most one extra pod during rollout — controlled resource usageapiVersion: policy/v1kind: PodDisruptionBudgetspec: minAvailable: 2 selector: matchLabels: app: zeptodbWithout a PDB, kubectl drain during node maintenance can kill quorum. With minAvailable: 2, Kubernetes blocks eviction if it would drop below two healthy pods. This is critical for in-memory databases where state loss has real consequences.
Kubernetes doesn’t restart pods when a ConfigMap changes. This is a common source of “I changed the config but nothing happened” issues.
The Helm chart solves this with a checksum annotation:
template: metadata: annotations: checksum/config: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }}When the ConfigMap content changes, the checksum changes, which triggers a rolling restart. Config-only updates get the same zero-downtime treatment as image upgrades.
helm upgrade zeptodb helm/zeptodb/ \ --set image.tag=v1.2.0 \ --wait --timeout 5mRoutine releases. Helm handles the rolling update automatically.
Change values in values.yaml and run helm upgrade. The checksum annotation detects the ConfigMap change and triggers a rolling restart — no image change needed.
For high-risk changes, deploy a separate Helm release:
# Deploy canary alongside productionhelm install zeptodb-canary helm/zeptodb/ \ --set image.tag=v2.0.0-rc1 \ --set replicaCount=1
# Test canary, then promotehelm upgrade zeptodb helm/zeptodb/ --set image.tag=v2.0.0helm uninstall zeptodb-canaryExtended grace period for WAL replay and state recovery:
helm upgrade zeptodb helm/zeptodb/ \ --set image.tag=v1.2.0 \ --set cluster.enabled=true \ --set terminationGracePeriodSeconds=120 \ --wait --timeout 10m# Check revision historyhelm history zeptodb
# Rollback to previous revisionhelm rollback zeptodb 3Helm maintains revision history, so rollback is a single command. The same zero-downtime rolling update strategy applies in reverse.
Zero downtime
maxUnavailable: 0 ensures a new pod is ready before the old one terminates. No capacity reduction during rollout.
PDB protection
PodDisruptionBudget prevents node drain from killing quorum. minAvailable: 2 guarantees service continuity.
Auto config reload
ConfigMap checksum annotation triggers rolling restart on config changes — no manual pod deletion needed.
Instant rollback
Helm revision history enables single-command rollback with the same zero-downtime guarantees.
Related: Kubernetes Compatibility and HA Testing → · Cluster Integrity & Split-Brain → · WAL Replicator Reliability →