Structured JSON logs
Every request logged as JSON with request ID, duration, status, and client identity. Machine-parseable, grep-friendly.
ZeptoDB’s HTTP server had zero request-level logging. No way to trace individual requests, identify slow queries, or correlate client-side errors with server-side events. This post covers the observability layer that fixes all of that.
Every HTTP request produces a structured JSON log entry:
{ "request_id": "r0001a3", "method": "POST", "path": "/", "status": 200, "duration_us": 532, "request_bytes": 42, "response_bytes": 1024, "remote_addr": "10.0.1.5", "subject": "algo-service"}Emitted via zeptodb::util::Logger (async JSON, rotating file). Log level is determined by status code:
| Status Range | Log Level |
|---|---|
| 2xx, 3xx | INFO |
| 4xx | WARN |
| 5xx | ERROR |
Component tag: "http". This makes it trivial to filter access logs from other server events in log aggregation tools.
Queries exceeding 100ms (or returning errors) get a dedicated log entry:
{ "query_id": "q_a1b2c3", "subject": "algo-service", "duration_us": 150234, "rows": 50000, "ok": true, "sql": "SELECT vwap(price, volume) FROM trades WHERE ..."}SQL is truncated to 200 characters for log safety — no risk of multi-megabyte log entries from large queries. Component tag: "query".
This is the fastest way to find performance problems in production. Sort by duration_us, and the worst offenders surface immediately.
Every HTTP response includes a unique request identifier:
HTTP/1.1 200 OKX-Request-Id: r0001a3Content-Type: application/jsonThe ID uses a monotonic counter (r<hex>), ensuring uniqueness within a process. Clients can log this value and use it to correlate their errors with server-side access log entries.
Typical debugging workflow:
Client log: "Query failed, request_id=r0001a3" ↓Server log: grep "r0001a3" /var/log/zeptodb/access.json ↓Result: {"request_id":"r0001a3","status":500,"duration_us":30012,...}Startup and shutdown are logged as structured events:
{"event": "server_start", "port": 8123, "tls": false, "auth": true, "async": true}{"event": "server_stop", "port": 8123}These are essential for operations — knowing exactly when a server started, with what configuration, and when it stopped.
Two metrics are exposed for monitoring dashboards:
| Metric | Type | Description |
|---|---|---|
zepto_http_requests_total | Counter | Total HTTP requests served |
zepto_http_active_sessions | Gauge | Current active sessions |
These integrate with the existing Prometheus ServiceMonitor in the Helm chart. Combined with the access log, you get both real-time dashboards and detailed per-request forensics.
HTTP Request │ ├─→ Generate X-Request-Id (monotonic counter) │ ├─→ Execute handler (query, admin, health, etc.) │ ├─→ Measure duration │ ├─→ Access log entry (util::Logger, async JSON) │ └─→ Log level based on status code │ ├─→ Slow query log (if duration > 100ms or error) │ ├─→ Prometheus counter increment │ └─→ Response with X-Request-Id headerThe logging is async — util::Logger buffers entries and writes them in a background thread. No blocking on the request hot path.
Structured JSON logs
Every request logged as JSON with request ID, duration, status, and client identity. Machine-parseable, grep-friendly.
Slow query detection
Queries over 100ms automatically logged with SQL, duration, and row count. Sort by duration to find bottlenecks.
Request tracing
X-Request-Id in every response. Clients log it, operators grep for it. End-to-end correlation in seconds.
Prometheus metrics
Request counter and active session gauge for real-time dashboards. Integrates with existing Helm ServiceMonitor.
Related: Helm Chart and Rolling Upgrades → · Health Monitor Resilience → · Kubernetes Compatibility and HA Testing →