Skip to content

Multi-Node Cluster Guide

ZeptoDB multi-node cluster configuration and operations guide.


┌─────────────────────┐ TCP RPC ┌──────────────────┐
│ Coordinator (HTTP) │◄───────────────────►│ Data Node 1 │
│ zepto_http_server │ │ zepto_data_node │
│ port 8123 │ TCP RPC │ port 9001 │
│ Web UI + API │◄───────────────────►├──────────────────┤
│ node_id=0 │ │ Data Node 2 │
│ │ │ zepto_data_node │
│ │ │ port 9002 │
└─────────────────────┘ └──────────────────┘
  • Coordinator (zepto_http_server): Provides HTTP API + Web UI, distributes queries via scatter-gather
  • Data Node (zepto_data_node): Executes queries via TCP RPC, stores data

./zepto_http_server [options]
OptionDefaultDescription
--port, -p8123HTTP listening port
--ticks, -n10000Number of sample ticks to generate at startup
--no-authfalseDisable authentication
--node-id0Cluster ID for this node
--add-node id:host:portRegister a remote data node (can be used multiple times)
--log-levelinfoLog level (debug/info/warn/error)

The Coordinator is always active. If there are no data nodes, it operates in standalone mode. When nodes are added via --add-node or POST /admin/nodes, it automatically switches to cluster mode.

./zepto_data_node <port> [num_ticks] [options]
ArgumentDefaultDescription
port(required)TCP RPC listening port
num_ticks0Number of sample ticks to generate at startup
--node-idport numberNode ID within the cluster
--symbol1symbol_id for sample data
--coordinator host:portAuto-register with the Coordinator
--api-keyAdmin API key to use for registration
--advertise-hostlocalhostHost address to advertise to the Coordinator

Terminal window
cd build
ninja zepto_http_server zepto_data_node

2. Method A: Start Coordinator first → Data Nodes auto-register

Section titled “2. Method A: Start Coordinator first → Data Nodes auto-register”
Terminal window
# Terminal 1 — Coordinator
./zepto_http_server --port 8123
# The admin API key will be printed → copy it
# Terminal 2 — Data Node (auto-register)
./zepto_data_node 9001 --coordinator localhost:8123 --api-key $ADMIN_KEY
# Terminal 3 — Data Node (auto-register)
./zepto_data_node 9002 --coordinator localhost:8123 --api-key $ADMIN_KEY
Terminal window
# Terminals 1~2 — Start Data Nodes first
./zepto_data_node 9001
./zepto_data_node 9002
# Terminal 3 — Coordinator (connect to nodes)
./zepto_http_server --port 8123 \
--add-node 9001:localhost:9001 \
--add-node 9002:localhost:9002

Open http://localhost:3000/cluster in a browser → 3 nodes should be displayed (coordinator + 2 data nodes)


Even after the cluster has started, you can dynamically add/remove nodes via the REST API.

Terminal window
# Start a new data node
./zepto_data_node 9003
# Register with the Coordinator
curl -X POST http://localhost:8123/admin/nodes \
-H "Authorization: Bearer $ADMIN_KEY" \
-d '{"id":9003,"host":"localhost","port":9003}'
Terminal window
curl -X DELETE http://localhost:8123/admin/nodes/9003 \
-H "Authorization: Bearer $ADMIN_KEY"
Terminal window
curl http://localhost:8123/admin/nodes \
-H "Authorization: Bearer $ADMIN_KEY"

Example response:

{
"nodes": [
{"id": 0, "host": "localhost", "port": 8123, "state": "ACTIVE", "ticks_ingested": 10000, "ticks_stored": 10000, "queries_executed": 5},
{"id": 9001, "host": "localhost", "port": 9001, "state": "ACTIVE", "ticks_ingested": 0, "ticks_stored": 0, "queries_executed": 0},
{"id": 9002, "host": "localhost", "port": 9002, "state": "ACTIVE", "ticks_ingested": 0, "ticks_stored": 0, "queries_executed": 0}
]
}
Terminal window
curl http://localhost:8123/admin/cluster \
-H "Authorization: Bearer $ADMIN_KEY"

Example response:

{
"mode": "cluster",
"node_count": 3,
"ticks_ingested": 10000,
"ticks_stored": 10000,
"queries_executed": 5
}

Terminal window
./zepto_http_server --port 8123

The cluster tab shows 1 node. mode: standalone. When a data node is added, it automatically switches to mode: cluster.

Terminal window
# Terminal 1
./zepto_data_node 9001
# Terminal 2
./zepto_http_server --port 8123 --add-node 9001:localhost:9001
Terminal window
# Terminals 1~3: data nodes
./zepto_data_node 9001
./zepto_data_node 9002
./zepto_data_node 9003
# Terminal 4: coordinator
./zepto_http_server --port 8123 \
--add-node 9001:localhost:9001 \
--add-node 9002:localhost:9002 \
--add-node 9003:localhost:9003
Terminal window
# Coordinator (10.0.1.1)
./zepto_http_server --port 8123
# Machine A (10.0.1.2) — auto-register
./zepto_data_node 9001 \
--coordinator 10.0.1.1:8123 \
--api-key $ADMIN_KEY \
--advertise-host 10.0.1.2
# Machine B (10.0.1.3) — auto-register
./zepto_data_node 9001 \
--coordinator 10.0.1.1:8123 \
--api-key $ADMIN_KEY \
--advertise-host 10.0.1.3

Information available on the /cluster page:

SectionDescription
Summary CardsMode (standalone/cluster), number of active nodes, total ticks, total queries
Node Status TablePer-node ID, Host, Port, State, Ticks, Queries
Ticks Stored Bar ChartBar chart of ticks stored per node
Ingestion HistoryPer-node time-series ingestion trend
Queries HistoryPer-node time-series query execution trend
Latency HistoryPer-node ingest latency trend

Node state colors:

  • 🟢 ACTIVE — Normal
  • 🟡 SUSPECT — Response delayed
  • 🔴 DEAD — Unreachable
  • 🔵 JOINING — Joining the cluster
  • ⚪ LEAVING — Leaving the cluster

Active-Standby architecture with automatic failover when the coordinator fails.

Coordinator (ACTIVE) ←── ping (500ms) ── Coordinator (STANDBY)
port 8123, rpc 9100 port 8124, rpc 9101
│ │
├── Data Node 1 (9001) │
└── Data Node 2 (9002) │
If ACTIVE dies → after 2 seconds STANDBY is promoted to ACTIVE
Node list is automatically re-registered
OptionDescription
--ha active|standbyHA role
--peer host:portRPC address of the peer coordinator
--rpc-port PORTRPC port for this node (default: HTTP port + 1000)
Terminal window
# Terminal 1 — Active Coordinator
./zepto_http_server --port 8123 --ha active --peer localhost:9101 --rpc-port 9100
# Terminal 2 — Standby Coordinator
./zepto_http_server --port 8124 --ha standby --peer localhost:9100 --rpc-port 9101
# Terminals 3~4 — Data Nodes (register with Active)
./zepto_data_node 9001 --coordinator localhost:8123 --api-key $ADMIN_KEY
./zepto_data_node 9002 --coordinator localhost:8123 --api-key $ADMIN_KEY
  1. Standby pings Active every 500ms
  2. If no response for 2 seconds, Standby is automatically promoted to Active
  3. The registered data node list is automatically re-registered with the new Active
  4. Data node processes are unaffected (independent processes)
  5. Clients connect to the new Active’s HTTP port

When Active dies, Standby (8124) becomes the new Active:

Terminal window
# Before: http://localhost:8123
# After: http://localhost:8124

In production, place a load balancer (ALB/NLB) in front and use health checks for automatic switchover.


SymptomCauseSolution
Not in cluster modeRunning an older version of the serverUpdate to the latest build (coordinator is always active)
Node shows as DEADData node is down or port mismatchCheck the data node process and verify the port
Web UI shows only 1 nodeData node is not registeredAdd via --add-node or POST /admin/nodes
Connection refusedData node has not started yetStart data nodes first, then start the coordinator