Example Dataset Bundle
Preloaded sample data so new users never face an empty database.
Overview
Section titled “Overview”The --demo server flag loads realistic sample datasets on startup. This eliminates the “blank canvas” problem — users can immediately run meaningful queries.
./zepto_server --port 8123 --demo# ordocker run -p 8123:8123 zeptodb/zeptodb:latest --demoDatasets
Section titled “Datasets”trades — Equity tick data
Section titled “trades — Equity tick data”5 symbols × ~20K ticks each = 100K rows
| Column | Type | Example |
|---|---|---|
| symbol | STRING | AAPL |
| price | INT64 | 18750 (= $187.50) |
| volume | INT64 | 100 |
| timestamp | INT64 | 1711234567000000000 |
Symbols: AAPL, GOOG, MSFT, TSLA, AMZN
Price distribution: realistic random walk around each symbol’s base price. Volume follows a U-shaped intraday pattern (high at open/close).
quotes — Bid/ask book
Section titled “quotes — Bid/ask book”5 symbols × ~40K quotes each = 200K rows
| Column | Type | Example |
|---|---|---|
| symbol | STRING | AAPL |
| bid | INT64 | 18748 |
| ask | INT64 | 18752 |
| bid_size | INT64 | 500 |
| ask_size | INT64 | 300 |
| timestamp | INT64 | 1711234567000000000 |
Quotes arrive ~2× the rate of trades. Spread varies by symbol volatility.
sensors — IoT telemetry
Section titled “sensors — IoT telemetry”10 devices × ~5K readings each = 50K rows
| Column | Type | Example |
|---|---|---|
| device_id | STRING | sensor_01 |
| temperature | INT64 | 2350 (= 23.50°C) |
| humidity | INT64 | 6520 (= 65.20%) |
| timestamp | INT64 | 1711234567000000000 |
Devices: sensor_01 through sensor_10. Temperature follows a sinusoidal daily pattern with noise. sensor_07 has injected anomalies for detection demos.
Starter Queries
Section titled “Starter Queries”Printed to stdout on --demo startup:
═══════════════════════════════════════════════════════ ZeptoDB Demo Mode — Sample data loaded trades: 100K rows | quotes: 200K rows | sensors: 50K rows═══════════════════════════════════════════════════════
Try these queries:
1. SELECT symbol, vwap(price, volume) AS vwap, count(*) AS n FROM trades GROUP BY symbol
2. SELECT xbar(timestamp, 300000000000) AS bar, first(price) AS open, max(price) AS high, min(price) AS low, last(price) AS close FROM trades WHERE symbol = 'AAPL' GROUP BY xbar(timestamp, 300000000000) ORDER BY bar
3. SELECT t.price, q.bid, q.ask FROM trades t ASOF JOIN quotes q ON t.symbol = q.symbol AND t.timestamp >= q.timestamp WHERE t.symbol = 'TSLA' LIMIT 10
Web UI: http://localhost:8123═══════════════════════════════════════════════════════Data Generation
Section titled “Data Generation”Demo data is generated deterministically (seeded PRNG) so results are reproducible across runs.
Generation parameters
Section titled “Generation parameters”demo: seed: 42 trading_day: "2024-03-23" market_hours: "09:30-16:00 ET" symbols: AAPL: { base_price: 18750, volatility: 0.001 } GOOG: { base_price: 17800, volatility: 0.0012 } MSFT: { base_price: 42500, volatility: 0.0008 } TSLA: { base_price: 17200, volatility: 0.002 } AMZN: { base_price: 18100, volatility: 0.0011 } sensors: count: 10 interval_sec: 15 anomaly_device: "sensor_07" anomaly_count: 5Implementation location
Section titled “Implementation location”- Data generator:
src/demo/demo_data_generator.cpp - Server flag handling:
tools/zepto_http_server.cpp(--demo) - Config schema:
include/zeptodb/demo/demo_config.h
Memory Footprint
Section titled “Memory Footprint”| Dataset | Rows | Columns | Approx. Memory |
|---|---|---|---|
| trades | 100K | 4 | ~3.2 MB |
| quotes | 200K | 6 | ~9.6 MB |
| sensors | 50K | 4 | ~1.6 MB |
| Total | 350K | ~14.4 MB |
Negligible overhead — safe to enable even on constrained environments.