766/766 tests passing
Every test suite — unit, feed, migration — produces identical results on x86_64 and aarch64.
Shipping a high-performance C++ database on a single architecture is table stakes. Shipping it on two — with SIMD vectorization, JIT compilation, and protocol parsers all working identically — is the real test. This post covers ZeptoDB’s full verification on AWS Graviton (aarch64): 766/766 tests passing, with some benchmarks that surprised us.
AWS Graviton instances offer ~20% cost savings over equivalent x86 instances. For a database that runs 24/7 in production, that’s a significant line item. But cost savings mean nothing if the database doesn’t work correctly — and ZeptoDB relies heavily on architecture-specific features:
All of these need to work correctly on ARM’s NEON instruction set, not just x86’s SSE/AVX.
| x86 Instance | Graviton Instance | |
|---|---|---|
| Architecture | x86_64 | aarch64 |
| CPU | Intel Xeon 6975P (8 vCPU) | Graviton (4 vCPU) |
| RAM | — | 15 GB |
| OS | Amazon Linux 2023 | Amazon Linux 2023 |
| Compiler | Clang 19.1.7 | Clang 19.1.7 |
| Highway SIMD | 1.2.0 | 1.2.0 |
| LLVM JIT | 19.1.7 | 19.1.7 |
Same compiler, same library versions, same OS. The only variable is the CPU architecture.
CMake + Ninja in Release mode. 137/137 targets built successfully on both architectures — identical target count, no conditional compilation needed. The Python binding (zeptodb.cpython-39-aarch64-linux-gnu.so) also generated successfully.
Highway’s architecture abstraction is the key enabler here. Code written against Highway’s portable API compiles to NEON on ARM and AVX2/AVX-512 on x86 without #ifdef blocks.
| Test Suite | x86_64 | aarch64 |
|---|---|---|
| Unit Tests | 619/619 ✅ | 619/619 ✅ |
| Feed Tests | 21/21 ✅ | 21/21 ✅ |
| Migration Tests | 126/126 ✅ | 126/126 ✅ |
| Total | 766/766 | 766/766 |
Zero platform-specific failures. Every test — from basic column operations through JIT-compiled expressions to FIX/ITCH protocol parsing — produces identical results on both architectures.
We ran the standard micro-benchmarks on the Graviton instance:
| Metric | Graviton (aarch64) | x86 (previous) | Ratio |
|---|---|---|---|
| xbar GROUP BY (1M rows) | 7.99 ms | 24 ms | 3x faster |
| ITCH Parser | 17.18 ns/msg (58.2M msg/s) | 23.3 ns/msg (42.9M msg/s) | 1.36x faster |
| FIX Parser | 358.97 ns/msg (2.79M msg/s) | — | — |
The 3x faster xbar GROUP BY on Graviton was unexpected. The likely explanation is memory access pattern differences — Graviton’s memory subsystem handles the sequential column scan + hash aggregation pattern particularly well. HugePages were not configured on the Graviton instance (fallback warning present), so the actual gap may narrow with identical memory configurations.
The ITCH parser at 58.2 million messages per second on ARM is notable — this is a bit-level binary protocol parser where every cycle counts.
One test failure appeared during verification: FIXMessageBuilderTest.BuildLogon. The test constructed a message with FIXMessageBuilder("ZEPTO", "SERVER") but asserted 49=APEX — a SenderCompID mismatch.
// Before (wrong):// Expected: 49=APEX// Actual: 49=ZEPTO
// Fix: tests/feeds/test_fix_parser.cpp:165// Changed expected value to match constructor argument// 49=APEX → 49=ZEPTOThis was a pre-existing test typo that failed on both architectures. Not an ARM-specific issue.
ZeptoDB is fully portable across x86_64 and aarch64 with zero code changes:
For deployment, this means teams can choose Graviton instances for cost savings without any functional risk.
766/766 tests passing
Every test suite — unit, feed, migration — produces identical results on x86_64 and aarch64.
3x faster GROUP BY
xbar aggregation on Graviton completed in 7.99ms vs 24ms on x86. Memory subsystem advantage.
58.2M ITCH msg/s
Bit-level protocol parsing at full speed on ARM NEON. No performance penalty for portability.
Zero #ifdef blocks
Highway SIMD + LLVM JIT abstract the architecture. Same source, same behavior, different ISA.
Related: SIMD JIT Optimization → · EKS Architecture Benchmark → · Bare Metal Tuning →