Transparent rewrite
Users query base tables normally. The rewriter silently redirects to MVs when possible.
Materialized views are only useful if queries actually hit them. Requiring users to SELECT FROM mv_name defeats the purpose — they’d need to know which MVs exist and rewrite their queries manually. This post covers the automatic query rewriter that transparently redirects matching queries to pre-computed results.
Without automatic rewrite, users must explicitly query the materialized view:
-- User writes this (hits MV):SELECT * FROM ohlcv_5min WHERE symbol = 1
-- But their original query was this (full scan):SELECT symbol, xbar(timestamp, 300000000000) AS bar, sum(volume) AS volFROM tradesGROUP BY symbol, xbar(timestamp, 300000000000)The rewriter makes the second query automatically use the MV — no query changes needed.
The rewriter runs between the query cache check and exec_select():
execute(sql) → parse() → query cache check → MVRewriter::try_rewrite() ← HERE → exec_select()If the rewriter finds a matching MV, it returns pre-computed results directly. The query never reaches the executor’s partition scan path.
MVRewriter::try_rewrite() checks each registered MV definition against the incoming SELECT:
The rewriter only handles simple GROUP BY aggregations. Any of these cause immediate fallback to full scan:
| Condition | Reason |
|---|---|
| No GROUP BY | Not an aggregation query |
| Has WHERE clause | MV doesn’t store filtered subsets |
| Has JOIN | MV is single-table only |
| Has HAVING | Post-aggregation filter not in MV |
| Has DISTINCT | Dedup logic not in MV |
| Has subquery/CTE | Too complex for this phase |
For each registered MV definition:
1. source_table matches? trades == trades ✓2. GROUP BY columns match? {symbol, xbar} == {symbol, xbar} ✓ (order-insensitive set comparison)3. xbar_bucket matches? 300000000000 == 300000000000 ✓4. ALL agg columns match? sum(volume) → MVAggType::SUM on "volume" ✓ count(*) → MVAggType::COUNT ✓All four conditions must hold. Partial matches (query uses a subset of MV columns) are not supported in this phase — it’s all-or-nothing.
AVG, VWAP, and STDDEV cause immediate fallback. AVG could theoretically be decomposed into SUM/COUNT (both stored in the MV), but that optimization is future work.
AggFunc::SUM → MVAggType::SUM ✓AggFunc::COUNT → MVAggType::COUNT ✓AggFunc::MIN → MVAggType::MIN ✓AggFunc::MAX → MVAggType::MAX ✓AggFunc::FIRST → MVAggType::FIRST ✓AggFunc::LAST → MVAggType::LAST ✓AggFunc::AVG → fallback ✗AggFunc::VWAP → fallback ✗When a match is found, the rewriter calls MaterializedViewManager::query() to get pre-computed buckets. ORDER BY and LIMIT are still applied on the MV result — the rewriter doesn’t skip these clauses.
MV match found → MaterializedViewManager::query() → pre-computed buckets → apply ORDER BY (if present) → apply LIMIT (if present) → return QueryResultSet| Query Path | Cost |
|---|---|
| Full scan (no MV) | O(n) — scan all rows in matching partitions |
| MV query (explicit) | O(buckets) — direct MV lookup |
| MV rewrite (automatic) | O(buckets) + O(MVs) matching overhead |
The matching overhead is O(registered MVs) — typically 1-10 views. Each match check is a handful of string comparisons and set intersections. For a database with 3 MVs, the overhead is ~100ns — negligible compared to the O(n) scan it replaces.
For 1M rows with 3,334 time bars, the rewrite turns a ~10ms scan into a ~0.01ms bucket lookup — a 1000× speedup that’s completely transparent to the user.
-- Step 1: Create MVCREATE MATERIALIZED VIEW ohlcv_5min AS SELECT symbol, xbar(timestamp, 300000000000) AS bar, first(price) AS open, max(price) AS high, min(price) AS low, last(price) AS close, sum(volume) AS vol FROM trades GROUP BY symbol, xbar(timestamp, 300000000000)
-- Step 2: Insert data (MV updated incrementally)INSERT INTO trades VALUES ('AAPL', 15025, 100, 1711900800000000000)
-- Step 3: Query base table — automatically rewritten to MVSELECT symbol, xbar(timestamp, 300000000000) AS bar, sum(volume) AS volFROM tradesGROUP BY symbol, xbar(timestamp, 300000000000)-- ↑ This hits the MV, not the raw partitions| Test | Verifies |
|---|---|
| ExactMatch | MV hit returns pre-aggregated data |
| NoMatchWhere | WHERE present → full scan fallback |
| NoMatchAggMismatch | avg() vs sum() MV → no rewrite |
| NoMVRegistered | No MVs → normal execution |
| MultipleAggColumns | Multi-column MV match |
| NoMatchTableMismatch | Wrong source table → no rewrite |
Transparent rewrite
Users query base tables normally. The rewriter silently redirects to MVs when possible.
Exact matching
All-or-nothing: source table, GROUP BY columns, xbar bucket, and all aggregations must match.
1000× speedup
O(n) partition scan → O(buckets) lookup. 10ms → 0.01ms for typical OHLCV queries.
Safe fallback
Any mismatch falls through to full scan. No incorrect results, no silent data loss.
Related: Storage Tiering and MVs → · Financial Functions → · XBAR Sorted-Scan →