Add a note file about query engine architectures

2026-03-31 15:00:48 +02:00 · 2026-03-31 15:00:48 +02:00 · 584f82bb82
commit 584f82bb82
parent 8ddc235f4d
1 changed files with 392 additions and 0 deletions
--- a/hqew/003-query-engine-architectures.md
+++ b/hqew/003-query-engine-architectures.md
@ -0,0 +1,392 @@
+# Query Engine Architectures
+
+A reference for the main ways query engines are structured internally.
+
+---
+
+## Short answer
+
+There is no single query engine architecture.
+
+Most engines are built by making choices along a few major axes:
+
+- row-oriented vs columnar
+- interpreted vs compiled
+- pipelined vs materializing
+- centralized vs distributed
+- tightly coupled to storage vs separated through a source interface
+
+Different systems pick different combinations depending on whether they optimize for transactional access, analytics, low latency, concurrency,
+distribution, or implementation simplicity.
+
+---
+
+## A useful mental model
+
+When people say "query engine architecture," they usually mean the answer to questions like:
+
+- what is the unit of execution: row, batch, or whole relation?
+- how do operators communicate: pull, push, or staged pipelines?
+- when are intermediates materialized?
+- where does optimization happen?
+- how tightly is execution tied to the underlying storage?
+- does one machine run the whole plan, or is the plan split across many workers?
+
+So architecture is less about one named pattern and more about a set of design choices.
+
+---
+
+## The classic row-at-a-time iterator model
+
+One of the most common traditional designs is the Volcano-style iterator model.
+
+In that architecture:
+
+- each operator exposes a method like `next()`
+- parent operators pull one tuple at a time from child operators
+- data flows upward through a tree of operators
+
+Typical shape:
+
+1. `Projection`
+2. pulls from `Filter`
+3. which pulls from `Scan`
+
+### Strengths
+
+- simple and modular
+- easy to understand and implement
+- operators compose cleanly
+- works well for many relational operators
+
+### Weaknesses
+
+- pays per-row virtual-call or dispatch overhead
+- poor cache locality for analytical scans
+- not ideal for SIMD or vectorized processing
+
+This architecture was historically very influential, but many modern analytical engines move away from pure row-at-a-time execution.
+
+---
+
+## Vectorized and batch-oriented engines
+
+Modern analytical engines often process batches of rows, frequently in columnar form.
+
+Instead of asking for one row at a time, an operator processes a chunk such as a `RecordBatch` containing many values per column.
+
+### Strengths
+
+- lower per-row overhead
+- better cache behavior
+- fits columnar storage and in-memory formats well
+- makes vectorized computation practical
+
+### Weaknesses
+
+- more complex operator implementations
+- more bookkeeping around batch boundaries
+- some control-flow-heavy operations fit less naturally
+
+This style is common in engines optimized for scans, filters, aggregations, and joins over large datasets.
+
+---
+
+## Interpreted vs compiled execution
+
+Another major architecture decision is whether plans are interpreted directly or turned into specialized executable code.
+
+### Interpreted execution
+
+The engine walks the plan at runtime and runs generic operator implementations.
+
+Strengths:
+
+- easier to build
+- easier to debug
+- more flexible for dynamic plans
+
+Weaknesses:
+
+- more dispatch overhead
+- fewer opportunities for low-level specialization
+
+### Compiled or code-generated execution
+
+The engine turns expressions or even whole pipelines into specialized machine code, bytecode, or generated source.
+
+Strengths:
+
+- less runtime interpretation overhead
+- more aggressive specialization
+- can fuse multiple operations together
+
+Weaknesses:
+
+- much more implementation complexity
+- longer compile/startup path
+- harder debugging and observability
+
+Many real systems are hybrid: interpreted at some layers, compiled at others.
+
+---
+
+## Pipelined vs materializing architectures
+
+Query engines also differ in when they store intermediate results.
+
+### Pipelined execution
+
+Operators stream data incrementally from one stage to the next.
+
+Strengths:
+
+- lower latency
+- less memory for some workloads
+- natural for streaming or interactive execution
+
+Weaknesses:
+
+- harder to reuse intermediates
+- blocking operators still force barriers
+
+### Materializing execution
+
+The engine fully computes an intermediate result before the next stage uses it.
+
+Strengths:
+
+- simpler control flow
+- easier fault recovery in some systems
+- easier sharing of intermediates
+
+Weaknesses:
+
+- more memory and I/O overhead
+- higher latency
+
+Real engines mix both. For example, filter and projection may pipeline, while sort and some aggregates necessarily materialize or partially buffer.
+
+---
+
+## Row stores, column stores, and storage coupling
+
+Some engines are tightly integrated with one storage layout. Others sit above many storage systems.
+
+### Storage-coupled engines
+
+In these systems, the executor is deeply aware of the storage engine's pages, indexes, and on-disk layout.
+
+Strengths:
+
+- can exploit detailed storage-level knowledge
+- often strong performance for the target workload
+
+Weaknesses:
+
+- harder to adapt to new sources
+- architecture is less modular
+
+### Storage-decoupled engines
+
+In these systems, execution interacts with storage through a cleaner source boundary such as scan interfaces and pushdowns.
+
+Strengths:
+
+- easier support for many formats and backends
+- cleaner separation of concerns
+
+Weaknesses:
+
+- some storage-specific optimizations become harder
+- abstraction can hide useful low-level details
+
+This is a major difference between database kernels and more modular data-processing engines.
+
+---
+
+## Centralized vs distributed architectures
+
+Some engines run entirely inside one process or one machine. Others split work across nodes.
+
+### Centralized engines
+
+All planning and execution happens locally.
+
+Strengths:
+
+- much simpler design
+- easier debugging
+- lower coordination overhead
+
+Weaknesses:
+
+- limited by one machine's CPU, memory, and I/O
+
+### Distributed engines
+
+The plan is divided into stages or fragments that run on multiple workers.
+
+Common added components include:
+
+- exchange or shuffle operators
+- task schedulers
+- remote data transport
+- fault handling and retries
+
+Strengths:
+
+- can scale to much larger datasets
+- can exploit cluster parallelism
+
+Weaknesses:
+
+- much more complexity
+- network and serialization overhead
+- harder optimization problem
+
+Distributed query engines are not just "single-node engines on more machines." They need an additional architecture for task placement and data
+movement.
+
+---
+
+## Blocking vs non-blocking operators
+
+Operator behavior also shapes architecture.
+
+### Non-blocking operators
+
+These can start producing output before consuming all input.
+
+Examples:
+
+- scan
+- filter
+- projection
+
+### Blocking operators
+
+These require all or much of the input before producing useful output.
+
+Examples:
+
+- sort
+- some aggregates
+- some join strategies
+
+Blocking operators create execution barriers and often determine where memory pressure and latency appear in the engine.
+
+---
+
+## Common architecture patterns
+
+### 1. Classic relational database kernel
+
+Typical traits:
+
+- row-oriented execution
+- iterator model
+- tight coupling to storage and indexes
+- strong support for transactional workloads
+
+### 2. Modern analytical columnar engine
+
+Typical traits:
+
+- columnar in-memory representation
+- batch or vectorized execution
+- pushdown into file formats or source connectors
+- optimized for scans, joins, and aggregates on large datasets
+
+### 3. Distributed SQL or lakehouse engine
+
+Typical traits:
+
+- logical and physical planning on a coordinator
+- distributed execution stages on workers
+- shuffle or exchange boundaries
+- object storage or remote tables as sources
+
+### 4. Streaming query engine
+
+Typical traits:
+
+- push-style or event-driven data flow
+- unbounded inputs
+- windows, watermarks, and incremental state
+- low-latency continuous execution
+
+### 5. Logic or rule engine
+
+Typical traits:
+
+- facts and rules instead of only relational algebra
+- recursion and fixpoint evaluation
+- unification, derivation, or chase-like execution
+- architecture organized around inference as much as scanning
+
+This last category matters when the "query engine" is not mainly about SQL tables at all.
+
+---
+
+## Comparison table
+
+| Architecture style | Typical unit of work | Best for | Main weakness |
+|:-------------------|:---------------------|:---------|:--------------|
+| Row iterator       | one row              | simplicity, OLTP-style access | high per-row overhead |
+| Vectorized columnar| batch / column chunk | analytics | more implementation complexity |
+| Materializing      | full intermediate    | simpler barriers and reuse | memory and latency cost |
+| Pipelined          | incremental stream   | low latency | harder coordination around blocking operators |
+| Distributed        | stage / partition    | scale-out analytics | network and scheduling complexity |
+| Rule/fixpoint      | fact/rule step       | recursion and inference | different optimization model from relational SQL |
+
+---
+
+## How to think about architectural fit
+
+A good architecture depends on what the engine is for.
+
+If the workload is mostly:
+
+- point lookups and updates, row-oriented designs can be reasonable
+- large scans and aggregates, vectorized columnar designs usually win
+- recursive inference or logical closure, rule/fixpoint architectures become central
+- cluster-scale datasets, distributed planning and exchange operators become unavoidable
+
+So the architecture should follow the workload, not the other way around.
+
+---
+
+## A practical summary
+
+If you need the shortest useful summary, this is a good one:
+
+> Query engine architecture is mostly about execution granularity, operator communication, intermediate storage, and the boundary with storage and
+> distribution.
+
+Or more concretely:
+
+- row vs batch
+- interpreted vs compiled
+- pipelined vs materialized
+- local vs distributed
+- relational vs rule-oriented execution
+
+Those choices explain most of the important differences between engines.
+
+---
+
+## Questions to keep in mind
+
+- What is the unit of execution?
+- Which operators are blocking?
+- How much is pushed down to the data source?
+- Is storage tightly coupled or abstracted behind connectors?
+- Does the workload look more transactional, analytical, streaming, or logical?
+- Is distribution a first-order requirement or premature complexity?
+
+---
+
+## Changelog
+
+* **Mar 31, 2026** -- First version created.