useful-notes/hqew/003-query-engine-architectures.md

# Query Engine Architectures

A reference for the main ways query engines are structured internally.

---

## Short answer

There is no single query engine architecture.

Most engines are built by making choices along a few major axes:

- row-oriented vs columnar
- interpreted vs compiled
- pipelined vs materializing
- centralized vs distributed
- tightly coupled to storage vs separated through a source interface

Different systems pick different combinations depending on whether they optimize for transactional access, analytics, low latency, concurrency,
distribution, or implementation simplicity.

---

## A useful mental model

When people say "query engine architecture," they usually mean the answer to questions like:

- what is the unit of execution: row, batch, or whole relation?
- how do operators communicate: pull, push, or staged pipelines?
- when are intermediates materialized?
- where does optimization happen?
- how tightly is execution tied to the underlying storage?
- does one machine run the whole plan, or is the plan split across many workers?

So architecture is less about one named pattern and more about a set of design choices.

---

## The classic row-at-a-time iterator model

One of the most common traditional designs is the Volcano-style iterator model.

In that architecture:

- each operator exposes a method like `next()`
- parent operators pull one tuple at a time from child operators
- data flows upward through a tree of operators

Typical shape:

1. `Projection`
2. pulls from `Filter`
3. which pulls from `Scan`

### Strengths

- simple and modular
- easy to understand and implement
- operators compose cleanly
- works well for many relational operators

### Weaknesses

- pays per-row virtual-call or dispatch overhead
- poor cache locality for analytical scans
- not ideal for SIMD or vectorized processing

This architecture was historically very influential, but many modern analytical engines move away from pure row-at-a-time execution.

---

## Vectorized and batch-oriented engines

Modern analytical engines often process batches of rows, frequently in columnar form.

Instead of asking for one row at a time, an operator processes a chunk such as a `RecordBatch` containing many values per column.

### Strengths

- lower per-row overhead
- better cache behavior
- fits columnar storage and in-memory formats well
- makes vectorized computation practical

### Weaknesses

- more complex operator implementations
- more bookkeeping around batch boundaries
- some control-flow-heavy operations fit less naturally

This style is common in engines optimized for scans, filters, aggregations, and joins over large datasets.

---

## Interpreted vs compiled execution

Another major architecture decision is whether plans are interpreted directly or turned into specialized executable code.

### Interpreted execution

The engine walks the plan at runtime and runs generic operator implementations.

Strengths:

- easier to build
- easier to debug
- more flexible for dynamic plans

Weaknesses:

- more dispatch overhead
- fewer opportunities for low-level specialization

### Compiled or code-generated execution

The engine turns expressions or even whole pipelines into specialized machine code, bytecode, or generated source.

Strengths:

- less runtime interpretation overhead
- more aggressive specialization
- can fuse multiple operations together

Weaknesses:

- much more implementation complexity
- longer compile/startup path
- harder debugging and observability

Many real systems are hybrid: interpreted at some layers, compiled at others.

---

## Pipelined vs materializing architectures

Query engines also differ in when they store intermediate results.

### Pipelined execution

Operators stream data incrementally from one stage to the next.

Strengths:

- lower latency
- less memory for some workloads
- natural for streaming or interactive execution

Weaknesses:

- harder to reuse intermediates
- blocking operators still force barriers

### Materializing execution

The engine fully computes an intermediate result before the next stage uses it.

Strengths:

- simpler control flow
- easier fault recovery in some systems
- easier sharing of intermediates

Weaknesses:

- more memory and I/O overhead
- higher latency

Real engines mix both. For example, filter and projection may pipeline, while sort and some aggregates necessarily materialize or partially buffer.

---

## Row stores, column stores, and storage coupling

Some engines are tightly integrated with one storage layout. Others sit above many storage systems.

### Storage-coupled engines

In these systems, the executor is deeply aware of the storage engine's pages, indexes, and on-disk layout.

Strengths:

- can exploit detailed storage-level knowledge
- often strong performance for the target workload

Weaknesses:

- harder to adapt to new sources
- architecture is less modular

### Storage-decoupled engines

In these systems, execution interacts with storage through a cleaner source boundary such as scan interfaces and pushdowns.

Strengths:

- easier support for many formats and backends
- cleaner separation of concerns

Weaknesses:

- some storage-specific optimizations become harder
- abstraction can hide useful low-level details

This is a major difference between database kernels and more modular data-processing engines.

---

## Centralized vs distributed architectures

Some engines run entirely inside one process or one machine. Others split work across nodes.

### Centralized engines

All planning and execution happens locally.

Strengths:

- much simpler design
- easier debugging
- lower coordination overhead

Weaknesses:

- limited by one machine's CPU, memory, and I/O

### Distributed engines

The plan is divided into stages or fragments that run on multiple workers.

Common added components include:

- exchange or shuffle operators
- task schedulers
- remote data transport
- fault handling and retries

Strengths:

- can scale to much larger datasets
- can exploit cluster parallelism

Weaknesses:

- much more complexity
- network and serialization overhead
- harder optimization problem

Distributed query engines are not just "single-node engines on more machines." They need an additional architecture for task placement and data
movement.

---

## Blocking vs non-blocking operators

Operator behavior also shapes architecture.

### Non-blocking operators

These can start producing output before consuming all input.

Examples:

- scan
- filter
- projection

### Blocking operators

These require all or much of the input before producing useful output.

Examples:

- sort
- some aggregates
- some join strategies

Blocking operators create execution barriers and often determine where memory pressure and latency appear in the engine.

---

## Common architecture patterns

### 1. Classic relational database kernel

Typical traits:

- row-oriented execution
- iterator model
- tight coupling to storage and indexes
- strong support for transactional workloads

### 2. Modern analytical columnar engine

Typical traits:

- columnar in-memory representation
- batch or vectorized execution
- pushdown into file formats or source connectors
- optimized for scans, joins, and aggregates on large datasets

### 3. Distributed SQL or lakehouse engine

Typical traits:

- logical and physical planning on a coordinator
- distributed execution stages on workers
- shuffle or exchange boundaries
- object storage or remote tables as sources

### 4. Streaming query engine

Typical traits:

- push-style or event-driven data flow
- unbounded inputs
- windows, watermarks, and incremental state
- low-latency continuous execution

### 5. Logic or rule engine

Typical traits:

- facts and rules instead of only relational algebra
- recursion and fixpoint evaluation
- unification, derivation, or chase-like execution
- architecture organized around inference as much as scanning

This last category matters when the "query engine" is not mainly about SQL tables at all.

---

## Comparison table

| Architecture style | Typical unit of work | Best for | Main weakness |
|:-------------------|:---------------------|:---------|:--------------|
| Row iterator       | one row              | simplicity, OLTP-style access | high per-row overhead |
| Vectorized columnar| batch / column chunk | analytics | more implementation complexity |
| Materializing      | full intermediate    | simpler barriers and reuse | memory and latency cost |
| Pipelined          | incremental stream   | low latency | harder coordination around blocking operators |
| Distributed        | stage / partition    | scale-out analytics | network and scheduling complexity |
| Rule/fixpoint      | fact/rule step       | recursion and inference | different optimization model from relational SQL |

---

## How to think about architectural fit

A good architecture depends on what the engine is for.

If the workload is mostly:

- point lookups and updates, row-oriented designs can be reasonable
- large scans and aggregates, vectorized columnar designs usually win
- recursive inference or logical closure, rule/fixpoint architectures become central
- cluster-scale datasets, distributed planning and exchange operators become unavoidable

So the architecture should follow the workload, not the other way around.

---

## A practical summary

If you need the shortest useful summary, this is a good one:

> Query engine architecture is mostly about execution granularity, operator communication, intermediate storage, and the boundary with storage and
> distribution.

Or more concretely:

- row vs batch
- interpreted vs compiled
- pipelined vs materialized
- local vs distributed
- relational vs rule-oriented execution

Those choices explain most of the important differences between engines.

---

## Questions to keep in mind

- What is the unit of execution?
- Which operators are blocking?
- How much is pushed down to the data source?
- Is storage tightly coupled or abstracted behind connectors?
- Does the workload look more transactional, analytical, streaming, or logical?
- Is distribution a first-order requirement or premature complexity?

---

## Changelog

* **Mar 31, 2026** -- First version created.
Add a note file about query engine architectures 2026-03-31 15:00:48 +02:00			`# Query Engine Architectures`

			`A reference for the main ways query engines are structured internally.`

			`---`

			`## Short answer`

			`There is no single query engine architecture.`

			`Most engines are built by making choices along a few major axes:`

			`- row-oriented vs columnar`
			`- interpreted vs compiled`
			`- pipelined vs materializing`
			`- centralized vs distributed`
			`- tightly coupled to storage vs separated through a source interface`

			`Different systems pick different combinations depending on whether they optimize for transactional access, analytics, low latency, concurrency,`
			`distribution, or implementation simplicity.`

			`---`

			`## A useful mental model`

			`When people say "query engine architecture," they usually mean the answer to questions like:`

			`- what is the unit of execution: row, batch, or whole relation?`
			`- how do operators communicate: pull, push, or staged pipelines?`
			`- when are intermediates materialized?`
			`- where does optimization happen?`
			`- how tightly is execution tied to the underlying storage?`
			`- does one machine run the whole plan, or is the plan split across many workers?`

			`So architecture is less about one named pattern and more about a set of design choices.`

			`---`

			`## The classic row-at-a-time iterator model`

			`One of the most common traditional designs is the Volcano-style iterator model.`

			`In that architecture:`

			- each operator exposes a method like `next()`
			`- parent operators pull one tuple at a time from child operators`
			`- data flows upward through a tree of operators`

			`Typical shape:`

			1. `Projection`
			2. pulls from `Filter`
			3. which pulls from `Scan`

			`### Strengths`

			`- simple and modular`
			`- easy to understand and implement`
			`- operators compose cleanly`
			`- works well for many relational operators`

			`### Weaknesses`

			`- pays per-row virtual-call or dispatch overhead`
			`- poor cache locality for analytical scans`
			`- not ideal for SIMD or vectorized processing`

			`This architecture was historically very influential, but many modern analytical engines move away from pure row-at-a-time execution.`

			`---`

			`## Vectorized and batch-oriented engines`

			`Modern analytical engines often process batches of rows, frequently in columnar form.`

			Instead of asking for one row at a time, an operator processes a chunk such as a `RecordBatch` containing many values per column.

			`### Strengths`

			`- lower per-row overhead`
			`- better cache behavior`
			`- fits columnar storage and in-memory formats well`
			`- makes vectorized computation practical`

			`### Weaknesses`

			`- more complex operator implementations`
			`- more bookkeeping around batch boundaries`
			`- some control-flow-heavy operations fit less naturally`

			`This style is common in engines optimized for scans, filters, aggregations, and joins over large datasets.`

			`---`

			`## Interpreted vs compiled execution`

			`Another major architecture decision is whether plans are interpreted directly or turned into specialized executable code.`

			`### Interpreted execution`

			`The engine walks the plan at runtime and runs generic operator implementations.`

			`Strengths:`

			`- easier to build`
			`- easier to debug`
			`- more flexible for dynamic plans`

			`Weaknesses:`

			`- more dispatch overhead`
			`- fewer opportunities for low-level specialization`

			`### Compiled or code-generated execution`

			`The engine turns expressions or even whole pipelines into specialized machine code, bytecode, or generated source.`

			`Strengths:`

			`- less runtime interpretation overhead`
			`- more aggressive specialization`
			`- can fuse multiple operations together`

			`Weaknesses:`

			`- much more implementation complexity`
			`- longer compile/startup path`
			`- harder debugging and observability`

			`Many real systems are hybrid: interpreted at some layers, compiled at others.`

			`---`

			`## Pipelined vs materializing architectures`

			`Query engines also differ in when they store intermediate results.`

			`### Pipelined execution`

			`Operators stream data incrementally from one stage to the next.`

			`Strengths:`

			`- lower latency`
			`- less memory for some workloads`
			`- natural for streaming or interactive execution`

			`Weaknesses:`

			`- harder to reuse intermediates`
			`- blocking operators still force barriers`

			`### Materializing execution`

			`The engine fully computes an intermediate result before the next stage uses it.`

			`Strengths:`

			`- simpler control flow`
			`- easier fault recovery in some systems`
			`- easier sharing of intermediates`

			`Weaknesses:`

			`- more memory and I/O overhead`
			`- higher latency`

			`Real engines mix both. For example, filter and projection may pipeline, while sort and some aggregates necessarily materialize or partially buffer.`

			`---`

			`## Row stores, column stores, and storage coupling`

			`Some engines are tightly integrated with one storage layout. Others sit above many storage systems.`

			`### Storage-coupled engines`

			`In these systems, the executor is deeply aware of the storage engine's pages, indexes, and on-disk layout.`

			`Strengths:`

			`- can exploit detailed storage-level knowledge`
			`- often strong performance for the target workload`

			`Weaknesses:`

			`- harder to adapt to new sources`
			`- architecture is less modular`

			`### Storage-decoupled engines`

			`In these systems, execution interacts with storage through a cleaner source boundary such as scan interfaces and pushdowns.`

			`Strengths:`

			`- easier support for many formats and backends`
			`- cleaner separation of concerns`

			`Weaknesses:`

			`- some storage-specific optimizations become harder`
			`- abstraction can hide useful low-level details`

			`This is a major difference between database kernels and more modular data-processing engines.`

			`---`

			`## Centralized vs distributed architectures`

			`Some engines run entirely inside one process or one machine. Others split work across nodes.`

			`### Centralized engines`

			`All planning and execution happens locally.`

			`Strengths:`

			`- much simpler design`
			`- easier debugging`
			`- lower coordination overhead`

			`Weaknesses:`

			`- limited by one machine's CPU, memory, and I/O`

			`### Distributed engines`

			`The plan is divided into stages or fragments that run on multiple workers.`

			`Common added components include:`

			`- exchange or shuffle operators`
			`- task schedulers`
			`- remote data transport`
			`- fault handling and retries`

			`Strengths:`

			`- can scale to much larger datasets`
			`- can exploit cluster parallelism`

			`Weaknesses:`

			`- much more complexity`
			`- network and serialization overhead`
			`- harder optimization problem`

			`Distributed query engines are not just "single-node engines on more machines." They need an additional architecture for task placement and data`
			`movement.`

			`---`

			`## Blocking vs non-blocking operators`

			`Operator behavior also shapes architecture.`

			`### Non-blocking operators`

			`These can start producing output before consuming all input.`

			`Examples:`

			`- scan`
			`- filter`
			`- projection`

			`### Blocking operators`

			`These require all or much of the input before producing useful output.`

			`Examples:`

			`- sort`
			`- some aggregates`
			`- some join strategies`

			`Blocking operators create execution barriers and often determine where memory pressure and latency appear in the engine.`

			`---`

			`## Common architecture patterns`

			`### 1. Classic relational database kernel`

			`Typical traits:`

			`- row-oriented execution`
			`- iterator model`
			`- tight coupling to storage and indexes`
			`- strong support for transactional workloads`

			`### 2. Modern analytical columnar engine`

			`Typical traits:`

			`- columnar in-memory representation`
			`- batch or vectorized execution`
			`- pushdown into file formats or source connectors`
			`- optimized for scans, joins, and aggregates on large datasets`

			`### 3. Distributed SQL or lakehouse engine`

			`Typical traits:`

			`- logical and physical planning on a coordinator`
			`- distributed execution stages on workers`
			`- shuffle or exchange boundaries`
			`- object storage or remote tables as sources`

			`### 4. Streaming query engine`

			`Typical traits:`

			`- push-style or event-driven data flow`
			`- unbounded inputs`
			`- windows, watermarks, and incremental state`
			`- low-latency continuous execution`

			`### 5. Logic or rule engine`

			`Typical traits:`

			`- facts and rules instead of only relational algebra`
			`- recursion and fixpoint evaluation`
			`- unification, derivation, or chase-like execution`
			`- architecture organized around inference as much as scanning`

			`This last category matters when the "query engine" is not mainly about SQL tables at all.`

			`---`

			`## Comparison table`

			`\| Architecture style \| Typical unit of work \| Best for \| Main weakness \|`
			`\|:-------------------\|:---------------------\|:---------\|:--------------\|`
			`\| Row iterator \| one row \| simplicity, OLTP-style access \| high per-row overhead \|`
			`\| Vectorized columnar\| batch / column chunk \| analytics \| more implementation complexity \|`
			`\| Materializing \| full intermediate \| simpler barriers and reuse \| memory and latency cost \|`
			`\| Pipelined \| incremental stream \| low latency \| harder coordination around blocking operators \|`
			`\| Distributed \| stage / partition \| scale-out analytics \| network and scheduling complexity \|`
			`\| Rule/fixpoint \| fact/rule step \| recursion and inference \| different optimization model from relational SQL \|`

			`---`

			`## How to think about architectural fit`

			`A good architecture depends on what the engine is for.`

			`If the workload is mostly:`

			`- point lookups and updates, row-oriented designs can be reasonable`
			`- large scans and aggregates, vectorized columnar designs usually win`
			`- recursive inference or logical closure, rule/fixpoint architectures become central`
			`- cluster-scale datasets, distributed planning and exchange operators become unavoidable`

			`So the architecture should follow the workload, not the other way around.`

			`---`

			`## A practical summary`

			`If you need the shortest useful summary, this is a good one:`

			`> Query engine architecture is mostly about execution granularity, operator communication, intermediate storage, and the boundary with storage and`
			`> distribution.`

			`Or more concretely:`

			`- row vs batch`
			`- interpreted vs compiled`
			`- pipelined vs materialized`
			`- local vs distributed`
			`- relational vs rule-oriented execution`

			`Those choices explain most of the important differences between engines.`

			`---`

			`## Questions to keep in mind`

			`- What is the unit of execution?`
			`- Which operators are blocking?`
			`- How much is pushed down to the data source?`
			`- Is storage tightly coupled or abstracted behind connectors?`
			`- Does the workload look more transactional, analytical, streaming, or logical?`
			`- Is distribution a first-order requirement or premature complexity?`

			`---`

			`## Changelog`

			`* Mar 31, 2026 -- First version created.`