168 lines
3.1 KiB
Markdown
168 lines
3.1 KiB
Markdown
# Query Execution Models
|
|
|
|
A reference for the main ways query operators run at runtime.
|
|
|
|
---
|
|
|
|
## Short answer
|
|
|
|
An execution model defines how operators consume input, produce output, and pass data through a plan.
|
|
|
|
The most important questions are:
|
|
|
|
- one row at a time or many values at once?
|
|
- pull-based or push-based?
|
|
- pipelined or materialized?
|
|
|
|
Those choices strongly affect latency, CPU efficiency, and implementation complexity.
|
|
|
|
---
|
|
|
|
## Row-at-a-time execution
|
|
|
|
In a row-oriented model, operators process one tuple at a time.
|
|
|
|
This is often implemented with an iterator interface where a parent asks a child for the next row.
|
|
|
|
Strengths:
|
|
|
|
- simple
|
|
- modular
|
|
- easy to debug
|
|
|
|
Weaknesses:
|
|
|
|
- high per-row overhead
|
|
- worse cache behavior for analytics
|
|
|
|
This model is historically important and still useful in many systems.
|
|
|
|
---
|
|
|
|
## Batch-oriented execution
|
|
|
|
In a batch model, operators process chunks of rows together.
|
|
|
|
The batch may be row-based or columnar, but the main idea is to amortize operator overhead across many values.
|
|
|
|
Strengths:
|
|
|
|
- better CPU efficiency
|
|
- lower dispatch overhead
|
|
- easier parallelism inside an operator
|
|
|
|
Weaknesses:
|
|
|
|
- more bookkeeping
|
|
- more complex control flow
|
|
|
|
---
|
|
|
|
## Vectorized execution
|
|
|
|
Vectorized execution is a batch-oriented style where operators often process column vectors rather than full row objects.
|
|
|
|
This fits well with columnar memory layouts and analytical workloads.
|
|
|
|
Strengths:
|
|
|
|
- excellent cache locality
|
|
- better SIMD opportunities
|
|
- good fit for scans, filters, joins, and aggregates
|
|
|
|
Weaknesses:
|
|
|
|
- some control-flow-heavy logic is less natural
|
|
- more careful null and type handling is needed
|
|
|
|
---
|
|
|
|
## Pull vs push
|
|
|
|
### Pull-based execution
|
|
|
|
Parent operators ask children for data.
|
|
|
|
Strengths:
|
|
|
|
- natural operator trees
|
|
- straightforward control flow
|
|
|
|
Weaknesses:
|
|
|
|
- can introduce repeated dispatch overhead
|
|
|
|
### Push-based execution
|
|
|
|
Child operators push data to parents or downstream consumers.
|
|
|
|
Strengths:
|
|
|
|
- natural for streaming or event-driven systems
|
|
- can work well with pipeline fusion
|
|
|
|
Weaknesses:
|
|
|
|
- control flow can be harder to reason about
|
|
|
|
Many systems combine these ideas rather than choosing only one.
|
|
|
|
---
|
|
|
|
## Pipelining vs materialization
|
|
|
|
### Pipelined execution
|
|
|
|
Operators pass intermediate results incrementally.
|
|
|
|
Strengths:
|
|
|
|
- low latency
|
|
- less temporary storage in favorable cases
|
|
|
|
Weaknesses:
|
|
|
|
- some operators still create barriers
|
|
|
|
### Materializing execution
|
|
|
|
An operator stores its entire output before the next operator consumes it.
|
|
|
|
Strengths:
|
|
|
|
- simpler boundaries
|
|
- easier reuse of intermediates
|
|
|
|
Weaknesses:
|
|
|
|
- more memory and I/O cost
|
|
- higher latency
|
|
|
|
---
|
|
|
|
## Blocking operators
|
|
|
|
Some operators are naturally blocking.
|
|
|
|
Examples:
|
|
|
|
- sort
|
|
- some aggregates
|
|
- some join strategies
|
|
|
|
These operators shape the real execution behavior of the plan because they force buffering or full-input processing before useful output appears.
|
|
|
|
---
|
|
|
|
## Practical mental model
|
|
|
|
Execution models are about runtime granularity and data flow.
|
|
|
|
If architecture asks "what kind of engine is this?", the execution model asks "how do operators actually run?"
|
|
|
|
---
|
|
|
|
## Changelog
|
|
|
|
* **April 1, 2026** -- First version created.
|