3.1 KiB
Query Execution Models
A reference for the main ways query operators run at runtime.
Short answer
An execution model defines how operators consume input, produce output, and pass data through a plan.
The most important questions are:
- one row at a time or many values at once?
- pull-based or push-based?
- pipelined or materialized?
Those choices strongly affect latency, CPU efficiency, and implementation complexity.
Row-at-a-time execution
In a row-oriented model, operators process one tuple at a time.
This is often implemented with an iterator interface where a parent asks a child for the next row.
Strengths:
- simple
- modular
- easy to debug
Weaknesses:
- high per-row overhead
- worse cache behavior for analytics
This model is historically important and still useful in many systems.
Batch-oriented execution
In a batch model, operators process chunks of rows together.
The batch may be row-based or columnar, but the main idea is to amortize operator overhead across many values.
Strengths:
- better CPU efficiency
- lower dispatch overhead
- easier parallelism inside an operator
Weaknesses:
- more bookkeeping
- more complex control flow
Vectorized execution
Vectorized execution is a batch-oriented style where operators often process column vectors rather than full row objects.
This fits well with columnar memory layouts and analytical workloads.
Strengths:
- excellent cache locality
- better SIMD opportunities
- good fit for scans, filters, joins, and aggregates
Weaknesses:
- some control-flow-heavy logic is less natural
- more careful null and type handling is needed
Pull vs push
Pull-based execution
Parent operators ask children for data.
Strengths:
- natural operator trees
- straightforward control flow
Weaknesses:
- can introduce repeated dispatch overhead
Push-based execution
Child operators push data to parents or downstream consumers.
Strengths:
- natural for streaming or event-driven systems
- can work well with pipeline fusion
Weaknesses:
- control flow can be harder to reason about
Many systems combine these ideas rather than choosing only one.
Pipelining vs materialization
Pipelined execution
Operators pass intermediate results incrementally.
Strengths:
- low latency
- less temporary storage in favorable cases
Weaknesses:
- some operators still create barriers
Materializing execution
An operator stores its entire output before the next operator consumes it.
Strengths:
- simpler boundaries
- easier reuse of intermediates
Weaknesses:
- more memory and I/O cost
- higher latency
Blocking operators
Some operators are naturally blocking.
Examples:
- sort
- some aggregates
- some join strategies
These operators shape the real execution behavior of the plan because they force buffering or full-input processing before useful output appears.
Practical mental model
Execution models are about runtime granularity and data flow.
If architecture asks "what kind of engine is this?", the execution model asks "how do operators actually run?"
Changelog
- April 1, 2026 -- First version created.