habedi-work/useful-notes

Fork 0

Hassan Abedi 8ed8347380 Add note files about query execution models and indexes

2026-04-01 09:12:33 +02:00

3.1 KiB

Raw Blame History

Query Execution Models

A reference for the main ways query operators run at runtime.

Short answer

An execution model defines how operators consume input, produce output, and pass data through a plan.

The most important questions are:

one row at a time or many values at once?
pull-based or push-based?
pipelined or materialized?

Those choices strongly affect latency, CPU efficiency, and implementation complexity.

Row-at-a-time execution

In a row-oriented model, operators process one tuple at a time.

This is often implemented with an iterator interface where a parent asks a child for the next row.

Strengths:

simple
modular
easy to debug

Weaknesses:

high per-row overhead
worse cache behavior for analytics

This model is historically important and still useful in many systems.

Batch-oriented execution

In a batch model, operators process chunks of rows together.

The batch may be row-based or columnar, but the main idea is to amortize operator overhead across many values.

Strengths:

better CPU efficiency
lower dispatch overhead
easier parallelism inside an operator

Weaknesses:

more bookkeeping
more complex control flow

Vectorized execution

Vectorized execution is a batch-oriented style where operators often process column vectors rather than full row objects.

This fits well with columnar memory layouts and analytical workloads.

Strengths:

excellent cache locality
better SIMD opportunities
good fit for scans, filters, joins, and aggregates

Weaknesses:

some control-flow-heavy logic is less natural
more careful null and type handling is needed

Pull vs push

Pull-based execution

Parent operators ask children for data.

Strengths:

natural operator trees
straightforward control flow

Weaknesses:

can introduce repeated dispatch overhead

Push-based execution

Child operators push data to parents or downstream consumers.

Strengths:

natural for streaming or event-driven systems
can work well with pipeline fusion

Weaknesses:

control flow can be harder to reason about

Many systems combine these ideas rather than choosing only one.

Pipelining vs materialization

Pipelined execution

Operators pass intermediate results incrementally.

Strengths:

low latency
less temporary storage in favorable cases

Weaknesses:

some operators still create barriers

Materializing execution

An operator stores its entire output before the next operator consumes it.

Strengths:

simpler boundaries
easier reuse of intermediates

Weaknesses:

more memory and I/O cost
higher latency

Blocking operators

Some operators are naturally blocking.

Examples:

sort
some aggregates
some join strategies

These operators shape the real execution behavior of the plan because they force buffering or full-input processing before useful output appears.

Practical mental model

Execution models are about runtime granularity and data flow.

If architecture asks "what kind of engine is this?", the execution model asks "how do operators actually run?"

Changelog

April 1, 2026 -- First version created.

3.1 KiB Raw Blame History

Query Execution Models

Short answer

Row-at-a-time execution

Batch-oriented execution

Vectorized execution

Pull vs push

Pull-based execution

Push-based execution

Pipelining vs materialization

Pipelined execution

Materializing execution

Blocking operators

Practical mental model

Changelog

3.1 KiB

Raw Blame History