# Query Execution Models A reference for the main ways query operators run at runtime. --- ## Short answer An execution model defines how operators consume input, produce output, and pass data through a plan. The most important questions are: - one row at a time or many values at once? - pull-based or push-based? - pipelined or materialized? Those choices strongly affect latency, CPU efficiency, and implementation complexity. --- ## Row-at-a-time execution In a row-oriented model, operators process one tuple at a time. This is often implemented with an iterator interface where a parent asks a child for the next row. Strengths: - simple - modular - easy to debug Weaknesses: - high per-row overhead - worse cache behavior for analytics This model is historically important and still useful in many systems. --- ## Batch-oriented execution In a batch model, operators process chunks of rows together. The batch may be row-based or columnar, but the main idea is to amortize operator overhead across many values. Strengths: - better CPU efficiency - lower dispatch overhead - easier parallelism inside an operator Weaknesses: - more bookkeeping - more complex control flow --- ## Vectorized execution Vectorized execution is a batch-oriented style where operators often process column vectors rather than full row objects. This fits well with columnar memory layouts and analytical workloads. Strengths: - excellent cache locality - better SIMD opportunities - good fit for scans, filters, joins, and aggregates Weaknesses: - some control-flow-heavy logic is less natural - more careful null and type handling is needed --- ## Pull vs push ### Pull-based execution Parent operators ask children for data. Strengths: - natural operator trees - straightforward control flow Weaknesses: - can introduce repeated dispatch overhead ### Push-based execution Child operators push data to parents or downstream consumers. Strengths: - natural for streaming or event-driven systems - can work well with pipeline fusion Weaknesses: - control flow can be harder to reason about Many systems combine these ideas rather than choosing only one. --- ## Pipelining vs materialization ### Pipelined execution Operators pass intermediate results incrementally. Strengths: - low latency - less temporary storage in favorable cases Weaknesses: - some operators still create barriers ### Materializing execution An operator stores its entire output before the next operator consumes it. Strengths: - simpler boundaries - easier reuse of intermediates Weaknesses: - more memory and I/O cost - higher latency --- ## Blocking operators Some operators are naturally blocking. Examples: - sort - some aggregates - some join strategies These operators shape the real execution behavior of the plan because they force buffering or full-input processing before useful output appears. --- ## Practical mental model Execution models are about runtime granularity and data flow. If architecture asks "what kind of engine is this?", the execution model asks "how do operators actually run?" --- ## Changelog * **April 1, 2026** -- First version created.