# Distributed Query Engines

A reference for what changes when query execution moves from one machine to many.

---

## Short answer

A distributed query engine is not just a single-node engine with remote workers.

Once execution is spread across machines, the engine needs extra architecture for:

- partitioning data
- moving data between stages
- scheduling tasks
- handling failures

Those concerns become first-order design problems.

---

## The basic shape

Most distributed engines have:

- a coordinator or planner
- workers or executors
- partitioned input data
- exchange or shuffle steps
- a final result collection stage

The plan is usually broken into fragments or stages that can run in parallel.

---

## What distribution adds

### Partitioning

Data is split across machines, often by file boundaries, shards, or key ranges.

### Exchange / shuffle

Rows or batches are moved across the network so that later operators see the right grouping or join partition.

### Scheduling

The system decides where tasks run and when.

### Fault tolerance

The engine needs some strategy for retries, recomputation, or checkpointing.

### Coordination overhead

Network, serialization, and orchestration can easily dominate runtime if the plan is badly shaped.

---

## Why distributed execution is hard

The main difficulty is that a plan that looks cheap algebraically may be expensive once network movement is included.

For example:

- a join may require shuffling huge datasets
- a group-by may need global repartitioning by key
- skewed keys may overload one worker

So distributed optimization is partly about minimizing data movement, not just local operator cost.

---

## Common patterns

### Map-then-reduce style

Local work happens close to the data, then partial results are shuffled and merged.

### Stage DAGs

Execution is represented as a directed acyclic graph of stages separated by exchange boundaries.

### Partial then final aggregation

Workers compute local aggregates, then a later stage merges them.

---

## Practical mental model

Single-node engines mostly optimize CPU, memory, and local I/O.

Distributed engines must also optimize:

- network traffic
- partition balance
- task scheduling
- failure recovery

That is why distributed query processing is a second architecture layer rather than a small extension.

---

## Changelog

* **April 1, 2026** -- First version created.