148 lines
3.2 KiB
Markdown
148 lines
3.2 KiB
Markdown
# Query Planning
|
|
|
|
A reference for how a query request becomes an internal plan.
|
|
|
|
---
|
|
|
|
## Short answer
|
|
|
|
Query planning is the stage where a query engine turns a user request into a structured representation of work to be done.
|
|
|
|
The main point is to separate:
|
|
|
|
- the syntax the user wrote
|
|
- the meaning of the query
|
|
- the later execution strategy
|
|
|
|
Without that separation, optimization and backend-independent execution become much harder.
|
|
|
|
---
|
|
|
|
## Typical pipeline
|
|
|
|
Planning usually sits between parsing and optimization:
|
|
|
|
1. parse query text or API calls
|
|
2. build an AST or similar syntax tree
|
|
3. resolve names and types
|
|
4. produce a logical plan
|
|
5. hand that plan to the optimizer
|
|
|
|
The exact boundaries differ across systems, but the general idea is stable.
|
|
|
|
---
|
|
|
|
## What planning does
|
|
|
|
### Parse structure into operations
|
|
|
|
The planner turns syntax such as `SELECT`, `WHERE`, `GROUP BY`, and `JOIN` into relational operators such as:
|
|
|
|
- scan
|
|
- projection
|
|
- filter
|
|
- join
|
|
- aggregate
|
|
- limit
|
|
|
|
### Resolve names
|
|
|
|
The planner figures out what table or source a name refers to and which columns expressions mention.
|
|
|
|
### Check types
|
|
|
|
The planner verifies that expressions are valid, such as comparing compatible types or ensuring aggregates are used correctly.
|
|
|
|
### Build expressions
|
|
|
|
Predicates and computed columns are turned into internal expression trees.
|
|
|
|
### Attach schema information
|
|
|
|
The planner determines the shape of operator outputs so later stages know what columns and types flow through the plan.
|
|
|
|
---
|
|
|
|
## AST vs logical plan
|
|
|
|
This distinction matters.
|
|
|
|
- the AST reflects the query language syntax
|
|
- the logical plan reflects the data operations implied by that syntax
|
|
|
|
For example, SQL syntax may contain clauses and aliases that are useful to the parser but irrelevant once the engine understands that the query means
|
|
"scan, filter, then project."
|
|
|
|
So planning is partly a translation from language syntax into execution-oriented semantics.
|
|
|
|
---
|
|
|
|
## A tiny example
|
|
|
|
Query:
|
|
|
|
```sql
|
|
SELECT name
|
|
FROM employees
|
|
WHERE age > 18
|
|
```
|
|
|
|
The parser may produce an AST containing nodes like:
|
|
|
|
- `SelectStatement`
|
|
- `FromClause`
|
|
- `WhereClause`
|
|
|
|
The planner turns that into a logical plan:
|
|
|
|
1. `Scan(employees)`
|
|
2. `Filter(age > 18)`
|
|
3. `Projection(name)`
|
|
|
|
That logical plan is what later stages optimize.
|
|
|
|
---
|
|
|
|
## Why planning matters
|
|
|
|
Planning is valuable because it creates the first stable representation of meaning inside the engine.
|
|
|
|
That gives the system a place to:
|
|
|
|
- validate the query
|
|
- reason about schemas
|
|
- rewrite plans
|
|
- compare equivalent formulations
|
|
- target different execution backends
|
|
|
|
In practice, planning is the bridge between the front-end language and the execution engine.
|
|
|
|
---
|
|
|
|
## Common complications
|
|
|
|
Planning gets harder when the query language includes:
|
|
|
|
- nested queries
|
|
- correlated subqueries
|
|
- user-defined functions
|
|
- ambiguous names
|
|
- multiple source types
|
|
- non-relational operators
|
|
|
|
This is why planning is often a substantial subsystem, not just a parser post-processing step.
|
|
|
|
---
|
|
|
|
## Practical mental model
|
|
|
|
If parsing answers "what syntax did the user write?", planning answers "what data operations does that syntax mean?"
|
|
|
|
That is the cleanest way to think about it.
|
|
|
|
---
|
|
|
|
## Changelog
|
|
|
|
* **Mar 31, 2026** -- First version created.
|