useful-notes/hqew/013-query-engine-design-questions.md

2.2 KiB

Query Engine Design Questions

A checklist note for thinking about what kind of query engine to build.


Short answer

Before building a query engine, it helps to force explicit answers to a small set of design questions.

Most architecture disagreements are really disagreements about workload, execution granularity, storage boundary, or correctness model.


Core questions

Workload

  • Is the workload mostly transactional, analytical, streaming, search, or logical inference?
  • Are queries mostly point lookups, scans, aggregations, recursive rules, or top-k retrieval?

Data model

  • Is the data relational, document-oriented, graph-shaped, vector-based, or rule/fact based?
  • What is the engine's core internal representation?

Unit of execution

  • Does the engine run row-at-a-time, batch-at-a-time, or over fully materialized relations?
  • Are blocking operators common?

Storage boundary

  • Is execution tightly coupled to one storage engine?
  • Or is there a source interface with pushdown capabilities?

Indexes

  • What access patterns deserve dedicated indexes?
  • Are indexes exact, approximate, ordered, inverted, or vector-oriented?

Optimization

  • Is the optimizer mostly rule-based or cost-based?
  • What statistics are available?

Distribution

  • Is one machine enough?
  • If not, where are exchange boundaries, partitioning choices, and failure handling defined?

Semantics

  • Is the system exact, approximate, eventually consistent, ranked, or fixpoint-based?
  • Does it support recursion, inference, or witness generation?

Why these questions matter

These questions determine most of the major architecture choices:

  • row vs column
  • iterator vs vectorized execution
  • local vs distributed execution
  • exact vs approximate search
  • relational vs rule-based planning

If those answers are unclear, architecture discussions tend to stay vague.


Practical use

This note is best used as a design checklist.

If a team can answer these questions cleanly, the likely engine shape becomes much easier to see.

If it cannot, the project is probably still mixing together several different engine ideas.


Changelog

  • April 2, 2026 -- First version created.