# Query Engine Design Questions A checklist note for thinking about what kind of query engine to build. --- ## Short answer Before building a query engine, it helps to force explicit answers to a small set of design questions. Most architecture disagreements are really disagreements about workload, execution granularity, storage boundary, or correctness model. --- ## Core questions ### Workload - Is the workload mostly transactional, analytical, streaming, search, or logical inference? - Are queries mostly point lookups, scans, aggregations, recursive rules, or top-k retrieval? ### Data model - Is the data relational, document-oriented, graph-shaped, vector-based, or rule/fact based? - What is the engine's core internal representation? ### Unit of execution - Does the engine run row-at-a-time, batch-at-a-time, or over fully materialized relations? - Are blocking operators common? ### Storage boundary - Is execution tightly coupled to one storage engine? - Or is there a source interface with pushdown capabilities? ### Indexes - What access patterns deserve dedicated indexes? - Are indexes exact, approximate, ordered, inverted, or vector-oriented? ### Optimization - Is the optimizer mostly rule-based or cost-based? - What statistics are available? ### Distribution - Is one machine enough? - If not, where are exchange boundaries, partitioning choices, and failure handling defined? ### Semantics - Is the system exact, approximate, eventually consistent, ranked, or fixpoint-based? - Does it support recursion, inference, or witness generation? --- ## Why these questions matter These questions determine most of the major architecture choices: - row vs column - iterator vs vectorized execution - local vs distributed execution - exact vs approximate search - relational vs rule-based planning If those answers are unclear, architecture discussions tend to stay vague. --- ## Practical use This note is best used as a design checklist. If a team can answer these questions cleanly, the likely engine shape becomes much easier to see. If it cannot, the project is probably still mixing together several different engine ideas. --- ## Changelog * **April 2, 2026** -- First version created.