Add note files about query engines for NoSQL and design questions

2026-04-02 09:15:43 +02:00 · 2026-04-02 09:15:43 +02:00 · d5bbc4886d
commit d5bbc4886d
parent 40ccf7ae69
2 changed files with 165 additions and 0 deletions
--- a/hqew/012-query-engines-for-non-sql-databases.md
+++ b/hqew/012-query-engines-for-non-sql-databases.md
@ -0,0 +1,80 @@
+# Query Engines for Non-SQL Databases
+
+A reference for why query engines are broader than SQL.
+
+---
+
+## Short answer
+
+Yes, a query engine can be built for non-SQL databases.
+
+SQL is only one possible query language. The broader pattern is:
+
+- a data model exists
+- users need a declarative or structured way to ask for data
+- the system needs planning and execution machinery to answer those requests
+
+So query engines are not inherently relational.
+
+---
+
+## What changes
+
+What changes is the underlying algebra and operator set.
+
+For example:
+
+- relational engines center on tables, joins, filters, and aggregates
+- graph engines center on nodes, edges, traversals, and pattern matching
+- document engines center on nested objects, arrays, and field-path predicates
+- rule engines center on facts, unification, recursion, and fixpoint evaluation
+
+The architecture may still look familiar, but the internal operators differ.
+
+---
+
+## Examples
+
+You can meaningfully talk about query engines for:
+
+- document databases such as MongoDB
+- search systems such as Lucene or Vespa
+- vector databases such as Qdrant or Weaviate
+- graph databases
+- Datalog or rule engines
+
+What makes them query engines is not SQL syntax. It is that they accept structured requests and execute them efficiently over some data model.
+
+---
+
+## What stays the same
+
+Even outside SQL, many systems still have:
+
+1. a query language or API
+2. an internal representation of the request
+3. optimization or rewrite steps
+4. execution against indexes or stored data
+
+So the concept generalizes cleanly beyond relational databases.
+
+---
+
+## Practical mental model
+
+SQL engines optimize relational algebra.
+
+Non-SQL engines optimize some other access model:
+
+- graph traversal
+- text retrieval
+- vector similarity
+- logical derivation
+
+That is the main difference.
+
+---
+
+## Changelog
+
+* **April 2, 2026** -- First version created.
--- a/hqew/013-query-engine-design-questions.md
+++ b/hqew/013-query-engine-design-questions.md
@ -0,0 +1,85 @@
+# Query Engine Design Questions
+
+A checklist note for thinking about what kind of query engine to build.
+
+---
+
+## Short answer
+
+Before building a query engine, it helps to force explicit answers to a small set of design questions.
+
+Most architecture disagreements are really disagreements about workload, execution granularity, storage boundary, or correctness model.
+
+---
+
+## Core questions
+
+### Workload
+
+- Is the workload mostly transactional, analytical, streaming, search, or logical inference?
+- Are queries mostly point lookups, scans, aggregations, recursive rules, or top-k retrieval?
+
+### Data model
+
+- Is the data relational, document-oriented, graph-shaped, vector-based, or rule/fact based?
+- What is the engine's core internal representation?
+
+### Unit of execution
+
+- Does the engine run row-at-a-time, batch-at-a-time, or over fully materialized relations?
+- Are blocking operators common?
+
+### Storage boundary
+
+- Is execution tightly coupled to one storage engine?
+- Or is there a source interface with pushdown capabilities?
+
+### Indexes
+
+- What access patterns deserve dedicated indexes?
+- Are indexes exact, approximate, ordered, inverted, or vector-oriented?
+
+### Optimization
+
+- Is the optimizer mostly rule-based or cost-based?
+- What statistics are available?
+
+### Distribution
+
+- Is one machine enough?
+- If not, where are exchange boundaries, partitioning choices, and failure handling defined?
+
+### Semantics
+
+- Is the system exact, approximate, eventually consistent, ranked, or fixpoint-based?
+- Does it support recursion, inference, or witness generation?
+
+---
+
+## Why these questions matter
+
+These questions determine most of the major architecture choices:
+
+- row vs column
+- iterator vs vectorized execution
+- local vs distributed execution
+- exact vs approximate search
+- relational vs rule-based planning
+
+If those answers are unclear, architecture discussions tend to stay vague.
+
+---
+
+## Practical use
+
+This note is best used as a design checklist.
+
+If a team can answer these questions cleanly, the likely engine shape becomes much easier to see.
+
+If it cannot, the project is probably still mixing together several different engine ideas.
+
+---
+
+## Changelog
+
+* **April 2, 2026** -- First version created.