diff --git a/notes/backend/01-cozo-and-lmdb-findings.md b/notes/backend/01-cozo-and-lmdb-findings.md
new file mode 100644
index 0000000..9f5ad62
--- /dev/null
+++ b/notes/backend/01-cozo-and-lmdb-findings.md
@@ -0,0 +1,135 @@
+## Cozo and LMDB Findings
+
+Sources inspected: the Cozo source tree at `github.com/cozodb/cozo`, the LMDB source tree at `github.com/LMDB/lmdb`, and the `heed` Rust binding at `github.com/meilisearch/heed`.
+File paths in this note are relative to the root of the named project's source tree.
+The aim was to understand how a working Datalog engine (Cozo) implements joins and what a low-level key-value substrate (LMDB) provides that makes those joins cheap.
+This note summarizes the design lessons and the practical implications for the `query-ops` crate in this playground.
+
+### Summary
+
+Cozo is an embedded Datalog database written in Rust.
+It does not have a separate semijoin operator.
+Instead, it has one inner-join operator that picks between two strategies based on how each relation is stored: an index-nested-loop strategy that uses ordered range scans over the substrate, and a fallback that materializes one side into a sorted vector and probes it.
+Semijoin behavior, when needed, emerges from a separate rewrite step called the magic-sets transformation, which converts semijoin-shaped pruning into regular inner joins against derived relations.
+
+LMDB is a memory-mapped, ordered key-value store with a B+ tree on disk.
+It exposes a small set of cursor primitives that support prefix iteration, range iteration, and exact-key lookup.
+These primitives are exactly what an index-nested-loop join needs: seek to a key prefix, then iterate forward while the prefix matches.
+
+The combined lesson is that a good join does not require a clever operator.
+It requires the relation to be stored with the join columns at the front of the key, so that the substrate's ordered iteration can do the join itself.
+
+### Cozo
+
+#### What It Is
+
+Cozo is a Datalog database with multiple swappable storage backends, including an in-memory store, SQLite, RocksDB, sled, and TiKV.
+The execution engine speaks a single narrow storage trait whose surface is essentially `get`, `put`, `range_iter`, and `prefix_iter` over byte keys.
+Each backend implements that trait.
+The trait definition lives at `cozo-core/src/storage/mod.rs` in the Cozo source tree.
+
+#### Join Behavior
+
+The relational algebra at `cozo-core/src/query/ra.rs` in the Cozo source tree defines a single join operator named `InnerJoin`.
+At execution time it chooses between two strategies based on a check called `join_is_prefix`:
+
+- prefix join: for each tuple from the left side, the engine builds a byte prefix from the join columns and calls `prefix_iter` on the right relation.
+  The substrate yields all matching tuples in key order.
+  No hash table is built.
+  This path is taken whenever the right side's join columns are stored as the prefix of its key.
+- materialized join: used when the join columns are not a key prefix.
+  The right side is read fully into a sorted, deduplicated vector, reordered so the join columns come first, then walked with a `starts_with(prefix)` check.
+  This is the build-and-probe family, but with a sorted vector instead of a hash map.
+
+The choice is made entirely on whether the join columns sit at the front of the stored key.
+
+#### No Semijoin Operator
+
+A search of the Cozo source for `semijoin` or `semi_join` returns nothing.
+Semijoin behavior comes from the magic-sets transformation at `cozo-core/src/query/magic.rs` in the Cozo source tree.
+This pass rewrites each rule so that body atoms get joined against an auxiliary "magic" relation whose contents encode the binding patterns supplied by the rule's callers.
+The net effect is the same as semijoining body atoms against caller-supplied filters, but the implementation is a logical rewrite, not a runtime operator.
+
+#### No Auto-Maintained Secondary Indexes
+
+Cozo does not maintain secondary indexes automatically.
+If you want to query a relation by a column order different from how it was declared, you declare a second relation with the columns reordered and keep its contents synchronized at insert time.
+A covering index is just another stored relation.
+The decision of which column order to store comes from how you expect to query the data, not from the engine.
+
+### LMDB
+
+#### What It Is
+
+LMDB is a single-file, memory-mapped, ordered key-value store.
+It uses a B+ tree on disk and exposes reads as zero-copy byte slices that point directly into the mmap.
+It supports a single writer at a time and many concurrent readers, and it uses shadow paging for MVCC, which means commits are atomic without a write-ahead log.
+
+#### Cursor Primitives
+
+A cursor in LMDB is a position inside the B+ tree.
+The full set of cursor operations is defined by the `MDB_cursor_op` enum in `libraries/liblmdb/lmdb.h` in the LMDB source tree.
+The operations relevant to join work are:
+
+- `MDB_SET_RANGE`: position at the first key greater than or equal to a given key.
+  This is the seek primitive that makes prefix scans possible.
+- `MDB_NEXT`: advance one step forward in key order.
+  Combined with `MDB_SET_RANGE` and a per-step prefix check, this gives you ordered range iteration.
+- `MDB_SET` and `MDB_SET_KEY`: exact-key positioning, used for point lookups.
+- `MDB_FIRST` and `MDB_LAST`: positional endpoints.
+
+For databases opened with the `MDB_DUPSORT` flag, one key can carry multiple sorted values, and additional operations apply: `MDB_GET_BOTH`, `MDB_NEXT_DUP`, `MDB_FIRST_DUP`.
+This is useful when a relation is encoded as "key = join columns, duplicate values = remaining columns": the set of duplicates is itself a secondary index over the join key.
+
+#### Rust Binding
+
+`heed` is the idiomatic Rust binding for LMDB.
+It wraps the cursor operations as `RoCursor` and `RwCursor` and returns key and value byte slices tied to the transaction lifetime, so reads remain zero-copy.
+Meilisearch uses `heed` in production, so the binding is well exercised.
+
+### LMDB Versus RocksDB
+
+Both LMDB and RocksDB are ordered key-value stores with prefix and range scans, but their internal designs lead to different operational profiles.
+
+LMDB highlights:
+
+- B+ tree on disk, memory mapped
+- Single writer at a time, many concurrent readers
+- Zero-copy reads from the mmap
+- Append-only on-disk format; deletes leave reclaimable free pages
+- File size grows up to a configured `mapsize`
+- No background compaction
+- Manual reclaim with `mdb_copy --compact`
+
+RocksDB highlights:
+
+- Log-structured merge tree
+- Multiple concurrent writers
+- Background compaction
+- Higher write throughput at the cost of write amplification
+- Reads may traverse multiple levels with bloom-filter checks
+- Engine manages its own disk layout
+
+For a read-heavy prototype with batch inserts, LMDB is the closer fit: predictable read costs, cheap range scans, and zero-copy probes.
+RocksDB earns its overhead when sustained write throughput is the bottleneck.
+
+### Practical Implications
+
+The current `query-ops` crate works on in-memory `Vec<Row>` values and will implement semijoin and natural join with a transient hash on one side.
+The Cozo design suggests a clear upgrade path once a real substrate is added.
+
+Short term: keep the in-memory operator and build a transient hash on the smaller side.
+This is correct, easy to test, and easy to reason about.
+
+Medium term: when relations move into a substrate like LMDB, encode each relation so that the join columns sit at the prefix of the key, or use a `DUPSORT` database where the duplicate values carry the remaining columns.
+At that point the join operator becomes a cursor pattern (`MDB_SET_RANGE` followed by `MDB_NEXT` while the prefix matches), and the separate hash-building step disappears.
+
+Index discipline: if a relation needs to be joined two different ways, store it twice with different prefix orders.
+There is no clever-indexing shortcut in either Cozo or LMDB, and trying to invent one is unlikely to be worth the cost.
+
+The takeaway is that the operator surface in `query-ops` is fine for an in-memory prototype, but the substrate decision is the load-bearing one for performance.
+We do not need to design around it now, but the natural successor to the current operators is a key-encoding discipline rather than a more elaborate operator implementation.
+
+### Changelog
+
+- **June 2, 2026** -- The first version of this document was made.