From b38b176e7f33a1660dd204a7441b73e39ee231b0 Mon Sep 17 00:00:00 2001
From: Hassan Abedi <cogitator.tech@gmail.com>
Date: Fri, 5 Jun 2026 13:51:26 +0200
Subject: [PATCH] Add a few emojies to README files

---
 AGENTS.md                    | 108 +++++++++++------------------------
 README.md                    |   6 +-
 crates/plan-runner/README.md |   5 ++
 crates/storage/README.md     |   2 +-
 tools/exporter/README.md     |   7 ++-
 5 files changed, 47 insertions(+), 81 deletions(-)

diff --git a/AGENTS.md b/AGENTS.md
index 26b91bf..ea29a6c 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -4,15 +4,15 @@ This file provides guidance to coding agents collaborating on this repository.
 
 ## Mission
 
-`storage-engine-playground` is an experimental Rust project for testing ideas from the FlowLog, DBSP, CRDT-as-query, and Geomerge notes.
+`storage-engine-playground` is an experimental Rust project for prototyping query engines and storage engines.
 
 The goal is not production software. The goal is a clear, runnable playground for small prototypes that help answer concrete architecture questions:
 
-- how Datalog-like rules should be parsed, cataloged, planned, and optimized
-- how FlowLog-style planning ideas transfer to a DBSP-oriented frontend
-- how CRDT queries behave under naive plans versus planned relational execution
-- how Geomerge-style laws can compile into maintained violation relations
-- how backend behavior changes across snapshot, DBSP-like, and Differential Dataflow-like execution models
+- how a query language should be parsed, cataloged, planned, and optimized
+- how a query planner and a query executor should be separated, and what intermediate representation sits between them
+- how a query executor's operators (scans, joins, antijoins, projections) compose into a working snapshot evaluator
+- how a storage engine should expose a backend-neutral interface (relations, rows, transactions, scans), and how that interface holds up across
+  different backends (in-process, file-backed, CRDT, and so on)
 
 Priorities, in order:
 
@@ -44,19 +44,23 @@ Priorities, in order:
 
 ## Repository Layout
 
-The repository is new and may change. Discover the current layout from the filesystem before editing.
+Discover the current layout from the filesystem before editing.
+The shape today is:
 
-Expected durable areas may include:
+- `crates/`: Rust workspace.
+  See [`crates/README.md`](crates/README.md) for the responsibilities and dependency edges between the four crates (`storage`, `query-ops`,
+  `plan-runner`, `geomerge-demo`).
+  Each crate keeps its own `src/`, `tests/`, and (where relevant) `fixtures/`, `benches/`, and `docs/diagrams/` subdirectories.
+- `tools/exporter/`: Haskell tool that consumes hand-authored `.scenario.json` files in `tools/exporter/examples/` and emits the runner-IR JSON
+  consumed by `crates/plan-runner`.
+  See [`tools/exporter/README.md`](tools/exporter/README.md).
+- `external/`: git submodules.
+  `external/geolog` provides the Haskell query planner used by the exporter; `external/geomerge` is the Rust CRDT crate consumed by
+  `storage::adapters::geomerge`.
+- Top-level configuration: `Makefile`, `flake.nix`, `Cargo.toml` (workspace), `pyproject.toml`, `.pre-commit-config.yaml`, `rust-toolchain.toml`.
 
-- `src/`: Rust source for parser, catalog, planner, execution experiments, and storage prototypes.
-- `tests/`: integration tests for rule planning, evaluation, and storage behavior.
-- `tools/exporter/examples/`: hand-authored scenario JSON consumed by the Haskell exporter to produce runner fixtures.
-- `fixtures/`: committed input facts and expected outputs.
-- `notes/`: local design notes that belong to this project.
-- `flowlog/`: project-local notes or sketches derived from the FlowLog line of work.
-
-Do not assume this list is exhaustive. If the project grows a different structure, follow the actual codebase and update this file when conventions
-stabilize.
+Do not assume this list is exhaustive.
+If the project grows a different structure, follow the actual codebase and update this file when conventions stabilize.
 
 ## Technical Direction
 
@@ -70,15 +74,15 @@ Datalog-like rules or Geolog-shaped laws
 -> relational plan
 -> FlowLog-style optimization
 -> backend lowering
--> maintained or snapshot outputs
+-> snapshot outputs
 ```
 
 Keep these layers explicit:
 
-- **Source Layer**: Datalog-like test programs, CRDT query definitions, and Geomerge-style laws.
+- **Source Layer**: Datalog-like test programs and Geomerge-style laws.
 - **Catalog Layer**: rule heads, body atoms, variables, constants, comparisons, negation, and projections.
 - **Planning Layer**: join graphs, join order, antijoin placement, SIP-style filtering, subplan sharing, and physical key choice.
-- **Execution Layer**: snapshot evaluator first, then DBSP-like or Differential Dataflow-like experiments.
+- **Execution Layer**: snapshot evaluator.
 - **Storage Layer**: facts, transactions, rollback, preview state, and violation output integration.
 
 ## FlowLog-Inspired Planning
@@ -106,60 +110,6 @@ rule with three positive atoms
 -> expected textual plan
 ```
 
-## DBSP and Incremental Execution
-
-DBSP-related work should preserve a clean boundary:
-
-```text
-planned relational IR
--> DBSP lowering
--> maintained output deltas
-```
-
-Do not make DBSP responsible for source-language semantics. The frontend should check supported syntax, stratification, and rule shape before backend
-lowering.
-
-For each DBSP-like experiment, also provide a snapshot oracle when feasible:
-
-```text
-snapshot result == maintained result after each update
-```
-
-Track these measurements when relevant:
-
-- hydration time
-- warm-update time
-- output delta size
-- maintained state size if available
-- sensitivity to join order
-- sensitivity to causal-history depth
-
-## CRDT Query Experiments
-
-Initial CRDT workloads should stay small and explicit:
-
-- multi-value register
-- causal readiness over `pred`
-- list next-element traversal
-- tombstone skipping
-
-Use operation facts shaped like:
-
-```text
-set(replica_id, counter, key, value)
-pred(from_replica_id, from_counter, to_replica_id, to_counter)
-insert(replica_id, counter, parent_replica_id, parent_counter, value)
-remove(replica_id, counter)
-```
-
-Important questions:
-
-- Does the query require recursion, negation, or both?
-- Can antijoins run earlier?
-- Can causal readiness be maintained from a frontier?
-- Does warm-update cost depend on history depth?
-- Does the output need integration into a current view?
-
 ## Geomerge-Style Validation Experiments
 
 The first Geomerge-style target is maintained violation detection for supported relational laws.
@@ -219,8 +169,7 @@ Recommended test groups:
 - antijoin scheduling
 - SIP-style filtering
 - snapshot evaluation
-- maintained-output equivalence
-- CRDT fixtures
+- storage-backend adapter parity (in-process, file-backed, and CRDT)
 - Geomerge-style violation fixtures
 
 Tests should prefer small facts with readable expected outputs. Avoid large benchmark fixtures unless the test is explicitly performance-oriented.
@@ -239,6 +188,13 @@ For Rust changes, prefer:
 These map to `cargo fmt --all --check`, `cargo clippy --all-targets --all-features -- -D warnings`, and `cargo test --all-targets --all-features`.
 If the project does not yet have a `Cargo.toml`, `make check` should still pass by skipping Rust-specific checks.
 
+For changes that touch the cross-language pipeline (Haskell exporter and Rust runner), also run:
+
+1. `make export-fixtures`: rebuilds `crates/plan-runner/fixtures/*.json` from `tools/exporter/examples/*.scenario.json` using the Haskell exporter.
+   Requires the Nix dev shell (`make shell` or `nix develop`) so GHC and Cabal are available.
+2. `make examples`: runs `export-fixtures` and then `cargo test -p plan-runner --test examples`, which walks every regenerated fixture and verifies it
+   against its `expected_bindings` oracle.
+
 For Markdown-only changes, run a manual read-through and check that headings follow the writing style.
 
 ## Change Design Checklist
diff --git a/README.md b/README.md
index 3ff6bb4..1895ac1 100644
--- a/README.md
+++ b/README.md
@@ -1,9 +1,13 @@
 ## Storage Engine Playground
 
-This repo is a playground for running small experiments related to storage side of things.
+This repo is a playground for running small experiments related to storage and query execution.
 
 ### Development
 
+> ⚠️ Clone with `--recursive`.
+> The repo pulls `external/geolog` and `external/geomerge` as git submodules;
+> a non-recursive clone leaves those directories empty and breaks the build.
+
 ```sh
 # Clone the repo with submodules
 git clone --recursive git@code.obsidian.systems:habedi-work/storage-engine-playground.git
diff --git a/crates/plan-runner/README.md b/crates/plan-runner/README.md
index 5c9c387..f0f4ae3 100644
--- a/crates/plan-runner/README.md
+++ b/crates/plan-runner/README.md
@@ -48,6 +48,11 @@ via `build_tables_via_storage`, then scans tables back out before executing.
 | `sqlite`         | `SqliteStorage`   | fresh tempdir per run |
 | `geomerge`       | `GeomergeStorage` | in-process            |
 
+> ⚠️ `--backend geomerge` requires a typed theory upfront, but the runner IR is untyped.
+> The CLI infers column types (`PrimInt` or `PrimString`) from the first fact row per relation;
+> relations with no facts default to `PrimString`.
+> Works for every current fixture; future fixtures with mixed-type columns may fail at insert time.
+
 ### Execute a Query Plan
 
 ```sh
diff --git a/crates/storage/README.md b/crates/storage/README.md
index c968acf..41f7092 100644
--- a/crates/storage/README.md
+++ b/crates/storage/README.md
@@ -106,7 +106,7 @@ cargo test -p storage --all-features
 - **Deletion support.**
   Most adapters implement `delete`.
   The `geomerge` adapter does not: its append-only commit log returns `StorageError::Unsupported("row deletion")`.
-- **Geomerge is alpha.**
+- ⚠️ **Geomerge is alpha.**
   The upstream `geomerge` crate is prototype-status and its API can change without notice; treat breakage in `adapters::geomerge` as expected churn
   rather than regression.
 - **Feature gates.**
diff --git a/tools/exporter/README.md b/tools/exporter/README.md
index 1dccef0..a320b1a 100644
--- a/tools/exporter/README.md
+++ b/tools/exporter/README.md
@@ -26,9 +26,10 @@ tools/exporter/
 
 ### Run It
 
-The exporter needs GHC 9.12 and Cabal.
-The repository's Nix dev shell provides both;
-enter it with `make shell` (or `nix develop`) before running the commands below.
+> ⚠️ The exporter needs GHC 9.12 and Cabal.
+> The repository's Nix dev shell provides both;
+> enter it with `make shell` (or `nix develop`) before running the commands below.
+> A system GHC older than 9.12 will fail to compile geolog-lang's `GHC2024` modules.
 
 ```sh
 # Build the executable: