## Plan Runner This crate is a snapshot executor for conjunctive-query plans. It reads a JSON plan (a DAG of scan and join nodes plus the input facts), walks the DAG using the operators from [`query-ops`](../query-ops), and prints the binding relation produced at the root node. The wire format mirrors `Geolog.DB.Plan.PlanGraph` from the [`geolog`](../../external/geolog) submodule, but the JSON shape is the contract: any frontend that emits this format can drive the runner. The mapping from `PlanEvalAtom` / `PlanJoin` to `scan_atom` / `semijoin` / `natural_join`, and the full IR spec, are documented as module-level rustdoc in [`src/lib.rs`](src/lib.rs). ### Pipeline End-to-end, scenarios become runner output through three stages: ```text tools/exporter/examples/*.scenario.json └── (Haskell exporter; runs Geolog.DB.Plan.planConjunction and Geolog.DB.InMemory.evalConjunctionPlanned as a self-check) └── crates/plan-runner/fixtures/*.json (JSON IR; checked in) └── (plan-runner; this crate) └── stdout JSON, with row-for-row oracle check ``` The exporter (`tools/exporter`) is the only producer of runner IR today; it's where atoms are planned and rejected if they don't fit the supported subset. Fixtures are regenerated with `make export-fixtures`, and the full loop is `make examples`. ### Backends The CLI takes a `--backend` flag. The `memory` backend is the pure in-memory path; every other backend routes facts through the [`Storage`](../storage) trait via `build_tables_via_storage`, then scans tables back out before executing. | Backend | Storage | Location | |------------------|------------------------------------------------|-----------------------| | `memory` | none (direct from `plan.facts`) | n/a | | `memory-storage` | `MemoryStorage` | in-process | | `lmdb` | `LmdbStorage` (heed-backed mmap B-tree) | fresh tempdir per run | | `redb` | `RedbStorage` (single-file B-tree) | fresh tempdir per run | | `fjall` | `FjallStorage` (LSM tree) | fresh tempdir per run | | `sqlite` | `SqliteStorage` (rusqlite, bundled libsqlite3) | fresh tempdir per run | | `geomerge` | `GeomergeStorage` (CRDT; alpha) | in-process | All seven produce byte-identical output for every checked-in fixture. The point of the abstraction is not performance comparison (the snapshot evaluator is bulk-materialized either way), but to validate that the storage layer is genuinely backend-neutral and that adding a new adapter is a constructor swap. Note on `geomerge`: the runner's JSON IR is untyped (only arity per relation), but geomerge requires a typed theory upfront. The CLI infers column types from the first fact row per relation and synthesizes a theory of `PrimInt` and `PrimString` columns via [`GeomergeStorage::with_relations`](../storage/src/adapters/geomerge.rs). Columns with no sample facts default to `PrimString`. ### Run It ```sh # Run one fixture through the default in-memory path: cargo run -p plan-runner -- crates/plan-runner/fixtures/two_atom_join.json # Same plan, routed through different backends: cargo run -p plan-runner -- --backend memory-storage crates/plan-runner/fixtures/two_atom_join.json cargo run -p plan-runner -- --backend lmdb crates/plan-runner/fixtures/two_atom_join.json cargo run -p plan-runner -- --backend redb crates/plan-runner/fixtures/two_atom_join.json cargo run -p plan-runner -- --backend fjall crates/plan-runner/fixtures/two_atom_join.json cargo run -p plan-runner -- --backend sqlite crates/plan-runner/fixtures/two_atom_join.json cargo run -p plan-runner -- --backend geomerge crates/plan-runner/fixtures/two_atom_join.json # Regenerate every fixture from its scenario and run the oracle test: make examples ``` A sample run: ```sh $ plan-run crates/plan-runner/fixtures/two_atom_join.json {"columns":["a","b","_w0_2"],"rows":[["node:1","node:2","edge:1"],["node:2","node:1","edge:2"]]} ``` The `_w_` columns are wildcards the exporter named so the runner can bind them. The scenario's `expected_bindings` block names only the variables the test cares about, and `verify` projects the runner output to that subset before comparing as a multiset. ### Run the Tests ```sh cargo test -p plan-runner ``` The two integration test files exercise complementary properties: - `tests/examples.rs` walks every fixture and checks it against its `expected_bindings` oracle. - `tests/storage_roundtrip.rs` cross-checks the pure path against the storage-backed path, to keep `build_tables` and `build_tables_via_storage` in lockstep. ### Notes - **IR contract.** The runner is backend-agnostic and frontend-agnostic: it consumes JSON in the shape documented in `src/lib.rs` and produces a binding relation. Anything that emits the same JSON can drive it. - **No optimizer.** Plans are executed as written. Node ordering, join shape, and antijoin scheduling are all the producer's responsibility. This crate's job ends at faithful execution of the IR. - **Wildcard columns survive.** `scan_atom` keeps every distinct variable that appears in the pattern, including the exporter's synthetic `_w_` names. The runner does not project them out; oracle verification handles that on the comparison side. - **Bulk, not streaming.** Each node materializes its full output as a `Relation`. This matches `query-ops`' execution model; it's not designed for incremental or maintained-view workloads.