Plan Runner
This crate is a snapshot executor for conjunctive-query plans.
It reads a JSON plan (a DAG of scan and join nodes plus the input facts),
walks the DAG using the operators from query-ops,
and prints the binding relation produced at the root node.
The wire format mirrors Geolog.DB.Plan.PlanGraph from the
geolog submodule, but the JSON shape is the contract:
any frontend that emits this format can drive the runner.
The mapping from PlanEvalAtom / PlanJoin to scan_atom / semijoin / natural_join,
and the full IR spec, are documented as module-level rustdoc in
src/lib.rs.
Pipeline
End-to-end, scenarios become runner output through three stages:
tools/exporter/examples/*.scenario.json
└── (Haskell exporter; runs Geolog.DB.Plan.planConjunction
and Geolog.DB.InMemory.evalConjunctionPlanned as a self-check)
└── crates/plan-runner/fixtures/*.json (JSON IR; checked in)
└── (plan-runner; this crate)
└── stdout JSON, with row-for-row oracle check
The exporter (tools/exporter) is the only producer of runner IR today;
it's where atoms are planned and rejected if they don't fit the supported subset.
Fixtures are regenerated with make export-fixtures, and the full loop is make examples.
Backends
The CLI takes a --backend flag.
The memory backend is the pure in-memory path;
every other backend routes facts through the Storage trait
via build_tables_via_storage, then scans tables back out before executing.
| Backend | Storage | Location |
|---|---|---|
memory |
none (direct from plan.facts) |
n/a |
memory-storage |
MemoryStorage |
in-process |
lmdb |
LmdbStorage (heed-backed mmap B-tree) |
fresh tempdir per run |
redb |
RedbStorage (single-file B-tree) |
fresh tempdir per run |
fjall |
FjallStorage (LSM tree) |
fresh tempdir per run |
sqlite |
SqliteStorage (rusqlite, bundled libsqlite3) |
fresh tempdir per run |
geomerge |
GeomergeStorage (CRDT; alpha) |
in-process |
All seven produce byte-identical output for every checked-in fixture. The point of the abstraction is not performance comparison (the snapshot evaluator is bulk-materialized either way), but to validate that the storage layer is genuinely backend-neutral and that adding a new adapter is a constructor swap.
Note on geomerge:
the runner's JSON IR is untyped (only arity per relation),
but geomerge requires a typed theory upfront.
The CLI infers column types from the first fact row per relation
and synthesizes a theory of PrimInt and PrimString columns via
GeomergeStorage::with_relations.
Columns with no sample facts default to PrimString.
Run It
# Run one fixture through the default in-memory path:
cargo run -p plan-runner -- crates/plan-runner/fixtures/two_atom_join.json
# Same plan, routed through different backends:
cargo run -p plan-runner -- --backend memory-storage crates/plan-runner/fixtures/two_atom_join.json
cargo run -p plan-runner -- --backend lmdb crates/plan-runner/fixtures/two_atom_join.json
cargo run -p plan-runner -- --backend redb crates/plan-runner/fixtures/two_atom_join.json
cargo run -p plan-runner -- --backend fjall crates/plan-runner/fixtures/two_atom_join.json
cargo run -p plan-runner -- --backend sqlite crates/plan-runner/fixtures/two_atom_join.json
cargo run -p plan-runner -- --backend geomerge crates/plan-runner/fixtures/two_atom_join.json
# Regenerate every fixture from its scenario and run the oracle test:
make examples
A sample run:
$ plan-run crates/plan-runner/fixtures/two_atom_join.json
{"columns":["a","b","_w0_2"],"rows":[["node:1","node:2","edge:1"],["node:2","node:1","edge:2"]]}
The _w<atomIdx>_<pos> columns are wildcards the exporter named so the runner can bind them.
The scenario's expected_bindings block names only the variables the test cares about,
and verify projects the runner output to that subset before comparing as a multiset.
Run the Tests
cargo test -p plan-runner
The two integration test files exercise complementary properties:
tests/examples.rswalks every fixture and checks it against itsexpected_bindingsoracle.tests/storage_roundtrip.rscross-checks the pure path against the storage-backed path, to keepbuild_tablesandbuild_tables_via_storagein lockstep.
Notes
- IR contract.
The runner is backend-agnostic and frontend-agnostic:
it consumes JSON in the shape documented in
src/lib.rsand produces a binding relation. Anything that emits the same JSON can drive it. - No optimizer. Plans are executed as written. Node ordering, join shape, and antijoin scheduling are all the producer's responsibility. This crate's job ends at faithful execution of the IR.
- Wildcard columns survive.
scan_atomkeeps every distinct variable that appears in the pattern, including the exporter's synthetic_w<atomIdx>_<pos>names. The runner does not project them out; oracle verification handles that on the comparison side. - Bulk, not streaming.
Each node materializes its full output as a
Relation. This matchesquery-ops' execution model; it's not designed for incremental or maintained-view workloads.