storage-engine-playground/README.md at 6327c0e344e2327ff782a02a46449e72f7415756

Hassan Abedi 6327c0e344 Simplify the project

2026-06-05 13:40:16 +02:00

4.2 KiB

Raw Blame History

Plan Runner

This crate implements an executor for (conjunctive) query plans. The implementation is a CLI tool. It reads a JSON plan (which currently is a DAG of scan and join nodes plus the input facts), walks the DAG using the operators from query-ops, and prints the resulting relation as JSON to stdout.

Pipeline

End-to-end, scenarios become runner output through three stages:

tools/exporter/examples/*.scenario.json
  └── (Haskell exporter; runs Geolog.DB.Plan.planConjunction
       and Geolog.DB.InMemory.evalConjunctionPlanned as a self-check)
        └── crates/plan-runner/fixtures/*.json    (JSON IR; checked in)
             └── (plan-runner; this crate)
                  └── stdout JSON, with row-for-row oracle check

The exporter (tools/exporter) is the only producer of runner IR today; it's where atoms are planned and rejected if they don't fit the supported subset. Fixtures are regenerated with make export-fixtures, and the full loop is make examples.

What happens inside the runner once a JSON plan arrives:

Storage Backends

The CLI takes a --backend flag. The memory backend is the pure in-memory path; every other backend routes facts through the Storage trait via build_tables_via_storage, then scans tables back out before executing.

Backend	Storage	Location
`memory`	none	n/a
`memory-storage`	`MemoryStorage`	in-process
`lmdb`	`LmdbStorage`	fresh tempdir per run
`redb`	`RedbStorage`	fresh tempdir per run
`fjall`	`FjallStorage`	fresh tempdir per run
`sqlite`	`SqliteStorage`	fresh tempdir per run
`geomerge`	`GeomergeStorage`	in-process

Execute a Query Plan

# Run a plan with the default backend (no storage)
cargo run -p plan-runner -- crates/plan-runner/fixtures/two_atom_join.json

# Run the same plan with every supported backend
cargo run -p plan-runner -- --backend memory-storage crates/plan-runner/fixtures/two_atom_join.json
cargo run -p plan-runner -- --backend lmdb           crates/plan-runner/fixtures/two_atom_join.json
cargo run -p plan-runner -- --backend redb           crates/plan-runner/fixtures/two_atom_join.json
cargo run -p plan-runner -- --backend fjall          crates/plan-runner/fixtures/two_atom_join.json
cargo run -p plan-runner -- --backend sqlite         crates/plan-runner/fixtures/two_atom_join.json
cargo run -p plan-runner -- --backend geomerge       crates/plan-runner/fixtures/two_atom_join.json

A sample run:

$ plan-run crates/plan-runner/fixtures/two_atom_join.json
{"columns":["a","b","_w0_2"],"rows":[["node:1","node:2","edge:1"],["node:2","node:1","edge:2"]]}

The _w<atomIdx>_<pos> columns are wildcards the exporter named so the runner can bind them. The scenario's expected_bindings block names only the variables the test cares about, and verify projects the runner output to that subset before comparing as a multiset.

Run the Tests

cargo test -p plan-runner

Notes

IR contract. The runner is backend-agnostic and frontend-agnostic. It consumes JSON in the shape documented in src/lib.rs and produces a binding relation. Anything that emits the same JSON can drive it.
No optimizer. Plans are executed as written. Node ordering, join shape, and antijoin scheduling are all the producer's responsibility. This crate's job ends at faithful execution of the IR.
Wildcard columns survive. scan_atom keeps every distinct variable that appears in the pattern, including the exporter's synthetic _w<atomIdx>_<pos> names. The runner does not project them out; oracle verification handles that on the comparison side.
Bulk, not streaming. Each node materializes its full output as a Relation. This matches query-ops' execution model; it's not designed for incremental or maintained-view workloads.

4.2 KiB Raw Blame History