storage-engine-playground/AGENTS.md

# AGENTS.md

This file provides guidance to coding agents collaborating on this repository.

## Mission

`storage-engine-playground` is an experimental Rust project for prototyping query engines and storage engines.

The goal is not production software. The goal is a clear, runnable playground for small prototypes that help answer concrete architecture questions:

- how a query language should be parsed, cataloged, planned, and optimized
- how a query planner and a query executor should be separated, and what intermediate representation sits between them
- how a query executor's operators (scans, joins, antijoins, projections) compose into a working snapshot evaluator
- how a storage engine should expose a backend-neutral interface (relations, rows, transactions, scans), and how that interface holds up across
  different backends (in-process, file-backed, CRDT, and so on)

Priorities, in order:

1. Correctness: prototypes must have clear expected outputs and tests.
2. Clarity: each module and test should answer one research or engineering question.
3. Small scope: prefer narrow experiments over broad engine rewrites.
4. Explainability: planners should emit inspectable plans, not only executable structures.
5. Reproducibility: examples should use committed fixtures, deterministic tests, and documented commands.

## Core Rules

- Use English for code, comments, tests, and prose.
- Treat ignored local reference material as source material only. Do not import copied code into durable modules without an explicit decision.
- Prefer implementing small vertical slices: parse a subset, build a catalog, plan one rule shape, and test it.
- Do not build a full Datalog engine before the planning layer is useful and tested.
- Keep source language, relational planning, and backend execution separated.
- Prefer backend-neutral intermediate structures until a specific backend API requires specialization.
- Add comments only when they explain non-obvious planning, recursion, delta, or storage behavior.
- Treat tests and fixtures as part of the design, not as afterthoughts.

## Writing Style

- Use Oxford commas in inline lists: "a, b, and c" not "a, b, c".
- Do not use em dashes. Restructure the sentence, or use a colon or semicolon instead.
- Avoid colorful adjectives and adverbs. Write "storage engine" not "lightweight storage engine", "planner" not "clever planner".
- Use noun phrases for checklist items, not imperative verbs. Write "rule catalog construction" not "construct rule catalogs".
- Headings in Markdown files must be in title case: "Query Planning" not "Query planning". Minor words (a, an, the, and, but, or, for, in, on, at, to,
  by, of) stay lowercase unless they are the first word.

## Repository Layout

Discover the current layout from the filesystem before editing.
The shape today is:

- `crates/`: Rust workspace.
  See [`crates/README.md`](crates/README.md) for the responsibilities and dependency edges between the four crates (`storage`, `query-ops`,
  `plan-runner`, `geomerge-demo`).
  Each crate keeps its own `src/`, `tests/`, and (where relevant) `fixtures/`, `benches/`, and `docs/diagrams/` subdirectories.
- `tools/exporter/`: Haskell tool that consumes hand-authored `.scenario.json` files in `tools/exporter/examples/` and emits the runner-IR JSON
  consumed by `crates/plan-runner`.
  See [`tools/exporter/README.md`](tools/exporter/README.md).
- `tools/plan-viewer/`: static HTML viewer for `plan-runner` fixtures.
  It evaluates a fixture in the browser and renders the plan DAG, per-node relations, input facts, and oracle comparison.
  See [`tools/plan-viewer/README.md`](tools/plan-viewer/README.md).
- `external/`: git submodules.
  `external/geolog` provides the Haskell query planner used by the exporter; `external/geomerge` is the Rust CRDT crate consumed by
  `storage::adapters::geomerge`.
- Top-level configuration: `Makefile`, `flake.nix`, `Cargo.toml` (workspace), `pyproject.toml`, `.pre-commit-config.yaml`, `rust-toolchain.toml`.

Do not assume this list is exhaustive.
If the project grows a different structure, follow the actual codebase and update this file when conventions stabilize.

## Technical Direction

The main experimental architecture is:

```text
Datalog-like rules or Geolog-shaped laws
-> dependency analysis and strata
-> rule catalog
-> join graph
-> relational plan
-> FlowLog-style optimization
-> backend lowering
-> snapshot outputs
```

Keep these layers explicit:

- **Source Layer**: Datalog-like test programs and Geomerge-style laws.
- **Catalog Layer**: rule heads, body atoms, variables, constants, comparisons, negation, and projections.
- **Planning Layer**: join graphs, join order, antijoin placement, SIP-style filtering, subplan sharing, and physical key choice.
- **Execution Layer**: snapshot evaluator.
- **Storage Layer**: facts, transactions, rollback, preview state, and violation output integration.

## FlowLog-Inspired Planning

FlowLog should be treated as a planning reference, not as an automatic dependency.

Reusable ideas:

- rule catalog construction
- dependency graph and stratification
- per-rule join graph extraction
- width-oriented structural planning
- sideways information passing
- antijoin scheduling
- physical key and payload selection
- shared subplan detection

When adapting an idea, write the smallest test that demonstrates the behavior. For example:

```text
rule with three positive atoms
-> catalog variables
-> join graph
-> planned join tree
-> expected textual plan
```

## Geomerge-Style Validation Experiments

The first Geomerge-style target is maintained violation detection for supported relational laws.

A useful lowering shape is:

```text
required_consequent(x) :- antecedent(x).
violation(x) :- required_consequent(x), not consequent(x).
```

Start with:

- foreign-key-style laws
- totality-as-validation laws
- equality-as-violation laws
- multi-atom antecedents without existential witnesses

Exclude at first:

- existential witness generation
- disjunctive consequents
- equality saturation
- model branching
- full chase behavior

Violation rows should carry enough context for diagnostics:

```text
law_id
violation_kind
relation_or_consequent
bound_variable_values
```

## Rust Conventions

- Prefer small modules with explicit data structures over large generic abstractions.
- Use enums and structs to model rule syntax, catalog entries, plan nodes, and execution results.
- Prefer typed identifiers for relation names, variable names, rule ids, and field positions when it improves clarity.
- Keep parser errors and unsupported-feature errors explicit.
- Avoid panics in library code except for internal invariants that tests already cover.
- Use deterministic ordering for plans and diagnostics so tests are stable.
- Prefer simple snapshot evaluators as correctness oracles before optimizing.

## Testing Expectations

Add tests for every non-trivial behavior.

Recommended test groups:

- parser acceptance and rejection
- rule catalog construction
- dependency graph and strata
- join graph construction
- structural planning
- antijoin scheduling
- SIP-style filtering
- snapshot evaluation
- storage-backend adapter parity (in-process, file-backed, and CRDT)
- Geomerge-style violation fixtures

Tests should prefer small facts with readable expected outputs. Avoid large benchmark fixtures unless the test is explicitly performance-oriented.

## Required Validation

Use the repository's actual tooling.
The `Makefile` wraps the standard Rust commands and skips Rust checks with a clear message if no `Cargo.toml` exists yet.

For Rust changes, prefer:

1. `make format-check`
2. `make lint`
3. `make test`

These map to `cargo fmt --all --check`, `cargo clippy --all-targets --all-features -- -D warnings`, and `cargo test --all-targets --all-features`.
If the project does not yet have a `Cargo.toml`, `make check` should still pass by skipping Rust-specific checks.

For changes that touch the cross-language pipeline (Haskell exporter and Rust runner), also run:

1. `make export-fixtures`: rebuilds `crates/plan-runner/fixtures/*.json` from `tools/exporter/examples/*.scenario.json` using the Haskell exporter.
   Requires the Nix dev shell (`make shell` or `nix develop`) so GHC and Cabal are available.
2. `make examples`: runs `export-fixtures` and then `cargo test -p plan-runner --test examples`, which walks every regenerated fixture and verifies it
   against its `expected_bindings` oracle.

For Markdown-only changes, run a manual read-through and check that headings follow the writing style.

## Change Design Checklist

Before coding:

1. Problem statement and target question
2. Existing module or new module decision
3. Snapshot oracle or expected output
4. Supported and unsupported feature boundary
5. Small fixture or example shape

Before submitting:

1. Formatting status
2. Test status
3. Unsupported cases documented
4. No durable references to ignored local paths
5. Notes or examples updated when behavior changes

## Review Guidelines

Review output should prioritize correctness and experiment quality.

- `P0`: must-fix defects, such as incorrect query results, invalid rollback behavior, unsupported syntax accepted silently, or tests that cannot run.
- `P1`: high-priority defects, such as nondeterministic plans, unclear unsupported-feature errors, missing snapshot oracle for a planner change, or
  misleading notes.
- `P2`: useful follow-up, such as additional fixtures, clearer diagnostics, or broader benchmark coverage.

Use this review format:

1. `Severity` (`P0`/`P1`/`P2`)
2. `File:line`
3. `Issue`
4. `Why it matters`
5. `Minimal fix direction`

## Practical Notes for Agents

- Read the relevant durable project notes before changing architecture.
- Treat copied papers, cloned repositories, and generated files in ignored local paths as reference material only.
- Prefer a planning-only prototype before backend integration.
- Prefer textual plan explanations in early tests. They make the planner easier to debug.
- Keep backend comparison fair: same rule, same input facts, same expected output.
- Keep transaction and rollback behavior explicit for validation experiments.
- Keep project tooling aligned with this file when new commands or checks are added.

## Commit and PR Hygiene

- Keep commits scoped to one logical change: parser, catalog, planner, evaluator, fixture, note, or tooling.
- Do not mix broad formatting churn with semantic changes.
- PR descriptions should include:
    1. the experiment or feature being tested,
    2. the source rules or fixtures affected,
    3. the expected behavior,
    4. validation commands and results,
    5. known unsupported cases.

Suggested PR checklist:

- [ ] `make format-check` passes, if applicable
- [ ] `make lint` passes, if applicable
- [ ] `make test` passes, if applicable
- [ ] Snapshot oracle or expected output included for planner behavior
- [ ] Unsupported cases documented