storage-engine-playground/AGENTS.md

# AGENTS.md

This file provides guidance to coding agents collaborating on this repository.

## Mission

`storage-engine-playground` is an experimental Rust project for testing ideas from the FlowLog, DBSP, CRDT-as-query, and Geomerge notes.

The goal is not production software. The goal is a clear, runnable playground for small prototypes that help answer concrete architecture questions:

- how Datalog-like rules should be parsed, cataloged, planned, and optimized
- how FlowLog-style planning ideas transfer to a DBSP-oriented frontend
- how CRDT queries behave under naive plans versus planned relational execution
- how Geomerge-style laws can compile into maintained violation relations
- how backend behavior changes across snapshot, DBSP-like, and Differential Dataflow-like execution models

Priorities, in order:

1. Correctness: prototypes must have clear expected outputs and tests.
2. Clarity: each module and test should answer one research or engineering question.
3. Small scope: prefer narrow experiments over broad engine rewrites.
4. Explainability: planners should emit inspectable plans, not only executable structures.
5. Reproducibility: examples should use committed fixtures, deterministic tests, and documented commands.

## Core Rules

- Use English for code, comments, tests, and prose.
- Treat ignored local reference material as source material only. Do not import copied code into durable modules without an explicit decision.
- Prefer implementing small vertical slices: parse a subset, build a catalog, plan one rule shape, and test it.
- Do not build a full Datalog engine before the planning layer is useful and tested.
- Keep source language, relational planning, and backend execution separated.
- Prefer backend-neutral intermediate structures until a specific backend API requires specialization.
- Add comments only when they explain non-obvious planning, recursion, delta, or storage behavior.
- Treat tests and fixtures as part of the design, not as afterthoughts.

## Writing Style

- Use Oxford commas in inline lists: "a, b, and c" not "a, b, c".
- Do not use em dashes. Restructure the sentence, or use a colon or semicolon instead.
- Avoid colorful adjectives and adverbs. Write "storage engine" not "lightweight storage engine", "planner" not "clever planner".
- Use noun phrases for checklist items, not imperative verbs. Write "rule catalog construction" not "construct rule catalogs".
- Headings in Markdown files must be in title case: "Query Planning" not "Query planning". Minor words (a, an, the, and, but, or, for, in, on, at, to,
  by, of) stay lowercase unless they are the first word.

## Repository Layout

The repository is new and may change. Discover the current layout from the filesystem before editing.

Expected durable areas may include:

- `src/`: Rust source for parser, catalog, planner, execution experiments, and storage prototypes.
- `tests/`: integration tests for rule planning, evaluation, and storage behavior.
- `examples/`: small runnable Datalog-like programs or storage scenarios.
- `fixtures/`: committed input facts and expected outputs.
- `notes/`: local design notes that belong to this project.
- `flowlog/`: project-local notes or sketches derived from the FlowLog line of work.

Do not assume this list is exhaustive. If the project grows a different structure, follow the actual codebase and update this file when conventions
stabilize.

## Technical Direction

The main experimental architecture is:

```text
Datalog-like rules or Geolog-shaped laws
-> dependency analysis and strata
-> rule catalog
-> join graph
-> relational plan
-> FlowLog-style optimization
-> backend lowering
-> maintained or snapshot outputs
```

Keep these layers explicit:

- **Source Layer**: Datalog-like test programs, CRDT query definitions, and Geomerge-style laws.
- **Catalog Layer**: rule heads, body atoms, variables, constants, comparisons, negation, and projections.
- **Planning Layer**: join graphs, join order, antijoin placement, SIP-style filtering, subplan sharing, and physical key choice.
- **Execution Layer**: snapshot evaluator first, then DBSP-like or Differential Dataflow-like experiments.
- **Storage Layer**: facts, transactions, rollback, preview state, and violation output integration.

## FlowLog-Inspired Planning

FlowLog should be treated as a planning reference, not as an automatic dependency.

Reusable ideas:

- rule catalog construction
- dependency graph and stratification
- per-rule join graph extraction
- width-oriented structural planning
- sideways information passing
- antijoin scheduling
- physical key and payload selection
- shared subplan detection

When adapting an idea, write the smallest test that demonstrates the behavior. For example:

```text
rule with three positive atoms
-> catalog variables
-> join graph
-> planned join tree
-> expected textual plan
```

## DBSP and Incremental Execution

DBSP-related work should preserve a clean boundary:

```text
planned relational IR
-> DBSP lowering
-> maintained output deltas
```

Do not make DBSP responsible for source-language semantics. The frontend should check supported syntax, stratification, and rule shape before backend
lowering.

For each DBSP-like experiment, also provide a snapshot oracle when feasible:

```text
snapshot result == maintained result after each update
```

Track these measurements when relevant:

- hydration time
- warm-update time
- output delta size
- maintained state size if available
- sensitivity to join order
- sensitivity to causal-history depth

## CRDT Query Experiments

Initial CRDT workloads should stay small and explicit:

- multi-value register
- causal readiness over `pred`
- list next-element traversal
- tombstone skipping

Use operation facts shaped like:

```text
set(replica_id, counter, key, value)
pred(from_replica_id, from_counter, to_replica_id, to_counter)
insert(replica_id, counter, parent_replica_id, parent_counter, value)
remove(replica_id, counter)
```

Important questions:

- Does the query require recursion, negation, or both?
- Can antijoins run earlier?
- Can causal readiness be maintained from a frontier?
- Does warm-update cost depend on history depth?
- Does the output need integration into a current view?

## Geomerge-Style Validation Experiments

The first Geomerge-style target is maintained violation detection for supported relational laws.

A useful lowering shape is:

```text
required_consequent(x) :- antecedent(x).
violation(x) :- required_consequent(x), not consequent(x).
```

Start with:

- foreign-key-style laws
- totality-as-validation laws
- equality-as-violation laws
- multi-atom antecedents without existential witnesses

Exclude at first:

- existential witness generation
- disjunctive consequents
- equality saturation
- model branching
- full chase behavior

Violation rows should carry enough context for diagnostics:

```text
law_id
violation_kind
relation_or_consequent
bound_variable_values
```

## Rust Conventions

- Prefer small modules with explicit data structures over large generic abstractions.
- Use enums and structs to model rule syntax, catalog entries, plan nodes, and execution results.
- Prefer typed identifiers for relation names, variable names, rule ids, and field positions when it improves clarity.
- Keep parser errors and unsupported-feature errors explicit.
- Avoid panics in library code except for internal invariants that tests already cover.
- Use deterministic ordering for plans and diagnostics so tests are stable.
- Prefer simple snapshot evaluators as correctness oracles before optimizing.

## Testing Expectations

Add tests for every non-trivial behavior.

Recommended test groups:

- parser acceptance and rejection
- rule catalog construction
- dependency graph and strata
- join graph construction
- structural planning
- antijoin scheduling
- SIP-style filtering
- snapshot evaluation
- maintained-output equivalence
- CRDT fixtures
- Geomerge-style violation fixtures

Tests should prefer small facts with readable expected outputs. Avoid large benchmark fixtures unless the test is explicitly performance-oriented.

## Required Validation

Use the repository's actual tooling. At the time this file was written, the copied `Makefile` is still Nix-playground-oriented and may not match this
project. Do not assume `make check` is meaningful until the Makefile is updated for this repository.

For Rust changes, prefer:

1. `cargo fmt`
2. `cargo clippy --all-targets --all-features`
3. `cargo test --all-targets --all-features`

If the project does not yet have a `Cargo.toml`, record that validation was not available.

For Markdown-only changes, run a manual read-through and check that headings follow the writing style.

## Change Design Checklist

Before coding:

1. Problem statement and target question
2. Existing module or new module decision
3. Snapshot oracle or expected output
4. Supported and unsupported feature boundary
5. Small fixture or example shape

Before submitting:

1. Formatting status
2. Test status
3. Unsupported cases documented
4. No durable references to ignored local paths
5. Notes or examples updated when behavior changes

## Review Guidelines

Review output should prioritize correctness and experiment quality.

- `P0`: must-fix defects, such as incorrect query results, invalid rollback behavior, unsupported syntax accepted silently, or tests that cannot run.
- `P1`: high-priority defects, such as nondeterministic plans, unclear unsupported-feature errors, missing snapshot oracle for a planner change, or
  misleading notes.
- `P2`: useful follow-up, such as additional fixtures, clearer diagnostics, or broader benchmark coverage.

Use this review format:

1. `Severity` (`P0`/`P1`/`P2`)
2. `File:line`
3. `Issue`
4. `Why it matters`
5. `Minimal fix direction`

## Practical Notes for Agents

- Read the relevant durable project notes before changing architecture.
- Treat copied papers, cloned repositories, and generated files in ignored local paths as reference material only.
- Prefer a planning-only prototype before backend integration.
- Prefer textual plan explanations in early tests. They make the planner easier to debug.
- Keep backend comparison fair: same rule, same input facts, same expected output.
- Keep transaction and rollback behavior explicit for validation experiments.
- When the Makefile becomes project-specific, update this file's validation section.

## Commit and PR Hygiene

- Keep commits scoped to one logical change: parser, catalog, planner, evaluator, fixture, note, or tooling.
- Do not mix broad formatting churn with semantic changes.
- PR descriptions should include:
    1. the experiment or feature being tested,
    2. the source rules or fixtures affected,
    3. the expected behavior,
    4. validation commands and results,
    5. known unsupported cases.

Suggested PR checklist:

- [ ] `cargo fmt` passes, if applicable
- [ ] `cargo clippy --all-targets --all-features` passes, if applicable
- [ ] `cargo test --all-targets --all-features` passes, if applicable
- [ ] Snapshot oracle or expected output included for planner behavior
- [ ] Unsupported cases documented