habedi-work/storage-engine-playground

Fork 0

Hassan Abedi 115b3ff6f9 WIP

2026-06-12 12:42:46 +02:00

11 KiB

Raw Blame History

AGENTS.md

This file provides guidance to coding agents collaborating on this repository.

Mission

storage-engine-playground is an experimental Rust project for prototyping query engines and storage engines.

The goal is not production software. The goal is a clear, runnable playground for small prototypes that help answer concrete architecture questions:

how a query language should be parsed, cataloged, planned, and optimized
how a query planner and a query executor should be separated, and what intermediate representation sits between them
how a query executor's operators (scans, joins, antijoins, projections) compose into a working snapshot evaluator
how a storage engine should expose a backend-neutral interface (relations, rows, transactions, scans), and how that interface holds up across different backends (in-process, file-backed, CRDT, and so on)

Priorities, in order:

Correctness: prototypes must have clear expected outputs and tests.
Clarity: each module and test should answer one research or engineering question.
Small scope: prefer narrow experiments over broad engine rewrites.
Explainability: planners should emit inspectable plans, not only executable structures.
Reproducibility: examples should use committed fixtures, deterministic tests, and documented commands.

Core Rules

Use English for code, comments, tests, and prose.
Treat ignored local reference material as source material only. Do not import copied code into durable modules without an explicit decision.
Prefer implementing small vertical slices: parse a subset, build a catalog, plan one rule shape, and test it.
Do not build a full Datalog engine before the planning layer is useful and tested.
Keep source language, relational planning, and backend execution separated.
Prefer backend-neutral intermediate structures until a specific backend API requires specialization.
Add comments only when they explain non-obvious planning, recursion, delta, or storage behavior.
Treat tests and fixtures as part of the design, not as afterthoughts.

Writing Style

Use Oxford commas in inline lists: "a, b, and c" not "a, b, c".
Do not use em dashes. Restructure the sentence, or use a colon or semicolon instead.
Avoid colorful adjectives and adverbs. Write "storage engine" not "lightweight storage engine", "planner" not "clever planner".
Use noun phrases for checklist items, not imperative verbs. Write "rule catalog construction" not "construct rule catalogs".
Headings in Markdown files must be in title case: "Query Planning" not "Query planning". Minor words (a, an, the, and, but, or, for, in, on, at, to, by, of) stay lowercase unless they are the first word.

Repository Layout

Discover the current layout from the filesystem before editing. The shape today is:

crates/: Rust workspace. See crates/README.md for the responsibilities and dependency edges between the four crates (storage, query-ops, plan-runner, geomerge-demo). Each crate keeps its own src/, tests/, and (where relevant) fixtures/, benches/, and docs/diagrams/ subdirectories.
tools/exporter/: Haskell tool that consumes hand-authored .scenario.json files in tools/exporter/examples/ and emits the runner-IR JSON consumed by crates/plan-runner. See tools/exporter/README.md.
tools/plan-viewer/: static HTML viewer for plan-runner fixtures. It evaluates a fixture in the browser and renders the plan DAG, per-node relations, input facts, and oracle comparison. See tools/plan-viewer/README.md.
external/: git submodules. external/geolog provides the Haskell query planner used by the exporter; external/geomerge is the Rust CRDT crate consumed by storage::adapters::geomerge.
Top-level configuration: Makefile, flake.nix, Cargo.toml (workspace), pyproject.toml, .pre-commit-config.yaml, rust-toolchain.toml.

Do not assume this list is exhaustive. If the project grows a different structure, follow the actual codebase and update this file when conventions stabilize.

Technical Direction

The main experimental architecture is:

Datalog-like rules or Geolog-shaped laws
-> dependency analysis and strata
-> rule catalog
-> join graph
-> relational plan
-> FlowLog-style optimization
-> backend lowering
-> snapshot outputs

Keep these layers explicit:

Source Layer: Datalog-like test programs and Geomerge-style laws.
Catalog Layer: rule heads, body atoms, variables, constants, comparisons, negation, and projections.
Planning Layer: join graphs, join order, antijoin placement, SIP-style filtering, subplan sharing, and physical key choice.
Execution Layer: snapshot evaluator.
Storage Layer: facts, transactions, rollback, preview state, and violation output integration.

FlowLog-Inspired Planning

FlowLog should be treated as a planning reference, not as an automatic dependency.

Reusable ideas:

rule catalog construction
dependency graph and stratification
per-rule join graph extraction
width-oriented structural planning
sideways information passing
antijoin scheduling
physical key and payload selection
shared subplan detection

When adapting an idea, write the smallest test that demonstrates the behavior. For example:

rule with three positive atoms
-> catalog variables
-> join graph
-> planned join tree
-> expected textual plan

Geomerge-Style Validation Experiments

The first Geomerge-style target is maintained violation detection for supported relational laws.

A useful lowering shape is:

required_consequent(x) :- antecedent(x).
violation(x) :- required_consequent(x), not consequent(x).

Start with:

foreign-key-style laws
totality-as-validation laws
equality-as-violation laws
multi-atom antecedents without existential witnesses

Exclude at first:

existential witness generation
disjunctive consequents
equality saturation
model branching
full chase behavior

Violation rows should carry enough context for diagnostics:

law_id
violation_kind
relation_or_consequent
bound_variable_values

Rust Conventions

Prefer small modules with explicit data structures over large generic abstractions.
Use enums and structs to model rule syntax, catalog entries, plan nodes, and execution results.
Prefer typed identifiers for relation names, variable names, rule ids, and field positions when it improves clarity.
Keep parser errors and unsupported-feature errors explicit.
Avoid panics in library code except for internal invariants that tests already cover.
Use deterministic ordering for plans and diagnostics so tests are stable.
Prefer simple snapshot evaluators as correctness oracles before optimizing.

Testing Expectations

Add tests for every non-trivial behavior.

Recommended test groups:

parser acceptance and rejection
rule catalog construction
dependency graph and strata
join graph construction
structural planning
antijoin scheduling
SIP-style filtering
snapshot evaluation
storage-backend adapter parity (in-process, file-backed, and CRDT)
Geomerge-style violation fixtures

Tests should prefer small facts with readable expected outputs. Avoid large benchmark fixtures unless the test is explicitly performance-oriented.

Required Validation

Use the repository's actual tooling. The Makefile wraps the standard Rust commands and skips Rust checks with a clear message if no Cargo.toml exists yet.

For Rust changes, prefer:

make format-check
make lint
make test

These map to cargo fmt --all --check, cargo clippy --all-targets --all-features -- -D warnings, and cargo test --all-targets --all-features. If the project does not yet have a Cargo.toml, make check should still pass by skipping Rust-specific checks.

For changes that touch the cross-language pipeline (Haskell exporter and Rust runner), also run:

make export-fixtures: rebuilds crates/plan-runner/fixtures/*.json from tools/exporter/examples/*.scenario.json using the Haskell exporter. Requires the Nix dev shell (make shell or nix develop) so GHC and Cabal are available.
make examples: runs export-fixtures and then cargo test -p plan-runner --test examples, which walks every regenerated fixture and verifies it against its expected_bindings oracle.

For changes that touch tools/plan-viewer, run make viewer-test. It checks the viewer's JavaScript engine against every fixture oracle under Node; the Rust crates remain the correctness oracle.

For Markdown-only changes, run a manual read-through and check that headings follow the writing style.

Change Design Checklist

Before coding:

Problem statement and target question
Existing module or new module decision
Snapshot oracle or expected output
Supported and unsupported feature boundary
Small fixture or example shape

Before submitting:

Formatting status
Test status
Unsupported cases documented
No durable references to ignored local paths
Notes or examples updated when behavior changes

Review Guidelines

Review output should prioritize correctness and experiment quality.

P0: must-fix defects, such as incorrect query results, invalid rollback behavior, unsupported syntax accepted silently, or tests that cannot run.
P1: high-priority defects, such as nondeterministic plans, unclear unsupported-feature errors, missing snapshot oracle for a planner change, or misleading notes.
P2: useful follow-up, such as additional fixtures, clearer diagnostics, or broader benchmark coverage.

Use this review format:

Severity (P0/P1/P2)
File:line
Issue
Why it matters
Minimal fix direction

Practical Notes for Agents

Read the relevant durable project notes before changing architecture.
Treat copied papers, cloned repositories, and generated files in ignored local paths as reference material only.
Prefer a planning-only prototype before backend integration.
Prefer textual plan explanations in early tests. They make the planner easier to debug.
Keep backend comparison fair: same rule, same input facts, same expected output.
Keep transaction and rollback behavior explicit for validation experiments.
Keep project tooling aligned with this file when new commands or checks are added.

Commit and PR Hygiene

Keep commits scoped to one logical change: parser, catalog, planner, evaluator, fixture, note, or tooling.
Do not mix broad formatting churn with semantic changes.
PR descriptions should include:
1. the experiment or feature being tested,
2. the source rules or fixtures affected,
3. the expected behavior,
4. validation commands and results,
5. known unsupported cases.

Suggested PR checklist:

make format-check passes, if applicable
make lint passes, if applicable
make test passes, if applicable
Snapshot oracle or expected output included for planner behavior
Unsupported cases documented

11 KiB Raw Blame History