WIP
This commit is contained in:
parent
6327c0e344
commit
8a2021ca4b
108
AGENTS.md
108
AGENTS.md
@ -4,15 +4,15 @@ This file provides guidance to coding agents collaborating on this repository.
|
||||
|
||||
## Mission
|
||||
|
||||
`storage-engine-playground` is an experimental Rust project for testing ideas from the FlowLog, DBSP, CRDT-as-query, and Geomerge notes.
|
||||
`storage-engine-playground` is an experimental Rust project for prototyping query engines and storage engines.
|
||||
|
||||
The goal is not production software. The goal is a clear, runnable playground for small prototypes that help answer concrete architecture questions:
|
||||
|
||||
- how Datalog-like rules should be parsed, cataloged, planned, and optimized
|
||||
- how FlowLog-style planning ideas transfer to a DBSP-oriented frontend
|
||||
- how CRDT queries behave under naive plans versus planned relational execution
|
||||
- how Geomerge-style laws can compile into maintained violation relations
|
||||
- how backend behavior changes across snapshot, DBSP-like, and Differential Dataflow-like execution models
|
||||
- how a query language should be parsed, cataloged, planned, and optimized
|
||||
- how a query planner and a query executor should be separated, and what intermediate representation sits between them
|
||||
- how a query executor's operators (scans, joins, antijoins, projections) compose into a working snapshot evaluator
|
||||
- how a storage engine should expose a backend-neutral interface (relations, rows, transactions, scans), and how that interface holds up across
|
||||
different backends (in-process, file-backed, CRDT, and so on)
|
||||
|
||||
Priorities, in order:
|
||||
|
||||
@ -44,19 +44,23 @@ Priorities, in order:
|
||||
|
||||
## Repository Layout
|
||||
|
||||
The repository is new and may change. Discover the current layout from the filesystem before editing.
|
||||
Discover the current layout from the filesystem before editing.
|
||||
The shape today is:
|
||||
|
||||
Expected durable areas may include:
|
||||
- `crates/`: Rust workspace.
|
||||
See [`crates/README.md`](crates/README.md) for the responsibilities and dependency edges between the four crates (`storage`, `query-ops`,
|
||||
`plan-runner`, `geomerge-demo`).
|
||||
Each crate keeps its own `src/`, `tests/`, and (where relevant) `fixtures/`, `benches/`, and `docs/diagrams/` subdirectories.
|
||||
- `tools/exporter/`: Haskell tool that consumes hand-authored `.scenario.json` files in `tools/exporter/examples/` and emits the runner-IR JSON
|
||||
consumed by `crates/plan-runner`.
|
||||
See [`tools/exporter/README.md`](tools/exporter/README.md).
|
||||
- `external/`: git submodules.
|
||||
`external/geolog` provides the Haskell query planner used by the exporter; `external/geomerge` is the Rust CRDT crate consumed by
|
||||
`storage::adapters::geomerge`.
|
||||
- Top-level configuration: `Makefile`, `flake.nix`, `Cargo.toml` (workspace), `pyproject.toml`, `.pre-commit-config.yaml`, `rust-toolchain.toml`.
|
||||
|
||||
- `src/`: Rust source for parser, catalog, planner, execution experiments, and storage prototypes.
|
||||
- `tests/`: integration tests for rule planning, evaluation, and storage behavior.
|
||||
- `tools/exporter/examples/`: hand-authored scenario JSON consumed by the Haskell exporter to produce runner fixtures.
|
||||
- `fixtures/`: committed input facts and expected outputs.
|
||||
- `notes/`: local design notes that belong to this project.
|
||||
- `flowlog/`: project-local notes or sketches derived from the FlowLog line of work.
|
||||
|
||||
Do not assume this list is exhaustive. If the project grows a different structure, follow the actual codebase and update this file when conventions
|
||||
stabilize.
|
||||
Do not assume this list is exhaustive.
|
||||
If the project grows a different structure, follow the actual codebase and update this file when conventions stabilize.
|
||||
|
||||
## Technical Direction
|
||||
|
||||
@ -70,15 +74,15 @@ Datalog-like rules or Geolog-shaped laws
|
||||
-> relational plan
|
||||
-> FlowLog-style optimization
|
||||
-> backend lowering
|
||||
-> maintained or snapshot outputs
|
||||
-> snapshot outputs
|
||||
```
|
||||
|
||||
Keep these layers explicit:
|
||||
|
||||
- **Source Layer**: Datalog-like test programs, CRDT query definitions, and Geomerge-style laws.
|
||||
- **Source Layer**: Datalog-like test programs and Geomerge-style laws.
|
||||
- **Catalog Layer**: rule heads, body atoms, variables, constants, comparisons, negation, and projections.
|
||||
- **Planning Layer**: join graphs, join order, antijoin placement, SIP-style filtering, subplan sharing, and physical key choice.
|
||||
- **Execution Layer**: snapshot evaluator first, then DBSP-like or Differential Dataflow-like experiments.
|
||||
- **Execution Layer**: snapshot evaluator.
|
||||
- **Storage Layer**: facts, transactions, rollback, preview state, and violation output integration.
|
||||
|
||||
## FlowLog-Inspired Planning
|
||||
@ -106,60 +110,6 @@ rule with three positive atoms
|
||||
-> expected textual plan
|
||||
```
|
||||
|
||||
## DBSP and Incremental Execution
|
||||
|
||||
DBSP-related work should preserve a clean boundary:
|
||||
|
||||
```text
|
||||
planned relational IR
|
||||
-> DBSP lowering
|
||||
-> maintained output deltas
|
||||
```
|
||||
|
||||
Do not make DBSP responsible for source-language semantics. The frontend should check supported syntax, stratification, and rule shape before backend
|
||||
lowering.
|
||||
|
||||
For each DBSP-like experiment, also provide a snapshot oracle when feasible:
|
||||
|
||||
```text
|
||||
snapshot result == maintained result after each update
|
||||
```
|
||||
|
||||
Track these measurements when relevant:
|
||||
|
||||
- hydration time
|
||||
- warm-update time
|
||||
- output delta size
|
||||
- maintained state size if available
|
||||
- sensitivity to join order
|
||||
- sensitivity to causal-history depth
|
||||
|
||||
## CRDT Query Experiments
|
||||
|
||||
Initial CRDT workloads should stay small and explicit:
|
||||
|
||||
- multi-value register
|
||||
- causal readiness over `pred`
|
||||
- list next-element traversal
|
||||
- tombstone skipping
|
||||
|
||||
Use operation facts shaped like:
|
||||
|
||||
```text
|
||||
set(replica_id, counter, key, value)
|
||||
pred(from_replica_id, from_counter, to_replica_id, to_counter)
|
||||
insert(replica_id, counter, parent_replica_id, parent_counter, value)
|
||||
remove(replica_id, counter)
|
||||
```
|
||||
|
||||
Important questions:
|
||||
|
||||
- Does the query require recursion, negation, or both?
|
||||
- Can antijoins run earlier?
|
||||
- Can causal readiness be maintained from a frontier?
|
||||
- Does warm-update cost depend on history depth?
|
||||
- Does the output need integration into a current view?
|
||||
|
||||
## Geomerge-Style Validation Experiments
|
||||
|
||||
The first Geomerge-style target is maintained violation detection for supported relational laws.
|
||||
@ -219,8 +169,7 @@ Recommended test groups:
|
||||
- antijoin scheduling
|
||||
- SIP-style filtering
|
||||
- snapshot evaluation
|
||||
- maintained-output equivalence
|
||||
- CRDT fixtures
|
||||
- storage-backend adapter parity (in-process, file-backed, and CRDT)
|
||||
- Geomerge-style violation fixtures
|
||||
|
||||
Tests should prefer small facts with readable expected outputs. Avoid large benchmark fixtures unless the test is explicitly performance-oriented.
|
||||
@ -239,6 +188,13 @@ For Rust changes, prefer:
|
||||
These map to `cargo fmt --all --check`, `cargo clippy --all-targets --all-features -- -D warnings`, and `cargo test --all-targets --all-features`.
|
||||
If the project does not yet have a `Cargo.toml`, `make check` should still pass by skipping Rust-specific checks.
|
||||
|
||||
For changes that touch the cross-language pipeline (Haskell exporter and Rust runner), also run:
|
||||
|
||||
1. `make export-fixtures`: rebuilds `crates/plan-runner/fixtures/*.json` from `tools/exporter/examples/*.scenario.json` using the Haskell exporter.
|
||||
Requires the Nix dev shell (`make shell` or `nix develop`) so GHC and Cabal are available.
|
||||
2. `make examples`: runs `export-fixtures` and then `cargo test -p plan-runner --test examples`, which walks every regenerated fixture and verifies it
|
||||
against its `expected_bindings` oracle.
|
||||
|
||||
For Markdown-only changes, run a manual read-through and check that headings follow the writing style.
|
||||
|
||||
## Change Design Checklist
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user