Compare commits

...

3 Commits

Author SHA1 Message Date
Hassan Abedi
362d5d1917 Add a Makefile target for plan viewer 2026-06-11 17:00:58 +02:00
Hassan Abedi
6327920bf7 Add a simple UI for visualizing the plans 2026-06-11 17:00:58 +02:00
Hassan Abedi
b38b176e7f Add a few emojies to README files 2026-06-11 17:00:58 +02:00
9 changed files with 84 additions and 82 deletions

View File

@ -8,7 +8,7 @@ indent_size = 4
insert_final_newline = true insert_final_newline = true
trim_trailing_whitespace = true trim_trailing_whitespace = true
[*.{rs,hs,py}] [*.{rs,hs,py,js}]
max_line_length = 100 max_line_length = 100
[*.md] [*.md]
@ -20,3 +20,6 @@ indent_size = 2
[*.{yaml,yml,json}] [*.{yaml,yml,json}]
indent_size = 2 indent_size = 2
[*.{css,html}]
indent_size = 2

111
AGENTS.md
View File

@ -4,15 +4,15 @@ This file provides guidance to coding agents collaborating on this repository.
## Mission ## Mission
`storage-engine-playground` is an experimental Rust project for testing ideas from the FlowLog, DBSP, CRDT-as-query, and Geomerge notes. `storage-engine-playground` is an experimental Rust project for prototyping query engines and storage engines.
The goal is not production software. The goal is a clear, runnable playground for small prototypes that help answer concrete architecture questions: The goal is not production software. The goal is a clear, runnable playground for small prototypes that help answer concrete architecture questions:
- how Datalog-like rules should be parsed, cataloged, planned, and optimized - how a query language should be parsed, cataloged, planned, and optimized
- how FlowLog-style planning ideas transfer to a DBSP-oriented frontend - how a query planner and a query executor should be separated, and what intermediate representation sits between them
- how CRDT queries behave under naive plans versus planned relational execution - how a query executor's operators (scans, joins, antijoins, projections) compose into a working snapshot evaluator
- how Geomerge-style laws can compile into maintained violation relations - how a storage engine should expose a backend-neutral interface (relations, rows, transactions, scans), and how that interface holds up across
- how backend behavior changes across snapshot, DBSP-like, and Differential Dataflow-like execution models different backends (in-process, file-backed, CRDT, and so on)
Priorities, in order: Priorities, in order:
@ -44,19 +44,26 @@ Priorities, in order:
## Repository Layout ## Repository Layout
The repository is new and may change. Discover the current layout from the filesystem before editing. Discover the current layout from the filesystem before editing.
The shape today is:
Expected durable areas may include: - `crates/`: Rust workspace.
See [`crates/README.md`](crates/README.md) for the responsibilities and dependency edges between the four crates (`storage`, `query-ops`,
`plan-runner`, `geomerge-demo`).
Each crate keeps its own `src/`, `tests/`, and (where relevant) `fixtures/`, `benches/`, and `docs/diagrams/` subdirectories.
- `tools/exporter/`: Haskell tool that consumes hand-authored `.scenario.json` files in `tools/exporter/examples/` and emits the runner-IR JSON
consumed by `crates/plan-runner`.
See [`tools/exporter/README.md`](tools/exporter/README.md).
- `tools/plan-viewer/`: static HTML viewer for `plan-runner` fixtures.
It evaluates a fixture in the browser and renders the plan DAG, per-node relations, input facts, and oracle comparison.
See [`tools/plan-viewer/README.md`](tools/plan-viewer/README.md).
- `external/`: git submodules.
`external/geolog` provides the Haskell query planner used by the exporter; `external/geomerge` is the Rust CRDT crate consumed by
`storage::adapters::geomerge`.
- Top-level configuration: `Makefile`, `flake.nix`, `Cargo.toml` (workspace), `pyproject.toml`, `.pre-commit-config.yaml`, `rust-toolchain.toml`.
- `src/`: Rust source for parser, catalog, planner, execution experiments, and storage prototypes. Do not assume this list is exhaustive.
- `tests/`: integration tests for rule planning, evaluation, and storage behavior. If the project grows a different structure, follow the actual codebase and update this file when conventions stabilize.
- `tools/exporter/examples/`: hand-authored scenario JSON consumed by the Haskell exporter to produce runner fixtures.
- `fixtures/`: committed input facts and expected outputs.
- `notes/`: local design notes that belong to this project.
- `flowlog/`: project-local notes or sketches derived from the FlowLog line of work.
Do not assume this list is exhaustive. If the project grows a different structure, follow the actual codebase and update this file when conventions
stabilize.
## Technical Direction ## Technical Direction
@ -70,15 +77,15 @@ Datalog-like rules or Geolog-shaped laws
-> relational plan -> relational plan
-> FlowLog-style optimization -> FlowLog-style optimization
-> backend lowering -> backend lowering
-> maintained or snapshot outputs -> snapshot outputs
``` ```
Keep these layers explicit: Keep these layers explicit:
- **Source Layer**: Datalog-like test programs, CRDT query definitions, and Geomerge-style laws. - **Source Layer**: Datalog-like test programs and Geomerge-style laws.
- **Catalog Layer**: rule heads, body atoms, variables, constants, comparisons, negation, and projections. - **Catalog Layer**: rule heads, body atoms, variables, constants, comparisons, negation, and projections.
- **Planning Layer**: join graphs, join order, antijoin placement, SIP-style filtering, subplan sharing, and physical key choice. - **Planning Layer**: join graphs, join order, antijoin placement, SIP-style filtering, subplan sharing, and physical key choice.
- **Execution Layer**: snapshot evaluator first, then DBSP-like or Differential Dataflow-like experiments. - **Execution Layer**: snapshot evaluator.
- **Storage Layer**: facts, transactions, rollback, preview state, and violation output integration. - **Storage Layer**: facts, transactions, rollback, preview state, and violation output integration.
## FlowLog-Inspired Planning ## FlowLog-Inspired Planning
@ -106,60 +113,6 @@ rule with three positive atoms
-> expected textual plan -> expected textual plan
``` ```
## DBSP and Incremental Execution
DBSP-related work should preserve a clean boundary:
```text
planned relational IR
-> DBSP lowering
-> maintained output deltas
```
Do not make DBSP responsible for source-language semantics. The frontend should check supported syntax, stratification, and rule shape before backend
lowering.
For each DBSP-like experiment, also provide a snapshot oracle when feasible:
```text
snapshot result == maintained result after each update
```
Track these measurements when relevant:
- hydration time
- warm-update time
- output delta size
- maintained state size if available
- sensitivity to join order
- sensitivity to causal-history depth
## CRDT Query Experiments
Initial CRDT workloads should stay small and explicit:
- multi-value register
- causal readiness over `pred`
- list next-element traversal
- tombstone skipping
Use operation facts shaped like:
```text
set(replica_id, counter, key, value)
pred(from_replica_id, from_counter, to_replica_id, to_counter)
insert(replica_id, counter, parent_replica_id, parent_counter, value)
remove(replica_id, counter)
```
Important questions:
- Does the query require recursion, negation, or both?
- Can antijoins run earlier?
- Can causal readiness be maintained from a frontier?
- Does warm-update cost depend on history depth?
- Does the output need integration into a current view?
## Geomerge-Style Validation Experiments ## Geomerge-Style Validation Experiments
The first Geomerge-style target is maintained violation detection for supported relational laws. The first Geomerge-style target is maintained violation detection for supported relational laws.
@ -219,8 +172,7 @@ Recommended test groups:
- antijoin scheduling - antijoin scheduling
- SIP-style filtering - SIP-style filtering
- snapshot evaluation - snapshot evaluation
- maintained-output equivalence - storage-backend adapter parity (in-process, file-backed, and CRDT)
- CRDT fixtures
- Geomerge-style violation fixtures - Geomerge-style violation fixtures
Tests should prefer small facts with readable expected outputs. Avoid large benchmark fixtures unless the test is explicitly performance-oriented. Tests should prefer small facts with readable expected outputs. Avoid large benchmark fixtures unless the test is explicitly performance-oriented.
@ -239,6 +191,13 @@ For Rust changes, prefer:
These map to `cargo fmt --all --check`, `cargo clippy --all-targets --all-features -- -D warnings`, and `cargo test --all-targets --all-features`. These map to `cargo fmt --all --check`, `cargo clippy --all-targets --all-features -- -D warnings`, and `cargo test --all-targets --all-features`.
If the project does not yet have a `Cargo.toml`, `make check` should still pass by skipping Rust-specific checks. If the project does not yet have a `Cargo.toml`, `make check` should still pass by skipping Rust-specific checks.
For changes that touch the cross-language pipeline (Haskell exporter and Rust runner), also run:
1. `make export-fixtures`: rebuilds `crates/plan-runner/fixtures/*.json` from `tools/exporter/examples/*.scenario.json` using the Haskell exporter.
Requires the Nix dev shell (`make shell` or `nix develop`) so GHC and Cabal are available.
2. `make examples`: runs `export-fixtures` and then `cargo test -p plan-runner --test examples`, which walks every regenerated fixture and verifies it
against its `expected_bindings` oracle.
For Markdown-only changes, run a manual read-through and check that headings follow the writing style. For Markdown-only changes, run a manual read-through and check that headings follow the writing style.
## Change Design Checklist ## Change Design Checklist

View File

@ -99,6 +99,18 @@ export-fixtures: ## Regenerate plan JSON for every tools/exporter/examples/*.sce
examples: export-fixtures ## Regenerate fixtures from scenarios and run them through plan-runner against their oracles. examples: export-fixtures ## Regenerate fixtures from scenarios and run them through plan-runner against their oracles.
@cargo test -p plan-runner --test examples @cargo test -p plan-runner --test examples
VIEWER_PORT ?= 8000
.PHONY: viewer
viewer: ## Serve the repository over HTTP for the plan viewer (override the port with VIEWER_PORT=...)
@if ! command -v python3 >/dev/null 2>&1; then \
echo "python3 not found. Enter the dev shell with 'make shell' (or 'nix develop') first."; \
exit 1; \
fi
@echo "Plan viewer: http://localhost:$(VIEWER_PORT)/tools/plan-viewer/index.html"
@echo "Example: http://localhost:$(VIEWER_PORT)/tools/plan-viewer/index.html?fixture=../../crates/plan-runner/fixtures/two_atom_join.json"
@python3 -m http.server $(VIEWER_PORT)
.PHONY: shell .PHONY: shell
shell: ## Enter the Nix dev shell defined in flake.nix shell: ## Enter the Nix dev shell defined in flake.nix
@nix develop @nix develop

View File

@ -1,9 +1,13 @@
## Storage Engine Playground ## Storage Engine Playground
This repo is a playground for running small experiments related to storage side of things. This repo is a playground for running small experiments related to storage and query execution.
### Development ### Development
> ⚠️ Clone with `--recursive`.
> The repo pulls `external/geolog` and `external/geomerge` as git submodules;
> a non-recursive clone leaves those directories empty and breaks the build.
```sh ```sh
# Clone the repo with submodules # Clone the repo with submodules
git clone --recursive git@code.obsidian.systems:habedi-work/storage-engine-playground.git git clone --recursive git@code.obsidian.systems:habedi-work/storage-engine-playground.git

View File

@ -48,6 +48,11 @@ via `build_tables_via_storage`, then scans tables back out before executing.
| `sqlite` | `SqliteStorage` | fresh tempdir per run | | `sqlite` | `SqliteStorage` | fresh tempdir per run |
| `geomerge` | `GeomergeStorage` | in-process | | `geomerge` | `GeomergeStorage` | in-process |
> ⚠️ `--backend geomerge` requires a typed theory upfront, but the runner IR is untyped.
> The CLI infers column types (`PrimInt` or `PrimString`) from the first fact row per relation;
> relations with no facts default to `PrimString`.
> Works for every current fixture; future fixtures with mixed-type columns may fail at insert time.
### Execute a Query Plan ### Execute a Query Plan
```sh ```sh

View File

@ -106,7 +106,7 @@ cargo test -p storage --all-features
- **Deletion support.** - **Deletion support.**
Most adapters implement `delete`. Most adapters implement `delete`.
The `geomerge` adapter does not: its append-only commit log returns `StorageError::Unsupported("row deletion")`. The `geomerge` adapter does not: its append-only commit log returns `StorageError::Unsupported("row deletion")`.
- **Geomerge is alpha.** - ⚠️ **Geomerge is alpha.**
The upstream `geomerge` crate is prototype-status and its API can change without notice; treat breakage in `adapters::geomerge` as expected churn The upstream `geomerge` crate is prototype-status and its API can change without notice; treat breakage in `adapters::geomerge` as expected churn
rather than regression. rather than regression.
- **Feature gates.** - **Feature gates.**

View File

@ -26,9 +26,10 @@ tools/exporter/
### Run It ### Run It
The exporter needs GHC 9.12 and Cabal. > ⚠️ The exporter needs GHC 9.12 and Cabal.
The repository's Nix dev shell provides both; > The repository's Nix dev shell provides both;
enter it with `make shell` (or `nix develop`) before running the commands below. > enter it with `make shell` (or `nix develop`) before running the commands below.
> A system GHC older than 9.12 will fail to compile geolog-lang's `GHC2024` modules.
```sh ```sh
# Build the executable: # Build the executable:

View File

@ -0,0 +1,18 @@
## Query Plan Viewer
A static HTML viewer for `plan-runner` JSON files (the fixtures).
### Usage
Open [`index.html`](index.html) in a browser, then drop a JSON file from [`crates/plan-runner/fixtures/`](../../crates/plan-runner/fixtures) onto the page.
Alternatively, you can run the commands below to serve the viewer locally:
```sh
make shell # Go into the Nix shells
make viewer
```
Then open the following URL in your browser (replace `two_atom_join.json` with the name of the plan you want to view)
http://localhost:8000/tools/plan-viewer/index.html?fixture=../../crates/plan-runner/fixtures/two_atom_join.json

Binary file not shown.