173 lines
5.9 KiB
Markdown
173 lines
5.9 KiB
Markdown
# Notes
|
|
|
|
This file records working notes from the recent Geolog / backend design discussion.
|
|
|
|
## Current State of `chase-rs`
|
|
|
|
- `chase-rs` currently runs the minimal `.chase` frontend language, not the
|
|
richer `.geolog` example language.
|
|
- The current CLI supports `repl`, `gui`, and `script` over the minimal command
|
|
language.
|
|
- The project is best described as a chase engine for TGDs / existential rules,
|
|
not a narrow classical Datalog implementation.
|
|
- It can execute Datalog-like programs, but it also supports existential head
|
|
variables via labeled null generation.
|
|
|
|
## Geolog and `geolog-lite`
|
|
|
|
- The `.geolog` files in `examples/geolog/` appear to define a richer DSL that
|
|
is not currently wired into the executable frontend.
|
|
- A practical direction is to extract a smaller, well-defined core named
|
|
`geolog-lite`.
|
|
- `geolog-lite` should focus on the positive relational fragment:
|
|
- theories
|
|
- instances
|
|
- predicates
|
|
- conjunctive rule bodies
|
|
- conjunctive rule heads
|
|
- existential variables in heads
|
|
- conjunctive queries
|
|
- Surface features such as record arguments, qualified names, field projection,
|
|
and function-like syntax should be desugared into a flat relational IR.
|
|
|
|
## Using `chase-rs` as a Processor
|
|
|
|
- `chase-rs` is a good fit as a backend processor for `geolog-lite` once the
|
|
language is lowered to flat predicates, facts, and TGD-style rules.
|
|
- A compilation pipeline could be:
|
|
- parse `geolog-lite`
|
|
- elaborate names and parameters
|
|
- lower to relational IR
|
|
- compile to `Instance` + `Rule`
|
|
- run chase
|
|
- answer conjunctive queries over the materialized instance
|
|
- This works well for the positive existential fragment.
|
|
- It does **not** fully cover richer Geolog features such as equality reasoning,
|
|
solver-oriented unsatisfiability, disjunction, or other advanced semantics.
|
|
|
|
## Most Critical Missing Capability in `chase-rs`
|
|
|
|
- The most semantically important missing feature is equality support:
|
|
- EGDs
|
|
- congruence closure / equality saturation
|
|
- A close second is full query-answering support beyond the current materialized
|
|
instance matching behavior.
|
|
|
|
## Relational Database as Backend
|
|
|
|
- A relational database can be used as a backend for `geolog-lite`.
|
|
- However, a plain relational database is **not** a drop-in replacement for a
|
|
chase engine.
|
|
- If the language includes existential rules and repeated rule application to
|
|
fixpoint, the chase logic still needs to exist somewhere.
|
|
- Therefore, the preferred model is:
|
|
- database = storage + joins + dedup + persistence
|
|
- Rust engine = chase coordination + witness generation + trigger tracking
|
|
|
|
## Preferred Architecture: DB-Backed Chase Engine
|
|
|
|
- Strongest direction discussed:
|
|
- store facts in a relational database
|
|
- let SQL perform joins and candidate generation
|
|
- implement the chase loop in Rust
|
|
- let Rust handle restricted-chase triggers and existential witnesses
|
|
- This keeps semantics explicit while benefiting from database execution.
|
|
|
|
## Recommended MVP Scope
|
|
|
|
Start with a deliberately small fragment:
|
|
|
|
- positive rules only
|
|
- flat predicates only after lowering
|
|
- conjunctive bodies and heads
|
|
- existential variables in heads
|
|
- conjunctive queries
|
|
- no equality
|
|
- no negation
|
|
- no disjunction
|
|
- no solver-style unsat features
|
|
|
|
This should be enough for a useful first vertical slice.
|
|
|
|
## Suggested Data Model
|
|
|
|
- Use one database table per predicate.
|
|
- Do not use SQL `NULL` as labeled nulls.
|
|
- Generate labeled null identities in Rust.
|
|
- Qualified Geolog names should lower to stable SQL-safe predicate names.
|
|
|
|
Example lowering ideas:
|
|
|
|
- `Edge : [src: V, tgt: V] -> Prop` -> predicate table `edge(src, tgt)`
|
|
- `src : E -> V` -> relation `src(e, v)` after desugaring
|
|
- `R/data : R -> [x: A, y: B]` -> relation `r_data(r, x, y)`
|
|
|
|
## Restricted Chase in a DB-Backed Engine
|
|
|
|
- The key mechanism is trigger tracking.
|
|
- For each rule application, compute a canonical frontier binding.
|
|
- Store applied triggers in a dedicated table.
|
|
- Skip any body match whose frontier binding has already been applied.
|
|
- For existential heads, generate fresh labeled nulls in Rust and insert the
|
|
corresponding derived facts.
|
|
|
|
High-level loop:
|
|
|
|
1. Find body matches using SQL.
|
|
2. Project frontier bindings.
|
|
3. Filter out already-applied triggers.
|
|
4. Generate existential witnesses in Rust when needed.
|
|
5. Insert derived facts with dedup.
|
|
6. Record applied triggers.
|
|
7. Repeat until fixpoint.
|
|
|
|
## Clean Backend Interface
|
|
|
|
- A clean backend trait is preferable to embedding raw SQL everywhere.
|
|
- The backend abstraction should be chase-shaped rather than a generic
|
|
`execute_sql` wrapper.
|
|
|
|
Suggested responsibilities:
|
|
|
|
- ensure predicate storage exists
|
|
- load / append base facts
|
|
- evaluate a compiled rule body
|
|
- filter unseen triggers
|
|
- insert derived facts with dedup
|
|
- record applied triggers
|
|
- evaluate conjunctive queries over materialized facts
|
|
|
|
Also keep an in-memory backend as the semantic reference implementation.
|
|
|
|
## DuckDB as the First Database Backend
|
|
|
|
- DuckDB is a strong candidate for the first backend:
|
|
- embedded / in-process
|
|
- good analytical join performance
|
|
- simple deployment model
|
|
- good fit for batched chase rounds
|
|
- Recommended usage model:
|
|
- one engine process owns the database connection
|
|
- work in batches rather than many tiny transactions
|
|
- keep chase semantics in Rust
|
|
- This makes DuckDB a good first backend behind a clean database trait.
|
|
|
|
## Suggested Implementation Order
|
|
|
|
1. Define a relational IR for predicates, rules, and queries.
|
|
2. Define a clean backend trait for fact storage and rule evaluation.
|
|
3. Keep an in-memory backend as the reference implementation.
|
|
4. Implement a `DuckDbBackend`.
|
|
5. Build a minimal `geolog-lite` parser and lowering pipeline.
|
|
6. Run a first end-to-end example such as transitive closure.
|
|
|
|
## Good First Example
|
|
|
|
- `examples/geolog/transitive_closure.geolog` is an ideal first target.
|
|
- It exercises:
|
|
- theory parsing
|
|
- predicate lowering
|
|
- basic chase materialization
|
|
- recursive closure
|
|
- simple conjunctive querying
|