Compare commits

..

2 Commits

Author SHA1 Message Date
Hassan Abedi
73394d7e9e Add summary of disscusion about DB baclend 2026-03-11 13:08:26 +01:00
Hassan Abedi
9946c4206f WIP 2026-03-11 10:37:22 +01:00
2 changed files with 231 additions and 0 deletions

172
NOTES.md Normal file
View File

@ -0,0 +1,172 @@
# Notes
This file records working notes from the recent Geolog / backend design discussion.
## Current State of `chase-rs`
- `chase-rs` currently runs the minimal `.chase` frontend language, not the
richer `.geolog` example language.
- The current CLI supports `repl`, `gui`, and `script` over the minimal command
language.
- The project is best described as a chase engine for TGDs / existential rules,
not a narrow classical Datalog implementation.
- It can execute Datalog-like programs, but it also supports existential head
variables via labeled null generation.
## Geolog and `geolog-lite`
- The `.geolog` files in `examples/geolog/` appear to define a richer DSL that
is not currently wired into the executable frontend.
- A practical direction is to extract a smaller, well-defined core named
`geolog-lite`.
- `geolog-lite` should focus on the positive relational fragment:
- theories
- instances
- predicates
- conjunctive rule bodies
- conjunctive rule heads
- existential variables in heads
- conjunctive queries
- Surface features such as record arguments, qualified names, field projection,
and function-like syntax should be desugared into a flat relational IR.
## Using `chase-rs` as a Processor
- `chase-rs` is a good fit as a backend processor for `geolog-lite` once the
language is lowered to flat predicates, facts, and TGD-style rules.
- A compilation pipeline could be:
- parse `geolog-lite`
- elaborate names and parameters
- lower to relational IR
- compile to `Instance` + `Rule`
- run chase
- answer conjunctive queries over the materialized instance
- This works well for the positive existential fragment.
- It does **not** fully cover richer Geolog features such as equality reasoning,
solver-oriented unsatisfiability, disjunction, or other advanced semantics.
## Most Critical Missing Capability in `chase-rs`
- The most semantically important missing feature is equality support:
- EGDs
- congruence closure / equality saturation
- A close second is full query-answering support beyond the current materialized
instance matching behavior.
## Relational Database as Backend
- A relational database can be used as a backend for `geolog-lite`.
- However, a plain relational database is **not** a drop-in replacement for a
chase engine.
- If the language includes existential rules and repeated rule application to
fixpoint, the chase logic still needs to exist somewhere.
- Therefore, the preferred model is:
- database = storage + joins + dedup + persistence
- Rust engine = chase coordination + witness generation + trigger tracking
## Preferred Architecture: DB-Backed Chase Engine
- Strongest direction discussed:
- store facts in a relational database
- let SQL perform joins and candidate generation
- implement the chase loop in Rust
- let Rust handle restricted-chase triggers and existential witnesses
- This keeps semantics explicit while benefiting from database execution.
## Recommended MVP Scope
Start with a deliberately small fragment:
- positive rules only
- flat predicates only after lowering
- conjunctive bodies and heads
- existential variables in heads
- conjunctive queries
- no equality
- no negation
- no disjunction
- no solver-style unsat features
This should be enough for a useful first vertical slice.
## Suggested Data Model
- Use one database table per predicate.
- Do not use SQL `NULL` as labeled nulls.
- Generate labeled null identities in Rust.
- Qualified Geolog names should lower to stable SQL-safe predicate names.
Example lowering ideas:
- `Edge : [src: V, tgt: V] -> Prop` -> predicate table `edge(src, tgt)`
- `src : E -> V` -> relation `src(e, v)` after desugaring
- `R/data : R -> [x: A, y: B]` -> relation `r_data(r, x, y)`
## Restricted Chase in a DB-Backed Engine
- The key mechanism is trigger tracking.
- For each rule application, compute a canonical frontier binding.
- Store applied triggers in a dedicated table.
- Skip any body match whose frontier binding has already been applied.
- For existential heads, generate fresh labeled nulls in Rust and insert the
corresponding derived facts.
High-level loop:
1. Find body matches using SQL.
2. Project frontier bindings.
3. Filter out already-applied triggers.
4. Generate existential witnesses in Rust when needed.
5. Insert derived facts with dedup.
6. Record applied triggers.
7. Repeat until fixpoint.
## Clean Backend Interface
- A clean backend trait is preferable to embedding raw SQL everywhere.
- The backend abstraction should be chase-shaped rather than a generic
`execute_sql` wrapper.
Suggested responsibilities:
- ensure predicate storage exists
- load / append base facts
- evaluate a compiled rule body
- filter unseen triggers
- insert derived facts with dedup
- record applied triggers
- evaluate conjunctive queries over materialized facts
Also keep an in-memory backend as the semantic reference implementation.
## DuckDB as the First Database Backend
- DuckDB is a strong candidate for the first backend:
- embedded / in-process
- good analytical join performance
- simple deployment model
- good fit for batched chase rounds
- Recommended usage model:
- one engine process owns the database connection
- work in batches rather than many tiny transactions
- keep chase semantics in Rust
- This makes DuckDB a good first backend behind a clean database trait.
## Suggested Implementation Order
1. Define a relational IR for predicates, rules, and queries.
2. Define a clean backend trait for fact storage and rule evaluation.
3. Keep an in-memory backend as the reference implementation.
4. Implement a `DuckDbBackend`.
5. Build a minimal `geolog-lite` parser and lowering pipeline.
6. Run a first end-to-end example such as transitive closure.
## Good First Example
- `examples/geolog/transitive_closure.geolog` is an ideal first target.
- It exercises:
- theory parsing
- predicate lowering
- basic chase materialization
- recursive closure
- simple conjunctive querying

View File

@ -428,6 +428,65 @@ Some example comments refer to interactive commands such as:
Those commands appear to belong to an external REPL or tool environment rather
than to the `.geolog` file grammar itself.
## Querying
The examples do **not** show a stable, implemented query form inside `.geolog`
files in the same way they show `theory` and `instance` declarations.
What they do show is:
### 1. External interactive commands
Some files suggest that querying or inspection happens through an external tool
or REPL:
```text
:source examples/geolog/transitive_closure.geolog
:inspect Chain
:chase Chain
:solve EmptyModel
```
Based on the example comments, these commands appear to mean roughly:
- `:source` or `:load`: load Geolog definitions from files
- `:inspect`: inspect a declared instance
- `:chase`: materialize or display the closure of a `= chase` instance
- `:solve`: ask a solver to construct a satisfying instance for a theory
These are tool commands, not part of the confirmed `.geolog` declaration syntax.
### 2. A sketched future query form
One file contains a commented-out example of a possible query block:
```text
query can_reach_B_from_A {
? : ExampleNet problem0 Solution instance;
}
```
This suggests an aspirational style where querying is expressed as asking for a
witness inhabiting some instance type. However, in the current examples this is
only a comment, not an observed live construct.
### 3. Current implemented querying elsewhere in the repo
The currently implemented query syntax in this repository belongs to the minimal
frontend language, not to Geolog. That language supports forms such as:
```text
query Ancestor(?X, ?Y)?
explain Ancestor(alice, carol)?
```
So the safest summary is:
- Geolog examples define theories, instances, and chase-oriented structure.
- Querying appears to be external, REPL-driven, or still in design.
- The only clearly implemented query syntax in this repo today is the minimal
`.chase` frontend query language.
## Best Example Files by Feature
- `transitive_closure.geolog`: basic theory, axiom, and `= chase` usage