# Notes This file records working notes from the recent Geolog / backend design discussion. ## Current State of `chase-rs` - `chase-rs` currently runs the minimal `.chase` frontend language, not the richer `.geolog` example language. - The current CLI supports `repl`, `gui`, and `script` over the minimal command language. - The project is best described as a chase engine for TGDs / existential rules, not a narrow classical Datalog implementation. - It can execute Datalog-like programs, but it also supports existential head variables via labeled null generation. ## Geolog and `geolog-lite` - The `.geolog` files in `examples/geolog/` appear to define a richer DSL that is not currently wired into the executable frontend. - A practical direction is to extract a smaller, well-defined core named `geolog-lite`. - `geolog-lite` should focus on the positive relational fragment: - theories - instances - predicates - conjunctive rule bodies - conjunctive rule heads - existential variables in heads - conjunctive queries - Surface features such as record arguments, qualified names, field projection, and function-like syntax should be desugared into a flat relational IR. ## Using `chase-rs` as a Processor - `chase-rs` is a good fit as a backend processor for `geolog-lite` once the language is lowered to flat predicates, facts, and TGD-style rules. - A compilation pipeline could be: - parse `geolog-lite` - elaborate names and parameters - lower to relational IR - compile to `Instance` + `Rule` - run chase - answer conjunctive queries over the materialized instance - This works well for the positive existential fragment. - It does **not** fully cover richer Geolog features such as equality reasoning, solver-oriented unsatisfiability, disjunction, or other advanced semantics. ## Most Critical Missing Capability in `chase-rs` - The most semantically important missing feature is equality support: - EGDs - congruence closure / equality saturation - A close second is full query-answering support beyond the current materialized instance matching behavior. ## Relational Database as Backend - A relational database can be used as a backend for `geolog-lite`. - However, a plain relational database is **not** a drop-in replacement for a chase engine. - If the language includes existential rules and repeated rule application to fixpoint, the chase logic still needs to exist somewhere. - Therefore, the preferred model is: - database = storage + joins + dedup + persistence - Rust engine = chase coordination + witness generation + trigger tracking ## Preferred Architecture: DB-Backed Chase Engine - Strongest direction discussed: - store facts in a relational database - let SQL perform joins and candidate generation - implement the chase loop in Rust - let Rust handle restricted-chase triggers and existential witnesses - This keeps semantics explicit while benefiting from database execution. ## Recommended MVP Scope Start with a deliberately small fragment: - positive rules only - flat predicates only after lowering - conjunctive bodies and heads - existential variables in heads - conjunctive queries - no equality - no negation - no disjunction - no solver-style unsat features This should be enough for a useful first vertical slice. ## Suggested Data Model - Use one database table per predicate. - Do not use SQL `NULL` as labeled nulls. - Generate labeled null identities in Rust. - Qualified Geolog names should lower to stable SQL-safe predicate names. Example lowering ideas: - `Edge : [src: V, tgt: V] -> Prop` -> predicate table `edge(src, tgt)` - `src : E -> V` -> relation `src(e, v)` after desugaring - `R/data : R -> [x: A, y: B]` -> relation `r_data(r, x, y)` ## Restricted Chase in a DB-Backed Engine - The key mechanism is trigger tracking. - For each rule application, compute a canonical frontier binding. - Store applied triggers in a dedicated table. - Skip any body match whose frontier binding has already been applied. - For existential heads, generate fresh labeled nulls in Rust and insert the corresponding derived facts. High-level loop: 1. Find body matches using SQL. 2. Project frontier bindings. 3. Filter out already-applied triggers. 4. Generate existential witnesses in Rust when needed. 5. Insert derived facts with dedup. 6. Record applied triggers. 7. Repeat until fixpoint. ## Clean Backend Interface - A clean backend trait is preferable to embedding raw SQL everywhere. - The backend abstraction should be chase-shaped rather than a generic `execute_sql` wrapper. Suggested responsibilities: - ensure predicate storage exists - load / append base facts - evaluate a compiled rule body - filter unseen triggers - insert derived facts with dedup - record applied triggers - evaluate conjunctive queries over materialized facts Also keep an in-memory backend as the semantic reference implementation. ## DuckDB as the First Database Backend - DuckDB is a strong candidate for the first backend: - embedded / in-process - good analytical join performance - simple deployment model - good fit for batched chase rounds - Recommended usage model: - one engine process owns the database connection - work in batches rather than many tiny transactions - keep chase semantics in Rust - This makes DuckDB a good first backend behind a clean database trait. ## Suggested Implementation Order 1. Define a relational IR for predicates, rules, and queries. 2. Define a clean backend trait for fact storage and rule evaluation. 3. Keep an in-memory backend as the reference implementation. 4. Implement a `DuckDbBackend`. 5. Build a minimal `geolog-lite` parser and lowering pipeline. 6. Run a first end-to-end example such as transitive closure. ## Good First Example - `examples/geolog/transitive_closure.geolog` is an ideal first target. - It exercises: - theory parsing - predicate lowering - basic chase materialization - recursive closure - simple conjunctive querying