5.9 KiB
5.9 KiB
Notes
This file records working notes from the recent Geolog / backend design discussion.
Current State of chase-rs
chase-rscurrently runs the minimal.chasefrontend language, not the richer.geologexample language.- The current CLI supports
repl,gui, andscriptover the minimal command language. - The project is best described as a chase engine for TGDs / existential rules, not a narrow classical Datalog implementation.
- It can execute Datalog-like programs, but it also supports existential head variables via labeled null generation.
Geolog and geolog-lite
- The
.geologfiles inexamples/geolog/appear to define a richer DSL that is not currently wired into the executable frontend. - A practical direction is to extract a smaller, well-defined core named
geolog-lite. geolog-liteshould focus on the positive relational fragment:- theories
- instances
- predicates
- conjunctive rule bodies
- conjunctive rule heads
- existential variables in heads
- conjunctive queries
- Surface features such as record arguments, qualified names, field projection, and function-like syntax should be desugared into a flat relational IR.
Using chase-rs as a Processor
chase-rsis a good fit as a backend processor forgeolog-liteonce the language is lowered to flat predicates, facts, and TGD-style rules.- A compilation pipeline could be:
- parse
geolog-lite - elaborate names and parameters
- lower to relational IR
- compile to
Instance+Rule - run chase
- answer conjunctive queries over the materialized instance
- parse
- This works well for the positive existential fragment.
- It does not fully cover richer Geolog features such as equality reasoning, solver-oriented unsatisfiability, disjunction, or other advanced semantics.
Most Critical Missing Capability in chase-rs
- The most semantically important missing feature is equality support:
- EGDs
- congruence closure / equality saturation
- A close second is full query-answering support beyond the current materialized instance matching behavior.
Relational Database as Backend
- A relational database can be used as a backend for
geolog-lite. - However, a plain relational database is not a drop-in replacement for a chase engine.
- If the language includes existential rules and repeated rule application to fixpoint, the chase logic still needs to exist somewhere.
- Therefore, the preferred model is:
- database = storage + joins + dedup + persistence
- Rust engine = chase coordination + witness generation + trigger tracking
Preferred Architecture: DB-Backed Chase Engine
- Strongest direction discussed:
- store facts in a relational database
- let SQL perform joins and candidate generation
- implement the chase loop in Rust
- let Rust handle restricted-chase triggers and existential witnesses
- This keeps semantics explicit while benefiting from database execution.
Recommended MVP Scope
Start with a deliberately small fragment:
- positive rules only
- flat predicates only after lowering
- conjunctive bodies and heads
- existential variables in heads
- conjunctive queries
- no equality
- no negation
- no disjunction
- no solver-style unsat features
This should be enough for a useful first vertical slice.
Suggested Data Model
- Use one database table per predicate.
- Do not use SQL
NULLas labeled nulls. - Generate labeled null identities in Rust.
- Qualified Geolog names should lower to stable SQL-safe predicate names.
Example lowering ideas:
Edge : [src: V, tgt: V] -> Prop-> predicate tableedge(src, tgt)src : E -> V-> relationsrc(e, v)after desugaringR/data : R -> [x: A, y: B]-> relationr_data(r, x, y)
Restricted Chase in a DB-Backed Engine
- The key mechanism is trigger tracking.
- For each rule application, compute a canonical frontier binding.
- Store applied triggers in a dedicated table.
- Skip any body match whose frontier binding has already been applied.
- For existential heads, generate fresh labeled nulls in Rust and insert the corresponding derived facts.
High-level loop:
- Find body matches using SQL.
- Project frontier bindings.
- Filter out already-applied triggers.
- Generate existential witnesses in Rust when needed.
- Insert derived facts with dedup.
- Record applied triggers.
- Repeat until fixpoint.
Clean Backend Interface
- A clean backend trait is preferable to embedding raw SQL everywhere.
- The backend abstraction should be chase-shaped rather than a generic
execute_sqlwrapper.
Suggested responsibilities:
- ensure predicate storage exists
- load / append base facts
- evaluate a compiled rule body
- filter unseen triggers
- insert derived facts with dedup
- record applied triggers
- evaluate conjunctive queries over materialized facts
Also keep an in-memory backend as the semantic reference implementation.
DuckDB as the First Database Backend
- DuckDB is a strong candidate for the first backend:
- embedded / in-process
- good analytical join performance
- simple deployment model
- good fit for batched chase rounds
- Recommended usage model:
- one engine process owns the database connection
- work in batches rather than many tiny transactions
- keep chase semantics in Rust
- This makes DuckDB a good first backend behind a clean database trait.
Suggested Implementation Order
- Define a relational IR for predicates, rules, and queries.
- Define a clean backend trait for fact storage and rule evaluation.
- Keep an in-memory backend as the reference implementation.
- Implement a
DuckDbBackend. - Build a minimal
geolog-liteparser and lowering pipeline. - Run a first end-to-end example such as transitive closure.
Good First Example
examples/geolog/transitive_closure.geologis an ideal first target.- It exercises:
- theory parsing
- predicate lowering
- basic chase materialization
- recursive closure
- simple conjunctive querying