geolog-zeta-fork/docs/ARCHITECTURE.md

256 lines
18 KiB
Markdown
Raw Permalink Normal View History

2026-02-26 11:50:51 +01:00
# Geolog Architecture
Geolog is a language for geometric logic with semantics in topoi. This document describes the module structure and data flow.
## Module Overview
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ USER INTERFACE │
├─────────────────────────────────────────────────────────────────────────────┤
│ repl.rs Interactive REPL with commands (:help, :inspect, etc.) │
│ bin/geolog.rs CLI entry point │
└───────────────────────────────┬─────────────────────────────────────────────┘
┌───────────────────────────────▼─────────────────────────────────────────────┐
│ PARSING / SURFACE LAYER │
├─────────────────────────────────────────────────────────────────────────────┤
│ lexer.rs Tokenization (chumsky-based) │
│ parser.rs Token stream → AST (chumsky-based) │
│ ast.rs Surface syntax AST types │
│ pretty.rs Core → geolog source (inverse of parsing) │
│ error.rs Error formatting with source spans │
└───────────────────────────────┬─────────────────────────────────────────────┘
┌───────────────────────────────▼─────────────────────────────────────────────┐
│ ELABORATION LAYER │
├─────────────────────────────────────────────────────────────────────────────┤
│ elaborate/ │
│ ├── mod.rs Re-exports │
│ ├── env.rs Elaboration environment (theory registry) │
│ ├── theory.rs AST theory → Core theory elaboration │
│ ├── instance.rs AST instance → Core structure elaboration │
│ └── error.rs Elaboration error types │
│ │
│ Transforms surface AST into typed core representation │
└───────────────────────────────┬─────────────────────────────────────────────┘
┌───────────────────────────────▼─────────────────────────────────────────────┐
│ CORE LAYER │
├─────────────────────────────────────────────────────────────────────────────┤
│ core.rs Core IR: Signature, Term, Formula, Structure │
│ - Signature: sorts + functions + relations │
│ - Term: Var | App | Record | Project │
│ - Formula: True | False | Eq | Rel | Conj | Disj | Exists │
│ - Structure: carriers + function maps + relation storage │
│ │
│ id.rs Identity system (Luid = global, Slid = structure-local) │
│ universe.rs Global element registry (Luid allocation) │
│ naming.rs Bidirectional name ↔ Luid mapping │
└───────────────────────────────┬─────────────────────────────────────────────┘
┌───────────────────────────────▼─────────────────────────────────────────────┐
│ STORAGE LAYER │
├─────────────────────────────────────────────────────────────────────────────┤
│ store/ │
│ ├── mod.rs Store struct: unified GeologMeta persistence │
│ ├── schema.rs Schema ID caches (sort_ids, func_ids, etc.) │
│ ├── append.rs Append-only element/function/relation creation │
│ ├── theory.rs Theory → Store integration │
│ ├── instance.rs Instance → Store integration │
│ ├── commit.rs Git-like commit/version control │
│ └── bootstrap_queries.rs Hardcoded query patterns (being replaced) │
│ │
│ workspace.rs Legacy session management (deprecated, use Store) │
│ patch.rs Patch-based structure modifications │
│ version.rs Git-like version control for structures │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ QUERY LAYER │
├─────────────────────────────────────────────────────────────────────────────┤
│ query/ │
│ ├── mod.rs Re-exports and overview │
│ ├── chase.rs Chase algorithm for existential/equality conclusions │
│ │ - chase_fixpoint_with_cc(): main entry point │
│ │ - Integrates CongruenceClosure for equality saturation│
│ ├── compile.rs Query → QueryOp plan compilation │
│ ├── backend.rs Naive QueryOp executor (reference impl) │
│ ├── optimize.rs Algebraic law rewriting (filter fusion, etc.) │
│ ├── pattern.rs Legacy Pattern API (deprecated) │
│ └── store_queries.rs Store-level compiled query methods │
│ │
│ Relational query engine for GeologMeta and instance queries. │
│ Query API: Query::scan(sort).filter_eq(func, col, val).compile() │
│ Optimizer applies RelAlgIR laws: Filter(p, Filter(q, x)) → Filter(p∧q, x) │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ SOLVING LAYER (frontier) │
├─────────────────────────────────────────────────────────────────────────────┤
│ cc.rs Congruence closure (shared by solver + chase) │
│ - Element equivalence tracking with union-find │
│ - Used for equality conclusion axioms │
│ │
│ solver/ │
│ ├── mod.rs Unified model enumeration API + re-exports │
│ │ - enumerate_models(): core unified function │
│ │ - solve(): find models from scratch │
│ │ - query(): extend existing models │
│ ├── types.rs SearchNode, Obligation, NodeStatus (re-exports cc::*) │
│ ├── tree.rs Explicit search tree with from_base() for extensions │
│ └── tactics.rs Automated search tactics: │
│ - CheckTactic: axiom checking, obligation reporting │
│ - ForwardChainingTactic: Datalog-style forward chaining │
│ - PropagateEquationsTactic: congruence closure propagation│
│ - AutoTactic: composite fixpoint solver │
│ │
│ REPL commands: `:solve <theory>`, `:extend <instance> <theory>`
│ See examples/geolog/solver_demo.geolog for annotated examples. │
│ │
│ tensor/ │
│ ├── mod.rs Re-exports │
│ ├── expr.rs Lazy tensor expression trees │
│ ├── sparse.rs Sparse tensor storage (RoaringTreemap) │
│ ├── builder.rs Expression builders (conjunction, disjunction, exists) │
│ ├── compile.rs Formula → TensorExpr compilation │
│ └── check.rs Axiom checking via tensor evaluation │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ META LAYER (self-description) │
├─────────────────────────────────────────────────────────────────────────────┤
│ meta.rs Rust codegen for GeologMeta theory │
│ theories/GeologMeta.geolog Homoiconic theory representation │
└─────────────────────────────────────────────────────────────────────────────┘
```
## Data Flow
### Parsing / Pretty-Printing Flow
```
Source text → lexer.rs → Token stream → parser.rs → ast::File
core::Structure ← elaborate ←──────────────────────── ast::*
pretty.rs → Source text (roundtrip!)
```
### Elaboration Flow
```
ast::TheoryDecl → elaborate/theory.rs → core::Theory (Signature + Axioms)
ast::InstanceDecl → elaborate/instance.rs → core::Structure
```
### REPL Flow
```
User input → ReplState::process_line → MetaCommand | GeologInput
GeologInput → parse → elaborate → workspace.add_*
```
## Key Types
### Identity System
```rust
Luid // "Local Universe ID" - globally unique across all structures
Slid // "Structure-Local ID" - index within a single structure
// A Structure maps Slid → Luid for global identity
structure.get_luid(slid) -> Luid
```
### Core Representation
```rust
// Signatures define the vocabulary
Signature {
sorts: Vec<String>, // Sort names by SortId
functions: Vec<FunctionSymbol>, // f : A → B
relations: Vec<RelationSymbol>, // R : A → Prop
}
// Structures interpret signatures
Structure {
carriers: Vec<RoaringBitmap>, // Elements per sort (as Slid)
functions: Vec<HashMap<...>>, // Function value maps
relations: Vec<VecRelation>, // Relation extents
local_to_global: Vec<Luid>, // Slid → Luid
}
```
### Axioms (Sequents)
```rust
Sequent {
context: Context, // Universally quantified variables
premise: Formula, // Antecedent (conjunction of atomics)
conclusion: Formula, // Consequent (positive geometric formula)
}
```
## Design Principles
1. **Postfix application**: `x f` not `f(x)` — matches categorical composition
2. **Child pointers**: Parent → Child, not Child → Parent (no products in domains)
3. **Upward binding**: Variables point to their binders (scoping is explicit)
4. **Sparse storage**: Relations use RoaringBitmap for efficient membership
5. **Patch-based updates**: Structures evolve via patches, enabling versioning
6. **Explicit search tree**: Solver maintains tree in memory, not call stack
## Testing Strategy
- **proptest**: Property-based tests for core operations (naming, patches, structure)
- **unit tests**: Specific behaviors in `tests/unit_*.rs`
- **integration tests**: Example .geolog files in `tests/examples_integration.rs`
- **REPL testing**: Interactive exploration via `cargo run`
## Future Directions
See `bd ready` for current work items. Key frontiers:
- **Query engine** (`geolog-7tt`, `geolog-32x`): Chase algorithm and RelAlgIR compiler
- **Nested instance elaboration** (`geolog-1d4`): Inline instance definitions
- **Monotonic Submodel proofs** (`geolog-rgg`): Lean4 formalization
- **Disjunction variable alignment** (`geolog-69b`): Extend tensor builder for heterogeneous disjuncts
## Recent Milestones
- **Unified model enumeration API** (`2026-01-19`): Consolidated `solve()`, `extend()`, and `query()`
into single `enumerate_models()` function. REPL commands `:solve` and `:extend` now share underlying implementation.
- **Tensor compiler improvements** (`2026-01-20`):
- Function application equalities: `f(x) = y`, `y = f(x)`, `f(x) = g(y)` now compile correctly
- Empty-domain existential fix: `∃x. φ` on empty domain correctly returns false
- Closed `geolog-dxr` (tensor compilation panics on function terms)
- **Bootstrap query migration** (`2026-01-20`): All 6 bootstrap_queries functions now delegate
to compiled query engine (`store_queries.rs`). Net reduction of ~144 lines of handcoded iteration.
- **Proptest coverage** (`2026-01-20`): Added 6 solver proptests covering trivial theories,
inconsistent theories, existential theories, and Horn clause propagation.
- **Theory extends fix** (`2026-01-20`): Fixed bug where function names like `Func/dom` (using `/`
as naming convention) were incorrectly treated as grandparent-qualified names. RelAlgIR.geolog
now loads correctly, unblocking homoiconic query plan work (`geolog-32x`).
- **:explain REPL command** (`2026-01-20`): Added `:explain <instance> <sort>` to show query
execution plans, with Display impl for QueryOp using math notation (∫, δ, z⁻¹, ×, ∧, ).
- **Geometric logic solver complete** (`geolog-xj2`): Forward chaining, equation propagation,
existential body processing, derivation search for False. Interactive via `:solve`.
- **Chase with equality saturation** (`2026-01-21`): Chase algorithm now integrates congruence
closure (CC) for handling equality conclusion axioms like `R(x,y) |- x = y`. CC tracks
element equivalences and canonicalizes structures after chase converges. This enables
Category theory to terminate correctly: unit law axioms collapse infinite `id;id;...`
compositions. Added `src/cc.rs` as shared module for both solver and chase.
- **Chase proptests** (`2026-01-21`): Added property-based tests for reflexivity, transitivity,
existential conclusions, and equality conclusions. Multi-session persistence tests verify
chase results survive REPL restart.
- **Fuzzing infrastructure** (`2026-01-21`): Added `fuzz/` directory with `fuzz_parser` and
`fuzz_repl` targets for finding edge cases. Requires nightly Rust.