geolog-zeta-fork/docs/ARCHITECTURE.md
2026-02-26 11:50:51 +01:00

18 KiB
Raw Blame History

Geolog Architecture

Geolog is a language for geometric logic with semantics in topoi. This document describes the module structure and data flow.

Module Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                              USER INTERFACE                                  │
├─────────────────────────────────────────────────────────────────────────────┤
│  repl.rs          Interactive REPL with commands (:help, :inspect, etc.)   │
│  bin/geolog.rs    CLI entry point                                           │
└───────────────────────────────┬─────────────────────────────────────────────┘
                                │
┌───────────────────────────────▼─────────────────────────────────────────────┐
│                           PARSING / SURFACE LAYER                            │
├─────────────────────────────────────────────────────────────────────────────┤
│  lexer.rs         Tokenization (chumsky-based)                              │
│  parser.rs        Token stream → AST (chumsky-based)                        │
│  ast.rs           Surface syntax AST types                                  │
│  pretty.rs        Core → geolog source (inverse of parsing)                 │
│  error.rs         Error formatting with source spans                        │
└───────────────────────────────┬─────────────────────────────────────────────┘
                                │
┌───────────────────────────────▼─────────────────────────────────────────────┐
│                            ELABORATION LAYER                                 │
├─────────────────────────────────────────────────────────────────────────────┤
│  elaborate/                                                                 │
│    ├── mod.rs     Re-exports                                                │
│    ├── env.rs     Elaboration environment (theory registry)                 │
│    ├── theory.rs  AST theory → Core theory elaboration                      │
│    ├── instance.rs AST instance → Core structure elaboration                │
│    └── error.rs   Elaboration error types                                   │
│                                                                             │
│  Transforms surface AST into typed core representation                      │
└───────────────────────────────┬─────────────────────────────────────────────┘
                                │
┌───────────────────────────────▼─────────────────────────────────────────────┐
│                              CORE LAYER                                      │
├─────────────────────────────────────────────────────────────────────────────┤
│  core.rs          Core IR: Signature, Term, Formula, Structure              │
│                   - Signature: sorts + functions + relations                │
│                   - Term: Var | App | Record | Project                      │
│                   - Formula: True | False | Eq | Rel | Conj | Disj | Exists │
│                   - Structure: carriers + function maps + relation storage  │
│                                                                             │
│  id.rs            Identity system (Luid = global, Slid = structure-local)   │
│  universe.rs      Global element registry (Luid allocation)                 │
│  naming.rs        Bidirectional name ↔ Luid mapping                         │
└───────────────────────────────┬─────────────────────────────────────────────┘
                                │
┌───────────────────────────────▼─────────────────────────────────────────────┐
│                            STORAGE LAYER                                     │
├─────────────────────────────────────────────────────────────────────────────┤
│  store/                                                                     │
│    ├── mod.rs           Store struct: unified GeologMeta persistence        │
│    ├── schema.rs        Schema ID caches (sort_ids, func_ids, etc.)         │
│    ├── append.rs        Append-only element/function/relation creation      │
│    ├── theory.rs        Theory → Store integration                          │
│    ├── instance.rs      Instance → Store integration                        │
│    ├── commit.rs        Git-like commit/version control                     │
│    └── bootstrap_queries.rs  Hardcoded query patterns (being replaced)      │
│                                                                             │
│  workspace.rs     Legacy session management (deprecated, use Store)         │
│  patch.rs         Patch-based structure modifications                       │
│  version.rs       Git-like version control for structures                   │
└─────────────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────────┐
│                            QUERY LAYER                                       │
├─────────────────────────────────────────────────────────────────────────────┤
│  query/                                                                     │
│    ├── mod.rs         Re-exports and overview                               │
│    ├── chase.rs       Chase algorithm for existential/equality conclusions  │
│    │                  - chase_fixpoint_with_cc(): main entry point          │
│    │                  - Integrates CongruenceClosure for equality saturation│
│    ├── compile.rs     Query → QueryOp plan compilation                      │
│    ├── backend.rs     Naive QueryOp executor (reference impl)               │
│    ├── optimize.rs    Algebraic law rewriting (filter fusion, etc.)         │
│    ├── pattern.rs     Legacy Pattern API (deprecated)                       │
│    └── store_queries.rs  Store-level compiled query methods                 │
│                                                                             │
│  Relational query engine for GeologMeta and instance queries.               │
│  Query API: Query::scan(sort).filter_eq(func, col, val).compile()           │
│  Optimizer applies RelAlgIR laws: Filter(p, Filter(q, x)) → Filter(p∧q, x)  │
└─────────────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────────┐
│                         SOLVING LAYER (frontier)                             │
├─────────────────────────────────────────────────────────────────────────────┤
│  cc.rs            Congruence closure (shared by solver + chase)             │
│                   - Element equivalence tracking with union-find            │
│                   - Used for equality conclusion axioms                     │
│                                                                             │
│  solver/                                                                    │
│    ├── mod.rs     Unified model enumeration API + re-exports                │
│    │              - enumerate_models(): core unified function               │
│    │              - solve(): find models from scratch                       │
│    │              - query(): extend existing models                         │
│    ├── types.rs   SearchNode, Obligation, NodeStatus (re-exports cc::*)     │
│    ├── tree.rs    Explicit search tree with from_base() for extensions      │
│    └── tactics.rs Automated search tactics:                                 │
│                   - CheckTactic: axiom checking, obligation reporting       │
│                   - ForwardChainingTactic: Datalog-style forward chaining   │
│                   - PropagateEquationsTactic: congruence closure propagation│
│                   - AutoTactic: composite fixpoint solver                   │
│                                                                             │
│  REPL commands: `:solve <theory>`, `:extend <instance> <theory>`            │
│  See examples/geolog/solver_demo.geolog for annotated examples.             │
│                                                                             │
│  tensor/                                                                    │
│    ├── mod.rs     Re-exports                                                │
│    ├── expr.rs    Lazy tensor expression trees                              │
│    ├── sparse.rs  Sparse tensor storage (RoaringTreemap)                    │
│    ├── builder.rs Expression builders (conjunction, disjunction, exists)   │
│    ├── compile.rs Formula → TensorExpr compilation                          │
│    └── check.rs   Axiom checking via tensor evaluation                      │
└─────────────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────────┐
│                         META LAYER (self-description)                        │
├─────────────────────────────────────────────────────────────────────────────┤
│  meta.rs          Rust codegen for GeologMeta theory                        │
│  theories/GeologMeta.geolog   Homoiconic theory representation              │
└─────────────────────────────────────────────────────────────────────────────┘

Data Flow

Parsing / Pretty-Printing Flow

Source text → lexer.rs → Token stream → parser.rs → ast::File
                                                         ↓
core::Structure ← elaborate ←──────────────────────── ast::*
       ↓
pretty.rs → Source text   (roundtrip!)

Elaboration Flow

ast::TheoryDecl  →  elaborate/theory.rs  →  core::Theory (Signature + Axioms)
ast::InstanceDecl →  elaborate/instance.rs → core::Structure

REPL Flow

User input → ReplState::process_line → MetaCommand | GeologInput
                                              ↓
                    GeologInput → parse → elaborate → workspace.add_*

Key Types

Identity System

Luid  // "Local Universe ID" - globally unique across all structures
Slid  // "Structure-Local ID" - index within a single structure

// A Structure maps Slid → Luid for global identity
structure.get_luid(slid) -> Luid

Core Representation

// Signatures define the vocabulary
Signature {
    sorts: Vec<String>,           // Sort names by SortId
    functions: Vec<FunctionSymbol>,  // f : A → B
    relations: Vec<RelationSymbol>,  // R : A → Prop
}

// Structures interpret signatures
Structure {
    carriers: Vec<RoaringBitmap>,  // Elements per sort (as Slid)
    functions: Vec<HashMap<...>>,  // Function value maps
    relations: Vec<VecRelation>,   // Relation extents
    local_to_global: Vec<Luid>,    // Slid → Luid
}

Axioms (Sequents)

Sequent {
    context: Context,      // Universally quantified variables
    premise: Formula,      // Antecedent (conjunction of atomics)
    conclusion: Formula,   // Consequent (positive geometric formula)
}

Design Principles

  1. Postfix application: x f not f(x) — matches categorical composition
  2. Child pointers: Parent → Child, not Child → Parent (no products in domains)
  3. Upward binding: Variables point to their binders (scoping is explicit)
  4. Sparse storage: Relations use RoaringBitmap for efficient membership
  5. Patch-based updates: Structures evolve via patches, enabling versioning
  6. Explicit search tree: Solver maintains tree in memory, not call stack

Testing Strategy

  • proptest: Property-based tests for core operations (naming, patches, structure)
  • unit tests: Specific behaviors in tests/unit_*.rs
  • integration tests: Example .geolog files in tests/examples_integration.rs
  • REPL testing: Interactive exploration via cargo run

Future Directions

See bd ready for current work items. Key frontiers:

  • Query engine (geolog-7tt, geolog-32x): Chase algorithm and RelAlgIR compiler
  • Nested instance elaboration (geolog-1d4): Inline instance definitions
  • Monotonic Submodel proofs (geolog-rgg): Lean4 formalization
  • Disjunction variable alignment (geolog-69b): Extend tensor builder for heterogeneous disjuncts

Recent Milestones

  • Unified model enumeration API (2026-01-19): Consolidated solve(), extend(), and query() into single enumerate_models() function. REPL commands :solve and :extend now share underlying implementation.

  • Tensor compiler improvements (2026-01-20):

    • Function application equalities: f(x) = y, y = f(x), f(x) = g(y) now compile correctly
    • Empty-domain existential fix: ∃x. φ on empty domain correctly returns false
    • Closed geolog-dxr (tensor compilation panics on function terms)
  • Bootstrap query migration (2026-01-20): All 6 bootstrap_queries functions now delegate to compiled query engine (store_queries.rs). Net reduction of ~144 lines of handcoded iteration.

  • Proptest coverage (2026-01-20): Added 6 solver proptests covering trivial theories, inconsistent theories, existential theories, and Horn clause propagation.

  • Theory extends fix (2026-01-20): Fixed bug where function names like Func/dom (using / as naming convention) were incorrectly treated as grandparent-qualified names. RelAlgIR.geolog now loads correctly, unblocking homoiconic query plan work (geolog-32x).

  • :explain REPL command (2026-01-20): Added :explain <instance> <sort> to show query execution plans, with Display impl for QueryOp using math notation (∫, δ, z⁻¹, ×, ∧, ).

  • Geometric logic solver complete (geolog-xj2): Forward chaining, equation propagation, existential body processing, derivation search for False. Interactive via :solve.

  • Chase with equality saturation (2026-01-21): Chase algorithm now integrates congruence closure (CC) for handling equality conclusion axioms like R(x,y) |- x = y. CC tracks element equivalences and canonicalizes structures after chase converges. This enables Category theory to terminate correctly: unit law axioms collapse infinite id;id;... compositions. Added src/cc.rs as shared module for both solver and chase.

  • Chase proptests (2026-01-21): Added property-based tests for reflexivity, transitivity, existential conclusions, and equality conclusions. Multi-session persistence tests verify chase results survive REPL restart.

  • Fuzzing infrastructure (2026-01-21): Added fuzz/ directory with fuzz_parser and fuzz_repl targets for finding edge cases. Requires nightly Rust.