12 KiB
12 KiB
Geolog Project Notes
Overview
Geolog is a Geometric Logic REPL — a type theory with semantics in topoi, designed for formal specifications using geometric logic.
Core Capabilities
- Geometric logic programming — encode mathematical structures, relationships, and constraints
- Database schema definition — define sorts, functions, relations, and axioms
- Model/instance creation — create concrete finite models satisfying theory axioms
- Automated inference — chase algorithm for automatic fact derivation
- Version control — git-like commits and tracking for instances
- Persistence — append-only storage with optional disk persistence
Use Cases
- Business process workflow orchestration
- Formal verification via diagrammatic rewriting
- Database query design
- Petri net reachability and process modeling
Tech Stack
Primary Language: Rust (2021 edition, Cargo-based)
Key Dependencies
| Crate | Version | Purpose |
|---|---|---|
chumsky |
0.9 | Parser combinator library |
ariadne |
0.4 | Error reporting with source spans |
rkyv |
0.7 | Zero-copy serialization |
rustyline |
15 | REPL readline interface |
egglog-union-find |
1.0 | Union-find for congruence closure |
roaring |
0.10 | Bitmap library for sparse relations |
indexmap |
2.0 | Order-preserving hash maps |
uuid |
1 | UUID generation |
memmap2 |
0.9 | Memory-mapped file I/O |
Testing Frameworks
insta— snapshot testingproptest— property-based testingtempfile— temporary directory management
Architecture
┌─────────────────────────────────────────────────────┐
│ USER INTERFACE │
│ REPL (interactive CLI) | Batch file loading │
├─────────────────────────────────────────────────────┤
│ PARSING LAYER (Lexer → Parser → AST) │
│ chumsky-based lexer & parser, source error reporting│
├─────────────────────────────────────────────────────┤
│ ELABORATION LAYER (AST → Core IR) │
│ Type checking, name resolution, theory/instance │
├─────────────────────────────────────────────────────┤
│ CORE LAYER (Typed Representation) │
│ Signature, Term, Formula, Structure, ElaboratedTheory│
├─────────────────────────────────────────────────────┤
│ STORAGE LAYER (Persistence) │
│ Append-only GeologMeta store with version control │
├─────────────────────────────────────────────────────┤
│ QUERY & SOLVER LAYER (Execution) │
│ Chase algorithm, congruence closure, relational │
│ algebra compiler, SMT-style model enumeration │
├─────────────────────────────────────────────────────┤
│ TENSOR ALGEBRA (Axiom Checking) │
│ Sparse tensor evaluation for axiom validation │
└─────────────────────────────────────────────────────┘
Directory Structure
| Path | Purpose |
|---|---|
src/bin/geolog.rs |
CLI entry point |
src/lib.rs |
Library root, exports parse() |
src/repl.rs |
Interactive REPL state machine |
src/lexer.rs |
Tokenization using chumsky |
src/parser.rs |
Token stream → AST |
src/ast.rs |
Abstract syntax tree types |
src/core.rs |
Core IR: Signature, Term, Formula, Structure |
src/elaborate/ |
AST → Core elaboration |
src/store/ |
Persistence layer (append-only) |
src/query/ |
Chase algorithm, relational algebra |
src/solver/ |
SMT-style model enumeration |
src/tensor/ |
Sparse tensor algebra for axiom checking |
src/cc.rs |
Congruence closure (union-find) |
src/id.rs |
Luid/Slid identity system |
src/universe.rs |
Global element registry |
examples/geolog/ |
30+ example .geolog files |
tests/ |
25+ test files |
docs/ |
ARCHITECTURE.md, SYNTAX.md |
proofs/ |
Lean4 formalization |
fuzz/ |
Fuzzing targets |
Main Components
Parsing & Syntax (~1,200 lines)
lexer.rs— tokenizationparser.rs— token stream → ASTast.rs— AST types (Theory, Instance, Axiom, etc.)error.rs— error formatting with source spanspretty.rs— Core → Geolog source roundtrip printing
Elaboration (~2,200 lines)
elaborate/mod.rs— coordinationelaborate/theory.rs— AST Theory → Core ElaboratedTheoryelaborate/instance.rs— AST Instance → Core Structureelaborate/env.rs— environment with theory registryelaborate/types.rs— type expression evaluationelaborate/error.rs— type error reporting
Core Representation
core.rs— DerivedSort, Signature, Structure, Formula, Term, Sequentid.rs— Luid (global unique ID) and Slid (structure-local ID)universe.rs— global element registry with UUID ↔ Luid mappingnaming.rs— bidirectional name ↔ Luid mapping
Storage Layer (~1,500 lines)
store/mod.rs— main Store structstore/schema.rs— cached sort/function/relation IDsstore/append.rs— low-level element append operationsstore/theory.rs— theory CRUDstore/instance.rs— instance CRUDstore/commit.rs— git-like version controlstore/materialize.rs— indexed views for fast lookups
Query & Compilation (~3,500 lines)
query/compile.rs— Query → RelAlgIR plan compilationquery/to_relalg.rs— Query → Relational Algebra IRquery/from_relalg.rs— RelAlgIR → Executable QueryOpquery/chase.rs— chase algorithm for fixpoint computationquery/backend.rs— naive QueryOp executorquery/optimize.rs— algebraic law rewriting
Solver & Model Enumeration (~1,300 lines)
solver/mod.rs— unified model enumeration APIsolver/tree.rs— explicit search tree for partial modelssolver/tactics.rs— automated search strategies:- CheckTactic: axiom validation
- ForwardChainingTactic: Datalog-style inference
- PropagateEquationsTactic: congruence closure
- AutoTactic: composite fixpoint solver
solver/types.rs— SearchNode, Obligation, NodeStatus types
Tensor Algebra (~2,600 lines)
tensor/expr.rs— lazy tensor expression treestensor/sparse.rs— sparse tensor storage (RoaringBitmap-based)tensor/builder.rs— expression builderstensor/compile.rs— Formula → TensorExpr compilationtensor/check.rs— axiom checking via tensor evaluation
Key Entry Points
-
CLI:
src/bin/geolog.rsUsage: geolog [-d <workspace>] [source_files...] -
Parse Entry:
src/lib.rsexportsparse(input: &str) → Result<File, String> -
REPL State:
src/repl.rs—ReplState::process_line() -
Theory Elaboration:
elaborate/theory.rs::elaborate_theory() -
Instance Elaboration:
elaborate/instance.rs::elaborate_instance_ctx() -
Chase Algorithm:
query/chase.rs::chase_fixpoint_with_cc() -
Model Enumeration:
solver/mod.rs::enumerate_models()
Design Decisions
Geometric Logic Foundation
- Axioms as Sequents:
forall vars. premises |- conclusion - Positive Conclusions: Can have existentials, disjunctions, but never negations
- Geometric Morphisms: Preserved by design, enabling category-theoretic semantics
Identity System
- Luid ("Local Universe ID"): Globally unique across all structures
- Slid ("Structure-Local ID"): Index within a single structure
- Bidirectional mapping enables persistent identity despite structure changes
Append-Only Storage
- GeologMeta: Single homoiconic theory instance storing all data
- Patch-based Versioning: Each commit is a delta from parent
- Never Delete: Elements only tombstoned for perfect audit trails
Type System
- Postfix Application:
x fnotf(x)— categorical style - Derived Sorts: Products of base sorts for record domains
- Product Domains: Functions can take record arguments:
[x: M, y: M] -> M - Relations → Prop: Relations are functions to
Prop(boolean predicates)
Chase Algorithm
- Fixpoint Iteration: Derives all consequences until closure
- Congruence Closure Integration: Merges elements when axioms conclude
x = y - Termination for Unit Laws: Categories with unit laws no longer loop forever
- Uses tensor algebra for efficient axiom checking
Solver Architecture
- Explicit Search Tree: Not implicit in call stack (AI-friendly for agent control)
- Refinement Preorder: Structures can grow (carriers, functions, relations)
- Obligations vs Unsat: Axiom obligation = need to witness conclusion (NOT failure)
- True Unsat: Only when deriving
⊢ Falsefrom instantiated axioms - Tactics-based: AutoTactic composes multiple tactics
Relational Algebra Compilation
- QueryOp Intermediate: SQL-like operators (Scan, Filter, Join, Project, etc.)
- Optimization Passes: Filter fusion, projection pushdown
- Store-aware: Compiled directly to GeologMeta queries with indexing
Tensor Algebra for Axiom Checking
- Sparse Representation: Roaring Bitmaps for efficient membership
- Lazy Expression Trees: Tensor products fused with contractions
- Boolean Semiring: AND for product, OR for sum
REPL Commands
:list, :inspect <name> - Introspection
:add, :assert, :retract - Mutations
:query, :explain, :compile - Query analysis
:chase, :solve, :extend - Inference
:commit, :history - Version control
:source <file> - Load programs
:help - Show help
Parameterized Theories
Theories can be parameterized by other instances:
theory (N : PetriNet instance) Marking {
token : Sort;
token/of : token -> N/P;
}
This enables rich type-theoretic modeling (e.g., Petri net reachability as dependent types).
Testing Infrastructure
- Property-based tests (
proptest): naming, overlay, patches, queries, structure, tensor, universe, solver - Unit tests: parsing, elaboration, meta, pretty-printing, relations, version control, workspace
- Integration tests: 30+
.geologexample files - Fuzzing:
fuzz/directory with parser and REPL fuzzing targets
Project Status
Version: 0.1.0 (Early production)
Completed
- Core geometric logic implementation
- Parser, elaborator, and core IR
- Chase algorithm with equality saturation
- Solver with SMT-like model enumeration
- Persistence and version control
- Comprehensive test coverage
Active Development
- Nested instance elaboration
- Homoiconic query plan representation
- Disjunction variable alignment for tensor builder
- Lean4 formalization of monotonic submodel proofs
Key Files Reference
| File | Line Count (approx) | Description |
|---|---|---|
src/core.rs |
~800 | Core type definitions |
src/parser.rs |
~600 | Parser implementation |
src/repl.rs |
~1000 | REPL state machine |
src/query/chase.rs |
~500 | Chase algorithm |
src/solver/mod.rs |
~400 | Model enumeration API |
src/tensor/sparse.rs |
~600 | Sparse tensor storage |
src/store/mod.rs |
~400 | Storage coordination |