diff --git a/notes/001-architecture.md b/notes/001-architecture.md new file mode 100644 index 0000000..6c137bb --- /dev/null +++ b/notes/001-architecture.md @@ -0,0 +1,104 @@ +# Architecture + +## Tech Stack + +**Language:** Rust (2021 edition) + +### Key Dependencies + +| Crate | Purpose | +|---------------------|-------------------------------------| +| `chumsky` | Parser combinator library | +| `ariadne` | Error reporting with source spans | +| `rkyv` | Zero-copy serialization | +| `rustyline` | REPL readline interface | +| `egglog-union-find` | Union-find for congruence closure | +| `roaring` | Bitmap library for sparse relations | +| `indexmap` | Order-preserving hash maps | +| `memmap2` | Memory-mapped file I/O | + +### Testing + +- `insta` — snapshot testing +- `proptest` — property-based testing +- `tempfile` — temporary directory management + +--- + +## System Layers + +``` +┌─────────────────────────────────────────────────────┐ +│ USER INTERFACE │ +│ REPL (interactive CLI) | Batch file loading │ +├─────────────────────────────────────────────────────┤ +│ PARSING LAYER (Lexer → Parser → AST) │ +├─────────────────────────────────────────────────────┤ +│ ELABORATION LAYER (AST → Core IR) │ +│ Type checking, name resolution, theory/instance │ +├─────────────────────────────────────────────────────┤ +│ CORE LAYER (Typed Representation) │ +│ Signature, Term, Formula, Structure │ +├─────────────────────────────────────────────────────┤ +│ STORAGE LAYER (Persistence) │ +│ Append-only store with version control │ +├─────────────────────────────────────────────────────┤ +│ QUERY & SOLVER LAYER (Execution) │ +│ Chase algorithm, congruence closure, model search │ +├─────────────────────────────────────────────────────┤ +│ TENSOR ALGEBRA (Axiom Checking) │ +│ Sparse tensor evaluation for axiom validation │ +└─────────────────────────────────────────────────────┘ +``` + +--- + +## Directory Structure + +| Path | Purpose | +|---------------------|----------------------------------------------| +| `src/bin/geolog.rs` | CLI entry point | +| `src/lib.rs` | Library root | +| `src/repl.rs` | Interactive REPL state machine | +| `src/lexer.rs` | Tokenization | +| `src/parser.rs` | Token stream → AST | +| `src/ast.rs` | Abstract syntax tree types | +| `src/core.rs` | Core IR: Signature, Term, Formula, Structure | +| `src/elaborate/` | AST → Core elaboration | +| `src/store/` | Persistence layer | +| `src/query/` | Chase algorithm, relational algebra | +| `src/solver/` | SMT-style model enumeration | +| `src/tensor/` | Sparse tensor algebra | +| `src/cc.rs` | Congruence closure (union-find) | +| `examples/geolog/` | 30+ example `.geolog` files | +| `tests/` | Test files | +| `docs/` | ARCHITECTURE.md, SYNTAX.md | +| `proofs/` | Lean4 formalization | + +--- + +## Key Entry Points + +| Entry Point | Location | +|-------------|----------| +| CLI | `src/bin/geolog.rs` | +| Parse | `src/lib.rs::parse()` | +| REPL | `src/repl.rs::ReplState::process_line()` | +| Theory elaboration | `elaborate/theory.rs::elaborate_theory()` | +| Instance elaboration | `elaborate/instance.rs::elaborate_instance_ctx()` | +| Chase | `query/chase.rs::chase_fixpoint_with_cc()` | +| Model enumeration | `solver/mod.rs::enumerate_models()` | + +--- + +## REPL Commands + +``` +:list, :inspect - Introspection +:add, :assert, :retract - Mutations +:query, :explain, :compile - Query analysis +:chase, :solve, :extend - Inference +:commit, :history - Version control +:source - Load programs +:help - Show help +``` diff --git a/notes/002-design-decisions.md b/notes/002-design-decisions.md new file mode 100644 index 0000000..58ce331 --- /dev/null +++ b/notes/002-design-decisions.md @@ -0,0 +1,66 @@ +# Design Decisions + +## Geometric Logic Foundation + +- **Axioms as Sequents:** `forall vars. premises |- conclusion` +- **Positive Conclusions:** Existentials and disjunctions allowed, but never negation +- **Geometric Morphisms:** Preserved by design for category-theoretic semantics + +--- + +## Identity System + +- **Luid** ("Local Universe ID"): Globally unique across all structures +- **Slid** ("Structure-Local ID"): Index within a single structure +- Bidirectional mapping enables persistent identity despite structure changes + +--- + +## Storage + +- **Append-Only:** Elements only tombstoned, never deleted +- **Patch-based Versioning:** Each commit is a delta from parent +- **GeologMeta:** Single homoiconic theory instance storing all data + +--- + +## Type System + +- **Postfix Application:** `x f` not `f(x)` — categorical style +- **Derived Sorts:** Products of base sorts for record domains +- **Product Domains:** Functions can take record arguments: `[x: M, y: M] -> M` +- **Relations → Prop:** Relations are functions to `Prop` + +--- + +## Chase Algorithm + +- **Fixpoint Iteration:** Derives consequences until closure +- **Congruence Closure:** Merges elements when axioms conclude `x = y` +- **Tensor Algebra:** Efficient axiom checking via sparse tensors +- **Termination:** Unit laws no longer cause infinite loops + +--- + +## Solver Architecture + +- **Explicit Search Tree:** Not implicit in call stack +- **Refinement Preorder:** Structures can grow (carriers, functions, relations) +- **Obligations vs Unsat:** Axiom obligation = need to witness conclusion (not failure) +- **Tactics-based:** AutoTactic composes CheckTactic, ForwardChainingTactic, PropagateEquationsTactic + +--- + +## Query Compilation + +- **QueryOp Intermediate:** SQL-like operators (Scan, Filter, Join, Project) +- **Optimization Passes:** Filter fusion, projection pushdown +- **Store-aware:** Compiled directly to GeologMeta queries with indexing + +--- + +## Tensor Algebra + +- **Sparse Representation:** Roaring Bitmaps for efficient membership +- **Lazy Expression Trees:** Tensor products fused with contractions +- **Boolean Semiring:** AND for product, OR for sum diff --git a/notes/003-project-status.md b/notes/003-project-status.md new file mode 100644 index 0000000..6050f9c --- /dev/null +++ b/notes/003-project-status.md @@ -0,0 +1,82 @@ +# Project Status + +**Version:** 0.1.0 + +--- + +## Metrics + +| Metric | Count | +|--------|-------| +| Rust source code | 30,284 lines | +| Test code | 7,872 lines | +| Example `.geolog` files | 2,346 lines | +| Example files | 30 | +| Rust source files | 84 | +| Unit tests | 137 (all passing) | +| Integration tests | 31 (all passing) | + +--- + +## What's Complete + +- Parser, lexer, syntax support +- Type system and elaboration +- Chase algorithm with equality saturation +- Tensor algebra for axiom checking +- SMT-style model enumeration +- Persistence and version control +- Interactive REPL +- Comprehensive test suite + +--- + +## Mostly Complete + +- Parameterized theories +- Nested instances (parsing done, elaboration partial) +- Query optimization +- Relational algebra compilation +- Congruence closure integration + +--- + +## Known Gaps + +### TODOs in Code + +| Location | Description | +|----------|-------------| +| `src/repl.rs:84` | Store full theory_type in GeologMeta | +| `src/overlay.rs:380` | Copy functions (deferred) | +| `src/store/query.rs:61` | Columnar batch loading | +| `src/store/instance.rs` | External references, product domains | +| `src/query/to_relalg.rs` | Column path navigation, ScanRelationOp | + +### Incomplete Features + +- Product types in conclusions don't fully work +- Existentials over product sorts unsupported +- Field projections in equations don't compile +- Disjunction fires both branches (correct but inefficient) +- Tensor compilation failures are silent + +### In Progress + +- Lean4 formalization of monotonic submodel proofs +- Homoiconic query plan representation +- Disjunction variable alignment for tensor builder + +--- + +## Test Coverage + +| File | Coverage Area | +|------|---------------| +| `tests/unit_elaborate.rs` | Type checking (14 tests) | +| `tests/unit_chase.rs` | Chase algorithm | +| `tests/unit_parsing.rs` | Parser (4 tests) | +| `tests/examples_integration.rs` | 31 integration tests | +| `tests/proptest_*.rs` | 11 property-based test suites | + +Fuzzing targets exist in `fuzz/` directory. diff --git a/notes/004-key-concepts.md b/notes/004-key-concepts.md new file mode 100644 index 0000000..b9814e9 --- /dev/null +++ b/notes/004-key-concepts.md @@ -0,0 +1,88 @@ +# Key Concepts + +## Axioms + +An axiom is an "if-then" rule: + +> "If premises are true, then conclusion must be true." + +``` +forall x, y, z. friends(x,y), friends(y,z) |- knows(x,z) +``` + +Similar to Datalog rules, but extended with: +- Existentials in conclusions: `|- exists x. R(x)` +- Disjunctions: `|- A \/ B` +- Equality conclusions: `|- x = y` + +Datalog is limited to Horn clauses (single atom in head). Geolog supports geometric sequents. + +--- + +## Chase Algorithm + +Repeatedly applies axioms until no new facts can be derived (fixpoint). + +### Building Blocks + +1. **Axioms** — rules to apply +2. **Structure** — carriers (element sets), functions, relations +3. **Bindings** — variable-to-element mappings +4. **Tensor system** — finds violations (premises true, conclusion false) +5. **Fire conclusion** — adds relations, defines functions, creates elements +6. **Congruence closure** — tracks element equality via union-find + +### Loop + +``` +repeat until no changes: + for each axiom: + find violations (tensor system) + fire conclusion for each violation + propagate equations (congruence closure) +``` + +--- + +## Union-Find + +Data structure for tracking equivalence classes. Inputs are elements (IDs), not sets. + +``` +union(3, 7) // "3 and 7 are now equivalent" +union(7, 12) // "7 and 12 are now equivalent" +find(3) // Returns same representative as find(12) +``` + +### Why Not Dictionary + Sets? + +| Approach | Union Cost | +|----------|------------| +| Dict + Sets | O(n) — must update all elements in merged set | +| Union-Find | O(1) amortized — just one pointer update | + +Used for congruence closure when axioms conclude `x = y`. + +--- + +## Language Syntax + +Not based on any single existing language. Influences: + +| Feature | Source | +|---------|--------| +| Postfix application (`x f` not `f(x)`) | Category theory | +| Sequent notation (`premises \|- conclusion`) | Proof theory | +| Path separators (`/` not `.`) | Unique to Geolog | +| Theory/Instance structure | Algebraic specification languages | + +--- + +## Geometric Logic Restrictions + +By design, Geolog excludes: +- Negation (no `not R(x)`) +- Classical implication (no `if A then B`) +- Arithmetic (no `x + 1 = y`) + +These are intentional limits for topos-theoretic semantics, not missing features. diff --git a/notes/005-critical-assessment.md b/notes/005-critical-assessment.md new file mode 100644 index 0000000..a918c88 --- /dev/null +++ b/notes/005-critical-assessment.md @@ -0,0 +1,79 @@ +# Critical Assessment + +## Origin + +AI-generated project by Claude Opus 4.5. From README: + +> "This README was synthesized automatically by Claude Opus 4.5. As was this entire project, really." + +--- + +## Claimed Use Cases + +- Business process workflow orchestration +- Formal verification +- Database query design +- Petri net reachability + +--- + +## Reality Check + +| Claim | Assessment | +|-------|------------| +| **Workflow orchestration** | No APIs, no integrations, no distributed execution. Just a REPL. Real workflow engines (Temporal, Camunda) handle failure recovery, monitoring, external systems. | +| **Formal verification** | Lean4 proofs "in progress." Not comparable to Coq, Lean, or TLA+ which have decades of development and industrial use. | +| **Database query design** | Not a database. No persistence at scale, no SQL, no transactions, no concurrency. | +| **Petri net reachability** | Toy examples with 3-4 places. Not industrial-scale process modeling. | + +--- + +## What It Actually Is + +A **research/educational tool** for: +- Learning geometric logic +- Understanding chase algorithms +- Experimenting with category-theoretic ideas +- Playing with Petri net examples +- Prototyping constraint systems + +--- + +## What It's Not + +- Production infrastructure +- Real workflow orchestration platform +- Industrial formal verification tool +- Database replacement +- Anything at scale + +--- + +## Technical Influences + +| Concept | Source | +|---------|--------| +| Geometric logic | Mathematical logic | +| Chase algorithm | Database theory | +| Tensor algebra | Linear algebra / sparse computation | +| Equality saturation | Program optimization (egg/egglog) | +| Congruence closure | Automated theorem proving | + +Uses `egglog-union-find` crate from the egg/egglog project. + +--- + +## Related Tools + +Similar tools in adjacent spaces (not cited as direct inspirations): +- **Alloy** — relational logic model finder +- **Datalog** — logic programming for databases +- **CQL** — categorical query language +- **Z3** — SMT solver +- **Egg** — equality saturation toolkit + +--- + +## Bottom Line + +Well-crafted educational project, not production infrastructure. The code quality is high and tests pass, but there's no evidence of real-world usage or battle-testing. diff --git a/notes/NOTES.md b/notes/NOTES.md deleted file mode 100644 index 28f2180..0000000 --- a/notes/NOTES.md +++ /dev/null @@ -1,322 +0,0 @@ -# Geolog Project Notes - -## Overview - -**Geolog** is a **Geometric Logic REPL** — a type theory with semantics in topoi, designed for formal specifications -using geometric logic. - -### Core Capabilities - -- **Geometric logic programming** — encode mathematical structures, relationships, and constraints -- **Database schema definition** — define sorts, functions, relations, and axioms -- **Model/instance creation** — create concrete finite models satisfying theory axioms -- **Automated inference** — chase algorithm for automatic fact derivation -- **Version control** — git-like commits and tracking for instances -- **Persistence** — append-only storage with optional disk persistence - -### Use Cases - -- Business process workflow orchestration -- Formal verification via diagrammatic rewriting -- Database query design -- Petri net reachability and process modeling - ---- - -## Tech Stack - -**Primary Language**: Rust (2021 edition, Cargo-based) - -### Key Dependencies - -| Crate | Version | Purpose | -|---------------------|---------|-------------------------------------| -| `chumsky` | 0.9 | Parser combinator library | -| `ariadne` | 0.4 | Error reporting with source spans | -| `rkyv` | 0.7 | Zero-copy serialization | -| `rustyline` | 15 | REPL readline interface | -| `egglog-union-find` | 1.0 | Union-find for congruence closure | -| `roaring` | 0.10 | Bitmap library for sparse relations | -| `indexmap` | 2.0 | Order-preserving hash maps | -| `uuid` | 1 | UUID generation | -| `memmap2` | 0.9 | Memory-mapped file I/O | - -### Testing Frameworks - -- `insta` — snapshot testing -- `proptest` — property-based testing -- `tempfile` — temporary directory management - ---- - -## Architecture - -``` -┌─────────────────────────────────────────────────────┐ -│ USER INTERFACE │ -│ REPL (interactive CLI) | Batch file loading │ -├─────────────────────────────────────────────────────┤ -│ PARSING LAYER (Lexer → Parser → AST) │ -│ chumsky-based lexer & parser, source error reporting│ -├─────────────────────────────────────────────────────┤ -│ ELABORATION LAYER (AST → Core IR) │ -│ Type checking, name resolution, theory/instance │ -├─────────────────────────────────────────────────────┤ -│ CORE LAYER (Typed Representation) │ -│ Signature, Term, Formula, Structure, ElaboratedTheory│ -├─────────────────────────────────────────────────────┤ -│ STORAGE LAYER (Persistence) │ -│ Append-only GeologMeta store with version control │ -├─────────────────────────────────────────────────────┤ -│ QUERY & SOLVER LAYER (Execution) │ -│ Chase algorithm, congruence closure, relational │ -│ algebra compiler, SMT-style model enumeration │ -├─────────────────────────────────────────────────────┤ -│ TENSOR ALGEBRA (Axiom Checking) │ -│ Sparse tensor evaluation for axiom validation │ -└─────────────────────────────────────────────────────┘ -``` - ---- - -## Directory Structure - -| Path | Purpose | -|---------------------|----------------------------------------------| -| `src/bin/geolog.rs` | CLI entry point | -| `src/lib.rs` | Library root, exports `parse()` | -| `src/repl.rs` | Interactive REPL state machine | -| `src/lexer.rs` | Tokenization using chumsky | -| `src/parser.rs` | Token stream → AST | -| `src/ast.rs` | Abstract syntax tree types | -| `src/core.rs` | Core IR: Signature, Term, Formula, Structure | -| `src/elaborate/` | AST → Core elaboration | -| `src/store/` | Persistence layer (append-only) | -| `src/query/` | Chase algorithm, relational algebra | -| `src/solver/` | SMT-style model enumeration | -| `src/tensor/` | Sparse tensor algebra for axiom checking | -| `src/cc.rs` | Congruence closure (union-find) | -| `src/id.rs` | Luid/Slid identity system | -| `src/universe.rs` | Global element registry | -| `examples/geolog/` | 30+ example `.geolog` files | -| `tests/` | 25+ test files | -| `docs/` | ARCHITECTURE.md, SYNTAX.md | -| `proofs/` | Lean4 formalization | -| `fuzz/` | Fuzzing targets | - ---- - -## Main Components - -### Parsing & Syntax (~1,200 lines) - -- `lexer.rs` — tokenization -- `parser.rs` — token stream → AST -- `ast.rs` — AST types (Theory, Instance, Axiom, etc.) -- `error.rs` — error formatting with source spans -- `pretty.rs` — Core → Geolog source roundtrip printing - -### Elaboration (~2,200 lines) - -- `elaborate/mod.rs` — coordination -- `elaborate/theory.rs` — AST Theory → Core ElaboratedTheory -- `elaborate/instance.rs` — AST Instance → Core Structure -- `elaborate/env.rs` — environment with theory registry -- `elaborate/types.rs` — type expression evaluation -- `elaborate/error.rs` — type error reporting - -### Core Representation - -- `core.rs` — DerivedSort, Signature, Structure, Formula, Term, Sequent -- `id.rs` — Luid (global unique ID) and Slid (structure-local ID) -- `universe.rs` — global element registry with UUID ↔ Luid mapping -- `naming.rs` — bidirectional name ↔ Luid mapping - -### Storage Layer (~1,500 lines) - -- `store/mod.rs` — main Store struct -- `store/schema.rs` — cached sort/function/relation IDs -- `store/append.rs` — low-level element append operations -- `store/theory.rs` — theory CRUD -- `store/instance.rs` — instance CRUD -- `store/commit.rs` — git-like version control -- `store/materialize.rs` — indexed views for fast lookups - -### Query & Compilation (~3,500 lines) - -- `query/compile.rs` — Query → RelAlgIR plan compilation -- `query/to_relalg.rs` — Query → Relational Algebra IR -- `query/from_relalg.rs` — RelAlgIR → Executable QueryOp -- `query/chase.rs` — chase algorithm for fixpoint computation -- `query/backend.rs` — naive QueryOp executor -- `query/optimize.rs` — algebraic law rewriting - -### Solver & Model Enumeration (~1,300 lines) - -- `solver/mod.rs` — unified model enumeration API -- `solver/tree.rs` — explicit search tree for partial models -- `solver/tactics.rs` — automated search strategies: - - CheckTactic: axiom validation - - ForwardChainingTactic: Datalog-style inference - - PropagateEquationsTactic: congruence closure - - AutoTactic: composite fixpoint solver -- `solver/types.rs` — SearchNode, Obligation, NodeStatus types - -### Tensor Algebra (~2,600 lines) - -- `tensor/expr.rs` — lazy tensor expression trees -- `tensor/sparse.rs` — sparse tensor storage (RoaringBitmap-based) -- `tensor/builder.rs` — expression builders -- `tensor/compile.rs` — Formula → TensorExpr compilation -- `tensor/check.rs` — axiom checking via tensor evaluation - ---- - -## Key Entry Points - -1. **CLI**: `src/bin/geolog.rs` - ``` - Usage: geolog [-d ] [source_files...] - ``` - -2. **Parse Entry**: `src/lib.rs` exports `parse(input: &str) → Result` - -3. **REPL State**: `src/repl.rs` — `ReplState::process_line()` - -4. **Theory Elaboration**: `elaborate/theory.rs::elaborate_theory()` - -5. **Instance Elaboration**: `elaborate/instance.rs::elaborate_instance_ctx()` - -6. **Chase Algorithm**: `query/chase.rs::chase_fixpoint_with_cc()` - -7. **Model Enumeration**: `solver/mod.rs::enumerate_models()` - ---- - -## Design Decisions - -### Geometric Logic Foundation - -- **Axioms as Sequents**: `forall vars. premises |- conclusion` -- **Positive Conclusions**: Can have existentials, disjunctions, but never negations -- **Geometric Morphisms**: Preserved by design, enabling category-theoretic semantics - -### Identity System - -- **Luid** ("Local Universe ID"): Globally unique across all structures -- **Slid** ("Structure-Local ID"): Index within a single structure -- Bidirectional mapping enables persistent identity despite structure changes - -### Append-Only Storage - -- **GeologMeta**: Single homoiconic theory instance storing all data -- **Patch-based Versioning**: Each commit is a delta from parent -- **Never Delete**: Elements only tombstoned for perfect audit trails - -### Type System - -- **Postfix Application**: `x f` not `f(x)` — categorical style -- **Derived Sorts**: Products of base sorts for record domains -- **Product Domains**: Functions can take record arguments: `[x: M, y: M] -> M` -- **Relations → Prop**: Relations are functions to `Prop` (boolean predicates) - -### Chase Algorithm - -- **Fixpoint Iteration**: Derives all consequences until closure -- **Congruence Closure Integration**: Merges elements when axioms conclude `x = y` -- **Termination for Unit Laws**: Categories with unit laws no longer loop forever -- Uses tensor algebra for efficient axiom checking - -### Solver Architecture - -- **Explicit Search Tree**: Not implicit in call stack (AI-friendly for agent control) -- **Refinement Preorder**: Structures can grow (carriers, functions, relations) -- **Obligations vs Unsat**: Axiom obligation = need to witness conclusion (NOT failure) -- **True Unsat**: Only when deriving `⊢ False` from instantiated axioms -- **Tactics-based**: AutoTactic composes multiple tactics - -### Relational Algebra Compilation - -- **QueryOp Intermediate**: SQL-like operators (Scan, Filter, Join, Project, etc.) -- **Optimization Passes**: Filter fusion, projection pushdown -- **Store-aware**: Compiled directly to GeologMeta queries with indexing - -### Tensor Algebra for Axiom Checking - -- **Sparse Representation**: Roaring Bitmaps for efficient membership -- **Lazy Expression Trees**: Tensor products fused with contractions -- **Boolean Semiring**: AND for product, OR for sum - ---- - -## REPL Commands - -``` -:list, :inspect - Introspection -:add, :assert, :retract - Mutations -:query, :explain, :compile - Query analysis -:chase, :solve, :extend - Inference -:commit, :history - Version control -:source - Load programs -:help - Show help -``` - ---- - -## Parameterized Theories - -Theories can be parameterized by other instances: - -```geolog -theory (N : PetriNet instance) Marking { - token : Sort; - token/of : token -> N/P; -} -``` - -This enables rich type-theoretic modeling (e.g., Petri net reachability as dependent types). - ---- - -## Testing Infrastructure - -- **Property-based tests** (`proptest`): naming, overlay, patches, queries, structure, tensor, universe, solver -- **Unit tests**: parsing, elaboration, meta, pretty-printing, relations, version control, workspace -- **Integration tests**: 30+ `.geolog` example files -- **Fuzzing**: `fuzz/` directory with parser and REPL fuzzing targets - ---- - -## Project Status - -**Version**: 0.1.0 (Early production) - -### Completed - -- Core geometric logic implementation -- Parser, elaborator, and core IR -- Chase algorithm with equality saturation -- Solver with SMT-like model enumeration -- Persistence and version control -- Comprehensive test coverage - -### Active Development - -- Nested instance elaboration -- Homoiconic query plan representation -- Disjunction variable alignment for tensor builder -- Lean4 formalization of monotonic submodel proofs - ---- - -## Key Files Reference - -| File | Line Count (approx) | Description | -|------------------------|---------------------|-----------------------| -| `src/core.rs` | ~800 | Core type definitions | -| `src/parser.rs` | ~600 | Parser implementation | -| `src/repl.rs` | ~1000 | REPL state machine | -| `src/query/chase.rs` | ~500 | Chase algorithm | -| `src/solver/mod.rs` | ~400 | Model enumeration API | -| `src/tensor/sparse.rs` | ~600 | Sparse tensor storage | -| `src/store/mod.rs` | ~400 | Storage coordination | diff --git a/notes/project-completeness-report.md b/notes/project-completeness-report.md deleted file mode 100644 index eaf6951..0000000 --- a/notes/project-completeness-report.md +++ /dev/null @@ -1,168 +0,0 @@ -# Geolog Project Completeness Report - -*Generated: 2026-03-17* - -## Project Overview - -**Geolog** is a Geometric Logic REPL and reasoning engine written in Rust. It implements geometric logic with semantics in topoi. - -### Key Capabilities - -- Define mathematical theories using sorts (types), functions, relations, and axioms -- Create concrete finite model instances of theories -- Automatically derive facts using the **chase algorithm** (fixpoint inference) -- Verify axioms using **tensor algebra** (sparse tensor evaluation) -- Solve models using SMT-like automated search tactics -- Version control with git-like commits and persistent append-only storage -- Interactive REPL for exploration and manipulation - -### Use Cases - -Business process workflow orchestration, formal verification, database schema definition, Petri net reachability analysis, and diagrammatic rewriting. - ---- - -## Current State of Implementation - -**Version:** 0.1.0 - Early Production - -### Codebase Metrics - -| Metric | Count | -|--------|-------| -| Rust source code | 30,284 lines | -| Test code | 7,872 lines | -| Example `.geolog` files | 2,346 lines | -| Example files | 30 | -| Rust source files | 84 | -| Unit tests | 137 (all passing) | -| Integration tests | 31 (all passing) | - -### Major Implemented Components - -1. **Parser** (`src/parser.rs`, `src/lexer.rs`) - Full syntax support via chumsky -2. **Elaboration** (`src/elaborate/`) - Type checking and AST→Core IR transformation -3. **Core IR** (`src/core.rs`) - Complete typed representation (Signature, Structure, Formula, Term) -4. **Chase Algorithm** (`src/query/chase.rs`) - Fixpoint computation with congruence closure for equality saturation -5. **Tensor Algebra** (`src/tensor/`) - Sparse tensor evaluation for axiom checking -6. **Solver** (`src/solver/`) - SMT-style model enumeration with multiple tactics -7. **Storage Layer** (`src/store/`) - Append-only persistent storage with version control -8. **Query Engine** (`src/query/`) - Relational algebra IR and compilation -9. **REPL** (`src/repl.rs`) - Full interactive CLI with 20+ commands - ---- - -## TODOs, FIXMEs, and Incomplete Features - -### Minor TODOs (11 instances) - -| Location | Description | -|----------|-------------| -| `src/repl.rs:84` | Store full theory_type in GeologMeta for proper reconstruction | -| `src/overlay.rs:380` | Copy functions (marked as complex, deferred) | -| `src/store/query.rs:61` | Columnar batch loading for relation tuples | -| `src/store/bootstrap_queries.rs` | Further integration with query engine | -| `src/store/instance.rs` | Handle external references, product domains, product codomains | -| `src/query/to_relalg.rs` | Column path navigation for nested schemas, ScanRelationOp implementation | - -### Macro Usage - -- 52 instances of `panic!`, `unimplemented!`, `todo!` markers scattered across the codebase -- Most are used defensively in error paths or for features explicitly deferred -- Only a small number appear to be in active execution paths - -### Active Development Areas - -From NOTES.md: -- Nested instance elaboration (partially done) -- Homoiconic query plan representation -- Disjunction variable alignment for tensor builder -- Lean4 formalization of monotonic submodel proofs - ---- - -## Documentation - -### Available Documentation - -| Document | Description | -|----------|-------------| -| **README.md** (1,315 lines) | Comprehensive with quick start, features, showcase (Petri nets), and full examples | -| **ARCHITECTURE.md** | Detailed module structure, data flow, design principles | -| **SYNTAX.md** | Complete grammar reference with examples | -| **NOTES.md** (322 lines) | Technical overview, directory structure, design decisions | -| **30 example `.geolog` files** | Working demonstrations (graphs, monoids, preorders, Petri nets, categories) | - -Documentation quality is **exceptionally high** with clear explanations of complex concepts like geometric logic, chase algorithm, equality saturation, and tensor algebra. - ---- - -## Tests and Comprehensiveness - -### Test Coverage - -- 137 unit tests - All passing -- 31 integration tests - All passing -- Property-based tests (proptest): naming, overlay, patches, queries, structure, tensor, universe, solver -- Fuzzing infrastructure - `fuzz/` directory with parser and REPL fuzzing targets -- Snapshot testing (insta) - For parsing and elaboration - -### Test Files - -| File | Coverage Area | -|------|---------------| -| `tests/unit_elaborate.rs` | Type checking (14 tests) | -| `tests/unit_chase.rs` | Chase algorithm (comprehensive) | -| `tests/unit_parsing.rs` | Parser (4 tests) | -| `tests/examples_integration.rs` | 31 integration tests covering all examples | -| `tests/proptest_*.rs` | 11 property-based test suites | - -Tests are **comprehensive, well-structured, and demonstrating good coverage** of both core functionality and edge cases. - ---- - -## Completeness Assessment - -### Complete/Functional - -- Core geometric logic engine -- Parser, lexer, and syntax support -- Type system and elaboration -- Chase algorithm with equality saturation -- Tensor algebra for axiom checking -- SMT-solver-style model enumeration -- Persistence and version control -- Interactive REPL -- Comprehensive test suite - -### Mostly Complete/Production-Ready - -- Parameterized theories -- Nested instances (parsing done, elaboration partial) -- Query optimization -- Relational algebra compilation -- Congruence closure integration - -### Minor Gaps/Deferred - -- Some advanced columnar batch persistence optimizations -- External references in instances -- Product domain handling in some edge cases -- Complete homoiconic query plan representation -- Lean4 formalization (work in progress) - ---- - -## Conclusion - -This project is **substantially complete and production-ready** for its core functionality. It's a sophisticated formal reasoning engine with: - -- **Clean architecture** with well-separated concerns -- **Robust implementation** (30K+ lines of carefully-structured code) -- **Excellent test coverage** (170+ tests, all passing) -- **Professional documentation** (README, architecture guides, syntax reference) -- **Real-world examples** (Petri nets, category theory, preorders, etc.) - -The TODOs and FIXMEs are **minor optimizations and edge cases**, not blocking issues. The 61 panic/unimplemented occurrences are mostly defensive error handling or explicitly deferred features, not blocking functionality. - -This is a mature, well-tested implementation of a complex formal reasoning system. The "0.1.0" version number likely reflects that the public API may evolve, not that the implementation is immature. diff --git a/notes/qa-session.md b/notes/qa-session.md deleted file mode 100644 index 00d6b5e..0000000 --- a/notes/qa-session.md +++ /dev/null @@ -1,153 +0,0 @@ -# Geolog Q&A Session - -*Date: 2026-03-17* - ---- - -## Q1: Is this project complete? - -**Answer: Yes, the core functionality is complete and production-ready.** - -### What Geolog Does - -Geolog is a reasoning engine for geometric logic. It lets you: -- Define theories with types, functions, relations, and rules -- Build concrete models of those theories -- Automatically derive facts using the chase algorithm -- Check rules using tensor algebra -- Solve models using automated search - -### Evidence of Completeness - -- 30,000+ lines of Rust code -- 168 tests, all passing -- 30 working examples -- Full documentation (README, architecture guide, syntax reference) - -### What's Done - -- Parser and type checker -- Chase algorithm with equality handling -- Tensor algebra for rule checking -- Solver for model search -- Persistent storage with version control -- Interactive REPL - -### Minor Gaps - -- 11 TODO comments for optimizations -- Some edge cases with product types -- Lean4 formalization is in progress - -These gaps don't block normal use. - ---- - -## Q2: Are tensor algebra and chase algorithm expressive enough for Geolog? - -**Answer: Yes for core geometric logic. There are specific gaps around product types and disjunction.** - -### Fully Supported - -| Feature | How It Works | -|---------|--------------| -| Relations like `R(x,y)` | Tensor + Chase | -| Existentials in premises | Tensor system | -| Equations like `f(x) = y` | Congruence closure | -| Nested functions like `f(g(x))` | Tensor contraction | -| Creating new elements | Chase handles `∃x. ...` in conclusions | -| Merging elements | Chase handles `x = y` conclusions | - -### Gaps - -**Product types have limited support:** -- Functions with product domains in conclusions don't work -- Existentials over product sorts don't work -- Field projections in equations (`x.field = y`) don't compile - -**Disjunction is incomplete:** -- When a rule concludes `A or B`, the chase fires both A and B -- This is correct but may add more facts than needed -- A proper implementation would choose one branch - -**Errors are silent:** -- When tensor compilation fails, the rule is skipped without warning -- This can hide problems in complex axioms - -### By Design (Not Bugs) - -Geolog uses geometric logic, which excludes: -- Negation (no `not R(x)`) -- Classical implication (no `if A then B`) -- Arithmetic (no `x + 1 = y`) - -These are intentional limits, not missing features. - -### Bottom Line - -The mechanisms handle all standard geometric logic axioms: -- Reflexivity, transitivity, composition -- Category laws, Petri net rules, graph reachability - -The gaps only matter for axioms with complex record/product operations in conclusions. - ---- - -## Q3: Is Geolog inspired by an existing tool or software? - -**Answer: Geolog is an AI-generated project that combines ideas from several fields. No single tool is its direct ancestor.** - -### Origin - -The README states: - -> "This README was synthesized automatically by Claude Opus 4.5. As was this entire project, really." - -Geolog was created by Claude Opus 4.5, Anthropic's AI model. - -### Technical Influences - -The project draws from several established areas: - -**Egglog / Egg** -- Uses `egglog-union-find` and `egglog-numeric-id` crates -- Equality saturation and e-graph techniques for congruence closure - -**Database Theory** -- Chase algorithm: a classical method for computing fixpoints in databases -- Relational algebra for query compilation - -**SMT Solvers** -- Model enumeration follows SMT solver patterns -- Automated search with forward chaining and equation propagation - -**Category Theory / Topos Theory** -- Geometric logic with semantics in topoi -- Lean4 formalization uses the `model-theory-topos` library - -**Formal Verification** -- Petri net reachability (similar to model checking) -- Diagrammatic rewriting - -### No Single Predecessor - -The codebase doesn't name a specific tool as direct inspiration. Geolog is a novel combination of: - -| Concept | Source Field | -|---------|--------------| -| Geometric logic | Mathematical logic | -| Chase algorithm | Database theory | -| Tensor algebra | Linear algebra / sparse computation | -| Equality saturation | Program optimization (egg/egglog) | -| Congruence closure | Automated theorem proving | - -### Related Tools (for context) - -While not cited as inspirations, similar tools exist in adjacent spaces: -- **Alloy** - relational logic model finder -- **Datalog** - logic programming for databases -- **CQL** - categorical query language -- **Z3** - SMT solver -- **Egg** - equality saturation toolkit - -Geolog occupies a unique niche: geometric logic with topos semantics applied to practical problems like workflow orchestration and Petri nets. diff --git a/notes/summary.md b/notes/summary.md deleted file mode 100644 index 0fe58d3..0000000 --- a/notes/summary.md +++ /dev/null @@ -1,81 +0,0 @@ -# Geolog Summary - -*Date: 2026-03-19* - -## What Is Geolog? - -Geolog is a **Geometric Logic REPL** — a reasoning engine written in Rust that implements type theory with semantics in topoi. It was AI-generated by Claude Opus 4.5. - -## Core Concepts - -### Axioms - -An axiom is an "if-then" rule: "If premises are true, then conclusion must be true." - -``` -forall x, y, z. friends(x,y), friends(y,z) |- knows(x,z) -``` - -Similar to Datalog rules, but extended with: -- Existentials in conclusions (`|- exists x. R(x)`) -- Disjunctions (`|- A \/ B`) -- Equality conclusions (`|- x = y`) - -### Chase Algorithm - -Repeatedly applies axioms until no new facts can be derived (fixpoint). Basic blocks: - -1. **Axioms** — the rules to apply -2. **Structure** — carriers (element sets), functions, relations -3. **Bindings** — variable-to-element mappings -4. **Tensor system** — finds violations (premises true, conclusion false) -5. **Fire conclusion** — adds relations, defines functions, creates elements, adds equations -6. **Congruence closure** — union-find for tracking element equality - -### Union-Find - -Data structure for tracking equivalence classes. Used for congruence closure when axioms conclude `x = y`. More efficient than naive dictionary + sets approach: -- Naive: O(n) per union -- Union-find: O(1) amortized per union - -### Language Syntax - -Geolog's syntax is not based on any single existing language. Influences: -- **Postfix application** (`x f` not `f(x)`) — category theory -- **Sequent notation** (`premises |- conclusion`) — proof theory -- **Path separators** (`/` not `.`) — unique to Geolog - -## Skeptical Assessment of Applications - -### Claimed Use Cases -- Business process workflow orchestration -- Formal verification -- Database query design -- Petri net reachability - -### Reality - -| Claim | Assessment | -|-------|------------| -| Workflow orchestration | No APIs, no integrations, no distributed execution — just a REPL | -| Formal verification | Lean4 proofs "in progress"; not comparable to Coq/Lean/TLA+ | -| Database query design | Not a database — no persistence at scale, no SQL, no transactions | -| Petri net reachability | Toy examples only, not industrial-scale | - -### What It Actually Is - -A **research/educational tool** for: -- Learning geometric logic and chase algorithms -- Experimenting with category-theoretic ideas -- Prototyping constraint systems -- Academic exploration - -**Not suitable for:** Production systems, real workflow orchestration, industrial formal verification, or database replacement. - -## Project Status - -- Version: 0.1.0 -- 30,000+ lines of Rust -- 168 tests (all passing) -- 30 example files -- Well-documented but unproven in real-world use