Expand notes with examples and more explanations

This commit is contained in:
Hassan Abedi 2026-03-19 14:40:44 +01:00
parent 966cd3ad1b
commit 15821e53ce
5 changed files with 789 additions and 238 deletions

View File

@ -1,53 +1,71 @@
# Architecture # Architecture
## What Geolog Is
Geolog is a command-line tool for defining rules and automatically deriving facts from them. Think of it like a spreadsheet that can figure out new cells based on formulas you define, except the "formulas" are logical rules and the "cells" are facts about things.
**Example:** You tell Geolog "if A is friends with B, and B is friends with C, then A knows C." Then you say "Alice is friends with Bob" and "Bob is friends with Charlie." Geolog automatically figures out "Alice knows Charlie."
---
## Tech Stack ## Tech Stack
**Language:** Rust (2021 edition) **Language:** Rust (2021 edition)
### Key Dependencies ### Key Dependencies
| Crate | Purpose | | Crate | What It Does |
|---------------------|-------------------------------------| |---------------------|-----------------------------------------------------------------|
| `chumsky` | Parser combinator library | | `chumsky` | Reads `.geolog` files and understands their structure |
| `ariadne` | Error reporting with source spans | | `ariadne` | Shows helpful error messages with line numbers |
| `rkyv` | Zero-copy serialization | | `rkyv` | Saves data to disk efficiently |
| `rustyline` | REPL readline interface | | `rustyline` | Makes the interactive prompt work (arrow keys, history, etc.) |
| `egglog-union-find` | Union-find for congruence closure | | `egglog-union-find` | Tracks when two things are actually the same thing |
| `roaring` | Bitmap library for sparse relations | | `roaring` | Stores large sets of numbers efficiently |
| `indexmap` | Order-preserving hash maps | | `indexmap` | Hash maps that remember insertion order |
| `memmap2` | Memory-mapped file I/O | | `memmap2` | Reads files by mapping them directly into memory |
### Testing ### Testing Tools
- `insta` — snapshot testing | Tool | Purpose |
- `proptest` — property-based testing |------|---------|
- `tempfile` — temporary directory management | `insta` | Compares output against saved "snapshots" to catch regressions |
| `proptest` | Generates random inputs to find edge cases |
| `tempfile` | Creates temporary directories for tests |
--- ---
## System Layers ## How Data Flows Through the System
``` ```
┌─────────────────────────────────────────────────────┐ ┌─────────────────────────────────────────────────────┐
│ USER INTERFACE │ │ USER INTERFACE │
REPL (interactive CLI) | Batch file loading You type commands or load .geolog files here
├─────────────────────────────────────────────────────┤ ├─────────────────────────────────────────────────────┤
│ PARSING LAYER (Lexer → Parser → AST) │ │ PARSING LAYER │
│ Turns text into a structured tree (like HTML DOM) │
│ Example: "P : Sort;" becomes {name: "P", kind: Sort}│
├─────────────────────────────────────────────────────┤ ├─────────────────────────────────────────────────────┤
│ ELABORATION LAYER (AST → Core IR) │ │ ELABORATION LAYER │
│ Type checking, name resolution, theory/instance │ │ Checks types and resolves names │
│ Example: Verifies "src : E -> V" refers to real │
│ sorts E and V that were declared │
├─────────────────────────────────────────────────────┤ ├─────────────────────────────────────────────────────┤
│ CORE LAYER (Typed Representation) │ │ CORE LAYER │
│ Signature, Term, Formula, Structure │ │ The verified, type-checked representation │
│ Everything here is guaranteed to be well-formed │
├─────────────────────────────────────────────────────┤ ├─────────────────────────────────────────────────────┤
│ STORAGE LAYER (Persistence) │ │ STORAGE LAYER │
│ Append-only store with version control │ │ Saves everything to disk, tracks history │
│ Like git for your data │
├─────────────────────────────────────────────────────┤ ├─────────────────────────────────────────────────────┤
│ QUERY & SOLVER LAYER (Execution) │ │ QUERY & SOLVER LAYER │
│ Chase algorithm, congruence closure, model search │ │ Runs the chase algorithm to derive new facts │
│ Searches for models that satisfy your rules │
├─────────────────────────────────────────────────────┤ ├─────────────────────────────────────────────────────┤
│ TENSOR ALGEBRA (Axiom Checking) │ │ TENSOR ALGEBRA │
│ Sparse tensor evaluation for axiom validation │ │ Efficiently checks which rules need to fire │
│ Uses sparse matrices for performance │
└─────────────────────────────────────────────────────┘ └─────────────────────────────────────────────────────┘
``` ```
@ -55,50 +73,79 @@
## Directory Structure ## Directory Structure
| Path | Purpose | | Path | What's There |
|---------------------|----------------------------------------------| |---------------------|--------------|
| `src/bin/geolog.rs` | CLI entry point | | `src/bin/geolog.rs` | The main program entry point — run `cargo run` and this executes |
| `src/lib.rs` | Library root | | `src/lib.rs` | Library code that other Rust programs could use |
| `src/repl.rs` | Interactive REPL state machine | | `src/repl.rs` | The interactive prompt (handles `:chase`, `:inspect`, etc.) |
| `src/lexer.rs` | Tokenization | | `src/lexer.rs` | Breaks text into tokens like `theory`, `{`, `Sort`, `;` |
| `src/parser.rs` | Token stream → AST | | `src/parser.rs` | Builds a tree structure from tokens |
| `src/ast.rs` | Abstract syntax tree types | | `src/ast.rs` | Defines what the tree structure looks like |
| `src/core.rs` | Core IR: Signature, Term, Formula, Structure | | `src/core.rs` | The "verified" internal representation |
| `src/elaborate/` | AST → Core elaboration | | `src/elaborate/` | Type checking and name resolution |
| `src/store/` | Persistence layer | | `src/store/` | Saving/loading data, version control |
| `src/query/` | Chase algorithm, relational algebra | | `src/query/` | The chase algorithm lives here |
| `src/solver/` | SMT-style model enumeration | | `src/solver/` | Model search (finding valid instances) |
| `src/tensor/` | Sparse tensor algebra | | `src/tensor/` | Efficient rule checking with sparse matrices |
| `src/cc.rs` | Congruence closure (union-find) | | `src/cc.rs` | Congruence closure — tracks equalities |
| `examples/geolog/` | 30+ example `.geolog` files | | `examples/geolog/` | 30+ example files you can learn from |
| `tests/` | Test files | | `tests/` | Automated tests |
| `docs/` | ARCHITECTURE.md, SYNTAX.md | | `docs/` | Documentation (ARCHITECTURE.md, SYNTAX.md) |
| `proofs/` | Lean4 formalization |
--- ---
## Key Entry Points ## Key Entry Points
| Entry Point | Location | If you're reading the code, start here:
|-------------|----------|
| CLI | `src/bin/geolog.rs` | | What You Want | Where To Look |
| Parse | `src/lib.rs::parse()` | |---------------|---------------|
| REPL | `src/repl.rs::ReplState::process_line()` | | How the CLI starts | `src/bin/geolog.rs` |
| Theory elaboration | `elaborate/theory.rs::elaborate_theory()` | | How files are parsed | `src/lib.rs::parse()` |
| Instance elaboration | `elaborate/instance.rs::elaborate_instance_ctx()` | | How REPL commands work | `src/repl.rs::ReplState::process_line()` |
| Chase | `query/chase.rs::chase_fixpoint_with_cc()` | | How theories are checked | `src/elaborate/theory.rs::elaborate_theory()` |
| Model enumeration | `solver/mod.rs::enumerate_models()` | | How instances are built | `src/elaborate/instance.rs::elaborate_instance_ctx()` |
| How facts are derived | `src/query/chase.rs::chase_fixpoint_with_cc()` |
| How models are searched | `src/solver/mod.rs::enumerate_models()` |
--- ---
## REPL Commands ## REPL Commands
When you run `geolog` interactively, these commands are available:
``` ```
:list, :inspect <name> - Introspection :list - Show all theories and instances
:add, :assert, :retract - Mutations :inspect Triangle - Show details of "Triangle" instance
:query, :explain, :compile - Query analysis
:chase, :solve, :extend - Inference :add Triangle v : V; - Add element v to Triangle
:commit, :history - Version control :assert [x:a, y:b] R; - Assert a relation holds
:source <file> - Load programs :retract [x:a, y:b] R; - Remove a relation
:help - Show help
:chase Triangle - Derive all consequences from axioms
:solve MyInstance - Search for a valid model
:commit "message" - Save current state (like git commit)
:history - Show past commits
:source file.geolog - Load and run a file
:help - Show help
```
### Example Session
```
$ cargo run -- examples/geolog/graph.geolog
> :list
Theories: Graph
Instances: Triangle
> :inspect Triangle
Instance Triangle : Graph
Carriers:
V: {A, B, C}
E: {ab, bc, ca}
Functions:
src: {ab -> A, bc -> B, ca -> C}
tgt: {ab -> B, bc -> C, ca -> A}
``` ```

View File

@ -1,66 +1,215 @@
# Design Decisions # Design Decisions
## Geometric Logic Foundation This document explains *why* Geolog works the way it does.
- **Axioms as Sequents:** `forall vars. premises |- conclusion`
- **Positive Conclusions:** Existentials and disjunctions allowed, but never negation
- **Geometric Morphisms:** Preserved by design for category-theoretic semantics
--- ---
## Identity System ## Geometric Logic: What It Means
- **Luid** ("Local Universe ID"): Globally unique across all structures Geolog uses "geometric logic" — a restricted form of logic that can express:
- **Slid** ("Structure-Local ID"): Index within a single structure - "If X then Y" rules
- Bidirectional mapping enables persistent identity despite structure changes - "There exists something such that..."
- "A or B" (but you can't say "not A")
**What you CAN write:**
```
// If x and y are both connected to z, they know each other
forall x, y, z. connected(x,z), connected(y,z) |- knows(x,y)
// Every person has a best friend
forall p : Person. |- exists f : Person. best_friend(p, f)
// Every task is either done or pending
forall t : Task. |- done(t) \/ pending(t)
```
**What you CANNOT write:**
```
// No negation allowed
forall x. |- not blocked(x) // INVALID
// No "if not" patterns
forall x. not done(x) |- pending(x) // INVALID
```
**Why this restriction?** Geometric logic has nice mathematical properties — it's preserved by "geometric morphisms" between topoi. This isn't just theory; it means the reasoning is compositional and predictable.
--- ---
## Storage ## Postfix Function Application
- **Append-Only:** Elements only tombstoned, never deleted Most languages write `f(x)` to apply function `f` to `x`. Geolog writes `x f`.
- **Patch-based Versioning:** Each commit is a delta from parent
- **GeologMeta:** Single homoiconic theory instance storing all data ```
// Traditional notation
src(ab) // "apply src to ab"
// Geolog notation
ab src // "ab, then src"
```
**Why?** It matches how you read pipelines left-to-right:
```
// Geolog: read left to right
ab src tgt // "take ab, get its src, get that's tgt"
// Traditional: read inside out
tgt(src(ab)) // "tgt of src of ab"
```
This is called "categorical" style because it matches how mathematicians compose morphisms in category theory.
---
## Identity System: Luids and Slids
Geolog needs to track elements across different operations. It uses two kinds of IDs:
| ID Type | Scope | Example |
|---------|-------|---------|
| **Luid** (Local Universe ID) | Global — unique across everything | "Element #4827" |
| **Slid** (Structure-Local ID) | Local — index within one structure | "The 3rd vertex" |
**Why both?**
- Slids are small integers (0, 1, 2...) — fast for computation
- Luids persist across saves/loads — you can reference "that specific vertex" forever
```
Structure "Triangle":
Vertex A has Slid=0, Luid=1001
Vertex B has Slid=1, Luid=1002
Vertex C has Slid=2, Luid=1003
After merging A and B (because an axiom said A = B):
Vertex A has Slid=0, Luid=1001 (canonical)
Vertex B → now points to A
```
---
## Append-Only Storage
Geolog never deletes data. When you "remove" something, it's marked as removed but still stored.
**Why?**
- Perfect audit trail — you can see exactly what changed and when
- Safe undo — nothing is ever lost
- Git-like history — you can go back to any previous state
```
Commit 1: Added vertices A, B, C
Commit 2: Added edge ab
Commit 3: Marked edge ab as "removed" ← still stored, just flagged
```
--- ---
## Type System ## Type System
- **Postfix Application:** `x f` not `f(x)` — categorical style ### Sorts (Types)
- **Derived Sorts:** Products of base sorts for record domains
- **Product Domains:** Functions can take record arguments: `[x: M, y: M] -> M` A **sort** is a type of thing. Vertices and Edges are different sorts:
- **Relations → Prop:** Relations are functions to `Prop`
```geolog
theory Graph {
V : Sort; // V is a type (the type of vertices)
E : Sort; // E is a type (the type of edges)
}
```
### Functions
Functions map elements to elements:
```geolog
src : E -> V; // Every edge has a source vertex
tgt : E -> V; // Every edge has a target vertex
mul : [x: M, y: M] -> M; // Binary function taking a pair
```
The `[x: M, y: M]` is a **product domain** — the function takes a record with two fields.
### Relations
Relations are yes/no questions about tuples:
```geolog
leq : [x: X, y: X] -> Prop; // "Is x ≤ y?"
connected : [a: V, b: V] -> Prop;
```
`Prop` means "proposition" — it's either true or false for each input.
--- ---
## Chase Algorithm ## Chase Algorithm Design
- **Fixpoint Iteration:** Derives consequences until closure The chase repeatedly applies rules until nothing new can be derived:
- **Congruence Closure:** Merges elements when axioms conclude `x = y`
- **Tensor Algebra:** Efficient axiom checking via sparse tensors ```
- **Termination:** Unit laws no longer cause infinite loops Initial facts: friends(alice, bob), friends(bob, charlie)
Rule: friends(x,y), friends(y,z) |- knows(x,z)
Iteration 1:
- Rule matches with x=alice, y=bob, z=charlie
- Conclusion knows(alice, charlie) is missing
- Add: knows(alice, charlie)
Iteration 2:
- No rule has missing conclusions
- Stop (fixpoint reached)
Final facts: friends(alice, bob), friends(bob, charlie), knows(alice, charlie)
```
**Key features:**
- Uses **tensor algebra** to efficiently find which rules need to fire
- Integrates **congruence closure** so `x = y` conclusions actually merge elements
- **Terminates** for theories with identity elements (earlier versions could loop forever)
--- ---
## Solver Architecture ## Solver Architecture
- **Explicit Search Tree:** Not implicit in call stack When you need to *find* a model (not just derive facts), the solver searches:
- **Refinement Preorder:** Structures can grow (carriers, functions, relations)
- **Obligations vs Unsat:** Axiom obligation = need to witness conclusion (not failure) ```
- **Tactics-based:** AutoTactic composes CheckTactic, ForwardChainingTactic, PropagateEquationsTactic Goal: Find an instance of Graph with a cycle
Search tree:
Node 1: Empty graph
→ Obligation: need at least one edge (from some axiom)
→ Branch: try adding edge e1
Node 2: Graph with e1
→ Obligation: e1 needs src and tgt
→ Branch: try src(e1) = v1, tgt(e1) = v1 (self-loop!)
Node 3: Graph with e1, v1, self-loop
→ All axioms satisfied
→ SUCCESS: return this model
```
**Key features:**
- **Explicit search tree** — the solver builds a tree structure, not recursive function calls
- **Tactics** — different strategies (forward chaining, equation propagation) can be combined
- **Obligations** — "this axiom requires a witness" is tracked explicitly
--- ---
## Query Compilation ## Query Compilation
- **QueryOp Intermediate:** SQL-like operators (Scan, Filter, Join, Project) Queries are compiled to relational algebra (like SQL internally):
- **Optimization Passes:** Filter fusion, projection pushdown
- **Store-aware:** Compiled directly to GeologMeta queries with indexing
--- ```
Query: Find all x,y where connected(x,y) and connected(y,x)
## Tensor Algebra Compiled plan:
1. Scan relation "connected" → get all (x,y) pairs
2. Scan relation "connected" → get all (a,b) pairs
3. Join where y=a and x=b
4. Project out x,y
```
- **Sparse Representation:** Roaring Bitmaps for efficient membership This is optimized with standard database techniques (filter pushdown, etc.).
- **Lazy Expression Trees:** Tensor products fused with contractions
- **Boolean Semiring:** AND for product, OR for sum

View File

@ -1,82 +1,152 @@
# Project Status # Project Status
**Version:** 0.1.0 **Version:** 0.1.0 (Early stage, but functional)
--- ---
## Metrics ## Codebase Size
| Metric | Count | | What | Count |
|--------|-------| |------|-------|
| Rust source code | 30,284 lines | | Rust source code | ~30,000 lines |
| Test code | 7,872 lines | | Test code | ~8,000 lines |
| Example `.geolog` files | 2,346 lines | | Example `.geolog` files | ~2,300 lines across 30 files |
| Example files | 30 | | Source files | 84 |
| Rust source files | 84 |
| Unit tests | 137 (all passing) | | Unit tests | 137 (all passing) |
| Integration tests | 31 (all passing) | | Integration tests | 31 (all passing) |
--- ---
## What's Complete ## What Works Well
- Parser, lexer, syntax support These features are complete and reliable:
- Type system and elaboration
- Chase algorithm with equality saturation ### Core Engine
- Tensor algebra for axiom checking - **Parser** — reads `.geolog` files, gives helpful error messages
- SMT-style model enumeration - **Type checker** — catches mistakes like using undefined sorts
- Persistence and version control - **Chase algorithm** — derives facts from rules correctly
- Interactive REPL - **Equality handling** — when rules say `x = y`, elements are properly merged
- Comprehensive test suite
### Examples That Run
```bash
cargo run -- examples/geolog/graph.geolog # Simple directed graph
cargo run -- examples/geolog/preorder.geolog # Reflexive, transitive relation
cargo run -- examples/geolog/transitive_closure.geolog # Shows chase in action
cargo run -- examples/geolog/category.geolog # Category theory basics
```
### Interactive Features
- REPL with command history
- Inspect any theory or instance
- Run chase interactively
- Save/load with version history
--- ---
## Mostly Complete ## What Mostly Works
- Parameterized theories These features work but have limitations:
- Nested instances (parsing done, elaboration partial)
- Query optimization ### Parameterized Theories
- Relational algebra compilation You can define theories that depend on other instances:
- Congruence closure integration
```geolog
theory (N : PetriNet instance) Marking {
token : Sort;
token/of : token -> N/P; // References N's places
}
```
**Limitation:** Nested instances (instances inside instances) only partially work.
### Model Search
The solver can find models satisfying axioms, but:
- Works best for small models
- Some search strategies are incomplete
- No timeout handling
--- ---
## Known Gaps ## Known Gaps
### TODOs in Code ### Things That Don't Work Yet
| Location | Description | | Feature | Status |
|----------|-------------| |---------|--------|
| `src/repl.rs:84` | Store full theory_type in GeologMeta | | Product types in conclusions | Functions returning records don't fully work |
| `src/overlay.rs:380` | Copy functions (deferred) | | Existentials over products | `exists p : [x:M, y:M]. ...` fails |
| `src/store/query.rs:61` | Columnar batch loading | | Field projections in equations | `x.field = y` doesn't compile |
| `src/store/instance.rs` | External references, product domains |
| `src/query/to_relalg.rs` | Column path navigation, ScanRelationOp |
### Incomplete Features ### Example of What Fails
- Product types in conclusions don't fully work ```geolog
- Existentials over product sorts unsupported // This works:
- Field projections in equations don't compile forall x : M. |- exists y : M. f(x) = y
- Disjunction fires both branches (correct but inefficient)
- Tensor compilation failures are silent
### In Progress // This doesn't work (product sort in existential):
forall x : M. |- exists p : [a:M, b:M]. g(x) = p
```
- Lean4 formalization of monotonic submodel proofs ### Silent Failures
- Homoiconic query plan representation
- Disjunction variable alignment for tensor builder When the tensor compiler can't handle an axiom, it **silently skips it**. This can be confusing — your axiom looks fine but nothing happens.
```geolog
// This axiom might be silently skipped if it has unsupported patterns:
ax/complex : forall x. premise |- [a: x, b: x] some_relation;
// No error, but also no effect
```
---
## TODOs in the Code
These are marked in the source with `// TODO`:
| Location | What's Missing |
|----------|----------------|
| `src/repl.rs:84` | Theory type not fully stored in metadata |
| `src/overlay.rs:380` | Copying functions between structures |
| `src/store/query.rs:61` | Batch loading for better performance |
| `src/store/instance.rs` | External references between instances |
| `src/query/to_relalg.rs` | Some relational operators |
--- ---
## Test Coverage ## Test Coverage
| File | Coverage Area | ### What's Tested
|------|---------------|
| `tests/unit_elaborate.rs` | Type checking (14 tests) |
| `tests/unit_chase.rs` | Chase algorithm |
| `tests/unit_parsing.rs` | Parser (4 tests) |
| `tests/examples_integration.rs` | 31 integration tests |
| `tests/proptest_*.rs` | 11 property-based test suites |
Fuzzing targets exist in `fuzz/` directory. | Area | Tests |
|------|-------|
| Type checking | 14 unit tests |
| Chase algorithm | Comprehensive unit tests |
| Parser | 4 unit tests + snapshot tests |
| Examples | 31 integration tests (one per example file) |
| Random inputs | 11 property-based test suites |
### Running Tests
```bash
cargo test # Run all tests
cargo test unit_chase # Run only chase tests
cargo test --release # Run with optimizations (faster)
```
### Fuzzing
The `fuzz/` directory has fuzzing targets that generate random inputs to find crashes:
```bash
cargo +nightly fuzz run fuzz_parser
```
---
## What's Being Worked On
According to code comments and structure:
1. **Lean4 formalization** — Mathematical proofs about correctness (in `proofs/`)
2. **Better disjunction handling** — Currently fires both branches, should choose one
3. **Query plan improvements** — More efficient execution

View File

@ -1,88 +1,273 @@
# Key Concepts # Key Concepts
## Axioms This document explains the main ideas in Geolog without assuming math background.
An axiom is an "if-then" rule: ---
> "If premises are true, then conclusion must be true." ## Theories and Instances
A **theory** is a template — it defines what *kinds* of things exist and what rules they follow.
An **instance** is a concrete example — actual things that follow the theory's rules.
### Example: Graphs
```geolog
// The theory: "A graph has vertices and edges"
theory Graph {
V : Sort; // There's a kind of thing called "vertex"
E : Sort; // There's a kind of thing called "edge"
src : E -> V; // Every edge has a source vertex
tgt : E -> V; // Every edge has a target vertex
}
// An instance: A specific triangle graph
instance Triangle : Graph = {
// Three vertices
A : V;
B : V;
C : V;
// Three edges forming a triangle
ab : E;
ab src = A; // Edge ab goes from A...
ab tgt = B; // ...to B
bc : E;
bc src = B;
bc tgt = C;
ca : E;
ca src = C;
ca tgt = A;
}
```
Think of it like:
- **Theory** = "A spreadsheet template with certain columns"
- **Instance** = "A filled-in spreadsheet"
---
## Axioms (Rules)
An **axiom** tells Geolog: "Whenever the left side is true, make the right side true too."
``` ```
forall x, y, z. friends(x,y), friends(y,z) |- knows(x,z) forall x, y, z. friends(x,y), friends(y,z) |- knows(x,z)
↑ ↑ ↑
"for any x,y,z" "if these are true" "then this must be true"
``` ```
Similar to Datalog rules, but extended with: ### Simple Example: Transitivity
- Existentials in conclusions: `|- exists x. R(x)`
- Disjunctions: `|- A \/ B`
- Equality conclusions: `|- x = y`
Datalog is limited to Horn clauses (single atom in head). Geolog supports geometric sequents. "If A ≤ B and B ≤ C, then A ≤ C"
```geolog
theory Preorder {
X : Sort;
leq : [x: X, y: X] -> Prop; // "x ≤ y" relation
// Transitivity rule
ax/trans : forall x: X, y: X, z: X.
[x: x, y: y] leq, [x: y, y: z] leq |- [x: x, y: z] leq;
}
```
### What `[x: x, y: y] leq` Means
This is Geolog's way of writing `leq(x, y)` or "x ≤ y":
```
[x: x, y: y] leq
↑ ↑
a record applied to the "leq" relation
with fields
x and y
```
It's like calling a function with named parameters: `leq(x=x, y=y)`.
--- ---
## Chase Algorithm ## The Chase Algorithm
Repeatedly applies axioms until no new facts can be derived (fixpoint). The **chase** is what makes Geolog useful. It automatically applies your rules to derive new facts.
### Building Blocks ### Step-by-Step Example
1. **Axioms** — rules to apply **Setup:**
2. **Structure** — carriers (element sets), functions, relations ```geolog
3. **Bindings** — variable-to-element mappings theory Friends {
4. **Tensor system** — finds violations (premises true, conclusion false) Person : Sort;
5. **Fire conclusion** — adds relations, defines functions, creates elements friends : [a: Person, b: Person] -> Prop;
6. **Congruence closure** — tracks element equality via union-find knows : [a: Person, b: Person] -> Prop;
### Loop // Rule: friends-of-friends know each other
ax/fof : forall x, y, z : Person.
[a: x, b: y] friends, [a: y, b: z] friends |- [a: x, b: z] knows;
}
instance Group : Friends = {
alice : Person;
bob : Person;
charlie : Person;
[a: alice, b: bob] friends; // Alice and Bob are friends
[a: bob, b: charlie] friends; // Bob and Charlie are friends
}
```
**Running the chase:**
``` ```
repeat until no changes: Initial state:
for each axiom: friends = {(alice, bob), (bob, charlie)}
find violations (tensor system) knows = {}
fire conclusion for each violation
propagate equations (congruence closure) Chase iteration 1:
Looking at axiom ax/fof...
Found match: x=alice, y=bob, z=charlie
- [a:alice, b:bob] friends? YES
- [a:bob, b:charlie] friends? YES
- [a:alice, b:charlie] knows? NO ← violation!
Fire conclusion: add [a:alice, b:charlie] knows
knows = {(alice, charlie)} ← NEW FACT
Chase iteration 2:
Looking at axiom ax/fof...
No new violations found.
DONE (fixpoint reached)
Final state:
friends = {(alice, bob), (bob, charlie)}
knows = {(alice, charlie)}
``` ```
### The Key Insight
You don't write a program to derive facts. You write *rules*, and Geolog figures out everything that follows from them.
--- ---
## Union-Find ## Union-Find (Equality Tracking)
Data structure for tracking equivalence classes. Inputs are elements (IDs), not sets. When a rule concludes that two things are equal (`x = y`), Geolog needs to merge them. It uses a data structure called **union-find**.
``` ### The Problem
union(3, 7) // "3 and 7 are now equivalent"
union(7, 12) // "7 and 12 are now equivalent" ```geolog
find(3) // Returns same representative as find(12) // Rule: If two people have the same ID, they're the same person
ax/same_id : forall p, q : Person, i : ID.
[p: p] id = i, [p: q] id = i |- p = q;
``` ```
### Why Not Dictionary + Sets? If this rule fires with `p = alice` and `q = alicia`, Geolog learns that `alice = alicia`. But now every reference to `alicia` should really mean `alice`.
| Approach | Union Cost | ### How Union-Find Works
|----------|------------|
| Dict + Sets | O(n) — must update all elements in merged set |
| Union-Find | O(1) amortized — just one pointer update |
Used for congruence closure when axioms conclude `x = y`. ```
Before:
alice → alice (points to itself)
alicia → alicia (points to itself)
After union(alice, alicia):
alice → alice (the "representative")
alicia → alice (now points to alice)
Later, find(alicia) returns alice
```
**Why not just use a dictionary?**
| Approach | Cost of merging N elements |
|----------|---------------------------|
| Dictionary + Sets | O(N) — must update every element |
| Union-Find | O(1) — just change one pointer |
When rules derive thousands of equalities, this matters.
--- ---
## Language Syntax ## Existentials (Creating New Things)
Not based on any single existing language. Influences: Rules can say "there must exist something":
| Feature | Source | ```geolog
|---------|--------| // Every person has a best friend
| Postfix application (`x f` not `f(x)`) | Category theory | ax/has_bf : forall p : Person. |- exists f : Person. [p: p, f: f] best_friend;
| Sequent notation (`premises \|- conclusion`) | Proof theory | ```
| Path separators (`/` not `.`) | Unique to Geolog |
| Theory/Instance structure | Algebraic specification languages | When the chase processes this:
1. **Check:** Does person `p` already have a best friend?
2. **If yes:** Do nothing
3. **If no:** Create a new person and make them `p`'s best friend
```
Before chase:
Persons = {alice, bob}
best_friend = {}
After chase:
Persons = {alice, bob, fresh_1, fresh_2}
best_friend = {(alice, fresh_1), (bob, fresh_2)}
```
The chase created `fresh_1` and `fresh_2` because the axiom demanded witnesses.
--- ---
## Geometric Logic Restrictions ## Disjunctions (Or)
By design, Geolog excludes: Rules can have "or" in conclusions:
- Negation (no `not R(x)`)
- Classical implication (no `if A then B`)
- Arithmetic (no `x + 1 = y`)
These are intentional limits for topos-theoretic semantics, not missing features. ```geolog
// Every task is either done or pending
ax/status : forall t : Task. |- [t: t] done \/ [t: t] pending;
```
**Current behavior:** Geolog fires *both* branches. So every task becomes both done AND pending.
**Why?** Proper disjunction handling requires backtracking search, which is complex. The current approach is "sound" (never wrong) but "incomplete" (may add more facts than necessary).
---
## Tensor Algebra (How Chase Is Fast)
Checking "which rules have violations" naively requires checking every possible variable assignment. For 100 people and 3 variables, that's 100³ = 1,000,000 checks.
Geolog uses **sparse tensors** (think: sparse matrices, but multi-dimensional) to do this efficiently:
```
Relations are stored as sparse tensors:
friends[i][j] = 1 if person i and person j are friends
Finding violations becomes tensor operations:
violations = friends ⊗ friends ⊗ (1 - knows)
↑ ↑ ↑
"x,y friends" "y,z friends" "x,z not known"
```
This is much faster than nested loops because:
- Sparse storage skips empty cells
- Tensor operations are highly optimized
- Results are computed in bulk, not one-by-one
---
## What Geolog Can't Express
Geometric logic intentionally excludes some things:
| Can't Write | Why Not |
|-------------|---------|
| `not R(x)` | No negation — can't say "x is NOT red" |
| `if not A then B` | No negation in premises |
| `x + 1 = y` | No arithmetic — this isn't a calculator |
| `forall x. R(x)` with no conclusion | Axioms must have conclusions |
**This is by design.** These restrictions make the logic "geometric," which has nice theoretical properties (preserved by geometric morphisms, has models in any topos, etc.).
For practical purposes: Geolog is good for "if-then" rules and "there exists" statements, not for arithmetic or negation.

View File

@ -1,79 +1,179 @@
# Critical Assessment # Critical Assessment
This document takes a skeptical look at what Geolog actually is versus what the documentation suggests.
---
## Origin ## Origin
AI-generated project by Claude Opus 4.5. From README: Geolog was generated by an AI (Claude Opus 4.5). From the README:
> "This README was synthesized automatically by Claude Opus 4.5. As was this entire project, really." > "This README was synthesized automatically by Claude Opus 4.5. As was this entire project, really."
--- This is unusual — most software is written by humans over time. Geolog was generated in a relatively short period, which has implications:
## Claimed Use Cases **Positives:**
- Consistent code style throughout
- Good test coverage from the start
- Clean architecture
- Business process workflow orchestration **Concerns:**
- Formal verification - No organic evolution based on real user feedback
- Database query design - No battle-testing in production environments
- Petri net reachability - Design decisions may be theoretically elegant but practically awkward
--- ---
## Reality Check ## Claimed vs. Actual Use Cases
| Claim | Assessment | ### Claim: "Business Process Workflow Orchestration"
|-------|------------|
| **Workflow orchestration** | No APIs, no integrations, no distributed execution. Just a REPL. Real workflow engines (Temporal, Camunda) handle failure recovery, monitoring, external systems. | **What this suggests:** You could use Geolog to manage business workflows — approve orders, route documents, handle exceptions.
| **Formal verification** | Lean4 proofs "in progress." Not comparable to Coq, Lean, or TLA+ which have decades of development and industrial use. |
| **Database query design** | Not a database. No persistence at scale, no SQL, no transactions, no concurrency. | **Reality:** Geolog is a command-line REPL. It has:
| **Petri net reachability** | Toy examples with 3-4 places. Not industrial-scale process modeling. | - No REST API
- No integration with external systems
- No notification system
- No user interface
- No distributed execution
- No failure recovery
**What real workflow tools have:** Temporal, Camunda, and Airflow handle retries, timeouts, external service calls, monitoring dashboards, and run across multiple servers.
**Verdict:** Geolog can *model* a workflow on paper. It cannot *run* a workflow in production.
--- ---
## What It Actually Is ### Claim: "Formal Verification"
A **research/educational tool** for: **What this suggests:** You could use Geolog to prove your software is correct.
- Learning geometric logic
- Understanding chase algorithms **Reality:**
- Experimenting with category-theoretic ideas - The Lean4 proofs in `proofs/` are "in progress"
- Playing with Petri net examples - There's no connection between Geolog models and actual code
- Prototyping constraint systems - No proof certificates are generated
**What real verification tools have:** Coq, Lean, and TLA+ have decades of development, extensive libraries, extraction to executable code, and industrial use (CompCert, seL4, Amazon's use of TLA+).
**Verdict:** Geolog has some formal foundations but is not a verification tool.
--- ---
## What It's Not ### Claim: "Database Query Design"
- Production infrastructure **What this suggests:** You could use Geolog for database work.
- Real workflow orchestration platform
- Industrial formal verification tool **Reality:**
- Database replacement - Data lives in memory (with optional file persistence)
- Anything at scale - No SQL interface
- No transactions
- No concurrent access
- No indexing for large datasets
- No connection to actual databases
**What real database tools have:** PostgreSQL, SQLite, and even embedded databases handle concurrent writes, ACID transactions, and gigabytes of data.
**Verdict:** Geolog has relational algebra internally for a different purpose. It's not a database.
---
### Claim: "Petri Net Reachability"
**What this suggests:** You could analyze Petri nets with Geolog.
**Reality:** This is the most honest claim. The examples show Petri net modeling:
```geolog
theory PetriNet {
P : Sort; // Places
T : Sort; // Transitions
pre : [t: T, p: P] -> Prop; // Input arcs
post : [t: T, p: P] -> Prop; // Output arcs
}
```
**But:** The examples have 3-4 places. Industrial Petri net tools handle thousands of places with specialized algorithms (state space reduction, symbolic model checking).
**Verdict:** Geolog can model small Petri nets. It's educational, not industrial.
---
## What Geolog Actually Is
### Good For
| Use | Why |
|-----|-----|
| **Learning geometric logic** | Clean implementation with good examples |
| **Understanding chase algorithms** | The code is readable and well-commented |
| **Experimenting with rule systems** | Quick to try ideas in the REPL |
| **Teaching category theory concepts** | Postfix notation matches categorical thinking |
| **Prototyping constraint systems** | Faster than writing from scratch |
### Not Good For
| Use | Why Not |
|-----|---------|
| **Production systems** | No APIs, no monitoring, no error handling |
| **Large datasets** | Everything in memory, no optimization for scale |
| **Real workflows** | No external integrations |
| **Serious verification** | Incomplete proofs, no connection to real code |
| **Anything user-facing** | It's a developer REPL, not an application |
--- ---
## Technical Influences ## Technical Influences
| Concept | Source | Geolog combines ideas from several fields, which is interesting but also means it's not the best tool for any single field:
|---------|--------|
| Geometric logic | Mathematical logic |
| Chase algorithm | Database theory |
| Tensor algebra | Linear algebra / sparse computation |
| Equality saturation | Program optimization (egg/egglog) |
| Congruence closure | Automated theorem proving |
Uses `egglog-union-find` crate from the egg/egglog project. | Concept | From | Better Specialized Tool |
|---------|------|------------------------|
| Geometric logic | Mathematical logic | Lean, Coq (for proofs) |
| Chase algorithm | Database theory | Datalog engines (for scale) |
| Equality saturation | Program optimization | egg/egglog (more mature) |
| Tensor operations | Linear algebra | NumPy, sparse matrix libraries |
| Model search | SMT solving | Z3, CVC5 (much more powerful) |
--- ---
## Related Tools ## Code Quality Assessment
Similar tools in adjacent spaces (not cited as direct inspirations): Despite the skepticism above, the code itself is well-done:
- **Alloy** — relational logic model finder
- **Datalog** — logic programming for databases **Positives:**
- **CQL** — categorical query language - Comprehensive tests (168 passing)
- **Z3** — SMT solver - Good error messages with source locations
- **Egg** — equality saturation toolkit - Clean module separation
- Documented design decisions
**Negatives:**
- Some silent failures (tensor compilation)
- Incomplete features (product types)
- No performance benchmarks
- Limited real-world testing
---
## Who Should Use This?
**Yes:**
- Students learning about logic and type systems
- Researchers exploring geometric logic
- Developers curious about chase algorithms
- Anyone wanting to understand equality saturation
**No:**
- Teams building production systems
- Anyone needing a database
- Anyone needing workflow automation
- Anyone needing formal guarantees
--- ---
## Bottom Line ## Bottom Line
Well-crafted educational project, not production infrastructure. The code quality is high and tests pass, but there's no evidence of real-world usage or battle-testing. Geolog is a **well-crafted educational project**, not production infrastructure.
The code is clean, tests pass, and the examples work. But the claimed "use cases" are aspirational — describing what geometric logic *could theoretically* be used for, not what this specific tool is ready to do.
If you want to learn about geometric logic and chase algorithms, Geolog is excellent. If you want to solve real problems, use established tools in each domain.