geolog-zeta-fork/notes/002-design-decisions.md

# Design Decisions

This document explains *why* Geolog works the way it does.

---

## Geometric Logic: What It Means

Geolog uses "geometric logic" — a restricted form of logic that can express:
- "If X then Y" rules
- "There exists something such that..."
- "A or B" (but you can't say "not A")

**What you CAN write:**
```
// If x and y are both connected to z, they know each other
forall x, y, z. connected(x,z), connected(y,z) |- knows(x,y)

// Every person has a best friend
forall p : Person. |- exists f : Person. best_friend(p, f)

// Every task is either done or pending
forall t : Task. |- done(t) \/ pending(t)
```

**What you CANNOT write:**
```
// No negation allowed
forall x. |- not blocked(x)           // INVALID

// No "if not" patterns
forall x. not done(x) |- pending(x)   // INVALID
```

**Why this restriction?** Geometric logic has nice mathematical properties — it's preserved by "geometric morphisms" between topoi. This isn't just theory; it means the reasoning is compositional and predictable.

---

## Postfix Function Application

Most languages write `f(x)` to apply function `f` to `x`. Geolog writes `x f`.

```
// Traditional notation
src(ab)           // "apply src to ab"

// Geolog notation
ab src            // "ab, then src"
```

**Why?** It matches how you read pipelines left-to-right:

```
// Geolog: read left to right
ab src tgt        // "take ab, get its src, get that's tgt"

// Traditional: read inside out
tgt(src(ab))      // "tgt of src of ab"
```

This is called "categorical" style because it matches how mathematicians compose morphisms in category theory.

---

## Identity System: Luids and Slids

Geolog needs to track elements across different operations. It uses two kinds of IDs:

| ID Type | Scope | Example |
|---------|-------|---------|
| **Luid** (Local Universe ID) | Global — unique across everything | "Element #4827" |
| **Slid** (Structure-Local ID) | Local — index within one structure | "The 3rd vertex" |

**Why both?**
- Slids are small integers (0, 1, 2...) — fast for computation
- Luids persist across saves/loads — you can reference "that specific vertex" forever

```
Structure "Triangle":
  Vertex A has Slid=0, Luid=1001
  Vertex B has Slid=1, Luid=1002
  Vertex C has Slid=2, Luid=1003

After merging A and B (because an axiom said A = B):
  Vertex A has Slid=0, Luid=1001 (canonical)
  Vertex B → now points to A
```

---

## Append-Only Storage

Geolog never deletes data. When you "remove" something, it's marked as removed but still stored.

**Why?**
- Perfect audit trail — you can see exactly what changed and when
- Safe undo — nothing is ever lost
- Git-like history — you can go back to any previous state

```
Commit 1: Added vertices A, B, C
Commit 2: Added edge ab
Commit 3: Marked edge ab as "removed"  ← still stored, just flagged
```

---

## Type System

### Sorts (Types)

A **sort** is a type of thing. Vertices and Edges are different sorts:

```geolog
theory Graph {
  V : Sort;    // V is a type (the type of vertices)
  E : Sort;    // E is a type (the type of edges)
}
```

### Functions

Functions map elements to elements:

```geolog
src : E -> V;              // Every edge has a source vertex
tgt : E -> V;              // Every edge has a target vertex
mul : [x: M, y: M] -> M;   // Binary function taking a pair
```

The `[x: M, y: M]` is a **product domain** — the function takes a record with two fields.

### Relations

Relations are yes/no questions about tuples:

```geolog
leq : [x: X, y: X] -> Prop;   // "Is x ≤ y?"
connected : [a: V, b: V] -> Prop;
```

`Prop` means "proposition" — it's either true or false for each input.

---

## Chase Algorithm Design

The chase repeatedly applies rules until nothing new can be derived:

```
Initial facts: friends(alice, bob), friends(bob, charlie)
Rule: friends(x,y), friends(y,z) |- knows(x,z)

Iteration 1:
  - Rule matches with x=alice, y=bob, z=charlie
  - Conclusion knows(alice, charlie) is missing
  - Add: knows(alice, charlie)

Iteration 2:
  - No rule has missing conclusions
  - Stop (fixpoint reached)

Final facts: friends(alice, bob), friends(bob, charlie), knows(alice, charlie)
```

**Key features:**
- Uses **tensor algebra** to efficiently find which rules need to fire
- Integrates **congruence closure** so `x = y` conclusions actually merge elements
- **Terminates** for theories with identity elements (earlier versions could loop forever)

---

## Solver Architecture

When you need to *find* a model (not just derive facts), the solver searches:

```
Goal: Find an instance of Graph with a cycle

Search tree:
  Node 1: Empty graph
    → Obligation: need at least one edge (from some axiom)
    → Branch: try adding edge e1

  Node 2: Graph with e1
    → Obligation: e1 needs src and tgt
    → Branch: try src(e1) = v1, tgt(e1) = v1 (self-loop!)

  Node 3: Graph with e1, v1, self-loop
    → All axioms satisfied
    → SUCCESS: return this model
```

**Key features:**
- **Explicit search tree** — the solver builds a tree structure, not recursive function calls
- **Tactics** — different strategies (forward chaining, equation propagation) can be combined
- **Obligations** — "this axiom requires a witness" is tracked explicitly

---

## Query Compilation

Queries are compiled to relational algebra (like SQL internally):

```
Query: Find all x,y where connected(x,y) and connected(y,x)

Compiled plan:
  1. Scan relation "connected" → get all (x,y) pairs
  2. Scan relation "connected" → get all (a,b) pairs
  3. Join where y=a and x=b
  4. Project out x,y
```

This is optimized with standard database techniques (filter pushdown, etc.).