geolog-zeta-fork/notes/002-design-decisions.md
2026-03-20 11:30:19 +01:00

5.5 KiB

Design Decisions

This document explains why Geolog works the way it does.


Geometric Logic: What It Means

Geolog uses "geometric logic" — a restricted form of logic that can express:

  • "If X then Y" rules
  • "There exists something such that..."
  • "A or B" (but you can't say "not A")

What you CAN write:

// If x and y are both connected to z, they know each other
forall x, y, z. connected(x,z), connected(y,z) |- knows(x,y)

// Every person has a best friend
forall p : Person. |- exists f : Person. best_friend(p, f)

// Every task is either done or pending
forall t : Task. |- done(t) \/ pending(t)

What you CANNOT write:

// No negation allowed
forall x. |- not blocked(x)           // INVALID

// No "if not" patterns
forall x. not done(x) |- pending(x)   // INVALID

Why this restriction? Geometric logic has nice mathematical properties — it's preserved by "geometric morphisms" between topoi. This isn't just theory; it means the reasoning is compositional and predictable.


Postfix Function Application

Most languages write f(x) to apply function f to x. Geolog writes x f.

// Traditional notation
src(ab)           // "apply src to ab"

// Geolog notation
ab src            // "ab, then src"

Why? It matches how you read pipelines left-to-right:

// Geolog: read left to right
ab src tgt        // "take ab, get its src, get that's tgt"

// Traditional: read inside out
tgt(src(ab))      // "tgt of src of ab"

This is called "categorical" style because it matches how mathematicians compose morphisms in category theory.


Identity System: Luids and Slids

Geolog needs to track elements across different operations. It uses two kinds of IDs:

ID Type Scope Example
Luid (Local Universe ID) Global — unique across everything "Element #4827"
Slid (Structure-Local ID) Local — index within one structure "The 3rd vertex"

Why both?

  • Slids are small integers (0, 1, 2...) — fast for computation
  • Luids persist across saves/loads — you can reference "that specific vertex" forever
Structure "Triangle":
  Vertex A has Slid=0, Luid=1001
  Vertex B has Slid=1, Luid=1002
  Vertex C has Slid=2, Luid=1003

After merging A and B (because an axiom said A = B):
  Vertex A has Slid=0, Luid=1001 (canonical)
  Vertex B → now points to A

Append-Only Storage

Geolog never deletes data. When you "remove" something, it's marked as removed but still stored.

Why?

  • Perfect audit trail — you can see exactly what changed and when
  • Safe undo — nothing is ever lost
  • Git-like history — you can go back to any previous state
Commit 1: Added vertices A, B, C
Commit 2: Added edge ab
Commit 3: Marked edge ab as "removed"  ← still stored, just flagged

Type System

Sorts (Types)

A sort is a type of thing. Vertices and Edges are different sorts:

theory Graph {
  V : Sort;    // V is a type (the type of vertices)
  E : Sort;    // E is a type (the type of edges)
}

Functions

Functions map elements to elements:

src : E -> V;              // Every edge has a source vertex
tgt : E -> V;              // Every edge has a target vertex
mul : [x: M, y: M] -> M;   // Binary function taking a pair

The [x: M, y: M] is a product domain — the function takes a record with two fields.

Relations

Relations are yes/no questions about tuples:

leq : [x: X, y: X] -> Prop;   // "Is x ≤ y?"
connected : [a: V, b: V] -> Prop;

Prop means "proposition" — it's either true or false for each input.


Chase Algorithm Design

The chase repeatedly applies rules until nothing new can be derived:

Initial facts: friends(alice, bob), friends(bob, charlie)
Rule: friends(x,y), friends(y,z) |- knows(x,z)

Iteration 1:
  - Rule matches with x=alice, y=bob, z=charlie
  - Conclusion knows(alice, charlie) is missing
  - Add: knows(alice, charlie)

Iteration 2:
  - No rule has missing conclusions
  - Stop (fixpoint reached)

Final facts: friends(alice, bob), friends(bob, charlie), knows(alice, charlie)

Key features:

  • Uses tensor algebra to efficiently find which rules need to fire
  • Integrates congruence closure so x = y conclusions actually merge elements
  • Terminates for theories with identity elements (earlier versions could loop forever)

Solver Architecture

When you need to find a model (not just derive facts), the solver searches:

Goal: Find an instance of Graph with a cycle

Search tree:
  Node 1: Empty graph
    → Obligation: need at least one edge (from some axiom)
    → Branch: try adding edge e1

  Node 2: Graph with e1
    → Obligation: e1 needs src and tgt
    → Branch: try src(e1) = v1, tgt(e1) = v1 (self-loop!)

  Node 3: Graph with e1, v1, self-loop
    → All axioms satisfied
    → SUCCESS: return this model

Key features:

  • Explicit search tree — the solver builds a tree structure, not recursive function calls
  • Tactics — different strategies (forward chaining, equation propagation) can be combined
  • Obligations — "this axiom requires a witness" is tracked explicitly

Query Compilation

Queries are compiled to relational algebra (like SQL internally):

Query: Find all x,y where connected(x,y) and connected(y,x)

Compiled plan:
  1. Scan relation "connected" → get all (x,y) pairs
  2. Scan relation "connected" → get all (a,b) pairs
  3. Join where y=a and x=b
  4. Project out x,y

This is optimized with standard database techniques (filter pushdown, etc.).