# Key Concepts

This document explains the main ideas in Geolog without assuming math background.

## Related Notes

If you want to compare Geolog with Datalog, or think about automatic translation, see:

- `notes/007-geolog-and-datalog.md` — a simple explanation of what maps well and what does not.
- `notes/008-geolog-to-datalog-subset.md` — a practical subset of Geolog that could be translated automatically into plain Datalog.

---

## Theories and Instances

A **theory** is a template — it defines what *kinds* of things exist and what rules they follow.

An **instance** is a concrete example — actual things that follow the theory's rules.

### Example: Graphs

```geolog
// The theory: "A graph has vertices and edges"
theory Graph {
  V : Sort;           // There's a kind of thing called "vertex"
  E : Sort;           // There's a kind of thing called "edge"
  src : E -> V;       // Every edge has a source vertex
  tgt : E -> V;       // Every edge has a target vertex
}

// An instance: A specific triangle graph
instance Triangle : Graph = {
  // Three vertices
  A : V;
  B : V;
  C : V;

  // Three edges forming a triangle
  ab : E;
  ab src = A;    // Edge ab goes from A...
  ab tgt = B;    // ...to B

  bc : E;
  bc src = B;
  bc tgt = C;

  ca : E;
  ca src = C;
  ca tgt = A;
}
```

Think of it like:
- **Theory** = "A spreadsheet template with certain columns"
- **Instance** = "A filled-in spreadsheet"

---

## Axioms (Rules)

An **axiom** tells Geolog: "Whenever the left side is true, make the right side true too."

```
forall x, y, z. friends(x,y), friends(y,z) |- knows(x,z)
        ↑              ↑                         ↑
   "for any x,y,z"  "if these are true"    "then this must be true"
```

### Simple Example: Transitivity

"If A ≤ B and B ≤ C, then A ≤ C"

```geolog
theory Preorder {
  X : Sort;
  leq : [x: X, y: X] -> Prop;  // "x ≤ y" relation

  // Transitivity rule
  ax/trans : forall x: X, y: X, z: X.
    [x: x, y: y] leq, [x: y, y: z] leq |- [x: x, y: z] leq;
}
```

### What `[x: x, y: y] leq` Means

This is Geolog's way of writing `leq(x, y)` or "x ≤ y":

```
[x: x, y: y] leq
    ↑           ↑
 a record     applied to the "leq" relation
 with fields
 x and y
```

It's like calling a function with named parameters: `leq(x=x, y=y)`.

---

## The Chase Algorithm

The **chase** is what makes Geolog useful. It automatically applies your rules to derive new facts.

### Step-by-Step Example

**Setup:**
```geolog
theory Friends {
  Person : Sort;
  friends : [a: Person, b: Person] -> Prop;
  knows : [a: Person, b: Person] -> Prop;

  // Rule: friends-of-friends know each other
  ax/fof : forall x, y, z : Person.
    [a: x, b: y] friends, [a: y, b: z] friends |- [a: x, b: z] knows;
}

instance Group : Friends = {
  alice : Person;
  bob : Person;
  charlie : Person;

  [a: alice, b: bob] friends;      // Alice and Bob are friends
  [a: bob, b: charlie] friends;    // Bob and Charlie are friends
}
```

**Running the chase:**

```
Initial state:
  friends = {(alice, bob), (bob, charlie)}
  knows = {}

Chase iteration 1:
  Looking at axiom ax/fof...
  Found match: x=alice, y=bob, z=charlie
    - [a:alice, b:bob] friends? YES
    - [a:bob, b:charlie] friends? YES
    - [a:alice, b:charlie] knows? NO ← violation!
  Fire conclusion: add [a:alice, b:charlie] knows

  knows = {(alice, charlie)}  ← NEW FACT

Chase iteration 2:
  Looking at axiom ax/fof...
  No new violations found.
  DONE (fixpoint reached)

Final state:
  friends = {(alice, bob), (bob, charlie)}
  knows = {(alice, charlie)}
```

### The Key Insight

You don't write a program to derive facts. You write *rules*, and Geolog figures out everything that follows from them.

---

## Union-Find (Equality Tracking)

When a rule concludes that two things are equal (`x = y`), Geolog needs to merge them. It uses a data structure called **union-find**.

### The Problem

```geolog
// Rule: If two people have the same ID, they're the same person
ax/same_id : forall p, q : Person, i : ID.
  [p: p] id = i, [p: q] id = i |- p = q;
```

If this rule fires with `p = alice` and `q = alicia`, Geolog learns that `alice = alicia`. But now every reference to `alicia` should really mean `alice`.

### How Union-Find Works

```
Before:
  alice → alice (points to itself)
  alicia → alicia (points to itself)

After union(alice, alicia):
  alice → alice (the "representative")
  alicia → alice (now points to alice)

Later, find(alicia) returns alice
```

**Why not just use a dictionary?**

| Approach | Cost of merging N elements |
|----------|---------------------------|
| Dictionary + Sets | O(N) — must update every element |
| Union-Find | O(1) — just change one pointer |

When rules derive thousands of equalities, this matters.

---

## Existentials (Creating New Things)

Rules can say "there must exist something":

```geolog
// Every person has a best friend
ax/has_bf : forall p : Person. |- exists f : Person. [p: p, f: f] best_friend;
```

When the chase processes this:

1. **Check:** Does person `p` already have a best friend?
2. **If yes:** Do nothing
3. **If no:** Create a new person and make them `p`'s best friend

```
Before chase:
  Persons = {alice, bob}
  best_friend = {}

After chase:
  Persons = {alice, bob, fresh_1, fresh_2}
  best_friend = {(alice, fresh_1), (bob, fresh_2)}
```

The chase created `fresh_1` and `fresh_2` because the axiom demanded witnesses.

---

## Disjunctions (Or)

Rules can have "or" in conclusions:

```geolog
// Every task is either done or pending
ax/status : forall t : Task. |- [t: t] done \/ [t: t] pending;
```

**Current behavior:** Geolog fires *both* branches. So every task becomes both done AND pending.

**Why?** Proper disjunction handling requires backtracking search, which is complex. The current approach is "sound" (never wrong) but "incomplete" (may add more facts than necessary).

---

## Tensor Algebra (How Chase Is Fast)

Checking "which rules have violations" naively requires checking every possible variable assignment. For 100 people and 3 variables, that's 100³ = 1,000,000 checks.

Geolog uses **sparse tensors** (think: sparse matrices, but multi-dimensional) to do this efficiently:

```
Relations are stored as sparse tensors:
  friends[i][j] = 1 if person i and person j are friends

Finding violations becomes tensor operations:
  violations = friends ⊗ friends ⊗ (1 - knows)
                  ↑          ↑           ↑
              "x,y friends" "y,z friends" "x,z not known"
```

This is much faster than nested loops because:
- Sparse storage skips empty cells
- Tensor operations are highly optimized
- Results are computed in bulk, not one-by-one

---

## What Geolog Can't Express

Geometric logic intentionally excludes some things:

| Can't Write | Why Not |
|-------------|---------|
| `not R(x)` | No negation — can't say "x is NOT red" |
| `if not A then B` | No negation in premises |
| `x + 1 = y` | No arithmetic — this isn't a calculator |
| `forall x. R(x)` with no conclusion | Axioms must have conclusions |

**This is by design.** These restrictions make the logic "geometric," which has nice theoretical properties (preserved by geometric morphisms, has models in any topos, etc.).

For practical purposes: Geolog is good for "if-then" rules and "there exists" statements, not for arithmetic or negation.