281 lines
7.2 KiB
Markdown
281 lines
7.2 KiB
Markdown
# Key Concepts
|
|
|
|
This document explains the main ideas in Geolog without assuming math background.
|
|
|
|
## Related Notes
|
|
|
|
If you want to compare Geolog with Datalog, or think about automatic translation, see:
|
|
|
|
- `notes/007-geolog-and-datalog.md` — a simple explanation of what maps well and what does not.
|
|
- `notes/008-geolog-to-datalog-subset.md` — a practical subset of Geolog that could be translated automatically into plain Datalog.
|
|
|
|
---
|
|
|
|
## Theories and Instances
|
|
|
|
A **theory** is a template — it defines what *kinds* of things exist and what rules they follow.
|
|
|
|
An **instance** is a concrete example — actual things that follow the theory's rules.
|
|
|
|
### Example: Graphs
|
|
|
|
```geolog
|
|
// The theory: "A graph has vertices and edges"
|
|
theory Graph {
|
|
V : Sort; // There's a kind of thing called "vertex"
|
|
E : Sort; // There's a kind of thing called "edge"
|
|
src : E -> V; // Every edge has a source vertex
|
|
tgt : E -> V; // Every edge has a target vertex
|
|
}
|
|
|
|
// An instance: A specific triangle graph
|
|
instance Triangle : Graph = {
|
|
// Three vertices
|
|
A : V;
|
|
B : V;
|
|
C : V;
|
|
|
|
// Three edges forming a triangle
|
|
ab : E;
|
|
ab src = A; // Edge ab goes from A...
|
|
ab tgt = B; // ...to B
|
|
|
|
bc : E;
|
|
bc src = B;
|
|
bc tgt = C;
|
|
|
|
ca : E;
|
|
ca src = C;
|
|
ca tgt = A;
|
|
}
|
|
```
|
|
|
|
Think of it like:
|
|
- **Theory** = "A spreadsheet template with certain columns"
|
|
- **Instance** = "A filled-in spreadsheet"
|
|
|
|
---
|
|
|
|
## Axioms (Rules)
|
|
|
|
An **axiom** tells Geolog: "Whenever the left side is true, make the right side true too."
|
|
|
|
```
|
|
forall x, y, z. friends(x,y), friends(y,z) |- knows(x,z)
|
|
↑ ↑ ↑
|
|
"for any x,y,z" "if these are true" "then this must be true"
|
|
```
|
|
|
|
### Simple Example: Transitivity
|
|
|
|
"If A ≤ B and B ≤ C, then A ≤ C"
|
|
|
|
```geolog
|
|
theory Preorder {
|
|
X : Sort;
|
|
leq : [x: X, y: X] -> Prop; // "x ≤ y" relation
|
|
|
|
// Transitivity rule
|
|
ax/trans : forall x: X, y: X, z: X.
|
|
[x: x, y: y] leq, [x: y, y: z] leq |- [x: x, y: z] leq;
|
|
}
|
|
```
|
|
|
|
### What `[x: x, y: y] leq` Means
|
|
|
|
This is Geolog's way of writing `leq(x, y)` or "x ≤ y":
|
|
|
|
```
|
|
[x: x, y: y] leq
|
|
↑ ↑
|
|
a record applied to the "leq" relation
|
|
with fields
|
|
x and y
|
|
```
|
|
|
|
It's like calling a function with named parameters: `leq(x=x, y=y)`.
|
|
|
|
---
|
|
|
|
## The Chase Algorithm
|
|
|
|
The **chase** is what makes Geolog useful. It automatically applies your rules to derive new facts.
|
|
|
|
### Step-by-Step Example
|
|
|
|
**Setup:**
|
|
```geolog
|
|
theory Friends {
|
|
Person : Sort;
|
|
friends : [a: Person, b: Person] -> Prop;
|
|
knows : [a: Person, b: Person] -> Prop;
|
|
|
|
// Rule: friends-of-friends know each other
|
|
ax/fof : forall x, y, z : Person.
|
|
[a: x, b: y] friends, [a: y, b: z] friends |- [a: x, b: z] knows;
|
|
}
|
|
|
|
instance Group : Friends = {
|
|
alice : Person;
|
|
bob : Person;
|
|
charlie : Person;
|
|
|
|
[a: alice, b: bob] friends; // Alice and Bob are friends
|
|
[a: bob, b: charlie] friends; // Bob and Charlie are friends
|
|
}
|
|
```
|
|
|
|
**Running the chase:**
|
|
|
|
```
|
|
Initial state:
|
|
friends = {(alice, bob), (bob, charlie)}
|
|
knows = {}
|
|
|
|
Chase iteration 1:
|
|
Looking at axiom ax/fof...
|
|
Found match: x=alice, y=bob, z=charlie
|
|
- [a:alice, b:bob] friends? YES
|
|
- [a:bob, b:charlie] friends? YES
|
|
- [a:alice, b:charlie] knows? NO ← violation!
|
|
Fire conclusion: add [a:alice, b:charlie] knows
|
|
|
|
knows = {(alice, charlie)} ← NEW FACT
|
|
|
|
Chase iteration 2:
|
|
Looking at axiom ax/fof...
|
|
No new violations found.
|
|
DONE (fixpoint reached)
|
|
|
|
Final state:
|
|
friends = {(alice, bob), (bob, charlie)}
|
|
knows = {(alice, charlie)}
|
|
```
|
|
|
|
### The Key Insight
|
|
|
|
You don't write a program to derive facts. You write *rules*, and Geolog figures out everything that follows from them.
|
|
|
|
---
|
|
|
|
## Union-Find (Equality Tracking)
|
|
|
|
When a rule concludes that two things are equal (`x = y`), Geolog needs to merge them. It uses a data structure called **union-find**.
|
|
|
|
### The Problem
|
|
|
|
```geolog
|
|
// Rule: If two people have the same ID, they're the same person
|
|
ax/same_id : forall p, q : Person, i : ID.
|
|
[p: p] id = i, [p: q] id = i |- p = q;
|
|
```
|
|
|
|
If this rule fires with `p = alice` and `q = alicia`, Geolog learns that `alice = alicia`. But now every reference to `alicia` should really mean `alice`.
|
|
|
|
### How Union-Find Works
|
|
|
|
```
|
|
Before:
|
|
alice → alice (points to itself)
|
|
alicia → alicia (points to itself)
|
|
|
|
After union(alice, alicia):
|
|
alice → alice (the "representative")
|
|
alicia → alice (now points to alice)
|
|
|
|
Later, find(alicia) returns alice
|
|
```
|
|
|
|
**Why not just use a dictionary?**
|
|
|
|
| Approach | Cost of merging N elements |
|
|
|----------|---------------------------|
|
|
| Dictionary + Sets | O(N) — must update every element |
|
|
| Union-Find | O(1) — just change one pointer |
|
|
|
|
When rules derive thousands of equalities, this matters.
|
|
|
|
---
|
|
|
|
## Existentials (Creating New Things)
|
|
|
|
Rules can say "there must exist something":
|
|
|
|
```geolog
|
|
// Every person has a best friend
|
|
ax/has_bf : forall p : Person. |- exists f : Person. [p: p, f: f] best_friend;
|
|
```
|
|
|
|
When the chase processes this:
|
|
|
|
1. **Check:** Does person `p` already have a best friend?
|
|
2. **If yes:** Do nothing
|
|
3. **If no:** Create a new person and make them `p`'s best friend
|
|
|
|
```
|
|
Before chase:
|
|
Persons = {alice, bob}
|
|
best_friend = {}
|
|
|
|
After chase:
|
|
Persons = {alice, bob, fresh_1, fresh_2}
|
|
best_friend = {(alice, fresh_1), (bob, fresh_2)}
|
|
```
|
|
|
|
The chase created `fresh_1` and `fresh_2` because the axiom demanded witnesses.
|
|
|
|
---
|
|
|
|
## Disjunctions (Or)
|
|
|
|
Rules can have "or" in conclusions:
|
|
|
|
```geolog
|
|
// Every task is either done or pending
|
|
ax/status : forall t : Task. |- [t: t] done \/ [t: t] pending;
|
|
```
|
|
|
|
**Current behavior:** Geolog fires *both* branches. So every task becomes both done AND pending.
|
|
|
|
**Why?** Proper disjunction handling requires backtracking search, which is complex. The current approach is "sound" (never wrong) but "incomplete" (may add more facts than necessary).
|
|
|
|
---
|
|
|
|
## Tensor Algebra (How Chase Is Fast)
|
|
|
|
Checking "which rules have violations" naively requires checking every possible variable assignment. For 100 people and 3 variables, that's 100³ = 1,000,000 checks.
|
|
|
|
Geolog uses **sparse tensors** (think: sparse matrices, but multi-dimensional) to do this efficiently:
|
|
|
|
```
|
|
Relations are stored as sparse tensors:
|
|
friends[i][j] = 1 if person i and person j are friends
|
|
|
|
Finding violations becomes tensor operations:
|
|
violations = friends ⊗ friends ⊗ (1 - knows)
|
|
↑ ↑ ↑
|
|
"x,y friends" "y,z friends" "x,z not known"
|
|
```
|
|
|
|
This is much faster than nested loops because:
|
|
- Sparse storage skips empty cells
|
|
- Tensor operations are highly optimized
|
|
- Results are computed in bulk, not one-by-one
|
|
|
|
---
|
|
|
|
## What Geolog Can't Express
|
|
|
|
Geometric logic intentionally excludes some things:
|
|
|
|
| Can't Write | Why Not |
|
|
|-------------|---------|
|
|
| `not R(x)` | No negation — can't say "x is NOT red" |
|
|
| `if not A then B` | No negation in premises |
|
|
| `x + 1 = y` | No arithmetic — this isn't a calculator |
|
|
| `forall x. R(x)` with no conclusion | Axioms must have conclusions |
|
|
|
|
**This is by design.** These restrictions make the logic "geometric," which has nice theoretical properties (preserved by geometric morphisms, has models in any topos, etc.).
|
|
|
|
For practical purposes: Geolog is good for "if-then" rules and "there exists" statements, not for arithmetic or negation.
|