geolog-zeta-fork/notes/007-geolog-and-datalog.md

311 lines
6.2 KiB
Markdown
Raw Normal View History

# Geolog And Datalog
*How to map Geolog concepts onto Datalog, and where the analogy breaks down.*
---
## Summary
Geolog is not just Datalog with different syntax, but a large and useful fragment of Geolog can be read that way.
The best short summary is:
> **Geolog is like typed Datalog with native functions, existential rule heads, and stronger logical structure.**
The mapping works well for:
- relations,
- Horn-style axioms,
- instance facts,
- and fixpoint/chase-style derivation.
The mapping breaks down when Geolog uses:
- existential conclusions,
- equality reasoning,
- and native function symbols with total/single-valued semantics.
---
## Side-By-Side Mapping
| Concept | Geolog | Datalog |
|---------|--------|---------|
| Domain/type | `V : Sort;` | usually implicit, or unary predicate like `V(x)` |
| Relation | `edge : [from: V, to: V] -> Prop;` | `edge(X, Y)` |
| Function | `src : E -> V;` | usually encoded as relation `src(E, V)` |
| Fact | `[from: a, to: b] edge;` | `edge(a, b).` |
| Function fact | `e1 src = a;` | `src(e1, a).` |
| Axiom/rule | `A, B |- C` | `C :- A, B.` |
| Derived predicate | relation + axioms | IDB predicate + rules |
| Instance | concrete elements/facts | EDB facts |
| Chase/fixpoint | repeated rule firing to saturation | bottom-up least fixpoint |
| Existential conclusion | `|- exists y : B. ...` | not plain Datalog |
| Equality reasoning | native/logical | usually extension or encoding |
---
## 1. Sorts And Types
Geolog has first-class sorts:
```geolog
theory Graph {
V : Sort;
}
```
Plain Datalog often treats types as implicit, or simulates them with unary predicates:
```prolog
V(a).
V(b).
V(c).
```
So the closest Datalog mental model is:
- **Geolog sort** ≈ **type discipline or unary predicate**.
---
## 2. Relations Map Directly
Geolog relation declaration:
```geolog
edge : [from: V, to: V] -> Prop;
```
Fact in a Geolog instance:
```geolog
[from: a, to: b] edge;
```
Datalog equivalent:
```prolog
edge(a, b).
```
This is the cleanest and most direct part of the mapping.
---
## 3. Functions Become Functional Relations
Geolog has native function symbols:
```geolog
E : Sort;
V : Sort;
src : E -> V;
e1 : E;
a : V;
e1 src = a;
```
In Datalog, the closest encoding is relational:
```prolog
src(e1, a).
```
But there is an important semantic gap:
- in **Geolog**, `src` is a genuine function,
- in **Datalog**, `src(e1, a)` is just a relation unless extra constraints are added.
So the right translation rule is:
- **Geolog function** ≈ **relation that is intended to be total and single-valued**.
To fully mimic a function, you also need functional constraints such as:
```text
if src(e, v1) and src(e, v2), then v1 = v2
```
Plain Datalog does not natively enforce that on its own.
---
## 4. Axioms Map To Rules
Geolog Horn-style axiom:
```geolog
ax/base : forall x,y : V.
[from: x, to: y] edge |- [from: x, to: y] reachable;
```
Datalog:
```prolog
reachable(X, Y) :- edge(X, Y).
```
Another Geolog axiom:
```geolog
ax/trans : forall x,y,z : V.
[from: x, to: y] reachable, [from: y, to: z] reachable
|- [from: x, to: z] reachable;
```
Datalog:
```prolog
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
```
This is the main sense in which Geolog resembles Datalog.
---
## 5. Full Example
Geolog:
```geolog
theory Graph {
V : Sort;
edge : [from: V, to: V] -> Prop;
reachable : [from: V, to: V] -> Prop;
ax/base : forall x,y : V.
[from: x, to: y] edge |- [from: x, to: y] reachable;
ax/trans : forall x,y,z : V.
[from: x, to: y] reachable, [from: y, to: z] reachable
|- [from: x, to: z] reachable;
}
instance G : Graph = {
a, b, c : V;
[from: a, to: b] edge;
[from: b, to: c] edge;
}
```
Datalog:
```prolog
edge(a, b).
edge(b, c).
reachable(X, Y) :- edge(X, Y).
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
```
After saturation/fixpoint, both derive:
```prolog
reachable(a, b).
reachable(b, c).
reachable(a, c).
```
---
## 6. Instances Are Like EDB Facts
Geolog theory + instance splits naturally into:
- **theory** = vocabulary + laws
- **instance** = starting facts
That matches the Datalog split between:
- **rules/schema**, and
- **EDB facts**.
So:
- **Geolog instance** ≈ **extensional database**.
---
## 7. Chase Is Like Bottom-Up Evaluation
Geolog chase:
- repeatedly apply axioms,
- add newly implied facts,
- stop when nothing new appears.
Datalog bottom-up evaluation:
- repeatedly apply rules,
- add newly derived facts,
- stop at the least fixpoint.
So in the Horn fragment:
- **Geolog chase** ≈ **Datalog least-fixpoint evaluation**.
---
## 8. Where The Mapping Breaks: Existentials
Geolog can say things like:
```geolog
forall p : Person.
|- exists c : Person. [parent: p, kid: c] child;
```
Meaning:
- for every person `p`, some child must exist,
- and chase may create a fresh witness if needed.
Plain Datalog cannot do this, because plain Datalog does not invent fresh elements.
This is not standard Datalog anymore. It is closer to:
- existential rules,
- tuple-generating dependencies,
- or Datalog±.
So:
- **Geolog with existential conclusions** is strictly more expressive than plain Datalog.
---
## 9. Where The Mapping Breaks: Equality
Geolog also supports equality-oriented reasoning in ways that go beyond plain Datalog.
If equalities are inferred or propagated as part of semantic saturation, then the system is doing more than ordinary Datalog rule evaluation.
So:
- **Geolog equality reasoning** is not captured by plain Datalog without extensions or encodings.
---
## 10. Best Practical Translation Rules
When reading Geolog in Datalog terms, use this mental model:
- `Sort` → type or unary predicate
- function `f : A -> B` → relation `f(a, b)` plus intended functionality
- relation `R : [...] -> Prop` → predicate `R(...)`
- axiom `premises |- conclusion` → rule `conclusion :- premises`
- instance facts → EDB facts
- chase → bottom-up fixpoint
- existential conclusion → no longer plain Datalog
---
## 11. Bottom Line
The safest final statement is:
> **A Datalog-like fragment of Geolog maps very naturally to typed Datalog, but full Geolog is richer because it includes native functions, existential rule heads, and stronger logical machinery.**