Add a comparison notes between Geolog and Datalog
This commit is contained in:
parent
33e826b6e5
commit
50ae1473e4
@ -1,53 +1,95 @@
|
|||||||
# Geolog And Datalog
|
# Geolog And Datalog
|
||||||
|
|
||||||
*How to map Geolog concepts onto Datalog, and where the analogy breaks down.*
|
*A simple explanation of how Geolog is similar to Datalog, and how it is different.*
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Summary
|
## What This Note Is About
|
||||||
|
|
||||||
Geolog is not just Datalog with different syntax, but a large and useful fragment of Geolog can be read that way.
|
Sometimes Geolog looks a lot like Datalog.
|
||||||
|
|
||||||
The best short summary is:
|
That is not an accident.
|
||||||
|
|
||||||
> **Geolog is like typed Datalog with native functions, existential rule heads, and stronger logical structure.**
|
Both systems let you:
|
||||||
|
|
||||||
The mapping works well for:
|
- describe facts,
|
||||||
|
- describe rules,
|
||||||
|
- and repeatedly apply those rules until no new facts appear.
|
||||||
|
|
||||||
- relations,
|
But Geolog is not just Datalog with different syntax.
|
||||||
- Horn-style axioms,
|
|
||||||
- instance facts,
|
|
||||||
- and fixpoint/chase-style derivation.
|
|
||||||
|
|
||||||
The mapping breaks down when Geolog uses:
|
Geolog also has features that go beyond basic Datalog, especially:
|
||||||
|
|
||||||
- existential conclusions,
|
- built-in types, called **sorts**,
|
||||||
- equality reasoning,
|
- built-in **functions**,
|
||||||
- and native function symbols with total/single-valued semantics.
|
- rules that can say “something exists”,
|
||||||
|
- and stronger support for equality.
|
||||||
|
|
||||||
|
This note explains the overlap in simple terms.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## First: What Is Datalog?
|
||||||
|
|
||||||
|
Datalog is a rule language.
|
||||||
|
|
||||||
|
It is often written with facts like this:
|
||||||
|
|
||||||
|
```prolog
|
||||||
|
edge(a, b).
|
||||||
|
edge(b, c).
|
||||||
|
```
|
||||||
|
|
||||||
|
And rules like this:
|
||||||
|
|
||||||
|
```prolog
|
||||||
|
reachable(X, Y) :- edge(X, Y).
|
||||||
|
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
|
||||||
|
```
|
||||||
|
|
||||||
|
You can read those rules as:
|
||||||
|
|
||||||
|
- if `edge(X, Y)` is true, then `reachable(X, Y)` is true,
|
||||||
|
- if `reachable(X, Y)` and `reachable(Y, Z)` are true, then `reachable(X, Z)` is true.
|
||||||
|
|
||||||
|
Then the system keeps applying rules until it cannot derive anything new.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Very Short Summary
|
||||||
|
|
||||||
|
The best simple summary is:
|
||||||
|
|
||||||
|
> Geolog contains a part that behaves a lot like typed Datalog, but full Geolog can do more.
|
||||||
|
|
||||||
|
Another good summary is:
|
||||||
|
|
||||||
|
> If you ignore some of the more advanced features, you can often read Geolog as “Datalog with types and functions”.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Side-By-Side Mapping
|
## Side-By-Side Mapping
|
||||||
|
|
||||||
| Concept | Geolog | Datalog |
|
| Idea | In Geolog | In Datalog |
|
||||||
|---------|--------|---------|
|
|------|-----------|------------|
|
||||||
| Domain/type | `V : Sort;` | usually implicit, or unary predicate like `V(x)` |
|
| Type of thing | `V : Sort;` | often left implicit, or written as a one-argument fact like `V(x)` |
|
||||||
| Relation | `edge : [from: V, to: V] -> Prop;` | `edge(X, Y)` |
|
| Relation | `edge : [from: V, to: V] -> Prop;` | `edge(X, Y)` |
|
||||||
| Function | `src : E -> V;` | usually encoded as relation `src(E, V)` |
|
| Function | `src : E -> V;` | usually encoded as a relation like `src(E, V)` |
|
||||||
| Fact | `[from: a, to: b] edge;` | `edge(a, b).` |
|
| Fact | `[from: a, to: b] edge;` | `edge(a, b).` |
|
||||||
| Function fact | `e1 src = a;` | `src(e1, a).` |
|
| Function value | `e1 src = a;` | `src(e1, a).` |
|
||||||
| Axiom/rule | `A, B |- C` | `C :- A, B.` |
|
| Rule | `A, B |- C` | `C :- A, B.` |
|
||||||
| Derived predicate | relation + axioms | IDB predicate + rules |
|
| Starting data | an instance | a set of starting facts |
|
||||||
| Instance | concrete elements/facts | EDB facts |
|
| Repeated rule application | chase | repeated rule evaluation until no new facts appear |
|
||||||
| Chase/fixpoint | repeated rule firing to saturation | bottom-up least fixpoint |
|
|
||||||
| Existential conclusion | `|- exists y : B. ...` | not plain Datalog |
|
|
||||||
| Equality reasoning | native/logical | usually extension or encoding |
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 1. Sorts And Types
|
## 1. Sorts In Geolog Are Like Types
|
||||||
|
|
||||||
Geolog has first-class sorts:
|
Geolog has **sorts**.
|
||||||
|
|
||||||
|
A sort is just a kind of thing.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
```geolog
|
```geolog
|
||||||
theory Graph {
|
theory Graph {
|
||||||
@ -55,7 +97,11 @@ theory Graph {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Plain Datalog often treats types as implicit, or simulates them with unary predicates:
|
This says that `V` is a sort. You can think of it as the type “vertex”.
|
||||||
|
|
||||||
|
Basic Datalog usually does not have this built in.
|
||||||
|
|
||||||
|
If someone wants types in Datalog, they often simulate them with ordinary facts such as:
|
||||||
|
|
||||||
```prolog
|
```prolog
|
||||||
V(a).
|
V(a).
|
||||||
@ -63,13 +109,16 @@ V(b).
|
|||||||
V(c).
|
V(c).
|
||||||
```
|
```
|
||||||
|
|
||||||
So the closest Datalog mental model is:
|
So the rough translation is:
|
||||||
|
|
||||||
- **Geolog sort** ≈ **type discipline or unary predicate**.
|
- **Geolog sort** = **type**,
|
||||||
|
- and in Datalog that type is often represented using an ordinary relation.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 2. Relations Map Directly
|
## 2. Relations Match Very Well
|
||||||
|
|
||||||
|
This part is simple.
|
||||||
|
|
||||||
Geolog relation declaration:
|
Geolog relation declaration:
|
||||||
|
|
||||||
@ -77,25 +126,30 @@ Geolog relation declaration:
|
|||||||
edge : [from: V, to: V] -> Prop;
|
edge : [from: V, to: V] -> Prop;
|
||||||
```
|
```
|
||||||
|
|
||||||
Fact in a Geolog instance:
|
Geolog fact:
|
||||||
|
|
||||||
```geolog
|
```geolog
|
||||||
[from: a, to: b] edge;
|
[from: a, to: b] edge;
|
||||||
```
|
```
|
||||||
|
|
||||||
Datalog equivalent:
|
Datalog version:
|
||||||
|
|
||||||
```prolog
|
```prolog
|
||||||
edge(a, b).
|
edge(a, b).
|
||||||
```
|
```
|
||||||
|
|
||||||
This is the cleanest and most direct part of the mapping.
|
So:
|
||||||
|
|
||||||
|
- a Geolog relation is very much like a Datalog predicate,
|
||||||
|
- and a Geolog relation fact is very much like a Datalog fact.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 3. Functions Become Functional Relations
|
## 3. Functions Need More Care
|
||||||
|
|
||||||
Geolog has native function symbols:
|
Geolog has real functions.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
```geolog
|
```geolog
|
||||||
E : Sort;
|
E : Sort;
|
||||||
@ -107,47 +161,46 @@ a : V;
|
|||||||
e1 src = a;
|
e1 src = a;
|
||||||
```
|
```
|
||||||
|
|
||||||
In Datalog, the closest encoding is relational:
|
This means:
|
||||||
|
|
||||||
|
- `src` takes an edge and gives back one vertex,
|
||||||
|
- `e1` has source `a`.
|
||||||
|
|
||||||
|
In Datalog, people usually write the same information like this:
|
||||||
|
|
||||||
```prolog
|
```prolog
|
||||||
src(e1, a).
|
src(e1, a).
|
||||||
```
|
```
|
||||||
|
|
||||||
But there is an important semantic gap:
|
But there is a difference.
|
||||||
|
|
||||||
- in **Geolog**, `src` is a genuine function,
|
In Geolog, `src` is a true function, so each edge should have one source.
|
||||||
- in **Datalog**, `src(e1, a)` is just a relation unless extra constraints are added.
|
|
||||||
|
|
||||||
So the right translation rule is:
|
In plain Datalog, `src(e1, a)` is just a normal fact in a normal relation. Datalog does not automatically know that `src` is supposed to behave like a function.
|
||||||
|
|
||||||
- **Geolog function** ≈ **relation that is intended to be total and single-valued**.
|
So the simple rule is:
|
||||||
|
|
||||||
To fully mimic a function, you also need functional constraints such as:
|
- **Geolog function** becomes **a relation in Datalog**,
|
||||||
|
- but some meaning is lost unless you also add extra rules or constraints.
|
||||||
```text
|
|
||||||
if src(e, v1) and src(e, v2), then v1 = v2
|
|
||||||
```
|
|
||||||
|
|
||||||
Plain Datalog does not natively enforce that on its own.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 4. Axioms Map To Rules
|
## 4. Axioms In Geolog Look Like Rules In Datalog
|
||||||
|
|
||||||
Geolog Horn-style axiom:
|
Geolog rule:
|
||||||
|
|
||||||
```geolog
|
```geolog
|
||||||
ax/base : forall x,y : V.
|
ax/base : forall x,y : V.
|
||||||
[from: x, to: y] edge |- [from: x, to: y] reachable;
|
[from: x, to: y] edge |- [from: x, to: y] reachable;
|
||||||
```
|
```
|
||||||
|
|
||||||
Datalog:
|
Datalog rule:
|
||||||
|
|
||||||
```prolog
|
```prolog
|
||||||
reachable(X, Y) :- edge(X, Y).
|
reachable(X, Y) :- edge(X, Y).
|
||||||
```
|
```
|
||||||
|
|
||||||
Another Geolog axiom:
|
Geolog rule:
|
||||||
|
|
||||||
```geolog
|
```geolog
|
||||||
ax/trans : forall x,y,z : V.
|
ax/trans : forall x,y,z : V.
|
||||||
@ -155,19 +208,21 @@ ax/trans : forall x,y,z : V.
|
|||||||
|- [from: x, to: z] reachable;
|
|- [from: x, to: z] reachable;
|
||||||
```
|
```
|
||||||
|
|
||||||
Datalog:
|
Datalog rule:
|
||||||
|
|
||||||
```prolog
|
```prolog
|
||||||
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
|
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
|
||||||
```
|
```
|
||||||
|
|
||||||
This is the main sense in which Geolog resembles Datalog.
|
So in the simplest cases:
|
||||||
|
|
||||||
|
- **Geolog axiom** behaves like a **Datalog rule**.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 5. Full Example
|
## 5. A Full Example
|
||||||
|
|
||||||
Geolog:
|
Here is a Geolog theory and instance:
|
||||||
|
|
||||||
```geolog
|
```geolog
|
||||||
theory Graph {
|
theory Graph {
|
||||||
@ -190,7 +245,7 @@ instance G : Graph = {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Datalog:
|
Here is the matching Datalog program:
|
||||||
|
|
||||||
```prolog
|
```prolog
|
||||||
edge(a, b).
|
edge(a, b).
|
||||||
@ -200,7 +255,7 @@ reachable(X, Y) :- edge(X, Y).
|
|||||||
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
|
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
|
||||||
```
|
```
|
||||||
|
|
||||||
After saturation/fixpoint, both derive:
|
After rule application, both systems derive:
|
||||||
|
|
||||||
```prolog
|
```prolog
|
||||||
reachable(a, b).
|
reachable(a, b).
|
||||||
@ -208,103 +263,109 @@ reachable(b, c).
|
|||||||
reachable(a, c).
|
reachable(a, c).
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
This is the part of Geolog that feels the most like Datalog.
|
||||||
|
|
||||||
## 6. Instances Are Like EDB Facts
|
|
||||||
|
|
||||||
Geolog theory + instance splits naturally into:
|
|
||||||
|
|
||||||
- **theory** = vocabulary + laws
|
|
||||||
- **instance** = starting facts
|
|
||||||
|
|
||||||
That matches the Datalog split between:
|
|
||||||
|
|
||||||
- **rules/schema**, and
|
|
||||||
- **EDB facts**.
|
|
||||||
|
|
||||||
So:
|
|
||||||
|
|
||||||
- **Geolog instance** ≈ **extensional database**.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 7. Chase Is Like Bottom-Up Evaluation
|
## 6. An Instance Is Just The Starting Data
|
||||||
|
|
||||||
Geolog chase:
|
In Geolog, a theory gives:
|
||||||
|
|
||||||
- repeatedly apply axioms,
|
- the names of sorts,
|
||||||
- add newly implied facts,
|
- the names of functions,
|
||||||
- stop when nothing new appears.
|
- the names of relations,
|
||||||
|
- and the axioms.
|
||||||
|
|
||||||
Datalog bottom-up evaluation:
|
Then an instance gives:
|
||||||
|
|
||||||
- repeatedly apply rules,
|
- the actual elements,
|
||||||
- add newly derived facts,
|
- function values,
|
||||||
- stop at the least fixpoint.
|
- and relation facts.
|
||||||
|
|
||||||
So in the Horn fragment:
|
So a Geolog instance is just the starting data for rule application.
|
||||||
|
|
||||||
- **Geolog chase** ≈ **Datalog least-fixpoint evaluation**.
|
That is exactly the role that a set of starting facts plays in Datalog.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 8. Where The Mapping Breaks: Existentials
|
## 7. Chase In Geolog Is Like Repeated Rule Evaluation In Datalog
|
||||||
|
|
||||||
Geolog can say things like:
|
In Geolog, the **chase** means:
|
||||||
|
|
||||||
|
1. look at the current facts,
|
||||||
|
2. find an axiom that applies,
|
||||||
|
3. add the fact that the axiom says must follow,
|
||||||
|
4. repeat until nothing new appears.
|
||||||
|
|
||||||
|
In Datalog, the standard evaluation idea is very similar:
|
||||||
|
|
||||||
|
1. look at the current facts,
|
||||||
|
2. find a rule whose body is true,
|
||||||
|
3. add the fact in the head of the rule,
|
||||||
|
4. repeat until nothing new appears.
|
||||||
|
|
||||||
|
So for simple rule systems:
|
||||||
|
|
||||||
|
- **Geolog chase** and **Datalog rule evaluation** are very close.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 8. Where The Analogy Stops Working: “There Exists ...”
|
||||||
|
|
||||||
|
Geolog can write rules like this:
|
||||||
|
|
||||||
```geolog
|
```geolog
|
||||||
forall p : Person.
|
forall p : Person.
|
||||||
|- exists c : Person. [parent: p, kid: c] child;
|
|- exists c : Person. [parent: p, kid: c] child;
|
||||||
```
|
```
|
||||||
|
|
||||||
Meaning:
|
In plain language, this says:
|
||||||
|
|
||||||
- for every person `p`, some child must exist,
|
- every person has some child.
|
||||||
- and chase may create a fresh witness if needed.
|
|
||||||
|
|
||||||
Plain Datalog cannot do this, because plain Datalog does not invent fresh elements.
|
If the system does not already know such a child, Geolog may create a fresh witness during chase.
|
||||||
|
|
||||||
This is not standard Datalog anymore. It is closer to:
|
Basic Datalog does not do that.
|
||||||
|
|
||||||
- existential rules,
|
Basic Datalog works with facts and constants that already exist. It does not normally invent brand new objects just because a rule says something must exist.
|
||||||
- tuple-generating dependencies,
|
|
||||||
- or Datalog±.
|
|
||||||
|
|
||||||
So:
|
So this is one major way in which Geolog is more expressive than ordinary Datalog.
|
||||||
|
|
||||||
- **Geolog with existential conclusions** is strictly more expressive than plain Datalog.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 9. Where The Mapping Breaks: Equality
|
## 9. Where The Analogy Stops Working: Equality
|
||||||
|
|
||||||
Geolog also supports equality-oriented reasoning in ways that go beyond plain Datalog.
|
Geolog can also reason about equality in richer ways than plain Datalog normally does.
|
||||||
|
|
||||||
If equalities are inferred or propagated as part of semantic saturation, then the system is doing more than ordinary Datalog rule evaluation.
|
That means Geolog can sometimes merge or identify things based on logical consequences, while ordinary Datalog is usually just adding new facts.
|
||||||
|
|
||||||
So:
|
So equality is another place where Geolog is doing more than basic Datalog.
|
||||||
|
|
||||||
- **Geolog equality reasoning** is not captured by plain Datalog without extensions or encodings.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 10. Best Practical Translation Rules
|
## 10. A Safe Mental Model
|
||||||
|
|
||||||
When reading Geolog in Datalog terms, use this mental model:
|
If you want a simple way to think about the connection, use this:
|
||||||
|
|
||||||
- `Sort` → type or unary predicate
|
- a **sort** in Geolog is like a type,
|
||||||
- function `f : A -> B` → relation `f(a, b)` plus intended functionality
|
- a **relation** in Geolog is like a predicate in Datalog,
|
||||||
- relation `R : [...] -> Prop` → predicate `R(...)`
|
- a **fact** in a Geolog instance is like a Datalog fact,
|
||||||
- axiom `premises |- conclusion` → rule `conclusion :- premises`
|
- an **axiom** in Geolog is often like a Datalog rule,
|
||||||
- instance facts → EDB facts
|
- and **chase** is like repeatedly applying rules until no new facts appear.
|
||||||
- chase → bottom-up fixpoint
|
|
||||||
- existential conclusion → no longer plain Datalog
|
That mental model is very useful.
|
||||||
|
|
||||||
|
Just remember that full Geolog is richer than plain Datalog.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 11. Bottom Line
|
## Final Summary
|
||||||
|
|
||||||
The safest final statement is:
|
Here is the simplest correct conclusion:
|
||||||
|
|
||||||
> **A Datalog-like fragment of Geolog maps very naturally to typed Datalog, but full Geolog is richer because it includes native functions, existential rule heads, and stronger logical machinery.**
|
> A large part of Geolog can be understood as Datalog-like rule evaluation over typed data.
|
||||||
|
|
||||||
|
But also:
|
||||||
|
|
||||||
|
> Full Geolog goes beyond basic Datalog because it has real functions, richer equality reasoning, and rules that can say that new things must exist.
|
||||||
|
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user