372 lines
7.7 KiB
Markdown
372 lines
7.7 KiB
Markdown
# Geolog And Datalog
|
|
|
|
*A simple explanation of how Geolog is similar to Datalog, and how it is different.*
|
|
|
|
---
|
|
|
|
## What This Note Is About
|
|
|
|
Sometimes Geolog looks a lot like Datalog.
|
|
|
|
That is not an accident.
|
|
|
|
Both systems let you:
|
|
|
|
- describe facts,
|
|
- describe rules,
|
|
- and repeatedly apply those rules until no new facts appear.
|
|
|
|
But Geolog is not just Datalog with different syntax.
|
|
|
|
Geolog also has features that go beyond basic Datalog, especially:
|
|
|
|
- built-in types, called **sorts**,
|
|
- built-in **functions**,
|
|
- rules that can say “something exists”,
|
|
- and stronger support for equality.
|
|
|
|
This note explains the overlap in simple terms.
|
|
|
|
---
|
|
|
|
## First: What Is Datalog?
|
|
|
|
Datalog is a rule language.
|
|
|
|
It is often written with facts like this:
|
|
|
|
```prolog
|
|
edge(a, b).
|
|
edge(b, c).
|
|
```
|
|
|
|
And rules like this:
|
|
|
|
```prolog
|
|
reachable(X, Y) :- edge(X, Y).
|
|
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
|
|
```
|
|
|
|
You can read those rules as:
|
|
|
|
- if `edge(X, Y)` is true, then `reachable(X, Y)` is true,
|
|
- if `reachable(X, Y)` and `reachable(Y, Z)` are true, then `reachable(X, Z)` is true.
|
|
|
|
Then the system keeps applying rules until it cannot derive anything new.
|
|
|
|
---
|
|
|
|
## Very Short Summary
|
|
|
|
The best simple summary is:
|
|
|
|
> Geolog contains a part that behaves a lot like typed Datalog, but full Geolog can do more.
|
|
|
|
Another good summary is:
|
|
|
|
> If you ignore some of the more advanced features, you can often read Geolog as “Datalog with types and functions”.
|
|
|
|
---
|
|
|
|
## Side-By-Side Mapping
|
|
|
|
| Idea | In Geolog | In Datalog |
|
|
|------|-----------|------------|
|
|
| Type of thing | `V : Sort;` | often left implicit, or written as a one-argument fact like `V(x)` |
|
|
| Relation | `edge : [from: V, to: V] -> Prop;` | `edge(X, Y)` |
|
|
| Function | `src : E -> V;` | usually encoded as a relation like `src(E, V)` |
|
|
| Fact | `[from: a, to: b] edge;` | `edge(a, b).` |
|
|
| Function value | `e1 src = a;` | `src(e1, a).` |
|
|
| Rule | `A, B |- C` | `C :- A, B.` |
|
|
| Starting data | an instance | a set of starting facts |
|
|
| Repeated rule application | chase | repeated rule evaluation until no new facts appear |
|
|
|
|
---
|
|
|
|
## 1. Sorts In Geolog Are Like Types
|
|
|
|
Geolog has **sorts**.
|
|
|
|
A sort is just a kind of thing.
|
|
|
|
Example:
|
|
|
|
```geolog
|
|
theory Graph {
|
|
V : Sort;
|
|
}
|
|
```
|
|
|
|
This says that `V` is a sort. You can think of it as the type “vertex”.
|
|
|
|
Basic Datalog usually does not have this built in.
|
|
|
|
If someone wants types in Datalog, they often simulate them with ordinary facts such as:
|
|
|
|
```prolog
|
|
V(a).
|
|
V(b).
|
|
V(c).
|
|
```
|
|
|
|
So the rough translation is:
|
|
|
|
- **Geolog sort** = **type**,
|
|
- and in Datalog that type is often represented using an ordinary relation.
|
|
|
|
---
|
|
|
|
## 2. Relations Match Very Well
|
|
|
|
This part is simple.
|
|
|
|
Geolog relation declaration:
|
|
|
|
```geolog
|
|
edge : [from: V, to: V] -> Prop;
|
|
```
|
|
|
|
Geolog fact:
|
|
|
|
```geolog
|
|
[from: a, to: b] edge;
|
|
```
|
|
|
|
Datalog version:
|
|
|
|
```prolog
|
|
edge(a, b).
|
|
```
|
|
|
|
So:
|
|
|
|
- a Geolog relation is very much like a Datalog predicate,
|
|
- and a Geolog relation fact is very much like a Datalog fact.
|
|
|
|
---
|
|
|
|
## 3. Functions Need More Care
|
|
|
|
Geolog has real functions.
|
|
|
|
Example:
|
|
|
|
```geolog
|
|
E : Sort;
|
|
V : Sort;
|
|
src : E -> V;
|
|
|
|
e1 : E;
|
|
a : V;
|
|
e1 src = a;
|
|
```
|
|
|
|
This means:
|
|
|
|
- `src` takes an edge and gives back one vertex,
|
|
- `e1` has source `a`.
|
|
|
|
In Datalog, people usually write the same information like this:
|
|
|
|
```prolog
|
|
src(e1, a).
|
|
```
|
|
|
|
But there is a difference.
|
|
|
|
In Geolog, `src` is a true function, so each edge should have one source.
|
|
|
|
In plain Datalog, `src(e1, a)` is just a normal fact in a normal relation. Datalog does not automatically know that `src` is supposed to behave like a function.
|
|
|
|
So the simple rule is:
|
|
|
|
- **Geolog function** becomes **a relation in Datalog**,
|
|
- but some meaning is lost unless you also add extra rules or constraints.
|
|
|
|
---
|
|
|
|
## 4. Axioms In Geolog Look Like Rules In Datalog
|
|
|
|
Geolog rule:
|
|
|
|
```geolog
|
|
ax/base : forall x,y : V.
|
|
[from: x, to: y] edge |- [from: x, to: y] reachable;
|
|
```
|
|
|
|
Datalog rule:
|
|
|
|
```prolog
|
|
reachable(X, Y) :- edge(X, Y).
|
|
```
|
|
|
|
Geolog rule:
|
|
|
|
```geolog
|
|
ax/trans : forall x,y,z : V.
|
|
[from: x, to: y] reachable, [from: y, to: z] reachable
|
|
|- [from: x, to: z] reachable;
|
|
```
|
|
|
|
Datalog rule:
|
|
|
|
```prolog
|
|
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
|
|
```
|
|
|
|
So in the simplest cases:
|
|
|
|
- **Geolog axiom** behaves like a **Datalog rule**.
|
|
|
|
---
|
|
|
|
## 5. A Full Example
|
|
|
|
Here is a Geolog theory and instance:
|
|
|
|
```geolog
|
|
theory Graph {
|
|
V : Sort;
|
|
edge : [from: V, to: V] -> Prop;
|
|
reachable : [from: V, to: V] -> Prop;
|
|
|
|
ax/base : forall x,y : V.
|
|
[from: x, to: y] edge |- [from: x, to: y] reachable;
|
|
|
|
ax/trans : forall x,y,z : V.
|
|
[from: x, to: y] reachable, [from: y, to: z] reachable
|
|
|- [from: x, to: z] reachable;
|
|
}
|
|
|
|
instance G : Graph = {
|
|
a, b, c : V;
|
|
[from: a, to: b] edge;
|
|
[from: b, to: c] edge;
|
|
}
|
|
```
|
|
|
|
Here is the matching Datalog program:
|
|
|
|
```prolog
|
|
edge(a, b).
|
|
edge(b, c).
|
|
|
|
reachable(X, Y) :- edge(X, Y).
|
|
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
|
|
```
|
|
|
|
After rule application, both systems derive:
|
|
|
|
```prolog
|
|
reachable(a, b).
|
|
reachable(b, c).
|
|
reachable(a, c).
|
|
```
|
|
|
|
This is the part of Geolog that feels the most like Datalog.
|
|
|
|
---
|
|
|
|
## 6. An Instance Is Just The Starting Data
|
|
|
|
In Geolog, a theory gives:
|
|
|
|
- the names of sorts,
|
|
- the names of functions,
|
|
- the names of relations,
|
|
- and the axioms.
|
|
|
|
Then an instance gives:
|
|
|
|
- the actual elements,
|
|
- function values,
|
|
- and relation facts.
|
|
|
|
So a Geolog instance is just the starting data for rule application.
|
|
|
|
That is exactly the role that a set of starting facts plays in Datalog.
|
|
|
|
---
|
|
|
|
## 7. Chase In Geolog Is Like Repeated Rule Evaluation In Datalog
|
|
|
|
In Geolog, the **chase** means:
|
|
|
|
1. look at the current facts,
|
|
2. find an axiom that applies,
|
|
3. add the fact that the axiom says must follow,
|
|
4. repeat until nothing new appears.
|
|
|
|
In Datalog, the standard evaluation idea is very similar:
|
|
|
|
1. look at the current facts,
|
|
2. find a rule whose body is true,
|
|
3. add the fact in the head of the rule,
|
|
4. repeat until nothing new appears.
|
|
|
|
So for simple rule systems:
|
|
|
|
- **Geolog chase** and **Datalog rule evaluation** are very close.
|
|
|
|
---
|
|
|
|
## 8. Where The Analogy Stops Working: “There Exists ...”
|
|
|
|
Geolog can write rules like this:
|
|
|
|
```geolog
|
|
forall p : Person.
|
|
|- exists c : Person. [parent: p, kid: c] child;
|
|
```
|
|
|
|
In plain language, this says:
|
|
|
|
- every person has some child.
|
|
|
|
If the system does not already know such a child, Geolog may create a fresh witness during chase.
|
|
|
|
Basic Datalog does not do that.
|
|
|
|
Basic Datalog works with facts and constants that already exist. It does not normally invent brand new objects just because a rule says something must exist.
|
|
|
|
So this is one major way in which Geolog is more expressive than ordinary Datalog.
|
|
|
|
---
|
|
|
|
## 9. Where The Analogy Stops Working: Equality
|
|
|
|
Geolog can also reason about equality in richer ways than plain Datalog normally does.
|
|
|
|
That means Geolog can sometimes merge or identify things based on logical consequences, while ordinary Datalog is usually just adding new facts.
|
|
|
|
So equality is another place where Geolog is doing more than basic Datalog.
|
|
|
|
---
|
|
|
|
## 10. A Safe Mental Model
|
|
|
|
If you want a simple way to think about the connection, use this:
|
|
|
|
- a **sort** in Geolog is like a type,
|
|
- a **relation** in Geolog is like a predicate in Datalog,
|
|
- a **fact** in a Geolog instance is like a Datalog fact,
|
|
- an **axiom** in Geolog is often like a Datalog rule,
|
|
- and **chase** is like repeatedly applying rules until no new facts appear.
|
|
|
|
That mental model is very useful.
|
|
|
|
Just remember that full Geolog is richer than plain Datalog.
|
|
|
|
---
|
|
|
|
## Final Summary
|
|
|
|
Here is the simplest correct conclusion:
|
|
|
|
> A large part of Geolog can be understood as Datalog-like rule evaluation over typed data.
|
|
|
|
But also:
|
|
|
|
> Full Geolog goes beyond basic Datalog because it has real functions, richer equality reasoning, and rules that can say that new things must exist.
|
|
|