Add a comparison notes between Geolog and Datalog
This commit is contained in:
parent
33e826b6e5
commit
50ae1473e4
@ -1,53 +1,95 @@
|
||||
# Geolog And Datalog
|
||||
|
||||
*How to map Geolog concepts onto Datalog, and where the analogy breaks down.*
|
||||
*A simple explanation of how Geolog is similar to Datalog, and how it is different.*
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
## What This Note Is About
|
||||
|
||||
Geolog is not just Datalog with different syntax, but a large and useful fragment of Geolog can be read that way.
|
||||
Sometimes Geolog looks a lot like Datalog.
|
||||
|
||||
The best short summary is:
|
||||
That is not an accident.
|
||||
|
||||
> **Geolog is like typed Datalog with native functions, existential rule heads, and stronger logical structure.**
|
||||
Both systems let you:
|
||||
|
||||
The mapping works well for:
|
||||
- describe facts,
|
||||
- describe rules,
|
||||
- and repeatedly apply those rules until no new facts appear.
|
||||
|
||||
- relations,
|
||||
- Horn-style axioms,
|
||||
- instance facts,
|
||||
- and fixpoint/chase-style derivation.
|
||||
But Geolog is not just Datalog with different syntax.
|
||||
|
||||
The mapping breaks down when Geolog uses:
|
||||
Geolog also has features that go beyond basic Datalog, especially:
|
||||
|
||||
- existential conclusions,
|
||||
- equality reasoning,
|
||||
- and native function symbols with total/single-valued semantics.
|
||||
- built-in types, called **sorts**,
|
||||
- built-in **functions**,
|
||||
- rules that can say “something exists”,
|
||||
- and stronger support for equality.
|
||||
|
||||
This note explains the overlap in simple terms.
|
||||
|
||||
---
|
||||
|
||||
## First: What Is Datalog?
|
||||
|
||||
Datalog is a rule language.
|
||||
|
||||
It is often written with facts like this:
|
||||
|
||||
```prolog
|
||||
edge(a, b).
|
||||
edge(b, c).
|
||||
```
|
||||
|
||||
And rules like this:
|
||||
|
||||
```prolog
|
||||
reachable(X, Y) :- edge(X, Y).
|
||||
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
|
||||
```
|
||||
|
||||
You can read those rules as:
|
||||
|
||||
- if `edge(X, Y)` is true, then `reachable(X, Y)` is true,
|
||||
- if `reachable(X, Y)` and `reachable(Y, Z)` are true, then `reachable(X, Z)` is true.
|
||||
|
||||
Then the system keeps applying rules until it cannot derive anything new.
|
||||
|
||||
---
|
||||
|
||||
## Very Short Summary
|
||||
|
||||
The best simple summary is:
|
||||
|
||||
> Geolog contains a part that behaves a lot like typed Datalog, but full Geolog can do more.
|
||||
|
||||
Another good summary is:
|
||||
|
||||
> If you ignore some of the more advanced features, you can often read Geolog as “Datalog with types and functions”.
|
||||
|
||||
---
|
||||
|
||||
## Side-By-Side Mapping
|
||||
|
||||
| Concept | Geolog | Datalog |
|
||||
|---------|--------|---------|
|
||||
| Domain/type | `V : Sort;` | usually implicit, or unary predicate like `V(x)` |
|
||||
| Idea | In Geolog | In Datalog |
|
||||
|------|-----------|------------|
|
||||
| Type of thing | `V : Sort;` | often left implicit, or written as a one-argument fact like `V(x)` |
|
||||
| Relation | `edge : [from: V, to: V] -> Prop;` | `edge(X, Y)` |
|
||||
| Function | `src : E -> V;` | usually encoded as relation `src(E, V)` |
|
||||
| Function | `src : E -> V;` | usually encoded as a relation like `src(E, V)` |
|
||||
| Fact | `[from: a, to: b] edge;` | `edge(a, b).` |
|
||||
| Function fact | `e1 src = a;` | `src(e1, a).` |
|
||||
| Axiom/rule | `A, B |- C` | `C :- A, B.` |
|
||||
| Derived predicate | relation + axioms | IDB predicate + rules |
|
||||
| Instance | concrete elements/facts | EDB facts |
|
||||
| Chase/fixpoint | repeated rule firing to saturation | bottom-up least fixpoint |
|
||||
| Existential conclusion | `|- exists y : B. ...` | not plain Datalog |
|
||||
| Equality reasoning | native/logical | usually extension or encoding |
|
||||
| Function value | `e1 src = a;` | `src(e1, a).` |
|
||||
| Rule | `A, B |- C` | `C :- A, B.` |
|
||||
| Starting data | an instance | a set of starting facts |
|
||||
| Repeated rule application | chase | repeated rule evaluation until no new facts appear |
|
||||
|
||||
---
|
||||
|
||||
## 1. Sorts And Types
|
||||
## 1. Sorts In Geolog Are Like Types
|
||||
|
||||
Geolog has first-class sorts:
|
||||
Geolog has **sorts**.
|
||||
|
||||
A sort is just a kind of thing.
|
||||
|
||||
Example:
|
||||
|
||||
```geolog
|
||||
theory Graph {
|
||||
@ -55,7 +97,11 @@ theory Graph {
|
||||
}
|
||||
```
|
||||
|
||||
Plain Datalog often treats types as implicit, or simulates them with unary predicates:
|
||||
This says that `V` is a sort. You can think of it as the type “vertex”.
|
||||
|
||||
Basic Datalog usually does not have this built in.
|
||||
|
||||
If someone wants types in Datalog, they often simulate them with ordinary facts such as:
|
||||
|
||||
```prolog
|
||||
V(a).
|
||||
@ -63,13 +109,16 @@ V(b).
|
||||
V(c).
|
||||
```
|
||||
|
||||
So the closest Datalog mental model is:
|
||||
So the rough translation is:
|
||||
|
||||
- **Geolog sort** ≈ **type discipline or unary predicate**.
|
||||
- **Geolog sort** = **type**,
|
||||
- and in Datalog that type is often represented using an ordinary relation.
|
||||
|
||||
---
|
||||
|
||||
## 2. Relations Map Directly
|
||||
## 2. Relations Match Very Well
|
||||
|
||||
This part is simple.
|
||||
|
||||
Geolog relation declaration:
|
||||
|
||||
@ -77,25 +126,30 @@ Geolog relation declaration:
|
||||
edge : [from: V, to: V] -> Prop;
|
||||
```
|
||||
|
||||
Fact in a Geolog instance:
|
||||
Geolog fact:
|
||||
|
||||
```geolog
|
||||
[from: a, to: b] edge;
|
||||
```
|
||||
|
||||
Datalog equivalent:
|
||||
Datalog version:
|
||||
|
||||
```prolog
|
||||
edge(a, b).
|
||||
```
|
||||
|
||||
This is the cleanest and most direct part of the mapping.
|
||||
So:
|
||||
|
||||
- a Geolog relation is very much like a Datalog predicate,
|
||||
- and a Geolog relation fact is very much like a Datalog fact.
|
||||
|
||||
---
|
||||
|
||||
## 3. Functions Become Functional Relations
|
||||
## 3. Functions Need More Care
|
||||
|
||||
Geolog has native function symbols:
|
||||
Geolog has real functions.
|
||||
|
||||
Example:
|
||||
|
||||
```geolog
|
||||
E : Sort;
|
||||
@ -107,47 +161,46 @@ a : V;
|
||||
e1 src = a;
|
||||
```
|
||||
|
||||
In Datalog, the closest encoding is relational:
|
||||
This means:
|
||||
|
||||
- `src` takes an edge and gives back one vertex,
|
||||
- `e1` has source `a`.
|
||||
|
||||
In Datalog, people usually write the same information like this:
|
||||
|
||||
```prolog
|
||||
src(e1, a).
|
||||
```
|
||||
|
||||
But there is an important semantic gap:
|
||||
But there is a difference.
|
||||
|
||||
- in **Geolog**, `src` is a genuine function,
|
||||
- in **Datalog**, `src(e1, a)` is just a relation unless extra constraints are added.
|
||||
In Geolog, `src` is a true function, so each edge should have one source.
|
||||
|
||||
So the right translation rule is:
|
||||
In plain Datalog, `src(e1, a)` is just a normal fact in a normal relation. Datalog does not automatically know that `src` is supposed to behave like a function.
|
||||
|
||||
- **Geolog function** ≈ **relation that is intended to be total and single-valued**.
|
||||
So the simple rule is:
|
||||
|
||||
To fully mimic a function, you also need functional constraints such as:
|
||||
|
||||
```text
|
||||
if src(e, v1) and src(e, v2), then v1 = v2
|
||||
```
|
||||
|
||||
Plain Datalog does not natively enforce that on its own.
|
||||
- **Geolog function** becomes **a relation in Datalog**,
|
||||
- but some meaning is lost unless you also add extra rules or constraints.
|
||||
|
||||
---
|
||||
|
||||
## 4. Axioms Map To Rules
|
||||
## 4. Axioms In Geolog Look Like Rules In Datalog
|
||||
|
||||
Geolog Horn-style axiom:
|
||||
Geolog rule:
|
||||
|
||||
```geolog
|
||||
ax/base : forall x,y : V.
|
||||
[from: x, to: y] edge |- [from: x, to: y] reachable;
|
||||
```
|
||||
|
||||
Datalog:
|
||||
Datalog rule:
|
||||
|
||||
```prolog
|
||||
reachable(X, Y) :- edge(X, Y).
|
||||
```
|
||||
|
||||
Another Geolog axiom:
|
||||
Geolog rule:
|
||||
|
||||
```geolog
|
||||
ax/trans : forall x,y,z : V.
|
||||
@ -155,19 +208,21 @@ ax/trans : forall x,y,z : V.
|
||||
|- [from: x, to: z] reachable;
|
||||
```
|
||||
|
||||
Datalog:
|
||||
Datalog rule:
|
||||
|
||||
```prolog
|
||||
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
|
||||
```
|
||||
|
||||
This is the main sense in which Geolog resembles Datalog.
|
||||
So in the simplest cases:
|
||||
|
||||
- **Geolog axiom** behaves like a **Datalog rule**.
|
||||
|
||||
---
|
||||
|
||||
## 5. Full Example
|
||||
## 5. A Full Example
|
||||
|
||||
Geolog:
|
||||
Here is a Geolog theory and instance:
|
||||
|
||||
```geolog
|
||||
theory Graph {
|
||||
@ -190,7 +245,7 @@ instance G : Graph = {
|
||||
}
|
||||
```
|
||||
|
||||
Datalog:
|
||||
Here is the matching Datalog program:
|
||||
|
||||
```prolog
|
||||
edge(a, b).
|
||||
@ -200,7 +255,7 @@ reachable(X, Y) :- edge(X, Y).
|
||||
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
|
||||
```
|
||||
|
||||
After saturation/fixpoint, both derive:
|
||||
After rule application, both systems derive:
|
||||
|
||||
```prolog
|
||||
reachable(a, b).
|
||||
@ -208,103 +263,109 @@ reachable(b, c).
|
||||
reachable(a, c).
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Instances Are Like EDB Facts
|
||||
|
||||
Geolog theory + instance splits naturally into:
|
||||
|
||||
- **theory** = vocabulary + laws
|
||||
- **instance** = starting facts
|
||||
|
||||
That matches the Datalog split between:
|
||||
|
||||
- **rules/schema**, and
|
||||
- **EDB facts**.
|
||||
|
||||
So:
|
||||
|
||||
- **Geolog instance** ≈ **extensional database**.
|
||||
This is the part of Geolog that feels the most like Datalog.
|
||||
|
||||
---
|
||||
|
||||
## 7. Chase Is Like Bottom-Up Evaluation
|
||||
## 6. An Instance Is Just The Starting Data
|
||||
|
||||
Geolog chase:
|
||||
In Geolog, a theory gives:
|
||||
|
||||
- repeatedly apply axioms,
|
||||
- add newly implied facts,
|
||||
- stop when nothing new appears.
|
||||
- the names of sorts,
|
||||
- the names of functions,
|
||||
- the names of relations,
|
||||
- and the axioms.
|
||||
|
||||
Datalog bottom-up evaluation:
|
||||
Then an instance gives:
|
||||
|
||||
- repeatedly apply rules,
|
||||
- add newly derived facts,
|
||||
- stop at the least fixpoint.
|
||||
- the actual elements,
|
||||
- function values,
|
||||
- and relation facts.
|
||||
|
||||
So in the Horn fragment:
|
||||
So a Geolog instance is just the starting data for rule application.
|
||||
|
||||
- **Geolog chase** ≈ **Datalog least-fixpoint evaluation**.
|
||||
That is exactly the role that a set of starting facts plays in Datalog.
|
||||
|
||||
---
|
||||
|
||||
## 8. Where The Mapping Breaks: Existentials
|
||||
## 7. Chase In Geolog Is Like Repeated Rule Evaluation In Datalog
|
||||
|
||||
Geolog can say things like:
|
||||
In Geolog, the **chase** means:
|
||||
|
||||
1. look at the current facts,
|
||||
2. find an axiom that applies,
|
||||
3. add the fact that the axiom says must follow,
|
||||
4. repeat until nothing new appears.
|
||||
|
||||
In Datalog, the standard evaluation idea is very similar:
|
||||
|
||||
1. look at the current facts,
|
||||
2. find a rule whose body is true,
|
||||
3. add the fact in the head of the rule,
|
||||
4. repeat until nothing new appears.
|
||||
|
||||
So for simple rule systems:
|
||||
|
||||
- **Geolog chase** and **Datalog rule evaluation** are very close.
|
||||
|
||||
---
|
||||
|
||||
## 8. Where The Analogy Stops Working: “There Exists ...”
|
||||
|
||||
Geolog can write rules like this:
|
||||
|
||||
```geolog
|
||||
forall p : Person.
|
||||
|- exists c : Person. [parent: p, kid: c] child;
|
||||
```
|
||||
|
||||
Meaning:
|
||||
In plain language, this says:
|
||||
|
||||
- for every person `p`, some child must exist,
|
||||
- and chase may create a fresh witness if needed.
|
||||
- every person has some child.
|
||||
|
||||
Plain Datalog cannot do this, because plain Datalog does not invent fresh elements.
|
||||
If the system does not already know such a child, Geolog may create a fresh witness during chase.
|
||||
|
||||
This is not standard Datalog anymore. It is closer to:
|
||||
Basic Datalog does not do that.
|
||||
|
||||
- existential rules,
|
||||
- tuple-generating dependencies,
|
||||
- or Datalog±.
|
||||
Basic Datalog works with facts and constants that already exist. It does not normally invent brand new objects just because a rule says something must exist.
|
||||
|
||||
So:
|
||||
|
||||
- **Geolog with existential conclusions** is strictly more expressive than plain Datalog.
|
||||
So this is one major way in which Geolog is more expressive than ordinary Datalog.
|
||||
|
||||
---
|
||||
|
||||
## 9. Where The Mapping Breaks: Equality
|
||||
## 9. Where The Analogy Stops Working: Equality
|
||||
|
||||
Geolog also supports equality-oriented reasoning in ways that go beyond plain Datalog.
|
||||
Geolog can also reason about equality in richer ways than plain Datalog normally does.
|
||||
|
||||
If equalities are inferred or propagated as part of semantic saturation, then the system is doing more than ordinary Datalog rule evaluation.
|
||||
That means Geolog can sometimes merge or identify things based on logical consequences, while ordinary Datalog is usually just adding new facts.
|
||||
|
||||
So:
|
||||
|
||||
- **Geolog equality reasoning** is not captured by plain Datalog without extensions or encodings.
|
||||
So equality is another place where Geolog is doing more than basic Datalog.
|
||||
|
||||
---
|
||||
|
||||
## 10. Best Practical Translation Rules
|
||||
## 10. A Safe Mental Model
|
||||
|
||||
When reading Geolog in Datalog terms, use this mental model:
|
||||
If you want a simple way to think about the connection, use this:
|
||||
|
||||
- `Sort` → type or unary predicate
|
||||
- function `f : A -> B` → relation `f(a, b)` plus intended functionality
|
||||
- relation `R : [...] -> Prop` → predicate `R(...)`
|
||||
- axiom `premises |- conclusion` → rule `conclusion :- premises`
|
||||
- instance facts → EDB facts
|
||||
- chase → bottom-up fixpoint
|
||||
- existential conclusion → no longer plain Datalog
|
||||
- a **sort** in Geolog is like a type,
|
||||
- a **relation** in Geolog is like a predicate in Datalog,
|
||||
- a **fact** in a Geolog instance is like a Datalog fact,
|
||||
- an **axiom** in Geolog is often like a Datalog rule,
|
||||
- and **chase** is like repeatedly applying rules until no new facts appear.
|
||||
|
||||
That mental model is very useful.
|
||||
|
||||
Just remember that full Geolog is richer than plain Datalog.
|
||||
|
||||
---
|
||||
|
||||
## 11. Bottom Line
|
||||
## Final Summary
|
||||
|
||||
The safest final statement is:
|
||||
Here is the simplest correct conclusion:
|
||||
|
||||
> **A Datalog-like fragment of Geolog maps very naturally to typed Datalog, but full Geolog is richer because it includes native functions, existential rule heads, and stronger logical machinery.**
|
||||
> A large part of Geolog can be understood as Datalog-like rule evaluation over typed data.
|
||||
|
||||
But also:
|
||||
|
||||
> Full Geolog goes beyond basic Datalog because it has real functions, richer equality reasoning, and rules that can say that new things must exist.
|
||||
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user