Add a comparison notes between Geolog and Datalog

This commit is contained in:
Hassan Abedi 2026-03-23 13:53:18 +01:00
parent 33e826b6e5
commit 50ae1473e4

View File

@ -1,53 +1,95 @@
# Geolog And Datalog
*How to map Geolog concepts onto Datalog, and where the analogy breaks down.*
*A simple explanation of how Geolog is similar to Datalog, and how it is different.*
---
## Summary
## What This Note Is About
Geolog is not just Datalog with different syntax, but a large and useful fragment of Geolog can be read that way.
Sometimes Geolog looks a lot like Datalog.
The best short summary is:
That is not an accident.
> **Geolog is like typed Datalog with native functions, existential rule heads, and stronger logical structure.**
Both systems let you:
The mapping works well for:
- describe facts,
- describe rules,
- and repeatedly apply those rules until no new facts appear.
- relations,
- Horn-style axioms,
- instance facts,
- and fixpoint/chase-style derivation.
But Geolog is not just Datalog with different syntax.
The mapping breaks down when Geolog uses:
Geolog also has features that go beyond basic Datalog, especially:
- existential conclusions,
- equality reasoning,
- and native function symbols with total/single-valued semantics.
- built-in types, called **sorts**,
- built-in **functions**,
- rules that can say “something exists”,
- and stronger support for equality.
This note explains the overlap in simple terms.
---
## First: What Is Datalog?
Datalog is a rule language.
It is often written with facts like this:
```prolog
edge(a, b).
edge(b, c).
```
And rules like this:
```prolog
reachable(X, Y) :- edge(X, Y).
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
```
You can read those rules as:
- if `edge(X, Y)` is true, then `reachable(X, Y)` is true,
- if `reachable(X, Y)` and `reachable(Y, Z)` are true, then `reachable(X, Z)` is true.
Then the system keeps applying rules until it cannot derive anything new.
---
## Very Short Summary
The best simple summary is:
> Geolog contains a part that behaves a lot like typed Datalog, but full Geolog can do more.
Another good summary is:
> If you ignore some of the more advanced features, you can often read Geolog as “Datalog with types and functions”.
---
## Side-By-Side Mapping
| Concept | Geolog | Datalog |
|---------|--------|---------|
| Domain/type | `V : Sort;` | usually implicit, or unary predicate like `V(x)` |
| Idea | In Geolog | In Datalog |
|------|-----------|------------|
| Type of thing | `V : Sort;` | often left implicit, or written as a one-argument fact like `V(x)` |
| Relation | `edge : [from: V, to: V] -> Prop;` | `edge(X, Y)` |
| Function | `src : E -> V;` | usually encoded as relation `src(E, V)` |
| Function | `src : E -> V;` | usually encoded as a relation like `src(E, V)` |
| Fact | `[from: a, to: b] edge;` | `edge(a, b).` |
| Function fact | `e1 src = a;` | `src(e1, a).` |
| Axiom/rule | `A, B |- C` | `C :- A, B.` |
| Derived predicate | relation + axioms | IDB predicate + rules |
| Instance | concrete elements/facts | EDB facts |
| Chase/fixpoint | repeated rule firing to saturation | bottom-up least fixpoint |
| Existential conclusion | `|- exists y : B. ...` | not plain Datalog |
| Equality reasoning | native/logical | usually extension or encoding |
| Function value | `e1 src = a;` | `src(e1, a).` |
| Rule | `A, B |- C` | `C :- A, B.` |
| Starting data | an instance | a set of starting facts |
| Repeated rule application | chase | repeated rule evaluation until no new facts appear |
---
## 1. Sorts And Types
## 1. Sorts In Geolog Are Like Types
Geolog has first-class sorts:
Geolog has **sorts**.
A sort is just a kind of thing.
Example:
```geolog
theory Graph {
@ -55,7 +97,11 @@ theory Graph {
}
```
Plain Datalog often treats types as implicit, or simulates them with unary predicates:
This says that `V` is a sort. You can think of it as the type “vertex”.
Basic Datalog usually does not have this built in.
If someone wants types in Datalog, they often simulate them with ordinary facts such as:
```prolog
V(a).
@ -63,13 +109,16 @@ V(b).
V(c).
```
So the closest Datalog mental model is:
So the rough translation is:
- **Geolog sort****type discipline or unary predicate**.
- **Geolog sort** = **type**,
- and in Datalog that type is often represented using an ordinary relation.
---
## 2. Relations Map Directly
## 2. Relations Match Very Well
This part is simple.
Geolog relation declaration:
@ -77,25 +126,30 @@ Geolog relation declaration:
edge : [from: V, to: V] -> Prop;
```
Fact in a Geolog instance:
Geolog fact:
```geolog
[from: a, to: b] edge;
```
Datalog equivalent:
Datalog version:
```prolog
edge(a, b).
```
This is the cleanest and most direct part of the mapping.
So:
- a Geolog relation is very much like a Datalog predicate,
- and a Geolog relation fact is very much like a Datalog fact.
---
## 3. Functions Become Functional Relations
## 3. Functions Need More Care
Geolog has native function symbols:
Geolog has real functions.
Example:
```geolog
E : Sort;
@ -107,47 +161,46 @@ a : V;
e1 src = a;
```
In Datalog, the closest encoding is relational:
This means:
- `src` takes an edge and gives back one vertex,
- `e1` has source `a`.
In Datalog, people usually write the same information like this:
```prolog
src(e1, a).
```
But there is an important semantic gap:
But there is a difference.
- in **Geolog**, `src` is a genuine function,
- in **Datalog**, `src(e1, a)` is just a relation unless extra constraints are added.
In Geolog, `src` is a true function, so each edge should have one source.
So the right translation rule is:
In plain Datalog, `src(e1, a)` is just a normal fact in a normal relation. Datalog does not automatically know that `src` is supposed to behave like a function.
- **Geolog function****relation that is intended to be total and single-valued**.
So the simple rule is:
To fully mimic a function, you also need functional constraints such as:
```text
if src(e, v1) and src(e, v2), then v1 = v2
```
Plain Datalog does not natively enforce that on its own.
- **Geolog function** becomes **a relation in Datalog**,
- but some meaning is lost unless you also add extra rules or constraints.
---
## 4. Axioms Map To Rules
## 4. Axioms In Geolog Look Like Rules In Datalog
Geolog Horn-style axiom:
Geolog rule:
```geolog
ax/base : forall x,y : V.
[from: x, to: y] edge |- [from: x, to: y] reachable;
```
Datalog:
Datalog rule:
```prolog
reachable(X, Y) :- edge(X, Y).
```
Another Geolog axiom:
Geolog rule:
```geolog
ax/trans : forall x,y,z : V.
@ -155,19 +208,21 @@ ax/trans : forall x,y,z : V.
|- [from: x, to: z] reachable;
```
Datalog:
Datalog rule:
```prolog
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
```
This is the main sense in which Geolog resembles Datalog.
So in the simplest cases:
- **Geolog axiom** behaves like a **Datalog rule**.
---
## 5. Full Example
## 5. A Full Example
Geolog:
Here is a Geolog theory and instance:
```geolog
theory Graph {
@ -190,7 +245,7 @@ instance G : Graph = {
}
```
Datalog:
Here is the matching Datalog program:
```prolog
edge(a, b).
@ -200,7 +255,7 @@ reachable(X, Y) :- edge(X, Y).
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
```
After saturation/fixpoint, both derive:
After rule application, both systems derive:
```prolog
reachable(a, b).
@ -208,103 +263,109 @@ reachable(b, c).
reachable(a, c).
```
---
## 6. Instances Are Like EDB Facts
Geolog theory + instance splits naturally into:
- **theory** = vocabulary + laws
- **instance** = starting facts
That matches the Datalog split between:
- **rules/schema**, and
- **EDB facts**.
So:
- **Geolog instance****extensional database**.
This is the part of Geolog that feels the most like Datalog.
---
## 7. Chase Is Like Bottom-Up Evaluation
## 6. An Instance Is Just The Starting Data
Geolog chase:
In Geolog, a theory gives:
- repeatedly apply axioms,
- add newly implied facts,
- stop when nothing new appears.
- the names of sorts,
- the names of functions,
- the names of relations,
- and the axioms.
Datalog bottom-up evaluation:
Then an instance gives:
- repeatedly apply rules,
- add newly derived facts,
- stop at the least fixpoint.
- the actual elements,
- function values,
- and relation facts.
So in the Horn fragment:
So a Geolog instance is just the starting data for rule application.
- **Geolog chase****Datalog least-fixpoint evaluation**.
That is exactly the role that a set of starting facts plays in Datalog.
---
## 8. Where The Mapping Breaks: Existentials
## 7. Chase In Geolog Is Like Repeated Rule Evaluation In Datalog
Geolog can say things like:
In Geolog, the **chase** means:
1. look at the current facts,
2. find an axiom that applies,
3. add the fact that the axiom says must follow,
4. repeat until nothing new appears.
In Datalog, the standard evaluation idea is very similar:
1. look at the current facts,
2. find a rule whose body is true,
3. add the fact in the head of the rule,
4. repeat until nothing new appears.
So for simple rule systems:
- **Geolog chase** and **Datalog rule evaluation** are very close.
---
## 8. Where The Analogy Stops Working: “There Exists ...”
Geolog can write rules like this:
```geolog
forall p : Person.
|- exists c : Person. [parent: p, kid: c] child;
```
Meaning:
In plain language, this says:
- for every person `p`, some child must exist,
- and chase may create a fresh witness if needed.
- every person has some child.
Plain Datalog cannot do this, because plain Datalog does not invent fresh elements.
If the system does not already know such a child, Geolog may create a fresh witness during chase.
This is not standard Datalog anymore. It is closer to:
Basic Datalog does not do that.
- existential rules,
- tuple-generating dependencies,
- or Datalog±.
Basic Datalog works with facts and constants that already exist. It does not normally invent brand new objects just because a rule says something must exist.
So:
- **Geolog with existential conclusions** is strictly more expressive than plain Datalog.
So this is one major way in which Geolog is more expressive than ordinary Datalog.
---
## 9. Where The Mapping Breaks: Equality
## 9. Where The Analogy Stops Working: Equality
Geolog also supports equality-oriented reasoning in ways that go beyond plain Datalog.
Geolog can also reason about equality in richer ways than plain Datalog normally does.
If equalities are inferred or propagated as part of semantic saturation, then the system is doing more than ordinary Datalog rule evaluation.
That means Geolog can sometimes merge or identify things based on logical consequences, while ordinary Datalog is usually just adding new facts.
So:
- **Geolog equality reasoning** is not captured by plain Datalog without extensions or encodings.
So equality is another place where Geolog is doing more than basic Datalog.
---
## 10. Best Practical Translation Rules
## 10. A Safe Mental Model
When reading Geolog in Datalog terms, use this mental model:
If you want a simple way to think about the connection, use this:
- `Sort` → type or unary predicate
- function `f : A -> B` → relation `f(a, b)` plus intended functionality
- relation `R : [...] -> Prop` → predicate `R(...)`
- axiom `premises |- conclusion` → rule `conclusion :- premises`
- instance facts → EDB facts
- chase → bottom-up fixpoint
- existential conclusion → no longer plain Datalog
- a **sort** in Geolog is like a type,
- a **relation** in Geolog is like a predicate in Datalog,
- a **fact** in a Geolog instance is like a Datalog fact,
- an **axiom** in Geolog is often like a Datalog rule,
- and **chase** is like repeatedly applying rules until no new facts appear.
That mental model is very useful.
Just remember that full Geolog is richer than plain Datalog.
---
## 11. Bottom Line
## Final Summary
The safest final statement is:
Here is the simplest correct conclusion:
> **A Datalog-like fragment of Geolog maps very naturally to typed Datalog, but full Geolog is richer because it includes native functions, existential rule heads, and stronger logical machinery.**
> A large part of Geolog can be understood as Datalog-like rule evaluation over typed data.
But also:
> Full Geolog goes beyond basic Datalog because it has real functions, richer equality reasoning, and rules that can say that new things must exist.