504 lines
8.4 KiB
Markdown
504 lines
8.4 KiB
Markdown
# Geolog To Datalog Subset
|
||
|
||
*A practical subset of Geolog that can be translated automatically into plain Datalog.*
|
||
|
||
---
|
||
|
||
## Goal
|
||
|
||
The goal of this note is to describe a part of Geolog that can be translated into plain Datalog without too much trouble.
|
||
|
||
This note is about a **safe, useful subset**.
|
||
|
||
It is **not** about translating all of Geolog.
|
||
|
||
That matters because full Geolog has features that go beyond what plain Datalog can express.
|
||
|
||
---
|
||
|
||
## Short Answer
|
||
|
||
Automatic translation is quite reasonable if we only allow:
|
||
|
||
- sorts,
|
||
- relations,
|
||
- functions,
|
||
- instances with facts,
|
||
- and simple Horn-style axioms.
|
||
|
||
Automatic translation becomes much harder if we also allow:
|
||
|
||
- existential conclusions,
|
||
- equality-generating behavior,
|
||
- or any rule that needs the system to create fresh objects.
|
||
|
||
---
|
||
|
||
## What Subset We Want
|
||
|
||
We want a subset where Geolog rules behave like ordinary Datalog rules.
|
||
|
||
That means we allow the parts of Geolog that fit this pattern:
|
||
|
||
- some facts are already true,
|
||
- a rule sees those facts,
|
||
- the rule derives one new fact,
|
||
- repeat until nothing new can be derived.
|
||
|
||
---
|
||
|
||
## Allowed Features
|
||
|
||
### 1. Sort declarations
|
||
|
||
Allowed:
|
||
|
||
```geolog
|
||
V : Sort;
|
||
E : Sort;
|
||
```
|
||
|
||
These are Geolog types.
|
||
|
||
In Datalog, they can be handled in either of two ways:
|
||
|
||
- keep them only as compiler information,
|
||
- or emit unary predicates such as `V(x)` and `E(x)`.
|
||
|
||
---
|
||
|
||
### 2. Relation declarations
|
||
|
||
Allowed:
|
||
|
||
```geolog
|
||
edge : [from: V, to: V] -> Prop;
|
||
reachable : [from: V, to: V] -> Prop;
|
||
```
|
||
|
||
Translation idea:
|
||
|
||
```prolog
|
||
edge(X, Y)
|
||
reachable(X, Y)
|
||
```
|
||
|
||
The field names such as `from` and `to` are useful in Geolog, but plain Datalog usually uses only argument position.
|
||
|
||
---
|
||
|
||
### 3. Function declarations
|
||
|
||
Allowed:
|
||
|
||
```geolog
|
||
src : E -> V;
|
||
tgt : E -> V;
|
||
```
|
||
|
||
Translation idea:
|
||
|
||
```prolog
|
||
src(E, V)
|
||
tgt(E, V)
|
||
```
|
||
|
||
This means we turn Geolog functions into ordinary Datalog relations.
|
||
|
||
That is a practical encoding, but it is important to remember that a Datalog relation does not automatically behave like a true function.
|
||
|
||
---
|
||
|
||
### 4. Instance facts
|
||
|
||
Allowed element declarations:
|
||
|
||
```geolog
|
||
a, b, c : V;
|
||
e1, e2 : E;
|
||
```
|
||
|
||
Possible Datalog output:
|
||
|
||
```prolog
|
||
V(a).
|
||
V(b).
|
||
V(c).
|
||
E(e1).
|
||
E(e2).
|
||
```
|
||
|
||
Allowed relation facts:
|
||
|
||
```geolog
|
||
[from: a, to: b] edge;
|
||
[from: b, to: c] edge;
|
||
```
|
||
|
||
Translation:
|
||
|
||
```prolog
|
||
edge(a, b).
|
||
edge(b, c).
|
||
```
|
||
|
||
Allowed function facts:
|
||
|
||
```geolog
|
||
e1 src = a;
|
||
e1 tgt = b;
|
||
```
|
||
|
||
Translation:
|
||
|
||
```prolog
|
||
src(e1, a).
|
||
tgt(e1, b).
|
||
```
|
||
|
||
---
|
||
|
||
### 5. Simple axioms
|
||
|
||
Allowed form:
|
||
|
||
```geolog
|
||
name : forall vars.
|
||
premise1, premise2, ..., premisen |- conclusion;
|
||
```
|
||
|
||
Where:
|
||
|
||
- all variables are universal variables,
|
||
- each premise is a simple atomic fact,
|
||
- the conclusion is exactly one simple atomic fact,
|
||
- there is no `exists` in the conclusion,
|
||
- there is no disjunction,
|
||
- there is no equality-generating step.
|
||
|
||
This is the part of Geolog that lines up best with plain Datalog.
|
||
|
||
---
|
||
|
||
## What Is Not Allowed
|
||
|
||
This subset should reject programs that use:
|
||
|
||
- existential conclusions,
|
||
- equality-generating rules,
|
||
- rules that require creation of fresh objects,
|
||
- disjunction,
|
||
- nested instances in places where they affect rule translation,
|
||
- or advanced features that do not flatten cleanly into ordinary Datalog facts and rules.
|
||
|
||
---
|
||
|
||
## Translation Rules
|
||
|
||
This section gives the basic mechanical mapping.
|
||
|
||
### Sort declaration
|
||
|
||
Geolog:
|
||
|
||
```geolog
|
||
V : Sort;
|
||
```
|
||
|
||
Possible Datalog representation:
|
||
|
||
```prolog
|
||
V(x)
|
||
```
|
||
|
||
In practice, you usually emit only the facts like `V(a).`, not a schema declaration.
|
||
|
||
---
|
||
|
||
### Relation declaration
|
||
|
||
Geolog:
|
||
|
||
```geolog
|
||
reachable : [from: V, to: V] -> Prop;
|
||
```
|
||
|
||
Datalog predicate shape:
|
||
|
||
```prolog
|
||
reachable(X, Y)
|
||
```
|
||
|
||
---
|
||
|
||
### Function declaration
|
||
|
||
Geolog:
|
||
|
||
```geolog
|
||
src : E -> V;
|
||
```
|
||
|
||
Datalog predicate shape:
|
||
|
||
```prolog
|
||
src(E, V)
|
||
```
|
||
|
||
This is called **relationalizing** the function: we represent the function as a relation.
|
||
|
||
---
|
||
|
||
### Element declaration
|
||
|
||
Geolog:
|
||
|
||
```geolog
|
||
a : V;
|
||
```
|
||
|
||
Datalog:
|
||
|
||
```prolog
|
||
V(a).
|
||
```
|
||
|
||
---
|
||
|
||
### Relation fact
|
||
|
||
Geolog:
|
||
|
||
```geolog
|
||
[from: a, to: b] edge;
|
||
```
|
||
|
||
Datalog:
|
||
|
||
```prolog
|
||
edge(a, b).
|
||
```
|
||
|
||
---
|
||
|
||
### Function fact
|
||
|
||
Geolog:
|
||
|
||
```geolog
|
||
e1 src = a;
|
||
```
|
||
|
||
Datalog:
|
||
|
||
```prolog
|
||
src(e1, a).
|
||
```
|
||
|
||
---
|
||
|
||
### Axiom
|
||
|
||
Geolog:
|
||
|
||
```geolog
|
||
ax/base : forall x,y : V.
|
||
[from: x, to: y] edge |- [from: x, to: y] reachable;
|
||
```
|
||
|
||
Datalog:
|
||
|
||
```prolog
|
||
reachable(X, Y) :- edge(X, Y).
|
||
```
|
||
|
||
Geolog:
|
||
|
||
```geolog
|
||
ax/trans : forall x,y,z : V.
|
||
[from: x, to: y] reachable, [from: y, to: z] reachable
|
||
|- [from: x, to: z] reachable;
|
||
```
|
||
|
||
Datalog:
|
||
|
||
```prolog
|
||
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
|
||
```
|
||
|
||
---
|
||
|
||
## Full Example
|
||
|
||
### Geolog input
|
||
|
||
```geolog
|
||
theory Graph {
|
||
V : Sort;
|
||
E : Sort;
|
||
|
||
src : E -> V;
|
||
tgt : E -> V;
|
||
|
||
edge : [from: V, to: V] -> Prop;
|
||
reachable : [from: V, to: V] -> Prop;
|
||
|
||
ax/base : forall x,y : V.
|
||
[from: x, to: y] edge |- [from: x, to: y] reachable;
|
||
|
||
ax/trans : forall x,y,z : V.
|
||
[from: x, to: y] reachable, [from: y, to: z] reachable
|
||
|- [from: x, to: z] reachable;
|
||
}
|
||
|
||
instance G : Graph = {
|
||
a, b, c : V;
|
||
e1, e2 : E;
|
||
|
||
e1 src = a;
|
||
e1 tgt = b;
|
||
e2 src = b;
|
||
e2 tgt = c;
|
||
|
||
[from: a, to: b] edge;
|
||
[from: b, to: c] edge;
|
||
}
|
||
```
|
||
|
||
### Datalog output
|
||
|
||
```prolog
|
||
V(a).
|
||
V(b).
|
||
V(c).
|
||
E(e1).
|
||
E(e2).
|
||
|
||
src(e1, a).
|
||
tgt(e1, b).
|
||
src(e2, b).
|
||
tgt(e2, c).
|
||
|
||
edge(a, b).
|
||
edge(b, c).
|
||
|
||
reachable(X, Y) :- edge(X, Y).
|
||
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
|
||
```
|
||
|
||
---
|
||
|
||
## What This Translation Loses
|
||
|
||
Even in this safe subset, some Geolog meaning may be weakened.
|
||
|
||
### Function meaning is weaker
|
||
|
||
In Geolog, `src : E -> V` really means “each edge has one source”.
|
||
|
||
In plain Datalog, once we translate it to a relation, this bad situation is possible unless something else prevents it:
|
||
|
||
```prolog
|
||
src(e1, a).
|
||
src(e1, b).
|
||
```
|
||
|
||
So the translator is practical, but not fully faithful to all of Geolog’s function meaning.
|
||
|
||
### Type checking may become weaker
|
||
|
||
If the translator does not emit unary sort predicates like `V(a).`, then the output Datalog program may lose explicit type information.
|
||
|
||
### Richer Geolog behavior is excluded
|
||
|
||
This subset does not include the most powerful parts of Geolog.
|
||
|
||
That is intentional.
|
||
|
||
---
|
||
|
||
## Best Implementation Strategy
|
||
|
||
If someone wanted to build this translator, the safest pipeline would be:
|
||
|
||
1. parse the Geolog source,
|
||
2. elaborate or type-check it,
|
||
3. reject anything outside the supported subset,
|
||
4. turn functions into relations,
|
||
5. flatten named relation fields into positional arguments,
|
||
6. print Datalog facts and rules.
|
||
|
||
That is much safer than trying to translate directly from raw surface syntax.
|
||
|
||
---
|
||
|
||
## A Good Simple Definition Of The Subset
|
||
|
||
Here is a useful one-sentence definition:
|
||
|
||
> The translatable subset is the part of Geolog where theories can be expressed as universally quantified Horn-style rules over relations and function-as-relation facts, with no existential conclusions and no equality-generating behavior.
|
||
|
||
---
|
||
|
||
## How Useful This Subset Would Be
|
||
|
||
This subset would be genuinely useful, but only for the right jobs.
|
||
|
||
It would be useful for:
|
||
|
||
- understanding a Geolog theory in simpler rule form,
|
||
- debugging the rule part of a theory,
|
||
- exporting some Geolog programs to Datalog tools,
|
||
- comparing Geolog evaluation with Datalog evaluation,
|
||
- and teaching the parts of Geolog that are close to rule-based database logic.
|
||
|
||
It would be especially useful for theories that are mostly:
|
||
|
||
- relation-based,
|
||
- rule-based,
|
||
- and not dependent on creating fresh objects during chase.
|
||
|
||
Good examples include:
|
||
|
||
- graph reachability,
|
||
- transitive closure,
|
||
- propagation rules,
|
||
- dependency rules,
|
||
- and many database-style logic programs.
|
||
|
||
But this subset would not cover the full point of Geolog.
|
||
|
||
It becomes less useful when a theory depends on:
|
||
|
||
- existential conclusions,
|
||
- equality-heavy reasoning,
|
||
- or the full meaning of native functions.
|
||
|
||
So the best way to think about it is this:
|
||
|
||
- it is a very useful **bridge**,
|
||
- a useful **debugging view**,
|
||
- and a useful **restricted compilation target**,
|
||
- but it is not a replacement for full Geolog.
|
||
|
||
If built well, the ideal behavior would be:
|
||
|
||
- if a Geolog program fits inside the subset, export it cleanly,
|
||
- if it does not fit, explain clearly why not.
|
||
|
||
That alone would make the subset valuable.
|
||
|
||
---
|
||
|
||
## Final Summary
|
||
|
||
If you only want a practical translator, the right plan is:
|
||
|
||
- support sorts,
|
||
- support relations,
|
||
- support functions by turning them into relations,
|
||
- support simple axioms with one conclusion fact,
|
||
- reject existential and equality-heavy features.
|
||
|
||
That gives a subset that is useful, understandable, and realistic to translate automatically.
|