455 lines
7.1 KiB
Markdown
455 lines
7.1 KiB
Markdown
|
|
# Geolog To Datalog Subset
|
|||
|
|
|
|||
|
|
*A practical subset of Geolog that can be translated automatically into plain Datalog.*
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Goal
|
|||
|
|
|
|||
|
|
The goal of this note is to describe a part of Geolog that can be translated into plain Datalog without too much trouble.
|
|||
|
|
|
|||
|
|
This note is about a **safe, useful subset**.
|
|||
|
|
|
|||
|
|
It is **not** about translating all of Geolog.
|
|||
|
|
|
|||
|
|
That matters because full Geolog has features that go beyond what plain Datalog can express.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Short Answer
|
|||
|
|
|
|||
|
|
Automatic translation is quite reasonable if we only allow:
|
|||
|
|
|
|||
|
|
- sorts,
|
|||
|
|
- relations,
|
|||
|
|
- functions,
|
|||
|
|
- instances with facts,
|
|||
|
|
- and simple Horn-style axioms.
|
|||
|
|
|
|||
|
|
Automatic translation becomes much harder if we also allow:
|
|||
|
|
|
|||
|
|
- existential conclusions,
|
|||
|
|
- equality-generating behavior,
|
|||
|
|
- or any rule that needs the system to create fresh objects.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## What Subset We Want
|
|||
|
|
|
|||
|
|
We want a subset where Geolog rules behave like ordinary Datalog rules.
|
|||
|
|
|
|||
|
|
That means we allow the parts of Geolog that fit this pattern:
|
|||
|
|
|
|||
|
|
- some facts are already true,
|
|||
|
|
- a rule sees those facts,
|
|||
|
|
- the rule derives one new fact,
|
|||
|
|
- repeat until nothing new can be derived.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Allowed Features
|
|||
|
|
|
|||
|
|
### 1. Sort declarations
|
|||
|
|
|
|||
|
|
Allowed:
|
|||
|
|
|
|||
|
|
```geolog
|
|||
|
|
V : Sort;
|
|||
|
|
E : Sort;
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
These are Geolog types.
|
|||
|
|
|
|||
|
|
In Datalog, they can be handled in either of two ways:
|
|||
|
|
|
|||
|
|
- keep them only as compiler information,
|
|||
|
|
- or emit unary predicates such as `V(x)` and `E(x)`.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### 2. Relation declarations
|
|||
|
|
|
|||
|
|
Allowed:
|
|||
|
|
|
|||
|
|
```geolog
|
|||
|
|
edge : [from: V, to: V] -> Prop;
|
|||
|
|
reachable : [from: V, to: V] -> Prop;
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Translation idea:
|
|||
|
|
|
|||
|
|
```prolog
|
|||
|
|
edge(X, Y)
|
|||
|
|
reachable(X, Y)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
The field names such as `from` and `to` are useful in Geolog, but plain Datalog usually uses only argument position.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### 3. Function declarations
|
|||
|
|
|
|||
|
|
Allowed:
|
|||
|
|
|
|||
|
|
```geolog
|
|||
|
|
src : E -> V;
|
|||
|
|
tgt : E -> V;
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Translation idea:
|
|||
|
|
|
|||
|
|
```prolog
|
|||
|
|
src(E, V)
|
|||
|
|
tgt(E, V)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
This means we turn Geolog functions into ordinary Datalog relations.
|
|||
|
|
|
|||
|
|
That is a practical encoding, but it is important to remember that a Datalog relation does not automatically behave like a true function.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### 4. Instance facts
|
|||
|
|
|
|||
|
|
Allowed element declarations:
|
|||
|
|
|
|||
|
|
```geolog
|
|||
|
|
a, b, c : V;
|
|||
|
|
e1, e2 : E;
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Possible Datalog output:
|
|||
|
|
|
|||
|
|
```prolog
|
|||
|
|
V(a).
|
|||
|
|
V(b).
|
|||
|
|
V(c).
|
|||
|
|
E(e1).
|
|||
|
|
E(e2).
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Allowed relation facts:
|
|||
|
|
|
|||
|
|
```geolog
|
|||
|
|
[from: a, to: b] edge;
|
|||
|
|
[from: b, to: c] edge;
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Translation:
|
|||
|
|
|
|||
|
|
```prolog
|
|||
|
|
edge(a, b).
|
|||
|
|
edge(b, c).
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Allowed function facts:
|
|||
|
|
|
|||
|
|
```geolog
|
|||
|
|
e1 src = a;
|
|||
|
|
e1 tgt = b;
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Translation:
|
|||
|
|
|
|||
|
|
```prolog
|
|||
|
|
src(e1, a).
|
|||
|
|
tgt(e1, b).
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### 5. Simple axioms
|
|||
|
|
|
|||
|
|
Allowed form:
|
|||
|
|
|
|||
|
|
```geolog
|
|||
|
|
name : forall vars.
|
|||
|
|
premise1, premise2, ..., premisen |- conclusion;
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Where:
|
|||
|
|
|
|||
|
|
- all variables are universal variables,
|
|||
|
|
- each premise is a simple atomic fact,
|
|||
|
|
- the conclusion is exactly one simple atomic fact,
|
|||
|
|
- there is no `exists` in the conclusion,
|
|||
|
|
- there is no disjunction,
|
|||
|
|
- there is no equality-generating step.
|
|||
|
|
|
|||
|
|
This is the part of Geolog that lines up best with plain Datalog.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## What Is Not Allowed
|
|||
|
|
|
|||
|
|
This subset should reject programs that use:
|
|||
|
|
|
|||
|
|
- existential conclusions,
|
|||
|
|
- equality-generating rules,
|
|||
|
|
- rules that require creation of fresh objects,
|
|||
|
|
- disjunction,
|
|||
|
|
- nested instances in places where they affect rule translation,
|
|||
|
|
- or advanced features that do not flatten cleanly into ordinary Datalog facts and rules.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Translation Rules
|
|||
|
|
|
|||
|
|
This section gives the basic mechanical mapping.
|
|||
|
|
|
|||
|
|
### Sort declaration
|
|||
|
|
|
|||
|
|
Geolog:
|
|||
|
|
|
|||
|
|
```geolog
|
|||
|
|
V : Sort;
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Possible Datalog representation:
|
|||
|
|
|
|||
|
|
```prolog
|
|||
|
|
V(x)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
In practice, you usually emit only the facts like `V(a).`, not a schema declaration.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### Relation declaration
|
|||
|
|
|
|||
|
|
Geolog:
|
|||
|
|
|
|||
|
|
```geolog
|
|||
|
|
reachable : [from: V, to: V] -> Prop;
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Datalog predicate shape:
|
|||
|
|
|
|||
|
|
```prolog
|
|||
|
|
reachable(X, Y)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### Function declaration
|
|||
|
|
|
|||
|
|
Geolog:
|
|||
|
|
|
|||
|
|
```geolog
|
|||
|
|
src : E -> V;
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Datalog predicate shape:
|
|||
|
|
|
|||
|
|
```prolog
|
|||
|
|
src(E, V)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
This is called **relationalizing** the function: we represent the function as a relation.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### Element declaration
|
|||
|
|
|
|||
|
|
Geolog:
|
|||
|
|
|
|||
|
|
```geolog
|
|||
|
|
a : V;
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Datalog:
|
|||
|
|
|
|||
|
|
```prolog
|
|||
|
|
V(a).
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### Relation fact
|
|||
|
|
|
|||
|
|
Geolog:
|
|||
|
|
|
|||
|
|
```geolog
|
|||
|
|
[from: a, to: b] edge;
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Datalog:
|
|||
|
|
|
|||
|
|
```prolog
|
|||
|
|
edge(a, b).
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### Function fact
|
|||
|
|
|
|||
|
|
Geolog:
|
|||
|
|
|
|||
|
|
```geolog
|
|||
|
|
e1 src = a;
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Datalog:
|
|||
|
|
|
|||
|
|
```prolog
|
|||
|
|
src(e1, a).
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### Axiom
|
|||
|
|
|
|||
|
|
Geolog:
|
|||
|
|
|
|||
|
|
```geolog
|
|||
|
|
ax/base : forall x,y : V.
|
|||
|
|
[from: x, to: y] edge |- [from: x, to: y] reachable;
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Datalog:
|
|||
|
|
|
|||
|
|
```prolog
|
|||
|
|
reachable(X, Y) :- edge(X, Y).
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Geolog:
|
|||
|
|
|
|||
|
|
```geolog
|
|||
|
|
ax/trans : forall x,y,z : V.
|
|||
|
|
[from: x, to: y] reachable, [from: y, to: z] reachable
|
|||
|
|
|- [from: x, to: z] reachable;
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Datalog:
|
|||
|
|
|
|||
|
|
```prolog
|
|||
|
|
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Full Example
|
|||
|
|
|
|||
|
|
### Geolog input
|
|||
|
|
|
|||
|
|
```geolog
|
|||
|
|
theory Graph {
|
|||
|
|
V : Sort;
|
|||
|
|
E : Sort;
|
|||
|
|
|
|||
|
|
src : E -> V;
|
|||
|
|
tgt : E -> V;
|
|||
|
|
|
|||
|
|
edge : [from: V, to: V] -> Prop;
|
|||
|
|
reachable : [from: V, to: V] -> Prop;
|
|||
|
|
|
|||
|
|
ax/base : forall x,y : V.
|
|||
|
|
[from: x, to: y] edge |- [from: x, to: y] reachable;
|
|||
|
|
|
|||
|
|
ax/trans : forall x,y,z : V.
|
|||
|
|
[from: x, to: y] reachable, [from: y, to: z] reachable
|
|||
|
|
|- [from: x, to: z] reachable;
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
instance G : Graph = {
|
|||
|
|
a, b, c : V;
|
|||
|
|
e1, e2 : E;
|
|||
|
|
|
|||
|
|
e1 src = a;
|
|||
|
|
e1 tgt = b;
|
|||
|
|
e2 src = b;
|
|||
|
|
e2 tgt = c;
|
|||
|
|
|
|||
|
|
[from: a, to: b] edge;
|
|||
|
|
[from: b, to: c] edge;
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Datalog output
|
|||
|
|
|
|||
|
|
```prolog
|
|||
|
|
V(a).
|
|||
|
|
V(b).
|
|||
|
|
V(c).
|
|||
|
|
E(e1).
|
|||
|
|
E(e2).
|
|||
|
|
|
|||
|
|
src(e1, a).
|
|||
|
|
tgt(e1, b).
|
|||
|
|
src(e2, b).
|
|||
|
|
tgt(e2, c).
|
|||
|
|
|
|||
|
|
edge(a, b).
|
|||
|
|
edge(b, c).
|
|||
|
|
|
|||
|
|
reachable(X, Y) :- edge(X, Y).
|
|||
|
|
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## What This Translation Loses
|
|||
|
|
|
|||
|
|
Even in this safe subset, some Geolog meaning may be weakened.
|
|||
|
|
|
|||
|
|
### Function meaning is weaker
|
|||
|
|
|
|||
|
|
In Geolog, `src : E -> V` really means “each edge has one source”.
|
|||
|
|
|
|||
|
|
In plain Datalog, once we translate it to a relation, this bad situation is possible unless something else prevents it:
|
|||
|
|
|
|||
|
|
```prolog
|
|||
|
|
src(e1, a).
|
|||
|
|
src(e1, b).
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
So the translator is practical, but not fully faithful to all of Geolog’s function meaning.
|
|||
|
|
|
|||
|
|
### Type checking may become weaker
|
|||
|
|
|
|||
|
|
If the translator does not emit unary sort predicates like `V(a).`, then the output Datalog program may lose explicit type information.
|
|||
|
|
|
|||
|
|
### Richer Geolog behavior is excluded
|
|||
|
|
|
|||
|
|
This subset does not include the most powerful parts of Geolog.
|
|||
|
|
|
|||
|
|
That is intentional.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Best Implementation Strategy
|
|||
|
|
|
|||
|
|
If someone wanted to build this translator, the safest pipeline would be:
|
|||
|
|
|
|||
|
|
1. parse the Geolog source,
|
|||
|
|
2. elaborate or type-check it,
|
|||
|
|
3. reject anything outside the supported subset,
|
|||
|
|
4. turn functions into relations,
|
|||
|
|
5. flatten named relation fields into positional arguments,
|
|||
|
|
6. print Datalog facts and rules.
|
|||
|
|
|
|||
|
|
That is much safer than trying to translate directly from raw surface syntax.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## A Good Simple Definition Of The Subset
|
|||
|
|
|
|||
|
|
Here is a useful one-sentence definition:
|
|||
|
|
|
|||
|
|
> The translatable subset is the part of Geolog where theories can be expressed as universally quantified Horn-style rules over relations and function-as-relation facts, with no existential conclusions and no equality-generating behavior.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Final Summary
|
|||
|
|
|
|||
|
|
If you only want a practical translator, the right plan is:
|
|||
|
|
|
|||
|
|
- support sorts,
|
|||
|
|
- support relations,
|
|||
|
|
- support functions by turning them into relations,
|
|||
|
|
- support simple axioms with one conclusion fact,
|
|||
|
|
- reject existential and equality-heavy features.
|
|||
|
|
|
|||
|
|
That gives a subset that is useful, understandable, and realistic to translate automatically.
|
|||
|
|
|