Add a note file about Geolog to Datalog translation
This commit is contained in:
parent
50ae1473e4
commit
63e28fc7a9
24
.editorconfig
Normal file
24
.editorconfig
Normal file
@ -0,0 +1,24 @@
|
|||||||
|
# https://EditorConfig.org
|
||||||
|
|
||||||
|
root = true
|
||||||
|
|
||||||
|
[*]
|
||||||
|
charset = utf-8
|
||||||
|
end_of_line = lf
|
||||||
|
indent_style = space
|
||||||
|
indent_size = 4
|
||||||
|
insert_final_newline = true
|
||||||
|
trim_trailing_whitespace = true
|
||||||
|
|
||||||
|
[*.rs]
|
||||||
|
max_line_length = 100
|
||||||
|
|
||||||
|
[*.md]
|
||||||
|
max_line_length = 150
|
||||||
|
trim_trailing_whitespace = false
|
||||||
|
|
||||||
|
[*.sh]
|
||||||
|
indent_size = 2
|
||||||
|
|
||||||
|
[*.{yml,yaml,json}]
|
||||||
|
indent_size = 2
|
||||||
@ -2,6 +2,13 @@
|
|||||||
|
|
||||||
This document explains the main ideas in Geolog without assuming math background.
|
This document explains the main ideas in Geolog without assuming math background.
|
||||||
|
|
||||||
|
## Related Notes
|
||||||
|
|
||||||
|
If you want to compare Geolog with Datalog, or think about automatic translation, see:
|
||||||
|
|
||||||
|
- `notes/007-geolog-and-datalog.md` — a simple explanation of what maps well and what does not.
|
||||||
|
- `notes/008-geolog-to-datalog-subset.md` — a practical subset of Geolog that could be translated automatically into plain Datalog.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Theories and Instances
|
## Theories and Instances
|
||||||
|
|||||||
454
notes/008-geolog-to-datalog-subset.md
Normal file
454
notes/008-geolog-to-datalog-subset.md
Normal file
@ -0,0 +1,454 @@
|
|||||||
|
# Geolog To Datalog Subset
|
||||||
|
|
||||||
|
*A practical subset of Geolog that can be translated automatically into plain Datalog.*
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
|
||||||
|
The goal of this note is to describe a part of Geolog that can be translated into plain Datalog without too much trouble.
|
||||||
|
|
||||||
|
This note is about a **safe, useful subset**.
|
||||||
|
|
||||||
|
It is **not** about translating all of Geolog.
|
||||||
|
|
||||||
|
That matters because full Geolog has features that go beyond what plain Datalog can express.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Short Answer
|
||||||
|
|
||||||
|
Automatic translation is quite reasonable if we only allow:
|
||||||
|
|
||||||
|
- sorts,
|
||||||
|
- relations,
|
||||||
|
- functions,
|
||||||
|
- instances with facts,
|
||||||
|
- and simple Horn-style axioms.
|
||||||
|
|
||||||
|
Automatic translation becomes much harder if we also allow:
|
||||||
|
|
||||||
|
- existential conclusions,
|
||||||
|
- equality-generating behavior,
|
||||||
|
- or any rule that needs the system to create fresh objects.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## What Subset We Want
|
||||||
|
|
||||||
|
We want a subset where Geolog rules behave like ordinary Datalog rules.
|
||||||
|
|
||||||
|
That means we allow the parts of Geolog that fit this pattern:
|
||||||
|
|
||||||
|
- some facts are already true,
|
||||||
|
- a rule sees those facts,
|
||||||
|
- the rule derives one new fact,
|
||||||
|
- repeat until nothing new can be derived.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Allowed Features
|
||||||
|
|
||||||
|
### 1. Sort declarations
|
||||||
|
|
||||||
|
Allowed:
|
||||||
|
|
||||||
|
```geolog
|
||||||
|
V : Sort;
|
||||||
|
E : Sort;
|
||||||
|
```
|
||||||
|
|
||||||
|
These are Geolog types.
|
||||||
|
|
||||||
|
In Datalog, they can be handled in either of two ways:
|
||||||
|
|
||||||
|
- keep them only as compiler information,
|
||||||
|
- or emit unary predicates such as `V(x)` and `E(x)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 2. Relation declarations
|
||||||
|
|
||||||
|
Allowed:
|
||||||
|
|
||||||
|
```geolog
|
||||||
|
edge : [from: V, to: V] -> Prop;
|
||||||
|
reachable : [from: V, to: V] -> Prop;
|
||||||
|
```
|
||||||
|
|
||||||
|
Translation idea:
|
||||||
|
|
||||||
|
```prolog
|
||||||
|
edge(X, Y)
|
||||||
|
reachable(X, Y)
|
||||||
|
```
|
||||||
|
|
||||||
|
The field names such as `from` and `to` are useful in Geolog, but plain Datalog usually uses only argument position.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 3. Function declarations
|
||||||
|
|
||||||
|
Allowed:
|
||||||
|
|
||||||
|
```geolog
|
||||||
|
src : E -> V;
|
||||||
|
tgt : E -> V;
|
||||||
|
```
|
||||||
|
|
||||||
|
Translation idea:
|
||||||
|
|
||||||
|
```prolog
|
||||||
|
src(E, V)
|
||||||
|
tgt(E, V)
|
||||||
|
```
|
||||||
|
|
||||||
|
This means we turn Geolog functions into ordinary Datalog relations.
|
||||||
|
|
||||||
|
That is a practical encoding, but it is important to remember that a Datalog relation does not automatically behave like a true function.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 4. Instance facts
|
||||||
|
|
||||||
|
Allowed element declarations:
|
||||||
|
|
||||||
|
```geolog
|
||||||
|
a, b, c : V;
|
||||||
|
e1, e2 : E;
|
||||||
|
```
|
||||||
|
|
||||||
|
Possible Datalog output:
|
||||||
|
|
||||||
|
```prolog
|
||||||
|
V(a).
|
||||||
|
V(b).
|
||||||
|
V(c).
|
||||||
|
E(e1).
|
||||||
|
E(e2).
|
||||||
|
```
|
||||||
|
|
||||||
|
Allowed relation facts:
|
||||||
|
|
||||||
|
```geolog
|
||||||
|
[from: a, to: b] edge;
|
||||||
|
[from: b, to: c] edge;
|
||||||
|
```
|
||||||
|
|
||||||
|
Translation:
|
||||||
|
|
||||||
|
```prolog
|
||||||
|
edge(a, b).
|
||||||
|
edge(b, c).
|
||||||
|
```
|
||||||
|
|
||||||
|
Allowed function facts:
|
||||||
|
|
||||||
|
```geolog
|
||||||
|
e1 src = a;
|
||||||
|
e1 tgt = b;
|
||||||
|
```
|
||||||
|
|
||||||
|
Translation:
|
||||||
|
|
||||||
|
```prolog
|
||||||
|
src(e1, a).
|
||||||
|
tgt(e1, b).
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 5. Simple axioms
|
||||||
|
|
||||||
|
Allowed form:
|
||||||
|
|
||||||
|
```geolog
|
||||||
|
name : forall vars.
|
||||||
|
premise1, premise2, ..., premisen |- conclusion;
|
||||||
|
```
|
||||||
|
|
||||||
|
Where:
|
||||||
|
|
||||||
|
- all variables are universal variables,
|
||||||
|
- each premise is a simple atomic fact,
|
||||||
|
- the conclusion is exactly one simple atomic fact,
|
||||||
|
- there is no `exists` in the conclusion,
|
||||||
|
- there is no disjunction,
|
||||||
|
- there is no equality-generating step.
|
||||||
|
|
||||||
|
This is the part of Geolog that lines up best with plain Datalog.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## What Is Not Allowed
|
||||||
|
|
||||||
|
This subset should reject programs that use:
|
||||||
|
|
||||||
|
- existential conclusions,
|
||||||
|
- equality-generating rules,
|
||||||
|
- rules that require creation of fresh objects,
|
||||||
|
- disjunction,
|
||||||
|
- nested instances in places where they affect rule translation,
|
||||||
|
- or advanced features that do not flatten cleanly into ordinary Datalog facts and rules.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Translation Rules
|
||||||
|
|
||||||
|
This section gives the basic mechanical mapping.
|
||||||
|
|
||||||
|
### Sort declaration
|
||||||
|
|
||||||
|
Geolog:
|
||||||
|
|
||||||
|
```geolog
|
||||||
|
V : Sort;
|
||||||
|
```
|
||||||
|
|
||||||
|
Possible Datalog representation:
|
||||||
|
|
||||||
|
```prolog
|
||||||
|
V(x)
|
||||||
|
```
|
||||||
|
|
||||||
|
In practice, you usually emit only the facts like `V(a).`, not a schema declaration.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Relation declaration
|
||||||
|
|
||||||
|
Geolog:
|
||||||
|
|
||||||
|
```geolog
|
||||||
|
reachable : [from: V, to: V] -> Prop;
|
||||||
|
```
|
||||||
|
|
||||||
|
Datalog predicate shape:
|
||||||
|
|
||||||
|
```prolog
|
||||||
|
reachable(X, Y)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Function declaration
|
||||||
|
|
||||||
|
Geolog:
|
||||||
|
|
||||||
|
```geolog
|
||||||
|
src : E -> V;
|
||||||
|
```
|
||||||
|
|
||||||
|
Datalog predicate shape:
|
||||||
|
|
||||||
|
```prolog
|
||||||
|
src(E, V)
|
||||||
|
```
|
||||||
|
|
||||||
|
This is called **relationalizing** the function: we represent the function as a relation.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Element declaration
|
||||||
|
|
||||||
|
Geolog:
|
||||||
|
|
||||||
|
```geolog
|
||||||
|
a : V;
|
||||||
|
```
|
||||||
|
|
||||||
|
Datalog:
|
||||||
|
|
||||||
|
```prolog
|
||||||
|
V(a).
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Relation fact
|
||||||
|
|
||||||
|
Geolog:
|
||||||
|
|
||||||
|
```geolog
|
||||||
|
[from: a, to: b] edge;
|
||||||
|
```
|
||||||
|
|
||||||
|
Datalog:
|
||||||
|
|
||||||
|
```prolog
|
||||||
|
edge(a, b).
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Function fact
|
||||||
|
|
||||||
|
Geolog:
|
||||||
|
|
||||||
|
```geolog
|
||||||
|
e1 src = a;
|
||||||
|
```
|
||||||
|
|
||||||
|
Datalog:
|
||||||
|
|
||||||
|
```prolog
|
||||||
|
src(e1, a).
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Axiom
|
||||||
|
|
||||||
|
Geolog:
|
||||||
|
|
||||||
|
```geolog
|
||||||
|
ax/base : forall x,y : V.
|
||||||
|
[from: x, to: y] edge |- [from: x, to: y] reachable;
|
||||||
|
```
|
||||||
|
|
||||||
|
Datalog:
|
||||||
|
|
||||||
|
```prolog
|
||||||
|
reachable(X, Y) :- edge(X, Y).
|
||||||
|
```
|
||||||
|
|
||||||
|
Geolog:
|
||||||
|
|
||||||
|
```geolog
|
||||||
|
ax/trans : forall x,y,z : V.
|
||||||
|
[from: x, to: y] reachable, [from: y, to: z] reachable
|
||||||
|
|- [from: x, to: z] reachable;
|
||||||
|
```
|
||||||
|
|
||||||
|
Datalog:
|
||||||
|
|
||||||
|
```prolog
|
||||||
|
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Full Example
|
||||||
|
|
||||||
|
### Geolog input
|
||||||
|
|
||||||
|
```geolog
|
||||||
|
theory Graph {
|
||||||
|
V : Sort;
|
||||||
|
E : Sort;
|
||||||
|
|
||||||
|
src : E -> V;
|
||||||
|
tgt : E -> V;
|
||||||
|
|
||||||
|
edge : [from: V, to: V] -> Prop;
|
||||||
|
reachable : [from: V, to: V] -> Prop;
|
||||||
|
|
||||||
|
ax/base : forall x,y : V.
|
||||||
|
[from: x, to: y] edge |- [from: x, to: y] reachable;
|
||||||
|
|
||||||
|
ax/trans : forall x,y,z : V.
|
||||||
|
[from: x, to: y] reachable, [from: y, to: z] reachable
|
||||||
|
|- [from: x, to: z] reachable;
|
||||||
|
}
|
||||||
|
|
||||||
|
instance G : Graph = {
|
||||||
|
a, b, c : V;
|
||||||
|
e1, e2 : E;
|
||||||
|
|
||||||
|
e1 src = a;
|
||||||
|
e1 tgt = b;
|
||||||
|
e2 src = b;
|
||||||
|
e2 tgt = c;
|
||||||
|
|
||||||
|
[from: a, to: b] edge;
|
||||||
|
[from: b, to: c] edge;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Datalog output
|
||||||
|
|
||||||
|
```prolog
|
||||||
|
V(a).
|
||||||
|
V(b).
|
||||||
|
V(c).
|
||||||
|
E(e1).
|
||||||
|
E(e2).
|
||||||
|
|
||||||
|
src(e1, a).
|
||||||
|
tgt(e1, b).
|
||||||
|
src(e2, b).
|
||||||
|
tgt(e2, c).
|
||||||
|
|
||||||
|
edge(a, b).
|
||||||
|
edge(b, c).
|
||||||
|
|
||||||
|
reachable(X, Y) :- edge(X, Y).
|
||||||
|
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## What This Translation Loses
|
||||||
|
|
||||||
|
Even in this safe subset, some Geolog meaning may be weakened.
|
||||||
|
|
||||||
|
### Function meaning is weaker
|
||||||
|
|
||||||
|
In Geolog, `src : E -> V` really means “each edge has one source”.
|
||||||
|
|
||||||
|
In plain Datalog, once we translate it to a relation, this bad situation is possible unless something else prevents it:
|
||||||
|
|
||||||
|
```prolog
|
||||||
|
src(e1, a).
|
||||||
|
src(e1, b).
|
||||||
|
```
|
||||||
|
|
||||||
|
So the translator is practical, but not fully faithful to all of Geolog’s function meaning.
|
||||||
|
|
||||||
|
### Type checking may become weaker
|
||||||
|
|
||||||
|
If the translator does not emit unary sort predicates like `V(a).`, then the output Datalog program may lose explicit type information.
|
||||||
|
|
||||||
|
### Richer Geolog behavior is excluded
|
||||||
|
|
||||||
|
This subset does not include the most powerful parts of Geolog.
|
||||||
|
|
||||||
|
That is intentional.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Best Implementation Strategy
|
||||||
|
|
||||||
|
If someone wanted to build this translator, the safest pipeline would be:
|
||||||
|
|
||||||
|
1. parse the Geolog source,
|
||||||
|
2. elaborate or type-check it,
|
||||||
|
3. reject anything outside the supported subset,
|
||||||
|
4. turn functions into relations,
|
||||||
|
5. flatten named relation fields into positional arguments,
|
||||||
|
6. print Datalog facts and rules.
|
||||||
|
|
||||||
|
That is much safer than trying to translate directly from raw surface syntax.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## A Good Simple Definition Of The Subset
|
||||||
|
|
||||||
|
Here is a useful one-sentence definition:
|
||||||
|
|
||||||
|
> The translatable subset is the part of Geolog where theories can be expressed as universally quantified Horn-style rules over relations and function-as-relation facts, with no existential conclusions and no equality-generating behavior.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Final Summary
|
||||||
|
|
||||||
|
If you only want a practical translator, the right plan is:
|
||||||
|
|
||||||
|
- support sorts,
|
||||||
|
- support relations,
|
||||||
|
- support functions by turning them into relations,
|
||||||
|
- support simple axioms with one conclusion fact,
|
||||||
|
- reject existential and equality-heavy features.
|
||||||
|
|
||||||
|
That gives a subset that is useful, understandable, and realistic to translate automatically.
|
||||||
|
|
||||||
Loading…
x
Reference in New Issue
Block a user