Add a note file about Geolog to Datalog translation
This commit is contained in:
parent
50ae1473e4
commit
63e28fc7a9
24
.editorconfig
Normal file
24
.editorconfig
Normal file
@ -0,0 +1,24 @@
|
||||
# https://EditorConfig.org
|
||||
|
||||
root = true
|
||||
|
||||
[*]
|
||||
charset = utf-8
|
||||
end_of_line = lf
|
||||
indent_style = space
|
||||
indent_size = 4
|
||||
insert_final_newline = true
|
||||
trim_trailing_whitespace = true
|
||||
|
||||
[*.rs]
|
||||
max_line_length = 100
|
||||
|
||||
[*.md]
|
||||
max_line_length = 150
|
||||
trim_trailing_whitespace = false
|
||||
|
||||
[*.sh]
|
||||
indent_size = 2
|
||||
|
||||
[*.{yml,yaml,json}]
|
||||
indent_size = 2
|
||||
@ -2,6 +2,13 @@
|
||||
|
||||
This document explains the main ideas in Geolog without assuming math background.
|
||||
|
||||
## Related Notes
|
||||
|
||||
If you want to compare Geolog with Datalog, or think about automatic translation, see:
|
||||
|
||||
- `notes/007-geolog-and-datalog.md` — a simple explanation of what maps well and what does not.
|
||||
- `notes/008-geolog-to-datalog-subset.md` — a practical subset of Geolog that could be translated automatically into plain Datalog.
|
||||
|
||||
---
|
||||
|
||||
## Theories and Instances
|
||||
|
||||
454
notes/008-geolog-to-datalog-subset.md
Normal file
454
notes/008-geolog-to-datalog-subset.md
Normal file
@ -0,0 +1,454 @@
|
||||
# Geolog To Datalog Subset
|
||||
|
||||
*A practical subset of Geolog that can be translated automatically into plain Datalog.*
|
||||
|
||||
---
|
||||
|
||||
## Goal
|
||||
|
||||
The goal of this note is to describe a part of Geolog that can be translated into plain Datalog without too much trouble.
|
||||
|
||||
This note is about a **safe, useful subset**.
|
||||
|
||||
It is **not** about translating all of Geolog.
|
||||
|
||||
That matters because full Geolog has features that go beyond what plain Datalog can express.
|
||||
|
||||
---
|
||||
|
||||
## Short Answer
|
||||
|
||||
Automatic translation is quite reasonable if we only allow:
|
||||
|
||||
- sorts,
|
||||
- relations,
|
||||
- functions,
|
||||
- instances with facts,
|
||||
- and simple Horn-style axioms.
|
||||
|
||||
Automatic translation becomes much harder if we also allow:
|
||||
|
||||
- existential conclusions,
|
||||
- equality-generating behavior,
|
||||
- or any rule that needs the system to create fresh objects.
|
||||
|
||||
---
|
||||
|
||||
## What Subset We Want
|
||||
|
||||
We want a subset where Geolog rules behave like ordinary Datalog rules.
|
||||
|
||||
That means we allow the parts of Geolog that fit this pattern:
|
||||
|
||||
- some facts are already true,
|
||||
- a rule sees those facts,
|
||||
- the rule derives one new fact,
|
||||
- repeat until nothing new can be derived.
|
||||
|
||||
---
|
||||
|
||||
## Allowed Features
|
||||
|
||||
### 1. Sort declarations
|
||||
|
||||
Allowed:
|
||||
|
||||
```geolog
|
||||
V : Sort;
|
||||
E : Sort;
|
||||
```
|
||||
|
||||
These are Geolog types.
|
||||
|
||||
In Datalog, they can be handled in either of two ways:
|
||||
|
||||
- keep them only as compiler information,
|
||||
- or emit unary predicates such as `V(x)` and `E(x)`.
|
||||
|
||||
---
|
||||
|
||||
### 2. Relation declarations
|
||||
|
||||
Allowed:
|
||||
|
||||
```geolog
|
||||
edge : [from: V, to: V] -> Prop;
|
||||
reachable : [from: V, to: V] -> Prop;
|
||||
```
|
||||
|
||||
Translation idea:
|
||||
|
||||
```prolog
|
||||
edge(X, Y)
|
||||
reachable(X, Y)
|
||||
```
|
||||
|
||||
The field names such as `from` and `to` are useful in Geolog, but plain Datalog usually uses only argument position.
|
||||
|
||||
---
|
||||
|
||||
### 3. Function declarations
|
||||
|
||||
Allowed:
|
||||
|
||||
```geolog
|
||||
src : E -> V;
|
||||
tgt : E -> V;
|
||||
```
|
||||
|
||||
Translation idea:
|
||||
|
||||
```prolog
|
||||
src(E, V)
|
||||
tgt(E, V)
|
||||
```
|
||||
|
||||
This means we turn Geolog functions into ordinary Datalog relations.
|
||||
|
||||
That is a practical encoding, but it is important to remember that a Datalog relation does not automatically behave like a true function.
|
||||
|
||||
---
|
||||
|
||||
### 4. Instance facts
|
||||
|
||||
Allowed element declarations:
|
||||
|
||||
```geolog
|
||||
a, b, c : V;
|
||||
e1, e2 : E;
|
||||
```
|
||||
|
||||
Possible Datalog output:
|
||||
|
||||
```prolog
|
||||
V(a).
|
||||
V(b).
|
||||
V(c).
|
||||
E(e1).
|
||||
E(e2).
|
||||
```
|
||||
|
||||
Allowed relation facts:
|
||||
|
||||
```geolog
|
||||
[from: a, to: b] edge;
|
||||
[from: b, to: c] edge;
|
||||
```
|
||||
|
||||
Translation:
|
||||
|
||||
```prolog
|
||||
edge(a, b).
|
||||
edge(b, c).
|
||||
```
|
||||
|
||||
Allowed function facts:
|
||||
|
||||
```geolog
|
||||
e1 src = a;
|
||||
e1 tgt = b;
|
||||
```
|
||||
|
||||
Translation:
|
||||
|
||||
```prolog
|
||||
src(e1, a).
|
||||
tgt(e1, b).
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 5. Simple axioms
|
||||
|
||||
Allowed form:
|
||||
|
||||
```geolog
|
||||
name : forall vars.
|
||||
premise1, premise2, ..., premisen |- conclusion;
|
||||
```
|
||||
|
||||
Where:
|
||||
|
||||
- all variables are universal variables,
|
||||
- each premise is a simple atomic fact,
|
||||
- the conclusion is exactly one simple atomic fact,
|
||||
- there is no `exists` in the conclusion,
|
||||
- there is no disjunction,
|
||||
- there is no equality-generating step.
|
||||
|
||||
This is the part of Geolog that lines up best with plain Datalog.
|
||||
|
||||
---
|
||||
|
||||
## What Is Not Allowed
|
||||
|
||||
This subset should reject programs that use:
|
||||
|
||||
- existential conclusions,
|
||||
- equality-generating rules,
|
||||
- rules that require creation of fresh objects,
|
||||
- disjunction,
|
||||
- nested instances in places where they affect rule translation,
|
||||
- or advanced features that do not flatten cleanly into ordinary Datalog facts and rules.
|
||||
|
||||
---
|
||||
|
||||
## Translation Rules
|
||||
|
||||
This section gives the basic mechanical mapping.
|
||||
|
||||
### Sort declaration
|
||||
|
||||
Geolog:
|
||||
|
||||
```geolog
|
||||
V : Sort;
|
||||
```
|
||||
|
||||
Possible Datalog representation:
|
||||
|
||||
```prolog
|
||||
V(x)
|
||||
```
|
||||
|
||||
In practice, you usually emit only the facts like `V(a).`, not a schema declaration.
|
||||
|
||||
---
|
||||
|
||||
### Relation declaration
|
||||
|
||||
Geolog:
|
||||
|
||||
```geolog
|
||||
reachable : [from: V, to: V] -> Prop;
|
||||
```
|
||||
|
||||
Datalog predicate shape:
|
||||
|
||||
```prolog
|
||||
reachable(X, Y)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Function declaration
|
||||
|
||||
Geolog:
|
||||
|
||||
```geolog
|
||||
src : E -> V;
|
||||
```
|
||||
|
||||
Datalog predicate shape:
|
||||
|
||||
```prolog
|
||||
src(E, V)
|
||||
```
|
||||
|
||||
This is called **relationalizing** the function: we represent the function as a relation.
|
||||
|
||||
---
|
||||
|
||||
### Element declaration
|
||||
|
||||
Geolog:
|
||||
|
||||
```geolog
|
||||
a : V;
|
||||
```
|
||||
|
||||
Datalog:
|
||||
|
||||
```prolog
|
||||
V(a).
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Relation fact
|
||||
|
||||
Geolog:
|
||||
|
||||
```geolog
|
||||
[from: a, to: b] edge;
|
||||
```
|
||||
|
||||
Datalog:
|
||||
|
||||
```prolog
|
||||
edge(a, b).
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Function fact
|
||||
|
||||
Geolog:
|
||||
|
||||
```geolog
|
||||
e1 src = a;
|
||||
```
|
||||
|
||||
Datalog:
|
||||
|
||||
```prolog
|
||||
src(e1, a).
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Axiom
|
||||
|
||||
Geolog:
|
||||
|
||||
```geolog
|
||||
ax/base : forall x,y : V.
|
||||
[from: x, to: y] edge |- [from: x, to: y] reachable;
|
||||
```
|
||||
|
||||
Datalog:
|
||||
|
||||
```prolog
|
||||
reachable(X, Y) :- edge(X, Y).
|
||||
```
|
||||
|
||||
Geolog:
|
||||
|
||||
```geolog
|
||||
ax/trans : forall x,y,z : V.
|
||||
[from: x, to: y] reachable, [from: y, to: z] reachable
|
||||
|- [from: x, to: z] reachable;
|
||||
```
|
||||
|
||||
Datalog:
|
||||
|
||||
```prolog
|
||||
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Full Example
|
||||
|
||||
### Geolog input
|
||||
|
||||
```geolog
|
||||
theory Graph {
|
||||
V : Sort;
|
||||
E : Sort;
|
||||
|
||||
src : E -> V;
|
||||
tgt : E -> V;
|
||||
|
||||
edge : [from: V, to: V] -> Prop;
|
||||
reachable : [from: V, to: V] -> Prop;
|
||||
|
||||
ax/base : forall x,y : V.
|
||||
[from: x, to: y] edge |- [from: x, to: y] reachable;
|
||||
|
||||
ax/trans : forall x,y,z : V.
|
||||
[from: x, to: y] reachable, [from: y, to: z] reachable
|
||||
|- [from: x, to: z] reachable;
|
||||
}
|
||||
|
||||
instance G : Graph = {
|
||||
a, b, c : V;
|
||||
e1, e2 : E;
|
||||
|
||||
e1 src = a;
|
||||
e1 tgt = b;
|
||||
e2 src = b;
|
||||
e2 tgt = c;
|
||||
|
||||
[from: a, to: b] edge;
|
||||
[from: b, to: c] edge;
|
||||
}
|
||||
```
|
||||
|
||||
### Datalog output
|
||||
|
||||
```prolog
|
||||
V(a).
|
||||
V(b).
|
||||
V(c).
|
||||
E(e1).
|
||||
E(e2).
|
||||
|
||||
src(e1, a).
|
||||
tgt(e1, b).
|
||||
src(e2, b).
|
||||
tgt(e2, c).
|
||||
|
||||
edge(a, b).
|
||||
edge(b, c).
|
||||
|
||||
reachable(X, Y) :- edge(X, Y).
|
||||
reachable(X, Z) :- reachable(X, Y), reachable(Y, Z).
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## What This Translation Loses
|
||||
|
||||
Even in this safe subset, some Geolog meaning may be weakened.
|
||||
|
||||
### Function meaning is weaker
|
||||
|
||||
In Geolog, `src : E -> V` really means “each edge has one source”.
|
||||
|
||||
In plain Datalog, once we translate it to a relation, this bad situation is possible unless something else prevents it:
|
||||
|
||||
```prolog
|
||||
src(e1, a).
|
||||
src(e1, b).
|
||||
```
|
||||
|
||||
So the translator is practical, but not fully faithful to all of Geolog’s function meaning.
|
||||
|
||||
### Type checking may become weaker
|
||||
|
||||
If the translator does not emit unary sort predicates like `V(a).`, then the output Datalog program may lose explicit type information.
|
||||
|
||||
### Richer Geolog behavior is excluded
|
||||
|
||||
This subset does not include the most powerful parts of Geolog.
|
||||
|
||||
That is intentional.
|
||||
|
||||
---
|
||||
|
||||
## Best Implementation Strategy
|
||||
|
||||
If someone wanted to build this translator, the safest pipeline would be:
|
||||
|
||||
1. parse the Geolog source,
|
||||
2. elaborate or type-check it,
|
||||
3. reject anything outside the supported subset,
|
||||
4. turn functions into relations,
|
||||
5. flatten named relation fields into positional arguments,
|
||||
6. print Datalog facts and rules.
|
||||
|
||||
That is much safer than trying to translate directly from raw surface syntax.
|
||||
|
||||
---
|
||||
|
||||
## A Good Simple Definition Of The Subset
|
||||
|
||||
Here is a useful one-sentence definition:
|
||||
|
||||
> The translatable subset is the part of Geolog where theories can be expressed as universally quantified Horn-style rules over relations and function-as-relation facts, with no existential conclusions and no equality-generating behavior.
|
||||
|
||||
---
|
||||
|
||||
## Final Summary
|
||||
|
||||
If you only want a practical translator, the right plan is:
|
||||
|
||||
- support sorts,
|
||||
- support relations,
|
||||
- support functions by turning them into relations,
|
||||
- support simple axioms with one conclusion fact,
|
||||
- reject existential and equality-heavy features.
|
||||
|
||||
That gives a subset that is useful, understandable, and realistic to translate automatically.
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user