Compare commits
No commits in common. "900f5bab95a4313dfa644311670ed216a4f2b7d7" and "a8fc42943dc25b32380f3861ad7c07daa2829f0e" have entirely different histories.
900f5bab95
...
a8fc42943d
1
.gitignore
vendored
1
.gitignore
vendored
@ -42,4 +42,3 @@ target
|
||||
*.log
|
||||
tmp/
|
||||
temp/
|
||||
.claude/
|
||||
|
||||
@ -1,146 +0,0 @@
|
||||
# Geolog Intermediate Representation
|
||||
|
||||
This note summarizes the intermediate representation currently implemented in the Geolog repo on branch `ir-draft1`:
|
||||
https://git.sgai.uk/creators/geolog/-/tree/ir-draft1?ref_type=heads
|
||||
|
||||
All file paths below are repo-relative, not local workspace paths. The main sources are `geolog-lang/src/Geolog/IR.hs` and `geolog-lang/src/Geolog/Lower.hs`.
|
||||
|
||||
## Where the IR sits
|
||||
|
||||
The current pipeline is:
|
||||
|
||||
1. Source text is lexed and parsed.
|
||||
2. The elaborator builds a typed core representation.
|
||||
3. `lowerTop` lowers that elaborated theory into `I.FlatTheory`.
|
||||
4. Tests pretty-print the lowered form as `table ...` and `law ...` declarations.
|
||||
|
||||
The lowering entry point is `geolog-lang/src/Geolog/Lower.hs`, via `lowerTop`, which calls `lower`, then `lowerEntry`, then `finishTheory`.
|
||||
|
||||
## Core IR data types
|
||||
|
||||
The IR is intentionally relational.
|
||||
|
||||
- `Path = [QName]`: a hierarchical name like `["M","mul"]`, pretty-printed as `M.mul`.
|
||||
- `ColType` is the type of a column. `EntityType Path` means the column refers to rows from another entity/table, `PrimType PrimInt | PrimString` covers builtin scalars, and `Tuple [(QName, ColType)]` supports structured values.
|
||||
- `Table` contains `columns :: [ColType]` and `primaryKey :: Maybe [Int]`.
|
||||
- `Term` supports literals, variables, projections like `x.field`, and record construction.
|
||||
- `Atom` contains `table :: Path`, `rowId :: Maybe Term`, and `values :: Map Int Term`.
|
||||
- `Prop` and `Law` provide a small logic language for atoms, equality, conjunction, disjunction, and universally quantified Horn-style laws.
|
||||
- `FlatTheory` is the final lowered package with `tables :: Map Path Table` and `laws :: Map Path Law`.
|
||||
|
||||
See `geolog-lang/src/Geolog/IR.hs` for the concrete definitions.
|
||||
|
||||
## What lowering does
|
||||
|
||||
Lowering walks an elaborated theory and converts type-theoretic structure into tables plus integrity laws.
|
||||
|
||||
### 1. Records become path prefixes
|
||||
|
||||
Nested theory fields extend the current path root. If a record field `mul` is inside `M`, the lowered path becomes `M.mul`.
|
||||
|
||||
This happens in `lowerTheoryFields` in `geolog-lang/src/Geolog/Lower.hs`.
|
||||
|
||||
### 2. Relations become `RelTable`
|
||||
|
||||
When lowering sees `C.VU C.QueryU`, it creates a relational table with the currently accumulated `PiColumn`s and no primary key:
|
||||
|
||||
- `kind = RelTable`
|
||||
- `columns = piColumnsOf ?columns`
|
||||
- `total = Nothing`
|
||||
|
||||
This is the representation for proposition-like relations.
|
||||
|
||||
See `lowerTheoryTy` in `geolog-lang/src/Geolog/Lower.hs`.
|
||||
|
||||
### 3. Functions/entities become `FunTable`
|
||||
|
||||
When lowering encounters an abstract decoded type that does not reduce to a more concrete theory shape, it emits a `FunTable`.
|
||||
|
||||
At the end of lowering, function tables get a primary key consisting of all columns except the final result column. That matches the examples:
|
||||
|
||||
- `M.mul = [M.car, M.car, M.car, primary key(0,1)]`
|
||||
- `D.mul = [..., primary key(0,1,2,3)]`
|
||||
|
||||
This logic is in `finishTheory` in `geolog-lang/src/Geolog/Lower.hs`.
|
||||
|
||||
### 4. Dependent structure becomes extra columns
|
||||
|
||||
The interesting part is that lowering is not just flattening syntax. It also lambda-lifts shared context into explicit columns.
|
||||
|
||||
`liftOnShared` copies shared outer variables into the current row shape so that dependencies become relationally explicit. This is why tables like `D.mul` contain both `M.car` arguments and displayed-car columns.
|
||||
|
||||
See `liftOnShared` in `geolog-lang/src/Geolog/Lower.hs`.
|
||||
|
||||
### 5. References generate join information
|
||||
|
||||
When a local path is used as:
|
||||
|
||||
- an attribute, lowering creates a `PiColumn`
|
||||
- a value, lowering creates a `JoinColumn`
|
||||
|
||||
In both cases it also accumulates foreign-key-style atoms in `?fkeys`. Those atoms are later assembled into explicit laws.
|
||||
|
||||
See `lowerAttrNeutral` and `lowerArgNeutral` in `geolog-lang/src/Geolog/Lower.hs`.
|
||||
|
||||
## Why there are `foreignKeys` and `total` laws
|
||||
|
||||
Each lowered table may carry generated laws.
|
||||
|
||||
### `foreignKeys`
|
||||
|
||||
`makeFkeys` says: if a row exists in table `p`, then the rows it depends on must also exist.
|
||||
|
||||
Operationally, this is how dependent references are reified as relational constraints. A row in `D.mul` implies the existence of corresponding rows in `D.car` and `M.mul`.
|
||||
|
||||
See `makeFkeys` in `geolog-lang/src/Geolog/Lower.hs`.
|
||||
|
||||
### `total`
|
||||
|
||||
`makeTotal` is only generated for function tables. It says: if the prerequisite input rows exist, then a row in the function table must exist.
|
||||
|
||||
So `M.mul.total` expresses totality of multiplication, while `D.mul.total` expresses totality relative to the displayed structure.
|
||||
|
||||
See `makeTotal` in `geolog-lang/src/Geolog/Lower.hs`.
|
||||
|
||||
## Reading the printed IR
|
||||
|
||||
The pretty-printer renders `FlatTheory` as a sequence of `table` and `law` declarations. For example, the golden output for `magma` contains:
|
||||
|
||||
```text
|
||||
table M.mul = [M.car, M.car, M.car, primary key(0,1)]
|
||||
law M.mul.total = forall (a : M.car) (b : M.car). T |- M.mul(a, b)
|
||||
```
|
||||
|
||||
and:
|
||||
|
||||
```text
|
||||
table D.mul = [M.car, M.car, D.car, D.car, D.car, primary key(0,1,2,3)]
|
||||
```
|
||||
|
||||
This means:
|
||||
|
||||
- the first columns are inputs or lifted context
|
||||
- the last column is the produced value/row
|
||||
- the primary key marks the determining inputs
|
||||
- `foreignKeys` laws enforce referenced rows
|
||||
- `total` laws enforce existence of output rows for function tables
|
||||
|
||||
Concrete examples live in:
|
||||
|
||||
- `geolog-lang/test/magma.output`
|
||||
- `geolog-lang/test/paths.output`
|
||||
|
||||
## Practical takeaway
|
||||
|
||||
The implemented IR is best understood as a relationalized, flattened theory:
|
||||
|
||||
- names become paths
|
||||
- dependencies become explicit columns
|
||||
- references become foreign-key-style laws
|
||||
- total functions become tables plus totality laws
|
||||
|
||||
So the IR is not just an AST for queries. It is the bridge from elaborated dependent syntax to a database-shaped semantics that downstream query/evaluation machinery can consume.
|
||||
|
||||
## Note
|
||||
|
||||
`docs/lecture3.tex` has a `Query IR` section header but no content yet, so the code above is currently the most concrete description of the actual intermediate representation.
|
||||
Loading…
x
Reference in New Issue
Block a user