# Geolog Intermediate Representation This note summarizes the intermediate representation currently implemented in the Geolog repo on branch `ir-draft1`: https://git.sgai.uk/creators/geolog/-/tree/ir-draft1?ref_type=heads All file paths below are repo-relative, not local workspace paths. The main sources are `geolog-lang/src/Geolog/IR.hs` and `geolog-lang/src/Geolog/Lower.hs`. ## Where the IR sits The current pipeline is: 1. Source text is lexed and parsed. 2. The elaborator builds a typed core representation. 3. `lowerTop` lowers that elaborated theory into `I.FlatTheory`. 4. Tests pretty-print the lowered form as `table ...` and `law ...` declarations. The lowering entry point is `geolog-lang/src/Geolog/Lower.hs`, via `lowerTop`, which calls `lower`, then `lowerEntry`, then `finishTheory`. ## Glossary - `IR`: Intermediate representation. A simpler form used between elaboration and later evaluation. - `lowering`: The step that converts elaborated Geolog code into the IR. - `Path`: A qualified name like `M.mul`. - `ColType`: The type of one table column. - `Table`: A lowered relational object with columns and maybe a primary key. - `RelTable`: A table used for a relation or predicate. - `FunTable`: A table used for something function-like, where input columns determine an output. - `Term`: A small IR expression, such as a variable, literal, or projection. - `Atom`: One table occurrence with concrete arguments. - `Prop`: A logical formula built from atoms and connectives like equality and conjunction. - `Law`: A quantified rule over IR propositions. - `FlatTheory`: The final lowered result: a set of tables plus laws. - `foreignKeys`: Generated laws saying a row depends on other rows existing. - `total`: Generated laws saying a function row must exist when its inputs exist. - `PiColumn`: A column introduced as part of a table's explicit shape. - `JoinColumn`: A column introduced while following a reference or join. ## Core IR data types The IR is intentionally relational. It turns theory structure into tables and rules. - `Path = [QName]`: a hierarchical name like `["M","mul"]`, pretty-printed as `M.mul`. - `ColType` is the type of a column. `EntityType Path` means the column refers to rows from another entity/table, `PrimType PrimInt | PrimString` covers builtin scalars, and `Tuple [(QName, ColType)]` supports structured values. - `Table` contains `columns :: [ColType]` and `primaryKey :: Maybe [Int]`. - `Term` supports literals, variables, projections like `x.field`, and record construction. - `Atom` contains `table :: Path`, `rowId :: Maybe Term`, and `values :: Map Int Term`. - `Prop` and `Law` provide a small logic language for atoms, equality, conjunction, disjunction, and universally quantified Horn-style laws. - `FlatTheory` is the final lowered package with `tables :: Map Path Table` and `laws :: Map Path Law`. See `geolog-lang/src/Geolog/IR.hs` for the concrete definitions. ## What lowering does Lowering walks the elaborated theory and turns it into tables plus rules. ### 1. Records become path prefixes Nested theory fields extend the current path root. If a record field `mul` is inside `M`, the lowered path becomes `M.mul`. This happens in `lowerTheoryFields` in `geolog-lang/src/Geolog/Lower.hs`. ### 2. Relations become `RelTable` When lowering sees `C.VU C.QueryU`, it creates a relational table with the currently accumulated `PiColumn`s and no primary key: - `kind = RelTable` - `columns = piColumnsOf ?columns` - `total = Nothing` This is the representation for proposition-like relations. See `lowerTheoryTy` in `geolog-lang/src/Geolog/Lower.hs`. ### 3. Functions/entities become `FunTable` When lowering finds something function-like that does not reduce to a simpler theory shape, it emits a `FunTable`. At the end of lowering, function tables get a primary key consisting of all columns except the final result column. That matches the examples: - `M.mul = [M.car, M.car, M.car, primary key(0,1)]` - `D.mul = [..., primary key(0,1,2,3)]` This logic is in `finishTheory` in `geolog-lang/src/Geolog/Lower.hs`. ### 4. Dependent structure becomes extra columns Lowering is not just flattening syntax. It also turns shared context into explicit columns. `liftOnShared` copies shared outer variables into the current row shape so that dependencies become relationally explicit. This is why tables like `D.mul` contain both `M.car` arguments and displayed-car columns. See `liftOnShared` in `geolog-lang/src/Geolog/Lower.hs`. ### 5. References generate join information When a local path is used as: - an attribute, lowering creates a `PiColumn` - a value, lowering creates a `JoinColumn` In both cases it also accumulates foreign-key-style atoms in `?fkeys`. Those atoms are later assembled into explicit laws. See `lowerAttrNeutral` and `lowerArgNeutral` in `geolog-lang/src/Geolog/Lower.hs`. ## Why there are `foreignKeys` and `total` laws Each lowered table may carry generated laws. ### `foreignKeys` `makeFkeys` says: if a row exists in table `p`, then the rows it depends on must also exist. This is how dependent references become relational constraints. A row in `D.mul` implies the existence of corresponding rows in `D.car` and `M.mul`. See `makeFkeys` in `geolog-lang/src/Geolog/Lower.hs`. ### `total` `makeTotal` is only generated for function tables. It says: if the prerequisite input rows exist, then a row in the function table must exist. So `M.mul.total` expresses totality of multiplication, while `D.mul.total` expresses totality relative to the displayed structure. See `makeTotal` in `geolog-lang/src/Geolog/Lower.hs`. ## Reading the printed IR The pretty-printer renders `FlatTheory` as a sequence of `table` and `law` declarations. For example, the golden output for `magma` contains: ```text table M.mul = [M.car, M.car, M.car, primary key(0,1)] law M.mul.total = forall (a : M.car) (b : M.car). T |- M.mul(a, b) ``` and: ```text table D.mul = [M.car, M.car, D.car, D.car, D.car, primary key(0,1,2,3)] ``` This means: - the first columns are inputs or lifted context - the last column is the produced value/row - the primary key marks the determining inputs - `foreignKeys` laws enforce referenced rows - `total` laws enforce existence of output rows for function tables Concrete examples live in: - `geolog-lang/test/magma.output` - `geolog-lang/test/paths.output` ## Practical takeaway The implemented IR is best understood as a relationalized, flattened theory: - names become paths - dependencies become explicit columns - references become foreign-key-style laws - total functions become tables plus totality laws So the IR is not just an AST for queries. It is the bridge from elaborated dependent syntax to a database-shaped form that later evaluation code can consume. ## Note `docs/lecture3.tex` has a `Query IR` section header but no content yet, so the code above is currently the most concrete description of the actual intermediate representation.