From dc8b1911826f431c06dfa009e59ac7fc55c0379b Mon Sep 17 00:00:00 2001 From: Hassan Abedi Date: Mon, 30 Mar 2026 13:34:09 +0200 Subject: [PATCH] Add a note file about decoupling the IR design --- scratches/ir-status-and-decoupling.md | 107 ++++++++++++++++++++++++++ 1 file changed, 107 insertions(+) create mode 100644 scratches/ir-status-and-decoupling.md diff --git a/scratches/ir-status-and-decoupling.md b/scratches/ir-status-and-decoupling.md new file mode 100644 index 0000000..f6fff9c --- /dev/null +++ b/scratches/ir-status-and-decoupling.md @@ -0,0 +1,107 @@ +# IR Status and Decoupling Notes + +This is a scratch note about the current Geolog IR design. + +## Short answer + +The IR does not look badly designed, but it does look weakly decoupled. + +That means: + +- as a prototype lowering target, it looks reasonable +- as a long-term architecture boundary, it looks too entangled + +## Why it does not look bad + +The current IR already does useful work: + +- it lowers elaborated Geolog theories into a concrete form +- it makes tables and laws explicit +- it captures important constraints like `foreignKeys` and `total` +- it gives the project a real intermediate form to build on + +So this does not look like a failed design. It looks like an early design that already has a clear purpose. + +## Why it looks weakly decoupled + +The current IR already commits to one specific execution story: a relational one. + +That shows up in a few ways: + +- the core data structures are already database-shaped: `Table`, `primaryKey`, `Atom`, `Law` +- lowering generates synthetic laws like `foreignKeys` and `total` +- those laws are named through path conventions rather than through a more explicit semantic kind +- some lowering behavior depends on already-built table state +- schema, constraints, and part of the execution model are all mixed together in one layer + +None of these choices are necessarily wrong. But together they make the IR less clean as a neutral interface. + +## Practical interpretation + +The current IR is best understood as: + +- a good first lowering target +- a relationalized theory representation +- not yet a stable, backend-neutral execution contract + +In particular, it still seems to be missing the runtime pieces that a full Geolog engine would need: + +- chase state +- fresh witness generation +- branch tracking +- equality merging +- provenance +- scheduling metadata + +## Main architectural concern + +Right now, the IR seems to combine too many concerns in one place: + +1. lowered theory structure +2. relational schema shape +3. logical constraints +4. hints of runtime semantics + +That makes it harder to swap backends or introduce a cleaner runtime layer later. + +## Likely direction + +A better factored design would probably split the current role into two layers. + +### 1. Lowered theory IR + +This layer would describe: + +- declarations +- dependencies +- logical constraints +- enough type information to preserve meaning + +It should stay as backend-neutral as possible. + +### 2. Runtime or backend IR + +This layer would describe: + +- query plans +- execution strategy +- chase state +- branch identity +- witness allocation +- equality classes +- provenance and scheduling + +That would let the current IR remain useful without forcing it to carry every concern. + +## Bottom line + +The current IR is not badly designed for an early-stage compiler pipeline. + +But if the project wants a strong separation between: + +- source semantics +- lowered theory +- runtime execution +- backend-specific storage and planning + +then the current IR is too tightly coupled to be the final architecture boundary.