useful-notes/scratches/ir-status-and-decoupling.md
2026-04-01 09:12:27 +02:00

2.9 KiB

IR Status and Decoupling Notes

This is a scratch note about the current Geolog IR design.

Short answer

The IR does not look badly designed, but it does look weakly decoupled.

That means:

  • as a prototype lowering target, it looks reasonable
  • as a long-term architecture boundary, it looks too entangled

Why it does not look bad

The current IR already does useful work:

  • it lowers elaborated Geolog theories into a concrete form
  • it makes tables and laws explicit
  • it captures important constraints like foreignKeys and total
  • it gives the project a real intermediate form to build on

So this does not look like a failed design. It looks like an early design that already has a clear purpose.

Why it looks weakly decoupled

The current IR already commits to one specific execution story: a relational one.

That shows up in a few ways:

  • the core data structures are already database-shaped: Table, primaryKey, Atom, Law
  • lowering generates synthetic laws like foreignKeys and total
  • those laws are named through path conventions rather than through a more explicit semantic kind
  • some lowering behavior depends on already-built table state
  • schema, constraints, and part of the execution model are all mixed together in one layer

None of these choices are necessarily wrong. But together they make the IR less clean as a neutral interface.

Practical interpretation

The current IR is best understood as:

  • a good first lowering target
  • a relationalized theory representation
  • not yet a stable, backend-neutral execution contract

In particular, it still seems to be missing the runtime pieces that a full Geolog engine would need:

  • chase state
  • fresh witness generation
  • branch tracking
  • equality merging
  • provenance
  • scheduling metadata

Main architectural concern

Right now, the IR seems to combine too many concerns in one place:

  1. lowered theory structure
  2. relational schema shape
  3. logical constraints
  4. hints of runtime semantics

That makes it harder to swap backends or introduce a cleaner runtime layer later.

Likely direction

A better factored design would probably split the current role into two layers.

1. Lowered theory IR

This layer would describe:

  • declarations
  • dependencies
  • logical constraints
  • enough type information to preserve meaning

It should stay as backend-neutral as possible.

2. Runtime or backend IR

This layer would describe:

  • query plans
  • execution strategy
  • chase state
  • branch identity
  • witness allocation
  • equality classes
  • provenance and scheduling

That would let the current IR remain useful without forcing it to carry every concern.

Bottom line

The current IR is not badly designed for an early-stage compiler pipeline.

But if the project wants a strong separation between:

  • source semantics
  • lowered theory
  • runtime execution
  • backend-specific storage and planning

then the current IR is too tightly coupled to be the final architecture boundary.