useful-notes/scratches/ir-status-and-decoupling.md

# IR Status and Decoupling Notes

This is a scratch note about the current Geolog IR design.

## Short answer

The IR does not look badly designed, but it does look weakly decoupled.

That means:

- as a prototype lowering target, it looks reasonable
- as a long-term architecture boundary, it looks too entangled

## Why it does not look bad

The current IR already does useful work:

- it lowers elaborated Geolog theories into a concrete form
- it makes tables and laws explicit
- it captures important constraints like `foreignKeys` and `total`
- it gives the project a real intermediate form to build on

So this does not look like a failed design. It looks like an early design that already has a clear purpose.

## Why it looks weakly decoupled

The current IR already commits to one specific execution story: a relational one.

That shows up in a few ways:

- the core data structures are already database-shaped: `Table`, `primaryKey`, `Atom`, `Law`
- lowering generates synthetic laws like `foreignKeys` and `total`
- those laws are named through path conventions rather than through a more explicit semantic kind
- some lowering behavior depends on already-built table state
- schema, constraints, and part of the execution model are all mixed together in one layer

None of these choices are necessarily wrong. But together they make the IR less clean as a neutral interface.

## Practical interpretation

The current IR is best understood as:

- a good first lowering target
- a relationalized theory representation
- not yet a stable, backend-neutral execution contract

In particular, it still seems to be missing the runtime pieces that a full Geolog engine would need:

- chase state
- fresh witness generation
- branch tracking
- equality merging
- provenance
- scheduling metadata

## Main architectural concern

Right now, the IR seems to combine too many concerns in one place:

1. lowered theory structure
2. relational schema shape
3. logical constraints
4. hints of runtime semantics

That makes it harder to swap backends or introduce a cleaner runtime layer later.

## Likely direction

A better factored design would probably split the current role into two layers.

### 1. Lowered theory IR

This layer would describe:

- declarations
- dependencies
- logical constraints
- enough type information to preserve meaning

It should stay as backend-neutral as possible.

### 2. Runtime or backend IR

This layer would describe:

- query plans
- execution strategy
- chase state
- branch identity
- witness allocation
- equality classes
- provenance and scheduling

That would let the current IR remain useful without forcing it to carry every concern.

## Bottom line

The current IR is not badly designed for an early-stage compiler pipeline.

But if the project wants a strong separation between:

- source semantics
- lowered theory
- runtime execution
- backend-specific storage and planning

then the current IR is too tightly coupled to be the final architecture boundary.
Add a note file about decoupling the IR design 2026-03-30 13:34:09 +02:00			`# IR Status and Decoupling Notes`

			`This is a scratch note about the current Geolog IR design.`

			`## Short answer`

			`The IR does not look badly designed, but it does look weakly decoupled.`

			`That means:`

			`- as a prototype lowering target, it looks reasonable`
			`- as a long-term architecture boundary, it looks too entangled`

			`## Why it does not look bad`

			`The current IR already does useful work:`

			`- it lowers elaborated Geolog theories into a concrete form`
			`- it makes tables and laws explicit`
			- it captures important constraints like `foreignKeys` and `total`
			`- it gives the project a real intermediate form to build on`

			`So this does not look like a failed design. It looks like an early design that already has a clear purpose.`

			`## Why it looks weakly decoupled`

			`The current IR already commits to one specific execution story: a relational one.`

			`That shows up in a few ways:`

			- the core data structures are already database-shaped: `Table`, `primaryKey`, `Atom`, `Law`
			- lowering generates synthetic laws like `foreignKeys` and `total`
			`- those laws are named through path conventions rather than through a more explicit semantic kind`
			`- some lowering behavior depends on already-built table state`
			`- schema, constraints, and part of the execution model are all mixed together in one layer`

			`None of these choices are necessarily wrong. But together they make the IR less clean as a neutral interface.`

			`## Practical interpretation`

			`The current IR is best understood as:`

			`- a good first lowering target`
			`- a relationalized theory representation`
			`- not yet a stable, backend-neutral execution contract`

			`In particular, it still seems to be missing the runtime pieces that a full Geolog engine would need:`

			`- chase state`
			`- fresh witness generation`
			`- branch tracking`
			`- equality merging`
			`- provenance`
			`- scheduling metadata`

			`## Main architectural concern`

			`Right now, the IR seems to combine too many concerns in one place:`

			`1. lowered theory structure`
			`2. relational schema shape`
			`3. logical constraints`
			`4. hints of runtime semantics`

			`That makes it harder to swap backends or introduce a cleaner runtime layer later.`

			`## Likely direction`

			`A better factored design would probably split the current role into two layers.`

			`### 1. Lowered theory IR`

			`This layer would describe:`

			`- declarations`
			`- dependencies`
			`- logical constraints`
			`- enough type information to preserve meaning`

			`It should stay as backend-neutral as possible.`

			`### 2. Runtime or backend IR`

			`This layer would describe:`

			`- query plans`
			`- execution strategy`
			`- chase state`
			`- branch identity`
			`- witness allocation`
			`- equality classes`
			`- provenance and scheduling`

			`That would let the current IR remain useful without forcing it to carry every concern.`

			`## Bottom line`

			`The current IR is not badly designed for an early-stage compiler pipeline.`

			`But if the project wants a strong separation between:`

			`- source semantics`
			`- lowered theory`
			`- runtime execution`
			`- backend-specific storage and planning`

			`then the current IR is too tightly coupled to be the final architecture boundary.`