query-engine/AGENTS.md

# AGENTS.md

This file provides guidance to coding agents collaborating on this repository.

## Mission

Query Engine is an experimental Rust project for building query-engine
components. The current implementation is centered on a chase-based reasoning
core, lightweight interactive frontends, and an early relational/SQL scaffold.

Priorities, in order:

1. Correctness of reasoning and query semantics.
2. Clear architectural boundaries between front-end, planning, and execution layers.
3. Termination guarantees for chase-based rule evaluation.
4. Performance and scalability.
5. Clear, maintainable, idiomatic Rust code.

## Core Rules

- Use English for code, comments, docs, and tests.
- Keep mutable state inside well-defined structs; avoid global mutable state.
- Prefer small, focused changes over large refactoring.
- Add comments only when they clarify non-obvious behavior.
- Follow Rust idioms: use `Result` for errors, iterators over manual loops, etc.
- Do not describe unimplemented subsystems as if they already exist.

Quick examples:

- Good: add a planning data type behind a focused module boundary.
- Good: add a new chase variant by extending the existing strategy/config model.
- Bad: mix parsing, planning, and execution concerns in one module.
- Bad: add global configuration that affects unrelated engine components.


## Writing Style

- Use Oxford commas in inline lists: "a, b, and c" not "a, b, c".
- Do not use em dashes. Restructure the sentence, or use a colon or semicolon instead.
- Avoid colorful adjectives and adverbs. Write "TCP proxy" not "lightweight TCP proxy", "scoring components" not "transparent scoring components".
- Use noun phrases for checklist items, not imperative verbs. Write "redundant index detection" not "detect redundant indexes".
- Headings in Markdown files must be in the title case: "Build from Source" not "Build from source". Minor words (a, an, the, and, but, or, for, in,
  on, at, to, by, of) stay lowercase unless they are the first word.

## Repository Layout

- `src/`: core implementation.
- `src/chase/`: chase and rule-evaluation modules.
    - `term.rs`: terms (constants, nulls, variables).
    - `atom.rs`: atoms (predicate applied to terms).
    - `instance.rs`: fact storage and validation.
    - `rule.rs`: TGDs, EGDs, equalities, and builders.
    - `substitution.rs`: variable bindings and unification.
    - `engine.rs`: chase execution and configuration.
    - `inference.rs`: shared matching and provenance-aware materialization helpers.
    - `union_find.rs`: equality merging support.
- `src/frontend/`: lightweight interactive surface for scripts, REPL, and local web UI.
- `src/relational/`: schemas, values, rows, and result sets for relational execution.
- `src/catalog/`: predicate-to-table schema inference and catalog access.
- `src/sql/`: narrow SQL AST and parser support.
- `src/planner/`: logical plan structures and SQL-to-plan translation.
- `src/execution/`: execution of the current logical plan subset, including the `DataSource` trait and the `TableStore` in-memory source.
- `examples/scripts/`: runnable script examples for supported workflows.
- `tests/`: integration, regression, and property-based tests.

## Architecture Constraints

- Treat the current chase subsystem as one engine component, not the entire long-term architecture.
- `Instance` holds the fact state as ground atoms.
- `Rule` and `Egd` represent declarative constraints used by the chase subsystem.
- The chase engine should remain largely stateless; pass execution state explicitly.
- New chase variants should be composable with existing infrastructure.
- Existential variables generate labeled nulls (`Term::Null`).
- The current SQL support is intentionally narrow: `SELECT-FROM-WHERE-ORDER BY-LIMIT` over predicate-backed tables; equality and inequality predicates combined with `AND` and `OR`; comma-join style multi-table queries; table aliases; ordering by output-column names; integer and string literals.
- Stable SQL column names come from explicit catalog registration or the frontend `schema ...` command, including for empty tables; otherwise the default names are positional such as `c0` and `c1`.
- Single-table SQL queries may use the table name as a qualifier when no alias is present.
- Do not describe unsupported SQL features such as aggregates, grouping, or arbitrary expressions as implemented.
- The executor operates on the `DataSource` trait, not on `Instance` directly. `Instance` and `TableStore` are the two built-in implementations.
- Relational and SQL modules should build on explicit schemas and logical plans, not call frontend helpers directly.
- If you add parser, planner, or executor layers, keep their responsibilities separate.
- Public docs and interfaces should reflect the implemented state of the repository accurately.

## Rust Conventions

- Target stable Rust (edition 2024, rust-version 1.92).
- Use `#[derive(...)]` for common traits where appropriate.
- Prefer `&str` over `String` in function parameters when ownership is not needed.
- Use `impl Trait` for return types when the concrete type is an implementation detail.
- Run `cargo clippy` and address warnings before committing.

## Required Validation

Run these checks for any non-trivial change:

1. `cargo test`
2. `cargo clippy --all-targets --all-features -- -D warnings`
3. `cargo fmt --check`

For performance-sensitive changes:

1. Add benchmarks if they do not exist.
2. Compare before/after performance.

## First Contribution Flow

Use this sequence for your first change:

1. Read `src/lib.rs` plus the relevant module files.
2. Implement the smallest possible code change.
3. Add or update tests that fail before and pass after.
4. Run `cargo test`.
5. Run `cargo clippy --all-targets --all-features -- -D warnings`.
6. Update docs if public API behavior changed.

Example scopes that are good first tasks:

- Add tests for an edge case in unification.
- Implement a new utility method on `Instance` or `Atom`.
- Tighten frontend wording so it matches actual behavior.
- Introduce a small planning-oriented type without changing execution semantics.
- Extend the SQL slice with a narrow, well-tested feature such as aliases or named columns.
- Add a runnable example script that demonstrates a supported workflow.

## Testing Expectations

- No semantics-changing logic update is complete without tests.
- Unit tests go in `#[cfg(test)] mod tests` within each module.
- Integration tests go in `tests/integration_tests.rs`.
- Regression tests for bug fixes go in `tests/regression_tests.rs`.
- Property-based tests go in `tests/property_tests.rs`.
- SQL/planner/execution flow tests go in `tests/sql_pipeline_tests.rs`.
- Runnable documentation examples belong in `examples/scripts/` when they clarify supported behavior.
- Do not merge code that breaks existing tests.

Minimal unit-test checklist for chase-related behavior:

1. Create an `Instance` with relevant facts.
2. Define rules using `RuleBuilder`.
3. Run `chase(instance, &rules)`.
4. Assert on `result.terminated`, `result.instance`, and derived facts.

Example test skeleton:

```rust
#[test]
fn test_example() {
    let instance: Instance = vec![
        Atom::new("Pred", vec![Term::constant("a")]),
    ].into_iter().collect();

    let rule = RuleBuilder::new()
        .when("Pred", vec![Term::var("X")])
        .then("Derived", vec![Term::var("X")])
        .build();

    let result = chase(instance, &[rule]);

    assert!(result.terminated);
    assert_eq!(result.instance.facts_for_predicate("Derived").len(), 1);
}
```

## Change Design Checklist

Before coding:

1. Confirm whether the change affects reasoning semantics, planning boundaries, or termination.
2. Identify affected tests.
3. Consider impact on API stability.
4. Avoid overstating roadmap progress in code comments or docs.
5. Keep the supported SQL subset explicit when touching `sql`, `planner`, or `execution`.

Before submitting:

1. Verify `cargo test` passes.
2. Verify `cargo clippy --all-targets --all-features -- -D warnings` passes.
3. Ensure tests were added or updated where relevant.
4. Verify docs still match the implemented feature set.

## Review Guidelines (P0/P1 Focus)

Review output should be concise and only include critical issues.

- `P0`: must-fix defects (incorrect reasoning, non-termination, unsound semantics).
- `P1`: high-priority defects (likely functional bug, performance regression, API breakage, misleading public behavior/docs).

Do not include:

- style-only nitpicks,
- praise/summary of what is already good,
- exhaustive restatement of the patch.

Use this review format:

1. `Severity` (`P0`/`P1`)
2. `File:line`
3. `Issue`
4. `Why it matters`
5. `Minimal fix direction`

## Practical Notes for Agents

- Prefer targeted edits over broad mechanical rewrites.
- If you detect contradictory repository conventions, follow existing code and update docs accordingly.
- When uncertain about correctness, add or extend tests first, then optimize.
- When adding non-chase engine pieces, define clean interfaces before broadening functionality.
- Keep `frontend` presentation-only when possible; shared reasoning logic belongs in `chase`, relational logic in `relational`/`planner`/`execution`.
- Keep user-facing naming consistent with the repository name: `query-engine` / `query_engine`.
- If you change the SQL subset, update `README.md`, `ROADMAP.md`, and relevant example scripts in the same change.

## Commit and PR Hygiene

- Keep commits scoped to one logical change.
- PR descriptions should include:
    1. behavioral change summary,
    2. tests added/updated,
    3. performance impact (if applicable),
    4. API changes (if any),
    5. roadmap or architecture impact (if applicable).

Suggested PR checklist:

- [ ] Tests added/updated for behavior changes
- [ ] `cargo test` passes
- [ ] `cargo clippy --all-targets --all-features -- -D warnings` passes
- [ ] `cargo fmt --check` passes
The base commit 2026-04-09 10:12:59 +02:00			`# AGENTS.md`

			`This file provides guidance to coding agents collaborating on this repository.`

			`## Mission`

			`Query Engine is an experimental Rust project for building query-engine`
			`components. The current implementation is centered on a chase-based reasoning`
Add scaffolding for SQL support 2026-04-09 12:38:43 +02:00			`core, lightweight interactive frontends, and an early relational/SQL scaffold.`
The base commit 2026-04-09 10:12:59 +02:00
			`Priorities, in order:`

			`1. Correctness of reasoning and query semantics.`
			`2. Clear architectural boundaries between front-end, planning, and execution layers.`
			`3. Termination guarantees for chase-based rule evaluation.`
			`4. Performance and scalability.`
			`5. Clear, maintainable, idiomatic Rust code.`

			`## Core Rules`

			`- Use English for code, comments, docs, and tests.`
			`- Keep mutable state inside well-defined structs; avoid global mutable state.`
			`- Prefer small, focused changes over large refactoring.`
			`- Add comments only when they clarify non-obvious behavior.`
			- Follow Rust idioms: use `Result` for errors, iterators over manual loops, etc.
			`- Do not describe unimplemented subsystems as if they already exist.`

			`Quick examples:`

			`- Good: add a planning data type behind a focused module boundary.`
			`- Good: add a new chase variant by extending the existing strategy/config model.`
			`- Bad: mix parsing, planning, and execution concerns in one module.`
			`- Bad: add global configuration that affects unrelated engine components.`


			`## Writing Style`

			`- Use Oxford commas in inline lists: "a, b, and c" not "a, b, c".`
			`- Do not use em dashes. Restructure the sentence, or use a colon or semicolon instead.`
			`- Avoid colorful adjectives and adverbs. Write "TCP proxy" not "lightweight TCP proxy", "scoring components" not "transparent scoring components".`
			`- Use noun phrases for checklist items, not imperative verbs. Write "redundant index detection" not "detect redundant indexes".`
			`- Headings in Markdown files must be in the title case: "Build from Source" not "Build from source". Minor words (a, an, the, and, but, or, for, in,`
			`on, at, to, by, of) stay lowercase unless they are the first word.`

			`## Repository Layout`

			- `src/`: core implementation.
			- `src/chase/`: chase and rule-evaluation modules.
			- `term.rs`: terms (constants, nulls, variables).
			- `atom.rs`: atoms (predicate applied to terms).
			- `instance.rs`: fact storage and validation.
			- `rule.rs`: TGDs, EGDs, equalities, and builders.
			- `substitution.rs`: variable bindings and unification.
			- `engine.rs`: chase execution and configuration.
Add scaffolding for SQL support 2026-04-09 12:38:43 +02:00			- `inference.rs`: shared matching and provenance-aware materialization helpers.
The base commit 2026-04-09 10:12:59 +02:00			- `union_find.rs`: equality merging support.
			- `src/frontend/`: lightweight interactive surface for scripts, REPL, and local web UI.
Add scaffolding for SQL support 2026-04-09 12:38:43 +02:00			- `src/relational/`: schemas, values, rows, and result sets for relational execution.
			- `src/catalog/`: predicate-to-table schema inference and catalog access.
			- `src/sql/`: narrow SQL AST and parser support.
			- `src/planner/`: logical plan structures and SQL-to-plan translation.
Decouple executor from Instance via DataSource trait 2026-04-10 16:06:57 +02:00			- `src/execution/`: execution of the current logical plan subset, including the `DataSource` trait and the `TableStore` in-memory source.
Update docs and add runnable SQL example scripts 2026-04-10 10:25:45 +02:00			- `examples/scripts/`: runnable script examples for supported workflows.
The base commit 2026-04-09 10:12:59 +02:00			- `tests/`: integration, regression, and property-based tests.

			`## Architecture Constraints`

			`- Treat the current chase subsystem as one engine component, not the entire long-term architecture.`
			- `Instance` holds the fact state as ground atoms.
			- `Rule` and `Egd` represent declarative constraints used by the chase subsystem.
			`- The chase engine should remain largely stateless; pass execution state explicitly.`
			`- New chase variants should be composable with existing infrastructure.`
			- Existential variables generate labeled nulls (`Term::Null`).
Add oblivious chase, broader SQL operators, LIMIT, and integer literals 2026-04-10 15:22:30 +02:00			- The current SQL support is intentionally narrow: `SELECT-FROM-WHERE-ORDER BY-LIMIT` over predicate-backed tables; equality and inequality predicates combined with `AND` and `OR`; comma-join style multi-table queries; table aliases; ordering by output-column names; integer and string literals.
Fix single-table aliases and empty schema tables 2026-04-10 12:56:24 +02:00			- Stable SQL column names come from explicit catalog registration or the frontend `schema ...` command, including for empty tables; otherwise the default names are positional such as `c0` and `c1`.
Support qualified table names in single-table SQL queries 2026-04-10 13:01:56 +02:00			`- Single-table SQL queries may use the table name as a qualifier when no alias is present.`
Update docs and add runnable SQL example scripts 2026-04-10 10:25:45 +02:00			`- Do not describe unsupported SQL features such as aggregates, grouping, or arbitrary expressions as implemented.`
Decouple executor from Instance via DataSource trait 2026-04-10 16:06:57 +02:00			- The executor operates on the `DataSource` trait, not on `Instance` directly. `Instance` and `TableStore` are the two built-in implementations.
Add scaffolding for SQL support 2026-04-09 12:38:43 +02:00			`- Relational and SQL modules should build on explicit schemas and logical plans, not call frontend helpers directly.`
The base commit 2026-04-09 10:12:59 +02:00			`- If you add parser, planner, or executor layers, keep their responsibilities separate.`
			`- Public docs and interfaces should reflect the implemented state of the repository accurately.`

			`## Rust Conventions`

			`- Target stable Rust (edition 2024, rust-version 1.92).`
			- Use `#[derive(...)]` for common traits where appropriate.
			- Prefer `&str` over `String` in function parameters when ownership is not needed.
			- Use `impl Trait` for return types when the concrete type is an implementation detail.
			- Run `cargo clippy` and address warnings before committing.

			`## Required Validation`

			`Run these checks for any non-trivial change:`

			1. `cargo test`
			2. `cargo clippy --all-targets --all-features -- -D warnings`
			3. `cargo fmt --check`

			`For performance-sensitive changes:`

			`1. Add benchmarks if they do not exist.`
			`2. Compare before/after performance.`

			`## First Contribution Flow`

			`Use this sequence for your first change:`

			1. Read `src/lib.rs` plus the relevant module files.
			`2. Implement the smallest possible code change.`
			`3. Add or update tests that fail before and pass after.`
			4. Run `cargo test`.
			5. Run `cargo clippy --all-targets --all-features -- -D warnings`.
			`6. Update docs if public API behavior changed.`

			`Example scopes that are good first tasks:`

			`- Add tests for an edge case in unification.`
			- Implement a new utility method on `Instance` or `Atom`.
			`- Tighten frontend wording so it matches actual behavior.`
			`- Introduce a small planning-oriented type without changing execution semantics.`
Add scaffolding for SQL support 2026-04-09 12:38:43 +02:00			`- Extend the SQL slice with a narrow, well-tested feature such as aliases or named columns.`
Update docs and add runnable SQL example scripts 2026-04-10 10:25:45 +02:00			`- Add a runnable example script that demonstrates a supported workflow.`
The base commit 2026-04-09 10:12:59 +02:00
			`## Testing Expectations`

			`- No semantics-changing logic update is complete without tests.`
			- Unit tests go in `#[cfg(test)] mod tests` within each module.
			- Integration tests go in `tests/integration_tests.rs`.
			- Regression tests for bug fixes go in `tests/regression_tests.rs`.
			- Property-based tests go in `tests/property_tests.rs`.
Add scaffolding for SQL support 2026-04-09 12:38:43 +02:00			- SQL/planner/execution flow tests go in `tests/sql_pipeline_tests.rs`.
Update docs and add runnable SQL example scripts 2026-04-10 10:25:45 +02:00			- Runnable documentation examples belong in `examples/scripts/` when they clarify supported behavior.
The base commit 2026-04-09 10:12:59 +02:00			`- Do not merge code that breaks existing tests.`

			`Minimal unit-test checklist for chase-related behavior:`

			1. Create an `Instance` with relevant facts.
			2. Define rules using `RuleBuilder`.
			3. Run `chase(instance, &rules)`.
			4. Assert on `result.terminated`, `result.instance`, and derived facts.

			`Example test skeleton:`

			```rust
			`#[test]`
			`fn test_example() {`
			`let instance: Instance = vec![`
			`Atom::new("Pred", vec![Term::constant("a")]),`
			`].into_iter().collect();`

			`let rule = RuleBuilder::new()`
			`.when("Pred", vec![Term::var("X")])`
			`.then("Derived", vec![Term::var("X")])`
			`.build();`

			`let result = chase(instance, &[rule]);`

			`assert!(result.terminated);`
			`assert_eq!(result.instance.facts_for_predicate("Derived").len(), 1);`
			`}`
			```

			`## Change Design Checklist`

			`Before coding:`

			`1. Confirm whether the change affects reasoning semantics, planning boundaries, or termination.`
			`2. Identify affected tests.`
			`3. Consider impact on API stability.`
			`4. Avoid overstating roadmap progress in code comments or docs.`
Add scaffolding for SQL support 2026-04-09 12:38:43 +02:00			5. Keep the supported SQL subset explicit when touching `sql`, `planner`, or `execution`.
The base commit 2026-04-09 10:12:59 +02:00
			`Before submitting:`

			1. Verify `cargo test` passes.
			2. Verify `cargo clippy --all-targets --all-features -- -D warnings` passes.
			`3. Ensure tests were added or updated where relevant.`
			`4. Verify docs still match the implemented feature set.`

			`## Review Guidelines (P0/P1 Focus)`

			`Review output should be concise and only include critical issues.`

			- `P0`: must-fix defects (incorrect reasoning, non-termination, unsound semantics).
			- `P1`: high-priority defects (likely functional bug, performance regression, API breakage, misleading public behavior/docs).

			`Do not include:`

			`- style-only nitpicks,`
			`- praise/summary of what is already good,`
			`- exhaustive restatement of the patch.`

			`Use this review format:`

			1. `Severity` (`P0`/`P1`)
			2. `File:line`
			3. `Issue`
			4. `Why it matters`
			5. `Minimal fix direction`

			`## Practical Notes for Agents`

			`- Prefer targeted edits over broad mechanical rewrites.`
			`- If you detect contradictory repository conventions, follow existing code and update docs accordingly.`
			`- When uncertain about correctness, add or extend tests first, then optimize.`
			`- When adding non-chase engine pieces, define clean interfaces before broadening functionality.`
Add scaffolding for SQL support 2026-04-09 12:38:43 +02:00			- Keep `frontend` presentation-only when possible; shared reasoning logic belongs in `chase`, relational logic in `relational`/`planner`/`execution`.
The base commit 2026-04-09 10:12:59 +02:00			- Keep user-facing naming consistent with the repository name: `query-engine` / `query_engine`.
Update docs and add runnable SQL example scripts 2026-04-10 10:25:45 +02:00			- If you change the SQL subset, update `README.md`, `ROADMAP.md`, and relevant example scripts in the same change.
The base commit 2026-04-09 10:12:59 +02:00
			`## Commit and PR Hygiene`

			`- Keep commits scoped to one logical change.`
			`- PR descriptions should include:`
			`1. behavioral change summary,`
			`2. tests added/updated,`
			`3. performance impact (if applicable),`
			`4. API changes (if any),`
			`5. roadmap or architecture impact (if applicable).`

			`Suggested PR checklist:`

			`- [ ] Tests added/updated for behavior changes`
			- [ ] `cargo test` passes
			- [ ] `cargo clippy --all-targets --all-features -- -D warnings` passes
			- [ ] `cargo fmt --check` passes