query-engine/README.md at d7b2eb414413e4c63a15a265f924be058c0d08da

Hassan Abedi d7b2eb4144 Add TUI frontend and syntax highlighting (REPL and web UI)

2026-04-14 10:16:41 +02:00

7.5 KiB

Raw Blame History

Query Engine

An experimental Rust project for building query-engine components.

Right now the repository is centered on a chase-based reasoning core, a small interactive frontend, and an early relational/SQL scaffold. The broader target shape is a query engine with clearer front-end, planning, optimization, and execution boundaries.

Current scope

Chase-based rule evaluation over facts, rules, and substitutions
Restricted, standard, oblivious, and Skolem chase variants
Optional semi-naive evaluation across all chase variants
Provenance-oriented explanations for derived answers
Script, REPL, local web UI, and optional TUI for experimentation (all with syntax highlighting)
Relational schema, catalog, logical-plan, and execution scaffolding
Physical operator scaffolding with a small rule-based rewrite layer
A minimal SQL slice for SELECT-FROM-WHERE-GROUP BY-ORDER BY-LIMIT queries over predicate-backed tables, including COUNT, SUM, MIN, MAX, and AVG aggregates
Filter push-down across joins in the physical rewrite pass

Architecture

The repository is currently organized around a few clear subsystems:

src/chase/: rule-engine data structures, chase execution, and stratification
src/io/: CSV-based fact import/export
src/frontend/: REPL, script, GUI, and explanation rendering
src/relational/: schemas, values, rows, and result sets
src/catalog/: predicate-backed table metadata
src/sql/: minimal SQL AST and parser
src/planner/: logical plan structures and SQL-to-plan translation
src/execution/: execution for the current logical-plan subset, the DataSource trait, the TableStore, and a physical operator layer with rule-based rewrites

Today, the chase subsystem is still the most mature part of the codebase. The relational and SQL modules are present to create clean extension points for a broader query-engine architecture.

The executor operates on the DataSource trait rather than on the chase Instance directly. This allows non-chase data sources to plug into the SQL pipeline. The crate ships two implementations: Instance (chase-backed) and TableStore (in-memory rows). Implementing DataSource for a new backend requires a single method:

fn scan(&self, table: &str, schema: &Schema) -> Result<ResultSet, ExecutionError>;

Intended Direction

The medium-term direction is to evolve this project into a more general query-engine playground with:

explicit front-end and parsing layers
internal planning representations
clearer separation between logical meaning and execution strategy
support for multiple query-engine experiments instead of only chase logic

The current code now includes an initial SQL front end, logical plan, and execution path. It is still intentionally narrow and should not be read as full SQL support.

Quickstart

Rust API

use query_engine::{Atom, Instance, Term, chase};
use query_engine::chase::rule::RuleBuilder;

let instance: Instance = vec![
    Atom::new("Parent", vec![Term::constant("alice"), Term::constant("bob")]),
    Atom::new("Parent", vec![Term::constant("bob"), Term::constant("carol")]),
]
.into_iter()
.collect();

let rule1 = RuleBuilder::new()
    .when("Parent", vec![Term::var("X"), Term::var("Y")])
    .then("Ancestor", vec![Term::var("X"), Term::var("Y")])
    .build();

let rule2 = RuleBuilder::new()
    .when("Ancestor", vec![Term::var("X"), Term::var("Y")])
    .when("Parent", vec![Term::var("Y"), Term::var("Z")])
    .then("Ancestor", vec![Term::var("X"), Term::var("Z")])
    .build();

let result = chase(instance, &[rule1, rule2]);

assert!(result.terminated);
assert_eq!(result.instance.facts_for_predicate("Ancestor").len(), 3);

CLI

cargo run -- repl
cargo run -- gui
cargo run -- script examples/scripts/ancestor.ech
cargo run -- script examples/scripts/sql_join.ech
cargo run --features tui -- tui

REPL language

fact Parent(alice, bob).
rule Parent(?X, ?Y) -> Ancestor(?X, ?Y).
schema Parent(parent, child).
sql SELECT * FROM Parent;
run.
query Ancestor(?X, ?Y)?
explain Ancestor(alice, carol)?
show facts
show rules
reset
help

Current SQL Slice

The repository now has a narrow SQL pipeline with:

predicate-backed catalog inference
relational schemas, rows, and values
SQL parsing for a small subset
logical planning
execution for filtering, ordering, limiting, and basic multi-table joins

Currently supported examples:

SELECT * FROM Parent
SELECT c0 FROM Parent
SELECT c0 FROM Parent WHERE c1 = 'bob'
SELECT c0 FROM Parent WHERE c1 != 'bob'
SELECT c0 FROM Parent WHERE c1 = 'bob' AND c0 = 'alice'
SELECT c0 FROM Parent WHERE c1 = 'bob' OR c1 = 'carol'
SELECT c0 FROM Parent ORDER BY c0 DESC
SELECT c0 FROM Parent ORDER BY c0 ASC LIMIT 1
SELECT c0 AS parent_name, 'seed' AS label, 42 AS answer FROM Parent
SELECT Parent.parent, Ancestor.child
FROM Parent, Ancestor
WHERE Parent.child = Ancestor.parent
SELECT p.parent, q.child
FROM Parent AS p, Parent AS q
WHERE p.child = q.parent
SELECT COUNT(*) FROM Parent
SELECT dept, COUNT(*), SUM(salary) FROM Emp GROUP BY dept

In the REPL or script runner, use the sql command and end the statement with ;:

sql SELECT c0 FROM Parent WHERE c1 = 'bob';

fact, rule, schema, sql, query, and explain commands may also span multiple lines in .ech scripts as long as the final line ends with the normal terminator.

You can also register stable column names for a predicate-backed table in the frontend before running SQL, including tables that currently have no facts:

schema Parent(parent, child).
sql SELECT parent FROM Parent WHERE child = 'bob';

For multi-table queries, qualify column names with the table name:

schema Parent(parent, child).
schema Ancestor(parent, child).
sql SELECT Parent.parent, Ancestor.child FROM Parent, Ancestor WHERE Parent.child = Ancestor.parent;

For self-joins or shorter qualification, use table aliases:

schema Parent(parent, child).
sql SELECT p.parent, q.child FROM Parent AS p, Parent AS q WHERE p.child = q.parent;

Current limits:

default column names are positional such as c0, c1
stable names require explicit catalog registration or schema ... in the frontend
single-table queries may also use the table name as a qualifier when no alias is present
joins currently use comma-separated tables plus WHERE filtering
multi-table queries require qualified column names such as Parent.child
table aliases are supported via FROM Parent AS p
WHERE supports =, !=/<>, AND, and OR (with standard precedence)
ORDER BY supports output-column ordering with ASC/DESC
LIMIT restricts the number of output rows
literals include strings, integers, and NULL
aggregates: COUNT(*), COUNT(col), SUM, MIN, MAX, AVG, with optional GROUP BY
projection aliases only via AS

Runnable SQL examples:

examples/scripts/sql_basic.ech
examples/scripts/sql_join.ech
examples/scripts/sql_self_join.ech
examples/scripts/sql_order_by.ech
examples/scripts/sql_filter_ops.ech

Development

For non-trivial changes, run:

cargo test
cargo clippy --all-targets --all-features -- -D warnings
cargo fmt --check

Benchmarks live under benches/ and can be run with:

cargo bench

Notes

This repository is still centered on a rule-engine core. The new SQL-related modules are scaffolding for a broader query-engine direction, not a claim of feature-complete SQL support.

License

This project is licensed under BSD-3.

7.5 KiB Raw Blame History