235 lines
7.5 KiB
Markdown
235 lines
7.5 KiB
Markdown
## Query Engine
|
|
|
|
An experimental Rust project for building query-engine components.
|
|
|
|
Right now the repository is centered on a chase-based reasoning core, an
|
|
interactive frontend, and an early relational/SQL scaffold. The broader target
|
|
shape is a query engine with clearer front-end, planning, optimization, and
|
|
execution boundaries.
|
|
|
|
### Current scope
|
|
|
|
- Chase-based rule evaluation over facts, rules, and substitutions
|
|
- Restricted, standard, oblivious, and Skolem chase variants
|
|
- Optional semi-naive evaluation across all chase variants
|
|
- Provenance-oriented explanations for derived answers
|
|
- Script, REPL, local web UI, and optional TUI for experimentation (all with syntax highlighting)
|
|
- Relational schema, catalog, logical-plan, and execution scaffolding
|
|
- Physical operator scaffolding with a rule-based rewrite layer
|
|
- A SQL slice for `SELECT-FROM-WHERE-GROUP BY-ORDER BY-LIMIT` queries over predicate-backed tables, including `COUNT`, `SUM`, `MIN`, `MAX`, and `AVG` aggregates
|
|
- Filter push-down across joins in the physical rewrite pass
|
|
|
|
### Architecture
|
|
|
|
The repository is currently organized around a few clear subsystems:
|
|
|
|
- `src/chase/`: rule-engine data structures, chase execution, and stratification
|
|
- `src/io/`: CSV-based fact import/export
|
|
- `src/frontend/`: REPL, script, GUI, and explanation rendering
|
|
- `src/relational/`: schemas, values, rows, and result sets
|
|
- `src/catalog/`: predicate-backed table metadata
|
|
- `src/sql/`: SQL AST and parser
|
|
- `src/planner/`: logical plan structures and SQL-to-plan translation
|
|
- `src/execution/`: execution for the current logical-plan subset, the `DataSource` trait, the `TableStore`, and a physical operator layer with rule-based rewrites
|
|
|
|
Today, the chase subsystem is still the most mature part of the codebase. The
|
|
relational and SQL modules are present to create clean extension points for a
|
|
broader query-engine architecture.
|
|
|
|
The executor operates on the `DataSource` trait rather than on the chase
|
|
`Instance` directly. This allows non-chase data sources to plug into the SQL
|
|
pipeline. The crate ships two implementations: `Instance` (chase-backed) and
|
|
`TableStore` (in-memory rows). Implementing `DataSource` for a new backend
|
|
requires a single method:
|
|
|
|
```rust
|
|
fn scan(&self, table: &str, schema: &Schema) -> Result<ResultSet, ExecutionError>;
|
|
```
|
|
|
|
### Intended Direction
|
|
|
|
The medium-term direction is to evolve this project into a more general
|
|
query-engine playground with:
|
|
|
|
- explicit front-end and parsing layers
|
|
- internal planning representations
|
|
- clearer separation between logical meaning and execution strategy
|
|
- support for multiple query-engine experiments instead of only chase logic
|
|
|
|
The current code now includes an initial SQL front end, logical plan, and
|
|
execution path. It is still intentionally narrow and should not be read as full
|
|
SQL support.
|
|
|
|
### Quickstart
|
|
|
|
#### Rust API
|
|
|
|
```rust
|
|
use query_engine::{Atom, Instance, Term, chase};
|
|
use query_engine::chase::rule::RuleBuilder;
|
|
|
|
let instance: Instance = vec![
|
|
Atom::new("Parent", vec![Term::constant("alice"), Term::constant("bob")]),
|
|
Atom::new("Parent", vec![Term::constant("bob"), Term::constant("carol")]),
|
|
]
|
|
.into_iter()
|
|
.collect();
|
|
|
|
let rule1 = RuleBuilder::new()
|
|
.when("Parent", vec![Term::var("X"), Term::var("Y")])
|
|
.then("Ancestor", vec![Term::var("X"), Term::var("Y")])
|
|
.build();
|
|
|
|
let rule2 = RuleBuilder::new()
|
|
.when("Ancestor", vec![Term::var("X"), Term::var("Y")])
|
|
.when("Parent", vec![Term::var("Y"), Term::var("Z")])
|
|
.then("Ancestor", vec![Term::var("X"), Term::var("Z")])
|
|
.build();
|
|
|
|
let result = chase(instance, &[rule1, rule2]);
|
|
|
|
assert!(result.terminated);
|
|
assert_eq!(result.instance.facts_for_predicate("Ancestor").len(), 3);
|
|
```
|
|
|
|
#### CLI
|
|
|
|
```bash
|
|
cargo run -- repl
|
|
cargo run -- gui
|
|
cargo run -- script examples/scripts/ancestor.ech
|
|
cargo run -- script examples/scripts/sql_join.ech
|
|
cargo run --features tui -- tui
|
|
```
|
|
|
|
#### REPL language
|
|
|
|
```text
|
|
fact Parent(alice, bob).
|
|
rule Parent(?X, ?Y) -> Ancestor(?X, ?Y).
|
|
schema Parent(parent, child).
|
|
sql SELECT * FROM Parent;
|
|
run.
|
|
query Ancestor(?X, ?Y)?
|
|
explain Ancestor(alice, carol)?
|
|
show facts
|
|
show rules
|
|
reset
|
|
help
|
|
```
|
|
|
|
#### Current SQL Slice
|
|
|
|
The repository now has a narrow SQL pipeline with:
|
|
|
|
- predicate-backed catalog inference
|
|
- relational schemas, rows, and values
|
|
- SQL parsing for the supported subset
|
|
- logical planning
|
|
- execution for filtering, ordering, limiting, and basic multi-table joins
|
|
|
|
Currently supported examples:
|
|
|
|
```sql
|
|
SELECT * FROM Parent
|
|
SELECT c0 FROM Parent
|
|
SELECT c0 FROM Parent WHERE c1 = 'bob'
|
|
SELECT c0 FROM Parent WHERE c1 != 'bob'
|
|
SELECT c0 FROM Parent WHERE c1 = 'bob' AND c0 = 'alice'
|
|
SELECT c0 FROM Parent WHERE c1 = 'bob' OR c1 = 'carol'
|
|
SELECT c0 FROM Parent ORDER BY c0 DESC
|
|
SELECT c0 FROM Parent ORDER BY c0 ASC LIMIT 1
|
|
SELECT c0 AS parent_name, 'seed' AS label, 42 AS answer FROM Parent
|
|
SELECT Parent.parent, Ancestor.child
|
|
FROM Parent, Ancestor
|
|
WHERE Parent.child = Ancestor.parent
|
|
SELECT p.parent, q.child
|
|
FROM Parent AS p, Parent AS q
|
|
WHERE p.child = q.parent
|
|
SELECT COUNT(*) FROM Parent
|
|
SELECT dept, COUNT(*), SUM(salary) FROM Emp GROUP BY dept
|
|
```
|
|
|
|
In the REPL or script runner, use the `sql` command and end the statement with
|
|
`;`:
|
|
|
|
```text
|
|
sql SELECT c0 FROM Parent WHERE c1 = 'bob';
|
|
```
|
|
|
|
`fact`, `rule`, `schema`, `sql`, `query`, and `explain` commands may also span
|
|
multiple lines in `.ech` scripts as long as the final line ends with the normal
|
|
terminator.
|
|
|
|
You can also register stable column names for a predicate-backed table in the
|
|
frontend before running SQL, including tables that currently have no facts:
|
|
|
|
```text
|
|
schema Parent(parent, child).
|
|
sql SELECT parent FROM Parent WHERE child = 'bob';
|
|
```
|
|
|
|
For multi-table queries, qualify column names with the table name:
|
|
|
|
```text
|
|
schema Parent(parent, child).
|
|
schema Ancestor(parent, child).
|
|
sql SELECT Parent.parent, Ancestor.child FROM Parent, Ancestor WHERE Parent.child = Ancestor.parent;
|
|
```
|
|
|
|
For self-joins or shorter qualification, use table aliases:
|
|
|
|
```text
|
|
schema Parent(parent, child).
|
|
sql SELECT p.parent, q.child FROM Parent AS p, Parent AS q WHERE p.child = q.parent;
|
|
```
|
|
|
|
Current limits:
|
|
|
|
- default column names are positional such as `c0`, `c1`
|
|
- stable names require explicit catalog registration or `schema ...` in the frontend
|
|
- single-table queries may also use the table name as a qualifier when no alias is present
|
|
- joins currently use comma-separated tables plus `WHERE` filtering
|
|
- multi-table queries require qualified column names such as `Parent.child`
|
|
- table aliases are supported via `FROM Parent AS p`
|
|
- `WHERE` supports `=`, `!=`/`<>`, `AND`, and `OR` (with standard precedence)
|
|
- `ORDER BY` supports output-column ordering with `ASC`/`DESC`
|
|
- `LIMIT` restricts the number of output rows
|
|
- literals include strings, integers, and `NULL`
|
|
- aggregates: `COUNT(*)`, `COUNT(col)`, `SUM`, `MIN`, `MAX`, `AVG`, with optional `GROUP BY`
|
|
- projection aliases only via `AS`
|
|
|
|
Runnable SQL examples:
|
|
|
|
- `examples/scripts/sql_basic.ech`
|
|
- `examples/scripts/sql_join.ech`
|
|
- `examples/scripts/sql_self_join.ech`
|
|
- `examples/scripts/sql_order_by.ech`
|
|
- `examples/scripts/sql_filter_ops.ech`
|
|
|
|
### Development
|
|
|
|
For non-trivial changes, run:
|
|
|
|
```bash
|
|
cargo test
|
|
cargo clippy --all-targets --all-features -- -D warnings
|
|
cargo fmt --check
|
|
```
|
|
|
|
Benchmarks live under `benches/` and can be run with:
|
|
|
|
```bash
|
|
cargo bench
|
|
```
|
|
|
|
### Notes
|
|
|
|
This repository is still centered on a rule-engine core. The new SQL-related
|
|
modules are scaffolding for a broader query-engine direction, not a claim of
|
|
feature-complete SQL support.
|
|
|
|
### License
|
|
|
|
This project is licensed under [BSD-3](LICENSE).
|