2026-06-03 11:48:33 +02:00
## Query Ops
This crate provides a small set of query operators that can be used to implement a simple query-plan executor.
The operators are: **atom scan** , **semijoin** , and **natural join** .
### Public API
| Item | Kind | Description |
|--------------------------------------------------|----------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `scan_atom(&Table, &AtomPattern) -> Relation` | function | Scans the table under the pattern and returns a binding relation with one column per distinct variable in first-occurrence order. Literal positions and repeated variables filter rows during the scan. |
| `semijoin(&Relation, &Relation) -> Relation` | function | Returns the rows of `left` whose values on the columns shared with `right` also appear in `right` . The output column list is the same as `left.columns` . |
| `natural_join(&Relation, &Relation) -> Relation` | function | Returns every pair of `left` and `right` rows that agree on shared columns. Each output row holds the columns of `left` followed by the non-shared columns of `right` . |
| `AtomPattern` | struct | Specifies, for each table column, either a variable to bind or a literal value to match. The pattern is a `Vec<Term>` whose length must equal the table's arity. |
| `Term` | enum | Represents one position of an `AtomPattern` . A term is either `Var(String)` to bind the cell to a named variable, or `Lit(Value)` to require the cell to equal a given value. |
| `Relation` | struct | Holds rows over named columns and is the type produced by every operator. Construct it with `Relation::new(columns)` or `Relation::from_rows(columns, rows)` . Column names within a single relation must be unique. |
2026-06-04 12:47:47 +02:00
The foundational types `Table` (positional input rows of fixed arity) and `Value` (`Int(i64)` , `Str(String)` , or `Id(RowId)` ) live in the [
`storage` ](../storage) crate; query-ops imports them.
2026-06-03 11:48:33 +02:00
Data types and their relationships:
< div align = "center" >
< picture >
< img alt = "Types" src = "docs/diagrams/types.svg" height = "60%" width = "60%" >
< / picture >
< / div >
### Example
The rule below returns the authors of every bestseller along with the book's price.
It uses all three operators:
- `scan_atom` for the three input tables,
- `semijoin` to keep only authors of bestsellers,
- and `natural_join` to attach each book's price.
```prolog
2026-06-03 12:30:41 +02:00
% Datalog rule/query
2026-06-03 11:48:33 +02:00
Q(name, book, dollars) :- author(name, book), bestseller(book), price(book, dollars).
```
2026-06-03 12:30:41 +02:00
The code below implements the rule (also available [here ](tests/hand_plan.rs )):
2026-06-03 11:48:33 +02:00
```rust
use query_ops::atom::{AtomPattern, Term, scan_atom};
use query_ops::join::{natural_join, semijoin};
2026-06-04 12:47:47 +02:00
use storage::table::Table;
use storage::value::Value;
2026-06-03 11:48:33 +02:00
fn s(x: & str) -> Value {
Value::Str(x.to_string())
}
fn i(x: i64) -> Value {
Value::Int(x)
}
fn main() {
let author = Table::from_rows(
2,
vec![
2026-06-03 12:30:41 +02:00
vec![s("Ursula K. Le Guin"), s("A Wizard of Earthsea")],
vec![s("Toni Morrison"), s("Beloved")],
vec![s("Ursula K. Le Guin"), s("The Left Hand of Darkness")],
vec![s("Terry Pratchett"), s("Mort")],
],
);
let bestseller = Table::from_rows(
1,
vec![
vec![s("A Wizard of Earthsea")],
vec![s("The Left Hand of Darkness")],
2026-06-03 11:48:33 +02:00
],
);
let price = Table::from_rows(
2,
vec![
2026-06-03 12:30:41 +02:00
vec![s("A Wizard of Earthsea"), i(14)],
vec![s("Beloved"), i(17)],
vec![s("The Left Hand of Darkness"), i(15)],
vec![s("Mort"), i(12)],
2026-06-03 11:48:33 +02:00
],
);
let author_rel = scan_atom(
& author,
& AtomPattern {
columns: vec![Term::Var("name".to_string()), Term::Var("book".to_string())],
},
);
let bestseller_rel = scan_atom(
& bestseller,
& AtomPattern {
columns: vec![Term::Var("book".to_string())],
},
);
let price_rel = scan_atom(
& price,
& AtomPattern {
columns: vec![Term::Var("book".to_string()), Term::Var("dollars".to_string())],
},
);
let authors_of_bestsellers = semijoin(& author_rel, &bestseller_rel);
let result = natural_join(& authors_of_bestsellers, &price_rel);
assert_eq!(
result.columns,
vec!["name".to_string(), "book".to_string(), "dollars".to_string()],
);
assert_eq!(
result.rows,
vec![
2026-06-03 12:30:41 +02:00
vec![s("Ursula K. Le Guin"), s("A Wizard of Earthsea"), i(14)],
vec![s("Ursula K. Le Guin"), s("The Left Hand of Darkness"), i(15)],
2026-06-03 11:48:33 +02:00
],
);
}
```
2026-06-03 12:30:41 +02:00
How it works (logically):
2026-06-03 11:48:33 +02:00
< div align = "center" >
< picture >
< img alt = "Types" src = "docs/diagrams/workflow.svg" height = "90%" width = "90%%" >
< / picture >
< / div >
### Run the Tests
```sh
cargo test -p query-ops
```
### Notes
- **Tables versus relations:** A `Table` is positional (fixed arity with no column names), while a `Relation` is keyed by variable names. The atom
scan is the bridge that turns one into the other (look at the example), and every join after that operates on relations.
- **Joining is by column name:** `semijoin` and `natural_join` find shared columns by matching the strings in `Relation.columns` . Whether two
relations join on a column therefore depends on the variable name you chose in each `AtomPattern` . Picking the same `Term::Var(name)` in two
patterns is what makes them join on that column.
- **No projection operator yet:** `natural_join` always carries forward every column from both inputs, and `scan_atom` keeps every distinct variable
that appears in the pattern. There is no way to drop columns from a relation today, so a result may include more columns than the Datalog rule head
implies.
- **Bulk, not streaming:** Each operator materializes its full output as a new `Relation` and returns it. Operators compose by passing the result of
one as input to the next: `natural_join(&semijoin(&a, &b), &scan_atom(&t, &p))` .