This commit is contained in:
Hassan Abedi 2026-06-03 12:24:10 +02:00
parent a1d46d37d1
commit 779f07789b

View File

@ -1,12 +1,12 @@
## Query Ops ## Query Ops
This crate provides a small set of query operators that can be used to implement a simple query-plan executor. This crate provides a small set of query operators that can be used to implement a simple query-plan executor.
The operators are: atom scan, semijoin, and natural join. The operators are: **atom scan**, **semijoin**, and **natural join**.
### Public API ### Public API
| Item | Kind | Description | | Item | Kind | Description |
|--------------------------------------------------|---------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |--------------------------------------------------|----------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `scan_atom(&Table, &AtomPattern) -> Relation` | function | Scans the table under the pattern and returns a binding relation with one column per distinct variable in first-occurrence order. Literal positions and repeated variables filter rows during the scan. | | `scan_atom(&Table, &AtomPattern) -> Relation` | function | Scans the table under the pattern and returns a binding relation with one column per distinct variable in first-occurrence order. Literal positions and repeated variables filter rows during the scan. |
| `semijoin(&Relation, &Relation) -> Relation` | function | Returns the rows of `left` whose values on the columns shared with `right` also appear in `right`. The output column list is the same as `left.columns`. | | `semijoin(&Relation, &Relation) -> Relation` | function | Returns the rows of `left` whose values on the columns shared with `right` also appear in `right`. The output column list is the same as `left.columns`. |
| `natural_join(&Relation, &Relation) -> Relation` | function | Returns every pair of `left` and `right` rows that agree on shared columns. Each output row holds the columns of `left` followed by the non-shared columns of `right`. | | `natural_join(&Relation, &Relation) -> Relation` | function | Returns every pair of `left` and `right` rows that agree on shared columns. Each output row holds the columns of `left` followed by the non-shared columns of `right`. |
@ -28,6 +28,7 @@ Data types and their relationships:
The rule below returns the authors of every bestseller along with the book's price. The rule below returns the authors of every bestseller along with the book's price.
It uses all three operators: It uses all three operators:
- `scan_atom` for the three input tables, - `scan_atom` for the three input tables,
- `semijoin` to keep only authors of bestsellers, - `semijoin` to keep only authors of bestsellers,
- and `natural_join` to attach each book's price. - and `natural_join` to attach each book's price.
@ -119,3 +120,16 @@ How it works:
```sh ```sh
cargo test -p query-ops cargo test -p query-ops
``` ```
### Notes
- **Tables versus relations:** A `Table` is positional (fixed arity with no column names), while a `Relation` is keyed by variable names. The atom
scan is the bridge that turns one into the other, and every join after that operates on relations.
- **Joining is by column name:** `semijoin` and `natural_join` find shared columns by matching the strings in `Relation.columns`. Whether two
relations join on a column therefore depends on the variable name you chose in each `AtomPattern`. Picking the same `Term::Var(name)` in two
patterns is what makes them join on that column.
- **No projection operator yet:** `natural_join` always carries forward every column from both inputs, and `scan_atom` keeps every distinct variable
that appears in the pattern. There is no way to drop columns from a relation today, so a result may include more columns than the Datalog rule head
implies.
- **Bulk, not streaming:** Each operator materializes its full output as a new `Relation` and returns it. Operators compose by passing the result of
one as input to the next: `natural_join(&semijoin(&a, &b), &scan_atom(&t, &p))`.