WIP
This commit is contained in:
parent
a1d46d37d1
commit
779f07789b
@ -1,20 +1,20 @@
|
|||||||
## Query Ops
|
## Query Ops
|
||||||
|
|
||||||
This crate provides a small set of query operators that can be used to implement a simple query-plan executor.
|
This crate provides a small set of query operators that can be used to implement a simple query-plan executor.
|
||||||
The operators are: atom scan, semijoin, and natural join.
|
The operators are: **atom scan**, **semijoin**, and **natural join**.
|
||||||
|
|
||||||
### Public API
|
### Public API
|
||||||
|
|
||||||
| Item | Kind | Description |
|
| Item | Kind | Description |
|
||||||
|--------------------------------------------------|---------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
|--------------------------------------------------|----------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||||
| `scan_atom(&Table, &AtomPattern) -> Relation` | function | Scans the table under the pattern and returns a binding relation with one column per distinct variable in first-occurrence order. Literal positions and repeated variables filter rows during the scan. |
|
| `scan_atom(&Table, &AtomPattern) -> Relation` | function | Scans the table under the pattern and returns a binding relation with one column per distinct variable in first-occurrence order. Literal positions and repeated variables filter rows during the scan. |
|
||||||
| `semijoin(&Relation, &Relation) -> Relation` | function | Returns the rows of `left` whose values on the columns shared with `right` also appear in `right`. The output column list is the same as `left.columns`. |
|
| `semijoin(&Relation, &Relation) -> Relation` | function | Returns the rows of `left` whose values on the columns shared with `right` also appear in `right`. The output column list is the same as `left.columns`. |
|
||||||
| `natural_join(&Relation, &Relation) -> Relation` | function | Returns every pair of `left` and `right` rows that agree on shared columns. Each output row holds the columns of `left` followed by the non-shared columns of `right`. |
|
| `natural_join(&Relation, &Relation) -> Relation` | function | Returns every pair of `left` and `right` rows that agree on shared columns. Each output row holds the columns of `left` followed by the non-shared columns of `right`. |
|
||||||
| `Table` | struct | Holds positional input rows of fixed arity and carries no column names. Construct it with `Table::new(arity)` or `Table::from_rows(arity, rows)`. |
|
| `Table` | struct | Holds positional input rows of fixed arity and carries no column names. Construct it with `Table::new(arity)` or `Table::from_rows(arity, rows)`. |
|
||||||
| `AtomPattern` | struct | Specifies, for each table column, either a variable to bind or a literal value to match. The pattern is a `Vec<Term>` whose length must equal the table's arity. |
|
| `AtomPattern` | struct | Specifies, for each table column, either a variable to bind or a literal value to match. The pattern is a `Vec<Term>` whose length must equal the table's arity. |
|
||||||
| `Term` | enum | Represents one position of an `AtomPattern`. A term is either `Var(String)` to bind the cell to a named variable, or `Lit(Value)` to require the cell to equal a given value. |
|
| `Term` | enum | Represents one position of an `AtomPattern`. A term is either `Var(String)` to bind the cell to a named variable, or `Lit(Value)` to require the cell to equal a given value. |
|
||||||
| `Relation` | struct | Holds rows over named columns and is the type produced by every operator. Construct it with `Relation::new(columns)` or `Relation::from_rows(columns, rows)`. Column names within a single relation must be unique. |
|
| `Relation` | struct | Holds rows over named columns and is the type produced by every operator. Construct it with `Relation::new(columns)` or `Relation::from_rows(columns, rows)`. Column names within a single relation must be unique. |
|
||||||
| `Value` | enum | Represents a single cell value stored in a `Table` or `Relation`. A value is either `Int(i64)` or `Str(String)`. |
|
| `Value` | enum | Represents a single cell value stored in a `Table` or `Relation`. A value is either `Int(i64)` or `Str(String)`. |
|
||||||
|
|
||||||
Data types and their relationships:
|
Data types and their relationships:
|
||||||
|
|
||||||
@ -28,6 +28,7 @@ Data types and their relationships:
|
|||||||
|
|
||||||
The rule below returns the authors of every bestseller along with the book's price.
|
The rule below returns the authors of every bestseller along with the book's price.
|
||||||
It uses all three operators:
|
It uses all three operators:
|
||||||
|
|
||||||
- `scan_atom` for the three input tables,
|
- `scan_atom` for the three input tables,
|
||||||
- `semijoin` to keep only authors of bestsellers,
|
- `semijoin` to keep only authors of bestsellers,
|
||||||
- and `natural_join` to attach each book's price.
|
- and `natural_join` to attach each book's price.
|
||||||
@ -119,3 +120,16 @@ How it works:
|
|||||||
```sh
|
```sh
|
||||||
cargo test -p query-ops
|
cargo test -p query-ops
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Notes
|
||||||
|
|
||||||
|
- **Tables versus relations:** A `Table` is positional (fixed arity with no column names), while a `Relation` is keyed by variable names. The atom
|
||||||
|
scan is the bridge that turns one into the other, and every join after that operates on relations.
|
||||||
|
- **Joining is by column name:** `semijoin` and `natural_join` find shared columns by matching the strings in `Relation.columns`. Whether two
|
||||||
|
relations join on a column therefore depends on the variable name you chose in each `AtomPattern`. Picking the same `Term::Var(name)` in two
|
||||||
|
patterns is what makes them join on that column.
|
||||||
|
- **No projection operator yet:** `natural_join` always carries forward every column from both inputs, and `scan_atom` keeps every distinct variable
|
||||||
|
that appears in the pattern. There is no way to drop columns from a relation today, so a result may include more columns than the Datalog rule head
|
||||||
|
implies.
|
||||||
|
- **Bulk, not streaming:** Each operator materializes its full output as a new `Relation` and returns it. Operators compose by passing the result of
|
||||||
|
one as input to the next: `natural_join(&semijoin(&a, &b), &scan_atom(&t, &p))`.
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user