Query Ops
This crate provides a small set of query operators that can be used to implement a simple query-plan executor. The operators are: atom scan, semijoin, and natural join.
Public API
| Item | Kind | Description |
|---|---|---|
scan_atom(&Table, &AtomPattern) -> Relation |
function | Scans the table under the pattern and returns a binding relation with one column per distinct variable in first-occurrence order. Literal positions and repeated variables filter rows during the scan. |
semijoin(&Relation, &Relation) -> Relation |
function | Returns the rows of left whose values on the columns shared with right also appear in right. The output column list is the same as left.columns. |
natural_join(&Relation, &Relation) -> Relation |
function | Returns every pair of left and right rows that agree on shared columns. Each output row holds the columns of left followed by the non-shared columns of right. |
Table |
struct | Holds positional input rows of fixed arity and carries no column names. Construct it with Table::new(arity) or Table::from_rows(arity, rows). |
AtomPattern |
struct | Specifies, for each table column, either a variable to bind or a literal value to match. The pattern is a Vec<Term> whose length must equal the table's arity. |
Term |
enum | Represents one position of an AtomPattern. A term is either Var(String) to bind the cell to a named variable, or Lit(Value) to require the cell to equal a given value. |
Relation |
struct | Holds rows over named columns and is the type produced by every operator. Construct it with Relation::new(columns) or Relation::from_rows(columns, rows). Column names within a single relation must be unique. |
Value |
enum | Represents a single cell value stored in a Table or Relation. A value is either Int(i64) or Str(String). |
Data types and their relationships:
Example
The rule below returns the authors of every bestseller along with the book's price. It uses all three operators:
scan_atomfor the three input tables,semijointo keep only authors of bestsellers,- and
natural_jointo attach each book's price.
Q(name, book, dollars) :- author(name, book), bestseller(book), price(book, dollars).
use query_ops::atom::{AtomPattern, Term, scan_atom};
use query_ops::join::{natural_join, semijoin};
use query_ops::table::Table;
use query_ops::value::Value;
fn s(x: &str) -> Value {
Value::Str(x.to_string())
}
fn i(x: i64) -> Value {
Value::Int(x)
}
fn main() {
let author = Table::from_rows(
2,
vec![
vec![s("Alice"), s("Foo")],
vec![s("Bob"), s("Bar")],
vec![s("Alice"), s("Baz")],
vec![s("Carol"), s("Qux")],
],
);
let bestseller = Table::from_rows(1, vec![vec![s("Foo")], vec![s("Baz")]]);
let price = Table::from_rows(
2,
vec![
vec![s("Foo"), i(25)],
vec![s("Bar"), i(15)],
vec![s("Baz"), i(30)],
vec![s("Qux"), i(20)],
],
);
let author_rel = scan_atom(
&author,
&AtomPattern {
columns: vec![Term::Var("name".to_string()), Term::Var("book".to_string())],
},
);
let bestseller_rel = scan_atom(
&bestseller,
&AtomPattern {
columns: vec![Term::Var("book".to_string())],
},
);
let price_rel = scan_atom(
&price,
&AtomPattern {
columns: vec![Term::Var("book".to_string()), Term::Var("dollars".to_string())],
},
);
let authors_of_bestsellers = semijoin(&author_rel, &bestseller_rel);
let result = natural_join(&authors_of_bestsellers, &price_rel);
assert_eq!(
result.columns,
vec!["name".to_string(), "book".to_string(), "dollars".to_string()],
);
assert_eq!(
result.rows,
vec![
vec![s("Alice"), s("Foo"), i(25)],
vec![s("Alice"), s("Baz"), i(30)],
],
);
}
How it works:
Run the Tests
cargo test -p query-ops
Notes
- Tables versus relations: A
Tableis positional (fixed arity with no column names), while aRelationis keyed by variable names. The atom scan is the bridge that turns one into the other (look at the example), and every join after that operates on relations. - Joining is by column name:
semijoinandnatural_joinfind shared columns by matching the strings inRelation.columns. Whether two relations join on a column therefore depends on the variable name you chose in eachAtomPattern. Picking the sameTerm::Var(name)in two patterns is what makes them join on that column. - No projection operator yet:
natural_joinalways carries forward every column from both inputs, andscan_atomkeeps every distinct variable that appears in the pattern. There is no way to drop columns from a relation today, so a result may include more columns than the Datalog rule head implies. - Bulk, not streaming: Each operator materializes its full output as a new
Relationand returns it. Operators compose by passing the result of one as input to the next:natural_join(&semijoin(&a, &b), &scan_atom(&t, &p)).