Hassan Abedi fa9f2ce50e WIP
2026-06-05 13:01:18 +02:00
..
WIP
2026-06-05 13:01:18 +02:00
WIP
2026-06-05 13:01:18 +02:00
WIP
2026-06-05 13:01:18 +02:00
WIP
2026-06-05 13:01:18 +02:00

Query Ops

This crate provides a small set of query operators that can be used to implement a simple query-plan executor. The operators are: atom scan, semijoin, and natural join.

Public API

Item Kind Description
scan_atom(&Table, &AtomPattern) -> Relation function Scans the table under the pattern and returns a binding relation with one column per distinct variable in first-occurrence order. Literal positions and repeated variables filter rows during the scan.
semijoin(&Relation, &Relation) -> Relation function Returns the rows of left whose values on the columns shared with right also appear in right. The output column list is the same as left.columns.
natural_join(&Relation, &Relation) -> Relation function Returns every pair of left and right rows that agree on shared columns. Each output row holds the columns of left followed by the non-shared columns of right.
AtomPattern struct Specifies, for each table column, either a variable to bind or a literal value to match. The pattern is a Vec<Term> whose length must equal the table's arity.
Term enum Represents one position of an AtomPattern. A term is either Var(String) to bind the cell to a named variable, or Lit(Value) to require the cell to equal a given value.
Relation struct Holds rows over named columns and is the type produced by every operator. Construct it with Relation::new(columns) or Relation::from_rows(columns, rows). Column names within a single relation must be unique.

The foundational types Table (positional input rows of fixed arity) and Value (Int(i64), Str(String), or Id(RowId)) live in the storage crate; query-ops imports them.

Data types and their relationships:

Types

Example

The rule below returns the authors of every bestseller along with the book's price. It uses all three operators:

  • scan_atom for the three input tables,
  • semijoin to keep only authors of bestsellers,
  • and natural_join to attach each book's price.
% Datalog rule/query
Q(name, book, dollars) :- author(name, book), bestseller(book), price(book, dollars).

The code below implements the rule (also available here):

use query_ops::atom::{AtomPattern, Term, scan_atom};
use query_ops::join::{natural_join, semijoin};
use storage::table::Table;
use storage::value::Value;

fn s(x: &str) -> Value {
    Value::Str(x.to_string())
}
fn i(x: i64) -> Value {
    Value::Int(x)
}

fn main() {
    let author = Table::from_rows(
        2,
        vec![
            vec![s("Ursula K. Le Guin"), s("A Wizard of Earthsea")],
            vec![s("Toni Morrison"), s("Beloved")],
            vec![s("Ursula K. Le Guin"), s("The Left Hand of Darkness")],
            vec![s("Terry Pratchett"), s("Mort")],
        ],
    );
    let bestseller = Table::from_rows(
        1,
        vec![
            vec![s("A Wizard of Earthsea")],
            vec![s("The Left Hand of Darkness")],
        ],
    );
    let price = Table::from_rows(
        2,
        vec![
            vec![s("A Wizard of Earthsea"), i(14)],
            vec![s("Beloved"), i(17)],
            vec![s("The Left Hand of Darkness"), i(15)],
            vec![s("Mort"), i(12)],
        ],
    );

    let author_rel = scan_atom(
        &author,
        &AtomPattern {
            columns: vec![Term::Var("name".to_string()), Term::Var("book".to_string())],
        },
    );
    let bestseller_rel = scan_atom(
        &bestseller,
        &AtomPattern {
            columns: vec![Term::Var("book".to_string())],
        },
    );
    let price_rel = scan_atom(
        &price,
        &AtomPattern {
            columns: vec![Term::Var("book".to_string()), Term::Var("dollars".to_string())],
        },
    );

    let authors_of_bestsellers = semijoin(&author_rel, &bestseller_rel);
    let result = natural_join(&authors_of_bestsellers, &price_rel);

    assert_eq!(
        result.columns,
        vec!["name".to_string(), "book".to_string(), "dollars".to_string()],
    );
    assert_eq!(
        result.rows,
        vec![
            vec![s("Ursula K. Le Guin"), s("A Wizard of Earthsea"), i(14)],
            vec![s("Ursula K. Le Guin"), s("The Left Hand of Darkness"), i(15)],
        ],
    );
}

How it works (logically):

Workflow

Run the Tests

cargo test -p query-ops

Notes

  • Tables versus relations: A Table is positional (fixed arity with no column names), while a Relation is keyed by variable names. The atom scan is the bridge that turns one into the other (look at the example), and every join after that operates on relations.
  • Joining is by column name: semijoin and natural_join find shared columns by matching the strings in Relation.columns. Whether two relations join on a column therefore depends on the variable name you chose in each AtomPattern. Picking the same Term::Var(name) in two patterns is what makes them join on that column.
  • No projection operator yet: natural_join always carries forward every column from both inputs, and scan_atom keeps every distinct variable that appears in the pattern. There is no way to drop columns from a relation today, so a result may include more columns than the Datalog rule head implies.
  • Bulk, not streaming: Each operator materializes its full output as a new Relation and returns it. Operators compose by passing the result of one as input to the next: natural_join(&semijoin(&a, &b), &scan_atom(&t, &p)).