useful-notes/hassan/008-chase-algorithm.md

44 lines
3.9 KiB
Markdown

## Chase Algorithm Resources
### Papers
| Paper | Authors | Link |
|:--------------------------------------------|:-------------------------|:-------------------------------------------------------------------------------------|
| The Theory of Joins in Relational Databases | Aho, Beeri & Ullman | [dl.acm.org/doi/10.1145/320083.320091](https://dl.acm.org/doi/10.1145/320083.320091) |
| Testing Implications of Data Dependencies | Maier, Mendelzon & Sagiv | [dl.acm.org/doi/10.1145/320107.320115](https://dl.acm.org/doi/10.1145/320107.320115) |
### Algorithm Variants
| Variant | Output Size | Speed per Step | Satisfaction Check | Used By | Best For |
|:---------------|:------------|:---------------|:-------------------|:------------------|:----------------------------------------------|
| Oblivious | Largest | Fastest | None | — | Prototyping; when output size doesn't matter |
| Skolem | Medium | Fast | Implicit (Skolem terms) | VLog, Graal | Parallelism; data exchange; deterministic runs |
| Restricted | Smallest | Slower | Explicit (homomorphism check) | RDFox, VLog | Production use; query performance over large KGs |
| Core | Most compact | Slowest | Full core computation | — (theoretical) | Theoretical analysis; minimum representations |
#### Variant Notes
**Oblivious Chase** fires every applicable rule without checking whether the conclusion already holds. Simple to implement but produces redundant facts and is the hardest to guarantee termination for.
**Skolem Chase** uses deterministic Skolem functions to generate nulls — the same rule trigger always produces the same null. This makes it idempotent and easy to parallelize, with no need to track which rules have already fired.
**Restricted Chase** checks before firing whether the rule head is already satisfied via a homomorphism into the current database. Never adds redundant facts, producing the smallest universal solution. The homomorphism check is NP-hard in general but fast in practice with good indexing.
**Core Chase** goes further than the restricted chase, computing the logical core — the most compressed possible universal solution. Theoretically important but computationally expensive; not used in production engines.
### Chase Query Engines
| Engine | Organization | Chase Variant | Home Page |
|:---------|:-------------------------------|:-------------------|:-------------------------------------------------------------------------------------|
| RDFox | Oxford Semantic Technologies | Restricted | [oxfordsemantic.tech/rdfox](https://oxfordsemantic.tech/rdfox) |
| VLog | TU Dresden / VU Amsterdam | Restricted + Skolem | [github.com/karmaresearch/vlog](https://github.com/karmaresearch/vlog) |
| Graal | LIRMM, Montpellier | Skolem + Oblivious | [graphik-team.github.io/graal](https://graphik-team.github.io/graal) |
| Vadalog | University of Oxford / TU Wien | Warded (Restricted variant) | [vadalog-system.com](https://vadalog-system.com) |
| PDQ | University of Oxford | Skolem | [github.com/michaelbenedict/pdq](https://github.com/michaelbenedict/pdq) |
| Llunatic | University of Genova | Skolem | [github.com/donatellosantoro/Llunatic](https://github.com/donatellosantoro/Llunatic) |
## Changelog
* **Mar 6, 2026** -- Added algorithm variants section; added chase variant column to engines table.
* **Mar 5, 2026** -- Re-formatted the table for the papers section.
* **Mar 5, 2026** -- The first version was created.