285 lines
8.8 KiB
Markdown
285 lines
8.8 KiB
Markdown
|
|
# Logic Primer
|
|||
|
|
|
|||
|
|
A reference for the core concepts of first-order logic and database theory.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Building Blocks
|
|||
|
|
|
|||
|
|
### Constants
|
|||
|
|
Specific known values — real things in your database.
|
|||
|
|
```
|
|||
|
|
Alice, Bob, Engineering, 42
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Variables
|
|||
|
|
Placeholders for unknown values, usually written in uppercase or with `?`.
|
|||
|
|
```
|
|||
|
|
X, Y, ?person, ?dept
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Terms
|
|||
|
|
A term is either a constant or a variable. Anything that can fill an argument slot in a predicate.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Predicates and Atoms
|
|||
|
|
|
|||
|
|
A **predicate** is a named relation applied to a list of terms:
|
|||
|
|
```
|
|||
|
|
Employee(X, Y) -- predicate "Employee", arity 2
|
|||
|
|
Likes(X, Y, Z) -- predicate "Likes", arity 3
|
|||
|
|
Person(Alice) -- predicate "Person", arity 1
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
The number of arguments a predicate takes is its **arity**.
|
|||
|
|
|
|||
|
|
An **atom** is a predicate applied to specific terms. When all terms are constants it is called a **ground atom** — a concrete fact in the database:
|
|||
|
|
```
|
|||
|
|
Employee(Alice, Engineering) -- ground atom
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Rules
|
|||
|
|
|
|||
|
|
A **rule** is an if-then statement built from atoms, split into two parts:
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
body → head
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
| Part | Also Called | Meaning |
|
|||
|
|
|:-----|:------------|:--------|
|
|||
|
|
| Body | Antecedent, LHS | Conditions that must hold |
|
|||
|
|
| Head | Consequent, RHS | What must be true if body holds |
|
|||
|
|
|
|||
|
|
Example:
|
|||
|
|
```
|
|||
|
|
Employee(X, Y), Department(Y) → ∃Z. ManagedBy(Y, Z)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Variables in the head that do not appear in the body are **existential variables** — they represent unknown values that must exist, but whose identity is not known.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Quantifiers
|
|||
|
|
|
|||
|
|
| Symbol | Name | Meaning | Example |
|
|||
|
|
|:-------|:-----|:--------|:--------|
|
|||
|
|
| `∀` | Universal | "For all..." | `∀X. Person(X) → Mortal(X)` |
|
|||
|
|
| `∃` | Existential | "There exists some..." | `∀X. Person(X) → ∃Y. MotherOf(X, Y)` |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Connectives
|
|||
|
|
|
|||
|
|
| Symbol | Name | Meaning |
|
|||
|
|
|:-------|:-----|:--------|
|
|||
|
|
| `∧` | Conjunction | AND |
|
|||
|
|
| `∨` | Disjunction | OR |
|
|||
|
|
| `¬` | Negation | NOT |
|
|||
|
|
| `→` | Implication | IF...THEN |
|
|||
|
|
| `↔` | Biconditional | IF AND ONLY IF |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Key Concepts
|
|||
|
|
|
|||
|
|
### Substitution
|
|||
|
|
A mapping from variables to terms. Written as `σ`:
|
|||
|
|
```
|
|||
|
|
σ = { X → Alice, Y → Engineering }
|
|||
|
|
|
|||
|
|
Employee(X, Y) under σ = Employee(Alice, Engineering)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Homomorphism
|
|||
|
|
A substitution that maps one set of atoms into another, preserving structure. Used in the chase to check whether a rule head is already satisfied in the database.
|
|||
|
|
|
|||
|
|
### Ground Instance
|
|||
|
|
An atom or rule with all variables replaced by constants — no variables remaining.
|
|||
|
|
|
|||
|
|
### Herbrand Universe
|
|||
|
|
The set of all ground terms constructable from the constants and function symbols in a formula. Represents the "world" of all possible values.
|
|||
|
|
|
|||
|
|
### Skolem Term / Skolem Function
|
|||
|
|
A named placeholder for an existentially quantified variable, constructed deterministically from the values that triggered it:
|
|||
|
|
```
|
|||
|
|
∀X. Person(X) → ∃Y. MotherOf(X, Y)
|
|||
|
|
|
|||
|
|
Skolemized: MotherOf(X, mother_of(X))
|
|||
|
|
```
|
|||
|
|
The Skolem term `mother_of(X)` means "the mother of X, whoever that is." Same input always produces the same term.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Types of Dependencies (Rules)
|
|||
|
|
|
|||
|
|
| Type | Abbreviation | Form | Meaning |
|
|||
|
|
|:-----|:-------------|:-----|:--------|
|
|||
|
|
| Tuple-generating dependency | TGD | `body → ∃z. head` | If body holds, some new tuple must exist |
|
|||
|
|
| Equality-generating dependency | EGD | `body → x = y` | If body holds, two values must be equal |
|
|||
|
|
| Functional dependency | FD | Special EGD | A set of attributes uniquely determines another |
|
|||
|
|
| Full dependency | — | TGD with no existentials | Head variables all appear in body |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Rule Classes and Termination
|
|||
|
|
|
|||
|
|
| Rule Class | Existentials | Termination Guarantee | Notes |
|
|||
|
|
|:-----------|:-------------|:----------------------|:------|
|
|||
|
|
| Datalog | No | Always | Core of logic programming |
|
|||
|
|
| Weakly acyclic | Yes | Yes | No cyclic value propagation through existentials |
|
|||
|
|
| Guarded | Yes | No (but decidable) | Existential vars "guarded" by a body atom |
|
|||
|
|
| Frontier-one | Yes | No (but decidable) | At most one frontier variable per rule |
|
|||
|
|
| General TGDs | Yes | Undecidable | No restrictions |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Datalog
|
|||
|
|
|
|||
|
|
Datalog is a logic programming language and the most important rule class for the chase. It is a restriction of first-order logic with no function symbols and no existential variables in rule heads.
|
|||
|
|
|
|||
|
|
### Syntax
|
|||
|
|
|
|||
|
|
A Datalog program consists of:
|
|||
|
|
- **Facts** — ground atoms representing known data
|
|||
|
|
- **Rules** — if-then statements deriving new facts from existing ones
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
% Facts
|
|||
|
|
Employee(alice, engineering).
|
|||
|
|
Employee(bob, marketing).
|
|||
|
|
Department(engineering).
|
|||
|
|
|
|||
|
|
% Rules
|
|||
|
|
WorksIn(X, D) :- Employee(X, D), Department(D).
|
|||
|
|
Colleague(X, Y) :- WorksIn(X, D), WorksIn(Y, D), X ≠ Y.
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Key Properties
|
|||
|
|
|
|||
|
|
| Property | Value |
|
|||
|
|
|:---------|:------|
|
|||
|
|
| Existentials in head | No |
|
|||
|
|
| Function symbols | No |
|
|||
|
|
| Negation | Stratified only |
|
|||
|
|
| Termination | Always |
|
|||
|
|
| Data complexity | PTIME |
|
|||
|
|
| Evaluation strategy | Bottom-up (forward chaining) or top-down (backward chaining) |
|
|||
|
|
|
|||
|
|
### Evaluation
|
|||
|
|
|
|||
|
|
Datalog is evaluated by computing the **least fixed point** — repeatedly applying all rules until no new facts are derived. This is exactly the chase with no existentials:
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
Iteration 0: { Employee(alice, engineering), Employee(bob, marketing), Department(engineering) }
|
|||
|
|
Iteration 1: + WorksIn(alice, engineering)
|
|||
|
|
Iteration 2: + Colleague(alice, bob), Colleague(bob, alice)
|
|||
|
|
Iteration 3: nothing new → stop
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Datalog vs SQL
|
|||
|
|
|
|||
|
|
Datalog can express **recursive queries** naturally, which SQL cannot without special extensions (like recursive CTEs). For example, computing transitive closure (all ancestors of a person) is trivial in Datalog:
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
Ancestor(X, Y) :- Parent(X, Y).
|
|||
|
|
Ancestor(X, Y) :- Parent(X, Z), Ancestor(Z, Y).
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Datalog and the Chase
|
|||
|
|
|
|||
|
|
Pure Datalog evaluation is a restricted chase with no existential variables — all chase variants behave identically. The fixed point is always reached in at most O(n^k) steps.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Geometric Logic
|
|||
|
|
|
|||
|
|
Geometric logic is a fragment of first-order logic that is particularly well-suited to the chase. It is the theoretical foundation for most modern chase-based reasoning engines.
|
|||
|
|
|
|||
|
|
### Definition
|
|||
|
|
|
|||
|
|
A formula is **geometric** if it uses only:
|
|||
|
|
|
|||
|
|
| Allowed | Not Allowed |
|
|||
|
|
|:--------|:-----------|
|
|||
|
|
| Conjunction `∧` | Universal quantifier `∀` in head |
|
|||
|
|
| Disjunction `∨` | Negation `¬` |
|
|||
|
|
| Existential quantifier `∃` | Implication `→` in head |
|
|||
|
|
| Equality `=` | Infinite conjunctions in head |
|
|||
|
|
|
|||
|
|
A **geometric sequent** (rule) has the form:
|
|||
|
|
```
|
|||
|
|
φ ⊢ ψ
|
|||
|
|
|
|||
|
|
where φ and ψ are geometric formulas
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Which is shorthand for: `∀x. φ(x) → ψ(x)`
|
|||
|
|
|
|||
|
|
### Examples
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
-- Every employee works in some department
|
|||
|
|
Employee(X) ⊢ ∃Y. WorksIn(X, Y)
|
|||
|
|
|
|||
|
|
-- Every department has a manager or is a sub-department
|
|||
|
|
Department(X) ⊢ (∃Y. ManagedBy(X, Y)) ∨ (∃Z. SubDeptOf(X, Z))
|
|||
|
|
|
|||
|
|
-- If two things are equal and one is an employee, so is the other
|
|||
|
|
Employee(X) ∧ X = Y ⊢ Employee(Y)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Why Geometric Logic Matters
|
|||
|
|
|
|||
|
|
Geometric logic has a special property: if a geometric formula is true in a model, it remains true in any **extension** of that model (adding more facts never makes it false). This is called **monotonicity** and is exactly what makes the chase work — the chase only adds facts, never removes them.
|
|||
|
|
|
|||
|
|
This property means:
|
|||
|
|
- The chase directly computes models of geometric theories
|
|||
|
|
- Every geometric theory has a **canonical model** — the one the chase builds
|
|||
|
|
- Query answers over the chase result are guaranteed correct for any model of the theory
|
|||
|
|
|
|||
|
|
### Geometric Logic vs Other Fragments
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
First-Order Logic
|
|||
|
|
└── Geometric Logic (∧, ∨, ∃ only — monotone)
|
|||
|
|
└── Coherent Logic (same but finitary — finite disjunctions only)
|
|||
|
|
└── Datalog (no existentials in head, no disjunction)
|
|||
|
|
└── Horn Clauses (at most one atom in head)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Connection to the Chase
|
|||
|
|
|
|||
|
|
Geometric sequents map directly onto TGDs (tuple-generating dependencies):
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
Geometric sequent: Employee(X) ⊢ ∃Y. WorksIn(X, Y)
|
|||
|
|
TGD: Employee(X) → ∃Y. WorksIn(X, Y)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
The chase is the standard procedure for building models of geometric theories. Given a set of geometric axioms and an initial database, the chase constructs the minimal model satisfying all axioms — which is exactly the universal solution used for query answering.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Logic Hierarchy
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
Propositional Logic -- no variables, just true/false statements
|
|||
|
|
↓ (add variables and quantifiers)
|
|||
|
|
First-Order Logic (FOL) -- variables, predicates, ∀, ∃
|
|||
|
|
↓ (restrict to certain rule forms)
|
|||
|
|
Geometric Logic -- only ∧, ∨, ∃ — no ∀ in head, no ¬
|
|||
|
|
↓ (remove existentials from head)
|
|||
|
|
Datalog -- function-free, no existentials in head
|
|||
|
|
↓ (remove disjunction)
|
|||
|
|
Horn Clauses -- at most one atom in head
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Changelog
|
|||
|
|
|
|||
|
|
* **Mar 6, 2026** -- First version created.
|