From 9f4e471223dfe910ea31bbb5b671ebbe44f8ee3a Mon Sep 17 00:00:00 2001 From: Hassan Abedi Date: Fri, 6 Mar 2026 12:24:51 +0000 Subject: [PATCH] Add document about relevant logical concepts --- hassan/009-logic-primer.md | 285 +++++++++++++++++++++++++++++++++++++ 1 file changed, 285 insertions(+) create mode 100644 hassan/009-logic-primer.md diff --git a/hassan/009-logic-primer.md b/hassan/009-logic-primer.md new file mode 100644 index 0000000..05bc13c --- /dev/null +++ b/hassan/009-logic-primer.md @@ -0,0 +1,285 @@ +# Logic Primer + +A reference for the core concepts of first-order logic and database theory. + +--- + +## Building Blocks + +### Constants +Specific known values — real things in your database. +``` +Alice, Bob, Engineering, 42 +``` + +### Variables +Placeholders for unknown values, usually written in uppercase or with `?`. +``` +X, Y, ?person, ?dept +``` + +### Terms +A term is either a constant or a variable. Anything that can fill an argument slot in a predicate. + +--- + +## Predicates and Atoms + +A **predicate** is a named relation applied to a list of terms: +``` +Employee(X, Y) -- predicate "Employee", arity 2 +Likes(X, Y, Z) -- predicate "Likes", arity 3 +Person(Alice) -- predicate "Person", arity 1 +``` + +The number of arguments a predicate takes is its **arity**. + +An **atom** is a predicate applied to specific terms. When all terms are constants it is called a **ground atom** — a concrete fact in the database: +``` +Employee(Alice, Engineering) -- ground atom +``` + +--- + +## Rules + +A **rule** is an if-then statement built from atoms, split into two parts: + +``` +body → head +``` + +| Part | Also Called | Meaning | +|:-----|:------------|:--------| +| Body | Antecedent, LHS | Conditions that must hold | +| Head | Consequent, RHS | What must be true if body holds | + +Example: +``` +Employee(X, Y), Department(Y) → ∃Z. ManagedBy(Y, Z) +``` + +Variables in the head that do not appear in the body are **existential variables** — they represent unknown values that must exist, but whose identity is not known. + +--- + +## Quantifiers + +| Symbol | Name | Meaning | Example | +|:-------|:-----|:--------|:--------| +| `∀` | Universal | "For all..." | `∀X. Person(X) → Mortal(X)` | +| `∃` | Existential | "There exists some..." | `∀X. Person(X) → ∃Y. MotherOf(X, Y)` | + +--- + +## Connectives + +| Symbol | Name | Meaning | +|:-------|:-----|:--------| +| `∧` | Conjunction | AND | +| `∨` | Disjunction | OR | +| `¬` | Negation | NOT | +| `→` | Implication | IF...THEN | +| `↔` | Biconditional | IF AND ONLY IF | + +--- + +## Key Concepts + +### Substitution +A mapping from variables to terms. Written as `σ`: +``` +σ = { X → Alice, Y → Engineering } + +Employee(X, Y) under σ = Employee(Alice, Engineering) +``` + +### Homomorphism +A substitution that maps one set of atoms into another, preserving structure. Used in the chase to check whether a rule head is already satisfied in the database. + +### Ground Instance +An atom or rule with all variables replaced by constants — no variables remaining. + +### Herbrand Universe +The set of all ground terms constructable from the constants and function symbols in a formula. Represents the "world" of all possible values. + +### Skolem Term / Skolem Function +A named placeholder for an existentially quantified variable, constructed deterministically from the values that triggered it: +``` +∀X. Person(X) → ∃Y. MotherOf(X, Y) + +Skolemized: MotherOf(X, mother_of(X)) +``` +The Skolem term `mother_of(X)` means "the mother of X, whoever that is." Same input always produces the same term. + +--- + +## Types of Dependencies (Rules) + +| Type | Abbreviation | Form | Meaning | +|:-----|:-------------|:-----|:--------| +| Tuple-generating dependency | TGD | `body → ∃z. head` | If body holds, some new tuple must exist | +| Equality-generating dependency | EGD | `body → x = y` | If body holds, two values must be equal | +| Functional dependency | FD | Special EGD | A set of attributes uniquely determines another | +| Full dependency | — | TGD with no existentials | Head variables all appear in body | + +--- + +## Rule Classes and Termination + +| Rule Class | Existentials | Termination Guarantee | Notes | +|:-----------|:-------------|:----------------------|:------| +| Datalog | No | Always | Core of logic programming | +| Weakly acyclic | Yes | Yes | No cyclic value propagation through existentials | +| Guarded | Yes | No (but decidable) | Existential vars "guarded" by a body atom | +| Frontier-one | Yes | No (but decidable) | At most one frontier variable per rule | +| General TGDs | Yes | Undecidable | No restrictions | + +--- + +## Datalog + +Datalog is a logic programming language and the most important rule class for the chase. It is a restriction of first-order logic with no function symbols and no existential variables in rule heads. + +### Syntax + +A Datalog program consists of: +- **Facts** — ground atoms representing known data +- **Rules** — if-then statements deriving new facts from existing ones + +``` +% Facts +Employee(alice, engineering). +Employee(bob, marketing). +Department(engineering). + +% Rules +WorksIn(X, D) :- Employee(X, D), Department(D). +Colleague(X, Y) :- WorksIn(X, D), WorksIn(Y, D), X ≠ Y. +``` + +### Key Properties + +| Property | Value | +|:---------|:------| +| Existentials in head | No | +| Function symbols | No | +| Negation | Stratified only | +| Termination | Always | +| Data complexity | PTIME | +| Evaluation strategy | Bottom-up (forward chaining) or top-down (backward chaining) | + +### Evaluation + +Datalog is evaluated by computing the **least fixed point** — repeatedly applying all rules until no new facts are derived. This is exactly the chase with no existentials: + +``` +Iteration 0: { Employee(alice, engineering), Employee(bob, marketing), Department(engineering) } +Iteration 1: + WorksIn(alice, engineering) +Iteration 2: + Colleague(alice, bob), Colleague(bob, alice) +Iteration 3: nothing new → stop +``` + +### Datalog vs SQL + +Datalog can express **recursive queries** naturally, which SQL cannot without special extensions (like recursive CTEs). For example, computing transitive closure (all ancestors of a person) is trivial in Datalog: + +``` +Ancestor(X, Y) :- Parent(X, Y). +Ancestor(X, Y) :- Parent(X, Z), Ancestor(Z, Y). +``` + +### Datalog and the Chase + +Pure Datalog evaluation is a restricted chase with no existential variables — all chase variants behave identically. The fixed point is always reached in at most O(n^k) steps. + +--- + +## Geometric Logic + +Geometric logic is a fragment of first-order logic that is particularly well-suited to the chase. It is the theoretical foundation for most modern chase-based reasoning engines. + +### Definition + +A formula is **geometric** if it uses only: + +| Allowed | Not Allowed | +|:--------|:-----------| +| Conjunction `∧` | Universal quantifier `∀` in head | +| Disjunction `∨` | Negation `¬` | +| Existential quantifier `∃` | Implication `→` in head | +| Equality `=` | Infinite conjunctions in head | + +A **geometric sequent** (rule) has the form: +``` +φ ⊢ ψ + +where φ and ψ are geometric formulas +``` + +Which is shorthand for: `∀x. φ(x) → ψ(x)` + +### Examples + +``` +-- Every employee works in some department +Employee(X) ⊢ ∃Y. WorksIn(X, Y) + +-- Every department has a manager or is a sub-department +Department(X) ⊢ (∃Y. ManagedBy(X, Y)) ∨ (∃Z. SubDeptOf(X, Z)) + +-- If two things are equal and one is an employee, so is the other +Employee(X) ∧ X = Y ⊢ Employee(Y) +``` + +### Why Geometric Logic Matters + +Geometric logic has a special property: if a geometric formula is true in a model, it remains true in any **extension** of that model (adding more facts never makes it false). This is called **monotonicity** and is exactly what makes the chase work — the chase only adds facts, never removes them. + +This property means: +- The chase directly computes models of geometric theories +- Every geometric theory has a **canonical model** — the one the chase builds +- Query answers over the chase result are guaranteed correct for any model of the theory + +### Geometric Logic vs Other Fragments + +``` +First-Order Logic + └── Geometric Logic (∧, ∨, ∃ only — monotone) + └── Coherent Logic (same but finitary — finite disjunctions only) + └── Datalog (no existentials in head, no disjunction) + └── Horn Clauses (at most one atom in head) +``` + +### Connection to the Chase + +Geometric sequents map directly onto TGDs (tuple-generating dependencies): + +``` +Geometric sequent: Employee(X) ⊢ ∃Y. WorksIn(X, Y) +TGD: Employee(X) → ∃Y. WorksIn(X, Y) +``` + +The chase is the standard procedure for building models of geometric theories. Given a set of geometric axioms and an initial database, the chase constructs the minimal model satisfying all axioms — which is exactly the universal solution used for query answering. + +--- + +## Logic Hierarchy + +``` +Propositional Logic -- no variables, just true/false statements + ↓ (add variables and quantifiers) +First-Order Logic (FOL) -- variables, predicates, ∀, ∃ + ↓ (restrict to certain rule forms) +Geometric Logic -- only ∧, ∨, ∃ — no ∀ in head, no ¬ + ↓ (remove existentials from head) +Datalog -- function-free, no existentials in head + ↓ (remove disjunction) +Horn Clauses -- at most one atom in head +``` + +--- + +## Changelog + +* **Mar 6, 2026** -- First version created. \ No newline at end of file