useful-notes/hassan/009-logic-primer.md

8.8 KiB
Raw Permalink Blame History

Logic Primer

A reference for the core concepts of first-order logic and database theory.


Building Blocks

Constants

Specific known values — real things in your database.

Alice, Bob, Engineering, 42

Variables

Placeholders for unknown values, usually written in uppercase or with ?.

X, Y, ?person, ?dept

Terms

A term is either a constant or a variable. Anything that can fill an argument slot in a predicate.


Predicates and Atoms

A predicate is a named relation applied to a list of terms:

Employee(X, Y)       -- predicate "Employee", arity 2
Likes(X, Y, Z)       -- predicate "Likes", arity 3
Person(Alice)        -- predicate "Person", arity 1

The number of arguments a predicate takes is its arity.

An atom is a predicate applied to specific terms. When all terms are constants it is called a ground atom — a concrete fact in the database:

Employee(Alice, Engineering)   -- ground atom

Rules

A rule is an if-then statement built from atoms, split into two parts:

body  →  head
Part Also Called Meaning
Body Antecedent, LHS Conditions that must hold
Head Consequent, RHS What must be true if body holds

Example:

Employee(X, Y), Department(Y)  →  ∃Z. ManagedBy(Y, Z)

Variables in the head that do not appear in the body are existential variables — they represent unknown values that must exist, but whose identity is not known.


Quantifiers

Symbol Name Meaning Example
Universal "For all..." ∀X. Person(X) → Mortal(X)
Existential "There exists some..." ∀X. Person(X) → ∃Y. MotherOf(X, Y)

Connectives

Symbol Name Meaning
Conjunction AND
Disjunction OR
¬ Negation NOT
Implication IF...THEN
Biconditional IF AND ONLY IF

Key Concepts

Substitution

A mapping from variables to terms. Written as σ:

σ = { X → Alice, Y → Engineering }

Employee(X, Y) under σ  =  Employee(Alice, Engineering)

Homomorphism

A substitution that maps one set of atoms into another, preserving structure. Used in the chase to check whether a rule head is already satisfied in the database.

Ground Instance

An atom or rule with all variables replaced by constants — no variables remaining.

Herbrand Universe

The set of all ground terms constructable from the constants and function symbols in a formula. Represents the "world" of all possible values.

Skolem Term / Skolem Function

A named placeholder for an existentially quantified variable, constructed deterministically from the values that triggered it:

∀X. Person(X) → ∃Y. MotherOf(X, Y)

Skolemized:  MotherOf(X, mother_of(X))

The Skolem term mother_of(X) means "the mother of X, whoever that is." Same input always produces the same term.


Types of Dependencies (Rules)

Type Abbreviation Form Meaning
Tuple-generating dependency TGD body → ∃z. head If body holds, some new tuple must exist
Equality-generating dependency EGD body → x = y If body holds, two values must be equal
Functional dependency FD Special EGD A set of attributes uniquely determines another
Full dependency TGD with no existentials Head variables all appear in body

Rule Classes and Termination

Rule Class Existentials Termination Guarantee Notes
Datalog No Always Core of logic programming
Weakly acyclic Yes Yes No cyclic value propagation through existentials
Guarded Yes No (but decidable) Existential vars "guarded" by a body atom
Frontier-one Yes No (but decidable) At most one frontier variable per rule
General TGDs Yes Undecidable No restrictions

Datalog

Datalog is a logic programming language and the most important rule class for the chase. It is a restriction of first-order logic with no function symbols and no existential variables in rule heads.

Syntax

A Datalog program consists of:

  • Facts — ground atoms representing known data
  • Rules — if-then statements deriving new facts from existing ones
% Facts
Employee(alice, engineering).
Employee(bob, marketing).
Department(engineering).

% Rules
WorksIn(X, D) :- Employee(X, D), Department(D).
Colleague(X, Y) :- WorksIn(X, D), WorksIn(Y, D), X ≠ Y.

Key Properties

Property Value
Existentials in head No
Function symbols No
Negation Stratified only
Termination Always
Data complexity PTIME
Evaluation strategy Bottom-up (forward chaining) or top-down (backward chaining)

Evaluation

Datalog is evaluated by computing the least fixed point — repeatedly applying all rules until no new facts are derived. This is exactly the chase with no existentials:

Iteration 0:  { Employee(alice, engineering), Employee(bob, marketing), Department(engineering) }
Iteration 1:  + WorksIn(alice, engineering)
Iteration 2:  + Colleague(alice, bob), Colleague(bob, alice)
Iteration 3:  nothing new → stop

Datalog vs SQL

Datalog can express recursive queries naturally, which SQL cannot without special extensions (like recursive CTEs). For example, computing transitive closure (all ancestors of a person) is trivial in Datalog:

Ancestor(X, Y) :- Parent(X, Y).
Ancestor(X, Y) :- Parent(X, Z), Ancestor(Z, Y).

Datalog and the Chase

Pure Datalog evaluation is a restricted chase with no existential variables — all chase variants behave identically. The fixed point is always reached in at most O(n^k) steps.


Geometric Logic

Geometric logic is a fragment of first-order logic that is particularly well-suited to the chase. It is the theoretical foundation for most modern chase-based reasoning engines.

Definition

A formula is geometric if it uses only:

Allowed Not Allowed
Conjunction Universal quantifier in head
Disjunction Negation ¬
Existential quantifier Implication in head
Equality = Infinite conjunctions in head

A geometric sequent (rule) has the form:

φ ⊢ ψ

where φ and ψ are geometric formulas

Which is shorthand for: ∀x. φ(x) → ψ(x)

Examples

-- Every employee works in some department
Employee(X)  ⊢  ∃Y. WorksIn(X, Y)

-- Every department has a manager or is a sub-department
Department(X)  ⊢  (∃Y. ManagedBy(X, Y))  (∃Z. SubDeptOf(X, Z))

-- If two things are equal and one is an employee, so is the other
Employee(X) ∧ X = Y  ⊢  Employee(Y)

Why Geometric Logic Matters

Geometric logic has a special property: if a geometric formula is true in a model, it remains true in any extension of that model (adding more facts never makes it false). This is called monotonicity and is exactly what makes the chase work — the chase only adds facts, never removes them.

This property means:

  • The chase directly computes models of geometric theories
  • Every geometric theory has a canonical model — the one the chase builds
  • Query answers over the chase result are guaranteed correct for any model of the theory

Geometric Logic vs Other Fragments

First-Order Logic
  └── Geometric Logic          (∧, , ∃ only — monotone)
        └── Coherent Logic     (same but finitary — finite disjunctions only)
              └── Datalog      (no existentials in head, no disjunction)
                    └── Horn Clauses  (at most one atom in head)

Connection to the Chase

Geometric sequents map directly onto TGDs (tuple-generating dependencies):

Geometric sequent:    Employee(X)  ⊢  ∃Y. WorksIn(X, Y)
TGD:                  Employee(X)  →  ∃Y. WorksIn(X, Y)

The chase is the standard procedure for building models of geometric theories. Given a set of geometric axioms and an initial database, the chase constructs the minimal model satisfying all axioms — which is exactly the universal solution used for query answering.


Logic Hierarchy

Propositional Logic          -- no variables, just true/false statements
      ↓  (add variables and quantifiers)
First-Order Logic (FOL)      -- variables, predicates, ∀, ∃
      ↓  (restrict to certain rule forms)
Geometric Logic              -- only ∧, , ∃ — no ∀ in head, no ¬
      ↓  (remove existentials from head)
Datalog                      -- function-free, no existentials in head
      ↓  (remove disjunction)
Horn Clauses                 -- at most one atom in head

Changelog

  • Mar 6, 2026 -- First version created.