Add a note file for CRDTs and incremental queries

This commit is contained in:
Hassan Abedi 2026-05-06 16:12:19 +02:00
parent 405a609eb8
commit 35b6e8f43f

View File

@ -0,0 +1,271 @@
# CRDTs, Datalog, and Incremental Queries
A primer and glossary for thinking about CRDTs as query-defined data structures.
---
## Short Answer
CRDTs are replicated data structures that let different replicas accept local writes and later converge to the same state.
One way to define a CRDT is to store every update as an immutable operation and derive the visible state with a deterministic query. If every replica
eventually receives the same set of operations, and every replica runs the same deterministic query, then every replica computes the same state.
Datalog is a good fit for this idea because it is declarative, rule-based, set-oriented, and naturally supports recursion. Incremental query execution
matters because the operation set only grows, and recomputing the full query after every new operation would become expensive.
---
## Core Idea
The usual implementation style for a CRDT is an algorithm written in a general-purpose language. The programmer must ensure that concurrent operations
commute, or otherwise prove that replicas converge.
The query-based style changes the burden:
- operations are stored as append-only facts
- CRDT state is defined as a query over those facts
- convergence follows from deterministic evaluation over the same input set
- performance depends on incremental maintenance of the query result
The conceptual move is from "merge these mutable states correctly" to "derive this state from the immutable operation history."
---
## Operation Log as Database
In this model, the database does not store only the current value. It stores facts such as:
```text
set(replica_id, counter, key, value)
pred(from_replica_id, from_counter, to_replica_id, to_counter)
insert(replica_id, counter, parent_replica_id, parent_counter, value)
remove(replica_id, counter)
```
The current application state is a derived view over these base facts.
This is similar to event sourcing, but with an important difference: the query is designed so the result does not depend on delivery order. Replicas
can receive operations in different orders and still converge once they have the same operation set.
---
## Datalog Role
Datalog represents derived facts with rules:
```text
overwritten(RepId, Ctr) :-
pred(RepId, Ctr, _, _).
mvrStore(Key, Value) :-
set(RepId, Ctr, Key, Value),
not overwritten(RepId, Ctr).
```
Read this as:
- `overwritten` contains operations that appear as predecessors of later operations
- `mvrStore` contains key-value pairs whose set operation has not been overwritten
That rule set describes a multi-value register key-value store. Concurrent writes are preserved as multiple visible values instead of being collapsed
by a last-writer-wins policy.
---
## Incremental Query Role
Incremental view maintenance means the engine consumes changes to inputs and produces changes to outputs.
Instead of:
```text
all operations -> full query recomputation -> full current state
```
the engine aims for:
```text
new operations -> query delta computation -> state delta
```
This matters for CRDTs because the source relation is a growing history. Without incremental execution, a long-lived document or key-value store pays
more and more cost per update.
DBSP is one framework for expressing this kind of incremental computation. It models relations as changing streams and maintains query results through
operators such as joins, projections, differences, antijoins, and fixed-point iterations.
---
## Example: Multi-Value Register Key-Value Store
A multi-value register keeps all concurrent values for a key. If one write causally overwrites another, the older value disappears. If two writes are
concurrent, both remain visible.
The operation facts are:
- `set`: an assignment of a value to a key
- `pred`: a causal dependency edge from one operation to another
The query computes visible key-value pairs by selecting `set` facts that are not known to be overwritten.
This is useful because conflict handling is explicit. The application can see concurrent values and decide how to resolve them.
---
## Example: Causal Readiness
The simple key-value query assumes operations are processed in causal order.
If operations can arrive out of order, the query needs an additional causal-readiness check. A causal-readiness rule derives which operations can be
safely exposed because their causal predecessors are present.
This usually involves recursive graph traversal over the causal dependency graph. It is more semantically complete, but it can be more expensive
because recursive fixed-point evaluation may depend on the depth of the causal history.
---
## Example: List CRDT
A list CRDT must converge not only on membership, but also on element order.
The query-based formulation uses insert operations shaped like:
```text
insert(replica_id, counter, parent_replica_id, parent_counter, value)
```
Each inserted element points to the element after which it was inserted. This forms a tree:
- the root is a sentinel element
- children represent concurrent insertions after the same parent
- siblings are ordered deterministically by operation identifiers
- the visible list comes from a depth-first traversal
Deletes use tombstones. The deleted element remains as a structural anchor, but the final visible list skips it.
---
## Query Engine Shape
A prototype engine for this approach usually needs these pieces:
- a Datalog parser
- dependency analysis between predicates
- stratification checks for negation
- translation from Datalog rules to relational algebra
- relational operators such as projection, selection, join, antijoin, union, difference, and distinct
- fixed-point execution for recursion
- incremental maintenance for input and output changes
The implementation path is:
```text
Datalog program
-> abstract syntax tree
-> predicate dependency graph
-> execution order
-> relational intermediate representation
-> incremental circuit or runtime plan
```
---
## Glossary
**CRDT**: A replicated data structure whose replicas converge after they have received the same updates, even if writes happened concurrently.
**Replica**: One copy of the data structure. A replica can accept local writes and later exchange operations with other replicas.
**Convergence**: The property that replicas eventually compute the same state after receiving the same information.
**Strong Eventual Consistency**: The combination of eventual delivery and deterministic convergence for replicas that have seen the same updates.
**Operation-Based CRDT**: A CRDT represented by operations that are generated at replicas and disseminated to other replicas.
**Immutable Operation**: An update fact that is never changed after creation. The operation set grows monotonically.
**Operation Identifier**: A unique identifier for an operation, often a pair of replica id and counter.
**Lamport Clock**: A logical counter used to order events without relying on wall-clock time.
**Causal Dependency**: A relationship saying one operation was known before another operation was created.
**Causal History**: The graph or set of causal dependencies among operations.
**Causal Broadcast**: A delivery discipline in which an operation is delivered only after its causal predecessors.
**Causal Readiness**: The condition that an operation has all required causal predecessors available.
**Concurrent Operations**: Operations where neither causally depends on the other.
**Multi-Value Register**: A register that exposes concurrent values instead of choosing one automatically.
**Last-Writer-Wins Register**: A register that picks a single winner, usually by timestamp or operation id.
**Tombstone**: A retained marker for a deleted element. It preserves references needed by later or concurrent operations.
**Datalog**: A declarative logic programming language based on facts and rules.
**Fact**: A ground tuple in a predicate, such as `set(1, 2, 10, 99)`.
**Rule**: A derivation statement that says when new facts are true.
**Predicate**: A named relation in Datalog.
**Extensional Database Predicate**: An input predicate whose facts are provided directly.
**Intensional Database Predicate**: A derived predicate whose facts come from rules.
**Stratified Negation**: A restriction on negation that avoids circular negative dependencies.
**Fixed Point**: The stable result reached when repeated rule application produces no new facts.
**Relational Algebra**: A set of operators for transforming relations, such as projection, selection, join, union, and difference.
**Antijoin**: An operator that keeps rows from the left input only when they have no matching row in the right input.
**Incremental View Maintenance**: Maintaining a derived result by applying input changes instead of recomputing from scratch.
**Delta**: A change to a relation or query result, often represented as inserted and removed tuples.
**DBSP**: A framework for incremental computation on changing relations and streams.
**Hydration**: Rebuilding the internal operator state from an existing operation history, commonly during application startup.
**Near-Real-Time Processing**: Applying small new batches after the query plan is already initialized.
---
## Design Questions
- Can the CRDT be expressed using monotonic inputs and deterministic rules?
- Does the query need negation, recursion, or both?
- Is negation stratified?
- Does causal readiness require graph traversal?
- Can the expensive parts be maintained incrementally?
- Does the system need the full operation history forever?
- Can old operations be compacted without changing future query results?
- Are tombstones required for structural references?
- Are operation identifiers enough to define a deterministic order?
- Does the application need conflicts exposed or automatically resolved?
---
## Practical Mental Model
The CRDT operation log is the base table.
The CRDT state is a materialized view.
Datalog defines the view.
DBSP or another incremental engine maintains the view as operations arrive.
The convergence argument is simple: same facts plus same deterministic query equals same derived state.
---
## Changelog
* **May 6, 2026** -- First version created.