CRDTs in Databases: Conflict-Free Replicated Data Types

Distributed databases face a fundamental tension between availability and consistency.
When network partitions occur, systems must choose between rejecting writes (sacrificing availability) or accepting potentially conflicting writes on multiple replicas (sacrificing consistency).
Conflict-Free Replicated Data Types (CRDTs) offer a mathematically grounded resolution: data structures that can be updated independently and concurrently on different replicas, then merged deterministically without conflicts.
The merge always converges to a consistent state, regardless of the order in which updates are received.

This property makes CRDTs a compelling primitive for databases operating under eventual consistency.
Rather than relying on coordination protocols or last-writer-wins heuristics that discard data, CRDTs guarantee convergence by construction.

Theoretical Foundations

CRDTs rest on a simple algebraic insight.
If you can define a merge function that is commutative, associative, and idempotent, then replicas applying updates in any order will converge to the same state.
These three properties ensure that:

Commutativity: merge(a, b) = merge(b, a). The order in which two states are merged does not matter.
Associativity: merge(a, merge(b, c)) = merge(merge(a, b), c). Grouping does not matter.
Idempotency: merge(a, a) = a. Redelivery of the same state is harmless.

Any data type satisfying these constraints forms a join-semilattice, where the merge function computes the least upper bound (LUB) of two states.
Monotonic growth through the semilattice guarantees convergence: replicas can only move "upward" in the lattice, and they all converge to the same supremum.

There are two equivalent formulations of CRDTs:

State-based CRDTs (CvRDTs) replicate by shipping their full state.
Each replica periodically sends its state to other replicas, which merge it with their local state using the LUB function.
The semilattice properties guarantee convergence regardless of message reordering, duplication, or delay.
The cost is bandwidth: full state transmission can be expensive for large objects.

Operation-based CRDTs (CmRDTs) replicate by shipping individual operations.
The delivery layer must guarantee exactly-once, causally ordered delivery, and concurrent operations must commute with one another.
CmRDTs are more bandwidth-efficient but require stronger delivery guarantees from the messaging layer.

Common CRDT Data Types

Several CRDT designs appear repeatedly in database systems.

G-Counter (Grow-only Counter)

Each replica maintains its own local counter in a vector.
The value of the counter is the sum of all entries.
Merge takes the element-wise maximum.
Since each replica only increments its own entry, concurrent updates never conflict.

PN-Counter (Positive-Negative Counter)

Two G-Counters: one for increments, and one for decrements.
The value is the difference.
This supports both increment and decrement while preserving convergence.

G-Set and 2P-Set

A G-Set (grow-only set) supports only additions; merge is set union.
A 2P-Set (two-phase set) maintains an add-set and a remove-set.
An element is in the set if it is in the add-set but not in the remove-set (remove takes precedence on conflict).
The limitation is that once removed, an element cannot be re-added.

OR-Set (Observed-Remove Set)

The OR-Set solves the re-addition problem by tagging each addition with a unique identifier.
A remove operation records the set of tags it has observed for a given element; those tags are excluded on lookup.
If a concurrent add on another replica introduces a new tag that was not observed by the remove, that new tag survives the merge.
This provides intuitive add-wins semantics under concurrency.

LWW-Register (Last-Writer-Wins Register)

Each write is tagged with a timestamp.
Merge picks the value with the highest timestamp.
LWW is a formally valid CRDT — the max-timestamp function satisfies the semilattice properties — but its semantics discard concurrent updates rather than preserving them.
It is useful when the application can tolerate data loss on concurrent writes, but it should not be chosen when all updates must be reflected in the final state.

Walkthrough

The following walkthrough illustrates a G-Counter operating across three replicas.

G-Counter: Step-by-Step

Initial state on all replicas:
  Replica A: [A:0, B:0, C:0]
  Replica B: [A:0, B:0, C:0]
  Replica C: [A:0, B:0, C:0]

Step 1: Replica A increments locally.
  Replica A: [A:1, B:0, C:0]   value = 1

Step 2: Replica B increments twice locally.
  Replica B: [A:0, B:2, C:0]   value = 2

Step 3: Replica C increments locally.
  Replica C: [A:0, B:0, C:1]   value = 1

Step 4: Replica A receives state from Replica B.
  merge([A:1, B:0, C:0], [A:0, B:2, C:0])
  = [max(1,0), max(0,2), max(0,0)]
  = [A:1, B:2, C:0]            value = 3

Step 5: Replica A receives state from Replica C.
  merge([A:1, B:2, C:0], [A:0, B:0, C:1])
  = [A:1, B:2, C:1]            value = 4

Step 6: Replica B and C eventually receive all states.
  All replicas converge to: [A:1, B:2, C:1]   value = 4

The merge function (element-wise max) is commutative, associative, and idempotent.
No matter the order or frequency of state exchanges, all replicas converge.

OR-Set Merge Pseudocode

The key invariant of a state-based OR-Set is that each element is associated with a set of unique tags representing distinct add operations.
A remove records which tags it has observed, and the merge union of tag sets determines the final presence of an element.

struct ORSet:
    # Maps each value to the set of unique tags currently associated with it.
    # An element is present if its tag set is non-empty after merge.
    elements: Map<Value, Set<UniqueTag>>

function add(set, value):
    tag = generate_unique_tag()  # e.g., (replica_id, logical_clock)
    set.elements[value].add(tag)

function remove(set, value):
    # Remove all currently observed tags for this value.
    # Tags added concurrently on other replicas (not yet received) are not
    # affected; they will survive the merge and keep the element present.
    set.elements[value] = {}

function merge(set_a, set_b):
    result = new ORSet()
    for each value in union(keys(set_a.elements), keys(set_b.elements)):
        tags_a = set_a.elements.get(value, {})
        tags_b = set_b.elements.get(value, {})
        result.elements[value] = union(tags_a, tags_b)
    return result

function lookup(set, value):
    return set.elements.get(value, {}) is not empty

A concurrent add on one replica and remove on another preserves the add: the new tag introduced by the add was not in the remove replica's observed set and therefore survives the union during merge.

Note on remove semantics: In a state-based OR-Set, the remove operation clears the local tag set for a value.
This is correct only when the local state is later propagated and merged using the full merge function above.
A replica that removes an element and then merges with a replica that independently added the same element (with a new tag) will see the element re-appear — which is the intended add-wins behavior.

CRDTs in Production Database Systems

Several databases have integrated CRDTs into their storage and replication layers.

Riak (originally developed by Basho; Basho ceased operations in 2017 and the project is now maintained as open source) was among the first production databases to offer CRDTs as first-class data types.
Riak's "data types" include counters, sets, maps, registers, and flags, all implemented as CRDTs with state-based replication via the Riak ring.

Redis Enterprise (CRDBs) provides active-active geo-replication using CRDTs.
Each Redis data structure (strings, sets, sorted sets, hashes) has a CRDT-aware merge strategy.

AntidoteDB, a research database from the SyncFree European project, was purpose-built around CRDTs.
It provides a transactional layer on top of CRDTs, combining causal consistency with convergent data types.

Automerge and Yjs are CRDT-based libraries for collaborative editing.
While not databases in the traditional sense, they demonstrate CRDT principles applied to document-level conflict resolution, and they increasingly integrate with persistence layers.

Practical Challenges

CRDTs are not a universal solution.
Several engineering challenges arise in practice.

Metadata overhead. OR-Sets and similar structures accumulate tombstones and unique tags.
Without garbage collection, state size grows unboundedly.
Effective compaction requires coordination (knowing that all replicas have observed a given state), which partially undermines the coordination-free appeal of CRDTs.

Semantic mismatch. The merge semantics of a CRDT may not match application intent.
A counter that allows concurrent increments and decrements cannot enforce a "balance must not go negative" invariant.
CRDTs guarantee convergence, not arbitrary application-level invariants.

Causal consistency. Pure state-based CRDTs do not require causal delivery, but many practical applications do.
Without causal ordering, a user might observe a reply to a message before the message itself.
Layering causal consistency on top of CRDTs adds complexity and some coordination cost.

Composability. Building complex data models from CRDT primitives is nontrivial.
A CRDT map of CRDT sets of CRDT registers can be composed, but reasoning about the convergence semantics of nested structures requires care.
The delta-state CRDT framework helps with efficient propagation of updates through composed structures.

Expressiveness limits. Some operations are inherently non-commutative.
Moving an element from one set to another, for example, involves a remove and an add that must be atomic.
General-purpose transactions across multiple CRDTs remain an open research area, with approaches like ECROs (Explicitly Consistent Replicated Objects) attempting to bridge this gap.

Delta-State CRDTs

A significant practical improvement over full-state CRDTs is the delta-state approach.
Instead of shipping the entire state, a replica ships only the "delta" (the recent mutations, encoded as a join-semilattice fragment).
The receiving replica merges the delta into its local state.
This preserves the convergence guarantees of state-based CRDTs while approaching the bandwidth efficiency of operation-based CRDTs.
Delta-state CRDTs can also fall back to full-state synchronization when deltas are lost, providing robustness without requiring reliable delivery.

Key Points

CRDTs achieve conflict-free convergence by requiring merge functions that are commutative, associative, and idempotent, and forming a join-semilattice.
State-based CRDTs (CvRDTs) ship full state and merge via least upper bound; operation-based CRDTs (CmRDTs) ship operations that must commute under concurrent, causally ordered, exactly-once delivery.
Common CRDT types include G-Counters, PN-Counters, OR-Sets, and LWW-Registers, each with distinct tradeoffs in expressiveness and overhead.
LWW-Register is a formally valid CRDT but discards concurrent updates; it is appropriate only when last-write semantics match application intent.
Production systems like Riak, Redis Enterprise, and AntidoteDB demonstrate that CRDTs can serve as practical building blocks for eventually consistent databases.
Metadata growth, garbage collection, and semantic mismatch between merge semantics and application intent are the primary engineering challenges.
Delta-state CRDTs reduce bandwidth overhead while preserving the convergence guarantees of full state-based replication.
CRDTs guarantee convergence, not arbitrary application invariants; enforcing constraints like non-negative balances requires additional mechanisms beyond the CRDT model.

References

Shapiro, M., Preguiça, N., Baquero, C., and Zawirski, M. "Conflict-Free Replicated Data Types." Proceedings of the 13th International Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS), 2011.

Shapiro, M., Preguiça, N., Baquero, C., and Zawirski, M. "A Comprehensive Study of Convergent and Commutative Replicated Data Types." INRIA Technical Report RR-7506, 2011.

Almeida, P. S., Shoker, A., and Baquero, C. "Delta State Replicated Data Types." Journal of Parallel and Distributed Computing, vol. 111, 2018.

Kleppmann, M. and Beresford, A. R. "A Conflict-Free Replicated JSON Datatype." IEEE Transactions on Parallel and Distributed Systems, vol. 28, no. 10, 2017.