hkucuk

CRDTs (Conflict-free Replicated Data Types): Data Consistency in Distributed Systems

October 15, 2024 • ☕️☕️ 8 min read • 🏷 computer, software, algorithm

Translated by author into: English


Distributed systems have become increasingly popular today. Microservices architectures, replications between data centers, geographically distributed databases, applications with multiple access points, and similar scenarios make ensuring data consistency quite complex. Conflicts can arise when data is simultaneously located, updated, or read from multiple locations. This is where the concept of CRDT (Conflict-free Replicated Data Types) comes into play.

CRDTs are special data structures that aim to eventually provide strong eventual consistency between data replicas in a distributed environment. Using such data structures, it is possible to merge data updates deterministically in some way without resorting to a continuous synchronization or expensive global lock mechanism to ensure consistency between replicas.

CRDT


1. What is CRDT and Its Features

CRDT (Conflict-free Replicated Data Type) is a type designed to replicate a data type in distributed environments without experiencing inconsistencies. In traditional methods, bringing together changes in different copies requires complex algorithms or additional coordination during conflict resolution. CRDTs are designed to minimize (or completely eliminate) these conflicts.

Basic features

  • Idempotent Operations: Merge functions in CRDTs are often designed to be idempotent. That is, if the same update is applied more than once, the same result is achieved.
  • Commutativity: The final state remains the same even if operations are applied in different orders.
  • Monotonic Growth: In some types of CRDTs (especially state-based CRDTs), the data structure is seen as always growing. Updates are made in the form of insertions or markings, but this also allows for “removals” with a different formulation later (e.g. OR-Set).
  • Strong Eventual Consistency: In a given time period, when all copies receive updates from each other and merge them, they all reach the same data in the final state.

These features provided by CRDTs aim to alleviate the consistency problem, which is one of the biggest challenges of distributed systems. In particular, if we want to design a system with both high availability and partition tolerance, as required by the CAP Theorem, we need to pay special attention to consistency issues. CRDTs provide a partial but effective solution to this problem by simplifying conflict resolution within the eventual consistency approach.

2. CRDT Types

CRDTs are generally examined in two basic categories: Operation-based and State-based. There is also an approach called Delta-based.

2.1. Operation-based (Op-based) CRDTs

In Op-based CRDTs, each change that occurs (for example, increasing a number, adding an element) is sent to other replicas via an “operation log”-like system. These replicas apply the incoming update and keep their own states up to date.

Example: GCounter (Grow-only Counter), PN-Counter (Positive-Negative Counter).

2.2. State-based (Convergent) CRDTs

In state-based CRDTs, a “state” is kept in each replica. Different replicas occasionally copy each other’s state (like the gossip mechanism) and merge the state. Since this merging process is idempotent, commutative, and associative, the same final state can be reached regardless of the order.

Example: OR-Set (Observed-Removed Set), MV-Register (Multi-Value Register).

2.3. Delta-based CRDTs

Delta-based CRDTs are a more efficient version of the state-based approach. Instead of sending the full state, only a summary (a small piece) called the “delta” is shared. This reduces bandwidth and synchronization costs. It is very useful for large data sets.

  • G-Counter: A counter that supports only increment operations.
  • PN-Counter: Supports both increment and decrement operations.
  • OR-Set (Observed-Removed Set): A set structure that supports element addition and removal operations. It contains additional metadata to keep track of when the same element was added/removed at different times.
  • LWW-Register (Last-Write-Wins Register): A register structure where the last written value prevails. It uses a kind of “last-writer-wins” logic.
  • MV-Register (Multi-Value Register): Holds multiple values ​​when different values ​​are written to different replicas at the same time and allows the application to choose between these values.
  • RGA (Replicated Growable Array): Used in scenarios such as text editing or document sharing.

3. Areas of Use and Advantages of CRDTs

  • Databases and Data Stores: Some databases such as Riak have adopted or partially used CRDT concepts. Technology stacks that use in-memory data structures such as Redis also have CRDT libraries.
  • Real-Time Collaboration Applications: In applications such as Google Docs, multiple people edit the same document simultaneously. CRDTs can facilitate conflict resolution in these scenarios.
  • Game Servers and Multiplayer Scenarios: CRDTs can be used in environments where real-time updates are made, such as player movements, scoreboards, and inventory information.
  • Distributed Cache Systems: CRDTs offer high performance and data integrity for cache keeping and synchronizing in geographically dispersed regions.

3.1. Advantages

  1. Simple Conflict Resolution: CRDTs minimize or eliminate conflicts by design.
  2. Eventual Consistency: CRDTs provide consistency at an eventual level in systems that want to emphasize availability and partition tolerance in the context of the CAP Theorem.
  3. Easy Extensibility: Once CRDT structures are understood, there is no need to write complex conflict resolution algorithms in the application.
  4. Idempotent and Commutative Operations: In distributed systems, CRDTs are not affected by processing the same data repeatedly or in different orders when there are situations such as message loss, retransmission or different ordering.

3.2. Disadvantages

  1. Additional Metadata: CRDTs usually hold additional metadata. This can create additional overhead in terms of memory usage.
  2. Does Not Cover All Operations: CRDT models may not be the best fit for all types of data structures or all application needs.
  3. The Meaning of the “Last State” of the Operation May Vary: Due to the nature of eventual consistency, the “current” state of the data state may not be consistent across the system. It becomes consistent over time.

4. Example CRDT Implementation

In the following example, we will demonstrate the logic of G-Counter (counter that only increments) in a distributed environment in a simple way with the Go language. This counter keeps the incremental values ​​in each replica and the values ​​between the replicas are merged to reach the total result.

Note: This code example is not a completely “production-grade” level CRDT, but is intended to explain the concept.

package main

import (
    "fmt"
    "sync"
)

// GCounter represents a simple grow-only counter CRDT.
type GCounter struct {
    // internal slice holds the counters for each replica
    values []int
    mu     sync.Mutex
}

// NewGCounter initializes a GCounter with a given number of replicas.
func NewGCounter(numReplicas int) *GCounter {
    return &GCounter{
        values: make([]int, numReplicas),
    }
}

// Increment increments the counter for the given replica ID.
func (c *GCounter) Increment(replicaID int) {
    c.mu.Lock()
    defer c.mu.Unlock()
    c.values[replicaID]++
}

// Value returns the total count by summing all replica counters.
func (c *GCounter) Value() int {
    c.mu.Lock()
    defer c.mu.Unlock()

    total := 0
    for _, v := range c.values {
        total += v
    }
    return total
}

// Merge merges another GCounter into the current one. 
// It takes the maximum value for each replica index.
func (c *GCounter) Merge(other *GCounter) {
    c.mu.Lock()
    defer c.mu.Unlock()

    // We assume both counters have the same length.
    for i := 0; i < len(c.values); i++ {
        if other.values[i] > c.values[i] {
            c.values[i] = other.values[i]
        }
    }
}

func main() {
    // Suppose we have 3 replicas in the system
    gc1 := NewGCounter(3)
    gc2 := NewGCounter(3)
    gc3 := NewGCounter(3)

    // Replica 1 increments its local counter
    gc1.Increment(0) // increment by replicaID = 0
    gc1.Increment(0)
    gc1.Increment(0)

    // Replica 2 increments its local counter
    gc2.Increment(1) // increment by replicaID = 1
    gc2.Increment(1)

    // Replica 3 increments its local counter
    gc3.Increment(2) // increment by replicaID = 2
    gc3.Increment(2)
    gc3.Increment(2)
    gc3.Increment(2)

    // Let's merge them to simulate synchronization
    gc1.Merge(gc2)
    gc1.Merge(gc3)

    // Now all replicas would eventually do the same merges:
    gc2.Merge(gc1)
    gc3.Merge(gc1)

    // The total value of any merged replica should be:
    // gc1 replica: 3 increments (by replica 0)
    // gc2 replica: 2 increments (by replica 1)
    // gc3 replica: 4 increments (by replica 2)
    // => total = 3 + 2 + 4 = 9
    fmt.Println("Final merged value at gc1:", gc1.Value())
    fmt.Println("Final merged value at gc2:", gc2.Value())
    fmt.Println("Final merged value at gc3:", gc3.Value())
}

When the program is run, the output will be as follows.

Final merged value at gc1: 9
Final merged value at gc2: 9
Final merged value at gc3: 9

The working version of the program can be accessed from here.

  1. GCounter Structure (Struct): The values ​​array holds each replica’s own local counter.
  2. Increment: Increments the counter of the specified replicaID.
  3. Value: Returns the current total value by adding the counters of all replicas.
  4. Merge: Combines two GCounters. Gets the maximum value for the same index (replicaID). Thus, no one’s increment is lost. This is based on the “grow-only” logic.

In a real system, these values ​​can be transferred to other replicas over the network using gossip protocols or a messaging system. In addition, updates coming at certain intervals in replicas can be merged.


CRDTs (Conflict-free Replicated Data Types) provide effective solutions to data consistency and conflict problems frequently encountered in today’s distributed systems. Although they require more metadata than traditional distributed data structures, they are an important concept in providing strong eventual consistency.

The CRDT approach is frequently encountered in microservices, real-time collaboration (collaborative editing) environments, and many scenarios where data consistency is critical. By choosing the CRDT structure appropriate for your application requirements, you can reduce data consistency problems in the distributed environment to a largely manageable level. The convenience provided by CRDTs is extremely valuable, especially in environments where complex problems such as increasing user numbers, number of data centers, and geographical distribution are experienced.


Resources