PAXOS Algorithm: Achieving Consensus in Distributed Systems
January 5, 2025 • ☕️ 7 min read • 🏷 computer, software, algorithm
Translated by author into: English
In distributed systems, reaching consensus among multiple nodes is a difficult problem. The PAXOS algorithm is a well-accepted protocol in the academic world that was developed to provide reliable consensus in distributed systems. Introduced by Leslie Lamport in the 1990s, this algorithm is especially used in systems where fault tolerance is critical.
1. The Importance of Consensus in Distributed Systems
Distributed systems are systems consisting of components spread across different geographical locations or different physical machines.
Example: Distributed approaches are used in many scenarios such as microservice architecture, large data processing clusters, database replication. Consensus refers to the process by which nodes in a distributed system reach a common decision on a certain value or situation.
Example: If there are many copies of the same data (replicas), a decision must be reached for them to be consistent. Problem: Network errors, delays, and node failures complicate this process. In summary, one of the most critical problems in distributed systems is to ensure that different nodes reach a single truth (ground truth) at the same time.
2. What is Paxos?
Paxos is a consensus algorithm proposed by Leslie Lamport to reach a common decision in a fault-tolerant manner in distributed systems. This idea, introduced in the late 1990s, is still valid in today’s modern distributed systems and forms the basis of most similar algorithms.
Purpose: To ensure that the majority of nodes in the system accept the same value. Important Note: Paxos is designed to work in the asynchronous network model (the delay of messages in the network is unknown) and is based on the logic of “If a value is accepted, guarantee it”.
3. Basic Components and Roles
There are three basic roles in Paxos:
- Proposer
- Responsible for proposing a new value.
- There can be one or more Proposers in the system.
- Acceptor
- Nodes that decide to accept or reject proposals.
- If they accept a proposal, they pass the value of that proposal to other nodes.
- Learner
- Responsible for learning the accepted value.
- Typically responsible for informing the rest of the system what the final decision is.
In real life, some nodes can take on the roles of Proposer, Acceptor, and Learner at the same time. However, these roles are conceptually separated for the sake of the algorithm’s logic.
4. Steps of the Paxos Algorithm
The Paxos process consists of roughly two (in some models three) basic phases. In the literature, these phases are generally referred to as Prepare and Accept. The phase where the accepted value is notified to all Learners is called Learn.
4.1 Phase 1: Prepare
-
Proposal Number:
- Each Proposer chooses a unique and monotonically increasing proposal number (example: 1, 2, 3…).
- It is important that the proposal number is unique; it is usually generated with strategies such as node_id + timestamp.
-
Prepare Message:
- The Proposer sends the message “prepare(n)” to the Acceptors in a majority (majority) set. Here n is the proposal number.
-
Promise:
- If an Acceptor has not previously made a promise with a proposal number m ≥ n, it sends a “promise” message to the Proposer. This means “I will not accept any other proposals with a lower number than this”.
- Also, the Acceptor informs the Proposer of the value of the highest proposal number it has previously accepted (if any).
Objective: The Proposer receives a “promise” response from a sufficient number of Acceptors to ensure that the value it will present in the next step is considered by the majority.
4.2 Phase 2: Accept
-
Value Determination:
- The Proposer examines the responses it receives from the Acceptors in Phase 1.
- If the Acceptors have previously accepted a proposal containing a certain value, the Proposer must re-propose this value by choosing the value with the highest proposal number.
- If no Acceptor has previously received an accepted value, the Proposer can submit its own proposed value.
-
Accept Message:
- The Proposer sends a message in the form of “accept(n, v)” (proposal number n, value v) to the same majority group (Acceptors).
-
Accept:
- If the Acceptor has not “promised” a proposal with a number higher than the proposal number n it sent, it accepts this value and notifies the Proposer (and/or Learners) that it has accepted it.
Purpose: At the end of Phase 2, the majority has accepted the same value at least once. When a value is accepted by the majority, that value is considered chosen by Paxos.
4.3 Phase 3: Learn
- The chosen value is distributed to the Learners.
- The Learners learn what the result is based on the “accepted” message from the Acceptors.
- The rest of the system agrees on this value.
5. Paxos Features
Paxos targets two critical features:
-
Safety
- Prevents two different values from being chosen at the same time, whether in the same round or in different rounds.
- If a value is chosen, it is not possible for the system to choose another value to replace that value.
-
Liveness
- When the system is sufficiently stable, it ensures that the suggested values are chosen after a certain period of time.
- However, liveness may be delayed if there are permanent network partitions or very frequent errors in the network.
Paxos’s safety feature provides a guarantee of “not choosing two different values at the same time”; The liveness property ensures that “a value will eventually be chosen”.
6. Multi-Paxos and Advanced Versions
Instead of reaching consensus on a single value, a set of values or a set of commands may be agreed upon. For example; in a distributed database, you may want to determine the order of a series of commands (insert, update, delete).
-
Multi-Paxos:
- Instead of the same nodes running Paxos over and over again, a fixed Proposer called the leader is assigned and tries to force a set of values to be accepted in order.
- In this way, a new leader is not elected in each round, and performance increases.
-
Raft, Zab, etc.:
- Other consensus algorithms inspired by Paxos also try to offer high performance and simplicity with similar steps such as “leader election”, “log replication”.
7. Advantages and Disadvantages of Paxos
Advantages:
- Basic Theoretical Basis: It is one of the most studied and proven consensus algorithms in the literature.
- Security Guarantee: Provides consistent results even in asynchronous network environments.
- Flexibility: Adaptable to many scenarios and different application architectures.
Disadvantages:
- Implementation Complexity: Although Paxos is clear in academic articles, it is difficult to code and manage in practice.
- Message Traffic Between Phases: The cost of Phase 1 and Phase 2 messaging is high. It requires careful design, especially in systems where high performance is required.
- Leadership Election May Be Required: In Multi-Paxos, it is necessary to design additional management and fault tolerance layers such as leader election and leader failure.
8. Real-Life Applications
- Google Chubby: Google’s internal distributed lock system uses a mechanism based on Paxos.
- Microsoft Azure: Similar algorithms are used in distributed database and blob storage infrastructures.
- Basic Database Replication: Large-scale, high-availability critical systems often use Paxos or similar consensus algorithms.
Although simplified variants of Paxos that are easier to manage in production environments (e.g. Raft) have become more popular today, the theoretical infrastructure of Paxos still forms the basis of many consensus algorithms.
The Paxos algorithm is one of the most prominent classical solutions for achieving consensus in distributed systems. It ensures that the majority accepts a common value despite difficulties such as network delays, node failures, and asynchronous communication.
- Provides security and liveness guarantees.
- Improved versions such as Multi-Paxos are widely used in practical use to accept multiple values sequentially.
- Since it is complex in terms of design and implementation, more understandable alternatives such as Raft are also preferred in practice.
Nevertheless, Paxos is a very important paradigm that comes to mind first when it comes to distributed consensus and is considered the ancestor of different algorithms. Especially in systems that require high availability and consistency, building Paxos-like mechanisms has become an inevitable necessity.
Resources
- https://en.wikipedia.org/wiki/Paxos_(computer_science)
- https://www.sciencedirect.com/topics/computer-science/paxos-algorithm
- https://lamport.azurewebsites.net/pubs/lamport-paxos.pdf
- https://dl.acm.org/doi/10.1145/226643.226647
- https://raft.github.io/raft.pdf
- https://gyuho.dev/consensus-systems/paxos-etcd-vs-nakamoto-bitcoin/