Introduction
The Two-Phase Commit (2PC) protocol is the standard approach for achieving atomic commitment in distributed transactions, but it has a well-known vulnerability: if the coordinator crashes after sending PREPARE messages but before sending COMMIT or ABORT, participants that voted YES are left in an uncertain state.
They cannot safely commit or abort without hearing from the coordinator, and they must block, holding locks and resources indefinitely until recovery completes.
Three-Phase Commit (3PC) was introduced by Dale Skeen in 1981 to address this blocking problem.
The core insight is to insert an additional phase between the voting phase and the final commit, creating a protocol that is non-blocking under certain failure assumptions.
Specifically, 3PC guarantees progress (no indefinite blocking) as long as there are no network partitions and the failure detector is reliable.
These assumptions are important, and understanding where they break down is essential for evaluating 3PC in practice.
Why 2PC Blocks
To understand 3PC, you first need to understand why 2PC blocks.
Consider a 2PC execution with coordinator C and participants P1, P2, P3.
Suppose all participants vote YES, and then C crashes before sending any COMMIT or ABORT messages.
Participants P1, P2, and P3 are all alive but uncertain — each has voted YES and is waiting for the coordinator's decision.
The participants cannot abort, because for all they know, C sent a COMMIT to some participant just before crashing and that participant has already committed.
They cannot commit, because C might have decided to abort (perhaps it observed a timeout or received a NO vote before crashing).
They are stuck.
The fundamental issue is that in 2PC, a participant can transition directly from the UNCERTAIN state to COMMITTED, meaning that the remaining live participants cannot distinguish whether a commit or abort decision was made by examining only their own states.
Skeen formalized this by observing that a non-blocking commit protocol must satisfy the following property: no participant should be able to reach the COMMITTED state while any other participant is still in the UNCERTAIN state.
If this holds, then when a participant fails, the surviving participants can always determine the outcome by examining their own states. 3PC achieves this by introducing a PRE-COMMIT state that acts as a buffer between UNCERTAIN and COMMITTED.
Protocol Design
3PC splits the commit process into three distinct phases.
Phase 1: Voting (CanCommit)
The coordinator sends a VOTE-REQUEST to all participants.
Each participant responds with either YES (it is prepared to commit and has written a prepare record to stable storage) or NO (it wants to abort).
This phase is identical to the first phase of 2PC.
Phase 2: Pre-Commit
If all participants voted YES, the coordinator sends a PRE-COMMIT message to all participants.
Upon receiving PRE-COMMIT, each participant acknowledges with an ACK and enters the PRE-COMMIT state.
This state means: "I know that everyone voted YES, and a commit decision is forthcoming, but I have not yet committed."
If any participant voted NO (or if the coordinator times out waiting for votes), the coordinator sends ABORT to all participants and the protocol terminates.
Phase 3: Commit (DoCommit)
Once the coordinator has received ACK from all participants (confirming every participant has entered PRE-COMMIT), it sends a COMMIT message.
Participants commit the transaction and release locks.
The critical invariant is: no participant can be in the COMMITTED state while another is in the UNCERTAIN state.
Any participant that has committed must have first passed through PRE-COMMIT, and a participant in PRE-COMMIT knows that all participants voted YES.
This separation allows a recovery protocol to safely resolve the transaction without blocking.
Walkthrough
Below is a step-by-step walkthrough of the 3PC protocol, including the coordinator's logic and the recovery procedure.
Normal Execution (Coordinator)
1. Coordinator writes BEGIN to log.
2. Coordinator sends VOTE-REQUEST to all participants.
3. Coordinator waits for votes (with timeout).
3a. If any vote is NO, or timeout expires:
- Coordinator sends ABORT to all participants.
- Coordinator writes ABORT to log.
- Terminate.
3b. If all votes are YES:
- Coordinator writes PRE-COMMIT to log.
- Coordinator sends PRE-COMMIT to all participants.
4. Coordinator waits for ACKs (with timeout).
4a. If timeout expires (some participants unresponsive):
- Coordinator sends ABORT to all participants.
- Coordinator writes ABORT to log.
- Terminate.
(Aborting here is safe: since not all ACKs were received,
not all participants have entered PRE-COMMIT, so no one
can have committed yet.)
4b. If all ACKs received:
- Coordinator writes COMMIT to log.
- Coordinator sends COMMIT to all participants.
- Terminate.
Normal Execution (Participant)
1. Participant receives VOTE-REQUEST.
2. If participant can commit:
- Write PREPARE to log.
- Send YES to coordinator.
- Set timeout for Phase 2 message.
Else:
- Write ABORT to log.
- Send NO to coordinator.
- Terminate.
3. Participant waits for Phase 2 message (with timeout).
3a. If ABORT received or timeout expires:
- Write ABORT to log.
- Abort transaction.
- Terminate.
3b. If PRE-COMMIT received:
- Write PRE-COMMIT to log.
- Send ACK to coordinator.
- Set timeout for Phase 3 message.
4. Participant waits for Phase 3 message (with timeout).
4a. If COMMIT received:
- Write COMMIT to log.
- Commit transaction.
- Terminate.
4b. If timeout expires:
- Enter recovery protocol (see below).
Note on Phase 2 timeout (step 3a): A participant that times out waiting for the Phase 2 message is still in the UNCERTAIN state — it has not seen PRE-COMMIT.
It can safely abort because, by the critical invariant, no participant can have committed without first passing through PRE-COMMIT.
Since this participant never received PRE-COMMIT, no participant could yet have committed.
Recovery Protocol
When a participant times out waiting for the COMMIT message in Phase 3, it initiates a termination protocol with the surviving participants.
The termination protocol requires that all surviving participants be reachable by the new coordinator; if a partition prevents this, the protocol cannot proceed safely (see Limitations below).
1. Elect a new coordinator among live participants.
2. New coordinator queries all live participants for their state.
3. Decision rules (applied in priority order):
3a. If any participant is in COMMITTED state:
- Commit all. (A commit decision was already made and acted upon.)
3b. If any participant is in ABORTED state:
- Abort all. (A decision to abort was already made and acted upon.)
3c. If any participant is in UNCERTAIN state (voted YES but never
received PRE-COMMIT):
- Abort all. (Safe because no one has committed: reaching
COMMITTED requires passing through PRE-COMMIT first, and
at least one participant never received PRE-COMMIT.)
3d. If all live participants are in PRE-COMMIT state (and no
participant is in UNCERTAIN, ABORTED, or COMMITTED state):
- Commit all. (Everyone voted YES and acknowledged PRE-COMMIT;
the original coordinator had decided to commit.)
The reason the recovery protocol terminates safely is the buffer state.
In case 3c, we know no one has committed because reaching COMMITTED requires passing through PRE-COMMIT first and at least one participant has not even reached PRE-COMMIT.
In case 3d, we know everyone voted YES and acknowledged the pre-commit, so completing the commit is consistent with the original coordinator's decision.
Importantly, rule 3d is only safe when the termination protocol can confirm that no participant is in UNCERTAIN or ABORTED state.
If any participant is unreachable, the new coordinator cannot safely apply rule 3d, because the unreachable participant might be in UNCERTAIN (triggering rule 3c) or ABORTED (triggering rule 3b).
Failure Model and Limitations
3PC is non-blocking under the assumption of a synchronous network with reliable failure detection.
Specifically, it assumes:
- Fail-stop failures only. Processes crash cleanly and do not recover with inconsistent state. Byzantine failures are not tolerated.
- No network partitions. The recovery protocol requires that the new coordinator can communicate with all surviving participants. If a network partition splits participants into two groups, each group might independently reach different decisions.
- Reliable failure detection. Timeouts must accurately distinguish between a crashed process and a slow one. In an asynchronous network, this distinction is impossible to make perfectly (per the FLP impossibility result).
The network partition vulnerability is particularly serious.
Consider participants {P1, P2, P3} where P1 and P2 are in the PRE-COMMIT state and P3 is still UNCERTAIN (it never received the PRE-COMMIT message).
If a partition separates {P1, P2} from {P3}, then {P1, P2} will elect a new coordinator, observe that all reachable participants are in PRE-COMMIT, and commit (rule 3d).
Meanwhile, P3 will time out, observe that it is itself in UNCERTAIN, and abort (rule 3c).
The result is an inconsistency: one partition commits and the other aborts the same transaction.
This scenario illustrates why 3PC has seen limited adoption in practice.
Real distributed systems operate in asynchronous or partially synchronous networks where partitions are a fact of life.
Protocols like Paxos and Raft, which tolerate partitions at the cost of requiring a majority quorum, have become the preferred foundation for fault-tolerant consensus and atomic commitment.
It is worth noting the asymmetry between 2PC and 3PC under partitions: 2PC blocks (it stops making progress but does not make incorrect decisions), while 3PC can make inconsistent decisions (it makes progress but may produce incorrect outcomes).
In many systems, blocking — which can be resolved by coordinator recovery — is preferable to inconsistency, which may be unrecoverable without external intervention.
Comparison with 2PC
| Property | 2PC | 3PC |
|---|---|---|
| Number of phases | 2 | 3 |
| Message complexity (approx.) | 3n | 5n |
| Blocking on coordinator failure | Yes | No (assuming no partitions) |
| Behavior under partitions | Blocks (safe, no progress) | May produce inconsistent decisions |
| Practical adoption | Widespread | Rare |
Message complexity counts approximately: 2PC uses n VOTE-REQUESTs + n votes + n COMMIT/ABORTs = 3n; 3PC adds n PRE-COMMITs + n ACKs = 5n.
Exact counts vary depending on whether coordinator-to-participant and participant-to-coordinator messages are counted separately.
The tradeoff is clear. 2PC blocks but never makes an incorrect decision under any failure model. 3PC avoids blocking but can make inconsistent decisions under partitions.
In practice, blocking (which can be resolved by coordinator recovery) is often preferable to inconsistency (which may be unrecoverable).
Key Points
- 3PC adds a
PRE-COMMITphase between voting and commit to eliminate the window where a participant can transition directly from uncertain to committed. - The protocol is non-blocking under fail-stop failures with no network partitions, meaning surviving participants can always reach a decision without waiting for crashed nodes to recover.
- The critical invariant is that no participant can be in the
COMMITTEDstate while any other participant remainsUNCERTAIN. - The coordinator must abort — not proceed — if it times out waiting for ACKs in Phase 2. Proceeding without all ACKs would violate the safety invariant because at least one participant might still be in
UNCERTAIN. - Network partitions can cause 3PC to produce inconsistent outcomes: one partition may commit while another aborts the same transaction.
- The termination protocol is only safe when the new coordinator can reach all surviving participants. If any participant is unreachable, the coordinator cannot safely distinguish rule 3c (abort) from rule 3d (commit).
- 3PC requires approximately 5n messages per transaction compared to 3n for 2PC, adding latency through an extra round trip.
- In practice, Paxos-based commit protocols (such as Paxos Commit) provide non-blocking atomic commitment that also tolerates partitions, making 3PC largely obsolete for new system designs.
- The protocol's design illustrates a fundamental principle: in asynchronous networks, you cannot simultaneously guarantee safety (consistency) and liveness (non-blocking progress) without majority-based consensus.
References
Dale Skeen, "Nonblocking Commit Protocols," Proceedings of the 1981 ACM SIGMOD International Conference on Management of Data, 1981.
Dale Skeen and Michael Stonebraker, "A Formal Model of Crash Recovery in a Distributed System," IEEE Transactions on Software Engineering, SE-9(3), 1983.
Jim Gray and Leslie Lamport, "Consensus on Transaction Commit," ACM Transactions on Database Systems, 31(1), 2006.
Philip A. Bernstein, Vassos Hadzilacos, and Nathan Goodman, "Concurrency Control and Recovery in Database Systems," Addison-Wesley, 1987.
Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson, "Impossibility of Distributed Consensus with One Faulty Process," Journal of the ACM, 32(2), 1985.