Multi-Shot Distributed Transaction Commit (Extended Version)
Atomic Commit Problem (ACP) is a single-shot agreement problem similar to consensus, meant to model the properties of transaction commit protocols in fault-prone distributed systems. We argue that ACP is too restrictive to capture the complexities of modern transactional data stores, where commit protocols are integrated with concurrency control, and their executions for different transactions are interdependent. As an alternative, we introduce Transaction Certification Service (TCS), a new formal problem that captures safety guarantees of multi-shot transaction commit protocols with integrated concurrency control. TCS is parameterized by a certification function that can be instantiated to support common isolation levels, such as serializability and snapshot isolation. We then derive a provably correct crash-resilient protocol for implementing TCS through successive refinement. Our protocol achieves a better time complexity than mainstream approaches that layer two-phase commit on top of Paxos-style replication.
💡 Research Summary
The paper tackles a fundamental limitation of the classic Atomic Commit Problem (ACP), which models transaction commit as a single‑shot agreement akin to consensus. Modern distributed databases, however, execute many transactions concurrently, intertwining commit protocols with optimistic concurrency control. To capture this reality, the authors introduce the Transaction Certification Service (TCS), a formal abstraction that accepts certify(t) requests and returns decide(t, d) where d ∈ {commit, abort}. TCS is parameterized by a certification function f : 2^T × T → {commit, abort} that encodes the desired isolation level (e.g., serializability, snapshot isolation). Crucially, f must be distributive: for any partition of the committed set T into T₁ and T₂, f(T₁ ∪ T₂, t) = f(T₁, t) ⊓ f(T₂, t), where ⊓ is a special conjunction that yields abort if any operand aborts and commit only if both commit. This property mirrors the fact that most conflict‑checking logic can be evaluated transaction‑by‑transaction.
Building on TCS, the authors design a multi‑shot two‑phase commit (2PC) protocol. Each shard maintains an ordered array of received transactions, a “next” pointer, and two sets: already committed transactions (T_commit) and prepared but not yet decided transactions (T_prepare). Upon receiving a PREPARE, a shard stores the transaction, computes two local votes using shard‑specific functions f_s (against T_commit) and g_s (against T_prepare), and sends a PREPARE_ACK containing the vote and the transaction’s position to the transaction’s coordinator. The coordinator aggregates votes from all involved shards using the ⊓ operator, decides commit or abort, and asynchronously notifies the client and all shards. The local functions f_s and g_s are simply the shard‑restricted versions of the global certification function, checking conflicts only on objects owned by the shard.
To make the service crash‑resilient, the paper weaves the multi‑shot 2PC with a Paxos‑style replication layer. Traditional fault‑tolerant 2PC would run a full Paxos instance for each shard and also replicate the coordinator, incurring two consensus rounds per transaction. The authors observe that only the prepare phase needs a strong leader; the final decision can be propagated asynchronously. Thus, a single Paxos round suffices to persist the decision at the shard leader, after which the decision is gossiped to replicas. If the leader crashes, any replica can read the current state and assume the coordinator role without violating safety, because all coordinators compute the same ⊓ of votes. This “weaving” reduces the worst‑case latency to the lower bound for consensus and matches known lower bounds for non‑blocking atomic commit.
The paper provides rigorous proofs. First, it shows that the multi‑shot 2PC protocol implements a correct TCS: every complete history’s committed sub‑history admits a legal linearization respecting the certification function. Second, it proves that the Paxos‑woven implementation simulates the multi‑shot 2PC, thereby inheriting its correctness. Consequently, the system guarantees safety (no two conflicting transactions commit) and liveness (a decision is eventually reached when failures do not persist).
Beyond the core protocol, the authors discuss how different certification functions instantiate common isolation levels. For serializability, f checks that no version read by t has been overwritten by a later committed transaction. For snapshot isolation, f only checks write‑write conflicts. Both functions satisfy the distributivity requirement, enabling the same protocol to support multiple isolation guarantees without redesign.
In summary, the work makes three major contributions: (1) a clean formal model (TCS) that unifies commit and concurrency control for multi‑shot workloads; (2) a generic multi‑shot 2PC algorithm that works with any distributive certification function; (3) a fault‑tolerant implementation that tightly integrates 2PC with Paxos, achieving lower latency than the naïve 2PC‑over‑Paxos approach while retaining provable correctness. This advances both the theory of distributed transaction processing and provides a practical blueprint for building high‑performance, strongly consistent data stores.
Comments & Academic Discussion
Loading comments...
Leave a Comment