Performance Comparison of Various STM Concurrency Control Protocols Using Synchrobench
Writing concurrent programs for shared memory multiprocessor systems is a nightmare. This hinders users to exploit the full potential of multiprocessors. STM (Software Transactional Memory) is a promising concurrent programming paradigm which addresses woes of programming for multiprocessor systems. In this paper, we implement BTO (Basic Timestamp Ordering), SGT (Serialization Graph Testing) and MVTO(Multi-Version Time-Stamp Ordering) concurrency control protocols and build an STM(Software Transactional Memory) library to evaluate the performance of these protocols. The deferred write approach is followed to implement the STM. A SET data structure is implemented using the transactions of our STM library. And this transactional SET is used as a test application to evaluate the STM. The performance of the protocols is rigorously compared against the linked-list module of the Synchrobench benchmark. Linked list module implements SET data structure using lazy-list, lock-free list, lock-coupling list and ESTM (Elastic Software Transactional Memory). Our analysis shows that for a number of threads greater than 60 and update rate 70%, BTO takes (17% to 29%) and (6% to 24%) less CPU time per thread when compared against lazy-list and lock-coupling list respectively. MVTO takes (13% to 24%) and (3% to 24%) less CPU time per thread when compared against lazy-list and lock-coupling list respectively. BTO and MVTO have similar per thread CPU time. BTO and MVTO outperform SGT by 9% to 36%.
💡 Research Summary
The paper addresses the difficulty of writing correct and scalable parallel programs for shared‑memory multiprocessors by exploring Software Transactional Memory (STM) as a higher‑level abstraction. Three classic database concurrency control protocols—Basic Timestamp Ordering (BTO), Serialization Graph Testing (SGT), and Multi‑Version Timestamp Ordering (MVTO)—are implemented within an STM middleware that follows a deferred‑write (lazy‑write) model. The middleware exposes four primitive operations (tm_begin, tm_read, tm_write, tm_commit) and stores shared objects in a thread‑safe map provided by Intel’s Threading Building Blocks (TBB).
To evaluate the protocols, the authors build a transactional SET data structure implemented as a sorted linked list. The SET provides three operations (add, remove, contains), each wrapped in its own transaction that uses the STM primitives. This STM‑based SET serves as the test application. Its performance is compared against four implementations from the Synchrobench benchmark: lazy‑list, lock‑coupling list (both with spin‑locks and mutexes), lock‑free list, and Elastic STM (ESTM).
Experiments are conducted on an Intel Core i3 dual‑core processor (3.20 GHz) with 3 GB of RAM, running Ubuntu 16.04 and g++ 5.4. The workload consists of 70 % update operations (adds or removes) and the number of concurrent threads varies from 10 to 100. Three timing metrics are recorded: wall‑clock (real) time, total CPU time, and per‑thread CPU time (using Linux’s CLOCK_THREAD_CPUTIME_ID).
Results show that BTO consistently outperforms SGT and, in many cases, MVTO when per‑thread CPU time is considered. Specifically, for 100 threads BTO reduces per‑thread CPU time by 24 %–33 % compared with lazy‑list and lock‑coupling list using spin‑locks. MVTO achieves similar reductions (22 %–36 %) against a broader set of Synchrobench implementations, but its advantage diminishes for total CPU time and real time where ESTM and lock‑free structures are faster. SGT generally lags behind the other two protocols, suffering from the overhead of maintaining a conflict graph and the associated locking required for graph updates.
The authors discuss why the observed trends occur. BTO’s simplicity—maintaining only two timestamps per object—keeps metadata overhead low and scales well with thread count. MVTO’s multi‑version approach improves read‑heavy scenarios but incurs memory pressure and garbage‑collection costs as versions accumulate. SGT’s graph‑based cycle detection introduces significant synchronization bottlenecks, especially under high contention. All three protocols share the same deferred‑write model, meaning that validation at commit time dominates the critical path; thus, any reduction in validation complexity directly benefits performance.
Limitations of the study include the modest hardware platform (dual‑core) which may not reflect behavior on many‑core servers, and the exclusive focus on a single data structure (SET implemented as a linked list). Consequently, the results may not generalize to other structures such as trees, hash tables, or skip lists. The paper also notes that the presented figures have been scaled for readability, which obscures absolute latency values.
In the related‑work discussion, the authors position their approach among a spectrum of STM designs: ENNALS STM (inline object metadata), Harris’s lock‑free STM (eliminating per‑transaction logs), DATM (dependency tracking), TL2 (lazy locking with a global version clock), and TinySTM (eager locking). They argue that adapting well‑studied database protocols to STM provides a clear baseline for performance comparison, while also highlighting the need for hybrid designs that combine low‑overhead metadata with efficient contention management.
Future work outlined includes extending the benchmark suite to include STAMP workloads, testing additional data structures (trees, hash tables, skip lists), and evaluating the protocols on many‑core machines. The authors also plan to explore advanced garbage‑collection strategies, contention managers, and hybrid concurrency controls that might combine the strengths of BTO’s timestamp simplicity with MVTO’s versioning benefits, while mitigating SGT’s graph overhead. The source code of their STM middleware is made publicly available for reproducibility.
Comments & Academic Discussion
Loading comments...
Leave a Comment