Middleware-based Database Replication: The Gaps between Theory and Practice

Reading time: 6 minute
...

📝 Original Info

  • Title: Middleware-based Database Replication: The Gaps between Theory and Practice
  • ArXiv ID: 0712.2773
  • Date: 2008-11-05
  • Authors: Researchers from original ArXiv paper

📝 Abstract

The need for high availability and performance in data management systems has been fueling a long running interest in database replication from both academia and industry. However, academic groups often attack replication problems in isolation, overlooking the need for completeness in their solutions, while commercial teams take a holistic approach that often misses opportunities for fundamental innovation. This has created over time a gap between academic research and industrial practice. This paper aims to characterize the gap along three axes: performance, availability, and administration. We build on our own experience developing and deploying replication systems in commercial and academic settings, as well as on a large body of prior related work. We sift through representative examples from the last decade of open-source, academic, and commercial database replication systems and combine this material with case studies from real systems deployed at Fortune 500 customers. We propose two agendas, one for academic research and one for industrial R&D, which we believe can bridge the gap within 5-10 years. This way, we hope to both motivate and help researchers in making the theory and practice of middleware-based database replication more relevant to each other.

💡 Deep Analysis

Deep Dive into Middleware-based Database Replication: The Gaps between Theory and Practice.

The need for high availability and performance in data management systems has been fueling a long running interest in database replication from both academia and industry. However, academic groups often attack replication problems in isolation, overlooking the need for completeness in their solutions, while commercial teams take a holistic approach that often misses opportunities for fundamental innovation. This has created over time a gap between academic research and industrial practice. This paper aims to characterize the gap along three axes: performance, availability, and administration. We build on our own experience developing and deploying replication systems in commercial and academic settings, as well as on a large body of prior related work. We sift through representative examples from the last decade of open-source, academic, and commercial database replication systems and combine this material with case studies from real systems deployed at Fortune 500 customers. We prop

📄 Full Content

Appears in Proceedings of the ACM SIGMOD Conference, Vancouver, Canada (June 2008)

Middleware-based Database Replication: The Gaps Between Theory and Practice

Emmanuel Cecchet

EPFL Lausanne, Switzerland

emmanuel.cecchet@epfl.ch George Candea

EPFL & Aster Data Systems Lausanne, Switzerland

george.candea@epfl.ch
Anastasia Ailamaki

EPFL & Carnegie Mellon University Lausanne, Switzerland

anastasia.ailamaki@epfl.ch

ABSTRACT

The need for high availability and performance in data management systems has been fueling a long running interest in database replication from both academia and industry. However, academic groups often attack replication problems in isolation, overlooking the need for completeness in their solutions, while commercial teams take a holistic approach that often misses opportunities for fundamental innovation. This has created over time a gap between academic research and industrial practice. This paper aims to characterize the gap along three axes: performance, availability, and administration. We build on our own experience developing and deploying replication systems in commercial and academic settings, as well as on a large body of prior related work. We sift through representative examples from the last decade of open-source, academic, and commercial database replication systems and combine this material with case studies from real systems deployed at Fortune 500 customers. We propose two agendas, one for academic research and one for industrial R&D, which we believe can bridge the gap within 5-10 years. This way, we hope to both motivate and help researchers in making the theory and practice of middleware-based database replication more relevant to each other. Categories and Subject Descriptors C.2.4 [Distributed Systems]: Distributed databases; H.2.4 [Systems]: Distributed databases General Terms Performance, Design, Reliability. Keywords Middleware, database replication, practice and experience.

  1. INTRODUCTION

Despite Gray’s warning on the dangers of replication [18] over a decade ago, industry and academia have continued building repli- cation systems for databases. The reason is simply that replication is the only tried-and-true mechanism for scaling performance and availability of databases across a wide range of requirements.

There exist replication “solutions” for every major DBMS, from Oracle RAC™, Streams™ and DataGuard™ to Slony-I for Postgres, MySQL replication and cluster, and everything in- between. The naïve observer may conclude that such variety of replication systems indicates a solved problem; the reality, however, is the exact opposite. Replication still falls short of customer expectations, which explains the continued interest in developing new approaches, resulting in a dazzling variety of offerings. Even the “simple” cases are challenging at large scale. We deployed a replication system for a large travel ticket brokering system at a Fortune-500 company faced with a workload where 95% of transactions were read-only. Still, the 5% write workload resulted in thousands of update requests per second, which implied that a system using 2-phase-commit, or any other form of synchronous replication, would fail to meet customer performance requirements (thus confirming Gray’s prediction [18]). This tradeoff between availability and performance has long been a hurdle to developing efficient replication techniques. In practice, the performance/availability tradeoff can be highly discontinuous. In the same ticket broker system mentioned above, the difference between a 30-second and a one-minute outage determines whether travel agents retry their requests or decide to switch to another broker for the rest of the day (“the competition is one click away”). Compounded across the hundreds of travel agencies that connect to the broker system daily for hotel bookings, airline tickets, car rentals, etc., the impact of one minute of downtime comes close to that of a day-long outage. The replication system needs to be mindful of the implied failover requirements, and obtaining predictable behavior is no mean feat. Our premise is that, by carefully observing real users’ needs and transforming them into research goals, the community can bridge the mismatch between existing replication systems and customers’ expectations within the coming decade. We sift through the last decade of database replication in academic, industrial, and open- source projects. Combining this analysis with 45 person-years of experience building and deploying replicated database systems, we identify the unanswered challenges of practical replication. We find that a few “hot topics” (e.g., reliable multicast and lazy replication [21]) attract the lion’s share of academic interest, while other equally important aspects (e.g., availability and management) are often forgotten—this limits the imp

…(Full text truncated)…

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut