DBTuneSuite: An Extendible Experimental Suite to Test the Time Performance of Multi-layer Tuning Options on Database Management Systems
DBTuneSuite is a suite of experiments on four widely deployed free database systems to test their performance under various query/upsert loads and under various tuning options. The suite provides: (i) scripts to generate data and to install and run tests, making it expandable to other tests and systems; (ii) suggestions of which systems work best for which query types; and (iii) quantitative evidence that tuning options widely used in practice can behave very differently across systems. This paper is most useful for database system engineers, advanced database users and troubleshooters, and students.
💡 Research Summary
DBTuneSuite is an extensible experimental framework designed to evaluate the time performance of multiple tuning options across four widely used open‑source database management systems: MySQL, MariaDB, PostgreSQL, and DuckDB. The authors motivate their work by posing two practical questions: (i) which tuning choices should be made to accelerate a given workload on a specific DBMS, and (ii) once the workload profile is known, which DBMS is best suited for the application. To answer these, the paper presents a comprehensive suite of scripts for data generation, system installation, and benchmark execution, enabling researchers and practitioners to reproduce the experiments or extend them to other systems and workloads.
The related‑work section surveys prior benchmarking efforts that compare DBMSs (e.g., Andjelić et al. 2008, Salunké & Ouda 2024, Tongkaw & Tongkaw 2016) and prior studies on database tuning (physical design, indexing, denormalization, automatic configuration). The authors note that most existing work focuses on a single layer—either system‑wide performance or a narrow set of physical parameters—whereas real‑world tuning spans schema design, index selection, query formulation, data loading strategies, and configuration parameters. DBTuneSuite therefore adopts a “five‑layer” view of tuning: (1) schema level (column ordering, partitioning, fill‑factor), (2) index level (clustered, non‑clustered, hash, adaptive radix tree), (3) query level (hints, sub‑query rewriting, join ordering), (4) data‑loading level (bulk‑load settings, batch size), and (5) system‑level configuration (buffer pool size, connection pool size, log settings, thread counts).
All experiments run on a homogeneous hardware platform: a NYU CIMS server equipped with four 16‑core AMD Opteron 6272 CPUs (64 physical cores total), 256 GB RAM, and Red Hat Enterprise Linux 9. Each benchmark is executed ten times and averaged. The authors use two data families. The first is a synthetic “employee” table with uniformly distributed columns and controlled duplication patterns, allowing precise measurement of index behavior (including fill‑factor effects). The second is the industry‑standard TPC‑H benchmark at scale factors 0.01667 (≈10⁵ rows) and 1.667 (≈10⁷ rows), providing realistic read‑heavy and mixed read/write workloads.
Key findings include:
- Storage‑engine and fill‑factor effects – In MySQL and MariaDB (both using InnoDB), setting a fill‑factor of 100 degrades INSERT/UPDATE mixed workloads, while PostgreSQL’s MVCC implementation maintains stable sub‑2 s latencies even under high concurrency.
- Index structure impact – DuckDB’s adaptive radix tree (ART) index, combined with its vectorized columnar execution engine, outperforms B‑tree based indexes in PostgreSQL and MySQL on large analytical queries (2–3× higher throughput on TPC‑H).
- Connection‑pool sizing – Small pools (≤5 connections) cause severe latency spikes in PostgreSQL and MariaDB under load, whereas MySQL achieves optimal performance with pools of 10–15 connections, reflecting differing internal thread‑management strategies.
- Query‑level tuning dominance – Reordering joins, applying sub‑query rewrites, and using optimizer hints yields the largest performance gains across all systems (30 %–70 % improvement), confirming the authors’ industry consulting experience that query‑level adjustments often provide the highest ROI.
- Automatic vs. manual tuning – While frameworks such as OtterTune and UDO can automatically discover good configuration settings, the study shows that manual, workload‑specific tuning—especially at the query and index layers—still outperforms generic automated solutions for many realistic scenarios.
The paper concludes that DBTuneSuite fills a gap in the literature by offering a reproducible, extensible platform that evaluates tuning knobs at multiple abstraction levels rather than only system‑wide throughput. The authors argue that the suite is valuable for DBMS developers (to identify performance bottlenecks in their engines), advanced application developers (to decide on schema, indexing, and query strategies), and educators or students (to design controlled experiments and understand variance). Future work includes extending the suite to cloud and container environments, adding newer DBMSs such as TiDB or CockroachDB, and integrating machine‑learning‑driven automatic tuning with the manual, multi‑layer approach presented here.
Comments & Academic Discussion
Loading comments...
Leave a Comment