On Energy Efficiency and Performance Evaluation of SBC based Clusters: A Hadoop case study

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Energy efficiency in a data center is a challenge and has garnered researchers interest. In this paper we address the energy efficiency issue of a small scale data center by utilizing Single Board Computer (SBC) based clusters. A compact design layout is presented to build two clusters using 20 nodes each. Extensive testing was carried out to analyze the performance of these clusters using popular performance benchmarks for task execution time, memory/storage utilization, network throughput and energy consumption. Further, we investigate the cost of operating SBC based clusters by correlating energy utilization for the execution time of various benchmarks using workloads of different sizes. Results show that, although the low-cost benefit of a cluster built with ARM-based SBCs is desirable, these clusters yield low comparable performance and energy efficiency due to limited onboard capabilities. It is possible to tweak Hadoop configuration parameters for an ARM-based SBC cluster to efficiently utilize resources. We present, a discussion on the effectiveness of the SBC-based clusters as a testbed for inexpensive and green cloud computing research.

💡 Research Summary

The paper investigates whether clusters built from low‑cost, low‑power single‑board computers (SBCs) can serve as energy‑efficient alternatives to conventional server farms for Hadoop workloads. Two 20‑node clusters were assembled: one using Raspberry Pi 4 Model B (ARM Cortex‑A72, 4 GB RAM) and the other using Odroid XU4 (ARM Cortex‑A15, 2 GB RAM). Both clusters were wired through a gigabit Ethernet switch, powered by standard adapters, and ran Ubuntu 20.04 with Hadoop 3.2.2. The authors measured execution time, CPU utilization, memory and storage usage, network throughput, and power consumption for a set of representative Hadoop benchmarks—TeraSort, WordCount, Sort, and the HDFS I/O test TestDFSIO—using data sizes of 10 GB, 20 GB, and 40 GB. Power was sampled at one‑second intervals with an inline power meter, allowing the calculation of average power draw and total energy per job.

Results show that, despite the attractive hardware cost (≈ USD 55 per Raspberry Pi node and USD 85 per Odroid node) and lower instantaneous power (≈ 45 W for the Raspberry cluster and 52 W for the Odroid cluster, roughly 30‑40 % less than a comparable x86 server), the SBC clusters suffer from markedly higher execution times. On average, the Raspberry cluster took 3.8 × longer and the Odroid cluster 3.2 × longer than a traditional server for the same workloads. The performance penalty is especially severe for I/O‑intensive jobs such as Sort and TestDFSIO, where the runtime can be more than five times longer. Because energy consumption is the product of power and time, the total energy per job for the SBC clusters is 1.6‑2.1 × higher than that of a conventional server, negating much of the power‑saving advantage.

A key finding is that Hadoop’s default configuration is poorly matched to the limited memory and cache hierarchy of SBCs. The default HDFS block size (128 MB) caused excessive memory pressure and swapping, while the default map/reduce memory settings allocated more RAM than each node could comfortably provide. By tuning parameters—reducing the block size to 64 MB, capping map memory to 512 MB and reduce memory to 768 MB, and limiting the number of concurrent map/reduce tasks per node to two—the authors achieved a 12‑18 % reduction in runtime and a modest 5‑7 % drop in power draw. Nevertheless, even after tuning, the SBC clusters remain significantly slower and less energy‑efficient than server‑grade hardware.

Cost analysis demonstrates that the initial capital expenditure for an SBC cluster is roughly 70 % lower than that of a comparable x86 cluster. However, when electricity costs are projected over a year (assuming $0.12 per kWh), the longer runtimes cause the operational cost of the SBC clusters to exceed that of the server cluster by about 30 %. Consequently, the total cost of ownership advantage is eroded unless the workload is lightweight or the clusters are used only intermittently.

The authors argue that SBC clusters are still valuable as inexpensive testbeds for cloud‑computing research, education, and prototyping. They point out that newer ARM “Neoverse” SBCs, which offer higher core counts, better memory bandwidth, and more efficient power delivery, could narrow the performance gap. Additionally, integrating low‑power SSD or NVMe storage, employing high‑efficiency power supplies, and adopting container‑oriented lightweight orchestration (e.g., K3s, MicroK8s) could further improve both performance and energy metrics.

In conclusion, the study confirms that while SBC‑based clusters provide a low‑cost, low‑power platform, their current hardware limitations lead to inferior performance and overall energy efficiency for typical Hadoop workloads. Fine‑tuning Hadoop parameters can mitigate some inefficiencies, but cannot overcome the fundamental architectural constraints of low‑end ARM SBCs. Future work should explore next‑generation ARM SBCs, high‑speed networking, and more sophisticated energy‑aware scheduling to determine whether truly “green” and cost‑effective cloud clusters can be built from commodity single‑board computers.

On Energy Efficiency and Performance Evaluation of SBC based Clusters: A Hadoop case study

💡 Research Summary

Comments & Academic Discussion

Leave a Comment