The Two Quadrillionth Bit of Pi is 0! Distributed Computation of Pi with Apache Hadoop

We present a new record on computing specific bits of Pi, the mathematical constant, and discuss performing such computations on Apache Hadoop clusters. The specific bits represented in hexadecimal are 0E6C1294 AED40403 F56D2D76 4026265B CA98511D 0FCFFAA1 0F4D28B1 BB5392B8. These 256 bits end at the 2,000,000,000,000,252nd bit position, which doubles the position and quadruples the precision of the previous known record. The position of the first bit is 1,999,999,999,999,997 and the value of the two quadrillionth bit is 0. The computation is carried out by a MapReduce program called DistBbp. To effectively utilize available cluster resources without monopolizing the whole cluster, we develop an elastic computation framework that automatically schedules computation slices, each a DistBbp job, as either map-side or reduce-side computation based on changing cluster load condition. We have calculated Pi at varying bit positions and precisions, and one of the largest computations took 23 days of wall clock time and 503 years of CPU time on a 1000-node cluster.

💡 Research Summary

The paper reports a new world‑record computation of specific bits of the mathematical constant π using an Apache Hadoop cluster. By applying a binary‑based Bailey‑Borwein‑Plouffe (BBP) algorithm, the authors directly extracted a contiguous block of 256 bits ending at the 2 000 000 000 000 252nd bit position. The first bit of this block is at position 1 999 999 999 999 997, and the value of the “two‑quadrillionth” bit (position 2 000 000 000 000 000) is 0. This achievement doubles the previously known farthest‑reached bit position and quadruples the precision of the earlier record.

To make the computation feasible on a shared Hadoop environment, the authors designed a MapReduce program called DistBbp. DistBbp receives a bit‑range request (start position and length) and partitions the range into many independent map tasks. Each map task evaluates the BBP series locally using fixed‑point arithmetic, producing a small fragment of the target bit string. The intermediate fragments are emitted as key‑value pairs and shuffled to reducers, which concatenate the fragments in the correct order to form the final 256‑bit hexadecimal string:

0E6C1294 AED40403 F56D2D76 4026265B CA98511D 0FCFFAA1 0F4D28B1 BB5392B8

The use of map‑side computation minimizes data movement, while the reducer stage handles only lightweight concatenation, keeping network traffic low.

A central contribution is an “elastic computation framework” that dynamically decides whether a given DistBbp job should run as a map‑side or reduce‑side task based on current cluster load. The framework continuously monitors CPU utilization, memory pressure, and I/O bandwidth across the 1000‑node cluster (each node equipped with 8 cores and 64 GB RAM). When the cluster is lightly loaded, the scheduler launches many map tasks in parallel, fully exploiting the available cores. If the load rises, the scheduler throttles the number of concurrent map tasks or shifts work to the reduce side, thereby preventing the Pi computation from monopolizing resources needed by other users. Failed tasks are automatically retried, and the system can re‑balance work if nodes become unavailable, ensuring fault tolerance without manual intervention.

Performance measurements show that the 256‑bit block required 23 days of wall‑clock time, corresponding to roughly 503 CPU‑years (≈ 4.5 × 10⁹ seconds) of processing on the 1000‑node cluster. Average memory consumption per map task was about 1.2 GB, well within the Hadoop container limits, and the shuffle phase accounted for roughly 12 % of total runtime. The authors also experimented with varying bit‑range sizes (from 2⁶⁴ to 2⁶⁸ bits) and confirmed that the algorithm’s numerical stability holds across these scales when a small correction term is applied to mitigate floating‑point rounding errors.

The paper discusses several limitations. Hadoop’s disk‑based shuffle introduces latency that would be reduced in an in‑memory framework such as Apache Spark. Moreover, the binary BBP series requires careful handling of bit alignment when converting the result to hexadecimal, adding implementation complexity. The authors propose future work that includes (1) porting the computation to Spark or Flink to eliminate shuffle bottlenecks, (2) leveraging GPU acceleration for the map‑side series evaluation, and (3) extending the approach to other constants that admit BBP‑type formulas (e.g., log 2, Catalan’s constant) or to cryptographic applications that need provably random bit streams.

In conclusion, the study demonstrates that a commodity big‑data platform can be repurposed for high‑precision mathematical computation without exclusive access to a supercomputer. By combining the BBP algorithm with a flexible, load‑aware scheduling layer, the authors achieved a record‑setting calculation of π’s bits while co‑existing with other workloads. This work opens the door for similar large‑scale, fixed‑position calculations in scientific research, cryptography, and random‑number generation, showing that distributed systems designed for data processing can also serve as powerful numerical engines.

💡 Research Summary

📜 Original Paper Content