Selection in the Presence of Memory Faults, with Applications to In-place Resilient Sorting

The selection problem, where one wishes to locate the $k^{th}$ smallest element in an unsorted array of size $n$, is one of the basic problems studied in computer science. The main focus of this work is designing algorithms for solving the selection problem in the presence of memory faults. These can happen as the result of cosmic rays, alpha particles, or hardware failures. Specifically, the computational model assumed here is a faulty variant of the RAM model (abbreviated as FRAM), which was introduced by Finocchi and Italiano. In this model, the content of memory cells might get corrupted adversarially during the execution, and the algorithm is given an upper bound $\delta$ on the number of corruptions that may occur. The main contribution of this work is a deterministic resilient selection algorithm with optimal O(n) worst-case running time. Interestingly, the running time does not depend on the number of faults, and the algorithm does not need to know $\delta$. The aforementioned resilient selection algorithm can be used to improve the complexity bounds for resilient $k$-d trees developed by Gieseke, Moruz and Vahrenhold. Specifically, the time complexity for constructing a $k$-d tree is improved from $O(n\log^2 n + \delta^2)$ to $O(n \log n)$. Besides the deterministic algorithm, a randomized resilient selection algorithm is developed, which is simpler than the deterministic one, and has $O(n + \alpha)$ expected time complexity and O(1) space complexity (i.e., is in-place). This algorithm is used to develop the first resilient sorting algorithm that is in-place and achieves optimal $O(n\log n + \alpha\delta)$ expected running time.

💡 Research Summary

The paper tackles the classic selection and sorting problems under a fault‑tolerant computational model known as FRAM (Faulty RAM), where an adversarial entity may corrupt up to δ memory cells during execution. Unlike previous resilient algorithms that either require knowledge of δ or incur a time penalty proportional to the number of faults, the authors present two novel algorithms that achieve optimal asymptotic performance without needing the exact fault bound.

The first contribution is a deterministic linear‑time selection algorithm that runs in worst‑case O(n) time regardless of δ. The core idea is to combine a block‑wise safe partition with a multi‑verification scheme. The input array is divided into √n‑sized blocks; each block is locally sorted and a representative value is extracted. Representatives are used to compute a median‑of‑medians‑style pivot. During the global partition, every comparison is performed independently several times, and a majority vote decides the side of each element. This redundancy guarantees that a corrupted cell cannot consistently mislead the partition, keeping the candidate set within a provably correct interval. After partitioning, the algorithm checks whether the candidate interval has shrunk sufficiently; if not, additional verification rounds are executed. Because each verification adds only a constant factor, the overall worst‑case bound remains O(n). Importantly, the algorithm never queries the value of δ; it adapts automatically to the actual number of corruptions α that occur.

The second contribution is a randomized in‑place selection algorithm that is conceptually simpler. It draws a random sample to choose a pivot, then applies the same multi‑verification partition. The expected running time is O(n + α) and the extra space is O(1), i.e., the algorithm works entirely within the input array. The randomness reduces the need for the deterministic median‑of‑medians recursion, while the verification step still neutralizes adversarial faults.

Both selection procedures are then employed as building blocks for two higher‑level applications.

Resilient k‑d tree construction – Prior resilient k‑d tree algorithms required O(n log² n + δ²) time because each level performed a costly fault‑tolerant partition. By substituting the deterministic O(n) selection at each level, the authors reduce the per‑level work to linear, yielding a total construction time of O(n log n). This matches the optimal bound for fault‑free k‑d trees while preserving correctness even when up to δ cells are corrupted.
Resilient in‑place sorting – Using the randomized selection as a pivot finder, the array is recursively split into two sub‑arrays that are each sorted in place. The multi‑verification partition ensures that each recursive call receives a correctly bounded sub‑problem despite any corrupted cells. The resulting algorithm runs in expected O(n log n + α δ) time and uses only O(1) extra memory, making it the first resilient sorting method that is both optimal in the comparison model and truly in‑place. When no faults occur (α = 0), the running time collapses to the classic O(n log n) bound.

The paper provides rigorous proofs of correctness, showing that the majority‑vote verification eliminates the influence of up to δ adversarial corruptions, and detailed complexity analyses that separate the contributions of the input size n, the actual fault count α, and the fault bound δ. Experimental results on synthetic data with varying fault rates confirm that the deterministic algorithm’s runtime is essentially independent of δ, while the randomized version scales linearly with the observed number of corrupted cells.

In summary, this work establishes that fundamental operations such as selection and sorting can be made robust against memory faults without sacrificing asymptotic efficiency. The deterministic O(n) selection algorithm achieves optimal worst‑case performance without any knowledge of the fault budget, and the randomized O(n + α) in‑place selector offers a simpler, space‑optimal alternative with strong expected‑time guarantees. By integrating these primitives, the authors improve resilient k‑d tree construction to O(n log n) and introduce the first optimal‑time, in‑place resilient sorting algorithm, thereby expanding the toolbox for reliable computation in environments where memory reliability cannot be guaranteed.