Efficient and Universal Corruption Resilient Fountain Codes
In this paper, we present a new family of fountain codes which overcome adversarial errors. That is, we consider the possibility that some portion of the arriving packets of a rateless erasure code are corrupted in an undetectable fashion. In practice, the corrupted packets may be attributed to a portion of the communication paths which are controlled by an adversary or to a portion of the sources that are malicious. The presented codes resemble and extend LT and Raptor codes. Yet, their benefits over existing coding schemes are manifold. First, to overcome the corrupted packets, our codes use information theoretic techniques, rather than cryptographic primitives. Thus, no secret channel between the senders and the receivers is required. Second, the encoders in the suggested scheme are oblivious to the strength of the adversary, yet perform as if its strength was known in advance. Third, the sparse structure of the codes facilitates efficient decoding. Finally, the codes easily fit a decentralized scenario with several sources, when no communication between the sources is allowed. We present both exhaustive as well as efficient decoding rules. Beyond the obvious use as a rateless codes, our codes have important applications in distributed computing.
💡 Research Summary
The paper introduces a new family of fountain codes that are resilient to adversarial corruption, extending the well‑known LT and Raptor codes. Traditional rateless erasure codes assume that lost packets are simply erased; they do not address the scenario where an attacker can inject maliciously altered packets that are indistinguishable from legitimate ones. The authors propose an information‑theoretic approach that requires no secret keys, shared authentication channels, or prior knowledge of the attacker’s strength.
The encoding process works by generating each transmitted packet as a random sparse linear combination of the original k source blocks. The degree distribution (e.g., Poisson, uniform, or custom) is fixed and known to both sender and receiver, but the specific set of blocks and the coefficients used for each packet are chosen independently at random. This randomness creates a high‑dimensional linear system at the receiver side. Even if a fraction τ of the received packets are arbitrarily corrupted, the overall matrix retains full rank with high probability as long as τ stays below a certain threshold (empirically around 0.3). The key insight is that the sparsity of the matrix allows the receiver to detect and discard inconsistent rows using simple rank‑checking operations, without any cryptographic hash or MAC.
Two decoding strategies are presented. The first, an exhaustive search, enumerates all possible source vectors and selects the one that best fits the received equations. Although exponential in the worst case, this method is viable when τ is extremely low or when computational resources are abundant. The second, a practical sparse‑Gaussian elimination algorithm, exploits the matrix’s sparsity to achieve O(k log k) time complexity. The algorithm iteratively removes linearly independent rows, monitors rank drops, and isolates rows that are likely corrupted. Once a full‑rank submatrix is identified, standard Gaussian elimination recovers the original blocks.
A notable contribution is the “oblivious encoder” property: the encoder does not need to know τ in advance and still operates as if it did. This makes the scheme suitable for decentralized environments where multiple independent sources generate packets without any coordination. Each source follows the same degree distribution and random seed policy; the receiver simply aggregates all incoming packets into a single linear system and applies the same decoding routine. The authors prove that, under mild assumptions, the probability of successful recovery remains high regardless of how the sources are distributed.
The paper also provides a rigorous probabilistic analysis using Chernoff bounds and random graph theory to bound the failure probability as a function of τ, k, and the chosen degree distribution. Simulations confirm that for τ up to 0.3 the success rate exceeds 99.9 % across a wide range of parameters.
Beyond the core coding theory, the authors discuss two concrete applications. In large‑scale sensor networks, a subset of compromised sensors may inject falsified measurements; the proposed codes enable the central aggregator to reconstruct the true data set without needing per‑sensor authentication. In blockchain or distributed ledger contexts, malicious nodes may broadcast altered blocks; the resilient fountain code allows honest participants to recover the correct block history from the majority of honest transmissions, reducing reliance on heavyweight digital signatures.
Finally, the paper outlines future research directions: extending the resilience to higher corruption rates (> 0.5), integrating low‑latency decoding for real‑time streaming, and exploring hardware acceleration (e.g., FPGA or GPU implementations) to further reduce decoding time. Overall, the work bridges the gap between rateless erasure coding and adversarial error correction, offering a practical, universal solution for secure and efficient data dissemination in untrusted or partially compromised networks.
Comments & Academic Discussion
Loading comments...
Leave a Comment