Ar{i}kan meets Shannon: Polar codes with near-optimal convergence to channel capacity
Let $W$ be a binary-input memoryless symmetric (BMS) channel with Shannon capacity $I(W)$ and fix any $\alpha > 0$. We construct, for any sufficiently small $\delta > 0$, binary linear codes of block length $O(1/\delta^{2+\alpha})$ and rate $I(W)-\delta$ that enable reliable communication on $W$ with quasi-linear time encoding and decoding. Shannon’s noisy coding theorem established the \emph{existence} of such codes (without efficient constructions or decoding) with block length $O(1/\delta^2)$. This quadratic dependence on the gap $\delta$ to capacity is known to be best possible. Our result thus yields a constructive version of Shannon’s theorem with near-optimal convergence to capacity as a function of the block length. This resolves a central theoretical challenge associated with the attainment of Shannon capacity. Previously such a result was only known for the erasure channel. Our codes are a variant of Ar{\i}kan’s polar codes based on multiple carefully constructed local kernels, one for each intermediate channel that arises in the decoding. A crucial ingredient in the analysis is a strong converse of the noisy coding theorem when communicating using random linear codes on arbitrary BMS channels. Our converse theorem shows extreme unpredictability of even a single message bit for random coding at rates slightly above capacity.
💡 Research Summary
The paper “Arıkan meets Shannon: Polar codes with near‑optimal convergence to channel capacity” addresses a long‑standing gap between Shannon’s existential noisy‑coding theorem and constructive, efficiently decodable codes. Shannon’s theorem guarantees the existence of codes of block length (O(1/\delta^{2})) achieving rate (I(W)-\delta) on any binary‑input memoryless symmetric (BMS) channel (W), but provides no explicit construction or low‑complexity decoding. Existing polar codes, based on the original 2×2 kernel, achieve capacity with a finite scaling exponent (\mu) (the exponent in the relation (N=O(1/\delta^{\mu}))), but the best known upper bound is (\mu\le 4.714), far from the optimal (\mu=2) that random linear codes attain.
The authors close this gap by introducing a family of polar codes built from large (\ell\times\ell) binary mixing matrices (kernels). For any desired small constant (\alpha>0), they select a sufficiently large power‑of‑two (\ell) (roughly (\exp(\Omega(\alpha^{-1.01})))) and construct the code by recursively applying the kernel’s Kronecker power, yielding a depth‑(t) (\ell)-ary tree of synthetic channels. Each synthetic “bit‑channel” (W_i) inherits an entropy that satisfies the conservation identity (\ell\cdot H(W)=\sum_i H(W_i)). By carefully designing a local kernel for every intermediate channel, they ensure that the entropy distribution sharpens dramatically at each recursion level—a phenomenon they formalize via a recursive potential‑function analysis. This analysis shows that the fraction of “non‑polarized” channels shrinks sub‑exponentially in the tree depth, leading to a block length \
Comments & Academic Discussion
Loading comments...
Leave a Comment