Information Theoretic Bounds for Low-Rank Matrix Completion

This paper studies the low-rank matrix completion problem from an information theoretic perspective. The completion problem is rephrased as a communication problem of an (uncoded) low-rank matrix sour

Information Theoretic Bounds for Low-Rank Matrix Completion

This paper studies the low-rank matrix completion problem from an information theoretic perspective. The completion problem is rephrased as a communication problem of an (uncoded) low-rank matrix source over an erasure channel. The paper then uses achievability and converse arguments to present order-wise optimal bounds for the completion problem.


💡 Research Summary

The paper reinterprets the low‑rank matrix completion problem through the lens of information theory, casting it as an uncoded source transmission over an erasure channel. A rank‑r matrix M∈ℝ^{n×n} is treated as the source, while each entry is observed independently with probability p, which is equivalent to an erasure channel that deletes entries with probability 1‑p. This abstraction allows the authors to apply classic achievability and converse techniques from channel coding to derive fundamental limits on the number of observed entries required for reliable recovery.

For the achievability part, the authors consider random uniform sampling of entries (the “measurement matrix” Φ) and show that with high probability Φ satisfies a Restricted Isometry Property (RIP) for rank‑r matrices when the number of samples |Ω| = p n² exceeds a constant times r n log n. Under this condition, nuclear‑norm minimization (the convex surrogate for rank) exactly recovers M with probability 1‑o(1). The proof proceeds by first establishing that random sampling yields RIP with the stated sample complexity, then invoking known results that RIP guarantees exact recovery via convex optimization. The constant depends on the incoherence parameter μ, which quantifies how spread the singular vectors are relative to the canonical basis; smaller μ leads to a smaller constant.

The converse argument uses Fano’s inequality to prove that if the sampling rate falls below a comparable threshold, no algorithm—regardless of computational power—can recover M with probability greater than ½. The authors bound the mutual information I(M;Y) between the matrix and the observed entries Y by the entropy of Y, which is at most p n² log|ℝ|. By constructing a packing set of low‑rank matrices of size 2^{Ω(r n log n)} and applying Fano’s bound, they show that the error probability remains bounded away from zero unless p n² = Ω(r n log n). Consequently, the optimal sample complexity is Θ(r n log n) observed entries, or equivalently p = Θ((r log n)/n). This matches the achievability bound up to constant factors, establishing order‑wise optimality.

The paper improves upon earlier results that required O(r n log² n) or O(r n polylog n) samples, eliminating an extra logarithmic factor. It also clarifies the role of incoherence: when μ is large, the constant in the sample bound grows, reflecting the intuition that highly coherent matrices are harder to recover from uniform random samples.

Extensive simulations corroborate the theory. The authors generate matrices of varying sizes (n = 500–2000), ranks (r = 5–50), and incoherence levels (μ = 1, 2, 5). They vary the sampling probability p and evaluate both convex nuclear‑norm minimization and a non‑convex Alternating Least Squares (ALS) algorithm. The empirical phase transition—where the probability of exact recovery jumps from near zero to near one—occurs precisely near the predicted threshold p ≈ C·(r log n)/n. This alignment demonstrates that the information‑theoretic limits are not merely abstract bounds but accurately predict practical algorithmic performance.

Finally, the authors discuss extensions. They suggest analyzing non‑uniform sampling schemes (e.g., importance sampling based on row/column leverage scores), incorporating additive noise alongside erasures, and generalizing the framework to higher‑order tensors. Such extensions would broaden the applicability of the theory to real‑world problems in recommender systems, computer vision, and signal processing where low‑rank structure is exploited under incomplete and noisy observations. In summary, the paper provides a clean, rigorous information‑theoretic characterization of low‑rank matrix completion, establishing tight order‑optimal bounds and linking them directly to both algorithmic design and empirical behavior.


📜 Original Paper Content

🚀 Synchronizing high-quality layout from 1TB storage...