High Rank Matrix Completion via Grassmannian Proxy Fusion

High Rank Matrix Completion via Grassmannian Proxy Fusion
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This paper approaches high-rank matrix completion (HRMC) by filling missing entries in a data matrix where columns lie near a union of subspaces, clustering these columns, and identifying the underlying subspaces. Current methods often lack theoretical support, produce uninterpretable results, and require more samples than theoretically necessary. We propose clustering incomplete vectors by grouping proxy subspaces and minimizing two criteria over the Grassmannian: (a) the chordal distance between each point and its corresponding subspace and (b) the geodesic distances between subspaces of all data points. Experiments on synthetic and real datasets demonstrate that our method performs comparably to leading methods in high sampling rates and significantly better in low sampling rates, thus narrowing the gap to the theoretical sampling limit of HRMC.


💡 Research Summary

The paper tackles the challenging problem of High‑Rank Matrix Completion (HRMC), where the columns of a data matrix lie near a union of several low‑dimensional subspaces and many entries are missing. Existing HRMC approaches either rely on naive completion followed by subspace clustering, neighborhood overlap, alternating EM‑style updates, lifting techniques, or integer programming. These methods suffer from a lack of theoretical guarantees, difficulty interpreting results, and, most critically, they require far more observed entries than the information‑theoretic limit suggested by low‑rank matrix completion (LRMC).

To overcome these issues, the authors introduce a novel “Grassmannian Proxy Fusion” (GrassFusion) framework. For each partially observed column (x_i), they construct a proxy subspace (U_i) of dimension (r) (an upper bound on the true subspace dimensions) that contains the observed entries. The key idea is to optimize all proxy subspaces jointly on the Grassmann manifold (\mathcal G(m,r)) by minimizing a composite objective:

  1. Chordal term (d_c(x_i,U_i)=1-\sigma_1^2(X_i^{0\top}U_i)). Here (X_i^0) spans all possible completions of the observed vector; the term penalizes proxy subspaces that cannot accommodate a feasible completion of (x_i).

  2. Geodesic term (d_g(U_i,U_j)=\sum_{\ell=1}^r\arccos\sigma_\ell(U_i^\top U_j)). This encourages proxies belonging to the same underlying subspace to become close on the manifold, thereby inducing consensus among them.

The full objective is
\


Comments & Academic Discussion

Loading comments...

Leave a Comment