A Scalable CUR Matrix Decomposition Algorithm: Lower Time Complexity and Tighter Bound
The CUR matrix decomposition is an important extension of Nystr"{o}m approximation to a general matrix. It approximates any data matrix in terms of a small number of its columns and rows. In this paper we propose a novel randomized CUR algorithm with an expected relative-error bound. The proposed algorithm has the advantages over the existing relative-error CUR algorithms that it possesses tighter theoretical bound and lower time complexity, and that it can avoid maintaining the whole data matrix in main memory. Finally, experiments on several real-world datasets demonstrate significant improvement over the existing relative-error algorithms.
💡 Research Summary
The paper addresses the practical limitations of existing relative‑error CUR matrix decomposition methods by introducing a new randomized algorithm that achieves both a tighter theoretical error bound and a substantially lower computational cost. CUR decomposition approximates a data matrix A∈ℝ^{m×n} as A≈C U R, where C consists of a small subset of columns of A, R consists of a small subset of rows, and U is a low‑dimensional linking matrix. This formulation preserves interpretability because the factors are actual columns and rows of the original data, but prior relative‑error algorithms suffer from three major drawbacks: (1) they require the entire matrix to be stored in memory, (2) their runtime scales at least linearly with the product mn (often O(mnk) when computing leverage scores), and (3) the error guarantee ‖A−CUR‖_F ≤ (1+ε)‖A−A_k‖_F contains relatively large constants and logarithmic factors, forcing the sampling size to be O(k log k/ε²).
The proposed method consists of two sequential randomized sampling phases. In the first phase the algorithm selects columns. Instead of computing exact leverage scores (which would need a full SVD), it obtains approximate leverage scores via a fast random‑projection technique: A is projected onto a subspace of dimension O(k log k), a QR factorization is performed on the reduced matrix, and the squared row norms of the resulting orthogonal matrix serve as proxy leverage scores. These scores are used to define sampling probabilities p_j ∝ ℓ_j for each column j. The algorithm draws O(k log k/ε²) columns (with replacement) and rescales them, forming C. The analysis shows that, in expectation, the column space spanned by C is a (1+ε)‑approximation of the top‑k column space of A, and the constant hidden in the bound is reduced compared with previous work.
In the second phase, rows are sampled using the already selected columns. The rows of C are orthogonalized (QR) to obtain a basis Q for the column space of C. For each original row i, the algorithm computes the squared norm of its projection onto Q, which acts as an approximate row leverage score. Rows are then sampled with probabilities proportional to these scores, again drawing O(k log k/ε²) rows to form R.
The linking matrix U is defined as the least‑squares solution U = C^{†} A R^{†}, where † denotes the Moore‑Penrose pseudoinverse. This choice guarantees that CUR is the best approximation of A within the subspace spanned by the selected columns and rows.
The authors prove that the expected Frobenius error satisfies
E
Comments & Academic Discussion
Loading comments...
Leave a Comment