Matrix Completion from Noisy Entries

Reading time: 5 minute
...

📝 Original Info

  • Title: Matrix Completion from Noisy Entries
  • ArXiv ID: 0906.2027
  • Date: 2010-06-01
  • Authors: Raghunandan Keshavan, Andrea Montanari, S. Oh —

📝 Abstract

Given a matrix M of low-rank, we consider the problem of reconstructing it from noisy observations of a small, random subset of its entries. The problem arises in a variety of applications, from collaborative filtering (the `Netflix problem') to structure-from-motion and positioning. We study a low complexity algorithm introduced by Keshavan et al.(2009), based on a combination of spectral techniques and manifold optimization, that we call here OptSpace. We prove performance guarantees that are order-optimal in a number of circumstances.

💡 Deep Analysis

Figure 1

📄 Full Content

Spectral techniques are an authentic workhorse in machine learning, statistics, numerical analysis, and signal processing. Given a matrix M , its largest singular values-and the associated singular vectors-'explain' the most significant correlations in the underlying data source. A low-rank approximation of M can further be used for low-complexity implementations of a number of linear algebra algorithms (Frieze et al., 2004).

In many practical circumstances we have access only to a sparse subset of the entries of an m × n matrix M . It has recently been discovered that, if the matrix M has rank r, and unless it is too ‘structured’, a small random subset of its entries allow to reconstruct it exactly. This result was first proved by Candès and Recht (2008) by analyzing a convex relaxation introduced by Fazel (2002). A tighter analysis of the same convex relaxation was carried out by Candès and Tao (2009). A number of iterative schemes to solve the convex optimization problem appeared soon thereafter (Cai et al., 2008;Ma et al., 2009;Toh and Yun, 2009).

In an alternative line of work, Keshavan, Montanari, and Oh (2010) attacked the same problem using a combination of spectral techniques and manifold optimization: We will refer to their algorithm as OptSpace. OptSpace is intrinsically of low complexity, the most complex operation being computing r singular values (and the corresponding singular vectors) of a sparse m × n matrix. The performance guarantees proved by Keshavan et al. (2010) are comparable with the information theoretic lower bound: roughly nr max{r, log n} random entries are needed to reconstruct M exactly (here we assume m of order n). A related approach was also developed by Lee and Bresler (2009), although without performance guarantees for matrix completion.

The above results crucially rely on the assumption that M is exactly a rank r matrix. For many applications of interest, this assumption is unrealistic and it is therefore important to investigate their robustness. Can the above approaches be generalized when the underlying data is ‘well approximated’ by a rank r matrix? This question was addressed by Candès and Plan (2009) within the convex relaxation approach of Candès and Recht (2008). The present paper proves a similar robustness result for OptSpace. Remarkably the guarantees we obtain are order-optimal in a variety of circumstances, and improve over the analogous results of Candès and Plan (2009).

Let M be an m × n matrix of rank r, that is

(

where U has dimensions m × r, V has dimensions n × r, and Σ is a diagonal r × r matrix. We assume that each entry of M is perturbed, thus producing an ‘approximately’ low-rank matrix N , with

where the matrix Z will be assumed to be ‘small’ in an appropriate sense.

Out of the m × n entries of N , a subset E ⊆ [m] × [n] is revealed. We let N E be the m × n matrix that contains the revealed entries of N , and is filled with 0’s in the other positions

Analogously, we let M E and Z E be the m × n matrices that contain the entries of M and Z, respectively, in the revealed positions and is filled with 0’s in the other positions. The set E will be uniformly random given its size |E|.

For the reader’s convenience, we recall the algorithm introduced by Keshavan et al. ( 2010), which we will analyze here. The basic idea is to minimize the cost function F (X, Y ), defined by

F(X, Y, S) ≡ 1 2

Here X ∈ R n×r , Y ∈ R m×r are orthogonal matrices, normalized by X T X = mI, Y T Y = nI.

Minimizing F (X, Y ) is an a priori difficult task, since F is a non-convex function. The key insight is that the singular value decomposition (SVD) of N E provides an excellent initial guess, and that the minimum can be found with high probability by standard gradient descent after this initialization. Two caveats must be added to this description: (1) In general the matrix N E must be ’trimmed’ to eliminate over-represented rows and columns;

(2) For technical reasons, we consider a slightly modified cost function to be denoted by F (X, Y ).

OptSpace( matrix N E ) 1: Trim N E , and let N E be the output; 2: Compute the rank-r projection of

We may note here that the rank of the matrix M , if not known, can be reliably estimated from N E (Keshavan and Oh, 2009).

The various steps of the above algorithm are defined as follows.

Trimming. We say that a row is ‘over-represented’ if it contains more than 2|E|/m revealed entries (i.e., more than twice the average number of revealed entries per row). Analogously, a column is over-represented if it contains more than 2|E|/n revealed entries. The trimmed matrix N E is obtained from N E by setting to 0 over-represented rows and columns.

Rank-r projection.

Let

be the singular value decomposition of N E , with singular values σ 1 ≥ σ 2 ≥ . . . . We then define

Apart from an overall normalization, P r ( N E ) is the best rank-r approximation to N E in Frobenius norm.

Minimization. The modified cost function F is defined a

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut