Closed-Form Solutions to A Category of Nuclear Norm Minimization Problems
It is an efficient and effective strategy to utilize the nuclear norm approximation to learn low-rank matrices, which arise frequently in machine learning and computer vision. So the exploration of nuclear norm minimization problems is gaining much a…
Authors: Guangcan Liu, Ju Sun, Shuicheng Yan
Closed-Form Solutions to A Category of Nuclear Norm Minimization Pr oblems Guangcan Liu † , Ju Sun ‡ and Shuicheng Y an § Departmen t of Electrical & C ompute r Engineering, National Uni versity of Singapor e † roth@sjtu.ed u.cn , ‡ idmsj@nus.ed u.sg , § eleyans@nus. edu.sg Abstract 1 It is an efficient and effecti ve strategy to utilize the nuclear norm approximatio n to learn low-rank matrices, which arise frequ ently in machine learning and co m- puter vision. So the exploration of nuclear norm minimizatio n prob lems is gaining much attention recently . In this paper we shall prove that the following Low-Rank Representation (LRR) [2, 1] problem : min Z k Z k ∗ , s.t., X = AZ, has a un ique an d closed -form solution, wher e X and A are giv en matrices. The proof is b ased on p roving a lemma that allo ws us to get closed -form solutions to a category of nuclear norm minimization problems. 1 Intr oduction In real application s, our obser vations a re often noisy , or ev en grossly cor rupted , and some observa- tions m ay b e missing . T his fact natura lly leads to the prob lem of recovering a low-rank matr ix X 0 from a co rrup ted observation ma trix X = X 0 + E 0 (each colum n o f X is an obser vation vector), with E 0 being the unkn own n oise. Due to the low-rank p roperty of X 0 , it is straightforward to consider the following re gularized rank minimization problem: min D,E rank ( D ) + λ k E k ℓ , s.t. X = D + E , where λ > 0 is a param eter and k·k ℓ is some k ind o f regularization strategy , such as the ℓ 1 -norm adopted by [3, 4], for character izing the noise E 0 . As a common practice in rank minimizatio n problem s, o ne could rep lace the rank functio n with the nuclear norm , r esulting in th e following conv ex o ptimization problem: min D,E k D k ∗ + λ k E k ℓ , s.t. X = D + E . (1) The minim izer D ∗ (with respect to the variable D ) gives a low-rank recovery to the o riginal data X 0 . The above f ormulatio n is adopted by the re cently established Robust PCA (RPCA) meth od [3, 4], which uses the ℓ 1 -norm to characterize the no ise. Howe ver , such a form ulation implicitly assumes th at the underly ing data stru cture is a single low-rank subspace. When the data is drawn from a u nion of multiple su bspaces, denoted as S 1 , S 2 , · · · , S k , the PRCA method actually treats the data as being sampled from a single subspace defin ed by S = P k i =1 S i . The specifics of the individual subspaces are not well considere d, s o the recovery may be inaccurate. T o better hand le the m ixed data, in [2, 1] we sug gest a more general ran k min imization prob lem defined as follows: min Z,E rank ( Z ) + λ k E k ℓ , s.t. X = AZ + E , 1 The content of this paper is a part of [1] 1 where A is a “dictionary” that linearly spans the da ta space. By replacing the rank fu nction with the nuclear norm, we hav e the following conv ex optimiza tion problem: min Z,E k Z k ∗ + λ k E k ℓ , s.t. X = AZ + E . (2) After ob taining an op timal solution ( Z ∗ , E ∗ ) , we could recover the o riginal data by using AZ ∗ (or X − E ∗ ). Since rank ( AZ ∗ ) ≤ rank ( Z ∗ ) , AZ ∗ is also a low-rank re covery to the original data X 0 . By setting A = I , th e form ulation (2) falls backs to (1). So o ur LRR metho d cou ld be regarded as a gen eralization o f RPCA that essentially uses the standard basis as the dictionary . By choosing an app ropriate dictionary A , as shown in [2, 1 ], the lowest-rank repr esentation also reveals the segmentation of data such that LRR could h andle well th e data drawn fro m a mixture o f mu ltiple subspaces. For ease of und erstanding the LRR method, in this work w e consider th e “ideal” ca se th at the data is noiseless. T hat is, we consider the following optimization problem: min Z k Z k ∗ , s.t. X = AZ. (3) W e will show that th is optimizatio n prob lem always has a unique and closed-fo rm minimizer, pro- vided that X = AZ has feasible solutio ns. 2 A Closed-Form Solut ion to Problem (3) The nuclear norm is conv ex, but not strongly con vex. So it is possible that pro blem (3) has multiple optimal solutions. Fortunately , it can be proven that the minimizer to pro blem (3) is always uniquely defined by a closed form. T his is summarized in the following theorem. Theorem 2.1 (Uniqueness) Assume A 6 = 0 and X = AZ hav e fea sible solutions, i.e., X ∈ span ( A ) . Then Z ∗ = V A ( V T A V A ) − 1 V T X , (4) is the un ique minimizer to pr oblem (3) , wher e V X and V A ar e calculated as fo llows: Compute the skinny Singular V alue Decompo sition (SV D) o f [ X , A ] , den oted a s [ X , A ] = U Σ V T , and pa rtition V as V = [ V X ; V A ] such that X = U Σ V T X and A = U Σ V A T . From the above theor em we have the following two coro llaries. First, when the matrix A is of fu ll row rank, the closed-fo rm s olution defined by (4) can be repre sented in a s impler form . Corollary 2.1 Suppose the matrix A has full r ow rank, then Z ∗ = A T ( AA T ) − 1 X , is the unique minimizer to pr oblem (3) , wher e A T ( AA T ) − 1 is the generalized in verse o f A . Second, whe n the data matr ix itself is used as the dictionar y , i.e., A = X , the solutio n to prob lem (3) falls back to the outputs of a factorization based method. Corollary 2.2 Assume X 6 = 0 . Th en the following optimization pr o blem min Z k Z k ∗ , s.t. X = X Z, has a unique minimizer Z ∗ = SIM( X ) , (5) wher e SIM( X ) = V X V T X is called the Shape Interaction Matrix (SIM) [5 ] in c omputer vision a nd X = U X Σ X V T X is the skinny SVD of X . The proof of Theorem 2.1 is based on the following three lemmas. 2 Lemma 2.1 Let U , V and M be matrices of comp atible dimen sions. Suppose b oth U and V have orthogonal columns, i.e., U T U = I and V T V = I 2 , then we have k M k ∗ = k U M V T k ∗ . Proof Let the full S VD of M be M = U M Σ M V T M , th en U M V T = ( U U M )Σ M ( V V M ) T . A s ( U U M ) T ( U U M ) = I a nd ( V V M ) T ( V V M ) = I , ( U U M )Σ M ( V M V ) T is actually the SVD of U M V T . B y the definition of the nuclear norm, we have k M k ∗ = tr (Σ M ) = U M V T ∗ . Lemma 2.2 F or any four matrices B , C , D a nd F of compa tible dimensions, we have B C D F ∗ ≥ k B k ∗ , wher e the equality holds if and only if C = 0 , D = 0 and F = 0 . Proof Lemma 10 of [6] directly leads to the above conclusion. Lemma 2.3 Let U , V and M be given ma trices of co mpatible dimension s. Su ppose both U and V have orthogona l colu mns, i.e., U T U = I and V T V = I , then the following optimizatio n pr ob lem min Z k Z k ∗ , s.t. U T Z V = M , (6) has a unique minimizer Z ∗ = U M V T . Proof First, we prove that k M k ∗ is the minimum objective function value an d Z ∗ = U M V T is a minimizer . For any feasible solu tion Z , let Z = U Z Σ Z V T Z be its full SVD. Let B = U T U Z and C = V T Z V . Then the constraint U T Z V = M is equal to B Σ Z C = M . (7) Since B B T = I a nd C T C = I , we can find the orthogo nal co mplemen ts 3 B ⊥ and C ⊥ such that B B ⊥ and [ C, C ⊥ ] are orth ogon al matrices. Accord ing to the u nitary inv ariance o f the nu clear norm , Lemm a 2.2 and (7), we have k Z k ∗ = k Σ Z k ∗ = B B ⊥ Σ Z [ C, C ⊥ ] ∗ = B Σ Z C B Σ Z C ⊥ B ⊥ Σ Z C B ⊥ Σ Z C ⊥ ∗ ≥ k B Σ Z C k ∗ = k M k ∗ , Hence, k M k ∗ is the minimum objective fu nction value of prob lem ( 6). At the sam e time, L emma 2.1 proves that k Z ∗ k ∗ = U M V T ∗ = k M k ∗ . So Z ∗ = U M V T is a minimizer to problem (6). Second, w e prove that Z ∗ = U M V T is th e un ique min imizer . Assume that Z 1 = U M V T + H is another optimal solution. By U T Z 1 V = M , we have U T H V = 0 . (8) Since U T U = I and V T V = I , s imilar to above, we can constru ct two orthogo nal matric es: [ U , U ⊥ ] and [ V , V ⊥ ] . By the optimality of Z 1 , we have k M k ∗ = k Z 1 k ∗ = U M V T + H ∗ = U T U T ⊥ ( U M V T + H )[ V , V ⊥ ] ∗ = M U T H V ⊥ U T ⊥ H V U T ⊥ H V ⊥ ∗ ≥ k M k ∗ . According to Lemma 2.2, the above equality can hold if and only if U T H V ⊥ = U T ⊥ H V = U T ⊥ H V ⊥ = 0 . T ogether with (8), we conclude that H = 0 . So the optimal solution is unique . 2 Note here that U and V may not be orthogon al, namely , U U T 6 = I and V V T 6 = I . 3 When B and/or C are already orthogonal matrices, i.e., B ⊥ = ∅ and/or C ⊥ = ∅ , our proof is still va lid. 3 The above lemm a allows us to get closed-fo rm solutions to a class of nuclear no rm minimization problem s. This leads to a simple proof of Theore m 2.1. Proof ( of The or em 2.1 ) Sinc e X ∈ span ( A ) , we have ra nk ([ X, A ]) = rank ( A ) . By the d efini- tions of V X and V A (see T heorem 2 .1), it can be conclud ed th at th e m atrix V T A has f ull r ow rank. That is, if the skinny SVD of V T A is U 1 Σ 1 V T 1 , th en U 1 is an ortho gonal matrix. Thr ough some simple computatio ns, we have V A ( V T A V A ) − 1 = V 1 Σ − 1 1 U T 1 . (9) Also, it ca n be calculated that the constrain t X = AZ is equ al to V T X = V T A Z , which is also equ al to Σ − 1 1 U T 1 V T X = V T 1 Z . So pro blem (3) is equal to the follo wing optimization problem: min Z k Z k ∗ , s.t. V T 1 Z = Σ − 1 1 U T 1 V T X . By Lemma 2.3 and (9), prob lem (3) has a uniqu e minimizer Z ∗ = V 1 Σ − 1 1 U T 1 V T X = V A ( V T A V A ) − 1 V T X . 3 Conclusion In this paper , we prove th at pr oblem (3) has a unique and closed-form solution. The h eart of the proof is Lemm a 2 .3, which actu ally a llows us to g et th e clo sed-form solution s to a category of nuclear norm m inimization p roblem s. For example, b y f ollowing the clues presented in th is p aper, it is simple for one to get the closed-for m solution to the following optimization problem: min Z k Z k ∗ , s.t. X = AZ B , where X , A and B are given matrices. Our theore ms are useful for study ing the LRR p roblem s. For example, based o n Lemma 2.3, we have d evised an method to r ecovery the effects of the unobserved , hidden data in LRR [7]. Refer ences [1] G. Liu, Z. Lin, S. Y an, J. Sun, Y . Y u, and Y . Ma, “Robust recovery of subspace struc- tures by low-rank representation , ” In Submission ( http : // arxiv .org /P S cache/ar xiv / pd f / 1010 / 1010 . 2955 v 1 .pd f ) . [2] G. Liu, Z. Lin, and Y . Y u , “Rob ust subspace se gmentation by low-rank rep resentation, ” in Inter - nationa l Confer e nce on Machine L earning , 2010, pp. 663–670. [3] J. Wrigh t, A. Ganesh, S. Rao, Y . Peng, and Y . Ma, “Robust principal compo nent a nalysis: Exact recovery of corrupted l ow-rank m atrices via con vex o ptimization, ” in Advances in Neural Information Pr ocessing Systems , 2009, pp. 2080–20 88. [4] E. J. Cand ` es, X. Li, Y . Ma, and J. Wright, “Robust p rincipal compo nent analysis?” S ubmitted to Journal of the A CM , 2009. [5] A. P . Costeira , Jo and T . Kanade, “ A mu ltibody factorization method for independen tly mo ving objects, ” Interna tional J ournal on Computer V ision , v ol. 29, no. 3, pp. 159– 179, 1998. [6] J. Sun, Y . Ni, X. Y uan, S. Y an, and L.- F . Cheo ng, “Robust low-rank subspac e segmentation with semidefinite guarantee s, ” in ICDM W orkshop on Optimization Based Meth ods for Emerging Data Mining Pr oblems , 2010. [7] G. Liu and S. Y an , “Laten t low-rank rep resentation for subspa ce segmentation and feature ex- traction, ” In Submission . 4
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment