On the non-detectability of spiked large random tensors

On the non-detectabilit y of spik ed large random tensors A. Chevreuil and P . Loubaton Lab oratoire d’Informatique Gaspard Monge (CNR S, Univ ersit´ e P aris-Est/ML V) 5 Bd. Descartes 77454 Marne-la-V all ´ ee (F rance) Abstract This paper addresses the detection of a low rank high-dimensional ten- sor corrupted b y an additiv e complex Gaussian noise. In the asymptotic regime where all the dimensions of the tensor conv erge tow ards + ∞ at the same rate, existing results dev oted to rank 1 tensors are extended. It is prov ed that if a certain parameter dep ending on the low rank tensor is below a threshold, then the null hypothesis and the presence of the lo w rank ten sor are undi stinguishable hypotheses in the sense that no test performs b etter than a random choice. 1 In tro duction The problem of testing whether an observed n 1 × n 2 matrix Y is either a zero- mean indep enden t identically distributed Gaussian random matrix Z with v ari- ance 1 n 2 , or X 0 + Z (where X 0 is a low rank matrix: a useful signal, called also spike ) is a fundamental problem arising in numerous applications such as the detection of low-rank m ultiv ariate signals or the Gaussian hidden clique prob- lem. When the tw o dimensions n 1 , n 2 con v erge to w ards ∞ at the same rate, the rank of X 0 remaining ﬁxed, the con text is this of the so-called additiv e spiked large random matrix mo dels. V arious results on the singular v alues of X 0 + Z ha v e b een established; in particular it is p ossible to show that the Generalized Lik elihoo d Ratio T est (GLR T) is consistent (i.e. b oth the probabilit y of false alarm and the probability of missed detection b oth conv erge tow ards 0 when n 1 , n 2 con v erge tow ards + ∞ in such a wa y that n 1 /n 2 → c > 0) if and only if and only if the largest singular v alue of X 0 is ab ov e the threshold c 1 / 4 (see e.g. [1], [2], [3]). In a n um ber of real life problems, the observ ation is not a matrix, but a tensor Y of order d ≥ 3, i.e. a d –dimensional array Y = Y i 1 ,i 2 ,...,i d where for eac h k = 1 , . . . , d , i k ∈ [1 , . . . , n k ]. In this context, the generalization of the ab o v e matrix hypothesis testing problem b ecomes: test that the observed order 1 d ≥ 3 tensor is either a zero-mean indep enden t identical ly distributed Gaussian random tensor Z , or the sum of Z and a low rank deterministic tensor X 0 , i.e. X 0 = r X i =1 λ i x (1 ,i ) 0 ⊗ x (2 ,i ) 0 ⊗ . . . x ( d,i ) 0 (1.1) where r is called the r ank of X 0 . Here ( λ i ) i =1 ,...,r are strictly positive real n um bers, and for each i = 1 , . . . , r and k = 1 , ..., d , x ( k,i ) 0 is a n i × 1 unit norm v ector. Recent w orks (see e.g [4, 5, 6, 7] ) addressed teh detection/estimation of X 0 when r is reduced to 1 and when the dimensions n 1 , . . . , n d con v erge to wards ∞ at the same rate. W e also mention that [4] and [7] only considered the case where the rank 1 tensor X 0 is symmetric, i.e. n 1 = n 2 = . . . = n d and all v ectors ( x (1 ,i ) 0 ) i =1 ,...,d coincide. As the concept of singular v alue decomp osition cannot be extended to tensors, ad’hoc statistical strategies ha ve been considered to prov e the (non)-existence of consistent tests: [5] and [7] established that if λ 1 is larger than a certain upper b ound, then consisten t detection of X 0 is possible. In the other direction, [6] and [7] prov ed that if λ 1 is less than a certain low er b ound (which is stricly less than the ab ov e upp er bound), then X 0 is non- detectable in the sense that any test behav es as a random choice betw een the t w o hypotheses. This is a remark able phenomenon b ecause such a b ehaviour is not observ ed in the matrix case d = 2. In eﬀect, if the largest eigenv alue of X 0 is b elo w c 1 / 4 , then, [8] pro ved when r = 1 that there exist statistical tests ha ving a b etter p erformance than a random choice, a result that [6] and [7] obtained a diﬀeren t w a y in the symetric case. In [4], [5], [6], [7] a main assumption is that X 0 is a rank 1 tensor. The purpose of the presen t paper is to consider the case where r ≥ 1: w e ﬁnd out a suﬃcien t condition on the parameters of X 0 under which X 0 is non-detectable. The problem of ﬁnding conditions under which the existence of a consistent detection detection is guaranteed is not addressed here. 2 Mo del, notation, and bac kground The order- d tensors are complex-v alued, and it is assumed that n 1 = n 2 = ... = n in order to simplify the notations. The set  d C n is a complex v ector-space endo w ed with the standard scalar pro duct ∀ X , Y ∈  d C n h X , Y i = X i 1 ,...,i d X i 1 ,...,i d Y i 1 ,...,i d and the F robenius norm k X k F = p h X , X i . The spik e ( “the signal” ) is assumed to b e a tensor of ﬁxed rank r following (1.1). Along this contribu tion, n is large or, mathematically , n → ∞ . W e hence ha v e for eac h n a set of n × 1 v ectors  x ( k,i ) 0  k =1 ...d,i =1 ,...,r . F or eac h k = 1 , . . . , d , w e denote b y χ ( k ) 0 the n × r matrix χ ( k ) 0 = ( x ( k, 1) 0 , . . . , x ( k,r ) 0 ). W e imp ose a 2 non-erratic asymptotic b ehavior of the spike, and sp eciﬁcally , as all the vectors x ( k,i ) 0 ∈ C n × 1 ha v e unit norm, we supp ose that for all i, j , D x ( k,i ) 0 , x ( k,j ) 0 E = ( χ ( k ) ∗ 0 χ ( k ) 0 ) i,j con v erges as n → ∞ . The rate of con v ergence is a technical asp ect that is out of the scop e of this contribution: w e will simply assume that the matrices ( χ ( k ) ∗ 0 χ ( k ) 0 ) k =1 ,...,d do not dep end on n . W e deﬁne the SVD of χ ( k ) 0 as U k  Σ k 0  V ∗ k for U k and V k unitary matrices resp ectiv ely of size n × n and r × r and Σ k a diagonal matrix with non-negative entries on the diagonal. V k and Σ k do no dep end on n because χ ( k ) ∗ 0 χ ( k ) 0 = V k Σ 2 k V ∗ k . W e denote by Z the noise tensor, and assume that its entries are N C (0 , 1 /n ) independent identically distributed complex circular Gaussian random v ariables. In the follo wing, we consider the alternative H 0 : Y = Z v ersus H 1 : Y = X 0 + Z . W e denote by p 1 ,n ( y ) the probabilit y probability density of Y under H 1 and p 0 ,n ( y ) the densit y of Y under H 0 . Λ( Y ) = p 1 ( Y ) p 0 ( Y ) is the likelihoo d ratio and we denote by E 0 the exp ectation under H 0 . W e now recall the fundamen- tal information geometry results used in [6] in order to address the detection problem.The follo wing prop erties are well known (see also [9] section 3): • (i) If E 0  Λ( Y ) 2  is b ounded, then no consistent detection test exists. • (ii) If moro ev er E 0  Λ( Y ) 2  = 1 + o (1), then the total v ariation distance b et w een p 0 ,n and p 1 ,n con v erges tow ards 0, and no test p erforms b etter than a decision at random. Therefore, the computation of the second order momen t of Λ( Y ) under p 0 ,n ma y prov ide insigh ts on the detection. W e ho w ev er notice that conditions (i) and (ii) are only suﬃcient. In particular, if lim sup n E 0  Λ( Y ) 2  = + ∞ , nothing can b e inferred on the b ehaviour of the detection problem when n → + ∞ . 3 Prior on the spik e. Expression of the second- order momen t. The density of Z , seen as a collection of n d complex-v alued random v ariables, is ob viously p 0 ,n ( z ) = κ n exp  − n k z k 2 F  where κ n =  n π  n d . On the one hand, we notice that the second -order momen t approach is not suited to the determinis tic mo del of the spik e as presented previously . Indeed, in this case E 0  Λ( Y ) 2  has the simple expression exp  2 n k X 0 k 2 F  and alwa ys diverges. On the other hand, the noise tensor shows an inv ariance property: if Θ 1 , ..., Θ d are unitary n × n matrices , then the density of the mo de pro ducts ( Θ 1 ⊗ Θ 2 ... ⊗ Θ d ) Z equals this of Z . F or d = 2,the notation ( Θ 1 ⊗ Θ 2 ) Z simply means Θ 1 ZΘ 2 and for a general d , (( Θ 1 ⊗ Θ 2 ... ⊗ Θ d ) Z ) i 1 ,...,i d is X ` 1 ,...,` d ( Θ 1 ) i 1 ,` 1 ( Θ 2 ) i 2 ,` 2 ... ( Θ d ) i d ,` d Z ` 1 ,...,` d . 3 W e hence mo dify the data according to the pro cedure: we pic k i.i.d. complex Haar samples Θ 1 , ..., Θ d and change the data tensor Y in to ( Θ 1 ⊗ Θ 2 ... ⊗ Θ d ) Y . This do es not aﬀect the distribution of the noise, but this amoun ts to assume a prior on the spik e. Indeed, the v ectors x ( k,i ) 0 are replaced by Θ k x ( k,i ) 0 . They are all uniformly distributed on the unit sphere of C n and for k 6 = l , vectors Θ k x ( k,i ) 0 and Θ l x ( l,j ) 0 are indep enden t for each i, j . Ho w ev er, vectors ( Θ k x ( k,i ) 0 ) i =1 ,...,r are not indep endent. In the following, the data and the noise tensors after this pro cedure are still denoted resp ectively by Y and Z . W e are now in p osition to give a closed-form expression of the second-order momen t of Λ ( Y ) . W e hav e p 1 ,n ( Y ) = E X [ p 0 ,n ( Y − X )] where E X is the mathematical exp ectation o v er the distribution of the spike, or equiv alently o v er the Haar matrices ( Θ k ) k =1 ,...,d . It holds that E 0  Λ( Y ) 2  = E X,X 0 [exp (2 n R h X , X 0 i )] = E X,X 0   exp   2 n R r X i,j =1 λ i λ j d Y k =1 D ( Θ 0 k ) ∗ Θ k x ( k,i ) 0 , x ( k,j ) 0 E     where E X,X 0 is ov er indep enden t copies X , X 0 of the spike asso ciated resp ec- tiv ely with ( Θ k ) k =1 ,...,d and ( Θ 0 k ) k =1 ,...,d . R stands for the real part. As Θ k and Θ 0 k are Haar and indep endent, then ( Θ 0 k ) ∗ Θ k is also Haar distributed and E 0  Λ( Y ) 2  = E [exp (2 nη )], where the expectation is ov er the i.i.d. Haar ma- trices Θ 1 , Θ 2 , ..., Θ d and η = R r X i,j =1 λ i λ j d Y k =1 D Θ k x ( k,i ) 0 , x ( k,j ) 0 E | {z } ξ ( i,j ) k . (3.1) η may b e factored as η = R h λ T   d k =1  χ ( k ) ∗ 0 Θ k χ ( k ) 0  λ i . In the latter equa- tion,  stands for the Hadamard pro duct of matrices. The ultimate simpliﬁca- tion comes from the SVD of χ ( k ) 0 : χ ( k ) ∗ 0 Θ k χ ( k ) 0 = V k  Σ k 0  U ∗ k Θ k U k  Σ k 0  Σ k V ∗ k . Firstly , U ∗ k Θ k U k has the same distribution as Θ k ; secondly , we may asso ciate with any Θ k its upp er r × r blo ck, that w e will denote Ψ k . As a conclusion, w e ma y express η as η = R h λ T   d k =1 ( V k Σ k Ψ k ΣV ∗ k )  λ i . (3.2) 4 Extending kno wn results When r = 1, Montanari et al. [6] found a b ound on the parameter λ 1 ensuring that E 0  Λ( Y ) 2  is b ounded. In this simple case, η has a simple expression 4 since η = λ 2 R Q d k =1 ξ k where the ( ξ k ) k =1 ,...,d are i.i.d. distributed as the ﬁrst comp onen t of a uniform vector of the unit sphere of C n . As in [6], we introduce β 2nd d = s min u ∈ [0 , 1] − 1 u d log(1 − u 2 ) . (4.1) Adapting the result of the mentionned article the complex-circular con text is straigh t-forw ard: Theorem 1 (case r=1 (Montanari et al.)) . L et ξ 1 , ..., ξ d b e i.i.d. distribute d as the ﬁrst c omp onent of a ve ctor uniformly distribute d on the unit spher e of C n . If λ 1 < q d 2 β 2nd d then E 0 h exp  2 nλ 2 1 R Q d k =1 ξ k i is b ounde d; mor e over, if d > 2 , the ab ove exp e ctation is 1+o(1). This non-obvious result may b e used in order to derive a condition ensuring that hypotheses H 0 and H 1 are indistinguishable when r > 1 . In this resp ect, recall the expansion (3.1). Thanks to the H ¨ older inequalit y , E 0  Λ( Y ) 2  is upper b ounded b y (see (3.1) for the deﬁnition of ξ ( i,j ) k ) r Y i,j =1 E 1 /p i,j " exp 2 np i,j λ i λ j R d Y k =1 ξ ( i,j ) k !# (4.2) for any non-negative num b ers p i,j suc h that P i,j 1 p i,j = 1 . F or ﬁxed i, j , we notice that the random v ariables ( ξ ( i,j ) k ) k =1 ,...,d v erify the condition of Theorem 1. An y of the expectations in (4.2) are hence upper-b ounded when n → ∞ pro vided that, for all i, j : p i,j λ i λ j < d 2  β 2nd d  2 . Cho osing ev en tually p i,j = ( P p λ p ) 2 λ i λ j , we deduce Theorem 2 (case r ≥ 1 - extension of Theorem 1 ) . If P r i =1 λ i < q d 2 β 2nd d then E 0  Λ( Y ) 2  is b ounde d. If mor e over d > 2 , we have E 0  Λ( Y ) 2  = 1 + o (1) and the hyp otheses H 0 and H 1 ar e indistinguishable. R emark 1 . Due to the use of the H ¨ older inequality , Theorem 2 is sub optimum in general. The inequality is patently an equality when ∀ k , i, j , x ( k,i ) 0 = x ( k,j ) 0 , i.e. the spik e has rank r = 1 and amplitude P r i =1 λ i . 5 A tigh ter b ound The main result of our con tribution is the following Theorem 3 (case r ≥ 1) . We deﬁne η max as η max = λ   d k =1 χ ( k ) ∗ 0 χ ( k ) 0  λ . (5.1) If √ η max < q d 2 β 2nd d then, for d > 2 , E 0  Λ( Y ) 2  = 1 + o (1) . 5 Before pro viding elemen ts of the pro of of the ab o v e result, we may brieﬂy justify why the bound in Theorem 3 is tighter than this of Theorem 2, whatever the choice of λ . On the one hand, indeed, ( P i λ i ) 2 = λ T J λ where J is the r × r matrix ha ving all its en tries equal to 1. On the other hand, all the v ectors x ( k,i ) 0 are normalized and consequently , an y of the diagonal entries of χ ( k ) ∗ 0 χ ( k ) 0 equals 1 and for any i 6 = j ,    ( χ ( k ) ∗ 0 χ ( k ) 0 ) i,j    ≤ 1 . This prov es that ( P i λ i ) 2 − η max = λ T  J −  d k =1 χ ( k ) ∗ 0 χ ( k ) 0  λ ≥ 0 . W e provid e the key elemen ts of the pro of of Theorem 3. Remind that we are lo oking for a condition on the spike under which E [exp (2 nη )] is b ounded. Eviden tly , the divergence may o ccur only when η > 0 . W e hence consider E 1 = E [exp (2 nη ) 1 η > ] and E 2 = E [exp (2 nη ) 1 η ≤  ], and pro v e that under the condition √ η max < q d 2 β 2nd d , for a certain small enough  , E 1 = o (1) (for d ≥ 2) and that E 2 = 1 + o (1) (for d > 2). The E 1 term. It is clear that the b oundedness of the integral E 1 is ac hiev ed when η rarely deviates from 0. As remark ed in [6], the natural mac hinery to con- sider to understand E 1 is this of the Large Deviation Principle (LDP). In essence, if η follows the LDP with rate n , there can b e found a certain non-n egativ e func- tion called Go o d Rate F unction (GRF) I η suc h that for any Borel set A of R , 1 n log P ( η ∈ A ) con v erges tow ards sup x ∈ A − I η ( x ). The existence of a GRF al- lo ws one to analyze the asymptotic b eha viour of in tegral E 1 . Indeed, the V arad- han lemma (see Theorem 4.3.1 in [10]) states that 1 n log E [exp (2 nη ) 1 η > ] → sup x> (2 x − I η ( x )) and hence the E 1 term conv erges tow ards 0 when sup x> (2 x − I η ( x )) < 0. W e th us justify that η follows a Large Deviation Principle with rate n , and w e compute a low er b ound of its GRF. F or this, we use that for each k , random matrix Ψ k deﬁned in (3.2) follows a LDP with rate n and that its GRF at the parameter ψ ∈ C r × r (w e may evidently take k ψ k 2 ≤ 1) is log det ( I r − ψ ∗ ψ ) (see Theorem 3-6 in [11]). η is a function of the i.i.d. matrices ( Ψ k ) k =1 ,...,d . Therefore, the contraction principle (see Theorem 4.2.1 in [10]) ensures that η follo ws a LDP with rate n and GRF I η giv en, for eac h real x in the range of η , as the solution of the optimization problem: max ∀ k 0 ≤ α k ≤ 1 max ∀ k k ψ k k = α k η ( ψ 1 , ..., ψ d ) = x d X k =1 log det ( I r − ψ ∗ k ψ k ) . (5.2) When d ≥ 3, the solution of this optimization problem cannot apparen tly b e expressed in closed form. W e thus just provide a low er b ound of I η ( x ). When d = 2, it is p ossible to ev aluate I η ( x ), but due to the lack of space, we do not report the corresponding result in the present pap er. 6 Prop osition 4. F or e ach x ∈ R , it holds that I η ( x ) ≥ − d log 1 −  | x | η max  2 /d ! . (5.3) wher e the right-hand side should b e understo o d as + ∞ if | x | ≥ η max . In order to establish Prop osition 4, w e use the following algebraic result whose pro of is omitted. Lemma 5. F or any matric es ( A k ) k =1 ,...,d ∈ C r × r and ve ctor λ ∈ R r , the supr emum of    λ T  d k =1 ( A k ψ k A ∗ k ) λ    over r × r matric es ψ k such that for al l k: || ψ k || 2 = α k is d Y k =1 α k ! λ T   d k =1 ( A k A ∗ k )  λ . The immediate consequence of this lemma is that the random v ariable η is b ounded and | η | ≤ η max where η max is given by (5.1). Moreov er, take a set of matrices ψ k suc h that k ψ k k 2 = α k ∈ [0 , 1]; then by Lemma 5, | η ( ψ 1 , ..., ψ d ) | ≤ ( Q k α k ) η max hence the optimization (5.2) is to b e carried out only on the set of matrice ψ k suc h that Q k α k ≥ | x | η max . On the other hand, one may use the generous b ound log det ( I r − ψ ∗ k ψ k ) ≤ log  1 − k ψ k k 2 2  and ﬁnally prov e that − I η ( x ) ≤ max Q k α k ≥ | x | η max d X k =1 log  1 − α 2 k  . The supremum of the r.h.s. of this equation is achiev ed for balanced α k and we immediately obtain (5.3). This completes the pro of of Proposition 4. The E 1 term. W e are now in position to conclude that E 1 = o (1). V aradhan’s lemma implies that 1 n log E 1 → sup x ≥  [2 x − I η ( x )]. Using Prop osition 4 and setting u =  | x | η max  1 /d , we obtain immediately that for each δ > 0, 1 n log E 1 is less than 1 n log E 1 < sup u ≥ ˜   2 u d  η max + d 2 1 u d log(1 − u 2 )  + δ for n large enough, where ˜  = ( /η max ) 1 /d . Recalling (4.1) and choosing δ small enough, we deduce that the condition η max < d 2  β 2nd d  2 implies that E 1 → 0. This holds for any order d ≥ 2. The E 2 term. The V aradhan lemma ma y b e inv oked: but its conclusion, namely 1 n log E 2 → 0, says nothing on the b oundedness of E 2 . W e ha ve, ho w ever 7 E 2 = Z ∞ 0 P (exp(2 nη ) ≥ t and η ≤  ) d t = Z 0 −∞ P ( η ≥ u and η ≤  ) 2 n exp(2 nu )d u + Z  0 P ( η ≥ u and η ≤  ) 2 n exp(2 nu )d u ≤ P ( η ≤  ) + Z  0 P ( η ≥ u ) 2 n exp(2 nu )d u. A w eak consequence of the LDP on η is the concentr ation of η around 0, namely P ( η ≤  ) = 1 − P ( η >  ) = 1 − o (1) . W e recall the expanded expression for η : see (3.1). Notice that η ≥ u implies that at least one of the r 2 terms of this expan- sion is at least equal to u r 2 . By the union b ound, and the fact that R Q d k =1 ξ ( i,j ) k ≤ Q d k =1    ξ ( i,j ) k    w e deduce that P ( η ≥ u ) ≤ P r i,j =1 P  Q d k =1    ξ ( i,j ) k    ≥ u r 2 λ i λ j  . In- v oking again the union b ound and noticing that for ﬁxed i, j ,  ξ ( i,j ) k  k =1 ,...,d ha v e the same distribution, we deduce that P ( η ≥ u ) ≤ d r X i,j =1 P    ξ ( i,j ) k    ≥  u r 2 λ i λ j  1 /d ! . No w, the density of ξ ( i,j ) k is in polar coordinates n − 1 π  1 − r 2  n − 2 hence, c ho osing  such that  ≤ r 2 max i,j λ i λ j : P     ξ ( i,j ) k    ≥  u r 2 λ i λ j  1 /d  =  1 −  u r 2 λ i λ j  2 /d  n − 1 . F or any 0 ≤ x < 1, log(1 − x ) ≤ − x , hence E 2 ≤ d X i,j 2 n Z  0 exp − ( n − 1)  u r 2 λ i λ j  2 /d + 2 nu ! du. When d > 2, it is alwa ys p ossible to determine  suﬃciently small such that − ( n − 1)  u r 2 λ i λ j  2 /d + 2 nu ≤ − n − 1 2  u r 2 λ i λ j  2 /d . This implies that, for such an , we hav e E 2 ≤ d 2 r 2 n  2 n − 1  d/ 2 X i,j λ i λ j Z ∞ 0 v d/ 2 − 1 exp( − v ) dv . The r.h.s. is of course o (1) since d > 2. R emark 2 . The b ound √ η max < q d 2 β 2nd d guaran tees the non-detectabilit y but it is not tight in general b ecause, in order to study the asymptotics of E 1 , w e replaced the true GRF I η b y the low er bound (5.3). Based on the lo ose 8 inequalit y log det ( I r − ψ ∗ k ψ k ) ≤ log det  1 − k ψ k k 2 2  , (5.3) may not b e very accurate. It is easy to chec k that the equalit y is reached in (5.3) when all the matrices ( χ ( k ) 0 ) k =1 ,...,d are rank 1, i.e. if the rank of X 0 is equal to 1. Therefore, the low er b ound (5.3) of I η is all the b etter as all the matrices χ ( k ) 0 are close to b eing rank 1 matrices. This suggests that, conv ersely , the b ound (5.3) is lik ely to b e loose when matrices ( χ ( k ) 0 ) k =1 ,...,d are close to b e orthogonal. As an illustration, we would like to consider exp erimen tal results. F or a given conﬁguration of the spike, we hav e c hosen at random the matrices ψ k with k ψ k k ≤ 1. F or eac h trial, w e plot the p oints of co ordinates x = η ( ψ 1 , ..., ψ d ) and y = P d k =1 log det ( I r − ψ ∗ k ψ k ) and we obtain a cloud the upp er en v elop e of which is a representation of the true GRF of η ; for comparison, we ha v e plotted the graph of the function deﬁned by the lo w er b ound (5.3). W e ha v e c hosen r = 2, d = 3, and tw o conﬁgurations of the spike: in the ﬁrst one, all the matrices χ k ha v e orthogonal columns (top graph of 5.1), in the second one, the eigenv alues of χ ∗ k χ k are the same for k = 1 , 2 equal to 1 . 8 and 0 . 2 (b ottom graph of 5.1). −2 −1 0 1 2 −20 −15 −10 −5 0 η ( ψ 1 … ψ d ) ∑ k log(det(I− ψ k ψ k H ) −2 −1 0 1 2 −20 −15 −10 −5 0 η ( ψ 1 … ψ d ) ∑ k log(det(I− ψ k ψ k H ) Figure 5.1: − I η and our upp er b ound 9 R emark 3 . In the sp eciﬁc case d = 2, it is p ossible to compute in closed-form the exact GRF I η of η , and to establish the follo wing result: if µ max ( X 0 X ∗ 0 ) < β 2nd 2 = 1 (here, µ max denotes the largest eigen v alue), then E 1 con v erges to wards 0. The approac h w e used in this pap er to upp er-bound E 2 for d > 2 is unsuc- cesfull for d = 2. Ho wev er, it is p ossible to adapt the tec hnique used in [6]: if µ max ( X 0 X ∗ 0 ) < 1, then E 2 is b ounded. F rom b oth results, it may b e concluded that under the condition µ max ( X 0 X ∗ 0 ) < 1, no consisten t detection test can b e found. 6 Conclusion In this paper, we hav e addressed the detection problem of a rank r high- dimensional tensor X 0 . W e hav e generalized the results of [6] to the case where r > 1, and established that if parame ter η max deﬁned b y (5.1) is less that param- eter β 2nd 2 in troduced in [6], the lo w rank tensor is undetectable. This condition is based on the lo w er b ound (5.3) of the GRF I η whic h is ho w ev er not tight in general. It is thus relev an t to try to improv e this b ound in a future work. References [1] R. Nadakuditi and A. Edelman, “Sample eigen v alue based detection of high-dimensional signals in white noise using relativ ely few samples,” IEEE T r ansactions on Signal Pr o c essing , v ol. 56, no. 7, pp. 2625–2637, 2008. [2] P . Bianchi, M. Debbah, M. Ma ¨ ıda, and M. Na jim, “P erformance of statis- tical tests for single source detection using random matrix theory ,” IEEE T r ansactions on Information The ory , vol. 57, no. 4, pp. 2400–2419, 2011. [3] F. Benayc h-Georges and R. R. Nadakuditi, “The singular v alues and v ectors of low rank p erturbations of large rectangular random matrices,” Journal of Multivariate Analysis , vol. 111, pp. 120–135, 2012. [4] S. Hopkins, J. Shi, and D. Steurer, “T ensor principal comp onent analysis via sum-of-squares proofs,” in JMLR: Workshop and Confer enc e Pr o c. , vol. 40, pp. 1–51, 2015. [5] A. Mon tanari and E. Ric hard, “A statistical mo del for tensor PCA ,” in Pr o- c e e dings of the 27th International Confer enc e on Neur al Information Pr o- c essing Systems - V olume 2 , NIPS’14, (Cambrid ge, MA, USA), pp. 2897– 2905, MIT Press, 2014. [6] A. Mon tanari, D. Reic hman, and O. Zeitouni, “On t he limitation of spectral methods: from the gaussian hidden clique problem to rank one p erturba- tions of gaussian tensors,” IEEE T r ans. Inf. The or. , vol. 63, pp. 1572–1579, Mar. 2017. 10 [7] A. Perry , A. W ein, and A. Bandeira, “Statistical limits of spik ed tensor mo dels,” arXiv:1612.07728v2 [math.PR] , 12 2016. [8] A. Onatski, M. Moreira, and M. Hallin, “Asymptotic p ow er of sphericity tests for high-dimensional data,” A nn. Statistics , vol. 41, no. 3, pp. 1204– 1231, 2013. [9] J. Banks, C. Mo ore, R. V ersh ynin, N. V erzelen, and J. Xu, “Information- theoretic bounds and phase transitions i n clustering, sparse PCA , and sub- matrix lo calization,” in 2017 IEEE International Symp osium on Informa- tion The ory (ISIT) , pp. 1137–1141, June 2017. [10] A. Dembo and O. Zeitouni, L ar ge deviations te chniques and tpplic ations . Springer-V erlag Berlin Heidelberg, 2009. [11] F. Gam b oa and A. Rouault, “Operator-v alued sp ectral measures and large deviations,” J. of Stat. PL anning and Infer enc e , vol. 154, no. 3, pp. 72–86, 2014. 11

On the non-detectability of spiked large random tensors

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment