An Improvement on Ranks of Explicit Tensors

We give constructions of n^k x n^k x n tensors of rank at least 2n^k - O(n^(k-1)). As a corollary we obtain an [n]^r shaped tensor with rank at least 2n^(r/2) - O(n^(r/2)-1) when r is odd. The tensors are constructed from a simple recursive pattern, …

Authors: Benjamin Weitz

An Impro v emen t on Rank of Explicit T ensors Benjamin W eitz Octob er 30, 2018 Abstract W e giv e constructions of n k × n k × n tensors of rank at least 2 n k − O ( n k − 1 ). As a corollary we obtain an [ n ] r shaped t ensor with rank at least 2 n ⌊ r / 2 ⌋ − O ( n ⌊ r / 2 ⌋− 1 ) when r is od d. The tensors are constructed from a simple recursiv e pattern, and the low er b ounds are pro ven using a partitioning th eorem developed by Bro c kett and Dobkin. These tw o b ounds are impro vemen ts o ver the previous b est-known explicit tensors that had ranks n k and n ⌊ r / 2 ⌋ respectively . 1 1 In tro duction An importa n t and w ell-studied proper ty of linear o pera tors, equiv alently matri- ces, is their ra nk. Muc h is understo o d a b out the ra nk of matrices ov er fields, and an efficient algor ithm exists for the calcula tion o f the rank o f an explicit matrix. How ever, a closely related pr oblem, calculating the ra nk of a tensor, a generalized v ers ion of a ma trix, ha s b een shown to b e NP -co mplete[4], a nd so is unlik ely to hav e an efficient algor ithm. Due to the int ra ctabilit y of the pr oblem, very few results hav e been shown on this sub ject. 1.1 Imp ortance of T ensor Rank The r ank of a tensor is relev ant and imp ortant in several different settings. F ast matr ix m ultiplication, a problem that is of incredible imp ortance, ca n b e improv ed b y improving the upper b ound on the rank of a rela ted tensor . A recent pap er by Ra n Raz pr oved tw o theor ems relating lower b ounds on the rank of tensors and low er bo unds on the size o f a rithmetic formulas: • Theorem : L et A : [ n ] r → F b e a tensor su ch that r ≤ O (lo g n/ log lo g n ) . If ther e exists a formula of size n c for the p olynomial f A ( x 1 , 1 , . . . , x r,n ) = X i 1 ,...i r ∈ [ n ] A ( i 1 , . . . , i r ) · r Y j =1 x j,i j then the tensor ra nk of A is at most n r · (1 − 2 − O ( c ) ) [9]. • Co rollary : L et A : [ n ] r b e a tensor such that r ≤ O (lo g n/ log log n ) . If the tensor r ank of A is ≥ n r · (1 − o (1)) then ther e is no p olynomial size formula for the p olynomial f A [9]. These tw o theor ems give a strong motiv ation b ehind finding explicit [ n ] r tensors of high rank. In this pap er w e give an explicit hypercub e tensor with rank approaching 2 n ⌊ r/ 2 ⌋ , a n improv ement ov er the previous b est-known example by a constant factor of 2. 1.2 Metho dology F or each integer k , we will give an n k × n k × n tensor with rank at least 2 n k − O ( n k − 1 ), an impro vemen t over the previous best-known n k . T o do so, we will use a par titioning theorem developed by Bro ck ett and Dobkin in [3]. This theorem allows us to low er bo und the ra nk of a tensor that is formed by concatenating or gluing tog ether other tensors, provided they are sufficient ly different. W e will construct a tensor rec ur siv ely by contin ually g luing to g ether three co pies o f a smaller tensor. The par titioning theorem will allow us to low er b ound the rank of the tensor at each step, and thus the final tensor as well . As a coro llary we will construct a n n × · · · × n | {z } r time s tensor of rank 2 n ⌊ r/ 2 ⌋ − O ( n ⌊ r/ 2 ⌋− 1 ) when r is o dd b y viewing the fir s t construction under an isomor phism. This is an improv ement ov er the pr evious b est-known n ⌊ r/ 2 ⌋ b y a constant factor. 2 1.3 Definitions Throughout this pap er, F will deno te a field. Let A ∈ F n 1 × n 2 × n 3 . If A = x 1 ⊗ x 2 ⊗ x 3 for x i ∈ F n i and A is nonzero, then A is ca lled a simple o r rank-1 tensor. The rank of a genera l tensor A is defined as the minimal num be r r such that w e can write A = r X i =1 B i (1) where each B i is a simple tensor. This is a natural extension of matrix ra nk, beca use if x 3 = 1 then the rank of A ag r ees with the matrix rank. Throughout this pap er, R [ A ] will denote the r ank o f A . 1.3.1 Slices, Concatenations, and the Characteristic Matrix Let A ∈ F n 1 × n 2 × n 3 , fix a p ositive int eg e r 1 ≤ k ≤ n 3 and let B ∈ F n 1 × n 2 × 1 satisfy B ij = A ijk Then B is ca lled the k th slice of A . W e will denote the k th slice of a tenso r A as A k . The c onc atenation of tenso rs A ∈ F n 1 × n 2 × m and B ∈ F n 1 × n 2 × m ′ , denoted AB ∈ F n 1 × n 2 × ( m + m ′ ) , is the tensor such that AB ijk =  A ijk if 1 ≤ k ≤ m B ij ( k − m ) if m < k ≤ m + m ′ The m + m ′ slices of the concatenatio n are the m s lices o f A followed by the m ′ slices of B . Also AB and B A differ only by p ermutations of the indices in the third dimension, so R [ AB ] = R [ B A ]. The char acteristic matrix of A ∈ F n 1 × n 2 × n 3 is a matrix A ( s ) with indeter- minant s, i.e. A ( s ) ∈ E n 1 × n 2 , where E = F ∪ S and S = { s i } n 3 i =1 is a set of indeterminates with A ( s ) = n 3 X i =1 s i A i so each indeterminate repr e s en ts the v a lues on a different slice. Define dim s = | S | = n 3 . Define the c olumn (resp. r ow ) r ank to b e the max ima l num b er of linear ly independent co lumns (resp. rows) as in [3]. Note that the row and co lumn r ank are no t nece ssarily equal, for ex a mple A ( s ) = [ s 1 s 2 ] has column rank= 2 and r ow r ank= 1. W e a lso sometimes write R [ A ( s )] for R [ A ]. T o av oid trivialities, we usually work with nonde gener ate tensors; a tensor A is nondeg enerate if no non trivial linea r combination o f its slices v anish and its characteristic matrix A ( s ) has full row and column r a nk. An analogy of concatenation can b e defined with characteristic matrices as well. Let A ( s ) and B ( t ) b e tw o characteristic matrices of the sa me dimensions, and let dim s = n and dim t = m . W e define C ( u ) = A ( s ) + B ( t ) as C ( u ) = n X i =1 A i s i + m X j =1 B j t j 3 where u = s ∪ t . Note C ( u ) is the characteristic matrix of AB , so this a ddition can be considered as a conca tena tion. 1.4 The Pa rtitioning Theorem The main to ol we use in our construction is the par titioning theorem developed b y Bro ck ett and Dobkin in [3], and we wr ite it here for eas y r eferral: Theorem 1 . L et G ( s ) b e a nonde gener ate char acteristic matrix, and let one of the fol lowing c ases hold: (i) G ( s ) =  G 1 ( s ) G 2 ( s )  (ii) G ( s ) =  G 1 ( s ) G 2 ( s )  (iii) G ( s ) = G 1 ( u ) + G 2 ( v ) Then for e ach c ase we have (i) R [ G ( s )] ≥ min M R [ G 1 ( s ) + N G 2 ( s )] + r ow r ank G 2 ( s ) (ii) R [ G ( s )] ≥ min N R [ G 1 ( s ) + G 2 ( s ) M ] + c olumn r ank G 2 ( s ) (iii) R [ G ( s )] ≥ min T R [ G 1 ( u ) + G 2 ( T u )] + dim v for matric es M , N , and T size d so that t he two summands ar e the s ame shap e and the addition is wel l-define d, and juxtap osition me ans r e gu lar matrix multi- plic ation. This theor em essentially states that if tw o halves o f a tensor "don’t ov erlap to o muc h", then each slice o f the second half must add at least one to the r ank. A sp ecial ca se o f "don’t ov er la p to o muc h" is given in the following theor em: Theorem 2. L et G 1 ( s ) , G 2 ( s ) , and G 3 ( s ) al l b e nonde gener ate char acteristic matric es. Then we have (i) R  G 1 ( s ) 0 G 2 ( s ) G 3 ( s )  ≥ max { R [ G 1 ( s )] + c olumn r ank G 3 ( s ) , R [ G 3 ( s )] + r ow r ank G 1 ( s ) } (ii) R  G 1 ( s ) + G 2 ( t ) G 3 ( t )  ≥ max { R [ G 1 ( s )]+ c olumn r ank G 3 ( t ) , R [ G 3 ( t )]+ dim s } (iii) R  G 1 ( s ) + G 2 ( t ) G 3 ( t )  ≥ max { R [ G 1 ( s )] + r ow r ank G 3 ( t ) , R [ G 3 ( t )] + dim s } 4 2 The Mai n Result In this section we give a construction that yields n k × n k × n tensors o f ra nk approaching 2 n k . These num b ers are, to the b est of our knowledge, the lar gest known rank of a n y ex plicit tensor of these shap es. As a corollar y , for r o dd, these constructions a llo w us to give an [ n ] r shap ed tensor of r ank approa c hing 2 n ⌊ r/ 2 ⌋ , another improvemen t to the b est o f our knowledge. The first step is to prov e a low er b ound for a blo ck tensor: Theorem 3. L et A ∈ F m × n × p b e nonde gener ate, B ∈ F m × n ′ × p ′ b e nonde gen- er ate, and C ∈ F m ′ × n × p ′ b e nonde gener ate and let E ∈ F m × n × p ′ , and let 0 b e the tensor of zer o es of appr opriate dimensions to b e c onc atenate d, and let M =  AE 0 B 0 C 00  then R [ M ] ≥ R [ A ] + c olumn r ank B ( t ) + r ow r ank C ( t ) Pr o of. First, transforming in to c har acteristic matrices, M ( u ) =  A ( s ) + E ( t ) B ( t ) C ( t ) 0  with u = s ∪ t . W e partition M ( u ) =  G 1 ( u ) G 2 ( u )  with G 1 ( u ) =  A ( s ) + E ( t ) B ( t )  G 2 ( u ) =  C ( t ) 0  By Theorem 1, R [ M ( u )] ≥ min N R [ G 1 ( u ) + N G 2 ( u )] + row rank G 2 ( u ) By Theorem 2, R [ G 1 ( u ) + N G 2 ( u )] = R [ A ( s ) + ( E + N C )( t ) B ( t )] ≥ max { R [ A ( s )] + column r ank B ( t ) , R [ B ( t )] + dim s } Since row ra nk G 2 ( u ) = row rank C ( s ), we hav e R [ M ( u )] ≥ row ra nk C ( s ) + R [ A ( s )] + column r ank B ( t ) . This theorem is the key to our co ns truction. W e recursively build a tenso r as follows: pick a p ositive integer k and let A (0) = I n k − 1 , a nd define A ( i +1) =  A ( i ) 0 0 A ( i ) 0 A ( i ) 00  the main result is 5 Theorem 4. Pi ck l = lo g n . Then the tensor A ( l ) ab ove has dimensions n k × n k × n and satisfies R [ A ( l ) ] ≥ 2 n k − O ( n k − 1 ) . Pr o of. F or a n y i , it is clear that A ( i ) is a 2 i n k − 1 × 2 i n k − 1 × 2 i tensor. F urther- more, an eas y induction shows that A ( i ) ( s ) is nondegener ate b y no ting that A ( i ) alwa ys has at least one slice with full row a nd column r ank, and a nontrivial linear co m bination of slices of A ( i ) that v a nish is such a combination o f slices of A ( i − 1) as well. Thus row rank A ( i ) ( s ) = column rank A ( i ) ( s ) = 2 i n k − 1 and A ( i ) is no ndegenerate. By Theo rem 3, R [ A ( i +1) ] ≥ R [ A ( i ) ]+row rank A ( i ) ( s )+column rank A ( i ) ( s ) = R [ A ( i ) ]+2 i +1 n k − 1 Then a straightforward induction shows R [ A ( i ) ] ≥ R [ A (0) ] + i − 1 X j =0 2 j +1 n k − 1 = n k − 1 + 2(2 i − 1) n k − 1 setting l = log n , we hav e R [ A ( l ) ] ≥ 2 n k − n k − 1 and A i ∈ F n k × n k × n . This co nstruction allows us to improv e on the previous b est-known explicit h yp ercub e tensor by tak ing the pre ima ge of these tensors under the cano nical isomorphism. Corollary 5. L et r b e o dd, k = ⌊ r / 2 ⌋ , A ( l ) as ab ove, and let φ b e the c anonic al isomorphi sm φ : F r times z }| { n × · · · × n → F n k × n k × n x 1 ⊗ · · · ⊗ x r 7→ ( x 1 ⊗ · · · ⊗ x k ) ⊗ ( x k +1 ⊗ · · · ⊗ x 2 k ) ⊗ x r then φ − 1 ( A ( l ) ) is an n × · · · × n | {z } r time s tensor with r ank at le ast 2 n ⌊ r/ 2 ⌋ − O ( n ⌊ r/ 2 ⌋− 1 ) . Pr o of. W e show that for any tenso r B ∈ F r times z }| { n × · · · × n , R [ φ ( B )] ≤ R [ B ]. Assume the opp osite towards a contradiction. Then if B = R [ B ] X i =1 D i for simple tensors D i , we hav e φ ( B ) = R [ B ] X i =1 φ ( D i ) and a s D i is simple, so is φ ( D i ), but since R [ φ ( B )] > R [ B ] this co n tradicts minimalit y of R [ φ ( B )], thus R [ φ ( B )] ≤ R [ B ], so clearly R [ A ( l ) ] ≤ R [ φ − 1 ( A ( l ) )]. T o our knowledge, these are the b est-known ra nks for explicit [ n ] r and n k × n k × n tensors for any k , including the imp ortant cub e n × n × n tensors. 6 3 Conclusion In this pap er we hav e presented an improv ement to a bout 2 n k from the previous highest r ank explicit tensors for the n k × n k × n shap e . This extends to an improv ement for the shap e [ n ] r when r is o dd. These tenso rs were co ns tructed b y using B r o ckett a nd Dobkin’s partitioning theorem in a recursive manner. How ever, using this theor em impo ses a restriction on the qua lit y of the low er bo unds. In order to improv e further, we need to either improv e the partitioning theorem or develop a different metho d. 3.1 Op en Problems • The most imp ortant op en pro blem is the one pre s en ted as the motiv ation for this pap er. The improvemen ts in this pap er do not come anywhere close to the n r (1 − o (1)) threshold for hyper cube tensors. An y explicit ten- sor with this r ank would imply sup er-p olynomial low er b ounds on cer ta in functions as p er Ran Raz’s recent theo rem[9]. Any attempt to develop examples of high-rank tensors should keep this goa l in mind. • An improv ement to B r o ckett a nd Dobkin’s pa r titioning theorem would be extremely useful. The same techniques pres en ted here would b e more powerful a nd p erhaps improv e by an increase in the exp onent, rather than a constant factor. 3.2 Add itional Not es This pap er is the result of resear c h done at Caltech fr o m June 20 10 to August 2010 as pa r t of the SURF pro gram. I work ed under Chris Umans, Profess or of Computer Science, and I’d like to thank him for a ll his help and advice while working on this pro ject. A dditionally , in betw een the wr iting a nd the submission of this article, a n independent article was published by Boris Alexeev, Mic hael F orb es, and Jacob T simerman[1] that gives, a mo ng other things, an e xplicit n k × n k × n { 0 , 1 } -tensor with r ank at leas t 2 n k + n − Θ( k log n ). The techniques in this pap er are similar to those describ ed here, so the tw o b ounds are very clo se, but the o ne given by Alexeev, F orb es, and T simerman ha s b etter low er-o rder terms. In teres ted parties can read the pap er here h ttp://a rxiv.or g/abs/1102.0072 . References [1] Bo ris Alexeev, Michael F or bes, and Jaco b T simerman, T ensor R ank: Some L ower and Upp er Bounds , submitted to a rxiv.o rg on F e br uary 1st, 2011 . [2] M.D. Atkinson and N.M. Stephens, On the Maximal Mult ipli c ative Com- plexity of a F amily of Biline ar F orms , Linear Algebr a and Its Applications V olume 2 7, Octo ber 1979 . [3] Ro ger W. Bro ck ett and David Dobkin, On the Optimal Evaluation of a Set of Biline ar F orms , Linear Alg e bra and Its Applications V o lume 19 , Issue 3, 1978. 7 [4] Johan Hå stad, T ensor R ank is NP-c omplete , Journal of Algor ithms, De- cember 1990. [5] Thomas D. Howell, Glob al Pr op erties of T ensor R ank , Linear Algebr a and Its Applications V olume 22, December 1 978. [6] Joseph Ja’ Ja ’, Optimal Evaluation of Pairs of Biline ar F orms , SIAM Jo ur- nal on Co mputing V olume 8, Issue 3, 19 79. [7] Joseph Ja’ Ja’ a nd Jean T akc he, On the V alidity of the Dir e ct Sum Con- je cture , SIAM Journa l on Computing V o lume 1 5, Iss ue 4, 19 8 6. [8] Joseph Kr uskal, Thr e e-W ay A rr ays: R ank and Uniqueness of T riline ar De c omp ositions, with Ap plic ation to A rithmetic Complexity and S tatistics , Linear Algebr a a nd Its Applications V olume 18, Issue 2, 197 7. [9] Ran Raz, T ensor R ank and L ower Bounds for A rithmetic F or- mulas , published on the E CCC J an uar y 4th, 20 10, av aila ble at http:/ /www.w isdom.weizmann.ac.il/~ranraz/publications/ [10] Jos M.F. T en Ber ge, K ruskal’s Pol ynomial for 2 × 2 × 2 A rr ays and a Gener alization to n × n × 2 A rr ays , Ps y c hometrika , V olume 56, December 1991. 8

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment