Information and dimensionality of anisotropic random geometric graphs

This paper deals with the problem of detecting non-isotropic high-dimensional geometric structure in random graphs. Namely, we study a model of a random geometric graph in which vertices correspond to points generated randomly and independently from …

Authors: Ronen Eldan, Dan Mikulincer

Information and Dimensionality of Anisotropic Random Geometric Graphs Ronen Eldan W eizmann Institu te of Science Dan Mikulincer W eizmann Institute of Scien ce Abstract This p aper dea ls with the pro blem of detect ing non-isot ropic high-dimens ional geomet- ric stru cture in random graph s. Namely , we study a mod el of a r andom geometr ic graph in which vertic es correspon d to poi nts gener ated randomly and independ ently from a non- isotro pic d -dimens ional Gaussi an dist rib ution , and two vertices are connected if the dis- tance between them is smaller than some pre-spe cified thresho ld. W e deri ve new notion s of dimension ality which depend upon the eigen v a lues of the cov ariance of the Gaussian distrib u tion. If α denotes the vector of eigen v alues, and n is the number of vertic es, then the quantities  k α k 2 k α k 3  6 /n 3 and  k α k 2 k α k 4  4 /n 3 determin e upper and lower bounds for the possib ility of detection . This generalizes a recent result by B ubeck , Ding, R ´ acz and the first named author from [ 2 ] which sho ws that the quantity d/n 3 determin es the boundary of dete ction for isotropic geometry . Our method s in vo lve Fou rier analysis and the theory of charac teristic functions to in vestigat e the und erlyin g probab ilities of the model . The proof of the lo wer bound use s information theo retic too ls, based on the method presente d in [ 3 ]. K ey wor ds: geometric graphs, random g raphs, Erd ˝ os-R ´ enyi model, di mensionalit y of em b eddings 1 Introduction This study continues a l i ne of work in i tiated b y Bubeck, Din g , R ´ acz and the first named au- thor [ 2 ], in which t h e probl em of detecting geometric s t ructure in large graphs was s t udied. In other words, given a large graph one is interested in determining whether or not it was generated using a latent geometric structure. The main contribution of this study is a generalization of the results to t he anisotropic case. Extracting info rm ation from large graphs i s an extensiv ely s tudied st atistical task. In many cases, a given network, or graph, reflects s o me underlying st ructure; for example, a biolog ical neuronal n etwork is likely to reflect certain characteristics of its functionality such as physical location and cell structure. The obj ective of this paper is thus the detection of such an underly- ing geometric s t ructure. As a motiv ating e xample, consider the graph representing a lar ge social network. It may be assumed that each node (or user) is described by a set of numerical parameters representing its properties (such as geographi cal locati o n, age, poli tical association, interests, etc). It is plausi- ble t o ass u me that two nodes are more l ikely to be connected when their two respective points in p arameter space are m ore correlated. Adoptin g thi s assu mption, t he nodes of such a graph 1 may be th o ught of as points in a Euclidean space, with links appearing between two nodes when their distance is small eno u g h. A natural questio n in th is context would be: What can be said about the geometric structure by inspection of the graph itself? Specifically , can one disti nguish between s uch a graph and a graph with no underlying geometric structure? In statistical terms, given a graph G on n vertices, ou r null hypothesis is that G is an instance of the s t andard Erd ˝ os-R ´ enyi random graph G ( n, p ) [ 8 ], where the presence of each edge is determined independently , wi th probability p : H 0 : G ∼ G ( n, p ) . On the other hand, for the alt ernativ e, we consider the so -called random geom etric graph. In this m odel each verte x i s a point in some metric space and an edge is present between two points if the distance bet ween them is smaller than some predefined threshol d . Perhaps the most well-stud ied settin g of t his mo del is the isotropi c Eucli dean model, where the vertices are generated u n iformly on the d -dimensio n al sphere or simpl y from the st andard n o rmal d - dimensional distribution. Howe ver , i t seems that this model is too simplisti c to reflect real world social networks. One particular probl em, which we in t end to t ackle in this study , is t he isotropicity assumptio n , which am ounts to t he fact that all of the properti es associated wi th a node ha ve the same si gnificance in determining t he n etwork structure. It is clear that som e parameters, s u ch as geographic l o cation, can be more s i gnificant than oth ers. W e therefore propose to extend this model to a non-isotropi c setting. Roughly speaking, we replace the sphere with an ellipsoid; Instead of generating vertices from N (0 , I n ) , they will be generated from N (0 , D α ) for s ome diagonal mat ri x D α with non-negative entries. W e denot e the mod el by G ( n, p , α ) where p is the probability of an edge appearing, and th e diagon al of D α is giv en by a vector α ∈ R d . F o rm ally , let X 1 , ..., X n be i.i.d poi nts generated from N (0 , D α ) . In G ( n, p, α ) vertices correspond to X 1 , ..., X n and two dis t inct vertices are joi ned by an edge if and only if h X i , X j i ≥ t p,α , where t p,α is the unique number satisfying P ( h X 1 , X 2 i ≥ t p,α ) = p . Our al t ernativ e hyp othesis is thu s H 1 : G ∼ G ( n, p, α ) . In this paper , we will focus on the high-dim ens ional regime of the problem. Namely , we assume that the d i mension and cov ariance matrix can depend on n . This p oint of view be- comes highl y relev ant wh en considerin g recent dev elopments i n data sciences, where big data and high-dimensio n al feature spaces are becoming more prev alent . W e w i ll focus on the dense regime, where p is a constant i ndependent of n and α . 1.1 Pr eviou s work This paper can be seen a direct follow-up of [ 2 ], which, as no t ed above, deals with the isotrop i c model of G ( n, p, d ) in which D α = I d . In the dense regime, it was sh own that the tot al v ariation between the models d epend s as y m ptotically on the rati o d n 3 . The dependence is s u ch that if d > > n 3 , then G ( n, p, d ) conv erges in total v ariation to G ( n, p ) . Con versely , on the other hand, if d << n 3 the total variation con ver ges to 1. Our starting po int is thus the result of [ 2 ] stated as follows: 2 Theor em 1 . (a) Let p ∈ (0 , 1) be fixed an d assume that d/n 3 → 0 . Then, TV( G ( n, p ) , G ( n, p, d )) → 1 . (b) Furthermor e, if d /n 3 → ∞ then TV( G ( n, p ) , G ( n, p, d )) → 0 . One of th e fundam ent al dif ferences between G ( n, p ) and G ( n, p, d ) is a consequence of the triangle inequality . That i s, if two poin ts u and v are both close to a point w , then u and v cannot be too far apart. This roughly means that if both u and v are con n ected to w , th en there is an increased probability of u being connected to v , unlike the case of the Erd ˝ os-R ´ enyi graph where there is no dependence between th e edges. Thu s, count i ng the num ber of tri ang l es in a graph seems to be a natural test to uncover geometric structure. The idea of using triangles w as extended i n [ 2 ] and a variant was proposed: t h e signed triangle . This statistic w as successfully used to completely characterize the asymptotics of TV( G ( n, p ) , G ( n, p, d )) i n the isotropi c case. T o understand the idea behind signed triangles, we first no te that if A is t h e adjacency matrix of G then the number of triangles in G is give n by T r(A 3 ) . The ”num ber” o f signed triangles is represented by T r(( A − p 1 ) 3 ) where 1 is t h e matrix whose entries are all equal to 1 . It turns out that the v ariance of s igned triangles is sig n ificantly smaller than t h e correspondi n g quantity for regular triangles. The methods used in [ 2 ] relied hea vily o n the symmetries of the sp h ere. As mentioned, our g oal is to generalize this to the non-isotrop i c case, which requires us to apply dif ferent methods. The dimension d of the isotropic space arises as a natural parameter when discussing the underlying probabili t ies of Theorem 1 . Clearly , howe ver , when different coordinates of the space ha ve di f ferent scales, the dimensi o n by itself has lit tle meaning. For example, consider a d -dimensional ellips oid with one axis being l ar ge and the rest being much sm aller . Thi s el- lipsoid behav es more like a 1 -dimensional sphere rather than a d -dimensional one, in th e sense mentioned above. It would stand t o reason the more ani sotropic t h e ellipsoid i s , the sm aller its ef fectiv e dimension would be. 1.2 Main r esults and ideas In accordance to the above, our first task is to find a suitabl e noti on of dim ensionality for our model. For q ≥ 1 , let k·k q stand for the q -norm. W e deri ve the quantities  k α k 2 k α k 3  6 and  k α k 2 k α k 4  4 as the ne w notion s of dimension, where α parametrizes t he eigen values of D α , the covariance matrix of the normal distribution, and i s cons idered as a d -dimensional vector . W e note that, in the isotropic case, tho s e qu antities reduce to d whi ch also maximi zes the expressions. This notion of dimensi on allows us to tackle the m ain o bjectiv e of this paper; studyi ng the total variation distance between G ( n, p ) and G ( n, p, α ) . Cons i dering what we know about the isotropic case our question becomes: What con d itions are required from α , so that the total var iation remains bounded away from 0 ? The following t heorem provides a suffi cient condition on α as well as a necessary one: Theor em 2 . (a) Let p ∈ (0 , 1) be fixed an d assume that  k α k 2 k α k 3  6 /n 3 → 0 . Then, TV( G ( n, p ) , G ( n, p, α )) → 1 . 3 (b) Furthermor e, if  k α k 2 k α k 4  4 /n 3 → ∞ , then TV( G ( n, p ) , G ( n, p, α )) → 0 . Note that there is a gap b etw een the bounds 2(a) and 2(b) (for example, i f α i ∼ 1 3 √ i , then  k α k 2 k α k 3  6 is order of d ln 2 ( d ) , wh i le  k α k 2 k α k 4  4 is about d 2 3 ). W e conjecture that the bou n d 2(a) is tight: Conjectur e 1 . Let p ∈ (0 , 1) be fixed and assume that  k α k 2 k α k 3  6 /n 3 → ∞ . Then TV( G ( n, p ) , G ( n, p, α )) → 0 In the fol lowing we describe some of th e ideas used to prove Theorem 2 . As discussed, th e main idea underlying th i s work has to do with count i ng triangles. Giv en a graph G we denote b y T ( G ) the n u mber of triangles in the graph. It is easy to verify that E T ( G ( n, p )) =  n 3  p 3 and V ar( T ( G ( n, p ))) is of order n 4 . In the isotropic case, standard calculations s h ow that the expected number of triangles in G ( n, p, d ) is boosted by a factor proportional to 1 + 1 √ d . The first difficulty that arises is t o find a precise estimate for the probability increment in the non-isotrop ic case. In th i s case, we show that there is a constant δ p depending on l y on p such that E T ( G ( n, p, α )) ≥  n 3  p 3  1 + δ p  k α k 3 k α k 2  3  . T h is would imply a non-negligible to t al variation di s tance as long as  n 3   k α k 3 k α k 2  3 is bigger than the standard deviation of T ( G ( n, p )) . W e incorporate t he idea of u s ing signed tria ngles which attain a si milar diffe rence bet ween expected values but ha ve a smaller variance. The number of signed triangles is defined as: τ ( G ) = X { i,j,k }∈ ( [ n ] 3 ) ( A i,j − p )( A i,k − p )( A j,k − p ) , where A is the adjacenc y matrix of G , which is proportional to T r(( A − p 1 ) 3 ) . It is kno wn that V ar( τ ( G ( n, p ))) is only of order n 3 . Resolving the value of V ar( τ ( G ( n, p, α ))) leads to th e following result (which implies Theorem 2 (a)): Theor em 3 . Let p ∈ (0 , 1) be fixed and assume that  k α k 2 k α k 3  6 /n 3 → 0 . Then TV( τ ( G ( n, p )) , τ ( G ( n, p, α )) ) → 1 T o prov e Theorem 2 (b) we may view the random graph G ( n, p, α ) as a measurable function of a random n × n matrix W ( n, α ) with entries proportional to h γ i , γ j i where γ i are dra wn i.i.d from N (0 , D α ) and D α = diag ( α ) . Similarly , G ( n, p ) can be v iewed as a function of an n × n GOE random matrix denoted by M ( n ) . In [ 2 ] Theorem 1 (b) was proven us ing direct calculations o n the densities of the in volve d dist ributions. Howe ver , in our ca se, n o s imple formula exists, which makes t h ei r m ethod i napplicable. The p rem i se is instead proven usin g information theoretic tools, adop t ing ideas from [ 3 ]. The main id ea is to use Pinsker’ s in equ ality to bound the total variation distance by the respective relative entropy . Thus we are int erested in En t [ W ( n, α ) || M ( n )] . Theorem 2 (b) wi ll then fol low from the next result: 4 Theor em 4 . Let p ∈ (0 , 1) be fixed and assume that  k α k 2 k α k 4  4 /n 3 → ∞ . Then En t [ W ( n, α ) || M ( n )] → 0 . W e su spect, as stated in Conjecture 1 , that Theorem 2 (b) do es not g ive a ti ght characteriza- tion of the l ower bound. Indeed, in the dense regime of the isot rop ic case, s igned triangles act as an optim al st atistic. It would seem to reason that deforming the s phere shoul dn’t affect the utility of such a local tool. Acknowledgmen ts: W e would like to thank the anonymous referee for carefully reading this paper and for t he t houghtful comments which h el p ed improve the overall presentation. 2 Pr eliminari es W e work in R n , equipped wi th th e s t andard Euclidean structure h· , · i . For q ≥ 1 , w e deno t e by k ·k q the correspond ing q -norm . That is, for ( v 1 , ..., v n ) = v ∈ R n , k v k q =  n P i =1 | v i | q  1 q . If α = { α i } d i =1 is a multi-set with elements from R , we adopt t h e s am e n otation for k α k q . W e abbre viate k·k := k·k 2 , the usual Euclidean n o rm and denote by S n − 1 the unit s p here und er this norm. In our proofs, we wil l allow ourselves t o use the let t ers c, C , c ′ , C ′ , c 1 , C 1 , etc. to d eno te absolute positive con s tants whose values may change b etw een appearances. The lett ers x, y , z will usually denote spatial variables while a, b, c will denote the correspondin g frequencies in the Fourier domain . T h e letters X , Y , Z will u sually be used as random variables and vectors. The i maginary unit will be denot ed as i . Let X be a real valued random variable. The characteristic function of X is a function ϕ X : R → R , given by ϕ X ( t ) = E [ e i tX ] . More generally , if X is an n -dim ensional random vector , then the characteristic function of X is a fun ct i on ϕ X : R n → R giv en by ϕ X ( t ) = E [ e i h t,X i ] . By elem ent ary Fourier analysi s, one c an use the characteristic function to recover the dis t ri- bution, whenev er the random vector is int egrable. W e w i ll be int erested in t he specific case where th e di m ension of X is 3 . As s ume X = ( X 1 , X 2 , X 3 ) has a dens ity , denoted by f , a characteristic function, denot ed by ϕ and (reflected) cumulative distribution function F ( t 1 , t 2 , t 3 ) = P ( X 1 > t 1 , X 2 > t 2 , X 3 > t 3 ) , with mar gin als onto the first 1 or 2 coordinates denoted as F ( t 1 , t 2 ) and F ( t 1 ) respectiv ely (remark that thi s is slightly a differ ent d efinition than th e usual notion of the cum ulative distri- bution function, which results in a change of sign for the next identity). Then e.g., [ 13 , Theorem 5] s tates that i π 3 × Z R 3 ϕ ( a, b, c ) e − i ( at 1 + bt 2 + ct 3 ) abc dadbdc = (1) 8 F ( t 1 , t 2 , t 3 ) − 4( F ( t 1 , t 2 ) + F ( t 2 , t 3 ) + F ( t 1 , t 3 )) + 2( F ( t 1 ) + F ( t 2 ) + F ( t 3 )) − 1 , 5 where the integral is taken as a Cauchy principal value; In R 3 , the Cauchy prin cipal value of a function g , which we henceforth denote by × R R 3 g , is defined as ∞ Z 0 ∞ Z 0 ∞ Z 0 ∆ c ∆ b ∆ a g ( a, b, c ) dadbdc, where ∆ a g ( a, b, c ) := g ( a, b, c ) + g ( − a, b, c ) and like wise for b, c . In the following, for mul- tiv ariate functions, we interpret the definition of an odd (resp. even ) function i n the following sense: g is odd (resp. e ven) if it i s antis ymmetric (resp. symmetric) u n der change of s ign o f any coordinate, while keeping the values of the rest of the coordinates intact. W e note that th e principal va lue of an odd function vanishes, and if g is integrable then × R R 3 g = R R 3 g . Furthermo re, by denoting sgn ( t 1 ,t 2 ,t 3 ) ( x, y , z ) = sgn( x − t 1 )sgn( y − t 2 )sgn( z − t 3 ) , a simple calculati o n sho ws the following equali t y: Z R 3 f ( x, y , z ) · sgn ( t 1 ,t 2 ,t 3 ) ( x, y , z ) d xdy dz = 8 F ( t 1 , t 2 , t 3 ) − 4( F ( t 1 , t 2 ) + F ( t 2 , t 3 ) + F ( t 1 , t 3 )) + 2( F ( t 1 ) + F ( t 2 ) + F ( t 3 )) − 1 . Since the Fourier transform is an isometry we hav e that Z R 3 f · sgn ( t 1 ,t 2 ,t 3 ) = 1 π 3 × Z R 3 ϕ · c sgn ( t 1 ,t 2 ,t 3 ) , (2) where c sgn ( t 1 ,t 2 ,t 3 ) is the Fourier transform of sgn ( t 1 ,t 2 ,t 3 ) , when consi dered as a tempered dis t ri- bution (for more information on the t opic, see [ 9 ]). Putting all of the abov e t o gether y ields c sgn ( t 1 ,t 2 ,t 3 ) ( a, b, c ) = i e − i ( at 1 + bt 2 + ct 3 ) abc . (3) For a positive semi-definite n × n m atrix Σ , we denote by N (0 , Σ) the law of the centered Gaussian distribution wit h cov ariance Σ . If X ∼ N (0 , Σ) then X T X has the law W n (Σ , 1) of the W ishart di s tribution with 1 degree of freedom. The characteristic fu n ct i on of X T X is known (see [ 7 ]) and gi ven by Θ → det (I − 2 i ΘΣ) − 1 2 . (4) If Z is distributed as a standard Gaussian random variable, t hen Z 2 has the χ 2 distribution with 1 degree of freedom. For such a distribution, we have E [ χ 2 ] = 1 and V a r ( χ 2 ) = 2 . The χ 2 distribution has a s ub-exponential tail which may be bounded u sing a Bernstein’ s typ e i nequality ( [ 15 ]), in the following way . If { χ 2 i } n i =1 , are independent χ 2 random variables, then for e very ( v 1 , ..., v n ) = v ∈ R n and e very t > 0 P     X v i χ 2 i − X v i    ≥ t  ≤ 2 exp − min t 2 k v k ∞ , t 2 4 k v k 2 2 !! . (5) Let X 1 , ..., X n be i ndependent random variables wit h 0 mean and va riance E [ X 2 i ] = σ 2 i . Define s n = r n P i =1 σ 2 i and S n = n P i =1 X i s n . Under appropriate regularity cond i tions the central limit th eo- rem s tates that S n con ver ges in distribution to N (0 , 1) . 6 Berry-Esseen’ s inequality [ 12 ] quantifies thi s con vergence. Suppose that the absolute t h ird moments of X i exist and E [ | X i | 3 ] = ρ i . If we denote by Z a standard Gaussian and d efine S n as above then, for every x ∈ R , | P ( S n < x ) − P ( Z < x ) | ≤ n P i =1 ρ i s 3 n . (6) This can be generalized to higher dim ens ions, as found in [ 1 , Theorem 1.1]. In that case assume X 1 , ..., X n are independent random vectors in R d and S n = n P i =1 X i has cova riance Σ 2 . As sume that Σ i s in vertible and d eno t e E [ | Σ − 1 X i | 3 ] = ρ i . If Z d is a d -dimensional standard Gaussian vector , then there e xists a univ ersal constant C be > 0 , such that for any con vex set A : | P (Σ − 1 S n ∈ A ) − P ( Z d ∈ A ) | ≤ C be d 1 4 X i ρ i . (7) For a random vector X on R n with density f , the diff erential entropy of X is defined En t[ X ] = − Z R n f ( x ) ln( f ( x )) d x. If Y is anot her random ve ctor with density g , the relative entropy of X w i th respect to Y is En t [ X | | Y ] = Z R n f ( x ) ln  f ( x ) g ( x )  dx. Pinsker’ s inequality connects between t h e relative entropy and the total variation di stance, TV( X , Y ) ≤ r 1 2 En t [ X | | Y ] . (8) The chai n rul e for relative entropy st ates that for any random ve ctors X 1 , X 2 , Y 1 , Y 2 , En t[( X 1 , X 2 ) || ( Y 1 , Y 2 )] = En t [ X 1 || Y 1 ] + E x ∼ λ 1 En t [ X 2 | X 1 = x || Y 2 | Y 1 = x ] , (9) where λ 1 is th e m ar ginal of X 1 , and X 2 | X 1 = x is the distribution of X 2 conditioned on the e vent X 1 = x (similarly for Y 2 | Y 1 = x ). 3 Estimate s f or a triangle in a random geome tric graph In this section we derive a lower bound for the p rob ability that an in duced subgraph, of s i ze 3, of a random geometric graph forms a tri ang le. This calculation is instrumental for the deriv a- tion of Theorem 2 (a). Us ing the notation of t h e introdu ct i on, let X 1 , X 2 , X 3 ∼ N (0 , D α ) be independent normal random vectors with coordinates X i 1 , X i 2 , X i 3 for 1 ≤ i ≤ d . W e denote by f the joint density of ( h X 1 , X 2 i , h X 1 , X 3 i , h X 2 , X 3 i ) . Consider the e vent E p = {h X 1 , X 2 i ≥ t p,α , h X 1 , X 3 i ≥ t p,α , h X 2 , X 3 i ≥ t p,α } , that the correspon d ing vertices form a triangle in G ( n, p, α ) . The main result of this section is the follo wing t h eorem. 7 Theor em 5 . Let p ∈ (0 , 1) and assume k α k ∞ = 1 . One has p 3 + ∆  k α k 3 k α k 2  3 ≥ P ( E p ) ≥ p 3 + δ p  k α k 3 k α k 2  3 whenever k α k 2 > c p , for constants ∆ , δ p , c p > 0 which may depend only on p . 3.1 Lower b ound; the case p = 1 2 It will be instructive to begin the d iscussion with the (easier) case p = 1 2 , in which t p,α = 0 . W e are th u s int erested in the probability that h X 1 , X 2 i , h X 1 , X 3 i , h X 2 , X 3 i > 0 . Note that the triplet ( h X 1 , X 2 i , h X 1 , X 3 i , h X 2 , X 3 i ) can be realized as a linear comb i nation of upper off- diagonal elements taken from d ind ependent 3 -dimens ional W ishart random matrices (see below for an elaborated explanation). Unfortu n ately , t here is no known closed expression for the density of such a distribution. The following lemma u tilizes th e characteristic function of t he joint distri- bution to deriv e a closed expression for th e desi red probability . Lemma 1. P  E 1 2  = 1 8 + × Z R 3 i 8 abcπ 3 Y i (1 + α 2 i ( a 2 + b 2 + c 2 ) + 2 α 3 i abc i ) − 1 2 ! dadbdc. (10) Pr oof. Consider the e vent {h X 1 , X 2 i > 0 , h X 1 , X 3 i < 0 , h X 2 , X 3 i < 0 } . The map ( x, y , z ) 7→ ( x, y , − z ) preserves th e law of ( X 1 , X 2 , X 3 ) . Thus , P ( {h X 1 , X 2 i > 0 , h X 1 , X 3 i < 0 , h X 2 , X 3 i < 0 } ) = P ( { h X 1 , X 2 i > 0 , h X 1 , X 3 i > 0 , h X 2 , X 3 i > 0 } ) . By the sam e argument, P ( {h X 1 , X 2 i > 0 , h X 1 , X 3 i > 0 , h X 2 , X 3 i < 0 } ) = P ( { h X 1 , X 2 i < 0 , h X 1 , X 3 i < 0 , h X 2 , X 3 i < 0 } ) . W e deno t e the e vent on the rig h t side by P  I 1 2  , the probabilit y of an induced independent set on 3 ve rtices. From the above observation, it is clear that 4  P ( E 1 2 ) + P ( I 1 2 )  = 1 . Also, we may note that R R 3 sgn( xy z ) · f ( x, y , z ) dxdy dz = 4  P ( E 1 2 ) − P ( I 1 2 )  . Combini n g the two equalities yields P ( E 1 2 ) = 1 8 + 1 8 R R 3 sgn( xy z ) · f ( x, y , z ) dxdy dz . As noted, no clos ed expression for f is known, so the calculati o n of t h e above integral cannot be ca rried ou t in a straightforward manner . Instead, ( 2 ) allows us to re wri t e t he integral as Z R 3 f ( x, y , z ) · sgn( xy z ) dxdy dz = 1 π 3 × Z R 3 ϕ ( a, b, c ) · c sgn( abc ) d adbdc, where ϕ is t h e characteristic function of f , and c sgn is the Fourier transform of sgn (0 , 0 , 0) as in ( 3 ). 8 Thus, we are required t o calculate ϕ ( a, b, c ) . Consid er three in d ependent normal random var iables, X , Y , Z , with m ean 0 and variance σ 2 , the characteristic function of ( X Y , X Z , Y Z ) is defined by ( a, b, c ) → E [exp( i ( a · X Y + b · X Z + c · Y Z ))] . W e hav e t h at a · X Y + b · X Z + c · Y Z = T r   0 a 2 b 2 a 2 0 c 2 b 2 c 2 0   ·   X 2 X Y X Z X Y Y 2 Y Z X Z Y Z Z 2   ! . If we consider the W ishart distri bution W 3 (Σ σ , 1) , where Σ σ is a σ 2 scalar matrix, we note that the above function equals the characteristic function o f W 3 (Σ σ , 1) on t h e matrix   0 a 2 b 2 a 2 0 c 2 b 2 c 2 0   . Using the formula ( 4 ), this equals det   1 − i σ 2 a − i σ 2 b − i σ 2 a 1 − i σ 2 c − i σ 2 b − i σ 2 c 1   ! − 1 2 , which may be writ- ten otherwise as (1 + ( σ 2 ) 2 ( a 2 + b 2 + c 2 ) + 2( σ 2 ) 3 abc i ) − 1 2 . By t he con volution-mu l tiplication theorem [ 6 , Theorem 3.3 .2], the characteristic function of a sum of independent var iables is the multiplication of their characteristic functions, it then follows that: ϕ ( a, b, c ) = d Y i =1 (1 + α 2 i ( a 2 + b 2 + c 2 ) + 2 α 3 i abc i ) − 1 2 , (11) which results i n: × Z R 3 ϕ ( a, b, c ) · c sgn( abc ) d adbdc = × Z R 3 i abc Y i (1 + α 2 i ( a 2 + b 2 + c 2 ) + 2 α 3 i abc i ) − 1 2 dadbdc. This concludes the p ro o f. In vie w of the abo ve, it suffice s to estimate the int egral in ( 10 ). W e will sho w that the integral of the expression Re i abc Y i (1 + α 2 i ( a 2 + b 2 + c 2 ) + 2 α 3 i abc i ) − 1 2 ! , is concentrated in a ball of radius 1 k α k 2 , and that insid e this ball, the above expression is very close in v alu e to k α k 3 3 . From this, it will follow that P  E 1 2  ≃ 1 8 +  k α k 3 k α k 2  3 . The n ext result wil l be used to control the integral outsi de o f the aforementi oned ball. Lemma 2. Let n ≥ 3 and γ = { γ i } d i =1 , suppose that γ i ∈ [0 , 1] for 1 ≤ i ≤ d . Define I ( T ) = ∞ Z T r 2 dr r Q i (1 + γ 2 i r 2 ) , ∀ T ≥ 1 , and denot e k γ k 2 2 = P i γ 2 i , then ther e e xist const a nts c n , C n > 0 , depending onl y on n , such that whenever k γ k 2 2 > c n we have t h at I ( T ) ≤ C n  1 k γ k 2 2  n 2 1 T n − 3 . 9 Pr oof. Indeed, ass ume k γ k 2 2 > n . Note that necessarily d ≥ n in th i s case. Thus we can give a non t rivial lower bound of Q i (1 + γ 2 i r 2 ) by considering t he sum of all products of n diffe rent elements of γ . That is Y i  1 + γ 2 i r 2  ≥     X S ⊂ γ | S | = n Y γ j ∈ S γ 2 j     r 2 n . W e claim now that: X S ⊂ γ | S | = n Y γ j ∈ S γ 2 j ≥ 1 n ! n − 1 Y k =0  k γ k 2 2 − k  . ( 12) T o see that, we may rewrite X S ⊂ γ | S | = n Y γ i ∈ S γ 2 i = 1 n X i γ 2 i X S ⊂ γ \{ γ i } | S | = n − 1 Y γ j ∈ S γ 2 j , where we have counted each S ⊂ γ , n times. But, γ i ≤ 1 for e very 1 ≤ i ≤ d , and so k γ \ { γ i }k 2 2 > k γ k 2 2 − 1 . ( 12 ) now follows by induction, since 1 n X i γ 2 i X S ⊂ γ \{ γ i } | S | = n − 1 Y γ j ∈ S γ 2 j ≥ 1 n X i γ 2 i 1 ( n − 1)! n − 2 Y k =0 ( k γ k 2 2 − 1 − k ) = 1 n ! n − 1 Y k =0  k γ k 2 2 − k  If we further assume that k γ k 2 2 ≥ 2 n , then k γ k 2 2 − k > 1 2 k γ k 2 2 , for every 0 ≤ k ≤ n − 1 . Plugging this i n to ( 12 ) produces Y i  1 + γ 2 i r 2  ≥ k γ k 2 2 n !2 ! n r 2 n , which implies I ( T ) ≤ n !2 k γ k 2 2 ! n 2 ∞ Z T dr r n − 2 = ( n !2) n n − 3 1 k γ k 2 2 ! n 2 1 T n − 3 , as d esired. Remark: The constant s obtained in the above proof are far from optimal, but will suffice for our needs. W e wi l l use the above result in order to b o u nd from below the int egral in formula ( 10 ). For this, we wil l assume W .L.O.G. that α is normalized in the following way: α 1 = 1 and α i ∈ [0 , 1] for 1 ≤ i ≤ d. (13) W e note t hat this norm alization yields the following prop erties for n, m ∈ N , which we shall use freely: • For e very k > 0 , k α k k k ≥ 1 and thus  k α k k k  n ≤  k α k k k  m when n ≤ m . 10 • α n i ≥ α m i and k α k n n ≥ k α k m m when n ≤ m . • For an y n > 2 and ε > 0 there exists c > 0 s u ch t hat whenever k α k 2 2 > c we ha ve  k α k n k α k 2  n < ε . Lemma 3. Ther e exists a constan t c 1 / 2 > 0 such that whenever k α k 2 2 > c 1 / 2 then × Z R 3 i abc Y i  1 + α 2 i ( a 2 + b 2 + c 2 ) + 2 α 3 i abc i  − 1 2 dadbdc ≥ 1 100  k α k 3 k α k 2  3 . Pr oof. First, we h ave the privilege of knowing the i ntegral ev aluates t o som e probability . There- fore, the principal value of it’ s imaginary part must vanish. This becomes evident by noting that the imaginary p art is an odd function. Thus, we are interested in: Re   × Z R 3 i abc Y i (1 + α 2 i ( a 2 + b 2 + c 2 ) + 2 α 3 i abc i ) − 1 2 dadbdc   = × Z R 3 − 1 abc Im Y i (1 + α 2 i ( a 2 + b 2 + c 2 ) + 2 α 3 i abc i ) − 1 2 ! dadbdc = × Z R 3 − sin  arg  Q i (1 + α 2 i ( a 2 + b 2 + c 2 ) + 2 α 3 i abc i ) − 1 2  abc     Q i  1 + α 2 i ( a 2 + b 2 + c 2 ) + 2 α 3 i abc i  1 2     dadbdc = × Z R 3 sin  1 2 P i arctan  2 α 3 i abc 1+ α 2 i ( a 2 + b 2 + c 2 )   abc Q i  (1 + α 2 i ( a 2 + b 2 + c 2 )) 2 + 4 α 6 i a 2 b 2 c 2  1 4 dadbdc = × Z R 3 − Im( ϕ ( a, b, c ) ) abc dadbdc = Z R 3 − Im( ϕ ( a, b, c )) abc dadbdc, (14) where ϕ is as in ( 11 ). It is straightforward to verify that Im( ϕ ( a, b, c )) = O ( abc ) , which i mplies that the abov e integrand is actually integrable, and thus jus tifies the last equality . W e wil l estimate the above i ntegral in sev eral st eps. Step 1 - The integral is bounded from below on B 1 = n x ∈ R 3 : k x k 2 ≤ 1 k α k 2 2 o , the ball o f radius 1 k α k 2 . First, we wi ll prov e t h at t he following holds: sin 1 2 X i arctan  2 α 3 i abc 1 + α 2 i ( a 2 + b 2 + c 2 )  ! ≥ X i α 3 i abc 1 + α 2 i ( a 2 + b 2 + c 2 ) − 3 k α k 6 3 ( abc ) 2 . (15) Indeed, s ince sin( x ) ≥ x − x 2 we ha ve that 11 sin 1 2 X i arctan  2 α 3 i abc 1 + α 2 i ( a 2 + b 2 + c 2 )  ! ≥ 1 2 X i arctan  2 α 3 i abc 1 + α 2 i ( a 2 + b 2 + c 2 )  − 1 4 X i arctan  2 α 3 i abc 1 + α 2 i ( a 2 + b 2 + c 2 )  ! 2 ≥ 1 2 X i arctan  2 α 3 i abc 1 + α 2 i ( a 2 + b 2 + c 2 )  − X i α 3 i ! 2 ( abc ) 2 . W ith the l ast inequality follo wing from the fact that arctan 2 ( x ) ≤ x 2 . Now , using the inequality arctan( x ) ≥ x − x 2 yields 1 2 X i arctan  2 α 3 i abc 1 + α 2 i ( a 2 + b 2 + c 2 )  − X i α 3 i ! 2 ( abc ) 2 ≥ X i α 3 i abc 1 + α 2 i ( a 2 + b 2 + c 2 ) − 2 X i α 6 i ! ( abc ) 2 − X i α 3 i ! 2 ( abc ) 2 ≥ X i α 3 i abc 1 + α 2 i ( a 2 + b 2 + c 2 ) − 3 k α k 6 3 ( abc ) 2 . When ( a, b, c ) ∈ B 1 , then α 2 i ( a 2 + b 2 + c 2 ) ≤ α 2 i k α k 2 2 ≤ 1 and we ha ve X i α 3 i abc 1 + α 2 i ( a 2 + b 2 + c 2 ) − 3 k α k 6 3 ( abc ) 2 ≥ 1 2 k α k 3 3 abc − 3 k α k 6 3 ( abc ) 2 . (16) Next, we note that for ( a, b, c ) ∈ B 1 : 1 ≥ 1 Q i h  1 + α 2 i ( a 2 + b 2 + c 2 )  2 + 4 α 6 i a 2 b 2 c 2 i 1 4 ≥ 1 Q i h  1 + α 2 i k α k 2 2  2 + 4 α 6 i k α k 6 2 i 1 4 . Since, in ( 13 ), we’ ve ass u m ed that α i ≤ 1 for each i while P i α 2 i ≥ 1 , we may now lower bound the abov e by 1 Q i  1+ 7 α 2 i k α k 2 2  − 1 4 , and since ln  Q i  1 + 7 α 2 i k α k 2 2   ≤ 7 k α k 2 2 P i α 2 i = 7 , we ha ve 1 Q i  1 + 7 α 2 i k α k 2 2  1 4 ≥ e − 2 . (17) By combining ( 16 ) and ( 17 ) into ( 11 ) we may see for ( a, b, c ) ∈ B 1 the following holds: Im ( ϕ ( a, b, c )) ≥  1 2 k α k 3 3 abc − 3 k α k 6 3 ( abc ) 2  e − 2 when abc > 0 . Also, it is not hard to see that Im( ϕ ) is an odd function, wh i ch makes Im ( ϕ ( a,b,c )) abc e ven. Hence, if H = { ( a, b, c ) ∈ R 3 | abc > 0 } , then Z B 1 Im ( ϕ ( a, b, c )) abc dadbdc = 2 Z B 1 ∩ H Im ( ϕ ( a, b, c )) abc dadbdc. 12 Finally , since the v ol ume of B 1 is 4 π 3 k α k 3 2 , and as lon g as k α k 2 2 is lar ge enough: Z B 1 ∩ H − Im ( ϕ ( a, b, c )) abc dadbdc ≥ 1 e 2 Z B 1 ∩ H  1 2 k α k 3 3 − 3 k α k 6 3 abc  dadbdc ≥ π 3 e 2  k α k 3 k α k 2  3 − 3 k α k 6 3 e 2 Z B 1 | abc | dadbdc ≥ π 3 e 2  k α k 3 k α k 2  3 − 3 e 2  k α k 3 k α k 2  6 , where the last in equality u ses the f act Z B 1 | abc | dadbdc ≤ 1 k α k 6 2 . Now , by using th e properties of the normalization ( 13 ), k α k 3 3 ≤ k α k 2 2 . Thus,  k α k 3 k α k 2  6 ≤ 1 k α k 2  k α k 3 k α k 2  3 , and there exists a constant c 1 > 0 su ch th at wheneve r k α k 2 2 > c 1 then Z B 1 − Im ( ϕ ( a, b, c )) abc dadbdc > π 4 e 2  k α k 3 k α k 2  3 > 1 10  k α k 3 k α k 2  3 . Step 2 - The integrand is posi tive on B 2 =  x ∈ R 3 : k x k 2 ≤ 1 k α k 22 / 12 2  , the ball of radius 1 k α k 11 / 12 2 . W e first note that w h ene ver     P i arctan  2 α 3 i abc 1+ α 2 i ( a 2 + b 2 + c 2 )      < π , then sin arg Y i  1 + α 2 i ( a 2 + b 2 + c 2 ) + 2 α 3 i abc i  !! has the same sign as that of abc , which in t urn impl i es that − Im( ϕ ( a,b,c )) abc > 0 . Thus, it will suf- fice to sho w t hat whenever ( a, b, c ) ∈ B 2 and abc > 0 , we ha ve P i arctan  2 α 3 i abc 1+ α 2 i ( a 2 + b 2 + c 2 )  < π . Indeed, for ( a, b, c ) ∈ B 2 , abc <  k α k − 11 / 12 2  3 ≤ 1 k α k 2 2 which, under t h e assumpti o n abc > 0 , results in X i arctan  2 α 3 i abc 1 + α 2 i ( a 2 + b 2 + c 2 )  ≤ X i 2 α 3 i abc 1 + α 2 i ( a 2 + b 2 + c 2 ) ≤ 2 k α k 3 3 k α k 2 2 < 2 < π , as d esired. 13 Step 3 - The absolute value of the integrand is negligible o n the sph erical she ll B \ B 2 wher e B is the unit ball in R 3 . Observe that,         sin  1 2 P i arctan  2 α 3 abc 1+ α 2 i ( a 2 + b 2 + c 2 )   abc         ≤ 1 2 X i 2 α 3 i | abc | 1+ α 2 i ( a 2 + b 2 + c 2 ) | abc | ≤ k α k 3 3 . (18) On the other hand, for ( a, b, c ) / ∈ B 2 we ha ve that : 1 Q i h (1 + α 2 i ( a 2 + b 2 + c 2 )) 2 + 4 α 6 i a 2 b 2 c 2 i 1 4 ≤ 1 Q i (1 + α 2 i ( a 2 + b 2 + c 2 )) 1 2 ≤ Y i 1 + α 2 i k α k 22 / 12 2 ! − 1 2 . Using t he elementary inequality ln(1 + x ) ≥ x − x 2 2 for x > 0 yield s : ln Y i 1 + α 2 i k α k 22 / 12 2 !! = X i ln 1 + α 2 i k α k 22 / 12 2 ! ≥ k α k 2 / 12 2 − k α k 4 4 2 k α k 44 / 12 2 ≥ k α k 2 / 12 2 − 1 where the last in equality fo l lows from the fact that k α k 4 4 ≤ k α k 2 2 . In turn, this implies Y i 1 + α 2 i k α k 22 / 12 2 ! − 1 2 ≤ e − k α k 2 / 12 2 − 1 2 . Finally , since the v ol ume of the unit ball is 4 π 3 , this gives Z B \ B 2     Im ( ϕ ( a, b, c )) abc     dadbdc < 4 π 3 k α k 3 3 e − k α k 2 / 12 2 − 1 2 . (19 ) Consequently , there is a constant c 2 such that whene ver k α k 2 2 > c 2 then Z B \ B 2     Im ( ϕ ( a, b, c )) abc     dadbdc ≤ 1 100  k α k 3 k α k 2  3 . Step 4 - The integral is negligible outside of B . For ( a, b, c ) / ∈ B we use ( 18 ) to achie ve sin  1 2 P i arctan  2 α 3 i abc 1+ α 2 i ( a 2 + b 2 + c 2 )   abc Q i  (1 + α 2 i ( a 2 + b 2 + c 2 )) 2 + 4 α 6 i a 2 b 2 c 2  1 4 < k α k 3 3 Q i (1 + α 2 i ( a 2 + b 2 + c 2 )) 1 2 . 14 By passing to sph erical coordi nates we obtain: Z R 3 \ B 1 Q i (1 + α 2 i ( a 2 + b 2 + c 2 )) 1 2 dadbdc = 4 π ∞ Z 1 r 2 dr Q i (1 + α 2 i r 2 ) 1 2 . Applying Lemma 2 with n = 4 and T = 1 , shows the existence of constants C , c ′ 3 > 0 such that whene ver k α k 2 2 > c ′ 3 , ∞ Z 1 r 2 dr Q i (1 + α 2 i r 2 ) 1 2 ≤ C 1 k α k 2 2 ! 2 = C 1 k α k 4 2 . Thus, there e xists a cons t ant c 3 = max( c ′ 3 , (16 C ) 2 ) such that whene ver k α k 2 2 > c 3 then Z R 3 \ B     Im ( ϕ ( a, b, c )) abc     dadbdc ≤ 1 100  k α k 3 k α k 2  3 . Final Step - R R 3 − Im ( ϕ ( a,b,c )) abc dadbdc ≥ 1 100  k α k 3 k α k 2  3 W e may now decompose the integral Z R 3 − Im ( ϕ ( a, b, c )) abc dadbdc = Z B 2 − Im ( ϕ ( a, b, c )) abc dadbdc + Z R 3 \ B 2 − Im ( ϕ ( a, b, c )) abc dadbdc Letting k α k 2 2 > max ( c 1 , c 2 , c 3 ) steps 1 and 2 s how that Z B 2 − Im ( ϕ ( a, b, c )) abc dadbdc ≥ 1 10  k α k 3 k α k 2  3 , (20 ) while steps 2 and 3 show Z R 3 \ B 2     Im ( ϕ ( a, b, c )) abc     dadbdc ≤ 2 100  k α k 3 k α k 2  3 . The requi red bound then follows by com b ining the abo ve two estimates. 3.2 Arbitrary 0 < p < 1 W e now consider th e case for arbitrary p . First, we would like to deri ve bounds on the beha vi or of t p,α , which constitute the following l emma. Lemma 4. Let p ∈ (0 , 1) and denote by Φ the cumulat ive distribution fun ction of the sta n dar d Gaussian. If t p = Φ − 1 ( p ) then k α k 2 t p − k p ≤ t p,α ≤ k α k 2 t p + k p , for a const ant k p depending only on p . Fu r thermor e, if p ′ := Φ  t p,α k α k 2  then | p − p ′ | ≤ 3  k α k 3 k α k 2  3 . 15 Pr oof. Let W = h X 1 ,X 2 i k α k 2 where X 1 , X 2 are d efined as in the beginning of the section. W e may consider h X 1 , X 2 i as s um of independent rando m v ariabl es X i 1 · X i 2 , where for each 1 ≤ i ≤ d , X i 1 and X i 2 are independent ly distributed as N (0 , α i ) . It then holds that E [ X i 1 · X i 2 ] = 0 , E [( X i 1 · X i 2 ) 2 ] = α 2 i . The absolute third moments are given as a product of absol ute t hird mo- ments o f Gaussi ans. That is, E [ | X i 1 · X i 2 | 3 ] = 8 α 3 i π < 3 α 3 i . Let t be such that p = P ( W ≥ t ) , in which case we also ha ve t p,α = t k α k 2 . Note that P i E [ | X i 1 · X i 2 | 3 ]  P i E [( X i 1 · X i 2 ) 2 ]  3 / 2 ≤ 3 k α k 3 3 k α k 3 2 . Thus, if Z is a s tandard normal random va riable, Berry-Esseen’ s inequ al i ty , ( 6 ), yields for ev ery s ∈ R : | P ( W > s ) − P ( Z > s ) | ≤ 3 k α k 3 3 k α k 3 2 . If t p = Φ − 1 ( p ) then P ( Z > t p ) = p and | Φ( t p ) − Φ( t ) | = | P ( Z > t p ) − P ( Z > t ) | = | P ( W > t ) − P ( Z > t ) | ≤ 3 k α k 3 3 k α k 3 2 . Since | p − p ′ | = | Φ( t p ) − Φ( t ) | , t his shows t he second part of the statement. T o finis h the proof, denote m = inf s ∈ [ t p ,t ] (Φ ′ ( s )) . By Lagrange’ s theorem m | t p − t | ≤ | Φ( t p ) − Φ( t ) | ≤ 3 k α k 3 3 k α k 3 2 ≤ 3 k α k 2 , which sho ws t p,α ∈ k α k 2 t p ± 3 m . Before proceeding, we need some fu rt h er definit ions. Let X ′ 1 , X ′ 2 , X ′ 3 be independent copies of X 1 , X 2 , X 3 and consider the joint di stribution ( h X 1 , X 2 i , h X ′ 1 , X 3 i , h X ′ 2 , X ′ 3 i ) . This di s tribu- tion has independent coordinates. Denote its density by g and corresponding characteristic function by ψ . If N 1 , N 2 are two independent standard Gauss i ans t hen the characteristic fun c- tion of their prod uct can be derived from ( 4 ) as E e i tN 1 N 2 = (1 + t 2 ) − 1 2 . From t his, i t follows that the characteristi c function of h X 1 , X 2 i is E e i t h X 1 ,X 2 i = Q i (1 + α 2 i t 2 ) − 1 2 , and we have, by independence ψ ( a, b, c ) = Y i  (1 + α 2 i a 2 )(1 + α 2 i b 2 )(1 + α 2 i c 2 )  − 1 2 . (21) W e denot e by ψ 1 ( a ′ , b ′ , c ′ ) = ψ  a ′ k α k 2 , b ′ k α k 2 , c ′ k α k 2  and ϕ 1 ( a ′ , b ′ , c ′ ) = ϕ  a ′ k α k 2 , b ′ k α k 2 , c ′ k α k 2  for the characteristi c function ϕ , ( 11 ). The fol lowing result will help us relate the independent version of the distribution and the original one. Lemma 5. Ther e exist absol u te constants c, C, ε > 0 s u ch that whenever k α k 2 2 > c then Z R 3 | Re( ϕ 1 ) − ψ 1 | da ′ db ′ dc ′ ≤ C  k α k 3 k α k 2  3+ ε . 16 Pr oof. Note that s i nce ψ 1 and ϕ 1 are cha racteristic functions, then | ψ 1 | , | R e( ϕ 1 ) | ≤ 1 . So, | ψ 1 − Re( ϕ 1 ) | ≤ | ln( ψ 1 ) − ln(Re( ϕ 1 )) | . Now , let B 0 . 01 = ( x ∈ R 3 : || x || 2 ≤  k α k 2 k α k 3  0 . 01 ) . Clearly , | Re( ϕ 1 ) | ≤ | ϕ 1 | = Q i  (1 + α 2 i k α k 2 2 ( a ′ 2 + b ′ 2 + c ′ 2 )) 2 + 4 α 6 i k α k 6 2 a ′ 2 b ′ 2 c ′ 2  − 1 4 , and s i nce | a ′ b ′ c ′ | ≤  a ′ 2 + b ′ 2 + c ′ 2  3 2 ≤  k α k 2 k α k 3  0 . 015 for ( a ′ , b ′ , c ′ ) ∈ B 0 . 01 , we ha ve      arg 1 + α 2 i k α k 2 2 ( a ′ 2 + b ′ 2 + c ′ 2 ) + 2 α 3 i k α k 3 2 a ′ b ′ c ′ i !      ≤ 2 α 3 i k α k 3 2 | a ′ b ′ c ′ | ≤ 2 α 3 i k α k 3 2  k α k 2 k α k 3  0 . 015 . By using the inequali ty cos( x ) ≥ 1 − x 2 , we achie ve Re( ϕ 1 ) ≥ cos 2 k α k 3 3 k α k 3 2  k α k 2 k α k 3  0 . 015 ! | ϕ 1 | ≥ 1 − 4 k α k 6 3 k α k 6 2  k α k 2 k α k 3  0 . 03 ! | ϕ 1 | . Using t he above , together wit h the triangle inequality giv es | ln( ψ 1 ) − ln(Re( ϕ 1 )) | ≤ | ln( ψ 1 ) − ln( | ϕ 1 | ) | +      ln 1 − 4 k α k 6 3 k α k 6 2  k α k 2 k α k 3  0 . 03 !      . (22) For x ∈ ( 0 , 1 2 ) we ha ve the inequality | ln(1 − x ) | ≤ 2 x , thus , as long as k α k 2 2 is lar ge enough      ln 1 − 4 k α k 6 3 k α k 6 2  k α k 2 k α k 3  0 . 03 !      ≤ 8 k α k 6 3 k α k 6 2  k α k 2 k α k 3  0 . 03 , and 8 Z B 0 . 01 k α k 6 3 k α k 6 2  k α k 2 k α k 3  0 . 03 da ′ db ′ dc ′ ≤ 32 π k α k 6 3 k α k 6 2  k α k 2 k α k 3  0 . 045 = 32 π  k α k 3 k α k 2  5 . 955 . (23) By using the inequali ty | ln(1 + x ) − x | ≤ x 2 for x > 0 we bo und ln( ψ 1 ) with ln( ψ 1 ( a ′ , b ′ , c ′ )) = − 1 2 X i " ln 1 + α 2 i a ′ 2 k α k 2 2 ! + ln 1 + α 2 i b ′ 2 k α k 2 2 ! + ln 1 + α 2 i c ′ 2 k α k 2 2 !# = − 1 2  a ′ 2 + b ′ 2 + c ′ 2  + O k α k 4 4 k α k 4 2 !  a ′ 4 + b ′ 4 + c ′ 4  . 17 Similar considerations sh ow ln( | ϕ 1 | ) = − 1 4 X ln 1 + 2 α 2 i k α k 2 2 ( a ′ 2 + b ′ 2 + c ′ 2 ) + α 4 i k α k 4 2 ( a ′ 2 + b ′ 2 + c ′ 2 ) 2 + 4 α 6 i k α k 6 2 a ′ 2 b ′ 2 c ′ 2 ! = − 1 2  a ′ 2 + b ′ 2 + c ′ 2  − k α k 4 4 4 k α k 4 2 ( a ′ 2 + b ′ 2 + c ′ 2 ) 2 − k α k 6 6 k α k 6 2 a ′ 2 b ′ 2 c ′ 2 + O k α k 4 4 k α k 4 2 !  ( a ′ 2 + b ′ 2 + c ′ 2 ) 2 +  a ′ 2 + b ′ 2 + c ′ 2  4 + a ′ 4 b ′ 4 c ′ 4  = − 1 2  a ′ 2 + b ′ 2 + c ′ 2  + O k α k 4 4 k α k 4 2 !  1 +  a ′ 2 + b ′ 2 + c ′ 2  6  . (24) The above sho w s the existence of a constant C > 0 such that Z B 0 . 01 | ln( ψ 1 ) − ln( | ϕ 1 | ) | ≤ C  k α k 4 k α k 2  4 Z B 0 . 01 ( a ′ 2 + b ′ 2 + c ′ 2 ) 6 da ′ db ′ dc ′ =4 π C  k α k 4 k α k 2  4  k α k 2 k α k 3  0 . 075 ≤ 4 π C  k α k 3 k α k 2  4  k α k 2 k α k 3  0 . 075 = 4 π C  k α k 3 k α k 2  3 . 925 . (25) By combining ( 23 ),( 25 ) and ( 22 ), we obtain Z B 0 . 01 | ψ 1 − Re( ϕ 1 ) | da ′ db ′ dc ′ ≤ π (4 C + 32 )  k α k 3 k α k 2  3 . 925 . T o bound th e int egral in R 3 \ B 0 . 01 we proceed in simil ar fashion to step 3 in Lemma 3 . First, note that | ϕ 1 | , | ψ 1 | ≤ 1 Q i  1 + α 2 i k α k 2 2 ( a ′ 2 + b ′ 2 + c ′ 2 )  1 2 . Denoting r = √ a ′ 2 + b ′ 2 + c ′ 2 , T =  k α k 2 k α k 3  0 . 005 and passing t o spherical coordinates yiel d s Z R 3 \ B 0 . 01 | Re( ϕ 1 ) − ψ 1 | da ′ db ′ dc ′ ≤ Z R 3 \ B 0 . 01 | Re( ϕ 1 ) | + | ψ 1 | da ′ db ′ dc ′ ≤ 8 π ∞ Z T r 2 dr Q i  1 + α 2 i k α k 2 2 r 2  1 2 . In voking Lemma 2 with n > 606 shows the existence of constants C , c > 0 such that ∞ Z T r 2 dr Q i  1 + α 2 i k α k 2 2 r 2  1 2 ≤ C T − 603 = C  k α k 3 k α k 2  3 . 015 , whenev er k α k 2 2 > c . This concludes the proof when we take ε = 0 . 01 5 . W e are now ready t o b o u nd from b el ow t he probabi lity of an i n duced t ri ang l e occurring in the general sett ing. Set p ∈ (0 , 1) and t := t p,α . W e are in terested in the e vent n min ( h X 1 , X 2 i , h X 1 , X 3 i , h X 2 , X 3 i ) > t o . 18 As before, let f be th e joint density of ( h X 1 , X 2 i , h X 1 , X 3 i , h X 2 , X 3 i ) and cons ider the integral: I p := Z R 3 f ( x, y , z )sgn( x − t )sgn( y − t )sgn ( z − t ) dxdy dz . Note that, i n the above formula, replacing f with g , the d ensity of the coo rdi nate-independent version, as d efined above , would yield I p = p 3 + 3 (1 − p ) 2 p − 3(1 − p ) p 2 − (1 − p ) 3 = (2 p − 1) 3 . For t he rest of this s ection, our g o al will be to sho w that I p is large, compared to (2 p − 1) 3 . That is, the dependency between the coordinates i nduces an increased probabili ty for t riangles and induced edges. As in ( 1 ), we m ay write the Fourier transform of sgn( x − t )sgn( y − t )sgn( z − t ) as c sgn ( a, b, c ) e − 2 π i t ( a + b + c ) . Thus, by ( 2 ), we ha ve the equality I p = 1 π 3 × Z R 3 ϕ ( a, b, c ) c sgn( a, b, c ) e − 2 π i t ( a + b + c ) dadbdc, where ϕ , as in ( 11 ), is the characteristic function of f . Since I p represents a real number , we only need to consi der t h e real part of the int egral: I p = 1 π 3 × Z Re ( ϕ ( a, b, c ) c sgn( a, b, c )) cos(2 π t ( a + b + c )) dadbdc + 1 π 3 × Z Im ( ϕ ( a, b, c ) c sgn( a, b, c )) sin(2 π t ( a + b + c )) dadbdc = 1 π 3 × Z − Im ( ϕ ( a, b, c )) abc cos(2 π t ( a + b + c )) dadbdc + 1 π 3 × Z Re ( ϕ ( a, b, c )) abc sin(2 π t ( a + b + c )) dad bdc. W e denote I ′ p = 1 π 3 × Z Re ( ϕ ( a, b, c )) abc sin(2 π t ( a + b + c )) dad bdc and I ′′ p = 1 π 3 × Z − Im ( ϕ ( a, b, c )) abc cos(2 π t ( a + b + c )) dadbdc. In t he proof of Lemma 3 we ha ve seen that × Z − Im ( ϕ ( a, b, c )) abc dadbdc, is mostly concentrated near the origin. Th u s, we s h ould expect that × Z − Im ( ϕ ( a, b, c )) abc cos(2 π t ( a + b + c )) dadbdc ≃ × Z − Im ( ϕ ( a, b, c )) abc dadbdc. So that I ′′ p represents th e increase in prob abi lity . From Lemma 5 , we know that Re( ϕ ) is close to ψ , the characteristic function of the coordinate-independent version, which means that I ′ p should be close t o (2 p − 1) 3 . The n ext two claims will formalize th i s i ntuition. W e begin by showing that I ′′ p is lar ge. 19 Claim 6. F i x p ∈ (0 , 1) . Ther e exist constants δ ′ p , c p > 0 depending only on p such that whenever k α k 2 2 > c p then I ′′ p ≥ 2 δ ′ p  k α k 3 k α k 2  3 . Pr oof. First, it is not hard to see that the integrand i n I ′′ p is continuous , up to a remov able discontinui t y , and w e may pass to st and ard integration. L et R b e an arbitrary orth o gonal trans- formation w h ich takes (1 , 0 , 0) to 1 √ 3 (1 , 1 , 1) . Consider the set K = R " − 1 k α k 11 / 12 2 , 1 k α k 11 / 12 2 # × " − 1 k α k 11 / 12 2 , 1 k α k 11 / 12 2 # × " − 1 k α k 11 / 12 2 , 1 k α k 11 / 12 2 #! . Note that if B 2 = n x ∈ R 3 | k x k 2 ≤ 1 k α k 22 / 12 2 o and B ′ 2 = n x ∈ R 3 | k x k 2 ≤ 4 k α k 22 / 12 2 o then, B 2 ⊂ K ⊂ B ′ 2 . Now , recall from ( 14 ) that, − Im( ϕ ( a, b, c ) ) abc = sin  1 2 P i arctan  2 α 3 i abc 1+ α 2 i ( a 2 + b 2 + c 2 )   abc Q i  (1 + α 2 i ( a 2 + b 2 + c 2 )) 2 + 4 α 6 i a 2 b 2 c 2  1 4 . From ( 18 ) and ( 15 ), we ha ve k α k 3 3 ≥ sin  1 2 P i arctan  2 α 3 i abc 1+ α 2 i ( a 2 + b 2 + c 2 )   abc ≥ X i α 3 i 1 + α 2 i ( a 2 + b 2 + c 2 ) − 3 k α k 6 3 | abc | . Along with the inequali ty α 3 i 1+ α 2 i ( a 2 + b 2 + c 2 ) ≥ α 3 i (1 − α 2 i ( a 2 + b 2 + c 2 )) , the above yields        sin  1 2 P i arctan  2 α 3 i abc 1+ α 2 i ( a 2 + b 2 + c 2 )   abc − k α k 3 3        ≤ k α k 5 5 ( a 2 + b 2 + c 2 ) − 3 k α k 6 3 | abc | . Therefore Z K − Im ( ϕ ( a, b, c )) abc cos(2 π t ( a + b + c )) dadbdc ≥ k α k 3 3 Z K cos(2 π t ( a + b + c )) dadbdc Q i  (1 + α 2 i ( a 2 + b 2 + c 2 )) 2 + 4 α 6 i a 2 b 2 c 2  1 4 − 3 k α k 6 3 Z K | abc | dadbdc − k α k 5 5 Z K ( a 2 + b 2 + c 2 ) dadbdc, (26) with 3 k α k 6 3 Z K | abc | dadbdc ≤ C 1 k α k 6 3 k α k 5 . 5 2 = C 1  k α k 3 k α k 2  3 k α k 3 3 k α k 2 2 1 k α k 0 . 5 2 ≤ C 1  k α k 3 k α k 2  3 1 k α k 0 . 5 2 , k α k 5 5 Z K ( a 2 + b 2 + c 2 ) dadbdc ≤ C 1 k α k 5 5 k α k 55 / 12 2 ≤ C 1  k α k 3 k α k 2  3 1 k α k 2 , 20 for an absolute con s tant C 1 > 0 . Recalling that | ϕ ( a, b, c ) | = Y i   1 + α 2 i ( a 2 + b 2 + c 2 )  2 + 4 α 6 i a 2 b 2 c 2  − 1 4 , we would like approximate | ϕ ( a, b, c ) | by e − k α k 2 2 2 ( a 2 + b 2 + c 2 ) . For t h at, we note t h at     | ϕ ( a, b, c ) | − e −k α k 2 2 2 ( a 2 + b 2 + c 2 )     ≤     ln ( | ϕ ( a, b, c ) | ) − ln  e −k α k 2 2 2 ( a 2 + b 2 + c 2 )      . Since | ln( x + 1) − x | ≤ x 2 , similar con s iderations as i n ( 24 ), show for ( a, b, c ) ∈ K : ln( | ϕ | ) = − 1 4 X ln  1 + 2 α 2 i  a 2 + b 2 + c 2  + α 4 i  a 2 + b 2 + c 2  2 + 4 α 6 i a 2 b 2 c 2  = − k α k 2 2 2  a 2 + b 2 + c 2  − k α k 4 4 4  a 2 + b 2 + c 2  2 − k α k 6 6 a 2 b 2 c 2 + O  k α k 4 4    a 2 + b 2 + c 2  2 +  a 2 + b 2 + c 2  4 + a 4 b 4 c 4  = − k α k 2 2 2  a 2 + b 2 + c 2  + O  k α k 4 4   a 2 + b 2 + c 2  2 . This shows the existence of an absolute constant C 2 > 0 su ch that for ( a, b, c ) ∈ K     | ϕ ( a, b, c ) | − e −k α k 2 2 2 ( a 2 + b 2 + c 2 )     ≤ C 2 k α k 4 4  a 2 + b 2 + c 2  2 . Hence Z K | ϕ ( a, b, c ) | cos(2 π t ( a + b + c )) dadbdc ≥ Z K e − k α k 2 2 2 ( a 2 + b 2 + c 2 ) cos (2 π t ( a + b + c )) dadbdc − C 2 k α k 4 4 Z K  a 2 + b 2 + c 2  2 dadbdc, (27) and C 2 k α k 4 4 Z K  a 2 + b 2 + c 2  2 dadbdc ≤ C 3 k α k 4 4 k α k 77 / 12 2 ≤ C 3  k α k 3 k α k 2  3 1 k α k 3 2 , for an absolute constant C 3 > 0 . By rotational in va riance of e − k α k 2 2 2 ( a 2 + b 2 + c 2 ) , we may apply R as a unitary coordinate change, whi ch shows Z K e − k α k 2 2 2 ( a 2 + b 2 + c 2 ) cos(2 π t ( a + b + c )) dad bdc = Z R − 1 K e − k α k 2 2 2 ( a 2 + b 2 + c 2 ) cos(2 √ 3 π ta ) dad bdc = 1 k α k 11 / 12 2 Z − 1 k α k 11 / 12 2 e − k α k 2 2 2 c 2 dc 1 k α k 11 / 12 2 Z − 1 k α k 11 / 12 2 e − k α k 2 2 2 b 2 db 1 k α k 11 / 12 2 Z − 1 k α k 11 / 12 2 e − k α k 2 2 2 a 2 cos( √ 12 π t a ) da = 1 k α k 3 2 k α k 1 / 12 2 Z −k α k 1 / 12 2 e − c 2 2 dc k α k 1 / 12 2 Z −k α k 1 / 12 2 e − b 2 2 db k α k 1 / 12 2 Z −k α k 1 / 12 2 e − a 2 2 cos  √ 12 π t k α k 2 a  da, (28) 21 where the last equali ty is a result of a second coordinate change. By Lemm a 4 , we know that | t p | − k p k α k 2 ≤     t k α k 2     ≤ | t p | + k p k α k 2 for constants k p , t p depending on p . Also, a calculation shows that ∞ Z −∞ e − a 2 2 cos  √ 12 π t k α k 2 a  da = √ 2 π e − 6 π 2 t 2 k α k 2 2 . Note that if k α k 2 / 12 2 ≥ 12 π 2 ( | t p | + k p ) 2 + 2 , then Z | a | > k α k 1 / 12 2 e − a 2 2 da ≤ 2 e − k α k 2 / 12 2 2 ≤ 2 e e − 12 π 2 ( | t p | + k p ) 2 2 ≤ 1 2 √ 2 π e − 6 π 2 t 2 k α k 2 2 . That is, if k α k 2 / 12 2 is lar ger than som e constant, which depends only on t , we hav e k α k 1 / 12 2 Z −k α k 1 / 12 2 e − a 2 2 cos  √ 12 π t k α k 2 a  da ≥ 1 2 √ 2 π e − 6 π 2 t 2 k α k 2 2 . T ogeth er with t h e observation 1 R − 1 e − x 2 2 dx > 1 , this shows that the expression ( 28 ) is bounded from belo w by 1 2 √ 2 π e − 6 π 2 t 2 k α k 2 2 . Combin i ng the above, along with ( 26 ) and ( 27 ) s h ows Z K − Im ( ϕ ( a, b, c )) abc cos(2 π t ( a + b + c )) dad bdc ≥ k α k 3 3 Z K cos(2 π t ( a + b + c )) | ϕ ( a, b, c ) | d adbdc − 2 C 1  k α k 3 k α k 2  3 1 k α k 0 . 5 2 ≥ k α k 3 3 Z K e − k α k 2 2 2 ( a 2 + b 2 + c 2 ) cos(2 π t ( a + b + c )) dad bdc − C 3  k α k 3 k α k 2  6 − 2 C 1  k α k 3 k α k 2  3 1 k α k 0 . 5 2 ≥ 1 2 √ 2 π e − 6 π 2 t 2 k α k 2 2  k α k 3 k α k 2  3 − C 3  k α k 3 k α k 2  6 − 2 C 1  k α k 3 k α k 2  3 1 k α k 0 . 5 2 ≥ 4 δ ′ p  k α k 3 k α k 2  3 . whenev er k α k 2 2 > c ′′ p , for c ′′ p , δ ′ p constants, depending only on p . From ( 19 ), we can choose a constant c ′ p > c ′′ p > 0 such that Z R 3 \ B 2 | Re ( ϕ ( a, b, c ) c sgn( a, b, c )) | d adbdc < 2 δ ′ p  k α k 3 k α k 2  3 , whenev er k α k 2 2 > c ′ p . Thus I ′′ p > Z K − Im ( ϕ ( a, b, c )) abc dadbdc − Z R 3 \ B 2     Im ( ϕ ( a, b, c )) abc     dadbdc ≥ 2 δ ′ p  k α k 3 k α k 2  3 . 22 It now remains to show that the differe nce between I ′ p and (2 p − 1) 3 is small , com pared to I ′′ p . Claim 7. Fix p ∈ (0 , 1) , ther e e xists a cons t ant c p > 0 depending only on p such that whene ver k α k 2 2 > c p then | I ′ p − (2 p − 1) 3 | ≤ δ ′ p  k α k 3 k α k 2  3 , wher e δ ′ p is the same as in Claim 6 . Pr oof. Let g be the density of th e coordinate free version of f , as in Lemma 5 , and let ψ be its characteristic function ( 21 ). Evidently , we have the equality: 1 π 3 × Z R 3 ψ ( a, b, c ) c sgn( a, b, c ) e − 2 π i t ( a + b + c ) dadbdc = (2 p − 1 ) 3 . Thus, by rewriting I ′ p as 1 π 3 × Z R 3 (Re ( ϕ ( a, b, c )) + ψ ( a, b, c ) − ψ ( a, b, c )) sin(2 π t ( a + b + c )) abc dadbdc, we obtain I ′ p = (2 p − 1) 3 + 1 π 3 × Z R 3 (Re ( ϕ ( a, b, c )) − ψ ( a, b, c )) sin(2 π t ( a + b + c )) abc dadbdc. Next, we rewrite sin(2 π t ( a + b + c )) as: sin(2 π ta ) sin(2 π tb ) sin(2 π tc ) + cos(2 π ta ) cos(2 π tb ) sin(2 π tc )+ cos(2 π ta ) sin(2 π tb ) cos(2 π tc ) + sin(2 π ta ) cos (2 π t b ) cos(2 π tc ) . Recall ϕ ( a, b, c ) = d Y i =1 (1 + α 2 i ( a 2 + b 2 + c 2 ) + 2 α 3 i abc i ) − 1 2 , ψ ( a, b, c ) = Y i  (1 + α 2 i a 2 )(1 + α 2 i b 2 )(1 + α 2 i c 2 )  − 1 2 . One may now verify that Re( ϕ ( a, b, c ) − ψ ( a, b, c )) 1 abc is an odd fun ction. For a function h , we’ ve defined ∆ c h ( a, b, c ) = h ( a, b, c ) + h ( a, b, − c ) . Thus, ∆ c  Re( ϕ ( a, b, c )) − ψ ( a, b, c ) abc sin(2 π ta ) cos(2 π tb ) cos(2 π tc )  = 0 . Looking at t he principal v alue, we see that: × Z R 3 Re( ϕ ( a, b, c )) − ψ ( a, b, c ) abc sin(2 π ta ) cos(2 π tb ) cos(2 π tc ) dadbdc = 0 , and the sam e can be said for th e oth er similar terms. W e are then left to consider an integrable function: I ′ p − (2 p − 1) 3 = Z R 3 sin(2 π ta ) sin(2 π tb ) sin(2 π tc ) abc (Re( ϕ ( a, b, c ) − ψ ( a, b, c )) dadbdc. 23 By makin g the subst itution a ′ = k α k 2 a, b ′ = k α k 2 b, c ′ = k α k 2 c , and denoting t ′ = t k α k 2 the above equals Z R 3 sin(2 π t ′ a ′ ) sin(2 π t ′ b ′ ) sin(2 π t ′ c ′ ) a ′ b ′ c ′ (Re( ϕ 1 ( a ′ , b ′ , c ′ ) − ψ 1 ( a ′ , b ′ , c ′ ))) da ′ db ′ dc ′ , where ϕ 1 and ψ 1 are as i n Lemma 5 . By Lemma 4 , we kn ow that | t ′ | < | t p | + k p k α k 2 . Thus sup ( a ′ ,b ′ ,c ′ ) ∈ R 3      sin(2 π t ′ a ′ ) sin(2 π t ′ b ′ ) sin(2 π t ′ c ) a ′ b ′ c ′      ≤  2 π  | t p | + k p k α k 2  3 . And so | I ′ p − (2 p − 1) 3 | ≤  2 π  | t p | + k p k α k 2  3 Z R 3 | Re( ϕ 1 ( a ′ , b ′ , c ′ )) − ψ 1 ( a ′ , b ′ , c ′ ) | da ′ db ′ dc ′ . Lemma 5 asserts that R R 3 | Re( ϕ 1 ) − ψ 1 | ≤ C  k α k 3 k α k 2  3+ ε for large enough k α k 2 2 . Thus, I ′ p − (2 p − 1) 3 ≤  2 π  | t p | + k p k α k 2  3 C  k α k 3 k α k 2  3+ ε . Since we’ ve assum ed α to be n o rmalized as in ( 13 ), k α k 3 k α k 2 can be made as smal l as needed. The proof concludes by choosi ng c p > c ′ p to be such that  2 π  | t p | + k p k α k 2  3 C  k α k 3 k α k 2  3+ ε < δ ′ p  k α k 3 k α k 2  3 whenev er k α k 2 2 > c p . Combining Claims 7 and 6 , we have thus established Lemma 8. F ix p ∈ (0 , 1) . Ther e exist cons tants δ ′ p , c p > 0 depending onl y on p s u ch t hat whenever k α k 2 2 > c p then I p ≥ (2 p − 1) 3 + δ ′ p  k α k 3 k α k 2  3 . Now , by definition P ( h X 1 , X 2 i > t p,α ) = p and P ( h X 1 , X 2 i > t p,α , h X 1 , X 3 i > t p,α ) = p 2 . W e note that Lemma 8 , along with ( 1 ) produ ces: (2 p − 1) 3 + δ ′ p  k α k 3 k α k 2  3 ≤ 8 P ( E p ) − 12 p 2 + 6 p − 1 . This prov es the lower bound of Theorem 5 : p 3 + δ ′ p 8  k α k 3 k α k 2  3 ≤ P ( E p ) . 24 3.3 Upper bound T o finish the proof of Theorem 5 it remains to prov e the upper bound. This is done in the following lemma. Lemma 9. Let p ∈ ( 0 , 1) , P ( E p ) − p 3 ≤ ∆  k α k 3 k α k 2  3 , for a universal constan t ∆ > 0 . Pr oof. The proof of this lemma will use the high er di m ensional analogue of th e Berry-Esseen’ s inequality . Define t he random vector V = ( h X 1 , X 2 i , h X 1 , X 3 i , h X 2 , X 3 i ) . It is straight forward to check that the cova riance matrix of V is k α k 2 2 I 3 where I 3 is the i d entity matrix. W e decompose V into V i = ( X i 1 X i 2 , X i 1 X i 3 , X i 2 X i 3 ) . Clearly V = d P i =1 V i and, since X i 1 , X i 2 , X i 3 are i.i.d. Gaussians, E k V i k 3 ≤ r E h (( X i 1 X i 2 ) 2 + ( X i 1 X i 3 ) 2 + ( X i 2 X i 3 ) 2 ) 3 i = q 3 E [( X i 1 X i 2 ) 6 ] + 18 E [( X i 1 ) 6 ( X i 2 ) 4 ( X i 3 ) 2 ] + 6 E [( X i 1 ) 4 ( X i 2 ) 4 ( X i 3 ) 4 ] ≤ 50 q α 6 i = 5 0 α 3 i . Thus, if Z 3 a 3 -di mensional standard Gaussian random vector , by ( 7 ) t here is a con s tant C be such that for any con vex set K ⊂ R 3 we ha ve that | P ( V / k α k 2 ∈ K ) − P ( Z 3 ∈ K ) | ≤ 100 C be  k α k 3 k α k 2  3 . In p articular , this holds for the con vex set E p =  ( x, y , z ) ∈ R 3 | x > t p,α k α k 2 , y > t p,α k α k 2 , z > t p,α k α k 2  . If we denote p ′ = Φ − 1 ( t p,α k α k 2 ) , the above s h ows | P ( V / k α k 2 ∈ E p ) − p ′ 3 | ≤ 100 C be  k α k 3 k α k 2  3 . By Lemma 4 , | p − p ′ | ≤ 3  k α k 3 k α k 2  3 . Also | p 3 − p ′ 3 | = | p − p ′ | ( p 2 + pp ′ + p ′ 2 ) ≤ 9  k α k 3 k α k 2  3 . W e then ha ve | P ( E p ) − p 3 | ≤ | P ( E p ) − p ′ 3 | + | p 3 − p ′ 3 | ≤ (9 + 100 C be )  k α k 3 k α k 2  3 as d esired. 25 4 Pr oof of Theorem 3 Recall from the i ntroduction that τ ( G ) deno t es the number o f signed triangles of a graph G . If A i s the adjacency mat ri x of G with entries A i,j we d eno te the centered adjacency m atrix of G as ¯ A with entries ¯ A i,j := A i,j − E [ A i,j ] . Given three di stinct vertices i , j and k , the signed triangle induced by tho s e 3 vertices is τ G ( i, j, k ) := ¯ A i,j ¯ A i,k ¯ A j,k . It then hol ds that for a graph G = ( V , E ) the nu mber of s igned triangles is gi ven by: τ ( G ) := X { i,j,k }∈ ( V 3 ) τ G ( i, j, k ) . Analysis of τ ( G ( n, p )) was done in [ 2 ], where it was shown that E τ ( G ( n, p )) = 0 while V ar( τ ( G ( n, p ))) ≤ n 3 . T o prove Theorem 3 it will suffice to show that E τ ( G ( n, p, α )) is asympt otically bigg er th an both the st andard de viation of τ ( G ( n, p )) and of τ ( G ( n, p, α )) , provided that  k α k 2 k α k 3  6 << n 3 . For this aim we first prove s ome technical lemmas: Lemma 10. Let p ∈ (0 , 1) , then E A 1 , 2 A 2 , 3 ≤ p 2 + 8  k α k 4 k α k 2  4 . Pr oof. Let X , Y , Z be i.i.d. random variables generated from N (0 , D α ) , then, conditi o ning on Y yields the expression E A 1 , 2 A 2 , 3 = E [ P ( h X, Y i ≥ t p,α ) P ( h Z , Y i ≥ t p,α ) | Y ] = E   Φ t p,α p P α i Y 2 i ! 2   , where Φ is the standard Gaussian cum u lativ e distribution function. By the s am e argument, we also ha ve E " Φ t p,α p P α i Y 2 i !# 2 = E A 1 , 2 E A 2 , 3 = p 2 . Thus, it will be enough to show , V ar Φ t p,α p P α i Y 2 i !! ≤ 8  k α k 4 k α k 2  4 . (29) For σ 2 > 0 denote by G σ 2 a random variable with law N (0 , σ 2 ) . W e then ha ve Φ t p,α p P α i Y 2 i ! = P  G P α i Y 2 i ≤ t p,α  , and V ar Φ t p,α p P α i Y 2 i !! ≤ E   Φ t p,α p P α i Y 2 i ! − Φ  t p,α k α k 2  ! 2   = E   P  G P α i Y 2 i ≤ t p,α  − P  G k α k 2 2 ≤ t p,α  2  ≤ E  TV  G P α i Y 2 i , G k α k 2 2  2  . 26 For the to tal variation distance between 2 Gaussi an random variables we hav e the following bound (see Proposition 3.6.1 in [ 11 ], for example): TV  G σ 2 1 , G σ 2 1  ≤ 2 | σ 2 1 − σ 2 2 | max( σ 2 1 , σ 2 2 ) . This implies V ar Φ t p,α p P α i Y 2 i !! ≤ 4 k α k 4 2 E   X α i Y 2 i − k α k 2 2  2  . As Y ∼ N (0 , D α ) , it is im mediate t o check E h X α i Y 2 i i = X α 2 i = k α k 2 2 . Hence E   X α i Y 2 i − k α k 2 2  2  = X α 2 i V ar  Y 2 i  = 2 k α k 4 4 . This establishes ( 29 ) and finishes the p roof. Lemma 11. Let p ∈ (0 , 1) , then E [ τ G ( n,p,α ) (1 , 2 , 3) τ G ( n,p,α ) (1 , 2 , 4)] ≤ 80  k α k 4 k α k 2  4 Pr oof. The proof is s imilar to Lemma 10 and uses th e observ atio n that if V 1 and V 2 are the random vectors corresponding to t wo vertices, then condition ed on their values, the random var iables τ G ( n,p,α ) (1 , 2 , 3) and τ G ( n,p,α ) (1 , 2 , 4) are ind ependent . Thus E [ τ G ( n,p,α ) (1 , 2 , 3) τ G ( n,p,α ) (1 , 2 , 4)] = E  E [ τ G ( n,p,α ) (1 , 2 , 3) τ G ( n,p,α ) (1 , 2 , 4) | V 1 , V 2 ]  = E h E  ¯ A 1 , 3 ¯ A 2 , 3 | V 1 , V 2  2 ¯ A 2 1 , 2 i ≤ E h E  ¯ A 1 , 3 ¯ A 2 , 3 | V 1 , V 2  2 i . Lemma 10 implies E  ¯ A 1 , 3 ¯ A 2 , 3  ≤ 8  k α k 4 k α k 2  4 , so that E [ τ G ( n,p,α ) (1 , 2 , 3) τ G ( n,p,α ) (1 , 2 , 4)] ≤ E h E  ¯ A 1 , 3 ¯ A 2 , 3 | V 1 , V 2  2 i ≤ V ar  E  ¯ A 1 , 3 ¯ A 2 , 3 | V 1 , V 2  + 64  k α k 4 k α k 2  8 . (30) Note that if X ∼ N (0 , D α ) , E  ¯ A 1 , 3 ¯ A 2 , 3 | V 1 , V 2  = (1 − p ) 2 P ( h X , V 1 i ≥ t p,α , h X , V 2 i ≥ t p,α ) + p 2 P ( h X , V 1 i < t p,α , h X , V 2 i < t p,α ) − p (1 − p ) ( P ( h X, V 1 i ≥ t p,α , h X , V 2 i < t p,α ) + P ( h X , V 1 i < t p,α , h X , V 2 i ≥ t p,α )) . For any v , u ∈ R d , denote by Σ v,u the matrix gi ven by  P α i v 2 i P α i v i u i P α i v i u i P α i u 2 i  , 27 then the joint law of h X , v i , h X , u i is G v,u := N (0 , Σ v,u ) . The abov e can now be rewritten as (1 − p ) 2 P ( G V 1 ,V 2 ∈ ( t p,α , ∞ ) × ( t p,α , ∞ )) + p 2 P ( G V 1 ,V 2 ∈ ( −∞ , t p,α ) × ( −∞ , t p,α )) − p (1 − p ) (( G V 1 ,V 2 ∈ ( −∞ , t p,α ) × ( t p,α , ∞ )) + P ( G V 1 ,V 2 ∈ ( t p,α , ∞ ) × ( −∞ , t p,α ))) . (31) In particul ar , if M i are i ndependent random Wishart matrices with law W 2 ( α 2 i I 2 , 1) and M = P M i then t he m atrix Σ V 1 ,V 2 has the same law as M . In th i s case we regard ( 31 ) as a functi on h , of th e covariance M . Using ( 30 ), we get E [ τ G ( n,p,α ) (1 , 2 , 3) τ G ( n,p,α ) (1 , 2 , 4)] ≤ V ar ( h ( M )) + 16  k α k 4 k α k 2  8 . It is thus enough to establish an upper bound for V ar( h ( M )) . For a positive semi-definit e matrix Σ , we denote G Σ ∼ N (0 , Σ) . As h ( M ) ≤ 1 , we ha ve the following inequality V ar( h ( M )) ≤ E  h ( M ) − h ( k α k 2 2 I 2 ) 2  ≤ E  TV  G M , G k α k 2 2 I  2  ≤ E h min  1 , En t  G M || G k α k 2 2 I 2 i , where we hav e used Pins ker’ s inequalit y ( 8 ) to bound th e total var iation. The relativ e entropy between t he Gaussians (see [ 5 ]) is giv en by En t  G M || G k α k 2 2 I 2  = T r 1 k α k 2 2 M ! − ln det( M ) det( G k α k 2 2 I 2 ) ! − 2 = T r M k α k 2 2 − I 2 ! − ln det M k α k 2 2 !! . For any x ≥ 1 2 we hav e the inequali t y x − 1 − ln( x ) ≤ x 2 . So, if both eigen values of M k α k 2 2 are bigger than 1 2 En t  G M || G k α 2 2 k I 2  ≤      M k α k 2 2 − I 2      2 H S . Otherwise, it i s clear that 1 ≤ 2      M k α k 2 2 − I 2      2 H S . Combining the abov e, we hav e establi shed V ar ( h ( M )) ≤ 2 E        M k α k 2 2 − I 2      2 H S   . (32) Recall that th e diagon al elements of M are given by P α i ( V ) 2 i where V ∼ N (0 , D α ) , thus E [ M 1 , 1 ] = E h X α i ( V 1 ) 2 i i =  X α 2 i  = k α k 2 2 , and V ar( M 1 , 1 ) = X V ar  α i ( V 1 ) 2 i  = 2 k α k 4 4 . 28 The o f f-diagonal element is gi ven by P α i ( V 1 ) i ( V 2 ) i , for which we have E [ M 1 , 2 ] = 0 , and V ar ( M 1 , 2 ) = E   X α i ( V 1 ) i ( V 2 ) i  2  = X α 2 i E  ( V 1 ) 2 i  E  ( V 2 ) 2 i  = k α k 4 4 . Using t hese estimates in ( 32 ) we obtain V ar ( h ( M )) ≤ 16  k α k 4 k α k 2  4 . plugging this into ( 30 ) gi ves the d esired result. Lemma 12. Let p ∈ (0 , 1) , then E [ τ G ( n,p,α ) (1 , 2 , 3) τ G ( n,p,α ) (1 , 4 , 5)] ≤ 80  k α k 4 k α k 2  4 Pr oof. Conditioned on the location of t h e ve rtex V 1 , the random variables τ G ( n,p,α ) (1 , 2 , 3) , τ G ( n,p,α ) (1 , 4 , 5) are independent and i d entically di stributed, thus E [ τ G ( n,p,α ) (1 , 2 , 3) τ G ( n,p,α ) (1 , 4 , 5)] = E  E  τ G ( n,p,α ) (1 , 2 , 3) τ G ( n,p,α ) (1 , 4 , 5) | V 1  = E h E  τ G ( n,p,α ) (1 , 2 , 3) | V 1  2 i . Using t he tower property of conditional e xpectation E h E  τ G ( n,p,α ) (1 , 2 , 3) | V 1  2 i = E h E  E  τ G ( n,p,α ) (1 , 2 , 3) | V 1 , V 2  | V 1  2 i ≤ E h E  τ G ( n,p,α ) (1 , 2 , 3) | V 1 , V 2  2 i = E h E  τ G ( n,p,α ) (1 , 2 , 3) | V 1 , V 2  2 i . But i n Lemma 11 , using ( 30 ), we have essenti al l y shown E h E  τ G ( n,p,α ) (1 , 2 , 3) | V 1 , V 2  2 i ≤ 80  k α k 4 k α k 2  4 , thus the claim is proven. T owards the proof of Theorem 2 we now estim ate E τ ( G ( n, p, α )) . No te that since E τ ( G ( n, p, α )) =  n 3  E τ G ( n,p,α ) (1 , 2 , 3) , it is enou g h to estimate E τ G ( n,p,α ) (1 , 2 , 3) . E τ G ( n,p,α ) (1 , 2 , 3) = E ¯ A 1 , 2 ¯ A 1 , 3 ¯ A 2 , 3 = E ( A 1 , 2 − p )( A 1 , 3 − p )( A 2 , 3 − p ) = E A 1 , 2 A 1 , 3 A 2 , 3 − p ( E A 1 , 2 A 2 , 3 + E A 1 , 2 A 1 , 3 + E A 1 , 3 A 2 , 3 ) + p 2 ( E A 1 , 2 + E A 1 , 3 + E A 2 , 3 ) − p 3 ≥ E A 1 , 2 A 1 , 3 A 2 , 3 − p 3 − 24 p  k α k 4 k α k 2  4 , (33) 29 where the inequ al i ty follows from the fact that E A i,j = p and Lemma 10 . As  k α k 4 k α k 2  4 ≤  k α k 3 k α k 2  3 1 k α k 2 , as l ong as k α k 2 is lar ge enough, the l ower bound of Theorem 5 yields E τ G ( n,p,α ) (1 , 2 , 3) ≥ δ p  k α k 3 k α k 2  3 for a constant δ p > 0 , depending only on p . This shows E τ ( G ( n, p, α )) ≥ δ p  n 3   k α k 3 k α k 2  3 . T o b o und from above t h e variance of τ ( G ( n, p, α )) we obs erve that τ G ( i, j, k ) is independent from τ G ( i ′ , j ′ , k ′ ) whene ver |{ i, j, k } ∩ { i ′ , j ′ , k ′ }| = 0 , thus V ar ( τ ( G ( n, p, α ))) = X { i,j,k } X { i ′ ,j ′ ,k ′ } E  τ G ( n,p,α ) ( i, j, k ) τ G ( n,p,α ) ( i ′ , j ′ , k ′ )  − E  τ G ( n,p,α ) ( i, j, k )  E  τ G ( n,p,α ) ( i ′ , j ′ , k ′ )  ≤ X { i,j,k } E  τ G ( n,p,α ) ( i, j, k ) τ G ( n,p,α ) ( i, j, k )  + X { i,j,k ,l } E  τ G ( n,p,α ) ( i, j, k ) τ G ( n,p,α ) ( i, j, l )  + X { i,j,k ,l ,m } E  τ G ( n,p,α ) ( i, j, k ) τ G ( n,p,α ) ( k , l, m )  =  n 3  E [ τ G ( n,p,α ) (1 , 2 , 3) τ G ( n,p,α ) (1 , 2 , 3)] +  n 4  4 2  E [ τ G ( n,p,α ) (1 , 2 , 3) τ G ( n,p,α ) (1 , 2 , 4)] + 5  n 5  E [ τ G ( n,p,α ) (1 , 2 , 3) τ G ( n,p,α ) (1 , 4 , 5)] . Noting that E [ τ G ( n,p,α ) (1 , 2 , 3) τ G ( n,p,α ) (1 , 2 , 3)] ≤ 1 , in conjunction wit h Lemmas 11 and 12 yields V ar( τ ( G ( n, p, α ))) ≤ n 3 + 80 n 5  k α k 4 k α k 2  4 . Combining all of the above E [ τ ( G ( n, p ))] = 0 , E [ τ ( G ( n, p, α ))] ≥ δ p  n 3   k α k 3 k α k 2  3 , and max { V ar( τ ( G ( n, p, α ))) , V ar( G ( n, p )) } ≤ n 3 + 80 n 5  k α k 4 k α k 2  4 . Chebyshev’ s inequali t y implies th at P  τ ( G ( n, p, α )) ≤ 1 2 E [ τ ( G ( n, p, α ))]  ≤ 200  k α k 2 k α k 3  6 n 3 + 80 n 5 k α k 2 2 k α k 4 4 k α k 6 3 δ 2 p n 6 , 30 and also P  τ ( G ( n, p )) ≥ 1 2 E [ τ ( G ( n, p, α ))]  ≤ 200  k α k 2 k α k 3  6 n 3 + 80 n 5 k α k 2 2 k α k 4 4 k α k 6 3 δ 2 p n 6 . Note that due to the normali zati o n ( 13 ) k α k 2 2 k α k 4 4 k α k 6 3 ≤ k α k 2 2 k α k 2 3 . Putting the above expressions together we thus ha ve: TV ( τ ( G ( n, p, α )) , τ ( G ( n, p ))) ≥ 1 − C  k α k 2 k α k 3  6 n 3 − C  k α k 2 k α k 3  2 n , for a constant C depending only on p . This concludes the proof of Theorem 3 . 5 Pr oof of the lower bound As stat ed in the i ntroduction, we can view G ( n, p, α ) as a function of an appropriate random matrix, as fol lows. Let Y be a random n × d matrix wit h rows sampled i.i.d. from N (0 , D α ) . Define W = W ( n, α ) = YY T / k α k 2 − diag  YY T / k α k 2  . Note that for i 6 = j , W ij = h γ i , γ j i / k α k 2 , where γ i , γ j are the ro ws of Y . Thus the n × n matrix A defined as A i,j = ( 1 if W ij ≥ t p,α / k α k 2 and i 6 = j 0 otherwise has t h e same law as the adjacency matrix of G ( n, p, α ) . Denot e the map that takes W t o A by H p,α , i.e., A = H p.α ( W ) . Similarly , we may view G ( n, p ) as function of an n × n matrix wit h independent Gaussian entries. Let M ( n ) b e a sym metric n × n random matrix with 0 entries in the diagonal, and whose entries above t h e di agonal are i.i.d. standard normal random va riables. If Φ is the cumu- lativ e distri bution functio n of the standard Gaussian, then the n × n matrix B , defined as B i,j = ( 1 if M ( n ) ij ≥ Φ − 1 ( p ) and i 6 = j 0 otherwise has the same law as the adj acency matri x of G ( n, p ) . Denot e the map that takes M ( n ) to B by K p , i.e., B = K p ( M ( n )) . Using the triangle inequality and by t he previous two paragraphs, we have that for any p ∈ ( 0 , 1) TV( G ( n, p ) , G ( n, p, α )) = TV ( K p ( M ( n )) , H p,α ( W ( n, α ))) ≤ TV( H p,α ( M ( n )) , H p,α ( W ( n, α ))) + TV( K p ( M ( n )) , H p,α ( M ( n ))) ≤ TV( M ( n ) , W ( n, α )) + TV ( K p ( M ( n )) , H p,α ( M ( n ))) . The second term is of lower order and w i ll be dealt wi t h l ater . The first term is bounded using Pinsker’ s inequality , ( 8 ), yielding TV( M ( n ) , W ( n, α )) ≤ r 1 2 En t [ M ( n )     W ( n, α )] . 31 W e’ll use a si m ilar ar gu m ent to the one presented in [ 3 ] which foll ows an i nductive p roo f using the chain rule for relati ve entropy . W e observe that a sampl e of W ( n + 1 , α ) may be constructed from W ( n, α ) by adj oining th e col u mn vector (and s ymmetrically t he row vector) Y Y / k α k 2 where Y ∼ N (0 , D α ) is independent of Y . Thu s, using th e notation, Z n for a s t andard Gaussian in R n , by ( 9 ), we obtain En t  W ( n + 1 , α )     M ( n + 1)  = Ent  W ( n, α )     M ( n )  + E Y En t  Y Y / k α k 2   W ( n, α )     Z n  . Since W ( n, α ) is a function of Y , s tandard properties of relativ e entropy (see [ 4 ], chapter 2) show E Y En t  Y Y / k α k 2   W ( n, α )     Z n  = E Y En t  Y Y / k α k 2   YY T / k α k 2     Z n  ≤ E Y En t  Y Y / k α k 2   Y     Z n  . Note t hat Y Y / k α k 2 | Y is dist ri buted as N (0 , 1 k α k 2 2 Y D α Y T ) . T he relative entropy between two n -dimensional Gaussians, (see [ 5 ]) N 1 ∼ N (0 , Σ 1 ) , N 2 ∼ N (0 , Σ 2 ) is gi ven by En t [ N 1 ||N 2 ] = 1 2  tr  Σ − 1 2 Σ 1  + ln  det Σ 2 det Σ 1  − n  . In o ur case Σ 2 = I n and E Y tr( Y D α Y T ) = n k α k 2 2 . Thus the following holds: E Y En t  1 k α k 2 Y Y   Y     Z n  = − 1 2 E Y ln det 1 k α k 2 2 Y D α Y T !! . Theorem 4 is then implied by th e following lemma: Lemma 13. − E Y ln det  1 k α k 2 2 Y D α Y T  ≤ C n 2  k α k 4 k α k 2  4 + r n  k α k 4 k α k 2  4 ! for a uni versal constant C > 0 . The proof will follow similar lin es as Lemm a 2 in [ 3 ]. Namely , we will d ecom pose the expectation on the event that the sm allest eig env alue of 1 k α k 2 2 Y D α Y T , denoted by λ min , is lar ger than 1 2 . Lemm a 13 will then follow by the follo wing t wo claims: Claim 14. − E Y " ln det 1 k α k 2 2 Y D α Y T ! 1 { λ min ≥ 1 2 } # ≤ C   n 2  k α k 4 k α k 2  4 + s n  k α k 4 k α k 2  4   , for a uni versal const ant C > 0 . Pr oof. W e first use the inequality − ln( x ) ≤ 1 − x + (1 − x ) 2 for x ≥ 1 2 : − E Y " ln det Y D α Y T k α k 2 2 ! 1 { λ min ≥ 1 2 } # ≤ E Y        tr I n − Y D α Y T k α k 2 2 !      +      I n − Y D α Y T k α k 2 2      2 H S   , (34) where k·k H S denotes the Hil bert-Schmidt norm. Before proceeding, we first calculate several quantities. For 1 ≤ j ≤ n denote by A j the j th row of Y √ D α with entries { √ α i y j,i } d i =1 . 32 1. Th e expected squared norm of A j is giv en by E k A j k 2 = P i E α i y 2 j,i = P i α 2 i = k α k 2 2 . Since y j,i is a centred Gaussian wi t h v ariance α i . 2. When j 6 = k , A j and A k are independent, and s o E k A j k 2 k A k k 2 =  P i α 2 i  2 = k α k 4 2 . 3. When j 6 = k , th e expected squared inner product between two ro ws is given by E h A j , A k i 2 = E d X i =1 α i y j,i y k ,i ! 2 = d X i =1 α 2 i E y 2 j,i y 2 k ,i + X i 1 6 = i 2 α i 1 α i 2 E y j,i 1 y k ,i 1 y j,i 2 y k ,i 2 = d X i =1 α 4 i = k α k 4 4 . 4. Th e expected 4th power of the norm is gi ven by E k A j k 4 = E X i α i y 2 j,i ! 2 = X i α 2 i E y 4 j,i + X i 6 = k α i α k E y 2 j,i y 2 j,k ≤ 3 X i α 4 i + X i α 2 i ! 2 = 3 k α k 4 4 + k α k 4 2 , when we remember that th e 4th moment of a centred Gaussian w i th v ariance α i is 3 α 2 i . W e turn to bound each term of the sum ( 34 ): E Y      tr I n − 1 k α k 2 2 Y D α Y T !      ≤ v u u t E Y tr 2 I n − 1 k α k 2 2 Y D α Y T ! = v u u t E Y n X j =1 1 − k A j k 2 k α k 2 2 !! 2 = v u u t E Y n 2 − 2 n k α k 2 2 n X j =1 k A j k 2 + 1 k α k 4 2 X j 6 = k k A j k 2 k A k k 2 + 1 k α k 4 2 n X j =1 k A j k 4 ! ≤ s n 2 − 2 n 2 + 2  n 2  + n k α k 4 2  3 k α k 4 4 + k α k 4 2  = s 3 n k α k 4 4 k α k 4 2 . Similarly , we may deal with the second term: E Y      I n − 1 k α k 2 2 Y D α Y T      2 H S = X k ,j 1 k α k 4 2 E Y h A j , A k i 2 ! − n = 1 k α k 4 2 n X j =1 E Y k A j k 4 + 1 k α k 4 2 X j 6 = k h A j , A k i 2 − n ≤ n k α k 4 2 (3 k α k 4 4 + k α k 4 2 ) + 2 k α k 4 2  n 2  k α k 4 4 − n = 3 n k α k 4 4 k α k 4 2 + ( n 2 − n ) k α k 4 4 k α k 4 2 ≤ 3 n 2 k α k 4 4 k α k 4 2 . 33 Combining ( 34 ) with th e l ast t wo dis p l ays gi ves − E Y " ln det 1 k α k 2 2 Y D α Y T ! 1 { λ min ≥ 1 2 } # ≤ 3   n 2  k α k 4 k α k 2  4 + s n  k α k 4 k α k 2  4   . Claim 15. E Y " ln det 1 k α k 2 2 Y D α Y T ! 1 { λ min < 1 / 2 } # < n exp( − C k α k 2 2 ) , for a uni versal const ant C > 0 . Pr oof. Observe that for any ξ ∈ (0 , 1 2 ) : − E Y " ln det Y D α Y T k α k 2 2 ! 1 { λ min < 1 / 2 } # ≤ n E  − log ( λ min ) 1 { λ min < 1 / 2 }  = n ∞ Z log(2) P ( − log ( λ min ) > t ) dt = n 1 / 2 Z 0 1 s P ( λ min < s ) d s ≤ n ξ P ( λ min < 1 / 2) + n ξ Z 0 1 s P ( λ min < s ) d s. (35) By allowing ξ to be some small constant, we’ll need to bound P ( λ min < 1 / 2) and P ( λ min < s ) for small s . Recall that for any s , λ min < s im plies t he existence of θ ∈ S n − 1 such that θ T Y D α Y T k α k 2 2 θ < s , or equiv alently    p D α Y T θ    2 < s k α k 2 2 . Also, if θ i s s u ch t hat    √ D α Y T k α k 2 θ    < √ s , then for any θ ′ ∈ S n − 1 ,     √ D α Y T k α k 2 θ ′     < √ s + p λ max k θ − θ ′ k , where λ max is the lar gest eigenv alue of Y D α Y T k α k 2 2 . W e will first bound P ( λ min < 1 / 2) , using an ε -net argument. Note that for each θ , √ D α Y T θ is distributed as N (0 , D 2 α ) . Consi der the Eu cli dean m et ri c on S n − 1 and let 0 < ε < 1 . W e may cove r S n − 1 with  3 ε  n balls of radius ε (see Lemma 2.3.4 in [ 14 ], for example) to achie ve P λ min < 1 / 2 ! ≤  3 ε  n P   N (0 , D 2 α )   < r 1 . 1 2 k α k 2 2 ! + P  p λ max > 0 . 1 √ 2 ε  . (36) 34 T o bound P  √ λ max > 0 . 1 √ 2 ε  we wi ll u s e ano t her ε -net with ε = 1 2 . Along with the fact that k θ − θ ′ k ≤ 1 2 implies    √ D α Y T ( θ − θ ′ ) k α k 2    ≤ √ λ max 2 , we may see t hat P  p λ max > 0 . 1 √ 2 ε  ≤ 6 n P    p D α Y T θ    2 > 0 . 01 k α k 2 2 4 ε 2 ! = 6 n P   N (0 , D 2 α )   > r 0 . 01 4 ε 2 k α k 2 2 ! . (37) But, for any x > 0 : P    N  0 , D 2 α    > q x k α k 2 2  = P X i α 2 i χ 2 i > x k α k 2 2 ! , where the χ 2 i are i .i.d. Chi-squared random variables with 1 degree of freedom. Observe t hat E [ α 2 i χ 2 i ] = α 2 i . W e may now utili ze the s ub-exponential tail of the χ 2 distribution and appl y ( 5 ) with v i = α 2 i , noting that, by th e norm al i zation, ( 13 ), k α k ∞ = 1 . Thus, provided that x > 3 P  X α 2 i χ 2 i > x k α k 2 2  ≤ P     X α 2 i χ 2 i − k α k 2 2    > ( x − 1 ) k α k 2 2  ≤ 2 exp  − min  x − 1 2 k α k 2 2 , ( x − 1) 2 4 k α k 2 2  ≤ 2 exp  − k α k 2 2  . (38) Substitutin g x for 0 . 01 4 ε 2 in ( 37 ) shows that when 0 . 01 4 ε 2 > 3 th en P  p λ max > 0 . 1 √ 2 ε  ≤ 6 n exp( − k α k 2 2 ) . The exact same considerations as in ( 38 ) also show that P   N (0 , D 2 α )   < r 1 . 1 2 k α k 2 2 ! ≤ P      X i α 2 i χ 2 i − k α k 2 2      > 0 . 9 2 k α k 2 2 ! ≤ 2 exp  − 0 . 9 2 16 k α k 2 2  ≤ 2 exp − k α k 2 2 20 ! . Plugging the above two displays into ( 36 ), when ε is smal l enough, yields P ( λ min < 1 / 2) ≤ 2  3 ε  n e − k α k 2 2 20 + 2 · 6 n e −k α k 2 2 ≤ 4 exp 3 n ε − k α k 2 2 20 ! . (39) 35 For general 0 < s < 1 / 2 , in a sim ilar f ashion to ( 36 ), us ing an s -net gi ves the b ound P  λ min < s  ≤  3 s  n P    N (0 , D 2 α )   < q 1 . 1 s k α k 2 2  + P  p λ max > 0 . 1 / √ s  . (40) Now , N (0 , D 2 α ) can be written as D α Z d where Z d is a standard Gaussian d -dimensional vector . In [ 10 , Proposition 2 .6], it was shown th at there exists universal cons tants C L , C ′ > 0 such that for any t < C ′ : P  k D α Z k < t k D α k H S  ≤ exp   C L ln( t ) k D α k H S k D α k op ! 2   = exp  C L ln( t ) k α k 2 2  = t C L k α k 2 2 , with equality stem ming from the f acts that k D α k H S = k α k 2 and k D α k op = k α k ∞ = 1 . Thus P    N (0 , D 2 α )   < q 1 . 1 s k α k 2 2  ≤ 2 s C L 2 k α k 2 2 . (41) By re vi s iting ( 37 ) and replacing √ 2 ε with √ s we note that for smal l s P  p λ max > 0 . 1 / √ s  ≤ 6 n P   N (0 , D 2 α )   > r 0 . 01 2 s k α k 2 2 ! ≤ 6 n P      X i α 2 i χ 2 i − k α k 2 2      >  0 . 01 2 s − 1  k α k 2 2 ! . And, provided that s ≤ 0 . 01 4 , ( 38 ) shows P ( p λ max > 0 . 1 / √ s ) ≤ 6 n exp  − 1 2 s  0 . 01 2 − s  k α k 2 2  ≤ 6 n e − 0 . 01 k α k 2 2 4 s . (42) By using ( 42 ) and ( 41 ) to bound ( 40 ) we obtain P ( λ min < s ) ≤ 2  3 s  n s C L 2 k α k 2 2 + exp 2 n − 0 . 01 k α k 2 2 4 s ! , ∀ s ≤ 0 . 01 4 . W e hav e thus s h own, by comb i ning ( 39 ), together with the last inequali t y into ( 35 ) and choosing ξ t o be a smal l enough constant: n ξ P ( λ min < 1 / 2) + n ξ Z 0 1 s P ( λ min < s ) d s ≤ n ξ 12 exp 3 n ε − k α k 2 2 20 ! + n ξ Z 0 3 n s C L 2 ( k α k 2 2 − n − 1) + 1 s e  2 n − 0 . 01 k α k 2 2 4 s  ds. Assuming that ξ ≤ 1 e and that k α k 2 2 > n + 1 , n ξ Z 0 3 n s C L 2 ( k α k 2 2 − n − 1) ds ≤ n 3 n ξ C L 2 ( k α k 2 2 − n ) ≤ ne C L 2 ( n −k α k 2 2 )+2 n , n ξ Z 0 1 s e  2 n − 0 . 01 k α k 2 2 4 s  ds ≤ ne 2 n ξ Z 0 e − 0 . 01 k α k 2 2 8 s ds ≤ ne 2 n ξ e − 0 . 01 k α k 2 2 8 ξ . 36 T o obtain the desired result we obs erve that if n 3  k α k 4 k α k 2  4 → 0 then  k α k 2 k α k 4  4 >> n 3 . the inequality k α k 2 2 ≥  k α k 2 k α k 4  4 / 3 implies k α k 2 2 >> n , which shows the e xi stence of a constant C > 0 for which E Y " ln det 1 k α k 2 2 Y D α Y T ! 1 { λ min < 1 / 2 } # < n exp( − C k α k 2 2 ) . T o finish t he p rove of Theorem 2 (b) we must no w deal with TV ( K p ( M ( n )) , H p,α ( M ( n ))) . Lemma 16. Assume n 3  k α k 4 k α k 2  4 → 0 , then TV( K p ( M ( n )) , H p,α ( M ( n ))) → 0 . Pr oof. First, we again pass to relative entropy using ( 8 ), Pinsker’ s inequality: TV( K p ( M ( n )) , H p,α ( M ( n )) ≤ q En t [ K p ( M ( n )) || H p,α ( M ( n ))] . W e note that both K p ( M ( n )) and H p,α ( M ( n )) are sim ply Bernoulli m atrices. The entries of K p ( M ( n )) are i.i.d. Bernoulli(p) , while the entries of H p,α ( M ( n )) are i.i.d. Bernoulli( p ′ ) where p ′ = Φ − 1  t p,α k α k 2  . Defining En t [ p || p ′ ] := Ent [Bernoulli( p ) || Bernoulli( p ′ )] and using the chain rule ( 9 ) for relati ve entropy yields En t [ K p ( M ( n )) || H p,α ( M ( n ))] ≤ n 2 En t[ p || p ′ ] . One m ay verify that lim p ′ → p En t [ p || p ′ ] ( p − p ′ ) 2 = lim p ′ → p p ln( p p ′ ) + (1 − p ) ln( 1 − p 1 − p ′ ) ( p − p ′ ) 2 = 1 2 p − 2 p 2 . So, En t[ p | | p ′ ] ( p − p ′ ) 2 is a continuous function on (0 , 1) × (0 , 1) and i s bounded on e very compact subset of i ts domain. Thus, there exists a constant C p , depending on p such that En t [ p || p ′ ] ≤ C p ( p − p ′ ) 2 . By Lemma 4 , | p − p ′ | ≤ 3  k α k 3 k α k 2  3 , which af fords the bound En t [ p || p ′ ] ≤ 9 C p  k α k 3 k α k 2  6 . But n ow , by Cauchy-Schwartz’ s inequali t y , k α k 3 3 = P i α i α 2 i ≤ q k α k 2 2 k α k 4 4 . Combin i ng all of the abov e TV ( K p ( M ( n )) , H p,α ( M ( n ))) 2 ≤ En t [ K p ( M ( n )) || H p,α ( M ( n ))] ≤ n 2 En t ( p || p ′ ) ≤ 9 C p n 2  k α k 3 k α k 2  6 ≤ n 2 k α k 4 4 k α k 2 2 k α k 6 2 < n 3  k α k 4 k α k 2  4 . 37 References [1] B E N T K U S , V . A lyapunov-type bound i n R d . Theory of Pr obab i lity & Its Appl ications 49 , 2 (2005), 311–323. [2] B U B E C K , S . , D I N G , J . , E L DA N , R . , A N D R ´ A CZ , M . Z . T esting for high-dim ensional geometry in random graphs. Random Structure s & Alg orithms 49 , 3 (2016), 503–532. [3] B U B E C K , S . , A N D G A N G U L Y , S . Entropic CL T and phase transition in high-dim ensional Wishart matrices. Intern a tional Mathematics Resear ch Notices 2018 , 2 (2016), 588–606. [4] C OV E R , T. M . , A N D T H O M A S , J . A . Elements of informati o n theory . John W iley & Sons, 2012. [5] D U CH I , J . Deriv ations for linear algebra and optimization. Berkele y , Calif ornia (200 7 ). http://www .cs.berkeley .edu/jduchi/projects / general notes.pdf. [6] D U RR E T T , R . Pr obab i lity: theory and examples . Cambridg e university press, 2010. [7] E A T O N , M . L . Chapter 8: The wishart d istribution. In Multivar iate Statist i cs , vol. 53 of Lectur e Notes–Monograph S eries . Insti tute of Mathematical Statist ics, Beachwood, Ohio, USA, 2007, pp. 302–333. [8] E R D ˝ O S , P . , A N D R ´ E N Y I , A . On the e volution of random graphs. Publ. Math. Inst. Hunga r . Acad. Sci 5 (1 9 6 0), 17–61. [9] H ¨ O RM A N D E R , L . T he analysis of linear p artial differ ential operators I: Di stribution theory and F ourier analysis . Springer , 2015. [10] L A TA L A , R . , M A N K I E W I C Z , P . , O L E SZ K I E W I C Z , K . , A N D T O M C Z A K - J A E G E R M A N N , N . Banach-Mazur distances and p roj ections on random subgaussian poly topes. Discr ete & Computational Geometry 3 8 , 1 (2007), 29–50. [11] N O U R D I N , I . , A N D P E C C A T I , G . Norma l appr oximations with Malliavin calculu s: fr om Stein’ s method to universality , vol. 1 9 2. C ambridge Univ ersit y Press, 2012. [12] P E T ROV , V . V . Limit Theor ems of Pr o bability Theory . Oxford Univ ersity Press, 1995. [13] S H E P H A R D , N . G . From characteristic function to distribution fun ction: a simpl e frame- work for th e theory . Econo met r ic theory 7 , 04 (1991), 519–529. [14] T A O , T. T opics in random matr ix theory , vol. 132. American Mathemati cal Society Providence, RI, 2012. [15] V E R S H Y N I N , R . Introduction to the non-asymptotic analysis of random matrices. In Compr essed sensing . Cambridge Univ . Press, Cambridge, 2012, pp. 21 0 –268. 38

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment