Phase transitions in the condition number distribution of Gaussian random matrices

Phase transitions in the condition number distribution of Gaussian random matrices Isaac P ´ erez Castillo 1 , Eytan Katzav 2 and Pierpaolo V iv o 3 1. Departamento de Sistemas Complejos, Instituto de F ´ ısica, UNAM, P .O. Box 20-364, 01000 M ´ exico D.F ., M ´ exico 2. Department of Mathematics, King’ s Colle ge London, Strand, London WC2R 2LS, United Kingdom 3 . Laboratoir e de Physique Th ´ eorique et Mod ` eles Statistiques (UMR 8626 du CNRS), Universit ´ e P aris-Sud, B ˆ atiment 100, 91405 Orsay Cedex, F rance (Dated: August 4, 2018) W e study the statistics of the condition number κ = λ max /λ min (the ratio between largest and smallest squared singular values) of N × M Gaussian random matrices. Using a Coulomb ﬂuid technique, we deriv e analytically and for large N the cumulative P [ κ < x ] and tail-cumulativ e P [ κ > x ] distributions of κ . W e ﬁnd that these distributions decay as P [ κ < x ] ≈ exp  − β N 2 Φ − ( x )  and P [ κ > x ] ≈ exp ( − β N Φ + ( x )) , where β is the Dyson index of the ensemble. The left and right rate functions Φ ± ( x ) are independent of β and calculated exactly for any choice of the rectangularity parameter α = M / N − 1 > 0 . Interestingly , they sho w a weak non-analytic behavior at their minimum h κ i (corresponding to the a verage condition number), a direct consequence of a phase transition in the associated Coulomb ﬂuid problem. Matching the behavior of the rate functions around h κ i , we determine exactly the scale of typical ﬂuctuations ∼ O ( N − 2 / 3 ) and the tails of the limiting distribution of κ . The analytical results are in excellent agreement with numerical simulations. P A CS numbers: 5.40.-a, 02.10.Yn, 02.50.Sk, 24.60.-k Intr oduction - A classical task in numerical analysis is to ﬁnd the solution x of a linear system A x = y , where in the simplest setting A is a square N × N matrix and y is a giv en column vector . The solution can be formally written as x = A − 1 y provided that A is in vertible. A very important issue for numerical stability is: ho w does a small change in the entries of y or of A propagate on the solution x ? The system of equations abov e is said to be well - (or ill -) con- ditioned if a small change in the coef ﬁcient matrix A or in the right hand side y results in a small (or lar ge) change in the so- lution v ector x . An ill-conditioned system produces a solution that cannot be trusted, as numerical inaccuracies in the inputs are ampliﬁed and propagated to the output [1]. A standard indicator of the reliability of numerical solutions is the condition number (CN) κ = λ max /λ min ≥ 1 , where λ min and λ max are the smallest and largest squar ed singular values of A , i.e. the positiv e eigenv alues of AA T (the square root √ κ of CN is alternati vely used v ery frequently). The quantity ln b κ is essentially a worst-case estimate of how many base- b digits are lost in solving numerically that linear system, which is singular if κ is inﬁnite, ill-conditioned if κ is “too lar ge”, and well-conditioned if κ is close to its minimum v alue 1 . Computing κ for a large coefﬁcient matrix A in a fast and ef- ﬁcient way can be, howe ver , as difﬁcult a task as solving the original system in the ﬁrst place [2]. T o overcome this prob- lem, Goldstine and von Neumann [3, 4] proposed instead to study the generic features of κ associated to a random matrix A with normally distributed elements [5]. What is the typical (expected) CN for a system of size N ? And what is a sensible estimate for the size of its ﬂuctuations? Modern applications of a random condition number of more general (rectangular) matrices N × M include wireless com- munication systems [6 – 10], spectrum sensing algorithms [11– 13], conv ergence rate of iterati ve schemes [14], compressed sensing [15], ﬁnance [16], meteorology [17] and performance assessment of principal component analysis [18] among oth- ers. The statistics of √ κ was ﬁrst computed by Edelman [19] for 2 × M random Gaussian matrices, as well as the limiting dis- tribution of √ κ/ N for lar ge N × N matrices. The rectangular case w as recently considered in [20]. Dif ferent bounds for the tails were given in [21 – 23]. Exact formulae for the distrib u- tion of κ for ﬁnite N , M also exist in terms of cumbersome series of zonal polynomials [24, 25] or an integral of a de- terminant [26], whose ev aluation becomes impractical even for moderate matrix sizes. Approximate results for correlated non-central Gaussian matrices can be found e.g. in [27]. Other deﬁnitions for the CN also exist [5, 28]. Unfortunately , almost nothing is known about the most dreaded (or welcomed) scenarios for applications, namely the occurrence of atypical instances, where the CN is much larger (or smaller) than its e xpected value. In this Letter , by suitably adapting the Coulomb ﬂuid method of statistical mechanics, we provide an analytical solution to this outstanding problem for large rectangular instances. W e show that the large devi- ation statistics of the CN of Gaussian matrices, expressed in terms of elementary functions, has a rich and elegant structure. As a bonus, we also deriv e the scale of typical ﬂuctuation of the CN around h κ i , and the tails of its limiting distribution. Let us ﬁrst summarize our setting and main results. Summary of results - W e consider rectangular N × M ( M > N ) matrices A with Gaussian distributed entries (real, com- plex or quaternions, labelled by the Dyson index β = 1 , 2 , 4 respectiv ely , or actually for general β > 0 as discussed in [29]). Forming the corresponding N × N cov ariance matrix W = AA T [43], which deﬁnes the W ishart ensemble [30], we deﬁne its rectangularity parameter α = M / N − 1 > 0 and the CN κ = λ max /λ min > 1 . Here λ max and λ min are the largest and smallest eigen values of W . W e consider the cumulativ e P [ κ < x ] and tail-cumulative (also kno wn as exceedance or 2 surviv al function) P [ κ > x ] distributions of κ , when N and M are large and α is kept ﬁnite. Using a Coulomb ﬂuid tech- nique we ﬁnd that for large N both distrib utions obey large deviation la ws, namely they decay for large N as [44] P [ κ < x ] ≈ exp  − β N 2 Φ − ( x )  , (1) P [ κ > x ] ≈ exp ( − β N Φ + ( x )) . (2) The left and right rate functions Φ ± ( x ) (depending paramet- rically on α , b ut not on β ) are gi ven in (10) and (12) and plot- ted in Fig. 1. Both functions are supported on x ∈ (1 , ∞ ) and have a minimum (zero) at h κ i = [(1 + √ 1 + α ) / (1 − √ 1 + α )] 2 > 1 . Therefore the corresponding density of κ is peaked around h κ i , which is precisely its mean v alue for lar ge N [45]. Crossing h κ i , both functions freeze to the zero value, and around h κ i they hav e an interesting non-analytic behavior , characterized by a third-order (for Φ − ( x ) ) [31] and second- order (for Φ + ( x ) ) discontinuity . Both these non-analytic be- haviors and the dif ferent scaling with N between (1) and (2) are direct consequences of freezing phase transitions [32] in an associated Coulomb ﬂuid problem. The physics of the two branches is howe ver entirely different (see below for details). Matching the behavior of the rate functions around h κ i , we also determine exactly the size ( ∼ O ( N − 2 / 3 ) ) of typical ﬂuc- tuations of κ and the tails of its limiting distribution. W e no w begin by recalling some well-known facts about W ishart ma- trices. Generalities - The joint probability density (jpd) of the N (real and positiv e) eigenv alues is giv en by [33, 34] P β ( λ ) = 1 Z 0 e − 1 2 P N i =1 λ i N Y i =1 λ β 2 ( αN +1) − 1 i Y i L U L U L U - ln P @ k < x D - ln P @ k > x D FIG. 1: Plot of − ln P [ κ < x ] (dashed green, Eq. (1)) and − ln P [ κ > x ] (solid red, Eq. (2)), together with numerical simu- lations for the left branch [42]. The two rate functions β N 2 Φ − ( x ) and β N Φ + ( x ) freeze to the zero value upon crossing h κ i . The in- sets describe the corresponding phases of the Coulomb ﬂuid (acti ve vs. inactiv e barriers for Φ − ( x ) and the pulling of indi vidual extreme charges for Φ + ( x ) ). An arbitrary v alue of N has been chosen to pro- duce a reasonably looking plot in which the two branches are visible. respect to the jpd (3)) is expected for lar ge N to hav e the scaling form ρ ( λ ) = N − 1 ρ mp ( λ/ N ) , where the func- tion ρ mp ( x ) = 1 2 π x p ( x − z − )( z + − x ) is the celebrated Mar ˇ cenko-Pastur (MP) law on the compact support (for α > 0 ) x ∈ [ z − , z + ] with z ± = (1 ± √ α + 1) 2 . This MP law is a particular case of the general solution (9) of the inte gral equa- tion (8) below (see Fig. 2, top) when the two barriers L, U are ineffecti ve ( L ≤ z − and U ≥ z + ). W e start now by consider- ing the cumulativ e distribution of the CN ﬁrst, and get to the tail-cumulativ e afterwards. Cumulative distribution - The cumulativ e distribution P [ κ < x ] of the CN κ (depending parametrically on β and α = M / N − 1 > 0 ), can be written as [26, 42] P [ κ < x ] = 1 ( N − 1)! Z ∞ 0 d λ 1   Z · · Z xλ 1 λ 1 N Y j =2 d λ j P β ( λ )   . (5) The goal is to e valuate this multiple integral for lar ge N by the Coulomb ﬂuid method. The ﬁrst step is to rewrite the jpd (3) in the Gibbs-Boltzmann form described abov e. Here, N − 1 ﬂuid particles are howe ver not free to spread on the whole positiv e line, but instead constrained to live within the box [ λ 1 , xλ 1 ] , where λ 1 is the (free) position of the leftmost parti- cle. The second step consists of a coarse-graining procedure, where one introduces a normalized density of particles ρ ( λ ) = ( N − 1) − 1 P N i =2 δ ( λ − λ i ) for the N − 1 particles λ i ( i 6 = 1) living inside the box. Using the replacement rule P i> 1 g ( λ i ) = ( N − 1) R d λρ ( λ ) g ( λ ) , we can con vert the energy function E [ { λ } ] into a continuous action S (depend- ing on ρ , and parametrically on the location of the leftmost particle λ 1 and x ). The multiple inte gration (5) is therefore 3 interpreted as the canonical partition function of the associ- ated Coulomb ﬂuid, where the sum over all microscopic con- ﬁgurations of { λ } compatible with the normalized density ρ amounts to a functional inte gration ov er ρ and a standard inte- gration ov er λ 1 . Eventually , these resulting integrals are e v al- uated using the saddle point method. Performing these steps, we get P [ κ < x ] ∝ Z ∞ 0 d ξ Z D [ ρ, C ]e − β N 2 S [ ρ,C ; ξ,xξ ] , (6) where we renamed λ 1 → ξ for later con venience and the ac- tion of this ﬂuid (conﬁned between the lo wer L and upper U barriers of the box) is S [ ρ, C ; L, U ] = Z U L d λρ ( λ ) V ( λ ) + C − 1 2 Z Z U L d λ d λ 0 ρ ( λ ) ρ ( λ 0 ) ln | λ − λ 0 | . (7) Here V ( λ ) = ( λ − α ln λ ) / 2 − C and C is a Lagrange multi- plier enforcing normalization of ρ . Eq. (7) is easily identiﬁed as the continuous version of the energy Eq. (4), where we hav e neglected subleading O ( N ) contrib utions [42]. Evaluating the functional integral in (6) by the saddle point method, δ S δ ρ    ρ = ρ ? = 0 , we get V ( λ ) = Z U L d λ 0 ρ ? ( λ 0 ) ln | λ − λ 0 | , (8) where the solution ρ ? ( λ ) is just the equilibrium density of the Coulomb ﬂuid constrained to liv e within the box [ L, U ] . Clearly , if we release the barriers L, U we expect to reco ver the unconstrained MP law , ρ ? ( λ ) → ρ mp ( λ ) . Solving this in- tegral equation for a normalized ρ ? between two barriers at L and U is one of the main technical challenges that we managed to overcome. Skipping details [42], we ﬁnd that the general solution of (8) is [41] ρ ? ( λ ) = ( x + ( L, U ) − λ ) ( λ − x − ( L, U )) 2 π λ p ( U − λ )( λ − L ) 1 [ L,U ] ( λ ) , (9) where x + ( L, U ) ≥ U > L ≥ x − ( L, U ) , x ± ( L, U ) are the roots of x 2 − x  L + U 2 + α + 2  + α √ LU = 0 , and 1 [ a,b ] ( x ) = 1 if x ∈ [ a, b ] and 0 otherwise. How does this density look like for giv en values of L and U ? Four dif ferent shapes (phases of the ﬂuid) are possible [42] for α > 0 that are plotted in Fig. 2 (top). For example, set- ting ( L, U ) = ( z − , z + ) the corresponding density (9) is the MP law ρ ? ( x ) = ρ mp ( x ) (phase I I ). This critical MP point, which is marked in Fig. 2 (bottom), separates re gion I I , where the barriers are ineffective ( L < z − and U > z + ) and the equi- librium density is again just ρ mp ( x ) , from re gion I (where the barriers are instead effective in compressing the MP sea). Once we hav e e valuated the functional inte gral by the saddle- point method (which implies inserting the density (9) in II L U III L U IV L U I L U FIG. 2: T op: The four phases of the ﬂuid. Re gion I: the two barriers compress ef fectively the ﬂuid. Region II: MP law , where the barriers do not affect the ﬂuid. Region III and IV : only the lo wer or the upper barrier is active, respectiv ely (this scenario is not realized in our CN setting). The analytical expressions in re gions I II and IV where ﬁrst deriv ed in [38, 40], respectively . Down: Regions in the ( L, U ) plane where the density Eq. (9) has different shapes, according to top panels, for α = 3 . W e plot lev el curves of the action (7) S [ ρ ? , C ; L, U ] ( I ) and energy dif ference ∆ e ( L, U ) [Eq. (0.15) in [42]] ( IV ). On the dashed green and solid red extremal lines, the action and the ener gy difference are minimal, respectiv ely . The abscissas ξ ? (solution of the saddle point equations in the two cases) are giv en by the intersection of the straight line U = xL (solid black, with the left arrow pointing in the direction of increasing slope x ) with such extremal lines. Besides, the solid or- ange line corresponds to the condition x − ( L, U ) = L along which the lower barrier is inef fective (and the upper is effecti ve), while the solid blue line correponds to the condition x + ( L, U ) = U along which the upper barrier is ineffecti ve (and the lower is ef fective). the action (7)) we set L = ξ and U = xξ and ev al- uate the remaining ξ -integral again by the saddle-point method. This yields an optimal v alue ξ ? as the solution of d d ξ S [ ρ ? , C ; ξ , xξ ]    ξ = ξ ? = 0 . This value ξ ? ( x ) is marked in Fig. 2 (bottom) as the intersection between the straight line U = xL of varying slope x > 1 and the dashed green line on which the action S [ ρ ? , C ; L, U ] is minimal. 4 The ﬁnal result reads P [ κ < x ] ≈ e − β N 2 Φ − ( x ) , where the O ( N 2 ) decay is traced back to the high energy cost in com- pressing the whole sea of strongly correlated particles. Here Φ − ( x ) = S [ ρ ? , C ; ξ ? , xξ ? ] − S [ ρ mp , C ; z − , z + ] , where the second term comes from the normalization factor and needs to be subtracted. W e ev entually obtain Φ − ( x ) = 1 8 h f ( α ) 1 (1 + √ x ) + ln f ( α ) 2 (1 + √ x ) i 1 (1 , h κ i ) ( x ) , (10) where f ( α ) 1 , 2 ( ω ) are elementary functions listed in [42]. The rate function Φ − ( x ) thus freezes to the value 0 as x increases up to the critical value h κ i (implying ξ ? ( h κ i ) → z − ). Beyond this limit, the barriers are no longer effecti ve, and new phys- ical insights are needed to tackle the tail-cumulativ e regime (see next subsection). The limits are Φ − ( x → h κ i − ) ∼ K ( α )( h κ i− x ) 3 and Φ − ( x → 1 + ) ∼ ( − 1 / 2) ln( x − 1) , where K ( α ) = − ( − 1 + √ 1 + α ) 8 / 96 √ 1 + α (1 + √ 1 + α ) 4 . This implies a third order discontinuity across h κ i as anticipated. Also, close to 1 , the density of κ has the follo wing power- law tail, P [ κ = 1 +  ] ∼  β N 2 / 2 to leading order in N for  → 0 + . Although formally valid only for α > 0 , it turns out that in the limit α → 0 (square Gaussian matrices, where the scaling with N is dif ferent) the rate function (10) is still well- behav ed and we recover Edelman’ s result [19] to leading order in N [42]. W e now turn to the tail-cumulativ e distrib ution (the right branch in Fig. 1). T ail-cumulative distribution - Contrary to the previous case, the tail-cumulati ve distribution P [ κ > x ] does not admit a multiple-integral representation of the type (5), which could be mapped to the physics of a ﬂuid “trapped” between two hard barriers. The starting point of the calculation is again the energy func- tion (4), though. The Coulomb ﬂuid physics suggests that atypically large values of the CN κ = λ max /λ min are obtained when the right-and-leftmost particles are pulled aw ay from the MP sea in opposite directions ( λ max − λ min ∼ O ( N )) , a pro- cedure that is energetically not able to generate macroscopic rearrangements within the MP sea. This elegant energetic ar- gument was ﬁrst introduced in [39]. Following this physi- cal picture, the right rate function Φ + ( x ) is determined by the O ( N ) ener gy cost ∆ E ( L, U ) in pinning the leftmost and rightmost char ges at L and U , well outside the unperturbed MP sea in between (see red inset in Fig. 1). The le vel curves of the O (1) ∆ e ( L, U ) := ∆ E ( L, U ) /β N are depicted in re- gion I I of Fig. 2, together with the extremal line (solid red) where it attains its minimum v alue. Setting now L = ξ and U = ξ x , the energetically most fa vored position ξ ? for the leftmost outlier will be determined again by the intersection point of that extremal line and the straight line U = xL . Skipping details [42], this change in ener gy can be written for large N as ∆ e ( ξ , ξ x ) = ( ξ − z − ) + ( ξ x − z + ) 2 − α 2 ln ξ 2 x z − z + − Z z + z − d η ρ mp ( η ) ln     ( ξ − η )( η − ξx ) ( z − − η )( η − z + )     . (11) The probability of this “pinned” conﬁguration of eigen- values (yielding a CN κ exactly equal to x ) is ∝ exp( − β N ∆ e ( ξ , ξ x )) . Finding the optimal position ξ ? for the leftmost particle by minimizing (11) with respect to ξ , we eventually obtain P [ κ > x ] ≈ exp ( − β N Φ + ( x )) , where Φ + ( x ) = ∆ e ( ξ ? ( x ) , ξ ? ( x ) x ) is gi ven by [46] Φ + ( x ) = ln   g ( α ) ( x )  α/ 2  h ( α ) ( x )  2(2+ α )  1 ( h κ i , ∞ ) ( x ) . (12) The functions g ( α ) ( x ) and h ( α ) ( x ) ha ve lengthy but ex- plicit expressions in terms of elementary functions [42]. The rate function Φ + ( x ) again freezes to the v alue 0 as x decreases do wn to the critical value h κ i (implying ξ ? ( h κ i ) → z − ), where the pinned outliers reconnect with the MP sea. The limits are Φ + ( x → h κ i + ) ∼ J ( α )( x − h κ i ) 3 / 2 and Φ + ( x → ∞ ) ∼ ( α/ 2) ln x , where J ( α ) = √ 2 4 √ α + 1  √ α + 1 − 1  4 / 3 √ α + 2  √ α + 1 + 1  2 . This implies a second order discontinuity across h κ i as anticipated. Also, at inﬁnity the density of κ therefore decays as a power- law , P [ κ = x ] ∼ x − αβ N/ 2 to leading order in N for x → ∞ . Conclusions - In summary , we hav e computed analytically for large N the cumulative and tail-cumulati ve distributions of the CN κ of rectangular N × M Gaussian random matrices. Mapping the problem to the calculation of the free energy of an associated Coulomb ﬂuid, we found that the random vari- able κ satisﬁes large deviation laws governed by rate functions Φ ± ( x ) to the left and to the right of the expected value h κ i , al- beit with different scalings with N . These rate functions (see (10) and (12)) essentially describe the probability of sampling a random Gaussian matrix with an atypically large (or small) CN. They are monotonic and con ve x, with a non-analytic be- havior at their zero h κ i . This non-analytic behavior is a direct consequence of a very rich thermodynamics of the associated Coulomb ﬂuid. More precisely , across h κ i a compr essed (left branch) or str etched (right branch) ﬂuid undergoes fr eezing phase transitions of different orders (third and second respec- tiv ely). Matching the beha vior of the rate functions close to their min- imum h κ i , we deduce that typical ﬂuctuations of κ around h κ i should occur on a scale of O ( N − 2 / 3 ) , and setting χ = f ( α ) N 2 / 3 ( κ − h κ i ) , with f ( α ) = 2 1 / 3 (1 + α ) 1 / 6 ( − 1 + √ 1 + α ) 8 / 3 / (1 + √ 1 + α ) 4 / 3 , the scaled random variable χ has a N - and α -independent distribution P [ χ < x ] = F β ( x ) with tails ∼ exp  − β | x | 3  for x → −∞ and ∼ exp  − β x 3 / 2  for x → ∞ . The scaling is in agreement with a recent result [20], v alid only for M  N 3 . Our analytical re- sults have been numerically checked with excellent agreement [42]. 5 Acknowledgments - PV ackno wledges ﬁnancial support from Labex-P ALM (project RandMat). W e thank G. Schehr and F . D. Cunden for helpful discussions. IPC acknowledges hospitality by the LPTMS (Univ . Paris Sud) where this work was completed. [1] S . Smale, On the ef ﬁciency of algorithms of analysis , Bull. (N. S.) Amer . Math. Soc. 13 , 87 (1985). [2] H. A vron, A. Druinsky , and S. T oledo, Iterative spectral condition-number estimation , [arXiv: 1301.1107] (2013). [3] H. H. Goldstine and J. v on Neumann, Numerical inverting of matrices of high or der , Amer. Math. Soc. Bull. 53 , 1021 (1947). [4] H. H. Goldstine and J. v on Neumann, Numerical inverting of matrices of high or der , II , Amer . Math. Soc. Proc. 2 , 188 (1951). [5] J. W . Demmel, The probability that a numerical analysis prob- lem is difﬁcult , Mathematics of Computation 50 , 449 (1988). [6] V . Erceg, P . Soma, D. S. Baum, and A. J. Paulraj, Capacity obtained fr om multiple-input multiple-output channel measur e- ments in ﬁxed wireless en vir onments at 2.5 GHz , in Proc. IEEE Int. Conf. Commun. (ICC), vol. 1, Ne w Y ork, pp. 396-400 (2002). [7] M. D. Batariere et al., W ideband MIMO mobile impulse r e- sponse measurements at 3.7 GHz , in Proc. IEEE V eh. T echnol. Conf. (VTC), Birmingham, AL, pp. 26-30 (2002). [8] H. Artes, D. Seethaler , and F . Hlawatsch, Ef ﬁcient detection al- gorithms for MIMO channels: a geometrical appr oach to ap- pr oximate ML detection , IEEE T rans. Signal Process. 51 , 2808 (2003). [9] D. W ubben, R. Bohnke, V . Kuhn, and K. D. Kammeyer , MMSE- based lattice-reduction for near-ML detection of MIMO sys- tems , in Proc. ITG W ork. Smart Antennas (WSA), Munich, Germany , pp. 106-113 (2004). [10] J. Maurer, G. Matz, and D. Seethaler , Low-complexity and full- diversity MIMO detection based on condition number thr esh- olding , in Proc. IEEE Int. Conf. Acoustics Speech Signal Pro- cess. (ICASSP), vol. 3, Honolulu, HI, pp. 61-64 (2007). [11] Y . Zeng and Y .-C. Liang, Eigen value based spectrum sensing algorithms for cognitive radio , IEEE Trans. Commun. 57 , 1784 (2009). [12] F . Penna, R. Garello, and M. Spirito, Cooper ative spectrum sensing based on the limiting eigen value ratio distribution in W ishart matrices , IEEE Commun. Lett. 13 , 507 (2009). [13] L. S. Cardoso, M. Debbah, P . Bianchi, and J. Najim, Cooper- ative spectrum sensing using random matrix theory , in Proc. IEEE Int. Symp. W ireless Perv asi ve Comput. (ISWPC), San- torini, Greece, pp. 334-338 (2008). [14] G. Strang, Linear Alg ebra and its Applications , 3rd edition. San Diego: Harcourt Brace Jov anovich Inc. (1988). [15] H. Rauhut, Compressive Sensing and Structured Random Ma- trices , in Theoretical F oundations and Numerical Methods for Sparse Recovery , Berlin, Boston: DE GR UYTER, 1-92 (2010). [16] A. Binder and M. Aichinger , A W orkout in Computational F i- nance , The W iley Finance Series, John W iley & Sons (2013). [17] M. Piringer , M. Hrad, S. Stenzel, and M. Huber-Humer , Re- construction of CH4 emissions fr om a biogas plant - mete- or ological aspects , Proceedings of 15th conference on ”Har- monisation within Atmospheric Dispersion Modelling for Reg- ulatory Purposes”, Madrid, Spain, May 6-9, 2013 [online at http://www .har mo .org/conferences/Madrid/15harmo.asp ]. [18] A. Kelemen, A. Abraham, and Y . Liang, Computational Intel- ligence in Medical Informatics , V ol. 85 of Studies in Computa- tional Intelligence, Springer (2008). [19] A. Edelman, Eig en values and condition numbers of random matrices , SIAM J. Matrix Anal. Appl. 9 , 543 (1988). [20] T . Jiang and D. Li, Approximation of rectangular Beta- Laguerr e ensembles and lar ge de viations , J. Theor . Prob . 1-44 (2013). [21] A. Edelman and B. D. Sutton, T ails of condition number distri- butions , SIAM J. Matrix Anal. Appl. 27 , 547 (2005). [22] Z. Chen and J. J. Dongarra, Condition numbers of Gaussian random matrices , SIAM J. Matrix Anal. Appl. 27 , 603 (2005). [23] J.-M. Aza ¨ ıs and M. Wschebor , Upper and lower bounds for the tails of the distrib ution of the condition number of a Gaussian matrix , SIAM J. Matrix Anal. Appl. 26 , 426 (2005). [24] W . Anderson and M. T . W ells, The exact distribution of the condition number of a Gaussian matrix , SIAM J. Matrix Anal. Appl. 31 , 1125 (2009). [25] T . Ratnarajah, R. V aillancourt, and M. Alvo, Eigen values and condition numbers of complex random matrices , SIAM J. Ma- trix Anal. Appl. 26 , 441 (2005). [26] M. Matthaiou, M. R. McKay , P . J. Smith, and J. A. Nossek, On the condition number distrib ution of complex W ishart matrices , IEEE T ransactions on Communications 58 , 1705 (2010). [27] L. W ei and O. Tirkkonen, Appr oximate condition number dis- tribution of complex non-central correlated W ishart matri- ces , IEEE International Conference on Communications (ICC), pp.1-5, 5-9 June 2011. [28] A. Edelman, On the distribution of a scaled condition number , Math. of Comp. 58 , 185 (1992). [29] I. Dumitriu and A. Edelman, Matrix models for Beta ensembles , J. Math. Phys. 43 , 5830 (2002). [30] J. W ishart, The generalised pr oduct moment distribution in samples fr om a normal multivariate population , Biometrika 20A , 32 (1928). [31] S. N. Majumdar and G. Schehr , T op eigen value of a random matrix: lar ge deviations and third or der phase transition , J. Stat. Mech.: Theory and Experiment P01012 (2014). [32] C. T exier and S. N. Majumdar , W igner time-delay distribution in c haotic cavities and fr eezing tr ansition , Phys. Rev . Lett. 110 , 250602 (2013). [33] R. A. Fisher, The sampling distribution of some statistics ob- tained from non-linear equations , Annals of Eugenics 9 , 238 (1939). [34] P . L. Hsu, On the distribution of the roots of certain determi- nantal equations , Annals of Eugenics 9 , 250 (1939). [35] F . J. Dyson, Statistical Theory of the Ener gy Le vels of Com- plex Systems. I , J. Math. Phys. 3 , 140 (1962); Statistical Theory of the Ener gy Levels of Comple x Systems. II , J. Math. Phys. 3 , 157 (1962); Statistical Theory of the Ener gy Levels of Complex Systems. III , J. Math. Phys. 3 , 166 (1962). [36] D. S. Dean and S. N. Majumdar , Extr eme value statistics of eigen values of Gaussian random matrices , Phys. Rev . E 77 , 041108 (2008). [37] D. S. Dean and S. N. Majumdar, Lar ge deviations of extr eme eigen values of random matrices , Phys. Re v . Lett. 97 , 160201 (2006). [38] P . V iv o, S. N. Majumdar, and O. Bohigas, Larg e deviations of the maximum eigenvalue in W ishart random matrices , J. Phys. A: Math. Theor . 40 , 4317 (2007). 6 [39] S. N. Majumdar and M. V er gassola, Larg e deviations of the maximum eigen value for W ishart and Gaussian random matri- ces , Phys. Re v . Lett. 102 , 060601 (2009). [40] E. Katza v and I. P ´ erez Castillo, Lar ge de viations of the smallest eigen value of the W ishart-Laguerr e ensemble , Phys. Rev . E 82 , 041004 (2010). [41] F . G. Tricomi, Inte gral Equations (Pure Appl. Math V , Inter- science, London, 1957). [42] See Supplementary Material. [43] Here T stands for transpose ( β = 1 ), hermitian conjugate ( β = 2 ) and symplectic conjugate ( β = 4 ). [44] Here ≈ stands for the logarithmic equiv alence lim N →∞ − ln P [ κ < x ] /β N 2 = Φ − ( x ) and similarly for the tail-cumulativ e branch. [45] The typical v alue h κ i for lar ge N is just the ratio of the average h λ max i = (1 + √ 1 + α ) 2 and the average h λ min i = (1 − √ 1 + α ) 2 [46] Note that this energetic argument giv es (strictly speaking) the density of κ , P [ κ = x ] and not its tail-cumulative distribution P [ κ > x ] . Howe ver , the lar ge N decay of the two is the same to leading order . SUPPLEMENT AR Y MA TERIAL Contents Supplementary Material 6 Deriv ation of the expression for the action in Eq. (7) 6 Solution of integral equation Eq. (8) 7 Left rate function 8 Square case: Edelman’ s result 9 Right rate function 10 Order of phase transitions 11 Numerical simulations for the left rate function 11 Derivation of the expr ession for the action in Eq. (7) Let us start by noticing that P [ κ < x ] = Prob[ λ max < xλ min ] , which can be worked out as follo ws P [ κ < x ] = Prob( λ 1 ≤ λ 2 ≤ · · · ≤ λ N ≤ κλ 1 ) = Z ∞ 0 d x Prob( x ≤ λ 2 ≤ λ 3 ≤ · · · ≤ λ N − 1 ≤ λ N ≤ κx , λ 1 = x ) = 1 ( N − 1)! Z ∞ 0 d x Prob( λ 1 = x , { λ 1 ≤ λ i ≤ κλ 1 , i = 2 , . . . , N } ) or in terms of the jpd of eigen values of the W ishart ensemble we write P [ κ < x ] = 1 ( N − 1)! Z ∞ 0 d λ 1 " Z xλ 1 λ 1 d λ 2 · · · Z xλ 1 λ 1 d λ N P β ( λ ) # T o go to a continuous theory we start from Eq. (3) of this Letter , which deﬁnes the jpd of eigen values of the W ishart ensemble, we ﬁrst rewrite it in e xponential form and rescale all the eigenv alues as λ i → β N λ i . This results in P β ( λ ) ∝ exp( − β N 2 E [ { λ } ]) , with E [ { λ } ] = 1 2 N N X j =1 λ j − α 2 N N X j =1 ln λ j − 1 2 N 2 X j 6 = k ln | λ j − λ k | , where we have kept only the relev ant terms. Next, we rename λ 1 as ξ and separate it from the rest of eigen values which, abusing notation, are denoted again as λ = ( λ 2 , . . . , λ N ) , viz. E [ { ξ , λ } ] = 1 2 N   ξ + N X j =2 λ j   − α 2 N   ln ξ + N X j =2 ln λ j   − 1 2 N 2   X j 6 = k ln | λ j − λ k | + 2 N X j =2 ln | ξ − λ j |   7 This expression for E [ { ξ , λ } ] can no w be expressed as a continuous theory in the follo wing form: S [ ρ ( λ ; λ )] ≡ E [ { ξ , ρ ( λ ; λ ) } ] = 1 2 Z d λρ ( λ ; λ ) − α 2 Z d λρ ( λ ; λ ) ln λ − 1 2 Z Z d λ d λ 0 ρ ( λ ; λ ) ρ ( λ ; λ ) ln | λ − λ 0 | + O ( N − 1 ) (0.1) where hav e introduced the normalized one-point function ρ ( λ ; λ ) = 1 N − 1 N X i =2 δ ( λ − λ i ) . It is important to note that the dependence on ξ does not appear in the leading terms of Eq. (0.1) and therefore will be irrele v ant for lar ge N . That is why we write the ener gy as S [ ρ ( λ ; λ )] , that is, only dependent on the density ρ ( λ ) . Going back to our expression for the cumulati ve function we write P [ κ < x ] ∝ Z D [ ρ ] Z ∞ 0 d ξ " Z κλ 1 λ 1 d λ 2 · · · Z κλ 1 λ 1 d λ N # e − β N 2 S [ ρ ( λ )] δ ( F ) ρ ( λ ) − 1 N − 1 N X i =2 δ ( λ − λ i ) ! Finally , using a Fourier representation for the functional Dirac delta, integrating its corresponding Lagrange multiplier , and ignoring subleading terms, we end up with the following e xpression: P [ κ < x ] ∝ Z ∞ 0 d ξ Z D [ ρ, C ] e − β N 2 S [ ρ,C ; ξ,xξ ] with S [ ρ, C ; L, U ] = 1 2 Z U L d λλρ ( λ ) − α 2 Z U L d λρ ( λ ) ln λ − 1 2 Z Z U L d λ d λ 0 ρ ( λ ) ρ ( λ 0 ) ln | λ − λ 0 | − C Z U L d λρ ( λ ) − 1 ! + O ( N − 1 ) . (0.2) as in Eq. (7) of the Letter . Solution of integral equation Eq. (8) The integral equation Eq. (8) of the Letter can be solved resorting to the theory of so-called Carleman equation, which is an integral equation of type Z b a ln | x − t | y ( t )d t = f ( x ) . (0.3) In our case we have a = L , b = U , y ( t ) = ρ ? ( y ) , and f ( x ) = x 2 − α 2 ln x − C . For our purposes, the general solution of (0.3) reads y ( x ) = 1 π 2 p ( x − a )( b − x ) " Z b a p ( t − a )( b − t ) f 0 ( t )d t t − x + B # . Recalling the following inte gral results: Z b a p ( t − a )( b − t ) t − x d t = π  b + a 2 − x  , Z b a p ( t − a )( b − t ) t ( t − x ) d t = − π x  x − √ ab  , we obtain the following e xpression for ρ ? ( x ) : ρ ? ( x ) = 1 2 π p ( x − a )( b − x ) " b + a 2 − x + α x  x − √ ab  + B # , 8 where B is determined by looking for the physical solution. Imposing normalization and using the following results Z b a d x 1 2 π p ( x − a )( b − x ) = 1 2 , Z b a d x 1 2 π x p ( x − a )( b − x ) = 1 2 √ ab , Z b a d x x 2 π p ( x − a )( b − x ) = a + b 4 , we obtain that B = 2 . Finally , setting a = L and b = U we write ρ ? ( λ ) = ( x + ( L, U ) − λ ) ( λ − x − ( L, U )) 2 π λ p ( U − λ )( λ − L ) 1 [ L,U ] ( λ ) , (0.4) where x ± ( L, U ) are the roots of the polynomial λ 2 − λ  L + U 2 + α + 2  + α √ LU = 0 Writing (0.4) in this w ay is most con venient as we can see that the physical normalized solution is the one such that x − ( L, U ) ≤ L < U ≤ x + ( L, U ) . This implies that the Coulomb ﬂuid has four phases: • x + ( L, U ) = U and x − ( L, U ) = U . In this case we notice that the spectral density becomes the MP law (Re gion II in Fig. 2): ρ ? ( λ ) = p ( U − λ )( λ − L ) 2 π λ , L = z − , U = z + . • x + ( L, U ) = U and x − ( L, U ) < L . In this case the spectral density becomes (Region III of Fig. 2): ρ ? ( λ ) = 1 2 π r U − λ λ − L " λ − α p L/U λ # , L ( U ) = 1 U h α + p ( α − U ) 2 − 4 U i 2 , U ≥  1 + 1 √ c  2 . Note that in this case we take as free parameter the position of the upper barrier U , instead of the lower barrier , as the latter will result into a third degree polynomial to solv e. • x − ( L, U ) = L and x + ( L, U ) > U . In this case the spectral density becomes (Re gion IV of Fig. 2): ρ ? ( λ ) = 1 2 π r λ − L U − λ " α p U /L − λ λ # , U ( L ) = 1 L h α − p ( α − L ) 2 − 4 L i 2 , L ≤  1 − 1 √ c  2 , where in this case we take the lo wer barrier L as our free parameter . • In this case we have x + ( L, U ) > U and x − ( L, U ) < L and (Region I of Fig. 2): ρ ? ( λ ) = [ x + ( L, U ) − λ ][ λ − x − ( L, U )] 2 π λ p ( U − λ )( λ − L ) with x ± ( L, U ) are the solutions x ± ( L, U ) = 1 2    L + U 2 + α + 2  ± s  L + U 2 + α + 2  2 − 4 α √ LU   Left rate function T o deri ve the left rate function we need to e valuate the action at the solution of the saddle point, that is S [ ρ ? , C ; L, U ] . It is possible to get rid of the double integral appearing (0.2) by taking the saddle-point equation, multiplying by ρ ? ( λ ) , and 9 integrating λ over the interv al [ L, U ] . The constant C is then determined taking the value of λ = U at the saddle-point equation. All in all, the action at the saddle point takes the follo wing form S [ ρ ? , C ; L, U ] = 1 4 Z U L d λρ ? ( λ ) ( λ − α ln λ ) + 1 2 C ( L, U ) with C ( L, U ) = U 2 − α 2 ln U − Z U L d λ ρ ? ( λ ) ln | U − λ | Now we report the result for the follo wing integrals: Z U L d λρ ? ( λ ) λ = 1 4 ( L + U )  α + L + U 2 + 2  − 1 2 α √ LU − 1 16  3 L 2 + 2 LU + 3 U 2  Z U L d λρ ? ( λ ) ln λ = 1 4  8( α + 1) ln  √ L + √ U  − 2 α ln( L ) − 2 α ln( U ) − 8( α + 1) ln(2) + 2 √ LU − L − U  Z U L d λρ ? ( λ ) ln | U − λ | = 1 4   − 2 α ln   U  − 2 √ LU + L + U  U − L   + 2( α + 2) ln( U − L ) − 4( α + 2) ln(2) − L + U   Gathering all these results and after simplifying we obtain the ﬁnal expression for the action S [ ρ ? , C ; L, U ] = − 1 64 − 2 L (4 α + U + 8) + 16 α √ LU − 8 α  − 4( α + 2) ln  √ L + √ U  + α ln( L ) + α ln( U )  − 8( α + 2) U − 32( α ( α + 2) + 2) ln(2) + L 2 + 32 ln( U − L ) + U 2 ! T wo more steps need to be done to get the left rate function: the ﬁrst one is to take L = ξ and U = xξ and minimize S [ ρ ? , C ; ξ , xξ ] with respect to ξ . This gi ves d d ξ S [ ρ ? , C ; ξ , xξ ]    ξ = ξ ? = 0 ⇒ ξ ? ( x ) = 4 1 + α (1 + √ x ) 2 . The second step is to obtain the normalization f actor by taking the limit x → ∞ in the action. This, in the Coulomb ﬂuid picture, corresponds to x → h κ i . By noting that ξ ? ( h κ i ) = z − and that ρ ? → ρ mp , the left rate function Φ − ( x ) = S [ ρ ? , C ; ξ ? , xξ ? ] − S [ ρ mp , C ; z − , z + ] gi ves Φ − ( x ) = 1 8 ( √ x + 1) 2 " 4 α 2  x ln(2) + √ x ln(4) − ln  √ x + 1  + α 2  √ x + 1  2 ln( x ) − 2  x + 2 √ x   2 α 2 ln  √ x + 1  + ln( α + 1)  + 4 α 2  √ x + 1  2 arccoth(2 α + 1) + α 2 ln(16) + 2( α + 1)  αx − 2( α + 2) √ x + α  − 2 ln( α + 1) + 8  √ x + 1  2 arccoth  √ x  # . Finally , after some algebra, Φ − ( x ) can be re written as reported in Eq. (10), and in terms of the functions f ( α ) 1 ( ω ) and f ( α ) 2 ( ω ) f ( α ) 1 ( ω ) = 2( α + 1)  α ( ω − 2) 2 − 4 ω + 4  ω 2 , f ( α ) 2 ( ω ) =  4( α + 1)( ω − 1) αω 2  2 α 2  ω √ 1 + α ( ω − 2)  4 . Squar e case: Edelman’ s r esult In this section, we show how one recovers Edelman’ s results for the condition number in the square matrix case ( N = M ) . T wo of the results contained in Edelman’ s paper [19] are as follows. For β = 1 , lim N →∞ P [ ˆ κ/ N < x ] = e − 2 /x − 2 /x 2 , (0.5) 10 and for β = 2 lim N →∞ P [ ˆ κ/ N < x ] = e − 4 /x 2 , (0.6) where ˆ κ = √ κ (as deﬁned in our Letter). Thus, P [ ˆ κ < N x ] = P [ κ < N 2 x 2 ] . Our result reads P [ κ < x ] ≈ exp  − β N 2 Φ − ( x )  . So all we have to do is to compute Φ − ( N 2 x 2 ) for α = 0 to leading order in N . W e get N 2 Φ − ( N 2 x 2 ) ∼ 2 x 2 , for N → ∞ , (0.7) implying P [ ˆ κ < N x ] ≈ exp  − 2 β x 2  , (0.8) which correctly reproduce Edelman’ s result for β = 2 and his leading term in N for β = 1 . The missing contrib ution in the latter case corresponds to a O ( N ) correction in the Coulomb gas approach not captured by our theory . Right rate function First, we giv e the deﬁnition of the functions g ( α ) ( x ) and h ( α ) ( x ) appearing in Eq. (13) of the Letter, as g ( α ) ( x ) = Ξ  p $ ( x, t ) + 1 , t  , (0.9) h ( α ) ( x ) = p $ ( x, t ) 1 + p $ ( x, t ) + 1 , (0.10) $ ( y, τ ) = 4 τ (1 + y ) ( τ − 1) 2 ( y − h κ i ) , (0.11) Ξ( r , s ) = s 2 ( r + 1) 2 − ( r − 1) 2 s 2 ( r − 1) 2 − ( r + 1) 2 , (0.12) where t = √ 1 + α . Next, we derive the difference in Coulomb ﬂuid free energy when the minimum eigen v alue is pinned below the left edge z − of the MP law and the maximum eigen v alue above the right edge z + (at the positions L and U respecti vely). W e want to estimate ∆ E = E ( L, λ 2 , . . . , λ N − 1 , U ) − E mp ( λ 1 , . . . , λ N ) , where the initial energy E mp is the one corresponding to the unperturbed MP conﬁguration. Recall that the Coulomb energy is E ( λ 1 , . . . , λ N ) = 1 2 N X i =1 λ i −  β 2 ( αN + 1) − 1  N X i =1 ln λ i − β 2 X i 6 = j ln | λ i − λ j | . (0.13) Then we hav e (after pinning the extreme particles) E ( L, λ 2 , · · · , λ N − 1 , U ) = L + U 2 + 1 2 N − 1 X i =2 λ i −  β 2 ( αN + 1) − 1  (ln L + ln U ) −  β 2 ( αN + 1) − 1  N − 1 X i =2 ln λ i − β X 2 ≤ i x ] ≈ exp [ − β N Φ + ( x )] , where Φ + ( x ) = ∆ e ( ξ ? ( x ) , ξ ? ( x ) x ) so all is left to do is to minimize ∆ e with respect to ξ . One can convince oneself that this is gi ven by d d ξ ∆ e ( ξ , ξ x )    ξ = ξ ? = 0 ⇒ ξ ? ( x ) = 2(2 + α ) x + 1 (0.18) Plugging this result back to ∆ e ( ξ ? ( x ) , ξ ? ( x ) x ) and after some algebra we get to the result (12) of this Letter . Order of phase transitions An analysis of the deriv ativ es of the rate functions will re veal the type of transition as the y approach h κ i . W e obtain Φ − ( x ) = − ( − 1 + √ 1 + α ) 8 96 √ 1 + α (1 + √ 1 + α ) 4 ( h κ i − x ) 3 + · · · Φ + ( x ) = √ 2 4 √ α + 1  √ α + 1 − 1  4 3 √ α + 2  √ α + 1 + 1  2 ( x − h κ i ) 3 / 2 + · · · (0.19) Numerical simulations for the left rate function Let us recall the expression for the cumulati ve distribution P [ κ < x ] = 1 ( N − 1)! Z ∞ 0 d λ 1 " Z xλ 1 λ 1 d λ 2 · · · Z xλ 1 λ 1 d λ N P β ( λ ) # (0.20) Suppose we were to implement a standard Monte Carlo simulation of the Coulomb ﬂuid. After relaxation has been achieved, we would simply use the Monte Carlo Markov Chain to estimate P [ κ < x ] by keeping a record of the number of states such that λ i ∈ [ λ 1 , xλ 1 ] for i = 2 , . . . , N divided by the total number of recorded states, that is P [ κ < x ] ∝ #( { λ 1 , . . . , λ N }| λ i ∈ [ λ 1 , xλ 1 ] , i = 2 , . . . , N ) #( { λ 1 , . . . , λ N } ) (0.21) 12 There is nothing wrong with this procedure except that it is e xtremely inefﬁcient: for atypical v alues of x it will be very unlikely to ﬁnd the Coulomb ﬂuid in that conﬁguration, and a huge number of samples will be needed to hav e reliable statistics. There is a smart way around this, though. One notices that it is very easy to obtain the samples #( { λ 1 , . . . , λ N }| λ i ∈ [ λ 1 , xλ 1 ] , i = 2 , . . . , N ) by simulating the ﬂuid directly with such constraints. The tricky part is to realize how to get the cumulativ e by using this subset of samples. W e write P [ κ < y ] = P [ κ < y | y ≤ x ] K x (0.22) where K x ≡ P [ y ≤ x ] depends only on x . The idea is then to simulate the Coulomb ﬂuid such that λ i ∈ [ λ 1 , xλ 1 ] and to calculate the cumulati ve P [ κ < y | y ≤ x ] . Obviously the cumulati ve will be concentrate mainly in the interv al y ∈ [ x − δ, x ] for some small δ . W e can then use this to estimate the deriv ativ e of the left rate function, getting rid of the constant K x , that is Φ 0 − ( y ) = − 1 β N 2 d d y ln P [ κ < y | y ≤ x ] (0.23) Therefore, the algorithmic procedure is the following: 1. Choose a value of x . 2. Simulate the Coulomb ﬂuid with barriers such that λ i ∈ [ λ 1 , xλ 1 ] and let it thermalize. 3. Construct the histogram of the cumulative P [ κ < y | y ≤ x ] of this conﬁned ﬂuid. 4. Estimate the deriv ati ve Φ 0 − ( y ) using the constructed histogram Then by scanning the values of x we obtain the deriv ativ e of the left rate function in the interval x ∈ [1 , h κ i ] .

Phase transitions in the condition number distribution of Gaussian random matrices

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment