Generalized Rybicki Press algorithm
This article discusses a more general and numerically stable Rybicki Press algorithm, which enables inverting and computing determinants of covariance matrices, whose elements are sums of exponentials. The algorithm is true in exact arithmetic and re…
Authors: Sivaram Ambikasaran
GENERALIZED R YBICKI PRESS ALGORITHM SIV ARAM AMBIKASARAN O ( N ) direct solver and determinan t computation of exponential cov ariance and general semi-separable matrices Abstract. This article discusses a more general and numerically stable Rybicki Press algorithm, which enables in verting and computing determinants of cov ariance matrices, whose elements are sums of exponentials. The algorithm is true in exact arithmetic and relies on introducing new v ariables and corresp onding equations, thereby conv erting the matrix in to a banded matrix of larger size. Linear complexity banded algorithms for solving linear systems and computing determinants on the larger matrix enable linear complexity algorithms for the initial semi-separable matrix as well. Benchmarks provided illustrate the linear scaling of the algorithm. Key words. Semi-separable matrices, Rybic ki Press algorithm, fast direct solver, fast determinant computation, exp onen tial cov ariance, CARMA processes AMS sub ject classifications. 15A23, 15A15, 15A09 1. In tro duction. Large dense co v ariance matrices arise in a wide range of applications in computational statistics and data analysis. Storing and p erforming n umerical computations on such large dense matrices is computationally intractable. Ho wev er, most of these large dense matrices are structured (either in exact arithmetic or finite arithmetic), whic h can b e exploited to construct fast algorithms. One such class of data sparse matrices are semi-separable matrices, which hav e raised a lot of interest and ha v e b een studied in detail across a wide range of applications including integral equations [1 – 3] and computational statistics [4–8]. F or a detailed bibliograph y on semi-separable matrices, the reader is referred to V andebril et al. [9]. Throughout the literature, there are slightly differen t definitions of semi-separable matrices. In this article, we will b e w orking with the following definition: Defn 1.1. A ∈ R N × N is terme d a semi-sep ar able matrix with semi-sep ar able r ank p , if it c an b e written as A = D + triu ( B p ) + tril ( C p ) (1.1) wher e D is a diagonal matrix, B p , C p ar e r ank p matric es, triu ( B p ) denotes the upp er triangular p art of B p and tril ( C p ) denotes the lower triangular p art of C p . F ast algorithms for solving semi-separable linear systems exists and the reader is referred to some of these references [10– 15] and the references therein. In this article, we prop ose a new O ( N ) direct solv er and determinant computation for semi-separable matrices. The main contributions of this article include: • A new O ( N ) direct solver for semi-separable matrices is obtained by em b edding the semi-separable matrix in to a larger banded matrix. • The determinant of these semi-separable matrix is shown to equal to the determinant of the larger banded matrix, thereb y enabling computing determinants of these semi-separable matrices at a computational cost of O ( N ). This is the first algorithm for computing the determinan ts for a general semi-separable matrix. • A n umerically stable generalized Rybicki Press algorithm is derived using these ideas. T o b e sp ecific, fast, stable, direct algorithms are derived for solving and computing determinants (b oth scaling as O ( N )) for cov ariance matrices of the form: A ij = p X l =1 α l exp ( − β l | t i − t j | ) (1.2) where i, j ∈ { 1 , 2 , . . . , n } , the points t i are distinct and are distributed on an in terv al. The co v ariance matrix in Equa- tion (1.2) is frequen tly encountered in computational statistics in the context of Contin uous time AutoRegressiv e- Mo ving-Average (abbreviated as CARMA) models [16 – 18]. 1 2 O ( N ) direct solver and determinant computation of general semi-separable matrices • Another adv an tage of this algorithm from a practical view-p oin t is that the algorithm relies only on sparse linear algebra and thereby can easily use the existing mature sparse linear algebra libraries. The algorithm discussed in this article has b een implemented in C++ and the implemen tation is made a v ailable at https://github.com/sivaramambikasaran/ESS [19] under the license provided by New Y ork Universit y . Ac knowledgemen ts : The author would lik e to thank Christopher S. Ko chanek for initiating the conv ersation on generalized Rybicki Press algorithm and Da vid W. Hogg for putting in touch with Christopher S. Kochanek. The author w ould also lik e to thank the anonymous referee for his careful, detailed review and insigh tful commen ts. The research was supp orted in part by the NYU-AIG P artnership on Inno v ation for Global Resilience under grant num b er A2014-005. The author was also supp orted in part by the Applied Mathematical Sciences Program of the U.S. Department of Energy under Con tract DEFGO288ER25053, Office of the Assistan t Secretary of Defense for Research and Engineering and AF OSR under NSSEFF Program Award F A9550-10-1-0180. 2. Sparse em b edding of semi-separable matrix with semi-separable rank 1 . T o motiv ate the general idea, w e will first look at the sparse embedding for a 4 × 4 semi-separable matrix, whose semi-separable rank is 1. The matrix A is as sho wn in Equation (2.1). A = a 11 u 1 v 2 u 1 v 3 u 1 v 4 u 1 v 2 a 22 u 2 v 3 u 2 v 4 u 1 v 3 u 2 v 3 a 33 u 3 v 4 u 1 v 4 u 2 v 4 u 3 v 4 a 44 (2.1) And the corresp onding linear system is Ax = b , where b = b 1 b 2 b 3 b 4 T In tro duce the following v ariables: r 4 = v 4 x 4 (2.2) r 3 = v 3 x 3 + r 4 (2.3) r 2 = v 2 x 2 + r 3 (2.4) l 1 = u 1 x 1 (2.5) l 2 = u 2 x 2 + l 1 (2.6) l 3 = u 3 x 3 + l 2 (2.7) In tro ducing the v ariables the linear system Ax = b is no w of the form a 11 x 1 + u 1 r 2 = b 1 (2.8) v 2 l 1 + a 22 x 2 + u 2 r 3 = b 2 (2.9) v 3 l 2 + a 33 x 3 + u 3 r 4 = b 3 (2.10) v 4 l 3 + a 44 x 4 = b 4 (2.11) Generalized Rybicki Press algorithm 3 The extended linear system (after appropriate ordering of equations and unknowns) is then of the form a 11 u 1 0 0 0 0 0 0 0 0 u 1 0 − 1 0 0 0 0 0 0 0 0 − 1 0 v 2 1 0 0 0 0 0 0 0 v 2 a 22 u 2 0 0 0 0 0 0 0 1 u 2 0 − 1 0 0 0 0 0 0 0 0 − 1 0 v 3 1 0 0 0 0 0 0 0 v 3 a 33 u 3 0 0 0 0 0 0 0 1 u 3 0 − 1 0 0 0 0 0 0 0 0 − 1 0 v 4 0 0 0 0 0 0 0 0 v 4 a 44 x 1 r 2 l 1 x 2 r 3 l 2 x 3 r 4 l 3 x 4 = b 1 0 0 b 2 0 0 b 3 0 0 b 4 (2.12) Note that Equation (2.12) is a banded matrix of bandwidth 2 and has a sparsit y structure even within the band. In general, let A b e an N × N semi-separable matrix, with the semi-separabilit y rank 1 as written in Equation (2.13). A ( i, j ) = a ii if i = j u j v i if i > j u i v j if i < j (2.13) where i, j ∈ { 1 , 2 , . . . , N } . One w ould then need to add the v ariables r 2 , r 3 , . . . , r N and l 1 , l 2 , . . . , l N − 1 , where r N = v N x N , l 1 = u 1 x 1 and r k = v k x k + r k +1 (2.14) l k = u k x k + l k − 1 (2.15) where k ∈ { 2 , . . . , N − 1 } . Hence, we hav e a total of 3 N − 2 v ariables and 3 N − 2 equations. Therefore, the extended matrix will b e a (3 N − 2) × (3 N − 2) banded matrix, whose bandwidth is 2. This is illustrated pictorially for a 8 × 8 matrix in Figure 2.1. 3. Sparse embedding of a general semi-separable matrix. Let A b e a N × N matrix, whose semi-separable rank is p , i.e., we hav e A ( i, j ) = a ii if i = j p X l =1 u ( l ) j v ( l ) i if i > j p X l =1 u ( l ) i v ( l ) j if i < j (3.1) where i, j ∈ { 1 , 2 , . . . , N } . W e then add the following v ariables r ( p ) 2 , r ( p ) 3 , . . . , r ( p ) N and l ( p ) 1 , l ( p ) 2 , . . . , l ( p ) N − 1 as b efore. How ever, not surprisingly , these new v ariables r ( p ) i ’s and l ( p ) j ’s will b e vectors of length p . Let U ( p ) k = h u (1) k u (2) k u (3) k · · · u ( p ) k i and V ( p ) k = h v (1) k v (2) k v (3) k · · · v ( p ) k i . W e then hav e the following relations for the additional vector v ariables. r ( p ) N = V T N x N (3.2) l ( p ) 1 = U T 1 x 1 (3.3) 4 O ( N ) direct solver and determinant computation of general semi-separable matrices Fig. 2.1 . Pictorial description of the extende d sp arse matrix obtaine d fr om a r ank 1 semi-sep ar able matrix wher e N = 8 . The c olor co de is as shown b elow. = a kk ; = u ( p ) i ; = v ( p ) j ; = − 1; = 1; and r ( p ) k = V T k x k + r ( p ) k +1 (3.4) l ( p ) k = U T k x k + l ( p ) k − 1 (3.5) where k ∈ { 2 , . . . , N − 1 } . Hence, w e no w hav e (2 p + 1) N − 2 p v ariables (this includes the N x k ’s, N − 1 v ector v ariables r ( p ) k and l ( p ) k of length p ) and (2 p + 1) N − 2 p equations relating them. Therefore, we end up with a ((2 p + 1) N − 2 p ) × ((2 p + 1) N − 2 p ) extended sparse matrix, whose bandwidth is (2 p + 1). This is illustrated in Figure 3.1 for 10 × 10 semi-separable matrix, whose semi-separable rank is 4. The computational complexit y of the algorithm clearly scales as O ( N ), since the extended sparse matrix has a bandwidth of O ( p ) and the matrix of size O ( pN ) × O ( pN ). It is also possible to analyze the scaling with resp ect to the semi-separable rank p , though this is of little practical relev ance since p = O (1) for most interesting semi-separable matrices. A detailed analysis shows that the computational complexity of the algorithm is O ( p 2 N ). Numerical b enchmarks presen ted in Section 7 v alidate the scaling of the algorithm. 4. Determinan t of extended sparse matrix. Claim 4.1. The determinant of the extende d sp arse matrix is the Generalized Rybicki Press algorithm 5 Fig. 3.1 . Pictorial description of the extende d sp arse matrix wher e N = 10 and p = 4 . The c olor c o de is as shown b elow. = a kk ; = u ( p ) i ; = v ( p ) j ; = − 1; = 1; 6 O ( N ) direct solver and determinant computation of general semi-separable matrices same as the determinant of the original dense matrix up to a sign. The extended system, denoted by A ex on appropriate reordering of rows and columns can b e written as P 1 A ex P 2 l ( p ) 1 l ( p ) 2 . . . l ( p ) N − 1 r ( p ) 2 r ( p ) 3 . . . r ( p ) N x 1 x 2 . . . x N = L ∆ 0 U a 0 U ∆ V a V b U b D l ( p ) 1 l ( p ) 2 . . . l ( p ) N − 1 r ( p ) 2 r ( p ) 3 . . . r ( p ) N x 1 x 2 . . . x N (4.1) where P 1 , P 2 are permutation matrices, the matrix L ∆ is a highly sparse low er-triangular matrix with 1’s on the diagonal and − 1’s at a few places in the low er-triangular part (the precise lo cation is unimp ortant for determinan t computations as w e will see later), the matrix U ∆ is a highly sparse upp er-triangular matrix with 1’s on the diagonal and − 1’s at a few places in the upper-triangular part and D is a diagonal matrix with D ii = a ii . The first set of ro ws, i.e., L ∆ 0 U a , corresp ond to adding the v ariables l ( p ) k , i.e., l ( p ) k = U T k x k + l ( p ) k − 1 . The next set of rows, i.e., 0 U ∆ V a , corresp ond to adding the v ariables r ( p ) k , i.e., r ( p ) k = V T k x k + r ( p ) k +1 . The last set of rows, i.e., V b U b D , corresp ond to the initial set of equations with the l ( p ) k ’s and r ( p ) k ’s introduced. W e then hav e det( P 1 A ex P 2 ) = det L ∆ 0 U a 0 U ∆ V a V b U b D = det L ∆ 0 0 U ∆ det D − V b U b L ∆ 0 0 U ∆ − 1 U a V a ! | {z } Block determinant formula (4.2) No w note that det( L ∆ ) = 1 = det( U ∆ ), due to the fact that L ∆ and U ∆ are triangular matrices with 1’s on the diagonal. Hence, det L ∆ 0 0 U ∆ = det( L ∆ ) det( U ∆ ) = 1 × 1 = 1 (4.3) F urther, note that the matrix D − V b U b L ∆ 0 0 U ∆ − 1 U a V a is the Sc h ur complemen t obtained b y eliminating the v ariables l ( p ) i , r ( p ) i and hence is the initial dense matrix A we b egan with, i.e., D − V b U b L ∆ 0 0 U ∆ − 1 U a V a = A (4.4) Hence, we hav e det( P 1 A ex P 2 ) = det D − V b U b L ∆ 0 0 U ∆ − 1 U a V a ! = det( A ) (4.5) Generalized Rybicki Press algorithm 7 whic h gives us that det( A ex ) = ± det( A ) (4.6) where the ambiguit y in the sign arises due to the determinant of the p erm utation matrices. 5. Rein terpretation of Rybic ki Press algorithm in terms of sparse embedding. W e will first naiv ely rein terpret the Rybicki Press algorithm in terms of the extended sparse matrix algebra. Recall that the Rybic ki Press algorithm [20] in verts a correlation matrix A given by Equation (5.1). A ( i, j ) = exp ( − β | t i − t j | ) (5.1) where t i ’s lies on an in terv al and are monotone. The original Rybicki Press algorithm relies on the fact that the inv erse of A happ ens to b e a tridiagonal matrix. The k ey ingredien t of their algorithm is the following prop ert y of exp onen tials: exp ( β ( t i − t j )) exp ( β ( t j − t k )) = exp ( β ( t i − t k )) (5.2) In our sparse interpretation as w ell, we will use this property to recognize that the matrix A is a semi-separable matrix, whose semi-separable rank is 1. This can b e seen b y setting u k = exp( β t k ) and v k = exp( − β t k ). This then giv es us ( i < j ) that A ( i, j ) = u i v j = exp( β t i ) exp( − β t j ) = exp( β ( t i − t j )) and similarly for i > j . This shows that the matrix A is semi-separable with semi-separable rank 1. Hence, we can mimic the same approach as in the earlier sections to obtain an O ( N ) algorithm. Ho wev er, there is an issue that needs to b e addressed from a numerical persp ectiv e. If the t i ’s are spread o ver a large in terv al, then u i is exponentially large, while v i is exponentially small, and hence embedding into a sparse matrix as such could prov e to b e a catastrophic leading to underflo w and o verflo w of the relev an t entries. This issue though can be circum ven ted by a suitable analytic preconditioning, by an appropriate change of v ariables. This is illustrated for a 4 × 4 linear system. W e will use the notation t ij to denote | t i − t j | . The linear equation is 1 exp( − β t 12 ) exp( − β t 13 ) exp( − β t 14 ) exp( − β t 12 ) 1 exp( − β t 23 ) exp( − β t 24 ) exp( − β t 13 ) exp( − β t 23 ) 1 exp( − β t 34 ) exp( − β t 14 ) exp( − β t 24 ) exp( − β t 34 ) 1 x 1 x 2 x 3 x 4 = b 1 b 2 b 3 b 4 (5.3) No w lets introduce the additional v ariables as follo ws: r 4 = x 4 (5.4) r 3 = x 3 + exp( − β t 34 ) r 4 (5.5) r 2 = x 2 + exp( − β t 23 ) r 3 (5.6) l 2 = x 1 exp( − β t 12 ) (5.7) l 3 = ( x 2 + l 2 ) exp( − β t 23 ) (5.8) l 4 = ( x 3 + l 3 ) exp( − β t 34 ) (5.9) The equations then b ecome x 1 + exp( − β t 12 ) r 2 = b 1 (5.10) l 2 + x 2 + exp( − β t 23 ) r 3 = b 2 (5.11) l 3 + x 3 + exp( − β t 24 ) r 4 = b 3 (5.12) l 4 + x 4 = b 4 (5.13) 8 O ( N ) direct solver and determinant computation of general semi-separable matrices Em b edding this in an extended sparse matrix, w e obtain 1 exp( − β t 12 ) 0 0 0 0 0 0 0 0 exp( − β t 12 ) 0 − 1 0 0 0 0 0 0 0 0 − 1 0 1 exp( − β t 23 ) 0 0 0 0 0 0 0 1 1 exp( − β t 23 ) 0 0 0 0 0 0 0 exp( − β t 23 ) exp( − β t 23 ) 0 − 1 0 0 0 0 0 0 0 0 − 1 0 1 exp( − β t 34 ) 0 0 0 0 0 0 0 1 1 exp( − β t 34 ) 0 0 0 0 0 0 0 exp( − β t 34 ) exp( − β t 34 ) 0 − 1 0 0 0 0 0 0 0 0 − 1 0 1 0 0 0 0 0 0 0 0 1 1 x 1 r 2 l 1 x 2 r 3 l 2 x 3 r 4 l 3 x 4 = b 1 0 0 b 2 0 0 b 3 0 0 b 4 (5.14) Note that the sparsit y pattern of the matrix is the same as before, which is to b e exp ected, since all we ha ve done essen tially is to scale elements appropriately and hence the zero fill-ins remain the same. 6. Numerically stable generalized Rybicki Press. The same idea carries ov er the generalized Rybicki Press algo- rithm, i.e., if we consider a CARMA( p , q ) pro cess whic h has the co v ariance matrix given b y K ( r ) = d if r = 0 p X l =1 α l ( p, q ) exp( − β l r ) if r > 0 (6.1) then it immediately follo ws that that the matrix is semi-separable with semi-separable rank being p . T o av oid numerical o verflo w and underflow, as shown in the previous section, appropriate sets of v ariables need to b e introduced. Let α = α 1 α 2 · · · α p T (6.2) and γ k = exp( − β 1 t k,k +1 ) exp( − β 2 t k,k +1 ) · · · exp( − β p t k,k +1 ) T (6.3) No w introduce the v ariables r k = αx k + D ( k,k +1) r k +1 (6.4) l k = γ k − 1 x k − 1 + D ( k − 1 ,k ) l k − 1 (6.5) where k ∈ { 2 , 3 , . . . , N } , with r N +1 = l 1 = 0 and D ( k,k +1) is a p × p diagonal matrix, with its diagonal b eing γ k . The initial equations b ecome α T l k + dx k + γ T k r k +1 = b k (6.6) where k ∈ { 1 , 2 , . . . , N } . No w form the extended sparse matrix using the v ariables x k , l k and r k , with the equations b eing Equations (6.4), (6.5), (6.6). The sparsity pattern of the extended sparse matrix is the same and hence the computational complexit y scales as O ( N ). Generalized Rybicki Press algorithm 9 7. Numerical b enc hmarks. W e present a few numerical b enc hmarks illustrating the scaling of the algorithm and the error. In all these b enc hmarks, the semi-separable matrix is of the form A ( i, j ) = d if i = j p X l =1 α l exp ( − β l | t i − t j | ) if i 6 = j (7.1) where the t i ’s lie on a on-dimensional manifold and are sorted in increasing fashion. Apart from the time tak en for the assem bly , factorization and solv e, the infinit y norm of the residual, i.e., k Ax − b k ∞ and the relativ e error in the log determinan t are also presented. F or the purp oses of b enc hmark, t i ’s are chosen at random from the interv al [0 , 20] and then sorted; α l ’s, β l ’s are chosen at random from the interv al [0 , 2]; and d is set equal to 1 + p X l =1 α l . Throughout the b enc hmarks the original dense matrix will b e referred to as A , while the corresp onding extended sparse matrix will b e referred to as A ex . The extended sparse linear system, i.e., A ex x ex = b ex , is solved using the sparse LU factorization (SparseLU) in Eigen [21]. This relies on the sequential Sup erLU pac k age [22 – 24], which p erforms sparse LU decomp osition with partial pivoting. The preordering of the unknowns is p erformed using the COLAMD metho d [25]. It is to b e noted that despite the preordering and partial piv oting, which in turn affects the banded structure, the computational cost as shown in Figures 7.1, 7.2 for the extended sparse system scales linearly in the num b er of unknowns. The extended sparse matrix is stored using a triplet list in Eigen [21], whic h in ternally conv erts it in to compressed column/row storage format. The exact implemen tation can be found at https://github.com/sivaramambikasaran/ESS [19]. 7.1. Benc hmark 1 . In this b enc hmark, w e illustrate the linear scaling of the algorithm with the num b er of unkno wns N for different choices of p . The solution obtained using the sparse LU factorization is compared with the partial piv oted LU algorithm (PartialPivLU) in Eigen [21], whic h is used to solv e the initial dense linear system Ax = b . T able 7.1 shows the scaling of the algorithm and the maxim um error in the residual for a fixed semi-separable rank of p = 5. T able 7.1 Sc aling of the algorithm with system size N for a fixe d semi-sep ar able r ank p = 5 . The time taken is rep orte d in mil lise c onds. System size Time taken in milliseconds Error in residual Error in log-det N Assembly F actorize Solv e measured in k · k ∞ log( | A ex | / | A | ) log( | A | ) Usual F ast Usual F ast Usual F ast Usual F ast 500 15 . 5 1 . 15 12 8 0 . 233 1 . 36 2 × 10 − 14 2 . 2 × 10 − 15 1 . 08 × 10 − 15 1000 49 . 2 1 . 73 91 . 6 15 . 5 0 . 862 2 . 02 4 × 10 − 14 3 . 8 × 10 − 15 1 . 46 × 10 − 15 2000 188 3 . 28 643 30 . 8 2 . 80 4 . 41 9 × 10 − 14 5 . 6 × 10 − 15 1 . 67 × 10 − 15 5000 1150 9 . 11 9360 83 . 1 14 . 4 10 . 4 2 × 10 − 13 6 . 4 × 10 − 15 5 . 44 × 10 − 16 10000 4760 20 . 5 71900 167 58 . 1 20 . 5 3 × 10 − 13 8 . 0 × 10 − 15 3 . 74 × 10 − 15 20000 − 49 . 1 − 333 − 42 . 9 − 1 . 0 × 10 − 14 − 50000 − 116 − 838 − 108 − 1 . 5 × 10 − 14 − 100000 − 216 − 1680 − 213 − 1 . 8 × 10 − 14 − 200000 − 441 − 3380 − 425 − 2 . 6 × 10 − 14 − 500000 − 1330 − 8500 − 1070 − 3 . 4 × 10 − 14 − 1000000 − 2700 − 17600 − 2330 − 3 . 9 × 10 − 14 − • Assem bly time - Time taken to assem ble the dense matrix versus extended sparse matrix. • F actorization time - Time taken to factorize the dense matrix v ersus extended sparse matrix. • Solv e time - Time taken to solve the dense linear system v ersus the extended sparse linear system (once the factor- ization has b een obtained). 10 O ( N ) direct solver and determinant computation of general semi-separable matrices • Error in residual - Comparision of k Ax − b k ∞ and k A ex x ex − b ex k ∞ . • Error in log-det - Relative error of the log of the absolute v alue of the determinant of the dense matrix and the extended sparse matrix. Figure 7.1 illustrates the scaling of the assembly , factorization and solve time with system size. The different components of the algorithm, i.e., assem bly , factorization and solv e, scale linearly in the n umber of unkno wns. Also, as expected the pre-factor infront of the linear scaling increases with the semi-separable rank p , i.e., in our case the num b er of exponentials. 10 3 10 4 10 5 10 6 10 − 4 10 − 3 10 − 2 10 − 1 10 0 10 1 System size Assem bly time in seconds F ast p = 1 F ast p = 2 F ast p = 5 F ast p = 10 O ( N ) (a) Assembly time v ersus system size 10 3 10 4 10 5 10 6 10 − 3 10 − 2 10 − 1 10 0 10 1 10 2 System size F actor time in seconds F ast p = 1 F ast p = 2 F ast p = 5 F ast p = 10 O ( N ) (b) F actor time v ersus system size 10 3 10 4 10 5 10 6 10 − 4 10 − 3 10 − 2 10 − 1 10 0 10 1 System size Solv e time in seconds F ast p = 1 F ast p = 2 F ast p = 5 F ast p = 10 O ( N ) (c) Solve time v ersus system size 10 3 10 4 10 5 10 6 10 − 16 10 − 15 10 − 14 10 − 13 System size Maxim um residual, i.e., k Ax − b k ∞ F ast p = 1 F ast p = 2 F ast p = 5 F ast p = 10 (d) Error versus system size Fig. 7.1 . Sc aling of the algorithm with system size. F r om the b enchmarks, it is clear that the computational c ost for the fast algorithm sc ales as O ( N ) for assembly, factorization and solve stages, wher e N is the numb er of unknowns. The maximum r esidual is less than 10 − 13 even for a system with mil lion unknowns. Generalized Rybicki Press algorithm 11 7.2. Benc hmark 2 . In this b enchmark, we illustrate the scaling of the time taken (assembly , factorization and solv e) for algorithm with p , the num b er of exp onen tials added (equiv alen tly the semi-separable rank). 1 2 5 10 20 10 − 2 10 − 1 10 0 Num b er of exp onen tials Assem bly time in seconds N = 50000 N = 100000 N = 200000 O ( p ) (a) Assembly time v ersus num b er of exponentials 1 2 5 10 20 10 − 1 10 0 10 1 Num b er of exp onen tials F actor time in seconds N = 50000 N = 100000 N = 200000 O ( p ) O ( p 2 ) (b) F actor time v ersus num b er of exponentials 1 2 5 10 20 10 − 1 10 0 Num b er of exp onen tials Solv e time in seconds N = 50000 N = 100000 N = 200000 O ( p ) (c) Solve time v ersus num b er of exponentials 1 2 5 10 20 10 − 15 10 − 14 10 − 13 10 − 12 Num b er of exp onen tials Maxim um residual, i.e., k Ax − b k ∞ N = 50000 N = 100000 N = 200000 (d) Error versus n umber of exp onen tials Fig. 7.2 . Sc aling of the fast algorithm with the numb er of exp onentials adde d. F r om the b enchmarks, it is cle ar that the computational c ost for the fast algorithm sc ales as O ( p ) for assembly and solve stages, while it sc ales as O ( p 2 ) for the factorization stage, wher e p is the numb er of exp onentials (e quivalently the semi-sep ar able r ank). The maximum r esidual is less than 10 − 13 almost always. Figure 7.2 illustrates the scaling of differen t parts of the algorithm with the semi-separable rank p . Note that the assembly time scales linearly with the semi-separable rank, while the factorization time scales quadratically with the semi-separable rank as exp ected. The error in the solution seems to be more or less independent of the semi-separable rank. 12 O ( N ) direct solver and determinant computation of general semi-separable matrices 8. Conclusion. The article discuss es a numerically stable, generalized Rybic ki Press algorithm, which relies on the fact that a semi-separable matrix can b e embedded in to a larger banded matrix. This enables O ( N ) inv ersion and determinant computation of co v ariance matrices, whose entries are sums of exp onen tials. This also immediately provides a fast matrix v ector product for semi-separable matrices. This publication also serv es to formally announce the release of the implemen- tation of the extended sparse semi-separable factorization and the generalized Rybic ki Press algorithm. The implementation is in C++ and is made av ailable at https://github.com/sivaramambikasaran/ESS [19] under the license provided by New Y ork Univ ersity . Generalized Rybicki Press algorithm 13 References. [1] S. O. Asplund. Finite b oundary v alue problems solv ed b y Green’s matrix. Mathematic a Sc andinavic a , 7:49–56, 1959. [2] I Gohberg, MA Kaasho ek, and F V an Sc hagen. Non-compact integral op erators with semi-separable k ernels and their discrete analogues: In version and Fredholm prop erties. Inte gr al Equations and Op er ator The ory , 7(5):642–703, 1984. [3] F ritz Gesztesy and Konstantin A Mak aro v. Mo dified Fredholm determinan ts for operators with matrix-v alued semi- separable integral kernels revisited. Inte gr al Equations and Op er ator The ory , 47(4):457–497, 2003. [4] SN Roy and AE Sarhan. On in verting a class of patterned matrices. Biometrika , pages 227–231, 1956. [5] SN Roy , BG Greenberg, and AE Sarhan. Ev aluation of determinan ts, characteristic equations and their ro ots for a class of patterned matrices. Journal of the R oyal Statistic al So ciety. Series B (Metho dolo gic al) , pages 348–359, 1960. [6] Chandan K Mustafi. The inv erse of a certain matrix, with an application. The Annals of Mathematic al Statistics , pages 1289–1292, 1967. [7] VR Rao Uppuluri and JA Carp en ter. The inv erse of a matrix o ccurring in first-order moving-a v erage mo dels. Sankhy¯ a: The Indian Journal of Statistics, Series A , pages 79–82, 1969. [8] G Green b erg and AHMED E Sarhan. Matrix in version, its in terest and application in analysis of data. Journal of the A meric an Statistic al Asso ciation , pages 755–766, 1959. [9] Raf V andebril, Marc V an Barel, Gene Golub, and Nicola Mastronardi. A bibliography on semi-separable matrices*. Calc olo , 42(3-4):249–270, 2005. [10] Y Eidelman and I Goh b erg. A mo dification of the Dewilde–v an der Veen metho d for in version of finite structured matrices. Line ar Algebr a and its Applic ations , 343:419–450, 2002. [11] Ellen V an Camp, Nicola Mastronardi, and Marc V an Barel. Two fast algorithms for solving diagonal-plus-semi-separable linear systems. Journal of Computational and Applie d Mathematics , 164:731–747, 2004. [12] Y Eidelman and I Gohberg. Inv ersion form ulas and linear complexit y algorithm for diagonal plus semi-separable matrices. Computers & Mathematics with Applic ations , 33(4):69–79, 1997. [13] I Goh b erg, T Kailath, and I Koltrach t. Linear complexity algorithms for semi-separable matrices. Inte gr al Equations and Op er ator The ory , 8(6):780–804, 1985. [14] Jitesh Jain, Hong Li, Cheng-Kok Koh, and V enk ataramanan Balakrishnan. O(n) algorithms for banded plus semi- separable matrices. In Numeric al Metho ds for Structur e d Matric es and Applic ations , pages 347–358. Springer, 2010. [15] Shiv Chandrasek aran and Ming Gu. F ast and stable algorithms for banded plus semi-separable systems of linear equations. SIAM Journal on Matrix Analysis and Applic ations , 25(2):373–384, 2003. [16] P eter J Bro c kwell and Ric hard A Davis. Intr o duction to time series and for e c asting , volume 1. T a ylor & F rancis, 2002. [17] P eter J Bro c kwell. L´ evy-driv en CARMA pro cesses. Annals of the Institute of Statistic al Mathematics , 53(1):113–124, 2001. [18] P eter J Bro c kwell. On con tinuous-time threshold ARMA pro cesses. Journal of Statistic al Planning and Infer enc e , 39(2):291–303, 1994. [19] Siv aram Ambik asaran. ESS. https://gith ub.com/siv aramam bik asaran/ESS, 2014. [20] George B Rybicki and William H Press. Class of fast metho ds for pro cessing irregularly sampled or otherwise inhomo- geneous one-dimensional data. Physic al r eview letters , 74(7):1060, 1995. [21] Ga ¨ el Guennebaud, Beno ˆ ıt Jacob, et al. Eigen v3. h ttp://eigen.tuxfamily .org, 2010. [22] James W Demmel, Stanley C Eisenstat, John R Gilb ert, Xiaoy e S Li, and Joseph WH Liu. A sup ernodal approach to sparse partial pivoting. SIAM Journal on Matrix A nalysis and Applic ations , 20(3):720–755, 1999. [23] James W Demmel. Sup erlu users’ guide. L awr enc e Berkeley National L ab or atory , 2011. [24] Xiao ye S Li. An ov erview of sup erlu: Algorithms, implementation, and user interface. A CM T r ansactions on Mathe- matic al Softwar e (TOMS) , 31(3):302–325, 2005. [25] Timoth y A Da vis, John R Gilbert, Stefan I Larimore, and Esmond G Ng. Algorithm 836: Colamd, a column approximate minim um degree ordering algorithm. ACM T r ansactions on Mathematic al Softwar e (TOMS) , 30(3):377–380, 2004.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment