Identifying Relevant Eigenimages - a Random Matrix Approach

> < 1 Identifying Relevant Eigenimages – a Random Matrix Approach Yu Ding, Yiu-Cho Chung, Kun Huang, and Orlando P. Simonetti Abstract —Dimensional reduction of high dimen sional data can be achieved by keeping only the relevant eigenmodes after principal component analysis . However, differentiating relevant eigenmodes from the ran dom noise eigenmodes is problematic. A new method based on the random matrix theory and a statistical goodness-of-fit test is proposed in this pap er. It is validated by numerica l simulations and app lied to real-time magnetic resonance cardiac cin e images. I. I NTRODUCTION N the fields of pattern recogni tion, machine l earning, and comput er vision, eigen-decom position based m ethods such as principal component analy sis (PCA) 1 , multidimensional scaling (MDS) 2 , and spectral clustering 3 have been widely adopted for the purpose of dimensionality reduction, data representation, and clus tering. A com mon fundam ental issu e in these approaches is to determine how ma ny eigenvectors are needed for further analysi s (e.g., data representation and cl ustering). Traditionall y, the choice of the num ber of pri ncipal eigenvectors is m ade in an em pirical or ad hoc m anner (e.g., the knee point of the eigenvalue plot). In PCA, one of the goals is t o find a low-dimensi onal representation of the dat a so that the residuals in the ambient sp ace are randomly di stributed. I Recent development of the random m atrix theory (RMT) 4 brings a new tool to this ol d problem. RM T has been successfully applied to study nucl ear energy level 4 , quantum chaos 5 , EEG data 6 , comm unication system s 7 , and the financial m arket 8, 9 . Usually, a data m atrix is the sum of a low-rank signal m atrix and a full-rank noise m atrix. Sengupta et al 10 studied the p robability Manuscript received Dec 09, 2008. Y. Ding is with The Ohio Sate Univ ersity, Columbus, OH 43210 USA (phone: 614-293-2710; fax: 614-293-2561; e-mail: Yu.Ding@ osum c.edu). Y.-C Chung, is with Siemens Healthcar e, Inc., Colum bus, OH 43210 USA. (e-mail: yiucho.chung@siem ens.com ). > < 2 distribution function (PDF) of singul ar values of such data ma tr ices, and found that part of t he small singul ar values follows the distributi on of Marcenko-Pastur (MP) law 11 . One of the im portant result s from RMT i s that the empi rical PDF of eigenvalues of the random variable co-variance matri x follows the MP-law. In other words, by removing a proper num ber of eigenmodes from a data matrix, t he residual matrix behaves li ke a random mat rix. We propose a new method to identify the number of relevant ei genmodes of a data m atrix by maxim izing the similarities between the residual data m atrix and a random matrix. The kern el of thi s method is a param eter estim ation to maxim ize the statistical goodness-of-fit (GOF) 12 between the MP-law and a subset of the eige nvalues. We refer to th is approach as the RMT-fitting method. When a data matrix c ontains independent a nd identically-distributed (IID) noise, the RMT-fitting m ethod has only one parameter to estim ate, the numbe r of relevant eigenvalues. When non-IID noise appears, satisfactory GOF can be achieved by varying anot her parameter. In this study, we validate the RMT-fitting method using num erical simula tion. This technique is then dem onstrated in the analysis of real-ti me m agnetic resonance (MR) cardiac i mages acquired by parall el imagi ng techniques 13, 14 which introduce non-IID noise. We find that the RMT-fitting me thod successfully identifies the number of relevant eigenim ages in this medical imaging appl ication. II. T HEORY We focus on a data set that can be represented by a 2-D matrix, the fi rst dimensi on representing the tem poral sampling, and t he second dime nsion representing the spati al sampling, such as a m ovie taken from a CCD cam era. For an M × N data matrix A = S o + n (with no loss of g enerality, assuming M < N ), where matrix S o has a low-rank structure and entries of m atrix n are IID noise, the M × M temporal covariance m atrix (TCM ) is given by A T A / N . S o represents the underlying signal such as a noiseless dynam ic cardiac image series with M fram es, each frame containing N pi xels. If S o has r ( r < M ) independent signal m odes with large variances when comparing to the noise variance 15 , then the PDF of the smallest M - r TCM eigenvalues follows the MP-law 10 : )) )( ( , 0 max( 2 ) ( min max 2 λ λ λ λ λ πσ α λ − − = p (1) where N r M / ) ( − = α , ) 1 ( 2 min max, α σ λ ± = , and is the mean noise variance. Wh en no signal appears in matrix A , r = 0 in Eq.(1). Throughout the paper, all eigenval ues menti oned are of TC M, and sorted in a descending order. 2 σ When non-IID noise appears, the TCM eigenvalue PDF is unknown. However, we observed that the MP-law is still K. Huang is with The Ohio Sa te University, Columbus, OH 43210 USA (e-m ail: khuang@bmi.osu.edu). O. P. Simonetti is with The Ohio Sate University, Colum bus, OH 43210 USA (e-mail: Orlando.Simonetti@ osumc.edu). > < 3 applicable when a data matrix is corrupt ed by two types of non-IID noise, spatially correlated Gaussian noise and spatially var iant Gaussian noise. W idely used signal p ost-processing algorithms such as interpo lation or low-pass filterin g may cause spatial correlation of noise. The recently developed MRI fast imaging me thod called ‘parallel im aging’ can introduce spatially variant noise. We found that by optim izing two parameters, r and N of Eq. (1), the MP-law can fit reasonably well to th e PDF of the smallest ei genvalues of the TCM of a dy nami c MRI cardiac cine im age series acquired using paralle l imaging on a com mercial MRI scanner. The RMT-fitting method searches the maxim a of the sta tistical goodness-of-fit (GOF) in a 2-D parameter space (the r , N -plane). Parameter is the m ean of last M-r eigenvalues, which is a function of r . We use the Kolmogorov-Sm irnov (KS) test or the Cramer-von-Mises (C VM) test 2 σ 16 as the GOF function. We test the effectiv eness of the RMT-fitting method using num erical simulations, and then app ly it to in-vivo real-time MR cardiac cine images to identify the relev ant eigenimages. III. M ETHOD All the images for simulation and analysis were acquired in healthy volun teers. This study is approv ed by our institu tion’s Human Subjects Committee. All subjects gave written informed cons ent to participate in these stu dies. A. Numerical Simulation A segmented cardiac MR cine 17, 18 series consisting of 20 fram es, evenly distri buted over a single cardiac cycle, was acquired in a healthy volunteer. The in-pl ane spatial resolut ion, temporal resolut ion, and image m atrix were 1.4 mm, 47.7 m s, and 256 × 208 respectively ( M = 20, N = 256 × 208). λ max =2.65 × 10 5 and λ min = 72 are the largest and the smallest eigenvalues. A dynamic cardiac cine series over more than one cardiac cycle can be simulated by repeating the original series β times ( β >1). Regardless of β , the data matrix ( M = 20 × β , N = 256 × 208) has a fixed rank = 20. The first simulation was designed to explore the perform ance of the RMT-fitting met hod as a function of the additive noise variance. An image series with 4 repetition s ( β =4) was constructed. Three different types o f noise were added to it: (i) zero -mean Gaussian IID noise; (ii) spatial correlated Gaussian noise, e.g. the noise in (i) filtered by a 2D sp atial low-pass filtering (LPF), with a 2-D square box-car function keeping 25% of the Fourier space and (iii) spatially varying Ga ussian noise generated by multiplying uniform Gaussian noise by a spatially variant template formed by uncorrelated random num be rs following a uniform distribution between zero and one (the same tem plate was used in all 80 frames) . In each case, the noise variance was varied fr om λ min to λ max . We defined the critical noise variance V c as the noise variance at which th e RMT-fitting method find 19 relevant eigenimages. We also varied the LPF cuto ff to check the accuracy of the fitted independent pixel number. One thousand tests we re > < 4 performed to reduce the effect of random fluctuations. The intent of the second sim ulation was t o explore the effect of the num ber of cardiac cycles included in the im age series. We constructed 12 im age series with different β , corrupt ed with Gaussian IID noise, and then evaluat ed corresponding V c . The quantitative relation b etween β and V c was studied by linear regression, a nd com p ared to the theoretical result 15 . In order to check the consistency of the results, another four cardiac MR cine series acquired in four volunteers usi ng the sam e imaging param eters were processed through the same simul ation and compared t o the previous results. B. In-vivo real-time Cardiac MRI cine Images 16 free-breathing, dynamic cardiac MR cine series were acquired in six health y volunteers using a real-time SSFP sequence 17, 18 using TSENSE 19 parallel imaging technique with accel eration factor four. Each cine se ries consisted of 256 frames spanning multiple cardiac and respiratory cycles, w ith each fram e containing 192 × 144 pixels. Th e temporal resolution for each fram e wa s approximate ly 60 msec. Since bot h the cardiac moti on and the respiratory m otion are quasi-periodic, the im age data matrix shoul d be represented by the sum of a low-rank signal ma trix and a noi se matri x. The number of relevant eigeni mages was calcula ted using the RMT-fitting method for each im ages series. IV. R ESULTS Both numerical simulat ions and in-vivo experim ents demonstrat ed t hat the number of relevant eigenima ges of a real-time cardiac cine can be determined by the RM T-fitting m ethod. If we assume eigenvalues are uncorrelated random numbers, and use a significance level of 5% i n the KS-test, no statist ical significant devi ation from t he MP-law was observed throughout this st ud y. A. Numerical Simulation When the Gaussian IID noise was added, the critical noise variance V c was 66.0 × λ min , see Fig. 1a. The mean pixel number from fitting was systematically smaller than the actual value, but the difference was less than 5% on average, see Fig. 1b. When spatially co rrelated noise was added, th e V c was 168.4 × λ min (measured before applyi ng LPF). As expected, the number of independent pixels is close to 25% of the total , see Figure 2a. In the simulat ion of variable noise LPF cutoff, the RMT m ethod resulted in the accurate number of independent pixels (difference < 5%), see Figure 2b. 10 2 10 4 10 6 0 10 20 30 Noise Variance (a.u.) Relevant Eigenimage ab 10 2 10 4 10 6 94% 96% 98% 100% Noise Variance (a.u.) Pixel Number Ratio Fig. 1. The simulation results with Ga ussian additive noise. (a) The number o f relevant eigenimage vs. additive noise variance; (b) the fitted pixel number / actual vs. additive noise variance (the measurem ent SD < 5% for all data p oints). > < 5 In the spatially variant no ise scenario, the V c is 60× λ min (measured after multiplying the template). The number of relevant eigenimages and t he KS-test value are shown in Figure 3. Figure 4 shows the relation between V c and β . The theoretical relation is ( V c / λ min ) 2 = ( N / M )× β 15 . From our sim ulation, the relation is ( V c / λ min ) 2 = 0.42×( N / M )× β , i . e . the RMT-fitting method can tol erate a slightly lower noise level. 10 2 10 4 10 6 23% 24% 25% 26% Noise Variance (a.u.) Pixel Number Ratio a b 0 2 4 0 2 4 Theoretical ( × 10 4 ) Experimental ( × 10 4 ) Fig. 2. The fitted independe nt pixel number. (a) The independent pixel ratio vs. additive noise variance, (b) the fitted vs. the actual independent pixel numbe r at different LPF cutoffs. The dashed line indicates the function of Y = X. Additionally, the mean noise variance can be assessed from the mean of the eigenvalues corresponding to the noise-onl y eigenimages as m entioned in the Theory section. The maximum noise variance measurement error in any single simulation test was small: 0.30% for the Gaussian noise case, 3.0% fo r the spatially variant noise case, and 1.2% for the spatially co rrelated noise case. Even when the number of rele vant eigenimages is not accurate, the noise variance measurement is still accurate. Numerical si mulati ons using the additional four im age series showed nearly iden tical results. We evaluated the relativ e fluctuations of critical noise variances (scaled by λ min ) for accurate number of relevant ei genimages m easurements, pi xel number m eas urements (scaled by the pixel num ber of an image) in the spatially correlated noise scenario, and noi se variance measurements. They were le ss than 5%, 10%, and 4%, respectively. B. In-vivo real-time Cardiac MRI cine Images The average number of relevant ei genimages selected by the RMT-fitting method was 90 ± 11. No clear spatially coherence structures were visible in t he eigenimages dete rmined to contain only noise. An example of the eig envalue distrib ution and a fitted line according to Eq. (1) are shown in Figure 5a. Figure 5b shows the eigenvalue “scree” plot. There is no clear jump around the number of relevan t eigenimages selected by the RMT-fitting method. Som e unknown degree of spatial filtering is applied by the image reconstruction software to all images acquired on a co mm ercial MRI system . Therefore, the pixel num ber generated by RMT-fitting method is sm aller than image size. We may truncate all the noise-only eigenimage s and reconstruct eigenfiltered 10 2 10 4 10 6 0 10 20 30 Noise Variance (a.u.) Relevant Eigenimag e a b 10 2 10 4 10 6 0.024 0.026 0.028 0.03 Noise Variance (a.u.) KS−test Fig. 3. (a) The number of relevant eigenim ages vs. spatially correlated additive noise variance, (b) the KS-test value vs. spatially correlated additive noise i 5 10 15 20 0 0.5 1 1.5 2 2.5 x 10 4 β (V c / λ min ) 2 Fig. 4. The relation between critical noise variance V c and the repetition β . The linear regression (Y = cX) is re presented by the dashed line. > < 6 images using rel evant eigenim ages. The mean noise variance in the filtered images is 90/256 ≈ 35% of the corresponding original images. All the results shown were acquired using t he KS-test to determine GOF. When the sam e experiments were repeated using the CVM-test, the maximum relative difference between all results was within 5%. The CVM-test based RMT-fitting method has a smaller random fluctuation (~ 30% smaller in standard deviation). 50 100 150 200 250 0.00 0.02 0.04 0.06 0.08 0.10 Eigenvalue (a. u.) Probality Densit y a 0 64 128 192 25 6 10 2 10 4 10 6 10 8 Ordinal Numbe r Eigenvalue b Cutoff Fig. 5. The eigenvalue distribution and th e cutoff of re levant eigenimage of an in-vivo real-time cardiac MRI cine imag e series. Two parameters determ in ed by the KS-test are the following: the cutoff = 74 and the effective number o f p ixels = 10152, about 36.7% of the original. (a) The distr ibution of the last 238 eigenvalues and the prediction of the RM T (solid line). (b) The “scree” plot o f all 256 eigenvalues. V. D ISCUSSION The RMT-fitting method successfully identifies the num ber of relevant eigenima ges in numerical sim ulations with three different noise distributions. Even with an additive noise varian ce over 60 times larger than λ min , correct numbers of relevant eigenimages were found. The quantitative relation between critical no ise variance V c and repetition β is also verified by our numerical si mulation. Thi s relation reveals the minim um number of repe titions needed for any signal mode to be distinguished from random noise using PCA. It indicates t hat all inform ation modes i n a noisy cardiac imag e series can be recovered using PCA as long as enough cardi ac cy cles are included, which guarantees that we can trade the scan time for SNR in real-time cardiac MR cine im aging. Comparing to the theoretical prediction 15 , our method shows a lower noi se tolera nce (the noise standard devi ation ≈ 80% of the theory). Two reasons may cause this result: first, ther e exist sm all deviations from MP-law, because th e MP-law is the asymptotic eigenvalue distrib ution when M , N ; second, the sensitivity of the GOF we selected may be relatively low. ∞ → The in-vivo experiment shows that the MR T-fitting m ethod can indentify the relevant eigenimages, and reveal the intrinsic low-rank structure of the corresponding data m atrix corrupted by non-IID noise. As a by-product, a series of noise-only eigenimages can be identified. Th erefore, the noise variance of th e original image series can be accurately assessed. This is a new method to m easure the temporal noise in dynami c images, which may be important to objectively evaluate the quality of dynamic images with non-IID noise. Both the numeri cal simula tion and the in-vivo im aging experi m ents demonstrate that the MP -law can fit the eigenvalue distribution reasonably well in the ( r , N )-plane even when non-IID noise appears. Therefore, the MP-law is still a good > < 7 approximati on of the eigenvalue PDF when non-IID noise appears. When the noise spatial correlation is induced by the low-pass filtering, the number of independe nt variables no longer co rresponds to the number of pixels. Therefore, it is understandable t hat the pixel number N should be al lowed to vary when applying the MP-law i n this scenario. A strict m athemat ical proof is warranted in the future. VI. C ONCLUSION The RMT-fitting method proposed in this paper provides a new solu tion to identify relevant eigeni mages, and also a m ethod to distinguish signal from noise. We believe it can be applied as a ge neric m ethod to select relevant eigenmodes in PCA. Therefore , it is a very principled ap proach and will h ave impact on many important problems in related areas. A CKNOWLEDGMENT Yu Ding thanks Prof. Heping He and Dr . Yan Su for helpful di scussions. R EFERENCES 1 I. T. Jolliffe, Principal component analysis (Springer New York, 2002). 2 J. B. Kruskal and M. Wish, Multidimensional Scaling (Sage Publications, 1978). 3 A. Ng, M. Jordan, and Y. W eiss, (MIT Press, 2002) . 4 M. L. Mehta, Random Matrices (Academic Press, 2004). 5 A. V. Andreev, O. Agam, B. D. Simons, et al., Physical Review Letters 76 , 3947 (1996). 6 P. Šeba, Physical Review Letters 91 , 198104 (2003). 7 A. M. Tulino and S. Verdu, R andom Matrix Theory And Wireless Communications (Now Publishers Inc, 2004) . 8 V. Plerou, P. Gopikrishnan, B. Rosenow, et al., Physical Review Letters 83 , 1471 (1999) . 9 L. Laloux, P. Cizeau, J. P. Bouchaud, et al., Physical Review Letters 83 , 1467 (1999). 10 A. M. Sengupta and P. P. Mitra, Phys Rev E Stat Phys Plasmas Fluids Relat Inter discip Topics 60 , 3389 (1999). 11 A. Edelman, SI AM Journal on Matrix Analysis and Applications 9 , 543 ( 1988). 12 R. G. Easterling, Technom etrics 18 , 1 (1976) . 13 M. A. Griswold, P. M. Jakob, R. M . Heidemann, et al., Magnetic Resonance in Medicine 47 , 1202 (2002) . 14 K. P. Pruessmann, M. Weiger, M. B. Scheidegger, et al., Magn Reson Med 42 , 952 (1999). 15 D. C. Hoyle and M. Rattray, Phys Rev E Stat Nonlin Soft Matter Phys 69 , 026124 (2004). 16 F. James, Statistical Methods in Experimental Physics (W orld Scientific Pub Co Inc, 2006). 17 V. S. Lee, D. Resnick, J. M. Bundy , et al., Radiology 222 , 835 (2002). 18 J. C. Carr, O. Simonetti, J. Bundy, et al., Radiology 219 , 828 (2001) . 19 P. Kellman, F. H. Epstein, and E. R. McVeigh, M agn Reson Med 45 , 846 (2001). > < 8

Identifying Relevant Eigenimages - a Random Matrix Approach

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment