Iterative Aggregation Method for Solving Principal Component Analysis Problems
📝 Abstract
Motivated by the previously developed multilevel aggregation method for solving structural analysis problems a novel two-level aggregation approach for efficient iterative solution of Principal Component Analysis (PCA) problems is proposed. The course aggregation model of the original covariance matrix is used in the iterative solution of the eigenvalue problem by a power iterations method. The method is tested on several data sets consisting of large number of text documents.
💡 Analysis
Motivated by the previously developed multilevel aggregation method for solving structural analysis problems a novel two-level aggregation approach for efficient iterative solution of Principal Component Analysis (PCA) problems is proposed. The course aggregation model of the original covariance matrix is used in the iterative solution of the eigenvalue problem by a power iterations method. The method is tested on several data sets consisting of large number of text documents.
📄 Content
ITERATIVE AGGREGATION METHOD FOR SOLVING PRINCIPAL COMPONENT ANALYSIS PROBLEMS Vitaly Bulgakov BULGAKOV V@YAHOO.COM ABSTRACT. Motivated by the previously developed multilevel aggregation method for solving structural analysis problems a novel two-level aggregation approach for efficient iterative solution of Principal Component Analysis (PCA) problems is proposed. The course aggregation model of the original covariance matrix is used in the iterative solution of the eigenvalue problem by a power iterations method. The method is tested on several data sets consisting of large number of text documents. Keywords: principal component analysis, clustering method, power iteration method, aggrega- tion method, eigenvalue problem
- Introduction This work was envisioned as application of the multilevel aggregation method [1] developed by the author back in 90s to PCA problems. Multilevel aggregation method was an extension of well-known multigrid methods[2] from boundary value problems to general structural analysis problems which brought it to the class of algebraic multigrid methods. The idea of the aggre- gation method was to use some naturally constructed course model of the original finite element approximation of a structure which provides a fast convergence for iterative methods for solving large algebraic systems of equations. One of applications of this method was an iterative solution of large eigenvalue problems arising in structural natural vibration and buckling analyses [3]. In these problems a sought set of lowest vibration modes can be thought of as principal components of structure behavior. An obvious similarity with PCA was a turning point to start looking for a proper way to create an aggregation model for data matrix approximation and use it for efficient solution of PCA problems. In this study PCA[4] is applied to and the method is tested on text analysis problems. A tested data set consists of documents each of which produces an N-dimensional vector stored as a column of a data matrix which values are term frequencies. Our raw data comes in the form of text files from data sets such as medical abstracts and news groups. The purpose of PCA is to iteratively compute a set of highest eigenvalues and corresponding eigenvectors of the covariance matrix. Covariance matrix is never formed explicitly. The main operation is multiplication of large sparse data matrix or its transpose by a vector. The course aggregation model of the original covariance matrix is used in the iterative solution of the eigenvalue problem. Original covariance matrix and its approximation of small size assumes similarity of leading eigenvalues and eigenvectors. This fact allows fast convergence of subspace iterations at minimal additional computational cost. For numerical experiments we use R language which is rich of linear algebra, statistical and graphical packages.
- PCA problem formulation PCA in multivariate statistics is widely used as an effective way to perform unsupervised di- mension reduction. The essence of this method lies in using Singular Value Decomposition (SVD) 1 arXiv:1602.08800v1 [cs.NA] 29 Feb 2016 which provides the best low rank approximation to original data according to Eckart-Young theo- rem [5]. Let n data points in m dimensional space be contained in the data matrix which is assumed already centered around the origin for computational stability (x1, x2, …, xn) = X (1) Then covariance matrix is A = XXT (2) Let (λk, φk) be an eigenpair of A, where eigenvectors φk define principal directions.
- Aggregation model In order to create an aggregation model we divide the entire set of data vectors xi into n0 clusters using some similarity criteria where n0 « n. We will explain later how we do clustering. We assume that all vectors within the cluster are similar and a single representative of a cluster is an average of all vectors xi where i ∈clusterk or for cluster k we have x0 k = 1/dimk ∗ X i∈clusterk xi (3) Transformation of matrix X to X0 is done using matrix R which we call aggregator X0 = XR (4) where R[i, k] = if i ∈clusterk then 1/dimk else 0. X0 is of size (m, n0). Approximation A0 of covariance matrix A is A0 = X0XT 0 = XRRTXT (5) Formally matrix A0 is of the same size as A but has a much lower rank. We do not need to use form (5) for computations. For matrix vector multiplication we rather use sparse matrix X0 which according to (3) is constructed by simple averaging of vectors inside a cluster and A0v = X0XT 0 v (6) Therefore A0v requires O(mn0) operations which is much lower than O(mn) operations required for Av. We also expect and this is confirmed by numerical experiments that convergence of itera- tive methods for solving partial eigenvalue problem for A0 is faster than that for A. There are quite a few clustering techniques known as computationally efficient. Besides since we need clustering as an auxiliary procedure we do not need highly accurate clustering results. In this study we use
This content is AI-processed based on ArXiv data.