A geometric approach to maximum likelihood estimation of the functional principal components from sparse longitudinal data

Reading time: 5 minute
...

📝 Original Info

  • Title: A geometric approach to maximum likelihood estimation of the functional principal components from sparse longitudinal data
  • ArXiv ID: 0710.5343
  • Date: 2007-10-30
  • Authors: ** 정보 없음 (원문에 저자 정보가 제공되지 않았습니다.) **

📝 Abstract

In this paper, we consider the problem of estimating the eigenvalues and eigenfunctions of the covariance kernel (i.e., the functional principal components) from sparse and irregularly observed longitudinal data. We approach this problem through a maximum likelihood method assuming that the covariance kernel is smooth and finite dimensional. We exploit the smoothness of the eigenfunctions to reduce dimensionality by restricting them to a lower dimensional space of smooth functions. The estimation scheme is developed based on a Newton-Raphson procedure using the fact that the basis coefficients representing the eigenfunctions lie on a Stiefel manifold. We also address the selection of the right number of basis functions, as well as that of the dimension of the covariance kernel by a second order approximation to the leave-one-curve-out cross-validation score that is computationally very efficient. The effectiveness of our procedure is demonstrated by simulation studies and an application to a CD4 counts data set. In the simulation studies, our method performs well on both estimation and model selection. It also outperforms two existing approaches: one based on a local polynomial smoothing of the empirical covariances, and another using an EM algorithm.

💡 Deep Analysis

📄 Full Content

eigenfunctions to reduce dimensionality by restricting them to a lower dimensional space of smooth functions. The estimation scheme is developed based on a Newton-Raphson procedure using the fact that the basis coefficients representing the eigenfunctions lie on a Stiefel manifold. We also address the selection of the right number of basis functions, as well as that of the dimension of the covariance kernel by a second order approximation to the leave-one-curve-out cross-validation score that is computationally very efficient. The effectiveness of our procedure is demonstrated by simulation studies and an application to a CD4 counts data set. In the simulation studies, our method performs

In recent years there have been numerous works on data that may be considered as noisy curves. When the individual observations can be regarded as measurements on an interval, the data thus obtained can be classified as functional data. For analysis of data arising in various fields, such as longitudinal data analysis, chemometrics, econometrics, etc. [Ferraty and Vieu (2006)], the functional data analysis viewpoint is becoming increasingly popular. Depending on how the individual curves are measured, one can think of two different scenarios -(i) when the individual curves are measured on a dense grid; and (ii) when the measurements are observed on an irregular, and typically sparse set of points on an interval. The first situation usually arises when the data are recorded by some automated instrument, e.g. in chemometrics, where the curves represent the spectra of certain chemical substances. The second scenario is more typical in longitudinal studies where the individual curves could represent the level of concentration of some substance, and the measurements on the subjects may be taken only at irregular time points.

In these settings, when the goal of analysis is either data compression, model building or studying covariate effects, one may want to extract information about the mean, variability, correlation structure, etc. In the first scenario, i.e., data on a regular grid, as long as the individual curves are smooth, the measurement noise level is low, and the grid is dense enough, one can essentially treat the data to be on a continuum, and employ techniques similar to the ones used in classical multivariate analysis.

However, the irregular nature of data in the second scenario, and the associated measurement noise require a different treatment.

The main goal of this paper is the estimation of the functional principal components from sparse, irregularly, observed functional data (scenario (ii)). The eigenfunctions give a nice basis for representing functional data, and hence are very useful in problems related to model building and prediction for functional data [see e.g. Cardot, Ferraty and Sarda (1999), Hall and Horowitz (2007), Cai and Hall (2006)]. Ramsay and Silverman (2005) and Ferraty and Vieu (2006) give an extensive survey of the applications of functional principal components analysis (FPCA).

The focus throughout this paper thus is in the estimation of covariance kernel of the underlying process. Covariance is a positive semidefinite operator. The space of covariance operators is a nonlinear manifold. Thus, from statistical as well as aesthetic point of view, it is important that any estimator of the covariance is also positive semidefinite. Moreover, Smith (2005) gives a compelling argument in favor of utilizing the intrinsic geometry of the parameter space in the context of estimating covariance matrix in a multivariate Gaussian setting. He obtains Cramér-Rao bounds for the risk, that are described in terms of intrinsic gradient and Hessian of the loglikelihood function. This work brings out important features of the estimators that are not obtained through the usual Euclidean viewpoint. It also provides a strong motivation for a likelihood-based approach that respects the intrinsic geometry of the parameter space. In this paper, we shall adopt a restricted maximum likelihood approach and explicitly utilize the intrinsic geometry of the parameter space when fitting the maximum likelihood estimator. Now we shall give an outline of the model for the sparse functional data. Suppose that we observe n independent realizations of an L 2 -stochastic process {X(t) : t ∈ [0, 1]} at a sequence of points on the interval [0, 1] (or, more generally, on an interval [a, b]), with additive measurement noise. That is, the observed data {Y ij : 1 ≤ j ≤ m i ; 1 ≤ i ≤ n} can be modeled as :

where {ε ij } are i.i.d. with mean 0 and variance 1. Since X(t) is an L 2 stochastic process, by Mercer’s theorem [cf. Ash (1972)] there exists a positive semi-definite kernel C(•, •) such that Cov(X(s), X(t)) = C(s, t) and each X i (t) has the following a.s. representation in terms of the eigenfunctions of the kernel C(•, •) :

where µ(•) = E(X(•)) is the mean function; λ 1 ≥ λ 2 ≥ . . . ≥ 0 are the eigenvalues of C(•, •); ψ ν (•

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut