Adaptive inference for the mean of a stochastic process in functional data

Adaptive inference for the mean of a stochastic process in functional   data
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This paper proposes and analyzes fully data driven methods for inference about the mean function of a stochastic process from a sample of independent trajectories of the process, observed at discrete time points and corrupted by additive random error. The proposed method uses thresholded least squares estimators relative to an approximating function basis. The variable threshold levels are estimated from the data and the basis is chosen via cross-validation from a library of bases. The resulting estimates adapt to the unknown sparsity of the mean function relative to the selected approximating basis, both in terms of the mean squared error and supremum norm. These results are based on novel oracle inequalities. In addition, uniform confidence bands for the mean function of the process are constructed. The bands also adapt to the unknown regularity of the mean function, are easy to compute, and do not require explicit estimation of the covariance operator of the process. The simulation study that complements the theoretical results shows that the new method performs very well in practice, and is robust against large variations introduced by the random error terms.


💡 Research Summary

The paper addresses the fundamental problem of estimating the mean function of a stochastic process when only a finite number of independent sample paths are observed at discrete time points and each observation is contaminated by additive random noise. Traditional functional data methods such as smoothing splines or kernel regression either ignore the discrete nature of the data or fail to exploit possible sparsity of the mean function when expressed in a suitable basis. To overcome these limitations, the authors propose a fully data‑driven procedure that combines (i) a basis expansion of the unknown mean, (ii) a thresholded least‑squares estimator for the expansion coefficients, (iii) data‑based selection of the threshold levels, and (iv) cross‑validation for choosing the most appropriate basis from a library (e.g., Fourier, wavelet, B‑splines).

The model assumes observations (Y_{ij}=X_i(t_j)+\varepsilon_{ij}) where (X_i(t)) are i.i.d. copies of the underlying process, (t_j) are fixed design points, and (\varepsilon_{ij}) are i.i.d. mean‑zero noise with variance (\sigma^2). The mean function (\mu(t)=\mathbb{E}


Comments & Academic Discussion

Loading comments...

Leave a Comment