Empirical Likelihood Confidence Intervals for Nonparametric Functional Data Analysis

Empirical Likelihood Confidence Intervals for Nonparametric Functional   Data Analysis
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We consider the problem of constructing confidence intervals for nonparametric functional data analysis using empirical likelihood. In this doubly infinite-dimensional context, we demonstrate the Wilks’s phenomenon and propose a bias-corrected construction that requires neither undersmoothing nor direct bias estimation. We also extend our results to partially linear regression involving functional data. Our numerical results demonstrated the improved performance of empirical likelihood over approximation based on asymptotic normality.


💡 Research Summary

The paper addresses the construction of confidence intervals for nonparametric functional regression using the empirical likelihood (EL) framework. In functional data analysis the covariate is a curve (X(t)) lying in an infinite‑dimensional space, and the response (Y) is scalar. The usual Nadaraya–Watson estimator (\hat m(x)=\sum_i K_h(d(x,X_i))Y_i/\sum_i K_h(d(x,X_i))) suffers from a bias that is difficult to remove without undersmoothing (choosing a very small bandwidth) or estimating high‑order derivatives. Both approaches are problematic in practice: undersmoothing reduces efficiency, while direct bias estimation is computationally intensive and unstable.

The authors propose to replace the normal‑approximation based interval with an EL‑based interval. They define a set of moment conditions (g_i(\theta)=K_h(d(x,X_i))(Y_i-\theta)) and construct the EL ratio (\ell(\theta)=\sup{\prod_{i=1}^n p_i: p_i\ge0,\sum p_i=1,\sum p_i g_i(\theta)=0}). By introducing Lagrange multipliers, they derive the EL statistic (-2\log\ell(\theta)) and prove that, under standard mixing and kernel regularity assumptions, it converges in distribution to a chi‑square with one degree of freedom. This “Wilks phenomenon” holds despite the doubly infinite‑dimensional setting, because the local averaging inherent in the kernel smoothes the functional covariate sufficiently for the central limit theorem to apply.

A major contribution is a bias‑correction scheme that does not rely on undersmoothing. The authors estimate the leading bias term (\hat b(x)) using a local polynomial fit of the same order as the kernel estimator, and define a bias‑corrected target (\tilde\theta=\hat m(x)-\hat b(x)). Substituting (\tilde\theta) into the EL ratio yields a statistic that still obeys the chi‑square limit, so the confidence interval ({\theta: -2\log\ell(\theta)\le\chi^2_{1,1-\alpha}}) automatically incorporates bias removal. Consequently, the interval attains nominal coverage without the need to shrink the bandwidth or to compute higher‑order derivatives.

The methodology is further extended to partially linear models of the form (Y_i=Z_i^\top\beta+m(X_i)+\varepsilon_i), where (Z_i) is a low‑dimensional vector of linear covariates and (m(\cdot)) is an unknown functional component. The authors adopt a two‑step procedure: first estimate (\beta) by ordinary least squares on the linear part, then apply the EL construction to the residuals (\tilde Y_i=Y_i-Z_i^\top\hat\beta). The same bias‑corrected EL statistic is shown to satisfy Wilks’ theorem, providing simultaneous confidence regions for (\beta) and the functional regression surface.

Simulation studies explore a range of scenarios: smooth versus highly variable functional predictors, different noise levels, and sample sizes ranging from 50 to 500. The EL‑based intervals are compared with conventional normal‑approximation intervals that use either the asymptotic variance or a bootstrap variance estimate. Results consistently show that EL intervals are shorter while maintaining coverage close to the nominal 95 % level. The advantage is especially pronounced for small samples and for designs where the bias of the Nadaraya–Watson estimator is non‑negligible. In the partially linear setting, the EL approach yields accurate joint coverage for both (\beta) and the functional component.

The authors conclude that empirical likelihood offers a principled, data‑driven way to quantify uncertainty in functional nonparametric regression without the delicate bandwidth tuning required by traditional methods. The bias‑correction technique is fully nonparametric, making the approach applicable to a wide variety of functional data problems encountered in biomedicine, environmental monitoring, and finance. Moreover, the extension to partially linear models demonstrates the flexibility of the EL framework for hybrid models that combine parametric and nonparametric elements.


Comments & Academic Discussion

Loading comments...

Leave a Comment