Doubly robust estimation with functional outcomes missing at random

Doubly robust estimation with functional outcomes missing at random
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We present and study semi-parametric estimators for the mean of functional outcomes in situations where some of these outcomes are missing and covariate information is available on all units. Assuming that the missingness mechanism depends only on the covariates (missing at random assumption), we present two estimators for the functional mean parameter, using working models for the functional outcome given the covariates, and the probability of missingness given the covariates. We contribute by establishing that both these estimators have Gaussian processes as limiting distributions and explicitly give their covariance functions. One of the estimators is double robust in the sense that the limiting distribution holds whenever at least one of the nuisance models is correctly specified. These results allow us to present simultaneous confidence bands for the mean function with asymptotically guaranteed coverage. A Monte Carlo study shows the finite sample properties of the proposed functional estimators and their associated simultaneous inference. The use of the method is illustrated in an application where the mean of counterfactual outcomes is targeted.


💡 Research Summary

**
This paper addresses the problem of estimating the mean function μ(t) of a functional outcome Y(t) when the entire trajectory of Y is missing for a subset of subjects, while a set of covariates X is fully observed for everyone. Under the standard Missing‑At‑Random (MAR) assumption—i.e., the missingness indicator Z depends only on X—the authors develop two semiparametric estimators: an outcome‑regression (OR) estimator and a doubly‑robust (DR) estimator.

The functional outcome is modeled as a linear functional regression
 Y_i(t) = X_iᵀβ(t) + ε_i(t),
with β(t) a vector of unknown coefficient functions and ε_i(t) a zero‑mean stochastic process with covariance σ_ε(s,t). The missingness probability is modeled by a logistic regression π_i = τ(X_iᵀγ), τ(s) = (1+e^{‑s})^{‑1}.

The OR estimator simply fits β̂(t) by least squares using only the observed pairs (X_i, Y_i) and then averages the fitted values across all subjects:
 μ̂_OR(t) = n⁻¹∑_{i=1}ⁿ X_iᵀβ̂(t).
If the outcome model is correctly specified, μ̂_OR is √n‑consistent and asymptotically normal.

The DR estimator augments the OR estimator with an inverse‑probability‑weighting correction:
 μ̂_DR(t) = n⁻¹∑_{i=1}ⁿ


Comments & Academic Discussion

Loading comments...

Leave a Comment