Imputation Estimators Partially Correct for Model Misspecification

Imputation Estimators Partially Correct for Model Misspecification
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Inference problems with incomplete observations often aim at estimating population properties of unobserved quantities. One simple way to accomplish this estimation is to impute the unobserved quantities of interest at the individual level and then take an empirical average of the imputed values. We show that this simple imputation estimator can provide partial protection against model misspecification. We illustrate imputation estimators’ robustness to model specification on three examples: mixture model-based clustering, estimation of genotype frequencies in population genetics, and estimation of Markovian evolutionary distances. In the final example, using a representative model misspecification, we demonstrate that in non-degenerate cases, the imputation estimator dominates the plug-in estimate asymptotically. We conclude by outlining a Bayesian implementation of the imputation-based estimation.


💡 Research Summary

The paper investigates a simple yet powerful approach for estimating population quantities when observations are incomplete: impute the missing values at the individual level, compute their conditional expectations under a fitted model, and then average these imputed values. This “imputation estimator” is shown to enjoy partial robustness against model misspecification, meaning that even when the assumed statistical model deviates from the true data‑generating process, the estimator’s bias and mean‑squared error (MSE) are often smaller than those of the conventional plug‑in estimator that simply substitutes point estimates of the model parameters.

The authors first formalize the problem. Let (Y) denote observed data, (X) the unobserved quantities of interest, and (g(X)) a function whose population mean (\mu = E


Comments & Academic Discussion

Loading comments...

Leave a Comment