Sample size and power calculations for causal inference of observational studies

Sample size and power calculations for causal inference of observational studies
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This paper investigates the theoretical foundation and develops analytical formulas for sample size and power calculations for causal inference with observational data. By analyzing the variance of an inverse probability weighting estimator of the average treatment effect, we decompose the power calculation into three components: propensity score distribution, potential outcome distribution, and their correlation. We show that to determine the minimal sample size of an observational study, in addition to the standard inputs in the power calculation of randomized trials, it is sufficient to have two parameters, which quantify the strength of the confounder-treatment and the confounder-outcome association, respectively. For the former, we propose using the Bhattacharyya coefficient, which measures the covariate overlap and, together with the treatment proportion, leads to a uniquely identifiable and easily computable propensity score distribution. For the latter, we propose a sensitivity parameter bounded by the R-squared statistic of the regression of the outcome on covariates. Our procedure relies on a parametric propensity score model and a semiparametric restricted mean outcome model, but does not require distributional assumptions on the multivariate covariates. We develop an associated R package PSpower.


💡 Research Summary

This paper provides a rigorous theoretical foundation and practical formulas for sample‑size and power calculations in observational studies that aim to estimate causal effects. The authors focus on the Hájek inverse‑probability‑weighting (IPW) estimator of the average treatment effect (ATE) because its asymptotic variance has a relatively simple form that does not depend on a fully specified outcome model. The variance V can be written as

 V = E


Comments & Academic Discussion

Loading comments...

Leave a Comment