Causal Inference with MNAR Self-Masking Confounders: A Stratified Delta-Imputed Propensity Estimation Method

Causal Inference with MNAR Self-Masking Confounders: A Stratified Delta-Imputed Propensity Estimation Method
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

In observational studies, causal inference becomes difficult when confounders are missing-not-at-random (MNAR), particularly where the missingness depends on the confounder’s own unreported value (self-masking). Existing methods for handling MNAR confounders often rely on strong, unverifiable assumptions, leading to biased estimates. We propose a simple approach with Stratified Delta-Imputed Propensity Estimator (SDIPE) in the presence of self-masking confounders. SDIPE first stratifies data into observed and missing groups, imputes missing confounders via delta-adjusted multiple imputation. Then, within each group, average-treatment-effects (ATEs) are estimated by stabilized-inverse-probability-weights. The final ATE is obtained by combining the subgroup-specific estimates, weighted by respective proportions in the sample. Simulation study shows that SDIPE achieves low bias and near-nominal coverage (94-96%) across varying missingness, sample sizes, and treatment prevalence. In contrast, conventional sensitivity-based multiple imputation exhibits substantial bias and poor coverage (18-89%). Additionally, SDIPE is robust to the choice of the delta parameter. Applied to NHANES-2017-2018, SDIPE estimates that married individuals have a 1.19-point lower depression score than unmarried individuals (95% CI: -1.76, -0.64), adjusting for MNAR income data. SDIPE provides a practical and robust approach for causal inference with self-masking MNAR confounders, offering improved performance over existing methods without requiring restrictive assumptions about the missingness mechanism.


💡 Research Summary

The paper tackles a pervasive problem in observational causal inference: missing‑not‑at‑random (MNAR) confounders whose missingness depends on the unobserved value of the confounder itself, a situation termed “self‑masking.” Traditional causal methods (g‑formula, propensity‑score weighting, doubly robust estimators) assume all confounders are fully observed; when this assumption fails, bias can be severe. Existing MNAR approaches either require strong, unverifiable assumptions (e.g., missingness independent of outcome or treatment) or are limited to discrete covariates and outcomes, leaving self‑masking scenarios largely unaddressed.

The authors propose the Stratified Delta‑Imputed Propensity Estimator (SDIPE). The method proceeds in four steps: (1) split the data into two strata—subjects with the partially observed confounder Z observed and subjects with Z missing; (2) impute the missing Z values using multiple imputation (MI) with a location‑shift (Δ) adjustment to relax the MAR assumption; (3) within each stratum estimate the average treatment effect (ATE) via stabilized inverse‑probability weighting (IPW), which mitigates extreme weights; (4) combine the stratum‑specific ATEs using the empirical proportions of the two groups to obtain a single overall estimate.

The Δ parameter is a sensitivity‑analysis device: it adds a constant shift to the predicted mean of Z for the missing cases, allowing the analyst to explore how results change under plausible departures from MAR. The authors demonstrate that the estimator is robust to a wide range of Δ values; the bias remains negligible even when Δ varies substantially.

Simulation studies explore four scenarios: treatment prevalence 20 % vs 40 %, sample sizes 500 vs 1000, and missingness rates 10 %, 30 %, 50 %. For each scenario, 500 Monte‑Carlo repetitions with 10 imputations and bootstrap confidence intervals are performed. Results show that SDIPE consistently yields relative bias below 0.5 % and coverage of the nominal 95 % confidence interval between 92 % and 96 %, regardless of missingness level or sample size. In contrast, a conventional sensitivity‑based MI approach exhibits relative bias ranging from 5 % to 16 % and coverage dropping as low as 18 % under high missingness. The Δ‑adjusted imputation does not materially affect bias, confirming the method’s stability.

The method is applied to the 2017‑2018 NHANES dataset to estimate the causal effect of marital status on depressive symptoms (PHQ‑9 score) among adults aged 30‑55. Income‑to‑poverty ratio, age, education, and gender are used as covariates; the income ratio suffers from self‑masking missingness. Using SDIPE, the authors estimate that being married reduces the PHQ‑9 score by 1.19 points (95 % CI: –1.76 to –0.64). This estimate is more credible than those obtained from complete‑case analysis or standard MAR‑based MI, which would be biased by the MNAR income data.

The paper’s contributions are threefold: (1) it introduces a simple, stratified framework that isolates the MNAR problem into a missing‑group where Δ‑adjusted MI can be applied; (2) it demonstrates superior statistical performance (bias, variance, coverage) over existing sensitivity‑based methods across a range of realistic conditions; (3) it provides a practical sensitivity‑analysis tool (Δ) that is easy to interpret and implement. Limitations include the need for user‑specified Δ values (subjectivity) and the current focus on continuous confounders; extensions to categorical variables and more complex missingness mechanisms are suggested for future work, possibly via Bayesian priors on Δ or machine‑learning based imputation models.

In summary, SDIPE offers a pragmatic, robust solution for causal inference when self‑masking MNAR confounders are present, delivering near‑unbiased ATE estimates and reliable confidence intervals without imposing unrealistic assumptions about the missingness mechanism. This makes it a valuable addition to the toolbox of epidemiologists, health economists, and social scientists dealing with incomplete covariate data.


Comments & Academic Discussion

Loading comments...

Leave a Comment