PrivATE: Differentially Private Average Treatment Effect Estimation for Observational Data
Causal inference plays a crucial role in scientific research across multiple disciplines. Estimating causal effects, particularly the average treatment effect (ATE), from observational data has garnered significant attention. However, computing the ATE from real-world observational data poses substantial privacy risks to users. Differential privacy, which offers strict theoretical guarantees, has emerged as a standard approach for privacy-preserving data analysis. However, existing differentially private ATE estimation works rely on specific assumptions, provide limited privacy protection, or fail to offer comprehensive information protection. To this end, we introduce PrivATE, a practical ATE estimation framework that ensures differential privacy. In fact, various scenarios require varying levels of privacy protection. For example, only test scores are generally sensitive information in education evaluation, while all types of medical record data are usually private. To accommodate different privacy requirements, we design two levels (i.e., label-level and sample-level) of privacy protection in PrivATE. By deriving an adaptive matching limit, PrivATE effectively balances noise-induced error and matching error, leading to a more accurate estimate of ATE. Our evaluation validates the effectiveness of PrivATE. PrivATE outperforms the baselines on all datasets and privacy budgets.
💡 Research Summary
The paper introduces PrivATE, a novel framework for estimating the average treatment effect (ATE) from observational data while providing rigorous differential privacy (DP) guarantees. Recognizing that different application domains have varying privacy requirements, the authors propose two distinct privacy protection levels: label‑level DP, which adds noise only to the outcome variable, and sample‑level DP, which perturbs treatment, covariates, and outcomes alike. This dual‑level design enables higher utility in settings where only the outcome is sensitive (e.g., educational assessments) and stronger protection when all attributes are confidential (e.g., medical records).
A central technical contribution is the adaptive matching limit mechanism. Traditional matching‑based ATE estimators suffer from a trade‑off between matching error (bias from poor matches) and noise error (variance from DP noise). The sensitivity of the sum of matched outcomes grows with the number of matches per unit, inflating the Laplace noise required for a given privacy budget ε. PrivATE analytically derives the total expected error as a function of the matching cap k, the global sensitivity, and ε, then selects the k that minimizes this combined error. This adaptive cap replaces fixed or heuristic caps used in prior work, allowing the method to automatically adjust to dataset characteristics and privacy budgets.
The framework builds on propensity‑score matching (PSM). Logistic regression estimates propensity scores, and each treated (or control) unit is matched to the k most similar units from the opposite group based on absolute score differences. The sum of observed outcomes for the matched set is computed, and Laplace noise calibrated to the global sensitivity (which is bounded by the adaptive k) is added. The noisy sums for treated and control groups are then combined to produce the DP‑protected ATE estimate. For label‑DP, only the outcome sums receive Laplace noise; for sample‑DP, additional random‑response or Laplace perturbations are applied to treatment indicators and covariates, satisfying ε‑sample DP.
Empirical evaluation spans three data regimes: real medical data (continuous outcomes, multiple covariates), semi‑synthetic data (real covariates with simulated treatments/outcomes), and fully synthetic data with varied treatment prevalence and covariate distributions. Experiments vary ε from 0.5 to 2.0 and compare PrivATE against state‑of‑the‑art DP ATE methods, including DP re‑weighting and DP‑PSM with fixed truncation thresholds. Metrics include relative error and mean‑squared error (MSE). Results show that under label‑DP, PrivATE achieves relative errors below 0.2 even at ε = 0.5, outperforming baselines by 20‑35 %. Under sample‑DP, PrivATE consistently yields lower MSE across all datasets and privacy budgets, with the adaptive matching limit reducing total error by 15‑30 % compared to fixed‑k approaches. Additional ablations confirm the robustness of the adaptive cap, the benefit of allocating more privacy budget to labels when possible, and the superiority of Laplace over Gaussian noise in this setting.
The paper’s contributions are fourfold: (1) a flexible two‑level DP protection scheme tailored to diverse real‑world scenarios, (2) an analytically grounded adaptive matching limit that balances sensitivity and matching accuracy, (3) a practical implementation that injects noise only into aggregated outcomes, thereby preserving statistical efficiency, and (4) extensive experiments demonstrating superior utility‑privacy trade‑offs and an open‑source release for reproducibility. The authors acknowledge limitations such as computational cost for high‑dimensional covariates and the reliance on propensity‑score models, suggesting future work on scalable matching algorithms, non‑linear treatment effect models, and extensions to federated or distributed settings.
Comments & Academic Discussion
Loading comments...
Leave a Comment