Modern Causal Inference Approaches to Improve Power for Subgroup Analysis in Randomized Controlled Trials

Modern Causal Inference Approaches to Improve Power for Subgroup Analysis in Randomized Controlled Trials
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Randomized controlled trials (RCTs) often include subgroup analyses to assess whether treatment effects vary across pre-specified patient populations. However, these analyses frequently suffer from small sample sizes which limit the power to detect heterogeneous effects. Power can be improved by leveraging predictors of the outcome – i.e., through covariate adjustment – as well as by borrowing external data from similar RCTs or observational studies. The benefits of covariate adjustment may be limited when the trial sample is small. Borrowing external data can increase the effective sample size and improve power, but it introduces two key challenges: (i) integrating data across sources can lead to model misspecification, and (ii) practical violations of the positivity assumption – where the probability of receiving the target treatment is near-zero for some covariate profiles in the external data – can lead to extreme inverse-probability weights and unstable inferences, ultimately negating potential power gains. To account for these shortcomings, we present an approach to improving power in pre-planned subgroup analyses of small RCTs that leverages both baseline predictors and external data. We propose debiased estimators that accommodate parametric, machine learning, and nonparametric Bayesian methods. To address practical positivity violations, we introduce three estimators: a covariate-balancing approach, an automated debiased machine learning (DML) estimator, and a calibrated DML estimator. We show improved power in various simulations and offer practical recommendations for the application of the proposed methods. Finally, we apply them to evaluate the effectiveness of citalopram for negative symptoms in first-episode schizophrenia patients across subgroups defined by duration of untreated psychosis, using data from two small RCTs.


💡 Research Summary

This paper addresses the persistent challenge of low statistical power in pre‑planned subgroup analyses of small randomized controlled trials (RCTs). While covariate adjustment and borrowing external data are recognized strategies for improving precision, each suffers from distinct drawbacks: covariate adjustment alone may be insufficient when the trial sample is tiny, and external data integration can introduce model misspecification and, more critically, practical violations of the positivity assumption (i.e., near‑zero probability of receiving the target treatment for some covariate profiles). To simultaneously mitigate these issues, the authors develop a suite of debiased estimators that combine baseline predictors with external information while explicitly handling positivity violations.

The methodological framework begins with a formal causal notation: each subject’s data consist of baseline covariates X, treatment indicator A, source indicator S (S=1 for the target trial, S=0 for external data), and outcome Y. The target estimand is the subgroup‑specific average treatment effect (ATE) within the trial population, defined as the conditional average treatment effect (CATE) averaged over the covariates that define the subgroup. Identification relies on five standard causal assumptions—weak ignorability, weak exchangeability over source, consistency, treatment positivity, and participation positivity. The authors note that while randomization guarantees many of these assumptions in the target trial, they must be assumed (and are often violated) for the external data.

Building on this identification, three families of debiased estimators are proposed:

  1. Covariate‑adjusted debiased estimator – a double‑robust estimator that uses outcome and treatment nuisance models (μ_a(x)=E

Comments & Academic Discussion

Loading comments...

Leave a Comment