Proximal Learning for Trials With External Controls: A Case Study in HIV Prevention

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

With the advent of effective pre-exposure prophylaxis agents, active-controlled HIV prevention trials have become a common study design. Nevertheless, estimating absolute efficacy relative to a placebo remains important. In this paper, we introduce a novel application of proximal causal inference methods to estimate the counterfactual cumulative HIV incidence under placebo for participants in an active-controlled trial of cabotegravir, using external control data from a placebo-controlled trial with similar eligibility criteria. We leverage baseline sexually transmitted infection status and geographic region as negative control outcome and exposure variables, respectively. We address two key challenges: unmeasured differences in HIV risk between trials and statistical difficulties arising from low HIV incidence rates in both studies. To overcome these challenges, we develop two proximal inference approaches: (1) a semiparametric inverse probability of censoring weighting estimator, and (2) a two-stage regression-based strategy tailored to low-event-rate settings. Our theoretical and numerical investigations demonstrate these methods yield reliable estimates of the counterfactual one-year cumulative HIV incidence under placebo, and provide robust evidence of the superior efficacy of cabotegravir compared with placebo. These findings highlight the potential of proximal inference methods to estimate placebo-controlled effects in both single-arm and active-controlled trials by leveraging external controls.

💡 Research Summary

This paper addresses a pressing methodological challenge in modern HIV‑prevention research: how to estimate the absolute efficacy of a new intervention relative to a true placebo when contemporary trials are ethically required to use active comparators and when HIV incidence is very low. The authors propose to combine data from an active‑controlled trial (HPTN 083, which compared long‑acting injectable cabotegravir with daily oral TDF/FTC) with external placebo data from the AMP (HVTN 704/HPTN 085) study, which evaluated the VRC01 broadly neutralizing antibody against placebo. Because the two studies differ in geography, demographics, and baseline STI prevalence, unmeasured confounding is a serious concern. To overcome this, the authors employ proximal causal inference, leveraging two negative controls: baseline sexually transmitted infection (STI) status as a negative‑control outcome (NCO) and trial geographic region as a negative‑control exposure (NCE). Both variables are associated with the unmeasured factors that drive HIV risk but are assumed not to be causally affected by the treatment (cabotegravir) or to affect HIV infection directly.

Two novel estimators are developed. The first is a semiparametric inverse‑probability‑of‑censoring‑weighting (IPCW) estimator that adjusts for loss‑to‑follow‑up and other forms of censoring while remaining robust to model misspecification. The second is a two‑stage regression procedure tailored for low‑event‑rate settings. In the first stage, bridge functions linking the NCO and NCE to the latent unmeasured confounder are estimated; in the second stage, these bridge functions are used to reconstruct the counterfactual cumulative HIV incidence that would have been observed under a true placebo for participants in HPTN 083. Both estimators enjoy a doubly‑robust property: consistency is retained if either the outcome model or the bridge‑function model is correctly specified.

Theoretical work establishes consistency, asymptotic normality, and semiparametric efficiency of the proposed estimators. Simulation studies, calibrated to realistic HIV‑incidence levels (≈0.5 % per year), demonstrate that the new methods dramatically reduce mean‑squared error and produce confidence intervals with coverage close to the nominal 95 % level, outperforming standard propensity‑score weighting and direct standardization approaches that assume no unmeasured confounding.

Applying the methods to the real data, the authors first estimate the one‑year cumulative HIV incidence in the AMP placebo arm (2.98 %). Using the proximal estimators, they then infer the counterfactual placebo incidence for the HPTN 083 cohort, obtaining an estimate of roughly 2.8 % (95 % CI ≈ 2.1–3.5 %). By contrast, the observed incidence in the cabotegravir arm of HPTN 083 was 0.41 % (95 % CI 0.20–0.70 %). This yields an absolute efficacy of about 7‑fold reduction in HIV acquisition, confirming the superior protective effect of cabotegravir.

The study makes several important contributions. First, it demonstrates that external placebo data can be rigorously incorporated into active‑controlled trials using proximal causal inference, thereby providing absolute efficacy estimates without a concurrent placebo arm. Second, it shows that appropriate negative‑control variables can mitigate bias from unmeasured confounders that differ across studies. Third, it offers practical solutions for the low‑event‑rate regime that plagues modern HIV‑prevention trials, where traditional methods often produce unstable estimates or confidence intervals that extend beyond the logical 0–100 % range. Finally, the methodological framework is broadly applicable to other therapeutic areas where rare outcomes and ethical constraints limit the use of placebo controls. Future work may extend the approach to multiple external datasets, incorporate Bayesian borrowing schemes, and develop systematic guidance for selecting and validating negative‑control variables.

Proximal Learning for Trials With External Controls: A Case Study in HIV Prevention

💡 Research Summary

Comments & Academic Discussion

Leave a Comment