Incorporating Contact Network Structure in Cluster Randomized Trials
Whenever possible, the efficacy of a new treatment, such as a drug or behavioral intervention, is investigated by randomly assigning some individuals to a treatment condition and others to a control condition, and comparing the outcomes between the two groups. Often, when the treatment aims to slow an infectious disease, groups or clusters of individuals are assigned en masse to each treatment arm. The structure of interactions within and between clusters can reduce the power of the trial, i.e. the probability of correctly detecting a real treatment effect. We investigate the relationships among power, within-cluster structure, between-cluster mixing, and infectivity by simulating an infectious process on a collection of clusters. We demonstrate that current power calculations may be conservative for low levels of between-cluster mixing, but failing to account for moderate or high amounts can result in severely underpowered studies. Power also depends on within-cluster network structure for certain kinds of infectious spreading. Infections that spread opportunistically through very highly connected individuals have unpredictable infectious breakouts, which makes it harder to distinguish between random variation and real treatment effects. Our approach can be used before conducting a trial to assess power using network information if it is available, and we demonstrate how empirical data can inform the extent of between-cluster mixing.
💡 Research Summary
This paper investigates how the structure of contact networks within and between clusters influences the statistical power of cluster randomized trials (CRTs) that aim to evaluate interventions against infectious diseases. Traditional CRT design relies on the intracluster correlation coefficient (ICC) to adjust for within‑cluster outcome correlation, assuming homogeneous correlation across all pairs of individuals in a cluster. The authors argue that this simplification neglects important heterogeneity in real‑world contact patterns and that ignoring between‑cluster mixing can lead to substantial mis‑estimation of power.
To explore these issues, the authors generate synthetic clusters using three well‑known random graph models: Erdős‑Rényi (ER), Barabási‑Albert (BA), and stochastic block model (SBM). Each cluster contains n = 300 nodes with an average degree of 4, and clusters are paired (C = 20 pairs) to mimic a matched‑pair CRT design. Between‑cluster mixing is quantified by a single parameter γ, defined as the proportion of edges that cross from the treatment cluster to the control cluster within a pair. γ = 0 corresponds to completely isolated clusters, γ = 0.5 means as many cross‑edges as within‑cluster edges, and γ = 1 yields a bipartite graph where all edges are between clusters. The authors achieve a target γ by degree‑preserving rewiring: edges are swapped between the two clusters while keeping each node’s degree unchanged.
The infection dynamics are modeled with a simple compartmental S→I process. Two infectivity regimes are considered: (i) unit infectivity, where each infected node contacts a single randomly chosen neighbor per time step, and (ii) degree infectivity, where an infected node contacts all of its neighbors simultaneously. Transmission probability depends only on treatment assignment: under the null hypothesis p0 = p1 = 0.30, while under the alternative the treatment reduces the probability to p1 = 0.25. Simulations run until cumulative incidence reaches 10 % of the total population, at which point the trial is considered complete.
Two analysis scenarios are examined. In the first, only baseline and final infection prevalences are observed, reflecting trials where individual infection times are unavailable. In the second, full time‑to‑infection data are assumed, allowing more sophisticated survival‑type analyses. For each scenario the authors apply appropriate statistical tests (e.g., difference in proportions, log‑rank‑type tests) and estimate empirical power by repeating the simulation 1,000 times.
Results show that when γ is low (near 0), the traditional ICC‑based power calculations are slightly conservative; with the chosen parameters (C ≈ 20, n ≈ 300) the empirical power lies in the typical 0.8–0.9 range. However, as γ increases modestly (γ ≥ 0.2), power declines sharply. The effect is especially pronounced for BA and SBM networks, where highly connected “hub” nodes or community structure facilitate rapid spread. In these settings, even modest between‑cluster mixing allows the infection to leak from treated to control clusters, diluting the observable treatment effect. Degree infectivity amplifies this phenomenon because an infected hub can simultaneously seed many contacts across the mixed edge set, leading to explosive outbreaks that mask treatment differences. Unit infectivity shows a milder dependence on network topology but still suffers power loss as γ grows.
The authors also demonstrate how real‑world data can inform γ. Using inter‑regional cell‑phone call volumes, they estimate the proportion of cross‑cluster contacts and incorporate this estimate into a pre‑trial power analysis. This example illustrates that, when network information is available, researchers can adjust the number of clusters or cluster size to compensate for anticipated mixing, thereby avoiding under‑powered studies.
In conclusion, the study provides strong evidence that CRT power is a joint function of (1) within‑cluster network topology, (2) the extent of between‑cluster mixing (γ), and (3) disease‑specific infectivity characteristics. Relying solely on ICC may be adequate only when clusters are essentially isolated; otherwise, ignoring network structure can lead to severe under‑power. The proposed framework—parameterizing mixing with γ, simulating disease spread on realistic networks, and comparing empirical power to standard formulas—offers a practical tool for trial designers. Future work could extend the approach to more complex epidemiological models (e.g., SEIR, waning immunity) and to other network families (small‑world, exponential random graph models), further refining power calculations for a broad range of public‑health interventions.
Comments & Academic Discussion
Loading comments...
Leave a Comment