Model-assisted design of experiments in the presence of network correlated outcomes
We consider the problem of how to assign treatment in a randomized experiment, in which the correlation among the outcomes is informed by a network available pre-intervention. Working within the potential outcome causal framework, we develop a class of models that posit such a correlation structure among the outcomes. Then we leverage these models to develop restricted randomization strategies for allocating treatment optimally, by minimizing the mean square error of the estimated average treatment effect. Analytical decompositions of the mean square error, due both to the model and to the randomization distribution, provide insights into aspects of the optimal designs. In particular, the analysis suggests new notions of balance based on specific network quantities, in addition to classical covariate balance. The resulting balanced, optimal restricted randomization strategies are still design unbiased, in situations where the model used to derive them does not hold. We illustrate how the proposed treatment allocation strategies improve on allocations that ignore the network structure, with extensive simulations.
💡 Research Summary
This paper addresses the design of randomized experiments when pre‑intervention network information suggests that unit outcomes are correlated. Working within the potential‑outcome framework, the authors introduce a class of statistical models that encode network‑induced correlation among potential outcomes, and then use these models to construct restricted randomization schemes that minimize the mean‑square error (MSE) of the difference‑in‑means estimator for the average treatment effect (ATE).
The core illustration is the “normal‑sum” model. Each unit i has a latent covariate X_i ~ N(μ,σ²). The control potential outcome Y_i(0) is generated as a normal random variable with mean equal to the sum of the X_j of i’s neighbors (including i itself) and variance γ². The treatment potential outcome is simply Y_i(1)=Y_i(0)+τ, where τ is the constant additive treatment effect. This construction captures the intuition that a unit’s outcome is influenced by the characteristics of its network neighbors, a situation common in social‑media usage, peer effects, or contagion‑like phenomena.
Using this model, the authors derive an explicit expression for the conditional MSE of the estimator (\hat τ) given a treatment assignment vector Z. The MSE decomposes into three components: (i) a bias term proportional to μ²·δ_N(Z)², where δ_N(Z) measures the difference in average neighbor‑set size (i.e., average degree) between treated and control units; (ii) a variance term γ²·(1/N₁+1/N₀) that penalizes imbalance in the numbers of treated (N₁) and control (N₀) units; and (iii) an additional variance term σ²·ωᵀAω that depends on the adjacency matrix A and the allocation vector ω. This decomposition reveals two natural balance criteria: (a) the average degree (or more generally, the average network exposure) should be balanced across treatment arms to eliminate bias, and (b) the treatment arms should be of equal size to minimize the γ‑driven variance.
Guided by these insights, the paper proposes several restricted randomization strategies. The simplest “degree‑balance” restriction admits only those assignments for which the absolute difference in average degree between groups falls below a pre‑specified threshold. A “size‑balance” restriction forces N₁=N₀, and a combined restriction enforces both simultaneously. Practically, these restrictions are implemented via rerandomization: repeatedly draw a completely random assignment until it satisfies the constraints, then use that assignment for the experiment.
A key theoretical contribution is the proof that these model‑assisted restricted designs retain design‑unbiasedness of the difference‑in‑means estimator: even if the normal‑sum model is misspecified, E
Comments & Academic Discussion
Loading comments...
Leave a Comment