Operational Dosage: Implications of Capacity Constraints for the Design and Interpretation of Experiments

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We study RCTs that evaluate the impact of service interventions, for example, teachers or advisors conducting proactive outreach to at-risk students, medical providers giving medication adherence support by calling or texting, or social workers that conduct home visits. A defining feature of service interventions is that they are delivered by a capacity-constrained resource – teachers, healthcare providers, or social workers – whose limited availability creates causal inference complications. Because participants share a finite service capacity, adding more participants can reduce the timeliness or intensity of the service that others receive, introducing interference across participants. This generates hidden variation in the treatment itself, which we term operational dosage. We provide a mathematical model of service interventions using techniques from queueing theory and study the impact of capacity constraints on experimental outcomes. Our main insight is that treatment effects are both capacity- and sample-size-dependent, as well as decreasing in sample size once a critical threshold is exceeded. Interestingly, an implication is that statistical power of service intervention RCTs peaks at intermediate sample sizes – directly contradicting conventional power calculations that assume monotonically increasing power with sample size. We instantiate our insights using simulations calibrated to a real-world trial evaluating a behavioral health intervention for tuberculosis patients in Kenya. Our simulation results suggest that a trial with high service capacity but limited sample size can obtain the same statistical power as a trial with lower service capacity but large sample size. Taken together, our results highlight the importance of capacity selection in experiment design and provide a mechanism for why experiments may fail to replicate or perform at scale.

💡 Research Summary

The paper tackles a largely overlooked source of bias in randomized controlled trials (RCTs) of “service interventions” – interventions delivered by capacity‑constrained human resources such as teachers, clinicians, or social workers. The authors coin the term “operational dosage” to describe the hidden variation in treatment intensity that arises when the number of participants exceeds the available service capacity. Using a continuous‑time Markov chain that embeds a classic queueing system, each participant alternates between a desirable state (e.g., medication adherence) and an undesirable state (non‑adherence). When in the undesirable state, a participant becomes eligible for a service that, if delivered, raises the probability of returning to the desirable state from a baseline rate τ to a higher rate determined by the service success probability p and the service completion rate μ (effective service rate μp). A finite number M of servers can attend to only one participant at a time, creating a queue whose length depends on the participant‑to‑server ratio N/M.

The theoretical analysis yields four main insights. First, treatment effects are not fixed; they depend on the operational dosage, which itself is a function of capacity and sample size. Second, there exists a critical threshold r for the ratio N/M. Below r the system operates in a “quality‑driven” (QD) regime where adding participants has little impact on effect size. Above r the system enters an “efficiency‑driven” (ED) regime; queue lengths grow non‑linearly, service delays increase, and the effective success probability declines sharply, causing treatment effects to fall. Third, because statistical power follows the same pattern, it is not monotonically increasing with sample size. Power rises as N grows while the system remains in QD, peaks near the threshold, and then declines once the ED regime is entered. Fourth, the threshold provides a practical design rule: setting the participant‑to‑server ratio close to r (for example, using a square‑root staffing heuristic M≈√N·(μ/λ)) maximizes power for a given budget.

To illustrate these concepts, the authors calibrate their model to a real‑world RCT of the “Keheala” digital adherence support program for tuberculosis patients in Kenya. Empirical estimates (λ≈0.05/day, τ≈0.1/day, μ≈0.2/day, p≈0.7) are fed into a simulation that varies M and N. The simulation shows that a high‑capacity, modest‑size trial (e.g., M=30, N=200) achieves the same power as a low‑capacity, large‑size trial (M=15, N=400). Conversely, keeping capacity fixed while inflating N (e.g., N=800, M=15) pushes the system into the ED regime, lengthening queues, reducing effective dosage, and dropping power below 0.6. These results confirm that capacity, not just sample size, is a decisive determinant of detectable effects.

The discussion situates operational dosage as a concrete mechanism of SUTVA violation, combining hidden variation (different participants receive different “doses” of the intervention) with interference (one participant’s service consumes capacity needed by another). The authors argue that failure to account for capacity explains why some interventions succeed in small trials but falter when scaled, and why replication attempts sometimes produce weaker effects. They provide actionable guidance: researchers should pre‑specify the required number of servers based on the anticipated participant‑to‑server ratio, possibly using the square‑root staffing rule; power calculations must incorporate the queueing dynamics; and post‑trial interpretation should consider whether capacity constraints may have attenuated observed effects.

Finally, the paper suggests extensions such as multi‑service settings, priority queuing, and stochastic arrival processes, indicating that the queueing‑based framework can be adapted to a broad class of policy evaluations where human capacity is the bottleneck. Overall, the work bridges queueing theory and causal inference, offering a novel lens on experimental design for capacity‑constrained service interventions.

Operational Dosage: Implications of Capacity Constraints for the Design and Interpretation of Experiments

💡 Research Summary

Comments & Academic Discussion

Leave a Comment