Dynamic Welfare-Maximizing Pooled Testing
Pooled testing is a common strategy for public health disease screening under limited testing resources, allowing multiple biological samples to be tested together with the resources of a single test, at the cost of reduced individual resolution. While dynamic and adaptive strategies have been extensively studied in the classical pooled testing literature, where the goal is to minimize the number of tests required for full diagnosis of a given population, much of the existing work on welfare-maximizing pooled testing adopts static formulations in which all tests are assigned in advance. In this paper, we study dynamic welfare-maximizing pooled testing strategies in which a limited number of tests are performed sequentially to maximize social welfare, defined as the aggregate utility of individuals who are confirmed to be healthy. We formally define the dynamic problem and study algorithmic approaches for sequential test assignment. Because exact dynamic optimization is computationally infeasible beyond small instances, we evaluate a range of strategies (including exact optimization baselines, greedy heuristics, mixed-integer programming relaxations, and learning-based policies) and empirically characterize their performance and tradeoffs using synthetic experiments. Our results show that dynamic testing can yield substantial welfare improvements over static baselines in low-budget regimes. We find that much of the benefit of dynamic testing is captured by simple greedy policies, which substantially outperform static approaches while remaining computationally efficient. Learning-based methods are included as flexible baselines, but in our experiments they do not reliably improve upon these heuristics. Overall, this work provides a principled computational perspective on dynamic pooled testing and clarifies when dynamic assignment meaningfully improves welfare in public health screening.
💡 Research Summary
The paper addresses the problem of allocating a limited number of pooled diagnostic tests in order to maximize social welfare, defined as the total utility obtained by individuals who are confirmed healthy. While classical pooled‑testing literature focuses on minimizing the number of tests required for full diagnosis, recent work on welfare‑maximizing pooled testing has largely been static: all pools are fixed before any test outcomes are observed. This work introduces a dynamic formulation in which tests are performed sequentially, and each test’s result can influence the composition of subsequent pools.
The authors formalize the setting with N agents, each characterized by a utility u_i (the societal value of confirming that agent i is healthy) and a prior health probability p_i. A testing budget B limits the total number of pools, and each pool may contain at most G samples. The expected welfare of a testing plan T is U(T)=∑{i=1}^N u_i·P_Ti, where P_Ti is the probability that agent i appears in at least one negatively‑tested pool. In a static plan all pools are predetermined; in a dynamic plan a decision rule τ_b maps the history of previous test outcomes H{b‑1} to the next pool t_b.
Exact dynamic optimization is shown to be computationally infeasible beyond very small instances because the decision tree grows exponentially with B. To explore the trade‑off between optimality and tractability, the paper evaluates four families of algorithms:
-
Exact Optimization (small scale) – exhaustive search or mixed‑integer linear programming (MILP) on tiny instances (N ≤ 15) to obtain the optimal dynamic policy, used as a benchmark.
-
Greedy Dynamic Assignment – at each step the algorithm computes posterior marginal health probabilities (using a Gibbs‑sampling approximation for overlapping pools) and selects the single pool that maximizes immediate expected welfare. The pool‑selection subproblem is identical to the single‑test optimization used in prior static work. After observing the test result, posterior probabilities are updated, confirmed agents are removed, and the process repeats. This method runs in O(B·N⁵) time and, despite its simplicity, achieves welfare within a few percent of the optimal dynamic benchmark across a wide range of synthetic scenarios.
-
MIP Relaxations – static MILP formulations are relaxed to incorporate a limited look‑ahead, using Lagrangian multipliers to estimate future welfare. These relaxations scale to moderate problem sizes (N ≈ 50) but provide only marginal improvements over the greedy approach while incurring higher solver overhead.
-
Learning‑Based Policies – reinforcement‑learning agents (e.g., DQN, policy gradient) are trained to map histories to pool selections. The state space (full history of pool compositions and outcomes) is large, leading to unstable training and performance that does not consistently surpass the greedy heuristic in the experiments conducted.
A key technical contribution is the use of Gibbs sampling to approximate posterior marginal probabilities after each test. Because exact Bayesian updating would require enumerating all joint infection configurations (exponential in the number of overlapping agents), the authors instead treat confirmed‑healthy agents as removed and sample the remaining uncertain agents conditioned on observed pool outcomes. Convergence is monitored with a rolling window, and the method scales linearly with N and G, making it practical for all dynamic algorithms evaluated.
Extensive synthetic experiments vary the budget fraction (B/N), pool size G, and the distribution of utilities and health probabilities. Results demonstrate that dynamic testing yields substantial welfare gains—often 12–18 % higher than the best static MILP baseline—in low‑budget regimes where B is a small fraction of N. The greedy dynamic policy captures most of this gain while remaining computationally lightweight, suggesting that sophisticated long‑horizon planning offers diminishing returns under the studied conditions. Learning‑based methods, while flexible, suffer from sample inefficiency and high variance, limiting their practical advantage at present.
The paper discusses several limitations. The independence assumption for infection states may not hold in real-world settings with clustered transmission; extending the model to incorporate correlation is left for future work. The Gibbs‑sampling approximation, though effective, can converge slowly when many pools test positive, motivating exploration of variational or message‑passing alternatives. Finally, the reinforcement‑learning approach would benefit from richer reward shaping and larger simulated datasets to better capture multi‑step dependencies.
In conclusion, the study provides a principled computational framework for dynamic welfare‑maximizing pooled testing, showing that simple adaptive heuristics can substantially improve public‑health outcomes when testing resources are scarce. It clarifies the conditions under which dynamic assignment is worthwhile and sets a foundation for future research on more expressive probabilistic models and advanced learning‑based policies.
Comments & Academic Discussion
Loading comments...
Leave a Comment