abx_amr_simulator: A simulation environment for antibiotic prescribing policy optimization under antimicrobial resistance
Antimicrobial resistance (AMR) poses a global health threat, reducing the effectiveness of antibiotics and complicating clinical decision-making. To address this challenge, we introduce abx_amr_simulator, a Python-based simulation package designed to model antibiotic prescribing and AMR dynamics within a controlled, reinforcement learning (RL)-compatible environment. The simulator allows users to specify patient populations, antibiotic-specific AMR response curves, and reward functions that balance immedi- ate clinical benefit against long-term resistance management. Key features include a modular design for configuring patient attributes, antibiotic resistance dynamics modeled via a leaky-balloon abstraction, and tools to explore partial observability through noise, bias, and delay in observations. The package is compatible with the Gymnasium RL API, enabling users to train and test RL agents under diverse clinical scenarios. From an ML perspective, the package provides a configurable benchmark environment for sequential decision-making under uncertainty, including partial observability induced by noisy, biased, and delayed observations. By providing a customizable and extensible framework, abx_amr_simulator offers a valuable tool for studying AMR dynamics and optimizing antibiotic stewardship strategies under realistic uncertainty.
💡 Research Summary
The paper introduces abx_amr_simulator, a Python‑based, Gymnasium‑compatible environment designed to model antibiotic prescribing decisions and antimicrobial‑resistance (AMR) dynamics for reinforcement‑learning (RL) research. Framed as an MDP/POMDP, the simulator explicitly supports partial observability through configurable noise, bias, and delay in patient and resistance observations, reflecting real‑world clinical data limitations.
Key components are:
-
PatientGenerator – creates a synthetic cohort each timestep, assigning each patient six attributes (infection probability, benefit/failure multipliers, spontaneous recovery probability, etc.). These can be homogeneous or heterogeneous, and observation noise can be added to simulate uncertain diagnostics.
-
AMR_LeakyBalloon – implements a “leaky balloon” model where resistance pressure for each antibiotic accumulates with prescribing frequency and decays exponentially when the drug is not used. Two tunable parameters (flatness and leak) control the steepness of resistance emergence and the rate of decay, respectively. Cross‑resistance can be specified, allowing one drug’s use to affect the resistance trajectory of others.
-
RewardCalculator – defines a scalar reward that balances immediate clinical success with long‑term community‑level resistance control. The overall reward is a convex combination:
R_overall(t) = (1‑λ) * mean(R_individual(t)) + λ * mean(R_community(t)),
where λ∈
Comments & Academic Discussion
Loading comments...
Leave a Comment