Gradient-Based Approximate Bayesian Inference with Entropy-Optimized Summary Statistics for Compartmental Models
Recent pandemics have highlighted the critical role of infectious disease models in guiding public health decision-making, driving demand for realistic models that can provide timely answers under uncertainty. Compartmental models are widely used to capture disease dynamics, and advances in data availability, computational resources, and epidemiological understanding have allowed the development of models that incorporate detailed representations of population structure, disease progression, and intervention effects. While these improvements improve model fidelity, they also increase model complexity, leading to high-dimensional parameter spaces, intractable likelihoods, and computational challenges for fitting models to limited surveillance data in real time. Existing likelihood-free methods, such as Approximate Bayesian Computation (ABC) and Bayesian Synthetic Likelihood (BSL), have developed largely independently, each with distinct strengths and limitations. We propose an integrated three-stage framework that synthesizes advances from both likelihood-based and likelihood-free method: (1) ABC-based entropy minimization to identify low-dimensional, approximately orthogonal summary statistics; (2) BSL inference using these optimized summaries to construct tractable Gaussian approximations; and (3) Hamiltonian Monte Carlo sampling for efficient posterior exploration. Through SEIR simulation study and application to the 1978 British boarding school influenza outbreak, we demonstrate that our framework achieves reliable parameter estimation and uncertainty quantification while maintaining computational efficiency.
💡 Research Summary
The paper addresses the longstanding challenge of performing Bayesian inference for compartmental epidemic models whose likelihood functions are analytically intractable and computationally expensive to evaluate. To overcome this, the authors propose a three‑stage hybrid framework that combines recent advances in Approximate Bayesian Computation (ABC), Bayesian Synthetic Likelihood (BSL), and Hamiltonian Monte Carlo (HMC).
Stage 1 – Entropy‑based summary‑statistic selection via ABC
A rich pool of candidate summary statistics is first constructed, guided by public‑health policy questions such as peak timing, peak magnitude, early growth rate, and epidemic phase. Using an ABC simulation scheme, the authors evaluate the information content of each statistic with respect to the model parameters. They then minimise an entropy‑based objective, which penalises redundancy and favours statistics that are nearly conditionally independent given the parameters. The result is a low‑dimensional set of orthogonal summaries that retain maximal information while dramatically reducing dimensionality.
Stage 2 – Synthetic likelihood construction with the selected summaries
The chosen statistics are assumed to follow a multivariate normal distribution conditional on the parameters, (s^*|θ \sim N(μ(θ), Σ(θ))). For any parameter vector, Monte‑Carlo simulations generate estimates of the mean and covariance. Because the entropy‑selected statistics are approximately independent, the authors adopt a diagonal covariance approximation, reducing the number of covariance parameters from O(p²) to O(p). This simplification improves numerical stability and computational speed, especially as the number of summaries grows. The synthetic likelihood (L_{SL}(θ; s_{obs})) together with a prior yields a tractable “synthetic posterior”.
Stage 3 – Efficient posterior sampling with HMC
Since the synthetic likelihood is differentiable, the authors employ Hamiltonian Monte Carlo, specifically the No‑U‑Turn Sampler (NUTS) implementation in Stan, to explore the posterior distribution. Gradient‑based sampling dramatically accelerates convergence compared with random‑walk Metropolis, and the diagonal covariance further speeds up gradient calculations.
Empirical evaluation
The framework is first benchmarked on a deterministic SEIR model with known parameters. Compared against ABC‑Rejection, ABC‑SMC, and a conventional BSL implementation using random‑walk Metropolis, the proposed method achieves lower mean absolute error, tighter 95 % credible intervals, and a 30–50 % reduction in total runtime.
A real‑world case study uses data from the 1978 British boarding‑school influenza outbreak. The authors map daily case counts to the policy‑driven summaries, apply the three‑stage pipeline, and obtain posterior distributions for peak date, peak size, and early growth rate that are both narrower and more consistent with epidemiological knowledge than those produced by existing methods.
Key contributions
- An integrated pipeline that leverages ABC for information‑theoretic summary selection and BSL for likelihood approximation, thereby solving the “curse of dimensionality” in ABC and the covariance‑estimation bottleneck in BSL.
- Demonstration that entropy‑based selection yields approximately independent summaries, justifying a diagonal covariance in the synthetic likelihood and enabling fast gradient calculations for HMC.
- Empirical evidence of improved statistical efficiency (bias, coverage) and computational efficiency (runtime, number of HMC iterations) on both simulated and real data.
Limitations and future directions
The normality assumption for the synthetic likelihood may be violated for highly skewed or heavy‑tailed summaries; extensions to Gaussian‑copula or non‑parametric synthetic likelihoods are suggested. The entropy‑minimisation step still requires ABC simulations, which can be costly for very complex models; reinforcement‑learning‑based statistic discovery or variational approximations could alleviate this. Finally, while a diagonal covariance works well for the selected orthogonal statistics, modest correlations may remain; adaptive partial‑covariance structures could further enhance accuracy.
In summary, the paper delivers a practical, theoretically grounded framework that makes real‑time Bayesian inference feasible for sophisticated compartmental epidemic models, providing public‑health decision makers with rapid, reliable parameter estimates and uncertainty quantification.
Comments & Academic Discussion
Loading comments...
Leave a Comment