Non-centred Bayesian inference for discrete-valued state-transition models: the Rippler algorithm

Non-centred Bayesian inference for discrete-valued state-transition models: the Rippler algorithm
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Stochastic state-transition models of infectious disease transmission can be used to deduce relevant drivers of transmission when fitted to data using statistically principled methods. Fitting this individual-level data requires inference on individuals’ unobserved disease statuses over time, which form a high-dimensional and highly correlated state space. We introduce a novel Bayesian (data-augmentation Markov chain Monte Carlo) algorithm for jointly estimating the model parameters and unobserved disease statuses, which we call the Rippler algorithm. This is a non-centred method that can be applied to any individual-based state-transition model. We compare the Rippler algorithm to the state-of-the-art inference methods for individual-based stochastic epidemic models and find that it performs better than these methods as the number of disease states in the model increases.


💡 Research Summary

The paper introduces the Rippler algorithm, a novel non‑centred data‑augmentation Markov chain Monte Carlo (MCMC) method for Bayesian inference in individual‑based stochastic epidemic models that can be expressed as Coupled Hidden Markov Models (CHMMs). In a CHMM each of N individuals occupies one of S discrete disease states at each of T time points, and the transition probability for an individual depends on its current state, the states of all other individuals, and a vector of epidemiological parameters θ. Observations Y are noisy or incomplete measurements of the hidden states X.

Traditional centred approaches sample X directly, which forces the Metropolis–Hastings proposal to recompute complex, highly correlated transition probabilities, leading to low acceptance rates and poor mixing, especially as the number of states grows. The authors therefore re‑parameterise the latent trajectory using a matrix U of independent Uniform(0,1) random numbers. For each (t, j) pair they compute lower and upper bounds U_low_{t,j} and U_upp_{t,j} that define the interval of U_{t,j} values which would generate the current hidden state x_{t,j} through a deterministic mapping g(·) (the inverse CDF of the categorical transition distribution). Consequently, given θ and U, the hidden trajectory X is uniquely determined, and the prior on U is simply i.i.d. Uniform, eliminating the need to evaluate the prior density of X in the acceptance ratio.

The core of the Rippler proposal is to perturb a small subset of the U matrix. In the default setting a single element (t, j) is selected with probability proportional to the length of the “outside” region 1 − U_upp + U_low (i.e., the region where a change would actually alter the state). The selected element is then resampled uniformly from the complementary region (0, U_low) ∪ (U_upp, 1), guaranteeing a non‑null move (X* ≠ X). This local change propagates forward in time because the altered U_{t,j} modifies the transition probabilities at t + 1, which in turn may change subsequent states—a ripple effect that efficiently explores the high‑dimensional latent space without having to redraw the entire trajectory.

The Metropolis–Hastings acceptance probability simplifies dramatically after applying Bayes’ theorem and exploiting the deterministic relationship between U and X. The terms involving the prior of X and the proposal density of U given X cancel (q₁ = 1, q₃ = 1). The remaining ratio consists of the likelihoods π(Y|θ, X*) and π(Y|θ, X) and the proposal densities for the selected U elements (q₂). Thus
α = min{1, π(Y|θ, X*) π(Y|θ, X) · q₂(U|θ, U*, X*) / q₂(U*|θ, U, X)}.
Because q₂ is simply the reciprocal of the length of the outside region for the chosen element(s), the acceptance step is cheap to compute.

To cope with the unknown optimal number κ of perturbed elements, the authors embed an adaptive tuning scheme. With probability ε they explore by drawing κ uniformly from {1,…,κ_max}; otherwise they exploit by choosing the κ whose empirical acceptance rate a_κ is closest to a target a′ = 0.234 (the optimal acceptance for high‑dimensional random‑walk proposals). This ε‑greedy strategy automatically balances exploration and exploitation as the chain progresses.

A further extension, the “data‑informed Rippler,” incorporates the observation model directly into the proposal. The initial state probabilities \tilde{p}(j) and transition probabilities p(t,j) are re‑weighted by the observation likelihood f(y|x,θ) and renormalised, yielding modified bounds \tilde{ρ} and ρ. This makes the proposal distribution more aligned with the posterior, especially valuable when observations are sparse or partially missing.

Algorithmically, each outer MCMC iteration updates θ using any suitable sampler (e.g., random‑walk Metropolis, Hamiltonian Monte Carlo). Then, a user‑specified number of inner “latent updates” are performed: compute U_low and U_upp, draw the current U, propose U* with adaptive κ, reconstruct X* via the deterministic mapping, compute the acceptance probability, and accept/reject. The process repeats for K iterations, yielding posterior samples of both θ and X.

The authors evaluate the method on simulated epidemics with 2–5 disease states, including simple SIR, SEIR, and multi‑compartment models. They compare Rippler against reversible‑jump MCMC (RJ‑MCMC) and individual forward‑filtering backward‑sampling (iFFBS). Performance metrics include effective sample size (ESS), integrated autocorrelation time (IACT), and wall‑clock time. Results show that as the number of states grows, Rippler’s ESS increases dramatically relative to RJ‑MCMC, whose samples become highly autocorrelated, and iFFBS, whose computational cost explodes due to the need to sample entire forward‑backward trajectories for each individual. In the 5‑state scenario, Rippler achieves an ESS roughly three to five times larger than RJ‑MCMC for comparable runtime, while iFFBS fails to finish within reasonable time. The data‑informed variant further improves ESS by 1.5–2× when observation missingness exceeds 30 %.

In summary, the Rippler algorithm leverages a non‑centred re‑parameterisation and a minimal‑change proposal that propagates efficiently through the latent trajectory, yielding superior mixing and scalability for high‑dimensional CHMMs. The adaptive tuning of the number of perturbed elements and the optional data‑informed proposal make the method robust across a range of epidemic model complexities. The authors suggest future work on continuous‑time extensions, spatial interaction structures, and hybrid schemes combining variational approximations with Rippler updates.


Comments & Academic Discussion

Loading comments...

Leave a Comment