Asymptotically Independent Markov Sampling: a new MCMC scheme for Bayesian Inference
In Bayesian statistics, many problems can be expressed as the evaluation of the expectation of a quantity of interest with respect to the posterior distribution. Standard Monte Carlo method is often not applicable because the encountered posterior distributions cannot be sampled directly. In this case, the most popular strategies are the importance sampling method, Markov chain Monte Carlo, and annealing. In this paper, we introduce a new scheme for Bayesian inference, called Asymptotically Independent Markov Sampling (AIMS), which is based on the above methods. We derive important ergodic properties of AIMS. In particular, it is shown that, under certain conditions, the AIMS algorithm produces a uniformly ergodic Markov chain. The choice of the free parameters of the algorithm is discussed and recommendations are provided for this choice, both theoretically and heuristically based. The efficiency of AIMS is demonstrated with three numerical examples, which include both multi-modal and higher-dimensional target posterior distributions.
💡 Research Summary
The paper introduces a novel Bayesian inference algorithm called Asymptotically Independent Markov Sampling (AIMS), which synergistically combines importance sampling, Markov chain Monte Carlo (MCMC), and annealing. The authors begin by reviewing the three foundational techniques: importance sampling provides unbiased estimators when a suitable proposal distribution is available, but its efficiency deteriorates if the proposal poorly matches the target; MCMC generates dependent samples that converge to the posterior but can suffer from high autocorrelation, especially in multimodal settings; annealing (or tempering) bridges an easy-to-sample distribution and the target by a sequence of intermediate tempered distributions, facilitating mode exploration.
AIMS constructs a temperature schedule β₀=0 < β₁ < … < β_m=1 and defines intermediate distributions π_j(θ) ∝ π₀(θ) L(θ)^{β_j}, where π₀ is the prior and L the likelihood. At each level j, the algorithm reuses the N_{j‑1} samples drawn from the previous level π_{j‑1}. Importance weights w_i^{(j‑1)} = π_j(θ_i^{(j‑1)}) / π_{j‑1}(θ_i^{(j‑1)}) ∝ L(θ_i^{(j‑1)})^{β_j‑β_{j‑1}} are computed and normalized. A transition kernel K_j (typically a symmetric Gaussian) is then mixed with these weighted particles to form a global proposal density
\hatπ_j(dθ) = Σ_{i=1}^{N_{j‑1}} \bar w_i^{(j‑1)} K_j(dθ | θ_i^{(j‑1)}).
This mixture serves as the proposal for an Independent Metropolis–Hastings (IMH) step, where the acceptance probability follows the standard Metropolis–Hastings ratio using π_j as the target and \hatπ_j as the proposal. Because \hatπ_j converges to π_j as N_{j‑1} → ∞, the IMH step becomes asymptotically independent: the generated samples are effectively i.i.d. from the target distribution.
The authors prove two key theoretical results. First, under mild conditions (the proposal’s support covers the target’s support and the kernel K_j is reversible with respect to π_j), the Markov chain induced by AIMS is uniformly ergodic. This is shown by establishing a Doeblin minorization condition, guaranteeing geometric convergence regardless of the starting point. Second, they demonstrate that the overall algorithm preserves the stationary distribution at each annealing level, and the sequence of chains converges to the final posterior π_m = π.
Practical implementation issues are addressed through adaptive parameter selection. The temperature schedule β_j is not fixed a priori; instead, the algorithm monitors the Effective Sample Size (ESS) of the weighted particles at each level. When ESS falls below a predefined fraction (e.g., 0.5 N_{j‑1}), the temperature is increased, ensuring that successive intermediate distributions remain sufficiently close for efficient reuse of samples. The kernel bandwidth σ_j is tuned to achieve a target acceptance rate (typically 20–40 %). These heuristics balance exploration (large σ_j) against weight stability (small σ_j).
Three numerical experiments illustrate AIMS’s performance. (1) A two‑dimensional bimodal mixture demonstrates that AIMS explores both modes with comparable frequency, whereas a Random‑Walk Metropolis–Hastings (RWMH) chain remains trapped in one mode for long periods. (2) A ten‑dimensional mixture of several Gaussian components shows that AIMS attains an ESS roughly two to three times larger than RWMH, Adaptive Metropolis (AM), and Sequential Importance Sampling (SIS), while maintaining higher acceptance rates and lower autocorrelation times. (3) A twenty‑dimensional Bayesian logistic regression problem (using real data) confirms that AIMS converges rapidly, captures the posterior’s main modes, and yields stable posterior summaries with far fewer effective samples than competing methods.
In summary, AIMS offers a principled framework that reuses samples from previous annealing levels to construct increasingly accurate global proposals, thereby mitigating the high correlation typical of conventional MCMC and the weight‑variance problem of importance sampling. The algorithm is particularly well‑suited for complex, high‑dimensional, and multimodal posterior distributions. The paper concludes with suggestions for future work, including fully automated temperature schedules, parallel implementations, and hybridization with variational inference techniques.
Comments & Academic Discussion
Loading comments...
Leave a Comment