A note on convergence of the equi-energy sampler

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

In a recent paper `The equi-energy sampler with applications statistical inference and statistical mechanics’ [Ann. Stat. 34 (2006) 1581–1619], Kou, Zhou & Wong have presented a new stochastic simulation method called the equi-energy (EE) sampler. This technique is designed to simulate from a probability measure $\pi$, perhaps only known up to a normalizing constant. The authors demonstrate that the sampler performs well in quite challenging problems but their convergence results (Theorem 2) appear incomplete. This was pointed out, in the discussion of the paper, by Atchad'e & Liu (2006) who proposed an alternative convergence proof. However, this alternative proof, whilst theoretically correct, does not correspond to the algorithm that is implemented. In this note we provide a new proof of convergence of the equi-energy sampler based on the Poisson equation and on the theory developed in Andrieu et al. (2007) for \emph{Non-Linear} Markov chain Monte Carlo (MCMC). The objective of this note is to provide a proof of correctness of the EE sampler when there is only one feeding chain; the general case requires a much more technical approach than is suitable for a short note. In addition, we also seek to highlight the difficulties associated with the analysis of this type of algorithm and present the main techniques that may be adopted to prove the convergence of it.

💡 Research Summary

The paper revisits the convergence theory of the Equi‑Energy (EE) sampler, a sophisticated Monte‑Carlo algorithm introduced by Kou, Zhou and Wong (2006) for sampling from a target distribution π that may be known only up to a normalising constant. The original work claimed convergence in Theorem 2, but Atchadé and Liu (2006) pointed out that the proof was incomplete because it ignored the algorithm’s intrinsic non‑linear dependence on the empirical distribution of a “feeding” chain. Atchadé and Liu later supplied an alternative proof; however, their argument corresponds to a modified version of the algorithm rather than the one actually implemented.

In response, the authors provide a rigorous convergence proof that matches the true EE sampler, at least in the case where there is a single feeding chain (i.e., two temperature levels). The proof rests on two modern theoretical tools: (1) the Poisson equation for Markov chains and (2) the non‑linear Markov chain Monte Carlo (MCMC) framework developed by Andrieu, Moulines and Priouret (2007). By casting the EE sampler as a non‑linear Markov process—where the transition kernel P_μ depends on the current empirical measure μ of the feeding chain—the authors can treat the algorithm within a well‑established ergodic theory for such processes.

The analysis proceeds as follows. First, for a fixed empirical distribution μ, the conditional transition kernel P_μ satisfies detailed balance with respect to π, guaranteeing that π is invariant for the kernel. This step mirrors the standard Metropolis–Hastings argument and is straightforward. The difficulty lies in the fact that μ itself evolves over time as the feeding chain accumulates samples. To handle this, the authors invoke uniform ergodicity assumptions and a law of large numbers for the empirical measures, showing that μ_n converges to π at a rate of O(1/√n).

The core of the convergence argument is the solution of the Poisson equation
f – P_μ f = h – π(h)
for any bounded test function h. Under the uniform ergodicity condition, a bounded Lipschitz solution f exists for each μ, and the authors prove that the family {f_μ} is uniformly controlled as μ varies. This uniform control allows them to apply a martingale decomposition to the empirical averages of h(X_n) and to show that the bias term vanishes as n→∞. Consequently, the time‑averaged estimator converges almost surely to the true expectation π(h), establishing strong ergodicity of the EE sampler.

Importantly, this proof aligns with the actual implementation: the feeding chain’s samples are used to update the empirical distribution, which in turn influences the proposal mechanism of the target chain. The earlier Atchadé–Liu proof treated the feeding distribution as static, thereby disconnecting the theory from practice. By embracing the dynamic, non‑linear nature of the algorithm, the present work closes that gap.

The paper also discusses the challenges of extending the result to the full multi‑level EE sampler, where several feeding chains interact across a hierarchy of temperatures. In that setting, the state space becomes a product of chains and the transition kernel depends on a vector of empirical measures, leading to a high‑dimensional non‑linear Markov system. The authors acknowledge that a complete convergence proof for the general case would require substantially more sophisticated measure‑theoretic and functional‑analytic machinery, and they leave this as an open problem for future research.

In summary, the authors deliver a concise yet technically robust proof that the EE sampler converges to its intended target distribution when a single feeding chain is used. Their approach showcases how the Poisson equation and the modern theory of non‑linear MCMC can be combined to analyze algorithms whose transition dynamics depend on the evolving empirical distribution, thereby providing a solid theoretical foundation for a class of adaptive sampling methods that were previously understood only heuristically.

A note on convergence of the equi-energy sampler

💡 Research Summary

Comments & Academic Discussion

Leave a Comment