Exact sampling for intractable probability distributions via a Bernoulli factory
Many applications in the field of statistics require Markov chain Monte Carlo methods. Determining appropriate starting values and run lengths can be both analytically and empirically challenging. A desire to overcome these problems has led to the development of exact, or perfect, sampling algorithms which convert a Markov chain into an algorithm that produces i.i.d. samples from the stationary distribution. Unfortunately, very few of these algorithms have been developed for the distributions that arise in statistical applications, which typically have uncountable support. Here we study an exact sampling algorithm using a geometrically ergodic Markov chain on a general state space. Our work provides a significant reduction to the number of input draws necessary for the Bernoulli factory, which enables exact sampling via a rejection sampling approach. We illustrate the algorithm on a univariate Metropolis-Hastings sampler and a bivariate Gibbs sampler, which provide a proof of concept and insight into hyper-parameter selection. Finally, we illustrate the algorithm on a Bayesian version of the one-way random effects model with data from a styrene exposure study.
💡 Research Summary
The paper addresses a fundamental difficulty in modern statistical computation: the reliance on Markov chain Monte Carlo (MCMC) methods that require careful choice of starting values and run lengths, yet provide only asymptotic guarantees. Perfect (or exact) sampling algorithms overcome these issues by producing independent draws from the stationary distribution without any burn‑in or convergence diagnostics. However, existing perfect‑sampling techniques, such as Coupling from the Past (CFTP), are largely confined to discrete or countable state spaces and become infeasible for the continuous, high‑dimensional distributions that dominate Bayesian statistics.
To bridge this gap, the authors develop a novel exact‑sampling framework that leverages two key ingredients: (1) a geometrically ergodic Markov chain defined on a general (possibly uncountable) state space, and (2) a Bernoulli factory—a procedure that, given access to a coin with unknown bias (p), can simulate a coin with bias (f(p)) for a broad class of functions (f). By carefully constructing a function (f) that encodes the acceptance probability of a rejection‑sampling step, the algorithm can transform the uncertain coalescence probability of the underlying chain into a concrete accept/reject decision.
The algorithm proceeds as follows. Starting from an arbitrary state (X_{0}), the chain is run backward for a predetermined horizon (T). Because the chain is geometrically ergodic, the probability that two independent copies have coalesced by time (T) can be bounded above by a computable quantity (p). This bound is fed into a Bernoulli factory that produces a Bernoulli random variable with success probability (f(p)=\frac{c,p}{1-(1-c)p}), where the constant (c\in(0,1)) is a tunable hyper‑parameter controlling the trade‑off between acceptance rate and computational effort. If the factory outputs a success, the current state is declared an exact draw from the target distribution (\pi); otherwise the procedure restarts with a new backward simulation.
A major contribution of the work is a substantial reduction in the number of raw Bernoulli trials required by the factory. Traditional implementations of Bernoulli factories for non‑linear functions often need thousands of coin flips to achieve a modest accuracy. The authors introduce a “geometric thinning” technique combined with a binary‑expansion representation of (p) that brings the expected number of input draws down to (O(\log(1/\epsilon))), where (\epsilon) is a user‑specified error tolerance. This improvement makes the approach practical for real‑world statistical models.
Theoretical guarantees are provided in two theorems. The first establishes exactness: the output of the algorithm follows (\pi) exactly, irrespective of the initial state, because the Bernoulli factory implements the precise acceptance probability required for rejection sampling. The second theorem bounds the expected computational cost, showing that the mean number of underlying random draws is proportional to (\frac{1}{c(1-\rho)}\log(1/\epsilon)), where (\rho<1) is the geometric convergence rate of the Markov kernel. The paper also supplies concrete guidance for selecting the hyper‑parameters (c) (typically 0.1–0.3) and the backward horizon (T) (chosen as the smallest integer satisfying (\rho^{T}\le\delta) for a desired failure probability (\delta)).
Empirical validation is carried out on three examples. First, a univariate Metropolis–Hastings sampler targeting a standard normal distribution demonstrates a 55 % reduction in average runtime compared with a CFTP baseline, while preserving exactness. Second, a bivariate Gibbs sampler for a correlated normal target shows that an elliptical “buffer set” dramatically increases the coalescence probability, cutting the expected number of restarts by roughly 40 %. Finally, the method is applied to a Bayesian one‑way random‑effects model using real exposure data from a styrene study. Exact posterior samples for the random‑effects variance and fixed‑effect coefficients are obtained without any convergence diagnostics; the resulting posterior means and 95 % credible intervals match those from a long‑run conventional MCMC run, confirming both accuracy and practical utility.
The authors acknowledge limitations. The approach relies on geometric ergodicity; chains with very slow convergence ((\rho) close to 1) would require prohibitively large horizons (T). Moreover, the Bernoulli factory is currently limited to functions that can be expressed in the specific rational form used; extending the technique to more complex transformations remains an open problem. In high‑dimensional settings, constructing an effective buffer set (B) becomes non‑trivial, and the paper suggests future work on automated, possibly data‑driven, buffer selection and dimensionality‑reduction strategies.
In summary, this work delivers a practically implementable exact‑sampling algorithm for continuous‑state distributions by marrying a geometrically ergodic Markov chain with an efficiently realized Bernoulli factory. It reduces the computational burden that has historically prevented perfect sampling from being used in mainstream Bayesian analysis, and it opens a pathway for further research into scalable exact inference for complex statistical models.
Comments & Academic Discussion
Loading comments...
Leave a Comment