Piecewise deterministic Markov process samplers are attractive alternatives to Metropolis--Hastings algorithms. A central design question is how to incorporate partial velocity refreshment to ensure ergodicity without injecting excessive noise. Forward Event-Chain Monte Carlo (FECMC) is a generalization of the Bouncy Particle Sampler (BPS) that addresses this issue through a stochastic reflection mechanism, thereby reducing reliance on global refreshment moves. Despite promising empirical performance, its theoretical efficiency remains largely unexplored. We develop a high-dimensional scaling analysis for standard Gaussian targets and prove that the negative log-density (or potential) process of FECMC converges to an Ornstein--Uhlenbeck diffusion, under the same scaling as BPS. We derive closed-form expressions for the limiting diffusion coefficients of both methods by analyzing their associated radial momentum processes and solving the corresponding Poisson equations. These expressions yield a sharp efficiency comparison: the diffusion coefficient of FECMC is strictly larger than that of optimally tuned BPS, and the optimum for FECMC is attained at zero global refreshment. Specifically, they imply an approximately eightfold increase in effective sample size per event over optimal BPS. Numerical experiments confirm the predicted diffusion coefficients and show that the resulting efficiency gains remain substantial for a range of non-Gaussian targets. Finally, as an application of these results, we propose an asymptotic variance estimator for Piecewise deterministic Markov processes that becomes increasingly efficient in high dimensions by extracting information from the velocity variable.
1. Introduction. Since the introduction of the random walk algorithm by Metropolis et al. (1953), Markov chain Monte Carlo (MCMC) methods have found widespread applications in physics and statistics, particularly in Bayesian statistics following the advocacy of Gelfand and Smith (1990). While the Metropolis-Hastings (MH) paradigm introduced by Hastings (1970) has underpinned the majority of applications, a fundamentally different class of algorithms based on piecewise deterministic Markov processes (PDMPs) (Davis, 1984) has emerged as a viable alternative. These methods originated in computational physics and molecular simulation as rejection-free, event-driven algorithms designed to overcome the diffusive behavior inherent in MH-type dynamics (Bernard, Krauth and Wilson, 2009;Peters and de With, 2012;Michel, Kapfer and Krauth, 2014;Faulkner et al., 2018;Krauth, 2021) and were later adopted in the statistical literature (Vanetti et al., 2018;Bouchard-Côté, Vollmer and Doucet, 2018;Bierkens, Fearnhead and Roberts, 2019;Fearnhead et al., 2018;Faulkner and Livingstone, 2024). Algorithms designed within the PDMP framework enjoy two key structural advantages that are typically lost in MH-based methods. First, PDMPs are inherently irreversible, and this lack of detailed balance has been shown to accelerate convergence relative to reversible counterparts, a well-documented fact in both computational physics (Nishikawa et al., 2015;Lei, Krauth and Maggs, 2019) and statistics (Andrieu and Livingstone, 2021;Eberle and Lörler, 2024). Second, and crucially for modern large-scale inference, the PDMP framework naturally accommodates unbiased subsampling and stochastic gradients while preserving the exactness of the target distribution (Bierkens, Fearnhead and Roberts, 2019;Sen et al., 2020;Fearnhead et al., 2024). This stands in contrast to most stochastic gradient MH methods, which typically introduce asymptotic bias in exchange for scalability (Welling and Teh, 2011;Chen, Fox and Guestrin, 2014;Nemeth and Fearnhead, 2021). Other structural benefits of PDMP-based samplers, including robustness in multiscale targets and bespoke implementation for special targets, are also under active investigation (Bierkens et al., 2023;Chevallier, Fearnhead and Sutton, 2023).
Despite these advantages, a precise understanding of the high-dimensional scaling behavior of PDMP-based samplers remains incomplete. In particular, although several PDMP algorithms have been proposed, there is still no fully unified theoretical framework for comparing their efficiency or guiding practical algorithmic choices in high dimensions, despite the fact that scaling analysis is well established for MH-type methods; see Section 2.1. In this work, we address this gap by deriving high-dimensional diffusive scaling limits, which enable a theoretical comparison of their computational complexity and asymptotic efficiency. We focus on two representative and widely used methods: the Bouncy Particle Sampler (BPS) (Bouchard-Côté, Vollmer and Doucet, 2018) and the Forward Event-Chain Monte Carlo (FECMC) (Michel, Durmus and Sénécal, 2020). Both methods employ piecewise linear dynamics perturbed by directional changes at random event times. A key difference is how randomness is injected via their directional changes: BPS employs deterministic reflections and thus relies on global refreshment moves for randomization, whereas FECMC achieves it through a stochastic reflection mechanism. In BPS, there is a tension in selecting the Poisson rate ρ for the global refreshment events. While global refreshment is essential for ergodicity, it may also inject additional noise that can hinder efficient exploration of the target distribution. Therefore, the primary design goal of FECMC was to discard the global refreshment while preserving ergodicity. This is achieved by stochastic reflection that combines (1) partial refreshment of the velocity component parallel to the gradient and (2) minimal randomization of the orthogonal component. The effectiveness of this strategy has been demonstrated through numerical experiments in Michel, Durmus and Sénécal (2020). However, a theoretical explanation for this observed improvement is lacking. In particular, it remains unclear whether additional global refreshment (ρ > 0) would accelerate exploration.
We provide a theoretical explanation by directly comparing the high-dimensional diffusion scaling limits between FECMC and BPS. In Section 3, we establish the weak convergence theorems for the FECMC radial momentum process and the negative log-density (or potential) process. We start with the case ρ = 0, which requires new techniques. In this case, the FECMC process loses the regeneration structure on which the martingale arguments of Bierkens, Kamatani and Roberts (2022) crucially depend. By considering the dual predictable projection instead of the generator, we derive the limit theorem for the case of ρ = 0 through a semimartingale techn
This content is AI-processed based on open access ArXiv data.