A Non-asymptotic Analysis for Learning and Applying a Preconditioner in MCMC

A Non-asymptotic Analysis for Learning and Applying a Preconditioner in MCMC
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Preconditioning is a common method applied to modify Markov chain Monte Carlo algorithms with the goal of making them more efficient. In practice it is often extremely effective, even when the preconditioner is learned from the chain. We analyse and compare the finite-time computational costs of schemes which learn a preconditioner based on the target covariance or the expected Hessian of the target potential with that of a corresponding scheme that does not use preconditioning. We apply our results to the Unadjusted Langevin Algorithm (ULA) for an appropriately regular target, establishing non-asymptotic guarantees for preconditioned ULA which learns its preconditioner. Our results are also applied to the unadjusted underdamped Langevin algorithm in the supplementary material. To do so, we establish non-asymptotic guarantees on the time taken to collect $N$ approximately independent samples from the target for schemes that learn their preconditioners under the assumption that the underlying Markov chain satisfies a contraction condition in the Wasserstein-2 distance. This approximate independence condition, that we formalize, allows us to bridge the non-asymptotic bounds of modern MCMC theory and classical heuristics of effective sample size and mixing time, and is needed to amortise the costs of learning a preconditioner across the many samples it will be used to produce.


💡 Research Summary

The paper investigates the finite‑time (non‑asymptotic) computational cost of Markov chain Monte Carlo (MCMC) algorithms that learn and apply a matrix‑valued preconditioner. The authors focus on target distributions π on ℝ^d with density proportional to exp(−U(x)), where the potential U is twice differentiable, m‑strongly convex (m I ≼ ∇²U(x)) and L‑smooth (∇²U(x) ≼ L I). The condition number κ = L/m quantifies anisotropy; large κ makes standard (unpreconditioned) samplers inefficient.

Two linear preconditioners are considered: (i) the inverse covariance Σ_π⁻¹, introduced by Haario et al. (2001), and (ii) the inverse Fisher matrix F⁻¹, where F = E_π


Comments & Academic Discussion

Loading comments...

Leave a Comment