Nested Slice Sampling: Vectorized Nested Sampling for GPU-Accelerated Inference

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Model comparison and calibrated uncertainty quantification often require integrating over parameters, but scalable inference can be challenging for complex, multimodal targets. Nested Sampling is a robust alternative to standard MCMC, yet its typically sequential structure and hard constraints make efficient accelerator implementations difficult. This paper introduces Nested Slice Sampling (NSS), a GPU-friendly, vectorized formulation of Nested Sampling that uses Hit-and-Run Slice Sampling for constrained updates. A tuning analysis yields a simple near-optimal rule for setting the slice width, improving high-dimensional behavior and making per-step compute more predictable for parallel execution. Experiments on challenging synthetic targets, high dimensional Bayesian inference, and Gaussian process hyperparameter marginalization show that NSS maintains accurate evidence estimates and high-quality posterior samples, and is particularly robust on difficult multimodal problems where current state-of-the-art methods such as tempered SMC baselines can struggle. An open-source implementation is released to facilitate adoption and reproducibility.

💡 Research Summary

The paper introduces Nested Slice Sampling (NSS), a GPU‑friendly, fully vectorized formulation of Nested Sampling (NS) that leverages Hit‑and‑Run Slice Sampling (HRSS) as its constrained inner kernel. The authors first decompose NS into an outer loop that turns evidence estimation into a one‑dimensional quadrature over prior volumes, and an inner loop that must draw approximate samples from the constrained prior Π_E(x) ∝ Π(x)·1{E(x) < E_min}. In the outer loop, a set of m live particles is maintained; each iteration removes the k worst‑energy particles, records them as dead samples, resamples k parents from the survivors (multinomial with uniform weights), mutates each parent with the constrained kernel, and reinserts the new points. The compression factor k/m controls both the number of NS steps and the exposed parallelism.

The key technical contribution is the use of HRSS for constrained updates. HRSS selects a random direction (hit‑and‑run) and performs an exact one‑dimensional slice update along the chord defined by the constraint. The authors derive a near‑optimal rule for the slice width w: w* ≈ c·d⁻¹/², where d is the dimensionality and c is a modest constant (empirically 1–2). At this width the number of stepping‑out and shrinkage evaluations concentrates with a standard deviation of order one across a wide range of dimensions, which is crucial for SIMD‑friendly batching. They also prove that for ellipsoidal level sets the expected chord length scales as Θ(d⁻¹/²), matching the width scaling.

Implementation is built on JAX/BlackJAX, exploiting vmap and JIT compilation to batch energy evaluations and all outer‑loop operations. The inner HRSS loop is bounded by a fixed maximum number of stepping‑out/shrinkage steps, ensuring predictable control flow on GPUs. This vectorized design eliminates the data‑dependent branching that hampers traditional NS implementations on accelerators.

Empirical evaluation covers three challenging domains: (1) synthetic multimodal mixtures (Gaussian/Bernoulli) up to 50 dimensions, (2) high‑dimensional Bayesian regression and neural‑network hyper‑parameter marginalisation (100–200 dimensions), and (3) Gaussian‑process kernel marginalisation (≈150 dimensions). Across all benchmarks NSS achieves evidence errors typically below 1–2 % and effective sample sizes that exceed those of strong adaptive tempered SMC baselines by 30 % or more. In multimodal settings, SMC suffers from particle degeneracy and biased evidence, whereas NSS’s constrained updates remain robust because they directly explore the shrinking feasible region.

The paper also discusses related work, contrasting rejection‑based constrained samplers (ellipsoidal, normalising‑flow proposals) and reflection‑based constrained dynamics, both of which degrade with dimension. HRSS maintains mixing rates essentially independent of dimension, making it well‑suited for NS’s ever‑shrinking contours.

Finally, the authors release an open‑source implementation, integrate it with modern probabilistic programming ecosystems, and outline future extensions such as dynamic live‑set sizes, more expressive constraint geometries, and support for TPUs. In summary, NSS provides a principled, scalable, and accelerator‑ready alternative to existing NS and SMC methods, delivering accurate evidence estimation and high‑quality posterior samples even for very high‑dimensional, multimodal Bayesian problems.

Nested Slice Sampling: Vectorized Nested Sampling for GPU-Accelerated Inference

💡 Research Summary

Comments & Academic Discussion

Leave a Comment