Efficient variational inference in large-scale Bayesian compressed sensing

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We study linear models under heavy-tailed priors from a probabilistic viewpoint. Instead of computing a single sparse most probable (MAP) solution as in standard deterministic approaches, the focus in the Bayesian compressed sensing framework shifts towards capturing the full posterior distribution on the latent variables, which allows quantifying the estimation uncertainty and learning model parameters using maximum likelihood. The exact posterior distribution under the sparse linear model is intractable and we concentrate on variational Bayesian techniques to approximate it. Repeatedly computing Gaussian variances turns out to be a key requisite and constitutes the main computational bottleneck in applying variational techniques in large-scale problems. We leverage on the recently proposed Perturb-and-MAP algorithm for drawing exact samples from Gaussian Markov random fields (GMRF). The main technical contribution of our paper is to show that estimating Gaussian variances using a relatively small number of such efficiently drawn random samples is much more effective than alternative general-purpose variance estimation techniques. By reducing the problem of variance estimation to standard optimization primitives, the resulting variational algorithms are fully scalable and parallelizable, allowing Bayesian computations in extremely large-scale problems with the same memory and time complexity requirements as conventional point estimation techniques. We illustrate these ideas with experiments in image deblurring.

💡 Research Summary

The paper addresses Bayesian compressed sensing for linear models equipped with heavy‑tailed priors, moving beyond the traditional deterministic MAP approach that yields only a single sparse estimate. By adopting a variational Bayesian framework, the authors aim to approximate the full posterior distribution over the latent signal vector, thereby enabling uncertainty quantification and principled hyper‑parameter learning.

The model assumes a latent vector x ∈ ℝⁿ with a super‑Gaussian prior (e.g., Laplacian or Student‑t) and Gaussian linear measurements y = Hx + noise. The exact posterior P(x|y) is intractable, so a Gaussian variational distribution Q(x|y) ∝ P(y|x) exp(βᵀs − ½ sᵀΓ⁻¹s) is introduced, where s = Gx, Γ = diag(γ), and the variational parameters ξ = (β, γ) must be optimized.

A double‑loop algorithm is required: the outer loop updates the marginal variances zₖ = Var_Q(sₖ|y) that appear in the log‑determinant term log|A| (with A = σ⁻² HᵀH + GᵀΓ⁻¹G), while the inner loop solves a smoothed MAP problem to update the mean ˆx and γ. Computing the variances zₖ is the principal bottleneck because Σ = A⁻¹ is dense and N can be on the order of 10⁶, making direct storage or Cholesky factorization infeasible.

The authors’ key contribution is to replace traditional Lanczos‑based variance estimation with a Monte‑Carlo estimator that leverages the Perturb‑and‑MAP algorithm for exact sampling from Gaussian Markov random fields (GMRFs). Perturb‑and‑MAP injects independent Gaussian noise into each factor of the GMRF, then solves a perturbed quadratic problem via preconditioned conjugate gradients to obtain an exact sample of x. By drawing a modest number S of such samples, the marginal variances are estimated as empirical averages of (gₖᵀx)². This estimator is unbiased, avoids the systematic under‑estimation characteristic of Lanczos, and requires only matrix‑vector products with H, G and their transposes—operations that are already available in large‑scale sparse implementations.

Because each sample is independent, the procedure is embarrassingly parallel; the authors demonstrate near‑linear speed‑up on multi‑core CPUs and GPUs. Moreover, the same samples can be reused to approximate the log‑determinant term (via stochastic trace estimators) and to monitor the variational free energy, incurring no extra computational cost.

In the inner loop, the smoothed MAP objective reduces to
σ⁻²‖y − H ˆx‖² − 2 ∑ₖ log tₖ(√(ˆsₖ² + zₖ)),
which is convex for log‑concave potentials tₖ. Standard quasi‑Newton methods efficiently solve this sub‑problem, yielding the updated mean ˆx. The γ‑updates have a closed form derived from the dual representation of the super‑Gaussian potentials.

Experimental validation focuses on image deblurring with a 256 × 256 test image. Using only a few hundred Perturb‑and‑MAP samples per outer iteration, the proposed method achieves reconstruction PSNR comparable to state‑of‑the‑art MAP solvers while providing posterior variance maps that highlight uncertain regions. Compared to Lanczos‑based variance estimation, the new approach reduces total runtime by a factor of five and eliminates the need for costly orthogonalization steps.

Overall, the paper demonstrates that accurate Gaussian variance estimation—once the dominant obstacle to scalable variational Bayesian inference—can be achieved with simple, parallelizable Monte‑Carlo sampling. This makes fully Bayesian compressed sensing viable for problems with millions of variables, matching the memory and time complexity of conventional point‑estimate methods while delivering richer probabilistic information.

Efficient variational inference in large-scale Bayesian compressed sensing

💡 Research Summary

Comments & Academic Discussion

Leave a Comment