Generalized Guarantees for Variational Inference in the Presence of Even and Elliptical Symmetry
We extend several recent results providing symmetry-based guarantees for variational inference (VI) with location-scale families. VI approximates a target density $p$ by the best match $q^$ in a family $Q$ of tractable distributions that in general does not contain $p$. It is known that VI can recover key properties of $p$, such as its mean and correlation matrix, when $p$ and $Q$ exhibit certain symmetries and $q^$ is found by minimizing the reverse Kullback-Leibler divergence. We extend these guarantees in two important directions. First, we provide symmetry-based guarantees for $f$-divergences, a broad class that includes the reverse and forward Kullback-Leibler divergences and the $α$-divergences. We highlight properties specific to the reverse Kullback-Leibler divergence under which we obtain our strongest guarantees. Second, we obtain further guarantees for VI when the target density $p$ exhibits even and elliptical symmetries in some but not all of its coordinates. These partial symmetries arise naturally in Bayesian hierarchical models, where the prior induces a challenging geometry but still possesses axes of symmetry. We illustrate these theoretical results in a number of experimental settings.
💡 Research Summary
This paper extends recent symmetry‑based guarantees for variational inference (VI) beyond the previously studied setting of reverse Kullback‑Leibler (KL) divergence and full global symmetries. The authors consider a target density p and a tractable variational family Q consisting of location‑scale distributions whose base density is spherically symmetric. They first show that if p is even‑symmetric about a point μ, then for any f‑divergence D_f(p‖q_ν) the gradient with respect to the location parameter ν vanishes at ν=μ, making μ a stationary point of the objective. When the associated φ‑function (φ(v)=f(e^v)) is convex and strictly decreasing—a property satisfied only by the reverse KL—the objective becomes strictly convex in ν, guaranteeing that μ is the unique global minimizer. This result therefore generalizes the mean‑recovery guarantee from reverse KL to the whole class of f‑divergences, provided the convex‑decreasing condition holds.
The second major contribution introduces the notion of partial symmetry: p may be even or elliptically symmetric only along a subset of coordinates σ, while the remaining coordinates may lack any symmetry. Under this “symmetry along σ” definition, the conditional distribution p(z_σ | z_{\barσ}) possesses a point of even symmetry m_σ(z_{\barσ}) and a normalized covariance matrix M_σ(z_{\barσ}). The authors prove that, for reverse KL (where φ is linear), a stationary point of the VI objective aligns the variational location ν_σ with the conditional mean m_σ and the variational scale S_σ with M_σ, thereby exactly recovering the mean and correlation structure on the symmetric subspace. This partial‑symmetry guarantee is especially relevant for hierarchical Bayesian models where priors induce challenging global geometry (e.g., funnels) but retain symmetry in certain latent dimensions.
Technical assumptions include differentiability of p and q, interchangeability of differentiation and integration (via dominated convergence), and, for the stronger convexity results, a somewhere‑strict log‑concavity of p (or of the conditional densities). The paper notes that the convex‑decreasing requirement on φ is currently verified only for reverse KL, and that extending the uniqueness guarantee to other f‑divergences remains an open problem.
Empirical validation is provided on two fronts. First, a 2‑dimensional Student‑t distribution (5 degrees of freedom, zero correlation) is approximated by a factorized Gaussian family. Grid‑search minimization of five divergences (reverse KL, forward KL, α‑divergences, Hellinger distance, total variation) all recover the exact mean, illustrating the theorem’s stationary‑point claim and suggesting that the convexity condition may be conservative. Second, the authors study an “elliptical funnel” model where τ∼N(0,1) and θ∣τ∼N(0, e^{2τC}). The conditional distribution of θ given τ is elliptically symmetric with a covariance proportional to a fixed matrix C, while τ itself is asymmetric. VI with a Gaussian location‑scale family, minimizing reverse KL, accurately recovers the mean and correlation of θ despite mis‑estimating τ, confirming the partial‑symmetry theorem.
In conclusion, the paper demonstrates that symmetry—both global and partial—can be leveraged to obtain strong, divergence‑agnostic guarantees for VI. It broadens the theoretical foundation beyond reverse KL, highlights the special role of convex‑decreasing φ functions, and provides practical insights for hierarchical models where only subsets of variables enjoy symmetry. Future work is suggested on weakening the φ‑convexity requirement, extending uniqueness results to a wider class of f‑divergences, and exploring higher‑dimensional or non‑elliptical symmetry structures.
Comments & Academic Discussion
Loading comments...
Leave a Comment