Expectations in Expectation Propagation
Expectation Propagation (EP) is a widely used message-passing algorithm that decomposes a global inference problem into multiple local ones. It approximates marginal distributions (beliefs) using intermediate functions (messages). While beliefs must be proper probability distributions that integrate to one, messages may have infinite integral values. In Gaussian-projected EP, such messages take a Gaussian form and appear as if they have “negative” variances. Although allowed within the EP framework, these negative-variance messages can impede algorithmic progress. In this paper, we investigate EP in linear models and analyze the relationship between the corresponding beliefs. Based on the analysis, we propose both non-persistent and persistent approaches that prevent the algorithm from being blocked by messages with infinite integral values. Furthermore, by examining the relationship between the EP messages in linear models, we develop an additional approach that avoids the occurrence of messages with infinite integral values.
💡 Research Summary
Expectation Propagation (EP) is a powerful message‑passing framework that decomposes a global Bayesian inference problem into a set of local factor‑wise computations. While the beliefs (approximated marginals) must be proper probability distributions that integrate to one, the intermediate messages are allowed to have infinite integrals. In Gaussian‑projected EP, such messages often appear as Gaussians with “negative” variances. Although mathematically admissible, negative‑variance messages can cause the beliefs to become improper and may halt the iterative algorithm.
The paper focuses on linear measurement models of the form y = A x + v, where A is a known matrix, v is Gaussian noise, and the signal components xₙ are independent with arbitrary priors. The joint density factorizes into a likelihood factor f_y(x) = N(y | A x, C_v) and a set of prior factors f_{xₙ}(xₙ) = p(xₙ). The authors study the reV‑AMP algorithm, an EP instance applied to this factorization, and write down the explicit Gaussian messages and beliefs for both the likelihood and prior nodes.
A key observation is that the belief at the likelihood node remains Gaussian regardless of the messages, so it never causes problems. In contrast, the belief at each prior node involves a one‑dimensional integral over the possibly non‑Gaussian prior. When the message from the likelihood to a prior node has a negative variance, the resulting belief can become non‑normalizable, and the subsequent Kullback‑Leibler (KL) projection step may be undefined.
To prevent this blockage, the authors propose two practical strategies:
-
Non‑persistent correction – Whenever a negative variance is detected, the offending message is either reverted to its previous value or clipped to a small positive variance. This allows the algorithm to continue without altering the overall EP fixed‑point equations.
-
Persistent correction – The update of a variable whose message would lead to a negative variance is temporarily suspended. The algorithm proceeds with updates of all other variables; once the covariance matrix has been sufficiently refreshed, the suspended update is retried. This “lazy” approach guarantees that the covariance C_{x|y} stays positive‑definite throughout the iteration.
Both strategies rely on the analytical result that if the belief at the likelihood factor is proper (which it always is), then the messages sent to the prior factors must have finite integrals. Consequently, the only way a negative variance can arise is through the internal KL projection step at the prior node.
Beyond these heuristic fixes, the paper introduces Analytic Continuation reV‑AMP (ACreV‑AMP). By extending the KL projection onto the complex plane, the Gaussian projection can be performed even when the natural variance would be negative. This analytic continuation effectively bypasses the need to modify the variance explicitly, while still producing a valid Gaussian approximation for the message.
The theoretical contributions are supported by rigorous matrix identities. Lemma 1 shows that, under sequential updates, the determinant of the posterior covariance C_{x|y} after an update can be expressed as a product involving the updated variance and the prior‑to‑posterior variance ratio. Theorem 1 proves, by induction and Sylvester’s criterion, that if all initial variances τ_{pₙ} are positive, then C_{x|y} remains positive‑definite for all subsequent iterations, guaranteeing that beliefs never become improper. The proofs heavily use the matrix inversion lemma and the matrix determinant lemma to track how a single diagonal entry change propagates through the covariance matrix.
From a computational standpoint, the authors exploit a low‑rank update for C_{x|y}, which reduces the per‑iteration cost to O(N²) when only one diagonal entry changes, while the full set of mean‑variance updates for all variables still scales as O(N). This makes the proposed methods suitable for high‑dimensional sparse recovery problems.
In summary, the paper identifies a subtle but critical failure mode of EP in linear models—negative‑variance Gaussian messages—and offers three complementary remedies: a simple clipping scheme, a lazy‑update scheme, and an analytic‑continuation‑based projection. Theoretical analysis guarantees that, with any of these fixes, the belief covariances stay positive‑definite, and empirical results (as reported) indicate faster convergence and greater robustness compared with the vanilla reV‑AMP, especially in regimes with strong sparsity or ill‑conditioned measurement matrices.
Comments & Academic Discussion
Loading comments...
Leave a Comment