Zero-Temperature Limit of a Convergent Algorithm to Minimize the Bethe Free Energy

Reading time: 5 minute
...

📝 Abstract

After the discovery that fixed points of loopy belief propagation coincide with stationary points of the Bethe free energy, several researchers proposed provably convergent algorithms to directly minimize the Bethe free energy. These algorithms were formulated only for non-zero temperature (thus finding fixed points of the sum-product algorithm) and their possible extension to zero temperature is not obvious. We present the zero-temperature limit of the double-loop algorithm by Heskes, which converges a max-product fixed point. The inner loop of this algorithm is max-sum diffusion. Under certain conditions, the algorithm combines the complementary advantages of the max-product belief propagation and max-sum diffusion (LP relaxation): it yields good approximation of both ground states and max-marginals.

💡 Analysis

After the discovery that fixed points of loopy belief propagation coincide with stationary points of the Bethe free energy, several researchers proposed provably convergent algorithms to directly minimize the Bethe free energy. These algorithms were formulated only for non-zero temperature (thus finding fixed points of the sum-product algorithm) and their possible extension to zero temperature is not obvious. We present the zero-temperature limit of the double-loop algorithm by Heskes, which converges a max-product fixed point. The inner loop of this algorithm is max-sum diffusion. Under certain conditions, the algorithm combines the complementary advantages of the max-product belief propagation and max-sum diffusion (LP relaxation): it yields good approximation of both ground states and max-marginals.

📄 Content

arXiv:1112.5298v1 [cs.CV] 22 Dec 2011 CENTER FOR MACHINE PERCEPTION CZECH TECHNICAL UNIVERSITY IN PRAGUE RESEARCH REPORT ISSN 1213-2365 Zero-Temperature Limit of a Convergent Algorithm to Minimize the Bethe Free Energy Tom´aˇs Werner CTU–CMP–2011–14 December 2011 Available at ftp://cmp.felk.cvut.cz/pub/cmp/articles/werner/Werner-TR-2011-14.pdf The work has been supported by the European Commission project FP7-ICT-270138 and the Czech Grant Agency project P103/10/0783. Research Reports of CMP, Czech Technical University in Prague, No. 14, 2011 Published by Center for Machine Perception, Department of Cybernetics Faculty of Electrical Engineering, Czech Technical University Technick´a 2, 166 27 Prague 6, Czech Republic fax +420 2 2435 7385, phone +420 2 2435 7637, www: http://cmp.felk.cvut.cz Zero-Temperature Limit of a Convergent Algorithm to Minimize the Bethe Free Energy Tom´aˇs Werner December 2011 Abstract After the discovery that fixed points of loopy belief propagation coin- cide with stationary points of the Bethe free energy, several researchers proposed provably convergent algorithms to directly minimize the Bethe free energy. These algorithms were formulated only for non-zero tem- perature (thus finding fixed points of the sum-product algorithm) and their possible extension to zero temperature is not obvious. We present the zero-temperature limit of the double-loop algorithm by Heskes, which converges a max-product fixed point. The inner loop of this algorithm is max-sum diffusion. Under certain conditions, the algorithm combines the complementary advantages of the max-product belief propagation and max-sum diffusion (LP relaxation): it yields good approximation of both ground states and max-marginals. 1 Introduction Loopy belief propagation [17] is a well-known algorithm to approximate marginals of the Gibbs distribution defined by an undirected graphical model. For acyclic graphs, BP always converges and yields the exact marginals. For graphs with cycles, it is not guaranteed to converge but when it does, it often yields sur- prisingly good approximations of the true marginals. One informal argument for this is that at a BP fixed point, marginals are exact in every sub-tree of the factor graph [23, 24]. Attempts to understand loopy BP has generated a large body of literature, see e.g. the survey [25]. BP has a modification, known as the max-product BP, where summations are replaced with maximizations. In statistical mechanics terminology, this can be understood as the zero-temperature limit of the ordinary BP. Max-product BP computes (or approximates) max-marginals rather than ordinary marginals. After the discovery [34, 33] that BP fixed points coincide with stationary points of the Bethe free energy, several researchers proposed provably convergent algorithms to find a local minimum of the Bethe free energy [35, 28, 22, 5, 6]. These algorithms have been proposed only for the sum-product and their possible extension to the max-product is not obvious. 1 We reformulate the double-loop algorithm [5] by Heskes such that taking its zero-temperature limit becomes straightforward, which results in an algorithm that always converges to a max-product BP fixed point. The inner loop of the algorithm is max-sum diffusion [13, 29, 31, 2]. We empirically observed that with a uniform initialization, the algorithm always yielded the same approximation of ground states that would be obtained by max-sum diffusion (or other algorithms for MAP inference based on LP relaxation, such as TRW-S [12]). Thus, it combines the complementary advantages of max-sum belief propagation and LP relaxation: unlike the former, it yields good approximation of ground states and, unlike the latter, it yields a good approximation of max-marginals. The text is organized as follows. We first (§2) review the basics of inference in graphical models. We thoroughly discuss the zero-temperature limit of the Gibbs distribution and related quantities and how to obtain their approxima- tion by variational inference. Then we review two basic cases of variational inference, with a convex free energy (§3) and with the Bethe free energy (§4). Then (§3.1, §4.1) we discuss their zero-temperature limits in detail. Finally (§5) we reformulate the double-loop algorithm [5] and modify it for the zero temperature. 2 Gibbs distribution Let V be a set of variables, each variable v ∈V taking states xv from a finite domain Xv. An assignment to a variable subset a ⊆V is xa ∈Xa, where Xa is the Cartesian product of domains Xv for v ∈a. In particular, xV ∈XV is an assignment to all the variables. Let E ⊆2V , thus (V, E) is a hypergraph. Each variable v ∈V and hyperedge a ∈E is assigned a potential function θv: Xv →R and θa: Xa →R, respectively, where R = R ∪{−∞}. All numbers θv(xv) and θa(xa) are understood as a single vector θ ∈R I (or mapping θ: I →R) with I = { (v, xv) | v ∈V, xv ∈Xv } ∪{ (a, xa) | a ∈E, xa ∈Xa }. The Gibbs probability distribution over the hypergraph (V, E

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut