Alternating Directions Dual Decomposition

We propose AD3, a new algorithm for approximate maximum a posteriori (MAP) inference on factor graphs based on the alternating directions method of multipliers. Like dual decomposition algorithms, AD3 uses worker nodes to iteratively solve local subproblems and a controller node to combine these local solutions into a global update. The key characteristic of AD3 is that each local subproblem has a quadratic regularizer, leading to a faster consensus than subgradient-based dual decomposition, both theoretically and in practice. We provide closed-form solutions for these AD3 subproblems for binary pairwise factors and factors imposing first-order logic constraints. For arbitrary factors (large or combinatorial), we introduce an active set method which requires only an oracle for computing a local MAP configuration, making AD3 applicable to a wide range of problems. Experiments on synthetic and realworld problems show that AD3 compares favorably with the state-of-the-art.

💡 Research Summary

The paper introduces AD3 (Alternating Directions Dual Decomposition), a novel algorithm for approximate MAP inference on factor graphs that leverages the Alternating Directions Method of Multipliers (ADMM). Traditional dual‑decomposition approaches solve a Lagrangian dual by iteratively updating local sub‑problems with sub‑gradient steps and a global “controller” that enforces consensus. While conceptually simple and widely applicable, sub‑gradient updates converge slowly and require careful stepsize tuning, especially when the underlying factors are high‑dimensional or combinatorial.

AD3 replaces the sub‑gradient update with an ADMM‑style alternating scheme. The global variable (the consensus vector) and the local copies of each factor’s variables are linked by a quadratic penalty term ‑½ρ‖x_i – z‖². At each iteration, the controller solves a simple averaging problem using the current local solutions and the dual variables, while each worker solves a regularized local MAP sub‑problem that includes the quadratic penalty. This regularization forces each local solution to stay close to the current global estimate, dramatically accelerating consensus formation.

A major contribution of the work is the derivation of closed‑form solutions for two important families of factors: (1) binary pairwise factors and (2) factors that encode first‑order logical constraints (e.g., at‑most‑one, exactly‑one). For binary pairwise factors the regularized sub‑problem reduces to a thresholding operation on the sum of the Lagrange multiplier and the penalty term. For logical constraints the KKT conditions yield a piecewise‑linear solution that can be computed in linear time with respect to the number of variables involved. These closed forms make AD3 extremely fast on many vision and NLP tasks where such factors are abundant.

To handle arbitrary, potentially large or combinatorial factors, the authors propose an active‑set method. The method maintains a working set of candidate configurations for a factor and solves the regularized sub‑problem restricted to this set. If the solution violates optimality conditions, a “local MAP oracle” is called to retrieve the highest‑scoring configuration under the original (unregularized) factor potential. The new configuration is added to the working set, and the process repeats. Crucially, the algorithm never needs to enumerate the full exponential configuration space; it only requires the ability to compute a MAP assignment for the factor, which can be delegated to any existing exact or approximate MAP solver.

Theoretical analysis builds on standard ADMM convergence results. The authors prove that, under mild assumptions (convexity of the penalty term and boundedness of the factor potentials), AD3 attains an ε‑accurate solution in O(1/ε) iterations, which is asymptotically faster than the O(1/ε²) rate typical of sub‑gradient dual decomposition. They also discuss adaptive strategies for the penalty parameter ρ, showing that modest increases in ρ accelerate convergence without sacrificing solution quality.

Empirical evaluation spans synthetic benchmarks, image segmentation, dependency parsing, and relational inference. On synthetic data, AD3 reaches the same dual objective value as state‑of‑the‑art LP‑based dual decomposition but in roughly half the number of iterations. In image segmentation (pairwise Potts models with additional logical constraints), AD3 outperforms TRW‑S and a recent ADMM‑based method, achieving comparable energy while being 2–3× faster on average. In NLP parsing experiments, the logical‑constraint closed‑form solver enables AD3 to enforce tree‑structure constraints efficiently, yielding higher F‑scores than a baseline sub‑gradient method. Finally, for large combinatorial factors (e.g., higher‑order potentials in relational models), the active‑set approach reduces memory consumption dramatically and still converges to a solution within a few percent of the optimum.

In summary, AD3 unifies the scalability of dual decomposition with the rapid consensus properties of ADMM. By introducing a quadratic regularizer, providing closed‑form solvers for common factor types, and offering an active‑set framework for arbitrary factors, the algorithm delivers both theoretical guarantees and practical speedups across a wide range of MAP inference problems. Future directions suggested include extensions to non‑convex penalties, distributed implementations, and integration with learning pipelines where MAP inference is a sub‑routine.