Products of Weighted Logic Programs

Weighted logic programming, a generalization of bottom-up logic programming, is a well-suited framework for specifying dynamic programming algorithms. In this setting, proofs correspond to the algorithm’s output space, such as a path through a graph or a grammatical derivation, and are given a real-valued score (often interpreted as a probability) that depends on the real weights of the base axioms used in the proof. The desired output is a function over all possible proofs, such as a sum of scores or an optimal score. We describe the PRODUCT transformation, which can merge two weighted logic programs into a new one. The resulting program optimizes a product of proof scores from the original programs, constituting a scoring function known in machine learning as a ``product of experts.’’ Through the addition of intuitive constraining side conditions, we show that several important dynamic programming algorithms can be derived by applying PRODUCT to weighted logic programs corresponding to simpler weighted logic programs. In addition, we show how the computation of Kullback-Leibler divergence, an information-theoretic measure, can be interpreted using PRODUCT.

💡 Research Summary

The paper introduces a novel transformation, called PRODUCT, that merges two weighted logic programs (WLPs) into a single program whose proof scores are the product of the original scores. Weighted logic programming extends traditional bottom‑up logic programming by assigning real‑valued weights to axioms, allowing each derivation (proof) to be scored either additively or multiplicatively. In many applications—such as finding paths in a graph, parsing sentences, or evaluating probabilistic models—the desired output is a function over all possible proofs, for example the sum of scores (total probability) or the maximum score (optimal solution).

The authors formalize PRODUCT as follows. First, the axiom sets of the two source WLPs are combined via a Cartesian product, producing a new set of composite axioms. The weight of each composite axiom is defined as the product of the weights of its components. Second, inference rules are also combined in a product fashion, so that a composite proof consists of a pair of proofs, one from each original program. Crucially, the transformation allows the insertion of side‑conditions—additional constraints that filter out proof pairs that are semantically inconsistent (e.g., mismatched variable bindings, incompatible structural patterns). These side‑conditions are what make the approach practical, because without them the combined proof space would explode exponentially.

From a machine‑learning perspective, the resulting scoring function is exactly a “product of experts” (PoE): each original WLP acts as an expert that assigns high scores to a subset of structures, and the combined program assigns high scores only to those structures that satisfy all experts simultaneously. This connection provides a logical‑programming foundation for PoE models, which have traditionally been expressed only in probabilistic graphical‑model terms.

The paper demonstrates the expressive power of PRODUCT through several concrete dynamic‑programming algorithms:

Joint Viterbi for Two HMMs – Each hidden Markov model is encoded as a WLP. PRODUCT yields a program whose proofs correspond to pairs of state sequences, and the side‑condition forces the two sequences to be synchronized in time. Running Viterbi on the product program finds the most probable joint path, which is exactly the product‑of‑experts optimum.
Grammar‑Constrained Shortest Path – A CYK parser for a context‑free grammar and a classic shortest‑path algorithm are each expressed as WLPs. By applying PRODUCT and a side‑condition that aligns grammar symbols with graph vertices, the authors obtain a program that simultaneously respects grammatical structure and minimizes path cost. This reproduces known algorithms for “constrained shortest path” problems while offering a uniform logical description.
Combining Language Models – An n‑gram model and a neural language model are separately compiled into WLPs. PRODUCT creates a joint model whose score for a candidate sentence is the product of the two model probabilities, a principled PoE formulation that can improve translation or speech‑recognition rescoring.

Beyond algorithmic synthesis, the authors show that the Kullback‑Leibler (KL) divergence between two probability distributions can be expressed as a PRODUCT computation. By representing the two distributions as WLPs and adding a side‑condition that encodes the log‑ratio term, the objective ∑ P(π) log

💡 Research Summary

📜 Original Paper Content