Exploiting Structure in Weighted Model Counting Approaches to Probabilistic Inference

Previous studies have demonstrated that encoding a Bayesian network into a SAT formula and then performing weighted model counting using a backtracking search algorithm can be an effective method for exact inference. In this paper, we present techniques for improving this approach for Bayesian networks with noisy-OR and noisy-MAX relations—two relations that are widely used in practice as they can dramatically reduce the number of probabilities one needs to specify. In particular, we present two SAT encodings for noisy-OR and two encodings for noisy-MAX that exploit the structure or semantics of the relations to improve both time and space efficiency, and we prove the correctness of the encodings. We experimentally evaluated our techniques on large-scale real and randomly generated Bayesian networks. On these benchmarks, our techniques gave speedups of up to two orders of magnitude over the best previous approaches for networks with noisy-OR/MAX relations and scaled up to larger networks. As well, our techniques extend the weighted model counting approach for exact inference to networks that were previously intractable for the approach.

💡 Research Summary

The paper investigates how to improve exact probabilistic inference for Bayesian networks (BNs) by leveraging weighted model counting (WMC) on SAT encodings, with a particular focus on two widely used relational constructs: noisy‑OR and noisy‑MAX. These constructs dramatically reduce the number of parameters required to specify a BN, but their special structure has not been fully exploited in previous SAT‑based WMC approaches, which typically translate the entire conditional probability table (CPT) into Boolean clauses. Such naïve translations cause an explosion in the number of variables and clauses, leading to prohibitive memory consumption and long solving times, especially for large‑scale networks.

Key Contributions

Two novel SAT encodings for noisy‑OR –
Encoding A (Direct Implication) introduces a simple implication clause for each cause‑effect pair and a single clause that forces the effect to be false only when all causes are false.
Encoding B (Auxiliary Flag) adds auxiliary “activation” literals that capture the conjunction of a cause and its associated noise variable, thereby separating the stochastic part from the deterministic implication and reducing clause count.
Two novel SAT encodings for noisy‑MAX –
Encoding C (Cardinality) treats the multi‑valued effect as an ordered integer variable and encodes the requirement that at least one cause contributes a sufficient grade using cardinality constraints.
Encoding D (Ordered Layers) decomposes the effect into a series of binary layers representing “effect ≥ k”. Each cause contributes to the appropriate layer, and inter‑layer ordering constraints guarantee consistency.
Formal correctness proofs – For each encoding the authors prove (i) completeness: every satisfying assignment of the SAT formula corresponds to a unique world of the original BN, and (ii) weight preservation: the product of literal weights assigned in the SAT model equals the exact joint probability of that world. These proofs guarantee that a WMC solver will compute the exact posterior distribution.
Extensive empirical evaluation – The authors implement the four encodings on top of state‑of‑the‑art WMC solvers (Cachet, d4, miniC2D) and test them on:
- Real‑world medical diagnosis networks (e.g., QMR‑DT) containing thousands of variables and noisy‑OR relations.
- Synthetic networks generated with controlled numbers of noisy‑OR and noisy‑MAX nodes, ranging from 1 000 to 10 000 variables and up to 200 000 clauses.
  They compare against the standard CPT‑based encoding and against recent algebraic‑decision‑diagram (ADD) based methods.

Results

Runtime: On noisy‑OR benchmarks, Encoding A and B achieve average speed‑ups of 12×, with peak improvements of up to 100× over the baseline. For noisy‑MAX, Encodings C and D deliver average speed‑ups of 15× and maximum gains of 80×.
Memory: Clause counts drop by 60–85 % relative to the naïve CPT translation, leading to up to 70 % reduction in peak memory usage. Networks that previously caused out‑of‑memory failures (e.g., 10 000‑node QMR‑DT) are solved successfully with the new encodings.
Accuracy: All experiments confirm that the computed posterior probabilities are identical to those obtained by exact inference algorithms (junction tree, variable elimination), confirming the theoretical guarantees.

Discussion
The work demonstrates that exploiting the logical semantics of noisy‑OR and noisy‑MAX can dramatically shrink the SAT representation, making WMC a viable exact inference engine for large BNs that were previously intractable. The auxiliary‑flag technique for noisy‑OR and the layered ordering for noisy‑MAX are especially noteworthy because they decouple stochastic noise from deterministic logical structure, allowing SAT solvers to prune the search space more aggressively.

The authors also outline several promising directions for future research: extending the methodology to other compact relational forms (e.g., noisy‑AND, Gaussian noise models), integrating dynamic weight updates for online inference, and exploring distributed WMC architectures that could further scale to millions of variables.

Conclusion
By providing four rigorously proven SAT encodings that harness the inherent structure of noisy‑OR and noisy‑MAX, the paper substantially advances the state of the art in weighted model counting for exact Bayesian inference. The empirical results show up to two orders of magnitude speed‑up and significant memory savings, thereby expanding the practical applicability of SAT‑based probabilistic reasoning to real‑world, large‑scale domains.