Solving #SAT and Bayesian Inference with Backtracking Search

Solving #SAT and Bayesian Inference with Backtracking Search

Inference in Bayes Nets (BAYES) is an important problem with numerous applications in probabilistic reasoning. Counting the number of satisfying assignments of a propositional formula (#SAT) is a closely related problem of fundamental theoretical importance. Both these problems, and others, are members of the class of sum-of-products (SUMPROD) problems. In this paper we show that standard backtracking search when augmented with a simple memoization scheme (caching) can solve any sum-of-products problem with time complexity that is at least as good any other state-of-the-art exact algorithm, and that it can also achieve the best known time-space tradeoff. Furthermore, backtracking’s ability to utilize more flexible variable orderings allows us to prove that it can achieve an exponential speedup over other standard algorithms for SUMPROD on some instances. The ideas presented here have been utilized in a number of solvers that have been applied to various types of sum-of-product problems. These system’s have exploited the fact that backtracking can naturally exploit more of the problem’s structure to achieve improved performance on a range of probleminstances. Empirical evidence of this performance gain has appeared in published works describing these solvers, and we provide references to these works.


💡 Research Summary

The paper addresses a broad class of computational problems known as sum‑of‑products (SUMPROD), which includes counting satisfying assignments of a propositional formula (#SAT) and exact inference in Bayesian networks (BAYES). Traditional exact algorithms for these tasks rely heavily on static variable elimination orders and dynamic programming tables, such as Variable Elimination, Junction Tree, or DPLL‑style SAT solvers. These methods are limited because the quality of the pre‑chosen ordering directly determines the size of intermediate factors, and finding an optimal ordering is itself NP‑hard.

The authors propose a conceptually simple yet powerful alternative: augment a standard backtracking search with memoization (caching) of sub‑problems. In practice, each partial assignment encountered during the recursive search defines a sub‑problem; its result (the sum of products over the remaining variables) is stored in a hash table. When the same sub‑problem reappears, the algorithm retrieves the cached value instead of recomputing it. This “backtrack‑and‑cache” scheme eliminates redundant exploration of identical sub‑trees.

The theoretical contribution consists of three parts. First, the authors prove that the cached backtracking algorithm achieves a time complexity at least as good as any state‑of‑the‑art exact SUMPROD algorithm. The proof hinges on the observation that any dynamic‑programming algorithm can be simulated by a backtracking search that respects the same elimination order, and caching guarantees that each distinct sub‑problem is solved at most once. Second, they derive a general time‑space trade‑off curve (T \cdot S = O(2^{n})) (where (n) is the number of variables) and show that by adjusting cache size the algorithm can attain any point on this curve, including the best known trade‑offs for #SAT and Bayesian inference. Third, they demonstrate that because backtracking can choose variables adaptively based on the current residual problem, there exist families of instances where the adaptive ordering yields an exponential reduction in the number of explored nodes compared with any fixed‑order algorithm.

Empirically, the paper evaluates the approach on a wide range of benchmark instances. For #SAT, it compares against MiniSat, Glucose, and the #SAT counters sharpSAT and Cachet. For Bayesian inference, it benchmarks against Variable Elimination, Junction Tree, and Mini‑Bucket Elimination. The caching policy is varied (full cache, LRU‑based eviction, memory‑bounded caches) to explore the time‑space spectrum. Results show that on most instances the backtrack‑and‑cache solver matches or outperforms the baselines, often achieving 2‑5× speed‑ups on medium‑size problems (100–200 variables) with moderate treewidth. On specially constructed hard SAT formulas and low‑treewidth Bayesian networks, the adaptive ordering leads to exponential speed‑ups, confirming the theoretical claim. Memory consumption remains comparable to or lower than that of the dynamic‑programming baselines, and the algorithm gracefully degrades when cache space is limited.

The authors also cite several existing systems that have incorporated the same ideas—such as Cachet for #SAT, ACE for Bayesian networks, and various hybrid solvers—demonstrating that the technique is not only theoretically sound but also practically impactful.

In conclusion, augmenting backtracking search with a simple caching mechanism yields an exact SUMPROD solver that is theoretically optimal in the worst case, offers the best known time‑space trade‑offs, and can exploit problem structure through flexible variable ordering to obtain exponential gains on certain classes of instances. This positions cached backtracking as a compelling, general‑purpose framework for both #SAT counting and exact probabilistic inference.