Trading inference effort versus size in CNF Knowledge Compilation

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Knowledge Compilation (KC) studies compilation of boolean functions f into some formalism F, which allows to answer all queries of a certain kind in polynomial time. Due to its relevance for SAT solving, we concentrate on the query type “clausal entailment” (CE), i.e., whether a clause C follows from f or not, and we consider subclasses of CNF, i.e., clause-sets F with special properties. In this report we do not allow auxiliary variables (except of the Outlook), and thus F needs to be equivalent to f. We consider the hierarchies UC_k <= WC_k, which were introduced by the authors in 2012. Each level allows CE queries. The first two levels are well-known classes for KC. Namely UC_0 = WC_0 is the same as PI as studied in KC, that is, f is represented by the set of all prime implicates, while UC_1 = WC_1 is the same as UC, the class of unit-refutation complete clause-sets introduced by del Val 1994. We show that for each k there are (sequences of) boolean functions with polysize representations in UC_{k+1}, but with an exponential lower bound on representations in WC_k. Such a separation was previously only know for k=0. We also consider PC < UC, the class of propagation-complete clause-sets. We show that there are (sequences of) boolean functions with polysize representations in UC, while there is an exponential lower bound for representations in PC. These separations are steps towards a general conjecture determining the representation power of the hierarchies PC_k < UC_k <= WC_k. The strong form of this conjecture also allows auxiliary variables, as discussed in depth in the Outlook.

💡 Research Summary

The paper investigates the trade‑off between inference effort and representation size in CNF knowledge compilation. It focuses on the query “clausal entailment” (CE), i.e., deciding whether a clause follows from a Boolean function f, and studies subclasses of CNF clause‑sets that allow polynomial‑time CE answering. The authors work without auxiliary variables (except in the Outlook) and require the compiled clause‑set F to be logically equivalent to f.

Three hierarchies of clause‑sets are central: UCₖ (unit‑refutation completeness at level k), PCₖ (propagation‑completeness at level k), and WCₖ (weak‑resolution completeness at level k). UC₀ = WC₀ coincides with the prime‑implicate (PI) representation, while UC₁ = WC₁ is the classic unit‑refutation complete class UC. For k ≥ 2, UCₖ ⊂ UCₖ₊₁, PCₖ ⊂ UCₖ, and WCₖ ⊃ UCₖ. The paper’s main conjecture (originally from earlier work) predicts that each step up the hierarchy yields strictly more succinct representations.

To prove these separations the authors introduce two technical tools: “doping” and “trigger hypergraphs”. Doping adds a fresh variable to each clause, turning every minimal premise set (MPS) of a clause‑set F into a unique prime implicate of the doped set D(F). This creates a tight correspondence between the combinatorial structure of F and the set of prime implicates of D(F). The authors focus on clause‑sets from the class SMU₍δ=1₎, i.e., minimally unsatisfiable hitting clause‑sets of deficiency 1, which have a tree‑like structure. By doping such tree‑clause‑sets they obtain families with an exponential number of prime implicates while preserving a simple description.

The second tool, trigger hypergraphs, captures the necessity of short clauses for a given hardness level. For a clause C that is a prime implicate of F, the restricted clause‑set ϕ_C * F (assignment falsifying C) must contain a clause of length ≤ k in order for the k‑resolution (or rₖ propagation) to derive the empty clause. The hypergraph’s matching number provides a lower bound on the number of clauses any representation of hardness k must contain.

Using these tools the authors prove:

Theorem 6.14: For every k ≥ 0 there exists a family of Boolean functions that has polynomial‑size representations in UCₖ₊₁ but any equivalent representation in WCₖ requires exponential size. This extends the previously known separation for k = 0 to all levels.
Theorem 8.1: There are Boolean functions with polynomial‑size representations in UC (= UC₁) while any equivalent representation in PC must be exponential. Hence PC ⊂ UC is a strict inclusion.

The proofs rely on constructing “extremal” SMU₍δ=1₎ tree‑clause‑sets, doping them, and analyzing the resulting trigger hypergraphs. The matching number grows linearly with the depth of the tree, which translates into an exponential lower bound for the weaker hierarchies.

Beyond the theoretical separations, the paper discusses practical implications for SAT solving. Clause‑sets in UCₖ allow efficient CE queries using the generalized unit‑propagation operator rₖ, but the runtime of rₖ grows as O(ℓ(F)·n(F)^{2k‑2}), reflecting the increased inference effort at higher levels. Conversely, PCₖ offers stronger propagation (no forced assignments beyond those detected by rₖ) at the cost of larger representations. The authors argue that the hierarchy provides a systematic way to balance representation compactness against query‑time complexity, which is crucial for designing SAT encodings and preprocessing techniques.

The Outlook explores the impact of allowing auxiliary variables. Two notions are distinguished: the “relative” condition (auxiliary variables may appear only in the representation) and the “absolute” condition (auxiliary variables must be eliminated after queries). The authors conjecture that the same strict separations hold under the absolute condition, and they outline how the current constructions could be adapted.

Finally, the paper lists open problems: proving strictness of PCₖ ⊂ UCₖ for k ≥ 2, extending the separations to the auxiliary‑variable setting, and developing concrete compilation algorithms that target specific levels of the hierarchy.

In summary, the work provides a deep combinatorial analysis of CNF knowledge compilation hierarchies, establishes exponential separations between successive levels, and offers a nuanced view of the trade‑off between inference power and representation size—insights that are directly relevant to both theoretical research and practical SAT solver engineering.

Trading inference effort versus size in CNF Knowledge Compilation

💡 Research Summary

Comments & Academic Discussion

Leave a Comment