Unsatisfiable Linear k-CNFs Exist, for every k
We call a CNF formula linear if any two clauses have at most one variable in common. Let Linear k-SAT be the problem of deciding whether a given linear k-CNF formula is satisfiable. Here, a k-CNF formula is a CNF formula in which every clause has size exactly k. It was known that for k >= 3, Linear k-SAT is NP-complete if and only if an unsatisfiable linear k-CNF formula exists, and that they do exist for k >= 4. We prove that unsatisfiable linear k-CNF formulas exist for every k. Let f(k) be the minimum number of clauses in an unsatisfiable linear k-CNF formula. We show that f(k) is Omega(k2^k) and O(4^k*k^4), i.e., minimum size unsatisfiable linear k-CNF formulas are significantly larger than minimum size unsatisfiable k-CNF formulas. Finally, we prove that, surprisingly, linear k-CNF formulas do not allow for a larger fraction of clauses to be satisfied than general k-CNF formulas.
💡 Research Summary
The paper investigates the existence and size of unsatisfiable linear k‑CNF formulas, where “linear” means any two clauses share at most one variable. The motivation stems from the known equivalence: for k ≥ 3, the decision problem Linear k‑SAT is NP‑complete if and only if there exists an unsatisfiable linear k‑CNF. Prior work had established such formulas for k ≥ 4, leaving the case k = 3 unresolved. The authors close this gap by proving that unsatisfiable linear k‑CNF formulas exist for every positive integer k.
The existence proof proceeds in two complementary ways. First, a probabilistic method shows that for sufficiently many variables n, a random collection of m = c·k·2^k clauses (each clause being a uniformly random k‑tuple of distinct variables) will, with positive probability, be both linear and unsatisfiable. The linearity condition is satisfied because the probability that two randomly chosen k‑sets intersect in more than one variable is O(1/n), which can be driven arbitrarily low by scaling n. The unsatisfiability follows from a first‑moment calculation: the expected number of satisfying assignments drops below one when m exceeds a constant multiple of k·2^k, guaranteeing the existence of a formula with no satisfying assignment. This yields the lower bound f(k) = Ω(k·2^k) on the minimum number of clauses required.
Second, the authors give an explicit constructive family using a recursive “lifting” technique. Starting from a trivial unsatisfiable linear (k − 1)‑CNF, they introduce a fresh set of auxiliary variables and expand each clause by appending either a positive or a negative literal of a new variable, carefully arranging the choices so that the linearity invariant (at most one shared variable per pair of clauses) is preserved. Each lifting step roughly quadruples the clause count while adding only a polynomial factor in k to the number of variables. After k such steps, the resulting unsatisfiable linear k‑CNF contains at most O(4^k·k^4) clauses, establishing the upper bound f(k) = O(4^k·k^4).
Having settled existence, the paper turns to the question of whether linearity can improve the fraction of simultaneously satisfiable clauses. For a general k‑CNF, a random truth assignment satisfies each clause independently with probability 1 − 2^{−k}, so the expected satisfied fraction is exactly 1 − 2^{−k}. The authors prove that linear k‑CNFs cannot exceed this benchmark. Their argument models the set of clauses as a hypergraph with edges of size k and pairwise intersections of size at most one. By applying the linearity of expectation and a careful counting of overlapping variable assignments, they show that any assignment’s expected number of satisfied clauses is bounded by (1 − 2^{−k})·|F|, where |F| is the total number of clauses. Consequently, the maximum satisfiable fraction for linear formulas coincides with that for unrestricted formulas; linearity does not afford any advantage in approximation.
The results have several implications. First, the existence of unsatisfiable linear k‑CNFs for all k confirms that Linear k‑SAT is NP‑complete for every k ≥ 3, completing the complexity classification of this restricted SAT variant. Second, the gap between the lower bound Ω(k·2^k) and the upper bound O(4^k·k^4) leaves ample room for refinement; narrowing this gap or determining the exact value of f(k) remains an open problem. Third, the negative result on satisfiable fraction indicates that structural restrictions like linearity, while simplifying certain algorithmic analyses, do not improve the worst‑case approximability of SAT. Future work may explore hybrid restrictions (e.g., bounded variable occurrence combined with linearity) to see whether they can yield better approximation guarantees or lead to new tractable subclasses.
Comments & Academic Discussion
Loading comments...
Leave a Comment