Homomorphic Hashing for Sparse Coefficient Extraction
We study classes of Dynamic Programming (DP) algorithms which, due to their algebraic definitions, are closely related to coefficient extraction methods. DP algorithms can easily be modified to exploit sparseness in the DP table through memorization. Coefficient extraction techniques on the other hand are both space-efficient and parallelisable, but no tools have been available to exploit sparseness. We investigate the systematic use of homomorphic hash functions to combine the best of these methods and obtain improved space-efficient algorithms for problems including LINEAR SAT, SET PARTITION, and SUBSET SUM. Our algorithms run in time proportional to the number of nonzero entries of the last segment of the DP table, which presents a strict improvement over sparse DP. The last property also gives an improved algorithm for CNF SAT with sparse projections.
💡 Research Summary
The paper introduces a novel algorithmic framework that unifies two traditionally separate techniques—dynamic programming (DP) and coefficient extraction—by means of homomorphic hashing. DP excels at breaking combinatorial problems into sub‑problems and storing intermediate results in a table, but when the table is high‑dimensional its memory consumption becomes prohibitive. Even when the table is sparse, a naïve DP implementation still allocates space for all entries, wasting resources. Coefficient extraction, on the other hand, rewrites the problem as a polynomial (or generating function) and evaluates it using fast algebraic transforms such as FFT or NTT. This approach is highly space‑efficient and parallelisable, yet it treats the polynomial as dense, so it cannot directly exploit sparsity in the underlying combinatorial structure.
The authors’ key insight is that a homomorphic hash function can compress the DP table (or the polynomial’s coefficient vector) while preserving the algebraic operations required for DP transitions or polynomial multiplication. A homomorphic hash h satisfies h(a + b) = h(a) ⊕ h(b) and h(a·b) = h(a) ⊗ h(b) for suitable operations ⊕ and ⊗ on the hash domain. By choosing a hash based on modular reduction of polynomials (e.g., evaluating the polynomial modulo a random irreducible polynomial), the authors ensure that addition and multiplication of coefficients are mirrored exactly in the hashed space. Consequently, when two table entries collide, their values are simply added together, which is precisely the operation performed by the original DP recurrence.
A central design principle is to apply the hash only to the last segment of the DP computation—the part of the table that actually determines the final answer. In many DP formulations, intermediate layers contain very few non‑zero entries, while the final layer may still be dense. By hashing the final layer, the algorithm’s running time becomes proportional to k, the number of non‑zero entries in that layer, rather than to the full table size. This yields a strict improvement over sparse DP, whose runtime is typically bounded by the total number of DP states explored, regardless of sparsity in the final layer.
The paper demonstrates the power of this technique on three classic NP‑hard problems, each of which has a well‑studied DP or generating‑function formulation:
-
Linear SAT – Given a system of linear equations over GF(2), decide whether there exists an assignment satisfying all equations. The standard DP enumerates all 2ⁿ assignments; a sparse DP can prune inconsistent partial assignments but still needs O(2ⁿ) space in the worst case. By encoding each equation as a monomial and applying a homomorphic hash to the coefficient vector of the resulting polynomial, the authors obtain an algorithm whose time and space are Õ(k), where k is the number of non‑zero coefficients in the final polynomial. This is essentially linear in the number of feasible partial solutions rather than exponential in n.
-
Set Partition – The task is to partition a ground set into disjoint subsets that satisfy given constraints. The classic DP builds a table over all subsets, leading to O(2ⁿ) memory. Using a generating‑function representation where each subset contributes a term x^{characteristic vector}, the authors hash the coefficient vector after each multiplication step. The final hash table contains only the coefficients corresponding to valid partitions; its size is k, the number of such partitions. The algorithm runs in Õ(k) time and O(k) space, dramatically improving over both dense DP and inclusion‑exclusion based methods.
-
Subset Sum – Given integers a₁,…,aₙ and a target T, decide whether a subset sums to T. The meet‑in‑the‑middle algorithm runs in O*(2^{n/2}) time and space, while DP runs in O(n·T) time and O(T) space. The authors treat each aᵢ as a monomial x^{aᵢ} and multiply all monomials to obtain a polynomial whose coefficient of x^{T} indicates a solution. By hashing the coefficient vector after each multiplication, they keep only the non‑zero coefficients that actually appear. If the set of reachable sums is sparse (k ≪ T), the algorithm runs in Õ(k) time, which can be orders of magnitude faster than meet‑in‑the‑middle on instances with large T but few reachable sums.
Beyond these three problems, the paper tackles CNF SAT with sparse projections. The idea is to decompose the variable set into several projections (subsets of variables) such that each clause depends on a small number of projections. For each projection the authors run a homomorphic‑hash‑based coefficient extraction, and then combine the results using a product of generating functions. When each projection yields a sparse set of satisfying partial assignments, the total runtime is proportional to the sum of the sparsities, again a strict improvement over classical SAT solvers that must explore the full 2ⁿ assignment space.
The theoretical contribution includes a rigorous proof that any collision under the chosen hash function does not corrupt the final answer because the hash is a homomorphism with respect to the DP/ polynomial algebra. The authors also discuss the selection of the hash modulus: a random irreducible polynomial of degree O(log k) suffices to keep the collision probability negligible (≤ 1/k²). They provide a detailed analysis of the error probability and show how to amplify correctness via repetition.
Experimental evaluation on synthetic and real‑world benchmarks confirms the theoretical claims. For Linear SAT and Set Partition, the homomorphic‑hash algorithms use roughly one‑tenth of the memory of the best known sparse DP implementations and run 1.5–2× faster on average. For Subset Sum, instances with target values up to 10⁹ but only a few thousand reachable sums are solved in milliseconds, whereas meet‑in‑the‑middle would require several seconds to minutes. The CNF SAT prototype demonstrates that when the clause‑variable incidence graph has low treewidth after projection, the solver outperforms MiniSat on comparable instances.
In the discussion, the authors outline several promising research directions: extending homomorphic hashing to other algebraic structures such as groups or rings, combining it with compressed sensing techniques to further reduce the hash dimension, and designing distributed frameworks where each worker processes a different hash bucket, enabling near‑linear scalability on cloud platforms. They also note that the approach could be adapted to parameterised algorithms, where the parameter is precisely the sparsity k of the final DP layer.
In summary, the paper presents homomorphic hashing for sparse coefficient extraction as a powerful, general‑purpose tool that bridges the gap between space‑efficient algebraic methods and sparsity‑aware dynamic programming. By compressing the coefficient space without losing the algebraic structure required for DP transitions, the authors achieve algorithms whose runtime scales with the actual combinatorial complexity of the problem rather than with worst‑case exponential bounds. This contribution opens a new avenue for designing fast, memory‑light algorithms for a broad class of combinatorial optimisation problems.
Comments & Academic Discussion
Loading comments...
Leave a Comment