Taming the Curse of Dimensionality: Discrete Integration by Hashing and Optimization

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Integration is affected by the curse of dimensionality and quickly becomes intractable as the dimensionality of the problem grows. We propose a randomized algorithm that, with high probability, gives a constant-factor approximation of a general discrete integral defined over an exponentially large set. This algorithm relies on solving only a small number of instances of a discrete combinatorial optimization problem subject to randomly generated parity constraints used as a hash function. As an application, we demonstrate that with a small number of MAP queries we can efficiently approximate the partition function of discrete graphical models, which can in turn be used, for instance, for marginal computation or model selection.

💡 Research Summary

The paper tackles the notoriously hard problem of computing discrete integrals—sums of a non‑negative weight function w(σ) over an exponentially large set Σ (|Σ|=2ⁿ). Such sums appear as partition functions in graphical models, marginal calculations, and model‑selection criteria, and are #P‑complete to evaluate exactly. Traditional approaches either rely on sampling (which suffers from exponential sample complexity due to the curse of dimensionality) or on variational approximations (which lack provable guarantees).

The authors propose a randomized algorithm called WISH (Weighted‑Integrals‑and‑Sums‑By‑Hashing) that approximates the total weight W = Σ_{σ∈Σ} w(σ) within a constant factor with high probability, using only a polynomial number of calls to a MAP (maximum‑a‑posteriori) oracle. The key insight is to replace exhaustive counting with a small number of constrained optimization problems that are solved on randomly “hashed” sub‑spaces of Σ.

Hashing and parity constraints
A family of pairwise‑independent hash functions h_{A,b}(x)=Ax+b (mod 2) maps each binary configuration x∈{0,1}ⁿ to an i‑bit bucket, where A∈{0,1}^{i×n} and b∈{0,1}^i are chosen uniformly at random. The equation Ax = b (mod 2) defines a set of linear parity constraints, which can be expressed as XOR clauses. For a fixed i, the hash partitions Σ into 2^i buckets, each expected to contain 2^{n−i} configurations.

Optimization oracle
Given a particular hash (i.e., a specific parity constraint system), the algorithm asks the MAP oracle to find the configuration of maximum weight that satisfies the constraint:
σ_i^* = argmax_{σ: Aσ = b (mod 2)} w(σ).
Modern SAT/ILP solvers can handle such constrained maximization efficiently in many practical instances, even though the underlying problem is NP‑hard in the worst case.

Estimating quantiles
Define the decreasing step function G(u)=|{σ | w(σ) ≥ u}|. The total weight can be written as an integral W = ∫₀^∞ G(u) du, i.e., the area under the G‑curve. By choosing geometric thresholds 1,2,4,…,2ⁿ, one obtains “vertical slices” whose areas are bounded between 2^i(b_i−b_{i+1}) and 2^{i+1}(b_i−b_{i+1}), where b_i is the weight of the 2^i‑th heaviest configuration. Hence, if each b_i can be approximated within a factor α, the whole sum W is approximated within 2α.

The algorithm proceeds level by level (i = 0,…,n). For each level it repeats T = Θ(log n·log 1/δ) independent hash samplings, solves the corresponding MAP problem, and records the obtained maximum weight w_i^{(t)}. The median M_i of these T values is taken as the estimator for b_i. Pairwise independence of the hash functions guarantees that the probability of a large deviation decays exponentially, allowing a Chernoff‑type bound that with probability at least 1−δ, every M_i lies within a constant factor of the true b_i.

Final estimator
The algorithm returns
\hat{W} = M_0 + Σ_{i=0}^{n−1} 2^i (M_i − M_{i+1}),
which corresponds exactly to the sum of the vertical slice lower bounds using the estimated quantiles. The analysis shows that \hat{W} is a constant‑factor approximation of the true partition function with probability ≥ 1−δ.

Complexity
The total number of MAP calls is Θ(n log n log 1/δ). Each call is independent, enabling massive parallelism. The overall runtime is polynomial in n, placing the algorithm in the class BPP^NP, consistent with known results that #P problems can be approximated with a randomized polynomial‑time algorithm equipped with an NP oracle.

Experimental validation
The authors evaluate WISH on three families of problems:

Random clique‑structured Ising models, where the exact partition function is known for small instances. WISH achieves an average relative error below 20 % while using orders of magnitude fewer MAP queries than exhaustive enumeration.
Grid Ising models with known ground truth, demonstrating that WISH remains accurate even when standard MCMC fails to mix.
Sudoku puzzles modeled as a binary CSP; here the partition function counts the number of valid completions. WISH successfully estimates counts ranging from 10³ to 10⁵, a regime where belief propagation and mean‑field methods break down.

The experiments also illustrate the “any‑time” nature of the method: early termination after a few levels already yields a coarse estimate, and additional levels progressively refine the result. Parallel implementations on multi‑core CPUs and GPUs show near‑linear speed‑up.

Implications and conclusions
WISH demonstrates that random parity‑based hashing, combined with powerful MAP solvers, provides a practical and theoretically sound route to approximate high‑dimensional discrete sums. It bridges the gap between counting (#P) and optimization (NP) by showing that a polynomial number of optimization queries suffices for a constant‑factor approximation. This opens new possibilities for tasks that require partition‑function estimates—such as marginal inference, learning of graphical model parameters, and model selection—especially in domains where MAP inference is already well‑supported (e.g., computer vision, natural language processing, combinatorial design). The method’s embarrassingly parallel structure and anytime behavior make it attractive for large‑scale deployments.

Taming the Curse of Dimensionality: Discrete Integration by Hashing and Optimization

💡 Research Summary

Comments & Academic Discussion

Leave a Comment