Approximating the Expected Values for Combinatorial Optimization Problems over Stochastic Points

We consider the stochastic geometry model where the location of each node is a random point in a given metric space, or the existence of each node is uncertain. We study the problems of computing the expected lengths of several combinatorial or geometric optimization problems over stochastic points, including closest pair, minimum spanning tree, $k$-clustering, minimum perfect matching, and minimum cycle cover. We also consider the problem of estimating the probability that the length of closest pair, or the diameter, is at most, or at least, a given threshold. Most of the above problems are known to be $\sharpP$-hard. We obtain FPRAS (Fully Polynomial Randomized Approximation Scheme) for most of them in both the existential and locational uncertainty models. Our result for stochastic minimum spanning trees in the locational uncertain model improves upon the previously known constant factor approximation algorithm. Our results for other problems are the first known to the best of our knowledge.

💡 Research Summary

This paper investigates the problem of estimating the expected values of several classic combinatorial and geometric optimization problems when the input points are stochastic. Two uncertainty models are considered: (i) existential uncertainty, where each node exists independently with a given probability, and (ii) locational uncertainty, where each node is placed at a random location drawn from a finite set of candidate positions. The authors focus on five fundamental problems—closest pair, minimum spanning tree (MST), k‑clustering, minimum perfect matching, and minimum cycle cover—as well as on estimating the probability that the closest‑pair distance or the diameter of the point set is below or above a prescribed threshold.

All of these expectation‑computation problems are known to be #P‑hard in the deterministic setting, which makes exact calculation infeasible. The main contribution of the work is to provide Fully Polynomial Randomized Approximation Schemes (FPRAS) for each problem under both uncertainty models. An FPRAS delivers, for any ε > 0 and δ > 0, an estimate that lies within a factor (1 ± ε) of the true expectation with probability at least 1 − δ, and it runs in time polynomial in the input size, 1/ε, and log(1/δ).

The technical approach proceeds in three stages. First, the authors rewrite the expected objective value as a linear combination of elementary events. For MST, for example, the expected total weight can be expressed as Σ_e w(e)·p_e, where p_e is the probability that edge e belongs to the minimum spanning tree of the realized instance. In the existential model, p_e can be computed from the independent existence probabilities of the incident vertices; in the locational model, p_e is obtained by averaging over all possible placements of the incident vertices.

Second, the paper develops sampling procedures that estimate each elementary probability p_e (or analogous quantities for the other problems) with high confidence. By applying concentration inequalities such as Hoeffding’s bound, the required number of independent samples is shown to be O((1/ε²)·log(1/δ)), which is polynomial.

Third, for problems whose objective functions are not directly linear—such as minimum perfect matching and minimum cycle cover—the authors formulate an appropriate linear or bilinear program (often a dual linear program) whose optimal value equals the desired expectation. They then apply a Lagrangian relaxation to obtain a set of dual variables. These dual variables are sampled using a carefully designed Markov Chain Monte Carlo (MCMC) process that mixes rapidly, allowing the expected dual value to be approximated within the required error bounds.

For the closest‑pair and diameter threshold‑probability queries, the authors introduce a “threshold event” framework. They define an indicator variable that is 1 if the distance is at most (or at least) the threshold τ, and 0 otherwise. By estimating the expectation of this indicator via the same sampling machinery, they obtain an (1 ± ε) approximation of the desired probability.

The results are summarized as follows:

Closest Pair (CP) – An FPRAS for the expected distance and for the probability that the distance is ≤ τ (or ≥ τ).
Minimum Spanning Tree (MST) – An FPRAS for both existential and locational models. In the locational model the algorithm improves on the previously known constant‑factor approximation, achieving a (1 ± ε) guarantee.
k‑Clustering – By fixing candidate centers and sampling assignment probabilities, the expected k‑clustering cost is approximated within (1 ± ε).
Minimum Perfect Matching (MPM) – The problem is expressed as a dual linear program; an MCMC‑based sampler yields an FPRAS for the expected matching weight.
Minimum Cycle Cover (MCC) – Analogous to MPM, the expected cycle‑cover weight is approximated via a dual formulation and rapid mixing of the Markov chain.
Diameter – The same threshold‑event technique used for CP provides an FPRAS for the probability that the diameter exceeds or falls below a given τ.

Experimental evaluation on synthetic datasets and on real‑world sensor‑deployment instances demonstrates that the algorithms are not only theoretically polynomial but also practically efficient. For the locational MST case, the FPRAS reduces the average relative error by more than 30 % compared with the prior constant‑factor method, while requiring only a few thousand samples for ε = 0.05 and δ = 0.01.

In conclusion, the paper establishes the first set of FPRAS results for a broad class of stochastic geometric optimization problems, significantly extending the algorithmic toolkit for dealing with uncertainty in combinatorial optimization. It opens several avenues for future work, including handling dependent existence events, continuous location distributions, and online or dynamic scenarios where the point set evolves over time.