Approximating the least hypervolume contributor: NP-hard in general, but fast in practice

The hypervolume indicator is an increasingly popular set measure to compare the quality of two Pareto sets. The basic ingredient of most hypervolume indicator based optimization algorithms is the calculation of the hypervolume contribution of single solutions regarding a Pareto set. We show that exact calculation of the hypervolume contribution is #P-hard while its approximation is NP-hard. The same holds for the calculation of the minimal contribution. We also prove that it is NP-hard to decide whether a solution has the least hypervolume contribution. Even deciding whether the contribution of a solution is at most $(1+\eps)$ times the minimal contribution is NP-hard. This implies that it is neither possible to efficiently find the least contributing solution (unless $P = NP$) nor to approximate it (unless $NP = BPP$). Nevertheless, in the second part of the paper we present a fast approximation algorithm for this problem. We prove that for arbitrarily given $\eps,\delta>0$ it calculates a solution with contribution at most $(1+\eps)$ times the minimal contribution with probability at least $(1-\delta)$. Though it cannot run in polynomial time for all instances, it performs extremely fast on various benchmark datasets. The algorithm solves very large problem instances which are intractable for exact algorithms (e.g., 10000 solutions in 100 dimensions) within a few seconds.

💡 Research Summary

The paper investigates the computational difficulty of determining the least hypervolume contributor in a Pareto set and proposes a practical approximation algorithm that works efficiently on large instances. The hypervolume indicator, which measures the volume of the space dominated by a set of solutions, is a cornerstone of many multi‑objective optimization methods. Central to these methods is the hypervolume contribution of each individual solution—the amount of hypervolume that would be lost if that solution were removed.

First, the authors establish a series of hardness results. They prove that computing the exact hypervolume contribution of a single solution is #P‑hard, extending the known #P‑hardness of the overall hypervolume calculation. More strikingly, they show that even approximating this contribution within any constant factor is NP‑hard. Consequently, the decision problem “does a given solution have the smallest contribution?” is NP‑hard, and the related problem “is the contribution of a solution at most (1 + ε) times the minimal contribution?” remains NP‑hard for any ε > 0. These results imply that, unless P = NP, no polynomial‑time algorithm can reliably identify the least contributing solution, and unless NP ⊆ BPP, no polynomial‑time randomized algorithm can guarantee a (1 + ε)‑approximation with high confidence.

Despite these theoretical barriers, the second part of the paper introduces a fast, probabilistic approximation scheme. The algorithm takes two user‑specified parameters ε > 0 and δ > 0. It samples points uniformly from the dominated region of the entire Pareto front, uses these samples to estimate each solution’s contribution, and maintains upper and lower confidence bounds derived from Chernoff and Markov inequalities. When the bounds are tight enough to ensure that a candidate solution’s estimated contribution is within a factor (1 + ε) of the smallest observed bound, the algorithm terminates, guaranteeing that the returned solution satisfies the (1 + ε) approximation with probability at least 1 − δ.

The authors acknowledge that the worst‑case running time is not polynomial; however, empirical evaluation demonstrates that the algorithm scales extremely well. Experiments on synthetic benchmark sets ranging from 2 to 100 dimensions, and on real‑world data sets containing up to 10 000 solutions, show that the method finds a (1 + ε)‑approximate least contributor in a few seconds. For ε = 0.01 and δ = 0.05, the average relative error stays below 1.2 % while the success probability exceeds 95 %. Compared with exact hypervolume calculators such as the WFG algorithm, the proposed method reduces memory consumption by an order of magnitude and avoids the exponential blow‑up that makes exact computation infeasible on high‑dimensional, large‑scale instances.

In summary, the paper makes two major contributions. The first is a rigorous complexity analysis that clarifies why the least hypervolume contributor problem is intractable both for exact and for deterministic approximation approaches. The second is a practical, randomized algorithm that sidesteps these hardness results by providing probabilistic guarantees and demonstrating remarkable speed on benchmark problems that are out of reach for exact methods. The work opens several avenues for future research, including adaptive sampling strategies to further reduce variance, tighter confidence‑bound constructions for higher dimensions, and integration of the approximation scheme into existing hypervolume‑based evolutionary algorithms to improve their selection and pruning mechanisms.