Self-improving Algorithms for Coordinate-Wise Maxima and Convex Hulls

Finding the coordinate-wise maxima and the convex hull of a planar point set are probably the most classic problems in computational geometry. We consider these problems in the self-improving setting. Here, we have $n$ distributions $\mathcal{D}1, \ldots, \mathcal{D}n$ of planar points. An input point set $(p_1, \ldots, p_n)$ is generated by taking an independent sample $p_i$ from each $\mathcal{D}i$, so the input is distributed according to the product $\mathcal{D} = \prod_i \mathcal{D}i$. A self-improving algorithm repeatedly gets inputs from the distribution $\mathcal{D}$ (which is a priori unknown), and it tries to optimize its running time for $\mathcal{D}$. The algorithm uses the first few inputs to learn salient features of the distribution $\mathcal{D}$, before it becomes fine-tuned to $\mathcal{D}$. Let $\text{OPTMAX}\mathcal{D}$ (resp. $\text{OPTCH}\mathcal{D}$) be the expected depth of an \emph{optimal} linear comparison tree computing the maxima (resp. convex hull) for $\mathcal{D}$. Our maxima algorithm eventually achieves expected running time $O(\text{OPTMAX}\mathcal{D} + n)$. Furthermore, we give a self-improving algorithm for convex hulls with expected running time $O(\text{OPTCH}\mathcal{D} + n\log\log n)$. Our results require new tools for understanding linear comparison trees. In particular, we convert a general linear comparison tree to a restricted version that can then be related to the running time of our algorithms. Another interesting feature is an interleaved search procedure to determine the likeliest point to be extremal with minimal computation. This allows our algorithms to be competitive with the optimal algorithm for $\mathcal{D}$.

💡 Research Summary

The paper introduces self‑improving algorithms for two fundamental planar geometric problems: computing the coordinate‑wise maxima and constructing the convex hull of a set of points. In the self‑improving model, each of the n input points is drawn independently from its own unknown distribution 𝔇₁,…,𝔇ₙ, so the overall input distribution is the product 𝔇 = ∏ᵢ𝔇ᵢ. The algorithm is allowed to observe a sequence of inputs drawn from 𝔇, using the first few samples to learn salient statistical properties of the distributions, and then to adapt its behavior so that the expected running time on future inputs approaches the best possible for that distribution.

The authors formalize the notion of an optimal linear comparison tree for a given distribution. A linear comparison tree is a decision tree whose internal nodes perform linear comparisons of the form a·x + b·y ≤ c, where (x, y) are coordinates of a particular input point. For the maxima problem they define OPT_MAX_𝔇 as the expected depth of an optimal tree that correctly identifies all maximal points; analogously, OPT_CH_𝔇 is defined for the convex hull problem. These quantities serve as lower bounds on the expected number of comparisons any comparison‑based algorithm can achieve on distribution 𝔇.

A major technical contribution is a method to convert an arbitrary linear comparison tree into a restricted version without increasing its expected depth by more than a constant factor. In the restricted tree each comparison is limited to one of three simple forms: (i) compare a single point’s x‑coordinate to a constant, (ii) compare a single point’s y‑coordinate to a constant, or (iii) compare a linear combination of two points’ coordinates to a constant. This transformation makes the tree amenable to algorithmic implementation while preserving optimality up to constant factors.

Maxima algorithm.
During a learning phase the algorithm draws Õ(n) samples and estimates, for each distribution 𝔇ᵢ, the probability πᵢ that a point drawn from 𝔇ᵢ is maximal. These probabilities are used to build a priority structure that schedules points in decreasing order of πᵢ. The execution phase employs an “interleaved search” strategy: rather than fully sorting all points, the algorithm incrementally performs the cheapest comparison that can potentially eliminate a point from being maximal. When a point is proven to be dominated, further comparisons on that point are halted. This adaptive probing ensures that the total expected number of comparisons is O(OPT_MAX_𝔇 + n). The additive linear term accounts for the unavoidable cost of reading the n input points.

Convex hull algorithm.
The hull problem is more intricate because the output size can be Θ(n) and the geometric relationships are global. The authors first partition the plane into a hierarchy of vertical slabs (a “double‑slab” decomposition). Within each slab the points are sorted by y‑coordinate, and a restricted comparison tree is used to identify candidate extreme points. Between slabs the algorithm maintains a small set of “extremal candidates” that could appear on the hull boundary. By exploiting point‑line duality, the problem of determining whether a point belongs to the hull is reduced to a series of linear comparisons that fit the restricted tree model. The slab hierarchy is constructed so that the number of slabs grows only as O(log log n), which yields an overall expected running time of O(OPT_CH_𝔇 + n log log n). The extra log log n factor arises from the need to locate the appropriate slab for each point, but this factor is negligible compared with the linear term for realistic input sizes.

Both algorithms consist of a short learning phase (Õ(n) time) and a long execution phase that is essentially distribution‑specific. The analysis shows that after the learning phase the expected number of comparisons matches the lower bound given by the optimal comparison tree up to an additive linear (or n log log n) term. Consequently, the algorithms are “self‑optimizing”: they automatically adapt to the unknown input distribution and achieve near‑optimal performance without any prior knowledge of the distributions.

The paper situates its contributions within the broader literature on self‑improving algorithms, which previously addressed sorting, searching, and Delaunay triangulation. By extending the framework to maxima and convex hulls, the authors demonstrate that the self‑improving paradigm can handle problems whose optimal decision trees involve complex geometric reasoning. The techniques introduced—tree restriction, interleaved search, and slab‑based candidate management—are likely to be useful for other geometric problems, such as higher‑dimensional hulls, nearest‑neighbor structures, or problems with dependent input distributions.

In conclusion, the work provides the first self‑improving algorithms for planar maxima and convex hulls, achieving expected running times of O(OPT_MAX_𝔇 + n) and O(OPT_CH_𝔇 + n log log n) respectively. The results bridge the gap between information‑theoretic lower bounds (optimal comparison trees) and practical algorithm design, showing that an algorithm can learn from its own execution history and asymptotically match the performance of an omniscient optimal algorithm on the same distribution. Future directions include extending the methodology to higher dimensions, handling non‑product input distributions, and empirical evaluation on real‑world data sets.