On the Furthest Hyperplane Problem and Maximal Margin Clustering

On the Furthest Hyperplane Problem and Maximal Margin Clustering
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This paper introduces the Furthest Hyperplane Problem (FHP), which is an unsupervised counterpart of Support Vector Machines. Given a set of n points in Rd, the objective is to produce the hyperplane (passing through the origin) which maximizes the separation margin, that is, the minimal distance between the hyperplane and any input point. To the best of our knowledge, this is the first paper achieving provable results regarding FHP. We provide both lower and upper bounds to this NP-hard problem. First, we give a simple randomized algorithm whose running time is n^O(1/{\theta}^2) where {\theta} is the optimal separation margin. We show that its exponential dependency on 1/{\theta}^2 is tight, up to sub-polynomial factors, assuming SAT cannot be solved in sub-exponential time. Next, we give an efficient approxima- tion algorithm. For any {\alpha} \in [0, 1], the algorithm produces a hyperplane whose distance from at least 1 - 5{\alpha} fraction of the points is at least {\alpha} times the optimal separation margin. Finally, we show that FHP does not admit a PTAS by presenting a gap preserving reduction from a particular version of the PCP theorem.


💡 Research Summary

The paper introduces the Furthest Hyperplane Problem (FHP), an unsupervised analogue of the classic Support Vector Machine (SVM). Given n points {x^{(i)}}_{i=1}^{n} in ℝ^{d}, the goal is to find a unit normal vector w (the hyperplane passes through the origin) that maximizes the minimal absolute projection |⟨w, x^{(i)}⟩| over all points. This minimal projection is called the margin θ. The induced labeling of the points is simply y_i = sign(⟨w, x^{(i)}⟩). The problem is motivated by Maximal Margin Clustering (MMC), where the hyperplane may have an offset b; the authors observe that any optimal MMC solution can be reduced to solving at most O(n²) instances of FHP, because the optimal hyperplane must pass through the midpoint of two opposite‑side points.

Complexity and Hardness.
The authors prove that FHP is NP‑hard. More importantly, they establish a fine‑grained lower bound based on the Exponential Time Hypothesis (ETH). By a gap‑preserving reduction from MAX‑3SAT, they construct FHP instances with n points in dimension d = Θ(n) and optimal margin θ = Θ(1/√d). If an algorithm could solve FHP in time n^{o(1/θ²)} (i.e., n^{O(1/θ^{2−ε})} for any ε>0), then SAT could be solved in sub‑exponential time, contradicting ETH. Consequently, any algorithm’s dependence on 1/θ² is essentially optimal.

Exact Algorithms.
Three exact approaches are described:

  1. Enumeration of Feasible Labelings.
    Using Sauer’s Lemma, the number of linearly separable labelings is bounded by O(n^{d}). A breadth‑first search over this labeling graph yields an O(n^{d})‑time algorithm that evaluates the margin for each labeling via a quadratic program (or ellipsoid method).

  2. ε‑Net on the Unit Sphere.
    It suffices to consider normal vectors w that lie within an ε‑net of S^{d‑1} with ε < θ, because any optimal w* is within distance θ of such a net point. Deterministic constructions give a net of size (1/θ)^{O(d)}; enumerating it yields a running time O((1/θ)^{O(d)}·poly(n,d)). This is exponential in dimension but independent of θ for large margins.

  3. Random Hyperplane Algorithm.
    The most striking result is a simple randomized algorithm: sample O(1/θ²) unit vectors uniformly from the sphere, compute the induced labeling for each, and return the vector achieving the largest margin. Lemma 3.1 shows that a random vector has a constant probability of reproducing the optimal labeling, provided the number of samples is Θ(1/θ²). Thus the algorithm runs in time n^{O(1/θ²)} with high probability. The analysis hinges on the observation that a weak correlation between a random vector and the optimal w* already fixes the sign pattern for all points, despite the tiny spherical cap volume.

The authors also discuss a dimensionality‑reduction shortcut via the Johnson‑Lindenstrauss lemma, which reduces the problem to O(log n /θ²) dimensions, but the direct random‑hyperplane method is even simpler and achieves the same asymptotic bound.

Approximation Algorithm with Outliers.
Recognizing that in many learning scenarios one may tolerate discarding a small fraction of points, the paper presents an efficient approximation algorithm. For any α∈


Comments & Academic Discussion

Loading comments...

Leave a Comment