Learning Half-Spaces from Perturbed Contrastive Examples
We study learning under a two-step contrastive example oracle, as introduced by Mansouri et. al. (2025), where each queried (or sampled) labeled example is paired with an additional contrastive example of opposite label. While Mansouri et al. assume an idealized setting, where the contrastive example is at minimum distance of the originally queried/sampled point, we introduce and analyze a mechanism, parameterized by a non-decreasing noise function $f$, under which this ideal contrastive example is perturbed. The amount of perturbation is controlled by $f(d)$, where $d$ is the distance of the queried/sampled point to the decision boundary. Intuitively, this results in higher-quality contrastive examples for points closer to the decision boundary. We study this model in two settings: (i) when the maximum perturbation magnitude is fixed, and (ii) when it is stochastic. For one-dimensional thresholds and for half-spaces under the uniform distribution on a bounded domain, we characterize active and passive contrastive sample complexity in dependence on the function $f$. We show that, under certain conditions on $f$, the presence of contrastive examples speeds up learning in terms of asymptotic query complexity and asymptotic expected query complexity.
💡 Research Summary
**
This paper revisits the contrastive‑example learning framework introduced by Mansouri et al. (2025) and addresses a key limitation of that model: the oracle is assumed to return the exact nearest point of opposite label (the “ideal” contrastive example). In practice such an oracle is unrealistic, and the dramatic reductions in query complexity reported by the original work may not be attainable.
To bridge this gap, the authors propose a perturbed contrastive‑example model parameterized by a non‑decreasing noise function f : ℝ₊ → ℝ₊. For a primary example x, let d be the Euclidean distance from x to the target decision boundary. The oracle is allowed to return a contrastive point x′ that may be displaced from the ideal point xₘᵢₙ by at most f(d). Consequently, points that lie close to the boundary receive higher‑quality contrastive examples, while points far from the boundary may receive noisier ones.
Two concrete mechanisms are studied:
-
Deterministic Approximate Minimum‑Distance Model (Deterministic AMDM). The oracle returns any point in the set
CE_{d,f}^{app}(x) = { x′ | ‖x′ − xₘᵢₙ‖ ≤ f(d) and ℓ(x′) ≠ ℓ(x) }.
This is an adversarial setting: the learner must succeed for the worst possible choice within the allowed region. -
Probabilistic Approximate Minimum‑Distance Model (Probabilistic AMDM). For each x a distribution Dₓ over points of opposite label is defined such that the expected displacement satisfies E_{x′∼Dₓ}
Comments & Academic Discussion
Loading comments...
Leave a Comment