Simultaneous Blackwell Approachability and Applications to Multiclass Omniprediction

Reading time: 5 minute
...

📝 Original Info

  • Title: Simultaneous Blackwell Approachability and Applications to Multiclass Omniprediction
  • ArXiv ID: 2602.17577
  • Date: 2026-02-19
  • Authors: ** (논문에 명시된 저자 정보가 제공되지 않았으므로, 저자 명단은 원문을 참고하시기 바랍니다.) **

📝 Abstract

Omniprediction is a learning problem that requires suboptimality bounds for each of a family of losses $\mathcal{L}$ against a family of comparator predictors $\mathcal{C}$. We initiate the study of omniprediction in a multiclass setting, where the comparator family $\mathcal{C}$ may be infinite. Our main result is an extension of the recent binary omniprediction algorithm of [OKK25] to the multiclass setting, with sample complexity (in statistical settings) or regret horizon (in online settings) $\approx \varepsilon^{-(k+1)}$, for $\varepsilon$-omniprediction in a $k$-class prediction problem. En route to proving this result, we design a framework of potential broader interest for solving Blackwell approachability problems where multiple sets must simultaneously be approached via coupled actions.

💡 Deep Analysis

📄 Full Content

Omniprediction is a powerful definition of learning introduced recently by [GKR + 22]. Consider a standard supervised learning task: we receive i.i.d. samples (x, y) ∼ D, where x ∈ R d are the features and y ∈ ∂∆ k := {e i } i∈ [k] is the label (see Section 2.1 for notation), and we wish to build a predictor p(x) ≈ E[y | x]. In omniprediction, a family of loss functions L is fixed, as well as a family of comparator predictors C. The goal is then to satisfy the simultaneous loss minimization guarantee, for some ε > 0 and predictor p : R d → ∆ k : E (x,y)∼D [ℓ (k ⋆ ℓ (p(x)), y)] ≤ min c∈C E (x,y)∼D [ℓ (c(x), y)] + ε, for all ℓ ∈ L.

(1)

Here, k ⋆ ℓ is the ex ante optimum mapping for a particular loss ℓ ∈ L, defined in (3). This function maps each p ∈ ∆ k to the loss-minimizing action, on average over y = e i where i ∼ p.

The formulation (1) effectively decouples the tasks of prediction and action: once the learner has decided on a predictor p, the decision maker who wishes to minimize a particular loss ℓ ∈ L then takes the action k ⋆ ℓ • p. This property is particularly useful when e.g., losses can depend on parameters unknown at training time (such as a market price), or robustness to a range of loss hyperparameters is desirable. Because (1) applies to a family of losses, the predictor p can be viewed as a “supervised sufficient statistic” that goes beyond single loss minimization. This perspective built upon earlier work in algorithmic fairness [HKRR18], and has intimate connections to indistinguishability arguments from pseudorandomness [GHK + 23, GH25].

By now, there is a rich body of work on omniprediction in statistical and online learning settings [GKR + 22, GHK + 23, HNRY23, GKR23, GJRR24, HTY25, DHI + 25, OKK25]. However, essentially all prior works focused on binary classification, where labels live in the set {0, 1}. This is a rather stringent restriction in the context of real-world supervised learning, which is often used for multiclass tasks, e.g., [DDS + 09, MDP + 11, Den12]. Even the ability to handle labels y ∈ ∂∆ k ≡ [k], for k a constant number of classes, would substantially extend the applicability of omnipredictors.

To our knowledge, the problem of multiclass omniprediction has only been studied in recent works by [NRRX25,LRS25]. These papers focused on a setting motivated by the economics literature, where C the family of comparators (viewed as an action space) is finite. The former’s main multiclass omniprediction result (Theorem 6.5, [NRRX25]) is restricted to ℓ that independently decompose coordinatewise. On the other hand, Corollary 6, [LRS25] gives a more general statement for multiclass omniprediction, but again the result is stated for finite C, and incurs an ≈ ε -4k-2 overhead in the sample complexity for achieving (1) (without the consideration of runtime).

The main motivation of our work is to bridge this gap, by developing multiclass omnipredictors with guarantees more closely resembling the state-of-the-art in binary omniprediction. Indeed, there has been substantial recent progress on improving the sample complexity and runtime of binary omniprediction for concrete pairs (C, L). For example, in the generalized linear model (GLM) setting, where C is bounded linear predictors and L is appropriate convex losses (cf. (5)), [HTY25,OKK25] developed end-to-end efficient algorithms with ≈ ε -2 sample complexities. 1 In fact, [OKK25] gave a substantial generalization, showing how to reduce binary omniprediction for arbitrary pairs of (C, L) to online learning tasks against appropriate function classes.

Our approach to multiclass omniprediction is based on the framework of [OKK25]. Both [HTY25,OKK25], as well as many prior results on binary omniprediction, leverage a reduction from [GHK + 23]. This reduction (Proposition 1) shows that (1) is satisfied for predictors p satisfying appropriate notions of multiaccuracy (Definition 2) and calibration (Definiton 3), concepts we review in Section 2.2. Intuitively, these properties guarantee that our predictor p(x) passes certain statistical tests against the ground truth p ⋆ (x) := E[y | x], induced by the particular pair (C, L) of interest.

As in the binary case, learning multiclass predictors that satisfy multiaccuracy and calibration individually is well-studied. We discuss the former in Section 5.4, and the latter is possible in ≈ ε -(k+1) timesteps (in the online setting) and samples (in the statistical setting), as shown by seminal work of [FV98] (see also [MS10]). However, it is less clear how to achieve both simultaneously.

In the binary setting, [OKK25] leveraged an existing calibration algorithm from [ABH11] based on Blackwell approachability, and augmented it to also guarantee multiaccuracy. Their analysis used several important facts about binary losses, e.g., existence of an approximate basis for proper losses (Lemma 9), and a custom “halfspace satisfiability oracle” specialized to their application (Algorithm 3). Unfor

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut