Online Learning: Stochastic and Constrained Adversaries
Learning theory has largely focused on two main learning scenarios. The first is the classical statistical setting where instances are drawn i.i.d. from a fixed distribution and the second scenario is the online learning, completely adversarial scenario where adversary at every time step picks the worst instance to provide the learner with. It can be argued that in the real world neither of these assumptions are reasonable. It is therefore important to study problems with a range of assumptions on data. Unfortunately, theoretical results in this area are scarce, possibly due to absence of general tools for analysis. Focusing on the regret formulation, we define the minimax value of a game where the adversary is restricted in his moves. The framework captures stochastic and non-stochastic assumptions on data. Building on the sequential symmetrization approach, we define a notion of distribution-dependent Rademacher complexity for the spectrum of problems ranging from i.i.d. to worst-case. The bounds let us immediately deduce variation-type bounds. We then consider the i.i.d. adversary and show equivalence of online and batch learnability. In the supervised setting, we consider various hybrid assumptions on the way that x and y variables are chosen. Finally, we consider smoothed learning problems and show that half-spaces are online learnable in the smoothed model. In fact, exponentially small noise added to adversary’s decisions turns this problem with infinite Littlestone’s dimension into a learnable problem.
💡 Research Summary
The paper addresses a gap in online learning theory between the two extreme settings that have traditionally been studied: the i.i.d. statistical scenario and the fully adversarial worst‑case scenario. The authors propose a unified game‑theoretic framework in which the adversary’s moves are constrained by a sequence of sets (P_t) that limit the probability distributions the adversary may draw at each round. These constraints can model a wide variety of situations, including (i) the classic worst‑case adversary (no restriction), (ii) deterministic constraints (e.g., budget limits), (iii) smoothed adversaries where each worst‑case choice is corrupted by i.i.d. noise, (iv) hybrid supervised settings where either the features (x) or the labels (y) are drawn i.i.d. while the other side is adversarial, and (v) a pure i.i.d. adversary.
The central object of study is the minimax value
\
Comments & Academic Discussion
Loading comments...
Leave a Comment