Stochastic convex optimization with bandit feedback

Stochastic convex optimization with bandit feedback
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This paper addresses the problem of minimizing a convex, Lipschitz function $f$ over a convex, compact set $\xset$ under a stochastic bandit feedback model. In this model, the algorithm is allowed to observe noisy realizations of the function value $f(x)$ at any query point $x \in \xset$. The quantity of interest is the regret of the algorithm, which is the sum of the function values at algorithm’s query points minus the optimal function value. We demonstrate a generalization of the ellipsoid algorithm that incurs $\otil(\poly(d)\sqrt{T})$ regret. Since any algorithm has regret at least $\Omega(\sqrt{T})$ on this problem, our algorithm is optimal in terms of the scaling with $T$.


💡 Research Summary

The paper studies stochastic bandit convex optimization, where a learner can query a convex, Lipschitz function $f$ defined on a compact convex set $\mathcal X\subset\mathbb R^{d}$ and receives a noisy observation $y=f(x)+\varepsilon$ with $\varepsilon$ being $\sigma$‑sub‑Gaussian. The goal is to minimize cumulative regret $R_T=\sum_{t=1}^{T}\bigl(f(x_t)-f(x^\star)\bigr)$ over $T$ queries, where $x^\star$ is a global minimizer of $f$.

Previous work on bandit optimization either assumes linear structure, strong convexity, or low‑dimensional Lipschitz continuity, leading to regret bounds that either depend heavily on the dimension or achieve sub‑optimal rates (e.g., $O(T^{3/4})$). The authors ask whether convexity alone is sufficient to obtain the optimal $\sqrt{T}$ dependence on the horizon.

The core contribution is a new algorithm that combines a “center‑point device” with a generalized ellipsoid method. In one dimension, the algorithm maintains a working interval $


Comments & Academic Discussion

Loading comments...

Leave a Comment