Is Online Linear Optimization Sufficient for Strategic Robustness?
We consider bidding in repeated Bayesian first-price auctions. Bidding algorithms that achieve optimal regret have been extensively studied, but their strategic robustness to the seller’s manipulation remains relatively underexplored. Bidding algorithms based on no-swap-regret algorithms achieve both desirable properties, but are suboptimal in terms of statistical and computational efficiency. In contrast, online gradient ascent is the only algorithm that achieves $O(\sqrt{TK})$ regret and strategic robustness [KSS24], where $T$ denotes the number of auctions and $K$ the number of bids. In this paper, we explore whether simple online linear optimization (OLO) algorithms suffice for bidding algorithms with both desirable properties. Our main result shows that sublinear linearized regret is sufficient for strategic robustness. Specifically, we construct simple black-box reductions that convert any OLO algorithm into a strategically robust no-regret bidding algorithm, in both known and unknown value distribution settings. For the known value distribution case, our reduction yields a bidding algorithm that achieves $O(\sqrt{T \log K})$ regret and strategic robustness (with exponential improvement on the $K$-dependence compared to [KSS24]). For the unknown value distribution case, our reduction gives a bidding algorithm with high-probability $O(\sqrt{T (\log K+\log(T/δ)})$ regret and strategic robustness, while removing the bounded density assumption made in [KSS24].
💡 Research Summary
The paper studies repeated Bayesian first‑price auctions with a single buyer who repeatedly draws a private value from a distribution F and must decide a bid from a discrete grid of K possible bids. Two desiderata are required of any bidding algorithm: (i) sublinear external regret with respect to the best fixed bidding policy in hindsight, and (ii) strategic robustness, meaning that a strategic seller cannot extract more revenue than the optimal single‑shot Myerson reserve price by exploiting the buyer’s learning algorithm.
Previous work (Kumar, Schneider, Sivakumar 2024, “KSS24”) showed that a specific online convex‑optimization formulation based on quantile strategies reduces the problem to an online linear‑optimization (OLO) problem, and that applying Online Gradient Ascent (OGA) yields O(√T) regret and strategic robustness. However, OGA was the only known algorithm achieving both properties, and its regret bound scales as O(√T K), which is suboptimal when the bid grid is fine (large K).
The authors ask whether any OLO algorithm can be used, and whether the dependence on K can be improved. Their answer is affirmative. The key technical contributions are:
-
Linearized Regret as the Right Objective – For each round t, the buyer’s utility uₜ(p) (where p is a point in the strategy space) is concave. By linearizing around the current point, the regret over the linearized losses ⟨∇uₜ(pₜ), pₜ − p⟩ upper‑bounds the original concave regret. The authors prove that minimizing this linearized regret against the all‑zero strategy (always bid 0) is sufficient to guarantee strategic robustness: the seller’s total revenue is at most Myerson(F) plus the linearized regret. Hence, sublinear linearized regret ⇒ sublinear strategic robustness.
-
Re‑parameterizing the Strategy Space – Instead of the quantile polytope used in KSS24, the paper represents a bidding policy by a probability vector q ∈ Δ_K, where q_k is the probability of playing bid b_k when the value is drawn from F. This maps the problem onto the K‑dimensional probability simplex, a canonical OLO domain with well‑understood geometry. The reduction from a generic OLO algorithm to a bidding algorithm is black‑box: feed the linearized gradients to the OLO algorithm, interpret its output as a probability distribution over bids, and sample accordingly.
-
Known Distribution Results – Plugging in the Multiplicative Weights Update (MWU) algorithm yields regret and strategic robustness both bounded by O(√T log K). This improves the K‑dependence from linear (in OGA) to logarithmic, which is crucial when K is large due to fine discretization of the continuous bid interval. The result matches the optimal √T dependence on T while achieving an exponential improvement in K.
-
Negative Result for Plain Regret – The authors also construct an example where a bidding algorithm enjoys sublinear regret on the original concave utilities but fails to be strategically robust; the seller can still extract Ω(T) extra revenue. This demonstrates that ordinary external regret is insufficient, and the linearized regret condition is essentially tight.
-
Unknown Distribution Setting – When the buyer does not know F, the gradient ∇uₜ(pₜ) cannot be computed directly. KSS24 handled this by pretending the values are uniform, incurring a factor depending on the maximum density \bar f. The new paper proposes a more refined approach: maintain an empirical distribution Fₜ from observed values, linearly interpolate it to obtain a continuous distribution \tilde Fₜ, then shift a small mass to zero to create a dominated continuous empirical distribution \hat Fₜ. This distribution satisfies two crucial properties: (a) it is (almost) absolutely continuous, eliminating the K‑dependent translation error between probability vectors and bidding policies; (b) it is stochastically dominated by the true F, guaranteeing Myerson(\hat Fₜ) ≤ Myerson(F). Using \hat Fₜ in place of F allows the buyer to compute approximate gradients and run any OLO algorithm.
-
High‑Probability Guarantees – With the above estimator, the authors prove that for any OLO algorithm with regret g(T) over the simplex, the resulting bidding algorithm achieves, with probability at least 1 − δ, total regret and strategic robustness bounded by O(g(T) + p T log(T/δ)), where p is a problem‑dependent constant (essentially the Lipschitz constant of the utility). Using MWU again yields O(√T (log K + log(T/δ))) high‑probability bounds, removing the dependence on \bar f and matching the known‑distribution logarithmic K‑dependence.
Overall, the paper establishes that online linear optimization is sufficient for strategically robust no‑regret bidding in first‑price auctions, both when the value distribution is known and when it must be learned online. The reductions are simple, black‑box, and improve upon prior work by (i) achieving logarithmic dependence on the number of bids, (ii) eliminating the need for density‑boundedness assumptions, and (iii) providing a unified analysis that works for any OLO algorithm, not just OGA. This broadens the toolbox for practitioners designing automated bidding agents in large‑scale ad‑exchange markets, where computational efficiency and robustness to seller manipulation are both critical.
Comments & Academic Discussion
Loading comments...
Leave a Comment