Revisiting Multi-Agent Asynchronous Online Optimization with Delays: the Strongly Convex Case
We revisit multi-agent asynchronous online optimization with delays, where only one of the agents becomes active for making the decision at each round, and the corresponding feedback is received by all the agents after unknown delays. Although previous studies have established an $O(\sqrt{dT})$ regret bound for this problem, they assume that the maximum delay $d$ is knowable or the arrival order of feedback satisfies a special property, which may not hold in practice. In this paper, we surprisingly find that when the loss functions are strongly convex, these assumptions can be eliminated, and the existing regret bound can be significantly improved to $O(d\log T)$ meanwhile. Specifically, to exploit the strong convexity of functions, we first propose a delayed variant of the classical follow-the-leader algorithm, namely FTDL, which is very simple but requires the full information of functions as feedback. Moreover, to handle the more general case with only the gradient feedback, we develop an approximate variant of FTDL by combining it with surrogate loss functions. Experimental results show that the approximate FTDL outperforms the existing algorithm in the strongly convex case.
💡 Research Summary
The paper tackles the problem of multi‑agent asynchronous online convex optimization (OCO) with unknown and possibly heterogeneous delays. In each round only one agent is active, makes a decision, and the loss incurred is observed by all agents after an arbitrary delay. Prior work (notably Hsieh et al., 2022) introduced the Delayed Dual Averaging (DDA) algorithm, which achieves the optimal O(√d T) regret for convex losses but requires either knowledge of the maximum delay d or a special ordering assumption on the arrival of feedback—conditions that are rarely satisfied in practice. Moreover, it remained unclear whether the strong convexity of loss functions could be exploited to improve regret, as is possible in the single‑agent setting (where O(d log T) regret is known to be optimal).
The authors answer both questions affirmatively. Assuming each loss function f_t is λ‑strongly convex over a compact convex set K (and that gradients are uniformly bounded), they propose a delayed variant of the classic Follow‑the‑Leader (FTL) algorithm, called Follow‑the‑Delayed‑Leader (FTDL). At round t, the active agent gathers all feedback that has arrived so far (the set F_t) and selects
x_t = arg min_{x∈K} ∑_{s∈F_t} f_s(x).
If no feedback has arrived yet, any point in K may be chosen. This algorithm is parameter‑free and requires the full loss functions as feedback. The key theoretical contribution is Theorem 1, which shows that under the bounded‑gradient and strong‑convexity assumptions the cumulative regret satisfies
R_T ≤ (2 d G²/λ)·(1 + ln T).
The proof builds on the classic analysis of FTL: an “ideal” decision that uses all past losses up to time t (including those not yet observed) enjoys non‑positive regret; the regret of FTDL is then bounded by the sum of distances between the actual and ideal decisions. Strong convexity yields a quadratic relationship between this distance and the loss gap, and the fact that at most d losses are missing at any round introduces the factor d. Consequently, the regret scales linearly with the maximum delay and only logarithmically with the horizon, matching the best known bound for the single‑agent case without any knowledge of d.
Recognizing that in many applications only gradient information is available, the authors develop Approximate‑FTDL. Each loss is replaced by its first‑order surrogate
\tilde f_t(x) = f_t(x_t) + ⟨∇f_t(x_t), x − x_t⟩ + (λ/2)‖x − x_t‖²,
which is a strongly convex approximation. The decision rule becomes
x_t = arg min_{x∈K} ∑_{s∈F_t}
Comments & Academic Discussion
Loading comments...
Leave a Comment