Efficient Tracking of Large Classes of Experts
In the framework of prediction of individual sequences, sequential prediction methods are to be constructed that perform nearly as well as the best expert from a given class. We consider prediction strategies that compete with the class of switching strategies that can segment a given sequence into several blocks, and follow the advice of a different “base” expert in each block. As usual, the performance of the algorithm is measured by the regret defined as the excess loss relative to the best switching strategy selected in hindsight for the particular sequence to be predicted. In this paper we construct prediction strategies of low computational cost for the case where the set of base experts is large. In particular we provide a method that can transform any prediction algorithm $\A$ that is designed for the base class into a tracking algorithm. The resulting tracking algorithm can take advantage of the prediction performance and potential computational efficiency of $\A$ in the sense that it can be implemented with time and space complexity only $O(n^{\gamma} \ln n)$ times larger than that of $\A$, where $n$ is the time horizon and $\gamma \ge 0$ is a parameter of the algorithm. With $\A$ properly chosen, our algorithm achieves a regret bound of optimal order for $\gamma>0$, and only $O(\ln n)$ times larger than the optimal order for $\gamma=0$ for all typical regret bound types we examined. For example, for predicting binary sequences with switching parameters under the logarithmic loss, our method achieves the optimal $O(\ln n)$ regret rate with time complexity $O(n^{1+\gamma}\ln n)$ for any $\gamma\in (0,1)$.
💡 Research Summary
The paper addresses the problem of online prediction with expert advice when the learner is allowed to switch among experts a limited number of times. This “tracking” setting is formalized by meta‑experts that follow a base expert on each segment of a partition of the time horizon, with the number of switches C controlling the complexity of the meta‑expert class. Classical approaches such as full transition‑diagram methods achieve optimal regret but require O(n²) time because they must maintain weights for all possible segment start points at each round. Linear‑time methods exist for special cases, yet they either need a priori bounds on C or suffer a noticeable increase in regret.
The authors propose a general reduction that converts any base prediction algorithm 𝔄 (e.g., exponentially weighted average, online gradient descent, etc.) into a tracking algorithm. The reduction is black‑box: it only requires that 𝔄 produce predictions and cumulative losses for the base experts. The key technical device is a reduced transition diagram whose states encode only a limited amount of information—typically the time of the last switch, the current segment start, and the number of switches used so far. By controlling the granularity of this state space with a parameter γ ∈
Comments & Academic Discussion
Loading comments...
Leave a Comment