Adapting to Non-stationarity with Growing Expert Ensembles
When dealing with time series with complex non-stationarities, low retrospective regret on individual realizations is a more appropriate goal than low prospective risk in expectation. Online learning algorithms provide powerful guarantees of this form, and have often been proposed for use with non-stationary processes because of their ability to switch between different forecasters or experts''. However, existing methods assume that the set of experts whose forecasts are to be combined are all given at the start, which is not plausible when dealing with a genuinely historical or evolutionary system. We show how to modify the fixed shares’’ algorithm for tracking the best expert to cope with a steadily growing set of experts, obtained by fitting new models to new data as it becomes available, and obtain regret bounds for the growing ensemble.
💡 Research Summary
The paper tackles the challenge of forecasting highly non‑stationary time‑series by shifting the performance goal from minimizing expected risk to minimizing retrospective regret on each realized sequence. This perspective aligns naturally with online learning, where a learner combines predictions from a set of “experts” and updates its strategy based on observed losses. Traditional expert‑combination methods, such as the classic Fixed‑Shares algorithm for tracking the best expert, assume that the entire pool of experts is known and fixed at the outset. In many real‑world scenarios—financial markets reacting to regulatory changes, climate monitoring systems incorporating new sensors, or any evolving historical process—new models are continuously trained on newly available data, causing the expert set to grow over time. The authors argue that this “growing ensemble” situation is not covered by existing theory and propose a principled extension of Fixed‑Shares that can handle a steadily expanding pool of experts while preserving strong regret guarantees.
The proposed algorithm retains the core Fixed‑Shares mechanism: at each time step the learner distributes a fraction α of the total weight to “share” among all experts, enabling rapid adaptation when the best expert changes. The extension adds two operations. First, at predetermined or performance‑driven trigger points a new expert is trained on the most recent data and inserted into the pool. Second, a portion β of the total weight is allocated to this newcomer; the remaining weight (1‑β) is proportionally scaled among the existing experts, and the whole vector is renormalized to sum to one. After the weight redistribution, the standard exponential‑weight update w_{t+1,i} ∝ w_{t,i}·exp(−η·ℓ_{t,i}) is applied, where ℓ_{t,i} is the loss of expert i at time t and η is the learning rate. The algorithm’s computational complexity remains linear in the current number of experts N_t, which grows at most logarithmically with the horizon under typical settings.
The theoretical contribution is a regret bound that matches the classic Fixed‑Shares result up to an additive term accounting for the introduction of new experts. Assuming bounded losses in
Comments & Academic Discussion
Loading comments...
Leave a Comment