Continuous Evolution Pool: Taming Recurring Concept Drift in Online Time Series Forecasting
Recurring concept drift poses a dual challenge in online time series forecasting: mitigating catastrophic forgetting while adhering to strict privacy constraints that prevent retaining historical data. Existing approaches predominantly rely on parameter updates or experience replay, which inevitably suffer from knowledge overwriting or privacy risks. To address this, we propose the Continuous Evolution Pool (CEP), a privacy-preserving framework that maintains a dynamic pool of specialized forecasters. Instead of storing raw samples, CEP utilizes lightweight statistical genes to decouple concept identification from forecasting. Specifically, it employs a Retrieval mechanism to identify the nearest concept based on gene similarity, an Evolution strategy to spawn new forecasters upon detecting distribution shifts, and an Elimination policy to prune obsolete models under memory constraints. Experiments on real-world datasets demonstrate that CEP significantly outperforms state-of-the-art baselines, reducing forecasting error by over 20% without accessing historical ground truth.
💡 Research Summary
The paper tackles a pressing problem in online time series forecasting: recurring concept drift (RCD) under strict privacy constraints that forbid the storage of raw historical samples. Conventional solutions either continuously update model parameters, which inevitably leads to catastrophic forgetting of previously learned concepts, or rely on experience replay buffers that must retain raw data and thus violate privacy regulations such as GDPR and CCPA. To overcome these limitations, the authors propose the Continuous Evolution Pool (CEP), a novel framework that maintains a dynamic collection of specialized forecasters while storing only lightweight statistical “genes” (mean and variance) for each concept.
Core Mechanism
CEP separates concept identification from the forecasting task. Each forecaster in the pool is paired with a gene vector composed of a global component (capturing long‑term distributional statistics) and a local component (capturing short‑term fluctuations). When a new input window arrives, CEP computes the similarity between the input’s gene and the genes of all forecasters. The forecaster with the smallest distance is selected to generate the prediction and is updated via a single‑step stochastic gradient descent on the current batch. If the distance exceeds a predefined evolution threshold, CEP interprets this as the emergence of a new concept. It then spawns a new forecaster by cloning the most similar existing model and initializing its parameters for rapid adaptation. The new forecaster, together with its gene, is added to the pool.
Memory Management
Because the pool size must remain bounded, CEP implements an elimination policy. Each forecaster accumulates an activity score based on usage frequency and recency. Forecasters that remain inactive beyond a configurable horizon are pruned, freeing memory for future concepts and preventing noisy or obsolete models from contaminating predictions.
Privacy Preservation
Only aggregated statistics (mean, variance) are stored; raw time‑series values never appear in memory. These low‑order moments are insufficient for reconstructing individual records, thereby providing a strong privacy guarantee while still being sensitive enough to detect macro‑level distribution shifts.
Experimental Evaluation
The authors evaluate CEP on four real‑world datasets (electricity consumption, traffic flow, financial series, and a synthetic benchmark) under a delayed‑feedback protocol where ground‑truth labels become available only after a lag. CEP is instantiated with several state‑of‑the‑art backbone forecasters (DLinear, Informer, Autoformer) and compared against leading drift‑handling baselines: DER++ (experience replay), FSNet (complementary learning systems), OneNet (reinforcement‑learning based model expansion), and classic drift detectors such as ADWIN. Results show that CEP reduces Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) by 20‑28 % relative to the strongest baseline while using less than 30 % of the memory footprint. The advantage is especially pronounced when the same concept reappears multiple times, confirming that CEP’s “expert model reuse” is effective.
Theoretical and Practical Contributions
- Decoupled Adaptation – By delegating macro‑level drift detection to statistical genes, CEP allows the neural forecasters to focus on learning complex temporal patterns without being distracted by distributional noise.
- Dynamic Evolution – The nearest‑evolution strategy creates new specialized models only when necessary, avoiding unnecessary parameter churn.
- Resource‑Efficient Knowledge Retention – The elimination mechanism ensures that the pool remains compact, making CEP suitable for edge devices with limited storage and compute.
- Privacy‑First Design – No raw data is ever persisted, satisfying stringent data‑protection regulations.
Limitations and Future Work
The current implementation handles univariate series; extending CEP to multivariate streams will require richer gene representations (e.g., covariance matrices). Moreover, the similarity metric is Euclidean on mean‑variance vectors; more sophisticated distributional distances (e.g., Wasserstein) could improve concept discrimination. Finally, integrating reinforcement learning to adapt the evolution threshold and elimination schedule dynamically is an interesting direction.
In summary, Continuous Evolution Pool offers a compelling solution to recurring concept drift in online time series forecasting, combining privacy preservation, memory efficiency, and superior predictive performance through a principled, gene‑based model pool architecture.
Comments & Academic Discussion
Loading comments...
Leave a Comment