Early warning of Mpox outbreaks in U.S. jurisdictions using Lasso Vector Autoregression models with cross-jurisdictional lags

Early warning of Mpox outbreaks in U.S. jurisdictions using Lasso Vector Autoregression models with cross-jurisdictional lags
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Mpox is an orthopoxvirus that infects humans and animals and is transmitted primarily through close physical contact. The episodic and spatially heterogeneous dynamics of Mpox transmission underscores the need for timely, area-specific forecasts to support targeted public health responses in the U.S. We develop a Vector Autoregression model with Lasso regularization (VAR-Lasso) to generate rolling two-week-ahead forecasts of weekly Mpox cases for eight high-incidence U.S. jurisdictions using national surveillance data from the Centers for Disease Control and Prevention (CDC). The VAR-Lasso model identifies significant long-lag, cross-jurisdictional predictors. For a case study in San Diego County (SDC), these statistical predictors align with phylogenetic analysis that traces a 2023 cluster in SDC to an outbreak in Illinois six months earlier. As the need for public health action is often greatest when incidence is increasing, our performance evaluation focuses on positive-slope weighted error metrics. Forecast performance of the VAR-Lasso model is compared to a uni-variate Auto-Regressive (AR) Lasso model and a naive moving-average estimate. The models are compared using slope-weighted Root Mean Squared Error (RMSE), slope-weighted Mean Absolute Error (MAE), and slope-weighted bias. Across all observations, the VAR-Lasso model reduces slope-weighted RMSE, MAE, and bias by 12%, 7%, and 66% relative to the AR model, and by 16%, 13%, and 76% relative to the naive benchmark. Our findings highlight the value of sparse multivariate time-series models that leverage cross-jurisdictional case data for early forecasting of Mpox outbreaks. Such forecasting can aid health departments in proactively providing timely resources and messaging to mitigate the risks of a future outbreak.


💡 Research Summary

This paper addresses the need for timely, jurisdiction‑specific early warnings of Mpox (monkeypox) outbreaks in the United States, where transmission has become episodic and spatially heterogeneous since the large 2022 global wave. The authors develop a two‑week‑ahead forecasting framework based on a Vector Autoregression model regularized with the Least Absolute Shrinkage and Selection Operator (VAR‑Lasso). By incorporating lagged case counts from multiple high‑incidence jurisdictions, the model captures cross‑jurisdictional dynamics that single‑variable autoregressive (AR) models cannot.

Data were drawn from the CDC national surveillance system for weekly Mpox cases between January 2023 and November 2024. Eight jurisdictions accounting for 55 % of U.S. cases (New York City, Texas, Los Angeles County, Florida, Illinois, Georgia, San Diego County, Washington) were selected. The VAR‑Lasso was trained on the full 2023 series and then used in a rolling fashion to generate 42 two‑week‑ahead forecasts for 2024, updating the training set each week.

A novel evaluation metric—positive‑slope‑weighted error—was introduced. Errors (RMSE, MAE, bias) are multiplied by the observed week‑to‑week increase in cases, thereby giving greater importance to periods of rising incidence, which are most critical for public‑health action. Compared with a univariate AR‑Lasso and a naïve moving‑average benchmark, the VAR‑Lasso reduced weighted RMSE by 12 % (1.75 vs. 2.00) and 16 % (vs. 2.11), weighted MAE by 7 % (1.42 vs. 1.52) and 13 % (vs. 1.64), and weighted bias by 66 % (–0.19 vs. –0.56) and 76 % (vs. –0.80). These improvements indicate that the multivariate, sparsely regularized model better anticipates surges, deviating by less than two cases per week on average when weighted by the magnitude of the surge.

A detailed case study for San Diego County (SDC) illustrates the model’s practical value. SDC experienced a modest but recurrent transmission pattern, with peaks of 12 cases in October 2023 and 7 cases in July 2024. While AR‑Lasso and the naïve estimator under‑predicted these peaks, VAR‑Lasso leveraged cross‑jurisdictional information to forecast them accurately. The model identified Illinois as the most influential external predictor for SDC, with the strongest coefficients attached to 23‑ and 24‑week lags—implying a six‑month lead time. Visual inspection of raw case series confirmed that Illinois case spikes in April 2023 and February 2024 preceded SDC spikes by roughly 23 weeks.

To validate this statistical finding, the authors performed phylogeographic analysis on 29 SDC MPXV genomes. Nineteen genomes formed a dominant cluster (Cluster B) whose most recent common ancestor dated to September 2023. Phylogenetic inference assigned a posterior probability of 0.72 to Illinois as the immediate source, and a probability of 1.0 to Illinois as the ultimate source, corresponding to a 22‑28‑week transmission lag. This independent genomic evidence aligns closely with the VAR‑Lasso‑identified lag, providing strong external validation of the model’s identified transmission pathway.

The study acknowledges limitations: reliance on CDC reporting may introduce delays or under‑reporting; the analysis is confined to eight jurisdictions, limiting broader generalizability; and model performance is sensitive to the choice of Lasso penalty parameter. Future work could explore alternative sparsity‑inducing priors (e.g., Bayesian shrinkage, Elastic Net) and expand the jurisdictional set.

In summary, the VAR‑Lasso framework successfully integrates cross‑jurisdictional lagged case data, isolates long‑range transmission signals, and delivers superior early‑warning forecasts during periods of increasing Mpox incidence. By providing health departments with more accurate, timely predictions, this approach can inform proactive vaccination campaigns, targeted messaging, and resource allocation, ultimately mitigating the impact of future Mpox outbreaks and offering a template for early‑warning systems for other emerging infectious diseases.


Comments & Academic Discussion

Loading comments...

Leave a Comment