Individualized and Interpretable Sleep Forecasting via a Two-Stage Adaptive Spatial-Temporal Model
Sleep quality impacts well-being. Therefore, healthcare providers and individuals need accessible and reliable forecasting tools for preventive interventions. This paper introduces an interpretable, individualized adaptive spatial-temporal model for predicting sleep quality. We designed a hierarchical architecture, consisting of parallel 1D convolutions with varying kernel sizes and dilated convolution, which extracts multi-resolution temporal patterns-short kernels capture rapid physiological changes, while larger kernels and dilation model slower trends. The extracted features are then refined through channel attention, which learns to emphasize the most predictive variables for each individual, followed by bidirectional LSTM and self-attention that jointly model both local sequential dynamics and global temporal dependencies. Finally, a two-stage adaptation strategy ensures the learned representations transfer effectively to new users. We conducted various experiments with five input window sizes (3, 5, 7, 9, and 11 days) and five prediction window sizes (1, 3, 5, 7, and 9 days). Our model consistently outperformed time series forecasting baseline approaches, including LSTM, Informer, PatchTST, and TimesNet. The best performance was achieved with a three-day input window and a one-day prediction window, yielding an RMSE of 0.216. Furthermore, the model demonstrated good predictive performance even for longer forecasting horizons (e.g., with a 0.257 RMSE for a three-day prediction window), highlighting its practical utility for real-world applications. We also conducted an explainability analysis to examine how different features influence sleep quality. These findings proved that the proposed framework offers a robust, adaptive, and explainable solution for personalized sleep forecasting using sparse data from commercial wearable devices.
💡 Research Summary
The paper presents a novel, individualized, and interpretable framework for forecasting sleep quality using only sparse daily summaries from a commercial smartwatch (Garmin Vivosmart 5). Recognizing that high‑resolution polysomnography is costly and that many deep‑learning approaches act as black boxes, the authors design a two‑stage adaptive spatial‑temporal model that balances predictive performance, personalization, and explainability.
The dataset comprises 16 middle‑aged female participants recruited from two community‑based programs in the Netherlands. Over 10–15 weeks each participant wore the smartwatch, generating 24 raw features (e.g., total kilocalories, steps, distance, activity seconds, heart‑rate statistics, respiration metrics, sleep stage durations, stress, and a composite sleep‑score derived from Firstbeat analytics). After discarding the “Hydration” variable (excessive missingness) and the first/last days of recordings, 23 features remain. A systematic feature‑selection pipeline—Pearson correlation filtering followed by Recursive Feature Elimination with a Random‑Forest regressor—reduces the set to the 15 most informative variables, achieving roughly a 32 % dimensionality reduction while preserving predictive power.
The model architecture consists of four key components:
- Multi‑scale 1‑D Convolutional Front‑End – Parallel 1‑D convolutions with varying kernel sizes (short kernels capture rapid physiological fluctuations; larger kernels and dilated convolutions model slower trends).
- Channel‑wise Attention – Learns per‑feature importance weights that are individualized, allowing the network to emphasize the most predictive signals for each user.
- Bidirectional LSTM + Self‑Attention – The Bi‑LSTM captures local sequential dynamics in both forward and backward directions, while a subsequent self‑attention layer aggregates global temporal dependencies, enabling accurate multi‑day forecasts.
- Two‑Stage Domain Adaptation – (a) A shared pre‑training phase on all participants’ data learns a robust representation; (b) At deployment, a lightweight fine‑tuning or test‑time adaptation (TTA) step updates the model using a small amount of labeled data from a new user, mitigating inter‑subject distribution shifts.
The authors evaluate the system across 25 scenarios formed by five input window lengths (3, 5, 7, 9, 11 days) and five prediction horizons (1, 3, 5, 7, 9 days). Baselines include classic LSTM, Informer, PatchTST, and TimesNet. Across all settings, the proposed model consistently outperforms these baselines. The best result—using a three‑day input window to predict the next day’s sleep score—achieves a Root Mean Squared Error (RMSE) of 0.216. Even for a three‑day forecast, the RMSE remains low at 0.257, demonstrating practical utility for longer horizons.
Interpretability is addressed through SHAP (SHapley Additive exPlanations) analysis. The authors visualize feature contributions to individual predictions, revealing that heart‑rate variability (resting HR, min/avg/max HR), deep‑sleep duration, average respiration rate, and stress metrics are the dominant drivers of sleep‑score fluctuations. These insights align with clinical knowledge, offering actionable feedback for users (e.g., increasing deep‑sleep time or managing stress) and fostering trust among clinicians.
In summary, the paper delivers a comprehensive solution that (1) operates on low‑dimensional, easily stored smartwatch data, (2) adapts efficiently to new users via a two‑stage domain‑adaptation scheme, and (3) provides transparent explanations of its forecasts. The combination of multi‑scale convolutional feature extraction, channel attention, recurrent‑transformer hybrid modeling, and rigorous adaptation makes the approach both technically robust and ready for real‑world deployment in personalized sleep‑health applications.
Comments & Academic Discussion
Loading comments...
Leave a Comment