Variance component score test for multivariate change point detection with applications to mobile health
Multivariate change point detection is the process of identifying distributional shifts in time-ordered data across multiple features. This task is particularly challenging when the number of features is large relative to the number of observations. This problem is often present in mobile health, where behavioral changes in at-risk patients must be detected in real time in order to prompt timely interventions. We propose a variance component score test (VC*) for detecting changes in feature means and/or variances using only pre-change point data to estimate distributional parameters. Through simulation studies, we show that VC* has higher power than existing methods. Moreover, we demonstrate that reducing bias by using only pre-change point days to estimate parameters outweighs the increased estimator variances in most scenarios. Lastly, we apply VC* and competing methods to passively collected smartphone data in adolescents and young adults with affective instability.
💡 Research Summary
The paper addresses the problem of detecting multivariate change points in high‑dimensional, time‑ordered data streams typical of mobile health (mHealth) applications. In such settings, the number of sensor features (p) can be comparable to or exceed the number of observed days (T), making traditional change‑point methods under‑powered. The authors propose a novel variance‑component score test, denoted VC*, which leverages only pre‑change‑point observations to estimate the mean vector (μ) and covariance matrix (Σ). By restricting estimation to the pre‑change segment, VC* eliminates the bias that arises when the full data set (including post‑change observations) is used under the alternative hypothesis.
The statistical model assumes that before the change point (day k) the data follow a multivariate normal distribution with mean μ and covariance Σ, while after k the mean shifts by a vector δ and the covariance scales by a factor τ. The null hypothesis H0: δ = 0 and τ = 0 (no change) is tested against the alternative that at least one of these parameters is non‑zero. The authors adapt the classic variance‑component score test—originally developed for testing random effects in mixed models—to this change‑point context. The score vector U and information matrix I are computed using the pre‑change estimates (\hat μ_{pre}) and (\hat Σ_{pre}); the test statistic (\tilde Q = U^{\top} I^{-1} U) follows a χ² distribution with two degrees of freedom under H0.
A key theoretical contribution is Theorem 2.1, which proves that using all observations to estimate μ and Σ yields biased estimators whenever a change point exists, even if only the mean or only the variance changes. Consequently, the authors recommend unbiased pre‑change estimators and introduce a regularization scheme for Σ that interpolates between the full unbiased estimate and a diagonal matrix of variances, controlled by a tuning parameter λ. This regularization mitigates the increase in estimator variance while preserving bias reduction.
The authors evaluate VC* through extensive simulations. They consider three change‑point scenarios (mean only, variance only, both), vary the proportion of affected features (π), the number of post‑change days, and the correlation structure among features. Competing methods include Hotelling’s T² max test, multivariate CUSUM, and a non‑parametric sample‑divergence test. Results show that VC* consistently attains high power, closely matching specialized versions that target only mean or variance changes, and outperforming the competing methods, especially when many features are correlated or when only a subset of features experiences a shift. A second set of simulations compares VC* to a traditional variance‑component test (VC) that uses all data for estimation. VC* outperforms VC in settings with moderate to strong feature correlation and small sample sizes, confirming that bias reduction outweighs the modest increase in variance.
The methodology is applied to a real‑world dataset comprising passive smartphone sensor streams (e.g., GPS distance, call counts, app usage) collected from 120 adolescents and young adults with affective instability. The data span at least 30 days per participant, and clinical assessments (EMA) provide ground‑truth timestamps of mood deterioration. Implemented in an online, sequential fashion, VC* identifies change points that align within one to two days of the EMA‑marked events, whereas Hotelling’s T² and CUSUM generate many false alarms or miss critical shifts, and sample divergence is computationally prohibitive for real‑time use.
In conclusion, VC* offers a statistically rigorous, computationally feasible solution for multivariate change‑point detection in high‑dimensional, low‑sample‑size contexts typical of mHealth. By exploiting unbiased pre‑change estimators and a variance‑component score framework, it achieves superior power and robustness compared with existing techniques. The authors suggest extensions to non‑Gaussian data, multiple change points, and Bayesian formulations, indicating broad applicability beyond mental‑health monitoring to domains such as finance and industrial quality control.
Comments & Academic Discussion
Loading comments...
Leave a Comment