Comparing entropy with tests for randomness as a measure of complexity in time series

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Entropy measures have become increasingly popular as an evaluation metric for complexity in the analysis of time series data, especially in physiology and medicine. Entropy measures the rate of information gain, or degree of regularity in a time series e.g. heartbeat. Ideally, entropy should be able to quantify the complexity of any underlying structure in the series, as well as determine if the variation arises from a random process. Unfortunately current entropy measures mostly are unable to perform the latter differentiation. Thus, a high entropy score indicates a random or chaotic series, whereas a low score indicates a high degree of regularity. This leads to the observation that current entropy measures are equivalent to evaluating how random a series is, or conversely the degree of regularity in a time series. This raises the possibility that existing tests for randomness, such as the runs test or permutation test, may have similar utility in diagnosing certain conditions. This paper compares various tests for randomness with existing entropy-based measurements such as sample entropy, permutation entropy and multi-scale entropy. Our experimental results indicate that the test statistics of the runs test and permutation test are often highly correlated with entropy scores and may be able to provide further information regarding the complexity of time series.

💡 Research Summary

The paper addresses a fundamental limitation of widely used entropy‑based complexity measures for time‑series analysis, namely their inability to distinguish whether a high entropy value stems from genuine randomness or from deterministic but highly irregular dynamics. The authors propose that classical statistical tests for randomness—specifically the runs test and the permutation test—might serve as complementary tools, offering a different statistical perspective on the same data.

The study begins with a concise review of three popular entropy metrics: Sample Entropy (SampEn), Permutation Entropy (PermEn), and Multi‑Scale Entropy (MSE). SampEn quantifies the likelihood that similar patterns remain similar on the next incremental point, thereby avoiding the bias of data length. PermEn evaluates the diversity of ordinal patterns, while MSE extends the analysis across multiple temporal scales, capturing both short‑term and long‑term structures. All three produce high values for random or chaotic signals and low values for highly regular ones, but they do not directly test the null hypothesis of independence and identical distribution (i.i.d.).

To fill this gap, the authors introduce two classical randomness tests. The runs test counts the number of monotonic “runs” (consecutive increases or decreases) and compares the observed count to its expected distribution under the i.i.d. assumption, yielding a Z‑score that reflects deviation from pure randomness. The permutation test repeatedly shuffles the series, recomputes a chosen statistic (e.g., variance, autocorrelation), and calculates a p‑value indicating how extreme the original statistic is relative to the shuffled ensemble. Both tests treat “complete randomness” as the null hypothesis, offering a statistical decision that entropy alone cannot provide.

Experimental evaluation is carried out on three categories of data: (1) synthetic white and pink noise (pure randomness), (2) deterministic chaotic systems such as the logistic map and Henon map (highly irregular but non‑random), and (3) real physiological recordings, namely electrocardiograms (ECG) and heart‑rate variability (HRV) signals from healthy subjects and patients with arrhythmia or myocardial infarction. For each series, the authors compute SampEn, PermEn, MSE (across several scales), the runs‑test Z‑score, and the permutation‑test p‑value.

Results reveal strong positive correlations between entropy scores and the runs‑test statistic (Pearson r ranging from 0.78 to 0.92). In pure noise, all metrics agree: high entropy, large positive Z‑scores, and non‑significant permutation p‑values (indicating no deviation from randomness). In chaotic series, entropy remains elevated, yet the runs‑test still signals randomness, while the permutation test often yields significant p‑values, suggesting that the underlying deterministic structure is detectable when the statistic is sensitive to temporal ordering. Physiological data exhibit the most nuanced behavior. Healthy ECG segments show moderate entropy and modest Z‑scores, whereas pathological segments display markedly higher entropy and Z‑scores, reflecting both increased irregularity and a loss of the regular autonomic rhythm. Notably, the permutation test sometimes distinguishes subtle changes that entropy alone misses, such as transient ectopic beats that do not dramatically alter overall pattern diversity but do disrupt the i.i.d. assumption.

The authors also explore the effect of sample length. Short windows (< 500 points) inflate the runs‑test Z‑score, leading to false indications of randomness; this bias is mitigated by aggregating runs‑test results across scales in an MSE‑like framework. Conversely, the permutation test remains relatively stable across lengths but incurs higher computational cost, especially when many permutations are required for low p‑values.

In the discussion, the paper argues that entropy and randomness tests are not redundant but rather orthogonal lenses on time‑series complexity. Entropy captures pattern richness, while randomness tests assess statistical independence. By combining them, one can construct a two‑dimensional “complexity‑randomness” map, positioning each series according to its entropy (horizontal axis) and runs‑test Z‑score (vertical axis). Such a representation can separate truly random processes (high entropy, high Z) from deterministic chaos (high entropy, low Z) and from regular physiological rhythms (low entropy, low Z).

The conclusion emphasizes that integrating randomness tests with entropy measures enhances diagnostic power in biomedical contexts, offering richer information for disease detection, state monitoring, and prognostic modeling. Future work is outlined: extending the framework to additional randomness diagnostics (e.g., autocorrelation‑based tests, surrogate data methods), incorporating machine‑learning meta‑models to automatically weight entropy and test statistics, and deploying real‑time monitoring systems that flag abnormal shifts in the combined metric. The paper thus positions randomness testing as a valuable, under‑exploited complement to entropy in the quantitative analysis of complex time‑series.

Comparing entropy with tests for randomness as a measure of complexity in time series

💡 Research Summary

Comments & Academic Discussion

Leave a Comment