Symbolic Sequences and Tsallis Entropy

We address this work to investigate symbolic sequences with long-range correlations by using computational simulation. We analyze sequences with two, three and four symbols that could be repeated $l$

Symbolic Sequences and Tsallis Entropy

We address this work to investigate symbolic sequences with long-range correlations by using computational simulation. We analyze sequences with two, three and four symbols that could be repeated $l$ times, with the probability distribution $p(l)\propto 1/ l^{\mu}$. For these sequences, we verified that the usual entropy increases more slowly when the symbols are correlated and the Tsallis entropy exhibits, for a suitable choice of $q$, a linear behavior. We also study the chain as a random walk-like process and observe a nonusual diffusive behavior depending on the values of the parameter $\mu$.


💡 Research Summary

The paper investigates symbolic sequences that exhibit long‑range correlations by means of numerical simulations. The authors construct sequences composed of a small alphabet (two, three or four symbols). Each symbol appears in a block that is repeated l times, where the block length l is drawn from a power‑law distribution p(l)∝l⁻ᵘ with exponent μ>1. This procedure generates a series of “clusters” of identical symbols; the smaller the value of μ, the heavier the tail of the distribution and the more pronounced the long‑range correlations.

First, the statistical properties of the generated sequences are characterized. The autocorrelation function C(r)=⟨σ_iσ_{i+r}⟩ is shown to decay algebraically as C(r)∝r^{-(μ‑1)}, confirming the presence of scale‑free memory. The conventional Shannon entropy S₁(N)=−∑_{i=1}^{N}p_i log p_i is computed for sequences of length N. When μ is large (short blocks dominate) the entropy grows almost as S₁∝N log N, i.e. like an uncorrelated random string. As μ approaches 1, long blocks become frequent, the effective information per symbol drops, and the growth becomes sub‑linear: S₁∝N^{α} with α<1. This demonstrates that long‑range correlations reduce the rate at which new information is generated.

Next, the authors turn to the non‑extensive Tsallis entropy, defined as
S_q(N)=\frac{1}{q‑1}\Bigl(1‑\sum_{i=1}^{N}p_i^{,q}\Bigr).
For q=1 the expression reduces to the Shannon entropy. By scanning a range of q values for each μ, they identify a “optimal” q that restores a linear dependence S_q∝N. Empirically they find a simple relation between the two parameters, q(μ)≈1+(2‑μ)/μ. For example, μ=1.5 yields q≈1.33, while μ=1.2 gives q≈1.67. In this regime the Tsallis entropy compensates for the redundancy introduced by the correlated blocks, and the information increase per added symbol becomes constant, as in an extensive system.

The symbolic sequences are then mapped onto a one‑dimensional random walk. Each symbol is assigned a step of +1 or –1; a block of length l produces a run of identical steps of that length. The mean‑square displacement ⟨x²(t)⟩ is measured as a function of time (or number of steps) and found to obey a power law ⟨x²(t)⟩∝t^{β}. The diffusion exponent β depends strongly on μ: for μ>2 the walk is normal (β≈1); for 1<μ<2 the walk is super‑diffusive with 1<β<2; and as μ→1 the exponent approaches β≈2, indicating ballistic‑like motion. This behavior mirrors the Lévy‑flight phenomenon, where heavy‑tailed step‑length distributions generate anomalous diffusion.

Finally, the paper highlights a conceptual link between the entropy analysis and the diffusion results. The same exponent μ that controls the decay of correlations also determines both the optimal Tsallis q and the diffusion exponent β. In other words, the degree of non‑extensivity required to recover linear entropy growth is directly related to the strength of the anomalous transport observed in the random‑walk representation.

In summary, the study provides three major contributions: (1) a clear demonstration that long‑range correlations suppress the growth of Shannon entropy; (2) evidence that Tsallis entropy with a suitably chosen q restores extensive, linear scaling, thereby offering a useful tool for quantifying information in correlated symbolic data; and (3) a systematic exploration of how the same correlated structure leads to a spectrum of diffusion regimes, from normal to ballistic, when the sequence is interpreted as a stochastic trajectory. These findings bridge information theory, non‑extensive statistical mechanics, and anomalous transport, and they suggest practical methodologies for analyzing complex time series in fields ranging from genomics to finance.


📜 Original Paper Content

🚀 Synchronizing high-quality layout from 1TB storage...