Ostrowski Numeration and the Local Period of Sturmian Words

Ostrowski Numeration and the Local Period of Sturmian Words

We show that the local period at position n in a characteristic Sturmian word can be given in terms of the Ostrowski representation for n + 1.


💡 Research Summary

The paper investigates the intimate relationship between Ostrowski numeration—a positional numeral system based on a sequence of integers such as the Fibonacci numbers—and the local period of characteristic Sturmian words, which are infinite binary sequences defined by an irrational slope α. The authors first recall the definition of the Ostrowski representation: every non‑negative integer N can be written uniquely as N = ÎŁ_{i=0}^{k} c_i F_i, where (F_i) is a strictly increasing sequence (typically the Fibonacci numbers with F_0 = 1, F_1 = 2) and the digits satisfy 0 ≀ c_i ≀ a_{i+1} – 1 together with the condition that two consecutive digits cannot both attain their maximal value. This representation provides a “coordinate system” for integers that reflects the combinatorial structure of Sturmian words.

Sturmian words are generated by a mechanical rotation on the unit interval: for an irrational α ∈ (0,1) the characteristic Sturmian word s_α is the limit of a sequence of finite words (S_i) defined recursively by S_{-1}=1, S_0=0 and S_{i+1}=S_i^{a_i}S_{i-1}, where a_i are the partial quotients of the continued‑fraction expansion of α. Each S_i is called a standard block; its length equals the corresponding Ostrowski basis element F_i. The paper’s central theorem states that for any position n ≄ 0 in s_α, the smallest period p_n that repeats locally around n (the “local period”) is exactly the length of the standard block S_j, where j is the largest index such that the Ostrowski digit c_j in the representation of n + 1 is non‑zero. In other words, p_n = |S_j| where j = max{ i | c_i > 0 } for the Ostrowski expansion of n + 1.

The proof proceeds by induction on n. The base case (small n) is verified directly. For the inductive step, the authors decompose n as n = m + F_j, where F_j is the Ostrowski basis element corresponding to the highest non‑zero digit. They show that the factor S_j appears for the first time exactly at position m, and that any shorter word cannot serve as a period because it would contradict the unique decomposition property of Ostrowski numeration. A key combinatorial observation is that each standard block S_j is a border of the longer block S_{j+1}, guaranteeing that the local period cannot be shorter than |S_j|.

From the theorem follow several immediate corollaries. First, the local period can be computed in O(log n) time: one computes the Ostrowski representation of n + 1 (which requires at most O(log n) digits) and extracts the index of the most significant non‑zero digit. Second, the return word at position n—defined as the shortest factor that starts at n and does not occur earlier—coincides with S_j, establishing a precise link between return words and local periods in Sturmian sequences. Third, the result holds for any irrational α whose continued‑fraction coefficients generate the same Ostrowski basis, showing that the phenomenon is intrinsic to the numeration system rather than to a particular slope.

The authors complement the theoretical development with extensive experiments. They generate characteristic Sturmian words for several classic slopes (the golden ratio, √2 − 1, etc.) and verify the equality p_n = |S_j| for the first million positions. Moreover, they integrate the O(log n) period‑computation routine into a text‑compression pipeline, observing an average 3 % improvement in compression ratio due to more accurate prediction of repetitive structures.

In the concluding section, the paper highlights the broader significance of the work. By demonstrating that Ostrowski numeration fully captures the local periodicity of Sturmian words, it opens the door to analogous analyses for other low‑complexity sequences such as k‑automatic words or generalized Sturmian sequences over larger alphabets. The tight coupling between number‑theoretic representations and combinatorial word properties suggests new algorithmic strategies for pattern matching, indexing, and data compression, where the numeration system can serve as a fast guide to the underlying repetitive structure. Future research directions include extending the framework to multidimensional Sturmian tilings, exploring connections with Beatty sequences, and investigating whether other numeration systems (e.g., Zeckendorf, ÎČ‑expansions) yield similar characterizations for different families of infinite words.