On the Structure of Bispecial Sturmian Words

On the Structure of Bispecial Sturmian Words
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

A balanced word is one in which any two factors of the same length contain the same number of each letter of the alphabet up to one. Finite binary balanced words are called Sturmian words. A Sturmian word is bispecial if it can be extended to the left and to the right with both letters remaining a Sturmian word. There is a deep relation between bispecial Sturmian words and Christoffel words, that are the digital approximations of Euclidean segments in the plane. In 1997, J. Berstel and A. de Luca proved that \emph{palindromic} bispecial Sturmian words are precisely the maximal internal factors of \emph{primitive} Christoffel words. We extend this result by showing that bispecial Sturmian words are precisely the maximal internal factors of \emph{all} Christoffel words. Our characterization allows us to give an enumerative formula for bispecial Sturmian words. We also investigate the minimal forbidden words for the language of Sturmian words.


💡 Research Summary

The paper investigates the combinatorial structure of bispecial Sturmian words and establishes a precise correspondence with Christoffel words, thereby extending earlier results that linked only palindromic (strictly bispecial) Sturmian words to primitive Christoffel words. A Sturmian word over the binary alphabet {a, b} is defined as a finite balanced word, i.e., any two factors of equal length differ in the number of a’s (or b’s) by at most one. A word w is left‑special if both aw and bw are Sturmian, right‑special if wa and wb are Sturmian, and bispecial if it is both left‑ and right‑special. Bispecial words split into strictly bispecial (all four extensions aw a, aw b, bw a, bw b are Sturmian) and non‑strictly bispecial; the former are exactly the palindromes and coincide with central words.

Christoffel words encode the best lattice approximation of a Euclidean segment from (0,0) to (p,q) with p,q>0 and p+q=n. The lower Christoffel word w_{p,q} has |w|_a = p, |w|b = q, and its reversal is the upper Christoffel word w^0{p,q}. When gcd(p,q)=1 the word is primitive; otherwise it is a power of a primitive Christoffel word.

The classical theorem of Berstel and de Luca states that a strictly bispecial Sturmian word u satisfies x u y being a primitive Christoffel word for some distinct letters x,y∈{a,b}. The main contribution of this work is Theorem 3.11, which removes the “primitive” restriction: a word u is bispecial (strict or non‑strict) if and only if there exist letters x,y such that x u y is a (possibly non‑primitive) Christoffel word. The proof proceeds by (i) showing that any bispecial word can be sandwiched by suitable letters to obtain a Sturmian word, (ii) invoking the known Sturmian–Christoffel correspondence to deduce the Christoffel nature of the sandwich, and (iii) conversely demonstrating that any Christoffel word x u y yields a bispecial interior factor u.

From this bijection the authors derive an exact enumeration formula. There are 2^{n+2} binary words of length n+2; among them φ(n+2) are primitive Christoffel words (φ denotes Euler’s totient). Each primitive Christoffel word contributes a unique internal factor of length n, and every bispecial Sturmian word arises in this way. Hence the number of bispecial Sturmian words of length n equals
  2^{,n+2} − φ(n+2).
This result generalizes previously known counts for left‑special, right‑special, and strictly bispecial words, and provides the first closed formula for the total number of bispecial Sturmian words, including the non‑strict ones.

The paper also studies minimal forbidden words for the factorial language St of Sturmian words. A minimal forbidden word is a shortest word not in St whose every proper prefix and suffix belong to St. Theorem 5.1 characterizes these as exactly the words y w x where x w y is a non‑primitive Christoffel word (i.e., a proper power of a primitive Christoffel word). Consequently, the number of minimal forbidden words of length n (for n>1) is
  2^{,n−1−φ(n)}.
This connects the combinatorial notion of forbidden patterns directly to the arithmetic property of the underlying Christoffel word’s primitivity.

Overall, the paper unifies the combinatorial theory of Sturmian words with discrete geometry via Christoffel words, delivering both structural insight and precise enumerative formulas. The results have potential applications in areas where balanced sequences are crucial, such as coding theory, data compression (antidictionaries), and bio‑informatics (minimal absent words). By clarifying the role of non‑strict bispecial words, the work fills a gap in the literature and opens avenues for further exploration of Sturmian-related languages and their forbidden structures.


Comments & Academic Discussion

Loading comments...

Leave a Comment