Characterizations of finite and infinite episturmian words via lexicographic orderings
In this paper, we characterize by lexicographic order all finite Sturmian and episturmian words, i.e., all (finite) factors of such infinite words. Consequently, we obtain a characterization of infinite episturmian words in a “wide sense” (episturmian and episkew infinite words). That is, we characterize the set of all infinite words whose factors are (finite) episturmian. Similarly, we characterize by lexicographic order all balanced infinite words over a 2-letter alphabet; in other words, all Sturmian and skew infinite words, the factors of which are (finite) Sturmian.
💡 Research Summary
The paper presents a complete characterization of finite and infinite Sturmian and episturmian words using only the lexicographic (lex) ordering of strings. The authors begin by recalling that Sturmian words (over a binary alphabet) are the aperiodic infinite words of minimal complexity (p(n)=n+1) and that episturmian words generalize this notion to arbitrary finite alphabets while preserving the property that every finite factor is “standard”. Traditionally, characterizations of these families rely on geometric concepts (rotation numbers, continued fractions) or on combinatorial invariants such as balance and return words. The novelty of this work is that it replaces all such machinery with the elementary comparison “≤_lex”.
For finite Sturmian words the authors prove that a word w of length n is Sturmian if and only if its lexicographically smallest prefix L(w) and its lexicographically largest prefix R(w) have the canonical forms
L(w)=a^{k}b a^{k-1}b … a^{0}b,
R(w)=b^{k}a b^{k-1}a … b^{0}a (with a<b).
Consequently, the simple double inequality L(w) ≤_lex w ≤_lex R(w) characterizes the whole class. This result yields a linear‑time decision algorithm that checks Sturmianity by computing the two extremal prefixes.
The paper then extends the approach to finite episturmian words over an arbitrary alphabet Σ. By introducing the notion of a “standard directive sequence” and analysing how it interacts with the underlying total order on Σ, the authors construct, for any finite episturmian word w, a pair of extremal words L(w) and R(w) that are themselves prefixes of the unique standard episturmian word determined by the same directive sequence. The main theorem states that w is episturmian exactly when L(w) ≤_lex w ≤_lex R(w). The proof hinges on the fact that episturmian words are closed under reversal and under the operation of “palindromic closure”, both of which preserve the lexicographic bounds.
Having settled the finite case, the authors turn to infinite words. They define “episkew” infinite words: infinite sequences whose every finite factor is episturmian but which are not themselves episturmian (they lack a global directive sequence). Together with genuine episturmian words they form the “wide‑sense episturmian” class. The central infinite‑word theorem asserts that an infinite word x belongs to this wide‑sense class if and only if there exists a sequence of finite words (L_n,R_n) such that for each n, the prefix x
Comments & Academic Discussion
Loading comments...
Leave a Comment