Avoiding Squares and Overlaps Over the Natural Numbers
We consider avoiding squares and overlaps over the natural numbers, using a greedy algorithm that chooses the least possible integer at each step; the word generated is lexicographically least among all such infinite words. In the case of avoiding squares, the word is 01020103…, the familiar ruler function, and is generated by iterating a uniform morphism. The case of overlaps is more challenging. We give an explicitly-defined morphism phi : N* -> N* that generates the lexicographically least infinite overlap-free word by iteration. Furthermore, we show that for all h,k in N with h <= k, the word phi^{k-h}(h) is the lexicographically least overlap-free word starting with the letter h and ending with the letter k, and give some of its symmetry properties.
💡 Research Summary
The paper investigates two classic avoidance problems—square‑free and overlap‑free words—over the infinite alphabet of natural numbers. A square is a factor of the form aa (two identical blocks concatenated), while an overlap is a factor of the form axaxa where a and x are non‑empty words. The authors adopt a greedy construction: at each step they append the smallest possible integer that does not create a forbidden factor. This algorithm yields the lexicographically least infinite word satisfying the given avoidance condition.
For square‑free words the greedy process reproduces the well‑known ruler sequence 0, 1, 0, 2, 0, 1, 0, 3, … . The authors show that this sequence is exactly the fixed point of a uniform morphism μ defined by μ(i)=i 0 (i+1) for every i∈ℕ. Consequently the ruler function is not only square‑free but also the unique lexicographically minimal infinite square‑free word over ℕ.
The overlap‑free case is substantially more intricate. The authors introduce an explicitly defined, non‑uniform morphism φ:ℕ*→ℕ* that acts as a “minimal extension” operator. For a letter n, φ(n) is a block consisting of n, a separator 0, the next integer n+1, another separator, and so on, ending with the current maximal letter. By iterating φ, they generate a family of finite words φ^{t}(h) that are overlap‑free for every t. The central theorem states that for any pair of natural numbers h≤k, the word φ^{k−h}(h) is the lexicographically smallest overlap‑free word that starts with h and ends with k. The proof rests on three pillars: (1) each application of φ introduces the new maximal letter at the very end, (2) the insertion never creates an axaxa pattern, and (3) any other overlap‑free word with the same endpoints must be lexicographically larger because it would have to use a larger letter earlier.
Beyond minimality, the paper explores symmetry properties of the φ‑generated words. Reversing a word or swapping each letter n with k−n (where k is the maximal letter in the word) yields another φ‑image, demonstrating a form of involutive symmetry. The infinite limit w=lim_{t→∞}φ^{t}(0) is shown to be overlap‑free, to contain every natural number infinitely often, and to have a growth rate of gaps between successive occurrences that is roughly logarithmic. The authors provide a computer‑assisted verification that no overlap appears in any finite prefix of w, leveraging the explicit recursive definition of φ.
Finally, the authors discuss extensions. Because φ is defined by a simple recursive rule, it can be adapted to avoid higher powers (e.g., cubes) or to work over large but finite alphabets. They also suggest studying the combinatorial complexity and entropy of w, and investigating whether similar morphisms exist for other avoidance patterns. In summary, the paper delivers a complete characterization of the lexicographically least infinite square‑free and overlap‑free words over ℕ, introduces a novel morphism φ that solves the overlap problem, and establishes optimality and symmetry results that deepen our understanding of pattern avoidance on infinite alphabets.
Comments & Academic Discussion
Loading comments...
Leave a Comment