Automata and Differentiable Words

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We exhibit the construction of a deterministic automaton that, given k > 0, recognizes the (regular) language of k-differentiable words. Our approach follows a scheme of Crochemore et al. based on minimal forbidden words. We extend this construction to the case of C\infinity-words, i.e., words differentiable arbitrary many times. We thus obtain an infinite automaton for representing the set of C\infinity-words. We derive a classification of C\infinity-words induced by the structure of the automaton. Then, we introduce a new framework for dealing with \infinity-words, based on a three letter alphabet. This allows us to define a compacted version of the automaton, that we use to prove that every C\infinity-word admits a repetition in C\infinity whose length is polynomially bounded.

💡 Research Summary

The paper investigates the class of differentiable words over the binary alphabet Σ = {1, 2}. A word is called differentiable if its run‑length encoding Δ(w) is again a word over Σ; equivalently, the word contains no factor 111 or 222. By iterating the derivative operator D (which discards a leading or trailing run of length 1 before applying Δ), one obtains the notion of k‑differentiable words (those for which Dⁱ(w) is defined for all 0 ≤ i < k) and the limit class C∞ of words that are differentiable arbitrarily many times.

The authors first describe how to compute the set of minimal forbidden words MF(Cₖ) for each k. Minimal forbidden words are the shortest words that are not factors of Cₖ; they uniquely characterize the factorial language Cₖ. Using the recursive relationship between MF(Cₖ) and MF(Cₖ₊₁) (derived from the definition of the derivative), they obtain an effective construction of the trie that recognises MF(Cₖ) for any finite k.

With this trie as input, they apply the L‑automaton construction of Crochemore et al. (2009). The algorithm builds a deterministic finite‑state automaton Aₖ whose states correspond to the prefixes of the forbidden words, and whose transition function consists of solid edges (the original trie edges) and weak edges (generated via a failure function s that points to the longest proper suffix that is also a state). The set of final states is Q \ M, i.e., all non‑sink states. The resulting automaton accepts exactly the language Cₖ.

When k is taken to infinity, the set MF(C∞) becomes infinite, and the construction yields an infinite deterministic automaton A∞. Each finite C∞‑word labels a unique path from the initial state, and two words end in the same state precisely when they share the same left simple extendability: among the two possible left extensions 1w and 2w, exactly one remains in C∞. This induces a natural equivalence relation on C∞‑words.

The paper then introduces a standard automaton‑compaction step, merging states that have identical outgoing transitions, producing a compacted automaton C A∞. This new automaton reflects a finer equivalence that also accounts for right simple extendability.

A major conceptual innovation is the vertical representation of C∞‑words on a three‑letter alphabet Σ′ = {0, 1, 2}. Each C∞‑word w is uniquely encoded by a pair of sequences over Σ′ that record, at each level of differentiation, whether the left and right extensions are possible (0 = neither, 1 = only 1‑extension, 2 = only 2‑extension). Using this representation the authors rewrite C A∞ as the vertical automaton V C A∞, and after another round of compaction obtain the vertical ultra‑compacted automaton V U C A∞.

The structural properties of V U C A∞ enable the authors to prove a new repetition theorem. For any C∞‑word u there exists a word z such that the concatenation u z u is again a C∞‑word, and its length satisfies

|u z u| ≤ C · |u|^{2.72}

for some constant C independent of u. Consequently every C∞‑word contains a gap‑repetition whose length is bounded by a sub‑cubic polynomial in |u|. This result complements earlier work by Carpi, who showed that the repetitivity index of C∞‑words is bounded below by a linear function. Moreover, the theorem resolves Problem 1 (whether for any two C∞‑words u, v there exists a C∞‑word of the form u z v) in the special case u = v.

The paper concludes with a discussion of how these automaton‑based techniques provide new tools for studying the Kolakoski word and related open problems, such as whether the set of factors of the Kolakoski word coincides with C∞, and whether a smooth right‑infinite word exists whose factor set equals C∞. The authors suggest that the vertical representation and the compacted automata may be adapted to other combinatorial word families (e.g., Sturmian or Thue‑Morse) and could lead to further insights into the combinatorial structure of differentiable words.

In summary, the work delivers (1) an effective method to construct deterministic automata for k‑differentiable languages via minimal forbidden words, (2) an infinite automaton for the full C∞ class, (3) a novel three‑letter vertical encoding that yields highly compacted automata, and (4) a polynomial‑bound repetition theorem for C∞‑words, thereby advancing both the theoretical understanding and algorithmic handling of differentiable words.

Automata and Differentiable Words

💡 Research Summary

Comments & Academic Discussion

Leave a Comment