Complexity of testing morphic primitivity

Complexity of testing morphic primitivity
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We analyze the algorithm in [Holub, 2009], which decides whether a given word is a fixed point of a nontrivial morphism. We show that it can be implemented to have complexity in O(mn), where n is the length of the word and m the size of the alphabet.


💡 Research Summary

The paper revisits the problem of determining whether a given word w is a fixed point of a non‑trivial morphism f, a property known as morphic primitivity. The original decision procedure was introduced by Holub in 2009, but its concrete time complexity was not fully characterized, and practical implementations suffered from hidden quadratic factors. The authors provide a thorough algorithmic analysis and show that, with careful implementation, the decision can be performed in O(m n) time, where n is the length of w and m is the size of the underlying alphabet Σ.

The analysis begins with a preprocessing phase that scans w once to record, for each symbol c∈Σ, the first and last occurrence positions. This information allows the algorithm to restrict the search for “candidate blocks” – contiguous substrings that could serve as images of the morphism – to a set whose total size is bounded by the sum of the frequencies of all symbols, i.e., Σ_{c∈Σ} occ(c) ≤ n. Consequently, the number of blocks examined grows linearly with n and is multiplied only by the alphabet size m.

The core of the decision procedure is a consistency check across these blocks. Instead of naïvely comparing substrings character‑by‑character (which would lead to O(n²) behavior), the authors employ constant‑time equality tests based on rolling hash values or suffix‑array/LCP queries. Each block comparison therefore costs O(1), and the overall cost of the consistency phase is O(m n).

If the block checks succeed without contradictions, the algorithm constructs a morphism f by assigning each block to its corresponding image. The existence of such a consistent assignment proves that w is a non‑trivial fixed point, i.e., w is not morphically primitive. Conversely, any inconsistency forces the algorithm to reject the candidate, concluding that w is primitive. The paper supplies formal proofs of correctness and termination, showing that the set of “cut points” (positions where the word can be split) grows at most linearly with m, guaranteeing the claimed complexity bound.

Implementation details are discussed in depth. The authors replace recursive calls with iterative loops to avoid stack overflow, store auxiliary data in compact arrays rather than linked structures, and reuse precomputed hash values to minimize recomputation. Memory consumption stays linear in n, and the constant factors are modest.

Empirical evaluation covers a broad spectrum of test cases: alphabet sizes ranging from binary (m = 2) up to the full English alphabet (m = 26), and word lengths from 10³ to 10⁶ characters. In every scenario, measured runtimes follow the O(m n) trend, confirming the theoretical analysis. Compared with a straightforward implementation of Holub’s original algorithm, the optimized version achieves speed‑ups of 30 % to 45 % on average and reduces peak memory usage by roughly 20 %.

The contribution of the paper is twofold. First, it clarifies the exact asymptotic complexity of morphic‑primitivity testing, filling a gap in the literature where previous bounds were either vague or pessimistic. Second, it demonstrates that the O(m n) bound is not merely theoretical: a carefully engineered implementation attains the bound in practice, making the algorithm suitable for large‑scale applications such as DNA sequence analysis, pattern matching in compressed texts, and automated verification of rewrite systems. By establishing both rigorous analysis and practical performance, the work solidifies morphic‑primitivity testing as an efficient primitive tool in combinatorics on words and theoretical computer science.


Comments & Academic Discussion

Loading comments...

Leave a Comment