A pseudo-primitive word with respect to an antimorphic involution \theta is a word which cannot be written as a catenation of occurrences of a strictly shorter word t and \theta(t). Properties of pseudo-primitive words are investigated in this paper. These properties link pseudo-primitive words with essential notions in combinatorics on words such as primitive words, (pseudo)-palindromes, and (pseudo)-commutativity. Their applications include an improved solution to the extended Lyndon-Sch\"utzenberger equation u_1 u_2 ... u_l = v_1 ... v_n w_1 ... w_m, where u_1, ..., u_l \in {u, \theta(u)}, v_1, ..., v_n \in {v, \theta(v)}, and w_1, ..., w_m \in {w, \theata(w)} for some words u, v, w, integers l, n, m \ge 2, and an antimorphic involution \theta. We prove that for l \ge 4, n,m \ge 3, this equation implies that u, v, w can be expressed in terms of a common word t and its image \theta(t). Moreover, several cases of this equation where l = 3 are examined.
For elements u, v, w in a free group, the equation of the form u ℓ = v n w m (ℓ, n, m ≥ 2) is known as the Lyndon-Schützenberger equation (LS equation for short). Lyndon and Schützenberger [13] investigated the question of finding all possible solutions for this equation in a free group, and proved that if the equation holds, then u, v, and w are all powers of a common element. This equation can be also considered on the semigroup of all finite words over a fixed alphabet Σ, and an analogous result holds.
Theorem 1 (see, e.g., [7,13,14]) For words u, v, w ∈ Σ + and integers ℓ, n, m ≥ 2, the equation u ℓ = v n w m implies that u, v, w are powers of a common word.
The Lyndon-Schützenberger equation has been generalized in several ways; e.g., the equation of the form x k = z k1 1 z k2 2 • • • z kn n was investigated by Harju and Nowotka [8] and its special cases in [1,11]. Czeizler et al. [3] have recently proposed another extension, which was originally motivated by the information encoded as DNA strands for DNA computing. In this framework, a DNA strand is modeled by a word w and encodes the same information as its Watson-Crick complement. In formal language theory, the Watson-Crick complementarity of DNA strands is modeled by an antimorphic involution θ [9,15], i.e., a function θ on an alphabet Σ * that is (a) antimorphic, θ(xy) = θ(y)θ(x), ∀x, y ∈ Σ * , and (b) involution, θ 2 = id, the identity. Thus, we can model the property whereby a DNA single strand binds to and is completely equivalent to its Watson-Crick complement, by considering a word u and its image θ(u) equivalent, for a given antimorphic involution θ.
For words u, v, w, integers ℓ, n, m ≥ 2, and an antimorphic involution θ, an extended Lyndon-Schützenberger equation (ExLS equation) is of the form
with u 1 , . . . , u ℓ ∈ {u, θ(u)}, v 1 , . . . , v n ∈ {v, θ(v)}, and w 1 , . . . , w m ∈ {w, θ(w)}.
The question arises as to whether an equation of this form implies the existence of a word t such that u, v, w ∈ {t, θ(t)} + . A given triple (ℓ, n, m) of integers is said to impose pseudo-periodicity, with respect to θ, on u, v, w, or simply, to impose θperiodicity on u, v, w if (1) implies u, v, w ∈ {t, θ(t)} + for some word t. Furthermore, we say that the triple (ℓ, n, m) imposes θ-periodicity if it imposes θ-periodicity on all u, v, w. The known results on ExLS equations [3] are summarized in Table 1.
Table 1. Summary of the known results regarding the extended Lyndon-Schützenberger equation.
l n m θ-periodicity
This paper is a step towards solving the unsettled cases of Table 1, by using the following strategy. Concise proofs exist in the literature for Theorem 1, that make use of fundamental properties such as:
(i) The periodicity theorem of Fine and Wilf (FW theorem), (ii) The fact that a primitive word cannot be a proper infix of its square, and (iii) The fact that the class of primitive words is closed under cyclic permutation.
(For details of each, see [2].) In contrast, the proof given in [3] for the result about ExLS equations, stating that (≥ 5, ≥ 3, ≥ 3) imposes θ-periodicity, involves tech-niques designed for this specific purpose only. Should Properties (i), (ii), (iii) be generalized so as to take into account the informational equivalence between a word u and θ(u), they could possibly form a basis for a concise proof of the solutions to the ExLS equation. The approach we use in this paper is thus to seek analog properties for this extended case, and use the results we obtain to approach the unsettled cases in Table 1.
Czeizler, Kari, and Seki generalized Property (i) in [4]. There, first the notion of primitive words was extended to that of pseudo-primitive words with respect to a given antimorphic involution θ (or simply, θ-primitive words). A word u is said to be θ-primitive if there does not exist another word t such that u ∈ t{t, θ(t)} + . For example, if θ is the mirror image over {a, b} * (the identity function on {a, b} extended to an antimorphism on {a, b} * ), aabb is θ-primitive, while abba is not because it can be written as abθ(ab). Based on the θ-primitivity of words, Property (i) was generalized as follows: “For words u, v, if a word in u{u, θ(u)} * and a word in v{v, θ(v)} * share a long enough prefix (for details, see Theorems 5 and 6), then u, v ∈ t{t, θ(t)} * for some θ-primitive word t.”
In contrast, little is known about Properties (ii) and (iii) except that they cannot be generalized as suggested in the previous example: non-trivial overlaps between two words in {t, θ(t)} + are possible, and cyclic permutations do not in general preserve the θ-primitivity of words. As a preliminary step towards an extension of Property (ii), Czeizler et al. examined the non-trivial overlap of the form
for some θ-primitive word v (1 ≤ i ≤ 2m), and both x and y are properly shorter than v [3]. Some of the results obtained there will be employed in this paper.
One purpose of this paper is to explore further the extend
This content is AI-processed based on open access ArXiv data.