Inverse Star, Borders, and Palstars

Reading time: 7 minute
...

📝 Original Info

  • Title: Inverse Star, Borders, and Palstars
  • ArXiv ID: 1008.2440
  • Date: 2010-08-14
  • Authors: Narad Rampersad, Jeffrey Shallit, Ming-wei Wang

📝 Abstract

A language L is closed if L = L*. We consider an operation on closed languages, L-*, that is an inverse to Kleene closure. It is known that if L is closed and regular, then L-* is also regular. We show that the analogous result fails to hold for the context-free languages. Along the way we find a new relationship between the unbordered words and the prime palstars of Knuth, Morris, and Pratt. We use this relationship to enumerate the prime palstars, and we prove that neither the language of all unbordered words nor the language of all prime palstars is context-free.

💡 Deep Analysis

Deep Dive into Inverse Star, Borders, and Palstars.

A language L is closed if L = L*. We consider an operation on closed languages, L-, that is an inverse to Kleene closure. It is known that if L is closed and regular, then L- is also regular. We show that the analogous result fails to hold for the context-free languages. Along the way we find a new relationship between the unbordered words and the prime palstars of Knuth, Morris, and Pratt. We use this relationship to enumerate the prime palstars, and we prove that neither the language of all unbordered words nor the language of all prime palstars is context-free.

📄 Full Content

Let L be a language such that L = L * . Then, following [3], we say that L is closed. Brzozowski [2] studied the the "smallest" language M such that L = M * .

For closed languages L, define

Theorem 2. If L is closed then (L - * ) * = L. Furthermore L - * = L -L 2 . If L is regular and closed, then so is L - * .

In this note we show that the class of context-free languages is not closed under the operation - * . First, though, we take a digression to discuss products of palindromes.

2 Palstars, prime palstars, and unbordered words

In this section we find a new connection between the prime palstars (as introduced in Knuth, Morris, and Pratt [4]) and the unbordered words.

We start with some definitions. By w R we mean the reverse of the word w. A palindrome is a word w such that w = w R . In this paper we will only be concerned with the nonempty palindromes of even length:

A palstar is an element of the language PALSTAR := PAL * . A word x is a prime palstar if it is a palstar and cannot be written as the product of two palstars. Evidently a prime palstar must itself be a palindrome. The first few prime palstars over {0, 1} are 00, 0110, 010010, 011110, 01000010, 01011010, 01111110, and their complements, obtained by mapping 0 to 1 and vice versa. The language of all prime palstars is denoted PRIMEPALSTAR.

Theorem 3 (Knuth-Morris-Pratt [4]). Every palstar has a unique factorization into prime palstars.

The proof of this theorem depends on the following lemma: Lemma 4 (Knuth-Morris-Pratt [4]). No prime palstar is a proper prefix of another prime palstar.

Corollary 5. If w is a palindrome of even length, then its factorization into prime palstars must be of the form w

Otherwise, since w ends with x n , it must begin with x R n = x n . Hence either x 1 is a prefix of x n , or vice versa. By Lemma 4 we must have x 1 = x n . Using the same argument on the shorter palindrome x 1 -1 wx -1 1 , we derive the remaining equalities. We now turn to borders. A word is said to be bordered if it has some nonempty prefix that is also a suffix. Otherwise, it is unbordered. Unbordered words are also called bifix-free in the literature [5].

Equivalently, a word w is bordered if it can be written in the form xyx for some nonempty word x. For example, entanglement begins and ends with the string ent.

Given two words of the same length

Theorem 6. A word w is a prime palstar if and only if there exists an unbordered word z such that w = zXz R .

Proof. Suppose w is not a prime palstar. If w is not an even length palindrome then it is certainly not of the form zXz r . Suppose then that w is an even length palindrome and hence is of the form zXz R . We will show that z is bordered. Since w is not a prime palstar we can factor w into a product of prime palstars. Then by Corollary 5 such a factorization must look like x • • • x for some palindrome x. Then when we “unshuffle” w into z and z R , we get that z starts with the odd-indexed letters of x and ends with the odd-indexed letters of x R . But x = x R , so z starts and ends with the same word.

On the other hand, suppose w = xXy. By comparing the symbols x to y we see that if y = x R , then w is not a palindrome. So assume y = x R . Now if x is bordered, then we can write it as x = zuz for some nonempty string z. Then w = (zuz)X(zuz) R = (zXz R )(uXu R )(zXz R ). This gives a factorization of w as a product of two or three nonempty palstars (according to whether u is empty or nonempty).

An example of this theorem in English is noon, which is a prime palstar, and is the shuffle of the unbordered word no with its reversal.

As far as we know, up to now no one has enumerated the palstars. However, our argument above allows us to do so, based on enumeration of the unbordered words.

Nielsen [5] has shown that if a n denotes the number of unbordered words of length n over an alphabet of size k, then

if n odd and > 1.

(Also see [1].) Furthermore, he showed that a n ∼ c k k n , where c k is a constant that tends to 1 as k → ∞, and c 2 . = .2677868. It follows that if b n is the number of prime palstars of length 2n, then b n = a n . In particular, about 27% of all binary palindromes are prime palstars.

We now apply the results in Section 2 to prove that the class of context-free languages is not closed under inverse star.

Clearly PALSTAR = PAL * is context-free. We have PRIMEPALSTAR = PALSTAR - * . So it suffices to show that PRIMEPALSTAR is not context-free. Suppose it were. First, we need the following result.

Theorem 7. The language U of unbordered words over an alphabet of size at least 2 is not context-free.

Proof. Assume it is. Without loss of generality the alphabet is Σ = {0, 1, . . .}. Consider U ′ := U ∩ 1 0 + 1 0 + 1 0 + 1 0 + , the intersection of U with a regular language. Then

Since the context-free languages are closed under intersection with a regular language, it suffices to prove U ′ is not context-free.

To do this, we use Ogden’s lemma [6]. Choose

…(Full text truncated)…

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut