Diverse Palindromic Factorization is NP-Complete
We prove that it is NP-complete to decide whether a given string can be factored into palindromes that are each unique in the factorization.
š” Research Summary
The paper investigates the computational complexity of a string factorization problem that combines two natural constraints: every factor must be a palindrome, and all factors must be distinct (i.e., the factorization must be ādiverseā). The authors prove that deciding whether a given string admits such a factorization is NPācomplete.
The authors begin by formalizing the problem. A string S of length n over an alphabet Ī£ is to be partitioned into substrings Fā,ā¦,F_t such that (i) each F_i is a palindrome, and (ii) no two factors are identical. This ādiverse palindromic factorizationā is a natural extension of the wellāstudied palindromic length problem (minimum number of palindromes needed to cover S) and of diverse factorizations such as LZ77/LZ78 parses. The decision versionādoes a diverse palindromic factorization exist?āis trivially in NP because a factorization serves as a polynomialāsize certificate.
To establish NPāhardness, the authors reduce the classic Boolean circuit satisfiability problem (CircuitāSAT) to the diverse palindromic factorization problem. They first transform any Boolean circuit into an equivalent circuit that uses only NAND gates (each with two inputs) and āsplitterā gates that duplicate a wire. This transformation incurs at most a logarithmic blowāup in size and depth, preserving polynomialātime reducibility.
Each wire of the circuit is assigned a unique symbol a. For each such symbol they introduce three related characters: a, its complement ĀÆa, and a marker x_a. Multiple copies of x_a are denoted x_a^j. The complement symbols play opposite roles in the reduction, while the x_a markers are used to enforce palindromic structure. Two additional separator symbols $ and # are employed as unique delimiters that never appear elsewhere in the construction.
The reduction proceeds inductively. Starting from an empty circuit Cā, the authors build a sequence of subācircuits Cā,ā¦,C_t, each obtained from the previous one by one of three elementary operations: (1) adding a new wire, (2) splitting an existing wire into two, or (3) connecting two wires to a new NAND gate. Correspondingly, they construct strings Sā,ā¦,S_t such that S_i ārepresentsā C_i. A string S_i represents a circuit if every truth assignment to the circuitās inputs is encoded by some diverse palindromic factorization of S_i, and conversely every diverse palindromic factorization of S_i encodes some assignment.
The key technical device is the notion of a ācomplete factorā: a factor that appears exactly once in the factorization and is not a proper substring of any other factor. For a wire a, the presence of the complete factors (a, x_a, x_aĀÆa) signals that the wireās logical value is true, whereas (ĀÆa, x_a, x_a a) signals false. The construction guarantees that in any diverse palindromic factorization exactly one of these two triples can appear as complete factors, thereby faithfully representing the Boolean value of the wire.
For each elementary circuit operation the authors describe how to extend the current string:
- Adding a wire: they append the block ā$ # x_a a x_aĀÆaā (or its complement) to encode the two possible truth values of the new wire.
- Splitting a wire: they insert a more elaborate pattern involving new symbols bā², cā² (the two split wires) together with several copies of x_a, ensuring that a factorization must choose either the āb true, c falseā or the opposite configuration, mirroring the logical split.
- Adding a NAND gate: they embed a pattern that forces the output wireās complete factor to be consistent with the NAND of its two input wiresā values.
Each step preserves the property that any diverse palindromic factorization of the resulting string corresponds bijectively to a truth assignment of the extended circuit. LemmaāÆ3 formalizes the ability to perform each step in constant time relative to the current string length, yielding an overall linearātime reduction from CircuitāSAT to the string problem.
Finally, to obtain a string whose factorization exists iff the original circuitās output is true, the authors append ā$ # x_o a_o x_oĀÆa_oā where a_o is the output wireās label. Since $ and # are unique separators, they must appear as complete factors in any factorization, forcing the final block to be interpreted exactly as described. Consequently, the constructed string has a diverse palindromic factorization if and only if the circuit is satisfiable.
The paper also extends the result in two directions:
- kādiverse factorization: for any fixed integer k ā„ 1, deciding whether a string admits a factorization where each palindrome appears at most k times remains NPācomplete. The reduction is adapted by limiting the multiplicity of the x_a markers.
- Binary alphabet restriction: the authors show that the problem stays NPācomplete even when the input alphabet is limited to two symbols (0/1). This is achieved by encoding each of the previously used symbols with short binary strings while preserving the palindrome and uniqueness constraints.
In summary, the authors provide a clean, linearātime manyāone reduction from Boolean circuit satisfiability to the diverse palindromic factorization problem, establishing its NPācompleteness. The work highlights that seemingly simple stringādecomposition constraints can encode full logical reasoning, and it opens avenues for further research on approximation algorithms, parameterized complexity (e.g., bounded alphabet size, bounded k), and practical implications for string compression and data indexing where uniqueness and palindromicity might be desirable.
Comments & Academic Discussion
Loading comments...
Leave a Comment