Super-d-complexity of finite words
In this paper we introduce and study a new complexity measure for finite words. For positive integer $d$ special scattered subwords, called super-$d$-subwords, in which the gaps are of length at least $(d-1)$, are defined. We give methods to compute super-$d$-complexity (the total number of different super-$d$-subwords) in the case of rainbow words (with pairwise different letters) by recursive algorithms, by mahematical formulas and by graph algorithms. In the case of general words, with letters from a given alphabet without any restriction, the problem of the maximum value of the super-$d$-complexity of all words of length $n$ is presented.
💡 Research Summary
The paper introduces a novel complexity measure for finite strings called super‑d‑complexity. A super‑d‑subword of a word u = x₁x₂…xₙ is a scattered subsequence v = x_{i₁}x_{i₂}…x_{i_s} such that the distance between consecutive indices satisfies i_{j+1} − i_j ≥ d (d ≥ 1). For d = 1 this reduces to ordinary contiguous substrings, and for d = 2 it coincides with the classic scattered subwords. The super‑d‑complexity of a word is the total number of distinct super‑d‑subwords it contains.
The authors first focus on rainbow words (all letters distinct). In this case the complexity depends only on the length n and the parameter d, and is denoted S(n,d). Let b_{n,d}(i) be the number of super‑d‑subwords that start at position i. By a simple combinatorial argument they obtain the recurrence
b_{n,d}(i) = 1 + Σ_{k=i+d}^{n} b_{n,d}(k) (for n > d, 1 ≤ i ≤ n−d)
and the total complexity
S(n,d) = Σ_{i=1}^{n} b_{n,d}(i).
Setting M_{n,d}=b_{n,d}(1) yields the d‑middle sequence defined by
M_{n,d}=M_{n−1,d}+M_{n−d,d} (n ≥ d ≥ 2),
with initial values M_{0,d}=0, M_{1,d}=…=M_{d−1,d}=1. For d = 2 this sequence coincides with the Fibonacci numbers, giving the elegant closed form
S(n,2)=F_{n+2}−1.
For arbitrary d the authors derive a generating function
M_d(z)=z/(1−z−z^{d})
and from it obtain a combinatorial expression
S(n,d)= Σ_{k≥0} \binom{n-(d−1)k}{k+1}.
Thus the complexity is the sum over all ways to choose k+1 letters such that the gaps between them are at least d.
A second, graph‑theoretic approach models the word as a directed acyclic graph G=(V,E) where V={1,…,n} and an edge (i→j) exists iff j−i ≥ d. The adjacency matrix A has a_{ij}=1 when the edge exists, 0 otherwise. Because G has no cycles, the (i,j) entry of A^{k} counts the number of directed paths of length k from i to j. Summing A, A², …, A^{k} (where A^{k+1}=0) yields a matrix R whose entries count all paths of any length. The super‑d‑complexity is then
S(n,d)= Σ_{i,j} r_{ij}.
The authors adapt the classic Warshall algorithm to compute R in O(n³) time. Moreover, by storing sets of actual subwords in the matrix entries (instead of just counts) the same framework can enumerate all distinct super‑d‑subwords.
The paper then extends the discussion to general words over an alphabet Σ of size m. For any word w, the super‑d‑complexity satisfies
|w|_d ≤ S_w(d) ≤ S(|w|,d),
where |w|_d is the number of length‑d subwords in the trivial word a…a. The lower bound is attained by a constant word, the upper bound by a rainbow word. Defining
f(m,n,d)=max_{w∈Σ^{n}} S_w(d),
the authors give exact values for binary alphabets in several regimes:
- f(2,n,n−1)=3 for n≥3,
- f(2,n,n−2)=5 for n≥4,
- if ⌊log₂ m⌋ ≤ d ≤ n−3 then f(2,n,d)=6 for n≥6,
and provide a table of values for small n and d. These results illustrate how the alphabet size and the distance parameter d constrain the attainable complexity.
In the conclusion, the authors emphasize that super‑d‑complexity can be computed efficiently by three independent methods (recursive, closed‑form, graph‑based), each offering different insights. The graph method is particularly versatile because it can be extended to list all subwords, not merely count them. They also note that determining the maximal super‑d‑complexity for arbitrary alphabets remains an open problem, inviting further combinatorial and algorithmic investigation.
Comments & Academic Discussion
Loading comments...
Leave a Comment