Digraph Complexity Measures and Applications in Formal Language Theory

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We investigate structural complexity measures on digraphs, in particular the cycle rank. This concept is intimately related to a classical topic in formal language theory, namely the star height of regular languages. We explore this connection, and obtain several new algorithmic insights regarding both cycle rank and star height. Among other results, we show that computing the cycle rank is NP-complete, even for sparse digraphs of maximum outdegree 2. Notwithstanding, we provide both a polynomial-time approximation algorithm and an exponential-time exact algorithm for this problem. The former algorithm yields an O((log n)^(3/2))- approximation in polynomial time, whereas the latter yields the optimum solution, and runs in time and space O*(1.9129^n) on digraphs of maximum outdegree at most two. Regarding the star height problem, we identify a subclass of the regular languages for which we can precisely determine the computational complexity of the star height problem. Namely, the star height problem for bideterministic languages is NP-complete, and this holds already for binary alphabets. Then we translate the algorithmic results concerning cycle rank to the bideterministic star height problem, thus giving a polynomial-time approximation as well as a reasonably fast exact exponential algorithm for bideterministic star height.

💡 Research Summary

This paper investigates structural complexity measures on directed graphs (digraphs), focusing on the cycle rank, and establishes a deep connection between this graph-theoretic notion and the star height of regular languages—a fundamental concept in formal language theory. The authors begin by recalling basic definitions of digraphs, strong connectivity, and the cycle rank, which is defined recursively: an acyclic digraph has rank 0; a strongly connected digraph with at least one edge has rank 1 + min r(G − v) over all vertices v; and a non‑strongly‑connected digraph’s rank is the maximum rank among its strongly connected components. They show that this definition is equivalent to the height of a directed elimination forest (or tree), a concept previously studied in sparse matrix factorization.

The paper then compares cycle rank with two other digraph complexity measures: the weak separator number and directed pathwidth. By constructing weak balanced separators and applying a recurrence relation R_k(n) = k + R_k(⌈(n − k)/2⌉), they prove that for a loop‑free digraph with weak separator number k, the cycle rank satisfies r(G) ≤ k·log₂(n/k) − 1. Moreover, they establish the chain of inequalities k ≤ dpw(G) ≤ r(G) ≤ k·log₂(n/k) − 1, showing that cycle rank sits between weak separator number and directed pathwidth, and consequently bounds other measures such as DAG‑width, Kelly‑width, and DAG‑width variants.

The computational complexity of determining the cycle rank is addressed next. The decision problem CYCLE RANK (given a digraph G and integer k, decide whether r(G) ≤ k) is proved NP‑complete, even when the input graph is required to be strongly connected or to have maximum outdegree 2. This extends earlier NP‑completeness results that applied only to undirected symmetric digraphs of unbounded degree.

On the algorithmic side, two main contributions are presented. First, a polynomial‑time approximation algorithm achieves an O((log n)³⁄²) approximation factor. The algorithm recursively finds weak balanced separators, solves the sub‑instances, and combines the results, yielding a logarithmic‑square‑root approximation that improves upon earlier poly‑logarithmic bounds. Second, an exact exponential‑time algorithm based on dynamic programming over subsets is given. For general digraphs the running time is O*(2ⁿ), but for digraphs with maximum outdegree 2 the authors exploit structural properties to obtain a faster bound of O*(1.9129ⁿ) in both time and space. The same exponential algorithm also solves the directed feedback vertex set problem on outdegree‑2 digraphs within the same bound.

The paper then turns to formal language theory. The star height h(L) of a regular language L is the minimum nesting depth of Kleene stars needed in any regular expression describing L. While Eggan raised the decidability of star height in the 1960s and Hashiguchi later proved decidability with a doubly‑exponential algorithm, practical computation remains elusive. The authors focus on the subclass of bideterministic regular languages—those recognized by deterministic finite automata that are deterministic both forward and backward and have a single final state. They prove that the star‑height problem for bideterministic languages is NP‑complete, even over binary alphabets. Crucially, the transition graph of a bideterministic automaton is precisely a digraph whose cycle rank equals the star height of the language. Consequently, the approximation and exact algorithms developed for cycle rank translate directly to the bideterministic star‑height problem, yielding an O((log n)³⁄²) approximation and an O*(1.9129ⁿ) exact algorithm for computing the star height of bideterministic languages.

The final section outlines open problems and future research directions. The authors note that while their exponential algorithm is optimal up to the base of the exponent for outdegree‑2 graphs, improving the approximation factor or extending the exact algorithm to higher outdegrees remains open. They also suggest investigating tighter relationships between cycle rank and other digraph width measures (e.g., DAG‑width, Kelly‑width) and exploring whether similar graph‑language correspondences exist for broader subclasses of regular languages.

In summary, the paper makes three major contributions: (1) it establishes tight structural bounds linking cycle rank, weak separator number, and directed pathwidth; (2) it settles the computational complexity of cycle rank, providing both a polynomial‑time logarithmic‑square‑root approximation and a fast exact exponential algorithm for bounded outdegree graphs; and (3) it leverages these graph‑theoretic results to resolve the complexity of the star‑height problem for bideterministic regular languages, delivering analogous approximation and exact algorithms. These results deepen the interplay between graph theory and formal language theory and open avenues for further algorithmic advances in both domains.

Digraph Complexity Measures and Applications in Formal Language Theory

💡 Research Summary

Comments & Academic Discussion

Leave a Comment