Preservation of normality by unambiguous transducers
We consider finite state non-deterministic but unambiguous transducers with infinite inputs and infinite outputs, and we consider the property of Borel normality of sequences of symbols. When these transducers are strongly connected, and when the input is a Borel normal sequence, the output is a sequence in which every block has a frequency given by a weighted automaton over the rationals. We provide an algorithm that decides in cubic time whether a unambiguous transducer preserves normality.
💡 Research Summary
The paper investigates the preservation of Borel normality under transformations realized by finite‑state transducers that are nondeterministic yet unambiguous. A sequence is Borel normal if every block of symbols of a given length occurs with the same limiting frequency; this notion captures the weakest form of randomness. While previous work showed that normality preservation can be decided in polynomial time for input‑deterministic transducers, the present study extends the analysis to the broader class of unambiguous transducers, i.e., machines where each input word labels at most one accepting run.
The authors first formalize the relevant concepts. An unambiguous automaton is a finite‑state machine whose transition relation may be nondeterministic but guarantees a unique accepting run for any accepted word. A transducer augments such an automaton with output labels (words over a possibly different alphabet) on each transition. The input automaton of a transducer inherits the unambiguity property, and the transducer is called strongly connected when its underlying state graph consists of a single strongly connected component (SCC).
The core technical contribution is Theorem 1. It states that for any strongly connected, unambiguous transducer (T), there exists a weighted automaton (A) over the rationals such that, for every normal input sequence (x) in the domain of (T) and for every finite output word (w), the limiting frequency (\mathrm{freq}(T(x),w)) equals the weight (\mathrm{weight}_A(w)) computed by (A). The weighted automaton assigns a rational weight to each transition; the weight of a word is the sum of the products of the weights of all runs that read that word. The construction of (A) proceeds by analysing the transition structure of (T), building an adjacency matrix that records the normalized number of transitions between states, and solving a linear system that captures the stationary distribution of the underlying Markov chain induced by the unambiguous input automaton. The authors prove that this construction can be carried out in cubic time with respect to the size of (T).
Theorem 2 builds on Theorem 1 to give an algorithmic decision procedure for normality preservation. Because the weighted automaton (A) exactly describes the output frequencies, (T) preserves normality if and only if, for every word (w) of length (\ell), (\mathrm{weight}_A(w)=|B|^{-\ell}) (where (B) is the output alphabet). In other words, the output distribution must be uniform over all words of the same length. Checking this condition reduces to computing the stationary vector of the adjacency matrix and verifying that the induced word weights match the uniform distribution. This verification also runs in (O(n^3)) time, where (n) is the number of states of (T).
A substantial part of the paper is devoted to the underlying linear‑algebraic machinery. The adjacency matrix (M) of an unambiguous automaton is defined as (M_{p,q}= #{a\mid p\xrightarrow{a} q}/|A|). For a strongly connected unambiguous automaton, the Perron–Frobenius theorem guarantees that the spectral radius of (M) is 1 and that there exists a non‑zero right eigenvector (\alpha) with (M\alpha=\alpha). The components of (\alpha) correspond to the uniform measure of the set of infinite words accepted when each state is taken as the sole initial state. This eigenvector is used to assign rational weights to transitions of the output automaton, ensuring that the weighted sum over all runs reproduces the true limiting frequencies of output blocks.
The authors illustrate their results with two concrete transducers. The first (Figure 2) is unambiguous but does not preserve normality; the associated weighted automaton yields output frequencies 9/15 for symbol ‘0’ and 6/15 for ‘1’, violating uniformity. The second (Figure 5) is a selector that either copies the current input symbol or outputs the empty word, depending on a simple parity condition on the number of preceding zeros before a ‘1’. Its weighted automaton has weights (2^{-|w|}) for any word (w), which matches the uniform distribution, confirming that the transducer preserves normality. This example also recovers Agafonov’s classical theorem: any oblivious finite‑state selector applied to a normal sequence yields a normal subsequence.
The paper concludes by emphasizing that the combination of automata theory, rational weighted automata, and spectral analysis yields a robust framework for reasoning about normality under finite‑state transformations. It opens several avenues for future work, such as extending the approach to genuinely nondeterministic (but not unambiguous) transducers, handling infinite alphabets or real‑valued weights, and designing synthesis algorithms that automatically construct normality‑preserving transducers for given specifications.
Overall, the work provides both a deep theoretical insight—showing that output block frequencies are governed by a rational weighted automaton—and a practical algorithmic tool—deciding normality preservation in cubic time—for a significant class of finite‑state transformations.
Comments & Academic Discussion
Loading comments...
Leave a Comment