Single use register automata for data words
Our starting point are register automata for data words, in the style of Kaminski and Francez. We study the effects of the single-use restriction, which says that a register is emptied immediately after being used. We show that under the single-use restriction, the theory of automata for data words becomes much more robust. The main results are: (a) five different machine models are equivalent as language acceptors, including one-way and two-way single-use register automata; (b) one can recover some of the algebraic theory of languages over finite alphabets, including a version of the Krohn-Rhodes Theorem; (c) there is also a robust theory of transducers, with four equivalent models, including two-way single use transducers and a variant of streaming string transducers for data words. These results are in contrast with automata for data words without the single-use restriction, where essentially all models are pairwise non-equivalent.
💡 Research Summary
The paper revisits the theory of automata and transducers over infinite alphabets—so‑called data words—by imposing a “single‑use” restriction on registers. In the classic Kaminski‑Francez model, registers may be read many times, which leads to a proliferation of incomparable models (one‑way vs. two‑way, deterministic vs. nondeterministic, alternating, etc.). By requiring that a register be cleared (set to the undefined value ⊥) immediately after it is used, the authors recover a robust, regular‑language‑like theory for data words.
Main contributions for language recognition
-
Equivalence of five acceptance models (Theorem 5). The following are shown to recognize exactly the same class of languages over any polynomial orbit‑finite alphabet Σ:
- Deterministic one‑way single‑use register automata.
- Deterministic two‑way single‑use register automata.
- Orbit‑finite monoids equipped with an equivariant morphism and an equivariant accepting set.
- Rigidly guarded MSO (a fragment of monadic second‑order logic tailored to data words).
- Orbit‑finite regular list functions (functions from Σ* to {yes,no} that can be expressed by a finite set of equivariant operations).
The equivalence of the monoid, logic, and list‑function views follows from earlier work (Bojańczyk & Klin, 2016). The new contributions are the automata‑to‑monoid and automata‑to‑list‑function translations, which are effective.
-
Algebraic foundations. The authors develop the theory of orbit‑finite monoids, extending classic results such as Green’s relations and Simon’s factorisation forest theorem to the orbit‑finite setting. This provides a structural handle on the languages recognized by single‑use automata.
-
Decidability. Emptiness is decidable for both one‑way and two‑way single‑use automata, contrasting sharply with the undecidable emptiness of unrestricted two‑way register automata. The paper leaves open the problem of deciding whether a given multiple‑use one‑way automaton can be simulated by a single‑use automaton.
Main contributions for transducers (string‑to‑string functions)
-
Four equivalent transducer models. The authors identify a robust class of functions over data words and prove that it can be described by:
- Deterministic two‑way single‑use transducers.
- An atom‑extended version of streaming string transducers (SSTs).
- Regular list functions with atoms (the functional analogue of the list functions used for languages).
- Compositions of “prime” two‑way machines, where a prime machine is either a classical Mealy machine (the building block of the Krohn‑Rhodes theorem) or a register machine that merely moves an atom to a later position.
One‑way single‑use transducers form a strict subclass (they cannot reverse the input), showing that the equivalence holds for language acceptance but not for functional power.
-
Krohn‑Rhodes theorem for data words. The classic Krohn‑Rhodes decomposition (every finite‑state Mealy machine is a cascade of prime machines) is lifted to the infinite‑alphabet setting. The authors introduce a new prime machine that carries a single atom forward; together with the original prime Mealy machines, these suffice to decompose any deterministic two‑way single‑use transducer.
-
Closure and decidability. The function class is closed under composition (via the prime‑machine cascade) and equivalence of two transducers is decidable. This mirrors the well‑behaved algebraic properties of regular string‑to‑string functions over finite alphabets.
Technical approach
- The paper works with polynomial orbit‑finite sets, i.e., sets built from the atom set A using finite products and disjoint unions, and equipped with the natural action of atom automorphisms. Functions must be equivariant, meaning they respect atom renamings and can only test equality of atoms.
- Single‑use transducers are defined by a deterministic transition system that can ask equivariant yes/no questions about the current input symbol or about equality of register contents, perform equivariant actions (store an atom, output a value derived from registers, move the head, accept/reject), and automatically clear any register that participates in a question or an output action.
- The one‑way restriction forbids “move left” actions; the two‑way restriction forbids output actions. The single‑use restriction is enforced by automatically resetting registers after they are read.
- For the automata‑to‑monoid direction, the authors construct a transition monoid whose elements are equivalence classes of configurations modulo atom renaming. The single‑use property guarantees finiteness of the support, yielding an orbit‑finite monoid.
- For the monoid‑to‑list‑function direction, they use the known correspondence between orbit‑finite monoids and regular list functions (via equivariant rational expressions).
- The Krohn‑Rhodes decomposition is achieved by first applying the classical decomposition to the underlying control part (ignoring registers) and then handling the atom‑moving register separately, showing that the extra register can be simulated by a cascade of a simple “move‑atom” prime machine.
Implications and future work
The results demonstrate that the lack of robustness for data‑word automata is not inherent but rather a consequence of allowing unrestricted reuse of registers. By imposing a natural memory discipline (single‑use), the authors recover a theory that mirrors the classical regular language theory: multiple equivalent automaton models, an algebraic characterisation, a logical characterisation, and a well‑behaved transducer theory with closure properties and decidable equivalence.
Potential future directions include:
- Investigating the expressive power of nondeterministic single‑use models (the paper focuses on deterministic ones).
- Exploring complexity bounds for decision problems (emptiness, equivalence) in terms of the number of registers and the size of the orbit‑finite alphabet.
- Extending the framework to data words with additional structure (e.g., ordered data, timestamps) while preserving single‑use robustness.
- Applying the theory to practical stream‑processing systems, where the single‑use discipline aligns with “consume‑once” processing pipelines.
In summary, the paper establishes a solid, algebraically rich foundation for regular‑like languages and functions over infinite alphabets by leveraging a simple yet powerful single‑use restriction on registers. This bridges the gap between finite‑alphabet regular theory and the more challenging world of data words.
Comments & Academic Discussion
Loading comments...
Leave a Comment