Feasible Automata for Two-Variable Logic with Successor on Data Words

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We introduce an automata model for data words, that is words that carry at each position a symbol from a finite alphabet and a value from an unbounded data domain. The model is (semantically) a restriction of data automata, introduced by Bojanczyk, et. al. in 2006, therefore it is called weak data automata. It is strictly less expressive than data automata and the expressive power is incomparable with register automata. The expressive power of weak data automata corresponds exactly to existential monadic second order logic with successor +1 and data value equality \sim, EMSO2(+1,\sim). It follows from previous work, David, et. al. in 2010, that the nonemptiness problem for weak data automata can be decided in 2-NEXPTIME. Furthermore, we study weak B"uchi automata on data omega-strings. They can be characterized by the extension of EMSO2(+1,\sim) with existential quantifiers for infinite sets. Finally, the same complexity bound for its nonemptiness problem is established by a nondeterministic polynomial time reduction to the nonemptiness problem of weak data automata.

💡 Research Summary

The paper introduces a new automata model for data words—finite or infinite sequences where each position carries a symbol from a finite alphabet and a value from an infinite data domain. The model, called Weak Data Automata (WDA), is a semantic restriction of the well‑known Data Automata (DA) introduced by Bojańczyk et al. (2006). While a DA consists of a letter‑to‑letter transducer A that rewrites the input profile and a second automaton B that checks properties of each data class, a WDA limits the second stage to three very simple, class‑local constraints: (i) key constraints (key(γ)) forbidding two positions with the same output symbol γ to share a data value, (ii) inclusion constraints (V(γ) ⊆ ⋃_{γ’∈R} V(γ’)) requiring that every data value appearing with γ also appears with some symbol from a set R, and (iii) denial constraints (V(γ) ∩ V(γ’) = ∅) ensuring that two different symbols never share a data value. No inter‑class constraints are allowed. This restriction makes WDA strictly weaker than DA but incomparable with register automata.

The authors prove that the expressive power of WDA coincides exactly with existential monadic second‑order logic with successor and data equality, denoted EMSO₂(+1,∼). EMSO₂(+1,∼) is obtained from FO₂(+1,∼) by allowing existential quantification over sets of positions; it does not permit universal set quantifiers or inter‑class relations. By leveraging earlier results (David et al., 2010) showing that the satisfiability of EMSO₂(+1,∼) can be decided in 2‑NEXPTIME, the paper immediately obtains the same upper bound for the non‑emptiness problem of WDA.

To address infinite behaviours, the paper extends WDA to Weak Büchi Data Automata (WBDA) for ω‑data words. The transducer A is equipped with a Büchi acceptance condition, and the class‑local constraints are interpreted over infinite runs. Logically, WBDA corresponds to EMSO₂(+1,∼) enriched with existential quantifiers that range only over infinite sets (often written EMSO₂⁺∞(+1,∼)). The authors show that the non‑emptiness of WBDA can be reduced nondeterministically in polynomial time to the non‑emptiness of WDA; consequently, WBDA also enjoys a 2‑NEXPTIME upper bound.

A substantial part of the paper is devoted to comparing WDA with other models. Two simple data languages, L_{a<b} (every a‑position is followed later by a b‑position with the same data value) and L_{a* b} (the matching b‑position must be exactly two steps after the a‑position), are shown not to be recognizable by any WDA, illustrating that the three constraints are insufficient for certain ordering requirements. Conversely, DA can express properties such as “the next position with the same data value exists” which are not expressible in WDA, establishing that the two models are incomparable.

The technical development includes: (1) formal definitions of data words, profiles, class strings, and letter‑to‑letter transducers; (2) the precise syntax and semantics of key, inclusion, and denial constraints; (3) a proof that disjunctive extensions of these constraints (e.g., key(K) for a set K) can be compiled away into ordinary constraints with only polynomial blow‑up; (4) the logical characterization theorem linking WDA to EMSO₂(+1,∼); (5) the construction of WBDA and the reduction from its non‑emptiness to that of WDA; and (6) a discussion of related work, including connections to linear temporal logic with data, safety fragments, and previous complexity bounds for FO₂(+1,∼).

The paper concludes with open problems: determining exact lower bounds for WDA and WBDA non‑emptiness, exploring determinisation and minimisation procedures, and investigating richer constraint families (e.g., inter‑class inclusions) while preserving decidability. The authors suggest that WDA could serve as a practical foundation for verification tasks in XML processing and infinite‑state model checking, where the trade‑off between expressive power and algorithmic tractability is crucial. Overall, the work provides a clean, well‑motivated automata model that exactly captures a natural fragment of two‑variable logic on data words, and it establishes tight complexity results for both finite and infinite settings.

Feasible Automata for Two-Variable Logic with Successor on Data Words

💡 Research Summary

Comments & Academic Discussion

Leave a Comment