Alternating Automata on Data Trees and XPath Satisfiability

A data tree is an unranked ordered tree whose every node is labelled by a letter from a finite alphabet and an element ("datum") from an infinite set, where the latter can only be compared for equality. The article considers alternating automata on d…

Authors: ** - Marcin Jurdziński (Department of Computer Science, University of Warwick, UK) - Ranko Lazić (Department of Computer Science

Alternating Automata on Data Trees and XPath Satisfiability
Alternating Automata on Data T rees and XP ath Satisfia bilit y MARCIN JURDZI ´ NSKI and RANKO LAZI ´ C Department of Computer Science, University o f Wa rwick , UK A data tree is an unrank ed ordered tree whose every node is lab elled by a letter fr om a finite alphabet and an elemen t (“datum”) from an infinite set, where the latter can only be compared for equality . The article considers alternating automata on data trees that can mov e down ward and right ward, and hav e one r egister for storing data. The main results are that nonemptiness ov er finite data trees is decidable but not primitive recursiv e, and that nonemptiness of safety automata is decidable but no t el ementary . The pr oofs use nondeterministic tree automata wi th f ault y coun ters. Allowing up ward mov es, leftw ar d mov es, or tw o registers, each causes undecidabilit y . As corollaries, decidability is obtained for t wo data-sensitive fragments of the XPa th query language. Categories and Sub ject Descriptors: F.4.1 [ Mat hematical Logic and F orma l Langua ges ]: F ormal Languages— De cision pr oblems ; F.1.1 [ Computati on by Abstract Devices ]: Models of Computation— A utomata ; H .2.3 [ Database Management ]: Languages— Query languages General T er m s: Algori thms, V erification 1. INTRODUCTION Context. Logics a nd automa ta for w ords and trees over finite alpha bets are rel- atively w ell-understo o d. Motiv ated partly by the sear ch for automated reasoning techn iques for XML a nd the need for formal verification and synthesis of infinite- state s y stems, there is a n active and bro ad r esearch pro gramme o n logics and au- tomata for words and trees which hav e richer struc tur e. Initial pr ogress ma de on rea soning ab out data words and data trees is s ummarised in the s urvey by Sego ufin [2006 ]. A data word is a word ov er Σ × D , where Σ is a finite alphab et, and D is a n infinite set (“do ma in”) whose elements (“data”) can only b e compar ed for equality . Similar ly , a data tree is a tree (co untable, unranked and o rdered) whose every no de is lab elled by a pa ir in Σ × D . First-order log ic for data words was consider ed b y Bo ja ´ nczyk e t al. [2 0 06], and related auto mata were studied fur ther by Bj¨ orklund and Sch wen tick [2007]. The logic has v ar iables which r ange ov er word p os itions ( { 0 , . . . , l − 1 } or N ), a unar y predicate for each letter fr o m the finite alphab et, and a bina ry predica te x ∼ y which denotes equality of data lab els. F O 2 (+1 , <, ∼ ) denotes such a logic with tw o v ariables a nd binary pr e dicates x + 1 = y a nd x < y . Over finite and over infinite This article is a revised and extended v er sion of [Jurdzi ´ nski and Lazi´ c 2007]. The second author was supp orted by a gran t f rom the Int el Cor poration. Pe rmission to make digital/hard cop y of all or part of this material wi thout fee for p ersonal or classro om use provided that the copies are not made or distributed f or pr ofit or commercial adv an tage, the ACM copyrigh t/server notice, the title of the publication, and its date app ear, and notice is given that copying is by p ermi ssion of the ACM, Inc. T o copy otherwise, to republish, to p ost on s erv ers, or to r edistribute to lists requir es prior sp ecific p ermission and/or a fee. c  20YY ACM 1529-3785/20 YY/0700-0001 $5.00 ACM T r ansactions on Computational Logic, V ol. V, No. N, Month 20YY, P ages 1–23. 2 · M. Jurdzi´ nski an d R. Lazi ´ c data words, satisfiability for FO 2 (+1 , <, ∼ ) was shown decida ble and at least as hard as nonemptiness of vector addition automata. Whether the latter problem is elementary has been ope n for many years. Extending the log ic by one mor e v ar iable causes undecidability . Over data trees, FO 2 (+1 , <, ∼ ) denotes a similar fir st-order logic with tw o v ari- ables. The v ar iables rang e over tr ee no des, +1 stands for tw o predicates “ch ild” and “next sibling”, and < s ta nds for tw o predicates “desce nda nt ” and “ younger sibling”. Complexity o f satisfiability ov er finite data trees was studied by Bo ja ´ nczyk et al. [2009]. F or FO 2 (+1 , ∼ ), it w a s shown to b e in 3 NExpTime , but for FO 2 (+1 , <, ∼ ), to b e a t least as hard as nonemptiness of vector addition tree auto mata. Decid- ability of the latter is an open question, and it is equiv alent to decidability o f m ultiplicative exp onential linear lo gic [deGr o ote et al. 2 004]. How ever, Bj¨ orklund and Bo ja ´ nczyk [2007 ] showed that FO 2 (+1 , <, ∼ ) ov er finite data tree s o f b ounded depth is decidable. XPath [Clark a nd DeRose 199 9] is a prominent q uery languag e for XML do cu- men ts [Bray et al. 199 8]. T he most basic static a nalysis pro blem for XPath, with a v arie t y of applicatio ns , is sa tisfiability in the presenc e o f DTDs. In the tw o e x- tensive articles on its complexity [Benedikt et al. 20 08; Geerts a nd F an 200 5], the only decidability res ult that allows nega tion and data (i.e., equa lit y compar isons betw een attribute v alues) do es not allow axes which ar e r ecursive (such as “self or descendant”) or b etw een s ibling s. By repres ent ing XML do cuments as data tre es and t ranslating fr o m XPath to F O 2 (+1 , ∼ ), Bo ja ´ nczyk et al. [20 09] obtained a decidable frag men t with negatio n, data and all no nrecursive axes. Another fr ag- men t of XPath was considered b y Hall´ e et al. [2006 ], but it lacks concatenation, recursive axes and sibling axes . A r e c en t adv ance of Figueir a [2009] shows Exp- Time -completeness for full down w ard XPath, but with r estricted DTDs. An alterna tive a pproach to reaso ning a bo ut data words is based on a utomata with re g isters [K aminski and F ra ncez 19 9 4]. A register is used for storing a da tum for later equality co mparisons. Nonemptiness of one- way nondeter ministic r e gis- ter a utomata over finite data words has rela tively low complexity: NP -complete [Sak amoto and Ikeda 2000] or PSp a ce - c omplete [Demri a nd Lazi´ c 200 9 ], dep ending on tech nical details of their definition. U nfortunately , such automata fail to pro- vide a satisfactory notion of regular la nguage of finite data words, as they a re not closed under complement [Kaminski a nd F rancez 1994 ] and their no nu niversalit y is undecidable [Nev en e t al. 20 04]. T o ov ercome those limitations, one-wa y a lter- nating automa ta with 1 regis ter were pro po sed by Demri and Lazi´ c [2009]: they are closed under B o olean op erations, their nonemptiness over finite data words is decidable, and future-time fr agments of temp oral log ics s uch as L TL o r the mo da l µ -calculus ex tended by 1 r egister are ea sily tr anslatable to such a uto mata. How ever, the nonemptiness problem ov er finite data words turned out to b e not primitive recursive. Moreov er , alr eady with weak accepta nce [Muller et al. 1 986] and thus also w ith B ¨ uchi or co-B ¨ uchi acceptance, nonemptiness ov er infinite data words is undecidable (more pr ecisely , co- r.e.-hard). When the a utomata are restricted to those which r ecognise safety prop erties [Alp ern and Schneider 198 7] ov er infinite data words, none mptiness was shown to b e ExpSp ace -complete, and inclusion to be decidable but not primitive re cursive [Lazi´ c 2006]. AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY. Alternating Automata on D ata T rees and XPath Satisfiability · 3 Contribution. This article a ddresses one of the re search dir e c tions prop o s ed b y Segoufin [20 0 6]: in vestigating mo dal lo gics and a utomata with r egisters on da ta trees. Nondeterministic automata with registers whic h can b e nondeterministi- cally reassig ned on finite binar y da ta tr ees were recently studied by Ka minski a nd T an [2008 ]: top-down and b otto m- up v ar iants r ecognise the sa me la nguages, and nonemptiness is decidable. How ever, they inherit the dr awbac ks o f one- wa y non- deterministic reg ister automata on data words: lack of closure under c omplement and undec ida bilit y of nonuniv e r sality . W e consider alternating automata that hav e 1 reg is ter and are forward, i.e., can mov e down ward and right ward ov er tree no des: for short, A TRA 1 . They a re clo sed under B o olean op erations , and we s how that their nonemptiness ov er finite data trees is decidable. Moreover, forward fragments of CTL and the mo dal µ -calculus with 1 r egister are eas ily tra nslatable to A TRA 1 [Jurdzi ´ nski and Lazi´ c 200 7]. The expressiveness of A TRA 1 is incomparable to those of F O 2 (+1 , ∼ ) and the automata of Ka minski a nd T an [2008]: for example, the latter tw o formalisms but not A TRA 1 can chec k whether s ome tw o leav es have equal data, and the opp osite is tr ue of chec king whether e ach no de’s datum is fr esh, i.e ., do es no t app ea r a t any ancesto r no de. By low er-b ound r esults for reg is ter automa ta on da ta words in [Neven et al. 2004; David 2 004; Demri and Lazi´ c 20 09], we hav e tha t A T RA 1 nonemptiness is not primitive r ecursive, and that it b ecomes undecidable (mor e precise ly , r.e.-har d) if any of the following is added: upw ar d mov es , le ft ward mov es, or one mo r e re g ister. Motiv ated par tly by applications to XML streams (cf., e.g., [Olteanu et al. 2004]), we co nsider both finite and countably infinite data trees, where hor izontal a s well as vertical infinity is allow ed. F or A TRA 1 with the weak acceptance mechanism, the undecidability result ov er infinite da ta words [Demri and Lazi´ c 2 009] ca rries over. How ever, we show that, for safety A TRA 1 , which are closed under intersection a nd union but not complement, inclusion is decidable and not primitive r ecursive. When a data tree is re jected by an automaton with the safety acceptance mechanism, there exists an initial segment whose every e xtension is rejected. W e als o o btain that nonemptiness of safety A TRA 1 is not element ary . The la tter is the mo st surprising result in the article: it means that the techniques in the pro o f that no nemptiness ov er infinite data words o f safety one - wa y alternating auto ma ta with 1 r egister is in ExpSp ace cannot b e lifted to trees to obtain a 2 ExpTime upp er b ound. The pr o ofs of de c idability in volv e tra nslating fro m A TRA 1 to fo r ward nondeter - ministic tree automata with faulty c o unt ers. The counters are faulty in the sense that they are sub ject to incrementing errors, i.e., can sp ontaneously increa se at any time. That makes the transition relatio ns do wn wards c ompatible with a well-quasi- ordering (cf. [Finkel a nd Schno eb e le n 2 001]), which lea ds to lower complexities of some verification problems than with er r or-free co un ters. W e define for ward XPath to b e the la rgest down ward and right w ard fragment in which, whenever tw o attribute v alue s a r e compar ed for equality , o ne o f them must be at the current no de. By translating from forward XPath to A TRA 1 , we obtain decidability o f satisfiability over finite do cument s and decidability of sa tis fia bilit y for a sa fet y subfragment, b oth in the presence of DTDs. In contrast to the decidable fragments of XPath mentioned previo usly , forward XPath has sibling axes, recursive axes, concatenatio n, negation, and data compar is ons. AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY. 4 · M. Jurdzi´ nski an d R. Lazi ´ c 2. PRELIMINARIE S After fixing nota tions for trees and data tr ees, we define tw o kinds of forward au- tomata and lo o k a t some of their basic prop er ties: a lternating automata with 1 register on data trees, and nondeterministic automata with co unt ers with incre- men ting erro rs on trees. 2.1 T rees and Data T rees F or technical simplicity , w e sha ll w o rk with binary tr ees ins tead of unranked o rdered trees. Firstly , as e.g. Bj¨ orklund and Bo ja ´ nczyk [2007], we adopt the insignifica n t generalisa tion of co nsidering unr anked order ed for ests, in which the ro o ts a re re- garded as sibling s with no par e n t. Secondly , the following is a standard and tr ivial one-to-one corr esp ondence b etw een unranked order ed fore s ts t a nd bina r y trees bt( t ): the no des of bt ( t ) are the same as the no des of t , and the children o f ea ch no de n in bt( t ) are the first child and next sibling of n in t . The corres po ndence works for finite as well as infinite unranked or dered forests. In the latter, there may be infinite (of type ω ) br a nches o r s iblingho o ds or b oth. Without loss of genera lity , each no de will either have b oth children or b e a lea f, only nonleaf no des will b e la be lled, and the ro o t no de will b e nonleaf. F or mally , a tr e e is a tuple h N , Σ , Λ i , where: — N is a prefix-close d subset of { 0 , 1 } ∗ such tha t | N | > 1 and, for each n ∈ N , either n · 0 ∈ N and n · 1 ∈ N , or n · 0 / ∈ N and n · 1 / ∈ N ; —Σ is a finite alphab et; —Λ is a mapping from the nonleaf elements of N to Σ. A data tr e e is a tree a s ab ove tog ether with a mapping ∆ fro m the nonlea f no des to a fixed infinite set D . F or a data tree τ , let tr e e( τ ) denote the underlying tree. F or a da ta tr e e τ and l > 0, let the l - pr efix of τ be the data tree o btained by restricting τ to no des of length at most l . F or each Σ, the set of all data tr ees with alphab et Σ is a complete metric spac e with the following no tion of distance: for distinct τ and τ ′ , let d ( τ , τ ′ ) = 1 /l where l is least s uch that τ and τ ′ hav e distinct l -prefixes. 2.2 Alternating T ree Reg i ster Automata Automata. A run of a forward alternating automaton with 1 reg ister on a data tree will consist of a c onfigur ation fo r each tree no de. Eac h co nfiguration will b e a finite set of thr e ads , which are pairs of an auto ma ton state and a register v alue, where the latter is a datum from D . F ollowing Br zozowski and Leiss [1980 ], tra nsitions will b e s p ecified by po sitive Bo olean formulae. F o r a set of sta tes Q , le t B + ( Q ) consist of all fo rmulae g iven by the following gra mmar, where q ∈ Q : ϕ ::= q (0 , ↓ ) | q (0 , 6 ↓ ) | q (1 , ↓ ) | q (1 , 6 ↓ ) | ⊤ | ⊥ | ϕ ∧ ϕ | ϕ ∨ ϕ Given a configur ation G at a nonleaf tree no de n , for each thr ead h q , D i in G , the automaton tr ansition function provides a for m ula ϕ in B + ( Q ), which dep ends on q , on the letter lab elling n , and on whether D = E , where E is the datum lab elling n . In ϕ , a n atom r ( d, ↓ ) requires that thre a d h r, E i be in the config uration for no de AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY. Alternating Automata on D ata T rees and XPath Satisfiability · 5 n · d (i.e., the r egister v alue is replaced by the datum at n ), and an a tom r ( d, 6 ↓ ) requires the same for thread h r , D i (i.e., the register v alue is not re placed). F ormally , a forwar d alternating t r e e 1 -r e gister automaton (shortly , A TRA 1 ) A is a tuple h Σ , Q, q I , F, δ i such that: —Σ is a finite alphab et and Q is a finite set of states ; — q I ∈ Q is the initial state a nd F ⊆ Q ar e the final sta tes; — δ : Q × Σ × { t t , ff } → B + ( Q ) is a transition function. Runs and L anguages. The s emantics of the p ositive Bo o lean formulae ca n be given by defining when a qua dr uple R ↓ 0 , R 6 ↓ 0 , R ↓ 1 , R 6 ↓ 1 of subsets o f Q satis fies a for- m ula ϕ in B + ( Q ), by structural recur sion. The cases for the Bo o le an atoms and op erators are standard, and for the remaining atoms we have: R ↓ 0 , R 6 ↓ 0 , R ↓ 1 , R 6 ↓ 1 | = r ( d, ?) def ⇔ r ∈ R ? d W e ca n now define the transition r elation of A , which is b etw een configura tions and pairs of configura tio ns, and relative to a letter and a da tum. W e write G → E a H 0 , H 1 iff, for each threa d h q , D i ∈ G , there exist R ↓ 0 , R 6 ↓ 0 , R ↓ 1 , R 6 ↓ 1 | = δ ( q , a, D = E ) such that, for b oth d ∈ { 0 , 1 } : {h r , E i : r ∈ R ↓ d } ∪ {h r, D i : r ∈ R 6 ↓ d } ⊆ H d A run o f A on a data tr ee h N , Σ , Λ , ∆ i is a mapping n 7→ G n from the no de s to configuratio ns such that: —the initial thread is in the configur ation at the r o ot, i.e. h q I , ∆( ε ) i ∈ G ε ; —for each nonleaf n , the transitio n r e lation is o bserved, i.e. G n → ∆( n ) Λ( n ) G n · 0 , G n · 1 . W e say that the run is: — final iff, for each lea f n , o nly final states o ccur in G n ; — finite iff ther e ex is ts l such that, for e a ch n o f length a t least l , G n is empty . W e may regar d A a s an automaton on finite data trees, a safety automaton, or a co-safety automaton. W e say that: — A ac c epts a finite data tree τ iff A has a final run on τ ; — A safety-ac c epts a data tree τ iff A has a final run o n τ ; — A c o-safety-ac c epts a data tree τ iff A ha s a final finite run o n τ . Observe that, for finite data trees, the three modes o f A c oincide. Let L f in ( A ) denote the set of a ll finite data trees with alphab et Σ that A accepts, and L s af ( A ) (r esp., L c os ( A )) deno te the se t of all data trees w ith a lphab et Σ tha t A safety-accepts (resp., co-sa fet y-accepts). R emark 2.1. The v alid initia l and successo r configur ations in r uns were defined in terms of low er bo unds on sets. In o ther words, while r unning o n any data tree, at each no de the automaton is free to introduce arbitrar y “ junk” threa ds. How ever, final and finite r uns were defined in terms of upp er b ounds o n s ets, so junk threads can only make it ha rder to complete a partial run int o an a ccepting one. This will play an impo rtant role in the pro o f of decida bilit y in Theo rem 3.1. AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY. 6 · M. Jurdzi´ nski an d R. Lazi ´ c Bo ole an Op er ations. Given an A TRA 1 A , let A denote its dual: the a utomaton obtained by r eplacing the set of final states with its complement and replacing , in each tra nsition fo r mula δ ( q , a, p ), every ⊤ with ⊥ , every ∧ with ∨ , and vice versa. Observe tha t A = A . Consider ing A (resp., A ) as a weak a lternating automaton whose every s ta te is of even (r e sp., o dd) pa rity , we hav e by [L¨ oding and Thomas 2000, Theorem 1] that L c os ( A ) is the co mplemen t o f L s af ( A ). Hence, we als o hav e that L s af ( A ) is the complement o f L c os ( A ), and tha t L f in ( A ) is the complement of L f in ( A ). F or e ach m o f fin , saf , c os , given A TRA 1 A 1 and A 2 with alphab et Σ, an automaton whose la nguage in mo de m is L m ( A 1 ) ∩ L m ( A 2 ) (res p., L m ( A 1 ) ∪ L m ( A 2 )) is c onstructible ea s ily . It suffices to for m a disjoint union of A 1 and A 2 , and add a new initial state q I such that δ ( q I , a, t t ) = δ ( q 1 I , a, t t ) ∧ δ ( q 2 I , a, t t ) (resp., δ ( q I , a, t t ) = δ ( q 1 I , a, t t ) ∨ δ ( q 2 I , a, t t )) for each a ∈ Σ, where q 1 I and q 2 I are the initial states of A 1 and A 2 . (Since the initial thread’s register v alue is alwa ys the ro o t no de’s datum, the formulae δ ( q I , a, ff ) are irrelev ant.) W e therefore obtain: Proposition 2.2. (a) A TRA 1 on finite data tr e es ar e close d under c omplement, interse ction and u nion. (b) Safety A TRA 1 and c o-safety A TRA 1 ar e dual, and e ach is close d under inter- se ction and u nion. In e ach c ase, a r e quir e d automaton is c omputable in lo garithmic sp ac e. Safety L anguages. A set L of data tree s w ith alphab et Σ is ca lled safety [Alp ern and Schneider 1987 ] iff it is closed with resp ect to the metric defined in Section 2.1, i.e. for each data tree τ , if for all l > 0 there exists τ ′ l ∈ L such that the l -prefixes of τ and τ ′ l are e q ual, then τ ∈ L . The co mplemen ts o f safety languages , i.e. the op en sets of data trees, are called c o-safety . Proposition 2.3. F or e ach A TRA 1 A , we have t hat L s af ( A ) is safety and L c os ( A ) is c o-safety. Proof. By Pro po sition 2.2(b), it suffices to show that L s af ( A ) is sa fet y . Supp ose for a ll l > 0 ther e exists τ ′ l ∈ L s af ( A ) such that the l - prefixes of τ and τ ′ l are equal. F or e a ch l > 0, let us fix a final run n 7→ G ′ l,n of A on τ ′ l . F or each 0 ≤ k ≤ l , let G l,k denote the restriction of the run n 7→ G ′ l,n to no des n of length k . Consider the tree consisting of the empty sequence and a ll sequences G l, 0 · G l, 1 · · · · G l,k for l > 0 a nd 0 ≤ k ≤ l . Without loss of genera lit y , each reg ister v alue in each G ′ l,n lab els some no de o f τ ′ l on the path from the r o ot to n , so the tree is finitely branching. By K¨ onig’s Lemma, it ha s an infinite path H 0 · H 1 · · · · . F or each 0 ≤ k , H k is a mapping fro m the no des of τ of length k to configura tions o f A . It remains to observe that n 7→ H | n | ( n ) is a final run of A on τ . Example 2.4. By recur sion on k ≥ 1 , we shall define A TRA 1 B k with alphab et { b 1 , . . . , b k , ∗} . As well as be ing interesting examples o f A TRA 1 , the B k will b e used in the nonelementarit y part of the pro o f of Theore m 4.1 . Let B 1 be the automaton depicted in Figure 1. It has three states, where q is initial, and q ′′ is fina l. W e hav e δ ( q , b 1 , p ) = q ′ (0 , 6 ↓ ) ∧ q ′′ (1 , 6 ↓ ) and δ ( q ′ , b 1 , p ) = q ′′ (0 , 6 ↓ ) ∧ q ′′ (1 , 6 ↓ ) for b oth p ∈ { t t , ff } , and the transition function g ives ⊥ in a ll AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY. Alternating Automata on D ata T rees and XPath Satisfiability · 7 q q ′ q ′′ b 1 0 , 6 ↓ b 1 0 , 6 ↓ 1 , 6 ↓ 1 , 6 ↓ Fig. 1. Defining B 1 other cas e s . (Recalling that the initial thread’s r egister v alue is the r o ot no de’s datum, the formula δ ( q , b 1 , ff ) is in fact irrelev ant.) O bserve that B 1 safety-accepts exactly data trees that have tw o nonlea f no des, the r o ot a nd its left-hand child, and b o th a re lab elled by letter b 1 . F or ea ch k ≥ 1, B k +1 is defined so that it safety-accepts a da ta tree over { b 1 , . . . , b k +1 , ∗} iff: (i) the r o ot no de is lab elled by b k +1 , its left-hand child is lab elled b y b k +1 , a nd its right-hand child is a leaf; (ii) for each no de n labe lle d by b k +1 , which is not the ro ot, the left-hand child of n is lab elled by ∗ and its b oth children are lab elled by b k +1 , and the right-hand subtree at n is safety-accepted by B k ; (iii) whenever a no de n , which is not the ro ot, and a descendant n ′ of n a re labe lled by b k +1 , we hav e that their data lab els are distinct, a nd that the datum at n equals the datum a t s ome no de which is lab e lle d by b k and which is in the right- hand subtree at n ′ . By Pro po sition 2.2(b), it suffices to define automa ta for (i)–(iii) separa tely . Ex- pressing (i) and (ii) is straig h tforward, and an auto ma ton for (iii) is depicted in Figure 2. It has four states, where q 0 is initial, and q 1 and q 2 are final. F or all letters a and Bo olea ns p , we have δ ( q 0 , a, p ) = q 1 (0 , 6 ↓ ), so initially the automa- ton moves to the left-hand child of the ro ot and changes the s tate to q 1 . F ro m q 1 , if the curr ent no de is lab elled by ∗ , the automaton mov es to b oth children: δ ( q 1 , ∗ , p ) = q 1 (0 , 6 ↓ ) ∧ q 1 (1 , 6 ↓ ) for b oth p . Also from q 1 , if the current no de n is lab elled by b k +1 , the auto maton b oth mov es to the left-hand child without chang- ing the s tate, and mov e s to the left-hand child with stor ing the datum a t n in the register and changing the state to q 2 : δ ( q 1 , b k +1 , p ) = q 1 (0 , 6 ↓ ) ∧ q 2 (0 , ↓ ) for b oth p . F rom q 2 , the b ehaviour for ∗ is ana logous to that from q 1 , but if the curr e n t no de’s letter is b k +1 and its datum is distinct fr om the datum in the reg ister, the automa- ton b oth mov es to the left-hand child without changing the state and moves to the right-hand child with changing the sta te to q 3 : δ ( q 2 , b k +1 , ff ) = q 2 (0 , 6 ↓ ) ∧ q 3 (1 , 6 ↓ ). The remainder of Fig ure 2 is interpreted similarly , and in cases no t depicted, the transition function gives ⊥ . Since the mo de of a cceptance is safety , the auto maton in fact express es: (iii’) whenever a no de n , which is not the ro ot, and a de s cendant n ′ of n are lab elled by b k +1 , we hav e that their data la b e ls are distinct, and that either the datum at n equals the datum at so me no de which is lab elled by b k and which is in the right-hand subtree at n ′ , or that subtree is infinite. Let 2 ⇑ 0 = 1, and 2 ⇑ k = 2 2 ⇑ ( k − 1) for k ≥ 1. By induction on k ≥ 1, the safety language of B k has the following tw o prop erties. In particular , in the presence o f (i) and (ii), we hav e that (iii) and (iii’) a re equiv alent. AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY. 8 · M. Jurdzi´ nski an d R. Lazi ´ c q 0 q 1 0 , 6 ↓ ∗ 0 , 6 ↓ 1 , 6 ↓ q 2 b k +1 0 , 6 ↓ 0 , ↓ ∗ 0 , 6 ↓ 1 , 6 ↓ q 3 b k +1 , 6 = 0 , 6 ↓ 1 , 6 ↓ ∗ 0 , 6 ↓ ∗ 1 , 6 ↓ ⊤ b k , = b k , 6 = 0 , 6 ↓ Fig. 2. Defining B k +1 —for every τ safety-accepted by B k , every down ward sequence which is from the left-hand child o f the ro o t a nd which co nsists of no des lab elled by b k is of leng th at mo st 2 ⇑ ( k − 1), so τ is finite and has at most 2 ⇑ k no des lab elled b y b k ; —for so me τ safety-accepted by B k , the no des lab elled by b k other than the r o ot form a full binary tree of height 2 ⇑ ( k − 1) (after removing the no des la b elled by ∗ ), so there are 2 ⇑ k no des lab elled by b k , and moreover the da ta at thos e no des are mut ually distinct. Finally , we obser ve that fo r co mputing B k , spa ce lo garithmic in k suffices. 2.3 F aul ty T ree Counter Automata In Sectio n 3, we sha ll establish dec ida bilit y of no nemptiness o f forward alternating tree 1-r egister a utomata ov er finite data trees, by translating them to automata which have natural-v a lued counters with incr ement s, decrements and zero-tests . The tra nslation will eliminate conjunctive branchings, b y having configur ations of the former automata (which ar e finite s e ts o f thre a ds) corresp ond to pairs of states and counter v alua tio ns, so the latter automata will b e o nly nondeterministic. Also, data will b e a bstracted in the translation, so the counter automata w ill run on finite trees (without data). The fea tur e that will make nonemptiness of the counter a utomata decida ble (on finite trees) is that they will b e fault y , in the se ns e tha t o ne o r mor e counters can err oneously increas e a t any time. The key ins ight is that s uch faults do not affect the translatio n’s pre s erv ation of no nemptiness: they in fact corresp ond to int ro ductions of “junk” threads in runs of A TRA 1 (cf. Remark 2.1). F or cla rity of the cor resp ondence b etw een the finitary la nguages of A TRA 1 and the language s of their transla tions, the counter a utomata will have ε -trans itions. W e now define the counter a utomata, and show their nonemptiness decidable. Automata. An incr ementing tr e e c ounter automaton (shortly , ITCA ) C , w hich is forward and with ε -transitio ns , is a tuple h Σ , Q, q I , F, k , δ i such that: —Σ is a finite alphab et and Q is a finite set of states ; — q I ∈ Q is the initial state a nd F ⊆ Q ar e the final sta tes; — k ∈ N is the num b er of counters; — δ ⊆ ( Q × Σ × L × Q × Q ) ∪ ( Q × { ε } × L × Q ) is a transition relation, where L = { inc , dec , ifz } × { 1 , . . . , k } is the instruction set. AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY. Alternating Automata on D ata T rees and XPath Satisfiability · 9 Runs and L anguages. A co unter v aluation is a mapping fro m { 1 , . . . , k } to N . F or counter v aluatio ns v and v ′ , we write: v ≤ v ′ iff v ( c ) ≤ v ′ ( c ) for all c v h inc ,c i − − − − → √ v ′ iff v ′ = v [ c 7→ v ( c ) + 1 ] v h dec ,c i − − − − → √ v ′ iff v ′ = v [ c 7→ v ( c ) − 1 ] v h ifz ,c i − − − − → √ v ′ iff v ( c ) = 0 a nd v ′ = v v l → v ′ iff v ≤ v √ l → √ v ′ √ ≤ v ′ for some v √ , v ′ √ A c onfigura t ion of C is a pair h q , v i , where q is a state and v is a counter v alua tion. T o define r uns, we first sp ecify that a blo ck is a nonempty finite sequence of configuratio ns obtainable by p erforming ε -tr a nsitions, i.e. for every tw o adjacent configuratio ns h q i , v i i and h q i +1 , v i +1 i in a blo ck, there ex ists l with h q i , ε, l, q i +1 i ∈ δ and v i l → v i +1 . Now, a run of C on a finite tree h N , Σ , Λ i is a ma pping n 7→ B n from the no des to blo cks such that: — h q I , 0 i is the first configuratio n in B ε ; —for each nonleaf n , there exis ts l with h q , Λ( n ) , l, r 0 , r 1 i ∈ δ , v l → w 0 and v l → w 1 , where h q , v i is the la st config uration in B n , and h r 0 , w 0 i a nd h r 1 , w 1 i a re the firs t configuratio ns in B n · 0 and B n · 1 (resp ectively). W e rega rd s uch a run accepting iff, for each leaf n , the state o f the last configu- ration in B n is final. The languag e L( C ) is the set of all finite trees with alphab et Σ on which C has an accepting run. De cidability of Nonemptiness. W e remark tha t, since nonemptiness of incr ement - ing counter automata over words is not primitive rec ursive [Demri and La zi´ c 2 009, Theorem 2.9(b)], the same is true of nonemptiness of ITCA. Theorem 2.5. Nonemptiness of ITCA is de cidable. Proof. Consider an ITCA C = h Σ , Q, q I , F, k , δ i . F or counter v a luations v and v ′ , and an instruction l , we say that v under l yields v ′ lazily a nd write v l → ♭ v ′ iff either v l → √ v ′ (i.e., there are no incr emen ting errors), or l is of the form h de c , c i , v ( c ) = 0 and v ′ = v (i.e., 0 is er r oneously decremented to 0 ). Observe that: (*) Whenever v ≤ w a nd w l → w ′ , there exists v ′ such that v l → ♭ v ′ and v ′ ≤ w ′ . T o reduce the nonemptiness problem for C to a rea chabilit y problem, let a level of C b e a finite se t of configur ations. F or levels G a nd G ′ of C , let us wr ite G → G ′ iff G ′ can b e obtained from G as follows: —each h q , v i ∈ G with q / ∈ F is replac ed either by the tw o co nfigurations that so me firable tra ns ition h q , a, l , r 0 , r 1 i y ields laz ily , or by the one configuratio n that some firable transition h q , ε, l, r i yields lazily; —each h q , v i ∈ G with q ∈ F is remov ed. AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY. 10 · M. Jurdzi´ nski and R. Lazi´ c Performing transitions of C lazily ensures that, for every level G , the set {G ′ : G → G ′ } of all its successor s is finite. The latter set is also computable. By the definition of accepting runs and (*), we ha ve that C is nonempt y iff the empty level is r eachable from the initial level { h q I , 0 i} . F or co nfigurations h q , v i and h r , w i , let h q , v i ≤ h r, w i iff q = r and v ≤ w . Now, let  b e the qua si-order ing o btained by lifting ≤ to levels: G  H iff, fo r ea ch h q , v i ∈ G , there exists h r , w i ∈ H such that h q , v i ≤ h r, w i . By Higman’s Lemma [Higman 1952],  is a w e ll- quasi-or dering, i.e., for every infinite sequence G 0 , G 1 , . . . , there exist i < j such that G i  G j . Obse r ve that, in the terminolog y of Fink el and Schn o eb elen [2001 ],  is strongly down ward-compatible with → : whenever G  H and H → H ′ , ther e exists G ′ such that G → G ′ and G ′  H ′ . Also,  is decidable. Since G  ∅ iff G = ∅ , we hav e re duce d nonemptiness of C to the sub cov er ing problem for down ward well-structured transition systems with r e fle x ive (which is weak er than strong ) compatibility , computable successor sets and decidable o rder- ing. The latter is decidable by [Finkel and Schno ebe le n 20 01, Theo rem 5.5]. 3. DECIDAB ILITY O VER FINITE DA T A T REES Theorem 3.1. Nonemptiness of A TRA 1 over fi nite data tr e es is de cidable and not primitive r e cursive. Proof. By considering data words a s data trees (e.g., by using only left-hand children starting from the ro ot), the low er b ound follows fro m non-primitive recur- siveness o f nonemptiness o f one-wa y co - nondeterministic (i.e., with only conjunctive branching) automa ta with 1 r e g ister ov er finite data words [Demri and La zi´ c 2 009, Theorem 5.2]. W e shall es tablish decidability by reducing to nonemptiness of ITCA, which is decidable by Theorem 2.5 . More sp ecifically , by extending to trees the tra nslation in the pro of of [Demr i and Lazi´ c 2009, Theor e m 4.4], whic h is from one- wa y alterna ting automata with 1 r egister on finite data words to incr e menting counter automata on finite words, we s hall show that, for ea ch A TRA 1 A , an ITCA C A with the same a lphab et and such that L( C A ) = { tree( τ ) : τ ∈ L f in ( A ) } , is computable (in po lynomial space). Let A = h Σ , Q, q I , F, δ i . F or a configur ation G of A and a datum D , let the bund le of D in G b e the set of a ll states tha t ar e paired with D , i.e. { q : h q , D i ∈ G } . The computation of C A with the pro p er ties a bove is based on the following abstr action of configuratio ns of A by mappings from P ( Q ) \ {∅} to N . The abstract config uration G counts, for each nonempty S ⊆ Q , the num b er of da ta who se bundles equal S : G ( S ) = |{ D : { q : h q , D i ∈ G } = S }| Thu s, tw o configur ations hav e the same abstra c tio n iff they are equal up to a bijectiv e renaming of data. F or 1 ≤ i ≤ G ( S ) and q ∈ S , we shall ca ll pair s h S, i i abstr act data a nd triples h q , S, i i abstr act thr e ads . F or a bstract configur ations v , w 0 and w 1 , letter s a , and sets of states Q = with either v ( Q = ) > 0 or Q = = ∅ , we sha ll define transitions v → Q = a w 0 , w 1 , a nd s how that they are bisimilar to tra nsitions G → E a H 0 , H 1 such that v = G , w 0 = H 0 , w 1 = H 1 and Q = = { q : h q , E i ∈ G } . The s e ts Q = can he nce b e thoug ht of as abstractions of the data E . T he a bstract transitions will then give us a no tio n of AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY. Alternating Automata on D ata T rees and XPath S atisfiabilit y · 11 abstract run of A on a finite tree (without data), wher e the sets Q = are guessed at every step. By the bisimilar it y , w e s hall have tha t: (I) A has a n acc e pting abstrac t run on a finite tr ee t with alphab et Σ iff it has an accepting run on some data tree τ such that t = tree ( τ ). In other w o r ds, w e shall hav e re duce d the question of whether L f in ( A ) is no nempt y , i.e. whether ther e exists a finite tree with alphab et Σ, a da ta lab elling of its nonleaf no des, and an accepting run of A on the resulting data tree, to whether there exists a finite tree a nd an accepting a bstract run of A on it. It will then r emain to show how to compute (in p o lynomial space) a n ITCA C A which guesses a nd chec ks accepting abstract runs of A , so that: (II) C A has an accepting r un on a finite tree t with a lphabe t Σ iff A has an accepting abstract run on t . T o b egin delivering our pr omises, we now define tr ansitions fro m a bstract co n- figurations v for le tter s a and sets of states Q = with either v ( Q = ) > 0 or Q = = ∅ to abstra ct configurations w 0 and w 1 , essentially by r eformulating the definition of concrete transitions (cf. Section 2.2) in terms o f abstract threads. F or each abstrac t datum h S, i i of v and bo th d ∈ { 0 , 1 } , the a bstract threa ds whos e abstr act datum is h S, i i will c o nt ribute tw o se ts of states to s uch a transitio n: R ′ ( S, i ) ↓ d , for which the auto maton’s reg ister is up da ted, a nd R ′ ( S, i ) 6 ↓ d , for which the automaton’s r eg- ister is not upda ted. If Q = is nonempt y , we take h Q = , 1 i to represent the da tum abstracted b y Q = , i.e. with which the register is up dated, so s tates in the union of the s et R ′ ( Q = , 1) 6 ↓ d and all the sets R ′ ( S, i ) ↓ d will be as so ciated to the same abstract datum of w d . F or mally , let v → Q = a w 0 , w 1 mean that, for each abstra ct datum h S, i i of v , there exist sets o f states R ′ ( S, i ) ↓ 0 , R ′ ( S, i ) 6 ↓ 0 , R ′ ( S, i ) ↓ 1 , R ′ ( S, i ) 6 ↓ 1 such that: (i) for each abstra c t thread h q , S, i i o f v , there exist R ↓ 0 , R 6 ↓ 0 , R ↓ 1 , R 6 ↓ 1 | = δ ( q , a, h S, i i = h Q = , 1 i ) which satisfy R ? d ⊆ R ′ ( S, i ) ? d for bo th d ∈ { 0 , 1 } and ? ∈ {↓ , 6 ↓} ; (ii) for b o th d ∈ { 0 , 1 } and each nonempty S ′ ⊆ Q , we hav e |{h S, i i : h S, i i 6 = h Q = , 1 i ∧ R ′ ( S, i ) 6 ↓ d = S ′ }| +  1 , if R = d = S ′ 0 , otherwise  ≤ w d ( S ′ ) for some R = d ⊇ R ′ ( Q = , 1) 6 ↓ d ∪ S 1 ≤ i ≤ v ( S ) R ′ ( S, i ) ↓ d . It is straig htforward to check the following tw o-par t corr esp ondence b etw een the abstract transitions just defined and co nc r ete tr ansitions: (II Ia) Whenever G → E a H 0 , H 1 , we have v → Q = a w 0 , w 1 , wher e v = G , w 0 = H 0 , w 1 = H 1 and Q = = { q : h q , E i ∈ G } . (II Ib) Whenever G = v a nd v → Q = a w 0 , w 1 , there exist E , H 0 and H 1 such that G → E a H 0 , H 1 , w 0 = H 0 , w 1 = H 1 and Q = = { q : h q , E i ∈ G } . Let α b e a bijectio n b etw een the abs tract data o f v and the data tha t o ccur in G , which is bundle preserving (i.e., w he ne ver α h S, i i = D , we hav e that S is the bundle of D in G ), and if Q = is no nempt y then α h Q = , 1 i = E . AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY. 12 · M. Jurdzi´ nski and R. Lazi´ c —T o show (I I Ia), for each abstr act datum h S, i i of v and b oth d ∈ { 0 , 1 } , take R ′ ( S, i ) ↓ d and R = d to b e the bundle of E in H d , a nd take R ′ ( S, i ) 6 ↓ d to b e the bundle of α h S, i i in H d . —F or (I I Ib), if Q = is e mpty then take E to b e an arbitrar y datum which do es not o ccur in G , pic k the sa me quadruples for the threads in G a s for the corresp onding (via α ) abstra ct threa ds o f v , and for b oth d ∈ { 0 , 1 } , obtain H d from w d by replacing each set of abstr a ct data h S ′ , 1 i , . . . , h S ′ , w d ( S ′ ) i with: the da ta α h S, i i such that h S, i i 6 = h Q = , 1 i and R ′ ( S, i ) 6 ↓ d = S ′ , the datum E if R = d = S ′ , and fresh further data if the inequality in (ii) is strict. Comp osing abs tr act tr ansitions gives us abstr act runs of A . Such a run on a finite tr ee h N , Σ , Λ i is a mapping n 7→ v n from the no des to abstract configura tions such tha t, for each no nleaf n , there exists Q = with v n → Q = Λ( n ) v n · 0 , v n · 1 , and if n is the ro ot then q I ∈ Q = . Defining the run to be accepting iff v n ( S ) = 0 for all leaves n and all S 6⊆ F , we have (I) ab ove by (II Ia) a nd (I I Ib). W e ar e now ready to define C A , as an ITC A that p erforms the steps (1)–(9) below. States o f C A are used for control and for storing a , Q = , r o ot (initially tt ) , S , R ′ ↓ 0 , R ′ 6 ↓ 0 , R ′ ↓ 1 , R ′ 6 ↓ 1 , q , R ↓ 0 , R 6 ↓ 0 , R ↓ 1 , R 6 ↓ 1 , d , ? and R = d . There are 2 | Q | − 1 counters in the a rray c , and 2 | Q | 4 counters in the a rray c ′ . The steps are implemen ted by ε -transitions , except for the a -transitio n in (4). The choices are nondeterministic. If a choice in (3.2) is imp oss ible, o r a c heck in (2), (3.2) o r (5) fails, then C A blo cks. The steps (1)– (9) guess and chec k an ac c e pting abstract run of A on a finite tr e e. The co unt er array c is used to stor e abstract configurations , a nd the c o unt er ar ray c ′ is a uxiliary . The initial condition in the definition of abstract runs is check ed in (2), the final c ondition in (8 ), and steps (3)–(7) are es sentially a refo r mulation of the definitio n of a bstract tra nsitions. This par ticular reformulation is tailore d for a development in the pro o f of Theorem 4.1, a nd is base d on obser ving that the quadruples o f s e ts R ′ ( S, i ) ↓ 0 , R ′ ( S, i ) 6 ↓ 0 , R ′ ( S, i ) ↓ 1 , R ′ ( S, i ) 6 ↓ 1 for abstra ct data h S, i i 6 = h Q = , 1 i do not need to b e stored simultaneously , i.e. that it suffices to store n um b e r s of such identical quadruples, which is done using the counter a rray c ′ . (1) Cho ose a ∈ Σ, and Q = with either c [ Q = ] > 0 o r Q = = ∅ . (2) If r o ot = tt , then chec k that q I ∈ Q = and s et r o ot := ff . (3) F or each nonempty S ⊆ Q , while c [ S ] > 0 do: (3.1) choose R ′ ↓ 0 , R ′ 6 ↓ 0 , R ′ ↓ 1 , R ′ 6 ↓ 1 ⊆ Q ; (3.2) for each q ∈ S , choose R ↓ 0 , R 6 ↓ 0 , R ↓ 1 , R 6 ↓ 1 | = δ ( q , a, h S, c [ S ] i = h Q = , 1 i ), and chec k that R ? d ⊆ R ′ ? d for bo th d ∈ { 0 , 1 } and ? ∈ {↓ , 6 ↓} ; (3.3) decrement c [ S ], and if h S, c [ S ] i = h Q = , 0 i , then choose R = d ⊇ R ′ ↓ d ∪ R ′ 6 ↓ d for bo th d ∈ { 0 , 1 } , else incre men t c ′ [ R ′ ↓ 0 , R ′ 6 ↓ 0 , R ′ ↓ 1 , R ′ 6 ↓ 1 ]. (4) Perform an a -tra nsition, forking with d := 0 a nd d := 1. (5) Check that R = d ⊇ S { R ′ ↓ d : c ′ [ R ′ ↓ 0 , R ′ 6 ↓ 0 , R ′ ↓ 1 , R ′ 6 ↓ 1 ] > 0 } , and increment c [ R = d ]. (6) T ra ns fer each c ′ [ R ′ ↓ 0 , R ′ 6 ↓ 0 , R ′ ↓ 1 , R ′ 6 ↓ 1 ] with nonempty R ′ 6 ↓ d to c [ R ′ 6 ↓ d ]. (7) Reset (i.e. decrement until 0) ea ch c ′ [ R ′ ↓ 0 , R ′ 6 ↓ 0 , R ′ ↓ 1 , R ′ 6 ↓ 1 ] with empty R ′ 6 ↓ d . (8) If c [ S ] = 0 whenever S 6⊆ F , then pass thro ugh a final state. (9) Rep eat from (1). AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY. Alternating Automata on D ata T rees and XPath S atisfiabilit y · 13 Since C A is an ITCA, its runs may contain a rbitrary error s that incr ease one or more counters. Nevertheless, b etw ee n e xecutions of steps (3)–(7) by C A and abstract trans itions of A , we hav e the following tw o-par t cor resp ondence. It s hows that the p ossibly erroneous ex ecutions of (3 )–(7) ma tc h the abstract tr ansitions with the slack allow ed by their definition, which in turn match the concrete tr ansitions with their poss ible in tro ductio ns of junk threads (cf. (II Ia), (I I Ib) and Remark 2 .1). (IV a) Whenev er v → Q = a w 0 , w 1 , we have that C A can p erform s teps (3)–(7 ) be- ginning with any configuratio n such that each c [ S ] ha s v alue v ( S ) and each c ′ [ R ′ ↓ 0 , R ′ 6 ↓ 0 , R ′ ↓ 1 , R ′ 6 ↓ 1 ] has v alue 0 , so that for bo th fork s d ∈ { 0 , 1 } in (4), the ending configura tion is such that e a ch c [ S ] has v alue w d ( S ) and each c ′ [ R ′ ↓ 0 , R ′ 6 ↓ 0 , R ′ ↓ 1 , R ′ 6 ↓ 1 ] ha s v alue 0. (IVb) Whenever C A can p erfor m steps (3)–(7) b eg inning with a configuration s uch that a and Q = are as in (1) and each c [ S ] ha s v alue v ( S ), so that for b oth forks d ∈ { 0 , 1 } in (4), the ending config uration is such that each c [ S ] has v alue w d ( S ), we hav e v → Q = a w 0 , w 1 . —In proving (IV a), w e can choose where incrementing error s o ccur. F or each iter- ation of (3.1)–(3.3 ), let the quadruple chosen in (3.1) b e R ′ ( S, c [ S ]) ↓ 0 , R ′ ( S, c [ S ]) 6 ↓ 0 , R ′ ( S, c [ S ]) ↓ 1 , R ′ ( S, c [ S ]) 6 ↓ 1 so that (3.2) can succeed b y (i) in the definition of a bstract transitions. It remains to match by incrementing err ors, say a t the end o f (7), any differences b etw een the tw o sides of the inequalities in (ii). —T o obtain (IVb), let R ′ ( S, i ) ↓ 0 , R ′ ( S, i ) 6 ↓ 0 , R ′ ( S, i ) ↓ 1 , R ′ ( S, i ) 6 ↓ 1 for e a ch abstra ct da- tum h S, i i of v b e the q uadruple chosen in the last p erforma nc e of (3.1 ) with i = c [ S ] (due to incrementing errors , there may b e more than o ne). Step (3.2 ) ensures that (i) is sa tisfied. Since at the end of (3), ea ch c ′ [ R ′ ↓ 0 , R ′ 6 ↓ 0 , R ′ ↓ 1 , R ′ 6 ↓ 1 ] has v alue at least |{h S, i i : h S, i i 6 = h Q = , 1 i ∧ ∀ d, ?( R ′ ( S, i ) ? d = R ′ ? d ) }| steps (5) and (6) ensure that (ii) is satisfie d. Now, we hav e (I I) ab ov e. The ‘if ’ direction follows by (IV a), and the ‘only if ’ direction by (IVb) once we observe that, without loss of gener ality , we can consider only runs of C A that do not contain incrementing err ors o n the a rray c outside of steps (3)–(7) except b efore the firs t p erforma nce o f (1). T o conc lude that p olynomial space suffices for computing C A , we observe that each of its state v ar iables is either from a fixe d finite set, o r an element of Σ, or an element o r subset of Q , and that deciding satisfactio n of tr ansition formulae δ ( q , a, h S, c [ S ] i = h Q = , 1 i ) in s tep (3.2) amounts to ev a lua ting B o olean formulae. W e r emark that, in the opp osite dir ection to the tra nslation in the pr o of of Theorem 3 .1 , by extending the pro of of [Demri and La zi´ c 20 09, Theor em 5 .2] to trees, for each ITCA C , an A TRA 1 A C is co mputable in logar ithmic space such that L f in ( A C ) consists of enco dings of ac cepting runs of C . Moreov er, similarly as on words, the tw o translations c an b e extended to infinite trees, where A TRA 1 are equipp ed with w eak acceptance and I TC A with B ¨ uchi acce pta nce. Instead o f AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY. 14 · M. Jurdzi´ nski and R. Lazi´ c decidable and not primitive recur sive a s on finite tr ees, nonemptiness for those tw o classes of automata can then b e shown co- r.e.-complete. 4. SAFETY AUTOMA T A W e now show decida bilit y of no ne mptiness of for ward a lternating tree 1-regis ter automata with safety acceptance ov e r finite or infinite data tre es. Mo re precisely , since the cla ss o f safety A TRA 1 is not close d under complement, but is c losed under intersection and union (cf. Pr op osition 2.2(b)), we show decidability of the inclusion problem, which implies decidability of nonemptiness of Bo olea n combi- nations o f safety A TRA 1 . How ever, alrea dy for the subproblems of nonemptiness and no n universalit y , we obtain non-elementary and non-pr imitiv e recurs ive low er bo unds (resp ectively). Theorem 4.1. F or safety A TRA 1 , inclu s ion is de cidable, nonemptiness is not elementary, and n onu niversality is n ot primitive r e cursive. Proof. Showing that the inclusion problem is decidable w ill inv o lve extending : —the pro of of Pro po sition 2.2 to obtain an intersection of a safety and a co-safety A TRA 1 , which can be seen as a weak parity A TRA 1 with 2 priorities; —the pro of of Theo rem 3.1 to obtain an ITCA with a more powerful s e t of ins truc- tions and no cycles of ε -tr a nsitions, which can also be seen a s having w eak parity acceptance with 2 priorities ; —the pro of of Theorem 2.5 to obtain decidability of no ne mptiness of s uch IT C A. T o maintain fo cus, we sha ll av o id intro ducing the extended notio ns in general, but concentrate on what is necessa ry for this part o f the pr o of. Suppo se A 1 = h Σ , Q 1 , q 1 I , F 1 , δ 1 i and A 2 = h Σ , Q 2 , q 2 I , F 2 , δ 2 i are A TRA 1 , where we need to determine whether L s af ( A 1 ) is a subset o f L s af ( A 2 ). By the pr o of of Prop ositio n 2.2(b), that amo un ts to emptiness of the intersection of L s af ( A 1 ) and L c os ( A 2 ), where A 2 = h Σ , Q 2 , q 2 I , F 2 , δ 2 i is the dual auto maton to A 2 . Ass uming that Q 1 and Q 2 are disjoint, and do not contain q ∩ I , let A ∩ = h Σ , { q ∩ I } ∪ Q 1 ∪ Q 2 , q ∩ I , F 1 ∪ F 2 , δ ∩ i be the automaton for the intersection o f A 1 and A 2 : δ ∩ ( q , a, p ) =    δ ( q 1 I , a, p ) ∧ δ ( q 2 I , a, p ) , if q = q ∩ I δ 1 ( q , a, p ) , if q ∈ Q 1 δ 2 ( q , a, p ) , if q ∈ Q 2 W e then hav e: (*) A data tree τ with alpha b et Σ is safety-accepted by A 1 and co-safety-accepted by A 2 iff A ∩ has a r un on τ which is final and Q 2 -finite, i.e. there exis ts l such that the configuration at each no de of length at least l contains no thre ads with states from Q 2 . Before pr o ceeding, let incr ementing t r e e c ount er automata with nondeterminis- tic tr ansfers (shor tly , ITCANT ) b e defined as ITCA (cf. Section 2.3), ex c e pt that h ifz , c i instructions ar e r eplaced by h tr ansf , c, C i for co unt ers c and sets o f coun- ters C . Such an instruction is equiv alent to a lo o p which executes while c is nonzero, AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY. Alternating Automata on D ata T rees and XPath S atisfiabilit y · 15 and in each iter ation, decrements c and increments some counter in C . How ever, in presence of incrementing error s , the lo op may not terminate, whereas h tra nsf , c, C i instructions ar e considered atomic. The effect of h tran sf , c, C i is therefore to tr ans- fer the v alue of c to the counters in C , among which it is split nondeterministically . In particular , h ifz , c i instructions can be reintroduced as h tran sf , c, ∅i . Now, steps (1)–(9) in the pro of of The o rem 3.1 can b e implement ed by a n IT- CANT which uses nondeterministic transfers ins tea d of the lo ops in (3), (6) and (7 ), and who se tr a nsition rela tion therefore contains no cycles o f ε -transitions . More sp ecifically , each r eset in (7) can b e implemented a s a tra nsfer to a new a uxil- iary counter c ′′ , (6) alrea dy co nsists of transfers to s ingle counters, a nd (3) can b e replaced by the following tw o steps: (3a) If Q = 6 = ∅ , then decrement c [ Q = ] a nd choose R = 0 , R = 1 ⊆ Q such that, for each q ∈ Q = , there exist R ↓ 0 , R 6 ↓ 0 , R ↓ 1 , R 6 ↓ 1 | = δ ( q , a, tt ) with R = d ⊇ R ↓ d ∪ R 6 ↓ d for bo th d ∈ { 0 , 1 } . (3b) T rans fer ea ch c [ S ] nondeterministically to the set of all c ′ [ R ′ ↓ 0 , R ′ 6 ↓ 0 , R ′ ↓ 1 , R ′ 6 ↓ 1 ] such tha t, for each q ∈ S , there exist R ↓ 0 , R 6 ↓ 0 , R ↓ 1 , R 6 ↓ 1 | = δ ( q , a, ff ) with R ? d ⊆ R ′ ? d for bo th d ∈ { 0 , 1 } and ? ∈ {↓ , 6 ↓} . Let C ∩ be such an ITC ANT for A ∩ , which in addition p erfor ms the following step betw een (7) and (8), where pr op is a state v ariable, initially ff : (7 1 2 ) If c [ S ] = 0 whenever S ∩ Q 2 6 = ∅ , then set pr op := tt . As in the pro of of Theor em 3.1, we have that C ∩ is computable fro m A ∩ , a nd therefore from A 1 and A 2 , in p olynomia l space. Also, A ∩ satisfies (I I Ia ) and (II Ib), and A ∩ and C ∩ satisfy (IV a) and (IVb). Recalling that C ∩ contains no cycles o f ε -transitions , we infer the following fr om (*) a bove, where the notion of tra nsitions betw een levels of C ∩ is a s in the pro of of Theore m 2 .5 , and P denotes the set of all states of C ∩ in which pr op has v alue tt : (**) L s af ( A 1 ) is a subset of L s af ( A 2 ) iff there do es not exis t an infinite sequence of transitions G 0 → G 1 → · · · whic h is from the initial level of C ∩ and such that some G i contains only states from P . T o co nclude decida bilit y of inclusion, we show that, given an ITCANT C ∩ and a s et P of its states, existence of an infinite sequence o f trans itio ns as in (**) is decidable. F or a set G of levels of C ∩ , we write ↑ G to denote its upw a rd closure with resp ect to  : the set of all H for which there exists G ∈ G with G  H . W e say that G is upw ards clo sed iff G = ↑ G , and we say that H is a basis for G iff G = ↑ H . As in the pro of of Theorem 2 .5, we hav e that succ e ssor sets with re- sp ect to → are computable,  is a well-quasi-order ing,  is strongly (in pa rticular, reflexively) down ward-compatible with → , and  is decidable. Hence, by [Finkel and Schnoeb elen 2001 , Pro p os ition 5.4], a finite ba sis G R of the upw ar d closure of the set of a ll levels rea chable from the initial level is computable. By the strong down ward c o mpatibility , the set o f all levels from which there exists an infinite se- quence of transitions is down wards closed, so its co mplement is up wards closed. W e claim that a finite bas is G T of the latter set is co mputable. With that assumption, since a finite basis G N of the set of all le vels that contain some state not from P is AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY. 16 · M. Jurdzi´ nski and R. Lazi´ c certainly computable, we are done b ecause there do es not ex is t an infinite sequence of transitions as in (**) iff ↑ G R is a s ubs e t of the unio n of ↑ G T and ↑ G N . It remains to establish the claim. F or a finite set G ′ of levels of C ∩ , let K ( G ′ ) = 1 + ma x G ′ ∈ G ′ max h q,v i∈G ′ X c ∈{ 1 ,...,k } v ( c ) where k is the num b er o f co un ters of C ∩ . Let also Pr e d ∀ ( G ′ ) b e the upw ar ds-closed set co nsisting of all G s uch that, whenever G → G ′ , we hav e G ′ ∈ ↑ G ′ . Observe that, whenever G ∈ Pr ed ∀ ( G ′ ), there exists G † ∈ Pred ∀ ( G ′ ) such that G †  G and, for e a ch h q , v i ∈ G † and c ∈ { 1 , . . . , k } , v ( c ) ≤ K ( G ′ ). Hence, a finite basis of Pred ∀ ( G ′ ) is computable, so the following is an effective pr o cedure: (i) Begin with G T := ∅ . (ii) Let H b e a finite ba sis of P red ∀ ( G T ). (iii) If H 6⊆ ↑ G T , then set G T := G T ∪ H and r epe at fro m (ii), else terminate. Since  is a well-quasi-o rdering, the pr o cedure terminates and computes a basis o f the set of all levels from which every sequence of tr ansitions is finite, as required. W e shall establish that nonemptiness o f safety A TRA 1 is no t elementary b y a tw o- stage reduction, which s eparates dealing with the inability of one-way alterna ting 1-regis ter automa ta to detect incrementing e r rors in enco ding s of c o mputations of counter machines, from ensuring acceptance only of enco dings of computations in which counters are bo unded b y a tow er of exp onentiations. Mor e precisely , w e shall use the following problem as in ter mediary . The notation 2 ⇑ m is a s in E xample 2.4. (***) Given a deterministic counter machine C and m ≥ 1 in unary , do es C have a computatio n which p ossibly contains incre men ting err o rs, in which every counter v a lue is at most 2 ⇑ m , and which is either halting or infinite? Such a machine is a tuple h Q, q I , q H , k, δ i w he r e: Q is a finite set of states, q I is the initial state, q H is the halting sta te, k ∈ N is the num b er of counters, and δ : Q \ { q H } → { 1 , . . . , k } × ( Q ∪ Q 2 ) is a trans itio n function. Th us, from a state q 6 = q H , either δ ( q ) is of the fo rm h c, q ′ i , which means that the machine increments c and go es to q ′ , or δ ( q ) is of the form h c, q ′ , q ′′ i , which mea ns that, if c is zero, then the machine g o es to q ′ , else it decrements c a nd go e s to q ′′ . More precisely , a configur a tion is a state together with a counter v alua tion, and we wr ite h q , v i → h q ′ , v ′ i iff, for some v √ ≥ v and v ′ √ ≤ v ′ , —either δ ( q ) = h c, q ′ i and v ′ √ = v √ [ c 7→ v √ ( c ) + 1], —or δ ( q ) = h c, q ′ , q ′′ i , v √ ( c ) = 0 and v ′ √ = v √ , —or δ ( q ) = h c, q ′′ , q ′ i and v ′ √ = v √ [ c 7→ v √ ( c ) − 1]. W e say that the transition is erro r -free iff the ab ov e holds with v √ = v and v ′ √ = v ′ . A co mputation is a seq uence h q 0 , v 0 i → h q 1 , v 1 i → · · · such that q 0 = q I and v = 0 . T o show that (* **) is not elemen tary , we r educe fro m the pro blem o f whether a deterministic 2-counter machine of siz e m has an er ror-fre e halting computation of length at most 2 ⇑ m . Given such a machine C whose co unt ers are c 1 and c 2 , let b C b e a deterministic machine with counters c 1 , c 2 , c 1 , c 2 , c † , c ′ , c ′′ and c ′′′ , which per forms the following and then halts: AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY. Alternating Automata on D ata T rees and XPath S atisfiabilit y · 17 c ′ := m ; inc ( c i ); while c ′ > 0 { dec ( c ′ ); while c i > 0 { dec ( c i ); inc ( c ′′ ) } ; inc ( c i ); while c ′′ > 0 { dec ( c ′′ ); while c i > 0 { dec ( c i ); inc ( c ′′′ ) } ; while c ′′′ > 0 { dec ( c ′′′ ); inc ( c i ); inc ( c i ) } } } Fig. 3. Computing 2 ⇑ m (I) F or b oth i ∈ { 1 , 2 } , set c i to 2 ⇑ m by executing the pseudo -co de in Figure 3 . The loo ps o ver c ′ , c ′′ and c ′′′ implemen t c i := 2 ⇑ c ′ , c i := 2 c ′′ and c i := 2 × c ′′′ (resp ectively). (II) Sim ulate C us ing c 1 and c 2 , and after each step: —increment c † ; —if c i has bee n incremented, then decr e ment c i ; —if c i has bee n decremented, then increment c i ; —if C has halted, then go to (I II). (II I) F or b oth i ∈ { 1 , 2 } , transfer c i to c i . Observe that b C is co mputable in spa ce loga rithmic in m . If C has an error -free halting computation of length at mo s t 2 ⇑ m , running b C w itho ut errors indeed halts and do e s not inv olve c o unt er v alues gr eater than 2 ⇑ m . F or the co nverse, supp ose b C has a computation which po ssibly contains incr e men ting e r rors, in which every counter v alue is at mos t 2 ⇑ m , and which is either halting or infinite. By the construction o f b C a nd the bo undednes s of counter v alues, the computatio n cannot be infinite, so it is ha lting. Since c 1 and c 2 were set to 2 ⇑ m by stage (I), a nd since stage (II I) ter minated, the ha lting computation o f C in stage (II) must hav e been error -free and it is certainly of length at most 2 ⇑ m . T o r educe fro m (*** ) to no nemptiness of safety A TRA 1 , consider a deterministic counter machine C = h Q, q I , q H , k, δ i and m ≥ 1 . W e can assume that q ′ 6 = q ′′ whenever δ ( q ) = h c, q ′ , q ′′ i . By the pro of of [Demri and La zi´ c 2009 , Theorem 5.2], which uses essentially the sa me e nco ding of computations o f co unter machines in to data words as in the pro o f of [Bo ja´ nczyk et al. 2006 , Theo rem 14 ], we hav e that an A TRA 1 A C with a lphab et Q is computable in space logar ithmic in |C | , s uch that it safety-accepts a data tr e e τ iff the left-most pa th in τ (i.e., the seq ue nc e of no des obtained by starting from the r o ot and rep eatedly taking the left-ha nd child) satisfies the following: —the letter of the first no de is q I , and either the letter of the last nonleaf no de is q H or the s equence is infinite; —for all letters q and q ′ of tw o consecutive no des n and n ′ (resp ectively), —either δ ( q ) is o f the form h c, q ′ i and w e s ay that n is c - decrementing, —or δ ( q ) is of the for m h c, q ′ , q ′′ i and we say that n is c -zer o -testing, —or δ ( q ) is of the for m h c, q ′′ , q ′ i and we say that n is c -decre men ting; —for each counter c , no t w o c -incrementing no des a re la belle d by the same datum, no t wo c -decre men ting no des ar e lab elled by the s ame datum, and whenever a c - incrementing no de n is follo wed by a c -zero -testing no de n ′ , then a c -decr ement ing no de with the same datum as n must o ccur b etw een n and n ′ . AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY. 18 · M. Jurdzi´ nski and R. Lazi´ c Hence, b y tak ing the left-most paths in data trees that are safety-accepted b y A C and er asing data, we obta in exa c tly the sequences of sta tes of halting or infinite computations of C which p ossibly contain incr e menting error s. Assuming that b 1 , . . . , b m , ∗ a re not in Q , to restrict further to computations of C in which every counter v alue is a t mo st 2 ⇑ m , it s uffices to strengthen A C to obtain a sa fet y A TRA 1 A 2 ⇑ m C with alphab et Q ∪ { b 1 , . . . , b m , ∗} which re quires that: —whenever a no de n in the left-most path is c -inc r ementing, then the auto maton B m from Exa mple 2.4 safety-accepts the rig h t-hand subtree at n ; —whenever a no de n in the left-most path is c -incr e men ting, n ′ is either n or a s ubsequent c -incre men ting no de, a nd no c -decrementing no de with the same datum a s n o ccurs b etw een n and n ′ , then the right-hand subtree at n ′ contains a no de with letter b m and the same datum as n . Finally , that nonuniv ersality of safety A TRA 1 is not primitive recursive follows from the same low er b ound for nonuniv er sality of s afety one-way alternating a u- tomata with 1 register ov er data words [La zi´ c 2006 ]. 5. XP A TH SA TISFIABILITY In this section, w e first describ e how XML do cuments and DTDs can be represented by data trees and tree a uto mata. W e then introduce a fo rward fragment of XPath, and a safety subfrag ment. By tra nslating XPath que r ies to forward alterna ting tree 1-r egister automata, and applying re sults from Sections 3 a nd 4, we obtain decidability of satisfia blity for forward XPath on finite do cument s and for safety forward XPath on finite or infinite do cuments. XML T r e es. Supp ose Σ is a finite set o f element types, Σ ′ is a finite set of attribute names, a nd Σ and Σ ′ are disjoint. An XML do cument [Bray et al. 1998 ] is an unra nked ordere d tre e whose every node n is la b elled by so me type ( n ) ∈ Σ and by a datum for each element of some at ts ( n ) ⊆ Σ ′ . Motiv ated b y pro c essing of XML streams (cf., e.g ., [Oltea nu et a l. 200 4]), we do not restrict o ur attention to finite XML do cuments. Concerning the data in XML do cument s, we shall consider o nly the equality pred- icate betw e en data la be ls . Equality compar isons with constants are str aightforw a rd to enco de using a dditional attribute names. Therefore, similarly a s Bo ja ´ nczyk et al. [2009], we repr e s ent an XML do cument by a data tree with alpha b et Σ ∪ Σ ′ , where each no de n is represented by a sequence o f 1 + | att s ( n ) | no des: the first no de is la belled by t ype ( n ), the lab els of the following no des e n umerate atts ( n ), the children o f the last no de repr esent the fir st child and the next sibling of n (if any), and for each pre c e ding no de in the s equence, its left-hand child is the nex t no de and its right-hand child is a leaf. W e s ay that s uch a da ta tree is an XML tr e e . F ollowing Benedikt et al. [2008 ] a nd Bo ja ´ nczyk et a l. [200 9 ], we assume without loss of generality tha t do cument type definitions (DTDs) [B ray et al. 1 998] are given a s regula r tre e langua ges. Mor e precisely , we consider a DTD to b e a for ward nondeterministic tree automaton T with alphab et Σ ∪ Σ ′ and without ε -transitions. Such automa ta ca n b e defined by omitting counters and ε - tr ansitions from ITCA (cf. Section 2.3). Infinite trees a re pro cessed in safety mo de, i.e. the condition that an infinite run o f T has to sa tisfy to b e accepting is the same as for finite runs: for AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY. Alternating Automata on D ata T rees and XPath S atisfiabilit y · 19 τ , n, n ′ | = ε def ⇔ n = n ′ τ , n, n ′ | = { ▽ , △ , ⊲ , ⊳ } def ⇔ n { b ▽ , b △ , b ⊲ , b ⊳ } n ′ τ , n, n ′ | = { ▽ ∗ , △ ∗ , ⊲ ∗ , ⊳ ∗ } def ⇔ n { b ▽ ∗ , b △ ∗ , b ⊲ ∗ , b ⊳ ∗ } n ′ τ , n, n ′ | = p 1 /p 2 def ⇔ there exists n ′′ suc h that τ , n, n ′′ | = p 1 and τ , n ′′ , n ′ | = p 2 τ , n, n ′ | = p 1 ∪ p 2 def ⇔ τ , n, n ′ | = p 1 or τ , n, n ′ | = p 2 τ , n, n ′ | = p [ u ] def ⇔ τ , n, n ′ | = p and τ , n ′ | = u τ , n | = p ? def ⇔ there exists n ′ suc h that τ , n, n ′ | = p τ , n | = a def ⇔ Λ( n ) = a τ , n | = p 1 / @ a ′ 1 ⊲ ⊳ p 2 / @ a ′ 2 def ⇔ there exist n 1 , k 1 , n 2 , k 2 suc h that τ , n, n 1 | = p 1 , k 1 ≤ | atts ( n 1 ) | , Λ( n 1 · 0 k 1 ) = a ′ 1 , τ , n, n 2 | = p 2 , k 2 ≤ | atts ( n 2 ) | , Λ( n 2 · 0 k 2 ) = a ′ 2 , ∆( n 1 · 0 k 1 ) ⊲ ⊳ ∆( n 2 · 0 k 2 ) Fig. 4. Semant ics of Queries and Qualifiers each lea f n , the state o f the configura tion at n is final. An XML tree τ as ab ove is regar ded to satisfy T iff T accepts tree ( τ ). F r agments of X Path. The fra gment o f XPath [Cla rk and DeRose 199 9] b elow contains all o per ators commonly found in practice a nd was considered in [Benedikt et al. 2 008; Gee r ts and F an 200 5]. The grammar s o f queries p and qua lifie r s u a r e m utually recurs ive. The element types a and attribute na mes a ′ range ov er Σ and Σ ′ , resp ectively . p ::= ε | ▽ | △ | ⊲ | ⊳ | ▽ ∗ | △ ∗ | ⊲ ∗ | ⊳ ∗ | p/p | p ∪ p | p [ u ] u ::= ¬ u | u ∧ u | p ? | a | p/ @ a ′ = p/ @ a ′ | p/ @ a ′ 6 = p/ @ a ′ W e say that a quer y or qualifier is forwar d iff: —it do es not contain △ , ⊳ , △ ∗ or ⊳ ∗ ; —for every sub qualifier of the for m p 1 / @ a ′ 1 ⊲ ⊳ p 2 / @ a ′ 2 , we hav e that p 1 = ε a nd that p 2 is o f the form ε or ▽ /p ′ 2 or ⊲ /p ′ 2 . A safety (resp., c o-safety ) query or qualifier is one in which each o ccur ence of ▽ , ▽ ∗ or ⊲ ∗ is under an o dd (resp., even) num b er of negations. Since infinite XML do cument s may contain infinite siblingho o ds, ▽ , ▽ ∗ and ⊲ ∗ are exactly the querie s that may require existence of a no de which can b e unbo undedly far. The semantics of quer ies a nd qualifier s is standar d (cf., e.g., [Geerts and F an 2005]). W e wr ite the satisfaction r elations as τ , n, n ′ | = p a nd τ , n | = u , where τ is an XML tree h N , Σ ∪ Σ ′ , Λ , ∆ i , and n and n ′ are Σ-la belled no de s . The definition is recursive over the gra mmars of queries and qualifiers, a nd can b e found in Figur e 4. W e omit the Bo olean cases, a nd we wr ite b ▽ , b △ , b ⊲ and b ⊳ for the rela tions b etw een Σ-lab elled no des that corres po nd to the child, parent, nex t-sibling and pr evious- sibling relations (resp ectively) in the do cument that τ represents. W e say that τ sa tis fie s p iff τ , ε, n ′ | = p for some n ′ . AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY. 20 · M. Jurdzi´ nski and R. Lazi´ c Example 5.1. Supp ose a ′ 1 , a ′ 2 ∈ Σ ′ . The for ward que r y p a ′ 1 ,a ′ 2 = ⊲ ∗ / ▽ ∗ [ ε/ @ a ′ 1 = ( ▽ / ▽ ∗ ) / @ a ′ 2 ] is sa tisfied by Σ- lab elled no des n 0 and n 1 iff n 0 b ⊲ ∗ b ▽ ∗ n 1 and there exists n 2 such that n 1 b ▽ + n 2 and the v alue of a ttribute a ′ 1 at n 1 is equal to the v alue of attribute a ′ 2 at n 2 . Hence, the safety forward query ε [ ¬ ( p a ′ 1 ,a ′ 2 ?)] is satisfied by an XML tr ee ov er Σ a nd Σ ′ (whose r o ot may hav e younger siblings) iff the v alue of a ′ 1 at a no de is never equal to the v alue of a ′ 2 at a des c endant. Suppo se a quer y p and a DTD T a re over the same element types and attribute names. W e say that p is satisfiable relative to T iff there exis ts an XML tree which satisfies p and T . Finitar y s atisfiability restricts to finite XML trees . Complexity of Satisfiability. Let us reg ard a forward qua lifier u over element t yp es Σ and attribute names Σ ′ as finitely equiv alent to an A TRA 1 A with alphab et Σ ∪ Σ ′ iff, for every finite XML tree τ over Σ and Σ ′ , and Σ-la be lled no de n , we hav e τ , n | = u iff A accepts the subtree of τ ro oted at n . F o r sa fet y (resp., co-safety) u , safety (resp., co -safety) equiv ale nce is defined b y also co ns idering infinite XML trees and safety (resp., co- safety) acce ptance by A . T o forma lise the co rresp onding notions for quer ies, we intro duce the following kind of automata “with hole s ”. Query automata are defined in the sa me wa y as A TRA 1 (cf. Section 2 .2), except that: —transition formulae may contain a new ato mic formula H ; —no pa th in the succes sor gra ph from the initial state to a state q such that H o ccurs in some transition formula at q may co n tain a n up date edg e. The vertices of the successor g raph are all sta tes, there is a n edge fr om q to r iff r (0 , ↓ ), r (0 , 6 ↓ ), r (1 , ↓ ) or r (1 , 6 ↓ ) o c curs in some transition for m ula at q , and such an edge is ca lle d up date iff r (0 , ↓ ) or r (1 , ↓ ) o ccurs in some transition for m ula at q . T o define a run of a query automaton on a data tree τ with the same alphab et and with r esp ect to a set o f no des N ′ , w e augment the definition of runs o f A TRA 1 so that whenever a tra nsition for mu la is ev aluated at a no de n , each o ccur ence of H is tre a ted as ⊤ if n ∈ N ′ , and as ⊥ if n / ∈ N ′ . Acceptance of a finite data tr e e , safety acceptanc e , and co -safety acceptance, all with r esp ect to a set of no des for int erpreting H , are then defined as for A TRA 1 . F or a query a uto maton A and an A TRA 1 or query automaton A ′ with the s ame alphab et and initial states q I and q ′ I (resp ectively), we define the substitution of A ′ for the hole in A by forming a disjoint union of A and A ′ , taking q I as the initial state, and substituting each occur ence of H in each transition formula δ ( q , a, b ) o f A by δ ( q ′ I , a, b ). Obser ve that the unreachability in A of H fro m q I by a path with a n upda te edge means that the c omp o site automaton transmits initial re g ister v alues to A ′ without changes. Now, we say that a fo rward query p over element types Σ and attribute names Σ ′ is finitely equiv alent to a quer y automa to n B with alphab et Σ ∪ Σ ′ iff, for every finite XML tree τ ov er Σ and Σ ′ , Σ-lab elled no de n , and set N ′ of Σ-lab elled no des, we hav e τ , n, n ′ | = p for some n ′ ∈ N ′ iff B a ccepts the subtree of τ ro o ted at n w ith resp ect to N ′ . F or sa fet y (resp., co- safety) p , safety (r esp., co-sa fety) equiv alence is defined b y also consider ing infinite XML trees and safety (resp., c o- safety) acceptance by B . AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY. Alternating Automata on D ata T rees and XPath S atisfiabilit y · 21 q 0 q 1 0 , 6 ↓ Σ ′ \ { a ′ 1 } 0 , 6 ↓ q 2 a ′ 1 0 , ↓ Σ ′ 0 , 6 ↓ H Σ q 3 Σ 0 , 6 ↓ Σ ′ 0 , 6 ↓ Σ ′ 1 , 6 ↓ Σ 1 , 6 ↓ q ′ 0 q ′ 1 0 , 6 ↓ Σ ′ \ { a ′ 2 } 0 , 6 ↓ ⊤ a ′ 2 , = Fig. 5. Defining A Σ , Σ ′ ε/ @ a ′ 1 =( ▽ /p ) / @ a ′ 2 Theorem 5.2. F or e ach forwar d query p (r esp., forwar d qualifier u ) over Σ and Σ ′ , a finit ely e qu ivalent query automaton B Σ , Σ ′ p (r esp., A TRA 1 A Σ , Σ ′ u ) is c omputable in lo garithmic sp ac e. If p (r esp., u ) is safety, then it is safety e quivalent to B Σ , Σ ′ p (r esp., A Σ , Σ ′ u ). Proof. The translatio ns are defined recursively ov er the gr ammars of quer ies and q ualifiers: — B Σ , Σ ′ ε , B Σ , Σ ′ ▽ , B Σ , Σ ′ ⊲ , B Σ , Σ ′ ▽ ∗ , B Σ , Σ ′ ⊲ ∗ and A Σ , Σ ′ a are straightforward to define; — B Σ , Σ ′ p ∪ p ′ is formed fro m B Σ , Σ ′ p and B Σ , Σ ′ p ′ by disjunctive disjoint union, A Σ , Σ ′ ¬ u is formed from A Σ , Σ ′ u by dualisatio n, and A Σ , Σ ′ u ∧ u ′ is formed from A Σ , Σ ′ u and A Σ , Σ ′ u ′ by conjunctive disjoint union (cf. the pro o f of P rop osition 2.2); —to o btain B Σ , Σ ′ p/p ′ , we substitute B Σ , Σ ′ p ′ for the hole in B Σ , Σ ′ p ; —to obtain B Σ , Σ ′ p [ u ] , we substitute a conjunctive disjoint union of B Σ , Σ ′ ε and A Σ , Σ ′ u for the hole in B Σ , Σ ′ p ; — A Σ , Σ ′ p ? is formed from B Σ , Σ ′ p by substituting ⊤ for H ; —an automa ton for ε/ @ a ′ 1 = ( ▽ /p ) / @ a ′ 2 is fo rmed by substituting the seco nd automaton depicted in Figure 5 (cf. Ex a mple 2.4 for depicting conv entions) for the hole in B Σ , Σ ′ p , and substituting the r esult fo r the ho le in the firs t automa to n depicted in Figure 5; —the remaining cases in the gra mmar of qualifier s a re handled s imilarly . The requir ed equiv alence s, as well a s that if p (resp., u ) is co-sa fet y then it is co- safety eq uiv a len t to B Σ , Σ ′ p (resp., A Σ , Σ ′ u ), are shown simultaneously by induction. Theorem 5.3. (a) F or forwar d XPath and arbitr ary DTDs, satisfiability over finite X ML tre es is de cidable. AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY. 22 · M. Jurdzi´ nski and R. Lazi´ c (b) F or safety forwar d XPath and arbitr ary DTDs, satisfiability over finite or infi- nite XML tr e es is de cidable. Proof. Given a forward quer y p a nd a DTD T ov er element types Σ and at- tribute names Σ ′ , b y Theorem 5.2 , an A TRA 1 A Σ , Σ ′ p ? is co mputable, whic h is finitely equiv alent to the qualifier p ?. W e can then compute an ITCA C ( A Σ , Σ ′ p ? ) as in the pro of o f Theorem 3.1, which recognises e x actly trees obtained by era sing data from finite XML trees that sa tisfy p . T o co nclude (a ), we obser ve that ITCA are closed (in logar ithmic spac e ) under intersections with forward nondeterministic tree au- tomata, and apply Theorem 2.5 . F or (b), supp os ing tha t p is safety , by Theorem 5.2 again, an A TRA 1 A Σ , Σ ′ p ? is computable, which is safety equiv alent to the qualifier p ?. Applying the pro of of Theorem 4.1 to A Σ , Σ ′ p ? and an A TRA 1 whose s afety lang uage is empt y , we can com- pute an ITCANT C ′ ( A Σ , Σ ′ p ? ), w hich co n tains no cycles of ε -tr ansitions a nd recognises exactly trees obtained by erasing data from finite or infinite XML trees that sa tisfy p . It remains to observe that ITCANT with no cycles of ε - tr ansitions a re closed (in logar ithmic spac e ) under intersections with forward nondeterministic tree au- tomata, and to r ecall that their nonemptiness was shown decidable also in the pro of of Theorem 4.1. W e remar k that, by the pro o f o f [Demri and Lazi´ c 20 09, Theore m 5.2], finitar y satisfiability for forward XPath with DTDs is not primitive recurs ive, even without sibling axes (i.e., ⊲ and ⊲ ∗ ). 6. CONCLUDING REMARKS It would be interesting to know more a bo ut the c omplexities of nonemptiness for safety A TRA 1 and satisfiability for safety forward XPath with DTDs. By Theo- rem 4.1 , the former is decidable and not elementary , and by Theo rem 5.3 (b), the latter is decidable. AC KNOWLEDGMENTS W e are gr a teful to the refer e es for helping us improv e the presentation. REFERENCES Alpern, B. and Schneider, F. B. 1987. Recognizing safety and l ive ness. Dist r. Comput. 2, 3, 117–126. Benedikt, M. , F an , W. , an d Geer ts, F. 2008. XPath satisfiabilit y i n the pr esence of DTDs. J. ACM 55, 2. Bj ¨ orklund, H. and Boja ´ nczyk, M. 2007. Bounded depth data trees. In Automa t a, L ang. and Pr o gr am., 34th Int. Col l. (ICALP) . Lect. Notes Comput. Sci., v ol. 4596. Spr i nger, 862–874. Bj ¨ orklund, H. and Schwentick, T. 2007. On notions of regularity for data languages. In F undamentals of Comput. The ory (F CT), 16th Int. Sy mp. Lect. Notes Comput. Sci., vol. 4639. Springer, 88–99. Boja ´ nczyk, M . , Muscholl, A. , Schwentick, T. , and Segouf in, L. 2009. Two -v ariable logic on data trees and XML reasoning. J. ACM 56, 3. Boja ´ nczyk, M. , Mu scholl, A. , Schwentick, T. , Segoufin, L. , and Da vid, C. 2006. Two- v ariable logic on words with data. In 21th IEE E Symp. o n L o gic in Compu t. Sci. (LICS) . IEEE Comput. So c., 7–16. AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY. Alternating Automata on D ata T rees and XPath S atisfiabilit y · 23 Bra y, T. , P aoli, J. , and Sperb erg-McQueen , C. 1998. E xtensible markup language (XML) 1.0. W3C Recommendation. Brzozo wski, J. A. and Leiss, E. L. 1980. On equations for r egular languages, finite automata, and sequen tial netw orks. The or. Comput. Sci. 10, 1, 19–35. Clark, J. and DeRose, S. 1999. XML path language (XPath) . W3C Recommendation. Da vid, C. 2004. M ots et donn´ ees infinies. M.S. thesis, Lab oratoire d’Informatique Algorithmique: F ondement s et Applications, Paris. deGroote, P. , Guillaume, B. , and Sal v a ti, S. 2004. V ecto r addition tree automata. In 19th IEEE Symp. on L o gic in Comput. Sci. (LICS) . IEEE Comput. So c., 64–73. Demri, S. and Lazi ´ c, R. 2009. L TL with the f r eeze quantifier and register automata . ACM T r ans. On Comp. L o gic 10, 3, 30 pp. Figueira, D. 2009. Satisfiabilit y of do wnw ard XPath with data equality tests. In 28th ACM SIGA CT-SIGMOD-SIGAR T Symp. on Princ. of Datab ase Syst. (PODS) . ACM, 197–206. Finkel, A. an d S chn oebelen, P. 2001. W ell- structured transitions systems everywhere! The or. Comput. Sci. 256, 1–2, 63–92. Geer ts, F. a n d F an, W. 2005. Satisfiabilit y of XPat h queries with sibli ng axes. In Datab ase Pr o gr am. L ang., 10th Int. Sy mp. (DBPL) . Lect. Notes Comput. Sci. , vol. 3774. Springer, 122– 137. Hall ´ e, S. , Villemaire, R. , and Cherkaoui, O. 2006. CTL mo del ch ec ki ng f or l ab elled tree queries. In 13th Int. Symp. on T emp or al R epr e sentation and R e asoning (TIME) . IEEE Com- put. So c., 27–35. Higman, G. 1952. Or deri ng by di visibility in abstract algebras. Pr o c. L ondon Math. So c. (3) 2, 7, 326–336. Jurdzi ´ nski, M. and Lazi ´ c, R. 2007. Alternation-free mo dal mu -calculus for data trees. In 22nd IEEE Symp. on L o gic in Comput. Sci. (LICS) . IEEE Comput. So c., 131–140. Kaminski, M. a nd Francez, N. 1994. Fi nite-memory automata. The or. Co mput. Sc i. 134, 2, 329–363. Kaminski, M . a n d T an, T. 2008. T ree automata ov er i nfinite alphab ets. In Pil lars of Comput. Sci., Essays D e d. to Boris (Bo az) T r akhtenbr ot on t he O c c. of His 85th Birthday . Lect. Notes Comput. Sci., vol. 4800. Springer, 386–423 . Lazi ´ c, R. 2006. Safely freezing L TL. In FSTTCS: F ound. of Softw. T e chnolo gy and The or. Comput. Sci., 26th Int. Conf. Lect. Notes Comput. Sci. , vol. 4337. Springer, 381–392. A revised and extended v ersion is av ailable at http://arxiv.or g/abs/0802.4 237 . L ¨ oding, C. a n d Thomas, W. 2000. Alternating automata and logics o ver infinite words. In IFIP TCS . Lect. Notes Comput. Sci., vol. 1878. Springer, 521–535. Muller, D. E. , S aoudi, A. , and Schupp, P. E. 1986. Alternating automat a, the we ak monadic theory of the tree, and i ts complexity . In Automata, Lang. and Pr o gr am., 13th Int. Col l. (ICALP) . Lect. Notes Comput. Sci., vol. 226. Springer, 275–283. Neven, F. , Schwentick, T. , and Vianu, V. 2004. Finite state mac hines for strings ov er infinite alphabets. ACM T r ans. On Comp. L o gi c 5, 3, 403–435. Ol teanu, D. , Furche, T. , and Br y, F. 2004. An efficient single-pass query ev aluator for XML data streams. In ACM Symp. on Applie d Comput. (SAC) . ACM, 627–6 31. Sakamoto, H. and Ikeda, D. 2000. Intract ability of decision problems f or finite-memory au- tomata. The or. Comput. Sci. 231, 2, 297–308. Segoufin, L. 2006. Automata and logics for words and trees ov er an i nfinite alphabet. In Comput. Sci. L o gic (CSL), 20th Int. Works. Lect. Notes Comput. Sci. , vo l. 4207. Springer, 41–57. Receiv ed May 2008; revised Mar c h 2010; accepte d June 2010 AC M T ransactions on Computational Logic , V ol. V, No. N, Mon th 20YY.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment