Visibly Tree Automata with Memory and Constraints

Logical Methods in Computer Science V ol. 4 (2:8) 2008, pp. 1–36 www .lmcs-online.org Submitted Sep . 20, 2007 Published Jun. 18, 2008 VISIBL Y TREE A UTOMA T A WITH MEMOR Y AND CONSTRAINTS ∗ HUBER T COMON-LUNDH a , FLORENT JACQUEMARD b , AND NICOLAS PERRIN c a LSV, CNRS/ENS Cac han e-mail addr ess : h.comon-lundh@aist.go.jp b INRIA Saclay & LSV (CNRS/ENS Cac han) e-mail addr ess : ﬂorent.j acquemard@inria.fr c ENS Lyon e-mail addr ess : nicolas.perrin@ens-lyon.fr Abstra ct. T ree automata with one memory have been introduced in 2001. They gener- alize b oth pushdown (w ord) automata and the tree automata with constraints of equality b etw een brothers of Bogaert and Tison. Though it has a decidable emptiness problem, the main weakness of th is mo del is its lac k of goo d closure prop erties. W e prop ose a generalization of th e v isibly p ushdow n automata of Alur and Madhusu- dan to a family of tree recognizers which carry along th eir (b ottom-up ) computation an auxiliary unboun ded memory with a tree structure (instead of a symbol stac k). In other w ords, these recognizers, called Visibly T ree Automata with Memory (V T A M) deﬁn e a sub class of tree automata w ith one memory enjo ying Boolean closure prop erties. W e sho w in particular that they can b e determinized and the problems lik e emp t iness, member- ship, inclusion and u nivers alit y are decidable for VT AM. Moreov er, we prop ose severa l extensions of VT AM whose t ransitions may b e constrained by diﬀerent kind s of tests b e- tw een memories and also constraints a la Bogaert and Tison. W e show that some of these classes of constrained VT AM keep the go o d closure and decidability prop erties, and we demonstrate their expressiveness with relev ant ex amples of t ree languages. Introduction The con trol ﬂ ow of programs with calls to functions can b e abstracted as pushd o w n systems. Th is allo ws to r educe some program v eriﬁ cation problems to problems (e.g. mo d el- c h ec king) on p ushdown automata. When it comes to functional languages with c ontinuation p assing style , the stac k m ust cont ain information on con tin u ations and has the structure of a d ag (for jump s). Similarly , in the con text of async hronous concurr en t programming lan- guages, for t wo concurr en t thr eads the ordering of return is not determined (syn c h ronized) 1998 ACM Subje ct Classiﬁc ation: F.1.1; F.1.2; I.2.2; I.2.3. Key wor ds and phr ases: T ree automata, Pushdown Automata, Alternating automata, Symbolic con- strain ts, First-order theorem proving. ∗ An ex tended abstract con taining some of the results presented in th is pap er has appeared in the proceeding of FOSSAC S’07. LOGICAL METHODS l IN COMPUTER SCIENCE DOI:10.216 8/LMCS-4 (2:8) 2008 c  H. Comon-Lundh, F . Jacquemard, and N. P er rin CC  Creative Commons 2 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN and these threads can n ot b e stac ke d. In these cases, the control ﬂo w is b etter mo deled as a tree stru ctur e r ather than a stac k. That is why we are int erested in tree automata with one memory , w h ic h generalize the pu shdo wn (tree) automata, r ep lacing the a stac k with a tree. Here, a “memory” has to b e unders to o d as a storage d evice, whose str u cture is a tree. F or instance, t w o m emories would corresp ond to t wo storage devices whose access would b e indep end en t. The tr e e automa ta with one memory in tro duced in [7] compute b ottom-up on a tree, with an auxiliary memory carryin g a tree, as in former works such as [14]. Along a com- putation, at any no de of the tree, th e memory is up dated incrementa lly from the memory reac h ed at the sons of th e n o de. This up date ma y consist in building a new tree from the memories at the sons (this generalizes a pu sh) or retrieving a subtr ee of one of the m emories at the sons (this generalizes a p op). In addition, such automata m a y p erform equalit y tests: a transition m a y b e constrained to b e p erformed, only when the memories reac h ed at some of the sons are iden tical. In this w a y , tree automata with one memory also generalize certain cases of tree automata with equ alit y and disequalit y tests b et ween brothers [4]. Automata with one memory ha v e b een introd u ced in th e conte xt of the ve riﬁcation of securit y proto cols, w here the messages exc hanged are represente d as trees. In the context of (functional or concurrent) programs, the creation of a thread, or a callcc , corresp onds to a push, the termination of a thread or a callcc corresp onds to a p op. The emptiness p roblem for such automata is in EXPTIME (n ote that for the extension with a second memory the emptiness problem b ecomes u n decidable). How ev er, the class of tree languages deﬁn ed by suc h automata is neither closed by intersectio n n or b y complement . This is not surp rising as they are strictly more general than cont ext free languages. On the other hand, Alur and Madhusudan hav e introd u ced the notion of visib ilit y for pushd o wn automata [2], whic h is a relev an t r estriction in the context of control ﬂ o w analysis. With this restriction, determinization is p ossible and actually the class of languages is closed under Bo olean op erations. In th is pap er, we prop ose the new f ormalism of Visibly T ree Automata with Memory (VT AM). On one hand, it extends visibly p ushdown languages to the recognition of trees, and with a tree str u cture instead of a stac k, follo wing f orm er approac hes [14, 21, 10]. On the other hand , VT AM restrict tr ee automata with one memory , imp osing a visibilit y condition on the transitions: eac h s y mb ol is assigned a giv en typ e of action. When r eading a symbol, the automaton can only p erf orm th e assigned t yp e of action: p ush or p op. W e ﬁr st show in Section 2 that VT AM can b e determinized, us in g a pro of similar to the pr o of of [2], and d o hav e the go o d closure prop erties. The main diﬃcu lty here is to understand w h at is a go o d notion of visibilit y f or trees, w ith memories instead of stac ks. W e also sh o w that the problems of mem b ership and emptiness are decidable in deterministic p olynomial time for VT AM. In a second p art of the pap er (Section 3), we extended VT AM with constraints. Our constrain ts here are recognizable relations; a transition can b e ﬁred only if the memory con tents of the sons of the curr en t no de satisfy such a relatio n. W e give then a general theorem, expressing conditions on the relations, whic h ensure the d ecidabilit y of emptiness. Suc h conditions are sh o w n to b e necessary on one hand, and , on the other hand, we pr o ve that they are satisﬁed by some examples, in cluding syn tactic equalit y and disequalit y tests and str u ctural equalit y and disequalit y tests. The case of VT AM with structural equ ality and disequalit y tests (this class is denoted VT AM ≡ 6≡ ) is particularly in teresting, since the VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 3 determinization and closure prop erties of S ection 2 carr y ov er this generalizatio n, wh ic h we sho w in Section 3.4.2. The automata of VT AM ≡ 6≡ also enjo y a go o d expressiv e p o wer, as w e sho w in S ection 3.7 by presenting some n on -trivial examples of languages in this class: w ell-balance d binary tr ees, red-blac k trees, p o werlists... As an in termediate result, w e show that, in case of equalit y tests or stru ctural equalit y tests, the language of memories th at can b e r eac hed in a giv en state is alwa ys a r egular language. This is a generalization of th e well -kno wn result that the s et of stac k conte n ts in a pu shdo wn automaton is alwa ys regular. T o pr o ve th is, we obs erv e that the memories con tents are recognized b y a tw o-w a y alternating tree automaton with constraint s. Then we sho w, u sing a s aturation strategy , that tw o-w a y alternating tree automata with (stru ctural) equalit y constraints are not more expressiv e than stand ard tree automata. Finally , in Section 4 we p rop ose a class of visibly tree automata, which com b in es the structural constrain ts of VT AM ≡ 6≡ , testing memory con ten ts, w ith Bogaert-Tison constraints of [4] (equalit y and d isequalit y tests b et we en brothers subterms) whic h op erate on th e term in input. W e sho w that the tree automata of this class can b e determinized, are closed under Bo olean op erations and hav e a decidable emp tin ess pr ob lem. Related W ork. Generalizatio ns of p ushdown automata to trees (b oth f or input and stac k) are prop osed in [14, 21, 10]. Our cont ributions are the generalization of the visibilit y condition of [2] to such tree automata – our VT AM (without constrain ts) s trictly generalize the VP Languages of [2], and the addition of constrain ts on the stac k conte n ts. The visib ly tree automata of [1] use a w ord stac k which is less general than a tree structur ed memory but the comparison with VT AM is n ot easy as th ey are alternating an d compu te top-do wn on inﬁnite trees. Indep en d en tly , Chab in and Ret y ha v e prop osed [5] a form alism com b ining p ushdown tree automata of [14] with the concept of visib ly pus hdo wn languages. Their automata recognize ﬁnite trees using a wo rd stac k. They h a ve a decidable emp tiness prob lem and the corresp onding tree languages (Visibly Pushdown T ree Languages, VPTL ) are closed u nder Bo olean op erations. F ollo wing remarks of one of these t w o authors, it app eared that VT AM and VPTL are in comparable, see Section 2.2. 1. Pre liminaries 1.1. T erm algebra. A sig natur e Σ is a ﬁnite s et of function symb ols with arit y , denoted b y f , g . . . W e write Σ n the subset of f unction sy mb ols of Σ of arity n . Given an inﬁn ite set X of v ariables, th e set of terms built o v er Σ and X is denoted T (Σ , X ), and the subset of ground terms is denoted T (Σ). Th e set of v ariables o ccurring in a term t ∈ T (Σ , X ) is denoted vars ( t ). A substitution σ is a mapping from X to T (Σ , X ) such that { x | σ ( x ) 6 = x } , the supp ort of σ , is ﬁ nite. T he app licatio n of a subs titution σ to a term t is written tσ . It is the homomorphic extension of σ to T (Σ , X ). The p ositions Pos ( t ) in a term t are sequences of p ositi v e inte gers (Λ, the empty sequ en ce, is the r o ot p osition). A subterm of t at p osition p is written t | p , and the replacemen t in t of th e sub term at p osition p by u denoted t [ u ] p . 4 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN 1.2. Rewriting. W e assum e standard deﬁnitions and n otations f or term rewriting [11]. A term r ewriting system (T R S ) o v er a signature Σ is a ﬁnite set of rewrite rules ℓ → r , wh er e ℓ ∈ T (Σ , X ) and r ∈ T (Σ , vars ( ℓ )). A term t ∈ T (Σ , X ) rewr ites to s by a T RS R (denoted t → R s ) if th er e is a rewrite ru le ℓ → r ∈ R , a p osition p of t and a su b stitution σ su c h that t | p = ℓσ and s = t [ r σ ] p . The transitiv e an d reﬂexive closure of → R is denoted − − → ∗ R . 1.3. T ree Automata . F ollo wing d eﬁnitions and notation of [8], we consider tree automata whic h compu te b ottom-up (fr om lea v es to ro ot) on (ﬁnite) ground terms in T (Σ). A t eac h stage of computation on a tree t , a tree automaton reads the function sym b ol f at th e curr en t p osition p in t and u p dates its curr en t state, acco rding to f and to the resp ectiv e states reac h ed at the p ositi ons immediately under p in t . F ormally , a b otto m-up tr e e automato n (T A) A on a signature Σ is a tuple ( Q, Q f , ∆) w here Σ is the computation signature, Q is a ﬁnite set of nullary state symb ols, d isjoin t f r om Σ, Q f ⊆ Q is the subset of ﬁn al states and ∆ is a set of rewrite rules of the form: f ( q 1 , . . . , q n ) → q , wh er e f ∈ Σ and q 1 , . . . , q n ∈ Q . A term t is ac c epte d (we may also write r e c o gnize d ) b y A in state q iﬀ t − − → ∗ ∆ q , and the language L ( A , q ) of A in state q is th e set of ground terms accepted in q . The language L ( A ) of A is S q ∈ Q f L ( A , q ) and a set of grou n d terms is called r e gu lar if it is the language of a T A. 2. Visibl y Tree Automa t a with Memor y W e pr op ose in this section a sub class of the tree automata with one memory [7] wh ic h is stable under Bo olean op erations and has decidable emptiness and memb ership pr ob lems. 2.1. Deﬁnition of VT AM. T ree automata ha v e b een extended [14, 21, 10, 7] to carry an unboun ded inform ation along the s tates in compu tations. In [7 ], this inf orm ation is stored in a tree structure and is called memo ry . W e k eep this terminolog y here, and cal l ou r recognizers tr e e automata with memory (T AM). F or consistency with the ab o v e form alisms, the memory conten ts will b e ground terms o ver a memory signatur e Γ. Lik e for T A w e consid er b ott om-up computations of T AM in tr ees; at eac h sta ge of computation on a tree t , a T AM, lik e a T A, reads the function symbol at the curr en t p osition p in t and up dates its cu rren t state, according to the states reac hed immediately under p . Moreo ver, a conﬁguration of T AM con tains not on ly a state but also a memory , whic h is a tree. The curren t memory is up dated according to the resp ectiv e con ten ts of memories reac hed in the no des imm ediately und er p in t . As ab o v e, we use term rewrite systems in order to d eﬁne the transitions allo wed in a T AM. F or this purp ose, w e add an argument to state symbols, wh ic h will con tain the memory . Hence, a conﬁguration of T AM in state q and whose memory conte n t is the ground term m ∈ T (Γ), is represente d b y the term q ( m ). W e p rop ose b elo w a v ery general deﬁnition of T AM. It is similar to the one of [7], except that w e ha v e her e general p atterns m 1 , . . . , m n , m , while these p atterns are restricted in [7], for instance a voiding memory duplications. Since we aim at providing closure and decision prop erties, we will also imp ose (other) restrictions later on. Deﬁnition 2.1. A b ott om-up tr e e automaton with memory (T AM) on a signature Σ is a tuple (Γ , Q, Q f , ∆) wh ere Γ is a memory signature, Q is a ﬁnite set of unary state symbols, disjoin t f r om Σ ∪ Γ, Q f ⊆ Q is th e subset of ﬁ nal states and ∆ is a set of rewrite rules of the VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 5 form f  q 1 ( m 1 ) , . . . , q n ( m n )  → q ( m ) where f ∈ Σ n , q 1 , . . . , q n , q ∈ Q and m 1 , . . . , m n , m ∈ T (Γ , X ). The r ules of ∆ are also called tr ansition rules . A term t is ac c epte d by A in state q ∈ Q and w ith memory m ∈ T (Γ) iﬀ t − − → ∗ ∆ q ( m ), and the language L ( A , q ) and memory language M ( A , q ) of A in state q are resp ective ly d eﬁned by: L ( A , q ) =  t   ∃ m ∈ T (Γ) , t − − → ∗ ∆ q ( m )  M ( A , q ) =  m   ∃ t ∈ T (Σ) , t − − → ∗ ∆ q ( m )  . The language of A is the un ion of languages of A in its ﬁnal stat es, denoted: L ( A ) = S q ∈ Q f L ( A , q ) . Visibilit y Condition. The ab o v e formalism is of course f ar to o expressive . As there are no restrictions on the op eration p erformed on memory by the rewrite r ules, one can easily enco de a T ur ing mac hine as a T AM. W e sh all no w deﬁne a decidable restriction called visibly tr e e automata with memory (VT AM). First, w e consid er on ly thr ee main families (later divided in to th e sub cate gories deﬁned in Figure 1) of op erations on memory . W e assume b elo w a compu tation step at some p ositi on p of a term, where memories m 1 , . . . , m n ha v e b een reac hed at the p ositions immediately b elo w p : PUSH : the new cu rren t memory m is bu ilt with a s ym b ol h ∈ Γ n pushe d on the top of memories m 1 , . . . , m n : f  q 1 ( m 1 ) , . . . , q n ( m n )  → q  h ( m 1 , . . . , m n )  . According to the terminology of [2 ], this corresp onds to a c al l mo v e in a pr ogram represented by an au- tomaton. POP : the n ew current memory is a subterm of one of the memories r eac hed so far: f  . . . , q i ( h ( m ′ 1 , . . . , m ′ k )) , . . .  → q ( m ′ j ). T h e top sym b ol h of m i is also r ead. Th is corre- sp onds to a f unction’s r eturn in a program. W e ha v e here to sp lit POP op erations in to four categories, dep ending on w hether w e p op on the memory at the left son or on the memory at the righ t son an d on whether w e get the left son of th at memory or its righ t son. INT (internal) : the new current memory is one of the memories reac h ed: f  q 1 ( m 1 ) , . . . , q n ( m n )  → q ( m i ) This co rresp ond s to an internal op eration (neither call nor return) in a function of a program. Again, w e need to sp lit INT op eratio ns into thr ee categories: one for constan t symbols and tw o rules for binary symb ols, dep ending on whic h of the tw o sons memories we k eep. Next, we adhere to the visibility condition of [2]. The idea b ehind this restriction, whic h w as already in [16], is that the sym b ol read b y an automaton (in a term in our case and [1], in a word in the case of [2]) corresp onds to an instruction of a program, and h ence b elongs to one of the three ab o ve families (call, r etur n or in ternal). In deed, the eﬀect of the execution of a giv en instruction on the curr en t pr ogram state (a stac k for [2] or a tree in our case) will alw ays b e in the same family . In other words, in this conte xt, the family of the memory op erations p erformed by a transition is completely determined by the function sym b ol read. Let us assu me from now on for the sake of simp licit y the follo wing restriction on th e arit y of sym b ols: 6 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN PUSH a → q ( c ) a ∈ Σ PUSH PUSH f  q 1 ( y 1 ) , q 2 ( y 2 )  → q  h ( y 1 , y 2 )  f ∈ Σ PUSH POP 11 f  q 1 ( h ( y 11 , y 12 )) , q 2 ( y 2 )  → q ( y 11 ) f ∈ Σ POP 11 f  q 1 ( ⊥ ) , q 2 ( y 2 )  → q ( ⊥ ) POP 12 f  q 1 ( h ( y 11 , y 12 )) , q 2 ( y 2 )  → q ( y 12 ) f ∈ Σ POP 12 f  q 1 ( ⊥ ) , q 2 ( y 2 )  → q ( ⊥ ) POP 21 f  q 1 ( y 1 ) , q 2 ( h ( y 21 , y 22 ))  → q ( y 21 ) f ∈ Σ POP 21 f  q 1 ( y 1 ) , q 2 ( ⊥ )  → q ( ⊥ ) POP 22 f  q 1 ( y 1 ) , q 2 ( h ( y 21 , y 22 ))  → q ( y 22 ) f ∈ Σ POP 22 f  q 1 ( y 1 ) , q 2 ( ⊥ )  → q ( ⊥ ) INT 0 a → q ( ⊥ ) a ∈ Σ INT 0 INT 1 f  q 1 ( y 1 ) , q 2 ( y 2 )  → q ( y 1 ) f ∈ Σ INT 1 INT 2 f  q 1 ( y 1 ) , q 2 ( y 2 )  → q ( y 2 ) f ∈ Σ INT 2 where q 1 , q 2 , q ∈ Q , y 1 , y 2 are distinct v ariables of X , c ∈ Γ 2 , h ∈ Γ 2 . Figure 1: VT AM transition categories. All the symb ols of Σ and Γ hav e either arit y 0 or 2. This is not a r eal restriction, and the r esults of this p ap er can b e extended s tr aigh tforwardly to the case of fun ction sym b ols with other arities. T he signature Σ is partitioned in eight subsets: Σ = Σ PUSH ⊎ Σ POP 11 ⊎ Σ POP 12 ⊎ Σ POP 21 ⊎ Σ POP 22 ⊎ Σ INT 0 ⊎ Σ INT 1 ⊎ Σ INT 2 The eigh t corresp onding categories of transitions (transitions of the same category p erform the same kind of op eration on the m emory) are d eﬁ ned formally in Figure 1. In this ﬁgure, one constan t symb ol h as a particular role: ⊥ is a sp ecial constan t sy mb ol in Γ, used to represent an emp t y memory . Note that ther e are thr ee categories for INT , INT 0 is for constant symb ols and INT 1 , INT 2 are for binary symbols and diﬀer according to th e memory w hic h is kept. Similarly , there are four v arian ts of POP transitions, POP 11 , . . . , POP 22 . Mo reo ver, eac h POP rule has a v arian t, wh ic h reads an empty memory ( i.e. the symbol ⊥ ). Deﬁnition 2.2. A v isibly tr e e automa ton with memory (or VT AM for short) on Σ is a T AM (Γ , Q, Q f , ∆) su c h that eve ry rule of ∆ b elongs to one of th e ab o ve catego ries PUSH , POP 11 , POP 12 , POP 21 , POP 22 , INT 0 , INT 1 , INT 2 . 2.2. Expressiv ene ss, Comparison. Standard b ottom-up tree automata are p articular cases of VT AM (simply assume all the symb ols of the signature in INT 0 or INT 1 ). No w, let us try to explain more precisely the r elation with th e visibly pushd o w n lan- guages of [2], w hen considering ﬁ nite wo rd languages. If the stac k is empt y in any ac cepting conﬁguration of some ﬁnite w ord pushdown automaton A , then it is easy to compute a pushdo wn automaton e A , whic h accepts the rev erses (mirror images) of the wo rds accepted by A . Moreo ver, if A is a visibly pushd own automaton, then e A is also a visibly pushd o wn automaton: it suﬃces to exc hange the pu sh and p op symb ols. VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 7 F or p u shdo wn w ord languages, there is a well- kno wn lemma showing that the recogni- tion b y ﬁnal state is equiv alen t to the recognition b y empty stac k. This equiv alence how ev er requires ǫ -transitions to empt y the stac k w hen a ﬁnal state is r eac hed. Th er e are ho w ev er no ǫ -transitions in visib ly push do w n automata. So, if we consider for instance th e language of w ords w ∈ { a, b } ∗ suc h th at any p reﬁx of w conta ins more a than b ’s, it is recognized by a visibly pu s hdo wn automaton. While, if we consider the m irror image (all suﬃxes cont ain more a ’s than b ’s), it is not recognized by a visib ly pushd o wn automaton. In conclusion, as long as visibilit y is relev ant, the w a y the automaton is moving is also relev an t. This applies of course to trees as well: there is a d iﬀerence b et w een top-do wn and b ottom-up recognition. No w, if we enco de a word as a tree on a unary alphab et, starting from righ t to left, VT AM generalize visibly push do wn automata: mo ving b ottom-up in the tree corresp ond s to mo ving left-righ t in the word. VPT A transitions and VPTL are deﬁn ed in [5] in the same formalism (rewrite rules) as in Figure 1, except th at th e rules are orien ted in the other direction (top-do wn computa- tions) and the memory conta ins a w ord, i.e. terms bu ilt with unary fun ction symb ols and one constan t (empty stack) . As sk etc hed ab o v e, since the automata of [5] wo rk top-do wn, a language can b e rec- ognized by a VT AM (whic h works b ottom-up) and not b y a VPTL. As a t ypical examp le, consider the trees con taining only unary symb ols a, b and a constan t 0 and suc h that all subterms con tain m ore a ’s than b ’s. But the co n v erse is also true: there are similarly languages that are recognized b y VPT A an d n ot by VT AM (and there, constrain ts cannot help!) No w, if w e consider a sligh t mo diﬁ cation of VPT A, in whic h the automata work b ottom- up (simply c h ange the direction of transition rules), it is not clear that go o d p rop erties (closure and decision) are preserv ed since, now, w e get equalit y tests b et w een memory con tents, increasing the original expressiv e p o w er; when going top-do wn w e alw a ys d uplicate the memory con ten t and send one copy to eac h son , while going b ottom-up we may hav e diﬀeren t memory conte n ts at t w o br other p ositions. 2.3. Determinism. A VT AM A is said c ompl ete if eve ry term of T (Σ) b elongs to L ( A , q ) for at least on e state q ∈ Q . Every VT AM can b e completed (with a p olynomial ov erhead) b y th e addition of a trash sta te. Hence, we shall consider from no w on only co mplete VT AM. A VT AM A = (Γ , Q, Q f , ∆) is said deterministic iﬀ: • for all a ∈ Σ INT 0 there is at most one ru le in ∆ with left-mem b er a , • for all f ∈ Σ PUSH ∪ Σ INT 1 ∪ Σ INT 2 , for all q 1 , q 2 ∈ Q , there is at most on e rule in ∆ with left-mem b er f  q 1 ( y 1 ) , q 2 ( y 2 )  , • for all f ∈ Σ POP 11 ∪ Σ POP 12 (resp ectiv ely Σ POP 21 ∪ Σ POP 22 ), for all q 1 , q 2 ∈ Q and all h ∈ Γ , there is at most one rule in ∆ with left-member f  q 1 ( h ( y 11 , y 12 )) , q 2 ( y 2 )  (resp ectiv ely f  q 1 ( y 1 ) , q 2 ( h ( y 21 , y 22 ))  ). Theorem 2.3. F or every VT A M A = (Γ , Q, Q f , ∆) ther e exists a deterministic VT A M A det = (Γ det , Q det , Q det f , ∆ det ) such that L ( A ) = L ( A det ) , wher e | Q det | and | Γ det | b oth ar e O  2 | Q | 2  . 8 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN Pr o of. W e follo w the tec hnique of [2] for the d eterminization of visibly push do wn automata: w e do a s u bset construction and p ostp one the application (to the memory) of PUSH r ules, unt il a matc hing POP is met. The construction of [2] is extend ed in order to handle the branc hing structure of the term r ead and of th e memory . With the visibilit y condition, for eac h symbol read, only one kind of memory op eration is p ossible. This p ermits a un iform construction of the r ules of A det for eac h symb ol of Σ . As w e shall see b elow, A det do es not need to kee p trac k of th e conte n ts of memory (of A ) during its compu tation, it only needs to memorize information on the r eac habilit y of states of A , follo win g the path (in the term read) from the p osition of the PUS H s ym b ol w hic h h as pushed the top symb ol of the current memory (let us call it the last-memory-push-p osition ) to the cu r ren t p osition in the term. W e let : Q det := { 0 , 1 } × P ( Q ) × P ( Q 2 ) Q det f is the subset of states w hose second comp on ent con tains a ﬁnal state of Q f . The ﬁrst comp onen t is a ﬂag indicating whether the memory is curr en tly empt y (v alue 0) or not (v alue 1). T he second comp onent is the sub s et of states of Q that A can reac h at cur r en t p osition, and the third comp onen t is a bin ary relation on Q w hic h con tains ( q , q ′ ) iﬀ starting from a state q and memory m at the last-memory-push-p osition, A can r eac h the curr en t p osition in state q ′ , and with the same memory m . W e consider memory sym b ols m ade of pairs of states and PUS H sym b ols: Γ det :=  Q det  2 × (Σ PUSH ) The comp onent s of a symbol p ∈ Γ det refer to the transition w h o p ushed p : th e ﬁrst and second comp onents of p are resp ectiv ely the left and righ t initial states of the transition and the third comp onent is the sym b ol r ead. The transition rules of ∆ det are giv en b elo w, according to the symbol read. INT . F or ev ery i and for ev ery f ∈ Σ INT i , we h a ve the follo wing rules in ∆ det : f  h b 1 , R 1 , S 1 i ( y 1 ) , h b 2 , R 2 , S 2 i ( y 2 )  → h b 1 , R , S i ( y 1 ) where R :=  q   ∃ q 1 ∈ R 1 , q 2 ∈ R 2 , f  q 1 ( y 1 ) , q 2 ( y 2 )  → q ( y 1 ) ∈ ∆  , and S is the u p date of S 1 according to the INT 1 -transitions of ∆, when b 1 = 1 (the case b 1 = 0 is s im ilar): S :=  ( q , q ′ )   ∃ q 1 ∈ Q, q 2 ∈ R 2 , ( q , q 1 ) ∈ S 1 and f  q 1 ( y 1 ) , q 2 ( y 2 )  → q ′ ( y 1 ) ∈ ∆  . The case f ∈ Σ INT 2 is similar. PUSH . F or ev ery f ∈ Σ PUSH , we hav e the follo wing ru les in ∆ det : f  h b 1 , R 1 , S 1 i ( y 1 ) , h b 2 , R 2 , S 2 i ( y 2 )  → h 1 , R , Id Q i ( p ( y 1 , y 2 )) where R :=  q   ∃ q 1 ∈ R 1 , q 2 ∈ R 2 , h ∈ Γ , f  q 1 ( y 1 ) , q 2 ( y 2 )  → q  h ( y 1 , y 2 )  ∈ ∆  , Id Q :=  ( q , q )   q ∈ Q  is used to initialize th e memorization of state reac hability fr om the p osition of the symbol f , and p :=  h b 1 , R 1 , S 1 i , h b 2 , R 2 , S 2 i , f  . Note that the t w o states reac hed just b elow the p osition of application of this ru le are p u shed on the top of th e memory . They will b e used later in ord er to up date R and S wh en a matc h ing POP symb ol is read. VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 9 POP . F or ev ery f ∈ Σ POP 11 , we h a ve the follo wing rules in ∆ det : f  h b 1 , R 1 , S 1 i ( H ( y 11 , y 12 )) , h b 2 , R 2 , S 2 i ( y 2 )  → h b, R , S i ( y 11 ) where H = h Q 1 , Q 2 , g i , with Q 1 = h b ′ 1 , R ′ 1 , S ′ 1 i ∈ Q det , Q 2 = h b ′ 2 , R ′ 2 , S ′ 2 i ∈ Q det . b = b ′ 1 R =  q     ∃ q ′ 1 ∈ R ′ 1 , q ′ 2 ∈ R ′ 2 , ( q 0 , q 1 ) ∈ S 1 , q 2 ∈ R 2 , h ∈ Γ , g  q ′ 1 ( y 1 ) , q ′ 2 ( y 2 )  → q 0  h ( y 1 , y 2 )  ∈ ∆ , f  q 1 ( h ( y 11 , y 12 )) , q 2 ( y 2 )  → q ( y 11 ) ∈ ∆  S =  ( q , q ′ )     ∃ q ′ 1 ∈ S ′ 1 ( q ) , q ′ 2 ∈ R ′ 2 , ( q 0 , q 1 ) ∈ S 1 , q 2 ∈ R 2 , h ∈ Γ , g  q ′ 1 ( y 1 ) , q ′ 2 ( y 2 )  → q 0  h ( y 1 , y 2 )  ∈ ∆ , f  q 1 ( h ( y 11 , y 12 )) , q 2 ( y 2 )  → q ′ ( y 11 ) ∈ ∆  When a POP s y mb ol is read, the top symb ol of the memory , which is p opp ed, conta ins the states reac h ed just b efore the application of the matc h ing PUSH . W e use this information in order to up d ate h b 1 , R 1 , S 1 i and h b 2 , R 2 , S 2 i to h b, R , S i . The cases f ∈ Σ POP 12 , f ∈ Σ POP 21 , f ∈ Σ POP 22 are similar. The ab o v e constructions ensure the three in v arian ts stated ab o v e, after the deﬁnition of Q det and corresp ond ing to the three comp onents of these states. It follo ws that L ( A ) = L ( A det ). 2.4. Closure Prop erties. Th e tree automata with one memory of [7] are closed under union but not closed un der in tersection and complement (even their version w ithout con- strain ts). The v isib ilit y condition makes p ossib le these closur es for VT AM. Theorem 2.4. The class of tr e e languages of VT A M is close d under Bo ole an op er ations. One c an c onstruct VT AM for union, interse ction and c omp lement of given VT AM languages whose sizes ar e r esp e ctively line ar, quadr atic and exp onential in the size of the initial VT AM. Pr o of. Let A 1 = (Γ 1 , Q 1 , Q f , 1 , ∆ 1 ) and A 2 = (Γ 2 , Q 2 , Q f , 2 , ∆ 2 ) b e t w o VT AM on Σ. W e assume wlog that Q 1 and Q 2 are disjoint. F or the union of the languages of A 1 and A 2 , we construct a VT AM A ∪ whose memory signature, state set, ﬁnal state set and ru les s et are the union of the resp ectiv e memory signatures, state sets, ﬁnal state sets and rules sets of the tw o giv en VT AM. W e ha v e L ( A ∪ ) = L ( A 1 ) ∪ L ( A 2 ). A ∪ = (Γ 1 ∪ Γ 2 , Q 1 ∪ Q 2 , Q f , 1 ∪ Q f , 2 , ∆ 1 ∪ ∆ 2 ) F or the int ersection of th e languages of A 1 and A 2 , we constr u ct a VT AM A ∩ whose memory signature, state set and ﬁn al state set are the Cartesian pr o duct of th e resp ective memory signatures, s tate sets and ﬁ nal state sets of the tw o giv en VT AM. A ∩ = (Γ 1 × Γ 2 , Q 1 × Q 2 , Q f , 1 × Q f , 2 , ∆ ∩ ) The ru le set ∆ ∩ of the inte rsection VT AM A ∩ is obtained by ”pro du ct” of r ules of the t w o giv en VT AM with same fun ction s ym b ols. The pro d uct of r ules means Cartesian pro d ucts of the resp ectiv e states and memory sym b ols p ushed or p op p ed. More precisely , ∆ ∩ is the smallest set of rules su c h that: • if ∆ 1 con tains f  q 11 ( y 1 ) , q 12 ( y 2 )  → q 1  h 1 ( y 1 , y 2 )  and ∆ 2 con tains f  q 21 ( y 1 ) , q 22 ( y 2 )  → q 2  h 2 ( y 1 , y 2 )  , for some f ∈ Σ PUSH , then ∆ ∩ con tains f  h q 11 , q 21 i ( y 1 ) , h q 12 , q 22 i ( y 2 )  → h q 1 , q 2 i  h h 1 , h 2 i ( y 1 , y 2 )  . 10 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN • if ∆ 1 con tains f  q 11 ( h 1 ( y 11 , y 12 )  , q 12 ( y 2 )  → q 1 ( y 11 ) and ∆ 2 con tains f  q 21 ( h 2 ( y 11 , y 12 )  , q 22 ( y 2 )  → q 2 ( y 11 ) for some f ∈ Σ POP 11 , th en ∆ ∩ con tains f  h q 11 , q 2 , 1 i ( h h 1 , h 2 i ( y 11 , y 12 )  , h q 12 , q 2 , 2 i ( y 2 )  → h q 1 , q 2 i ( y 11 ) • similarly for POP 12 , POP 21 and POP 22 • if ∆ 1 con tains f  q 11 ( y 1 ) , q 21 ( y 2 )  → q 1 ( y 1 ) and ∆ 2 con tains f  q 21 ( y 1 ) , q 22 ( y 2 )  → q 2 ( y 1 ) for some f ∈ Σ INT 1 , then ∆ ∩ con tains f  h q 11 , q 2 , 1 i ( y 1 ) , h q 12 , q 2 , 2 i ( y 2 )  → h q 1 , q 2 i ( y 1 ) • and similarly for INT 2 , INT 0 . W e ha v e th en L ( A ∩ ) = L ( A 1 ) ∩ L ( A 2 ). Note that th e ab o v e pro du ct constru ction for A ∩ is p ossible only b ecause the visibilit y condition ensures that t wo ru les w ith the same function sym b ol in left-side will hav e the same form. Hence w e can syn c h ronize memory op eratio ns on the same symbols. F or the complemen t, we us e the construction of Theorem 2.3 and a completion (this op eration preserve s determinism), and tak e the complemen t of the ﬁnal state set of the VT AM obtained. 2.5. Decision Problems. Ev ery VT AM is a particular case of tr ee automaton with one memory of [7]. Since the emptiness p roblem (whether the language accepted is empty or not) is decidable for this latter class, it is also decidable for VT AM. How ev er, w h ereas this problem is EXPT I ME-complete for the automata of [7], it is only PTIME for VT AM. Theorem 2.5. The e mptiness pr oblem is PTIME-c ompl ete for V T AM . Pr o of. Assume giv en a VT AM A = (Γ , Q, Q f , ∆). By deﬁnition, for eac h state q ∈ Q , the language L ( A , q ) is empt y iﬀ the memory language M ( A , q ) is empt y . F or eac h state q , w e in tro duce a predicate symb ol P q and we construct Horn clauses in su c h a wa y that P q ( m ) b elongs to the least Herbrand m o del of this set of clauses, iﬀ the conﬁ guration with state q and memory m is reac hable by the automaton (i.e. m ∈ M ( A , q )). F or suc h a construction (already giv en in [7]), w e simply forget the fu nction symbol, as- so ciating to a transition rule f ( q 1 ( m 1 ) , q 2 ( m 2 )) → q ( m ) the Horn clause P q 1 ( m 1 ) , P q 2 ( m 2 ) ⇒ P q ( m ). Then, according to the restrictions in Deﬁnition 2.2, we get only Horn clauses of one of the follo win g form s: ⇒ P q ( c ) P q 1 ( y 1 ) , P q 2 ( y 2 ) ⇒ P q  h ( y 1 , y 2 )  P q 1  h ( y 11 , y 12 )  , P q 2 ( y 2 ) ⇒ P q ( y 11 ) P q 1  h ( y 11 , y 12 )  , P q 2 ( y 2 ) ⇒ P q ( y 12 ) P q 1 ( ⊥ ) , P q 2 ( y 2 ) ⇒ P q ( ⊥ ) P q 1 ( y 1 ) , P q 2 ( y 2 ) ⇒ P q ( y 1 ) where all the v ariables are distinct. Suc h clauses b elong to the class H 3 of [19], for wh ic h it is pr o ved in [19] that emptiness is decidable in cubic time. It follo ws that emp tiness of VT AM is d ecidable in cubic time. Hardness for PTIME follo ws from the PTIME-hardn ess of emptiness of ﬁnite tree au- tomata [8]. VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 11 Another pro of r elying on similar tec h niques, but for a more general resu lt, will b e stated in Lemma 3.7 and can b e foun d in App end ix 5. The u niversality is the problem of deciding wh ether a giv en automaton r ecognize s all ground terms . Inclusion refers to the problem of deciding the inclus ion b et wee n the resp ectiv e languages of t w o giv en automata. Corollary 2.6. The universality and inclusion pr oblem ar e EXPTIME- c omplete f or VT AM. Pr o of. A VT AM A is un iv ers al iﬀ the language of its complemen t automaton A is emp ty , and L ( A 1 ) ⊆ L ( A 2 ) iﬀ L ( A 1 ) ∩ L ( A 2 ) = ∅ . With the b oun ds giv en in Th eorem 2.4 these problems can b e decided in EXPT IME for VT AM (these op erations requir e a determiniza- tion of a giv en VT AM ﬁrst). The EXPTIME-hard ness follo ws from the corresp onding prop ert y of ﬁnite tree au- tomata (see [8] for instance). The memb ership problem is, giv en a term t and an automaton A , to kn o w whether t is accepted by A . Corollary 2.7. The memb ership pr oblem is de c i dable in PTIM E for VT AM. Pr o of. Giv en a term t we can build a VT AM A t whic h recognizes exa ctly the languag e { t } . The intersectio n of A t with th e giv en VT AM A r ecognizes a non empty language iﬀ t b elongs to the language of A . 3. Visibl y Tree Automa t a with Memor y and Constraint s In the late eighti es, some mo dels of tree recognizers were obtained b y adding equalit y and disequalit y constraints in trans itions of tree automata. T hey ha v e b een prop osed in order to solv e problems with term rewrite systems or constraint s sys tems with non-linear patterns (terms with multiple o ccurrences of the same v ariable). The tree automata of [4] for instance can p erform equalit y and disequalit y tests b et w een subterms lo cated at br other p ositions of th e input term. In the case of tree automata with memory , constrain ts are applied to th e memory con tents. Ind eed, eac h b ottom-up computation step starts with tw o states and t wo memories (and end s with one state and one memory), an d therefore, it is p ossib le to compare the con tents of these t w o memories, with resp ect to some binary relation. W e state ﬁr st the general deﬁ n ition of visibly tree automata with constraints on mem- ories (Section 3.1), then give suﬃcien t conditions on the binary relation for the emptiness decidabilit y (Section 3.2) and sho w th at, if in general regular binary relations do not sat- isfy these conditions (and indeed, the corresp onding class of constrained VT AM has an undecidable emptiness problem, Section 3.3) s ome r elev an t examples d o satisfy them. In particular, w e study in S ection 3.4.2 the case of VT AM with structural equalit y constraint s. They enjoy n ot only decision pr op erties bu t also goo d closure prop erties. Some r elev an t examples of tree languages r ecognize d by constr ained VT AM of this class are pr esen ted at the end of the section. 12 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN INT R 1 f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − → y 1 R y 2 q ( y 1 ) f 9 ∈ Σ INT R 1 INT R 2 f 10  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − → y 1 R y 2 q ( y 2 ) f 10 ∈ Σ INT R 2 INT R 1 f 11  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − − → y 1 ¬ R y 2 q ( y 1 ) f 11 ∈ Σ INT R 1 INT R 2 f 12  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − − → y 1 ¬ R y 2 q ( y 2 ) f 12 ∈ Σ INT R 2 Figure 2: New tr an s ition categories for VT AM R ¬ R . 3.1. Deﬁnitions. Assume giv en a ﬁ xed equ iv alence relation R on T (Γ). W e consider n o w t wo n ew categories f or the symbols of Σ: INT R 1 and INT R 2 , in addition to the eigh t previous catego ries of page 6. T he new cate gories corresp ond to the co nstrained v ersions of the transition ru les INT 1 and INT 2 present ed in Figure 2. The constrain t y 1 R y 2 in the tw o ﬁrst rules of Figure 2 is called p ositive and the constrain t y 1 ¬ R y 2 in the t w o last ru les is called ne gative . W e sh all not extend the rules PUSH and POP w ith constrain ts for some r ea- sons explained in section 3.5. A ground term t rewrites to s by a constrained rule f  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − → y 1 c y 2 r (wh er e c is eit her R or ¬ R ) if there exists a p osition p of t and a s u bstitution σ su c h that t | p = ℓσ , y 1 σ c y 2 σ and s = t [ r σ ] p . F or example, if R is term equalit y , the transition is p erform ed only when the memory con tents are iden tical. Deﬁnition 3.1. A visibly tr e e automaton with memory and c onstr aints (VT AM R ¬ R ) on a signature Σ is a tuple (Γ , R, Q, Q f , ∆) where Γ, Q , Q f are deﬁned as for T AM, R is a n equiv alence r elation on T (Γ) and ∆ is a set of r ewr ite rules in one of the ab o v e categories: PUSH , POP 11 , POP 12 , POP 21 , POP 22 , INT 0 , INT 1 , INT 2 , INT R 1 , INT R 2 . W e let VT AM R b e th e sub class of VT AM R ¬ R with p ositiv e constrain ts only . Th e accep- tance of terms of T (Σ) and languages of term and memories are d eﬁned and denoted as in Section 2.1. The deﬁn ition of c omplete VT AM R ¬ R is the same as f or VT AM. As for VT AM, ev ery VT AM R ¬ R can b e completed (with a p olynomial ov erhead) by the addition of a trash state q ⊥ . Th e only subtle diﬀerence concerns the constrained rules: for ev ery f 9 ∈ INT R 1 and ev er y states q 1 , q 2 , • if there is a ru le f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − → y 1 R y 2 q ( y 1 ) and n o rule of the form f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − − → y 1 ¬ R y 2 q ′ ( y 1 ), then we add f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − − → y 1 ¬ R y 2 q ⊥ ( y 1 ), • if there is a r ule f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − − → y 1 ¬ R y 2 q ( y 1 ) and no rule of the form f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − → y 1 R y 2 q ′ ( y 1 ), then we add f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − → y 1 R y 2 q ⊥ ( y 1 ), • if there is n o rule of the form f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − → y 1 R y 2 q ( y 1 ) or f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − − → y 1 ¬ R y 2 q ′ ( y 1 ), then we add f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − → y 1 R y 2 q ⊥ ( y 1 ) and f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − − → y 1 ¬ R y 2 q ⊥ ( y 1 ). The deﬁnition of deterministic VT AM R ¬ R is based on the same conditions as for VT AM for the fun ction sym b ols in categorie s of PU SH 0 , PUS H , POP 11 , . . . , POP 22 , INT 1 , INT 2 . F or the fun ction symb ols of INT R 1 , INT R 2 , we ha v e the follo wing condition: for all f ∈ Σ INT R 1 ∪ Σ INT R 2 for all q 1 , q 2 ∈ Q , there are at most t w o r ules in ∆ with left-mem b er f  q 1 ( y 1 ) , q 2 ( y 2 )  , and if there are t w o, one h as a p ositive constraint and the other has a negativ e constraint. VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 13 W e will see in Section 3.4 a su b class of VT AM R ¬ R that can b e determinized (when R is structural equalit y) and another one that cannot (when R is synta ctic equalit y). 3.2. Suﬃcien t Conditions for Emptiness Decision. W e prop ose here a generic theo- rem ensu ring emptiness decision f or VT AM R ¬ R . The idea of this theorem is that under some condition on R , the transition rules with negativ e constrain ts can b e eliminated. Theorem 3.2. L et R b e an e qu ivalenc e r elation satisfying these two pr op erties: i. for every automaton A of VT AM R and for e very state q of A , the memory language M ( A , q ) is eﬀe ctive ly a r e gular tr e e language, ii. for every term m ∈ T (Γ) , the c ar dinality of the e quivalenc e class of m for R is ﬁnite and and its elements c an b e enumer ate d. Then the emptiness pr oblem is de ci dable for VT AM R ¬ R . Pr o of. The pro of relies on the follo wing Lemma 3.3 w hic h state s that the n egativ e con- strain ts in VT AM R ¬ R can b e eliminated, wh ile preserving the memory languages. T he elim- ination can b e done thanks to the condition ii , by replacement of the rules of INT ¬ R 1 and INT ¬ R 2 b y rules of INT R 1 and INT R 2 . Next, w e can use i in ord er to decide emptiness for the VT AM R obtained by elimination of negativ e constrain ts. In d eed, for all states q of A , b y d eﬁnition, L ( A , q ) is empt y iﬀ M ( A , q ) is empt y . Lemma 3.3. L et R satisfy the hyp otheses i and ii of The or em 3.2, and let A = (Γ , R , Q, Q f , ∆) b e a VT AM R ¬ R . Ther e exists a VT AM R A + = (Γ , R, Q + , Q f , ∆ + ) such that Q ⊆ Q + , and for e ach q ∈ Q , M ( A + , q ) = M ( A , q ) . Pr o of. The co nstruction of A + is b y induction on the n um b er n of rules with negativ e constrain ts in ∆ and us es the b ound on the size of equiv alence classes, cond ition ii of the theorem. The result is immediate if n = 0. W e assum e that the result is true for n − 1 r ules, and s h o w that we can get rid of a ru le of ∆ with negativ e constrain ts (and replace it with rules unconstrained or with p ositiv e constrain ts). Let u s consider one suc h rule: f  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − − → y 1 ¬ R y 2 q ( y 1 ) (3.1) W e sho w that, un der the ind uction hyp othesis, we ha v e the follo win g lemma wh ic h w ill b e used b elo w in order to get rid of the rule (3.1 ). Lemma 3.4. Given m 1 , . . . , m k ∈ M ( A , q 2 ) , it is eﬀe ctively de cidable whether M ( A , q 2 ) \ { m 1 , . . . , m k } is empty or not and, in c ase it is not empty, we c an e ﬀe ctively build a m k +1 in this set. Pr o of. Let [ m i ] R denote the equiv alence class of m i . By condition ii , ev ery [ m i ] R is ﬁn ite, hence for eac h i ≤ k , w e can build a VT AM A i with a state p i suc h that M ( A i , p i ) is th e complemen t of [ m i ] R . W e ad d all the rules of A i to A , obtaining A ′ (w e assume that the state sets of A 1 , . . . , A k , A are disj oin t, and that the states of A 1 , . . . , A k are not ﬁnal in A ′ ). Since R is an equiv alence relation, w e ha v e: y 1 ¬ R m i iﬀ y 1 / ∈ [ m i ] R iﬀ ∃ y 2 / ∈ [ m i ] R , y 1 R y 2 14 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN Hence, if y 2 = m i is a witness for the r ule (3.1), then w e can apply in stead a r u le: f  q 1 ( y 1 ) , p i ( y 2 )  − − − − − → y 1 R y 2 q ( y 1 ) (3.2) Then w e add to A ′ the rules (3.2) as ab o v e and obtain A ′′ . It can b e s h o w n th at M ( A ′′ , q 2 ) = M ( A , q 2 ). Let m k +1 b e a term of M ( A ′′ , q 2 ) \ { m 1 , . . . , m k } of min imal size (if one exists). This term m k +1 can b e cr eated in a run of A ′′ whic h d o es not use the rule (3.1). Otherwise, the witness for y 2 in the app licatio n of this r ule w ould b e a term of M ( A ′′ , q 2 ) \ { m 1 , . . . , m k } smaller than m k +1 (it cannot b e one of { m 1 , . . . , m k } b ecause for these particular v alues of y 2 , we assume the application of (3.2)). It follo ws that m k +1 ∈ M ( A ′′ \ (3.1) , q 2 ). This automaton A 1 = A ′′ \ (3.1) has n − 1 rules with negativ e constrain ts. He nce, by induction h yp othesis, there is a VT AM R A + 1 with m k +1 in its memory language M ( A + 1 , q 2 ). By condition i , th is language is regular and w e can bu ild m k +1 from a T A for this language. No w, let us come b ac k to the pr o of that we can r eplace ru le (3.1), while p reserving the memory languages. If M ( A , q 2 ) = ∅ (whic h can b e eﬀect iv ely decided according to lemma 3.4) then the r ule (3.1) is useless and can b e remo v ed from A withou t c h anging its memory language. Note that the condition M ( A , q 2 ) = ∅ is decidable b ecause by hyp othesis i , M ( A , q 2 ) is regular. Otherwise, let m 1 ∈ M ( A , q 2 ) b e built with L emma 3.4 and let N 1 b e the cardinal of the equiv alence class [ m 1 ] R . W e apply N 1 times the constr u ction of Lemma 3.4. There are three cases: (1) if we ﬁnd more than N 1 terms in M ( A , q 2 ), then one of them, sa y m k is not in [ m 1 ] R . Then (3.1) is useless f or the p oint of view of memory languages: whatev er v alue for y 1 , w e know a y 2 ∈ M ( A , q 2 ) wh ic h p ermits to ﬁre the rule. Ind eed, if y 1 ∈ [ m 1 ] R , then we can c ho ose y 2 = m k , and otherwise w e c ho ose y 2 = m 1 . Hence (3.1) can b e replaced without c hanging the memory language by: f  q 1 ( y 1 ) , q 0 ( y 2 )  − → q ( y 1 ) (3.3) where q 0 is an y state of A s uc h that M ( A , q 0 ) 6 = ∅ . W e can then app ly the indu ction h yp othesis to the VT AM R ¬ R obtained. (2) if we ﬁnd less than N 1 terms in M ( A , q 2 ), but one is n ot in [ m 1 ] R . The case is the same as ab o v e. (3) if we ﬁnd less than N 1 terms in M ( A , q 2 ), all in [ m 1 ] R , it means that one of the appli- cations of Lemma 3.4 was not successful, and hence that we ha v e found all th e terms of M ( A , q 2 ). It follo ws that the rule (3.1 ) can b e ﬁred iﬀ y 1 / ∈ [ m 1 ] R , i.e. there exists y 2 / ∈ [ m 1 ] R suc h that y 1 Ry 2 . Hence, we can replace (3.1 ) b y f  q 1 ( y 1 ) , p 1 ( y 2 )  − − − − − → y 1 R y 2 q ( y 1 ) . Then w e can apply the indu ction hyp othesis. W e presen t in Section 3.4 t w o examples of relations satisfying i. and ii . VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 15 3.3. Regular T ree Relations. W e ﬁrs t consider the general case of VT AM R ¬ R where th e equiv alence R is based on an arbitrary regular bin ary relation on T (Γ). By regular b inary relation, w e mean a set of pairs of ground terms accepted b y a tree automaton computin g sim ultaneously in b oth terms of the pair. More formally , we us e a co ding of a p air of terms of T (Σ) in to a term of T  (Σ ∪ {⊥} ) 2  , where ⊥ is a new constan t sym b ol (not in Σ ). This co ding is d eﬁned recursive ly by: • ⊗ : T (Σ) ∪ {⊥} × T (Σ) ∪ {⊥} → T  (Σ ∪ {⊥} ) 2  • for all a, b ∈ Σ 0 ∪ {⊥} , a ⊗ b := h a, b i , • for all a ∈ Σ 0 ∪ ⊥ , f ∈ Σ 2 , t 1 , t 2 ∈ T (Σ), f ( t 1 , t 2 ) ⊗ a := h f , a i ( t 1 ⊗ ⊥ , t 2 ⊗ ⊥ ) a ⊗ f ( t 1 , t 2 ) := h a, f i ( ⊥ ⊗ t 1 , ⊥ ⊗ t 2 ), • for all f , g ∈ Σ 2 , s 1 , s 2 , t 1 , t 2 ∈ T (Σ), f ( s 1 , s 2 ) ⊗ g ( t 1 , t 2 ) := h f , g i ( s 1 ⊗ t 1 , s 2 ⊗ t 2 ). Then, a binary relation R ⊆ T (Σ) × T (Σ) is called r egular iﬀ th e set { s ⊗ t   ( s, t ) ∈ R } is regular. The ab o ve co din g of pairs is unr elated to the pr o duct u sed in Theorem 2.4. Theorem 3.5. The memb ership pr oblem for VT AM R ¬ R is N P-c omplete when R is a r e gu lar binary r elation. Pr o of. Assume giv en a ground term t ∈ T (Σ) and a VT AM R ¬ R A = (Γ , R, Q, Q f , ∆). Beca use of th e visib ly condition, for ev ery subterm s of t , w e can compute in p olynomial time in the size of s the shap e denoted struct ( s ), which is an abstraction of the memory reac h ed when A runs on s . More pr ecisely , struct ( s ) is an un lab eled tree, and eve ry p ossible con ten t of memory m reac hable by A in a compu tation s − − → ∗ ∆ q ( m ) is obtained by a lab eling of the no des of struct ( s ) with symb ols of Γ. Note that for all s u bterm s , the size of struct ( s ) is smaller than th e size of t . Let us guess a decoration of ev ery n o d e of t with a state of Q and a lab eling of struct ( s ) (where s is the subterm of t at th e give n no de), such th at the r o ot of t is d ecorated with a ﬁnal state of Q f . W e can c hec k in p olynomial time whether this decoration repr esen ts a run of A on t or not. The NP-hardness is a consequence of Theorem 3.9, wh ic h app lies to the particular case where R is the sy ntactic equalit y b et wee n terms. Note that the NP algorithm w orks with ev ery equiv alence R based on a r egular relation, but the the NP-hardn ess concerns on ly s ome cases of s u c h relations. F or instance, in Section 3.4, w e w ill see one example of relation for whic h membersh ip is NP-hard and another example for which it is in PTIME. The class of VT AM R ¬ R when R is a b inary regular tree r elation constitutes a n ice and uniform framewo rk. Note h ow eve r the condition ii of Theorem 3.2 is not alwa ys true in this case. Actually , this class is to o expressiv e. Theorem 3.6. Given a r e gular binary r elation R and an automaton A in VT AM R , the emptiness of L ( A ) is unde cidable. Pr o of. W e reduce the b lank accepting problem for a deterministic T u ring m ac hin e M . W e enco de conﬁgurations of M as ”righ t-co m bs” (binary trees) bu ilt with the tap e and state sym b ols of M , in Σ PUSH (hence binary) and a constant symb ol ε in Σ INT 0 . Let R b e the regular relation whic h accepts all the pairs of conﬁgurations c ⊗ c ′ suc h that c ′ is a successor of c by M . A sequence of conﬁgur ations c 0 c 1 . . . c n (with n ≥ 1) is enco ded as a tree t = f ( c 0 ( f ( c 1 , . . . f ( c n − 1 , c n ))), where f is a binary symb ol of Σ INT R 1 . 16 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN W e construct a VT AM R A whic h accepts exactly the term-repr esen tations t of com- putation sequences of M starting with the initial conﬁguration c 0 of M and end ing w ith a ﬁnal conﬁgur ation c n with blank tap e. F ollo wing the t y p e of the fu nction sym b ols, the rules of A will • p ush all the symb ols read in subterms of t corresp ondin g to conﬁgur ations, • compare, with R , c i and c i +1 (the memory cont en ts in r esp ectiv ely the left and right branc hes) and store c i in the memory , with a trans ition applied at the top of a s u bterm f ( c i , f ( c i +1 , . . . )). This wa y , A c hec ks that successiv e conﬁgur ations in t corresp ond to transitions of M , hence that the language of A is n ot emp t y iﬀ M accepts the initial conﬁguration c 0 . 3.4. Syn t actic and Structural Equa lity and Disequality Constraints. W e present no w t w o examples of relations satisfying the conditions of Theorem 3.2: syn tact ic and structural term equalit y . T he satisfaction of condition i will b e pr o ved with th e help of the follo wing cru x Lemma. Lemma 3.7. L et R b e a r e gular binary r elation deﬁne d by a T A whose state set is  R i   i = { 1 ..n }  and such that ∀ i, j ∃ k , l , ∀ x, y , z . xR i y ∧ y R j z ⇔ xR k y ∧ x R l z . L et A = (Γ , R , Q, Q f , ∆) b e a tr e e automaton with memory and c onstr aints (not ne c essarily visibly). Then it is p ossible to c ompute i n exp onential time a ﬁnite tr e e automaton A ′ , such that, for ev ery state q ∈ Q , the language M ( A , q ) is the language ac c epte d in some state of A ′ . Pr o of. (Sk etc h) T o pr o ve this lemma, we ﬁr st observe that the M ( A , q ) (for q ∈ Q ) are actually the least sets that satisﬁes the f ollo wing conditions (we assu me here for simp licit y that the non-constan t sym b ols are binary and displa y only some of th e imp lications; the others can b e easily guessed): ∀ x, y , z . x ∈ M ( A , q 1 ) , y ∈ M ( A , q 2 )) ⇒ g ( x, y ) ∈ M ( A , q ) if there is a rule f ( q 1 ( x 1 ) , q 2 ( x 2 )) → q ( g ( x 1 , x 2 )) g ( x, y ) ∈ M ( A , q 1 ) , z ∈ M ( A , q 2 ) ⇒ x ∈ M ( A , q ) if there is a rule f ( q 1 ( g ( x, y ) , q 2 ( z )) → q ( x ) x ∈ M ( A , q 1 ) , y ∈ M ( A , q 2 ) , R ( x, y ) ⇒ x ∈ M ( A , q ) if there is a rule f ( q 1 ( x ) , q 2 ( y )) − − − → xRy q ( x ) · · · In terms of automata, this means that M ( A , q ) is a language recognized b y a t w o-w a y alternating tree automaton with regular binary constrain ts. In other w ords, such languages are the least Herbrand mo d el of a set of clauses of the form Q 1 ( y 1 ) , Q 2 ( y 2 ) , R ( y 1 , y 2 ) ⇒ Q 3 ( y 1 ) INT 1 , INT 2 Q 1 ( y 1 ) , Q 2 ( y 2 ) ⇒ Q 3 ( f ( y 1 , y 2 )) PUSH ⇒ Q 1 ( a ) INT 0 Q 1 ( f ( y 1 , y 2 )) , Q 2 ( y 3 ) ⇒ Q 3 ( y 1 ) POP 11 , POP 21 Q 1 ( f ( y 1 , y 2 )) , Q 2 ( y 3 ) ⇒ Q 3 ( y 2 ) POP 12 , POP 22 The lemma then shows that languages that are recognized by t w o-w a y alternating tree automata with some particular regular constrain ts, are also recognized by a ﬁnite VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 17 tree automaton. This corresp onds to classical redu ctions of t w o-w a y au tomata to one-w a y automata (see e.g [8], chapter 7, [13], or [12, 6] for the ﬁrst r elev an t references). The idea of the reduction is to ﬁnd shortcuts: mo ving up and down yiel ds a mov e at the same lev el. Ad d such shortcuts as n ew ru les, un til getting a “complete set”. Then only kee p the non-red undant ru les: this yields a ﬁ nite tree automaton. Suc h a pro cedur e relies on the d eﬁnitions of ordered strategies, redund ancy and satur ation (ak a complete sets), whic h are classical n otions in automated ﬁrst-order theorem proving [13, 3, 20]. Ind eed, formally , a “shortcut” m ust b e a form ula, which allo w s f or smaller pro ofs than the p ro of using the t wo original r ules. A satur ate d set corresp onds to a set of formulas w hose all shortcuts are already in the set. The adv an tage of the clausal formalism is to enable an easy representat ion of the ab ov e shortcuts, as inte rmediary steps. Su ch shortcuts are clauses, but are not automata rules. Second, we may rely on completeness results f or Horn clauses. That is why , only for the pro of of this lemma, wh ic h follo ws and extend the classical pro ofs adding s ome regular constraints, w e s witc h to a ﬁr st-order logic form alizati on. The complete pro of can b e found in App endix 5. As in the classical pr o ofs, we saturate the set of clauses b y resolution with selection and eager splitting. This saturation termin ates, and the set of clauses corresp ond in g to ﬁ nite tree automata transitions in the saturated set recognizes the language M ( A , q ), which is ther efore regular. The condition on R in the lemma allo w s to br eak c hains such as ∃ x 1 , . . . , x n .xRx 1 ∧ x 1 Rx 2 ∧ · · · ∧ x n Ry ∧ P ( x, y ), w hic h wo uld b e a sour ce of n on-termination in the saturation pro cedure. W e may indeed replace such c h ains by ∃ x 1 , . . . , x n .xR 1 x 1 ∧ xR 2 x 2 ∧ . . . ∧ xR n x n ∧ xR 0 y ∧ P ( x, y ), whic h can again b e simp liﬁed into ∃ x 1 .xS x 1 ∧ xR 0 y ∧ P ( x, y ) where S is the in tersectio n of R 1 , . . . , R n . P ossible such int ersections range in a ﬁnite set as the relation R is regular and the R i s are states of the automaton accepting R . Finally note that ﬁnd ing k , l in the lemma’s assumption can alwa ys b e p erformed in an eﬀectiv e wa y since R is r egular. 3.4.1. Syntactic Constr aints. W e ﬁrs t apply Lemma 3.7 to the class VT AM = 6 = where = de- notes the equalit y b et w een ground terms made of memory sym b ols. Note that it is a particular case of constrained VT AM R ¬ R of the ab o v e section 3.3, s ince the term equalit y is a r egular relation. The automata of the sub class with p ositiv e constrain ts only , VT AM = , are particular cases of tr ee automata with one memory of [7], and h a ve ther efore a decidable emptiness pr oblem. W e sh o w b elo w that VT AM = 6 = fulﬁlls the h yp otheses of Theorem 3.2, and hence that the emptiness is also d ecidable for the wh ole class. W e can ﬁrst v erify that the relation = c hec ks the hypothesis of Lemma 3.7, hence the condition i of Th eorem 3.2 . Moreo ve r, the relation = obviously also chec ks the condition ii of Th eorem 3.2. Corollary 3.8. The emptiness pr oblem is de cidable for VT AM = 6 = . A careful analysis of the p ro of of Theorem 3.2 p ermits to conclud e to an EXPTI ME complexit y for this p r oblem with VT AM = 6 = . Theorem 3.9. The memb e rship pr oblem is NP- c omplete for VT AM = 6 = . Pr o of. An NP algorithm is giv en in the pro of of Theorem 3.5. F or the NP-hardn ess, w e u se a logspace reduction of 3-SA T. Let u s consider an instance of 3-SA T with n p rop ositional 18 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN v ariables X 1 , . . . , X n and a conju nction of m clauses: m ^ i =1 ( α i, 1 ∨ α i, 2 ∨ α i, 3 ) where ev ery α i,j is either a v ariable X k ( k ≤ n ) or a negation of v ariable ¬ X k . W e assume wlog that every v ariable o ccurs at most once in a clause. W e consider an en co d ing t of the giv en instance as a term o v er the signature Σ con taining the sym b ols: X 1 , . . . , X n (constan ts), id , false , ¬ (unary) and ∧ an d ∨ (binary). The enco ding is: t := C ∧  C ∨ [ δ 1 , 1 ( X 1 ) , . . . , δ 1 ,n ( X n )] , . . . , C ∨ [ δ m, 1 ( X 1 ) , . . . , δ m,n ( X n )]  where C ∧ (resp. C ∨ ) is a conte xt b uilt solely with ∧ (resp. ∨ ) and w h ere ev er y δ i,j is either: • δ i,j = id (in terpreted as the identi t y) if one of α i, 1 , α i, 2 , α i, 3 is X j , • δ i,j = ¬ if one of α i, 1 , α i, 2 , α i, 3 is ¬ X j , • δ i,j = false (interpreted as the constan t function returning false ) if X j do es not o ccur in α i, 1 , α i, 2 , α i, 3 . No w, let u s partition the signature Σ with: X 1 , . . . , X n , ∨ ∈ PUSH , id , false , ¬ ∈ INT 1 and ∧ ∈ INT = 1 ; and let consider the memory signature Γ = { 0 , 1 , ∨} . W e construct n o w a VT AM = A = (Γ , = , { q 0 , q 1 } , { q 1 } , ∆) whose transition w ill, in tuitiv ely: • guess an assignmen t for eac h constan t s y mb ol X k of t , by mean of a non-determin istic c h oice of one state q 0 or q 1 , • compu te the v alue of t with these assignments, • p ush eac h tuple of assignment for eac h clause, in the contexts C ∨ , • chec k the coherence of assignmen ts by m eans of equalit y tests b etw een th e tup les p ushed, in the context C ∧ . More formally , we ha v e the follo win g transitions in ∆: X i → q 0 (0) X i → q 1 (1) i ≤ n id ( q ε ( y 1 )) → q ε ( y 1 ) false ( q ε ( y 1 )) → q 0 ( y 1 ) ¬ ( q ε ( y 1 )) → q 1 − ε ( y 1 ) with ε ∈ { 0 , 1 } ∨ ( q ε 1 ( y 1 ) , q ε 2 ( y 2 )) → q ε 1 ∨ ε 2 ( ∨ ( y 1 , y 2 )) ∧ ( q ε 1 ( y 1 ) , q ε 2 ( y 2 )) − − − − → y 1 = y 2 q ε 1 ∧ ε 2 ( y 1 ) with ε 1 , ε 2 ∈ { 0 , 1 } W e can verify that the ab o ve VT AM = A recognizes t iﬀ the instance of 3-SA T h as a solution. VT AM = 6 = is closed under u nion (using the same constru ction as b efore) bu t not un der complemen tati on. This is a consequ ence of the f ollo wing Th eorem. Theorem 3.10. The univ ersality pr oblem i s u nde cidable for VT AM = 6 = . Pr o of. W e red uce the blank accepting p roblem for a d eterministic T uring mac hine M . Lik e in the p ro of of Theorem 3.6, we enco de c onﬁgur ations of M as righ t-com bs on a s ignature Σ con taining the tap e and state symb ols of M , considered as binary sym b ols of Σ PUSH and a constan t symbol ε in Σ PUSH . A sequ en ce of conﬁgur ations c 0 , c 1 , . . . , c n (with n ≥ 1) is enco ded as a tree t = f ( c n ( f ( c n − 1 , . . . f ( c 0 , ε )))), where f is a binary symbol of Σ INT = 1 . Suc h a tree is called a c omputa tion of M if c 0 is the initial conﬁgur ation, c n is a ﬁnal conﬁgur ation VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 19 ε → q ǫ ( ε ) f ( q B ( y 1 ) , q ε ( y 2 )) − − − − → y 1 6 = y 2 q ( y 1 ) f ( q B ( y 1 ) , q ( y 2 )) − − − − → y 1 = y 2 q ( y 1 ) f ( q B ( y 1 ) , q ( y 2 )) − − − − → y 1 6 = y 2 q f ( y 1 ) f ( q B ( y 1 ) , q f ( y 2 )) − − − − → y 1 = y 2 q f ( y 1 ) f ( q B ( y 1 ) , q f ( y 2 )) − − − − → y 1 6 = y 2 q f ( y 1 ) Figure 3: Th e VT AM = 6 = A 3 in the pro of of Theorem 3.10. ε − → q ε ( ε ) f ( q ∀ ( y 1 ) , q ε ( y 2 )) − − − − → y 1 6 = y 2 q ∀ ( y 1 ) f ( q ∀ ( y 1 ) , q ∀ ( y 2 )) − − − − → y 1 = y 2 q ∀ ( y 1 ) f ( q ✷ ( y 1 ) , q ∀ ( y 2 )) − − − − → y 1 = y 2 q ✷ ( y 1 ) f ( q = ( y 1 ) , q ✷ ( y 2 )) − − − − → y 1 6 = y 2 q f ( y 1 ) f ( q ∀ ( y 1 ) , q f ( y 2 )) − − − − → y 1 = y 2 q f ( y 1 ) Figure 4: Th e VT AM = 6 = A 4 in the pro of of Theorem 3.10. and for all 0 ≤ i < n , c i +1 is the su ccessor of c i with M . Moreo v er, we assume that all th e c i ha v e the same length (for this purp ose we complete the representat ions of conﬁgurations with blank symb ols). W e w an t to construct a VT AM = 6 = A wh ic h r ecognizes exactly the terms whic h are not computations of M . Hence, A recognizes all the terms of T (Σ) iﬀ M do es not accept the initial blank conﬁguration. F or th e construction of A , let u s ﬁrst observe th at we can asso ciate to M a VT AM A ✷ whic h, while reading a conﬁguration c i , will push on the memory its successor c i +1 . The existence of suc h an automaton is guaran teed by th e ﬁrst fact that for eac h regular b inary relation R , as deﬁned in Section 3.3, there exists a VT AM whic h, for eac h ( s, t ) ∈ R , w ill push t while r eading s , and by the second fact th at the language of c i ⊗ c i +1 , hence the relation of successor conﬁguration, are r egular. Moreo v er, since only push op eratio ns are p erformed, w e can ensure that A ✷ satisﬁes th e visibly condition. Let us n ote q ✷ the ﬁn al state (w h ic h is assumed uniqu e wlog) of the VT AM A ✷ . W e also u se the follo w ing VT AMs: A ∀ : a VT AM with (un ique) ﬁn al state q ∀ whic h, while reading a conﬁguration c i will push on the memory an y conﬁguration with same length as c i , A = : a VT AM w ith ﬁnal state q = whic h, while reading a conﬁgur ation c i will p ush c i on the memory , A B : a VT AM with ﬁn al state q B whic h, while reading a conﬁguration c i will pus h on the memory a conﬁgur ation with same length as c i and con taining only blank symb ols. The VT AM = 6 = A is the union of th e f ollo wing automata: A 1 : a VT AM = 6 = recognizing the terms of T (Σ) w hic h are not representa tions of s equences of conﬁgurations (malformed terms). I ts language is actually a regular tree language. A 2 : a VT AM = 6 = recognizing the sequences of conﬁgurations f ( c n ( f ( c n − 1 , . . . f ( c 0 , ε )))) such that c 0 is not initial or c n is not ﬁnal. Again, this is a r egular tree language. A 3 : a VT AM = 6 = recognizing the sequences of conﬁgur ations with t w o conﬁgurations of d iﬀer- en t lengths. It con tains the transitions rules of A B and the additional transitions describ ed in Figure 3, whic h p erf orm this test. A 4 : a VT AM = 6 = recognizing the sequences of conﬁgurations f ( c n ( f ( c n − 1 , . . . f ( c 0 , ε )))) such that all th e c i ha v e the s ame length bu t th ere exists 0 ≤ i < n suc h that c i +1 is not th e successor of c i b y M . Th is last VT AM = 6 = con tains the transitions of A ✷ , A ∀ , A = , and the additional transitions describ ed in Figure 4. 20 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN With the transition rules in Figure 4, th e automaton A 4 guesses a i < n and, while reading eac h of the conﬁgurations c j with j ≤ i , it pus h es the successor conﬁ guration of c j , sa y c ′ j (second column of ﬁ gure 4). Then, while r eading c i +1 A 4 pushes c i +1 , and it c hec ks that c ′ i and c i +1 diﬀer. After that, wh en reading eac h of the remaining conﬁgur ations, A 4 pushes c i +1 (third column of ﬁgure 4). The VT AM = 6 = A 1 to A 4 co ver all the cases of term T (Σ) not b eing an accepting com- putation of M starting with the initial blank conﬁguration. Hence the language of their union A is T (Σ) iﬀ M do es n ot accept the in itial blank conﬁ gu r ation. Corollary 3.11. VT AM = 6 = is not eﬀe ctively c lose d under c ompl ementation. Pr o of. It is a consequence of Corollary 3.8 (emptiness d ecision) and Th eorem 3.10 . 3.4.2. Structur al Constr aints. Lemma 3.7 applies also to another class VT AM ≡ 6≡ , where ≡ denotes s tructural equalit y of terms, d eﬁ ned r ecur siv ely as the smallest equiv alence relation on ground term s suc h that: • a ≡ b for all a , b of arit y 0, • f ( s 1 , s 2 ) ≡ g ( t 1 , t 2 ) if s 1 ≡ t 1 and s 2 ≡ t 2 , for all f , g of arity 2. Note that it is a regular relation, and that it s atisﬁes the hyp othesis of Lemma 3.7 and the condition ii of T heorem 3.2. Corollary 3.12. The emptiness pr oblem is de cidable for VT AM ≡ 6≡ . F ollo wing the pro cedur e in the pro of of Theorem 3.2, w e obtain a 2-EXPTIME com- plexit y for this pr ob lem and this class. The crucial pr op ert y of the relations ≡ and 6≡ is that, unlike the ab ov e class VT AM = 6 = or the general VT AM R ¬ R , th ey ignore the lab els of the con ten ts of the memory . They just care of th e structure of these memory terms. A b eneﬁt of this prop ert y of VT AM ≡ 6≡ is that the decision of the memb ership p roblem drops to PTIME for this class. Theorem 3.13. The memb ership pr oblem is de cidable in PTIM E for VT AM ≡ 6≡ . Pr o of. Let A = (Γ , ≡ , Q, Q f , ∆) b e a VT AM ≡ 6≡ on Σ and let t b e a term in T (Σ). Let sub ( t ) b e the set of su b terms of t and let us constr u ct a VT AM A ′ = (Γ , sub ( t ) × Q, { t } × Q f , ∆ ′ ) on Σ ′ where the symb ols of Σ ′ and Σ are the same, and w e assu me that the sym b ols in category INT ≡ 1 (resp. INT ≡ 2 ) in the partition of Σ are in INT 1 (resp. INT 2 ) in the partition of Σ ′ . The transitions of ∆ ′ are obtained b y the follo wing transformation of the transitions of ∆. W e only describ e the construction for the cases INT 1 and INT ≡ 1 with p ositiv e constrain ts. The other cases are similar. • for ev er y f 7 ( q 1 ( y 1 ) , q 2 ( y 2 )) → q ( y 1 ) ∈ ∆, we add to ∆ ′ all the transitions: f 7  h q 1 , t 1 i ( y 1 ) , h q 2 , t 2 i ( y 2 )  →  q , f ( t 1 , t 2 )  ( y 1 ) such that f ( t 1 , t 2 ) ∈ sub ( t ), • for eve ry f 9 ( q 1 ( y 1 ) , q 2 ( y 2 )) − − − − → y 1 ≡ y 2 q ( y 1 ) ∈ ∆, we add to ∆ ′ all the tran s itions as ab o v e (in this case, f 9 is assumed a sym b ol of category INT 1 in Σ ′ ) s u c h that moreo v er struct ( t 1 ) = struct ( t 2 ), where struct ( s ) is deﬁned, lik e in the pro of of Theorem 3.5 , as the shap e (unlab eled tree) that will h a ve the memory of A after A pro cessed s . The VT AM A ′ can b e computed in time O ( k t k 2 × k A k ). It recognizes at m ost one term , t , and it recognizes t iﬀ A recognizes t . Therefore, t is recognized b y A iﬀ the language of A ′ is not empty . Th is can b e decided in PTIME according to Theorem 2.5. VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 21 Ev en more int eresting, the construction for determinization of S ection 2.3 still w orks for VT AM ≡ 6≡ . Theorem 3.14. F or every VT AM ≡ 6≡ A = (Γ , ≡ , Q, Q f , ∆) ther e exi sts a deterministic VT AM ≡ 6≡ A det = (Γ det , ≡ , Q det , Q det f , ∆ det ) such that L ( A ) = L ( A det ) , wher e | Q det | and | Γ det | b oth ar e O  2 | Q | 2  . Pr o of. W e use the same construction as in the pr o of of Th eorem 2.3, with a d irect extension of the constru ction for INT to INT ≡ . The key prop erty for hand ling constrain ts is that the structure of memory (h ence the result of the structural tests) is ind ep endent from the non- deterministic c hoices of the automaton. With th e visibilit y condition it only d ep ends on the term r ead. Theorem 3.15. The class of tr e e languages of VT AM ≡ 6≡ is close d under Bo ole an op er ations. One c an c onstruct VT AM ≡ 6≡ for union, interse ction and c omplement of given VT AM ≡ 6≡ lan- guages whose sizes ar e r esp e ctively line ar, quadr atic and exp onential in the size of the initial VT AM ≡ 6≡ . Pr o of. W e us e the same constructions as in Theorem 2.4 (VT AM) for u nion and in tersec- tion. F or the in tersectio n, in the case of constrained ru les w e can safely keep the constraints in p ro duct rules, thanks to the visibilit y condition (as the structure of memory only de- p ends on the term read, see the proof of Th eorem 3.14). F or instance, the pro duct of the INT ≡ 1 rules f 9  q 11 ( y 1 ) , q 12 ( y 2 )  − − − − → y 1 ≡ y 2 q 1 ( y 1 ) and f 9  q 21 ( y 1 ) , q 22 ( y 2 )  − − − − → y 1 ≡ y 2 q 1 ( y 1 ) is f 9  h q 11 , q 21 i ( y 1 ) , h q 12 , q 22 i ( y 2 )  − − − − → y 1 ≡ y 2 h q 1 , q 2 i ( y 1 ). The pro du ct of t wo INT 6≡ 1 is constructed similarly . W e do not n eed to consider the pro du ct of a ru le INT ≡ 1 with a ru le INT 6≡ 1 , and vice-v ersa, b ecause in this case the p r o duct is emp t y (no rule is added to the VT AM ≡ 6≡ for in tersectio n). F or the complemen tatio n, w e use Th eorem 3.14 and completion. Corollary 3.16. The universality and inclusion pr oblems ar e de cidable for VT AM ≡ 6≡ . Pr o of. This is a consequence of Corollary 3.12 and T h eorem 3.15. 3.5. Constrained PU SH T ransitions. Ab o v e, w e alw a ys consid er ed constr aints in tran- sitions with INT sym b ols only . W e did not consid er a constrained extension of the r ules PUSH . T he main reason is th at symbols of a new category PUSH ≡ , whic h test tw o memories for stru ctural equalit y and then pu sh a symbol on the top of them, p ermit us to constru ct a constrained VT AM A whose memory language M ( A , q ) is th e set of wel l-balanced binary trees. This language is not regular, wh ereas the base of our emptiness decision pro cedure is the result (Theorem 3.2, Lemm a 3.7) of regularit y of these languages for the cla sses considered. 3.6. Con texts as Sym b ols and Signature T ranslations. Before lo oking for some ex- amples of VT AM ≡ 6≡ languages, we sho w a ”tric k” that (seemingly) adds expressiv eness to VT AM ≡ 6≡ . One sy mb ol can p erform either a PU SH or a POP op eration, or mak e an INT transition (constrained or not), bu t it cannot combine several of th ese op erations. Here, w e prop ose a wa y to com bine several op erations in one symb ol, and th us increase the exp res- siv eness of VT AM ≡ 6≡ , without losing the go o d prop erties of this class. 22 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN The tric k is to r eplace symbols by c ontexts . F or instance a con text g 2 ( g 1 ( · , · ) , g 0 ) can replace a symb ol of arity 2. Assume that g 2 is a PUS H symbol, g 1 is an INT 1 sym b ol with test, and g 0 is an INT 0 sym b ol. This conte xt ﬁrst p erforms a test on the memories of the sons, and then a PUS H op er ation on the memory k ept b y g 1 (and on the ⊥ leaf created by g 0 ). Suc h a combinatio n is normally not p ossible, and replacing symbols by con texts brin gs a lot of additional expressiveness. Here is how w e pr ecisely pro ceed: w e wan t to recognize a language (on a signature Σ) with a VT AM, and we h a ve then to c h o ose the categories for eac h symb ol of the s ignatur e ( PUSH , POP ij , INT ≡ 1 , ...). As w e will see in the examples b elo w, it migh t b e u seful in practice to ha v e s ome extra categories combining the p o wers of t w o or more categories of VT AM ≡ 6≡ . W e can do that still with VT AM ≡ 6≡ , by mean of an enco ding of the terms of T (Σ). More pr ecisely , we replace some symb ols of the in itial signature Σ by con texts built w ith new s y mb ols. F or instance, we replace a g ∈ Σ, wh ic h will p erform the complex op eratio n describ ed ab o v e, by the con text g 2 ( g 1 ( · , · ) , g 0 ). Then, we will hav e to ensure that the new sym b ols (in our example g 0 , g 1 and g 2 ) are only used to f orm the con texts enco ding the sym b ols of Σ. This can easily b e done with lo cal information main tained in the state of the automa ton. The set of well formed terms , b uilt with n ew sym b ols organized in allo wed con texts, is a regular tree language. W e will call the VT AM ≡ 6≡ signature obtained a tr anslation of th e initial signature. If L is a tree language on Σ, then c ( L ) is the translation of L . In sum mary , we ha v e sh o w n here a general metho d for add ing new categories of symbols corresp onding to (relev ant ) com binations of op eratio ns of VT AM ≡ 6≡ , and hence d eﬁ ning extensions of VT AM ≡ 6≡ with the same go o d pr op erties as VT AM ≡ 6≡ . By r elevant , w e mean that some com binations are excluded, like for ins tance, PU SH + constrain t ≡ at th e same time (see paragraph ab o v e). Suc h forbidden combinatio n cannot b e hand led b y our m etho d . With sim ilar enco dings, w e can deal w ith symb ols of arit y bigger than 2, e.g. g ( · , · , · ) can b e replaced b y g 2 ( · , g 1 ( · , · )). Note ho w ev er ﬁr st that this enco ding concerns the recognized tree, not the memories . F or instance, it is not p ossible to systematically enco de the syn tactic equalit y as str uctural equalit y (on memories) in this wa y . And ind eed, the d ecision results are drastically diﬀerent in the tw o cases. Also note th at, ev en if c ( L ) is accepted by a VT AM, whic h implies that ¬ c ( L ) is also accepted by a VT AM, it ma y w ell b e the case that c ( ¬ L ) is not recognized b y a VT AM. So, the ab o v e tric k do es not sho w that we can extend our results to a wider class of tree languages. 3.7. Some VT AM ≡ 6≡ Languages. The r egular tree languages and VPL are particular cases of VT AM languages. W e present in th is section some other examples of relev ant tree languages translatable, using the metho d of Section 3.6, in to VT AM ≡ 6≡ languages. Wel l b alanc e d binary tr e es. T he VT AM ≡ 6≡ with memory signature { f , ⊥} , state set { q , q f } , unique ﬁnal stat e q f , and whose rules f ollo w accepts the (non-regular) language of we ll balanced bin ary trees build w ith g an d a . Here a is a constan t in Σ INT 0 , and g is in a n ew category , and is translated into th e con text g 2 ( g 1 ( · , · ) , g 0 ), where g 2 ∈ Σ PUSH , g 1 ∈ Σ INT ≡ 1 , and g 0 ∈ Σ INT 0 . VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 23 a → q f ( ⊥ ) g 0 → q 0 ( ⊥ ) g 1  q f ( y 1 ) , q f ( y 2 )  − − − − → y 1 ≡ y 2 q ( y 1 ) g 2  q ( y 1 ) , q 0 ( y 2 )  − → q f  f ( y 1 , y 2 )  Powerlists. A p ow erlist [18] is roughly a list of length 2 n (for n ≥ 0) wh ose elemen ts are stored in the lea v es of a b alanced binary tree. F or instance, the elemen ts may b e integ ers represent ed in unary n otatio n with the un ary successor sym b ol s and the constan t 0, and the balanced binary tree on the top of them can b e built with a binary sy mb ol g . This data stru cture has b een used in [18] to s p ecify data-parallel algorithms based on divide- and-conquer strategy and recursion ( e.g. Batc her’s merge sort and fast F our ier transf orm ). It is ea sy follo wing the ab ov e construction to characte rize translations of p o werlists with a VT AM ≡ 6≡ . W e do not pus h on the ”lea ves”, i . e. on the elemen ts of the p o we rlist, and compute in th e higher part (the complete bin ary tree) as ab o ve . Some equational prop erties of algebraic sp eciﬁcati ons of p ow erlists hav e b een studied in the con text of automatic ind uction theorem pr o ving and su ﬃ cien t completeness [17]. T ree automata w ith constrain ts ha v e b een ac kn o wledged as a v ery p o w erf ul formalism in this con text (see e.g. [9]). W e therefore b eliev e that a charact erization of p o werlists (and their complemen t language) with VT AM ≡ 6≡ is us efu l for th e automated ve riﬁcation of algorithms on this data structure. R e d-black tr e es. A red-black tree is a binary searc h tree follo wing these prop erties: (1) ev ery n o de is either red or black, (2) the ro ot no de is black, (3) al l the lea ves are b lac k, (4) if a no de is red , then b oth its sons are b lac k, (5) ev ery p ath from the ro ot to a leaf con tains the same num b er of b lac k no des. The f ou r ﬁrst prop erties are lo cal and can b e chec k ed with standard T A r u les. The ﬁfth prop er ty mak e the language red-blac k trees not regular and we need VT AM ≡ 6≡ rules to recognize it. It can b e c hec ked by p ushing all the black no d es read. W e use for this purp ose a symbol black ∈ Σ PUSH . When a r ed no d e is read, the n um b er of blac k no des in b oth its sons are c h ec ked to b e equal (by a test ≡ on the corr esp onding memories) and only one corresp ond ing memory is k ept. This is d one with a sy mb ol r e d ∈ Σ INT ≡ 1 . When a blac k no de is read, the equalit y of num b er of b lac k n o des in its sons must also b e tested, and a black must moreov er b e p ushed on the top of the memory ke pt. It m eans that tw o op erations m ust b e combined. W e can do that by deﬁnin g an appropr iate context with the metho d of Section 3.6. In [15] a sp ecia l class of tree automata is introdu ced and used in a p ro cedure f or the v eriﬁcation of C programs which hand le b alanced tree d ata structures, lik e red-blac k tree. Based on the ab o ve example, we thin k that, follo wing the same approac h, VT AM ≡ 6≡ can also b e used for s imilar pur p oses. 24 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN BTINT 1 f 13  q 1 ( y 1 ) , q 2 ( y 2 )  − − − → 1=2 q ( y 1 ) f 13 ∈ Σ BTINT 1 BTINT 2 f 14  q 1 ( y 1 ) , q 2 ( y 2 )  − − − → 1=2 q ( y 2 ) f 14 ∈ Σ BTINT 2 BTINT 1 f 15  q 1 ( y 1 ) , q 2 ( y 2 )  − − − → 1 6 =2 q ( y 1 ) f 15 ∈ Σ BTINT 1 BTINT 2 f 16  q 1 ( y 1 ) , q 2 ( y 2 )  − − − → 1 6 =2 q ( y 2 ) f 16 ∈ Σ BTINT 2 Figure 5: New tr an s ition categories for BTVT AM R ¬ R . 4. Visibl y Tree Automa t a with Mem or y and Structural Const raints a nd Bogaer t-Tison Constraints In S ection 3, we ha v e only considered VT AM w ith constraints testing th e memories con tents. In this section, w e go a b it fu rther and add to VT AM R ¬ R some Bogaert-Tison constrain ts [4], i.e. equalit y and disequalit y tests b et ween br other sub terms in the term read by th e automaton. W e consid er tw o new catego ries for the s y mb ols which w e call BTINT 1 and BTINT 2 , f or ”Boga ert-Tison Int ernal”. A transition with a symb ol in one of these categories will make n o test on the memory conte n ts, b ut rather an equ alit y or disequalit y test b et wee n the brother subterms d irectly under the current p osition of computation. In Figure 5, we describ e the new transitions categories. W e u se the same notation as in [4] for the constraint s. Not e that again, w e only allo w Bogaert- Tison constrain ts in in ternal ru les. F or ins tance, if f 13 ( t 1 , t 2 ) is a su bterm of the input tree, and if t 1 leads to q 1 ( m 1 ), and t 2 to q 2 ( m 2 ), then the transition rule f 13  q 1 ( y 1 ) , q 2 ( y 2 )  − − − → 1=2 q ( y 1 ), of t yp e BT INT 1 can b e applied at this p osition iﬀ t 1 = t 2 . Deﬁnition 4.1. A visibly tr e e automaton with memory and c onstr aints and B o gaert-Tison tests (BTVT AM R ¬ R ) on a signature Σ is a tup le (Γ , R , Q, Q f , ∆) where Γ , Q , Q f are deﬁ n ed as for T AM, R is an equiv alence relation on T (Γ) and ∆ is a set of r ewrite r ules in one of the ab o v e catego ries: PUS H , POP 11 , PO P 12 , PO P 21 , PO P 22 , INT 0 , INT 1 , INT 2 , INT R 1 , INT R 2 , BTINT 1 , BTINT 2 . The acceptance of terms of T (Σ) and languages of term and memories are deﬁned and denoted as in Section 2.1. The d eﬁnition of c omplete BTVT AM R ¬ R is the same as b efore. Ev ery BTVT AM R ¬ R can b e completed (with a p olynomial o v erhead) by the addition of a trash state q ⊥ (the construction is similar to the one for VT AM R ¬ R in Section 3.1). The d eﬁnition of deterministic BTVT AM R ¬ R is based on the same conditions as for VT AM R ¬ R for the function symb ols in categories PU SH 0 , PUSH , POP 11 , . . . , POP 22 , INT 1 , INT 2 , INT R 1 , INT R 2 , and for the fun ction symb ols of B TINT 1 , BTINT 2 , we u se the same kind of conditions as for INT R 1 , INT R 2 : f or all f ∈ Σ BTINT 1 ∪ Σ BTINT 2 for all q 1 , q 2 ∈ Q , there are at most t w o rules in ∆ with left-mem b er f  q 1 ( y 1 ) , q 2 ( y 2 )  , and if there are t w o, then th eir constrain ts ha v e diﬀerent signs. Theorem 4.2. F or ev ery BTVT AM ≡ 6≡ A = (Γ , ≡ , Q, Q f , ∆) ther e exists a deterministic BTVT AM ≡ 6≡ A det = (Γ det , ≡ , Q det , Q det f , ∆ det ) such that L ( A ) = L ( A det ) , wher e | Q det | and | Γ det | b oth ar e O  2 | Q | 2  . VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 25 Pr o of. W e use, again, the same construction as in the pro of of Th eorem 2.3, with a d irect extension of the construction for INT to INT ≡ and BTINT . As men tioned in Theorem 3.14, the extension works for INT ≡ b ecause the resu lts of the tests are indep endent from the non-deterministic c hoices of the automaton. F or BT INT it is exactly the same (the br other terms are n ot c hanged by the automaton!). Theorem 4.3. The class of tr e e languages of BTVT AM ≡ 6≡ is close d under Bo ole an op er a- tions. Pr o of. W e u s e the s ame constructions as in Theorem 2.4 for union and in tersection. F or the in tersectio n, as in Th eorem 3.15, the constr aints (even Bogaert-Tison tests) can b e safely k ept in p ro duct rules, thanks to the visibilit y condition. F or the complemen tation, we use Theorem 4.2 and complemen tati on. The pro of of the follo wing theorem follo ws the same id ea as the pr o of for Bogaert-Tison automata [4], but we need here to take care of the structur al constrain ts on the memory con tents. A consequence is that the complexit y of emptiness d ecision is muc h higher. Theorem 4.4. The e mptiness pr oblem is de cidable for BTVT AM ≡ 6≡ . Pr o of. Let A b e a BTVT AM ≡ 6≡ . First w e determinize it into A det and assume that A det is also complete. Then, w e delete the rules BTINT 1 of the form: f  q 1 ( y 1 ) , q 2 ( y 2 )  − − − → 1=2 q ( y 1 ). with q 1 distinct from q 2 (idem for B TINT 2 rules) b ecause they can’t b e used (the automaton is deterministic so one term cann ot lead to tw o diﬀerent states). F or the same reason, we c hange eac h rule BTINT 1 of the form: f  q 1 ( y 1 ) , q 2 ( y 2 )  − − − → 1 6 =2 q ( y 1 ) with q 1 distinct fr om q 2 (idem for B TINT 6 = 2 rules) int o th e same r ule but without the disequalit y test: f  q 1 ( y 1 ) , q 2 ( y 2 )  → q ( y 1 ). W e call the newly ob tained automaton A new . It is still d eterministic and recognizes the same language as A det . Actually , the careful reader ma y n otice th at A new is n ot a true BTVT AM ≡ 6≡ , b ecause some unconstrained rules ma y inv olve symb ols in BTINT in this au- tomaton. Ho wev er, it is just an intermediate step in the constru ction of another automaton A ′ b elo w. No w, w e consid er the remaining BTINT 1 or B TINT 2 rules with negativ e Bogae rt-Tison constrain ts, whic h are of the f orm: f  q 1 ( y 1 ) , q 1 ( y 2 )  − − − → 1 6 =2 q ( y 1 ) (or q ( y 2 )). W e denote th em b y R 1 , ..., R i , ..., R N , and denote by q i the s tate in th e left mem b er of R i , f or eac h i ≤ N . W e also denote the corresp on d ing BTINT 1 or BTINT 2 rules by S 1 ,...., S i ,..., S N . Note that, since A det is deterministic and complete, we can asso ciate to eac h rule of BTINT i , wh ose constrain t is negativ e, a unique rule of BTINT i with a p ositiv e constraint and the same states in its left m emb er. So, the state in the left mem b er of S i is the same q i as for R i . It is imp ortan t to notice that if a rule R i can eﬀectiv ely b e used, then there must exist t wo distinct terms leading to the state q i (w e will call th em witnesses). If not, the ru le can b e remov ed. So, our pu r p ose is no w to ﬁnd, for eac h rule R i , whether tw o witnesses exist or not. W e let R b e initially { R 1 , . . . , R N } . Supp ose that at least one R i rule can b e used, and consider a run on a term t that u ses su ch a ru le. W e consider an innerm ost app lication of a r ule R i in this run on a su bterm f ( t 1 , t 2 ). The run on t 1 and the run on t 2 b oth lead to the state q i , without any use of an R j rule. Let u s remo v e all the R i rules fr om A new , and w e remo v e all the equalit y tests in th e S i rules. Let A ′ b e the resulting automaton. It is a deterministic VT AM ≡ 6≡ (considering the 26 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN sym b ols in BTINT as INT symbols in this n ew automaton), and eac h term in L ( A ′ , q i ) can b e transf ormed (w e will call it BT-tr ansformation ) into a term in L ( A new , q i ): eac h time w e use a mo d iﬁed S i rule, for instance of t yp e BTINT 1 , on a subtree f ( t 1 , t 2 ), we replace t 2 with t 1 so that the equalit y test is satisﬁed (and the resulting memory is unchanged). Imp ortant: all the replacemen ts must b e p erformed b otto m-up. The pro of of the emptiness d ecidabilit y of VT AM ≡ 6≡ (Corollary 3.12) is constructive , hence if we choose a reac hable state q j , we can ﬁnd a term in L ( A ′ , q j ) to th is state, and then con v ert it in to a witness. So, we can ﬁnd a ﬁ rst witness t A ∈ L ( A new , q j ). If no witness ca n b e found, then all the R i rules are u seless and we can deﬁnitely remo v e them all. Otherwise, we still need to ﬁnd another witness, and if th er e is at least one suc h other witness, then one of th em can b e recognized without usin g a R i rule. W e can construct a VT AM ≡ 6≡ recognizing all the terms wh ose BT-transf ormation leads to t A . T o design it, w e r ead t A top-do w n (kno wing the state of A ′ at eac h no de), and eac h time w e see a su bterm f ( t 1 , t 2 ) to whic h a mo d iﬁed S i rule has to b e applied, for ins tance a mo diﬁed BTINT 1 (resp. BTINT 2 ) rule, the right (resp. left) son of f only needs to b e a term in L ( A ′ , q i ), and the left (resp. righ t) son of f only n eeds to b e BT-transformed in to t 1 (resp. t 2 ). Once this VT AM ≡ 6≡ is constructed, w e can com b ine it w ith A ′ in order to obtain a VT AM ≡ 6≡ recognizing all the terms leading A ′ to q j (the state reac hed by A ′ on t A ) except the term s w hose BT-trans f ormation is t A . Then we ﬁn d another term in L ( A ′ , q j ) (if it exists) and its BT-transf ormation is n ot t A : it is actually another witness t B . When we hav e t w o witnesses for a rule R j , w e remo v e it f rom R , and we add this r ule R j to A ′ , b ut without the disequ alit y test. The automaton A ′ k eeps its go o d prop erty: a term t leading A ′ to some state q can b e BT-transformed in to a term leading A new to state q : w hen we ”meet” the u se of a rule f ormerly in th e set R on f ( t 1 , t 1 ) dur ing the b otto m-up exploration of t , w e replace the righ t (for a r ule th at was of t yp e BTINT 1 and w ith n egativ e constrain ts) or the left son (otherwise) by a witness diﬀerent from t 1 , s o that the disequalit y test is satisﬁed. Note that ev en if t 1 is a witness, we can do so b ecause we ha v e foun d t w o witnesses. With the new rule in A ′ w e lo ok for 2 witnesses f or some r emaining R i rule. Again, we can sho w that if a couple of witnesses exists, then at least one coup le can b e found without an y use of the remaining R i rules. When we ﬁnd a ﬁrs t witness t A for a remaining r ule R j , w e can ﬁnd another one (if it exists) using app ro x im ately the same tec h nique as previously: w e read t A top-do w n, and when we see a ru le formerly in R , used on f ( t 1 , t 2 ) (e.g. a rule formerly of t yp e B TINT 1 with a negativ e constrain t), w e just go on recursivel y , saying that the left son m ust b e a term whose BT-transformation is t 1 , and the right son m ust b e either: • a term whose BT-transform ation is t 2 , • or, if our BT-transformation would c h ange f ( t 1 , t 1 ) in to f ( t 1 , t 2 ), a term w h ose BT- transformation is t 1 . As previously , w e construct a VT AM ≡ 6≡ , fully us ing the Bo olean closure of this class, that recognizes th e terms in L ( A ′ , q j ) (the state r eac hed b y A ′ on t A ), except those w hose BT- transformation is t A , and therefore w e can ﬁ nd another witness (if it exists) t B . W e contin ue to use this metho d , ﬁnd ing couples of witnesses, until there is n o rule in the set R anymore, or u n til we are n ot able to ﬁ nd a new couple of witnesses anymore: in that latter case, w e remo v e the remaining R i rules b ecause they are useless. So, no w we use the ﬁ nal v ersion of A ′ obtained in order to ﬁnd a term leading to a ﬁ nal state, and since we ha v e a couple of witnesses for eac h ru le formerly in the set R , w e can VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 27 BT-transform it in to a term accepted b y A new (hence by A ). If such a term do es not exist, the language recognized by A new (i.e. the language recognized by A ) is empty . 5. Conc lusion Ha vin g a tree memory structure instead of a stac k is sometimes m ore r elev an t (ev en when the in put fun ctions sym b ols are only of arities 1 and 0). W e ha v e sh own ho w to extend the visibly pus h do wn languages to su c h memory structures, k eeping determinization and closure pr op erties of VPL. O ur second con tribution is then to extend this automaton mo del, constraining the transition rules w ith some regular conditions on memory con ten ts. The structural equ alit y and disequalit y tests app ear to a b e a goo d class of constrain ts since w e ha v e then b oth d ecidabilit y of emptiness and Boolean closure prop erties. Moreo ve r, they can b e com bined (while keeping d ecidabilit y and closure results) with equ alit y and disequalit y tests a la [4], op erating on brothers su bterms of the term r ead. Sev eral furth er studies can b e done on the automata of this p ap er. F or instance, the problem of the closure of the corresp onding tree languages un der certain classes of term rewriting sy s tems is particularly in teresting, as it can b e applied to th e ve riﬁcation of inﬁnite state systems with r e gular mo del che cking tec h niques. It could b e in teresting as well to study how the deﬁn ition of VT AM can b e extended to deal w ith un rank ed trees, with the p ersp ecti v e of applications to problems related to semi-structured do cuments pro cessing. Ac knowledgmen ts. The authors w ish to thank Pierre R ´ et y f or h a vin g noted s ome mis- tak es in the examples in the extended abstract, and for having sent us a basis of comparison of VT AM with (top down) Visibly Push d o w n T ree Automata, and Jean Goub ault-Larrecq for his suggestion to refer to H 3 [19] in the pro of of Theorem 2.5, and the reviewe rs for their useful and numerous remarks and su ggestio ns. Referen ces [1] R. Alur, S. Chaudhuri, and P . Madhusudan. Visibly pushd o wn tree languages. Av ailable on: http://www .cis.upen n.edu/ ~ swarat/pub s/vptl.ps , 2006. [2] R. A lur and P . Madhusudan. Visibly pushdown languages. In L. Babai, editor, Pr o c e e di ngs of the 36th Ann ual ACM Symp osium on The ory of Computing (STOC 2004) , pages 202–211. AC M, 2004. [3] L. Bac hmair and H. Ganzinger. R esolution th eorem proving. In A. Robinson and A. V oronko v , editors, Handb o ok of Automate d R e asoning , c hapter 2. North Holland, 2001. [4] B. Bogaert and S. Tison. Equality and Disequality Constraints on Direct Subterms in T ree A utomata. In 9th Symp. on The or etic al Asp e cts of Computer Sci enc e, ST ACS , vol ume 577 of LNCS , pages 161–171. Springer, 1992. [5] J. Chabin and P . R´ ety . V isibly pu shdow n languages and t erm rewriting. In Pr o c. 6th International Symp osium F r ontiers of Combi ning Syste ms (F r oCoS) , v olume 4720 of L e ctur e Notes in Computer Scienc e , pages 252–266. Springer, 2007. [6] W. Charatonik and A. Podelski. Set constraints with intersection. In Pr o c. IEEE Symp osium on L o gic in Computer Scienc e , V arsa w, 1997. [7] H. Comon and V. Cortie r. T ree automata with one memory , set constrain ts and cryptographic proto cols. The or etic al Computer Scienc e , 331(1):143 –214, F eb. 2005. [8] H. Comon, M. Dauchet, R . Gilleron, F. Jacquemard, D. Lugiez, S. Tison, and M. T ommasi. T r e e Aut omata T e chniques and Applic ations . http://www.g rappa.univ - lille3.fr/tata , 1997. [9] H. Comon and F. Jacquemard. Ground reducibilit y is EXPTIME-complete. I nformation and Compu- tation , 187(1):123–15 3, 2003. 28 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN [10] J.-L. Coquid´ e, M. Dauchet, R. Gilleron, and S. V´ agv¨ olgyi. Bottom-up tree pushdown automata: clas- siﬁcation and connection with rewrite systems. T he or etic al Computer Sci enc e , 127(1):69–9 8, 1994. [11] N . Dershow itz and J.-P . Jouannaud. Re write systems , c hapter Handb o ok of Theoretical Computer Science, V olume B, pages 243–320. Elsevier, 1990. [12] T. F r¨ uhwirth, E. Shap iro, M. V ardi, and E. Y ardeni. Logic programs as types for logic programs. In Pr o c. of the 6th IEEE Symp osium on L o gic in Computer Scienc e , pages 300–309 , 1991. [13] J. Goubault-Larrecq. R´ esolution ordonn´ ee av ec s´ election et classes d´ ecidables en logique du premier ordre. Lecture Notes, 2006. av alaible at http://www. lsv.ens- cachan.f r/ ~ goubault/S Oresol.pd f . [14] I . Guessarian. Pushd o wn tree automata. The ory of Computing Systems , 16(1):237 –263, 1983. [15] P . H abermehl, R. I osif, and T. V o jnar. Automata-based veriﬁcatio n of programs with tree up dates. In Pr o c. 12th Intern. Conf. on T o ols and Algorithms for the Construction and A nalysis of Systems (T A CAS’06) , volume 3920 of LNC S , April 2006. [16] T. Jensen, D . L. M´ etay er, and T. Thorn. V eriﬁcation of control ﬂow based security p olicies. In Pr o- c e e dings of the IEEE Symp osium on R ese ar ch in Se curity and Pri vacy , pages 89–103. IEEE Computer Society Press, 1999. [17] D . K apur. Essays i n Honor of L arry W os , chapter Constructors can b e Pa rtial T oo. MIT Press, 1997. [18] J. Misra. Po werlist: A structu re for parallel recursion. ACM T r ansactions on Pr o gr amming L anguages and Systems , 16(6):1737–17 67, Nov em ber 1994. [19] F. Nielson, H. R. N ielson, and H. S eidl. Normalizable horn clauses, strongly recognizable relations and spi. In Pr o c. 9th Static Analysis Symp osium (SAS) , volume 2477 of LNCS , pages 20–35, 2002. [20] R . Nieuw enhuis and A. R u bio. Paramodulation-based theorem p ro ving. In A. Robinson and A. V oronko v, editors, Handb o ok of Automate d R e asoning , c hapter 7. North Holland, 2001. [21] K . M. S chimpf and J. Gallier. T ree p ushdow n automata. Journal of Computer and System Scienc es , 30(1):25–4 0, 1985. VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 29 Appendix: Tw o-w a y tree automa t a with structural equa lity cons traints are as expres sive as st andard tree automa t a. In this section, w e complete the pro of of Lemm a 3.7. W e s ho w actually a more general result: we consider tw o-w a y alternating tree automata with some regular constraints and sho w th at the language they recognize is also accepted b y a standard tree automaton. This generalizes the pro of for tw o-w a y alternating tr ee automata (see e.g. [8] c h apter 7) and the pro of for t wo-w a y automata with equality tests [7], whic h itself r elies on a transf ormation from t w o-w a y automata to one-w a y automata [6]. Tw o-w ay automata are, as usual, automata that can mov e up and down and alter- nation consists (as usual) in spa wning to copies of the tree in diﬀeren t states, requ iring acceptance of b oth copies. In the logical formalism, alternation simply corresp onds to clauses q 1 ( x ) , q 2 ( x ) → q ( x ), requiring to accept x b oth in state q 1 and in state q 2 if one w an ts to accept x in state q . F or simp licit y , w e assume that all fun ction symb ols hav e arit y 0 or 2. Lexical con v en- tions: • f , g , h, ... are ranging o v er symbols of arity 2. Un less explicitly stated they ma y denote iden tical symbols. • a, b, c... range o ver constant s • x, x 1 , . . . , x i , . . . , y , . . . , y i , z , . . . , z i , . . . are (universally quan tiﬁed) ﬁrs t-order v ariables, • S, S 1 , S 2 , . . . , S i , . . . range o ver states symb ols for a ﬁ xed given tree automaton • Q, Q 1 , Q 2 , . . . , range o v er states sym b ols of the tree au tomaton with memory • R , R 1 , R 2 , . . . , range o v er state symb ols of the bin ary recognizable relations. W e assume that R i are recognizable r elatio ns deﬁned by clauses of th e f orm: ( A ) ⇒ R ( a, b ) ( B ) S 1 ( x ) , S 2 ( y ) ⇒ R ( f ( x, y ) , a ) ( C ) S 1 ( x ) , S 2 ( y ) ⇒ R ( a, f ( x, y )) ( D ) R 1 ( x 1 , x 2 ) , R 2 ( y 1 , y 2 ) ⇒ R 3 ( f ( x 1 , y 1 ) , g ( x 2 , y 2 )) ( E ) S 1 ( x ) , S 2 ( y ) ⇒ S ( f ( x, y )) ( F ) ⇒ S ( a ) W e assume wlog that there is a state S ⊤ in which all trees are accepted (a “trash state”). Moreo ver, we will need in what follo ws an additional pr op ert y of th e R i ’s: ∀ i, j, ∃ k , l , R i ( x, y ) ∧ R j ( y , z ) | = | R k ( x, y ) ∧ R l ( x, z ) This prop erty is satisﬁed by the structural equiv alence, for which there is only one ind ex i : R i = ≡ and w e ha v e indeed x ≡ y ∧ y ≡ z | = | x ≡ y ∧ x ≡ z It is also satisﬁed by the univ ersal binary relation and by the equalit y relation. Th at is why this generalizes corresp ondin g results of [8, 7]. Our automata are deﬁned by a ﬁnite set of clauses of the form: 30 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN (1) Q 1 ( y 1 ) , Q 2 ( y 2 ) , R ( y 1 , y 2 ) ⇒ Q 3 ( y 1 ) (2) Q 1 ( y 1 ) , Q 2 ( y 2 ) ⇒ Q 3 ( f ( y 1 , y 2 )) (2 b ) ⇒ Q 1 ( a ) (3) Q 1 ( f ( y 1 , y 2 )) , Q 2 ( y 3 ) ⇒ Q 3 ( y 1 ) (4) Q 1 ( f ( y 1 , y 2 )) , Q 2 ( y 3 ) ⇒ Q 3 ( y 2 ) These clauses ha v e a least Herbrand mo d el. W e write [ [ Q ] ] the interpretation of Q in this mo del. T his is the language recognized by the automaton in state Q . The goal is to p ro v e that, for ev er y Q , [ [ Q ] ] is recognized by a ﬁn ite tree automaton W e use a selection strategy , with splitting and complete the rules (1)-(4) ab ov e. W e sho w that the completion terminates and that we get out of it a tree automaton whic h accepts exactly the memory con ten ts. Sp litting will in trod uce n u llary predicate sym b ols (prop ositional v ariables). W e consider the follo wing selection str ategy . Let E 1 b e the set of literals w h ic h con tain at least one function symbol and E 2 b e the set of negativ e literals (1) If the clause con tains a negativ e literal ¬ R ( u, v ) or a negativ e literal ¬ S ( u ) where either u, v is not a v ariable, th en select su ch literals only . Th is case is r uled out in wh at follo w s (2) If the clause con tains at least one negated prop ositio nal v ariable, selec t the negated prop ositional v ariables only . This case is ruled out in what follo ws (3) If E 1 ∩ E 2 6 = ∅ , then select E 1 ∩ E 2 (4) If E 1 6 = ∅ and E 1 ∩ E 2 = ∅ , then select E 1 (5) If E 1 = ∅ and E 2 6 = ∅ , then select the negativ e literals ¬ R ( x, y ) and ¬ S ( x ) if any , otherwise select E 2 (6) Otherwise, select the only literal of the clause In what follo ws (and pr ecedes), selected literals are underlin ed. W e in tro duce th e pr o cedure by starting to r un the completion w ith the selection strat- egy , b efore s h o w ing the general f orm of the clauses w e get. First, clauses of the form (3), (4) are replaced (usin g sp litting) w ith clauses of the form (3) Q 1 ( f ( y 1 , y 2 )) , NE Q 2 ⇒ Q 3 ( y 1 ) (4) Q 1 ( f ( y 1 , y 2 )) , NE Q 2 ⇒ Q 3 ( y 2 ) ( s 1 ) Q 2 ( x ) ⇒ NE Q 2 Ov erlapping ( s 1 ) and (2, 2b) ma y yield clauses of the form ( s 2 ) NE Q 1 , NE Q 2 ⇒ NE Q 3 ( s 3 ) ⇒ NE Q together with new clauses of the form ( s 1 ). Eve n tually , we may reac h, using ( s 3 ) and (3-4) clauses: (3 b ) Q 1 ( f ( y 1 , y 2 )) ⇒ Q 3 ( y 1 ) (4 b ) Q 1 ( f ( y 1 , y 2 )) ⇒ Q 3 ( y 2 ) (1) + (2) yields clauses of the f orm VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 31 (5 . 1) Q 1 ( y 1 ) , Q 2 ( y 2 ) , Q 3 ( g ( y 3 , y 4 )) , R 1 ( y 1 , y 3 ) , R 2 ( y 2 , y 4 ) ⇒ Q 4 ( f ( y 1 , y 2 )) (5 . 2) Q 1 ( y 1 ) , Q 2 ( y 2 ) , Q 3 ( a ) , S 1 ( y 1 ) , S 2 ( y 2 ) ⇒ Q 4 ( f ( y 1 , y 2 )) (5 . 3) Q 1 ( a ) ⇒ Q 2 ( b ) (5 . 4) S 1 ( y 1 ) , S 2 ( y 2 ) , Q 1 ( f ( y 1 , y 2 )) ⇒ Q 2 ( a ) (2) +(3b) and (2) + (4b) yield clauses of the form (after splitting): (6) NE Q 3 , Q 1 ( y 1 ) ⇒ Q 2 ( y 1 ) and ev en tually (6 b ) Q 1 ( y 1 ) ⇒ Q 2 ( y 1 ) (5.1) + (2) yields (7 . 1) Q 1 ( y 1 ) , Q 2 ( y 2 ) , Q 3 ( y 3 ) , Q 4 ( y 4 ) , R 1 ( y 1 , y 3 ) , R 2 ( y 2 , y 4 ) ⇒ Q 5 ( f ( y 1 , y 2 )) W e split (7.1) : w e intro d uce new pr edicate s ym b ols Q R j i deﬁned by Q i ( y ) , R j ( x, y ) ⇒ Q R j i ( x ) Then clauses (7.1) b ecomes: (7 . 1) Q 1 ( y 1 ) , Q 2 ( y 2 ) , Q R 1 3 ( y 1 ) , Q R 2 4 ( y 2 ) ⇒ Q 5 ( f ( y 1 , y 2 )) (5.2) + (2b) yields clauses of the form (7 . 2) Q 1 ( y 1 ) , Q 2 ( y 2 ) , S 1 ( y 1 ) , S 2 ( y 2 ) ⇒ Q 3 ( f ( y 1 , y 2 )) (6b) + (2) yields new clauses of the form (2). (7.1) + (5.1) y ields clauses of the form: (8 . 1) Q 1 ( y 1 ) , Q 2 ( y 2 ) , Q R 3 3 ( y 1 ) , Q R 4 4 ( y 2 ) , Q 5 ( y 3 ) , Q 6 ( y 4 ) , R 1 ( y 3 , y 1 ) , R 2 ( y 4 , y 2 ) ⇒ Q 7 ( f ( y 3 , y 4 )) A t this p oin t, we use the p rop ert y of R and split the clause: ∃ y 1 .Q 1 ( y 1 ) ∧ Q R 3 3 ( y 1 ) ∧ R 1 ( y 3 , y 1 ) | = | Q R 4 1 ( y 1 ) ∧ Q R 5 3 ( y 1 ) Hence clauses (9.1) can b e rewr itten into clauses of the form: (8 . 1) Q R 1 1 ( y 1 ) , Q R 3 3 ( y 1 ) , Q 5 ( y 1 ) , Q R 2 2 ( y 2 ) , Q R 4 4 ( y 2 ) , Q 6 ( y 2 ) ⇒ Q 7 ( f ( y 1 , y 2 )) Finally , if w e let Q b e the set of pr edicate sym b ols consisting of • S ym b ols S i • S ym b ols Q i • S ym b ols Q R j i F or eve ry subset S of Q , w e in tro duce a prop ositional v ariable NE S . Clauses are split, in tro ducing new p rop ositional v ariables (or predicate symb ols Q R j i ) in such a wa y that in all clauses except split clauses, the v ariables o ccurring on th e left, also o ccur on the right of the clause. And, in sp lit clauses, there is only one v ariable o ccur ring on the left and not on the r igh t. W e let C b e the set of clauses obtained by rep eated applicatio ns of resolution with splitting, with the ab o v e selection strategy (a priori C could b e inﬁ nite). W e claim that all 32 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN generated clauses are of one of the follo wing f orm s (Where the P i ’s and the P ′ i ’s b elong to Q , Q ’s states migh t actually b e Q R j i ) 1. P op clauses. (the original clauses, which are not subsu med by the new clauses): (3) Q 1 ( f ( y 1 , y 2 )) , NE Q 2 ⇒ Q 3 ( y 1 ) (4) Q 1 ( f ( y 1 , y 2 )) , NE Q 2 ⇒ Q 3 ( y 2 ) (3 b ) Q 1 ( f ( y 1 , y 2 )) ⇒ Q 2 ( y 1 ) (4 b ) Q 1 ( f ( y 1 , y 2 )) ⇒ Q 2 ( y 2 ) Note that, clause (1) is a particular case of th e alternating clauses b elo w, since it can b e written Q 1 ( y 1 ) , Q R 2 ( y 1 ) ⇒ Q 3 ( y 1 ) 2. Push clauses. ( P 1 ) P 1 ( x ) , . . . , P n ( x ) , P ′ 1 ( y ) , . . . , P ′ m ( y ) ⇒ Q ( f ( x, y )) ( P 2 ) ⇒ P ( a ) ( P 3 ) NE S , P 1 ( x ) , . . . , P n ( x ) , P ′ 1 ( y ) , . . . , P ′ m ( y ) ⇒ Q ( f ( x, y )) ( P 4 ) NE S ⇒ Q ( a ) 3. Inte rmediate clauses. ( I 1 ) P 1 ( x ) , . . . , P n ( x ) , P ′ 1 ( y ) , . . . , P ′ m ( y ) , P ′′ 1 ( f ( x, y )) , . . . , P ′′ k ( f ( x, y )) ⇒ Q ( f ( x, y )) ( I 2 ) P 1 ( a ) , . . . , P n ( a ) ⇒ Q ( a ) ( I 3 ) S 1 ( x 1 ) , S 2 ( x 2 ) , Q 1 ( a ) ⇒ Q 2 ( g ( x 1 , x 2 )) ( I 4 ) Q 1 ( a ) ⇒ Q 2 ( b ) 4. Alternating clauses. ( A 1 ) NE S , P 1 ( x ) , . . . , P n ( x ) ⇒ Q ( x ) ( A 2 ) P 1 ( x ) , . . . , P n ( x ) ⇒ Q ( x ) In addition, w e h a ve clauses obtained by sp litting: 5. Split clauses. ( S 1 ) R j ( x, y ) , Q i ( y ) ⇒ Q R j i ( x ) ( S 1 b ) R j ( y , x ) , Q i ( y ) ⇒ Q − R j i ( x ) ( S 2 ) R 1 ( x 1 , y 1 ) , R 2 ( x 2 , y 2 ) , Q i ( f ( y 1 , y 2 )) ⇒ Q ± R j i ( g ( x 1 , x 2 )) ( S 3 ) S 1 ( x ) , S 2 ( y ) , Q i ( f ( x, y )) ⇒ Q ± R j i ( a ) ( S 4 ) P 1 ( x ) , . . . , P n ( x ) ⇒ NE { P 1 ,...,P n } ( S 5 ) P 1 ( x ) , . . . , P n ( x ) , P ′ 1 ( y ) , . . . , P ′ m ( y ) , P ′′ 1 ( f ( x, y )) , . . . , P ′′ k ( f ( x, y )) ⇒ NE S VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 33 6. Prop ositional clauses. ( E 1 ) NE S 1 , . . . , NE S n ⇒ NE S ( E 2 ) ⇒ NE S ( E 3 ) P 1 ( a ) , . . . P n ( a ) ⇒ NE S Ev ery resolution step u sing the selection strategy of t w o of the ab ov e clauses yield a clause in the ab o v e set POP + PUSH : yields an alternating clause ( A 1 ) and a split clause ( S 4 ). INT + PU SH : yields a Push clause or an intermediate clause alternating + PUS H : yields an intermediate clause ( I 1 ) or ( I 2 ). split + R : yields a split clause ( S ) 2 or ( S 3 ) or an intermediate clause ( I 3 ) or ( I 4 ). ( S 2 ) + PUSH : yields clauses ( S 1 ) and push clauses. Note th at here, w e use the prop erty of the relation R to sp lit clauses, wh ic h ma y in v olv e p redicates Q R j i . ( S 3 )+ PUSH : yields push clause and split clauses ( S 4 ). ( S 4 )+ PUSH : yields split clauses ( S 5 ) or pr op ositional clause ( E 3 ). ( S 5 )+ PUSH : yields split clauses ( S 5 ) or pr op ositional clause ( E 1 ). It follo w s that all clauses of C are of the ab o v e form. Sin ce there are only ﬁ nitely man y suc h clauses, C is ﬁn ite and computed in ﬁnite (exp onential ) time. No w, we let A b e the alt ernating tr ee automaton d eﬁned by clauses ( P 1 ) and ( P 2 ) (and automata clauses deﬁn ing the S states). L et, for an y state Q , [ [ Q ] ] A b e the language accepted in s tate Q by A . W e claim that [ [ Q ] ] = [ [ A ] ]. T o p r o ve this, w e ﬁrst sho w (the pro of is omitted here) that NE { P 1 ,...,P n } is in C iﬀ [ [ P 1 ] ] A ∩ . . . ∩ [ [ P n ] ] A 6 = ∅ . Then observe that [ [ Q ] ] is also the interpretatio n of Q in the least Herbrand m o del of C : indeed, all computations yielding C are correct. Since [ [ Q ] ] A ⊆ [ [ Q ] ] is trivial, we only ha v e to p ro v e the con verse in clus ion. F or ev ery t ∈ [ [ Q ] ] there is a pr o of of Q ( t ) usin g the clauses in C . Assume, by con tradiction, th at there is a term t and a pr edicate symb ol Q su c h that all pro ofs of Q ( t ) us in g the clauses in C in v olve at least a clause, wh ic h is not an automaton clause. Then, considering an appropr iate su b-pro of, there is a term u and a predicate sym b ol P suc h that all pro ofs of P ( u ) inv olv e at least one non-automaton clause and there is a p ro of of P ( u ) whic h u ses exact ly one non-automaton clause, at the last step of the pro of. W e in v estiga te all p ossible cases for the last clause used in the pro of of P ( u ) and d eriv e a con tradiction in eac h case. Clause I 1 : The last step of the pro of is P 1 ( u 1 ) , . . . , P n ( u 1 ) , P ′ 1 ( u 2 ) , . . . , P ′ m ( u 2 ) , P ′′ 1 ( f ( u 1 , u 2 )) , . . . , P ′′ k ( f ( u 1 , u 2 )) P ( f ( u 1 , u 2 )) and we assume u = f ( u 1 , u 2 ). Assu me also that, among the pro ofs we consider, k is minimal. (If k = 0 then w e ha v e a pus h clause, whic h is supp osed n ot to b e the case). By hypothesis, for all i , u 1 ∈ [ [ P i ] ] A , u 2 ∈ [ [ P ′ i ] ] A and f ( u 1 , u 2 ) ∈ [ [ P ′′ i ] ] A . In p articular, if w e consider the last clause used in th e pro of of P ′′ k ( u ): Q 1 ( x ) , . . . , Q r ( x ) , Q ′ 1 ( y ) , . . . , Q ′ s ( y ) ⇒ P ′′ k ( f ( x, y )) 34 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN b elongs to C . T hen, o verlapping this clause with the ab o v e clause I 1 , the f ollo wing clause b elongs also to C : P 1 ( x ) , . . . , P n ( x ) , Q 1 ( x ) , . . . , Q r ( x ) , P ′ 1 ( y ) , . . . , P ′ m ( y ) , Q ′ 1 ( y ) , . . . , Q ′ s ( y ) , P ′′ 1 ( f ( x, y )) , . . . , P ′′ k − 1 ( f ( x, y )) ⇒ P ( f ( x, y )) and therefore we ha v e another pr o of of P ( u ): P 1 ( u 1 ) , . . . , P n ( u 1 ) , Q 1 ( u 1 ) , . . . , Q r ( u 1 ) P ′ 1 ( u 2 ) , . . . , P ′ m ( u 2 ) , Q ′ 1 ( u 2 ) , . . . , Q ′ s ( u 2 ) , P ′′ 1 ( f ( u 1 , u 2 )) , . . . , P ′′ k − 1 ( f ( u 1 , u 2 )) P ( f ( u 1 , u 2 )) whic h cont radicts the minimalit y of k . Clause ( A 1 ) : The last step of th e pro of is P 1 ( u ) , . . . , P n ( u ) P ( u ) By hyp othesis, the p ro ofs of P i ( u ) only use automata clauses: ∀ i.u ∈ [ [ P i ] ] A . L e the p ush rule Q 1 ( x ) , . . . , Q m ( x ) , Q ′ 1 ( y ) , . . . , Q ′ p ( y ) ⇒ P n ( f ( x, y )) b e the last clause us ed in the pro of of P ( u ). Ove rlapping this clause and the clause A 1 ab o v e, there is another clause in C yielding a p ro of of P ( u ): Q 1 ( x ) , . . . , Q m ( x ) , Q ′ 1 ( y ) , . . . , Q ′ p ( y ) , P 1 ( f ( x, y )) , . . . , P n − 1 ( f ( x, y )) ⇒ P ( f ( x, y )) And we are bac k to th e case of I 1 . Clause (3b): Q 1 ( f ( u, t )) P ( u ) By h yp othesis f ( t, u ) ∈ [ [ Q 1 ] ] A . Hence there is a push clause P 1 ( x ) , . . . , P n ( x ) , P ′ 1 ( y ) , . . . , P ′ m ( y ) ⇒ Q 1 ( f ( x, y )) suc h that t ∈ [ [ P 1 ] ] A ∩ . . . ∩ [ [ P n ] ] A and u ∈ [ [ P ′ 1 ] ] A ∩ . . . ∩ [ [ P ′ m ] ] A . By resolution on the clause (3b), there is also in C a clause P 1 ( x ) , . . . , P n ( x ) , NE { P ′ 1 ,...,P ′ m } ⇒ Q ( x ) Ho wev er, since [ [ P ′ 1 ] ] A ∩ . . . ∩ [ [ P ′ m ] ] A 6 = ∅ , NE { P ′ 1 ,...,p ′ m } is also in C and, b y resolution again P 1 ( x ) , . . . , P n ( x ) ⇒ Q ( x ) is a clause of C . Then we are bac k to the case of A 1 . Clause (3): The last step of the p ro of is Q 1 ( f ( u, t )) NE Q 2 P ( u ) Since NE Q 2 ∈ C in this case, by saturation of C , th ere is a clause Q 1 ( x, y ) ⇒ Q ( x ) in C , and we are b ac k to the case of (3 b ). VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 35 Other cases: they are quite similar to the previous ones. Let us only consider the case of clause ( S 2 ), which is sligh tly more complicated. R 1 ( u 1 , v 1 ) R 2 ( u 2 , v 2 ) Q i ( f ( v 1 , v 2 )) Q R j i ( g ( u 1 , u 2 )) Assume moreo v er that u = g ( u 1 , u 2 ) is a minimal s ize term su c h that, for some Q i , R j , Q R j i ( u ) is pro v able using as a last step an inference S 2 , and is not pro v able b y automata clauses only , As b efore, we consid er the ov erlap b etw een S 2 and a p ush clause. W e get R 1 ( x 1 , y 1 ) , R 2 ( x 2 , y 2 ) , P 1 ( y 1 ) , . . . , P n ( y 1 ) , P ′ 1 ( y 2 ) , . . . , P ′ m ( y 2 ) ⇒ Q R j i ( g ( x 1 , x 2 )) Hence, the follo w in g clauses b elong to C (when P i , P ′ i are n ot themselv es p redicates Q R ; otherwise, w e hav e to use the p rop ert y on R r elatio ns and split in another wa y , using the S ⊤ predicate, as sh o w n later): R 1 ( x 1 , y 1 ) , P i ( y 1 ) ⇒ P R 1 i ( x 1 ) R 2 ( x 2 , y 2 ) , P ′ i ( y 2 ) ⇒ P ′ i R 2 ( x 2 ) P R 1 1 ( x 1 ) , . . . , P R 1 n ( x 1 ) , P ′ 1 R 2 ( x 2 ) , . . . , P ′ m R 2 ( x 2 ) ⇒ Q R j i ( g ( x 1 , x 2 )) and we ha v e the follo win g pro of of g ( u 1 , u 2 ): R 1 ( u 1 , v 1 ) P 1 ( v 1 ) P R 1 1 ( u 1 ) · · · R 1 ( u 1 , v n ) P n ( v 1 ) P R 1 n ( u 1 ) R 2 ( u 2 , w 1 ) P ′ 2 ( w 1 ) P ′ 1 R 2 ( u 2 ) · · · R 2 ( u 2 , w m ) P ′ m ( w m ) P ′ m R 2 ( u 2 ) Q R j i ( g ( u 1 , u 2 )) No w, by o v erlapping again R 1 ( x 1 , y 1 ) and R 2 ( x 2 , y 2 ) with their deﬁning clause, we compute “shortcut clauses” b elonging to C and get another pr o of (for instance assuming v 1 = f ( v 11 , v 12 ) and u 1 = h ( u 11 , u 12 )): R 11 ( u 11 , v 11 ) R 12 ( u 12 , v 12 ) P 1 ( f ( v 11 , v 12 )) P R 1 1 ( u 1 ) · · · R 2 ( u 2 , w 1 ) P ′ 2 ( w 1 ) P ′ 1 R 2 ( u 2 ) · · · R 2 ( u 2 , w m ) P ′ m ( w m ) P ′ m R 2 ( u 2 ) Q R j i ( g ( u 1 , u 2 )) By minimalit y of u , u 1 ∈ [ [ P R 1 1 ] ] A . Similarly , for ev ery i , u 1 ∈ [ [ P R 1 i ] ] A . u 2 ∈ [ [ P ′ i R 2 ] ] A and it follo ws th at g ( u 1 , u 2 ) ∈ [ [ Q R j i ] ] A . Finally , let us consider the case where some P i is itself a pr ed icate sym b ol Q R , in whic h case w e do n ot hav e a p redicate ( Q R ) R 1 . W e use then the assu med prop ert y of the predicates R i : R 1 ( x, y ) ∧ R ( y , z ) | = | R ′ 1 ( x, y ) ∧ R ′ ( x, z ), hence ( ∃ u, ∃ v .R 1 ( x, u ) ∧ R ( u, v ) ∧ Q ( v )) | = | ( ∃ u.R 1 ( x, u ) ∧ S ⊤ ( u )) ∧ ( ∃ v .R ( x, v ) ∧ Q ( v )) Hence we need t wo split clauses instead of one: R ′ 1 ( x, y ) ⇒ S R ′ 1 ⊤ ( x ) R ′ ( x, y ) , Q ( y ) ⇒ Q R ′ ( x ) 36 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN And R 1 ( x 1 , y 1 ) , Q R ( y 1 ) is replaced w ith S R ′ 1 ⊤ ( x 1 ) , Q R ′ ( x 1 ). Note that such a trans f orma- tion is not n ecessary when th ere is a single transitiv e bin ary relation, as in our application: then R ( x, y ) ∧ Q R ( y ) is simply replaced with Q R ( x ). T o sum up: if there is a pr o of of P ( u ) us ing clauses of C , then, b y saturation of the clauses of C w.r.t. ov erlaps with pu sh clauses, we can rewrite the pro of into a pro of using push clauses only: u ∈ [ [ P ] ] A . This pr o ves that [ [ P ] ] = [ [ P ] ] A . Finally , it is easy (and wel l-kno w n) to compute a standard b ott om-up automaton ac- cepting the same language as an alternating automaton; this on ly requires a sub set con- struction. T h at is w h y the language accepted b y our t wo- w a y automat a with stru ctural equalit y constrain ts is actually a r ecogniza ble language. The o verall size of the r esulting automaton (and its computation time) are simply exp onen tial, but w e kno w that, already for alternating automata, w e cannot d o b etter. This wor k is license d under th e Creative Commons Attr ibution-NoDer ivs L icense. T o view a copy of this license, visit htt p:// creat ivecommons.org/licenses/by-nd/2.0/ or send a letter to Creative Commons , 559 Nathan Abbott Wa y , S tanford, California 94305, USA.

Visibly Tree Automata with Memory and Constraints

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment