Visibly Tree Automata with Memory and Constraints

Tree automata with one memory have been introduced in 2001. They generalize both pushdown (word) automata and the tree automata with constraints of equality between brothers of Bogaert and Tison. Though it has a decidable emptiness problem, the main …

Authors: Hubert Comon-Lundh, Florent Jacquemard, Nicolas Perrin

Logical Methods in Computer Science V ol. 4 (2:8) 2008, pp. 1–36 www .lmcs-online.org Submitted Sep . 20, 2007 Published Jun. 18, 2008 VISIBL Y TREE A UTOMA T A WITH MEMOR Y AND CONSTRAINTS ∗ HUBER T COMON-LUNDH a , FLORENT JACQUEMARD b , AND NICOLAS PERRIN c a LSV, CNRS/ENS Cac han e-mail addr ess : h.comon-lundh@aist.go.jp b INRIA Saclay & LSV (CNRS/ENS Cac han) e-mail addr ess : florent.j acquemard@inria.fr c ENS Lyon e-mail addr ess : nicolas.perrin@ens-lyon.fr Abstra ct. T ree automata with one memory have been introduced in 2001. They gener- alize b oth pushdown (w ord) automata and the tree automata with constraints of equality b etw een brothers of Bogaert and Tison. Though it has a decidable emptiness problem, the main weakness of th is mo del is its lac k of goo d closure prop erties. W e prop ose a generalization of th e v isibly p ushdow n automata of Alur and Madhusu- dan to a family of tree recognizers which carry along th eir (b ottom-up ) computation an auxiliary unboun ded memory with a tree structure (instead of a symbol stac k). In other w ords, these recognizers, called Visibly T ree Automata with Memory (V T A M) defin e a sub class of tree automata w ith one memory enjo ying Boolean closure prop erties. W e sho w in particular that they can b e determinized and the problems lik e emp t iness, member- ship, inclusion and u nivers alit y are decidable for VT AM. Moreov er, we prop ose severa l extensions of VT AM whose t ransitions may b e constrained by different kind s of tests b e- tw een memories and also constraints a la Bogaert and Tison. W e show that some of these classes of constrained VT AM keep the go o d closure and decidability prop erties, and we demonstrate their expressiveness with relev ant ex amples of t ree languages. Introduction The con trol fl ow of programs with calls to functions can b e abstracted as pushd o w n systems. Th is allo ws to r educe some program v erifi cation problems to problems (e.g. mo d el- c h ec king) on p ushdown automata. When it comes to functional languages with c ontinuation p assing style , the stac k m ust cont ain information on con tin u ations and has the structure of a d ag (for jump s). Similarly , in the con text of async hronous concurr en t programming lan- guages, for t wo concurr en t thr eads the ordering of return is not determined (syn c h ronized) 1998 ACM Subje ct Classific ation: F.1.1; F.1.2; I.2.2; I.2.3. Key wor ds and phr ases: T ree automata, Pushdown Automata, Alternating automata, Symbolic con- strain ts, First-order theorem proving. ∗ An ex tended abstract con taining some of the results presented in th is pap er has appeared in the proceeding of FOSSAC S’07. LOGICAL METHODS l IN COMPUTER SCIENCE DOI:10.216 8/LMCS-4 (2:8) 2008 c  H. Comon-Lundh, F . Jacquemard, and N. P er rin CC  Creative Commons 2 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN and these threads can n ot b e stac ke d. In these cases, the control flo w is b etter mo deled as a tree stru ctur e r ather than a stac k. That is why we are int erested in tree automata with one memory , w h ic h generalize the pu shdo wn (tree) automata, r ep lacing the a stac k with a tree. Here, a “memory” has to b e unders to o d as a storage d evice, whose str u cture is a tree. F or instance, t w o m emories would corresp ond to t wo storage devices whose access would b e indep end en t. The tr e e automa ta with one memory in tro duced in [7] compute b ottom-up on a tree, with an auxiliary memory carryin g a tree, as in former works such as [14]. Along a com- putation, at any no de of the tree, th e memory is up dated incrementa lly from the memory reac h ed at the sons of th e n o de. This up date ma y consist in building a new tree from the memories at the sons (this generalizes a pu sh) or retrieving a subtr ee of one of the m emories at the sons (this generalizes a p op). In addition, such automata m a y p erform equalit y tests: a transition m a y b e constrained to b e p erformed, only when the memories reac h ed at some of the sons are iden tical. In this w a y , tree automata with one memory also generalize certain cases of tree automata with equ alit y and disequalit y tests b et ween brothers [4]. Automata with one memory ha v e b een introd u ced in th e conte xt of the ve rification of securit y proto cols, w here the messages exc hanged are represente d as trees. In the context of (functional or concurrent) programs, the creation of a thread, or a callcc , corresp onds to a push, the termination of a thread or a callcc corresp onds to a p op. The emptiness p roblem for such automata is in EXPTIME (n ote that for the extension with a second memory the emptiness problem b ecomes u n decidable). How ev er, the class of tree languages defin ed by suc h automata is neither closed by intersectio n n or b y complement . This is not surp rising as they are strictly more general than cont ext free languages. On the other hand, Alur and Madhusudan hav e introd u ced the notion of visib ilit y for pushd o wn automata [2], whic h is a relev an t r estriction in the context of control fl o w analysis. With this restriction, determinization is p ossible and actually the class of languages is closed under Bo olean op erations. In th is pap er, we prop ose the new f ormalism of Visibly T ree Automata with Memory (VT AM). On one hand, it extends visibly p ushdown languages to the recognition of trees, and with a tree str u cture instead of a stac k, follo wing f orm er approac hes [14, 21, 10]. On the other hand , VT AM restrict tr ee automata with one memory , imp osing a visibilit y condition on the transitions: eac h s y mb ol is assigned a giv en typ e of action. When r eading a symbol, the automaton can only p erf orm th e assigned t yp e of action: p ush or p op. W e fir st show in Section 2 that VT AM can b e determinized, us in g a pro of similar to the pr o of of [2], and d o hav e the go o d closure prop erties. The main difficu lty here is to understand w h at is a go o d notion of visibilit y f or trees, w ith memories instead of stac ks. W e also sh o w that the problems of mem b ership and emptiness are decidable in deterministic p olynomial time for VT AM. In a second p art of the pap er (Section 3), we extended VT AM with constraints. Our constrain ts here are recognizable relations; a transition can b e fired only if the memory con tents of the sons of the curr en t no de satisfy such a relatio n. W e give then a general theorem, expressing conditions on the relations, whic h ensure the d ecidabilit y of emptiness. Suc h conditions are sh o w n to b e necessary on one hand, and , on the other hand, we pr o ve that they are satisfied by some examples, in cluding syn tactic equalit y and disequalit y tests and str u ctural equalit y and disequalit y tests. The case of VT AM with structural equ ality and disequalit y tests (this class is denoted VT AM ≡ 6≡ ) is particularly in teresting, since the VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 3 determinization and closure prop erties of S ection 2 carr y ov er this generalizatio n, wh ic h we sho w in Section 3.4.2. The automata of VT AM ≡ 6≡ also enjo y a go o d expressiv e p o wer, as w e sho w in S ection 3.7 by presenting some n on -trivial examples of languages in this class: w ell-balance d binary tr ees, red-blac k trees, p o werlists... As an in termediate result, w e show that, in case of equalit y tests or stru ctural equalit y tests, the language of memories th at can b e r eac hed in a giv en state is alwa ys a r egular language. This is a generalization of th e well -kno wn result that the s et of stac k conte n ts in a pu shdo wn automaton is alwa ys regular. T o pr o ve th is, we obs erv e that the memories con tents are recognized b y a tw o-w a y alternating tree automaton with constraint s. Then we sho w, u sing a s aturation strategy , that tw o-w a y alternating tree automata with (stru ctural) equalit y constraints are not more expressiv e than stand ard tree automata. Finally , in Section 4 we p rop ose a class of visibly tree automata, which com b in es the structural constrain ts of VT AM ≡ 6≡ , testing memory con ten ts, w ith Bogaert-Tison constraints of [4] (equalit y and d isequalit y tests b et we en brothers subterms) whic h op erate on th e term in input. W e sho w that the tree automata of this class can b e determinized, are closed under Bo olean op erations and hav e a decidable emp tin ess pr ob lem. Related W ork. Generalizatio ns of p ushdown automata to trees (b oth f or input and stac k) are prop osed in [14, 21, 10]. Our cont ributions are the generalization of the visibilit y condition of [2] to such tree automata – our VT AM (without constrain ts) s trictly generalize the VP Languages of [2], and the addition of constrain ts on the stac k conte n ts. The visib ly tree automata of [1] use a w ord stac k which is less general than a tree structur ed memory but the comparison with VT AM is n ot easy as th ey are alternating an d compu te top-do wn on infinite trees. Indep en d en tly , Chab in and Ret y ha v e prop osed [5] a form alism com b ining p ushdown tree automata of [14] with the concept of visib ly pus hdo wn languages. Their automata recognize finite trees using a wo rd stac k. They h a ve a decidable emp tiness prob lem and the corresp onding tree languages (Visibly Pushdown T ree Languages, VPTL ) are closed u nder Bo olean op erations. F ollo wing remarks of one of these t w o authors, it app eared that VT AM and VPTL are in comparable, see Section 2.2. 1. Pre liminaries 1.1. T erm algebra. A sig natur e Σ is a finite s et of function symb ols with arit y , denoted b y f , g . . . W e write Σ n the subset of f unction sy mb ols of Σ of arity n . Given an infin ite set X of v ariables, th e set of terms built o v er Σ and X is denoted T (Σ , X ), and the subset of ground terms is denoted T (Σ). Th e set of v ariables o ccurring in a term t ∈ T (Σ , X ) is denoted vars ( t ). A substitution σ is a mapping from X to T (Σ , X ) such that { x | σ ( x ) 6 = x } , the supp ort of σ , is fi nite. T he app licatio n of a subs titution σ to a term t is written tσ . It is the homomorphic extension of σ to T (Σ , X ). The p ositions Pos ( t ) in a term t are sequences of p ositi v e inte gers (Λ, the empty sequ en ce, is the r o ot p osition). A subterm of t at p osition p is written t | p , and the replacemen t in t of th e sub term at p osition p by u denoted t [ u ] p . 4 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN 1.2. Rewriting. W e assum e standard definitions and n otations f or term rewriting [11]. A term r ewriting system (T R S ) o v er a signature Σ is a finite set of rewrite rules ℓ → r , wh er e ℓ ∈ T (Σ , X ) and r ∈ T (Σ , vars ( ℓ )). A term t ∈ T (Σ , X ) rewr ites to s by a T RS R (denoted t → R s ) if th er e is a rewrite ru le ℓ → r ∈ R , a p osition p of t and a su b stitution σ su c h that t | p = ℓσ and s = t [ r σ ] p . The transitiv e an d reflexive closure of → R is denoted − − → ∗ R . 1.3. T ree Automata . F ollo wing d efinitions and notation of [8], we consider tree automata whic h compu te b ottom-up (fr om lea v es to ro ot) on (finite) ground terms in T (Σ). A t eac h stage of computation on a tree t , a tree automaton reads the function sym b ol f at th e curr en t p osition p in t and u p dates its curr en t state, acco rding to f and to the resp ectiv e states reac h ed at the p ositi ons immediately under p in t . F ormally , a b otto m-up tr e e automato n (T A) A on a signature Σ is a tuple ( Q, Q f , ∆) w here Σ is the computation signature, Q is a finite set of nullary state symb ols, d isjoin t f r om Σ, Q f ⊆ Q is the subset of fin al states and ∆ is a set of rewrite rules of the form: f ( q 1 , . . . , q n ) → q , wh er e f ∈ Σ and q 1 , . . . , q n ∈ Q . A term t is ac c epte d (we may also write r e c o gnize d ) b y A in state q iff t − − → ∗ ∆ q , and the language L ( A , q ) of A in state q is th e set of ground terms accepted in q . The language L ( A ) of A is S q ∈ Q f L ( A , q ) and a set of grou n d terms is called r e gu lar if it is the language of a T A. 2. Visibl y Tree Automa t a with Memor y W e pr op ose in this section a sub class of the tree automata with one memory [7] wh ic h is stable under Bo olean op erations and has decidable emptiness and memb ership pr ob lems. 2.1. Definition of VT AM. T ree automata ha v e b een extended [14, 21, 10, 7] to carry an unboun ded inform ation along the s tates in compu tations. In [7 ], this inf orm ation is stored in a tree structure and is called memo ry . W e k eep this terminolog y here, and cal l ou r recognizers tr e e automata with memory (T AM). F or consistency with the ab o v e form alisms, the memory conten ts will b e ground terms o ver a memory signatur e Γ. Lik e for T A w e consid er b ott om-up computations of T AM in tr ees; at eac h sta ge of computation on a tree t , a T AM, lik e a T A, reads the function symbol at the curr en t p osition p in t and up dates its cu rren t state, according to the states reac hed immediately under p . Moreo ver, a configuration of T AM con tains not on ly a state but also a memory , whic h is a tree. The curren t memory is up dated according to the resp ectiv e con ten ts of memories reac hed in the no des imm ediately und er p in t . As ab o v e, we use term rewrite systems in order to d efine the transitions allo wed in a T AM. F or this purp ose, w e add an argument to state symbols, wh ic h will con tain the memory . Hence, a configuration of T AM in state q and whose memory conte n t is the ground term m ∈ T (Γ), is represente d b y the term q ( m ). W e p rop ose b elo w a v ery general definition of T AM. It is similar to the one of [7], except that w e ha v e her e general p atterns m 1 , . . . , m n , m , while these p atterns are restricted in [7], for instance a voiding memory duplications. Since we aim at providing closure and decision prop erties, we will also imp ose (other) restrictions later on. Definition 2.1. A b ott om-up tr e e automaton with memory (T AM) on a signature Σ is a tuple (Γ , Q, Q f , ∆) wh ere Γ is a memory signature, Q is a finite set of unary state symbols, disjoin t f r om Σ ∪ Γ, Q f ⊆ Q is th e subset of fi nal states and ∆ is a set of rewrite rules of the VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 5 form f  q 1 ( m 1 ) , . . . , q n ( m n )  → q ( m ) where f ∈ Σ n , q 1 , . . . , q n , q ∈ Q and m 1 , . . . , m n , m ∈ T (Γ , X ). The r ules of ∆ are also called tr ansition rules . A term t is ac c epte d by A in state q ∈ Q and w ith memory m ∈ T (Γ) iff t − − → ∗ ∆ q ( m ), and the language L ( A , q ) and memory language M ( A , q ) of A in state q are resp ective ly d efined by: L ( A , q ) =  t   ∃ m ∈ T (Γ) , t − − → ∗ ∆ q ( m )  M ( A , q ) =  m   ∃ t ∈ T (Σ) , t − − → ∗ ∆ q ( m )  . The language of A is the un ion of languages of A in its final stat es, denoted: L ( A ) = S q ∈ Q f L ( A , q ) . Visibilit y Condition. The ab o v e formalism is of course f ar to o expressive . As there are no restrictions on the op eration p erformed on memory by the rewrite r ules, one can easily enco de a T ur ing mac hine as a T AM. W e sh all no w define a decidable restriction called visibly tr e e automata with memory (VT AM). First, w e consid er on ly thr ee main families (later divided in to th e sub cate gories defined in Figure 1) of op erations on memory . W e assume b elo w a compu tation step at some p ositi on p of a term, where memories m 1 , . . . , m n ha v e b een reac hed at the p ositions immediately b elo w p : PUSH : the new cu rren t memory m is bu ilt with a s ym b ol h ∈ Γ n pushe d on the top of memories m 1 , . . . , m n : f  q 1 ( m 1 ) , . . . , q n ( m n )  → q  h ( m 1 , . . . , m n )  . According to the terminology of [2 ], this corresp onds to a c al l mo v e in a pr ogram represented by an au- tomaton. POP : the n ew current memory is a subterm of one of the memories r eac hed so far: f  . . . , q i ( h ( m ′ 1 , . . . , m ′ k )) , . . .  → q ( m ′ j ). T h e top sym b ol h of m i is also r ead. Th is corre- sp onds to a f unction’s r eturn in a program. W e ha v e here to sp lit POP op erations in to four categories, dep ending on w hether w e p op on the memory at the left son or on the memory at the righ t son an d on whether w e get the left son of th at memory or its righ t son. INT (internal) : the new current memory is one of the memories reac h ed: f  q 1 ( m 1 ) , . . . , q n ( m n )  → q ( m i ) This co rresp ond s to an internal op eration (neither call nor return) in a function of a program. Again, w e need to sp lit INT op eratio ns into thr ee categories: one for constan t symbols and tw o rules for binary symb ols, dep ending on whic h of the tw o sons memories we k eep. Next, we adhere to the visibility condition of [2]. The idea b ehind this restriction, whic h w as already in [16], is that the sym b ol read b y an automaton (in a term in our case and [1], in a word in the case of [2]) corresp onds to an instruction of a program, and h ence b elongs to one of the three ab o ve families (call, r etur n or in ternal). In deed, the effect of the execution of a giv en instruction on the curr en t pr ogram state (a stac k for [2] or a tree in our case) will alw ays b e in the same family . In other words, in this conte xt, the family of the memory op erations p erformed by a transition is completely determined by the function sym b ol read. Let us assu me from now on for the sake of simp licit y the follo wing restriction on th e arit y of sym b ols: 6 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN PUSH a → q ( c ) a ∈ Σ PUSH PUSH f  q 1 ( y 1 ) , q 2 ( y 2 )  → q  h ( y 1 , y 2 )  f ∈ Σ PUSH POP 11 f  q 1 ( h ( y 11 , y 12 )) , q 2 ( y 2 )  → q ( y 11 ) f ∈ Σ POP 11 f  q 1 ( ⊥ ) , q 2 ( y 2 )  → q ( ⊥ ) POP 12 f  q 1 ( h ( y 11 , y 12 )) , q 2 ( y 2 )  → q ( y 12 ) f ∈ Σ POP 12 f  q 1 ( ⊥ ) , q 2 ( y 2 )  → q ( ⊥ ) POP 21 f  q 1 ( y 1 ) , q 2 ( h ( y 21 , y 22 ))  → q ( y 21 ) f ∈ Σ POP 21 f  q 1 ( y 1 ) , q 2 ( ⊥ )  → q ( ⊥ ) POP 22 f  q 1 ( y 1 ) , q 2 ( h ( y 21 , y 22 ))  → q ( y 22 ) f ∈ Σ POP 22 f  q 1 ( y 1 ) , q 2 ( ⊥ )  → q ( ⊥ ) INT 0 a → q ( ⊥ ) a ∈ Σ INT 0 INT 1 f  q 1 ( y 1 ) , q 2 ( y 2 )  → q ( y 1 ) f ∈ Σ INT 1 INT 2 f  q 1 ( y 1 ) , q 2 ( y 2 )  → q ( y 2 ) f ∈ Σ INT 2 where q 1 , q 2 , q ∈ Q , y 1 , y 2 are distinct v ariables of X , c ∈ Γ 2 , h ∈ Γ 2 . Figure 1: VT AM transition categories. All the symb ols of Σ and Γ hav e either arit y 0 or 2. This is not a r eal restriction, and the r esults of this p ap er can b e extended s tr aigh tforwardly to the case of fun ction sym b ols with other arities. T he signature Σ is partitioned in eight subsets: Σ = Σ PUSH ⊎ Σ POP 11 ⊎ Σ POP 12 ⊎ Σ POP 21 ⊎ Σ POP 22 ⊎ Σ INT 0 ⊎ Σ INT 1 ⊎ Σ INT 2 The eigh t corresp onding categories of transitions (transitions of the same category p erform the same kind of op eration on the m emory) are d efi ned formally in Figure 1. In this figure, one constan t symb ol h as a particular role: ⊥ is a sp ecial constan t sy mb ol in Γ, used to represent an emp t y memory . Note that ther e are thr ee categories for INT , INT 0 is for constant symb ols and INT 1 , INT 2 are for binary symbols and differ according to th e memory w hic h is kept. Similarly , there are four v arian ts of POP transitions, POP 11 , . . . , POP 22 . Mo reo ver, eac h POP rule has a v arian t, wh ic h reads an empty memory ( i.e. the symbol ⊥ ). Definition 2.2. A v isibly tr e e automa ton with memory (or VT AM for short) on Σ is a T AM (Γ , Q, Q f , ∆) su c h that eve ry rule of ∆ b elongs to one of th e ab o ve catego ries PUSH , POP 11 , POP 12 , POP 21 , POP 22 , INT 0 , INT 1 , INT 2 . 2.2. Expressiv ene ss, Comparison. Standard b ottom-up tree automata are p articular cases of VT AM (simply assume all the symb ols of the signature in INT 0 or INT 1 ). No w, let us try to explain more precisely the r elation with th e visibly pushd o w n lan- guages of [2], w hen considering fi nite wo rd languages. If the stac k is empt y in any ac cepting configuration of some finite w ord pushdown automaton A , then it is easy to compute a pushdo wn automaton e A , whic h accepts the rev erses (mirror images) of the wo rds accepted by A . Moreo ver, if A is a visibly pushd own automaton, then e A is also a visibly pushd o wn automaton: it suffices to exc hange the pu sh and p op symb ols. VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 7 F or p u shdo wn w ord languages, there is a well- kno wn lemma showing that the recogni- tion b y final state is equiv alen t to the recognition b y empty stac k. This equiv alence how ev er requires ǫ -transitions to empt y the stac k w hen a final state is r eac hed. Th er e are ho w ev er no ǫ -transitions in visib ly push do w n automata. So, if we consider for instance th e language of w ords w ∈ { a, b } ∗ suc h th at any p refix of w conta ins more a than b ’s, it is recognized by a visibly pu s hdo wn automaton. While, if we consider the m irror image (all suffixes cont ain more a ’s than b ’s), it is not recognized by a visib ly pushd o wn automaton. In conclusion, as long as visibilit y is relev ant, the w a y the automaton is moving is also relev an t. This applies of course to trees as well: there is a d ifference b et w een top-do wn and b ottom-up recognition. No w, if we enco de a word as a tree on a unary alphab et, starting from righ t to left, VT AM generalize visibly push do wn automata: mo ving b ottom-up in the tree corresp ond s to mo ving left-righ t in the word. VPT A transitions and VPTL are defin ed in [5] in the same formalism (rewrite rules) as in Figure 1, except th at th e rules are orien ted in the other direction (top-do wn computa- tions) and the memory conta ins a w ord, i.e. terms bu ilt with unary fun ction symb ols and one constan t (empty stack) . As sk etc hed ab o v e, since the automata of [5] wo rk top-do wn, a language can b e rec- ognized by a VT AM (whic h works b ottom-up) and not b y a VPTL. As a t ypical examp le, consider the trees con taining only unary symb ols a, b and a constan t 0 and suc h that all subterms con tain m ore a ’s than b ’s. But the co n v erse is also true: there are similarly languages that are recognized b y VPT A an d n ot by VT AM (and there, constrain ts cannot help!) No w, if w e consider a sligh t mo difi cation of VPT A, in whic h the automata work b ottom- up (simply c h ange the direction of transition rules), it is not clear that go o d p rop erties (closure and decision) are preserv ed since, now, w e get equalit y tests b et w een memory con tents, increasing the original expressiv e p o w er; when going top-do wn w e alw a ys d uplicate the memory con ten t and send one copy to eac h son , while going b ottom-up we may hav e differen t memory conte n ts at t w o br other p ositions. 2.3. Determinism. A VT AM A is said c ompl ete if eve ry term of T (Σ) b elongs to L ( A , q ) for at least on e state q ∈ Q . Every VT AM can b e completed (with a p olynomial ov erhead) b y th e addition of a trash sta te. Hence, we shall consider from no w on only co mplete VT AM. A VT AM A = (Γ , Q, Q f , ∆) is said deterministic iff: • for all a ∈ Σ INT 0 there is at most one ru le in ∆ with left-mem b er a , • for all f ∈ Σ PUSH ∪ Σ INT 1 ∪ Σ INT 2 , for all q 1 , q 2 ∈ Q , there is at most on e rule in ∆ with left-mem b er f  q 1 ( y 1 ) , q 2 ( y 2 )  , • for all f ∈ Σ POP 11 ∪ Σ POP 12 (resp ectiv ely Σ POP 21 ∪ Σ POP 22 ), for all q 1 , q 2 ∈ Q and all h ∈ Γ , there is at most one rule in ∆ with left-member f  q 1 ( h ( y 11 , y 12 )) , q 2 ( y 2 )  (resp ectiv ely f  q 1 ( y 1 ) , q 2 ( h ( y 21 , y 22 ))  ). Theorem 2.3. F or every VT A M A = (Γ , Q, Q f , ∆) ther e exists a deterministic VT A M A det = (Γ det , Q det , Q det f , ∆ det ) such that L ( A ) = L ( A det ) , wher e | Q det | and | Γ det | b oth ar e O  2 | Q | 2  . 8 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN Pr o of. W e follo w the tec hnique of [2] for the d eterminization of visibly push do wn automata: w e do a s u bset construction and p ostp one the application (to the memory) of PUSH r ules, unt il a matc hing POP is met. The construction of [2] is extend ed in order to handle the branc hing structure of the term r ead and of th e memory . With the visibilit y condition, for eac h symbol read, only one kind of memory op eration is p ossible. This p ermits a un iform construction of the r ules of A det for eac h symb ol of Σ . As w e shall see b elow, A det do es not need to kee p trac k of th e conte n ts of memory (of A ) during its compu tation, it only needs to memorize information on the r eac habilit y of states of A , follo win g the path (in the term read) from the p osition of the PUS H s ym b ol w hic h h as pushed the top symb ol of the current memory (let us call it the last-memory-push-p osition ) to the cu r ren t p osition in the term. W e let : Q det := { 0 , 1 } × P ( Q ) × P ( Q 2 ) Q det f is the subset of states w hose second comp on ent con tains a final state of Q f . The first comp onen t is a flag indicating whether the memory is curr en tly empt y (v alue 0) or not (v alue 1). T he second comp onent is the sub s et of states of Q that A can reac h at cur r en t p osition, and the third comp onen t is a bin ary relation on Q w hic h con tains ( q , q ′ ) iff starting from a state q and memory m at the last-memory-push-p osition, A can r eac h the curr en t p osition in state q ′ , and with the same memory m . W e consider memory sym b ols m ade of pairs of states and PUS H sym b ols: Γ det :=  Q det  2 × (Σ PUSH ) The comp onent s of a symbol p ∈ Γ det refer to the transition w h o p ushed p : th e first and second comp onents of p are resp ectiv ely the left and righ t initial states of the transition and the third comp onent is the sym b ol r ead. The transition rules of ∆ det are giv en b elo w, according to the symbol read. INT . F or ev ery i and for ev ery f ∈ Σ INT i , we h a ve the follo wing rules in ∆ det : f  h b 1 , R 1 , S 1 i ( y 1 ) , h b 2 , R 2 , S 2 i ( y 2 )  → h b 1 , R , S i ( y 1 ) where R :=  q   ∃ q 1 ∈ R 1 , q 2 ∈ R 2 , f  q 1 ( y 1 ) , q 2 ( y 2 )  → q ( y 1 ) ∈ ∆  , and S is the u p date of S 1 according to the INT 1 -transitions of ∆, when b 1 = 1 (the case b 1 = 0 is s im ilar): S :=  ( q , q ′ )   ∃ q 1 ∈ Q, q 2 ∈ R 2 , ( q , q 1 ) ∈ S 1 and f  q 1 ( y 1 ) , q 2 ( y 2 )  → q ′ ( y 1 ) ∈ ∆  . The case f ∈ Σ INT 2 is similar. PUSH . F or ev ery f ∈ Σ PUSH , we hav e the follo wing ru les in ∆ det : f  h b 1 , R 1 , S 1 i ( y 1 ) , h b 2 , R 2 , S 2 i ( y 2 )  → h 1 , R , Id Q i ( p ( y 1 , y 2 )) where R :=  q   ∃ q 1 ∈ R 1 , q 2 ∈ R 2 , h ∈ Γ , f  q 1 ( y 1 ) , q 2 ( y 2 )  → q  h ( y 1 , y 2 )  ∈ ∆  , Id Q :=  ( q , q )   q ∈ Q  is used to initialize th e memorization of state reac hability fr om the p osition of the symbol f , and p :=  h b 1 , R 1 , S 1 i , h b 2 , R 2 , S 2 i , f  . Note that the t w o states reac hed just b elow the p osition of application of this ru le are p u shed on the top of th e memory . They will b e used later in ord er to up date R and S wh en a matc h ing POP symb ol is read. VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 9 POP . F or ev ery f ∈ Σ POP 11 , we h a ve the follo wing rules in ∆ det : f  h b 1 , R 1 , S 1 i ( H ( y 11 , y 12 )) , h b 2 , R 2 , S 2 i ( y 2 )  → h b, R , S i ( y 11 ) where H = h Q 1 , Q 2 , g i , with Q 1 = h b ′ 1 , R ′ 1 , S ′ 1 i ∈ Q det , Q 2 = h b ′ 2 , R ′ 2 , S ′ 2 i ∈ Q det . b = b ′ 1 R =  q     ∃ q ′ 1 ∈ R ′ 1 , q ′ 2 ∈ R ′ 2 , ( q 0 , q 1 ) ∈ S 1 , q 2 ∈ R 2 , h ∈ Γ , g  q ′ 1 ( y 1 ) , q ′ 2 ( y 2 )  → q 0  h ( y 1 , y 2 )  ∈ ∆ , f  q 1 ( h ( y 11 , y 12 )) , q 2 ( y 2 )  → q ( y 11 ) ∈ ∆  S =  ( q , q ′ )     ∃ q ′ 1 ∈ S ′ 1 ( q ) , q ′ 2 ∈ R ′ 2 , ( q 0 , q 1 ) ∈ S 1 , q 2 ∈ R 2 , h ∈ Γ , g  q ′ 1 ( y 1 ) , q ′ 2 ( y 2 )  → q 0  h ( y 1 , y 2 )  ∈ ∆ , f  q 1 ( h ( y 11 , y 12 )) , q 2 ( y 2 )  → q ′ ( y 11 ) ∈ ∆  When a POP s y mb ol is read, the top symb ol of the memory , which is p opp ed, conta ins the states reac h ed just b efore the application of the matc h ing PUSH . W e use this information in order to up d ate h b 1 , R 1 , S 1 i and h b 2 , R 2 , S 2 i to h b, R , S i . The cases f ∈ Σ POP 12 , f ∈ Σ POP 21 , f ∈ Σ POP 22 are similar. The ab o v e constructions ensure the three in v arian ts stated ab o v e, after the definition of Q det and corresp ond ing to the three comp onents of these states. It follo ws that L ( A ) = L ( A det ). 2.4. Closure Prop erties. Th e tree automata with one memory of [7] are closed under union but not closed un der in tersection and complement (even their version w ithout con- strain ts). The v isib ilit y condition makes p ossib le these closur es for VT AM. Theorem 2.4. The class of tr e e languages of VT A M is close d under Bo ole an op er ations. One c an c onstruct VT AM for union, interse ction and c omp lement of given VT AM languages whose sizes ar e r esp e ctively line ar, quadr atic and exp onential in the size of the initial VT AM. Pr o of. Let A 1 = (Γ 1 , Q 1 , Q f , 1 , ∆ 1 ) and A 2 = (Γ 2 , Q 2 , Q f , 2 , ∆ 2 ) b e t w o VT AM on Σ. W e assume wlog that Q 1 and Q 2 are disjoint. F or the union of the languages of A 1 and A 2 , we construct a VT AM A ∪ whose memory signature, state set, final state set and ru les s et are the union of the resp ectiv e memory signatures, state sets, final state sets and rules sets of the tw o giv en VT AM. W e ha v e L ( A ∪ ) = L ( A 1 ) ∪ L ( A 2 ). A ∪ = (Γ 1 ∪ Γ 2 , Q 1 ∪ Q 2 , Q f , 1 ∪ Q f , 2 , ∆ 1 ∪ ∆ 2 ) F or the int ersection of th e languages of A 1 and A 2 , we constr u ct a VT AM A ∩ whose memory signature, state set and fin al state set are the Cartesian pr o duct of th e resp ective memory signatures, s tate sets and fi nal state sets of the tw o giv en VT AM. A ∩ = (Γ 1 × Γ 2 , Q 1 × Q 2 , Q f , 1 × Q f , 2 , ∆ ∩ ) The ru le set ∆ ∩ of the inte rsection VT AM A ∩ is obtained by ”pro du ct” of r ules of the t w o giv en VT AM with same fun ction s ym b ols. The pro d uct of r ules means Cartesian pro d ucts of the resp ectiv e states and memory sym b ols p ushed or p op p ed. More precisely , ∆ ∩ is the smallest set of rules su c h that: • if ∆ 1 con tains f  q 11 ( y 1 ) , q 12 ( y 2 )  → q 1  h 1 ( y 1 , y 2 )  and ∆ 2 con tains f  q 21 ( y 1 ) , q 22 ( y 2 )  → q 2  h 2 ( y 1 , y 2 )  , for some f ∈ Σ PUSH , then ∆ ∩ con tains f  h q 11 , q 21 i ( y 1 ) , h q 12 , q 22 i ( y 2 )  → h q 1 , q 2 i  h h 1 , h 2 i ( y 1 , y 2 )  . 10 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN • if ∆ 1 con tains f  q 11 ( h 1 ( y 11 , y 12 )  , q 12 ( y 2 )  → q 1 ( y 11 ) and ∆ 2 con tains f  q 21 ( h 2 ( y 11 , y 12 )  , q 22 ( y 2 )  → q 2 ( y 11 ) for some f ∈ Σ POP 11 , th en ∆ ∩ con tains f  h q 11 , q 2 , 1 i ( h h 1 , h 2 i ( y 11 , y 12 )  , h q 12 , q 2 , 2 i ( y 2 )  → h q 1 , q 2 i ( y 11 ) • similarly for POP 12 , POP 21 and POP 22 • if ∆ 1 con tains f  q 11 ( y 1 ) , q 21 ( y 2 )  → q 1 ( y 1 ) and ∆ 2 con tains f  q 21 ( y 1 ) , q 22 ( y 2 )  → q 2 ( y 1 ) for some f ∈ Σ INT 1 , then ∆ ∩ con tains f  h q 11 , q 2 , 1 i ( y 1 ) , h q 12 , q 2 , 2 i ( y 2 )  → h q 1 , q 2 i ( y 1 ) • and similarly for INT 2 , INT 0 . W e ha v e th en L ( A ∩ ) = L ( A 1 ) ∩ L ( A 2 ). Note that th e ab o v e pro du ct constru ction for A ∩ is p ossible only b ecause the visibilit y condition ensures that t wo ru les w ith the same function sym b ol in left-side will hav e the same form. Hence w e can syn c h ronize memory op eratio ns on the same symbols. F or the complemen t, we us e the construction of Theorem 2.3 and a completion (this op eration preserve s determinism), and tak e the complemen t of the final state set of the VT AM obtained. 2.5. Decision Problems. Ev ery VT AM is a particular case of tr ee automaton with one memory of [7]. Since the emptiness p roblem (whether the language accepted is empty or not) is decidable for this latter class, it is also decidable for VT AM. How ev er, w h ereas this problem is EXPT I ME-complete for the automata of [7], it is only PTIME for VT AM. Theorem 2.5. The e mptiness pr oblem is PTIME-c ompl ete for V T AM . Pr o of. Assume giv en a VT AM A = (Γ , Q, Q f , ∆). By definition, for eac h state q ∈ Q , the language L ( A , q ) is empt y iff the memory language M ( A , q ) is empt y . F or eac h state q , w e in tro duce a predicate symb ol P q and we construct Horn clauses in su c h a wa y that P q ( m ) b elongs to the least Herbrand m o del of this set of clauses, iff the confi guration with state q and memory m is reac hable by the automaton (i.e. m ∈ M ( A , q )). F or suc h a construction (already giv en in [7]), w e simply forget the fu nction symbol, as- so ciating to a transition rule f ( q 1 ( m 1 ) , q 2 ( m 2 )) → q ( m ) the Horn clause P q 1 ( m 1 ) , P q 2 ( m 2 ) ⇒ P q ( m ). Then, according to the restrictions in Definition 2.2, we get only Horn clauses of one of the follo win g form s: ⇒ P q ( c ) P q 1 ( y 1 ) , P q 2 ( y 2 ) ⇒ P q  h ( y 1 , y 2 )  P q 1  h ( y 11 , y 12 )  , P q 2 ( y 2 ) ⇒ P q ( y 11 ) P q 1  h ( y 11 , y 12 )  , P q 2 ( y 2 ) ⇒ P q ( y 12 ) P q 1 ( ⊥ ) , P q 2 ( y 2 ) ⇒ P q ( ⊥ ) P q 1 ( y 1 ) , P q 2 ( y 2 ) ⇒ P q ( y 1 ) where all the v ariables are distinct. Suc h clauses b elong to the class H 3 of [19], for wh ic h it is pr o ved in [19] that emptiness is decidable in cubic time. It follo ws that emp tiness of VT AM is d ecidable in cubic time. Hardness for PTIME follo ws from the PTIME-hardn ess of emptiness of finite tree au- tomata [8]. VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 11 Another pro of r elying on similar tec h niques, but for a more general resu lt, will b e stated in Lemma 3.7 and can b e foun d in App end ix 5. The u niversality is the problem of deciding wh ether a giv en automaton r ecognize s all ground terms . Inclusion refers to the problem of deciding the inclus ion b et wee n the resp ectiv e languages of t w o giv en automata. Corollary 2.6. The universality and inclusion pr oblem ar e EXPTIME- c omplete f or VT AM. Pr o of. A VT AM A is un iv ers al iff the language of its complemen t automaton A is emp ty , and L ( A 1 ) ⊆ L ( A 2 ) iff L ( A 1 ) ∩ L ( A 2 ) = ∅ . With the b oun ds giv en in Th eorem 2.4 these problems can b e decided in EXPT IME for VT AM (these op erations requir e a determiniza- tion of a giv en VT AM first). The EXPTIME-hard ness follo ws from the corresp onding prop ert y of finite tree au- tomata (see [8] for instance). The memb ership problem is, giv en a term t and an automaton A , to kn o w whether t is accepted by A . Corollary 2.7. The memb ership pr oblem is de c i dable in PTIM E for VT AM. Pr o of. Giv en a term t we can build a VT AM A t whic h recognizes exa ctly the languag e { t } . The intersectio n of A t with th e giv en VT AM A r ecognizes a non empty language iff t b elongs to the language of A . 3. Visibl y Tree Automa t a with Memor y and Constraint s In the late eighti es, some mo dels of tree recognizers were obtained b y adding equalit y and disequalit y constraints in trans itions of tree automata. T hey ha v e b een prop osed in order to solv e problems with term rewrite systems or constraint s sys tems with non-linear patterns (terms with multiple o ccurrences of the same v ariable). The tree automata of [4] for instance can p erform equalit y and disequalit y tests b et w een subterms lo cated at br other p ositions of th e input term. In the case of tree automata with memory , constrain ts are applied to th e memory con tents. Ind eed, eac h b ottom-up computation step starts with tw o states and t wo memories (and end s with one state and one memory), an d therefore, it is p ossib le to compare the con tents of these t w o memories, with resp ect to some binary relation. W e state fir st the general defi n ition of visibly tree automata with constraints on mem- ories (Section 3.1), then give sufficien t conditions on the binary relation for the emptiness decidabilit y (Section 3.2) and sho w th at, if in general regular binary relations do not sat- isfy these conditions (and indeed, the corresp onding class of constrained VT AM has an undecidable emptiness problem, Section 3.3) s ome r elev an t examples d o satisfy them. In particular, w e study in S ection 3.4.2 the case of VT AM with structural equalit y constraint s. They enjoy n ot only decision pr op erties bu t also goo d closure prop erties. Some r elev an t examples of tree languages r ecognize d by constr ained VT AM of this class are pr esen ted at the end of the section. 12 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN INT R 1 f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − → y 1 R y 2 q ( y 1 ) f 9 ∈ Σ INT R 1 INT R 2 f 10  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − → y 1 R y 2 q ( y 2 ) f 10 ∈ Σ INT R 2 INT R 1 f 11  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − − → y 1 ¬ R y 2 q ( y 1 ) f 11 ∈ Σ INT R 1 INT R 2 f 12  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − − → y 1 ¬ R y 2 q ( y 2 ) f 12 ∈ Σ INT R 2 Figure 2: New tr an s ition categories for VT AM R ¬ R . 3.1. Definitions. Assume giv en a fi xed equ iv alence relation R on T (Γ). W e consider n o w t wo n ew categories f or the symbols of Σ: INT R 1 and INT R 2 , in addition to the eigh t previous catego ries of page 6. T he new cate gories corresp ond to the co nstrained v ersions of the transition ru les INT 1 and INT 2 present ed in Figure 2. The constrain t y 1 R y 2 in the tw o first rules of Figure 2 is called p ositive and the constrain t y 1 ¬ R y 2 in the t w o last ru les is called ne gative . W e sh all not extend the rules PUSH and POP w ith constrain ts for some r ea- sons explained in section 3.5. A ground term t rewrites to s by a constrained rule f  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − → y 1 c y 2 r (wh er e c is eit her R or ¬ R ) if there exists a p osition p of t and a s u bstitution σ su c h that t | p = ℓσ , y 1 σ c y 2 σ and s = t [ r σ ] p . F or example, if R is term equalit y , the transition is p erform ed only when the memory con tents are iden tical. Definition 3.1. A visibly tr e e automaton with memory and c onstr aints (VT AM R ¬ R ) on a signature Σ is a tuple (Γ , R, Q, Q f , ∆) where Γ, Q , Q f are defined as for T AM, R is a n equiv alence r elation on T (Γ) and ∆ is a set of r ewr ite rules in one of the ab o v e categories: PUSH , POP 11 , POP 12 , POP 21 , POP 22 , INT 0 , INT 1 , INT 2 , INT R 1 , INT R 2 . W e let VT AM R b e th e sub class of VT AM R ¬ R with p ositiv e constrain ts only . Th e accep- tance of terms of T (Σ) and languages of term and memories are d efined and denoted as in Section 2.1. The defin ition of c omplete VT AM R ¬ R is the same as f or VT AM. As for VT AM, ev ery VT AM R ¬ R can b e completed (with a p olynomial ov erhead) by the addition of a trash state q ⊥ . Th e only subtle difference concerns the constrained rules: for ev ery f 9 ∈ INT R 1 and ev er y states q 1 , q 2 , • if there is a ru le f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − → y 1 R y 2 q ( y 1 ) and n o rule of the form f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − − → y 1 ¬ R y 2 q ′ ( y 1 ), then we add f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − − → y 1 ¬ R y 2 q ⊥ ( y 1 ), • if there is a r ule f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − − → y 1 ¬ R y 2 q ( y 1 ) and no rule of the form f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − → y 1 R y 2 q ′ ( y 1 ), then we add f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − → y 1 R y 2 q ⊥ ( y 1 ), • if there is n o rule of the form f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − → y 1 R y 2 q ( y 1 ) or f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − − → y 1 ¬ R y 2 q ′ ( y 1 ), then we add f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − → y 1 R y 2 q ⊥ ( y 1 ) and f 9  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − − → y 1 ¬ R y 2 q ⊥ ( y 1 ). The definition of deterministic VT AM R ¬ R is based on the same conditions as for VT AM for the fun ction sym b ols in categorie s of PU SH 0 , PUS H , POP 11 , . . . , POP 22 , INT 1 , INT 2 . F or the fun ction symb ols of INT R 1 , INT R 2 , we ha v e the follo wing condition: for all f ∈ Σ INT R 1 ∪ Σ INT R 2 for all q 1 , q 2 ∈ Q , there are at most t w o r ules in ∆ with left-mem b er f  q 1 ( y 1 ) , q 2 ( y 2 )  , and if there are t w o, one h as a p ositive constraint and the other has a negativ e constraint. VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 13 W e will see in Section 3.4 a su b class of VT AM R ¬ R that can b e determinized (when R is structural equalit y) and another one that cannot (when R is synta ctic equalit y). 3.2. Sufficien t Conditions for Emptiness Decision. W e prop ose here a generic theo- rem ensu ring emptiness decision f or VT AM R ¬ R . The idea of this theorem is that under some condition on R , the transition rules with negativ e constrain ts can b e eliminated. Theorem 3.2. L et R b e an e qu ivalenc e r elation satisfying these two pr op erties: i. for every automaton A of VT AM R and for e very state q of A , the memory language M ( A , q ) is effe ctive ly a r e gular tr e e language, ii. for every term m ∈ T (Γ) , the c ar dinality of the e quivalenc e class of m for R is finite and and its elements c an b e enumer ate d. Then the emptiness pr oblem is de ci dable for VT AM R ¬ R . Pr o of. The pro of relies on the follo wing Lemma 3.3 w hic h state s that the n egativ e con- strain ts in VT AM R ¬ R can b e eliminated, wh ile preserving the memory languages. T he elim- ination can b e done thanks to the condition ii , by replacement of the rules of INT ¬ R 1 and INT ¬ R 2 b y rules of INT R 1 and INT R 2 . Next, w e can use i in ord er to decide emptiness for the VT AM R obtained by elimination of negativ e constrain ts. In d eed, for all states q of A , b y d efinition, L ( A , q ) is empt y iff M ( A , q ) is empt y . Lemma 3.3. L et R satisfy the hyp otheses i and ii of The or em 3.2, and let A = (Γ , R , Q, Q f , ∆) b e a VT AM R ¬ R . Ther e exists a VT AM R A + = (Γ , R, Q + , Q f , ∆ + ) such that Q ⊆ Q + , and for e ach q ∈ Q , M ( A + , q ) = M ( A , q ) . Pr o of. The co nstruction of A + is b y induction on the n um b er n of rules with negativ e constrain ts in ∆ and us es the b ound on the size of equiv alence classes, cond ition ii of the theorem. The result is immediate if n = 0. W e assum e that the result is true for n − 1 r ules, and s h o w that we can get rid of a ru le of ∆ with negativ e constrain ts (and replace it with rules unconstrained or with p ositiv e constrain ts). Let u s consider one suc h rule: f  q 1 ( y 1 ) , q 2 ( y 2 )  − − − − − − → y 1 ¬ R y 2 q ( y 1 ) (3.1) W e sho w that, un der the ind uction hyp othesis, we ha v e the follo win g lemma wh ic h w ill b e used b elo w in order to get rid of the rule (3.1 ). Lemma 3.4. Given m 1 , . . . , m k ∈ M ( A , q 2 ) , it is effe ctively de cidable whether M ( A , q 2 ) \ { m 1 , . . . , m k } is empty or not and, in c ase it is not empty, we c an e ffe ctively build a m k +1 in this set. Pr o of. Let [ m i ] R denote the equiv alence class of m i . By condition ii , ev ery [ m i ] R is fin ite, hence for eac h i ≤ k , w e can build a VT AM A i with a state p i suc h that M ( A i , p i ) is th e complemen t of [ m i ] R . W e ad d all the rules of A i to A , obtaining A ′ (w e assume that the state sets of A 1 , . . . , A k , A are disj oin t, and that the states of A 1 , . . . , A k are not final in A ′ ). Since R is an equiv alence relation, w e ha v e: y 1 ¬ R m i iff y 1 / ∈ [ m i ] R iff ∃ y 2 / ∈ [ m i ] R , y 1 R y 2 14 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN Hence, if y 2 = m i is a witness for the r ule (3.1), then w e can apply in stead a r u le: f  q 1 ( y 1 ) , p i ( y 2 )  − − − − − → y 1 R y 2 q ( y 1 ) (3.2) Then w e add to A ′ the rules (3.2) as ab o v e and obtain A ′′ . It can b e s h o w n th at M ( A ′′ , q 2 ) = M ( A , q 2 ). Let m k +1 b e a term of M ( A ′′ , q 2 ) \ { m 1 , . . . , m k } of min imal size (if one exists). This term m k +1 can b e cr eated in a run of A ′′ whic h d o es not use the rule (3.1). Otherwise, the witness for y 2 in the app licatio n of this r ule w ould b e a term of M ( A ′′ , q 2 ) \ { m 1 , . . . , m k } smaller than m k +1 (it cannot b e one of { m 1 , . . . , m k } b ecause for these particular v alues of y 2 , we assume the application of (3.2)). It follo ws that m k +1 ∈ M ( A ′′ \ (3.1) , q 2 ). This automaton A 1 = A ′′ \ (3.1) has n − 1 rules with negativ e constrain ts. He nce, by induction h yp othesis, there is a VT AM R A + 1 with m k +1 in its memory language M ( A + 1 , q 2 ). By condition i , th is language is regular and w e can bu ild m k +1 from a T A for this language. No w, let us come b ac k to the pr o of that we can r eplace ru le (3.1), while p reserving the memory languages. If M ( A , q 2 ) = ∅ (whic h can b e effect iv ely decided according to lemma 3.4) then the r ule (3.1) is useless and can b e remo v ed from A withou t c h anging its memory language. Note that the condition M ( A , q 2 ) = ∅ is decidable b ecause by hyp othesis i , M ( A , q 2 ) is regular. Otherwise, let m 1 ∈ M ( A , q 2 ) b e built with L emma 3.4 and let N 1 b e the cardinal of the equiv alence class [ m 1 ] R . W e apply N 1 times the constr u ction of Lemma 3.4. There are three cases: (1) if we find more than N 1 terms in M ( A , q 2 ), then one of them, sa y m k is not in [ m 1 ] R . Then (3.1) is useless f or the p oint of view of memory languages: whatev er v alue for y 1 , w e know a y 2 ∈ M ( A , q 2 ) wh ic h p ermits to fire the rule. Ind eed, if y 1 ∈ [ m 1 ] R , then we can c ho ose y 2 = m k , and otherwise w e c ho ose y 2 = m 1 . Hence (3.1) can b e replaced without c hanging the memory language by: f  q 1 ( y 1 ) , q 0 ( y 2 )  − → q ( y 1 ) (3.3) where q 0 is an y state of A s uc h that M ( A , q 0 ) 6 = ∅ . W e can then app ly the indu ction h yp othesis to the VT AM R ¬ R obtained. (2) if we find less than N 1 terms in M ( A , q 2 ), but one is n ot in [ m 1 ] R . The case is the same as ab o v e. (3) if we find less than N 1 terms in M ( A , q 2 ), all in [ m 1 ] R , it means that one of the appli- cations of Lemma 3.4 was not successful, and hence that we ha v e found all th e terms of M ( A , q 2 ). It follo ws that the rule (3.1 ) can b e fired iff y 1 / ∈ [ m 1 ] R , i.e. there exists y 2 / ∈ [ m 1 ] R suc h that y 1 Ry 2 . Hence, we can replace (3.1 ) b y f  q 1 ( y 1 ) , p 1 ( y 2 )  − − − − − → y 1 R y 2 q ( y 1 ) . Then w e can apply the indu ction hyp othesis. W e presen t in Section 3.4 t w o examples of relations satisfying i. and ii . VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 15 3.3. Regular T ree Relations. W e firs t consider the general case of VT AM R ¬ R where th e equiv alence R is based on an arbitrary regular bin ary relation on T (Γ). By regular b inary relation, w e mean a set of pairs of ground terms accepted b y a tree automaton computin g sim ultaneously in b oth terms of the pair. More formally , we us e a co ding of a p air of terms of T (Σ) in to a term of T  (Σ ∪ {⊥} ) 2  , where ⊥ is a new constan t sym b ol (not in Σ ). This co ding is d efined recursive ly by: • ⊗ : T (Σ) ∪ {⊥} × T (Σ) ∪ {⊥} → T  (Σ ∪ {⊥} ) 2  • for all a, b ∈ Σ 0 ∪ {⊥} , a ⊗ b := h a, b i , • for all a ∈ Σ 0 ∪ ⊥ , f ∈ Σ 2 , t 1 , t 2 ∈ T (Σ), f ( t 1 , t 2 ) ⊗ a := h f , a i ( t 1 ⊗ ⊥ , t 2 ⊗ ⊥ ) a ⊗ f ( t 1 , t 2 ) := h a, f i ( ⊥ ⊗ t 1 , ⊥ ⊗ t 2 ), • for all f , g ∈ Σ 2 , s 1 , s 2 , t 1 , t 2 ∈ T (Σ), f ( s 1 , s 2 ) ⊗ g ( t 1 , t 2 ) := h f , g i ( s 1 ⊗ t 1 , s 2 ⊗ t 2 ). Then, a binary relation R ⊆ T (Σ) × T (Σ) is called r egular iff th e set { s ⊗ t   ( s, t ) ∈ R } is regular. The ab o ve co din g of pairs is unr elated to the pr o duct u sed in Theorem 2.4. Theorem 3.5. The memb ership pr oblem for VT AM R ¬ R is N P-c omplete when R is a r e gu lar binary r elation. Pr o of. Assume giv en a ground term t ∈ T (Σ) and a VT AM R ¬ R A = (Γ , R, Q, Q f , ∆). Beca use of th e visib ly condition, for ev ery subterm s of t , w e can compute in p olynomial time in the size of s the shap e denoted struct ( s ), which is an abstraction of the memory reac h ed when A runs on s . More pr ecisely , struct ( s ) is an un lab eled tree, and eve ry p ossible con ten t of memory m reac hable by A in a compu tation s − − → ∗ ∆ q ( m ) is obtained by a lab eling of the no des of struct ( s ) with symb ols of Γ. Note that for all s u bterm s , the size of struct ( s ) is smaller than th e size of t . Let us guess a decoration of ev ery n o d e of t with a state of Q and a lab eling of struct ( s ) (where s is the subterm of t at th e give n no de), such th at the r o ot of t is d ecorated with a final state of Q f . W e can c hec k in p olynomial time whether this decoration repr esen ts a run of A on t or not. The NP-hardness is a consequence of Theorem 3.9, wh ic h app lies to the particular case where R is the sy ntactic equalit y b et wee n terms. Note that the NP algorithm w orks with ev ery equiv alence R based on a r egular relation, but the the NP-hardn ess concerns on ly s ome cases of s u c h relations. F or instance, in Section 3.4, w e w ill see one example of relation for whic h membersh ip is NP-hard and another example for which it is in PTIME. The class of VT AM R ¬ R when R is a b inary regular tree r elation constitutes a n ice and uniform framewo rk. Note h ow eve r the condition ii of Theorem 3.2 is not alwa ys true in this case. Actually , this class is to o expressiv e. Theorem 3.6. Given a r e gular binary r elation R and an automaton A in VT AM R , the emptiness of L ( A ) is unde cidable. Pr o of. W e reduce the b lank accepting problem for a deterministic T u ring m ac hin e M . W e enco de configurations of M as ”righ t-co m bs” (binary trees) bu ilt with the tap e and state sym b ols of M , in Σ PUSH (hence binary) and a constant symb ol ε in Σ INT 0 . Let R b e the regular relation whic h accepts all the pairs of configurations c ⊗ c ′ suc h that c ′ is a successor of c by M . A sequence of configur ations c 0 c 1 . . . c n (with n ≥ 1) is enco ded as a tree t = f ( c 0 ( f ( c 1 , . . . f ( c n − 1 , c n ))), where f is a binary symb ol of Σ INT R 1 . 16 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN W e construct a VT AM R A whic h accepts exactly the term-repr esen tations t of com- putation sequences of M starting with the initial configuration c 0 of M and end ing w ith a final configur ation c n with blank tap e. F ollo wing the t y p e of the fu nction sym b ols, the rules of A will • p ush all the symb ols read in subterms of t corresp ondin g to configur ations, • compare, with R , c i and c i +1 (the memory cont en ts in r esp ectiv ely the left and right branc hes) and store c i in the memory , with a trans ition applied at the top of a s u bterm f ( c i , f ( c i +1 , . . . )). This wa y , A c hec ks that successiv e configur ations in t corresp ond to transitions of M , hence that the language of A is n ot emp t y iff M accepts the initial configuration c 0 . 3.4. Syn t actic and Structural Equa lity and Disequality Constraints. W e present no w t w o examples of relations satisfying the conditions of Theorem 3.2: syn tact ic and structural term equalit y . T he satisfaction of condition i will b e pr o ved with th e help of the follo wing cru x Lemma. Lemma 3.7. L et R b e a r e gular binary r elation define d by a T A whose state set is  R i   i = { 1 ..n }  and such that ∀ i, j ∃ k , l , ∀ x, y , z . xR i y ∧ y R j z ⇔ xR k y ∧ x R l z . L et A = (Γ , R , Q, Q f , ∆) b e a tr e e automaton with memory and c onstr aints (not ne c essarily visibly). Then it is p ossible to c ompute i n exp onential time a finite tr e e automaton A ′ , such that, for ev ery state q ∈ Q , the language M ( A , q ) is the language ac c epte d in some state of A ′ . Pr o of. (Sk etc h) T o pr o ve this lemma, we fir st observe that the M ( A , q ) (for q ∈ Q ) are actually the least sets that satisfies the f ollo wing conditions (we assu me here for simp licit y that the non-constan t sym b ols are binary and displa y only some of th e imp lications; the others can b e easily guessed): ∀ x, y , z . x ∈ M ( A , q 1 ) , y ∈ M ( A , q 2 )) ⇒ g ( x, y ) ∈ M ( A , q ) if there is a rule f ( q 1 ( x 1 ) , q 2 ( x 2 )) → q ( g ( x 1 , x 2 )) g ( x, y ) ∈ M ( A , q 1 ) , z ∈ M ( A , q 2 ) ⇒ x ∈ M ( A , q ) if there is a rule f ( q 1 ( g ( x, y ) , q 2 ( z )) → q ( x ) x ∈ M ( A , q 1 ) , y ∈ M ( A , q 2 ) , R ( x, y ) ⇒ x ∈ M ( A , q ) if there is a rule f ( q 1 ( x ) , q 2 ( y )) − − − → xRy q ( x ) · · · In terms of automata, this means that M ( A , q ) is a language recognized b y a t w o-w a y alternating tree automaton with regular binary constrain ts. In other w ords, such languages are the least Herbrand mo d el of a set of clauses of the form Q 1 ( y 1 ) , Q 2 ( y 2 ) , R ( y 1 , y 2 ) ⇒ Q 3 ( y 1 ) INT 1 , INT 2 Q 1 ( y 1 ) , Q 2 ( y 2 ) ⇒ Q 3 ( f ( y 1 , y 2 )) PUSH ⇒ Q 1 ( a ) INT 0 Q 1 ( f ( y 1 , y 2 )) , Q 2 ( y 3 ) ⇒ Q 3 ( y 1 ) POP 11 , POP 21 Q 1 ( f ( y 1 , y 2 )) , Q 2 ( y 3 ) ⇒ Q 3 ( y 2 ) POP 12 , POP 22 The lemma then shows that languages that are recognized by t w o-w a y alternating tree automata with some particular regular constrain ts, are also recognized by a finite VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 17 tree automaton. This corresp onds to classical redu ctions of t w o-w a y au tomata to one-w a y automata (see e.g [8], chapter 7, [13], or [12, 6] for the first r elev an t references). The idea of the reduction is to find shortcuts: mo ving up and down yiel ds a mov e at the same lev el. Ad d such shortcuts as n ew ru les, un til getting a “complete set”. Then only kee p the non-red undant ru les: this yields a fi nite tree automaton. Suc h a pro cedur e relies on the d efinitions of ordered strategies, redund ancy and satur ation (ak a complete sets), whic h are classical n otions in automated first-order theorem proving [13, 3, 20]. Ind eed, formally , a “shortcut” m ust b e a form ula, which allo w s f or smaller pro ofs than the p ro of using the t wo original r ules. A satur ate d set corresp onds to a set of formulas w hose all shortcuts are already in the set. The adv an tage of the clausal formalism is to enable an easy representat ion of the ab ov e shortcuts, as inte rmediary steps. Su ch shortcuts are clauses, but are not automata rules. Second, we may rely on completeness results f or Horn clauses. That is why , only for the pro of of this lemma, wh ic h follo ws and extend the classical pro ofs adding s ome regular constraints, w e s witc h to a fir st-order logic form alizati on. The complete pro of can b e found in App endix 5. As in the classical pr o ofs, we saturate the set of clauses b y resolution with selection and eager splitting. This saturation termin ates, and the set of clauses corresp ond in g to fi nite tree automata transitions in the saturated set recognizes the language M ( A , q ), which is ther efore regular. The condition on R in the lemma allo w s to br eak c hains such as ∃ x 1 , . . . , x n .xRx 1 ∧ x 1 Rx 2 ∧ · · · ∧ x n Ry ∧ P ( x, y ), w hic h wo uld b e a sour ce of n on-termination in the saturation pro cedure. W e may indeed replace such c h ains by ∃ x 1 , . . . , x n .xR 1 x 1 ∧ xR 2 x 2 ∧ . . . ∧ xR n x n ∧ xR 0 y ∧ P ( x, y ), whic h can again b e simp lified into ∃ x 1 .xS x 1 ∧ xR 0 y ∧ P ( x, y ) where S is the in tersectio n of R 1 , . . . , R n . P ossible such int ersections range in a finite set as the relation R is regular and the R i s are states of the automaton accepting R . Finally note that find ing k , l in the lemma’s assumption can alwa ys b e p erformed in an effectiv e wa y since R is r egular. 3.4.1. Syntactic Constr aints. W e firs t apply Lemma 3.7 to the class VT AM = 6 = where = de- notes the equalit y b et w een ground terms made of memory sym b ols. Note that it is a particular case of constrained VT AM R ¬ R of the ab o v e section 3.3, s ince the term equalit y is a r egular relation. The automata of the sub class with p ositiv e constrain ts only , VT AM = , are particular cases of tr ee automata with one memory of [7], and h a ve ther efore a decidable emptiness pr oblem. W e sh o w b elo w that VT AM = 6 = fulfills the h yp otheses of Theorem 3.2, and hence that the emptiness is also d ecidable for the wh ole class. W e can first v erify that the relation = c hec ks the hypothesis of Lemma 3.7, hence the condition i of Th eorem 3.2 . Moreo ve r, the relation = obviously also chec ks the condition ii of Th eorem 3.2. Corollary 3.8. The emptiness pr oblem is de cidable for VT AM = 6 = . A careful analysis of the p ro of of Theorem 3.2 p ermits to conclud e to an EXPTI ME complexit y for this p r oblem with VT AM = 6 = . Theorem 3.9. The memb e rship pr oblem is NP- c omplete for VT AM = 6 = . Pr o of. An NP algorithm is giv en in the pro of of Theorem 3.5. F or the NP-hardn ess, w e u se a logspace reduction of 3-SA T. Let u s consider an instance of 3-SA T with n p rop ositional 18 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN v ariables X 1 , . . . , X n and a conju nction of m clauses: m ^ i =1 ( α i, 1 ∨ α i, 2 ∨ α i, 3 ) where ev ery α i,j is either a v ariable X k ( k ≤ n ) or a negation of v ariable ¬ X k . W e assume wlog that every v ariable o ccurs at most once in a clause. W e consider an en co d ing t of the giv en instance as a term o v er the signature Σ con taining the sym b ols: X 1 , . . . , X n (constan ts), id , false , ¬ (unary) and ∧ an d ∨ (binary). The enco ding is: t := C ∧  C ∨ [ δ 1 , 1 ( X 1 ) , . . . , δ 1 ,n ( X n )] , . . . , C ∨ [ δ m, 1 ( X 1 ) , . . . , δ m,n ( X n )]  where C ∧ (resp. C ∨ ) is a conte xt b uilt solely with ∧ (resp. ∨ ) and w h ere ev er y δ i,j is either: • δ i,j = id (in terpreted as the identi t y) if one of α i, 1 , α i, 2 , α i, 3 is X j , • δ i,j = ¬ if one of α i, 1 , α i, 2 , α i, 3 is ¬ X j , • δ i,j = false (interpreted as the constan t function returning false ) if X j do es not o ccur in α i, 1 , α i, 2 , α i, 3 . No w, let u s partition the signature Σ with: X 1 , . . . , X n , ∨ ∈ PUSH , id , false , ¬ ∈ INT 1 and ∧ ∈ INT = 1 ; and let consider the memory signature Γ = { 0 , 1 , ∨} . W e construct n o w a VT AM = A = (Γ , = , { q 0 , q 1 } , { q 1 } , ∆) whose transition w ill, in tuitiv ely: • guess an assignmen t for eac h constan t s y mb ol X k of t , by mean of a non-determin istic c h oice of one state q 0 or q 1 , • compu te the v alue of t with these assignments, • p ush eac h tuple of assignment for eac h clause, in the contexts C ∨ , • chec k the coherence of assignmen ts by m eans of equalit y tests b etw een th e tup les p ushed, in the context C ∧ . More formally , we ha v e the follo win g transitions in ∆: X i → q 0 (0) X i → q 1 (1) i ≤ n id ( q ε ( y 1 )) → q ε ( y 1 ) false ( q ε ( y 1 )) → q 0 ( y 1 ) ¬ ( q ε ( y 1 )) → q 1 − ε ( y 1 ) with ε ∈ { 0 , 1 } ∨ ( q ε 1 ( y 1 ) , q ε 2 ( y 2 )) → q ε 1 ∨ ε 2 ( ∨ ( y 1 , y 2 )) ∧ ( q ε 1 ( y 1 ) , q ε 2 ( y 2 )) − − − − → y 1 = y 2 q ε 1 ∧ ε 2 ( y 1 ) with ε 1 , ε 2 ∈ { 0 , 1 } W e can verify that the ab o ve VT AM = A recognizes t iff the instance of 3-SA T h as a solution. VT AM = 6 = is closed under u nion (using the same constru ction as b efore) bu t not un der complemen tati on. This is a consequ ence of the f ollo wing Th eorem. Theorem 3.10. The univ ersality pr oblem i s u nde cidable for VT AM = 6 = . Pr o of. W e red uce the blank accepting p roblem for a d eterministic T uring mac hine M . Lik e in the p ro of of Theorem 3.6, we enco de c onfigur ations of M as righ t-com bs on a s ignature Σ con taining the tap e and state symb ols of M , considered as binary sym b ols of Σ PUSH and a constan t symbol ε in Σ PUSH . A sequ en ce of configur ations c 0 , c 1 , . . . , c n (with n ≥ 1) is enco ded as a tree t = f ( c n ( f ( c n − 1 , . . . f ( c 0 , ε )))), where f is a binary symbol of Σ INT = 1 . Suc h a tree is called a c omputa tion of M if c 0 is the initial configur ation, c n is a final configur ation VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 19 ε → q ǫ ( ε ) f ( q B ( y 1 ) , q ε ( y 2 )) − − − − → y 1 6 = y 2 q ( y 1 ) f ( q B ( y 1 ) , q ( y 2 )) − − − − → y 1 = y 2 q ( y 1 ) f ( q B ( y 1 ) , q ( y 2 )) − − − − → y 1 6 = y 2 q f ( y 1 ) f ( q B ( y 1 ) , q f ( y 2 )) − − − − → y 1 = y 2 q f ( y 1 ) f ( q B ( y 1 ) , q f ( y 2 )) − − − − → y 1 6 = y 2 q f ( y 1 ) Figure 3: Th e VT AM = 6 = A 3 in the pro of of Theorem 3.10. ε − → q ε ( ε ) f ( q ∀ ( y 1 ) , q ε ( y 2 )) − − − − → y 1 6 = y 2 q ∀ ( y 1 ) f ( q ∀ ( y 1 ) , q ∀ ( y 2 )) − − − − → y 1 = y 2 q ∀ ( y 1 ) f ( q ✷ ( y 1 ) , q ∀ ( y 2 )) − − − − → y 1 = y 2 q ✷ ( y 1 ) f ( q = ( y 1 ) , q ✷ ( y 2 )) − − − − → y 1 6 = y 2 q f ( y 1 ) f ( q ∀ ( y 1 ) , q f ( y 2 )) − − − − → y 1 = y 2 q f ( y 1 ) Figure 4: Th e VT AM = 6 = A 4 in the pro of of Theorem 3.10. and for all 0 ≤ i < n , c i +1 is the su ccessor of c i with M . Moreo v er, we assume that all th e c i ha v e the same length (for this purp ose we complete the representat ions of configurations with blank symb ols). W e w an t to construct a VT AM = 6 = A wh ic h r ecognizes exactly the terms whic h are not computations of M . Hence, A recognizes all the terms of T (Σ) iff M do es not accept the initial blank configuration. F or th e construction of A , let u s first observe th at we can asso ciate to M a VT AM A ✷ whic h, while reading a configuration c i , will push on the memory its successor c i +1 . The existence of suc h an automaton is guaran teed by th e first fact that for eac h regular b inary relation R , as defined in Section 3.3, there exists a VT AM whic h, for eac h ( s, t ) ∈ R , w ill push t while r eading s , and by the second fact th at the language of c i ⊗ c i +1 , hence the relation of successor configuration, are r egular. Moreo v er, since only push op eratio ns are p erformed, w e can ensure that A ✷ satisfies th e visibly condition. Let us n ote q ✷ the fin al state (w h ic h is assumed uniqu e wlog) of the VT AM A ✷ . W e also u se the follo w ing VT AMs: A ∀ : a VT AM with (un ique) fin al state q ∀ whic h, while reading a configuration c i will push on the memory an y configuration with same length as c i , A = : a VT AM w ith final state q = whic h, while reading a configur ation c i will p ush c i on the memory , A B : a VT AM with fin al state q B whic h, while reading a configuration c i will pus h on the memory a configur ation with same length as c i and con taining only blank symb ols. The VT AM = 6 = A is the union of th e f ollo wing automata: A 1 : a VT AM = 6 = recognizing the terms of T (Σ) w hic h are not representa tions of s equences of configurations (malformed terms). I ts language is actually a regular tree language. A 2 : a VT AM = 6 = recognizing the sequences of configurations f ( c n ( f ( c n − 1 , . . . f ( c 0 , ε )))) such that c 0 is not initial or c n is not final. Again, this is a r egular tree language. A 3 : a VT AM = 6 = recognizing the sequences of configur ations with t w o configurations of d iffer- en t lengths. It con tains the transitions rules of A B and the additional transitions describ ed in Figure 3, whic h p erf orm this test. A 4 : a VT AM = 6 = recognizing the sequences of configurations f ( c n ( f ( c n − 1 , . . . f ( c 0 , ε )))) such that all th e c i ha v e the s ame length bu t th ere exists 0 ≤ i < n suc h that c i +1 is not th e successor of c i b y M . Th is last VT AM = 6 = con tains the transitions of A ✷ , A ∀ , A = , and the additional transitions describ ed in Figure 4. 20 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN With the transition rules in Figure 4, th e automaton A 4 guesses a i < n and, while reading eac h of the configurations c j with j ≤ i , it pus h es the successor confi guration of c j , sa y c ′ j (second column of fi gure 4). Then, while r eading c i +1 A 4 pushes c i +1 , and it c hec ks that c ′ i and c i +1 differ. After that, wh en reading eac h of the remaining configur ations, A 4 pushes c i +1 (third column of figure 4). The VT AM = 6 = A 1 to A 4 co ver all the cases of term T (Σ) not b eing an accepting com- putation of M starting with the initial blank configuration. Hence the language of their union A is T (Σ) iff M do es n ot accept the in itial blank confi gu r ation. Corollary 3.11. VT AM = 6 = is not effe ctively c lose d under c ompl ementation. Pr o of. It is a consequence of Corollary 3.8 (emptiness d ecision) and Th eorem 3.10 . 3.4.2. Structur al Constr aints. Lemma 3.7 applies also to another class VT AM ≡ 6≡ , where ≡ denotes s tructural equalit y of terms, d efi ned r ecur siv ely as the smallest equiv alence relation on ground term s suc h that: • a ≡ b for all a , b of arit y 0, • f ( s 1 , s 2 ) ≡ g ( t 1 , t 2 ) if s 1 ≡ t 1 and s 2 ≡ t 2 , for all f , g of arity 2. Note that it is a regular relation, and that it s atisfies the hyp othesis of Lemma 3.7 and the condition ii of T heorem 3.2. Corollary 3.12. The emptiness pr oblem is de cidable for VT AM ≡ 6≡ . F ollo wing the pro cedur e in the pro of of Theorem 3.2, w e obtain a 2-EXPTIME com- plexit y for this pr ob lem and this class. The crucial pr op ert y of the relations ≡ and 6≡ is that, unlike the ab ov e class VT AM = 6 = or the general VT AM R ¬ R , th ey ignore the lab els of the con ten ts of the memory . They just care of th e structure of these memory terms. A b enefit of this prop ert y of VT AM ≡ 6≡ is that the decision of the memb ership p roblem drops to PTIME for this class. Theorem 3.13. The memb ership pr oblem is de cidable in PTIM E for VT AM ≡ 6≡ . Pr o of. Let A = (Γ , ≡ , Q, Q f , ∆) b e a VT AM ≡ 6≡ on Σ and let t b e a term in T (Σ). Let sub ( t ) b e the set of su b terms of t and let us constr u ct a VT AM A ′ = (Γ , sub ( t ) × Q, { t } × Q f , ∆ ′ ) on Σ ′ where the symb ols of Σ ′ and Σ are the same, and w e assu me that the sym b ols in category INT ≡ 1 (resp. INT ≡ 2 ) in the partition of Σ are in INT 1 (resp. INT 2 ) in the partition of Σ ′ . The transitions of ∆ ′ are obtained b y the follo wing transformation of the transitions of ∆. W e only describ e the construction for the cases INT 1 and INT ≡ 1 with p ositiv e constrain ts. The other cases are similar. • for ev er y f 7 ( q 1 ( y 1 ) , q 2 ( y 2 )) → q ( y 1 ) ∈ ∆, we add to ∆ ′ all the transitions: f 7  h q 1 , t 1 i ( y 1 ) , h q 2 , t 2 i ( y 2 )  →  q , f ( t 1 , t 2 )  ( y 1 ) such that f ( t 1 , t 2 ) ∈ sub ( t ), • for eve ry f 9 ( q 1 ( y 1 ) , q 2 ( y 2 )) − − − − → y 1 ≡ y 2 q ( y 1 ) ∈ ∆, we add to ∆ ′ all the tran s itions as ab o v e (in this case, f 9 is assumed a sym b ol of category INT 1 in Σ ′ ) s u c h that moreo v er struct ( t 1 ) = struct ( t 2 ), where struct ( s ) is defined, lik e in the pro of of Theorem 3.5 , as the shap e (unlab eled tree) that will h a ve the memory of A after A pro cessed s . The VT AM A ′ can b e computed in time O ( k t k 2 × k A k ). It recognizes at m ost one term , t , and it recognizes t iff A recognizes t . Therefore, t is recognized b y A iff the language of A ′ is not empty . Th is can b e decided in PTIME according to Theorem 2.5. VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 21 Ev en more int eresting, the construction for determinization of S ection 2.3 still w orks for VT AM ≡ 6≡ . Theorem 3.14. F or every VT AM ≡ 6≡ A = (Γ , ≡ , Q, Q f , ∆) ther e exi sts a deterministic VT AM ≡ 6≡ A det = (Γ det , ≡ , Q det , Q det f , ∆ det ) such that L ( A ) = L ( A det ) , wher e | Q det | and | Γ det | b oth ar e O  2 | Q | 2  . Pr o of. W e use the same construction as in the pr o of of Th eorem 2.3, with a d irect extension of the constru ction for INT to INT ≡ . The key prop erty for hand ling constrain ts is that the structure of memory (h ence the result of the structural tests) is ind ep endent from the non- deterministic c hoices of the automaton. With th e visibilit y condition it only d ep ends on the term r ead. Theorem 3.15. The class of tr e e languages of VT AM ≡ 6≡ is close d under Bo ole an op er ations. One c an c onstruct VT AM ≡ 6≡ for union, interse ction and c omplement of given VT AM ≡ 6≡ lan- guages whose sizes ar e r esp e ctively line ar, quadr atic and exp onential in the size of the initial VT AM ≡ 6≡ . Pr o of. W e us e the same constructions as in Theorem 2.4 (VT AM) for u nion and in tersec- tion. F or the in tersectio n, in the case of constrained ru les w e can safely keep the constraints in p ro duct rules, thanks to the visibilit y condition (as the structure of memory only de- p ends on the term read, see the proof of Th eorem 3.14). F or instance, the pro duct of the INT ≡ 1 rules f 9  q 11 ( y 1 ) , q 12 ( y 2 )  − − − − → y 1 ≡ y 2 q 1 ( y 1 ) and f 9  q 21 ( y 1 ) , q 22 ( y 2 )  − − − − → y 1 ≡ y 2 q 1 ( y 1 ) is f 9  h q 11 , q 21 i ( y 1 ) , h q 12 , q 22 i ( y 2 )  − − − − → y 1 ≡ y 2 h q 1 , q 2 i ( y 1 ). The pro du ct of t wo INT 6≡ 1 is constructed similarly . W e do not n eed to consider the pro du ct of a ru le INT ≡ 1 with a ru le INT 6≡ 1 , and vice-v ersa, b ecause in this case the p r o duct is emp t y (no rule is added to the VT AM ≡ 6≡ for in tersectio n). F or the complemen tatio n, w e use Th eorem 3.14 and completion. Corollary 3.16. The universality and inclusion pr oblems ar e de cidable for VT AM ≡ 6≡ . Pr o of. This is a consequence of Corollary 3.12 and T h eorem 3.15. 3.5. Constrained PU SH T ransitions. Ab o v e, w e alw a ys consid er ed constr aints in tran- sitions with INT sym b ols only . W e did not consid er a constrained extension of the r ules PUSH . T he main reason is th at symbols of a new category PUSH ≡ , whic h test tw o memories for stru ctural equalit y and then pu sh a symbol on the top of them, p ermit us to constru ct a constrained VT AM A whose memory language M ( A , q ) is th e set of wel l-balanced binary trees. This language is not regular, wh ereas the base of our emptiness decision pro cedure is the result (Theorem 3.2, Lemm a 3.7) of regularit y of these languages for the cla sses considered. 3.6. Con texts as Sym b ols and Signature T ranslations. Before lo oking for some ex- amples of VT AM ≡ 6≡ languages, we sho w a ”tric k” that (seemingly) adds expressiv eness to VT AM ≡ 6≡ . One sy mb ol can p erform either a PU SH or a POP op eration, or mak e an INT transition (constrained or not), bu t it cannot combine several of th ese op erations. Here, w e prop ose a wa y to com bine several op erations in one symb ol, and th us increase the exp res- siv eness of VT AM ≡ 6≡ , without losing the go o d prop erties of this class. 22 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN The tric k is to r eplace symbols by c ontexts . F or instance a con text g 2 ( g 1 ( · , · ) , g 0 ) can replace a symb ol of arity 2. Assume that g 2 is a PUS H symbol, g 1 is an INT 1 sym b ol with test, and g 0 is an INT 0 sym b ol. This conte xt first p erforms a test on the memories of the sons, and then a PUS H op er ation on the memory k ept b y g 1 (and on the ⊥ leaf created by g 0 ). Suc h a combinatio n is normally not p ossible, and replacing symbols by con texts brin gs a lot of additional expressiveness. Here is how w e pr ecisely pro ceed: w e wan t to recognize a language (on a signature Σ) with a VT AM, and we h a ve then to c h o ose the categories for eac h symb ol of the s ignatur e ( PUSH , POP ij , INT ≡ 1 , ...). As w e will see in the examples b elo w, it migh t b e u seful in practice to ha v e s ome extra categories combining the p o wers of t w o or more categories of VT AM ≡ 6≡ . W e can do that still with VT AM ≡ 6≡ , by mean of an enco ding of the terms of T (Σ). More pr ecisely , we replace some symb ols of the in itial signature Σ by con texts built w ith new s y mb ols. F or instance, we replace a g ∈ Σ, wh ic h will p erform the complex op eratio n describ ed ab o v e, by the con text g 2 ( g 1 ( · , · ) , g 0 ). Then, we will hav e to ensure that the new sym b ols (in our example g 0 , g 1 and g 2 ) are only used to f orm the con texts enco ding the sym b ols of Σ. This can easily b e done with lo cal information main tained in the state of the automa ton. The set of well formed terms , b uilt with n ew sym b ols organized in allo wed con texts, is a regular tree language. W e will call the VT AM ≡ 6≡ signature obtained a tr anslation of th e initial signature. If L is a tree language on Σ, then c ( L ) is the translation of L . In sum mary , we ha v e sh o w n here a general metho d for add ing new categories of symbols corresp onding to (relev ant ) com binations of op eratio ns of VT AM ≡ 6≡ , and hence d efi ning extensions of VT AM ≡ 6≡ with the same go o d pr op erties as VT AM ≡ 6≡ . By r elevant , w e mean that some com binations are excluded, like for ins tance, PU SH + constrain t ≡ at th e same time (see paragraph ab o v e). Suc h forbidden combinatio n cannot b e hand led b y our m etho d . With sim ilar enco dings, w e can deal w ith symb ols of arit y bigger than 2, e.g. g ( · , · , · ) can b e replaced b y g 2 ( · , g 1 ( · , · )). Note ho w ev er fir st that this enco ding concerns the recognized tree, not the memories . F or instance, it is not p ossible to systematically enco de the syn tactic equalit y as str uctural equalit y (on memories) in this wa y . And ind eed, the d ecision results are drastically different in the tw o cases. Also note th at, ev en if c ( L ) is accepted by a VT AM, whic h implies that ¬ c ( L ) is also accepted by a VT AM, it ma y w ell b e the case that c ( ¬ L ) is not recognized b y a VT AM. So, the ab o v e tric k do es not sho w that we can extend our results to a wider class of tree languages. 3.7. Some VT AM ≡ 6≡ Languages. The r egular tree languages and VPL are particular cases of VT AM languages. W e present in th is section some other examples of relev ant tree languages translatable, using the metho d of Section 3.6, in to VT AM ≡ 6≡ languages. Wel l b alanc e d binary tr e es. T he VT AM ≡ 6≡ with memory signature { f , ⊥} , state set { q , q f } , unique final stat e q f , and whose rules f ollo w accepts the (non-regular) language of we ll balanced bin ary trees build w ith g an d a . Here a is a constan t in Σ INT 0 , and g is in a n ew category , and is translated into th e con text g 2 ( g 1 ( · , · ) , g 0 ), where g 2 ∈ Σ PUSH , g 1 ∈ Σ INT ≡ 1 , and g 0 ∈ Σ INT 0 . VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 23 a → q f ( ⊥ ) g 0 → q 0 ( ⊥ ) g 1  q f ( y 1 ) , q f ( y 2 )  − − − − → y 1 ≡ y 2 q ( y 1 ) g 2  q ( y 1 ) , q 0 ( y 2 )  − → q f  f ( y 1 , y 2 )  Powerlists. A p ow erlist [18] is roughly a list of length 2 n (for n ≥ 0) wh ose elemen ts are stored in the lea v es of a b alanced binary tree. F or instance, the elemen ts may b e integ ers represent ed in unary n otatio n with the un ary successor sym b ol s and the constan t 0, and the balanced binary tree on the top of them can b e built with a binary sy mb ol g . This data stru cture has b een used in [18] to s p ecify data-parallel algorithms based on divide- and-conquer strategy and recursion ( e.g. Batc her’s merge sort and fast F our ier transf orm ). It is ea sy follo wing the ab ov e construction to characte rize translations of p o werlists with a VT AM ≡ 6≡ . W e do not pus h on the ”lea ves”, i . e. on the elemen ts of the p o we rlist, and compute in th e higher part (the complete bin ary tree) as ab o ve . Some equational prop erties of algebraic sp ecificati ons of p ow erlists hav e b een studied in the con text of automatic ind uction theorem pr o ving and su ffi cien t completeness [17]. T ree automata w ith constrain ts ha v e b een ac kn o wledged as a v ery p o w erf ul formalism in this con text (see e.g. [9]). W e therefore b eliev e that a charact erization of p o werlists (and their complemen t language) with VT AM ≡ 6≡ is us efu l for th e automated ve rification of algorithms on this data structure. R e d-black tr e es. A red-black tree is a binary searc h tree follo wing these prop erties: (1) ev ery n o de is either red or black, (2) the ro ot no de is black, (3) al l the lea ves are b lac k, (4) if a no de is red , then b oth its sons are b lac k, (5) ev ery p ath from the ro ot to a leaf con tains the same num b er of b lac k no des. The f ou r first prop erties are lo cal and can b e chec k ed with standard T A r u les. The fifth prop er ty mak e the language red-blac k trees not regular and we need VT AM ≡ 6≡ rules to recognize it. It can b e c hec ked by p ushing all the black no d es read. W e use for this purp ose a symbol black ∈ Σ PUSH . When a r ed no d e is read, the n um b er of blac k no des in b oth its sons are c h ec ked to b e equal (by a test ≡ on the corr esp onding memories) and only one corresp ond ing memory is k ept. This is d one with a sy mb ol r e d ∈ Σ INT ≡ 1 . When a blac k no de is read, the equalit y of num b er of b lac k n o des in its sons must also b e tested, and a black must moreov er b e p ushed on the top of the memory ke pt. It m eans that tw o op erations m ust b e combined. W e can do that by definin g an appropr iate context with the metho d of Section 3.6. In [15] a sp ecia l class of tree automata is introdu ced and used in a p ro cedure f or the v erification of C programs which hand le b alanced tree d ata structures, lik e red-blac k tree. Based on the ab o ve example, we thin k that, follo wing the same approac h, VT AM ≡ 6≡ can also b e used for s imilar pur p oses. 24 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN BTINT 1 f 13  q 1 ( y 1 ) , q 2 ( y 2 )  − − − → 1=2 q ( y 1 ) f 13 ∈ Σ BTINT 1 BTINT 2 f 14  q 1 ( y 1 ) , q 2 ( y 2 )  − − − → 1=2 q ( y 2 ) f 14 ∈ Σ BTINT 2 BTINT 1 f 15  q 1 ( y 1 ) , q 2 ( y 2 )  − − − → 1 6 =2 q ( y 1 ) f 15 ∈ Σ BTINT 1 BTINT 2 f 16  q 1 ( y 1 ) , q 2 ( y 2 )  − − − → 1 6 =2 q ( y 2 ) f 16 ∈ Σ BTINT 2 Figure 5: New tr an s ition categories for BTVT AM R ¬ R . 4. Visibl y Tree Automa t a with Mem or y and Structural Const raints a nd Bogaer t-Tison Constraints In S ection 3, we ha v e only considered VT AM w ith constraints testing th e memories con tents. In this section, w e go a b it fu rther and add to VT AM R ¬ R some Bogaert-Tison constrain ts [4], i.e. equalit y and disequalit y tests b et ween br other sub terms in the term read by th e automaton. W e consid er tw o new catego ries for the s y mb ols which w e call BTINT 1 and BTINT 2 , f or ”Boga ert-Tison Int ernal”. A transition with a symb ol in one of these categories will make n o test on the memory conte n ts, b ut rather an equ alit y or disequalit y test b et wee n the brother subterms d irectly under the current p osition of computation. In Figure 5, we describ e the new transitions categories. W e u se the same notation as in [4] for the constraint s. Not e that again, w e only allo w Bogaert- Tison constrain ts in in ternal ru les. F or ins tance, if f 13 ( t 1 , t 2 ) is a su bterm of the input tree, and if t 1 leads to q 1 ( m 1 ), and t 2 to q 2 ( m 2 ), then the transition rule f 13  q 1 ( y 1 ) , q 2 ( y 2 )  − − − → 1=2 q ( y 1 ), of t yp e BT INT 1 can b e applied at this p osition iff t 1 = t 2 . Definition 4.1. A visibly tr e e automaton with memory and c onstr aints and B o gaert-Tison tests (BTVT AM R ¬ R ) on a signature Σ is a tup le (Γ , R , Q, Q f , ∆) where Γ , Q , Q f are defi n ed as for T AM, R is an equiv alence relation on T (Γ) and ∆ is a set of r ewrite r ules in one of the ab o v e catego ries: PUS H , POP 11 , PO P 12 , PO P 21 , PO P 22 , INT 0 , INT 1 , INT 2 , INT R 1 , INT R 2 , BTINT 1 , BTINT 2 . The acceptance of terms of T (Σ) and languages of term and memories are defined and denoted as in Section 2.1. The d efinition of c omplete BTVT AM R ¬ R is the same as b efore. Ev ery BTVT AM R ¬ R can b e completed (with a p olynomial o v erhead) by the addition of a trash state q ⊥ (the construction is similar to the one for VT AM R ¬ R in Section 3.1). The d efinition of deterministic BTVT AM R ¬ R is based on the same conditions as for VT AM R ¬ R for the function symb ols in categories PU SH 0 , PUSH , POP 11 , . . . , POP 22 , INT 1 , INT 2 , INT R 1 , INT R 2 , and for the fun ction symb ols of B TINT 1 , BTINT 2 , we u se the same kind of conditions as for INT R 1 , INT R 2 : f or all f ∈ Σ BTINT 1 ∪ Σ BTINT 2 for all q 1 , q 2 ∈ Q , there are at most t w o rules in ∆ with left-mem b er f  q 1 ( y 1 ) , q 2 ( y 2 )  , and if there are t w o, then th eir constrain ts ha v e different signs. Theorem 4.2. F or ev ery BTVT AM ≡ 6≡ A = (Γ , ≡ , Q, Q f , ∆) ther e exists a deterministic BTVT AM ≡ 6≡ A det = (Γ det , ≡ , Q det , Q det f , ∆ det ) such that L ( A ) = L ( A det ) , wher e | Q det | and | Γ det | b oth ar e O  2 | Q | 2  . VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 25 Pr o of. W e use, again, the same construction as in the pro of of Th eorem 2.3, with a d irect extension of the construction for INT to INT ≡ and BTINT . As men tioned in Theorem 3.14, the extension works for INT ≡ b ecause the resu lts of the tests are indep endent from the non-deterministic c hoices of the automaton. F or BT INT it is exactly the same (the br other terms are n ot c hanged by the automaton!). Theorem 4.3. The class of tr e e languages of BTVT AM ≡ 6≡ is close d under Bo ole an op er a- tions. Pr o of. W e u s e the s ame constructions as in Theorem 2.4 for union and in tersection. F or the in tersectio n, as in Th eorem 3.15, the constr aints (even Bogaert-Tison tests) can b e safely k ept in p ro duct rules, thanks to the visibilit y condition. F or the complemen tation, we use Theorem 4.2 and complemen tati on. The pro of of the follo wing theorem follo ws the same id ea as the pr o of for Bogaert-Tison automata [4], but we need here to take care of the structur al constrain ts on the memory con tents. A consequence is that the complexit y of emptiness d ecision is muc h higher. Theorem 4.4. The e mptiness pr oblem is de cidable for BTVT AM ≡ 6≡ . Pr o of. Let A b e a BTVT AM ≡ 6≡ . First w e determinize it into A det and assume that A det is also complete. Then, w e delete the rules BTINT 1 of the form: f  q 1 ( y 1 ) , q 2 ( y 2 )  − − − → 1=2 q ( y 1 ). with q 1 distinct from q 2 (idem for B TINT 2 rules) b ecause they can’t b e used (the automaton is deterministic so one term cann ot lead to tw o different states). F or the same reason, we c hange eac h rule BTINT 1 of the form: f  q 1 ( y 1 ) , q 2 ( y 2 )  − − − → 1 6 =2 q ( y 1 ) with q 1 distinct fr om q 2 (idem for B TINT 6 = 2 rules) int o th e same r ule but without the disequalit y test: f  q 1 ( y 1 ) , q 2 ( y 2 )  → q ( y 1 ). W e call the newly ob tained automaton A new . It is still d eterministic and recognizes the same language as A det . Actually , the careful reader ma y n otice th at A new is n ot a true BTVT AM ≡ 6≡ , b ecause some unconstrained rules ma y inv olve symb ols in BTINT in this au- tomaton. Ho wev er, it is just an intermediate step in the constru ction of another automaton A ′ b elo w. No w, w e consid er the remaining BTINT 1 or B TINT 2 rules with negativ e Bogae rt-Tison constrain ts, whic h are of the f orm: f  q 1 ( y 1 ) , q 1 ( y 2 )  − − − → 1 6 =2 q ( y 1 ) (or q ( y 2 )). W e denote th em b y R 1 , ..., R i , ..., R N , and denote by q i the s tate in th e left mem b er of R i , f or eac h i ≤ N . W e also denote the corresp on d ing BTINT 1 or BTINT 2 rules by S 1 ,...., S i ,..., S N . Note that, since A det is deterministic and complete, we can asso ciate to eac h rule of BTINT i , wh ose constrain t is negativ e, a unique rule of BTINT i with a p ositiv e constraint and the same states in its left m emb er. So, the state in the left mem b er of S i is the same q i as for R i . It is imp ortan t to notice that if a rule R i can effectiv ely b e used, then there must exist t wo distinct terms leading to the state q i (w e will call th em witnesses). If not, the ru le can b e remov ed. So, our pu r p ose is no w to find, for eac h rule R i , whether tw o witnesses exist or not. W e let R b e initially { R 1 , . . . , R N } . Supp ose that at least one R i rule can b e used, and consider a run on a term t that u ses su ch a ru le. W e consider an innerm ost app lication of a r ule R i in this run on a su bterm f ( t 1 , t 2 ). The run on t 1 and the run on t 2 b oth lead to the state q i , without any use of an R j rule. Let u s remo v e all the R i rules fr om A new , and w e remo v e all the equalit y tests in th e S i rules. Let A ′ b e the resulting automaton. It is a deterministic VT AM ≡ 6≡ (considering the 26 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN sym b ols in BTINT as INT symbols in this n ew automaton), and eac h term in L ( A ′ , q i ) can b e transf ormed (w e will call it BT-tr ansformation ) into a term in L ( A new , q i ): eac h time w e use a mo d ified S i rule, for instance of t yp e BTINT 1 , on a subtree f ( t 1 , t 2 ), we replace t 2 with t 1 so that the equalit y test is satisfied (and the resulting memory is unchanged). Imp ortant: all the replacemen ts must b e p erformed b otto m-up. The pro of of the emptiness d ecidabilit y of VT AM ≡ 6≡ (Corollary 3.12) is constructive , hence if we choose a reac hable state q j , we can find a term in L ( A ′ , q j ) to th is state, and then con v ert it in to a witness. So, we can find a fi rst witness t A ∈ L ( A new , q j ). If no witness ca n b e found, then all the R i rules are u seless and we can definitely remo v e them all. Otherwise, we still need to find another witness, and if th er e is at least one suc h other witness, then one of th em can b e recognized without usin g a R i rule. W e can construct a VT AM ≡ 6≡ recognizing all the terms wh ose BT-transf ormation leads to t A . T o design it, w e r ead t A top-do w n (kno wing the state of A ′ at eac h no de), and eac h time w e see a su bterm f ( t 1 , t 2 ) to whic h a mo d ified S i rule has to b e applied, for ins tance a mo dified BTINT 1 (resp. BTINT 2 ) rule, the right (resp. left) son of f only needs to b e a term in L ( A ′ , q i ), and the left (resp. righ t) son of f only n eeds to b e BT-transformed in to t 1 (resp. t 2 ). Once this VT AM ≡ 6≡ is constructed, w e can com b ine it w ith A ′ in order to obtain a VT AM ≡ 6≡ recognizing all the terms leading A ′ to q j (the state reac hed by A ′ on t A ) except the term s w hose BT-trans f ormation is t A . Then we fin d another term in L ( A ′ , q j ) (if it exists) and its BT-transf ormation is n ot t A : it is actually another witness t B . When we hav e t w o witnesses for a rule R j , w e remo v e it f rom R , and we add this r ule R j to A ′ , b ut without the disequ alit y test. The automaton A ′ k eeps its go o d prop erty: a term t leading A ′ to some state q can b e BT-transformed in to a term leading A new to state q : w hen we ”meet” the u se of a rule f ormerly in th e set R on f ( t 1 , t 1 ) dur ing the b otto m-up exploration of t , w e replace the righ t (for a r ule th at was of t yp e BTINT 1 and w ith n egativ e constrain ts) or the left son (otherwise) by a witness different from t 1 , s o that the disequalit y test is satisfied. Note that ev en if t 1 is a witness, we can do so b ecause we ha v e foun d t w o witnesses. With the new rule in A ′ w e lo ok for 2 witnesses f or some r emaining R i rule. Again, we can sho w that if a couple of witnesses exists, then at least one coup le can b e found without an y use of the remaining R i rules. When we find a firs t witness t A for a remaining r ule R j , w e can find another one (if it exists) using app ro x im ately the same tec h nique as previously: w e read t A top-do w n, and when we see a ru le formerly in R , used on f ( t 1 , t 2 ) (e.g. a rule formerly of t yp e B TINT 1 with a negativ e constrain t), w e just go on recursivel y , saying that the left son m ust b e a term whose BT-transformation is t 1 , and the right son m ust b e either: • a term whose BT-transform ation is t 2 , • or, if our BT-transformation would c h ange f ( t 1 , t 1 ) in to f ( t 1 , t 2 ), a term w h ose BT- transformation is t 1 . As previously , w e construct a VT AM ≡ 6≡ , fully us ing the Bo olean closure of this class, that recognizes th e terms in L ( A ′ , q j ) (the state r eac hed b y A ′ on t A ), except those w hose BT- transformation is t A , and therefore w e can fi nd another witness (if it exists) t B . W e contin ue to use this metho d , find ing couples of witnesses, until there is n o rule in the set R anymore, or u n til we are n ot able to fi nd a new couple of witnesses anymore: in that latter case, w e remo v e the remaining R i rules b ecause they are useless. So, no w we use the fi nal v ersion of A ′ obtained in order to find a term leading to a fi nal state, and since we ha v e a couple of witnesses for eac h ru le formerly in the set R , w e can VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 27 BT-transform it in to a term accepted b y A new (hence by A ). If such a term do es not exist, the language recognized by A new (i.e. the language recognized by A ) is empty . 5. Conc lusion Ha vin g a tree memory structure instead of a stac k is sometimes m ore r elev an t (ev en when the in put fun ctions sym b ols are only of arities 1 and 0). W e ha v e sh own ho w to extend the visibly pus h do wn languages to su c h memory structures, k eeping determinization and closure pr op erties of VPL. O ur second con tribution is then to extend this automaton mo del, constraining the transition rules w ith some regular conditions on memory con ten ts. The structural equ alit y and disequalit y tests app ear to a b e a goo d class of constrain ts since w e ha v e then b oth d ecidabilit y of emptiness and Boolean closure prop erties. Moreo ve r, they can b e com bined (while keeping d ecidabilit y and closure results) with equ alit y and disequalit y tests a la [4], op erating on brothers su bterms of the term r ead. Sev eral furth er studies can b e done on the automata of this p ap er. F or instance, the problem of the closure of the corresp onding tree languages un der certain classes of term rewriting sy s tems is particularly in teresting, as it can b e applied to th e ve rification of infinite state systems with r e gular mo del che cking tec h niques. It could b e in teresting as well to study how the defin ition of VT AM can b e extended to deal w ith un rank ed trees, with the p ersp ecti v e of applications to problems related to semi-structured do cuments pro cessing. Ac knowledgmen ts. The authors w ish to thank Pierre R ´ et y f or h a vin g noted s ome mis- tak es in the examples in the extended abstract, and for having sent us a basis of comparison of VT AM with (top down) Visibly Push d o w n T ree Automata, and Jean Goub ault-Larrecq for his suggestion to refer to H 3 [19] in the pro of of Theorem 2.5, and the reviewe rs for their useful and numerous remarks and su ggestio ns. Referen ces [1] R. Alur, S. Chaudhuri, and P . Madhusudan. Visibly pushd o wn tree languages. Av ailable on: http://www .cis.upen n.edu/ ~ swarat/pub s/vptl.ps , 2006. [2] R. A lur and P . Madhusudan. Visibly pushdown languages. In L. Babai, editor, Pr o c e e di ngs of the 36th Ann ual ACM Symp osium on The ory of Computing (STOC 2004) , pages 202–211. AC M, 2004. [3] L. Bac hmair and H. Ganzinger. R esolution th eorem proving. In A. Robinson and A. V oronko v , editors, Handb o ok of Automate d R e asoning , c hapter 2. North Holland, 2001. [4] B. Bogaert and S. Tison. Equality and Disequality Constraints on Direct Subterms in T ree A utomata. In 9th Symp. on The or etic al Asp e cts of Computer Sci enc e, ST ACS , vol ume 577 of LNCS , pages 161–171. Springer, 1992. [5] J. Chabin and P . R´ ety . V isibly pu shdow n languages and t erm rewriting. In Pr o c. 6th International Symp osium F r ontiers of Combi ning Syste ms (F r oCoS) , v olume 4720 of L e ctur e Notes in Computer Scienc e , pages 252–266. Springer, 2007. [6] W. Charatonik and A. Podelski. Set constraints with intersection. In Pr o c. IEEE Symp osium on L o gic in Computer Scienc e , V arsa w, 1997. [7] H. Comon and V. Cortie r. T ree automata with one memory , set constrain ts and cryptographic proto cols. The or etic al Computer Scienc e , 331(1):143 –214, F eb. 2005. [8] H. Comon, M. Dauchet, R . Gilleron, F. Jacquemard, D. Lugiez, S. Tison, and M. T ommasi. T r e e Aut omata T e chniques and Applic ations . http://www.g rappa.univ - lille3.fr/tata , 1997. [9] H. Comon and F. Jacquemard. Ground reducibilit y is EXPTIME-complete. I nformation and Compu- tation , 187(1):123–15 3, 2003. 28 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN [10] J.-L. Coquid´ e, M. Dauchet, R. Gilleron, and S. V´ agv¨ olgyi. Bottom-up tree pushdown automata: clas- sification and connection with rewrite systems. T he or etic al Computer Sci enc e , 127(1):69–9 8, 1994. [11] N . Dershow itz and J.-P . Jouannaud. Re write systems , c hapter Handb o ok of Theoretical Computer Science, V olume B, pages 243–320. Elsevier, 1990. [12] T. F r¨ uhwirth, E. Shap iro, M. V ardi, and E. Y ardeni. Logic programs as types for logic programs. In Pr o c. of the 6th IEEE Symp osium on L o gic in Computer Scienc e , pages 300–309 , 1991. [13] J. Goubault-Larrecq. R´ esolution ordonn´ ee av ec s´ election et classes d´ ecidables en logique du premier ordre. Lecture Notes, 2006. av alaible at http://www. lsv.ens- cachan.f r/ ~ goubault/S Oresol.pd f . [14] I . Guessarian. Pushd o wn tree automata. The ory of Computing Systems , 16(1):237 –263, 1983. [15] P . H abermehl, R. I osif, and T. V o jnar. Automata-based verificatio n of programs with tree up dates. In Pr o c. 12th Intern. Conf. on T o ols and Algorithms for the Construction and A nalysis of Systems (T A CAS’06) , volume 3920 of LNC S , April 2006. [16] T. Jensen, D . L. M´ etay er, and T. Thorn. V erification of control flow based security p olicies. In Pr o- c e e dings of the IEEE Symp osium on R ese ar ch in Se curity and Pri vacy , pages 89–103. IEEE Computer Society Press, 1999. [17] D . K apur. Essays i n Honor of L arry W os , chapter Constructors can b e Pa rtial T oo. MIT Press, 1997. [18] J. Misra. Po werlist: A structu re for parallel recursion. ACM T r ansactions on Pr o gr amming L anguages and Systems , 16(6):1737–17 67, Nov em ber 1994. [19] F. Nielson, H. R. N ielson, and H. S eidl. Normalizable horn clauses, strongly recognizable relations and spi. In Pr o c. 9th Static Analysis Symp osium (SAS) , volume 2477 of LNCS , pages 20–35, 2002. [20] R . Nieuw enhuis and A. R u bio. Paramodulation-based theorem p ro ving. In A. Robinson and A. V oronko v, editors, Handb o ok of Automate d R e asoning , c hapter 7. North Holland, 2001. [21] K . M. S chimpf and J. Gallier. T ree p ushdow n automata. Journal of Computer and System Scienc es , 30(1):25–4 0, 1985. VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 29 Appendix: Tw o-w a y tree automa t a with structural equa lity cons traints are as expres sive as st andard tree automa t a. In this section, w e complete the pro of of Lemm a 3.7. W e s ho w actually a more general result: we consider tw o-w a y alternating tree automata with some regular constraints and sho w th at the language they recognize is also accepted b y a standard tree automaton. This generalizes the pro of for tw o-w a y alternating tr ee automata (see e.g. [8] c h apter 7) and the pro of for t wo-w a y automata with equality tests [7], whic h itself r elies on a transf ormation from t w o-w a y automata to one-w a y automata [6]. Tw o-w ay automata are, as usual, automata that can mov e up and down and alter- nation consists (as usual) in spa wning to copies of the tree in differen t states, requ iring acceptance of b oth copies. In the logical formalism, alternation simply corresp onds to clauses q 1 ( x ) , q 2 ( x ) → q ( x ), requiring to accept x b oth in state q 1 and in state q 2 if one w an ts to accept x in state q . F or simp licit y , w e assume that all fun ction symb ols hav e arit y 0 or 2. Lexical con v en- tions: • f , g , h, ... are ranging o v er symbols of arity 2. Un less explicitly stated they ma y denote iden tical symbols. • a, b, c... range o ver constant s • x, x 1 , . . . , x i , . . . , y , . . . , y i , z , . . . , z i , . . . are (universally quan tified) firs t-order v ariables, • S, S 1 , S 2 , . . . , S i , . . . range o ver states symb ols for a fi xed given tree automaton • Q, Q 1 , Q 2 , . . . , range o v er states sym b ols of the tree au tomaton with memory • R , R 1 , R 2 , . . . , range o v er state symb ols of the bin ary recognizable relations. W e assume that R i are recognizable r elatio ns defined by clauses of th e f orm: ( A ) ⇒ R ( a, b ) ( B ) S 1 ( x ) , S 2 ( y ) ⇒ R ( f ( x, y ) , a ) ( C ) S 1 ( x ) , S 2 ( y ) ⇒ R ( a, f ( x, y )) ( D ) R 1 ( x 1 , x 2 ) , R 2 ( y 1 , y 2 ) ⇒ R 3 ( f ( x 1 , y 1 ) , g ( x 2 , y 2 )) ( E ) S 1 ( x ) , S 2 ( y ) ⇒ S ( f ( x, y )) ( F ) ⇒ S ( a ) W e assume wlog that there is a state S ⊤ in which all trees are accepted (a “trash state”). Moreo ver, we will need in what follo ws an additional pr op ert y of th e R i ’s: ∀ i, j, ∃ k , l , R i ( x, y ) ∧ R j ( y , z ) | = | R k ( x, y ) ∧ R l ( x, z ) This prop erty is satisfied by the structural equiv alence, for which there is only one ind ex i : R i = ≡ and w e ha v e indeed x ≡ y ∧ y ≡ z | = | x ≡ y ∧ x ≡ z It is also satisfied by the univ ersal binary relation and by the equalit y relation. Th at is why this generalizes corresp ondin g results of [8, 7]. Our automata are defined by a finite set of clauses of the form: 30 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN (1) Q 1 ( y 1 ) , Q 2 ( y 2 ) , R ( y 1 , y 2 ) ⇒ Q 3 ( y 1 ) (2) Q 1 ( y 1 ) , Q 2 ( y 2 ) ⇒ Q 3 ( f ( y 1 , y 2 )) (2 b ) ⇒ Q 1 ( a ) (3) Q 1 ( f ( y 1 , y 2 )) , Q 2 ( y 3 ) ⇒ Q 3 ( y 1 ) (4) Q 1 ( f ( y 1 , y 2 )) , Q 2 ( y 3 ) ⇒ Q 3 ( y 2 ) These clauses ha v e a least Herbrand mo d el. W e write [ [ Q ] ] the interpretation of Q in this mo del. T his is the language recognized by the automaton in state Q . The goal is to p ro v e that, for ev er y Q , [ [ Q ] ] is recognized by a fin ite tree automaton W e use a selection strategy , with splitting and complete the rules (1)-(4) ab ov e. W e sho w that the completion terminates and that we get out of it a tree automaton whic h accepts exactly the memory con ten ts. Sp litting will in trod uce n u llary predicate sym b ols (prop ositional v ariables). W e consider the follo wing selection str ategy . Let E 1 b e the set of literals w h ic h con tain at least one function symbol and E 2 b e the set of negativ e literals (1) If the clause con tains a negativ e literal ¬ R ( u, v ) or a negativ e literal ¬ S ( u ) where either u, v is not a v ariable, th en select su ch literals only . Th is case is r uled out in wh at follo w s (2) If the clause con tains at least one negated prop ositio nal v ariable, selec t the negated prop ositional v ariables only . This case is ruled out in what follo ws (3) If E 1 ∩ E 2 6 = ∅ , then select E 1 ∩ E 2 (4) If E 1 6 = ∅ and E 1 ∩ E 2 = ∅ , then select E 1 (5) If E 1 = ∅ and E 2 6 = ∅ , then select the negativ e literals ¬ R ( x, y ) and ¬ S ( x ) if any , otherwise select E 2 (6) Otherwise, select the only literal of the clause In what follo ws (and pr ecedes), selected literals are underlin ed. W e in tro duce th e pr o cedure by starting to r un the completion w ith the selection strat- egy , b efore s h o w ing the general f orm of the clauses w e get. First, clauses of the form (3), (4) are replaced (usin g sp litting) w ith clauses of the form (3) Q 1 ( f ( y 1 , y 2 )) , NE Q 2 ⇒ Q 3 ( y 1 ) (4) Q 1 ( f ( y 1 , y 2 )) , NE Q 2 ⇒ Q 3 ( y 2 ) ( s 1 ) Q 2 ( x ) ⇒ NE Q 2 Ov erlapping ( s 1 ) and (2, 2b) ma y yield clauses of the form ( s 2 ) NE Q 1 , NE Q 2 ⇒ NE Q 3 ( s 3 ) ⇒ NE Q together with new clauses of the form ( s 1 ). Eve n tually , we may reac h, using ( s 3 ) and (3-4) clauses: (3 b ) Q 1 ( f ( y 1 , y 2 )) ⇒ Q 3 ( y 1 ) (4 b ) Q 1 ( f ( y 1 , y 2 )) ⇒ Q 3 ( y 2 ) (1) + (2) yields clauses of the f orm VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 31 (5 . 1) Q 1 ( y 1 ) , Q 2 ( y 2 ) , Q 3 ( g ( y 3 , y 4 )) , R 1 ( y 1 , y 3 ) , R 2 ( y 2 , y 4 ) ⇒ Q 4 ( f ( y 1 , y 2 )) (5 . 2) Q 1 ( y 1 ) , Q 2 ( y 2 ) , Q 3 ( a ) , S 1 ( y 1 ) , S 2 ( y 2 ) ⇒ Q 4 ( f ( y 1 , y 2 )) (5 . 3) Q 1 ( a ) ⇒ Q 2 ( b ) (5 . 4) S 1 ( y 1 ) , S 2 ( y 2 ) , Q 1 ( f ( y 1 , y 2 )) ⇒ Q 2 ( a ) (2) +(3b) and (2) + (4b) yield clauses of the form (after splitting): (6) NE Q 3 , Q 1 ( y 1 ) ⇒ Q 2 ( y 1 ) and ev en tually (6 b ) Q 1 ( y 1 ) ⇒ Q 2 ( y 1 ) (5.1) + (2) yields (7 . 1) Q 1 ( y 1 ) , Q 2 ( y 2 ) , Q 3 ( y 3 ) , Q 4 ( y 4 ) , R 1 ( y 1 , y 3 ) , R 2 ( y 2 , y 4 ) ⇒ Q 5 ( f ( y 1 , y 2 )) W e split (7.1) : w e intro d uce new pr edicate s ym b ols Q R j i defined by Q i ( y ) , R j ( x, y ) ⇒ Q R j i ( x ) Then clauses (7.1) b ecomes: (7 . 1) Q 1 ( y 1 ) , Q 2 ( y 2 ) , Q R 1 3 ( y 1 ) , Q R 2 4 ( y 2 ) ⇒ Q 5 ( f ( y 1 , y 2 )) (5.2) + (2b) yields clauses of the form (7 . 2) Q 1 ( y 1 ) , Q 2 ( y 2 ) , S 1 ( y 1 ) , S 2 ( y 2 ) ⇒ Q 3 ( f ( y 1 , y 2 )) (6b) + (2) yields new clauses of the form (2). (7.1) + (5.1) y ields clauses of the form: (8 . 1) Q 1 ( y 1 ) , Q 2 ( y 2 ) , Q R 3 3 ( y 1 ) , Q R 4 4 ( y 2 ) , Q 5 ( y 3 ) , Q 6 ( y 4 ) , R 1 ( y 3 , y 1 ) , R 2 ( y 4 , y 2 ) ⇒ Q 7 ( f ( y 3 , y 4 )) A t this p oin t, we use the p rop ert y of R and split the clause: ∃ y 1 .Q 1 ( y 1 ) ∧ Q R 3 3 ( y 1 ) ∧ R 1 ( y 3 , y 1 ) | = | Q R 4 1 ( y 1 ) ∧ Q R 5 3 ( y 1 ) Hence clauses (9.1) can b e rewr itten into clauses of the form: (8 . 1) Q R 1 1 ( y 1 ) , Q R 3 3 ( y 1 ) , Q 5 ( y 1 ) , Q R 2 2 ( y 2 ) , Q R 4 4 ( y 2 ) , Q 6 ( y 2 ) ⇒ Q 7 ( f ( y 1 , y 2 )) Finally , if w e let Q b e the set of pr edicate sym b ols consisting of • S ym b ols S i • S ym b ols Q i • S ym b ols Q R j i F or eve ry subset S of Q , w e in tro duce a prop ositional v ariable NE S . Clauses are split, in tro ducing new p rop ositional v ariables (or predicate symb ols Q R j i ) in such a wa y that in all clauses except split clauses, the v ariables o ccurring on th e left, also o ccur on the right of the clause. And, in sp lit clauses, there is only one v ariable o ccur ring on the left and not on the r igh t. W e let C b e the set of clauses obtained by rep eated applicatio ns of resolution with splitting, with the ab o v e selection strategy (a priori C could b e infi nite). W e claim that all 32 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN generated clauses are of one of the follo wing f orm s (Where the P i ’s and the P ′ i ’s b elong to Q , Q ’s states migh t actually b e Q R j i ) 1. P op clauses. (the original clauses, which are not subsu med by the new clauses): (3) Q 1 ( f ( y 1 , y 2 )) , NE Q 2 ⇒ Q 3 ( y 1 ) (4) Q 1 ( f ( y 1 , y 2 )) , NE Q 2 ⇒ Q 3 ( y 2 ) (3 b ) Q 1 ( f ( y 1 , y 2 )) ⇒ Q 2 ( y 1 ) (4 b ) Q 1 ( f ( y 1 , y 2 )) ⇒ Q 2 ( y 2 ) Note that, clause (1) is a particular case of th e alternating clauses b elo w, since it can b e written Q 1 ( y 1 ) , Q R 2 ( y 1 ) ⇒ Q 3 ( y 1 ) 2. Push clauses. ( P 1 ) P 1 ( x ) , . . . , P n ( x ) , P ′ 1 ( y ) , . . . , P ′ m ( y ) ⇒ Q ( f ( x, y )) ( P 2 ) ⇒ P ( a ) ( P 3 ) NE S , P 1 ( x ) , . . . , P n ( x ) , P ′ 1 ( y ) , . . . , P ′ m ( y ) ⇒ Q ( f ( x, y )) ( P 4 ) NE S ⇒ Q ( a ) 3. Inte rmediate clauses. ( I 1 ) P 1 ( x ) , . . . , P n ( x ) , P ′ 1 ( y ) , . . . , P ′ m ( y ) , P ′′ 1 ( f ( x, y )) , . . . , P ′′ k ( f ( x, y )) ⇒ Q ( f ( x, y )) ( I 2 ) P 1 ( a ) , . . . , P n ( a ) ⇒ Q ( a ) ( I 3 ) S 1 ( x 1 ) , S 2 ( x 2 ) , Q 1 ( a ) ⇒ Q 2 ( g ( x 1 , x 2 )) ( I 4 ) Q 1 ( a ) ⇒ Q 2 ( b ) 4. Alternating clauses. ( A 1 ) NE S , P 1 ( x ) , . . . , P n ( x ) ⇒ Q ( x ) ( A 2 ) P 1 ( x ) , . . . , P n ( x ) ⇒ Q ( x ) In addition, w e h a ve clauses obtained by sp litting: 5. Split clauses. ( S 1 ) R j ( x, y ) , Q i ( y ) ⇒ Q R j i ( x ) ( S 1 b ) R j ( y , x ) , Q i ( y ) ⇒ Q − R j i ( x ) ( S 2 ) R 1 ( x 1 , y 1 ) , R 2 ( x 2 , y 2 ) , Q i ( f ( y 1 , y 2 )) ⇒ Q ± R j i ( g ( x 1 , x 2 )) ( S 3 ) S 1 ( x ) , S 2 ( y ) , Q i ( f ( x, y )) ⇒ Q ± R j i ( a ) ( S 4 ) P 1 ( x ) , . . . , P n ( x ) ⇒ NE { P 1 ,...,P n } ( S 5 ) P 1 ( x ) , . . . , P n ( x ) , P ′ 1 ( y ) , . . . , P ′ m ( y ) , P ′′ 1 ( f ( x, y )) , . . . , P ′′ k ( f ( x, y )) ⇒ NE S VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 33 6. Prop ositional clauses. ( E 1 ) NE S 1 , . . . , NE S n ⇒ NE S ( E 2 ) ⇒ NE S ( E 3 ) P 1 ( a ) , . . . P n ( a ) ⇒ NE S Ev ery resolution step u sing the selection strategy of t w o of the ab ov e clauses yield a clause in the ab o v e set POP + PUSH : yields an alternating clause ( A 1 ) and a split clause ( S 4 ). INT + PU SH : yields a Push clause or an intermediate clause alternating + PUS H : yields an intermediate clause ( I 1 ) or ( I 2 ). split + R : yields a split clause ( S ) 2 or ( S 3 ) or an intermediate clause ( I 3 ) or ( I 4 ). ( S 2 ) + PUSH : yields clauses ( S 1 ) and push clauses. Note th at here, w e use the prop erty of the relation R to sp lit clauses, wh ic h ma y in v olv e p redicates Q R j i . ( S 3 )+ PUSH : yields push clause and split clauses ( S 4 ). ( S 4 )+ PUSH : yields split clauses ( S 5 ) or pr op ositional clause ( E 3 ). ( S 5 )+ PUSH : yields split clauses ( S 5 ) or pr op ositional clause ( E 1 ). It follo w s that all clauses of C are of the ab o v e form. Sin ce there are only fi nitely man y suc h clauses, C is fin ite and computed in finite (exp onential ) time. No w, we let A b e the alt ernating tr ee automaton d efined by clauses ( P 1 ) and ( P 2 ) (and automata clauses defin ing the S states). L et, for an y state Q , [ [ Q ] ] A b e the language accepted in s tate Q by A . W e claim that [ [ Q ] ] = [ [ A ] ]. T o p r o ve this, w e first sho w (the pro of is omitted here) that NE { P 1 ,...,P n } is in C iff [ [ P 1 ] ] A ∩ . . . ∩ [ [ P n ] ] A 6 = ∅ . Then observe that [ [ Q ] ] is also the interpretatio n of Q in the least Herbrand m o del of C : indeed, all computations yielding C are correct. Since [ [ Q ] ] A ⊆ [ [ Q ] ] is trivial, we only ha v e to p ro v e the con verse in clus ion. F or ev ery t ∈ [ [ Q ] ] there is a pr o of of Q ( t ) usin g the clauses in C . Assume, by con tradiction, th at there is a term t and a pr edicate symb ol Q su c h that all pro ofs of Q ( t ) us in g the clauses in C in v olve at least a clause, wh ic h is not an automaton clause. Then, considering an appropr iate su b-pro of, there is a term u and a predicate sym b ol P suc h that all pro ofs of P ( u ) inv olv e at least one non-automaton clause and there is a p ro of of P ( u ) whic h u ses exact ly one non-automaton clause, at the last step of the pro of. W e in v estiga te all p ossible cases for the last clause used in the pro of of P ( u ) and d eriv e a con tradiction in eac h case. Clause I 1 : The last step of the pro of is P 1 ( u 1 ) , . . . , P n ( u 1 ) , P ′ 1 ( u 2 ) , . . . , P ′ m ( u 2 ) , P ′′ 1 ( f ( u 1 , u 2 )) , . . . , P ′′ k ( f ( u 1 , u 2 )) P ( f ( u 1 , u 2 )) and we assume u = f ( u 1 , u 2 ). Assu me also that, among the pro ofs we consider, k is minimal. (If k = 0 then w e ha v e a pus h clause, whic h is supp osed n ot to b e the case). By hypothesis, for all i , u 1 ∈ [ [ P i ] ] A , u 2 ∈ [ [ P ′ i ] ] A and f ( u 1 , u 2 ) ∈ [ [ P ′′ i ] ] A . In p articular, if w e consider the last clause used in th e pro of of P ′′ k ( u ): Q 1 ( x ) , . . . , Q r ( x ) , Q ′ 1 ( y ) , . . . , Q ′ s ( y ) ⇒ P ′′ k ( f ( x, y )) 34 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN b elongs to C . T hen, o verlapping this clause with the ab o v e clause I 1 , the f ollo wing clause b elongs also to C : P 1 ( x ) , . . . , P n ( x ) , Q 1 ( x ) , . . . , Q r ( x ) , P ′ 1 ( y ) , . . . , P ′ m ( y ) , Q ′ 1 ( y ) , . . . , Q ′ s ( y ) , P ′′ 1 ( f ( x, y )) , . . . , P ′′ k − 1 ( f ( x, y )) ⇒ P ( f ( x, y )) and therefore we ha v e another pr o of of P ( u ): P 1 ( u 1 ) , . . . , P n ( u 1 ) , Q 1 ( u 1 ) , . . . , Q r ( u 1 ) P ′ 1 ( u 2 ) , . . . , P ′ m ( u 2 ) , Q ′ 1 ( u 2 ) , . . . , Q ′ s ( u 2 ) , P ′′ 1 ( f ( u 1 , u 2 )) , . . . , P ′′ k − 1 ( f ( u 1 , u 2 )) P ( f ( u 1 , u 2 )) whic h cont radicts the minimalit y of k . Clause ( A 1 ) : The last step of th e pro of is P 1 ( u ) , . . . , P n ( u ) P ( u ) By hyp othesis, the p ro ofs of P i ( u ) only use automata clauses: ∀ i.u ∈ [ [ P i ] ] A . L e the p ush rule Q 1 ( x ) , . . . , Q m ( x ) , Q ′ 1 ( y ) , . . . , Q ′ p ( y ) ⇒ P n ( f ( x, y )) b e the last clause us ed in the pro of of P ( u ). Ove rlapping this clause and the clause A 1 ab o v e, there is another clause in C yielding a p ro of of P ( u ): Q 1 ( x ) , . . . , Q m ( x ) , Q ′ 1 ( y ) , . . . , Q ′ p ( y ) , P 1 ( f ( x, y )) , . . . , P n − 1 ( f ( x, y )) ⇒ P ( f ( x, y )) And we are bac k to th e case of I 1 . Clause (3b): Q 1 ( f ( u, t )) P ( u ) By h yp othesis f ( t, u ) ∈ [ [ Q 1 ] ] A . Hence there is a push clause P 1 ( x ) , . . . , P n ( x ) , P ′ 1 ( y ) , . . . , P ′ m ( y ) ⇒ Q 1 ( f ( x, y )) suc h that t ∈ [ [ P 1 ] ] A ∩ . . . ∩ [ [ P n ] ] A and u ∈ [ [ P ′ 1 ] ] A ∩ . . . ∩ [ [ P ′ m ] ] A . By resolution on the clause (3b), there is also in C a clause P 1 ( x ) , . . . , P n ( x ) , NE { P ′ 1 ,...,P ′ m } ⇒ Q ( x ) Ho wev er, since [ [ P ′ 1 ] ] A ∩ . . . ∩ [ [ P ′ m ] ] A 6 = ∅ , NE { P ′ 1 ,...,p ′ m } is also in C and, b y resolution again P 1 ( x ) , . . . , P n ( x ) ⇒ Q ( x ) is a clause of C . Then we are bac k to the case of A 1 . Clause (3): The last step of the p ro of is Q 1 ( f ( u, t )) NE Q 2 P ( u ) Since NE Q 2 ∈ C in this case, by saturation of C , th ere is a clause Q 1 ( x, y ) ⇒ Q ( x ) in C , and we are b ac k to the case of (3 b ). VISIBL Y T R EE AUTOMA T A WITH MEMOR Y AND CONSTRAINTS 35 Other cases: they are quite similar to the previous ones. Let us only consider the case of clause ( S 2 ), which is sligh tly more complicated. R 1 ( u 1 , v 1 ) R 2 ( u 2 , v 2 ) Q i ( f ( v 1 , v 2 )) Q R j i ( g ( u 1 , u 2 )) Assume moreo v er that u = g ( u 1 , u 2 ) is a minimal s ize term su c h that, for some Q i , R j , Q R j i ( u ) is pro v able using as a last step an inference S 2 , and is not pro v able b y automata clauses only , As b efore, we consid er the ov erlap b etw een S 2 and a p ush clause. W e get R 1 ( x 1 , y 1 ) , R 2 ( x 2 , y 2 ) , P 1 ( y 1 ) , . . . , P n ( y 1 ) , P ′ 1 ( y 2 ) , . . . , P ′ m ( y 2 ) ⇒ Q R j i ( g ( x 1 , x 2 )) Hence, the follo w in g clauses b elong to C (when P i , P ′ i are n ot themselv es p redicates Q R ; otherwise, w e hav e to use the p rop ert y on R r elatio ns and split in another wa y , using the S ⊤ predicate, as sh o w n later): R 1 ( x 1 , y 1 ) , P i ( y 1 ) ⇒ P R 1 i ( x 1 ) R 2 ( x 2 , y 2 ) , P ′ i ( y 2 ) ⇒ P ′ i R 2 ( x 2 ) P R 1 1 ( x 1 ) , . . . , P R 1 n ( x 1 ) , P ′ 1 R 2 ( x 2 ) , . . . , P ′ m R 2 ( x 2 ) ⇒ Q R j i ( g ( x 1 , x 2 )) and we ha v e the follo win g pro of of g ( u 1 , u 2 ): R 1 ( u 1 , v 1 ) P 1 ( v 1 ) P R 1 1 ( u 1 ) · · · R 1 ( u 1 , v n ) P n ( v 1 ) P R 1 n ( u 1 ) R 2 ( u 2 , w 1 ) P ′ 2 ( w 1 ) P ′ 1 R 2 ( u 2 ) · · · R 2 ( u 2 , w m ) P ′ m ( w m ) P ′ m R 2 ( u 2 ) Q R j i ( g ( u 1 , u 2 )) No w, by o v erlapping again R 1 ( x 1 , y 1 ) and R 2 ( x 2 , y 2 ) with their defining clause, we compute “shortcut clauses” b elonging to C and get another pr o of (for instance assuming v 1 = f ( v 11 , v 12 ) and u 1 = h ( u 11 , u 12 )): R 11 ( u 11 , v 11 ) R 12 ( u 12 , v 12 ) P 1 ( f ( v 11 , v 12 )) P R 1 1 ( u 1 ) · · · R 2 ( u 2 , w 1 ) P ′ 2 ( w 1 ) P ′ 1 R 2 ( u 2 ) · · · R 2 ( u 2 , w m ) P ′ m ( w m ) P ′ m R 2 ( u 2 ) Q R j i ( g ( u 1 , u 2 )) By minimalit y of u , u 1 ∈ [ [ P R 1 1 ] ] A . Similarly , for ev ery i , u 1 ∈ [ [ P R 1 i ] ] A . u 2 ∈ [ [ P ′ i R 2 ] ] A and it follo ws th at g ( u 1 , u 2 ) ∈ [ [ Q R j i ] ] A . Finally , let us consider the case where some P i is itself a pr ed icate sym b ol Q R , in whic h case w e do n ot hav e a p redicate ( Q R ) R 1 . W e use then the assu med prop ert y of the predicates R i : R 1 ( x, y ) ∧ R ( y , z ) | = | R ′ 1 ( x, y ) ∧ R ′ ( x, z ), hence ( ∃ u, ∃ v .R 1 ( x, u ) ∧ R ( u, v ) ∧ Q ( v )) | = | ( ∃ u.R 1 ( x, u ) ∧ S ⊤ ( u )) ∧ ( ∃ v .R ( x, v ) ∧ Q ( v )) Hence we need t wo split clauses instead of one: R ′ 1 ( x, y ) ⇒ S R ′ 1 ⊤ ( x ) R ′ ( x, y ) , Q ( y ) ⇒ Q R ′ ( x ) 36 H. COMON- LUNDH, F. JACQUEMARD, AND N. PERRIN And R 1 ( x 1 , y 1 ) , Q R ( y 1 ) is replaced w ith S R ′ 1 ⊤ ( x 1 ) , Q R ′ ( x 1 ). Note that such a trans f orma- tion is not n ecessary when th ere is a single transitiv e bin ary relation, as in our application: then R ( x, y ) ∧ Q R ( y ) is simply replaced with Q R ( x ). T o sum up: if there is a pr o of of P ( u ) us ing clauses of C , then, b y saturation of the clauses of C w.r.t. ov erlaps with pu sh clauses, we can rewrite the pro of into a pro of using push clauses only: u ∈ [ [ P ] ] A . This pr o ves that [ [ P ] ] = [ [ P ] ] A . Finally , it is easy (and wel l-kno w n) to compute a standard b ott om-up automaton ac- cepting the same language as an alternating automaton; this on ly requires a sub set con- struction. T h at is w h y the language accepted b y our t wo- w a y automat a with stru ctural equalit y constrain ts is actually a r ecogniza ble language. The o verall size of the r esulting automaton (and its computation time) are simply exp onen tial, but w e kno w that, already for alternating automata, w e cannot d o b etter. This wor k is license d under th e Creative Commons Attr ibution-NoDer ivs L icense. T o view a copy of this license, visit htt p:// creat ivecommons.org/licenses/by-nd/2.0/ or send a letter to Creative Commons , 559 Nathan Abbott Wa y , S tanford, California 94305, USA.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment