Ensuring Query Compatibility with Evolving XML Schemas

apport   de recherche ISSN 0249-6399 ISRN INRIA/RR--6711--FR+ENG Thème SYM INSTITUT N A TION AL DE RECHERCHE EN INFORMA TIQUE ET EN A UTOMA TIQ UE Ensuring Query Compatibility with Ev olving XML Schemas Pierre Gene vès — Nabil Layaïda — V incent Quint N° 6711 Nov embre 2008 Centre de recherche INRIA Grenoble – Rhône-Alpes 655, av enue de l’Europe, 38334 Montbonnot Saint Ismier Téléphone : +33 4 76 61 52 00 — Télécopie +33 4 76 61 52 52 Ensuring Query Compatibilit y with Evolving XML Schemas Pierre Genev` es ∗ , Nabil La y a ¨ ıda , Vincen t Quint Th ` eme SYM — Syst ` emes symboliques ´ Equip es-Pro jets W am Rapp ort de rec herche n ° 6711 — No vem bre 2008 — 24 pages Abstract: During the life cycle of an XML application, both schemas and queries may c hange from one v ersion to another. Schema ev olutions may af- fect query results and p otentially the v alidity of pro duced data. Now adays, a c hallenge is to assess and accommo date the impact of theses changes in rapidly ev olving XML applications. This article prop oses a logical framew ork and to ol for verifying forw ard/backw ard compatibilit y issues inv olving schemas and queries. First, it allows analyzing relations b etw een schemas. Second, it allows XML designers to identify queries that must b e reformulated in order to pro duce the exp ected results across suc- cessiv e sc hema v ersions. Third, it allows examining more precisely the impact of schema changes ov er queries, therefore facilitating their reformulation. Key-w ords: XML, Sc hema, Queries, XPath, Evolution, Compatibility , Anal- ysis ∗ CNRS Ensuring Query Compatibilit y with Evolving XML Schemas R´ esum ´ e : Durant le cycle de vie d’une application XML, ` a la fois les sc h´ emas et les requˆ etes sont amen ´ es ` a ´ evoluer d’une version ` a une autre. Les ´ evolutions de sc h´ emas p euven t aﬀecter les r´ esultats des requˆ etes et p otentiellemen t la v alidit´ e des donn´ ees pro duites. De nos jours, un vrai d´ eﬁ consiste ` a ´ ev aluer et ` a prendre en compte l’impact de ces c hangements dans des applications XML qui ´ ev oluent rapidemen t. Cet article propose un cadre logique et un outil p our la v ´ eriﬁcation des compatibilit ´ es ascendan te et descendante des sch ´ emas et des requˆ etes. T out d’ab ord, il p ermet d’analyser les relations en tre les sch ´ emas. Ensuite, il p ermet au concepteur XML d’identiﬁer les requˆ etes qui doiven t ˆ etre reform ul ´ ees aﬁn de pro duire les r´ esultats attendus ` a trav ers les versions successiv es des sch ´ emas. Enﬁn, il p ermet d’examiner de mani ` ere plus pr ´ ecise l’impact des c hangemen ts des sch ´ emas sur les requˆ etes, facitilitant de ce fait leur form ulation. Mots-cl´ es : XML, Sc hema, Requˆ etes, XP ath, Evolution, Compatibilit´ e, Ana- lyse Ensuring Query Comp atibility with Evolving XML Schemas 3 1 Introduction XML is no w commonplace on the web and in man y information systems where it is used for representing all kinds of information resources, ranging from simple text do cuments such as RSS or Atom feeds to highly structured databases. In these dynamic environmen ts, not only data are c hanging steadily but their sc hemas also get mo diﬁed to cop e with the ev olution of the real world en tities they describ e. Sc hema c hanges raise the issue of data consistency . Existing do cuments and data that were v alid with a certain version of a schema may b ecome inv alid on a new version of the schema (forw ard incompatibility). Con versely , new do cumen ts created with the latest version of a schema may b e in v alid on some previous versions (backw ard incompatibility). In addition, schemas ma y b e written in diﬀerent languages, such as DTD, XML Schema, or Relax-NG, to name only the most p opular ones. And it is common practice to describ e the same structure, or new v ersions of a structure, in diﬀerent schema languages. Do cument formats developed by W3C provide a v ariety of examples: XHTML 1.0 has b oth DTDs and XML Schemas, while XHTML 2.0 has a Relax-NG deﬁnition; the sc hema for SVG Tin y 1.1 is a DTD, while v ersion 1.2 is written in Relax-NG; MathML 1.01 has a DTD, MathML 2.0 has b oth a DTD and an XML Schema, and MathML 3.0 is developed with a Relax-NG sc hema and is exp ected to hav e also a DTD and an XML Schema. An issue then is to make sure that schemas written in diﬀerent languages are equiv alent, i.e. they describ e the same structure, p ossibly with some diﬀerences due to the expressivit y of the language [14]. Another issue is to clearly identify the diﬀerences b etw een t wo versions of the same sc hema expressed in diﬀer- en t languages. Moreov er, the issues of forward and backw ard compatibilit y of instances ob viously remain when schema languages change from a version to another. V alidation, and then compatibility , is not the only purp ose of a schema. V alidation is usually the ﬁrst step for safe pro cessing of do cumen ts and data. It mak es sure that do cuments and data are structured as exp ected and can then b e pro cessed safely . The next step is to actually access and select the v arious parts to b e handled in each phase of an application. F or this, query languages pla y a key role. As an example, when transforming a do cument with XSL, XP ath queries are paramount to lo cate in the original do cumen t the data to b e pro duced in the transformed do cumen t. Queries are aﬀected by sc hema evolutions. The structures they return ma y c hange dep ending on the version of the sc hema used by a do cument. When c hanging sc hema, a query ma y return nothing, or something diﬀerent from what w as exp ected, and obviously further pro cessing based on this query is at risk. These observ ations highlight the need for ev aluating precisely and safely the impact of schema ev olutions on existing and future instances of do cuments and data. They also sho w that it is imp ortant for soft ware engineers to pre- cisely kno w what parts of a pro cessing chain ha ve to b e up dated when schemas c hange. In this pap er we fo cus on the XPath query language whic h is used in man y situations while pro cessing XML do cuments and data. The XSL trans- formation language w as already mentioned, but XPath is also present in XLink and XQuery for instance. RR n ° 6711 4 Genev ` es, L aya ¨ ıda, & Quint Related W ork Sc hema evolution is an imp ortant topic and has b een extensiv ely explored in the con text of relational, ob ject-oriented, and XML databases. Most of the previous w ork for XML query reformulation is approac hed through reductions to relational problems [4]. This is b ecause schema ev olution was considered as a storage problem where the priority consists in ensuring data consistency across m ultiple relational sc hema v ersions. In such settings, tw o distinct schemas and an explicit description of the mapping b etw een them are assumed as input. The problem then consists in reform ulating a query expressed in terms of one sc hema in to a semantically equiv alen t query in terms of the other schema: see [6, 18] and more recently [12] with references thereof. In addition to the fundamental diﬀerences b etw een XML and the relational data model, in the more general case of XML pro cessing, sc hemas constan tly ev olve in a distributed, indep endent, and unpredictable en vironment. The re- lations b etw een diﬀeren t schemas are not only unknown but hard to track. In this context, one priority is to help maintaining query consistency during these ev olutions, whic h is still considered as a c hallenging problem [16]. The work found in [13] discusses the impact of evolving XML sc hemas on query reform ulation. Based on a taxonom y of XML schema changes during their ev olution, the authors pro vide informal – not exact nor systematic – guidelines for writing queries whic h are less sensitive to schema evolution. In fact, study- ing query reformulation requires at least the abilit y to analyze the relationship b et ween queries. F or this reason, a closely related w ork is the problem of deter- mining query con tainment and satisﬁability under t yp e constraints [1, 9]. The w ork found in [1] studies the complexity of XPath emptiness and containmen t for v arious fragments (see [2] and references thereof for a surv ey). The main distinctive idea pursued in this pap er is to dev elop a logical ap- proac h for guiding sc hema and query ev olution. In con trast to the classical use of logics for pro ving prop erties such as query emptiness or equiv alence [1, 9], the goal here is diﬀeren t in that we seek to provide the necessary to ols to pro duce relev ant knowledge when such relations do not hold. Outline The rest of this pap er is organized as follo ws: the next section introduces our framew ork, Section 3 presents its underlying logic, and Section 4 presen ts predi- cates for characterizing the impact of sc hema c hanges. W e report on exp eriments on realistic scenarios in Section 5 b efore we conclude in Section 6. 2 Analysis F ramew ork Our framework allows the automatic veriﬁcation of properties related to XML sc hema and query evolution. In particular, it oﬀers the possibility of chec king ﬁne-grained prop erties on the b ehavior of queries with respect to successiv e ver- sions of a given sc hema. The system can be used for chec king whether sc hema ev olutions require a particular query to b e up dated. Whenever schema ev olu- tions may induce query malfunctions, the system is able to generate annotated XML do cumen ts that exemplify bugs, with the goal of helping the programmer to understand and prop erly o vercome undesired eﬀects of schema evolutions. INRIA Ensuring Query Comp atibility with Evolving XML Schemas 5 select("a//b[ancestor::e]", type("XHTML1-strict.dtd", "html")) XML Problem Description (T ext File) Parsing and Compilation let $ X=e & <1> $ X... Logical form ula ov er binary trees with attributes Satisﬁability T est Unsatisﬁable (property proved) Satisﬁable Synthesis Satisfying binary tree with attributes binary to n -ary Sample XML document inducing a bug Figure 1: F ramew ork Overview. . F or these purp oses, our framew ork relies on the com bination and joint use of several contributions:  an extension of the logic introduced in [9] to deal with XML attributes (Sections 2 and 3);  a set of logical features and high-level predicates sp eciﬁcally designed for studying and c haracterizing schema and query compatibility issues when sc hemas ev olve (Section 4);  a range of applications and pro cedures to cop e with schema and query ev olution (Section 5);  a full implementation of the whole system, including: – a parser for reading the problem description (text ﬁle), which in turn use sp eciﬁc parsers for schemas (Section 2.2), queries (Section 2.3), logical formulas (Section 3.2), and predicates (Section 4); – compilers for translating sc hemas and queries into their logical rep- resen tations (Sections 3.3 and 3.4); – an optimized solver ﬁrst described in [9, 10] for chec king satisﬁabilit y of logical formulas in time 2 O ( n ) where n is the formula size; – and a coun ter example XML tree generator (describ ed in [10]). Figure 1 illustrates how the previous softw are comp onents are combined and used together, in a simpliﬁed ov erview of the global framework. W e next intro- duce the data mo del w e consider for XML do cuments, schemas and queries. 2.1 XML T rees with Attributes An XML do cument is considered as a ﬁnite tree of unbounded depth and arity , with tw o kinds of no des resp ectively named elemen ts and attributes. In such a tree, an elemen t ma y ha ve any num b er of c hildren elemen ts, and ma y carry zero, one or more attributes. A ttributes are lea ves. Elemen ts are ordered whereas attributes are not, as illustrated on Figure 4. In this pap er, we focus on the nested structure of elements and attributes, and ignore XML data v alues. 2.2 T yp e Constraints As an in ternal representation for tree grammars, we consider regular tree t yp e expressions (in the manner of [11]), extended with constrain ts o ver attributes. RR n ° 6711 6 Genev ` es, L aya ¨ ıda, & Quint Assuming a set of v ariables ranged o v er by x , w e deﬁne a tree t yp e expression as follows: τ ::= tree type expression ∅ empt y set () empt y sequence τ | τ disjunction τ , τ concatenation l ( a )[ τ ] elemen t deﬁnition x v ariable let x.τ in τ binder W e imp ose a usual restriction on the recursive use of v ariables: w e allo w un- guarded ( i.e. not enclosed by a lab el) recursive uses of v ariables, but restrict them to tail p ositions 1 . With that restriction, tree types expressions deﬁne regular tree languages. In addition, an element deﬁnition ma y in volv e simple attribute expressions that describ e whic h attributes the deﬁned element ma y (or may not) carry: a ::= attribute expression () empt y list list | a disjunction list ::= attribute list list , list comm utative concatenation l ? optional attribute l required attribute ¬ l prohibited attribute Our tree type expressions capture most of the sc hemas in use to da y [14, 3]. In practice, our system provides parsers that con vert DTDs, XML Schemas, and Relax NGs to this internal tree t yp e representation. Users may th us deﬁne constrain ts o v er XML do cuments with the language of their choice, and, more imp ortan tly , they may refer to most existing sc hemas for use with the system. 2.3 Queries The set of XPath expressions we consider is given by the syn tax shown on Figure 2. The semantics of XP ath expressions is describ ed in [5], and more formally in [17]. W e observed that, in practice, man y XP ath expressions con tain syn tactic sugars that can also ﬁt into this fragmen t. Figure 3 presents how our XP ath parser rewrites some commonly found XPath patterns in to the fragment of Figure 2, where the notation ( axis :: nt ) k stands for the comp osition of k successiv e path steps of the same form: axis :: nt /.../ axis :: nt | {z } k steps . 3 Logical Setting 3.1 Logical Data Mo del It is well-kno wn that there exist bijective enco dings b etw een unranked trees (trees of unbounded arity) and binary trees. Owing to these enco dings binary 1 F or instance, “ let x.l ( a )[ τ ] , x | () in x ” is allo wed. INRIA Ensuring Query Comp atibility with Evolving XML Schemas 7 query ::= / p ath absolute path p ath relativ e path query | query union query ∩ query in tersection p ath ::= p ath / p ath path comp osition p ath [ qualiﬁer ] qualiﬁed path axis :: nt step qualiﬁer ::= qualiﬁer and qualiﬁer conjunction qualiﬁer or qualiﬁer disjunction not( qualiﬁer ) negation p ath path p ath / @ nt attribute path @ nt attribute step nt ::= no de test σ no de lab el ∗ an y no de lab el axis ::= tree navigation axis self | c hild | parent descendan t | ancestor descendan t-or-self ancestor-or-self follo wing-sibling preceding-sibling follo wing | preceding Figure 2: XPath Expressions. nt [p osition() = 1] nt [not(preceding-sibling:: nt )] nt [p osition() = last()] nt [not(following-sibling:: nt )] nt [p osition() = k |{z} k> 1 ] nt [(preceding-sibling:: nt ) k − 1 ] coun t( p ath ) = 0 not( p ath ) coun t( p ath ) > 0 p ath coun t( nt ) > k |{z} k> 0 nt / (follo wing-sibling:: nt ) k preceding-sibling:: ∗ [p osition() = last() and qualiﬁer ] preceding-sibling:: ∗ [not(preceding-sibling:: ∗ ) and qualiﬁer ] Figure 3: Syntactic Sugars and their Rewritings. RR n ° 6711 8 Genev ` es, L aya ¨ ıda, & Quint trees ma y b e used instead of unrank ed trees without loss of generalit y . In the sequel, we rely on a simple “ﬁrst-child & next-sibling” enco ding of unranked trees. In this encoding, the ﬁrst child of an element no de is preserved in the binary tree representation, whereas siblings of this no de are app ended as righ t successors in the binary representation. Attributes are left unchanged b y this enco ding. F or instance, Figure 5 presents how the sample tree of Figure 4 is mapp ed. XML Notation a b c d e r s t u v w x Figure 4: Sample XML T ree with Attributes. a b c d e r s t u v w x Figure 5: Binary Enco ding of T ree of Figure 4. The logic w e in tro duce below, used as the core of our framework, op erates on such binary trees with attributes. 3.2 Logical F ormulas The concrete syntax of logical form ulas is shown on Figure 6, where the meta- syn tax h X i  means one or more o ccurences of X separated b y commas. The reader can directly use this syntax for enco ding form ulas as text ﬁles to b e used with the system describ ed in Section 2 [8]. This concrete syntax is used as a single unifying notation throughout all the pap er. The semantics of logical form ulas corresp onds to the classical seman tics of a µ -calculus interpreted o ver ﬁnite tree structures. A formula is satisﬁable iﬀ there exists a ﬁnite binary tree with attributes for which the formula holds at some no de. This is formally deﬁned in [9], and we review it informally b elow through a series of examples. INRIA Ensuring Query Comp atibility with Evolving XML Schemas 9 ϕ ::= form ula T true F false l elemen t name p atomic prop osition # start con text ϕ | ϕ disjunction ϕ & ϕ conjunction ϕ => ϕ implication ϕ <=> ϕ equiv alence ( ϕ ) paren thesized form ula ˜ ϕ negation < p > ϕ existen tial mo dalit y < l >T attribute named l $ X v ariable let h $ X = ϕ i  in ϕ binder for recursion pr e dic ate predicate (See Section 4) p ::= program inside mo dalities 1 ﬁrst c hild 2 next sibling -1 parent -2 previous sibling Figure 6: Syntax of Logical F orm ulas. There is a diﬀerence b etw een an element name and an atomic prop osition 2 : an element has one and only one elemen t name, whereas it can satisfy multiple atomic prop ositions. W e use atomic prop ositions to attach sp eciﬁc information to tree no des, not related to their XML labeling. F or example, the start context (a reserved atomic prop osition) is used to mark the starting context no des for ev aluating XPath expressions. The logic uses programs for na vigating in binary trees: the program 1 allo ws to navigate from a no de down to its ﬁrst successor and the program 2 for na vigating from a no de do wn to its s econd successor. The logic also features con verse programs -1 and -2 for na vigating up ward in binary trees, respectively from the ﬁrst successor to its parent and from the second successor to its previous sibling. T able 1 gives some simple form ulas using mo dalities for na vigating in binary trees, together with sample satisfying trees, in binary and unrank ed tree represen tations. The logic allo ws expressing recursion in trees through the recursive binder. F or example the recursive formula: let $ X = b | <2> $ X in $ X means that either the current no de is named b or there is a sibling of the curren t no de which is named b . F or this purp ose, the v ariable $ X is b ound to the subform ula b | <2> $ X which contains an o ccurence of $ X (therefore deﬁning 2 In practice, an atomic prop osition must start with a “ ”. RR n ° 6711 10 Genev ` es, L aya ¨ ıda, & Quint Sample F ormula T ree XML a & <1>b a b a & <1>(b & <2>c) a b c e & <-1>(d & <2>g) d e g f & <-2>(g & ~<2>T) none none T able 1: Sample F orm ulas and Satisfying T rees. the recursion). The scop e of this binding is the subformula that follows the “ in ” sym b ol of the form ula, that is $ X . The entire formula can thus b e seen as a compact recursive notation for a inﬁnitely nested formula of the form: b | <2>(b | <2>(b | <2>(...))) Recursion allows expressing global properties . F or instance, the recursive for- m ula: ~ let $ X = a | <1> $ X | <2> $ X in $ X expresses the absence of nodes named a in the whole subtree of the curren t no de (including the current node). F urthermore, the ﬁxpoint op erator mak es p ossible to bind several v ariables at a time, whic h is sp eciﬁcally useful for expressing m utual recursion. F or example, the m utually recursiv e formula: let $ X = (a & <2> $ Y) | <1> $ X | <2> $ X, $ Y = b | <2> $ Y in $ X asserts that there is a no de somewhere in the subtree such that this no de is named a and it has at least one sibling which is named b . Binding several v ariables at a time pro vides a very expressiv e yet succinct notation for expressing m utually recursiv e structural patterns (that are common in XML Schemas, for instance). F rom a theoretical p ersp ective, the recursive binder let $ X = ϕ in ϕ cor- resp onds to the ﬁxp oint op erators of the µ -calculus. It is shown in [9] that the least ﬁxp oint and the greatest ﬁxp oin t op erators of the µ -calculus coincide o v er ﬁnite tree structures, for a restricted class of form ulas called cycle-fr e e form ulas. T ranslations of XP ath expressions and sc hemas presented in this pap er alw ays yield cycle-free formulas (see [10] for more details). INRIA Ensuring Query Comp atibility with Evolving XML Schemas 11 3.3 Compilation of Queries The logic is expressive enough to capture the set of XP ath expressions pre- sen ted in Section 2.3. F or example, Figure 7 illustrates how the sample XPath expression: child::r[child::w/@att] is expressed in the logic. F rom a giv en context in an XML do cument, this expression selects all r child no des which ha ve at least one w child with an attribute att . Figure 7 sho ws ho w it is expressed in the logic, on the binary tree represen tation. The form ula holds for r no des which are selected by the expression. The ﬁrst part of the formula, ϕ , corresponds to the step child::r whic h selects candidates r no des. The second part, ψ , navigates do wnw ard in the subtrees of these candidate no des to verify that they hav e at least one immediate w child with an attribute att . att # r ϕ s r v w ϕ ∧ ψ T ranslated Query: child::r [child::w/ @att ] T ranslation: r & (let $ X=<-1> # | <-2> $ X) | {z } ϕ & <1>let $ Y=w & T | <2> $ Y | {z } ψ Figure 7: XPath T ranslation Example. This example illustrates the need for conv erse programs inside mo dalities. The translated XP ath expression only uses forw ard axes (c hild and attribute), nev ertheless b oth forw ard and bac kward mo dalities are required for its logical translation. Without con v erse programs we would ha ve b een unable to diﬀeren- tiate selected no des from nodes whose existence is simply tested. More generally , prop erties must often be stated on both the ancestors and the descendan ts of the selected no de. Equipping the logic with b oth forward and conv erse programs is therefore crucial. Logics without con verse programs may only b e used for solv- ing XPath emptiness but cannot b e used for solving other decision problems suc h as containmen t eﬃcien tly . A systematic translation of XPath expressions into the logic is giv en in [9]. In this pap er, we extended it to deal with attributes. W e implemented a compiler that takes any expression of the fragmen t of Figure 2 and computes its logical translation. With the help of this compiler, we extend the syntax of logical form ulas with a logical predicate select ( " query " , ϕ ). This predicate compiles the XP ath expression query given as parameter into the logic, starting from a con text that satisﬁes ϕ . The XP ath expression to b e giv en as parameter must matc h the syn tax of the XPath fragment shown on Figure 2 (or Figure 3). In a similar manner, w e introduce the predicate exists ( " query " , ϕ ) which tests RR n ° 6711 12 Genev ` es, L aya ¨ ıda, & Quint the existence of query from a con text satisfying ϕ , in a qualiﬁer-like manner (without mo ving to its result). Additionally , the predicate select ( " query " ) is introduced as a shortcut for select ( " query " , # ), where # simply marks the initial con text node of the XPath expression 3 . The predicate exists ( " query " ) is a shortcut for exists ( " query " , T ). These syntactic extensions of the logic allow the user to easily em b ed XP ath expressions and formulate decision problems out of them (like e.g. con tainmen t or any other b o olean combination). In the next sections we explain how the framew ork allows com bining queries with sc hema information for formulating problems. 3.4 Compilation of T ree Types T ree t yp e expressions are compiled into the logic in tw o steps: the ﬁrst stage translates them into binary tree type expressions, and the second step actually compiles this intermediate representation in to the logic. The translation pro ce- dure from tree t yp e expressions to binary tree type expressions is well-kno wn and detailed in [7]. The syntax of output expressions follows: τ ::= binary tree type expression ∅ empt y set () empt y tree τ | τ disjunction l ( a )[ x, x ] elemen t deﬁnition let x.τ in τ binder A ttribute expressions are not concerned by this transformation to binary form: they are simply attached, unchanged, to new (binary) element deﬁnitions. Fi- nally , binary tree type expressions are compiled in to the logic. The logical translation of an expression τ is given by the function tr( τ ) F T deﬁned b elow: tr( τ ) ψ ϕ def = F for τ = ∅ , () tr( τ 1 | τ 2 ) ψ ϕ def = tr( τ 1 ) ψ ϕ | tr( τ 2 ) ψ ϕ tr( l ( a )[ x 1 , x 1 ] ) ψ ϕ def = ( l & ϕ & tra( a ) & s 1 ( x 1 ) & s 2 ( x 2 )) | ψ tr( let x i .τ i in τ ) ψ ϕ def = let $ X i = tr( τ i ) ψ ϕ in tr( τ ) ψ ϕ where the function s · ( · ) sets the type front ier: s p ( x ) =    ˜ < p >T if x is b ound to () ˜ < p >T | < p > $ X if nul lable ( x ) < p > $ X if not nul lable ( x ) according to the predicate nul lable ( x ) which indicates whether the type T 6 = () b ound to x contains the empt y tree. 3 This mark is especially useful for comparing tw o or more XPath expressions from the same context. INRIA Ensuring Query Comp atibility with Evolving XML Schemas 13 The function tra( a ) compiles attribute expressions asso ciated with element deﬁnitions as follows: tra( () ) def = notothers( () ) tra( list | a ) def = tra( list ) & notothers( list ) tra( list , list 0 ) def = tra( list ) & tra( list 0 ) tra( l ? ) def = l | ˜ l tra( l ) def = l tra( ¬ l ) def = ˜ l In usual schemas ( e.g. DTDs, XML Schemas) when no attribute is sp eciﬁed for a given element, it simply means no attribute is allow ed for the deﬁned elemen t. This conv ention m ust b e explicitly stated into the logic. This is the role of the function “notothers( list )” which returns the negated disjunction of all attributes not present in list . As a result, taking attributes into account comes at an extra-cost. The abov e translation app ends a (p otentially very large) form ula in whic h all attributes o ccur, for each element deﬁnition. In practice, a placeholder atomic prop osition is inserted un til the full set of attributes inv olved in the problem form ulation is known. When the whole form ula has been parsed, placeholders are replaced b y the conjunction of negated attributes they denote. This extra-cost can b e observed in practice, and the system allows tw o mo des of op erations: with or without attributes 4 . Nevertheless the system is still capable of handling real world DTDs (such as the DTD of XHTML 1.0 Strict) with attributes. This is due to (1) the limited expressive p ow er of languages suc h as DTD that do not allow for disjunction ov er attribute expressions (like “ list | a ” ); and, more importantly , (2) the satisﬁability-testing algorithm whic h is implemented using symbolic techniques [10]. T ree t yp e expressions form the common in ternal representation for a v ariety of XML sc hema deﬁnition languages. In practice, the logical translation of a tree t yp e expression τ are obtained directly from a v ariety of formalisms for deﬁning sc hemas, including DTD, XML Sc hema, and Relax NG. F or this purp ose, the syn tax of logical formulas is extended with a predicate type ( " · " , · ). The logical translation of an existing sc hema is returned by type ( " f " , l ) where f is a ﬁle path to the schema ﬁle and l is the element name to b e considered as the en try p oin t (root) of the given schema. Any o ccurence of this predicate will parse the giv en sc hema, extract its in ternal tree type represen tation τ , compile it into the logic and return the logical formula tr( τ ) F T . 3.5 T yp e T agging A tag (or “color”) is introduced in the compilation of schemas with the purpose of marking all node types of a speciﬁc schema. A tag is simply a fresh atomic prop osition passed as a parameter to the translation of a tree t yp e expression. F or example: tr( τ ) F xhtml is the logical translation of τ where eac h element deﬁni- tion is annotated with the atomic prop osition “xh tml”. With the help of tags, it b ecomes possible to refer to the element types in an y context. F or instance, 4 The optional argument “-attributes” m ust b e supplied for attributes to be considered. RR n ° 6711 14 Genev ` es, L aya ¨ ıda, & Quint one may formulate tr( τ ) F xhtml | tr( τ 0 ) F smil for denoting the union of all τ and τ 0 do cumen ts, while keeping a wa y to distinguish elemen t types; even if some elemen t names are shared by the t wo t yp e expressions. T agging b ecomes ev en more useful for characterizing ev olutions betw een suc- cessiv e versions of a single schema. In this setting, we need a wa y to distinguish no des allo wed by a newer schema version from no des allow ed by an older ver- sion. This distinction must not be based only on element names, but also on con tent mo dels. Assume for instance that τ 0 is a newer version of schema τ . If w e are in terested in the set of trees allow ed by τ 0 but not allow ed by τ then we ma y form ulate: tr( τ 0 ) F T & ˜ tr( τ ) F T If we no w wan t to c heck more ﬁne-grained prop erties, w e may rather b e in ter- ested in the following (tagged) form ulation: tr( τ 0 ) F all & ˜ tr( τ ) ˜ old complement T In this manner, w e can distinguish elements that were added in τ 0 and whose names did not o ccur in τ , from elements whose names already o ccured in τ but whose conten t mo del c hanged in τ 0 , for instance. In practice, a type is tagged using the predicate type ( " f " , l, ϕ, ϕ 0 ) which parses the speciﬁed schema, con verts it in to its logical represen tation τ and returns the formula tr( τ ) ϕ 0 ϕ . Such kind of type tagging is useful for studying the consequences of sc hema up dates o ver queries, as presented in the next sections. 4 Analysis Predicates This section in tro duces the basic analysis tasks oﬀered to XML application de- signers for assessing the impact of sc hema ev olutions. In particular, w e prop ose a mean for identifying the precise reasons for type mismatches or changes in query results under type constrain ts. F or this purp ose, we build on our query and type expression compilers, and deﬁne additional predicates that facilitate the formulation of decision problems at a higher level of abstraction. Sp eciﬁcally , these predicates are introduced as logical macros with the goal of allowing system usage while focusing (only) on the XML-side prop erties, and k eeping underlying logical issues transparen t for the user. Ultimately , we regard the set of basic logical formulas (suc h as mo dalities and recursive binders) as an assembly language, to which predicates are translated. W e illustrate this principle with t wo simple predicates designed for c hecking bac kward-compatibilit y of schemas, and query satisﬁability in the presence of a sc hema.  The predicate backward incompatible ( τ , τ 0 ) takes tw o t yp e expressions as parameters, and assumes τ 0 is an altered version of τ . This predicate is unsatisﬁable iﬀ all instances of τ 0 are also v alid against τ . Any occurrence of this predicate in the input formula will automatically b e compiled as tr( τ 0 ) F T & ˜ tr( τ ) F T .  The predicate non empty ( " query " , τ ) tak es an XP ath expression (with the syn tax deﬁned on Figure 2) and a type expression as parameters, and is INRIA Ensuring Query Comp atibility with Evolving XML Schemas 15 unsatisﬁable iﬀ the query alwa ys returns an empty set of no des when ev aluated on an XML do cument v alid against τ . This predicate compiles in to select ( " query " , tr( τ ) F T & # ) where the predicate select ( " query " , ϕ ) compiles the XP ath expression query in to the logic, starting from a context that satisﬁes ϕ , as explained in Section 3.3. This can b e used to c heck whether the mo diﬁcation of the schema do es not contradict any part of the query . Notice that the predicate non empty ( " query " , τ ) can b e used for chec king whether a query that is v alid 5 against a schema remains v alid with an up dated v ersion of a sc hema. In other terms, this predicate allo ws determining whether a query that m ust alw ays return a non-empt y result (whatev er the tree on whic h it is ev aluated) keeps verifying the same property with a new version of a schema. A second, more-elab orated, class of predicates allows formulating problems that combine b oth a query query and tw o type expressions τ , τ 0 (where τ 0 is assumed to b e a ev olved version of τ ):  new element name ( " query " , τ , τ 0 ) is satisﬁed iﬀ the query query selects elemen ts whose names did not o ccur at all in τ . This is esp ecially useful for queries whose last na vigation step contains a “ * ” no de test and may th us select unexp ected elements. This predicate is compiled in to: ˜ element ( τ ) & select ( " query " , tr( τ 0 ) F T ) where element ( τ ) is another predicate that builds the disjunction of all el- emen t names o ccuring in τ . In a similar manner, the predicate attribute ( ϕ ) builds the logical disjunction of all attribute names used in ϕ .  new region ( " query " , τ , τ 0 ) is satisﬁed iﬀ the query query selects elements whose names already o ccurred in τ , but such that these no des no w o ccur in a new context in τ 0 . In this setting, the path from the ro ot of the do cumen t to a no de selected by the XPath expression query contains a no de whose type is deﬁned in τ 0 but not in τ as illustrated b elow: node selected by query path from root to selected node contains no de in τ 0 \ τ XML document v alid against τ 0 but not against τ 5 W e say that a query is valid iﬀ its negation is unsatisﬁable. RR n ° 6711 16 Genev ` es, L aya ¨ ıda, & Quint The predicate new region ( " query " , τ , τ 0 ) is logically deﬁned as follo ws: new region ( " query " , τ , τ 0 ) def = select ( " query " , tr( τ ) F all & ˜ tr( τ 0 ) ˜ old complement T ) & ˜ added element ( τ , τ 0 ) & ancestor ( old complement ) & ˜ descendant ( old complement ) & ˜ following ( old complement ) & ˜ preceding ( old complement ) The previous deﬁnition hea vily relies on the partition of tree no des deﬁned b y XPath axes, as illustrated b y Figure 8. The deﬁnition of new region ( " query " , τ , τ 0 ) uses an auxiliary predicate added element ( τ , τ 0 ) that builds the disjunc- tion of all elemen t names deﬁned in τ 0 but not in τ (or in other terms, elemen ts that w ere added in τ 0 ). In a similar manner, the predicate added attribute ( ϕ, ϕ 0 ) builds the disjunction of all attribute names de- ﬁned in τ 0 but not in τ . self ancestor descendant preceding following following-sibling preceding-sibling child parent Figure 8: XPath axes: partition of tree no des. The predicate new region ( " query " , τ , τ 0 ) is useful for c hecking whether a query selects a diﬀerent set of no des with τ 0 than with τ b ecause selected elemen ts may occur in new regions of the do cument due to changes brough t b y τ 0 .  new content ( " query " , τ , τ 0 ) is satisﬁed iﬀ the query query selects elements whose names were already deﬁned in τ , but whose conten t mo del has c hanged due to evolutions brought by τ 0 , as illustrated b elow: INRIA Ensuring Query Comp atibility with Evolving XML Schemas 17 node selected by query subtree for selected node has changed (new conten t model) XML document v alid against τ 0 but not against τ The deﬁnition of new content ( " query " , τ , τ 0 ) follows: new content ( " query " , τ , τ 0 ) def = select ( " query " , tr( τ ) F all & ˜ tr( τ 0 ) ˜ old complement T ) & ˜ added element ( τ , τ 0 ) & ˜ ancestor ( added element ( τ , τ 0 )) & descendant ( old complement ) & ˜ following ( old complement ) & ˜ preceding ( old complement ) The predicate new content ( " query " , τ , τ 0 ) can b e used for ensuring that XP ath expressions will not return no des with a p ossibly new conten t mo del that ma y cause problems. F or instance, this allows chec king whether an XP ath expression whose resulting no de set is con v erted to a string v alue (as in, e.g. XPath expressions used in XSL T “v alue-of ” instructions) is aﬀected by the changes from τ to τ 0 . The previously deﬁned predicates can b e used to help the programmer iden- tify precisely how t yp e constraint evolutions aﬀect queries. They can even be com bined with usual logical connectives to form ulate ev en more sophisticated problems. F or example, let us deﬁne the predicate exclude ( ϕ ) which is satisﬁ- able iﬀ there is no no de that satisﬁes ϕ in the whole tree. This predicate can b e used for excluding sp eciﬁc elemen t names or even no des selected by a given XP ath expression. It is deﬁned as follows: exclude ( ϕ ) def = ˜ ancestor-or-self ( descendant-or-self ( ϕ )) This predicate can also b e used for c hecking prop erties in an iterative manner, reﬁning the prop ert y to b e tested at each step. It can also b e used for verifying ﬁne-grained prop erties. F or instance, one may chec k whether τ 0 deﬁnes the same set of trees as τ mo dulo new element names that w ere added in τ 0 with the following formulation: ˜ ( τ <=> τ 0 ) & exclude ( added element ( τ , τ 0 )) This allows iden tifying that, during the t yp e evolution from τ to τ 0 , the query results change has not b een caused by the t yp e extension but b y new comp osi- tions of no des from the older type. RR n ° 6711 18 Genev ` es, L aya ¨ ıda, & Quint In practice, instead of taking in ternal tree type representation s (as deﬁned in Section 2.2) as parameters, most predicates do actually take any logical form ula as parameter, or even sc hema paths as parameters. W e b elieve this facilitates predicates usage and, most notably , how they can b e comp osed to- gether. Figure 9 gives the syntax of built-in predicates as they are implemen ted in the system, where f is a ﬁle path to a DTD (.dtd), XML Schema (.xsd), or Relax NG (.rng). In addition of aforementioned predicates, the predicate pr e dic ate ::= select ( " query " ) select ( " query " , ϕ ) exists ( " query " ) exists ( " query " , ϕ ) type ( " f " , l ) type ( " f " , l, ϕ, ϕ 0 ) forward incompatible ( ϕ, ϕ 0 ) backward incompatible ( ϕ, ϕ 0 ) element ( ϕ ) attribute ( ϕ ) descendant ( ϕ ) exclude ( ϕ ) added element ( ϕ, ϕ 0 ) added attribute ( ϕ, ϕ 0 ) non empty ( " query " , ϕ ) new element name ( " query " , " f " , " f 0 " , l ) new region ( " query " , " f " , " f 0 " , l ) new content ( " query " , " f " , " f 0 " , l ) pr e dic ate-name ( h ϕ i  ) Figure 9: Syntax of Predicates for XML Reasoning. descendant ( ϕ ) forces the existence of a no de satisfying ϕ in the subtree, and pr e dic ate-name ( h ϕ i  ) is a call to a custom predicate, as explained in the next section. 4.1 Custom Predicates F ollo wing the spirit of predicates presen ted in the previous section, users may also deﬁne their own custom predicates. The full syntax of XML logical spec- iﬁcations to b e used with the system is deﬁned on Figure 10, where the meta- syn tax h X i  means one or more o ccurrence of X separated by commas. A global problem sp eciﬁcation can b e any form ula (as deﬁned on Figure 6), or a list of custom predicate deﬁnitions separated b y semicolons and follow ed by a form ula. A custom predicate ma y hav e parameters that are instanciated with actual formulas when the custom predicate is called (as shown on Figure 9). A form ula b ound to a custom predicate ma y include calls to other predicates, INRIA Ensuring Query Comp atibility with Evolving XML Schemas 19 Sc hema V ariables Elemen ts A ttributes XHTML 1.0 basic DTD 71 52 57 XHTML 1.1 basic DTD 89 67 83 MathML 1.01 DTD 137 127 72 MathML 2.0 DTD 194 181 97 T able 2: Sizes of (Some) Considered Schemas. but not to the currently deﬁned predicate (recursive deﬁnitions m ust b e made through the let binder sho wn on Figure 6). sp e c ::= ϕ form ula (see Fig. 6) def ; ϕ def ::= pr e dic ate-name ( h l i  ) = ϕ 0 custom deﬁnition def ; def list of deﬁnitions Figure 10: Global Syn tax for Sp ecifying Problems. 5 F ramework in Action W e ha ve implemen ted the whole softw are architecture describ ed in Section 2 and illustrated on Figure 1 [8]. W e ha v e carried out extensive experiments of the system with real w orld schemas such as XHTML, MathML, SV G, SMIL (T able 2 gives details related to their resp ective sizes) and queries found in transformations such MathML conten t to presentation [15]. W e present t wo of them that show how the to ol can b e used to analyze diﬀerent situations where sc hemas and queries evolv e. Ev olution of XHTML Basic The ﬁrst test consists in analyzing the relationship (forw ard and bac kw ard com- patibilit y) b etw een XHTML basic 1.0 and XHTML basic 1.1 sc hemas. In par- ticular, backw ard compatibility can b e chec ked by the follo wing command: backward_incompatible("xhtml-basic10.dtd", "xhtml-basic11.dtd", "html") The test immediately yields a counter example as the new schema contains new element names. The counter example (shown b elow) contains a style elemen t o ccurring as a child of head , which is not p ermitted in XHTML basic 1.0: <style type="_otherV"/> </head> RR n ° 6711 20 Genev ` es, L aya ¨ ıda, & Quint <body/> </html> The next step consists in fo cusing on the relationship b et ween both sc hemas excluding these new elemen ts. This can b e form ulated b y the following com- mand: backward_incompatible("xhtml-basic10.dtd", "xhtml-basic11.dtd", "html") & exclude(added_element( type("xhtml-basic10.dtd","html"), type("xhtml-basic11.dtd", "html"))) The result of the test sho ws a counter example do cumen t that prov es that XHTML basic 1.1 is not bac kw ard compatible with XHTML basic 1.0 even if new elemen ts are not considered. In particular, the conten t mo del of the label elemen t cannot hav e an a elemen t in XHTML basic 1.0 while it can in XHTML basic 1.1. The coun ter example pro duced by the solv er is sho wn b elo w: <html> <head> <object> <label> <a> <img/> </a> <img/> </label> <param/> </object> <meta/> <title/> <base/> </head> <body/> </html> XTML basic 1.0 validity error: element "a" is not declared in "label" list of possible children Notice that w e observed similar forw ard and bac kward compatibilit y issues with sev eral other W3C normativ e schemas (in particular for the diﬀerent v ersions of SMIL and SVG). Such backw ard incompatibilities suggests that applications cannot simply ignore new elements from newer schemas, as the combination of older elements may evolv e signiﬁcantly from one v ersion to another. MathML Conten t to Presen tation Conv ersion MathML is an XML format for describing mathematical notations and capturing b oth its structure and graphical structure, also known as Conten t MathML and Presen tation MathML resp ectively . The structure of a given equation is kept separate from the presen tation and the rendering part can be generated from the structure description. This op eration is usually carried out using an XSL T transformation that ac hieves the conv ersion. In this test series, w e fo cus on the analysis of the queries con tained in suc h a transformation sheet and ev aluate the impact of the schema change from MathML 1.0 to MathML 2.0 on these queries. INRIA Ensuring Query Comp atibility with Evolving XML Schemas 21 Most of the queries con tained in the transformation represent only a few patterns very similar up to elemen t names. The following three patterns are the most frequently used: Q1: //apply[*[1][self::eq]] Q2: //apply[*[1][self::apply]/inverse] Q3: //sin[preceding-sibling::*[position()=last() and (self::compose or self::inverse)]] The ﬁrst test is formulated b y the following command: new_region("Q1","mathml.dtd","mathml2.dtd","math") The result of the test shows a counter example document that prov es that the query may select no des in new con texts in MathML 2.0 compared to MathML 1.0. In particular, the query Q1 selects apply elements whose ancestors can b e declare elements, as indicated on the do cument pro duced by the solver: <math xmlns:solver="http://wam.inrialpes.fr/xml" solver:context="true"> <declare> <apply solver:target="true"> <eq/> </apply> <condition/> </declare> </math> Notice that the solver automatically annotates a pair of no des related by the query: when the query is ev aluated from a no de marked with the attribute solver:context , the node mark ed with solver:target is selected. T o ev aluate the eﬀect of this c hange, the counter example is ﬁlled with conten t and passed as an input parameter to the transformation. This shows immediately a bug in the transformation as the resulting document is not a MathML 2.0 presentation do cumen t. Based on this analysis, we know that the XSL T template asso ciated with the match pattern Q1 m ust be up dated to cop e with MathML evolution from version 1.0 to version 2.0. The next test consists in ev aluating the impact of the MathML t yp e evolution for the query Q2 while excluding all new elemen ts added in MathML 2.0 from the test. This identiﬁes whether old elements of MathML 1.0 can b e comp osed in MathML 2.0 in a diﬀeren t manner. This can b e p erformed with the following command: new_content("Q2","mathml.dtd","mathml2.dtd","math") & exclude(added_element(type("mathml.dtd","math"), type("mathml2.dtd", "math"))) The test result sho ws an example do cument that eﬀectiv ely combines MathML 1.0 elemen ts in a w ay that was not allow ed in MathML 1.0 but p ermitted in MathML 2.0. <math xmlns:solver="http://wam.inrialpes.fr/xml" solver:context="true"> <apply solver:target="true"> <apply> <inverse/> </apply> <annotation-xml> RR n ° 6711 22 Genev ` es, L aya ¨ ıda, & Quint <math/> </annotation-xml> <condition/> </apply> </math> Similarly , the last test consists in ev aluating the impact of the MathML type ev olution for the query Q3 , excluding all new elements added in MathML 2.0 and counter example do cuments con taining declare elements (to av oid trivial coun ter examples): new_regions("Q3","mathml.dtd","mathml2.dtd","math") & exclude(added_element(type("mathml.dtd","math"), type("mathml2.dtd","math"))) & exclude(declare) The counter example do cument sho wn below illustrates a case where the sin elemen t o ccurs in a new context. <math xmlns:solver="http://wam.inrialpes.fr/xml" solver:context="true"> <apply> <annotation-xml> <math> <apply> <inverse/> <sin solver:target="true"/> </apply> </math> </annotation-xml> </apply> </math> Applying the transformation on previous examples yields do cuments which are neither MathML 1.0 nor MathML 2.0 v alid. As a result, the stylesheet cannot b e used safely ov er do cuments of the new type without mo diﬁcations. In addition, the required c hanges to the st ylesheet are not limited to the addition of new templates for MathML 2.0 elemen ts. The templates that deal with the comp osition of MathML 1.0 elemen ts should be revised as well. All the previous tests were pro cessed in less than 30 seconds on an ordinary laptop computer running Jav a under Mac OS X. 6 Conclusion In this article, w e present a logical framew ork and a to ol for verifying for- w ard/backw ard compatibility issues caused by sc hemas and queries ev olution. The to ol allows XML designers to identify queries that must b e reformulated in order to pro duce the exp ected results across successive schema v ersions. With this to ol designers can examine precisely the impact of sc hema changes ov er queries, therefore facilitating their reform ulation. W e gav e illustrations of how to use the to ol for b oth schema and query evolution on realistic examples. In particular, we considered t ypical situations in applications inv olving W3C sc hemas evolution such as XHTML and MathML. The to ol can b e v ery useful for standard sc hema writers and main tainers in order to assist them enforce some level of quality assurance on compatibility b etw een versions. INRIA Ensuring Query Comp atibility with Evolving XML Schemas 23 There are a num b er of interesting extensions to the prop osed system. First, the set of predicates can b e easily enriched to detect more precisely the impact on queries. F or example, one can extend the tagging to iden tify separately ev ery na vigation step and qualiﬁer in a query expression. This will help greatly in the iden tiﬁcation and reformulation of the na vigation steps or qualiﬁers aﬀected by sc hemas ev olution. References [1] M. Benedikt, W. F an, and F. Geerts. XPath satisﬁabilit y in the presence of DTDs. In PODS ’05 , pages 25–36. ACM Press, 2005. [2] M. Benedikt and C. Ko ch. XPath leashed. submitted, 2006. [3] G. J. Bex, W. Martens, F. Nev en, and T. Sc h wen tick. Expressiveness of XSDs: from practice to theory , there and back again. In WWW ’05 , pages 712–721, 2005. [4] K. Beyer, F. ¨ Ozcan, S. Saiprasad, and B. V. der Linden. DB2/XML: designing for evolution. In SIGMOD ’05 , pages 948–952. ACM, 2005. [5] J. Clark and S. DeRose. XML path language (XPath) version 1.0, W3C recommendation, No vem b er 1999. h ttp://www.w3.org/TR/ 1999/REC- xpath-19991116. [6] A. Deutsc h and V. T annen. XML queries and constrain ts, containmen t and reform ulation. The or. Comput. Sci. , 336(1):57–87, 2005. [7] P . Genev` es. L o gics for XML . PhD thesis, Insti- tut National P olytechnique de Grenoble, Decem b er 2006. h ttp://www.pierresoft.com/pierre.geneves/phd.h tm. [8] P . Genev ` es and N. Lay a ¨ ıda. A satis ﬁabilit y solver for XML and XPath, June 2006. http://w am.inrialp es.fr/xml. [9] P . Genev` es, N. Lay a ¨ ıda, and A. Sc hmitt. Eﬃcient static analysis of XML paths and types. In PLDI ’07 , pages 342–351. ACM Press, 2007. [10] P . Genev` es, N. Lay a ¨ ıda, and A. Sc hmitt. Eﬃcient static analysis of XML paths and types. Long v ersion of [9], Research Rep ort 6590, INRIA, July 2008. [11] H. Hoso ya, J. V ouillon, and B. C. Pierce. Regular expression types for XML. ACM TOPLAS , 27(1):46–90, 2005. [12] H. J. Mo on, C. A. Curino, A. Deutsch, and C.-Y. Hou. Managing and querying transaction-time databases under schema evolution. In VLDB ’08 , pages 882–895. VLDB Endowmen t, 2008. [13] M. M. Moro, S. Malaik a, and L. Lim. Preserving xml queries during sc hema ev olution. In WWW ’07 , pages 1341–1342. A CM, 2007. RR n ° 6711 24 Genev ` es, L aya ¨ ıda, & Quint [14] M. Murata, D. Lee, M. Mani, and K. Kaw aguchi. T axonom y of XML sc hema languages using formal language theory . ACM TOIT , 5(4):660– 704, 2005. [15] E. Pietriga. MathML conten t2presentation transformation, Ma y 2005. h ttp://www.lri.fr/˜pietriga/mathmlc2p/mathmlc2p.html. [16] E. Sedlar. Managing structure in bits & pieces: the killer use case for XML. In SIGMOD ’05 , pages 818–821. ACM, 2005. [17] P . W adler. Tw o semantics for XP ath. Inter- nal T ec hnical Note of the W3C XSL W orking Group, h ttp://homepages.inf.ed.ac.uk/wadler/papers/xpath-semantics/xpath- seman tics.p df, Jan uary 2000. [18] C. Y u and L. P opa. Semantic adaptation of schema mappings when schemas ev olve. In VLDB ’05 , pages 1006–1017. VLDB Endowmen t, 2005. INRIA Centre de recherche INRIA Grenoble – Rhône-Alpes 655, av enue de l’Europe - 38334 Montbonnot Saint-Ismier (France) Centre de recherche INRIA Bordeaux – Sud Ouest : Domaine Univ ersitaire - 351, cours de la Libération - 33405 T alence Cedex Centre de recherche INRIA Lille – Nord Europe : Parc Scientiﬁque de la Haute Borne - 40, a venue Halley - 59650 V illeneuve d’Ascq Centre de recherche INRIA Nancy – Grand Est : LORIA, T echnopôle de Nancy-Brabois - Campus scientiﬁque 615, rue du Jardin Botanique - BP 101 - 54602 V illers-lès-Nancy Cedex Centre de recherche INRIA Paris – Rocquencourt : Domaine de V oluceau - Rocquencourt - BP 105 - 78153 Le Chesnay Cedex Centre de recherche INRIA Rennes – Bretagne Atlantique : IRISA, Campus univ ersitaire de Beaulieu - 35042 Rennes Cedex Centre de recherche INRIA Saclay – Île-de-France : Parc Orsay Uni versité - ZA C des Vignes : 4, rue Jacques Monod - 91893 Orsay Cedex Centre de recherche INRIA Sophia Antipolis – Méditerranée : 2004, route des Lucioles - BP 93 - 06902 Sophia Antipolis Cedex Éditeur INRIA - Domaine de V oluceau - Rocquencourt, BP 105 - 78153 Le Chesnay Cedex (France) http://www.inria.fr ISSN 0249-6399 </div> <hr style="margin: 50px 0; border: 0; border-top: 2px solid #eee;" />  <section class="original-paper-section" id="paper-viewer-anchor"> <div style="display: flex; justify-content: space-between; align-items: center; margin-bottom: 25px;"> <h3 style="margin:0; font-size: 1.4rem; color: #222;">Original Paper</h3> <div id="nav-top"></div> </div> <div id="paper-content-container" style="background: #f4f4f4; border: 1px solid #ddd; border-radius: 8px; min-height: 600px; position: relative; overflow: visible;"> <div id="loading-status" style="text-align: center; padding: 100px 20px; color: #888;"> <p>Loading high-quality paper...</p> </div> </div> <div id="nav-bottom" style="margin-top: 30px; display: flex; justify-content: center;"></div> </section>  <section id="related-papers-section" style="margin-top:60px; border-top:2px solid #eee; padding-top:40px;"> <h3 style="font-size:1.4rem; font-weight:800; color:#1a1a1a; margin-bottom:24px; display:flex; align-items:center; gap:10px;"> <svg width="22" height="22" viewBox="0 0 24 24" fill="none" stroke="#0366d6" stroke-width="2"><path d="M4 19.5A2.5 2.5 0 0 1 6.5 17H20"/><path d="M6.5 2H20v20H6.5A2.5 2.5 0 0 1 4 19.5v-15A2.5 2.5 0 0 1 6.5 2z"/></svg> Related Papers </h3> <div id="related-papers-list" style="display:grid; grid-template-columns:repeat(auto-fill,minmax(260px,1fr)); gap:20px;"> <p style="color:#aaa; font-style:italic; font-size:0.9rem;">Loading...</p> </div> </section>  <section class="comments-section" style="margin-top: 80px; border-top: 3px solid #0366d6; padding-top: 50px;"> <h3 style="font-size: 1.6rem; font-weight: 800; color: #1a1a1a; margin-bottom: 35px; display: flex; align-items: center; gap: 12px;"> <svg width="28" height="28" viewBox="0 0 24 24" fill="none" stroke="#0366d6" stroke-width="2.5"><path d="M21 15a2 2 0 0 1-2 2H7l-4 4V5a2 2 0 0 1 2-2h14a2 2 0 0 1 2 2z"/></svg> Comments & Academic Discussion </h3> <div id="comments-list" style="margin-bottom: 50px;"> <p style="color: #999; font-style: italic;">Loading comments...</p> </div> <div class="comment-form-wrap" style="background: #fdfdfd; padding: 35px; border-radius: 16px; border: 1px solid #e1e4e8; box-shadow: 0 4px 12px rgba(0,0,0,0.03);"> <h4 id="reply-title" style="margin-top: 0; margin-bottom: 20px; font-size: 1.2rem; font-weight: 800; color: #333;">Leave a Comment</h4> <form id="comment-form" onsubmit="submitComment(event)"> <input type="hidden" id="parent-id" value=""> <div style="display: grid; grid-template-columns: 1fr 1fr; gap: 20px; margin-bottom: 20px;"> <input type="text" id="comment-author" placeholder="Your Name" required style="padding: 14px; border: 1px solid #ddd; border-radius: 8px; font-size: 1rem; outline: none; transition: border-color 0.2s;" onfocus="this.style.borderColor='#0366d6'" onblur="this.style.borderColor='#ddd'"> <input type="text" id="comment-website" placeholder="Website" style="display:none !important;" tabindex="-1" autocomplete="off"> </div> <textarea id="comment-content" rows="5" placeholder="Share your insights or questions about this paper..." required style="width: 100%; padding: 14px; border: 1px solid #ddd; border-radius: 8px; margin-bottom: 20px; resize: vertical; box-sizing: border-box; font-size: 1rem; outline: none; transition: border-color 0.2s;" onfocus="this.style.borderColor='#0366d6'" onblur="this.style.borderColor='#ddd'"></textarea> <div style="display: flex; justify-content: space-between; align-items: center;"> <button type="button" id="cancel-reply" onclick="resetReply()" style="display: none; background: #fff; border: 1px solid #d73a49; color: #d73a49; padding: 10px 20px; border-radius: 8px; font-weight: bold; cursor: pointer; transition: all 0.2s;">Cancel Reply </button> <button type="submit" style="background: #0366d6; color: white; border: none; padding: 14px 40px; border-radius: 8px; font-weight: 800; cursor: pointer; font-size: 1rem; transition: background 0.2s; margin-left: auto;" onmouseover="this.style.background='#0056b3'" onmouseout="this.style.background='#0366d6'">Post Comment</button> </div> </form> </div> </section> <script> const arxivId = "0811.4324"; const apiUrl = "/api/comments/" + arxivId; const pageLang = "en"; // ── Related Papers ──────────────────────────────────────── (function loadRelatedPapers() { const container = document.getElementById('related-papers-list'); if (!container || !arxivId) return; fetch('/api/related/' + encodeURIComponent(arxivId) + '?lang=' + pageLang) .then(r => r.json()) .then(papers => { if (!papers || papers.length === 0) { document.getElementById('related-papers-section').style.display = 'none'; return; } container.innerHTML = papers.map(p => ` <a href="${p.url}" style="display:block; text-decoration:none; color:inherit; background:#fff; border:1px solid #e8e8e8; border-radius:10px; padding:16px; transition:box-shadow 0.2s;" onmouseover="this.style.boxShadow='0 4px 16px rgba(3,102,214,0.12)'" onmouseout="this.style.boxShadow='none'"> ${p.image_url ? `<div style="aspect-ratio:16/9; overflow:hidden; border-radius:6px; margin-bottom:10px; background:#f5f5f5;"><img src="${p.image_url}" style="width:100%;height:100%;object-fit:cover;" loading="lazy" onerror="this.parentElement.style.display='none'"></div>` : ''} <div style="font-size:0.72rem; color:#888; margin-bottom:5px;">${p.arxiv_id} · ${p.date_str}</div> <div style="font-size:0.92rem; font-weight:700; color:#1a1a1a; line-height:1.4; display:-webkit-box; -webkit-line-clamp:2; -webkit-box-orient:vertical; overflow:hidden;">${p.title}</div> </a> `).join(''); }) .catch(() => { document.getElementById('related-papers-section').style.display = 'none'; }); })(); function loadComments() { fetch(apiUrl) .then(res => res.json()) .then(data => { const list = document.getElementById('comments-list'); if (!data || data.length === 0) { list.innerHTML = `<div style="text-align:center; padding:40px; background:#fcfcfc; border-radius:12px; border:1px dashed #ddd; color:#999;">No comments yet. Be the first to share your thoughts!</div>`; return; } // Group replies under parents const top = data.filter(c => !c.parent_id); const byParent = {}; data.filter(c => c.parent_id).forEach(c => { byParent[c.parent_id] = byParent[c.parent_id] || []; byParent[c.parent_id].push(c); }); top.forEach(c => { c.replies = byParent[c.id] || []; }); list.innerHTML = top.map(renderComment).join(''); }) .catch(e => console.error('loadComments error:', e)); } function renderComment(c) { return ` <div class="comment-item" style="margin-bottom: 30px; border-left: 4px solid #0366d6; padding: 15px 25px; background: #fff; border-radius: 0 12px 12px 0; box-shadow: 0 2px 8px rgba(0,0,0,0.02);"> <div style="display: flex; justify-content: space-between; align-items: center; margin-bottom: 12px;"> <span style="font-weight: 800; color: #1a1a1a; font-size: 1.05rem;">${c.author}</span> <span style="font-size: 0.85rem; color: #bbb;">${c.created_at ? c.created_at.slice(0,10) : ''}</span> </div> <p style="margin: 0; color: #4a4a4a; line-height: 1.7; font-size: 1.05rem; white-space: pre-wrap;">${c.content}</p> <div style="margin-top: 15px;"> <button onclick="setReply(${c.id}, '${c.author.replace(/'/g, "\\'")}')" style="background:none; border:none; color:#0366d6; font-size:0.9rem; padding:0; cursor:pointer; font-weight:bold; display:flex; align-items:center; gap:5px;"> <svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5"><path d="M15 10l-5 5 5 5"/><path d="M4 4v7a4 4 0 0 0 4 4h12"/></svg> Reply </button> </div> ${c.replies && c.replies.length > 0 ? `<div style="margin-top: 25px; margin-left: 30px; border-top: 1px solid #f0f0f0; padding-top: 25px;">${c.replies.map(renderComment).join('')}</div>` : ''} </div> `; } function setReply(id, name) { document.getElementById('parent-id').value = id; document.getElementById('reply-title').innerText = `Reply to ${name}`; document.getElementById('cancel-reply').style.display = 'inline-block'; document.getElementById('comment-content').focus(); document.getElementById('comment-form').scrollIntoView({ behavior: 'smooth', block: 'center' }); } function resetReply() { document.getElementById('parent-id').value = ""; document.getElementById('reply-title').innerText = "Leave a Comment"; document.getElementById('cancel-reply').style.display = 'none'; } function submitComment(e) { e.preventDefault(); const author = document.getElementById('comment-author').value; const content = document.getElementById('comment-content').value; const parent_id = document.getElementById('parent-id').value || null; const website = document.getElementById('comment-website').value; if (website) return; // honeypot fetch(apiUrl, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ author, content, parent_id }) }).then(res => { if (res.ok) { document.getElementById('comment-content').value = ""; resetReply(); loadComments(); alert("Comment posted successfully."); } else { alert("Error posting comment. Please try again."); } }); } document.addEventListener('DOMContentLoaded', loadComments); </script> <script> document.addEventListener("DOMContentLoaded", function() { const arxivId = "0811.4324"; const container = document.getElementById('paper-content-container'); const loadingStatus = document.getElementById('loading-status'); let currentPage = 1; let totalPages = 0; const yearStr = "2008"; const monthStr = "11"; const paths = [ `/koineu_html/${yearStr}/${monthStr}/${arxivId}/index.html` ]; function updateNav() { const navHtml = ` <div style="display: flex; align-items: center; gap: 15px; background: #fff; padding: 8px 25px; border-radius: 30px; border: 1px solid #ddd; box-shadow: 0 2px 8px rgba(0,0,0,0.05);"> <button onclick="movePage(-1)" ${currentPage === 1 ? 'disabled' : ''} style="border:0; background:none; cursor:pointer; font-weight:bold; color:${currentPage === 1 ? '#ccc' : '#0366d6'}">◀ Prev</button> <span style="font-family: monospace; font-weight:bold; color:#333;">PAGE ${currentPage} / ${totalPages}</span> <button onclick="movePage(1)" ${currentPage === totalPages ? 'disabled' : ''} style="border:0; background:none; cursor:pointer; font-weight:bold; color:${currentPage === totalPages ? '#ccc' : '#0366d6'}">Next ▶</button> </div> `; document.getElementById('nav-top').innerHTML = navHtml; document.getElementById('nav-bottom').innerHTML = navHtml; } window.movePage = function(delta) { const next = currentPage + delta; if (next >= 1 && next <= totalPages) { document.getElementById('pf' + currentPage.toString(16)).style.display = 'none'; document.getElementById('pf' + next.toString(16)).style.display = 'block'; currentPage = next; updateNav(); window.scrollTo({ top: document.getElementById('paper-viewer-anchor').offsetTop - 20, behavior: 'smooth' }); } }; function tryLoad(idx) { if (idx >= paths.length) { loadingStatus.innerHTML = "<p>Original content is being processed. Available soon.</p>"; return; } const url = paths[idx]; const baseUrl = url.replace('index.html', ''); fetch(url).then(r => { if(!r.ok) throw new Error(); return r.text(); }).then(html => { const parser = new DOMParser(); const doc = parser.parseFromString(html, 'text/html'); const pageContainer = doc.getElementById('page-container') || doc.body; const styles = doc.querySelectorAll('style'); styles.forEach(s => { const newStyle = document.createElement('style'); newStyle.textContent = s.textContent.replace(/url\((?!http|data|["']?\/)/g, `url(${baseUrl}`); document.head.appendChild(newStyle); }); if (pageContainer) { const pages = pageContainer.querySelectorAll('.pf'); totalPages = pages.length || 1; pageContainer.style.cssText = "position:relative !important; top:0 !important; left:0 !important; width:100% !important; display:block !important;"; pages.forEach((p, i) => { p.style.display = (i === 0) ? 'block' : 'none'; p.style.cssText += "position:relative !important; margin:0 auto !important; max-width:100% !important; background:white !important; box-shadow:0 0 15px rgba(0,0,0,0.1) !important;"; p.querySelectorAll('img').forEach(img => { const src = img.getAttribute('src'); if (src && !src.startsWith('http') && !src.startsWith('/')) img.src = baseUrl + src; img.onerror = function() { this.style.display = 'none'; }; }); }); container.innerHTML = ""; container.appendChild(pageContainer); updateNav(); } }).catch(() => tryLoad(idx + 1)); } tryLoad(0); }); </script> <script> document.addEventListener("DOMContentLoaded", function() { const arxivId = "0811.4324"; const lang = "en"; // Record View & Update Count fetch(`/api/view/${arxivId}?lang=${lang}`, { method: 'POST' }) .then(res => res.json()) .then(data => { if (data.status === 'success' && data.view_count !== undefined) { const viewEl = document.getElementById('post-view-number'); if (viewEl) viewEl.innerText = Number(data.view_count).toLocaleString(); } }) .catch(e => console.error(e)); // Load Sidebar Data fetch(`/api/sidebar-data?lang=${lang}`) .then(res => res.json()) .then(data => { // Popular Posts const popContainer = document.getElementById('dynamic-sidebar-popular'); if (data.popular_posts && data.popular_posts.length > 0) { let popHtml = `<h4 class="sidebar-section-title">${lang === 'kr' ? '인기 게시물' : 'Popular Posts'}</h4>`; data.popular_posts.forEach(p => { popHtml += ` <a href="${p.url}" class="sidebar-post"> <div class="sidebar-post__img"> <img src="${p.image_url}" onerror="this.src='/images/placeholder.jpg'" alt="${p.title}" loading="lazy" /> </div> <span class="sidebar-post__title">${p.title}</span> </a> `; }); popContainer.innerHTML = popHtml; } else { popContainer.style.display = 'none'; } // Recent Comments const commentContainer = document.getElementById('dynamic-sidebar-comments'); if (data.recent_comments && data.recent_comments.length > 0) { let cHtml = `<h4 class="sidebar-section-title">${lang === 'kr' ? '최근 댓글' : 'Recent Comments'}</h4><div style="display:flex; flex-direction:column; gap:15px;">`; data.recent_comments.forEach(c => { cHtml += ` <a href="${c.url}#comments-list" style="text-decoration:none; background:#f8f9fa; padding:12px; border-radius:8px; display:block; border:1px solid #eee; transition:background 0.2s;" onmouseover="this.style.background='#f0f7ff'" onmouseout="this.style.background='#f8f9fa'"> <div style="font-size:0.85rem; color:#666; margin-bottom:5px;"><strong>${c.author}</strong> on <span style="color:#0366d6;">${c.post_title}</span></div> <div style="font-size:0.95rem; color:#333; line-height:1.4;">"${c.content}"</div> </a> `; }); cHtml += `</div>`; commentContainer.innerHTML = cHtml; } else { commentContainer.style.display = 'none'; } }) .catch(e => { document.getElementById('dynamic-sidebar-popular').style.display = 'none'; document.getElementById('dynamic-sidebar-comments').style.display = 'none'; }); }); </script> <div class="post-share" style="margin-top: 50px;"> <a href="https://twitter.com/intent/tweet?url=http://koineu.com/en/posts/2008/11/2008-11-27-0811_4324&text=Ensuring%20Query%20Compatibility%20with%20Evolving%20XML%20Schemas" class="share-btn" target="_blank" rel="noopener">Twitter</a> <a href="https://www.facebook.com/sharer/sharer.php?u=http://koineu.com/en/posts/2008/11/2008-11-27-0811_4324" class="share-btn" target="_blank" rel="noopener">Facebook</a> </div> </div> </div> </div> </main> <footer class="site-footer"> <div class="container"> <div class="footer-grid"> <div> <div class="footer-logo">KOINEU</div> <p class="footer-desc">Global Academic Research Archive powered by AI.</p> <div class="footer-biz-sidebar" style="margin:20px 0; font-size:0.85rem; color:var(--color-muted); line-height:1.7;"> <div style="display:flex; gap:8px;"> <span style="white-space:nowrap;"><b>Company:</b></span> <span>미스미스터크레이지 (MissMrCrazy)</span> </div> <div style="display:flex; gap:8px;"> <span style="white-space:nowrap;"><b>CEO:</b></span> <span>송호성 (Song Ho-seong)</span> </div> <div style="display:flex; gap:8px;"> <span style="white-space:nowrap;"><b>Biz Reg:</b></span> <span>731-64-00881</span> </div> </div> </div> <div> <h3 class="footer-title">Recent posts</h3> <div class="footer-posts-list"> <a href="/en/posts/2026/04/2026-04-01-2604_00465/" class="footer-post"> <div class="footer-post__img"> <img src="/koineu_html/2026/04/2604.00465/bg5.webp" onerror="this.src='/images/placeholder.jpg'" alt="thumb" loading="lazy"> </div> <div> <div class="footer-post__title" style="color:white;"> Gravitational wave spectrum from first-order QCD phase transitions based on a parity doublet model </div> <div class="footer-post__meta" style="font-size:0.75rem; color:#aaa; margin-top:3px;">2026-04-01</div> </div> </a> <a href="/en/posts/2026/04/2026-04-01-2604_00618/" class="footer-post"> <div class="footer-post__img"> <img src="/koineu_html/2026/04/2604.00618/bgd.webp" onerror="this.src='/images/placeholder.jpg'" alt="thumb" loading="lazy"> </div> <div> <div class="footer-post__title" style="color:white;"> Absorption of 1$P$-wave heavy charmonium $χ_{c1}(1P)$ in nuclei </div> <div class="footer-post__meta" style="font-size:0.75rem; color:#aaa; margin-top:3px;">2026-04-01</div> </div> </a> </div> </div> <div> <h3 class="newsletter-title">STAY INFORMED</h3> <p class="newsletter-desc"> Get the latest research breakthroughs delivered to your inbox. </p> <form class="newsletter-form" action="#" method="post"> <input type="email" name="email" class="newsletter-input" placeholder="Email Address" required /> <button type="submit" class="newsletter-btn">SUBSCRIBE</button> </form> </div> </div> <div class="footer-bottom" style="margin-top:50px; padding-top:30px; border-top:1px solid var(--color-border);"> <div style="display:flex; flex-direction:column; gap:10px;"> <div style="display:flex; flex-wrap:wrap; gap:20px; font-size:0.75rem; color:var(--color-muted);"> <span>2026 © KOINEU. All Rights Reserved.</span> <a href="/en/about/" style="color:var(--color-muted); text-decoration:none;">About Us</a> <a href="/en/terms/" style="color:var(--color-muted); text-decoration:none;">Terms</a> <a href="/en/privacy/" style="color:var(--color-muted); text-decoration:none;">Privacy Policy</a> </div> <div style="font-size:0.7rem; color:#aaa; line-height:1.5;"> Address: 31, Hyangsoseojeong-gil, Danwol-myeon, Yangpyeong-gun, Gyeonggi-do, KR | Industry: Information & Communication </div> </div> <a href="#" class="back-to-top" style="display:flex; align-items:center; gap:6px; color:var(--color-muted); font-size:0.8125rem; margin-top:auto; text-decoration:none;"> <svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5"><path d="M18 15l-6-6-6 6"/></svg> Back to top </a> </div> </div> </footer> <script> // ── 모바일 메뉴 토글 ── const navToggle = document.querySelector('.nav-toggle'); const navMenu = document.querySelector('.nav-menu'); if (navToggle && navMenu) { navToggle.addEventListener('click', () => { const expanded = navToggle.getAttribute('aria-expanded') === 'true'; navToggle.setAttribute('aria-expanded', String(!expanded)); navMenu.classList.toggle('is-open'); }); } // ── 검색 패널 ── const searchBtn = document.getElementById('navSearchToggle'); const searchPanel = document.getElementById('searchPanel'); const searchClose = document.getElementById('searchClose'); if (searchBtn && searchPanel) { searchBtn.addEventListener('click', () => { searchPanel.style.display = 'flex'; searchPanel.querySelector('input[name="q"]')?.focus(); }); searchClose?.addEventListener('click', () => { searchPanel.style.display = 'none'; }); searchPanel.addEventListener('click', (e) => { if (e.target === searchPanel) searchPanel.style.display = 'none'; }); document.addEventListener('keydown', (e) => { if (e.key === 'Escape') searchPanel.style.display = 'none'; }); } // ── Back to top ── document.querySelector('.back-to-top')?.addEventListener('click', (e) => { e.preventDefault(); window.scrollTo({ top: 0, behavior: 'smooth' }); }); </script> </body> </html>