Marginal AMP Chain Graphs

MAR GINAL AMP CHAIN GRAPHS JOSE M. PE ˜ NA ADIT, ID A, LINK ¨ OPING UN IVERSITY, SE-5 8183 LINK ¨ OPING, SWEDE N JOSE.M.PE NA@LIU.SE Abstract. W e present a new family of mo dels that is based on graphs that may hav e undi- rected, directed and bidirec ted edge s. W e name these new mo dels marg inal AMP (MAMP) chain gra phs b ecause each of them is Marko v eq uiv alent to s ome AMP chain gr aph under marginaliza tion o f some of its nodes. How ever, MAMP c hain graphs do not o nly subsume AMP chain graphs but also mult iv ariate re gressio n chain graphs. W e describ e globa l a nd pairwise Mar ko v prop erties for MAMP chain graphs and prove their equiv a lence for comp osi- tional graphoids. W e als o characterize when t wo MAMP chain graphs a re Marko v equiv alent. F o r Gaussian probability distributions, w e also show that every MAMP chain gr aph is Marko v equiv alent to s ome dir e cted and acy c lic gr aph with deterministic no des under marginaliza tion and co nditioning o n some of its no des. This is imp o r tant beca us e it implies that the indep e ndence mo del r epresented b y a MAMP chain gr aph can b e a ccounted for by s o me data generating pro ce s s that is partially obs erved and has selection bias. Finally , we modify MAMP chain graphs so that they a re closed under marg inalization for Gauss- ian probability distributions. This is a desira ble feature b ecaus e it g uarantees parsimonious mo dels under mar ginalizatio n. 1. Intr oduction Chain graphs (CGs) ar e graphs with p ossibly directed and undirec ted edges, a nd no semidi- rected cycle. They hav e b een extensiv ely studied as a formalism to represen t indep endence mo dels, b ecause they can mo del s ymmetric and asymmetric relationships b etw een the random v ariables of in terest. Ho w eve r, there are four diﬀerent in terpretations of CGs a s indep endence mo dels (Cox and W erm uth , 1993, 199 6 ; Drton, 2009; So nntag and P e ˜ na, 2013). In this pa- p er, w e a r e in t erested in the AMP in terpretation (Andersson et al., 2001; Levitz et al., 2001) and in the m ultiv ariate regression (MVR) inte rpretation (Co x and W e rm uth , 1993, 19 9 6). Although MVR CGs w ere origina lly represen ted using dashed directed and undirected edges, w e prefer to represen t them using solid directed and bidirected edges. In this pap er, we unify and generalize the AMP and MVR in terpretations of CGs. W e do so b y in tro ducing a new family of mo dels tha t is based on gra phs that may hav e undirected, directed and bidirected edges. W e call this new family marginal AMP (MAMP) CGs. The rest of the pap er is organized a s follows . W e start with some preliminaries and no- tation in Section 2. W e con tin ue by proving in Section 3 that, for Gaussian probabilit y distributions, ev ery AMP CG is Marko v equiv alen t to some directed and acyclic g raph with deterministic no des under marginalization and conditioning on some of its no des. W e extend this result to MAMP CGs in Section 4 , whic h implies that the indep endence mo del repre- sen ted b y a MAMP ch ain graph can b e accoun ted f or by some data generating pro cess that is partially observ ed a nd ha s selection bias. There fore, the indep endence mo dels represen ted by MAMP CGs are not arbitrary and, th us, MAMP CGs are worth studying. W e also describ e in Section 4 global and pairwise Mark ov prop erties fo r MAMP CGs and prov e their equiv a- lence for comp o sitional graphoids. Moreov er, w e also c haracterize in that section when t wo MAMP CGs are Mark o v equiv alen t. W e show in Section 5 that MAMP CGs are not closed under marginalization a nd mo dify t hem so that they b ecome closed under marginalization for Gaussian probability distributions. This is imp o r t an t b ecause it guarante es parsimonious Date : mamp cgs 32.tex, 12 :04, 30/ 06/1 8. 1 2 mo dels under marginalization. Finally , w e discuss in Se ction 6 how MAMP C Gs relate to other ex isting mo dels based on graphs such as regression CGs, maximal ancestral graphs, summary graphs and MC gra phs. 2. Pre liminaries In this section, we introduce some concepts of mo dels based on graphs, i.e. graphical mo dels. Most of these concepts hav e a unique deﬁnition in the literature. Ho w ever, a few concepts hav e more than one deﬁnition in the literature and, thus, w e opt for the most suitable in this w ork. All the graphs and probabilit y distributions in this pap er are deﬁned o ver a ﬁnite set V . All the graphs in this pap er are simple, i.e. they contain at most o ne edge b et w een an y pair of no des. The elemen ts of V are not distinguished from singletons. The op erators set union and set diﬀerence are giv en equal precedence in the expressions. The term maximal is alw ays wrt set inclusion. If a gra ph G contains an undirected, directed or bidirected edge b etw een tw o no des V 1 and V 2 , then w e write that V 1 − V 2 , V 1 → V 2 or V 1 ↔ V 2 is in G . W e represen t with a circle, such as in ← ⊸ or ⊸ ⊸ , t ha t the end of an edge is unsp eciﬁed, i.e. it may b e an arrow tip or nothing. The parents of a set of no des X of G is the set pa G ( X ) = { V 1 ∣ V 1 → V 2 is in G , V 1 ∉ X and V 2 ∈ X } . The ch ildren of X is the set ch G ( X ) = { V 1 ∣ V 1 ← V 2 is in G , V 1 ∉ X a nd V 2 ∈ X } . The neigh b ors of X is t he set ne G ( X ) = { V 1 ∣ V 1 − V 2 is in G , V 1 ∉ X and V 2 ∈ X } . The sp ouses of X is the set sp G ( X ) = { V 1 ∣ V 1 ↔ V 2 is in G , V 1 ∉ X and V 2 ∈ X } . The adjacen ts of X is the set ad G ( X ) = ne G ( X ) ∪ pa G ( X ) ∪ ch G ( X ) ∪ sp G ( X ) . A route b etw een a no de V 1 and a no de V n in G is a sequence of (not necess arily distinct) no des V 1 , . . . , V n st V i ∈ ad G ( V i + 1 ) fo r all 1 ≤ i < n . If the no des in the route a re a ll distinct, then the route is called a path. The length of a r o ute is the num b er of (not necess arily distinct) edges in the route, e.g. the length of the route V 1 , . . . , V n is n − 1. A ro ute is called undirected if V i − V i + 1 is in G for all 1 ≤ i < n . A route is called descending if V i → V i + 1 , V i − V i + 1 or V i ↔ V i + 1 is in G for all 1 ≤ i < n . A route is called strictly descending if V i → V i + 1 is in G for all 1 ≤ i < n . The descendan ts of a set of no des X of G is the set d e G ( X ) = { V n ∣ there is a descending route from V 1 to V n in G , V 1 ∈ X and V n ∉ X } . The non-descendan t s of X is the set nde G ( X ) = V ∖ X ∖ de G ( X ) . The strict ascendan ts of X is t he set san G ( X ) = { V 1 ∣ there is a strictly descending route from V 1 to V n in G , V 1 ∉ X and V n ∈ X } . A route V 1 , . . . , V n in G is called a cycle if V n = V 1 . Moreo ver, it is called a semidirected cycle if V n = V 1 , V 1 → V 2 is in G and V i → V i + 1 , V i ↔ V i + 1 or V i − V i + 1 is in G for all 1 < i < n . An AMP chain graph (AMP CG) is a g raph whose ev ery edge is directed or undirected st it has no semidirecte d cycles. A MVR c hain graph (MVR CG) is a graph whose ev ery edge is directed or bidirected st it has no semidirected cycles. A set of no des of a graph is connected if there exists a path in the graph b etw een eve ry pa ir of no des in the set st all t he edges in the path ar e undirected or bidirected. A connectivit y comp onen t of a graph is a maximal connected set. The subgraph of G induced b y a set of its no des X , denoted as G X , is the g r aph o v er X that has all and only the edges in G whose b oth ends are in X . Let X , Y , Z and W denote four disjoin t subsets of V . An independence mo del M is a se t of statemen ts X ⊥ M Y ∣ Z . Moreov er, M is called graphoid if it satisﬁes the follow ing prop erties: Symmetry X ⊥ M Y ∣ Z ⇒ Y ⊥ M X ∣ Z , decomp osition X ⊥ M Y ∪ W ∣ Z ⇒ X ⊥ M Y ∣ Z , w eak union X ⊥ M Y ∪ W ∣ Z ⇒ X ⊥ M Y ∣ Z ∪ W , con traction X ⊥ M Y ∣ Z ∪ W ∧ X ⊥ M W ∣ Z ⇒ X ⊥ M Y ∪ W ∣ Z , and in tersection X ⊥ M Y ∣ Z ∪ W ∧ X ⊥ M W ∣ Z ∪ Y ⇒ X ⊥ M Y ∪ W ∣ Z . Moreo ver, M is called comp ositional graphoid if it is a grapho id that also satisﬁes the composition prop ert y X ⊥ M Y ∣ Z ∧ X ⊥ M W ∣ Z ⇒ X ⊥ M Y ∪ W ∣ Z . Another prop ert y that M ma y satisfy is w eak transitivit y X ⊥ M Y ∣ Z ∧ X ⊥ M Y ∣ Z ∪ K ⇒ X ⊥ M K ∣ Z ∨ K ⊥ M Y ∣ Z with K ∈ V ∖ X ∖ Y ∖ Z . W e no w recall the seman tics of AMP , MVR and L WF CGs. A no de B in a path ρ in an AMP CG G is called a triplex no de in ρ if A → B ← C , A → B − C , or A − B ← C is a subpath of ρ . Moreo v er, ρ is said to be Z -op en with Z ⊆ V when 3 ● eve ry triplex no de in ρ is in Z ∪ san G ( Z ) , and ● eve ry non-triplex no de B in ρ is outside Z , unless A − B − C is a subpath of ρ and pa G ( B ) ∖ Z ≠ ∅ . A no de B in a pa th ρ in a MVR CG G is called a triplex no de in ρ if A ← ⊸ B ← ⊸ C is a subpath of ρ . Moreov er, ρ is said to b e Z - o p en with Z ⊆ V when ● eve ry triplex no de in ρ is in Z ∪ san G ( Z ) , and ● eve ry non-triplex no de B in ρ is outside Z . A section of a route ρ in a CG is a maximal undire cted subroute of ρ . A section V 2 − . . . − V n − 1 of ρ is a collider section of ρ if V 1 → V 2 − . . . − V n − 1 ← V n is a subroute of ρ . A route ρ in a CG is said to b e Z - op en when ● eve ry collider section of ρ has a no de in Z , a nd ● no non-collider section of ρ has a no de in Z . Let X , Y and Z denote three disjoin t subsets of V . When there is no Z -op en path/ pa th/route in an AMP/MVR/L WF CG G b et w een a no de in X a nd a no de in Y , w e sa y that X is sepa- rated from Y giv en Z in G and denote it as X ⊥ G Y ∣ Z . The indep endence mo del represen ted b y G is t he set of separations X ⊥ G Y ∣ Z . W e denote it as I AM P ( G ) , I M V R ( G ) or I LW F ( G ) . In general, these three indep endence mo dels ar e diﬀeren t. Ho wev er, if G is a directed and acyclic graph (DA G), then they are the same. Giv en an AMP , MVR or L WF CG G and t w o disjoin t subsets L and S of V , w e denote b y [ I ( G )] S L the indep endence mo del represen ted by G under marginalization of the nodes in L and conditioning on the no des in S . Sp eciﬁcally , X ⊥ G Y ∣ Z is in [ I ( G )] S L iﬀ X ⊥ G Y ∣ Z ∪ S is in I ( G ) and X , Y , Z ⊆ V ∖ L ∖ S . Finally , we denote by X ⊥ p Y ∣ Z that X is indep enden t of Y giv en Z in a probabilit y distribution p . W e sa y t ha t p is Mark ovian wrt an AMP , MVR or L WF CG G when X ⊥ p Y ∣ Z if X ⊥ G Y ∣ Z for all X , Y and Z disjoin t subsets of V . W e say that p is faithf ul to G when X ⊥ p Y ∣ Z iﬀ X ⊥ G Y ∣ Z for all X , Y and Z disjoin t subsets of V . 3. Error AMP CGs An y regular Gaussian probability distribution that can b e represen ted by an AMP CG can b e express ed as a system of linear equ ations with c orrelated errors whose structure depends on the CG (Andersson et al., 2001, Section 5). How ev er, the CG represen ts the erro r s implicitly , as no no des in the CG cor r esp o nd to the errors. W e prop ose in this section to add some deterministic no des to the CG in o r der to represen t the errors explicitly . W e call the result an EAMP CG. W e will sho w that, as desired, ev ery AMP CG is Mark o v equiv alen t to its corresp onding EAMP CG under marginalization of the erro r no des, i.e. the indep endence mo del represen ted b y t he former coincides with the indep endence mo del represen ted by the latter. W e will also sho w that ev ery EAMP CG under mar g inalization of the error no des is Mark o v equiv alen t t o some L WF CG under marginalization of the error no des, and that the latter is Marko v equiv alen t to some DA G under ma r g inalization of the error no des and conditioning on some selection no des. The relev ance of this result can b e b est explained b y extending to AMP CGs what Koster (2002, p. 838) stated for summary graphs and Ric hardson and Spirtes (2002, p. 981) stat ed for ancestral graphs: The fact that an AMP CG has a D A G as departure p oin t implies tha t the indep endence mo del a sso ciated with the former can b e accoun t ed for by some data generating pro cess that is part ia lly observ ed (corresp onding to marginalization) and has selection bias (corr esp o nding to conditioning). W e extend this result to MAMP CGs in the next section. It is w orth men tioning t ha t Andersson et al. (20 0 1, Theorem 6) hav e identiﬁe d the con- ditions under whic h an AMP CG is Mark o v equiv alen t to some L WF CG. 1 It is clear f r o m 1 T o be exa ct, Ander sson et al. (2001, Theor em 6) hav e identiﬁed the conditions under which all and only the probability distributions that c a n b e r epresented by an AMP CG can also be represented by some L WF CG. How ever, for a ny AMP or L WF CG G , there are Gaus s ian probability distributions that hav e all and only 4 these conditions that there ar e AMP CGs that are not Mark ov equiv alen t to a ny L WF CG. The results in this section diﬀer from those b y Andersson et al. (20 01, Theorem 6), b ecause w e sho w that ev ery AMP CG is Marko v equiv alen t to some L WF CG with error no des under marginalization of the error no des. It is also w ort h men tio ning that Ric hardson and Spirtes (2002, p. 1025 ) show t hat there are AMP CGs that are not Mark ov equiv alen t to any D AG under marginalization and con- ditioning. How ev er, the results in this section show that ev ery AMP CG is Mark ov equiv- alen t to some DA G with error and selection no des under marginalization o f the error no des and conditioning of t he selection no des. Therefore, the indep endence mo del represen ted by an y AMP CG has indeed some DA G a s departure p oint and, th us, it can b e accoun ted for b y some data generating pro cess. The r esults in this section do not con tra dict those b y Ric hardson and Spirtes (2 002, p. 1025), b ecause they did not consider deterministic no des while w e do (recall that the error no des are deterministic). Finally , it is also w or t h men tioning that EAMP CGs are not the ﬁrst graphical mo dels to ha ve DA Gs as departure p o in t. Sp eciﬁcally , summary g raphs (Cox and W erm uth , 1996), MC graphs (Koster, 2002), ancestral gra phs (Richardson and Spirtes, 2002), and rib o nless g r a phs (Sadeghi, 2013) predate EAMP CGs and hav e the men tioned prop erty . Ho wev er, none o f these other classes of graphical mo dels subsumes AMP CGs, i.e. there are indep endence mo dels that can b e represen ted b y an AMP CG but not by an y mem b er of the other class (Sadeghi and Lauritzen, 2012, Section 4). Therefore, none of t hese other classes of graphical mo dels subsumes EAMP CGs under marginalization of t he error no des. 3.1. AMP and L WF CGs with Det erministic N o des. W e sa y t ha t a node A of an AMP or L WF CG is determined b y some Z ⊆ V when A ∈ Z o r A is a function of Z . In that case, w e also say that A is a deterministic no de. W e use D ( Z ) to denote all the no des that are determined b y Z . F r om the p oin t of view o f the separatio ns in an AMP or L WF CG, that a no de is determined b y but is not in the conditioning set o f a separation has the same eﬀect as if the no de w ere actually in the conditioning set. W e extend the deﬁnitions of separation for AMP and L WF CGs to the case where deterministic no des ma y exist. Giv en an AMP CG G , a path ρ in G is said to be Z -op en when ● eve ry triplex no de in ρ is in D ( Z ) ∪ san G ( D ( Z )) , and ● no non-triplex no de B in ρ is in D ( Z ) , unless A − B − C is a subpath of ρ and pa G ( B ) ∖ D ( Z ) ≠ ∅ . Giv en an L WF CG G , a route ρ in G is said to b e Z - o p en when ● eve ry collider section of ρ has a no de in D ( Z ) , and ● no non-collider section of ρ has a no de in D ( Z ) . It should b e not ed that w e are not the ﬁrst to consider mo dels based on g raphs with deter- ministic no des. F or instance, Geiger et al. ( 1 990, Section 4) consider D A Gs with deterministic no des. How ev er, our deﬁnition of deterministic no de is more general than theirs. 3.2. F rom AMP CGs to D AGs Via EAMP CGs. Anderss on et al. (2001, Section 5) sho w that an y regular Ga ussian probability distribution p t ha t is Mark o vian wrt an AMP CG G can b e expressed a s a system of linear equations with correlated errors whose structure dep ends on G . Sp eciﬁcally , assume without loss of generality that p has mean 0. Let K i denote an y connectivit y comp onen t of G . Let Ω i K i ,K i and Ω i K i ,pa G ( K i ) denote submatrices of the precision matrix Ω i of p ( K i , pa G ( K i )) . Then, as show n b y Bishop (2006, Section 2.3.1), K i ∣ pa G ( K i ) ∼ N ( β i pa G ( K i ) , Λ i ) the independenc ie s in the indep endence mo del represented by G , as shown by Levitz et al. (2 001, Theorem 6.1) and Pe˜ na (2011, Theor ems 1 and 2). Then, our formulation is equiv alent to the origina l formulation of the r esult by Andersson et al. (2001, The o rem 6 ). 5 where β i = − ( Ω i K i ,K i ) − 1 Ω i K i ,pa G ( K i ) and ( Λ i ) − 1 = Ω i K i ,K i . Then, p can b e expres sed as a system of linear equations with normally distributed errors whose structure dep ends on G as follo ws: K i = β i pa G ( K i ) + ǫ i where ǫ i ∼ N ( 0 , Λ i ) . Note that for all A, B ∈ K i st A − B is not in G , A ⊥ G B ∣ pa G ( K i ) ∪ K i ∖ A ∖ B and th us ( Λ i ) − 1 A,B = 0 ( L a uritzen, 1996, Prop osition 5.2). Note also that f o r all A ∈ K i and B ∈ pa G ( K i ) st A ← B is no t in G , A ⊥ G B ∣ pa G ( A ) and t h us ( β i ) A,B = 0. Let β A con tain the nonzero elemen ts of the v ector ( β i ) A, ● . Then, p can b e expressed as a system of linear equations with correlated errors whose structure dep ends on G as follows. F or an y A ∈ K i , A = β A pa G ( A ) + ǫ A and for an y other B ∈ K i , cov ar iance ( ǫ A , ǫ B ) = Λ i A,B . It is worth men tioning that the mapping ab ov e b etw een probabilit y distributions and sys - tems of linear equations is bijectiv e (Andersson et al., 200 1, Section 5). Note that no no des in G corresp ond to the errors ǫ A . Therefore, G represen t the errors implicitly . W e prop ose to r epresen t them explicitly . This can easily b e done b y transforming G in to what w e call an EAMP CG G ′ as f ollo ws: 1 Let G ′ = G 2 F o r each no de A in G 3 Add the no de ǫ A to G ′ 4 Add the edge ǫ A → A to G ′ 5 F o r each edge A − B in G 6 Add the edge ǫ A − ǫ B to G ′ 7 Remo ve the edge A − B from G ′ The transformation ab ov e basically consists in adding the error no des ǫ A to G and connect them appropria t ely . Figure 1 show s an example. Note tha t ev ery no de A ∈ V is determined b y pa G ′ ( A ) and, what will b e more imp ortant, that ǫ A is determined b y pa G ′ ( A ) ∖ ǫ A ∪ A . Th us, the existence of deterministic no des imp oses indep endencies whic h do not corresp ond to separations in G . Note a lso that, giv en Z ⊆ V , a no de A ∈ V is determined b y Z iﬀ A ∈ Z . The if part is trivial. T o see the o nly if part, note that ǫ A ∉ Z and th us A cannot be de termined b y Z unless A ∈ Z . Therefore, a no de ǫ A in G ′ is determined by Z iﬀ pa G ′ ( A ) ∖ ǫ A ∪ A ⊆ Z b ecause, as sho wn, there is no other wa y f o r Z to determine pa G ′ ( A ) ∖ ǫ A ∪ A whic h, in turn, determine ǫ A . Let ǫ denote all the error no des in G ′ . Note that w e ha v e not y et g iv en a formal deﬁnition of EAMP CGs. W e deﬁne them as all the graphs resulting from applying the pseudo co de ab o ve to an AMP CG. It is easy to see that ev ery EAMP CG is an AMP CG o ver V ∪ ǫ and, thus , its seman tics are deﬁned. The fo llowing theorem conﬁrms t ha t these seman tics are as desired. The formal pro ofs of our r esults app ear in the app endix at the end of the pap er. Theorem 1. I AM P ( G ) = [ I AM P ( G ′ )] ∅ ǫ . Theorem 2. Assume that G ′ has the same deterministic r elationships n o matter whether it is interpr ete d as an AMP or L WF CG. Then, I AM P ( G ′ ) = I LW F ( G ′ ) . 6 G G ′ G ′′ A B C D E F A B C D E F ǫ A ǫ B ǫ C ǫ D ǫ E ǫ F A B C D E F ǫ A ǫ B ǫ C ǫ D ǫ E ǫ F S ǫ C ǫ D S ǫ C ǫ E S ǫ D ǫ F S ǫ E ǫ F Figure 1. Example of the diﬀeren t transformations for AMP CGs. The follo wing corollary links the t w o most p opular in terpretations of CGs. Sp eciﬁcally , it sho ws that ev ery AMP CG is Mark ov equiv alen t to some L WF CG with determinis tic no des under marginalization. The corollary follo ws fro m Theorems 1 and 2. Corollary 1. I AM P ( G ) = [ I LW F ( G ′ )] ∅ ǫ . No w, let G ′′ denote the D AG obtained from G ′ b y replacing ev ery edge ǫ A − ǫ B in G ′ with ǫ A → S ǫ A ǫ B ← ǫ B . Figure 1 sho ws an example. The no des S ǫ A ǫ B are called selection no des. Let S denote all the selection no des in G ′′ . The following theorem relates t he semantic s of G ′ and G ′′ . Theorem 3. Assume that G ′ and G ′′ have the same deterministic r elations h ips. T h en, I LW F ( G ′ ) = [ I ( G ′′ )] S ∅ . The main result o f this section is the following corollary , whic h show s that ev ery AMP CG is Mark o v equiv alen t to some D AG with deterministic no des under marginalization and conditioning. The corollary follo ws from Corollary 1 and Theorem 3. Corollary 2. I AM P ( G ) = [ I ( G ′′ )] S ǫ . 4. Marginal AMP CGs In this section, w e presen t the main contribution of this pap er, namely a new family of graphical mo dels that unify and g eneralize AMP and MVR CGs. Sp eciﬁcally , a g r aph G con taining p ossibly directed, bidirected and undirected edges is a marginal AMP ( MAMP) CG if C1 . G has no semidirec ted cycle, C2 . G has no cycle V 1 , . . . , V n = V 1 st V 1 ↔ V 2 is in G and V i − V i + 1 is in G for a ll 1 < i < n , and C3 . if V 1 − V 2 − V 3 is in G and sp G ( V 2 ) ≠ ∅ , then V 1 − V 3 is in G to o. A set of no des of a MAMP CG G is undirectly connected if there exists a path in G b et w een ev ery pair of no des in the set st all the edges in the pa th are undirected. An undirected connectivit y comp o nen t of G is a maximal undirectly connected set. W e denote b y uc G ( A ) the undirected connectivit y comp onen t a no de A of G b elongs to. The semantics of MAMP CGs is as follow s. A no de B in a path ρ in a MAMP CG G is called a triplex no de in ρ if A ← ⊸ B ← ⊸ C , A ← ⊸ B − C , or A − B ← ⊸ C is a subpath of ρ . Moreo ver, ρ is said to b e Z - o p en with Z ⊆ V when ● eve ry triplex no de in ρ is in Z ∪ san G ( Z ) , and ● eve ry non-triplex no de B in ρ is outside Z , unless A − B − C is a subpath of ρ and sp G ( B ) ≠ ∅ o r pa G ( B ) ∖ Z ≠ ∅ . 7 Let X , Y and Z denote three disjoint subsets of V . When there is no Z - op en path in G b et w een a no de in X and a no de in Y , w e sa y that X is separated from Y g iv en Z in G and denote it as X ⊥ G Y ∣ Z . W e denote b y X / ⊥ G Y ∣ Z that X ⊥ G Y ∣ Z do es not hold. Lik ewise, w e denote by X ⊥ p Y ∣ Z (resp ectiv ely X / ⊥ p Y ∣ Z ) that X is indep enden t (resp ectiv ely dep enden t) of Y giv en Z in a pro ba bilit y distribution p . The indep endence mo del represen t ed b y G , denoted as I ( G ) , is the set o f separation statemen ts X ⊥ G Y ∣ Z . W e sa y that p is Marko vian wrt G when X ⊥ p Y ∣ Z if X ⊥ G Y ∣ Z fo r all X , Y and Z disjoin t subsets of V . Moreo v er, we say that p is faithful to G when X ⊥ p Y ∣ Z iﬀ X ⊥ G Y ∣ Z for all X , Y and Z disjoin t subsets of V . Note that if a MAMP CG G has a path V 1 − V 2 − . . . − V n st sp G ( V i ) ≠ ∅ f or all 1 < i < n , then V 1 − V n m ust b e in G . Therefore, the indep endence mo del represen ted b y a MAMP CG is the same whether we use the deﬁnition of Z -o p en pa t h ab ov e o r the following simpler one. A pa th ρ in a MAMP CG G is said to b e Z - o p en when ● eve ry triplex no de in ρ is in Z ∪ san G ( Z ) , and ● eve ry non-triplex no de B in ρ is outside Z , unless A − B − C is a subpath of ρ and pa G ( B ) ∖ Z ≠ ∅ . The motiv ation b ehind t he three constrain ts in the deﬁnition of MAMP CGs is as follow s. The constrain t C1 follo ws from the semidirec ted acyclicit y constrain t of AMP and MVR CGs. F or the constrain ts C2 and C3, not e that typically ev ery missing edge in the graph of a g raphical mo del corresp onds to a separation. How ev er, this ma y not b e true for gr a phs that do not satisfy the constraints C2 and C3. F or instance, the graph G b elo w do es not con tain any edge b etw een B and D but B / ⊥ G D ∣ Z for all Z ⊆ V ∖ { B , D } . Lik ewise, G do es not con tain an y edge b et wee n A and E but A / ⊥ G E ∣ Z fo r all Z ⊆ V ∖ { A, E } . A B C D E F Since the situation ab o v e is coun terin tuit ive, we enforce the constrain ts C2 and C3. The- orem 5 b elow sho ws that ev ery missing edge in a MAMP CG cor r esp o nds to a separation. Note tha t AMP and MVR CGs a re sp ecial cases of MAMP CGs. Ho w ev er, MAMP CGs are a prop er g eneralization o f AMP and MVR CGs, as there are indep endence mo dels t ha t can b e represen ted b y the f o rmer but not by the tw o la tter. An example follows (w e p ostp one the pro of that it cannot b e represe n ted b y an y AMP or MVR CG un t il a f ter Theorem 7). A B C D E Giv en a MAMP CG G , let  G denote the AMP CG obtained b y replacing ev ery bidirected edge A ↔ B in G with A ← L AB → B . Note that G and  G represen t the same separations o ver V . Therefore, ev ery MAMP CG can b e seen as the result of marginalizing out some no des in an AMP CG, hence the name. F urthermore, Corollary 2 sho ws that eve ry AMP CG can b e seen as the result of marginalizing out and conditioning on some no des in a D A G . Consequen tly , ev ery MAMP CG can also b e seen as the result of marginalizing out and conditioning on some no des in a D AG. Therefore, the indep endence mo del represen ted b y a MAMP CG can b e accounted for by some data generating pro cess that is partially observ ed and has selection bias. This implies that the indep endence mo dels represen ted by MAMP CGs are not arbitrary and, th us, MAMP CGs are w orth studying. The theorem b elo w pro vides another wa y to see that the indep endence mo dels represen ted b y MAMP CGs 8 are not arbitrary . Sp eciﬁcally , it sho ws that each of them coincides with the indep endence mo del of some probabilit y distribution. Theorem 4. F or any MAMP CG G , ther e exists a r e gular Gaussian pr ob ability distribution p that is faithful to G . Corollary 3. Any indep endenc e mo del r epr esente d by a MAMP CG is a c om p o s itional gr aphoid that satisﬁes we ak tr ansitivity. Finally , we s ho w b elo w that the indep endence mo del represen ted b y a MAMP C G coincides with certain closure of certain separations. This is interes ting b ecause it implies that a f ew separations and rules to com bine them characterize all the separations represen t ed b y a MAMP CG. Moreov er, it also implies that we hav e a simple graphical criterion to decide whether a giv en separation is or is not in the closure without ha ving to ﬁnd a deriv ation of it, whic h is usually a tedious task. Speciﬁcally , w e deﬁne the pairwise separation base of a MAMP CG G as the separations ● A ⊥ B ∣ pa G ( A ) f or all A, B ∈ V st A ∉ ad G ( B ) and B ∉ de G ( A ) , ● A ⊥ B ∣ ne G ( A ) ∪ pa G ( A ∪ ne G ( A )) for all A, B ∈ V st A ∉ ad G ( B ) , A ∈ de G ( B ) , B ∈ de G ( A ) and uc G ( A ) = uc G ( B ) , and ● A ⊥ B ∣ pa G ( A ) for all A, B ∈ V st A ∉ ad G ( B ) , A ∈ de G ( B ) , B ∈ de G ( A ) and u c G ( A ) ≠ uc G ( B ) . W e deﬁne the comp ositional graphoid closure of t he pairwise separation base of G , denoted as cl ( G ) , as the set o f separations that are in the base plus those t ha t can b e deriv ed fro m it b y applying the comp ositional gra pho id pr o p erties. W e denote the separations in cl ( G ) as X ⊥ cl ( G ) Y ∣ Z . Theorem 5. F or any MAMP CG G , if X ⊥ cl ( G ) Y ∣ Z then X ⊥ G Y ∣ Z . Theorem 6. F or any MAMP CG G , if X ⊥ G Y ∣ Z then X ⊥ cl ( G ) Y ∣ Z . 4.1. Mark ov Equiv alence. W e say t ha t t w o MAMP CGs ar e Mark o v equiv alen t if they represen t the same indep endence mo del. In a MAMP CG, a triplex ({ A, C } , B ) is an induced subgraph of the form A ← ⊸ B ← ⊸ C , A ← ⊸ B − C , or A − B ← ⊸ C . W e say that tw o MAMP CGs are triplex equiv alen t if they hav e the same adjacencies and the same tr iplexes. Theorem 7. Two MAMP CGs ar e Markov e quivalent iﬀ they a r e triplex e quivalent. W e men tioned in the previous sec tion that MAMP CGs are a prop er g eneralization of AMP and MVR CGs, as there are indep endence mo dels that can b e represen ted b y the former but not by the tw o lat ter. Moreov er, w e gav e the an example and p ostp oned the pro of. With the help of Theorem 7, w e can now giv e the pro of. Example 1. The indep endenc e mo d e l r epr esente d by the MAMP CG G b elow c annot b e r epr esente d by any AMP or MVR CG. A B C D E T o se e it, assume to the c ontr ary that it c an b e r epr esente d by an AMP CG H . Note that H is a MAMP C G to o. Then, G and H must have the same triplex es by The or em 7. Then , H must have triplexes ({ A, D } , B ) and ({ A, C } , B ) but no triplex ({ C , D } , B ) . So, C − B − D must b e in H . Mor e over, H must have a triplex ({ B , E } , C ) . So, C ← E m ust b e in H . However, this im plies that H do es not have a triplex ({ C , D } , E ) , which is a c ontr adiction b e c ause G has such a triplex. T o se e that no MVR CG c an r epr esent the in d ep endenc e mo del r epr esente d by G , simply note that n o MVR CG c an have triplexes ({ A, D } , B ) and ({ A, C } , B ) but n o triplex ({ C , D } , B ) . 9 W e end this section with tw o lemmas that iden tify some in t eresting distinguished mem b ers of a triplex equiv alence class of MAMP CGs. W e sa y that t wo no des f orm a directed no de pair if there is a directed edge b et w een them. Lemma 1. F or every triplex e quivalenc e class of MAMP CGs, ther e is a unique ma ximal set of dir e cte d no de p ai rs s t some CG in the class has exac tly those dir e cte d no de p airs. A MAMP CG is a maximally directed CG (MDCG) if it has exactly the maximal set of directed no de pa ir s corresp onding to its triplex equiv alence class. Note that there may b e sev eral MD CG s in the class. F or instance, the triplex equiv alence class t ha t con tains the MAMP CG A → B has t wo MDCGs (i.e. A → B and A ← B ) . Lemma 2. F or every triplex e quivalenc e class of MDCGs, ther e is a unique maximal set of bidir e cte d e dges st so m e MDCG in the c l a ss has exac tly those bidir e cte d e dges. A MDCG is a maximally bidirected MDCG (MBMDCG) if it has exactly the maximal set of bidirected edges corresp onding to its triplex equiv alence class. Note that there may b e sev eral MBMDCGs in the class. F or instance, the triplex equiv alence class tha t con tains the MAMP CG A → B has t wo MBMDCGs (i.e. A → B a nd A ← B ). No t e how ev er that all the MBMDCGs in a t r iplex equiv alence class hav e the same triplex edges, i.e. the edges in a triplex. 5. Error M AMP CGs Unfortunately , MAMP CGs are not closed under marginalization, meaning that the inde- p endence mo del resulting fro m marginalizing out some no des in a MAMP CG ma y not b e represen table by any MAMP CG. An example follow s. Example 2. The indep endenc e mo del r esulting fr om mar ginaliz ing out E and I in the MAMP CG G b elow c a n not b e r epr esente d by any MAMP CG. A B C D E F I J K T o se e it, assume to the c ontr ary that it c an b e r epr esente d by a MAMP CG H . Note that C and D must b e adjac ent in H , b e c ause C / ⊥ G D ∣ Z for al l Z ⊆ { A, B , F , J, K } . Similarly, D and F must b e adj ac ent in H . However, H c a n not have a triplex ({ C , F } , D ) b e c ause C ⊥ G F ∣ A ∪ D . Mor e over, C ← D c an not b e in H b e c ause A ⊥ G C , a n d D → F c an n ot b e in H b e c ause A ⊥ G F . Then, C − D − F must b e in H . F ol lo w ing an a nalo gous r e asoning, w e c an c onclude that F − J − K must b e in H . Howev e r, this c ontr adicts that D ⊥ G J . A solution to the pro blem ab o v e is to represe n t the marginal mo del b y a MAMP CG with extra edges so a s to av oid represen ting false indep endencies. This, of course, has t w o undesir- able consequenc es: Some true independencies ma y not b e represen ted, a nd the complexit y of the CG increases. See (R ic hardson and Spirtes, 2002, p. 965) f or a discussion on the imp or- tance of the class of mo dels considered b eing closed under marginalization. In this section, w e propo se an alternativ e solution to this problem: Much like w e did in Section 3 with AMP CGs, w e mo dif y MAMP CGs into what w e call EMAMP CGs, and sho w that the latter a re closed under margina lization. 2 2 The reader may think that parts o f this section are r ep etition o f Section 3 and, thus, tha t b oth s ections should b e uniﬁed. Howev er, we think that this would harm rea dability . 10 5.1. MAMP CGs with Deterministic No des. W e sa y that a no de A o f a MAMP CG is determined by some Z ⊆ V when A ∈ Z or A is a function o f Z . In that case, we a lso sa y that A is a deterministic no de. W e use D ( Z ) to denote all the no des that are determined b y Z . F rom the p oint of view of t he separations in a MAMP CG, that a no de is determined b y but is not in the conditioning set of a separation has the same eﬀect as if the node w ere actually in the conditioning set. W e extend the deﬁnition of separation for MAMP CGs to the case where determinis tic no des ma y exist. Giv en a MAMP CG G , a path ρ in G is said to be Z -op en when ● eve ry triplex no de in ρ is in D ( Z ) ∪ san G ( D ( Z )) , and ● no non-triplex no de B in ρ is in D ( Z ) , unless A − B − C is a subpath of ρ and pa G ( B ) ∖ D ( Z ) ≠ ∅ . 5.2. F rom MAMP CGs to EMAMP CGs. Ande rsson et al. (2 001, Se ction 5) and Kang and Tian (2009, Section 2 ) sho w that an y regular G aussian probabilit y distribution that is Mark o vian wrt an AMP or MVR CG G can b e expressed a s a system of linear equations with correlated errors whose structure dep ends on G . As w e sho w b elow , these tw o works can easily b e com bined to obtain a similar result fo r MAMP CGs. Let p denote an y regular Gaussian distributions that is Mar ko vian wrt a MAMP CG G . Assume without loss of generalit y that p has mean 0. Let K i denote an y connectivit y comp onen t of G . Let Ω i K i ,K i and Ω i K i ,pa G ( K i ) denote submatrices of t he precision matrix Ω i of p ( K i , pa G ( K i )) . Then, as sho wn b y Bishop (2006, Section 2.3.1), K i ∣ pa G ( K i ) ∼ N ( β i pa G ( K i ) , Λ i ) where β i = − ( Ω i K i ,K i ) − 1 Ω i K i ,pa G ( K i ) and ( Λ i ) − 1 = Ω i K i ,K i . Then, p can b e expres sed as a system of linear equations with normally distributed errors whose structure dep ends on G as follo ws: K i = β i pa G ( K i ) + ǫ i where ǫ i ∼ N ( 0 , Λ i ) . Note that f or all A, B ∈ K i st uc G ( A ) = uc G ( B ) and A − B is not in G , A ⊥ G B ∣ pa G ( K i ) ∪ uc G ( A ) ∖ A ∖ B and thus ( Λ i uc G ( A ) ,uc G ( A ) ) − 1 A,B = 0 (Lauritzen, 1 996, Prop osition 5 .2 ). Note also that for all A, B ∈ K i st uc G ( A ) ≠ uc G ( B ) and A ↔ B is not in G , A ⊥ G B ∣ pa G ( K i ) and th us Λ i A,B = 0. Finally , note also that for all A ∈ K i and B ∈ pa G ( K i ) st A ← B is no t in G , A ⊥ G B ∣ pa G ( A ) and thu s ( β i ) A,B = 0. Let β A con tain the nonzero elemen ts of the vec tor ( β i ) A, ● . Then, p can b e expressed as a system of linear equations with correlated errors whose structure depends on G as follo ws. F or an y A ∈ K i , A = β A pa G ( A ) + ǫ A and for an y other B ∈ K i , cov ar iance ( ǫ A , ǫ B ) = Λ i A,B . It is worth men tioning that the mapping ab ov e b etw een probabilit y distributions and sys - tems of linear equations is bijectiv e. W e omit the pro of of this fact b ecause it is unimp ort an t in this w ork, but it can be pro v en m uch in the same wa y as Lemma 1 in Pe˜ na (2011). Note that each equation in the system o f linear equations ab ov e is a univ ariate recursiv e regression, i.e. a random v ariable can b e a regr essor in an equation only if it has b een the regressand in a previous equation. This has t wo main adv an tages, as Co x and W erm ut h (1993, p. 207) 11 G G ′ [ G ′ ] { A,B ,F } A B C D E F A B C D E F ǫ A ǫ B ǫ C ǫ D ǫ E ǫ F C D E ǫ A ǫ B ǫ C ǫ D ǫ E ǫ F Figure 2. Example of the diﬀeren t transformations for MAMP CGs. explain: ”F irst, and most imp ortantly , it describ es a step wise pro cess by whic h the o bserv a- tions could ha v e b een generated and in this sense ma y prov e the basis for dev eloping p otential causal explanations. Second, each pa rameter in the system [of linear equations] has a w ell- understo o d meaning since it is a regression co eﬃcien t: That is, it giv es for unstandardized v ariables the amoun t by whic h the resp onse is expected to c hange if the explanatory v ariable is increased by one unit and all other v ariables in the equation are k ept constan t.” Therefore, a MAMP CG can b e seen as a data generating pro cess and, th us, it giv es us insight in to the system under study . Note that no no des in G corresp ond to the erro rs ǫ A . Therefore, G represen t the errors implicitly . W e pro p ose to represen t them explicitly . This can easily b e done b y transforming G in to what we call an EMAMP CG G ′ as f ollo ws, where A z x B means A ↔ B or A − B : 1 Let G ′ = G 2 F o r each no de A in G 3 Add the no de ǫ A to G ′ 4 Add the edge ǫ A → A to G ′ 5 F o r each edge A z x B in G 6 Add the edge ǫ A z x ǫ B to G ′ 7 Remo ve the edge A z x B from G ′ The transformation ab ov e basically consists in adding the error no des ǫ A to G and connect them appropria t ely . Figure 2 show s an example. Note tha t ev ery no de A ∈ V is determined b y pa G ′ ( A ) and, what will b e more imp ortant, that ǫ A is determined b y pa G ′ ( A ) ∖ ǫ A ∪ A . Th us, the existence of deterministic no des imp oses indep endencies whic h do not corresp ond to separations in G . Note a lso that, giv en Z ⊆ V , a no de A ∈ V is determined b y Z iﬀ A ∈ Z . The if part is trivial. T o see the o nly if part, note that ǫ A ∉ Z and th us A cannot be de termined b y Z unless A ∈ Z . Therefore, a no de ǫ A in G ′ is determined by Z iﬀ pa G ′ ( A ) ∖ ǫ A ∪ A ⊆ Z b ecause, as sho wn, there is no other wa y f o r Z to determine pa G ′ ( A ) ∖ ǫ A ∪ A whic h, in turn, determine ǫ A . Let ǫ denote all t he error no des in G ′ . It is easy to see that G ′ is a MAMP CG o ver V ∪ ǫ and, thus , its seman tics are deﬁned. The fo llowing theorem conﬁrms t ha t these seman tics are as desired. Theorem 8. I ( G ) = [ I ( G ′ )] ∅ ǫ . 5.3. EMAMP CGs Are Closed under Marginalization. Finally , w e sho w that EMAMP CGs are closed under marginalization, meaning that for any EMAM P CG G ′ and L ⊆ V t here is an EMA MP CG [ G ′ ] L st [ I ( G ′ )] L ∪ ǫ = [ I ([ G ′ ] L )] ǫ . W e actually sho w ho w to transform G ′ in to [ G ′ ] L . Not e that our deﬁnition of closed under marginalizatio n is an adaptation of the 12 MAMP CGs R CGs AMP CGs MVR CGs Mark ov net works Co v ariance graphs Ba yes ian net works Figure 3. Subfamilies of MAMP CGs. standard one to the fact that w e only care ab out indep endence mo dels under marginalizatio n of the error no des. T o ga in some in tuitio n into the problem a nd our solution to it, assume that L contains a single no de B . Then, marginalizing out B from the system of linear equations asso ciated with G implies the follo wing: F or ev ery C st B ∈ pa G ( C ) , mo dify the equation C = β C pa G ( C ) + ǫ C b y replacing B with t he right-hand side of its corr esp o nding equation, i.e. β B pa G ( B ) + ǫ B and, then, remo v e the equation B = β B pa G ( B ) + ǫ B from the system. In g raphical terms, this corresp onds to C inheriting the paren ts of B in G ′ and, then, r emo ving B from G ′ . The follo wing pse udo co de formalizes t his idea for any L ⊆ V . 1 Let [ G ′ ] L = G ′ 2 Rep eat un til all the no des in L ha v e b een considered 3 Let B denote any no de in L that has not b een considered b efore 4 F or eac h pair o f edges A → B and B → C in [ G ′ ] L with A, C ∈ V ∪ ǫ 5 Add the edge A → C to [ G ′ ] L 6 Remo ve B and all the edges it participates in from [ G ′ ] L Note that the result of the pseudo co de ab o ve is the same no matter the ordering in whic h the no des in L are selected in line 3. Note also that we hav e not y et given a forma l deﬁnition of EMAMP CGs. W e deﬁne them recursiv ely as all the graphs resulting from applying the ﬁrst pseudoco de in this section to a MAMP CG, plus all the graphs resulting from a pplying the second pse udo co de in this sec tion to an EMAMP CG. It is easy to see t ha t ev ery EMAMP CG is a MAMP CG o v er W ∪ ǫ with W ⊆ V and, thus, its semantics are deﬁned. Theorem 8 together with the follo wing theorem conﬁrm that these semantics ar e a s desired. Theorem 9. [ I ( G ′ )] L ∪ ǫ = [ I ([ G ′ ] L )] ǫ . 6. Discussion In this pap er we ha v e in tro duced MAMP CGs, a new f a mily of graphical mo dels that unify and generalize AMP and MVR CGs. W e ha v e describ ed global and pairwise Marko v prop erties for them and prov ed their equiv alence for comp ositional graphoids. W e ha v e sho wn that ev ery MAMP CG is Mark o v equiv alen t to some D A G with deterministic no des under marginalization and conditioning o n some of its no des. Therefore, the indep endence mo del represen ted b y a MAMP CG can b e accoun ted for by some data generating pro cess that is partially observ ed and has selection bias. W e hav e a lso c ha r a cterized when tw o MAMP CGs are Mark o v equiv alen t. W e conjecture that ev ery Mark ov equiv alence class of MAMP CGs has a distinguished mem b er. W e are currently w o rking on this question. It is w ort h men tioning that suc h a result has b een pro v en for AMP CGs (Rov erato and Studen´ y, 2 006). Finally , w e ha ve mo diﬁed MAMP CGs so that they a r e closed under margina lization. This 13 is a desirable feature b ecause it guaran t ees parsimonious mo dels under marginalization. W e are curren tly studying ho w to mo dify MAMP CGs so that they are closed under conditioning to o. W e are also w orking on a constrain t based algorithm for learning a MAMP CG a giv en probabilit y distribution is faithful to. The idea is to com bine the learning algorithms that w e ha v e recently prop o sed for AMP CGs (P e ˜ na, 2012) a nd MVR CGs (Sonntag and P e ˜ na, 2012). W e b eliev e that the most natural wa y to generalize AMP and MVR CGs is b y a llo wing undirected, directed and bidirected edges. Ho wev er, we are not the ﬁrst to in tro duce a family of mo dels tha t is based on graphs that may contain these three t yp es of edges. In the rest of this section, w e r eview some w orks that ha v e done it b efore us, and explain ho w our w ork diﬀers from them. Co x and W erm uth (1993, 19 96) in tro duced regression CGs (RCGs) to g eneralize MVR CGs by allow ing t hem to ha v e also undirected edges. The separation criterion for R CGs is iden tical to that of MVR CGs. Then, there are indep endence mo dels that can b e represen ted b y MAMP CGs but that cannot b e represen ted by RC Gs, b ecause R CGs generalize MVR CGs but no t AMP CGs. An example fo llo ws. Example 3. The indep ende n c e mo del r epr ese n te d by the AMP CG G b elow c a n not b e r epr e- sente d by any RCG . A B C D T o se e it, assume to the c on tr ary that it c an b e r epr esen te d by a R CG H . Note that H is a MAMP CG to o. Then, G and H must have the same triplexe s by T he or em 7. Then, H must have triplexes ({ A, B } , C ) a n d ({ A, D } , C ) but no triplex ({ B , D } , C ) . So, B ⊸ ⊸ C → D , B ⊸ ⊸ C − D , B ← C ⊸ ⊸ D or B − C ⊸ ⊸ D must b e in H . However, this implies that H do es not have the triplex ({ A, B } , C ) or ({ A, D } , C ) , whic h i s a c ontr a diction. It is worth mentioning tha t , although R CGs can hav e undirected edges, they cannot ha v e a subgraph of the fo rm A ← ⊸ B − C . Therefore, RCGs are a subfamily of MAMP CGs. Figure 3 depicts this and ot her subfamilies of MAMP CGs. Another family of mo dels that is based o n graphs that ma y con tain undirected, directed and bidirected edges is ma ximal ancestral graphs (MA Gs) (R ic hardson and Spirtes, 2002). Although MA Gs can hav e undirected edges, they mus t comply with certain to p ological con- strain ts. The separation criterion for MAGs is iden tical to that of MVR CGs. Therefore, t he example abov e also serv es to illustrate that MA Gs generalize MVR CGs but not AMP CGs, as MAMP CGs do. See also (Ric hardson and Spirtes , 2002, p. 102 5 ). Therefore, MAMP CGs are not a subfamily of MA Gs. The follo wing example show s that MA Gs are not a subfamily of MAMP CGs either. Example 4. The ind ep endenc e mo del r ep r es e nte d by the MA G G b elow c an n ot b e r epr esente d by any MAMP CG. A B C D T o se e it, assume to the c ontr ary that it c an b e r epr esente d by a MAMP CG H . Obvi o usly, G and H must have the same ad jac encies. Then, H must have a triplex ({ A, C } , B ) b e c ause A ⊥ G C , b ut i t c annot have a triplex ({ A, D } , B ) b e c ause A ⊥ G D ∣ B . Th i s is p ossible only if the e dge A ← B i s not in H . Then, H m ust have one of the fol lowing induc e d sub gr aphs: 14 A B C D A B C D A B C D A B C D A B C D However, the ﬁrs t a n d se c ond c ases ar e imp oss ible b e c ause A ⊥ H D ∣ B ∪ C wher e as A / ⊥ G D ∣ B ∪ C . The thir d c as e i s imp ossible b e c ause it do es not satisfy the c onstr aint C1. I n the fourth c ase, note that C ↔ B − D c annot b e in H b e c ause, otherwise, it do es not sa tisfy the c o n str ain t C1. Then , the fourth c ase i s imp ossible b e c ause A ⊥ H D ∣ B ∪ C wher e as A / ⊥ G D ∣ B ∪ C . Final ly, the ﬁfth c ase is als o i m p ossible b e c ause it do es not satisfy the c onstr aint C1 or C2. It is w o r t h men tioning t ha t the mo dels represen ted by AMP and MVR CGs are smo o th, i.e. they are curv ed exp o nen tial families, for G a ussian probability distributions. Ho w eve r, only the mo dels represe n ted b y MVR CGs a re smo oth for discrete probability distributions. The mo dels represen ted by MA G s are smo oth in the Gaussian and discrete cases. See Drton (2009) and Ev ans and Richardson (2013). Finally , three ot her families of mo dels that are based on graphs that ma y con ta in undi- rected, directed and bidirected edges are summary graphs after replacing the da shed undi- rected edges with bidirected edges (Co x and W erm uth , 1996), MC graphs (Koster, 2002), and lo opless mixed graphs (Sadeghi and Lauritzen, 2 012). As show n in (Sadeghi and Lauritzen , 2012, Sections 4.2 and 4.3) , ev ery independence mo del that can b e represen ted b y summary graphs and MC graphs can also b e represen ted b y lo o pless mixed graphs. The separation criterion for lo opless mixed graphs is identical to that of MVR CGs. Therefore, the exam- ple ab ov e also serv es to illustrate that lo opless mixed graphs generalize MVR CGs but not AMP CGs, as MAMP CGs do. See also (Sadeghi and La uritzen, 2012, Section 4.1). More- o ver, summary graphs and MC graphs hav e a rat her coun terintuitiv e and undesirable feature: Not ev ery missing edge corresp onds to a separation (Ric hardson and Spirt es , 20 02, p. 102 3 ). MAMP CGs, on the other hand, do not hav e this disadv an tage (recall Theorem 5). In summary , MAMP CGs ar e the only graphical mo dels w e are a ware of that generalize b oth AMP and MVR CG s. A cknowledgments W e w ould like to thank the anony mous Review ers and sp ecially Review er 3 for suggest- ing Example 4. This work is funded b y the Cen ter for Industrial Information T ec hnolo gy (CENIIT) and a so-called career con tract at Link¨ oping Univ ersit y , b y the Sw edish Research Council (ref. 2010- 4808), and by FEDER funds and the Spanish Gov ernmen t (MICINN) through the pro ject TIN2010-20 900-C04- 03. Appendix: Pr oofs Pro of of Theorem 1. It suﬃces to sho w that ev ery Z -op en pat h b et ween α and β in G can b e transformed into a Z -op en path betw een α and β in G ′ and vice v ersa, with α , β ∈ V a nd Z ⊆ V ∖ α ∖ β . Let ρ denote a Z -op en path b et wee n α and β in G . W e can easily transform ρ in to a path ρ ′ b et w een α and β in G ′ : Simply , replace ev ery maximal subpath of ρ of the form V 1 − V 2 − . . . − V n − 1 − V n ( n ≥ 2) with V 1 ← ǫ V 1 − ǫ V 2 − . . . − ǫ V n − 1 − ǫ V n → V n . W e no w show that ρ ′ is Z -o p en. First, if B ∈ V is a triplex no de in ρ ′ , then ρ ′ m ust ha ve one of the follo wing subpaths: A B C A B ǫ B ǫ C ǫ B B C ǫ A with A, C ∈ V . Therefore, ρ m ust ha v e one of the f ollo wing subpaths (speciﬁcally , if ρ ′ has the i - t h subpath ab o v e, then ρ has the i -th subpath below): 15 A B C A B C A B C In either case, B is a triplex no de in ρ and, thus, B ∈ Z ∪ san G ( Z ) fo r ρ to b e Z -op en. Then, B ∈ Z ∪ san G ′ ( Z ) b y construction of G ′ and, thus , B ∈ D ( Z ) ∪ s an G ′ ( D ( Z )) . Second, if B ∈ V is a non-triplex node in ρ ′ , then ρ ′ m ust ha v e one of the following subpaths: A B C A B C A B C A B ǫ B ǫ C ǫ B B C ǫ A with A, C ∈ V . Therefore, ρ m ust ha v e one of the f ollo wing subpaths (speciﬁcally , if ρ ′ has the i - t h subpath ab o v e, then ρ has the i -th subpath below): A B C A B C A B C A B C A B C In either case, B is a non-tr iplex no de in ρ and, th us, B ∉ Z for ρ to b e Z -op en. Since Z con tains no error no de, Z cannot determine an y no de in V that is not already in Z . Then, B ∉ D ( Z ) . Third, if ǫ B is a non-triplex no de in ρ ′ (note that ǫ B cannot b e a triplex no de in ρ ′ ), then ρ ′ m ust hav e one o f the following subpaths: A B ǫ B ǫ C ǫ B B C ǫ A α = B ǫ B ǫ C ǫ B B = β ǫ A A B ǫ B ǫ C ǫ B B C ǫ A ǫ A ǫ B ǫ C with A, C ∈ V . Recall that ǫ B ∉ Z b ecause Z ⊆ V ∖ α ∖ β . In the ﬁrst case, if α = A then A ∉ Z , else A ∉ Z for ρ to b e Z -op en. Then, ǫ B ∉ D ( Z ) . In the second case, if β = C then C ∉ Z , else C ∉ Z for ρ to b e Z - o p en. Then, ǫ B ∉ D ( Z ) . In the third and fourth cases, B ∉ Z b ecause α = B or β = B . Then, ǫ B ∉ D ( Z ) . In the ﬁf t h and sixth cases, B ∉ Z for ρ to b e Z -op en. Then, ǫ B ∉ D ( Z ) . The last case implies that ρ has the follo wing subpath: A B C Th us, B is a non-triplex no de in ρ , whic h implies that B ∉ Z or pa G ( B ) ∖ Z ≠ ∅ for ρ to b e Z -op en. In either case, ǫ B ∉ D ( Z ) ( r ecall tha t pa G ′ ( B ) = pa G ( B ) ∪ ǫ B b y construction of G ′ ). Finally , let ρ ′ denote a Z -op en path b etw een α and β in G ′ . W e can easily transform ρ ′ in to a path ρ b et w een α and β in G : Simply , replace ev ery maximal subpath of ρ ′ of the f o rm V 1 ← ǫ V 1 − ǫ V 2 − . . . − ǫ V n − 1 − ǫ V n → V n ( n ≥ 2) with V 1 − V 2 − . . . − V n − 1 − V n . W e no w show that ρ is Z -op en. First, no t e that all the no des in ρ are in V . Moreo v er, if B is a triplex no de in ρ , then ρ m ust hav e one of the follo wing subpaths: A B C A B C A B C with A, C ∈ V . Therefore, ρ ′ m ust ha ve one of the follow ing subpaths (sp eciﬁcally , if ρ has the i - t h subpath ab o v e, then ρ ′ has the i -th subpath b elow): A B C A B ǫ B ǫ C ǫ B B C ǫ A In either case, B is a triplex no de in ρ ′ and, th us, B ∈ D ( Z ) ∪ san G ′ ( D ( Z )) for ρ ′ to b e Z -op en. Since Z con tains no erro r no de, Z cannot determine any no de in V that is not already in Z . Then, B ∈ D ( Z ) iﬀ B ∈ Z . Since there is no strictly descendin g route from B 16 to an y error no de, then an y strictly descending r o ute f rom B to a no de D ∈ D ( Z ) implies that D ∈ V whic h, as seen, implies that D ∈ Z . Then, B ∈ san G ′ ( D ( Z )) iﬀ B ∈ san G ′ ( Z ) . Moreo ver, B ∈ san G ′ ( Z ) iﬀ B ∈ san G ( Z ) b y construction of G ′ . These r esults together imply that B ∈ Z ∪ san G ( Z ) . Second, if B is a non- t r iplex no de in ρ , then ρ mus t ha v e one of the follo wing subpaths: A B C A B C A B C A B C A B C A B C with A, C ∈ V . Therefore, ρ ′ m ust ha ve one of the follow ing subpaths (sp eciﬁcally , if ρ has the i - t h subpath ab o v e, then ρ ′ has the i -th subpath b elow): A B C A B C A B C A B ǫ B ǫ C ǫ B B C ǫ A ǫ A ǫ B ǫ C In the ﬁrst ﬁv e cases, B is a non-triplex no de in ρ ′ and, th us, B ∉ D ( Z ) for ρ ′ to b e Z -op en. Since Z contains no error no de, Z cannot determine an y no de in V that is no t already in Z . Then, B ∉ Z . In the last case, ǫ B is a non-triplex no de in ρ ′ and, thus, ǫ B ∉ D ( Z ) for ρ ′ to b e Z -op en. Then, B ∉ Z or pa G ′ ( B ) ∖ ǫ B ∖ Z ≠ ∅ . Then, B ∉ Z or pa G ( B ) ∖ Z ≠ ∅ (recall tha t pa G ′ ( B ) = pa G ( B ) ∪ ǫ B b y construction of G ′ ).  Pro of of Theorem 2. Assume for a moment that G ′ has no determinis tic no de. Note tha t G ′ has no induced subgra ph o f the form A → B − C with A, B , C ∈ V ∪ ǫ . Suc h an induced subgraph is called a ﬂag by Andersson et al. (20 0 1, pp. 40-41). They a lso in tro duce the t erm biﬂag, whose deﬁnition is irrelev an t here. What is relev ant here is the observ a tion that a CG cannot hav e a biﬂag unless it has some ﬂag. Therefore, G ′ has no biﬂags. Consequen tly , ev ery probabilit y distribution that is Mark ovian wrt G ′ when in terpreted as an AMP CG is also Mark ovian wrt G ′ when interpreted as a L WF CG a nd vice vers a (Andersson et al., 2001, Corollary 1). Now, not e that there are G aussian probability distributions that are faithful to G ′ when in terpreted as an AMP CG (Levitz et al., 2001, Theorem 6.1) a s well as when in terpreted as a L WF CG (P e ˜ na, 2011, Theorems 1 and 2). Therefore, I AM P ( G ′ ) = I LW F ( G ′ ) . W e denote this indep endence mo del b y I N DN ( G ′ ) . No w, forget the momen tary assumption made ab ov e t ha t G ′ has no deterministic no de. Recall that w e assumed that D ( Z ) is the same under the AMP a nd the L WF in t erpretatio ns of G ′ for all Z ⊆ V ∪ ǫ . Recall also tha t , from the p oint of view of the separations in an AMP or L WF CG, that a no de is determined b y the conditioning set has the same eﬀect a s if the no de w ere in the conditioning set. Then, X ⊥ G ′ Y ∣ Z is in I AM P ( G ′ ) iﬀ X ⊥ G ′ Y ∣ D ( Z ) is in I N DN ( G ′ ) iﬀ X ⊥ G ′ Y ∣ Z is in I LW F ( G ′ ) . Then, I AM P ( G ′ ) = I LW F ( G ′ ) .  Pro of of Theorem 3. Assume for a momen t that G ′ has no deterministic no de. Then, G ′′ has no deterministic node either. W e show b elow that ev ery Z -op en route betw een α and β in G ′ can b e transformed in to a ( Z ∪ S ) -op en ro ute b etw een α and β in G ′′ and vice v ersa, with α , β ∈ V ∪ ǫ . This implies that I LW F ( G ′ ) = [ I ( G ′′ )] S ∅ . W e denote this indep endence mo del b y I N DN ( G ′ ) . First, let ρ ′ denote a Z -op en ro ute b et wee n α and β in G ′ . Then, we can easily transform ρ ′ in to a ( Z ∪ S ) -op en ro ute ρ ′′ b et w een α and β in G ′′ : Simply , replace ev ery edge ǫ A − ǫ B in ρ ′ with ǫ A → S ǫ A ǫ B ← ǫ B . T o see that ρ ′′ is actually ( Z ∪ S ) -o p en, note that ev ery collider section in ρ ′ is due to a subroute of the form A → B ← C with A, B ∈ V and C ∈ V ∪ ǫ . Then, an y no de tha t is in a collider (resp ectiv ely non-collider) section o f ρ ′ is also in a collider (resp ectiv ely no n-collider) section of ρ ′′ . 17 Second, let ρ ′′ denote a ( Z ∪ S ) -op en route b etw een α and β in G ′′ . Then, w e can easily transform ρ ′′ in to a Z -op en route ρ ′ b et w een α and β in G ′ : F irst, replace ev ery subroute ǫ A → S ǫ A ǫ B ← ǫ A of ρ ′′ with ǫ A and, then, replace ev ery subroute ǫ A → S ǫ A ǫ B ← ǫ B of ρ ′′ with ǫ A − ǫ B . T o see that ρ ′ is actually Z -op en, note that ev ery undirected edge in ρ ′ is b et we en t wo noise no des and recall that no noise no de has incoming directed edges in G ′ . Then, again ev ery collider section in ρ ′ is due t o a subroute of the form A → B ← C with A, B ∈ V and C ∈ V ∪ ǫ . Then, a g ain an y no de that is in a collider (respectiv ely non-collider) section of ρ ′ is also in a collider (resp ectiv ely non-collider) section of ρ ′′ . No w, forget the momen tary assumption made ab ov e t ha t G ′ has no deterministic no de. Recall that w e assumed tha t D ( Z ) is the same no matter whether w e are considering G ′ or G ′′ for all Z ⊆ V ∪ ǫ . Recall also that, fro m the p o int of view o f the separations in a L WF CG, that a no de is determined by the conditioning set has the same eﬀect a s if the no de were in the conditioning set. Then, X ⊥ G ′′ Y ∣ Z is in [ I ( G ′′ )] S ∅ iﬀ X ⊥ G ′ Y ∣ D ( Z ) is in I N DN ( G ′ ) iﬀ X ⊥ G ′ Y ∣ Z is in I LW F ( G ′ ) . Then, I LW F ( G ′ ) = [ I ( G ′′ )] S ∅ .  Pro of of Theorem 4. It suﬃces to replace ev ery bidirected edge A ↔ B in G with A ← L AB → B to create an AMP CG  G , apply Theorem 6.1 by Levitz et al. (2001) to conclude that there exists a regular Gaussian probabilit y distribution q that is faithful to  G , and then let p b e the mar ginal proba bilit y distribution of q ov er V .  Pro of of Corollary 3. It follows fro m Theorem 4 b y j ust noting that the set of indep en- dencies in an y regular Gaussian pro babilit y distribution satisﬁes the comp ositional graphoid prop erties (Studen´ y, 2005, Sections 2.2.2, 2.3.5 and 2.3.6).  Pro of of Theorem 5. Since the indep endence mo del represen ted by G satisﬁes the comp o- sitional graphoid pro p erties b y Corolla r y 3, it suﬃces to pro v e that the pairwise separation base of G is a subset of the independence mo del represen ted b y G . W e prov e this next. Let A, B ∈ V st A ∉ ad G ( B ) . Consider the follo wing cases. Case 1: Assume that B ∉ de G ( A ) . Then, ev ery path b et w een A and B in G falls within one o f the follo wing cases . Case 1.1: A = V 1 ← V 2 . . . V n = B . Then, this path is no t pa G ( A ) -op en. Case 1.2: A = V 1 ← ⊸ V 2 . . . V n = B . Note that V 2 ≠ V n b ecause A ∉ ad G ( B ) . Note also that V 2 ∉ pa G ( A ) due to the constrain t C1. Then, V 2 → V 3 m ust b e in G for the pat h to b e pa G ( A ) -op en. By rep eating this reasoning, w e can conclude that A = V 1 ← ⊸ V 2 → V 3 → . . . → V n = B is in G . How ev er, this contradicts that B ∉ de G ( A ) . Case 1.3: A = V 1 − V 2 − . . . − V m ← ⊸ V m + 1 . . . V n = B . Note that V m ∉ pa G ( A ) due to the constraint C1. Then, this pat h is not pa G ( A ) -op en. Case 1.4: A = V 1 − V 2 − . . . − V m → V m + 1 . . . V n = B . Note that V m + 1 ≠ V n b ecause B ∉ de G ( A ) . Not e also that V m + 1 ∉ pa G ( A ) due to the constrain t C1. Then, V m + 1 → V m + 2 m ust b e in G for the path to b e p a G ( A ) -op en. By rep eating this reasoning, w e can conclude that A = V 1 − V 2 − . . . − V m → V m + 1 → . . . → V n = B is in G . Ho w ev er, this con tradicts that B ∉ de G ( A ) . Case 1.5: A = V 1 − V 2 − . . . − V n = B . This case contradicts the assumption that B ∉ de G ( A ) . Case 2: Assume that A ∈ de G ( B ) , B ∈ d e G ( A ) and uc G ( A ) = uc G ( B ) . Then, there is an undirected pa th ρ b et wee n A and B in G . Then, ev ery path b et w een A and B in G falls within one of the fo llowing cases. 18 Case 2.1: A = V 1 ← V 2 . . . V n = B . Then, this path is not ( ne G ( A ) ∪ pa G ( A ∪ ne G ( A ))) -op en. Case 2.2: A = V 1 ← ⊸ V 2 . . . V n = B . Note that V 2 ≠ V n b ecause A ∉ ad G ( B ) . Note also that V 2 ∉ ne G ( A ) ∪ pa G ( A ∪ ne G ( A )) due to the constrain ts C1 and C2. Then, V 2 → V 3 m ust b e in G for the path to b e ( ne G ( A ) ∪ pa G ( A ∪ ne G ( A ))) -op en. By rep eating this reasoning, w e can conclude that A = V 1 ← ⊸ V 2 → V 3 → . . . → V n = B is in G . How ev er, t his to g ether with ρ violate the constrain t C1. Case 2.3: A = V 1 − V 2 ← V 3 . . . V n = B . Then, this path is not ( ne G ( A ) ∪ pa G ( A ∪ ne G ( A ))) -op en. Case 2.4: A = V 1 − V 2 ← ⊸ V 3 . . . V n = B . Note t ha t V 3 ≠ V n due to ρ and the constrain ts C1 and C2. Note a lso that V 3 ∉ ne G ( A ) ∪ pa G ( A ∪ ne G ( A )) due to the constrain ts C1 a nd C2. Then, V 3 → V 4 m ust b e in G f or t he path to b e ( ne G ( A ) ∪ pa G ( A ∪ ne G ( A ))) -op en. By rep eating this reasoning, we can conclude that A = V 1 − V 2 ← ⊸ V 3 → . . . → V n = B is in G . How eve r, t his together with ρ violate the constrain t C1. Case 2.5: A = V 1 − V 2 − V 3 . . . V n = B st sp G ( V 2 ) = ∅ . Then, this path is not ( ne G ( A ) ∪ p a G ( A ∪ ne G ( A ))) -op en. Case 2.6: A = V 1 − V 2 − . . . − V n = B st sp G ( V i ) ≠ ∅ for a ll 2 ≤ i ≤ n − 1. Note that V i ∈ ne G ( V 1 ) for all 3 ≤ i ≤ n b y the constrain t C3. How ever, this con t radicts that A ∉ ad G ( B ) . Case 2.7: A = V 1 − V 2 − . . . − V m − V m + 1 − V m + 2 . . . V n = B st sp G ( V i ) ≠ ∅ for all 2 ≤ i ≤ m and sp G ( V m + 1 ) = ∅ . Note that V i ∈ ne G ( V 1 ) for all 3 ≤ i ≤ m + 1 by the constrain t C3. Then, this path is not ( ne G ( A ) ∪ pa G ( A ∪ ne G ( A ))) -op en. Case 2.8: A = V 1 − V 2 − . . . − V m − V m + 1 ← V m + 2 . . . V n = B st sp G ( V i ) ≠ ∅ for all 2 ≤ i ≤ m . Note that V i ∈ ne G ( V 1 ) for all 3 ≤ i ≤ m + 1 b y the constrain t C3. Then, this pat h is not ( ne G ( A ) ∪ pa G ( A ∪ ne G ( A ))) -op en. Case 2.9: A = V 1 − V 2 − . . . − V m − V m + 1 ← ⊸ V m + 2 . . . V n = B st s p G ( V i ) ≠ ∅ for all 2 ≤ i ≤ m . Not e that V m + 2 ≠ V n due to ρ and the constraints C1 and C2. Note also that V m + 2 ∉ ne G ( A ) ∪ pa G ( A ∪ ne G ( A )) due to the constrain ts C1 and C2. Then, V m + 2 → V m + 3 m ust b e in G fo r the path to b e ( ne G ( A ) ∪ pa G ( A ∪ ne G ( A ))) -op en. By rep eating this reasoning, w e can conclude that A = V 1 − V 2 − . . . − V m − V m + 1 ← ⊸ V m + 2 → . . . → V n = B is in G . How ev er, this together with ρ violate the constrain t C1. Case 3: Assume that A ∈ de G ( B ) , B ∈ de G ( A ) and u c G ( A ) ≠ uc G ( B ) . Then, eve ry path betw een A and B in G falls within one of t he fo llo wing cases. Case 3.1: A = V 1 ← V 2 . . . V n = B . Then, this path is no t pa G ( A ) -op en. Case 3.2: A = V 1 ← ⊸ V 2 . . . V n = B . Note that V 2 ≠ V n b ecause A ∉ ad G ( B ) . Note also that V 2 ∉ pa G ( A ) due to the constrain t C1. Then, V 2 → V 3 m ust b e in G for the pat h to b e pa G ( A ) -op en. By rep eating this reasoning, w e can conclude that A = V 1 ← ⊸ V 2 → V 3 → . . . → V n = B is in G . Ho w ev er, this tog ether with the assumption that A ∈ de G ( B ) contradict the constrain t C1. Case 3.3: A = V 1 − V 2 − . . . − V m ← ⊸ V m + 1 . . . V n = B . Note that V m ∉ pa G ( A ) due to the constraint C1. Then, this pat h is not pa G ( A ) -op en. Case 3.4: A = V 1 − V 2 − . . . − V m → V m + 1 . . . V n = B . Note that V m + 1 ≠ V n b ecause, otherwise, this together with the assumption that A ∈ de G ( B ) contradict the constrain t C1. Note also that V m + 1 ∉ pa G ( A ) due to the constraint C1. Then, V m + 1 → V m + 2 m ust b e in G for the path to b e p a G ( A ) -op en. By rep eating this reasoning, we can conclude that A = V 1 − V 2 − . . . − V m → V m + 1 → . . . → V n = B is in G . Ho w ev er, t his together with the assumption that A ∈ de G ( B ) contradict the constrain t C1.  19 Lemma 3. L et X and Y denote two no des of a MAMP CG G with only one c onne ctivity c o m p onent. If X ⊥ G Y ∣ Z and ther e is a no de C ∈ Z st sp G ( C ) ≠ ∅ , then X ⊥ G Y ∣ Z ∖ C . Pr o of. Assume to the con tr a ry that there is a ( Z ∖ C ) -op en path ρ b etw een X and Y in G . Note that C mus t o ccur in ρ b ecause, otherwise, ρ is Z -op en whic h contradicts that X ⊥ G Y ∣ Z . F or the same reason, C m ust b e a non-triplex no de in ρ . Then, D − C − E m ust b e a subpath of ρ and, th us, the edge D − E mus t b e in G b y the constraint C3, b ecause sp G ( C ) ≠ ∅ . Then, the path obtained from ρ b y replacing the subpath D − C − E with the edge D − E is Z - op en. Ho wev er, this con tradicts that X ⊥ G Y ∣ Z .  Lemma 4. L et X and Y denote two no des of a MAMP CG G with only one c onne ctivity c o m p onent. If X ⊥ G Y ∣ Z then X ⊥ cl ( G ) Y ∣ Z . Pr o of. W e pro v e the lemma by induction on ∣ Z ∣ . If ∣ Z ∣ = 0, then uc G ( X ) ≠ uc G ( Y ) . Con- sequen tly , X ⊥ cl ( G ) Y follows f rom the pairwise separation base of G b ecause X ∉ ad G ( Y ) . Assume a s induction hy p othesis that the lemma holds fo r ∣ Z ∣ < l . W e now prov e it for ∣ Z ∣ = l . Consider the following cases. Case 1: Assume that u c G ( X ) = uc G ( Y ) . Consider the follo wing cases. Case 1.1: Assume that Z ⊆ uc G ( X ) . Then, the pairwise separation base of G implies that C ⊥ cl ( G ) uc G ( X ) ∖ C ∖ ne G ( C )∣ ne G ( C ) for all C ∈ uc G ( X ) b y rep eat ed comp osition, which implies X ⊥ cl ( G ) Y ∣ Z by the graphoid prop erties (Laur it zen , 1996, Theorem 3.7). Case 1.2: Assume that there is some node C ∈ Z ∖ uc G ( X ) st C ↔ D is in G with D ∈ uc G ( X ) and X / ⊥ G C ∣ Z ∖ C . The n, Y ⊥ G C ∣ Z ∖ C . T o see it, assume the con tra r y . Then, X / ⊥ G Y ∣ Z ∖ C by w eak transitivit y b ecause X ⊥ G Y ∣ Z . How ev er, this contradicts Lemma 3. No w, note that Y ⊥ G C ∣ Z ∖ C implies Y ⊥ cl ( G ) C ∣ Z ∖ C b y the induction hypothesis. Note also that X ⊥ G Y ∣ Z ∖ C b y Lemma 3 and, thu s, X ⊥ cl ( G ) Y ∣ Z ∖ C b y the induction hypothesis. Then, X ⊥ cl ( G ) Y ∣ Z b y symmetry , comp o sition and w eak union. Case 1.3: Assume that Cases 1.1 and 1.2 do not apply . L et E ∈ Z ∖ uc G ( X ) . Suc h a no de E exists b ecause, otherwise, Case 1.1 applies. Moreo v er, X ⊥ G E ∣ Z ∖ E b ecause, otherwise, t here is some no de C that satisﬁes the conditions o f Case 1.2. Note also that X ⊥ G Y ∣ Z ∖ E . T o see it, assume the con trary . Then, there is a ( Z ∖ E ) -op en path b etw een X a nd Y in G . Note that E m ust o ccur in the path b ecause, otherwise, the path is Z -op en, whic h con tra dicts that X ⊥ G Y ∣ Z . Ho wev er, this implies that X / ⊥ G E ∣ Z ∖ E , whic h is a con tra diction. No w, not e that X ⊥ G E ∣ Z ∖ E and X ⊥ G Y ∣ Z ∖ E imply X ⊥ cl ( G ) E ∣ Z ∖ E and X ⊥ cl ( G ) Y ∣ Z ∖ E by the induction hypothesis. Then, X ⊥ cl ( G ) Y ∣ Z by comp osition and w eak union. Case 2: Assume that u c G ( X ) ≠ uc G ( Y ) . Consider the follo wing cases. Case 2.1: Assume that there is some no de C ∈ Z st C ↔ X is in G . Then, Y ⊥ G C ∣ Z ∖ C b ecause, ot herwise, X / ⊥ G Y ∣ Z . Then, Y ⊥ cl ( G ) C ∣ Z ∖ C by the induction h yp o thesis. Note that X ⊥ G Y ∣ Z ∖ C by Lemma 3 and, thus, X ⊥ cl ( G ) Y ∣ Z ∖ C b y t he induction hypothesis. Then, X ⊥ cl ( G ) Y ∣ Z by symmetry , comp osition and w eak union. Case 2.2: Assume t ha t there is some no de C ∈ Z ∩ uc G ( X ) st sp G ( C ) ≠ ∅ , and X ⊥ G C ∣ Z ∖ C . Then, X ⊥ cl ( G ) C ∣ Z ∖ C by the induction h yp o t hesis. Note that X ⊥ G Y ∣ Z ∖ C b y Lemma 3 and, th us, X ⊥ cl ( G ) Y ∣ Z ∖ C by the induction hypothesis. Then, X ⊥ cl ( G ) Y ∣ Z b y comp osition and we ak union. Case 2.3: Assume t ha t there is some no de C ∈ Z ∩ uc G ( X ) st sp G ( C ) ≠ ∅ , and X / ⊥ G C ∣ Z ∖ C . Then, Y ⊥ G C ∣ Z ∖ C . T o see it, assume the contrary . Then, 20 X / ⊥ G Y ∣ Z ∖ C b y w eak transitivit y b ecause X ⊥ G Y ∣ Z . How ev er, this con tradicts Lemma 3 . No w, note that Y ⊥ G C ∣ Z ∖ C implies Y ⊥ cl ( G ) C ∣ Z ∖ C b y the induction hypothesis. Note also that X ⊥ G Y ∣ Z ∖ C b y Lemma 3 and, thu s, X ⊥ cl ( G ) Y ∣ Z ∖ C b y the induction hypothesis. Then, X ⊥ cl ( G ) Y ∣ Z b y comp osition and w eak union. Case 2.4: Assume that Cases 2.1-2 .3 do not a pply . Let V 1 , . . . , V m b e the no des in Z ∩ uc G ( X ) . L et W 1 , . . . , W n b e the no des in Z ∖ uc G ( X ) . Then, (1) X ⊥ cl ( G ) Y fo llo ws from the pairwise separation base of G b ecause uc G ( X ) ≠ uc G ( Y ) and X ∉ ad G ( Y ) . Moreov er, for all 1 ≤ i ≤ m (2) V i ⊥ cl ( G ) Y f ollo ws from the pairwise separation base of G b ecause V i ∉ uc G ( Y ) and V i ∉ ad G ( Y ) , since sp G ( V i ) = ∅ b ecause, otherwise, Case 2.2 o r 2.3 applies. Moreo v er, for all 1 ≤ j ≤ n (3) X ⊥ cl ( G ) W j follo ws from the pairwise separation base o f G b ecause W j ∉ uc G ( X ) and W j ∉ ad G ( X ) , since W j ↔ X is not in G b ecause, otherwise, Case 2.1 applies. Moreo ver, for all 1 ≤ i ≤ m and 1 ≤ j ≤ n (4) V i ⊥ cl ( G ) W j follo ws from the pairwise separatio n base of G b ecause uc G ( V i ) ≠ uc G ( W j ) and V i ∉ ad G ( W j ) , since sp G ( V i ) = ∅ b ecause, otherwise, Case 2.2 or 2.3 applies. Then, (5) X ⊥ cl ( G ) Y ∣ Z b y rep eated symmetry , comp osition and we ak union.  W e sort the connectivit y comp onen ts of a MAMP CG G a s K 1 , . . . , K n st if X → Y is in G , then X ∈ K i and Y ∈ K j with i < j . It is w orth men tioning that, in the pro ofs b elow , we mak e use of t he fact that the indep endence mo del represen t ed by G satisﬁes w eak transitivit y b y Corollary 3. Note, how ev er, that this prop erty is not used in the construction of cl ( G ) . In the expressions b elow, we give equal preced ence to the o p erators set min us, set union and set in tersection. Lemma 5. L et X and Y deno te two no des o f a MAMP CG G st X , Y ∈ K m , X ⊥ G Y ∣ Z and Z ∩ ( K m + 1 ∪ . . . ∪ K n ) = ∅ . L et H denote the sub gr aph of G i n duc e d by K m . L et W = Z ∩ K m . L et W 1 denote a minima l (wrt s e t inclusion) subset of W st X ⊥ H W ∖ W 1 ∣ W 1 . T h en, X ⊥ cl ( G ) Y ∣ Z ∪ pa G ( X ∪ W 1 ) . Pr o of. W e deﬁne the restricted separation base of G as the follo wing set of separations: R1. A ⊥ B ∣ ne G ( A ) for all A, B ∈ K m st A ∉ ad G ( B ) and uc G ( A ) = uc G ( B ) , and R2. A ⊥ B for all A, B ∈ K m st A ∉ ad G ( B ) and uc G ( A ) ≠ uc G ( B ) . W e deﬁne the extended separation base of G as the follo wing set of separations: E1. A ⊥ B ∣ ne G ( A ) ∪ pa G ( K m ) for all A, B ∈ K m st A ∉ ad G ( B ) and uc G ( A ) = uc G ( B ) , and E2. A ⊥ B ∣ pa G ( K m ) fo r all A, B ∈ K m st A ∉ ad G ( B ) and uc G ( A ) ≠ uc G ( B ) . Note tha t the separations E1 (r esp. E2) a re in one-to-o ne corresp ondence with the sepa- rations R1 (resp. R2) st the latter can b e obt a ined from the former by adding pa G ( K m ) to the conditioning sets. L et W 2 = W ∖ W 1 . Then, X ⊥ H W 2 ∣ W 1 implies that X ⊥ cl ( H ) W 2 ∣ W 1 b y L emma 4. Note also that the pair wise separation ba se o f H coincides with the restricted separation base of G . The n, X ⊥ cl ( H ) W 2 ∣ W 1 implies that X ⊥ W 2 ∣ W 1 can b e deriv ed fro m the restricted separation base of G b y a pplying the comp ositional graphoid prop erties. W e can now reuse t his deriv a tion t o derive X ⊥ W 2 ∣ W 1 ∪ pa G ( K m ) from the extended separa- tion base o f G b y applying the comp ositional graphoid pro p erties: It suﬃces to apply the same sequence of prop erties but replacing an y separation of the restricted separation base in the deriv atio n with the corresp o nding separation of the extended separation base. In fact, X ⊥ W 2 ∣ W 1 ∪ pa G ( K m ) is not o nly in the closure of the extended separation base of G but also in the closure o f the pairwise separatio n base of G , i.e. X ⊥ cl ( G ) W 2 ∣ W 1 ∪ pa G ( K m ) . T o sho w it, it suﬃces to sho w t ha t the extended separation base is in the closure of the pairwise 21 separation ba se. Sp eciﬁcally , consider any A, B ∈ K m st A ∉ ad G ( B ) and uc G ( A ) ≠ u c G ( B ) . Then, (1) A ⊥ cl ( G ) B ∣ pa G ( A ) follows from the pairwise separation base of G , and (2) A ⊥ cl ( G ) pa G ( K m ) ∖ pa G ( A )∣ pa G ( A ) f ollo ws from the pairwise separation base of G b y rep eated composition. Then, (3) A ⊥ cl ( G ) B ∣ pa G ( K m ) b y compo sition on (1) and (2), and we ak union. No w, consider an y A, B ∈ K m st A ∉ ad G ( B ) and uc G ( A ) = uc G ( B ) . Then, (4) A ⊥ cl ( G ) B ∣ ne G ( A ) ∪ pa G ( A ∪ ne G ( A )) follow s from the pairwise separatio n base of G . Moreo ver, for an y C ∈ A ∪ ne G ( A ) (5) C ⊥ cl ( G ) pa G ( K m ) ∖ pa G ( C )∣ pa G ( C ) follo ws from the pair wise separation base of G by rep eated composition. Then, (6) C ⊥ cl ( G ) pa G ( K m ) ∖ pa G ( A ∪ ne G ( A ))∣ pa G ( A ∪ ne G ( A )) by w eak union. Then, (7) A ⊥ cl ( G ) pa G ( K m ) ∖ pa G ( A ∪ ne G ( A ))∣ ne G ( A ) ∪ pa G ( A ∪ ne G ( A )) b y rep eated symmetry , comp osition and weak union. Then, (8) A ⊥ cl ( G ) B ∣ ne G ( A ) ∪ pa G ( K m ) by comp osition on (4) and (7), and w eak union. Note that X ⊥ H Y ∣ W 1 b ecause, otherwise, X / ⊥ G Y ∣ Z whic h is a contradiction. Then, w e can r ep eat the reasoning ab o v e to show that X ⊥ cl ( G ) Y ∣ W 1 ∪ pa G ( K m ) . Then, X ⊥ cl ( G ) Y ∪ W 2 ∣ W 1 ∪ pa G ( K m ) b y comp osition on X ⊥ cl ( G ) W 2 ∣ W 1 ∪ pa G ( K m ) . Finally , we show that this implies that X ⊥ cl ( G ) Y ∣ Z ∪ pa G ( X ∪ W 1 ) . Sp eciﬁcally , (9) X ⊥ cl ( G ) Y ∪ W 2 ∣ W 1 ∪ pa G ( K m ) as sho wn ab ov e. Moreo v er, for an y C ∈ X ∪ W 1 (10) C ⊥ cl ( G ) pa G ( K m ) ∖ pa G ( C )∣ pa G ( C ) follo ws from the pair wise separation base of G by rep eated composition. Then, (11) C ⊥ cl ( G ) pa G ( K m ) ∖ pa G ( X ∪ W 1 )∣ pa G ( X ∪ W 1 ) by w eak union. Then, (12) X ⊥ cl ( G ) pa G ( K m ) ∖ pa G ( X ∪ W 1 )∣ W 1 ∪ pa G ( X ∪ W 1 ) by re p eated symmetry , comp osition and w eak union. Then, (13) X ⊥ cl ( G ) Y ∪ W 2 ∣ W 1 ∪ pa G ( X ∪ W 1 ) b y con t raction on (9) and (12 ) , and decomp o sition. Moreo ver, for an y C ∈ X ∪ W 1 (14) C ⊥ cl ( G ) Z ∖ W ∪ pa G ( X ∪ W 1 ) ∖ pa G ( C )∣ pa G ( C ) fo llo ws from the pairwise separation base of G b y repeated comp osition. Then, (15) C ⊥ cl ( G ) Z ∖ W ∖ pa G ( X ∪ W 1 )∣ pa G ( X ∪ W 1 ) by w eak union. Then, (16) X ⊥ cl ( G ) Z ∖ W ∖ pa G ( X ∪ W 1 )∣ W 1 ∪ pa G ( X ∪ W 1 ) b y rep eated symmetry , composition and w eak union. Then, (17) X ⊥ cl ( G ) Y ∣ Z ∪ pa G ( X ∪ W 1 ) by comp osition on (13) and (16), and w eak union.  Lemma 6. L et X and Y denote two no des of a MAMP CG G st Y ∈ K 1 ∪ . . . ∪ K m , X ∈ K m and X ⊥ G Y ∣ Z . L et H denote the sub gr aph of G ind uc e d by K m . L et W = Z ∩ K m . L et W 1 denote a minim a l (wrt set inclusion ) s ubse t of W st X ⊥ H W ∖ W 1 ∣ W 1 . T hen, X / ⊥ G C ∣ Z for al l C ∈ pa G ( X ∪ W 1 ) ∖ Z . Pr o of. Note that X / ⊥ H D ∣ W ∖ D for all D ∈ W 1 . T o see it, assume the con trary . Then, X ⊥ H D ∣ W ∖ D and X ⊥ H W ∖ W 1 ∣ W 1 imply X ⊥ H W ∖ W 1 ∪ D ∣ W 1 ∖ D by in tersection, whic h con tra dicts the deﬁnition of W 1 . F ina lly , note tha t X / ⊥ H D ∣ W ∖ D implies that there is a ( W ∖ D ) -op en path betw een X and D in G whose all no des are in K m . Then, X / ⊥ G C ∣ Z for all C ∈ pa G ( X ∪ W 1 ) ∖ Z .  Lemma 7. L et X and Y denote two no des of a MAMP CG G st Y ∈ K 1 ∪ . . . ∪ K m − 1 , X ∈ K m , X ⊥ G Y ∣ Z and Z ∩ ( K m + 1 ∪ . . . ∪ K n ) = ∅ . L et H de note the sub gr aph of G ind uc e d b y K m . L et W = Z ∩ K m . L et W 1 denote a minim al (wrt set inclusion) subset of W st X ⊥ H W ∖ W 1 ∣ W 1 . Then, X ⊥ cl ( G ) Y ∣ Z ∪ pa G ( X ∪ W 1 ) . 22 Pr o of. Let W 2 = W ∖ W 1 . Note that X / ⊥ G C ∣ Z for all C ∈ pa G ( X ∪ W 1 ) ∖ Z by Lemma 6, b ecause Y ∈ K 1 ∪ . . . ∪ K m − 1 , X ∈ K m and X ⊥ G Y ∣ Z . Then, Y ∉ pa G ( X ∪ W 1 ) b ecause, otherwise, X / ⊥ G Y ∣ Z whic h is a contradiction. Moreo v er, for an y C ∈ X ∪ W 1 (1) C ⊥ cl ( G ) Y ∪ pa G ( K m ) ∖ pa G ( C )∣ pa G ( C ) follo ws f rom the pairwise separation base of G b y repeated comp osition. Then, (2) C ⊥ cl ( G ) Y ∣ pa G ( K m ) b y w eak union. Then, (3) X ⊥ cl ( G ) Y ∣ W 1 ∪ pa G ( K m ) by rep eated symmetry , comp osition and w eak union. More- o ver, (4) X ⊥ cl ( G ) W 2 ∣ W 1 ∪ pa G ( K m ) as sho wn in the t hird paragraph of the pro of of Lemma 5. Then, (5) X ⊥ cl ( G ) Y ∪ W 2 ∣ W 1 ∪ pa G ( K m ) b y comp osition on (3) and (4). Moreov er, for an y C ∈ X ∪ W 1 (6) C ⊥ cl ( G ) pa G ( K m ) ∖ pa G ( C )∣ pa G ( C ) follo ws from the pair wise separation base of G by rep eated composition. Then, (7) C ⊥ cl ( G ) pa G ( K m ) ∖ pa G ( X ∪ W 1 )∣ pa G ( X ∪ W 1 ) by w eak union. Then, (8) X ⊥ cl ( G ) pa G ( K m ) ∖ pa G ( X ∪ W 1 )∣ W 1 ∪ pa G ( X ∪ W 1 ) by re p eated symmetry , comp osition and w eak union. Then, (9) X ⊥ cl ( G ) Y ∪ W 2 ∣ W 1 ∪ pa G ( X ∪ W 1 ) b y con traction on (5) and (8 ), and decomposition. Moreo ver, for an y C ∈ X ∪ W 1 (10) C ⊥ cl ( G ) Z ∖ W ∪ pa G ( X ∪ W 1 ) ∖ pa G ( C )∣ pa G ( C ) fo llo ws from the pairwise separation base of G b y repeated comp osition. Then, (11) C ⊥ cl ( G ) Z ∖ W ∖ pa G ( X ∪ W 1 )∣ pa G ( X ∪ W 1 ) by w eak union. Then, (12) X ⊥ cl ( G ) Z ∖ W ∖ pa G ( X ∪ W 1 )∣ W 1 ∪ pa G ( X ∪ W 1 ) b y rep eated symmetry , composition and w eak union. Then, (13) X ⊥ cl ( G ) Y ∣ Z ∪ pa G ( X ∪ W 1 ) by comp osition on (9) and (12), and w eak union.  Pro of of Theorem 6. Since the indep endence mo del induced by G satisﬁes the decomp o- sition pr o p ert y a nd cl ( G ) satisﬁes the compo sition prop ert y , it suﬃces to pro v e the theorem for ∣ X ∣ = ∣ Y ∣ = 1. Moreov er, assume without loss of generality that Y ∈ K 1 ∪ . . . ∪ K m and X ∈ K m . W e pro v e the theorem b y induction on ∣ Z ∣ . The theorem holds for ∣ Z ∣ = 0 and m = 1 b y Lemma 5, b ecause X, Y ∈ K 1 , X ⊥ G Y ∣ Z , Z ∩ ( K 2 ∪ . . . ∪ K n ) = ∅ and pa G ( X ∪ W 1 ) ∖ Z = ∅ . Assume as induction hypothesis that the theorem holds for ∣ Z ∣ = 0 and m < l . W e no w pro v e it for ∣ Z ∣ = 0 and m = l . Consider the following cases. Case 1: Assume that Y ∈ K 1 ∪ . . . ∪ K l − 1 . Then, (1) X ⊥ cl ( G ) Y ∣ Z ∪ pa G ( X ∪ W 1 ) b y Lemma 7, b ecause Y ∈ K 1 ∪ . . . ∪ K l − 1 , X ∈ K l , X ⊥ G Y ∣ Z and Z ∩ ( K l + 1 ∪ . . . ∪ K n ) = ∅ . Moreov er, for an y C ∈ pa G ( X ∪ W 1 ) ∖ Z (2) X / ⊥ G C ∣ Z b y Lemma 6, b ecause Y ∈ K 1 ∪ . . . ∪ K l − 1 , X ∈ K l and X ⊥ G Y ∣ Z . Then, (3) C ⊥ G Y ∣ Z b ecause, otherwise , X / ⊥ G Y ∣ Z whic h is a con tra diction. Then, (4) C ⊥ cl ( G ) Y ∣ Z b y the induction hy p othesis, b ecause C , Y ∈ K 1 ∪ . . . ∪ K l − 1 . Then, (5) pa G ( X ∪ W 1 ) ∖ Z ⊥ cl ( G ) Y ∣ Z b y rep eated symmetry and comp osition. Then, (6) X ⊥ cl ( G ) Y ∣ Z b y symmetry , con traction on (1) and (5), and decomp osition. Case 2: Assume that Y ∈ K l . Then, (1) X ⊥ cl ( G ) Y ∣ Z ∪ pa G ( X ∪ W 1 ) by Lemma 5, b ecause X , Y ∈ K l , X ⊥ G Y ∣ Z and Z ∩ ( K l + 1 ∪ . . . ∪ K n ) = ∅ . Moreo ver, for an y D ∈ pa G ( X ∪ W 1 ) ∖ Z (2) X / ⊥ G D ∣ Z by Lemma 6, b ecause X, Y ∈ K l and X ⊥ G Y ∣ Z . Then, (3) Y ⊥ G D ∣ Z b ecause, otherwise, X / ⊥ G Y ∣ Z whic h is a contradiction. Then, (4) Y ⊥ cl ( G ) D ∣ Z b y Case 1 replacing X with Y and Y with D , b ecause D ∈ K 1 ∪ . . . ∪ K l − 1 , Y ∈ K l and (3). Then, (5) Y ⊥ cl ( G ) pa G ( X ∪ W 1 ) ∖ Z ∣ Z b y r ep eated comp osition. Then, (6) X ⊥ cl ( G ) Y ∣ Z b y symmetry , con traction on (1) and (5), and decomp osition. 23 This ends the pro of for ∣ Z ∣ = 0. Assume as induction h yp othesis that the theorem holds for ∣ Z ∣ < t . W e no w pro v e it for ∣ Z ∣ = t and m = 1. Let K j b e the connectivit y comp onent st Z ∩ K j ≠ ∅ and Z ∩ ( K j + 1 ∪ . . . ∪ K n ) = ∅ . Consider the fo llo wing cases. Case 3: Assume that j = 1. Then, X ⊥ cl ( G ) Y ∣ Z holds by Lemma 5, because X , Y ∈ K 1 , X ⊥ G Y ∣ Z , Z ∩ ( K 2 ∪ . . . ∪ K n ) = ∅ a nd pa G ( X ∪ W 1 ) ∖ Z = ∅ . Case 4: Assume tha t j > 1 and pa G ( Z ∩ K j ) ∖ Z = ∅ . Then, note that there is no ( Z ∖ C ) -op en path b etw een X and any C ∈ Z ∩ K j . T o see it, assume the con trary . Since X ∈ K 1 and j > 1 , the path m ust reach K j from one o f its pa r en ts or c hildren. Ho wev er, the path cannot reac h K j from one of its ch ildren b ecause, otherwise, the path has a triplex no de outside Z since X ∈ K 1 , j > 1 and Z ∩ ( K j + 1 ∪ . . . ∪ K n ) = ∅ . This con tradicts t ha t the path is ( Z ∖ C ) -o p en. Then, the path m ust reac h K j from one of its paren ts. Ho w ev er, this contradicts that the path is ( Z ∖ C ) -op en, b ecause pa G ( Z ∩ K j ) ∖ Z = ∅ . Then, (1) X ⊥ G C ∣ Z ∖ C as shown ab ov e. Then, (2) X ⊥ cl ( G ) C ∣ Z ∖ C b y the induction h yp ot hesis. Moreo ver, (3) X ⊥ G Y ∣ Z ∖ C b y contraction on X ⊥ G Y ∣ Z and ( 1 ), and decomp osition. Then, (4) X ⊥ cl ( G ) Y ∣ Z ∖ C b y the induction hypothesis. Then, (5) X ⊥ cl ( G ) Y ∣ Z b y comp osition on (2) and (4), and w eak union. Case 5: Assume that j > 1 and pa G ( C ) ∖ Z ≠ ∅ for some C ∈ Z ∩ K j . Then, note that there is no ( Z ∖ C ) -op en path b et wee n X and Y . T o see it, assume the con trary . If C is not in the path, then C ∈ pa G ( D ) st − D − is in the path and D ∈ Z b ecause, otherwise, the path is Z -op en whic h con tradicts tha t X ⊥ G Y ∣ Z . How eve r, this implies a con t r a diction b ecause C ∈ K j and th us D ∈ K j + 1 ∪ . . . ∪ K n , but Z ∩ ( K j + 1 ∪ . . . ∪ K n ) = ∅ . Therefore, C mus t b e in the path. In fact, C mus t b e a non-triplex no de in the path b ecause, otherwise, t he path is not ( Z ∖ C ) -op en. Then, either (i) − C − , (ii) ← C ⊸ ⊸ or (iii) ⊸ ⊸ C → is in the path. Case (i) implies that the path is Z -op en, b ecause pa G ( C ) ∖ Z ≠ ∅ . This con tradicts that X ⊥ G Y ∣ Z . Cases (ii) and (iii) imply that the path has a directed subpath from C to (iv) X , (v) Y or (vi) a triplex no de E in the path. Cases (iv) a nd (v) are imp ossible b ecause X , Y ∈ K 1 but C ∈ K j with j > 1. Case (vi) contradicts that the pa t h is ( Z ∖ C ) -op en, b ecause C ∈ K j and thu s E ∈ K j + 1 ∪ . . . ∪ K n , but Z ∩ ( K j + 1 ∪ . . . ∪ K n ) = ∅ . Then, (1) X ⊥ G Y ∣ Z ∖ C as sho wn ab ov e. Then, (2) X ⊥ cl ( G ) Y ∣ Z ∖ C b y the induction hypothesis. Moreo ve r, (3) X ⊥ G C ∣ Z ∖ C or C ⊥ G Y ∣ Z ∖ C b y we ak transitivit y on X ⊥ G Y ∣ Z and ( 1 ). Then, (4) X ⊥ cl ( G ) C ∣ Z ∖ C or C ⊥ cl ( G ) Y ∣ Z ∖ C by the induction hypothesis. Then, (5) X ⊥ cl ( G ) Y ∣ Z b y symmetry , comp osition on (2) and (4), and w eak union. This ends the pro o f for ∣ Z ∣ = t and m = 1. Assume as induction hy p othesis that the theorem holds for ∣ Z ∣ = t and m < l . In o rder to prov e it for ∣ Z ∣ = t and m = l , it suﬃces to rep eat Cases 1 and 2 if Z ∩ ( K l + 1 ∪ . . . ∪ K n ) = ∅ , and Cases 4 and 5 replacing 1 with l otherwise.  Pro of of Theorem 7. W e ﬁrst prov e the “only if” part. Let G 1 and G 2 b e tw o Mark o v equiv alen t MAMP CGs. First, assume that tw o no des A a nd C are adjacen t in G 2 but not in G 1 . If A and C are in the same undirected connectivit y comp onent of G 1 , then A ⊥ C ∣ ne G 1 ( A ) ∪ pa G 1 ( A ∪ ne G 1 ( A )) holds for G 1 b y Theorem 5 but it do es no t hold for G 2 , whic h is a contradiction. On the o t her hand, if A and C are in diﬀerent undirected connectivit y compo nen ts o f G 1 , t hen A ⊥ C ∣ pa G 1 ( C ) or A ⊥ C ∣ pa G 1 ( A ) holds for G 1 b y T heorem 5 but neither holds for G 2 , whic h is a contradiction. Consequen tly , G 1 and G 2 m ust hav e the same adjacencies . Finally , assume that G 1 and G 2 ha ve the same adjacencies but G 1 has a triplex ({ A, C } , B ) that G 2 do es not ha ve. If A and C are in the same undirected connectivit y comp onent of 24 G 1 , then A ⊥ C ∣ ne G 1 ( A ) ∪ pa G 1 ( A ∪ ne G 1 ( A )) holds for G 1 b y Theorem 5. Note also t ha t B ∉ ne G 1 ( A ) ∪ pa G 1 ( A ∪ ne G 1 ( A )) b ecause, otherwise, G 1 w ould not satisfy the constrain t C1 or C2. Then, A ⊥ C ∣ ne G 1 ( A ) ∪ pa G 1 ( A ∪ ne G 1 ( A )) do es not ho ld for G 2 , which is a contradiction. On the o ther hand, if A and C are in diﬀerent undirec ted connectivit y comp onen ts of G 1 , then A ⊥ C ∣ pa G 1 ( C ) or A ⊥ C ∣ pa G 1 ( A ) holds for G 1 b y Theorem 5 . Note also that B ∉ pa G 1 ( A ) and B ∉ pa G 1 ( C ) b ecause, otherwise, G 1 w ould not ha ve the triplex ({ A, C } , B ) . Then, neither A ⊥ C ∣ pa G 1 ( C ) nor A ⊥ C ∣ pa G 1 ( A ) holds for G 2 , whic h is a contradiction. Consequen tly , G 1 and G 2 m ust b e triplex equiv alen t. W e now pro v e t he “if” part. L et G 1 and G 2 b e tw o triplex equiv alen t MAMP CGs. W e j ust pro ve t ha t all the non-separations in G 1 are also in G 2 . The opp osite result can b e prov en in the same manner b y just exc hanging the roles of G 1 and G 2 in the pro of. Sp eciﬁcally , assume that α ⊥ β ∣ Z do es not hold for G 1 . W e prov e that α ⊥ β ∣ Z do es not hold fo r G 2 either. W e divide the pro o f in three par t s. P art 1 W e sa y that a path has a triplex ({ A, C } , B ) if it has a subpath of the form A ← ⊸ B ← ⊸ C , A ← ⊸ B − C , or A − B ← ⊸ C . Let ρ 1 b e any path b etw een α and β in G 1 that is Z -op en st (i) no subpath o f ρ 1 b et w een α and β in G 1 is Z -op en, (ii) ev ery triplex no de in ρ 1 is in Z , and (iii) ρ 1 has no non-tr iplex no de in Z . L et ρ 2 b e the path in G 2 that consists of the same no des as ρ 1 . Then, ρ 2 is Z -op en. T o see it, assume the con trary . Then, one of the follo wing cases must o ccur. Case 1: ρ 2 do es not ha v e a triplex ({ A, C } , B ) a nd B ∈ Z . Then, ρ 1 m ust hav e a t r iplex ({ A, C } , B ) b ecause it is Z -o p en. Then, A and C m ust b e adjacen t in G 1 and G 2 b ecause these a re triplex equiv alent. Let  1 b e the pat h obtained from ρ 1 b y replacing the triplex ({ A, C } , B ) with the edge b etw een A and C in G 1 . Note that  1 cannot b e Z -op en b ecause, otherwise, it would con tradict the condition (i) . Then,  1 is not Z - op en b ecause A or C do not meet the requiremen t s. Assume without loss of g enerality that C do es not meet t he requiremen ts. Then, one o f the following cases m ust o ccur. Case 1.1:  1 do es not hav e a t riplex ({ A, D } , C ) and C ∈ Z . Then, one of the follo wing subgraphs m ust o ccur in G 1 . 3 A B C D A B C D A B C D A B C D Ho wev er, the ﬁrst three subgraphs imply that ρ 1 is not Z -op en, whic h is a con tra- diction. The fourth subgraph implies t ha t  1 is Z -op en, whic h is a contradiction. Case 1.2:  1 has a triplex ({ A, D } , C ) and C ∉ Z ∪ san G 1 ( Z ) . Note that C cannot b e a triplex no de in ρ 1 b ecause, otherwise, ρ 1 w ould not b e Z -op en. Then, o ne of the follow ing subgraphs m ust o ccur in G 1 . A B C D A B C D A B C D A B C D Ho wev er, the ﬁrst and second subgraphs imply that C ∈ Z ∪ san G 1 ( Z ) b ecause B ∈ Z , whic h is a contradiction. The third subgraph implies t hat B − D is in G 1 b y t he constraint C3 and, th us, that the path obtained from ρ 1 b y replacing B − C − D with B − D is Z - op en, whic h contradicts the condition (i). F or the 3 If  1 do es not hav e a tr iplex ({ A, D } , C ) , then A ← C , C → D or A − C − D m ust b e in G 1 . Moreov er , recall that B is a tr iplex no de in ρ 1 . Then, A → B ← C , A → B ↔ C , A → B − C , A ↔ B ← C , A ↔ B ↔ C , A ↔ B − C , A − B ← C o r A − B ↔ C must be in G 1 . Howev er, if A ← C is in G 1 then the only le g al options are those that contain the edg e B ← C . On the o ther hand, if A − C − D is in G 1 then the only le g al o ptions are A → B ← C and A ↔ B ↔ C . 25 fourth subgraph, assume that A and D are adjacent in G 1 . Then, one o f the follo wing subgraphs m ust o ccur in G 1 . A B C D E A B C D E A B C D E Ho wev er, the ﬁrst subgraph implies that the path obtained fr o m ρ 1 b y replacing A → B − C − D with A → D is Z -op en, b ecause D ∉ Z since ρ 1 is Z -op en. This con tradicts the condition (i) . The second subgraph implies that the path obtained from ρ 1 b y replacing A → B − C − D with A → D is Z -op en, b ecause D ∈ Z ∪ san G 1 ( Z ) since ρ 1 is Z -op en. This con tra dicts the condition (i). Therefore, only the third subgraph is p ossible. Thus, by rep eatedly applying the previous reasoning, w e can conclude without loss of generalit y that the following subgraph m ust o ccur in G 1 , with n ≥ 4, V 1 = A , V 2 = B , V 3 = C , V 4 = D a nd where V 1 and V n are not adjacen t in G 1 . Note that the subgraph b elow co vers the case where A and D are not adjacent in the original subgraph b y simply taking n = 4. V 1 V 2 V 3 V 4 . . . V n − 1 V n Since V 1 and V n are not adjacen t in G 1 , G 1 has a triplex ({ V 1 , V n } , V n − 1 ) and, th us, so do es G 2 b ecause G 1 and G 2 are triplex equiv alen t. Then, one of the follo wing subgraphs m ust o ccur in G 2 . V 1 . . . V n − 1 V n V 1 . . . V n − 1 V n V 1 . . . V n − 1 V n Note that V 1 , . . . , V n m ust be a path in G 2 , b ecause G 1 and G 2 are triplex equiv- alen t. Note also tha t t his path cannot hav e any triplex in G 2 . T o see it, recall that w e assumed that ρ 2 do es not ha v e a triplex ({ A, C } , B ) . Recall that V 1 = A , V 2 = B , V 3 = C . Moreo v er, if the path V 1 , . . . , V n has a triplex ({ V i , V i + 2 } , V i + 1 ) in G 2 with 2 ≤ i ≤ n − 2 , then V i and V i + 2 m ust b e adjacen t in G 1 and G 2 , b ecause suc h a triplex do es not exist in G 1 , whic h is triplex equiv alen t to G 2 . Sp eciﬁcally , V i − V i + 2 m ust b e in G 1 b ecause, as seen ab ov e, V i − V i + 1 − V i + 2 is in G 1 . Then, the path obtained from ρ 1 b y replacing V i − V i + 1 − V i + 2 with V i − V i + 2 is Z -op en, whic h con tra dicts the condition (i) . How ev er, if the path V 1 , . . . , V n has no triplex in G 2 , then ev ery edge in the path m ust b e directed as ← in the case of the ﬁrst and second subgraphs ab o v e, whereas ev ery edge in the path m ust b e undirected or directed as ← in the third subgraph ab ov e. Either case contradicts the constrain t C1 or C2. Case 2: Case 1 do es no t apply . Then, ρ 2 has a triplex ({ A, C } , B ) a nd B ∉ Z ∪ san G 2 ( Z ) . Then, ρ 1 cannot ha ve a triplex ({ A, C } , B ) . Then, A and C mus t b e adja cent in G 1 and G 2 b ecause these are triplex equiv a len t. Let  1 b e the path obtained from ρ 1 b y replacing the triplex ({ A, C } , B ) with the edge b etw een A a nd C in G 1 . Note that  1 cannot b e Z -op en b ecause, otherwise, it w ould con tradict the condition (i). Then,  1 is not Z - op en b ecause A or C do no t meet the requiremen ts. Assume without loss of generalit y that C do es not meet the requiremen ts. Then, one of the follow ing cases m ust o ccur. 26 Case 2.1:  1 has a tr iplex ({ A, D } , C ) and C ∉ Z ∪ san G 1 ( Z ) . Then, o ne of the follo wing subgraphs m ust o ccur in G 1 . 4 A B C D A B C D A B C D A B C D A B C D A B C D Ho wev er, this implies that C is a triplex no de in ρ 1 , whic h is a contradiction b ecause ρ 1 is Z -o p en but C ∉ Z ∪ san G 1 ( Z ) . Case 2.2:  1 do es not hav e a triplex ({ A, D } , C ) and C ∈ Z . Then, A ← C , C → D or A − C − D . Case 2.2.1: If C → D or A − C − D , then one of the follo wing subgraphs m ust o ccur in G 1 . A B C D A B C D A B C D Ho wev er, t he ﬁrst and second subgraphs imply that ρ 1 is not Z -op en, which is a contradiction. The third subgraph implies that  1 is Z -op en, whic h is a contradiction. Case 2.2.2: If A ← C then ({ A, D } , C ) is not a triplex in  1 . How ever, note that ρ 1 m ust ha v e a triplex ({ B , D } , C ) , b ecause ρ 1 is Z -op en and C ∈ Z . Then, one of the following subgraphs m ust o ccur in G 1 . A B C D A B C D A B C D Assume that A and D are a dj a cen t in G 1 . Then, A ← D mus t b e in G 1 . Moreo ver, D ∈ Z b ecause, otherwise, w e can remo ve B and C from ρ 1 and get a Z -op en pa t h b et wee n A a nd B in G 1 that is shorter than ρ 1 , whic h con tra dicts the condition (i). Then, D m ust b e a tr iplex no de in ρ 1 . Then, one o f the follo wing subgraphs m ust o ccur in G 1 . A B C D E A B C D E A B C D E A B C D E A B C D E Th us, b y rep eatedly applying the pr evious reasoning, w e can conclud e with- out loss of generalit y that the follow ing subgraph must o ccur in G 1 , with n ≥ 4, V 1 = A , V 2 = B , V 3 = C , V 4 = D and where V 1 and V n are not adjacen t in G 1 . No te that the subgraph b elow cov ers the case where A and D are not adja cen t in t he original subgraph b y simply taking n = 4. V 1 V 2 . . . V n − 1 V n 4 If  1 has a triplex ({ A, D } , C ) , then A → C ← ⊸ D , A → C − D , A ↔ C ← ⊸ D , A ↔ C − D or A − C ← ⊸ D must be in G 1 . Mo r eov e r , recall that B is no t a triplex no de in ρ 1 . Then, A ← B ← C , A ← B → C , A ← B ↔ C , A ← B − C , A → B → C , A ↔ B → C , A − B → C or A − B − C must b e in G 1 . How ever, if A → C is in G 1 then the o nly legal o ptions are those tha t contain the edge B → C . On the o ther hand, if A ↔ C is in G 1 then the o nly legal optio n is A ← B → C . Finally , if A − C is in G 1 then the o nly legal optio ns are A ← B → C and A − B − C . 27 Note that V i is a triplex no de in ρ 1 for all 3 ≤ i ≤ n − 1. Then, V i ∈ Z for all 3 ≤ i ≤ n − 1 b y the condition (ii) b ecause ρ 1 is Z -op en. Then, V i m ust b e a triplex no de in ρ 2 for all 3 ≤ i ≤ n − 1 b ecause, otherwise, Case 1 would apply instead of Case 2. Recall tha t V 2 = B is a lso a triplex no de in ρ 2 . Note that G 1 do es not hav e a triplex ({ V 1 , V n } , V n − 1 ) and, thus, G 2 do es not hav e it either b ecause these are triplex equiv alen t . Then, one of t he follo wing subgraphs m ust o ccur in G 2 . V 1 . . . V n − 1 V n V 1 . . . V n − 1 V n V 1 . . . V n − 1 V n Ho wev er, the ﬁrst subgraph implies tha t V n − 1 is not a triplex no de in ρ 2 , whic h is a con tradiction. The second subgraph implies that G 2 has a cycle that violates the constrain t C1. T o see it, recall that V i is a triplex no de in ρ 2 for all 2 ≤ i ≤ n − 1 and, th us, V i ← V i + 1 is not in G 2 for all 1 ≤ i ≤ n − 2. The third subgraph implies that V n − 2 ↔ V n − 1 is not in G 2 b ecause, otherwise, V 1 and V n w ould b e adjacen t b y the constrain t C3. Therefore, V n − 2 → V n − 1 m ust b e in G 2 b ecause V n − 1 is a triplex no de in ρ 2 . How eve r, this implies that V n − 2 is not a triplex no de in ρ 2 , whic h is a con tradiction. P art 2 Let ρ 1 b e an y of the shortest Z -op en paths betw een α and β in G 1 st a ll its triplex no des are in Z . Let ρ 2 b e the path in G 2 that consists of the same no des as ρ 1 . W e pro v e b elo w that ρ 2 is Z -op en. W e prov e this result by induction on the num b er of non-triplex no des of ρ 1 that are in Z . If this n um b er is zero, then P art 1 prov es the result. Assume as induction h yp o thesis that the r esult ho lds when the n umber is smaller than m . W e no w pro ve it for m . Let ρ A ∶ B 1 denote the subpath of ρ 1 b et w een the no des A and B . Let C b e an y of the non- triplex no des of ρ 1 that are in Z . Note that there m ust exist some no de D ∈ pa G 1 ( C ) ∖ Z for ρ 1 to b e Z -op en. If D is in ρ 1 , then ρ α ∶ D 1 ∪ D → C ∪ ρ C ∶ β 1 or ρ α ∶ C 1 ∪ C ← D ∪ ρ D ∶ β 1 is a Z -op en path b et we en α and β in G 1 that has fewe r than m non- triplex no des in Z . Then, the result holds by the induction h yp othesis. On the other hand, if D is not in ρ 1 , then ρ α ∶ C 1 ∪ C ← D and D → C ∪ ρ C ∶ β 1 are tw o paths. Moreov er, they ar e Z -o p en in G 1 and they hav e few er t ha n m non-triplex no des in Z . Then, b y the induction hy p othesis, there are tw o Z -op en paths ρ α ∶ D 2 and ρ D ∶ β 2 in G 2 st the fo rmer ends with the no des C and D and the latter starts with these t w o no des. No w, consider the follo wing cases. Case 1: ρ α ∶ D 2 ends with A − C ← D . Then, ρ D ∶ β 2 starts with D → C − B or D → C ← ⊸ B . Then, ρ 2 = ρ α ∶ C 2 ∪ ρ C ∶ β 2 is Z -o p en a path in either case. Case 2: ρ α ∶ D 2 ends with A − C ↔ D . Then, ρ D ∶ β 2 starts with D ↔ C − B or D ↔ C ← ⊸ B . Then, ρ 2 = ρ α ∶ C 2 ∪ ρ C ∶ β 2 is Z -o p en a path in either case. Case 3: ρ α ∶ D 2 ends with A ← ⊸ C − D . Then, ρ D ∶ β 2 starts with D − C ← ⊸ B , or D − C − B st there is some no de E ∈ pa G 2 ( C ) ∖ Z . Then, ρ 2 = ρ α ∶ C 2 ∪ ρ C ∶ β 2 is Z -op en a path in either case. Case 4: ρ α ∶ D 2 ends with A ← ⊸ C ← ⊸ D . Then, ρ D ∶ β 2 starts with D ← ⊸ C − B or D ← ⊸ C ← ⊸ B . Then, ρ 2 = ρ α ∶ C 2 ∪ ρ C ∶ β 2 is Z -o p en a path in either case. Case 5: ρ α ∶ D 2 ends with A − C − D st there is some no de E ∈ pa G 2 ( C ) ∖ Z . Then, ρ D ∶ β 2 starts with D − C ← ⊸ B , or D − C − B st t here is some no de F ∈ pa G 2 ( C ) ∖ Z . Then, ρ 2 = ρ α ∶ C 2 ∪ ρ C ∶ β 2 is a Z -op en path in either case. P art 3 Assume that P art 2 do es not apply . Then, ev ery Z -op en path b et w een α and β in G 1 has some triplex no de B 1 that is outside Z b ecause, otherwise, Part 2 would apply . Note tha t for the path to b e Z -op en, G 1 m ust hav e a subgraph B 1 → . . . → B n st B 1 , . . . , B n − 1 ∉ Z but 28 B n ∈ Z . Let us conv ert ev ery Z -op en path b et wee n α and β in G 1 in to a r o ute b y replacing eac h of it s triplex no des B 1 that are outside Z with the corresp onding route B 1 → . . . → B n ← . . . ← B 1 . Let  1 b e an y of the shortest routes so-constructed. Let ρ 1 b e t he pat h from whic h  1 w as constructed. Note that ρ 1 cannot b e Z -op en st all its triplex no des are in Z b ecause, otherwise, P art 2 w ould a pply . Let W denote the set of all the triplex no des in ρ 1 that are outside Z . Then, ρ 1 is one of the shortest ( Z ∪ W ) -op en paths b etw een α and β in G 1 st a ll its triplex no des ar e in Z ∪ W . T o see it, assume to the con t r ary that ρ ′ 1 is a ( Z ∪ W ) -op en path b etw een α and β in G 1 that is shorter than ρ 1 and st all the triplex no des in ρ ′ 1 are in Z ∪ W . Let  ′ 1 b e the ro ute resulting from replacing ev ery no de B 1 of ρ ′ 1 that is in W with the route B 1 → . . . → B n ← . . . ← B 1 that w as added to ρ 1 to construct  1 . Clearly ,  ′ 1 is shorter than  1 , whic h is a contradiction. Let  2 and ρ 2 b e the ro ute and the path in G 2 that consist of the same no des as  1 and ρ 1 . Note that ρ 2 is ( Z ∪ W ) -op en by P art 2. Consider a ny of the routes B 1 → . . . → B n ← . . . ← B 1 that w ere added to ρ 1 to construct  1 . This implies that ρ 1 has a triplex ({ A, C } , B 1 ) . Assume that B 1 → B 2 is in G 1 but B 1 − B 2 or B 1 ← ⊸ B 2 is in G 2 . Note that A ← ⊸ B 1 or B 1 ← ⊸ C is in G 2 b ecause, as noted ab ov e, ρ 2 is ( Z ∪ W ) -op en. Assume without loss of generalit y that A ← ⊸ B 1 is in G 2 . Then, A − B 1 → B 2 or A ← ⊸ B 1 → B 2 is in G 1 whereas A ← ⊸ B 1 − B 2 or A ← ⊸ B 1 ← ⊸ B 2 is in G 2 . Therefore, A and B 2 m ust b e adjacen t in G 1 and G 2 b ecause these are triplex equiv alent. This implies that A → B 2 is in G 1 . Moreo v er, A ∈ Z b ecause, ot herwise, w e can construct a route that is shorter than  1 b y simply remo ving B 1 from  1 , whic h is a con tradiction. This implies that A ↔ B 1 is in G 2 b ecause, otherwise, ρ 2 w ould not b e ( Z ∪ W ) -op en. This implies that A ↔ B 1 − B 2 or A ↔ B 1 ← ⊸ B 2 is in G 2 , whic h implies that A − B 2 or A ← ⊸ B 2 is in G 2 . The situatio n is depicted in the follo wing subgraphs. G 1 G 1 A C B 1 B 2 A C B 1 B 2 A C B 1 B 2 A C B 1 B 2 A C B 1 B 2 A C B 1 B 2 G 2 G 2 G 2 G 2 No w, let A ′ b e the no de that precedes A in ρ 1 . Not e that A ′ ← A cannot b e in ρ 1 or ρ 2 b ecause, otherwise, these w ould not b e ( Z ∪ W ) -op en since A ∈ Z . Then, A ′ − A or A ′ ← ⊸ A is in G 1 and G 2 . Then, A ′ − A → B 2 or A ′ ← ⊸ A → B 2 is in G 1 whereas A ′ − A ← ⊸ B 2 , A ′ − A − B 2 , A ′ ← ⊸ A ← ⊸ B 2 or A ′ ← ⊸ A − B 2 is in G 2 . Thes e four subgraphs of G 2 imply that A ′ and B 2 m ust b e adjacen t in G 1 and G 2 : The second subgraph due to the constrain t C3 b ecause A ↔ B 1 is in G 2 , and the other three subgraphs b ecause G 1 and G 2 are triplex equiv alent. By rep eating t he reasoning in the paragra ph ab o v e, we can conclude that A ′ → B 2 is in G 1 , whic h implies that A ′ ∈ Z , whic h implies that A ′ − A or A ′ ↔ A is in G 2 , whic h implies that A ′ − B 2 or A ′ ← ⊸ B 2 is in G 2 . By rep eating the reasoning in the paragraph ab ov e, 5 w e can conclude tha t α → B 2 is in G 1 and, thus, w e can construct a route that is shorter t ha n  1 b y simply remov ing some no des from  1 , whic h is a con tradiction. Consequen tly , B 1 → B 2 m ust b e in G 2 . 5 Let A ′′ be the no de that precedes A ′ in ρ 1 . F or this rep eated r easoning to b e co rrect, it is impo rtant to realize that if A ′ − A is in G 2 , then A ′′ ↔ A ′ m ust be in G 2 , b eca use A ′ ∈ Z and ρ 2 is ( Z ∪ W ) -open. 29 Finally , assume that B 1 → B 2 → B 3 is in G 1 but B 1 → B 2 − B 3 or B 1 → B 2 ← ⊸ B 3 is in G 2 . Then, B 1 and B 3 m ust b e adjacen t in G 1 and G 2 b ecause these are triplex equiv alent. This implies that B 1 → B 3 is in G 1 , whic h implies that we can construct a r o ute that is shorter than  1 b y simply removing B 2 from  1 , whic h is a con tradiction. By rep eating this reasoning, w e can conclude that B 1 → . . . → B n is in G 2 and, thus , that ρ 2 is Z -o p en.  Pro of of Lemma 1. Assume to the con trary that there are tw o suc h sets of directed no de pairs. Let the MAMP CG G con tain exactly the directed no de pairs in one of the sets, and let the MAMP CG H con tain exactly the directed no de pairs in the other set. F or ev ery A → B in G st A − B o r A ↔ B is in H , replace the edge b et wee n A and B in H with A → B and call the resulting gr aph F . W e prov e b elow that F is a MAMP CG that is triplex equiv alen t to G and th us to H , which is a con tra diction since F has a prop er sup erset of the directed no de pairs in H . First, note that F cannot violate the constrain ts C2 and C3. Assum e to the contrary that F violates the constraint C1 due to a cycle ρ . Note that none of the directed edges in ρ can b e in H b ecause, otherwise, H w ould violate the constrain t C1, since H has the same adjacencies as F but a subset of the directed edges in F . Then, all the directed edges in ρ mus t b e in G . Ho wev er, this implies the con tradictory conclusion that G violat es the constraint C1, since G has the same adjacencies as F but a subset of t he directed edges in F . Second, assume to the con trary that G (and, thus, H ) has a triplex ({ A, C } , B ) t ha t F has not. Then, { A, B } or { B , C } must an directed no de pair in G b ecause, otherwise, F would ha ve a tr iplex ({ A, C } , B ) since F w ould ha v e the same induced graph ov er { A, B , C } a s H . Sp eciﬁcally , A → B or B ← C m ust b e in G b ecause, ot herwise, G would not hav e a triplex ({ A, C } , B ) . Moreov er, neither A ← B nor B → C can b e H b ecause, otherwise, H w ould not ha v e a triplex ({ A, C } , B ) . Therefore, if A → B or B ← C is in G and neither A ← B nor B → C is in H , then A → B or B ← C mus t b e in F . Ho w ev er, this implies that B → C or A ← B m ust b e in F b ecause, otherwise, F would hav e a triplex ({ A, C } , B ) whic h w ould b e a con tradiction. How ev er, this is a con tra diction since neither B → C no r A ← B can b e in G or H b ecause, ot herwise, neither G nor H w ould ha v e a triplex ({ A, C } , B ) . Finally , assume to the contrary that F has a triplex ({ A, C } , B ) that G has no t (and, thus, nor do es H ). Then, A − B − C m ust b e in H b ecause, otherwise, A ← B or B → C w ould b e in H and, th us, F w ould no t hav e a tr iplex ({ A, C } , B ) . How eve r, this implies that A → B or B ← C is in G because, otherwise, F w ould not ha ve a triplex ({ A, C } , B ) . How eve r, this implies that B → C or A ← B is in G b ecause, o therwise, G w ould hav e a triplex ({ A, C } , B ) . Therefore, A → B → C or A ← B ← C is in G and, th us, A → B → C or A ← B ← C m ust b e in F since A − B − C is in H . How ev er, this contradicts the assumption that F ha s a triplex ({ A, C } , B ) .  Pro of of Lemma 2. Assume to the con trary that there are t wo such sets of bidirected edges . Let the MDCG G contain exactly the bidirected edges in one of the sets, and let the MDCG H con tain exactly the bidirected edges in the other set. F or ev ery A ↔ B in G st A − B is in H , replace A − B with A ↔ B in H and call the resulting g raph F . W e prov e b elo w that F is a MDCG that is triplex equiv alen t to G , whic h is a con tradiction since F ha s a prop er sup erset of the bidirected edges in G . First, note that F cannot violate the cons train t C1. Assume to the con trary that F viola t es the constrain t C2 due to a cyc le ρ . Note that a ll the undirected edges in ρ are in H . In fact, they m ust also b e in G , b ecause G and H hav e t he same directed no de pairs and bidirected edges. Moreo v er, the bidirected edge in ρ m ust b e in G or H . How ev er, t his is a con tradiction. No w, assume to the con t r a ry t ha t F violates the constrain t C3 b ecause A − B − C and B ↔ D are in F but A and C are not adjacen t in F (no t e tha t if A and C w ere adjacen t in F , then 30 they w ould not violate the constrain t C3 or they w ould violate the constraint C1 or C2, whic h is imp ossible as w e ha v e just sho wn). Note that A − B − C m ust b e in H . In fact, A − B − C m ust also b e in G , b ecause G a nd H hav e the same directed no de pairs and bidirected edges. Moreo ver, B ↔ D mus t b e in G or H . How ever, this implies tha t A and C are adjacen t in G or H by the constrain t C3, whic h implies that A and C are adjacen t in G and H b ecause they are t r iplex equiv alen t and th us also in F , whic h is a con tradiction. Consequen tly , F is a MAMP CG, whic h implies tha t F is a MDCG b ecause it has the same directed edges as G and H . Second, note that all the triplexes in G are in F to o. Finally , assume to the contrary that F has a triplex ({ A, C } , B ) that G has no t (and, thus, nor do es H ) . Then, A − B − C mu st b e in H b ecause, otherwise, A ← B or B → C would b e in H and t h us F w ould not hav e a t riplex ({ A, C } , B ) . How ev er, this implies that F has the same induced g raph ov er { A, B , C } a s G , whic h contradicts the assumption that F has a triplex ({ A, C } , B ) .  Pro of of Theorem 8. It suﬃces to sho w that ev ery Z -op en pat h b et ween α and β in G can b e transformed into a Z -op en path betw een α and β in G ′ and vice v ersa, with α , β ∈ V a nd Z ⊆ V ∖ α ∖ β . Let ρ denote a Z -op en path b et wee n α and β in G . W e can easily transform ρ in to a path ρ ′ b et w een α and β in G ′ : Simply , replace ev ery maximal subpath of ρ of the form V 1 z x V 2 z x . . . z x V n − 1 z x V n ( n ≥ 2) with V 1 ← ǫ V 1 z x ǫ V 2 z x . . . z x ǫ V n − 1 z x ǫ V n → V n . W e now sho w that ρ ′ is Z -o p en. Case 1.1: If B ∈ V is a triplex no de in ρ ′ , then ρ ′ m ust hav e o ne of the f o llo wing subpaths: A B C A B ǫ B ǫ C ǫ B B C ǫ A with A, C ∈ V . Therefore, ρ mus t ha v e one o f the follo wing subpaths (sp eciﬁcally , if ρ ′ has the i -th subpath ab o ve, then ρ has the i -th subpath below): A B C A B C A B C In either case, B is a triplex node in ρ and, th us, B ∈ Z ∪ san G ( Z ) fo r ρ to b e Z - op en. Then, B ∈ Z ∪ san G ′ ( Z ) b y construction of G ′ and, th us, B ∈ D ( Z ) ∪ san G ′ ( D ( Z )) . Case 1.2: If B ∈ V is a non-tr iplex no de in ρ ′ , then ρ ′ m ust hav e o ne of t he following subpaths: A B C A B C A B C A B ǫ B ǫ C ǫ B B C ǫ A with A, C ∈ V . Therefore, ρ mus t ha v e one o f the follo wing subpaths (sp eciﬁcally , if ρ ′ has the i -th subpath ab o ve, then ρ has the i -th subpath below): A B C A B C A B C A B C A B C In either case, B is a non-triplex no de in ρ and, thus , B ∉ Z for ρ to b e Z -o p en. Since Z con tains no error no de, Z cannot determine any no de in V that is not already in Z . Then, B ∉ D ( Z ) . 31 Case 1.3: If ǫ B is a triplex no de in ρ ′ , then ρ ′ m ust hav e one of the following subpaths: ǫ A ǫ B ǫ C ǫ A ǫ B ǫ C Therefore, ρ must hav e one of the follo wing subpaths (sp eciﬁcally , if ρ ′ has the i -th subpath ab o v e, then ρ has the i -th subpath b elo w): A B C A B C In either case, B is a triplex node in ρ and, th us, B ∈ Z ∪ san G ( Z ) fo r ρ to b e Z - op en. Then, ǫ B ∈ Z ∪ san G ′ ( Z ) b y construction of G ′ and, th us, ǫ B ∈ D ( Z ) ∪ san G ′ ( D ( Z )) . Case 1.4: If ǫ B is a non- t r iplex no de in ρ ′ , then ρ ′ m ust hav e one of the fo llowing subpaths: A B ǫ B ǫ C ǫ B B C ǫ A α = B ǫ B ǫ C ǫ B B = β ǫ A A B ǫ B ǫ C ǫ B B C ǫ A ǫ A ǫ B ǫ C with A, C ∈ V . Recall that ǫ B ∉ Z b ecause Z ⊆ V ∖ α ∖ β . In the ﬁrst case, if α = A then A ∉ Z , else A ∉ Z for ρ to b e Z - o p en. Then, ǫ B ∉ D ( Z ) . In the second case, if β = C then C ∉ Z , else C ∉ Z for ρ to b e Z -op en. Then, ǫ B ∉ D ( Z ) . In the third and fourth cases, B ∉ Z b ecause α = B or β = B . Then, ǫ B ∉ D ( Z ) . In the ﬁf th and six th cases, B ∉ Z for ρ to b e Z -op en. Then, ǫ B ∉ D ( Z ) . The last case implies that ρ has the f o llo wing subpath: A B C Th us, B is a non-t r iplex no de in ρ , whic h implies tha t B ∉ Z or pa G ( B ) ∖ Z ≠ ∅ for ρ to b e Z -op en. In either case, ǫ B ∉ D ( Z ) (recall that pa G ′ ( B ) = pa G ( B ) ∪ ǫ B b y construction of G ′ ). Finally , let ρ ′ denote a Z -op en path b etw een α and β in G ′ . W e can easily transform ρ ′ in to a path ρ b et w een α and β in G : Simply , replace ev ery maximal subpath of ρ ′ of the f o rm V 1 ← ǫ V 1 z x ǫ V 2 z x . . . z x ǫ V n − 1 z x ǫ V n → V n ( n ≥ 2) with V 1 z x V 2 z x . . . z x V n − 1 z x V n . W e now sho w that ρ is Z -op en. Note that all the no des in ρ are in V . Case 2.1: If B is a triplex no de in ρ , then ρ m ust ha v e one of the follo wing subpaths: A B C A B C A B C A B C A B C with A, C ∈ V . Therefore, ρ ′ m ust hav e o ne of t he follow ing subpaths (sp eciﬁcally , if ρ has the i -th subpath ab o ve, then ρ ′ has the i -th subpath b elow): A B C A B ǫ B ǫ C ǫ B B C ǫ A ǫ B ǫ C ǫ A ǫ B ǫ C ǫ A In the ﬁrst three cases, B is a t riplex no de in ρ ′ and, th us, B ∈ D ( Z ) ∪ san G ′ ( D ( Z )) for ρ ′ to b e Z -op en. Since Z con tains no error no de, Z cannot determine an y no de in V that is not already in Z . Then, B ∈ D ( Z ) iﬀ B ∈ Z . Since there is no strictly descending route from B to an y error no de, then any strictly descending route f r om B to a no de D ∈ D ( Z ) implies t ha t D ∈ V whic h, as seen, implies that D ∈ Z . Then, B ∈ san G ′ ( D ( Z )) iﬀ B ∈ san G ′ ( Z ) . Moreo v er, B ∈ san G ′ ( Z ) iﬀ B ∈ s an G ( Z ) b y construction of G ′ . These results to gether imply that B ∈ Z ∪ san G ( Z ) . 32 In the last t wo cases, ǫ B is a triplex no de in ρ ′ and, th us, B ∈ D ( Z ) ∪ san G ′ ( D ( Z )) for ρ ′ to b e Z -op en b ecause Z con tains no error no de. Therefore, as sho wn in the previous paragraph, B ∈ Z ∪ san G ( Z ) . Case 2.2: If B is a non-triplex no de in ρ , then ρ m ust hav e one of the fo llowing subpaths: A B C A B C A B C A B C A B C A B C with A, C ∈ V . Therefore, ρ ′ m ust hav e o ne of t he follow ing subpaths (sp eciﬁcally , if ρ has the i -th subpath ab o ve, then ρ ′ has the i -th subpath b elow): A B C A B C A B C A B ǫ B ǫ C ǫ B B C ǫ A ǫ A ǫ B ǫ C In the ﬁrst ﬁv e cases, B is a non-triplex no de in ρ ′ and, thus , B ∉ D ( Z ) for ρ ′ to b e Z -op en. Since Z con tains no error node, Z cannot determine any no de in V that is not already in Z . Then, B ∉ Z . In the last case, ǫ B is a non-triplex no de in ρ ′ and, th us, ǫ B ∉ D ( Z ) for ρ ′ to b e Z -op en. Then, B ∉ Z or pa G ′ ( B ) ∖ ǫ B ∖ Z ≠ ∅ . Then, B ∉ Z or pa G ( B ) ∖ Z ≠ ∅ (recall tha t pa G ′ ( B ) = pa G ( B ) ∪ ǫ B b y construction of G ′ ).  Pro of of Theorem 9. W e ﬁnd it easier to prov e the t heorem by deﬁning separation in MAMP CGs in terms of routes rather than paths. A no de B in a ro ute ρ in a MAMP CG G is called a triplex no de in ρ if A ← ⊸ B ← ⊸ C , A ← ⊸ B − C , or A − B ← ⊸ C is a subroute of ρ (note that ma yb e A = C in t he ﬁrst case). Note that B ma y b e b oth a triplex and a non-triplex no de in ρ . Moreov er, ρ is said to b e Z - op en with Z ⊆ V when ● eve ry triplex no de in ρ is in D ( Z ) , and ● no non-triplex no de in ρ is in D ( Z ) . When t here is no Z -op en route in G b etw een a no de in X a nd a no de in Y , w e say that X is separated from Y giv en Z in G and denote it as X ⊥ G Y ∣ Z . It is straigh tforward to see that this and the o riginal deﬁnition of separation in MAMP CGs in tro duced in Section 4 are equiv alen t, in the sense that they iden tify the same separations in G . W e prov e the theorem for the case where L con tains a single no de B . The general case follo ws b y induction. Sp eciﬁcally , g iv en α , β ∈ V ∖ L and Z ⊆ V ∖ L ∖ α ∖ β , w e sho w b elo w that ev ery Z -op en route b etw een α and β in [ G ′ ] L can b e tr a nsformed into a Z -op en route b et w een α and β in G ′ and vice v ersa. First, let ρ denote a Z -o p en route betw een α and β in [ G ′ ] L . W e can easily t r ansform ρ in to a Z -op en route b et w een α and β in G ′ : F or each edge A → C or A ← C with A, C ∈ V ∪ ǫ that is in [ G ′ ] L but not in G ′ , replace eac h of its o ccurrence in ρ with A → B → C or A ← B ← C , resp ectiv ely . Note that B ∉ D ( Z ) b ecause B , ǫ B ∉ Z . Second, let ρ denote a Z - o p en route betw een α a nd β in G ′ . Note that B cannot pa rticipate in an y undirected or bidirected edge in G ′ , b ecause B ∈ V . Note also that B cannot b e a triplex no de in ρ , b ecause B ∉ D ( Z ) since B , ǫ B ∉ Z . Note also that B ≠ α, β . Then, B can only app ear in ρ in the follo wing conﬁgurat io ns: A → B → C , A ← B ← C , or A ← B → C with A, C ∈ V ∪ ǫ . Then, w e can easily transform ρ in to a Z -op en route b etw een α and β in [ G ′ ] L : Replace eac h o ccurrence of A → B → C in ρ with A → C , each o ccurrence of A ← B ← C in ρ with A ← C , and eac h o ccurrence of A ← B → C in ρ with A ← ǫ B → C . In the last case, note that ǫ B ∉ D ( Z ) b ecause B , ǫ B ∉ Z .  33 Reference s Andersson, S. A., Madigan, D. and Pe rlman, M. D . Alternativ e Mark ov Prop erties for Chain Graphs. Sc andin a vian Journal of Statistics , 28:33-85, 2001. Bishop, C. M. Pattern R e c o gni tion a n d Machine L e arning . Springer, 2 006. Bouc k aert, R. R. Bayesian Belief Networks: F r om Cons truction to In fer enc e . PhD Thesis, Univ ersit y of Utrec ht, 1995. Co x, D. R. and W erm uth, N. Linear Dep endencies Represen ted b y Chain Graphs. Statistic al Scienc e , 8:204 -218, 19 93. Co x, D. R. and W erm uth, N. Multivariate Dep endenc i e s - Mo dels, Analysis and Interpr e ta- tion . Chapman & Hall, 1996 . Drton, M. Discrete Chain G raph Mo dels. Bernoul li , 15 :7 36-753 , 2009 . Ev ans, R. J. and Richardson, T. S. Marginal log-linear Parameters for Graphical Mark ov Mo dels. Journal of the R oyal Statistic al So ciety B , 75:7 43-768 , 2013. Geiger, D., V erma, T. and P earl, J. Identifying Indep endence in Ba yes ian Ne t w orks. Networks , 20:507-5 34, 1990. Kang, C. and Tian, J. Mark ov Prop erties for Linear Causal Mo dels with Corr elat ed Errors. Journal of Machine L e arni ng R e s e a r ch , 10:41-70, 200 9 . Koster, J. T. A. Marginalizing and Conditioning in Graphical Mo dels. Bernoul li , 8:817-840, 2002. Lauritzen, S. L. Gr aphic al Mo d e l s . Oxford Univ ersit y Press, 1996. Levitz, M., Pe rlman M. D . and Madigan, D . Separat io n and Completeness Prop erties for AMP Chain Graph Mark ov Mo dels. The Annals of Statistics , 29:1 751-178 4, 2 001. P e ˜ na, J. M. F aithfulness in Chain Graphs: The G a ussian Case. In Pr o c e e ding s of the 14th International Confer en c e on Artiﬁcial Intel ligenc e and Statistics , 588- 5 99, 2011. P e ˜ na, J. M. Learning AMP Chain Graphs under F aithfulness. In Pr o c e e dings of the 6 th Eur op e an Workshop on Pr ob abilistic Gr aphic al Mo del s , 251- 258, 201 2. Ric hardson, T. and Spirtes, P . Ancestral Graph Marko v Mo dels. The A nnals of Statistics , 30:962-1 030, 2002. Ro v erato, A. and Studen´ y, M. A Graphical Represen ta tion of Equiv alence Classes of AMP Chain G raphs. Journal of Machine L e arning R ese ar ch , 7 :1045-10 78, 2006. Sadeghi, K. and Lauritzen, S. L. Mark ov Prop erties for Mixed Graphs. arXiv:1109.5 909v4 [stat.OT]. Sadeghi, K. Stable Mixed G raphs. B ernoul li , to app ear. Sonn ta g , D. and P e ˜ na, J. M. Learning Multiv ariate Regression Chain Graphs under F ait h- fulness. In Pr o c e e dings of the 6 th Eur op e an Worksh o p on Pr ob abilistic Gr aphic al Mo dels , 299-306 , 2012. Sonn ta g , D. and P e ˜ na, J. M. Chain Graph In terpretatio ns and their Relations. In Pr o c e e din g s of the 12th Eur op e an Conf e r en c e on Symb olic and Quantitative Appr o aches to R e ason i n g under Unc ertainty , 510-521. Studen´ y, M. Pr ob abilistic Condition al Indep endenc e Structur es . Springer, 2 005.

Marginal AMP Chain Graphs

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment