Formulas for Counting the Sizes of Markov Equivalence Classes of Directed Acyclic Graphs

F ormulas for counting the sizes of Mark o v Equiv alence Classes F orm ulas for Coun ting the Sizes of Mark o v Equiv alence Classes of Directed Acyclic Graphs Y angb o He heyb@pku.edu.cn LMAM, Scho ol of Mathematic al Scienc es, LMEQF, and Center for Statistic al Scienc e, Peking University Bin Y u binyu@st a t.berkeley.edu Dep artments of Statistics and EECS, UC Berkeley Editor: Abstract The sizes of Mark o v equiv alence classes of directed acyclic graphs play imp ortant roles in measuring the uncertaint y and complexity in causal learning. A Marko v equiv alence class can b e represen ted by an essen tial graph and its undirected subgraphs determine the size of the class. In this paper, w e dev elop a metho d to deriv e the form ulas for counting the sizes of Marko v equiv alence classes. W e ﬁrst introduce a new concept of core graph. The size of a Mark o v equiv alence class of in terest is a p olynomial of the n um b er of v ertices given its core graph. Then, we discuss the recursive and explicit formula of the p olynomial, and pro vide an algorithm to deriv e the size form ula via sym bolic computation for an y giv en core graph. The prop osed size formula deriv ation sheds light on the relationships b etw een the size of a Marko v equiv alence class and its representation graph, and mak es size counting eﬃcien t, even when the essen tial graphs con tain non-sparse undirected subgraphs. Keyw ords: Directed acyclic graph; Mark o v equiv alence class; Size form ula; Causality 1. In tro duction A Marko v Equiv alence class con tains all statistically equiv alent mo dels of directed acyclic graphs (D AG) (P earl, 2000; Spirtes et al., 2001). In general, observ ational data is not suﬃcien t to distinguish an underlying D AG from the others in the same Mark ov equiv alence class. The size of a Marko v equiv alence class is the num b er of DA Gs in the class. It pla ys an imp ortant part in pap ers to measure the “uncertain ty” of causal graphs or to ev aluate the “complexity” of a Mark o v equiv alence class in causal learning (Chic kering, 2002; He and Geng, 2008). F or example, He and Geng (2008) prop ose sev eral criterions, all of which are deﬁned on the sizes of Mark ov equiv alence classes, to measure the uncertain t y of causal graphs for a candidate in terv en tion; c ho osing in terven tions by minimizing these criterions mak es helpful but exp ensive in terven tions more eﬃcient. Maathuis et al. (2009) in troduce a metho d to estimate the a verage causal eﬀects of the co v ariates on the resp onse by considering the DA Gs in the equiv alence class; the size of the class determines the complexity of the estimation. 1 Y.B. He An essen tial graph represents a Marko v equiv alence class and its undirected subgraphs determine the size of the class (Andersson et al., 1997). The size of a small Marko v equiv a- lence class can b e counted via trav ersal metho ds that list all D AGs in the Marko v equiv a- lence class (Gillispie and Perlman, 2002). Recently , He et al. (2015) propose a size counting algorithm that calculates the size of a Marko v equiv alence class via partitioning the class recursiv ely . In general, this metho d is eﬃcient for Mark ov equiv alence classes represented b y sparse essential graphs, but b ecomes muc h time-consuming when the essential graphs con tain non-sparse undirected subgraphs. Coun ting graphs based on formulas is usually elegant and eﬃcient. Robinson (1973, 1977) provide recursiv e formulas to coun t DA Gs with a given num b er of v ertices. Steinsky (2003) develops recursive form ulas to count Marko v equiv alence classes of size 1. Later, Gillispie (2006) introduces recursive formulas for arbitrary size, based on all conﬁgurations of the undirected essential graphs that pro duce this size. How ever, there are few form ulas a v ailable for coun ting the size of a given Marko v equiv alence class, except ﬁve form ulas in tro duced in He et al. (2015) for Marko v equiv alence classes represented b y ﬁve sp eciﬁc t yp es of undirected essential graphs (trees, graphs with up to tw o missing edges, etc.). In this pap er, we focus on the formulas for coun ting the size of a Marko v equiv alence class. W e ﬁrst introduce a new concept of “core graph”, which is an undirected chordal graph without dominating v ertices. An undirected essen tial graph can b e represented b y its core graph and the num b er of dominating vertices. The size of the corresp onding Mark ov equiv alence clas s is a p olynomial of the n um b er of dominating vertices given its core graph. Then w e dev elop an iterativ e metho d to deriv e the p olynomial, and giv e the explicit polyno- mials for both sev eral speciﬁc t yp es of core graphs and all core graphs with up to ﬁv e missing edges. Based on symbolic computation, we in tro duce a size formula deriv ation algorithm and a form ula-based size counting algorithm for general core graphs and Mark ov equiv alence classes, resp ectiv ely . Our exp eriments show that the prop osed size form ula deriv ation is ef- ﬁcien t in general and formula-based algorithm can sp eedup size counting dramatically for the Marko v equiv alence clas ses represented b y essential graphs with non-sparse undirected subgraphs. The rest of the pap er is arranged as follo ws. In Section 2, we give a brief introduction ab out Marko v equiv alence class and size coun ting of Mark o v equiv alence classes. In Section 3, w e prop ose a metho d to derive the size formulas and to coun t the sizes of Marko v equiv alence classes based on these form ulas. In Section 4, we study the size formulas and form ula-based size coun ting of Marko v equiv alence classes exp erimentally . W e conclude in Section 5 and ﬁnally present all pro ofs in the App endix. 2. Mark ov Equiv alence Class and Size Coun ting A graph G consists of a vertex set V and an edge set E . A graph is directed (undirected) if all of its edges are directed (undirected). A sequence of edges that connect distinct vertices in V , say { v 1 , · · · , v k } , is called a path from v 1 to v k if either v i → v i +1 or v i − v i +1 is in E for i = 1 , · · · , k − 1. A path is p artial ly dir e cte d if at least one edge in the path is directed. A path is directed (undirected) if all edges are directed (undirected). A cycle is a path from a vertex to itself. 2 F ormulas for counting the sizes of Marko v Equiv alence Classes A dir e cte d acyclic gr aph (DA G) D is a directed graph without any directed cycle. Let V b e the v ertex set of D and τ b e a subset of V . The induc e d sub gr aph D τ of D ov er τ , is deﬁned to b e the graph whose vertex set is τ and whose edge set cont ains all of those edges of D with t w o end p oints in τ . A v-structur e is a three-vertex induced subgraph of D like v 1 → v 2 ← v 3 . A graph is called a chain gr aph if it contains no partially directed cycles. The isolated undirected subgraphs of the chain graph after removing all directed edges are the c hain comp onen ts of the c hain graph. A chor d of a cycle is an edge that joins t w o nonadjacen t v ertices in the cycle. An undirected graph is chor dal if every cycle with four or more vertices has a chord. A graphical mo del is a probabilistic mo del for which a DA G denotes the conditional indep endencies b etw een random v ariables. A Markov e quivalenc e class is a set of DA Gs that enco de the same set of conditional indep endencies. Let the skeleton of an arbitrary graph G b e the undirected graph with the same v ertices and edges as G , regardless of their directions. V erma and P earl (1990) prov e that tw o DA Gs are Markov e quivalent if and only if they hav e the same skeleton and the same v-structures. Moreo v er, Andersson et al. (1997) sho w that a Marko v equiv alence class can b e represented uniquely b y an essential gr aph , denoted by C , which has the same skeleton as D , and an edge is directed in C if and only if it has the same orientation in ev ery equiv alent D A G of D . An essen tial graph is a c hain graph and each of its c hain comp onents is an undirected and connected chordal graph (UCCG for short). Let Size( C ) denote the size of the Mark o v equiv alence class represented by C (size of C for short). Clearly , Size( C ) = 1 if C is a D A G; otherwise C ma y con tain at least one c hain comp onent, denoted by C τ 1 , . . . , C τ k . W e can calculate the size of C by coun ting the D A Gs in Mark o v equiv alence classes represented b y its c hain comp onents using the follo wing equation (Gillispie and Perlman, 2002; He and Geng, 2008): Size( C ) = k Y i =1 Size( C τ i ) . (1) Since eac h c hain component is an undirected and connected c hordal graph, to obtain the size of a Mark ov equiv alence class, it is suﬃcient to compute the size of Mark ov equiv alence classes represented b y these UCCGs ac cording to Equation (1). Let U b e a UCCG, τ b e the vertex set of U and D b e a DA G in the equiv alence class represen ted b y U . A vertex v is a r o ot of D if all directed edges adjacen t to v are out of v , and D is v -r o ote d if v is a ro ot of D . A v -r o ote d sub-class of U is the set of all v -ro oted D A Gs in the Marko v equiv alence class represented b y U . A v -r o ote d essential gr aph of U , denoted by U ( v ) , is a graph that has the same skeleton as U , and an edge is directed in U ( v ) if and only if it has the same orientation in every v -ro oted D AG of U . He et al. (2015) sho w that a v -rooted sub-class of U can b e represented uniquely b y a v -ro oted essential graph and a Mark ov equiv alence class can be partitioned in to sub-classes represen ted b y its ro oted essen tial graphs. Lemma 1 L et U b e a UCCG over τ = { v i } i =1 , ··· ,p , U ( v i ) b e v i -r o ote d essential gr aph, and f ( U ( v i ) ) b e the size of v i -r o ote d sub-class r epr esente d by U ( v i ) . We have Size( U ( v i ) ) ≥ 1 for 3 Y.B. He any i = 1 , · · · , p , and Size( U ) = p X i =1 Size( U ( v i ) ) . (2) F or any i ∈ { 1 , · · · , p } , the undirected subgraphs of U ( v i ) in Lemma 1 are UCCGs, so we can calculate Size( U ( v i ) ) in Equation (2) using Equation (1). As a result, using Equation (1) and Equation (2), He et al. (2015) prop ose to calculate the size of a Mark ov equiv alence class by partitioning it recursiv ely into ro oted sub-classes until the sizes of all these sub- classes can b e completely determined by the num b ers of v ertices and edges. How ever, when the UCCGs con tain non-sparse subgraphs, this metho d migh t b e m uc h time-consuming. In the next section, we will sho w that the size of the Marko v equiv alence class repre- sen ted by a UCCG dep ends on a subgraph of the UCCG, and in tro duce a size formula deriv ation algorithm and a formula-based counting algorithm, whic h can greatly accelerate size counting of Marko v equiv alence classes with non-sparse undirected subgraphs. 3. F ormulas for sizes of Mark o v equiv alence classes In this section, we introduce the concept of core graph that determines the size formula of a Marko v equiv alence class in Section 3.1. Then, we discuss the recursive and explicit form ulas for the size of a Marko v equiv alence class giv en its core graph in Section 3.2. Finally , in Section 3.3, we provide algorithms to derive size formulas and to count the sizes of Marko v equiv alence classes based on these form ulas. 3.1 Core graph A vertex is dominating in a UCCG U if it is adjacent to all other v ertices in U . A dominating vertex prune d subgraph of U is obtained b y removing some dominating v ertices from U . W e denote a dominating v ertex pruned subgraph of U as U m − if it is obtained by removing m dominating v ertices from U . An extende d graph of H , denoted b y H m + , is a graph obtained b y adding m dominating vertices to H . Deﬁnition 2 (Core graph of a UCCG) The c or e gr aph of U is the minimal dominating vertex prune d sub gr aph of U . Let m b e the n umber of dominating vertices in U , K b e the core graph of U . Clearly , K is the same as U m − . If U is a completed graph, all v ertices in U are dominating, so the core graph of U is a nul l gr aph . Let K be an undirected graph o v er V . Clearly , according to Deﬁnition 2, the undirected graph K is a core graph of some UCCG if and only if K is an undirected chordal graph without dominating v ertices. The c omplement of K , denoted b y K c , is a graph on the same vertices and an edge app ears in K c if and only if it do es not o ccur in K . Prop osition 3 presents a prop erty of the complement of a core graph. Prop osition 3 (Complemen t of core graph) L et U b e a UCCG, m b e the numb er of dominating vertic es in U , K b e the c or e gr aph of U , K c b e the c omplement of K . We have that K c b e a c onne cte d gr aph, and for any two e dges in K c , either they shar e a c ommon vertex, or they ar e c onne cte d by an e dge. 4 F ormulas for counting the sizes of Marko v Equiv alence Classes This prop erty helps us to construct a core graph. In T able 1, w e list all core graphs and the corresp onding complement graphs of the UCCGs with up to three missing edges. Num b er ( missing edges) 0 1 2 3 K K ∅ r r r r r r r r r r r r     r r r r    B B B K c K ∅ r r r r r   J J r r r   J J r r r r     J J r r r r   J J T able 1: Core graphs and their complements when at most three edges are missing, K , K c , and K ∅ denote a core graph, the complement of K , and a null graph, respectively . Let U be a UCCG with m dominating vertices, K b e the core graph of U . As an extended graph of K , U is the same as K m + regardless the lab els of vertices, so we hav e Size( U ) = Size( K m + ). Clearly , the size of the Marko v equiv alence class represented b y a UCCG U is determined by its core graph K and the n um b er of dominating v ertices m . F or an undirected chordal graph K and a nonnegative integer m , w e deﬁne a function f ( K , m ) as following, f ( K , m ) := Size( K m + ) . (3) F rom the deﬁnition of the form ula f ( K , m ), we ha v e the follo wing lemma directly . Lemma 4 L et K b e an undir e cte d chor dal gr aph, and K k + b e an extende d gr aph of K , we have f ( K k + , m ) = f ( K , m + k ) . Consider the UCCGs with at most tw o missing edges, as sho wn in T able 1, there is only one core graph exists, so the sizes of the corresponding Marko v equiv alence classes are determined given the num b er of vertices in the UCCGs. When three edges are missing in the UCCGs, there are three core graphs exists, so three sizes are p ossible given the n um b er of v ertices. This explains the results in tro duced in He et al. (2015) that the size of a Mark ov equiv alence class is determined giv en the n um b er of v ertices ( p ) only when no more than t w o edges are missing in UCCGs. The size of U might b e very h uge; for a UCCG U with p vertices, Size( U ) reac hes the maxim um p ! when U is a completed graph. In general, more edges in the UCCG (more denser), more larger the corresp onding class and more time-consuming of size counting. F ortunately , a dense UCCG U might has sparse core graph K when man y dominating v ertices exist. In the next section, given the core graph K , we will discuss the formula of f ( K , m ) that can b e used to sp eedup the enumeration of Size( U ). 3.2 Size form ulas based on core graphs In this section, w e prop ose a method to derive the size formula f ( K , m ) deﬁned in Equation (3). W e ﬁrst in tro duce a recursive formula of f ( K , m ) given K , then prop ose a metho d 5 Y.B. He to deriv e the explicit size formulas, and ﬁnally give the explicit form ulas for b oth several sp eciﬁc types of core graphs and all core graphs with up to ﬁv e missing edges. Theorem 5 in troduces the main recursiv e formula for the size of a Marko v equiv alence class whose representation graph is extended from an undirected chordal graph K as follo ws. Theorem 5 L et K b e an undir e cte d chor dal gr aph over V . F or any inte ger m ≥ 0 , K m + is an extende d gr aph of K , and f ( K , m ) is the size of K m + deﬁne d in Equation (3). We have f ( K , 0) = Size( K ) , and for any inte ger m > 0 , f ( K , m ) = m · f ( K , m − 1) + X v ∈ V f ( K N v , m ) Size  K ( v )  Size ( K N v ) , (4) wher e K ( v ) is a v-r o ote d gr aph of K and K N v is an induc e d sub gr aph on the neighb ors of v . Theorem 5 shows that the size function f ( K , m ) can b e calculated through the term f ( K , m − 1) and the terms related to some subgraphs of K . Below, we discuss the explicit form ula of f ( K , m ). First, we ha v e the follo wing corollary . Corollary 6 L et K b e an undir e cte d chor dal gr aph. The formula f ( K , m ) deﬁne d in Equa- tion (3) is a p olynomial divisible by m ! . Consider the recursive formula in Equation (4), the second term in the right side is crucial to deriv e the explicit formula of f ( K , m ). Deﬁne g ( K , m ) := 1 m ! X v ∈ V f ( K N v , m ) Size( K ( v ) ) Size( K N v ) . (5) If K is an undirected chordal graph, its induced subgraph K N v is also an undirected c hordal graph. According to Corollary 6, the formula f ( K N v , m ) is a p olynomial divisible b y m !, it follo ws that the formula g ( K , m ) deﬁned in Equation (5) is a p olynomial of m . Let d be the degree of p olynomial g ( K , m ), according to Corollary 6, g ( K , m ) can b e represen ted by g ( K , m ) = d +1 X i =1 γ i m i − 1 . (6) Giv en the p olynomial g ( K , m ), the following theorem shows the explicit form ula of f ( K , m ). Theorem 7 L et K b e an undir e cte d chor dal gr aph, { γ i , i = 1 , 2 , · · · , d + 1 } b e the c o eﬃcients of the p olynomial g ( K , m ) deﬁne d in Equation (6), and let a ij = ( − 1) j − i  j i − 1  for any i ≤ j . We have, for any m ≥ 0 , f ( K , m ) = β 0 + d +1 X i =1 β i m i ! m ! , (7) wher e β 0 = Size ( K ) , β d +1 = γ d +1 /a d +1 ,d +1 , and β i = ( γ i − P d +1 j = i +1 a i,j β j ) /a i,i , for any inte ger i ∈ [1 , d ] . 6 F ormulas for counting the sizes of Marko v Equiv alence Classes According to Theorem 7, to obtain the explicit formula of f ( K , m ) for an undirected c hordal graph K , we just need to calculate the size Size( K ), and the p olynomial g ( K , m ) deﬁned in Equation (6). The algorithms for general core graphs K will b e in tro duced in Section 3.3. Belo w, we discuss the form ulas for some sp eciﬁc types of undirected c hordal graphs. When an undirected c hordal graph contains some isolated vertices, these vertices can b e remov ed and the corresp onding size formula can b e obtained as follows. Corollary 8 (Isolated v ertices) The gr aph K is c omp ose d of an undir e cte d chor dal gr aph K 1 and j isolate d vertic es. We have f ( K , m ) = f ( K 1 , m ) + j · Size ( K 1 ) · mm ! . (8) Esp e cial ly, when K 1 is a nul l gr aph, we have f ( K , m ) = ( j m + 1) m ! . A tree is a connected graph without cycle, and a tr e e plus graph is generated by adding one more edge to a tree. W e give four explicit size formulas for four sp eciﬁc types of undirected chordal graphs in Corollary 9. Corollary 9 L et K b e an undir e cte d chor dal with p vertic es. 1. If K is a nul l gr aph, we have f ( K , m ) = m ! . 2. If K is a tr e e, we have f ( K , m ) = [( p − 1) m 2 + (2 p − 1) m + p ] m ! . 3. If K is a tr e e plus, we have f ( K , m ) = [ m 3 + 2 pm 2 + (4 p − 1) m + 2 p ] m ! . 4. If K is c omp ose d of isolate d e dges, we have f ( K , m ) = 2 p/ 2 − 1 ( pm 2 / 2 + 3 pm/ 2 + 2) m ! . By Corollary 8, corollary 9 and Theorem 7, w e can obtain the size form ula f ( K , m ) giv en an undirected chordal graph K . He et al. (2015) give t w o explicit size form ulas for essential graphs with one or t w o missing edges; here w e do the same for core graphs with at most ﬁv e missing edges. In T able 2, we list all core graphs with up to ﬁve missing edges, together with their corresp onding size formulas. W e give an example to demonstrate the deriv ation of these formulas. Consider the last (with id 16) core graph in T able 2, K is comp osed of a completed graph with ﬁve vertices ( K 1 ) and one isolated v ertex. W e hav e Size( K 1 ) = 120 and f ( K 1 , m ) = ( m + 5)! from Lemma 4, it follo ws f ( K , m ) /m ! = [( m + 5)! + 120 mm !] /m ! = 120 m + ( m + 5) · · · ( m + 1) by Corollary 8. Giv en an undirected connected chordal graph U , when its core graph K is small, we can calculate g ( K , m ) directly follo wing its deﬁnition in Equation (5), and then obtain the explicit form ula of f ( K , m ) according to Theorem 7. How ever, when the core graph is large, the deriv ation of g ( K , m ) b ecomes more complicated. In the next section, we will provide an algorithm to derive the explicit formulas of f ( K , m ) for a general core graph K . 7 Y.B. He id ( n 0 , p ) K f ( K , m ) /m ! id ( n 0 , p ) K f ( K , m ) /m ! 1 (1, 2) r r 2 m + 1 9 (4,5) r r r r r         J J 24 m + ( m + 4) · · · ( m + 1) 2 (2,3) r r r m 2 + 5 m + 2 10 (5,4) r r r r   m 2 + 7 m + 2 3 (3,3) r r r 3 m + 1 11 (5,5) r r r r r         2 m 3 + 11 m 2 + 29 m + 10 4 (3,4) r r r r     3 m 2 + 7 m + 4 12 (5,5) r r r r r       m 3 + 10 m 2 + 19 m + 10 5 (3,4) r r r r    B B B m 3 + 6 m 2 + 17 m + 6 13 (5,5) r r r r r       m 3 + 10 m 2 + 19 m + 10 6 (4,4) r r r r     4 m 2 + 12 m + 4 14 (5,6) r r r r r r A A A H H H   @ @ H H H m 4 + 14 m 3 + 55 m 2 + 82 m + 40 7 (4,4) r r r r   2 m 2 + 8 m + 3 15 (5,6) r r r r r r A A A H H H   @ @ H H H    ( m + 1)(2 m + 3)( m 2 + 7 m + 16) 8 (4,5) r r r r r         ( m + 1)( m + 4)(2 m + 3) 16 (5,6) r r r r r r       A A A H H H   @ @ 120 m + ( m + 5) · · · ( m + 1) T able 2: The explicit formulas for all core graphs with up to ﬁv e missing edges, n 0 , p are the num b er of missing edges and the num b er of vertices in the core graph K , resp ectiv ely . Algorithm 1: SizeF( K ) Input : K , an undirected c hordal graph; Output : f ( K , m ), a p olynomial of m . 1 Let typ e b e the type of K and p b e the num b er of vertices in K ; 2 switch typ e do 3 case nul l gr aph return m !; 4 case tr e e return [( p − 1) m 2 + (2 n − 1) m + p ] m !; 5 case tr e e-plus return ( m 3 + 2 pm 2 + (4 p − 1) m + 2 p ) m !; 6 case isolate d-e dge gr aph return 2 p/ 2 − 1 ( pm 2 / 2 + 3 pm/ 2 + 2) m ! 7 Let w be the num b er of dominating vertices in K ; remov e these v ertices from K ; 8 if w > 0 then 9 h ( m ) ← SizeF( K ); 10 return h(m+w) 11 Let k be the num ber of isolated vertices in K ; remov e these v ertices from K ; 12 if k > 0 then 13 return SizeF( K ) + Size( K ) k mm !, (see Size( K ) in Algorithm 2); 14 return SizeGF( K ) , (see SizeGF( K ) in Algorithm 1.1); 3.3 Algorithms In this section, w e introduce tw o main algorithms. The algorithm SizeF( K ) in Algorithm 1 gives the explicit form ula of f ( K , m ) for an undirected c hordal graph K . The algorithm 8 F ormulas for counting the sizes of Marko v Equiv alence Classes Algorithm 2: Size( C ) Input : C , an essen tial graph; Output : the size of Marko v equiv alence classes represented b y C . 1 Let C 1 , · · · , C J b e all of chain comp onents of U ; for an y integer 0 ≤ J ≤ J , m j is the n um b er of dominating vertices in C j and K j the core graph of C j ; 2 for j ← 1 to J do 3 f j ( m ) ← SizeF( K i ) ; 4 return Q J j =1 f j ( m j ) . Size( C ) in Algorithm 2 counts the size of the Mark o v equiv alence class represented by an essen tial graph C . Both Algorithm 1 and Algorithm 2 call each other recursiv ely . In Algorithm 1, w e ﬁrst giv e the explicit form ula of f ( K , m ) when K is n ull, tree, tree-plus or isolated-edge graph according to Prop ortion 9. Otherwise, when the undirected chordal graph K contains dominating v ertices or isolated v ertices, w e simplify the formula deriv ation according to Lemma 4 or Corollary 8, resp ectiv ely . Finally , for a general undirected chordal graph K , w e deriv e the explicit form ula of f ( K , m ) b y the algorithm called SizeGF( K ) in Algorithm 1.1. The algorithm SizeGF( K ) in Algorithm 1.1 ﬁrst calculates the polynomial g ( K , m ) de- ﬁned in Equation (6) and then derives the explicit p olynomial f ( K , m ) according to Theorem 7. Supp ose that the undirected chordal graph K con tains J isolated connected subgraphs, w e calculate the p olynomial g ( K , m ) in the ﬁrst part of Algorithm 1.1 (line 1 to 4) according to Corollary 10 as follows. Corollary 10 L et K b e an undir e cte d chor dal gr aph with J isolate d c onne cte d sub gr aphs, denote d by K 1 , · · · , K J r esp e ctively, V ( K j ) b e the set of vertic es in K j , and g ( K , m ) is the p olynomial deﬁne d in Equation (6). We have g ( K , m ) = J X j =1 Size ( K ) Size ( K j ) X v ∈ V ( K j ) f ( K j,N v , m ) m ! Size ( K ( v ) j ) Size ( K j,N v ) , (9) wher e K ( v ) j is the v -r o ote d essential gr aph of K j , and K j,N v is the induc e d sub gr aph of K j on the neighb ours of v . In Algorithm 1.1, w e need to calculate Size( K ( v ) j ) for some j and v , whic h are the sizes of Mark o v equiv alence classes represented b y ro oted essential graphs. He et al. (2015) prop ose an algorithm called ChainCom to construct the ro oted essen tial graph and all of its chain comp onen ts for a UCCG and a ro ot v ertex. W e give ChainCom in Algorithm 3 in App endix for the completion of the pap er. In Algorithm 2, we ﬁrst ﬁnd the core graphs of the chain components of the essen tial graph C , then calculate the size of the corresp onding Marko v equiv alence class by using the formulas obtained from Algorithm 1. When some subgraphs of these c hain comp onents con tain dominating vertices, formula-based size counting will display its adv antages; this will b e studied exp erimentally in the next section. 9 Y.B. He Algorithm 1.1: SizeGF( K ) Input : K , an undirected c hordal graph; Output : f ( K , m ), a p olynomial of m . 1 Let K 1 , · · · , K J b e J UCCGs in K , V ( K j ) b e the vertex set of K j ; 2 Set S ( v ) K j ← Size( K ( v ) j ) for an y integer j ∈ [1 , J ] and an y v ∈ V ( K j ); 3 S K j ← P v ∈ V ( K j ) Size( K ( v ) j ), S K ← Q J j =1 S K j ; 4 g ( m ) ← P J j =1 S K S K j P v ∈ V ( K j ) SizeF ( K j,N v ) m ! S ( v ) K j Size ( K j,N v ) and denote it as P d +1 i =1 γ i m i − 1 ; 5 Set β 0 ← S K ; β d +1 ← γ d +1 /a d +1 ,d +1 and a ij ← ( − 1) j − i  j i − 1  for i ≤ j ≤ d + 1; 6 for i ← d to 1 do 7 β i ← γ i − P d +1 j = i +1 a i,j β j a i,i ; 8 return P d +1 i =0 β i m i m!. 4. Exp erimen tal Results In this section, we introduce the implementation of the form ula deriv ation and formula- based counting algorithms, and conduct exp eriments to ev aluate the formula-based size coun ting algorithm prop osed in Section 3. All exp eriments are run on a lin ux server at In tel 2.0GHz. These exp eriments display that the prop osed alorithms greatly sp eed up the size coun ting, esp ecially when the corresp onding UCCGs contain dense subgraphs. 4.1 A Python pac k age for size formula deriv ation W e developed a Python pack age named countMEC to derive the size form ulas and to count the sizes of Marko v equiv alence classes based on these formulas. The symbolic computation in coun tMEC dep ends on the python pack age sympy . The following example demonstrates the usage of the pack age coun tMEC. 1. from countMEC import * 2. G=ran_conn_chordal_graph(15,95) 3. K=core_graph(G) 4. F=SizeF([K]) 5. S=Size(G) In this example, w e ﬁrst imp ort the pack age countMEC, and randomly generate a UCCG G with 15 v ertices and 95 edges. The graph G is shown in the left of Figure 1. Then, we get the core graph of G , denoted by K , which is shown in the right of Figure 1. The graph G con tains 7 dominating vertices and the core graph K just con tains 8 vertices and 17 edges. In the fourth line, we call SizeF( · ) (Algorithm 1); it outputs the following size formula: F ( m ) =  m 3 + 16 m 2 + 77 m + 108  ( m + 2)!. In the last line, w e call Size( · ) (Algorithm 2) and get S = 643749120, which is the size of G . It’s easy to c hec k that S = F (7). In this example, it takes 0.5 second to coun t size using the prop osed form ula-based algorithm, while 440 seconds are tak en with the metho d introduced in He et al. (2015); we will compare the time complexities of tw o metho ds thoroughly in the next section. 10 F ormulas for counting the sizes of Marko v Equiv alence Classes G K Figure 1: A UCCG G with 15 v ertices and 95 edges and its core graph K . 4.2 F orm ula-based size counting In this section, w e experimentally compare the time complexit y of our prop osed counting algorithms to the b enchmark algorithm introduced in He et al. (2015). Let U n p b e the set of Mark o v equiv alence classes with p vertices and n edges. W e obtain random c horal graphs from U n p follo wing He et al. (2015). First, w e construct a tree b y connecting tw o v ertices (one is sampled from the connected v ertices and the other from the isolated vertices) sequen tially un til all p v ertices are connected. Then, we randomly insert an edge such that the resulting graph is chordal, rep eatedly until the num b er of edges reaches n . Rep eating this pro cedure N times, w e obtain N samples from U j p for each in teger j ( ≤ n ). W e ﬁrst consider the UCCGs in U n p with p ≤ 12 for eac h in teger n ∈ [ p + 2 , p ( p − 1) / 2 − 3]. Because the results hav e the similar patterns for diﬀeren t p , we just rep ort the exp erimen ts for p = 12 in this paper. Based on the 10 5 samples from U n 12 for eac h in teger n ∈ [14 , 63], we plot the mean, the minimum, the median, and the maximum of the coun ting time used b y the benchmark algorithm (blue dashed lines) and by the prop osed Algorithm 2 (red solid lines) in four panels of Figure 2, resp ectively . In eac h panel of Figure 2, the main window displa ys all results ( n ∈ [14 , 63]) of b oth algorithms, the tw o upp er sub-windows display the results of b oth algorithms for n ∈ [14 , 39] and n ∈ [40 , 50], resp ectively , and the low er sub-windo w displays the results of Algorithm 2 again with a prop er size-co ordinate. W e see that the counting time (mean, minimum, median, and maxim um) of the b ench- mark algorithm is increasing in the num b er of edges ( n ); size counting based on b enc hmark algorithm b ecomes muc h time-consuming when the graphs are dense. Meanwhile, the time used b y Algorithm 2, increases ﬁrst, and then decreases with the n umber of edges. Figure 2 shows that size coun ting based on Algorithm 2 keeps eﬃcient for b oth sparse and dense graphs. W e also study the sets U n p that contain UCCGs with tens of vertices under sparsity constrain ts. The num b er of vertices p is set to 20 , 50, and 100, and the num b er of edges n is set to r p where r is the ratio of n to p . F or each p , we consider three ratios: 3, 4 and 5. The graphs in U rp p are sparse since r ≤ 5. F or each pair of ( p, r ), 10 5 UCCGs are generated randomly and then sorted in ascending order according to the counting time used b y b enc hmark algorithm. The ordered 10 5 UCCGs are divided in to four subsets. The subset S 1 con tains the ﬁrst 500 UCCGs, S 2 con tains the next 49500 UCCGs, S 3 con tains 11 Y.B. He Figure 2: The mean, the minimum, the median and the maximum of coun ting time of Mark o v equiv alence classes with 12 vertices and n edges. the next 49500 UCCGs after S 2 , and S 4 con tains the last 500 UCCGs. F or each subset, w e rep ort the av erage of counting time and the av erage of their ratios in T able 3 for the b enc hmark algorithm ( T 1 ) and the prop osed algorithm 2 ( T 2 ). W e see that on a v erage, (1) the proposed Algorithm 2 is faster than the b enchmark algorithm in all cases, (2) the more edges the UCCGs hav e ( r from 3 to 5), or the more time b enchmark algorithm used (subset from S 1 to S 4 ), the smaller T 2 /T 1 , that is, the higher speedup Algorithm 2 achiev ed. F or example, consider the subsets S 4 and r = 5, the a v erage coun ting time is shorten rapidly for all p ∈ { 20 , 50 , 100 } , the a v erage of ratios T 2 /T 1 are also reduced to nearly 0.02. W e ha ve to p oint out that the c horal graphs generated in our exp erimen ts might not b e uniformly distributed in the space of chordal graphs and that the results in Figure 2 and T able 3 are not accurate estimations of exp ectations of the corresp onding statistics. 5. Conclusion and discussion In this pap er, we prop ose a method to deriv e the size form ulas of Mark ov equiv alence classes and to count the sizes based on these form ulas. A core graph of an undirected connected 12 F ormulas for counting the sizes of Marko v Equiv alence Classes p Subset r 3 4 5 T 1 T 2 T 2 /T 1 T 1 T 2 T 2 /T 1 T 1 T 2 T 2 /T 1 20 S 1 0.01 0.01 0.76 0.02 0.02 0.76 0.02 0.02 0.75 S 2 0.03 0.03 0.77 0.17 0.13 0.74 1.47 0.96 0.67 S 3 0.10 0.07 0.74 1.03 0.52 0.63 21.73 4.38 0.38 S 4 0.68 0.32 0.55 21.14 2.23 0.17 954.22 10.92 0.02 50 S 1 0.07 0.05 0.79 0.19 0.15 0.79 0.74 0.56 0.76 S 2 0.18 0.14 0.77 0.77 0.55 0.73 5.82 3.21 0.59 S 3 0.55 0.40 0.74 5.18 2.39 0.59 113.22 17.46 0.34 S 4 5.62 2.10 0.41 238.98 18.80 0.15 17598.39 128.65 0.02 100 S 1 0.26 0.21 0.80 0.73 0.58 0.80 3.18 2.27 0.71 S 2 0.78 0.60 0.77 2.92 2.05 0.71 21.86 10.90 0.53 S 3 2.25 1.63 0.74 19.96 9.04 0.56 429.61 55.63 0.27 S 4 21.14 7.81 0.43 897.18 59.59 0.10 59093.25 516.44 0.02 T able 3: The av erage of counting time ( T 1 for b enchmark algorithm and T 2 for Algrithm 2) and ratios ( T 2 /T 1 ) for UCCGs with p vertices and pr edges in diﬀerent subsets. c hordal graph is introduced and the size formula deriv ation based on the core graph is pro- p osed. W e discuss b oth recursiv e and explicit forms of the size form ulas and give algorithm to deriv e these formulas. Comparing to the b enchmark coun ting algorithm, the prop osed algorithm can generate more size formulas eﬃcien tly , and b y these formulas, size count- ing is accelerated dramatically when the essential graph contains non-sparse undirected subgraphs. Ac knowledgmen ts This work w as supp orted partially b y NSFC (11671020, 11101008, 71271211). App endix A. Algorithm ChainCom( U , v ) F or the completion of the pap er, we give the algorithm ChainCom( U , v ) in Algorithm 3, whic h is in tro duced in He et al. (2015), to construct the ro oted essential graph U ( v ) and all of its c hain comp onents. App endix B. Pro ofs of Results In this section, we provide the pro ofs of the main results of our pap er. Pro of of Prop osition 3 Let v i 1 − v j 1 and v i 2 − v j 2 b e tw o edges in K c . If neither they share a common vertex, nor they are connected by an edge, w e hav e that v i 1 , v j 1 , v i 2 , v j 2 are four distinct v ertices and there is no edge b et w een v i 1 , v j 1 and v i 2 , v j 2 . Since that ¯ K is the complement of K , 13 Y.B. He Algorithm 3: ChainCom ( U , v ) Input : U , a UCCG; v , a v ertex of U . Output : v − ro oted essential graph of U and all of its chain comp onents. 1 Set A = { v } , B = τ \ v , G = U and O = ∅ 2 while B is not empty do 3 Set T = { w : w in B and adjacent to A } ; 4 Orien t all edges b etw een A and T as c → t in G , where c ∈ A, t ∈ T ; 5 rep eat 6 for e ach e dge y − z in the vertex-induc e d sub gr aph G T do 7 if x → y − z in G and x and z ar e not adjac ent in G then 8 Orien t y − z to y → z in G 9 un til no mor e undir e cte d e dges in G T c an b e oriente d ; 10 Set A = T and B = B \ T ; 11 App end all isolated undirected graphs in G T to O ; 12 return G and O w e ha ve that the four edges, v i 1 − v i 2 , v i 2 − v j 1 , v j 1 − v j 2 , and v j 2 − v i 1 app ear in K , and mean while, the t wo edges v i 1 − v j 1 and v i 2 − v j 2 do not o ccur in K . This implies that no c hord exists in the cycle v i 1 − v i 2 − v j 1 − v j 2 − v i 1 in K . It is a con tradiction b ecause K is a c hordal graph. Since no dominating v ertices app ear in K , for any vertex v in K , there exists another v ertex in K suc h that it is not adjacen t to v . Consequently , there is no isolated v ertex in K c . F ollowing the pro of in the last paragraph, there are no t w o edges that o c cur separatively in t w o isolated subgraphs of K . As a result, K c is a connected graph.  Before proving Theorem 5, we giv e the follo wing lemma. Lemma 11 L et U b e an undir e cte d chor dal gr aph over V and U ( v ) b e the v-r o ote d gr aph of U . We have that the sub gr aph of U ( v ) on the neighb ors of v , denote d by U ( v ) N v , is undir e cte d. Pro of W e can get U ( v ) using Algorithm 3. Consider an y edge, denoted b y v i − v j , in U N v , v , v i and v j form a triangle. According to Algorithm 3, v i − v j can not b e oriented to a directed edge since v → v i − v j is not a induced subgraph of U . Therefore, we ha ve that U ( v ) N v is undirected. Pro of of Theorem 5 Denote the vertices of K as V = { v 1 , · · · , v p } , and the m extended vertices in K m + as V 0 = { v p +1 , · · · , v p + m } . F rom Lemma 1, we ha v e f ( K , m ) = X v ∈ V Size  ( K m + ) ( v )  + X v ∈ V 0 Size  ( K m + ) ( v )  . (10) 14 F ormulas for counting the sizes of Marko v Equiv alence Classes Figure 3: The directions of edges among v , N v and V 0 and the other v ertices in ( K m + ) ( v ) , where v → N v represen ts that eac h edge b etw een v and N v is directed from v to the vertex in N v , and V 0 − N v represen ts that all e dges b etw een V 0 and N v are undirected. F or any v ∈ V 0 , since v is adjacent to all other vertices in K m + , from Lemma 11, we ha v e Size  ( K m + ) ( v )  = f ( K , m − 1) and X v ∈ V 0 Size  ( K m + ) ( v )  = m · f ( K , m − 1) . (11) F or any v ∈ V , the neighbor set of v in K ( m +) is N v ∪ V 0 , from Lemma 11, ( K N v ) m + is a c hain comp onen t of ( K m + ) ( v ) when m > 0. According to Algorithm 3 and Lemma 11, the directions of edges among v , N v and V 0 and the other v ertices in ( K m + ) ( v ) are displa y ed in Figure 3. All edges are directed from N v ∪ V 0 to V − N v ∪ { v } in ( K m + ) ( v ) . W e hav e Size  ( K m + ) ( v )  = Size   ( K m + ) ( v )  N v ∪ V 0  Size   ( K m + ) ( v )  V − N v ∪{ v }  First, according to Lemma 11, w e can get that  ( K m + ) ( v )  N v ∪ V 0 is the same as ( K N v ) m + , th us, Size   ( K m + ) ( v )  N v ∪ V 0  = f ( K N v , m ) holds. Then, consider the undirected edges in  ( K m + ) ( v )  V − N v ∪{ v } , according to Algorithm 3, because all v ertices in V 0 are paren ts of ver- tices in V − N v ∪ { v } , w e hav e that  ( K m + ) ( v )  V − N v ∪{ v } has the same c hain comp onents as  K ( v )  V − N v ∪{ v } . As a result, Size   ( K m + ) ( v )  V − N v ∪{ v }  = Size   K ( v )  V − N v ∪{ v }  . More- o v er, according to Equation (1), we ha v e Size   K ( v )  V − N v ∪{ v }  = Size ( K ( v ) ) Size ( K N v ) . Consequently , w e hav e f  ( K m + ) ( v )  = f ( K N v , m ) Size( K ( v ) ) Size( K N v ) . (12) Theorem 5 holds directly from Equation (10), Equation (11) and Equation (12).  Pro of of Corollary 6 F or any undirected c hordal graph K , from Theorem 5, w e hav e f ( K , m ) = m · f ( K , m − 1) + X v ∈ V f ( K N v , m ) h ( K , v ) (13) 15 Y.B. He where V is the set of v ertices in K , h ( K , v ) is an integer function of K and v . Consider f ( · , · ) terms in the right side of Equation (13), w e can calculate them b y using Equation (13) again as follows. f ( K , m − 1) = ( m − 1) · f ( K , m − 2) + X v ∈ V f ( K N v , m − 1) h ( K , v ) (14) and f ( K N v , m ) = m · f ( K N v , m − 1) + X v 0 ∈ N v f  K N 0 v 0 , m  h ( K N v , v ) , (15) where N 0 v 0 is the neighbor set of v 0 in K N v . Replacing f ( K , m − 1) and f ( K N v , m ) in Equation (13) by the corresp onding terms in Equation (14) and Equation (15), w e can ﬁnd that f ( K , m ) is the sum of the following three types of terms, 1. m ( m − 1) f ( K , m − 2), 2. mf ( K N v , m − 1) h ( K , v ), for any v ∈ V ( K ), and 3. f ( K N 0 v 0 , m ) h ( K N v , v ), where N 0 v 0 is the neighbor set of v 0 in K N v , for any v 0 , v such that v 0 ∈ N v . Notice that for any v , v 0 , w e hav e N 0 v 0 ⊂ N v ⊂ V , so the graphs in ab ov e three t yp es of terms are smaller than that in Equation (13). By this w a y , using Equation (13) rep eatedly , w e can calculate f ( K , m ) b y smaller graphs. Finally , f ( K , m ) can b e calculated only by f ( K , 0) and f (( K ∅ ) , k ) for k ≤ m . As a result, f ( K , m ) is the sum of some p olynomials of m and eac h term of the p olynomials contains either m ! f ( K , 0) or m ! k ! f (( K ∅ ) , k ) for k ≤ m . Because K ∅ is n ull graph, f (( K ∅ ) , k ) = k !, we hav e that f ( K , m ) is a p olynomial divisible b y m !.  Pro of of Theorem 7 W e just need to sho w that F orm ula (7) is the solution of Equation (4). First, when m = 0, we ha v e f ( K , m ) = β 0 = Size( K ). Theorem 7 holds if the follo wing equation holds, β 0 + d +1 X i =1 β i m i ! m ! = m β 0 + d +1 X i =1 β i ( m − 1) i ! ( m − 1)! + d +1 X i =1 γ i m i − 1 m ! . Equiv alen tly , d +1 X i =1 β i m i − d +1 X i =1 β i ( m − 1) i = d +1 X i =1 γ i m i − 1 (16) Consider the left side of Equation (16), P d +1 i =1 β i  m i − ( m − 1) i  = P d +1 i =1 β i h P i − 1 j =0 ( − 1) i − ( j +1)  i j  m j i = P d j =0 P d +1 i = j +1 h ( − 1) i − ( j +1)  i j  β i m j i = P d +1 k =1 h P d +1 i = k ( − 1) i − k  i k − 1  β i i m k − 1 16 F ormulas for counting the sizes of Marko v Equiv alence Classes If Equation (16) holds for any m > 0, w e ha ve that P d +1 i = k ( − 1) i − k  i k − 1  β i = γ k holds for any k = 1 , · · · , d + 1. Let A =      a 11 a 12 · · · a 1 ,d +1 0 a 22 · · · a 2 ,d +1 . . . . . . . . . . . . 0 0 · · · a d +1 ,d +1      =       1 0  −  2 0  · · · ( − 1) d  d +1 0  0  2 1  · · · ( − 1) d − 1  d +1 1  . . . . . . . . . . . . 0 0 · · ·  d +1 d       and β = ( β 1 , · · · , β d +1 ) T , and γ = ( γ 1 , · · · , γ d +1 ) T . W e hav e Aβ = γ . It is easy to verify that β in Theorem 7 is the solution of Aβ = γ .  Pro of of Corollary 8 Let v 1 , · · · , v j b e the j isolated v ertices, v 0 1 , · · · , v 0 m b e the m extended v ertices. V ( K 1 ) be the v ertices in K 1 , V ( K ) b e the vertices in K . Clearly , we ha ve V ( K ) = V ( K 1 ) ∪ { v 1 , · · · , v j } Because K is comp osed of K 1 and j isolated v ertices, Equation (8) holds when m = 0 since Size( K ) = Size(( K 1 )). Consider the case m = 1. F rom Theorem 5, we ha v e f ( K , m ) = m · f ( K , m − 1) + X v ∈ V ( K ) f ( K N v , m ) Size  K ( v )  Size ( K N v ) . (17) Since m − 1 = 0, we ha v e f ( K , m − 1) = Size( K ) = Size( K 1 ), and m · f ( K , m − 1) = m · f ( K 1 ). Moreo ver, for any v ∈ V ( K 1 ), Size( K ( v ) ) = Size(( K 1 ) ( v ) ) and K N v = ( K 1 ) N v hold. F or an y v ∈ { v 1 , · · · , v j } , Size( K ( v ) ) = Size( K 1 ) and K N v is a null graph; it follows f ( K N v , m ) = m ! and Size( K N v ) = 1. F rom Equation (17), we ha v e that f ( K , m ) = m · Size ( K 1 ) + P v ∈ V ( K 1 ) f  ( K 1 ) N v , m  Size  K ( v ) 1  Size (( K 1 ) N v ) + Size( K 1 ) j m ! = f ( K 1 , m ) + Size( K 1 ) j m ! = f ( K 1 , 1) + Size( K 1 ) j W e ha v e Equation (8) holds for m = 1. Supp ose that Equation (8) holds for m = k − 1, consider m = k , from Equation (17), w e hav e f ( K , k ) = k · f ( K , k − 1) + X v ∈ V ( K 1 ) f  ( K 1 ) N v , k  Size  K ( v ) 1  Size (( K 1 ) N v ) + Size( K 1 ) j k ! (18) Since Equation (8) holds for m = k − 1, we ha ve f ( K , k − 1) = f ( K 1 , k − 1) + j ( k − 1)Size( K 1 )( k − 1)! . (19) F rom Theorem 5, w e can get f ( K 1 , k ) = k f ( K 1 , k − 1) + X v ∈ V ( K 1 ) f  ( K 1 ) N v , k  Size  K ( v ) 1  Size (( K 1 ) N v ) . (20) 17 Y.B. He F rom Equation (18), (19), and (20), we hav e f ( K , k ) = f ( K 1 , k ) + j ( k − 1)Size( K 1 ) k ! + Size( K 1 ) j k ! = f ( K 1 , k ) + Size( K 1 ) j k k ! . As a result, Equation (8) holds for an y integer m ≥ 0 .  Pro of of Corollary 9 The pr o of of (1) When K is null graph, K m + is a completed graph with m vertices, the result (1) holds ob viously . The pr o of of (2) Let d 1 , d 2 , d 3 , · · · , d p b e degrees of v ertices v 1 , · · · , v p in K , we hav e P p i =1 d i = 2( p − 1). Consider g ( K , m ) deﬁned in Equation (5), g ( K , m ) = p X i =1 f ( K N v i , m ) m ! Size( K ( v i ) ) Size( K N v i ) . Since K is a tree, w e hav e that K N v i is comp osed of d i isolated vertices, so f ( K N v ,m ) m ! = 1 + d i m . W e also ha v e f ( K ( v i ) ) = 1 and f ( K N v i ) = 1 if K is a tree. Consequen tly , g ( K , m ) = p X i =1 (1 + d i m ) = p + 2( p − 1) m The result (2) holds according to Theorem 7. The pr o of of (3) Consider a tree plus graph, if it is c hordal, the added edge m ust b e in a triangle, otherwise, the tree plus graph is not chordal. Let d 1 , d 2 , d 3 , · · · , d p b e degrees of vertices v 1 , · · · , v p in K and d 1 , d 2 , d 3 are the degrees of the three v ertices in the triangle, w e ha ve P p i =1 d i = 2 p . Moreov er,considering the induced subgraph of K ov er N v i , w e ha ve that K N v i is comp osed of an edge and d i − 2 isolated v ertices for i = 1 , 2 , 3, and K N v i just con tains d i isolated v ertices for i = 4 , 5 , · · · , p . F ollowing Corollary 8, we can calculate g ( K , m ) as follo wing g ( K , m ) = 1 m ! h 2( d 1 + d 2 + d 3 − 6) mm ! + 3( m + 2)! + 2 P i 6 =1 , 2 , 3 ( d i mm ! + m !) i = 3 m 2 + 4 pm − 3 m + 2 p The result (3) holds according to Theorem 7. The pr o of of (4) Consider a vertex v in K , w e ha v e that ( K m + ) ( v ) con tains p/ 2 chain comp onents, in whic h one is a completed graph with m + 1 vertices, and the others are one-edge graphs. W e can calculate g ( K , m ) deﬁned in Equation (6) as following g ( K , m ) = 2 p/ 2 pm/ 2 + 2 p/ 2 p/ 2 . As a result, the result (4) holds according to Theorem 7.  18 F ormulas for counting the sizes of Marko v Equiv alence Classes Pro of of Corollary 10 According to the deﬁnition of g ( K , m ) in Equation (5) g ( K , m ) = J X j =1 X v ∈ V ( K j ) f ( K N v , m ) m ! Size( K ( v ) ) Size( K N v ) . Because K is comp osed of K 1 , · · · , K J that are J isolated connected graphs, we ha v e that K N v = K j,N v , and Size( K ( v ) ) = Size( K ( v ) j ) Q l 6 = j Size( K l ) = Size( K ( v ) j ) Size ( K ) Size ( K j ) . Conse- quen tly , Corollary 10 holds.  References S. A. Andersson, D. Madigan, and M. D. P erlman. A c haracterization of Mark ov equiv alence classes for acyclic digraphs. The Annals of Statistics , 25(2):505–541, 1997. D. M. Chic kering. Learning equiv alence classes of Ba y esian-net w ork structures. The Journal of Machine L e arning R ese ar ch , 2:445–498, 2002. S. B. Gillispie. F ormulas for counting acyclic digraph Marko v equiv alence classes. Journal of Statistic al Planning and Infer enc e , 136(4):1410–1432, 2006. S.B. Gillispie and M.D. Perlman. The size distribution for Mark o v equiv alence classes of acyclic digraph mo dels. Artiﬁcial Intel ligenc e , 141(1-2):137–155, 2002. Y angb o He and Zhi Geng. Active learning of causal net works with in terven tion exp eriments and optimal designs. Journal of Machine L e arning R ese ar ch , 9:2523–2547, 2008. Y angb o He, Jinzhu Jia, and Bin Y u. Counting and exploring sizes of mark o v equiv alence classes of directed acyclic graphs. Journal of Machine L e arning R ese ar ch , 16:2589–2609, 2015. M. H. Maathuis, M. Kalisch, and P . B ¨ uhlmann. Estimating high-dimensional interv ention eﬀects from observ ational data. The A nnals of Statistics , 37(6A):3133–3164, 2009. ISSN 0090-5364. J. Pearl. Causality: Mo dels, R e asoning, and Infer enc e . Cambridge Univ Pr, 2000. R. Robinson. Counting lab eled acyclic digraphs. New Dir e ctions in the The ory of Gr aphs , pages 239–273, 1973. R. Robinson. Coun ting unlab eled acyclic digraphs. In Combinatorial mathematics V , pages 28–43. Springer, 1977. P . Spirtes, C.N. Glymour, and R. Sc heines. Causation, Pr e diction, and Se ar ch . The MIT Press, 2001. B. Steinsky . En umeration of lab elled chain graphs and lab elled essen tial directed acyclic graphs. Discr ete mathematics , 270(1-3):266–277, 2003. 19 Y.B. He T. V erma and J. Pearl. Equiv alence and synthesis of causal mo dels. In Pr o c e e dings of the Sixth Annual Confer enc e on Unc ertainty in A rtiﬁcial Intel ligenc e , page 270. Elsevier Science Inc., 1990. 20

Formulas for Counting the Sizes of Markov Equivalence Classes of Directed Acyclic Graphs

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment