Statistical models for cores decomposition of an undirected random graph
The $k$-core decomposition is a widely studied summary statistic that describes a graph's global connectivity structure. In this paper, we move beyond using $k$-core decomposition as a tool to summarize a graph and propose using $k$-core decompositio…
Authors: Vishesh Karwa, Michael J. Pelsmajer, Sonja Petrovic
Statistical mo dels for cores decomp osition of an undirected random graph Vishesh Karw a 1 , Mic hael J. P elsma jer 2 , Sonja P etro vi ´ c 3 , Despina Stasi 4 , Dane Wilburne 5 6 Abstract: The k -core decomp osition is a widely studied summary statistic that describes a graph’s global connectivit y structure. In this pap er, we mo ve beyond using k -core decomp osi- tion as a tool to summarize a graph and propose using k -core decomposition as a to ol to model random graphs. W e prop ose using the shell distribution vector, a wa y of summarizing the decom- position, as a sufficient statistic for a family of exponential random graph models. W e study the properties and behavior of the mo del family , implement a Marko v c hain Monte Carlo algorithm for simulating graphs from the model, implement a direct sampler from the set of graphs with a giv en shell distribution, and explore the sampling distributions of some of the commonly used complementary statistics as goo d candidates for heuristic model fitting. These algorithms provide first fundamental steps necessary for solving the following problems: parameter estimation in this ERGM, extending the mo del to its Bay esian relativ e, and developing a rigorous metho dology for testing go o dness of fit of the mo del and mo del selection. The metho ds are applied to a synthetic netw ork as well as the well-kno wn Sampson monks dataset. 1. In tro duction Net work analyses are often concerned—either directly or indirectly—with the degrees of the no des in the net work, a natural approac h since coun ting the num b er of edges inciden t to a node giv es a basic local measure of connectivity . Several familiar statistical frameworks assign a probability distribution to the set of net works on a fixed n um ber of nodes based on their degree information, e.g. Holland and Leinhardt [ 1981 ], Chatterjee et al. [ 2011 ], Olhede and W olfe [ 2012 ], and Rinaldo et al. [ 2013 ]. How ev er, despite the ric h structure degree-based mo dels offer compared to simpler mo dels such as Erd¨ os-Ren yi-Gilb ert, they fail to capture certain vital connectivit y information about the net work. In some applications, it matters not just to ho w man y other nodes a particular no de in the net w ork is connected, but also to which other no des it is connected. F or example, a no de v ma y seem imp ortant if it has high degree, but if all its neigh b ors are themselv es unimp ortant due to having no additional connections (e.g., if they all hav e degree 1), then the “influence” or “cen trality” of v within the netw ork is not actually all that impressiv e, after all. This distinction is esp ecially crucial in applications concerning information disp ersal as in Pei et al. [ 2012 ], the spread of infectious diseases or viruses as in Kitsak et al. [ 2010 ], or robustness to no de failure. In the so cial net w ork con text, this importance can be interpreted as “celebrit y status” of a node. Whereas degree-ce n tric analyses are not well-suited to mo del such situations, the c or e de c omp osition of a netw ork graph can capture precisely this type of information. Cores of a graph were introduced by Seidman [ 1983 ] to study tightly-knit groups in so cial netw orks. Since then, core decomp osition has b een used as a to ol for numerous applications v arying from under- standing protein net w orks [ W uch t y and Almaas , 2005 ], visualization of large net w orks [ Alv arez-Hamelin et al. , 2006 ], and understanding the top ology of the Internet graph [ Carmi et al. , 2007 ] to name a few. In studies such as Kitsak et al. [ 2010 ] and Bae and Kim [ 2014 ], the authors iden tify spreader no des and rank them in terms of their spreading influence, using a graph’s core decomp osition. Metho ds for iden tifying spreaders using cores w ere extended to dynamic netw orks in Miorandi and Pellegrini [ 2010 ] and core decomp osition in general was extended to weigh ted netw orks in Eidsaa and Almaas [ 2013 ]. 1 Carnegie Mellon Univ ersity and Harv ard Universit y , vishesh@cmu.edu 2 Illinois Institute of T echnology , pelsmajer@iit.edu 3 Illinois Institute of T echnology , sonja.petrovic@iit.edu 4 Illinois Institute of T echnology , stasdes@iit.edu 5 Illinois Institute of T echnology , dwilburn@hawk.iit.edu 6 Authors are listed in alphabetical order; contributions are equal. 1 An imp ortant feature of a core decomp osition is that it c an b e computed efficiently (see, e.g., Lee et al. [ 2013 ]), even for “uncertain graphs” whic h are graphs whose edges ha ve some probabilit y of existing– suc h graphs hav e applications in biological netw orks that model, for instance, protein interactions (see Bonc hi et al. [ 2014 ]). Although core decomp osition has b ecome an imp ortan t and widely used to ol as a descriptiv e summary statistic of the net w ork, it is a statistic for whic h there do es not exist an asso ciated statistical mo del. The goal of this pap er is to place the core decomp osition of a netw ork on a rigorous statistical foundation and present it as a tool for statistical modeling rather than descriptiv e analysis. W e construct a natural mo del based on core decomp osition by embedding the core structure of a graph in the family of exp onential random graph mo dels (ERGMs) and describ e its theoretical prop erties. W e restrict the supp ort of the mo del to allow only netw orks with a fixed de gener acy to hav e a p ositive probability . W e sho w that this eliminates certain bad properties common to many ER GMs and expect that such support restrictions ma y help impro ve the properties of other ER GMs as well. W e study three common inference tasks as they apply to the supp ort restricted ERGM: sampling, maxim um lik eliho o d estimation, and go o dness-of-fit testing. More sp ecifically , the con tributions of this pap er are as follows: 1. In Section 2 , w e summarize the core decomposition of a net w ork in the form of a shel l distribution , and in Section 3 we introduce a supp ort r estricte d exp onential random graph mo del with the shell distribution as a sufficient statistic. 2. In Section 4 , w e p erform simulation studies to understand the b ehavior of the mo del by relying on an MCMC algorithm to sample from the mo del and to estimate the parameters of the mo del. 3. In Section 5 , w e present an algorithm to sample from the space of graphs given a fixed shell distribution. 4. W e return to the theoretical properties of the mo del in Sections 6 and 7 , where w e study the space of graphs with a fixed shell distribution and describ e the mar ginal p olytop e asso ciated with the mo del and conditions for the existence of MLE, resp ectiv ely . ER GMs pro vide a natural framew ork to mo del net w orks through their sufficien t statistics; see Robins et al. [ 2007 ] for an introduction. Golden berg et al. [ 2009 ] provide a comprehensiv e review of v arious w a ys to mo del netw orks, including ER GMs. ERGMs are a sp ecial case of the venerable class of exp onen tial families which are known to p ossess excellent statistical prop erties; see Brown [ 1986 ] for a theoretical treatmen t of exp onential families and Rinaldo et al. [ 2009 ] in particular for discrete exp onential family mo dels, including ER GMs. ERGMs hav e b een the w orkhorse of man y applied studies, and the literature is to o v as t to b e surv ey ed here; see Snijders et al. [ 2006 ], Saul and Filko v [ 2007 ] and Go o dreau et al. [ 2009 ] for examples of studies that use ERGMs for netw ork mo deling. Our goal is to add to the to olb ox of ERGMs the ability to mo del the core structure of a graph. Doing so has tw o imp ortant consequences: First, it puts the core structure of a graph, summarized b y its shell distribution, on a firm statistical fo oting. Second, it allo ws us to understand what prop erties of a netw ork are captured by the shell distribution. It is w orth noting that any ERGM based on a core decomp osition c annot be sp ecialized to the Erd¨ os-R´ enyi mo del, i.e., the Erd¨ os-R´ enyi mo del is not a submo del of any ERGM based on the core decomp osition. In fact, the same is true for any ERGM with sufficient statistics based on the degree sequence of the netw ork. As such, the shell distribution ER GM would o ccupy a unique space in the netw ork literature. Mo dels based on the core distribution go b eyon d the dyadic indep endence assumption inherent in the degree se quence based netw ork mo dels and are able to capture transitivit y effects. These mo dels differ from the ER GM-based subgraph coun ts, suc h as triangles and stars, which also go b eyond the dyadic indep endence assumption. This is b ecause the core structure of a net w ork is a glob al sufficient statistic in the follo wing sense: T o which core a no de b elongs dep ends in some wa y on the entire net w ork; see Section 2 for the precise definition of a core and some examples. In con trast, subgraph counts measure lo cal and coarse prop erties of the net w ork. W e wan t to p oint out that for all the go o d prop erties of ER GMs, they are not without drawbac ks. Recen t empirical and theoretical work has brought to light some undesirable prop erties of some sp ecial classes of ERGMs; these prop erties are often termed as “mo del degeneracy” ( Rinaldo et al. [ 2009 ], Sc hw einberger [ 2011 ], Chatterjee and Diaconis [ 2013 ], Hunter et al. [ 2008 ]) or “inconsistency” ( Shalizi et al. [ 2013 ]). As noted in Rinaldo et al. [ 2009 ], “mo del degeneracy” is an umbrella term used to denote 2 man y undesirable prop erties of ERGMs. One sp ecific dra wback to note is that it may b e difficult to sample efficiently ( Bannister et al. ), but that is an issue for ERGMs in general and outside the scop e of this pap er. W e discuss these issues in Section 9.2 and explain how we fix them b y placing supp ort restrictions on the class of mo dels that w e consider. Since the word degeneracy also refers to a graph- theoretic notion which is relev an t to this work, we a v oid the use of the term “mo del degeneracy” and instead use the term “bad” b ehavior of the mo del. 2. T echnical preliminaries: cores and shells W e restrict our analysis to the set of simple graphs, representing netw orks without multiple edges and self-lo ops. F or the remainder of this man uscript, let G n denote the set of all simple graphs on n no des. W e are interested in distributions o ver the set G n ; thus G will denote a random v ariable with state space G n , and G = g its realization. W e will also consider families of subsets of G n b elo w. Definition 1 ( Seidman [ 1983 ]) . The k-c or e of a graph g , denoted by H k ( g ) or simply H k if the graph is clear from the con text, is the maximal subgraph in which every vertex has degree at least k 1 . As it is often useful to think of the k -core as the output of an algorithm for which the graph g is the input, we also use the equiv alen t algorithmic definition: H k is the subgraph obtained by iteratively deleting vertices of degree less than k ; see Algorithm 1 . F or example, for the particular graph G = g on the left of Figure 1 , H 0 ( g ) is just the graph itself, H 1 is g without the isolated vertex, the 2-core H 2 is sho wn in the middle, and H 3 and H 4 are the same graph, sho wn on the righ t. F or k ≥ 5, H k is the empt y graph. Fig 1: A small graph g (left), its 2-core (center), and its 3- and 4-core (right). Eac h no de is con tained in several k -cores, for ev ery k from 0 to whatev er the largest k is for that no de. Thus, the following no de statistic captures all core information for a no de. Definition 2. A vertex v in a graph g has shel l index i if v ∈ H i ( g ) but v / ∈ H i +1 ( g ). Define s g : V → N as the function that maps v ertices of g to the non-negative integers according to their shell indices, so that if v has shell index i we may write s g ( v ) = i . If the graph g is clear from the con text, w e drop the subscript and simply write s ( v ) = i . In other w ords, the shell index of a vertex v indicates the highest core to whic h v belongs. F or example, not all nodes in the 2-core H 2 ( g ) in the middle of Figure 1 hav e shell index 2 in g : the six no des on the right hav e shell index 4. The v ertex set V ( g ) of an y net w ork g can be partitioned according to the shell indices, since the shell index exists, is well-defined and is unique for all vertices. There are t wo natural wa ys to record all of the shell index information ab out a netw ork, and hence, record the information that captures its core structure. First, the shel l se quenc e s ( g ) of an n -v ertex graph g with v ertices v 1 , . . . , v n is a vector of length n whose i th en try is the shell index of vertex v i . Second, if the in terest is in unlab eled graphs (i.e., exchangeable models for lab eled graphs), it is natural to summarize the sequence with a histogram as follows. The shel l distribution n S ( g ) of an n -vertex graph g is a vector of length n whose j th en try n j ( g ) is the n umber of v ertices of g that hav e shell index j , for 0 ≤ j ≤ n − 1. 1 This is the usual definition of the k -core and it appropriately describ es the notion of node imp ortance and robust degree. Seidman’s original definition also requires the subgraph to b e connected. 3 (The shell index of a vertex is b ounded ab ov e by its degree, whic h is b ounded ab ov e by n − 1.) Note that P n − 1 j =0 n j ( g ) = n . In symbols, n S ( g ) := ( n 0 ( g ) , n 1 ( g ) , . . . , n n − 1 ( g )) , where n j ( g ) = |{ v ∈ V ( g ) : s ( v ) = j, 0 ≤ j ≤ n − 1 }| . F or example, the graphs in Figures 2a and 2b b oth ha v e shell distribution (0 , 8 , 0 , 0 , 0 , 0 , 0 , 0). The graphs in Figures 2c and 2d ha ve shell distributions (0 , 0 , 8 , 0 , 0 , 0 , 0 , 0) and (0 , 0 , 4 , 4 , 0 , 0 , 0 , 0), resp ectively . These graphs illustrate the fact that the degree and core structures of a graph are not obtainable from one another. Graph g of Figure 1 has shell distribution (1 , 5 , 5 , 0 , 6 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0). (a) V ertices hav e de- grees 1, 2, and 3. (b) All eigh t vertices ha ve degree 1. (c) All v ertices b e- long to the 0-core, 1- core and 2-core. Higher cores are empty . (d) All vertices are in k -core for k = 0 , 1 , 2, but 4 of the vertices are also in the 3-core. Fig 2: The graphs in (a) and (b) ha v e the same core structure but different degree structure. The graphs in (c) and (d) hav e the same degree structure but different core structure. Finally , the de gener acy of a graph g ∈ G n , denoted by dgen( g ), is the index of the largest nonzero en try in the shell distribution vector n S ( g ). In other words, the degeneracy of a graph is the maximum index of a non-empt y shell. Th us w e ma y define the follo wing subset of the set of simple n -v ertex graphs G n : G n,m = { g ∈ G n : dgen( g ) = m } . 3. The shell distribution ER GM A natural wa y to mo del random graphs using their core structure is to embed summaries of their core structure in the exp onential random graph mo del (ERGM) framew ork. In what follo ws, we define a family of ERGMs using one such summary , namely the shell distribution, as a sufficient statistic. Let G = g b e an instance of a random graph from the set G n . Partitioning the vertex set of g according to the shell indices implies that the probability of observing g is P ( G = g ; p ) = ( ϕ ( p )) − 1 n − 1 Y j =0 p n j ( g ) j , (1) where p j ∈ (0 , 1) is the parameter that represen ts the prop ensity of shell j to ha v e v ertices in it, p = ( p 0 , p 1 , . . . , p n − 1 ) is the parameter vector, integers n j ( g ) are the comp onen ts of the shell distribu- tion v ector n S ( g ) as defined ab ov e, and ϕ ( p ) is the partition function. [One may also think of p j as represen ting the attr activeness of shell j .] Note that a feature of the model is that there is no dy ad indep endence assumption. Equation ( 1 ) is a most direct wa y to define an ER GM based on the shell distribution. One can easily see that it can b e written in exp onential family form (see App endix 9 ) and allo w us to tak e adv antage of v arious go o d prop erties of exp onential families. It turns out, how ev er, that sp ecification ( 1 ) of the mo del has many undesirable prop erties, common to other ER GMs [ Rinaldo et al. , 2009 ]; details are given in App endix 9 . There are several wa ys to a void these issues that arise from sp ecifying the model as in Equation ( 1 ); one such w ay is to add an 4 additional parameter to the mo del as follows. W e restrict the supp ort of the mo del to the set G n,m of all simple graphs whose degeneracy is equal to m . P ( G = g ; p, m ) = ( ( ϕ ( p )) − 1 Q m j =0 p n j ( g ) j if g ∈ G n,m , 0 otherwise, (2) where ϕ ( p ) = X g ∈G n,m m Y j =0 p n j ( g ) j is the normalizing constant (partition function). Equation ( 2 ) defines a multinomial-lik e distribution o ver the partition of no des induced by the shell distribution. By limiting degeneracy , the mo del has a significan tly reduced num ber of parameters, which offers an additional adv antage in estimation ov er the more general mo del. F or each fixed v alue m of degeneracy , the mo del defined by Equation ( 2 ) is an ERGM supp orted on the subset of graphs G n,m . W e hav e th us defined a family of mo dels with parameters p and m , where p = ( p 0 , . . . , p m ) ∈ ∆ m +1 and m ∈ { 0 , . . . , n − 1 } . It is a union of ERGMs, one for each distinct v alue of m . F or the remainder of the pap er, this supp ort restriction is assumed to b e present and made implicit, unless otherwise mentioned, to ease notation. The dimension of the parameter space is m + 1 and is a function of the parameter m . Remark 3. In this paper, we will treat m as fixed and kno wn. When fitting the mo del to real netw orks, m will b e selected by setting it equal to the degeneracy of the observ ed graph, assuming the sample size N = 1 as is most common in applications. Estimating m and fitting the shell ER GM when N > 1 and the observed graphs hav e distinct degeneracy v alues is an op en question. The c hoice of fixing m rather than treating it as an estimable parameter is b oth reasonable and warran ted. The degeneracy of a graph is an imp ortant metric that describ es its sparsity and is easily calculable from the data. If the degeneracy is not fixed, the large ma jority of our parameters will not b e estimable as the observ ed graphs are exp ected to b e sparse (real net w orks usually are), with observed degeneracy muc h smaller than N , see also 4.1 . Moreo ver, simulations show that allowing m to b e different from the observed degeneracy leads to a p o orly b ehav ed mo del, as explained in Section 9.2 . Intuitiv ely , ha ving p i > 0 for large shell indices i ensure that large-index shells attract most no des. In order to express this model in exp onential family form, define the set of natural parameters θ i = log ( p i /p m ). Note that by definition, θ m = 0, so there are m linearly indep endent parameters; w e will thus denote by θ = ( θ 0 , . . . , θ m − 1 ) the vector of natural parameters. The shell distribution ERGM can now b e written in the following form: P ( G = g ) = exp m − 1 X j =0 n j ( g ) θ j − ψ ( θ ) , (3) where ψ ( θ ) is the log-partition function (or the log normalizing constan t), given by ψ ( θ ) = log X g ∈G n,m exp m − 1 X j =0 n j ( g ) θ j . (4) The m -truncated shell distribution ( n 0 ( g ) , . . . , n m − 1 ( g )) is a minimal sufficien t statistic of the model. The natural parameter space is Θ = { θ ∈ R m : ψ ( θ ) < ∞} = R m . (5) 5 Giv en this mo del sp ecification, the ov erarc hing ob jective is to use it to perform statistical inference. Ho wev er, as is usually the case for ER GMs, ev aluating the log-partition function ab ov e is intractable for an y reasonably sized N . This will affect the computation of the maximum likelihoo d estimator (MLE), requiring one to resort to MCMC metho ds, as well as testing mo del fit. In the remainder of this pap er, w e study three imp ortant asp ects of these problems. First, b oth MLE computation and mo del fitting dep end on our ability to sample from the mo del with a giv en parameter v alue. T o this end, we provide an MCMC algorithm for sampling from the mo del, summarize the results of several simulations, and pro vide an in terpretation of the mo del parameters and the sampling distribution of realizable graph shell structures. Second, from the theory of exp onential families, w e kno w that the MLE is unique if it exists. But the question of existence is not often easy to address; w e solve it here for the shell distribution mo del. Finally , testing mo del fit necessitates the ability to sample from the fib ers of the model, that is, the subspaces of G n with given fixed v alues of the shell distribution. W e pro vide an algorithm for p erforming this task. W e b egin with theoretical considerations, then pro ceed to simulation results. 3.1. Sample sp ac e r estriction and de gener acy of r e al-world networks In ERGMs, sample space restriction leads to an improv emen t in the prop erties of the conditional mo del and estimation algorithms, as sho wn in Snijders and V an Duijn [ 2002 ], Snijders [ 2002 ]. A usual approac h is to condition on the degree sequence, maximum degree, or degree distribution, etc. In contrast, we are conditioning on the observed degeneracy of the graph. This is more robust than conditioning on the degree, as we are allowing the degrees to b e somewhat free but still controlling sparsity in another wa y . Degeneracy of real net works tends to b e small relative to the num ber of no des. A table illustrating this for the undirected graphs from the P a jek collection of datasets [ Batagelj and Mrv ar ] is included b elo w. Netw ork Dataset #Nodes #Edges Degen. Shell Distribution Scotland 244 256 4 (16 , 26 , 183 , 7 , 12) Geom 7343 11898 21 (1185 , 2218 , 1714 , 1023 , 503 , 248 , 122 , 126 , 34 , 27 , 20 , 52 , 0 , 1 , 7 , 14 , 17 , 0 , 0 , 0 , 0 , 22) NDyeast 2114 2277 5 (244 , 1199 , 478 , 169 , 18 , 6) NetScience 1589 2742 19 (128 , 320 , 390 , 281 , 223 , 89 , 21 , 60 , 27 , 30 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 20) USpowerGrid 4941 6594 5 (0 , 1588 , 3122 , 195 , 24 , 12) Erd˝ os 6927 11850 10 (0 , 4780 , 954 , 466 , 258 , 179 , 113 , 73 , 49 , 17 , 38) Observ e that the degeneracy of the graph is allo wed to grow as the num b er of no des grows, but is exp ected to b e significantly smaller than n in real-world netw orks. 4. Inference and implemen tation of the shell distribution ER GM Man y inference problems asso ciated with ERGMs require generating random samples from the mo del at a fixed parameter v alue. In particular, problems such as computing an MLE using Monte Carlo metho ds ( Snijders [ 2002 ]), sampling from the p osterior distribution of the parameters ( Caimo and F riel [ 2011 ]) and exploring the space of graphs that hav e high probability under the model eac h require random samples from the mo del. In this section, we presen t a commonly used MCMC algorithm to sample graphs from the shell distribution ERGM and use this algorithm to obtain maximum likelihoo d estimates and to understand the prop erties of random graphs that arise from the shell distribution ER GM. Sampling from the shell distribution ER GM: As is the case with most ERGMs, sampling from the shell distribution ER GM is intractable and w e need to resort to Marko v c hain Monte Carlo (MCMC) sc hemes. W e use a Metropolis-Hastings algorithm with a tie-no-tie prop osal (see Caimo and F riel [ 2011 ]) to generate graphs from the mo del. At eac h iteration, the algorithm proposes a graph g 0 from the curren t 6 state g and decides to accept it with probability min 1 , P ( g 0 ) · P ( g 0 → g ) P ( g ) · P ( g → g 0 ) = min 1 , Y i p n i ( g 0 ) − n i ( g ) i · P ( g 0 → g ) P ( g → g 0 ) ! , (6) where { g → g 0 } denotes the even t that the Marko v chain mov es from g to g 0 . Note that when the prop osed graph g 0 has degeneracy not equal to m , b y definition of the model, P ( g 0 ) = 0, hence the acceptance probability is 0. A simple prop osal distribution that is commonly used for prop osing new graphs in the Metrop olis framew ork is to randomly select a dyad and swap it. How ever, during exp eriments, w e found that this leads to Marko v c hains with p o or mixing prop erties. Instead, we use a “tie-no-tie” (TNT) proposal, also used in Caimo and F riel [ 2011 ]. At eac h iteration, the TNT prop osal randomly chooses b etw een the set of edges and non-edges, and then swaps a randomly chosen dyad within the selected set. But this prop osal is not symmetric: Let π b e the probability of c ho osing the set of edges, ne ( g ) b e the num ber of non-edges in g and e ( g ) b e the num b er of edges in g . Then the Hastings ratio P ( g 0 → g ) P ( g → g 0 ) is determined as follows: P ( g 0 → g ) P ( g → g 0 ) = ( π 1 − π ne ( g ) e ( g )+1 , if g 0 is obtained from g by adding an edge 1 − π π e ( g ) ne ( g )+1 , if g 0 is obtained from g by removing an edge. (7) Remark 4. Computing the acceptance probability using equation 6 requires one to compute the so- called vector of “change statistics” { n i ( g 0 ) − n i ( g ) } , i = 1 , . . . , n at each step, see Hun ter and Handco ck [ 2006 ]. F or many existing ER GMs, the c hange statistics can b e computed lo cally , i.e without resorting to computing the sufficient statistics for prop osed netw ork g 0 . How ev er, this is not the case for the shell distribution as it is a glob al sufficient statistic. In order to compute the change statistics, we need to recompute the shell distribution for the prop osed netw ork g 0 at each step of the Marko v chain. This increases the computational complexity of the algorithm, even though one can compute the shell distribution in linear time. 4.1. Estimating the p ar ameters of the shel l distribution ERGM: A natural starting point to estimate parameter v alues θ and m using a real net w ork is b y either (a) using their observ ed coun terparts, (b) by using a maximum likelihoo d estimate. W e will discuss these t wo estimating metho ds for b oth θ and m . Estimation of m is tric ky , as it represents the mo del dimension, and we observe only one graph. Also for an y observed graph, allowing m to b e different from the observ ed degeneracy leads to many undesirable prop erties of the resulting mo del. W e explain this issue at length in Section 9.2 . Thus for simulation studies based on real netw orks we fix m to b e the observed degeneracy . Estimation of θ is more inv olv ed. One can estimate θ naively b y using the empirical shell distribution and setting ˆ θ j = n j /n , or one can use a more principled likelihoo d-based estimator (such as an MLE or a Bay es estimate). It turns out that using the observed shell distribution as an empirical estimate leads to a po or (or uninteresting) parameter estimate - in particular, netw orks sampled from the empirical estimate do not resemble the observ ed netw ork. Namely , the mo del puts most of its mass on graphs with all no des in the largest p ossible shell (see also Sections 4.2 and 4.3 ). On the other hand, computing an MLE of θ from the observ ed netw ork is intractable due to the normalizing constant ψ ( θ ) giv en in Equation ( 4 ). Maximizing the lik eliho o d requires the rep eated use of Mark o v Chain Mon te Carlo sampling, as describ ed b elow, see also Hun ter and Handcock [ 2006 ] and references therein. Ba y esian estimates are also intractable due to tw o normalizing constants, see Caimo and F riel [ 2011 ] for more details. W e use Marko v chain Mon te Carlo MLE ( Gey er and Thompson [ 1992 ], Snijders [ 2002 ]) to estimate θ . F or t = 0 , 1 , . . . , let θ t b e the parameter estimate at iteration t . W e estimate the ratio of the intractable normalizing constant ψ ( θ ) ψ ( θ t ) using samples from θ t obtained by the Marko v chain algorithm presented earlier. Sp ecifically , let g 1 , . . . , g B b e a random sample from the mo del θ t , then 7 ψ ( θ ) ψ ( θ t ) u 1 B B X b =1 exp ( θ − θ t ) n S ( g b ) . Then, θ t +1 is estimated by maximizing the estimated log-likelihoo d, given b y ˆ l ( θ , θ t ) = ( θ − θ t ) n S ( g obs ) − log ψ ( θ ) ψ ( θ t ) and the pro cess is rep eated until conv ergence, see Hunter and Handco ck [ 2006 ] for more details. Estimation of the normalizing constan t requires a go o d initial v alue θ 0 ( Hun ter and Handco ck [ 2006 ]). W e use a heuristic grid search to obtain a go o d starting p oint that is close to the MLE, where closeness to the MLE is ev aluated by chec king if the empirical version of the following moment equation holds: E ˆ θ [ n s ( g )] = n s ( g obs ) , where g obs is the observed graph and ˆ θ is an MLE. The behavior of the MCMC-MLE estimator dep ends on the c hoice of a go o d starting p oint θ 0 . F or the current sim ulations, we use a heuristic starting p oin t, but one could also consider the step length algorithm in Hummel et al. [ 2012 ] to find a go o d starting point close to the MLE. What do graphs from the shell distribution ER GM lo ok like? W e use the MCMC algorithm describ ed ab ov e to explore the structure of random graphs generated by the mo del for fixed and esti- mated parameter v alues. In particular, for a fixed choice of parameters θ and m of the shell distribution ER GM, we explore the space of graphs that hav e high probability mass under the mo del by sampling a large num b er of graphs { g b } B b =1 using the MCMC algorithm. W e use these samples to find out what features of an y given netw ork can b e captured by mo deling its core structure through the shell distri- bution ERGM. In the sim ulation studies b elow, w e employ tw o t yp es of parameter v alues to sim ulate graphs - known fixed parameters and parameters estimated from a real-world netw ork. F or the kno wn parameters, w e alwa ys use degeneracy m = 3. Parameter estimates based on real world netw orks are obtained using a com bination of a heuristic grid se arc h (to initialize the MCMC MLE algorithm) and MCMC MLE. T o explore the sampled space of graphs, we summarize the distribution of the sampled graphs { g b } by using several summary statistics: b oxplots of the degree distribution and shell distribu- tions, and histograms of num ber of edges, tw o stars, and triangles, centralit y , size of largest shell and size of the innermost shell. When the parameters are estimated using a real world net w ork, w e also compare the distribution of these summary statistics with the corresp onding observed statistic. It may b e tempting to use this c omparison as a wa y to assess the go o dness of fit of the mo del, how ev er, one m ust exercise caution: Remark 5. It is important to note that comparing the sampling distribution of summary statistics with the observ ed v alues is not a formal go o dness-of-fit test of the mo del, but instead a heuristic approach to ev aluate ho w well the mo del fits the data. It follows along the lines the go o dness-of-fit testing prop osed for more general ERGMs in Hunter et al. [ 2008 ]. Ideally , one should b e able to either deriv e the asymptotic distribution of an y test statistic or, since in this case we usually observe a single net w ork, p erform an exact test. How ever, doing so requires several imp ortant steps, foremost, a go o d choice of a test statistic that can play the role of a generalized go o dness-of-fit statistic. In case of, say , hierarc hical log-linear mo dels for contingency tables, one can use the chi-square statistic, and sample from the conditional distribution given the observed sufficien t statistic to appro ximate the exact distribution of χ 2 . In case of this ERGM, how ever, we do not hav e at our disp osal suc h a statistic that can reliably ‘measure’ the distance of the observed netw ork from the exp ected netw ork. The main obstacle is that the dyads are not independent in this model, unlik e the case of hierarc hical mo dels in whic h cells in the contingency table (arising from the incidence matrix) are indep endent. T o this end, we follow the generally used strategies for ERGMs and rep ort the sampling distributions of v arious complementary net work statistics, such as the num b er of edges and the num b er of triangles. F or completeness, we explore the distribution of these statistics when conditioning on the sufficient statistics in Section 6 . 8 0.00 0.05 0.10 0.15 0.20 20 30 40 50 Edges 0.000 0.025 0.050 0.075 0.100 10 20 30 T r iangles 0.000 0.005 0.010 0.015 0.020 100 200 2 stars 0 3 6 9 12 0.1 0.2 0.3 0.4 0.5 Centrality 0.0 0.4 0.8 1.2 0 5 10 15 Size of largest shell 0.0 0.3 0.6 0.9 5 10 15 Size of innermost shell Fig 3: Sampling distributions of summary statistics from the Equal A ttr activeness mo del 0.00 0.05 0.10 0.15 0.20 20 30 40 50 Edges 0.000 0.025 0.050 0.075 0 10 20 30 T r iangles 0.000 0.005 0.010 0.015 100 200 2 stars 0 2 4 6 8 0.2 0.4 0.6 Centrality 0.0 0.1 0.2 0.3 0.4 0 5 10 15 Size of largest shell 0.0 0.1 0.2 0.3 0.4 5 10 15 Size of innermost shell Fig 4: Sampling distributions of summary statistics from the De c aying Attr activeness mo del 4.2. Example 1: V arious fixe d Shel l pr ob abilities In this section, we study the properties of the shell distribution ERGM by simulating graphs from v arious fixed parameters. W e set m = 3, n = 18 and consider t wo mo dels: 1. Equal attractiveness, i.e., p i = 1 4 for all i ; 2. Decaying attractiveness, i.e, p i ∝ e − i for all i . Equal attr activeness Mo del: This mo del p osits that every shell has equal attractiveness, i.e. p i = 1 4 for all i and since θ i = log p i p m , it follo ws that θ = (0 , 0 , 0 , 0). Hence by definition, this mo del places a uniform mass ov er the set of all 3-degenerate graphs. The sampling distribution of v arious summary statistics of graphs sampled from this mo del are shown in Figure 3 . Note that ev en though the mo del p osits that every shell has equal attractiveness a priori, the sampled graphs are such that most no des tend to lie in the innermost shell whic h is shell 3 in this case. This can b e seen b y the histogram of the size of the innermost shell in Figure 3 . There are at least three reasons for this behavior, the first one related to the very definition of the shell index. Namely , the existence of higher-index shells in a graph requires a certain minimum num ber of no des in it, and hence, a priori, higher shells hav e higher levels of natural “attractiveness”, to which we refer as intrinsic graph-theoretic attractiv eness. In this sense, the innermost shell is alwa ys the most attractive. Secondly , the mo del puts a uniform distribution on the space of all graphs, not on the space of all shell distributions. F or example consider the 4-truncated shell distributions (0 , 0 , 0 , 18) and (18 , 0 , 0 , 0): there are many graphs realizing the former, y et exactly one graph realizing the latter, namely the empty graph. Th us, the sampling distribution of the shell distributions is non-uniform. Finally , there is also an issue with the slow mixing of the Marko v chain. Shell distributions with a large num b er of no des in the higher-indexed shells are “stable” in the sense that adding or remo ving a single edge tends to lea v e the shell distribution unc hanged. On the other hand, when most no des are in low er index shells, adding or removing a few edges lead to large c hanges in the shell distribution. It is worth noting that the second and the third issue ab ov e are, in fact, related to each other and also to an issue that arises naturally in ER GMs in general. Namely , ERGMs mo del random graphs, not sufficient statistics, thus a uniform distribution ov er the set of graphs is not a uniform distribution o ver the s et of sufficient statistics one may care ab out. This is made evident by the current example: a uniform distribution o v er 3-degenerate graphs induces a non-uniform distribution on the graph statistics suc h as num ber of triangles, num ber of edges, and 2-stars. De c aying Attr activeness Mo del: The decaying attractiveness mo del p osits that the attractiveness of eac h shell decays exponentially with its index, i.e. p i = ce − i , where c is some constant. This mo del aims to ov ercome the problems imp osed by the in trinsic graph-theoretic attractiveness of the higher-index shells. Figure 4 shows the sampling distributions of summary statistics of the samples from this mo del. 9 The histogram of the size of the innermost shell has tw o mo des, one at 16 and a second one at 4, suggesting a bimo dal distribution. The histograms of num b er of tw o stars and the num b er of triangles are bimo dal as well. 4.3. Example 2: Sampson monastery data The Sampson dataset is a widely studied netw ork of size 18 that records in teractions among a group of monks in a New England Monastery Sampson [ 1968 ] and their evolution o ver time. The first three time perio ds of the original Sampson data are commonly used (e.g., in the ergm pac k age) and often aggregated. The netw ork at any of these three time p erio ds, makes for an uninteresting second example from the p oint of view of shells: namely all no des are in the same shell and of degeneracy 3 and we ha v e already considered suc h netw orks in Section 4.2 . The aggregate netw ork ov er the three time p erio ds also has just ab out all no des (all but 4) in the highest shell and of degeneracy 5. In order to obtain a more v aried shell distribution as a case study to examine the model b ehavior, w e consider instead an arbitrary subgraph of the aggregate netw ork; sp ecifically , we use the upp er triangular part of the adjacency matrix and symmetrize it. This undirected netw ork is shown in Figure 5 , color-co ded by shells; it has n = 18 no des, e = 35 edges and densit y of 0 . 23. The observed degeneracy is 3 and the observ ed 4-truncated shell distribution is (0 , 2 , 3 , 13); there are 3 nonempt y shells, and the innermost shell (shell 3) contains the highest num ber of no des (13). T o use this Sampson-deriv ed net w ork to study the prop erties of the shell distribution ER GM, w e set m = 3 and use MCMC MLE to estimate the v alue of θ . Using a heuristic grid searc h, we found θ 0 = (2 , 1 , 1 , 0) to b e a go o d initial estimate. The estimated MLE is ˆ θ M LE = ( − 7 . 95 , 2 . 79 , 0 . 91 , 0) which corresp onds to ˆ p M LE = (0 . 00 , 0 . 82 , 0 . 13 , 0 . 05). Recall that θ i = log p i p m and hence θ i can b e interpreted as the log-o dds of attractiveness of shell i relative to shell m . F or this dataset, attractiveness of shell 1 relativ e to shell 3 is almost 3 times that of shell 2, thus indicating that the netw ork has a rich p eriphery in the sense of Rombac h et al. [ 2014 ]. This can also b e seen by noting that ˆ p 1 = 0 . 82; recall that the p i can also b e in terpreted as the prop ensity of the i -th shell to hav e no des in it b eyond its intrinsic graph-theoretic attractiveness (as explained in Section 4.2 ). Next, using m = 3 and the MLE estimate θ = ( − 7 . 95 , 2 . 79 , 0 . 91 , 0), we simulated net w orks from the mo del using the MCMC algorithm presented earlier in this Section to study what prop erties of the netw ork are captured by the mo del. One can think of these sampled graphs as samples from the p osterior predictive distribution. Con v ergence of a 40,000-step Mark ov chain was v erified using the usual diagnostics, suc h as trace plots and auto correlation plots to ensure sufficient mixing. Figures 6 , 7 , 8 summarize the results of the simulations. Sp ecifically , Figure 6 shows the sampling distribution of v arious summary statistics in the form of a histogram and compares them with the observ ed v alues. Sev eral in teresting results emerge. The sampling distribution of the summary statistics are all unimo dal and very close to the observed statistic sho wn b y the red line. Notice that the histogram of triangles is centered around the observed v alue, thus the shell distribution mo del captures triadic effects quite well, at least in this small example. W e would lik e to draw a comparison with degree-based mo dels which do not capture triadic effects, by definition. It is widely b elieved that the centralit y of a netw ork is related to its core distribution, and the histogram of centralit y pro vides additional supp ort of this hypothesis. The distribution of the size of the largest shell is also captured b y the mo del. Ho w ever, the sampling distribution of num b er of edges suggests that the observ ed n umber of edges is m uch smaller than what w e exp ect under the mo del. This may b e due to the fact that the mo del has a bias tow ards graphs with higher-index shells (innermost cores), and these shells tend to be densely connected. A similar situation is true for the num ber of tw o-stars. The sampling distribution of the size of the innermost shell indicates that it can hav e anywhere from 5 to 18 no des, with tw o mo des at 15 and 16; compare this with the observed num b er of 13 no des in shell 3. W e also consider v arious shell distributions visited by the Mark ov chain. The top 10 most frequently visited shell distributions are giv en in T able 1 . Figures 7 and 8 sho w the b ox plots of degree and shell distributions, resp ectively , of the sampled graphs, and include the observed degree and shell distributions as dotted lines. Note that the sampling 10 Fig 5: A subset of the Sampson Monastery Dataset: No des are colored according to their s hell index: black is 1, red is 2, and green is shell index 3. 0.000 0.025 0.050 0.075 0.100 20 30 40 50 Edges 0.000 0.025 0.050 0.075 0.100 0 10 20 30 T riangles 0.000 0.004 0.008 0.012 100 200 2 stars 0 2 4 6 8 0.1 0.2 0.3 0.4 0.5 Centrality 0.0 0.1 0.2 0.3 5 10 15 Size of largest shell 0.0 0.1 0.2 0.3 5 10 15 Size of innermost shell Fig 6: Sampling distribution of summary statistics from the mo del estimated from the dataset in Figure 5 . The red dashed lines indicate the observed v alues of the statistics. T able 1 The top 10 visited shel l distributions Shell Distribution Density (in %) 0.1.1.16 5.95 0.0.1.17 5.79 0.1.2.15 5.22 0.0.2.16 4.54 0.2.1.15 4.22 0.0.0.18 3.88 0.1.3.14 3.64 0.2.2.14 3.63 0.1.0.17 3.54 0.0.3.15 2.89 distribution of degree distributions is quite different from that of shell distributions, sho wing that the shell distribution mo del captures features that go b eyond the degrees, and justifying our initial motiv ation for constructing the mo del. In addition, the sampling distribution of the shell distribution is concentrated around the observed shell distribution. This is to b e exp ected: as we used the observed shell distribution to estimate the mo del, it serves as a c hec k that the MLE of θ using MCMC MLE is indeed a goo d estimate. Recall that another definition of the MLE is the following: If ˆ θ is an MLE, then, E ˆ θ [ n s ( g )] = n s ( g obs ). Figure 8 serv es as a visual confirmation of this equation. In fact, the observed shell distribution is n s ( g obs ) = (0 , 2 , 3 , 13) and the estimate of the exp ected shell distribution (based on the MCMC samples from ˆ θ ) turns out to b e ˆ E ˆ θ [ n s ( g )] = (0 . 00 , 2 . 29 , 3 . 06 , 12 . 66). Finally , even though the general trend in the observed degree distribution is captured by the mo del, as suggested by Figure 7 , there is a substantial deviation b etw een the observed degree distribution and the one suggested by the model. This reinforces the observ ation that the degree distribution and shell distribution capture differen t asp ects of the Sampson netw ork, and the shell distribution ERGM captures prop erties of the net work b eyond the degrees. In fact, it is w ell-known that degree-based models hav e independent dy ads, whereas the shell distribution ER GM do es not. This is further evidenced b y Figure 6 . 11 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 3 6 9 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Degree Number of Nodes Fig 7: Degree Distribution ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 5 10 15 1 2 3 Shell Index Number of Nodes Fig 8: Shell Distribution Fig 9: Bo x plots of degree distributions and shell distributions for the shell distribution model estimated from Sampson data. The dashed lines represent the observ ed distributions. 5. A sampling algorithm for generating graphs with a giv en shell distribution In the last tw o decades, there hav e b een several con tributions in the graph theory and computer science literature on computing cores decomp ositions. Given the wide-ranging application of cores, a natural problem that arises is to find an algorithm that randomly generates graphs with a given core structure. Suc h an algorithm is presented in Baur et al. [ 2007 ] for graphs with additional restrictions on the n um b er of edges b etw een pairs of shells. This section provides a simple algorithm (Algorithm 3 ) for sampling the space of graphs with a given shell distribution (sometimes called the fib er of that distribution), such that an y graph has p ositive probabilit y of b eing constructed (Theorem 9 ). This is an indep endent sampler, not a Marko v chain. Sim ulations indicate go o d p erformance in terms of discov ering new graphs at a fast pace. While the true sampling distribution is not known, our exp eriments sho w that reasonably long runs will give go o d estimates. Algorithm 1: Compute Shell Sequence input : a graph g output : its shell sequence s ( g ) = ( s 1 , . . . , s n ) 1 Initialize s ∗ = 0. 2 Rep eatedly remo ve vertices of degree at most s ∗ in g , incrementing s ∗ by 1 if no eligible vertices remain in g ; quit when g is empty . The shell index of each vertex is the v alue of s ∗ when it w as deleted. F or con v enience, we restate the basic algorithm for producing the shell se quence of a graph as Algorithm 1 . There is no need to implement it, since the linear-time algorithm from Bagatelj and Za ver ˇ snik [ 2003 ] is already implemented as the graph.coreness function from the Csardi and Nepusz [ 2006 ] igraph pack age in R . Note that the order in which the vertices of g are deleted in Algorithm 1 is neither unique, nor arbitrary: vertices are deleted in increasing order of their shell indices, but not all vertices with the same shell index are interc hangeable. F or example, consider the graph in Figure 2a , for which every v ertex has shell index equal to 1: the first v ertex deleted will be, by necessity , one of the vertices of degree 1, but the second vertex deleted can v ary dep ending on the choice of the first vertex. 12 Our sampling algorithm will generate graphs with vertices in an order that is compatible with Algo- rithm 1 , so w e will need to know more ab out such orderings. T o that end, we give a simple condition for a graph g on vertices { v 1 , . . . , v n } that determines whether Algorithm 1 could p otentially pro cess its vertices in that order, yielding a pre-sp ecified sorted shell sequence s 1 ≤ . . . ≤ s n . Condition 6. F or all i ∈ [ n ]: 1. v i has at least s i neigh b ors v j with s j ≥ s i , and 2. v i has at most s i neigh b ors v j with j > i . Lemma 7. Consider any gr aph g ∈ G n on vertic es lab ele d v 1 , . . . , v n and sorte d se quenc e of n non- ne gative inte gers s 1 ≤ . . . ≤ s n . Algorithm 1 c an pr o c ess the vertic es of g in the given or der, yielding shel l indic es s ( v i ) = s i for al l i ∈ [ n ] , if and only if g satisfies Condition 6 . Pr o of. Consider Algorithm 1 on a graph g satisfying Condition 6 , at the momen t when s ∗ incremen ts from s − 1 to s . The subgraph induced by { v i : s i ≥ s } has minimum degree at least s by Condition 6 (i), so none of those vertices can hav e b een deleted yet. On the other hand, if v i is the vertex remaining in g with smallest index i , then v i m ust hav e at least s neighbors v j with j > i , so b y Condition 6 (ii), s i ≥ s . Thus, the vertices remaining in g at that moment are precisely those v i with s i ≥ s . Applying the argument for any s and for s + 1 sho ws that the vertices v i with s i = s are precisely those which Algorithm 1 deletes when s ∗ = s , as required. F or the other direction, supp ose that Algorithm 1 processes the v ertices of g in order, yielding s ( v i ) = s i for all i ∈ [ n ]. Then Condition 6 (ii) is true since s ∗ = s i when v i is deleted. Supp ose that Condition 6 (i) is not true for some v i . Just b efore s ∗ incremen ts from s i − 1 to s i , all vertices v j with s j < s i ha ve been deleted, so v i has fewer than s i neigh b ors remaining. Then v i could b e deleted, which w ould make its shell index s i − 1 according to the algorithm, a contradiction. Giv en a sorted shell sequence s 1 , . . . , s n of some simple graph, we initially aim to construct a graph g in n steps, by adding edges from v i to v j with j > i during Step i so that Condition 6 is satisfied. At Step i , we will need to know how man y neighbors v i already has with shell index at least s i —call this n umber t i . Then Condition 6 can b e restated as follows: v i has b etw een s i − t i and s i new neighbors added during Step i , where t i = |{ v j : v j v i ∈ g , j < i, s j ≥ s i }| . These considerations are summarized in Algorithm 2 . Algorithm 2: Graph sampler: initial version input : a sequence of non-negative integers s 1 ≤ . . . ≤ s n output : a graph g on vertices v 1 , . . . , v n with shell sequence s ( g ) = ( s 1 , . . . , s n ) 1 for i ← 1 to n do 2 Make v i adjacent to a set S of vertices v j with j > i such that s i − t i ≤ | S | ≤ s i 3 Update t j v alues as needed. 4 end Ho wev er, Algorithm 2 could get stuck if it is unable to choose S as required. This problem will not happ en as long as the num b er of v ertices v j with i < j ≤ n is at least s i − t i . F or Steps i ≤ n − s n , the num ber of such v ertices n − i satisfies n − i ≥ s n ≥ s i ≥ s i − t i , so the problem can only o ccur for i > n − s n . T o av oid this, we will mo dify those steps of the algorithm. Consider i ≥ n − s n . Since the num b er of v ertices v j with j > i is n − i ≤ s n and s i = s n , the condition s i − t i ≤ | S | ≤ s i reduces to just | S | ≥ s n − t i . The num b er of vertices in { v j : j ≥ n − s n } is s n + 1, including v i , so v i has s n p oten tial neigh b ors in that set. Th us, for such i , Condition 6 is equiv alent to Condition 8 , which is as follows: Condition 8. F or all i ∈ [ n ] with i ≥ n − s n , v i has at most t i non-neigh b ors in the set { v j : n − s n ≤ j ≤ n } . As w e pro cess vertices v i with i ≥ n − s n , let t 0 i represen t the maximum num b er of non-neighbors allo wed among unprocessed vertices. Initialize t 0 i = t i for all n − s n ≤ i ≤ n . T o satisfy Condition 8 , each 13 t 0 j decreases by 1 whenever it is not made adjacent to the currently active vertex v i . When a t 0 j reac hes zero, we make it adjacent to all remaining vertices and then remov e v j from further consideration; note that this do es not change t 0 i for any i 6 = j . Since no t 0 i will ever go b elow zero, we will b e able to pro cess all v i with i ≥ n − s n so that Condition 8 is satisfied. Finally , recall that t i = |{ v j : v j v i ∈ g , j < i, s j ≥ s i }| . Since the given sequence s 1 , . . . , s n is sorted in increasing order, s j > s i is imp ossible when j < i . Thus, an equiv alent definition of t i is: t i = |{ v j : v j v i ∈ g , j < i, s j = s i }| . (8) Algorithm 3 constructs graphs within the restrictions p ermitted by Condition 6 (for i < n − s n ) and Condition 8 (for i ≥ n − s n ), c ho osing randomly among all p ossibilities whenever there is more than one option. As we hav e sho wn, the algorithm will nev er get stuck. Th us, we hav e the following result: Theorem 9. F or any gr aph g with shel l se quenc e s ( g ) , Algorithm 3 pr o duc es g , up to isomorphism, with p ositive pr ob ability. Algorithm 3: Graph Sampler: construct a random graph with a given shell sequence input : a sorted integer sequence s 1 ≤ . . . ≤ s n output : a graph g with shell sequence s ( g ) = ( s 1 , . . . , s n ) 1 Initialize v 1 , . . . , v n to b e the vertices of g . 2 Initialize t 1 = . . . = t n = 0 3 for i ← 1 to n − s n − 1 do 4 Choose a random subset R of { v j : i < j ≤ n } with max { 0 , s i − t i } ≤ | R | ≤ s i 5 for v j ∈ R do 6 Add the edge v i v j to g 7 if s j = s i then t j ← t j + 1 8 end 9 end 10 Initialize S = { v j : n − s n ≤ j ≤ n } 11 for v j ∈ S do 12 if t j = 0 then 13 S ← S \ { v j } 14 Add edges from v j to all v k ∈ S in g 15 end 16 end 17 while S 6 = ∅ do 18 Pick any v i ∈ S 19 S ← S \ { v i } 20 Choose a random subset R of S with | R | ≥ | S | − t i 21 for v j ∈ R do 22 Add the edge v i v j to g 23 end 24 for v j ∈ S \ R do 25 t j ← t j − 1 26 if t j = 0 then 27 S ← S \ { v j } 28 Add edges from v j to all v k ∈ S in g 29 end 30 end 31 end A commen t on the running time of Algorithm 3 : Since a random set R can b e chosen from a given set S in O ( | S | ) time, this algorithm runs in O ( | V | 2 ) time. W e conclude this section by summarizing simulation results. Algorithm 3 randomly constructs b oth lab eled graphs (whic h requires permuting the node lab els of the output of the algorithm) and unla- b eled graphs with a given shell distribution. It pro duces graphs in every isomorphism class of the shell distribution, and our simulations giv e preliminary evidence that it also do es so quite fast. As an example, consider shell distribution (0 , 2 , 1 , 4 , 0 , 0 , 0) on 7 vertices. F or lab eled graphs, 10,000 runs of the algorithm pro duced more than 7,400 distinct graphs, whic h implies a very high discov ery 14 rate of the fib er. F or unlab eled graphs, discov ering the 12 isomorphism classes requires only 100 calls to the algorithm. 6. Beha vior of complemen tary statistics on the fib er of the shell ERGM In this section, we explore, b oth theoretically and exp erimentally , the b ehavior of v arious subgraphs on the fib er of graphs with a given shell distribution. In the netw ork literature, subgraphs—such as edges and triangles—are used to p erform heuristic go o dness-of-fit tests. Hence, understanding how these subgraphs can v ary across the set of graphs with a fixed shell distribution is imp ortant. W e present the results in terms of a sorte d shel l se quenc e , but note that a sorted shell sequence is equiv alen t to a shell distribution, as one can b e obtained from the other uniquely . The following are lo wer and upper b ounds on the num ber of edges and triangles in a graph with a prescrib ed shell sequence and degeneracy m . Prop osition 10. If g is a gr aph with sorte d shel l se quenc e s 1 ≤ . . . ≤ s n , then the maximum numb er of e dges in g is m 2 + n − m X i =1 s i . Pr o of. By Lemma 7 , each vertex v i has at most s i neigh b ors v j with j > i , and the total num b er of v j with j > i is n − i . W e will construct a graph so that the first b ound is realized for v i with i ≤ n − m and the second b ound is realized for i ≥ n − m ; thus, it has the maximum p ossible num ber of edges. Begin with a complete graph G 0 on the m highest indexed vertices, v n − m +1 , . . . , v n . Then for each 1 ≤ i ≤ n − m , add exactly s i edges from v i to V ( G 0 ). This yields a graph with the desired num b er of edges. Prop osition 11. If g is a gr aph with sorte d shel l se quenc e s 1 ≤ . . . ≤ s n and c orr esp onding shel l distribution n S ( g ) = ( n 0 , . . . , n n − 1 ) , then the minimum numb er of e dges in g is m X j =1 f ( n j , j ) , wher e f ( n j , j ) = ( d j n j 2 e if j < n j j n j − n j 2 if j ≥ n j . Pr o of. F or any 0 ≤ i ≤ m , the vertices with shell index i must hav e at least i neighbors in { v j : s j ≥ i } . W e will construct a graph in stages as j go es from m do wn to 0, adding vertices with shell index j during stage j , using the minimum p ossible n um b er of edges to satisfy the previous condition. First, given any d < n , we show how to construct a graph G ( n, d ) with n vertices, minimum degree d , and the fewest p ossible num ber of edges. Let the v ertex set b e Z n and arrange the vertices evenly around a circle. If d is even, make each vertex adjacent to the d/ 2 closest vertices to it on either side. If d is o dd and n is even, make each vertex adjacent to the ( d − 1) / 2 closest v ertices to it on either side and also to the vertex directly across from it. If d is o dd and n is o dd, there is no d -regular graph on n v ertices, but we can construct an n -vertex graph with one vertex of degree d + 1 and all other vertices of degree d , as follo ws: Begin by making each vertex adjacent to the ( d − 1) / 2 closest vertices to it on each side. Then, for 0 ≤ i ≤ d − 1 2 , make vertex i adjacent to vertex i + d +1 2 . Note that for i = d − 1 2 , we get an edge from v ertex d − 1 2 to vertex d ≡ 0 mod n . The degree of vertex 0 increases by t wo and every other vertex degree increases by one, as required. Note that the num b er of edges in G ( n, d ) is d nd/ 2 e . No w, start with G ( n m , m ), which w e can do since n m ≥ m + 1. Next w e consider j starting from j = m − 1 down to j = 0, adding n j v ertices with shell index j at each step as follows: 15 If n j > j , then we add a disjoint copy of G ( n j , j ). If n j ≤ j , we add a disjoint complete graph on n j v ertices and, from each of its vertices, add edges to exactly j − n j + 1 other vertices (which were added at earlier steps). Let f ( n j , j ) b e the n um b er of edges added in Step j . Then f ( n j , j ) = d j n j / 2 e when j < n j and f ( n j , j ) = n j 2 + n j ( j − n j + 1) = j n j − n j 2 when j ≥ n j . The minimum num ber of edges is thus P k j =1 f ( n j , j ). W e no w study the b eha vior of the n um b er of triangles starting with a sharp upp er b ound. Prop osition 12. The maximum numb er of triangles for a gr aph with sorte d shel l se quenc e s 1 ≤ . . . ≤ s n = m is m 3 + n − m X i =1 s i 2 . Pr o of. The construction in the pro of of Prop osition 10 pro duces a graph with the righ t num b er of triangles and the argument is similar. Obtaining an explicit low er b ound for the num b er of triangles is difficult. Instead, w e construct graphs with the given shell sequence with relativ ely few triangles, th us providing an upp er b ound for the minimum num b er of triangles for graphs with the specified shell sequence. The first construction b egins with a complete graph on m v ertices but then minimizes additional edges added in subsequent steps. Lemma 13. L et s 1 ≤ . . . ≤ s n b e a sorte d shel l se quenc e. Then, ther e exists a gr aph g with this shel l se quenc e and exactly A triangles, wher e A = s n 3 + n − s n X i =max(1 ,n − 2 s n +1) s i 2 . Pr o of. Start with a complete graph on vertices S 0 := { v i : n − s n + 1 ≤ i ≤ n } . Let S 1 := { v i : max(1 , n − 2 s n + 1) ≤ i ≤ n − s n } and for each v i ∈ S 1 , add exactly s i edges from v i to S 0 . Finally , for 1 ≤ i ≤ n − 2 s n , add to the graph a vertex v i and exactly s i edges from v i to S 1 . The idea in the next construction is to grow a (nearly balanced) bipartite graph with partite sets S, S 0 rapidly . How ever, it may b e imp ossible to mak e a bipartite graph, so w e maintain another set S 0 for the vertices that cannot b e placed into S or S 0 . Every triangle will ha ve at least one vertex in S 0 . Fix any sorted shell sequence s 1 ≤ . . . ≤ s n = m . If n m ≥ 2 m , let S 0 = ∅ , let S, S 0 b e sets of sizes b n m / 2 c , d n m / 2 e , and let G b e the complete bipartite graph with partite sets S, S 0 . Otherwise, m + 1 ≤ n m < 2 m . Let a 0 = 2 s n − n m and let a m = a 0 m = n m − s n , then let S 0 , S, S 0 b e v ertex sets of sizes a 0 , a m , a 0 m resp ectiv ely . Initialize G to b e the union of a complete graph on S 0 and the complete tripartite graph with partite sets S 0 , S, S 0 . Note that G has n m v ertices and minimum degree s n . Starting with j = m − 1 and decreasing j after eac h step, add n j v ertices to S ∪ S 0 , split so that that | S | − | S 0 | is 0 or ± 1. Make each new vertex in S adjacent to j vertices in S 0 if | S 0 | ≥ j . Otherwise, mak e each new vertex in S adjacen t to every vertex in S 0 and also adjacent to j − | S 0 | vertices in S 0 ; this adds j −| S 0 | 2 triangles p er new vertex in S . Similarly add j edges from each new vertex of S 0 to v ertices in S if p ossible or to vertices in S ∪ S 0 otherwise, which adds j −| S | 2 triangles p er new vertex of S 0 . Let B b e the n umber of triangles in the graph obtained. 16 Although we cannot give a simple form ula for B , we can compute B directly , without actually constructing the graph: If n m ≥ 2 n , then B = 0. Otherwise, m + 1 ≤ n m < 2 m . In that case, we use a j , a 0 j to represent the sizes of S, S 0 after step j , where j is initialized to b e m and then decreases after eac h step. The num b er of new v ertices in S, S 0 in eac h step is represen ted b y x, x 0 . Then the computation can b e p erformed as follo ws. Algorithm 4: Compute B if n m ≥ 2 n then B ← 0 else Initialize j ← m , a 0 ← 2 s n − n m , a m = a 0 m ← n m − s n , and B ← a 0 3 + a 0 2 ( a m + a 0 m ) + a 0 a m a 0 m while j > 1 do Let j ← j − 1 if n j is even then x ← n j / 2 and x 0 ← n j / 2 else if a j +1 > a 0 j +1 then x ← b n j / 2 c and x 0 ← d n j / 2 e else x ← d n j / 2 e and x 0 ← b n j / 2 c a j ← a j +1 + x and a 0 j ← a 0 j +1 + x 0 B ← B + x j − a 0 j 2 + x 0 j − a j 2 /* where k 2 = 0 whenever k < 2 */ end Moreo ver, if ev er min( j − a 0 j , j − a j ) < 2, then B will remain fixed thereafter, since j is decreasing and a j and a 0 j are increasing evenly . Thus, the algorithm can b e terminated early if min( j − a 0 j , j − a j ) < 2. Prop osition 14. L et s 1 ≤ . . . ≤ s n b e a sorte d shel l se quenc e. Then, the minimum numb er of triangles in a gr aph with this shel l se quenc e is at most min { A, B } . Pr o of. This follows immediately from Lemma 13 and the previous construction. In order to further understand the b ehavior of these subgraph counts on the fib ers of the mo del, we sim ulated graphs using Algorithm 3 with the shell distribution corresp onding to the Sampson netw ork studied ab ov e. Here, we summarize the results of those simulations. Recall that the 4-truncated shell distribution of the Sampson netw ork is (0 , 2 , 3 , 13). The net w ork has 35 edges and 14 triangles. Simulating 50,000 graphs with this shell distribution using Algorithm 3 pro duced graphs with as many as 41 and as few as 27 edges. Prop ositions 10 and 11 show that the maxim um and minim um nu mber of edges for graphs with this shell distribution are 44 and 24, re- sp ectiv ely . The maxim um num ber of triangles among the simulated graphs w as 30, and the minim um w as 0. The upp er b ound for the num b er of triangles in a graph with this shell distribution, as giv en b y Prop osition 12 , is 34. The v alue A in Lemma 13 is 0, which coincides exactly with the minim um n umber of triangles observ ed in the sim ulations. It is worth noting that, among the 50,000 simulated graphs with shell distribution corresp onding to that of the Sampson netw ork, no tw o were isomorphic. In other words, 50,000 calls to Algorithm 3 pro duced 50,000 distinct graphs. This again suggests that Algorithm 3 discov ers the fib er of graphs with a fixed shell structure at a high rate. 7. Existence of MLE and the mo del p olytop e It is well kno wn from the theory of exp onential families (e.g., classical text Brown [ 1986 ]) that the MLE of the natural parameters of the mo del exists if and only if the av erage sufficient statistic of the sample lies in the interior of the following conv ex p olyhedron. F or discrete exp onential families, and ERGMs in particular, Rinaldo et al. [ 2009 ] offer details on the relev ance of this p olyhedron to the problem of maxim um likelihoo d estimation and study its prop erties from b oth theoretical and algorithmic p oint of view. Definition 15. The mo del p olytop e (or mar ginal p olytop e ) for the shell distribution ERGM ( 4 ) with the sufficient statistic vector ( n 0 ( g ) , . . . , n m − 1 ( g )) is the con vex hull of all p ossible vectors of minimal 17 sufficien t statistics: P n,m = conv { ( n 0 ( g ) , . . . , n m − 1 ( g )) | g ∈ G n,m } ⊂ R m . Of course, eac h v alue of m giv es rise to a differen t polytop e, but eac h turns out to be a subpolytop e (in fact, a face, as explained b elow) of the one with unrestricted degeneracy m ≤ n − 1. Th us w e define it as a sp ecial case and study its geometry first. F or simplicity of notation, denote the minimal sufficien t statistic v ector of the unrestricted mo del (i.e., the truncated shell distribution) by n ∗ S ( g ) = ( n 0 ( g ) , . . . , n n − 2 ( g )). Definition 16. The mo del p olytop e for the shell distribution ERGM with unrestricted degeneracy is P n := conv { n ∗ S ( g ) | g ∈ G n } ⊂ R n − 1 . Denote by ¯ n ∗ S the aver age sufficient statistic of the sample g 1 , . . . , g N ; its j th en try is 1 N P N j =1 n ∗ j ( g i ). Prop osition 17. F or a sample of size N = 1 , ¯ n ∗ S never lies in the interior of P n ; that is, the MLE never exists. Pr o of. Determining whether ¯ n ∗ S lies in the relative in terior of P n or on its b oundary requires an explicit description of the p olytop e. W e will sho w that P n is a dilate of a simplex. T o this end, let us consider the p olytop e of non-truncated shell distributions: P n = conv { ( n 0 , . . . , n n − 1 ) : ( n 0 , . . . , n n − 1 ) = n S ( g ) for some g ∈ G n } . W e claim that ( n 0 , . . . , n n − 1 ) = n S ( g ) for some g ∈ G n if and only if n m ≥ m + 1 and P n j = n , where m = dgen( g ). That n m ≥ m + 1 is a necessary condition is clear by definition. That it is sufficient, it suffices to construct a graph g with this sequence. But this is straightforw ard: starting with K m , add n m − m v ertices and connect each of them with every vertex of K m . This gives the m -shell. Then, to construct the j -shell for all other j , simply add as man y vertices as are necessary in the shell, and connect each of them with j edges to some subset of the original K m . Listing all integer p oints of this p olytop e, it is not difficult to see that it is simply an n-dilate of the simplex, P n = conv { ne i } = n ∆ n − 1 ⊂ R n , where e i is the i -th standard unit vector in R n . Finally , to obtain the p olytop e P n with the truncated sequences, simply omit the last coordinate from P n . The only effect this has on the p olytop e is that it interprets the simplex ∆ n − 1 as living in R n − 1 , instead of the wa y it is written ab ov e, as a p olytop e in R n but embedded in the h yp erplane P j n j = n . Finally , note that all realizable integer p oints (i.e., those corresp onding to a shell distribution) lie on the b oundary of this p olytop e, and not its relative in terior, since any realizable integer p oint m ust hav e a 0 in some comp onent, as is evident from the necessary and sufficient conditions for shell distribution realizabilit y given ab ov e. Th us, the MLE nev er exists for a single observ ation g . Remark 18. In case of larger samples, the MLE may or may not exist. The decision requires chec king if the av erage sufficie n t statistic is on the b oundary of P n . W e ha ve shown that the polytop e for unrestricted degeneracy mo del, P n , is just a dilate of the simplex, and all of the realizable sufficient statistics lie on its b oundary . But the simple structure of P n also implies that P n,m ⊂ P n for each m ≤ n − 1, where P n,m denotes the embedding of P n,m in to R n − 1 . Indeed, any p oint p ∈ P n,m ⊂ R m corresp onds to a p oint p ∈ R n − 1 whic h is clearly a realizable shell distribution v ector. Thus p is a p oint in the p olytop e P n that lies on the face cut out by the equations that set all co ordinates other than m -th to zero. Remark 19. Setting the degeneracy parameter m to b e equal to the observ ed graph and using the corresp onding ER GM ( 4 ) with sample space G n,m b eha ves b etter than using unrestricted degeneracy m ≤ n − 1 in general. In particular, many of the p oints that lie on the b oundary of P n lie on the relative in terior of a face of some P n,m , thus the MLE has a p ositive probability of existing. The asymptotics of this construction are of interest to the b eha vior of the MLE problem, but are b eyond the scop e of the present pap er. 18 8. Discussion Cores hav e b een widely used to study and summarize netw orks. In this pap er we study the core de- comp osition of a netw ork with an eye tow ards statistical inference. W e embed the core structure of a net work as captured by its shell distribution in the exp onential random graph framework. W e examine the theoretical prop erties of the mo del and study the problem of inference in the model which b oils do wn to three tasks–existence of the MLE, sampling from the mo del and sampling from the fib er. The existence of MLE question is answered by characterizing the mo del p olytop e. T o enable maxim um lik eliho o d estimation, we introduce a new type of supp ort restriction that av oids bad b ehavior of the mo del common to many other classes of ERGMs. W e develop an MCMC algorithm to sample from the mo del and apply this algorithm to estimate the MLE and p erform heuristic go o dness-of-fit tests. W e also study the fib er which is the space of all graphs given a fixed shell distribution and dev elop a sam- pling algorithm that can generate any graph with a predefined core structure with p ositive probability . F urther, we describ e the fib er in detail by computing b ounds on subgraph counts induced by fixing the core structure of a net work. Our exp eriments and theoretical results indicate that the shell distribution mo del captures informa- tion b eyond the degree distribution and, in particular, the triadic effects quite well. The mo del supp ort is obtained b y conditioning on the degeneracy of a graph. Conditioning is common in ER GMs, as it impro ves mo del prop erties and stability of estimation algorithms. The choice of degeneracy and th us the sp ecific shell ERGM dep ends on the data and is meant to provide a w ay to impro ve not only the mo del’s stability , but also its interpretabilit y . There are sev eral in teresting extensions of this work worth pursuing. Inference in the shell distribution ER GM gives rise to several imp ortant problems that deserv e attention. Firstly , even though the shell distribution of a netw ork can b e computed in linear time, when embedded in an MCMC algorithm to compute change statistics, this pro cess is very slow. In contrast, the change statistics of most ERGMs can b e computed lo cally , without the need of recomputing the new sufficien t statistic of the en tire graph. A natural question to ask is if one can compute the change statistics of the shell distribution more efficiently . In particular, the following is of critical interest: is there a wa y to use the lo cal change in the netw ork, such as adding or deleting edges, to re-compute the shell distribution? A related question is on the prop osal distribution used in the MCMC algorithm. Since we restrict the supp ort of the mo del to graphs with degeneracy equal to m , it would b e useful to find prop osal distributions that generate netw orks that are alwa ys in this set. W e considered one t yp e of summary statistic of the core distribution, namely the shell distribution and studied the asso ciated ERGM thor- oughly . Other interesting wa ys to summarize the core structure can b e used to develop ER GMs. As men tioned, ER GMs based on the core distribution go b eyond the dy adic assumption that is inherent in the degree-based analysis. An interesting summary statistic to consider is the degree of a no de in its core. In a differen t direction, for man y datasets, including the Sampson dataset, the net work in question is directed. Notions of core decomp osition can b e defined for suc h generalizations of graphs as well: for example, the ( k , l )-core of Giatsidis et al. [ 2013 ] for directed graphs. It is not difficult to extend our mo del and algorithms to this notion of core decomp osition, and it w ould b e in teresting to see how that mo del would p erform. Finally , the supp ort restriction applied to the core ERGM may b e useful in other con texts, but a natural question to ask is how do es one select the degeneracy parameter m . Ac kno wledgemen ts The authors w ould lik e to thank Stephen Fien berg and Alessandro Rinaldo for sev eral useful discussions on this topic and the anonymous reviewers for their careful reading of our pap er and their suggestions for clarifications. P etro vi´ c, Stasi and Wilburne were supported by AFOSR/D ARP A grant F A9550- 14-1-0141. Karw a gratefully ackno wledges supp ort by a gran t from the Singap ore National Researc h 19 F oundation under the Interactiv e and Digital Media Programme Office to the Living Analytics Researc h Cen tre. References Jos ´ e Ignacio Alv arez-Hamelin, Luca Dall’Asta, Alain Barrat, and Alessandro V espignani. k-core de- comp osition: a to ol for the visualization of large scale net works. In A dvanc es in Neur al Information Pr o c essing Systems 18 , page 41. MIT Press, 2006. Jo onh yun Bae and Sangw o ok Kim. Identifying and ranking influential spreaders in complex netw orks b y neighborho o d coreness. Physic a A: Statistic al Me chanics and its Applic ations , 395:549–559, 2014. Vladimir Bagatelj and Matjaˇ z Zav er ˇ snik. An O(m) algorithm for cores decomp osition of netw orks. CoRR , cs.DS/0310049, 2003. Mic hael J. Bannister, William E. Dev anny , and Da vid Eppstein. ER GMs are Hard. Preprint, av ailable at arxiv: arXiv:1412.1787 [cs.DS]. Vladimir Batagelj and Andrej Mrv ar. P a jek datasets. URL http://vlado.fmf.uni- lj.si/pub/ networks/data/ . Mic hael Baur, Marco Gaertler, Rob ert G¨ orke, Marcus Krug, and Dorothea W agner. Generating graphs with predefined k -core structure. Pr o c e e dings of the Eur op e an Confer enc e of Complex Systems , 2007. F rancesho Bonchi, F ranceso Gullo, Andreas Kaltenbrunner, and Y ana V olk o vich. Core decomp osition of uncertain graphs. Pr o c e e dings of the 20th ACM SIGKDD International Confer enc e on Know le dge Disc overy and Data Mining , 2014. La wrence Bro wn. F undamentals of Statistic al Exp onential F amilies , v olume 9 of Mono gr aph Series . IMS Lecture Notes, 1986. Alb erto Caimo and Nial F riel. Ba y esian inference for exponential random graph models. So cial Networks , 33(1):41–55, 2011. Shai Carmi, Shlomo Ha vlin, Scott Kirkpatrick, Y uv al Shavitt, and Eran Shir. A mo del of internet top ology using k-shell decomp osition. Pr o c e e dings of the National A c ademy of Scienc es, USA , 104: 11150–11154, 2007. Soura v Chatterjee and Persi Diaconis. Estimating and understanding exponential random graph mo dels. The Annals of Statistics , 41(5):2428–2461, 2013. Soura v Chatterjee, P ersi Diaconis, and Allan Sly . Random graphs with a given degree sequence. Ann. Appl. Pr ob ab. , 21(4):1400–1435, 2011. Gab or Csardi and T amas Nepusz. The igraph softw are pack age for complex net w ork re searc h. Inter- Journal , Complex Systems:1695, 2006. Marius Eidsaa and Eivind Almaas. s -core netw ork decomp osition: A generalization of k -core analysis to weigh ts. Physic al R eview , 88(6):062819, 2013. Charles J Geyer and Elizab eth A Thompson. Constrained monte carlo maximum lik elihoo d for de- p enden t data. Journal of the R oyal Statistic al So ciety. Series B (Metho dolo gic al) , pages 657–699, 1992. Christos Giatsidis, Dimitrios M. Thilikos, and Michalis V arzigiannis. D-cores: measuring collab oration of directed graphs based on degeneracy . Know le dge and Information Systems , 35(2):311–343, 2013. Anna Golden berg, Alice X. Zheng, Stephen E. Fienberg, and Edoardo M. Airoldi. A survey of statistical net work mo dels. F oundations and T r ends in Machine L e arning , 2(2):129–233, 2009. Stev en M Go o dreau, James A Kitts, and Martina Morris. Birds of a feather, or friend of a friend? using exp onential random graph mo dels to in v estigate adolescen t so cial netw orks*. Demo gr aphy , 46 (1):103–125, 2009. P aul W Holland and Samuel Leinhardt. An exp onential family of probability distributions for directed graphs. Journal of the americ an Statistic al asso ciation , 76(373):33–50, 1981. Ruth M Hummel, Da vid R Hunter, and Mark S Handco ck. Impro ving simulation-based algorithms for fitting ergms. Journal of Computational and Gr aphic al Statistics , 21(4):920–939, 2012. Da vid R Hunter and Mark S Handco c k. Inference in curved exp onential family mo dels for netw orks. Journal of Computational and Gr aphic al Statistics , 15(3), 2006. 20 Da vid R Hunter, Stev en M Go o dreau, and Mark S Handco ck. Go o dness of fit of so cial netw ork mo dels. Journal of the Americ an Statistic al Asso ciation , 103(481), 2008. Maksim Kitsak, Lazaros K Gallos, Shlomo Havlin, F redrik Liljeros, Lev Muc hnik, H Eugene Stanley , and Hern´ an A Makse. Iden tification of influential spreaders in complex netw orks. Natur e Physics , 6 (11):888–893, 2010. Mic hael M. Lee, Indra jit Roy , Alvin AuY oung, V anish T alwar, K.R. Jay aram, and Y uan yuan Zhou. Views and transactional storage for large graphs. Midd lewar e , pages 287–306, 2013. Daniele Miorandi and F rencesco De Pellegrini. K-shell decomp osition for dynamic complex netw orks. Mo deling and Optimization in Mobile A d Ho c and Wir eless Networks WiOpt 2010 Pr o c e e dings of the 8th International Pr o c e e dings on , pages 488–496, 2010. Sofia C. Olhede and P atrick W olfe. Degree-based net work mo dels. Preprint, arXiv:1211.6537, 2012. Sen Pei, Lev Muchnik, Jose Andrade Jr., Zhiming Zheng, and Hern´ an Mask e. Searching for sup er- spreaders if information in real-w orld so cial media. Natur e Scientific R ep orts , 4, 2012. Alessandro Rinaldo, Stephen E. Fien b erg, and Yi Zhou. On the geometry of discrete exp onen tial families with application to exp onential random graph models. Ele ctr onic Journal of Statistics , 3:446–484, 2009. Alessandro Rinaldo, Sonja Petro vi ´ c, Stephen E Fienberg, et al. Maximum lilkelihoo d estimation in the β -mo del. The Annals of Statistics , 41(3):1085–1110, 2013. Garry Robins, Pip Pattison, Y uv al Kalish, and Dean Lusher. An introduction to exp onential random graph (p*) mo dels for so cial netw orks. So cial networks , 29(2):173–191, 2007. M. Puck Rombac h, Mason A. Porter, James H. F owler, and P eter J. Mucha. Core-p eriphery structure in netw orks. SIAM Journal of Applie d Math , 74(1):167–190, 2014. Ka yv an Sadeghi and Alessandro Rinaldo. Statistical mo dels for degree distributions of netw orks. NIPS Workshop , 2014. Sam uel F ranklin Sampson. A novitiate in a p erio d of change: An exp erimental and c ase study of so cial r elationships . PhD thesis, Cornell Universit y , September, 1968. Zac hary M Saul and Vladimir Filko v. Exploring biological netw ork structure using exp onential random graph mo dels. Bioinformatics , 23(19):2604–2611, 2007. Mic hael Sch w einberger. Instability , sensitivity , and degeneracy of discrete exp onential families. Journal of the Americ an Statistic al Asso ciation , 106(496):1361–1370, 2011. Stephen B. Seidman. Netw ork structure and minimum degree. So cial Networks , 5(3):269–287, 1983. Cosma Rohilla Shalizi, Alessandro Rinaldo, et al. Consistency under sampling of exp onential random graph mo dels. The Annals of Statistics , 41(2):508–535, 2013. T om AB Snijders. Mark ov c hain Monte Carlo estimation of exp onential random graph mo dels. Journal of So cial Structur e , 3(2):1–40, 2002. T om AB Snijders and Marijtje AJ V an Duijn. Conditional maximum likelihoo d estimation under v arious sp ecifications of exp onential random graph mo dels. Contributions to so cial network analysis, information the ory, and other topics in statistics , pages 117–134, 2002. T om AB Snijders, Philippa E Pattison, Garry L Robins, and Mark S Handco c k. New sp ecifications for exp onen tial random graph mo dels. So ciolo gic al metho dolo gy , 36(1):99–153, 2006. Stefan W uc h t y and Eivind Almaas. Peeling the yeast protein netw ork. Pr ote omics , 5(2):444–449, 2005. 9. App endix A This app endix deals with the case when graph degeneracy m is not restricted to one v alue for all graphs under the mo del. In other words, the unrestricted mo del gives p ositive probability to netw orks of degeneracy less than or e qual to any fixed v alue of m ≤ n − 1. F or simplicity , w e will refer to this as the unr estricte d mo del, motiv ated by the sample space restrictions placed in defining the core distribution ERGM in Section 3 . W e will see that the choice of any particular such m ≤ n − 1 do es not affect the b ehavior of the model; instead, problems arise when allowing degeneracy to v ary within the graphs in the mo del. Section 9.1 introduces the unrestricted mo del, which is ill-b eha ved (cf. Remark 3 ). Section 9.2 explains this b eha vior. 21 9.1. The mo del with unr estricte d de gener acy F or completeness, let us re-derive the mo del, from first principles, for the unrestricted case m ≤ n − 1, for which the sample space is the set of all graphs with n no des, G n . Again, to tak e adv antage of the theory of exp onential families, we rewrite Equation 1 in exp onen- tial family form b y re-parameterizing P ( G = g ) in terms of normalized probabilities ˜ p j = p j p n − 1 . (Our notation very closely follows Sadeghi and Rinaldo [ 2014 ].) Observ e that p n − 1 = 1 1+ P n − 2 j =1 ˜ p j , and thus P ( G = g ) can b e written as P ( G = g ) = ϕ ( p ) n − 1 Y j =0 ( ˜ p j p n − 1 ) n j ( g ) = ϕ ( p ) p P n − 1 j =0 n j ( G ) n − 1 n − 1 Y j =0 ˜ p n j ( g ) j = ϕ ( p ) p n n − 1 n − 1 Y j =0 ˜ p n j ( g ) j , or, more compactly , using that ˜ p n − 1 = 1 and renaming the constant ϕ ( p ) to φ ( ˜ p ) to reflect the re- parametrization: P ( G = g ) = φ ( ˜ p ) 1 + P n − 2 j =1 ˜ p j n n − 2 Y j =0 ˜ p n j ( g ) j . (9) Next, let θ j = log ˜ p j and define the normalizing constan t in terms of θ as ψ ( θ ) = n log(1+ P n − 2 j =0 exp( θ j )) − log( φ ( ˜ p )) . With this, we can write P ( G = g ) in exp onen tial family form: P ( G = g ) = exp n − 2 X j =0 n j ( g ) θ j − ψ ( θ ) . (10) F or this version of the mo del, the minimal sufficient statistic is given b y the truncated shell distribution n ∗ S ( g ) = ( n 0 ( g ) , . . . , n n − 2 ( g )). As b efore, it is not difficult to see that the natural parameter space Θ for the mo del is Θ = R n − 1 . T o obtain the log-partition function ψ ( θ ) in closed form, for fixed n , consider the set of graphs on n no des as an ordered list, G n = { g 1 = K n , . . . , g i , . . . , g M = ¯ K n } , where the graphs are listed in non-increasing order in terms of the num b er of edges, and where M = 2 ( n 2 ) . Note that in the empt y graph g M , ev ery vertex has shell index 0, while in the complete graph g 1 = K n , the shell indices are s ( v ) = n − 1 for all v ∈ V ( K n ). Therefore, P ( G = g M ) = φ ( ˜ p ) 1 + P n − 2 j =0 ˜ p j n · ˜ p n 0 , (11) and P ( G = g 1 ) = φ ( ˜ p ) 1 + P n − 2 j =0 ˜ p j n . (12) F or an y other arbitrary graph g i ∈ G n \ { ¯ K n , K n } , P ( G = g i ) = φ ( ˜ p ) 1 + P n − 2 j =1 ˜ p j n n − 2 Y j =0 ˜ p n j ( g i ) j . (13) Using P M i =1 P ( G = g i ) = 1 and Equations ( 11 ) and ( 13 ), the normalizing constant φ ( ˜ p ) can be rewritten as: φ ( ˜ p ) = 1 + P n − 2 j =0 ˜ p j n 1 + . . . + Q n − 2 j =0 ˜ p n j ( g i ) j + . . . + ˜ p n 0 . (14) 22 n ∗ S ( g 1 ) = (0 , 0) n ∗ S ( g 2 ) = (0 , 3) n ∗ S ( g 3 ) = (1 , 2) n ∗ S ( g 4 ) = (3 , 0) Fig 10: T runcated shell distributions of all non-isomorphic simple graphs on 3 vertices. Finally , θ j = log ˜ p j and the second equality in ( 11 ) pro vide ψ ( θ ) = log (1 + . . . + Q n − 2 j =0 ˜ p n j ( g i ) j + . . . + ˜ p n 0 ) = log(1 + . . . + e P n − 2 j =0 n j ( g i ) θ j + . . . + e nθ 0 ). Example 20. Determining ψ ( θ ) for the case n = 3 dep ends on coun ting simple graphs on three no des up to isomorphism. Namely , there are 4 non-isomorphic simple graphs on 3 v ertices (see Figure 10 ): G n consists of 1 copy of g 1 , 3 isomorphic copies of g 2 , 3 isomorphic copies of g 3 and 1 copy of g 4 . F or g 1 = K 3 , each vertex has shell index 2, so n ∗ S ( g 1 ) = (0 , 0). F or g 2 , each vertex has shell index 1 and therefore n ∗ S ( g 2 ) = (0 , 3). Tw o v ertices of g 3 ha ve shell index 1 while the remaining vertex has shell index 0, so n ∗ S ( g 3 ) = (1 , 2), and n ∗ S ( g 4 ) = (3 , 0) as every vertex of g 4 = ¯ K 3 has shell index 0. Therefore, the log-partition function for n = 3 is ψ ( θ ) = log(1 + 3 ˜ p 3 1 + 3 ˜ p 0 ˜ p 2 1 + ˜ p 3 0 ) = log(1 + 3 e 3 θ 1 + 3 e 2 θ 1 + θ 0 + e 3 θ 0 ) . 9.2. Bad b ehavior of the unr estricte d mo del In this subsection w e illustrate how the mo del misb ehav es if the degeneracy m is not controlled. The mo del with unrestricted degeneracy allows, for any fixed m , the supp ort of the mo del to contain graphs with degeneracy less than or equal to m , i.e. the sample space of the mo del is defined as follows: G n, ≤ m = { g ∈ G n : dgen( g ) ≤ m } . Note that a sp ecial case is when G n, ≤ n − 1 = G n , that is, a graph with any degeneracy is allow ed with p ositiv e probability under the mo del. If we allow the mo del to put p ositive mass on graphs with degeneracy less than or equal to m , then for any generic p oint in the parameter space Θ, the following b ehavior o ccurs. The likelihoo d function has many mo des, and the lo cal mo des of the mo del corresp onding to graphs where all no des lie in the shells that are most p opular (with resp ect to the m th shell). The example b elo w illustrates this p oint, follo wed by Lemma 22 that makes this intuitiv e explanation of the mo del b ehavior precise. Example 21. Let m = 4 and consider the unrestricted shell ER GM supp orted on the sample space G n, ≤ 4 , i.e. the mo del puts a p ositive mass on all graphs with degeneracy less than or equal to 4. Let θ = ( θ 0 , . . . , θ 4 ) b e a parameter vector of this mo del. Recall that θ i = log p i p m and hence θ 4 = 0. Without significan t loss of generalit y , let us assume that θ 3 > θ 0 , θ 1 , θ 2 . Hence amongst shells 0, 1, 2 and 3, the 3 rd shell has the highest attractiv eness, relative to the 4 th shell. Consider the set of graphs whose degeneracy is less than m = 4, i.e. G n, ≤ 3 . Let g b e any graph in G n, ≤ 3 , then n s ( g ) = ( n 0 ( g ) , n 1 ( g ) , n 2 ( g ) , n 3 ( g ) , 0). Let g ∗ b e any graph in G n, ≤ 3 , where all nodes lie in the shell 3, which is the most attractiv e shell, i.e., n s ( g ∗ ) = (0 , 0 , 0 , n, 0). 23 Then P ( g ∗ ) > P ( g ). Indeed, the following inequalities are straigh tforward: log P ( g ∗ ) P ( g ) = m − 1 X i =0 θ i ( n i ( g ∗ ) − n i ( g )) = − m − 2 X i =0 θ i n i ( g ) + θ m − 1 ( n − n m − 1 ( g )) = − m − 2 X i =0 θ i n i ( g ) + θ m − 1 m − 2 X i =0 n i ( g ) ! = m − 2 X i =0 n i ( g )( θ m − 1 − θ i ) > 0 . This should b e interpreted as follows: Among the set of all graphs with degeneracy less than or equal to 3, the most likely graph will b e such that all no des are in the shell index corresp onding to the largest θ . Thus, in some sense, the lo cal mo de is a “degenerate” mo de (no pun intended!). In the ab ov e example, w e could hav e c hosen any θ k , k 6 = m , to b e the most attractiv e shell, and the shell distribution of g ∗ should b e modified accordingly , i.e. n k ( g ∗ ) = n and n i ( g ∗ ) = 0 for all i 6 = k . Moreo ver, we could hav e considered the mo de ov er any restricted sample space, not just G n, ≤ 3 . Lemma 22 illustrates this p oint by generalizing the example in several directions, in particular, by allo wing there to b e more than one ‘p opular’ shell. Let m b e the degeneracy of the mo del, let θ = ( θ 0 , . . . , θ m − 1 ) b e the parameter vector of the shell ER GM. Define [ m ] = { 0 , 1 , . . . , m − 1 } . Lemma 22. Consider the shel l ERGM on the sample sp ac e G n, ≤ m with p ar ameter ve ctor ( θ 0 , . . . , θ m ) , wher e θ m = 0 by definition. L et g b e any gr aph in G n, ≤ d with de gener acy d < m , i.e., n i ( g ) = 0 for al l i > d . L et L d = { l ∈ [ d ] : θ l = max i ∈ [ d ] θ i } . L et L c d = [ d ] \L d . L et g ∗ b e any network with de gener acy d such that no des exist only in the most p opular shel ls, i.e. n i ( g ∗ ) = 0 for al l i / ∈ L d . Then, P ( g ∗ ) > P ( g ) . Pr o of. Let θ ∗ = max i ∈ [ d ] θ i , and consider the follo wing, as in Example 21 : log P ( g ∗ ) P ( g ) = X i ∈ [ d ] θ i ( n i ( g ∗ ) − n i ( g )) = X i ∈L c d θ i (0 − n i ( g )) + X i ∈L d θ i ( n i ( g ∗ ) − n i ( g )) = − X i ∈L c d θ i n i ( g ) + θ ∗ X i ∈L d ( n i ( g ∗ ) − n i ( g )) = − X i ∈L c d θ i n i ( g ) + θ ∗ n − X i ∈L d n i ( g ) ! = − X i ∈L c d θ i n i ( g ) + θ ∗ X i ∈L c d n i ( g ) = X i ∈L c d n i ( g )( θ ∗ − θ i ) > 0 . The fourth equalit y holds since n i ( g ∗ ) = 0 for all i ∈ L c d . The fifth equalit y holds because P i ∈ [ d ] n i ( g ) = n . As an additional example of b eha vior explained in Lemma 22 , let m = 5, d = 3 and let θ = ( a, α, b, α, c, 0) where α > a, b, c . By Lemma 22 , among all graphs with degeneracy at most 3, graphs 24 with shell distribution (0 , k , 0 , n − k , 0 , 0) are the mo des, where n − k ≥ 4. Th us, among d -degenerate graphs, only graphs where all no des lie in the most p opular shells are mo des. These graphs are v astly differen t from each other in terms of their top ological prop erties (e.g. density , num ber of triangles), yet they o ccur as mo des of the same parameter vector. The reason wh y such a b ehavior occurs is that allo wing graphs with degeneracy less than m introduces a linear constrain t on the shell distributions of these graphs. Th us to eliminate suc h a behavior, w e define the model so that any graph with degeneracy less than m has 0 probabilit y . Tw o consequences of this fact are that when fitting the shell ER GM to an observed graph, (1) m cannot b e larger than the observ ed degeneracy , and (2) graphs with degeneracy less than the observed degeneracy hav e 0 probability . T o see why (1) holds, let g b e an observ ed graph with shell distribution n s ( g ) and degeneracy ˆ m . Consider fitting the shell ER GM to g b y allowing m > ˆ m . If the sample space is G n,m , the observed graph has 0 probability under the mo del! On the other hand, if we let the sample space b e G n, ≤ m and w e hav e ˆ m < m , the observed net w ork lies in the set G n, ≤ ˆ m ( G n, ≤ m . Lemma 22 can b e applied to s ho w that the mo del has an undesirable prop erty . Let supp ( n S ) = { i ∈ [ m ] : n i ( g ) 6 = 0 } . Let Θ g b e a subset of the parameter space suc h that indices of largest v alue of θ corresp ond to supp ( n S ), i.e., Θ g = { θ ∈ Θ : ∀ s ∈ supp ( n S ) , θ s = max i ∈ [ m ] θ i } By Lemma 22 , any parameter in Θ g will hav e the observed graph g as one of its mo des. Moreov er, these mo dels will hav e sev eral other mo des that hav e shell distributions quite different from the observed graph. The ab o ve discussion sho ws that if we allo w m > ˆ m , there exist a large subset of the parameter space where the mo del misb ehav es. A natural question to ask is the conv erse - does there exists a parameter v ector for whic h the observ ed graph is the only mo de? An easy algebraic calculation in the example b elo w shows even a weak er requiremen t of having the model assign higher mass to graphs with shell distributions v astly differen t from the observ ed shell distribution is not p ossible. Example 23. Let the observed shell distribution b e n S ( g ) = (0 , k , 0 , n − k ), with n − k ≥ 4 and k > 0. Hence the observ ed degeneracy is ˆ m = 3. Consider the shell ERGM with m = 3 and sample space G n, ≤ 4 . Consider tw o graphs g 1 and g 2 with shell distributions (0 , 0 , 0 , n, 0) and (0 , n, 0 , 0). W e will sho w that there do es not exist an y p oin t in the parameter space suc h that P ( g ) > P ( g 1 ) and P ( g ) > P ( g 2 ) simultaneously . T o this end, let θ = ( θ 0 , θ 1 , θ 2 , 0) b e any p oint in the parameter space. Note that log P ( g ) P ( g 1 ) = ( θ 1 − θ 3 ) k and log P ( g ) P ( g 2 ) = ( θ 3 − θ 1 )( n − k ). F or b oth these terms to b e p ositive at the same time, we need θ 1 > θ 3 and θ 3 > θ 1 whic h is imp ossible. Moreo ver if θ 1 = θ 3 , then the mo del places equal probability on the observed graph g and g 1 and g 2 , which is undesirable. 25
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment