Modeling and verifying a broad array of network properties

Motivated by widely observed examples in nature, society and software, where groups of already related nodes arrive together and attach to an existing network, we consider network growth via sequential attachment of linked node groups, or graphlets. …

Authors: Vladimir Filkov, Zachary M. Saul, Soumen Roy

Modeling and verifying a broad array of network properties
Mo deling and v erifying a broad arra y of net w ork prop erties Vladimir Filk ov, 1 , ∗ Zac hary M. Saul, 1 Soumen Ro y, 2, 3 Raissa M. D’Souza, 2, 3 and Premkumar T. Dev an bu 1 1 Dep artment of Computer Scienc e, University of California, Davis, CA 95616 2 Center for Computational Scienc e and Engine ering, University of California, Davis, CA 95616 3 Dep artment of Me chanic al and A er onautic al Engine ering, University of California, Davis, CA 95616 Motiv ated b y widely observ ed examples in nature, so ciety and softw are, where groups of related no des arriv e together and attac h to existing net works, w e consider net w ork gro wth via sequen tial at- tac hment of link ed node groups, or graphlets. W e analyze the simplest case, attachmen t of the three no de W -graphlet, where, with probabilit y α , we attac h a p eripheral no de of the graphlet, and with probabilit y (1 − α ), we attac h the cen tral node. Our analytical results and simulations sho w that tuning α pro duces a wide range in degree distribution and degree assortativit y , achieving assortativ- it y v alues that capture a div erse set of man y real-w orld systems. W e in tro duce a fifteen-dimensional attribute vector derived from seven well-kno wn net work prop erties, which enables comprehensive comparison b et w een any t w o netw orks. Principal Comp onent Analysis of this attribute v ector space sho ws a significan tly larger co verage potential of real-w orld net work properties by a simple extension of the ab o ve model when compared against a classic model of netw ork gro wth. P A CS n umbers: 89.75.Hc,89.75.Fb The ubiquit y and imp ortance of netw ork structures has recen tly b ecome apparen t, leading to an increased fo cus on netw ork gro wth mechanisms [1]. Existing mo dels of net work growth primarily consider the arriv al of single no des at each time step; ho w ever, there are numerous ex- amples in natural and artificial systems where net works gro w not just by the addition of single nodes but b y the addition of groups of already related no des. F or example, in biology , in developmen tal transcriptional gene regula- tion, whole path wa ys can b e added or eliminated by a m utation in a master regulator [3]; and in the evolution of biological netw orks, gene duplication can add subnet- w orks to the netw ork [4]. Gro wth of computer softw are net works (comp osed of interacting functions or classes) is often due to adding small groups of related elements si- m ultaneously . F or example, 1) functions to allo cate, use, and free a resource (suc h as a file) are usually added to- gether and 2) in ob ject-orien ted languages, go o d design principles call for classes to b e added in small groups called design p atterns [5]. F urther, in social netw orks within cities, families arriv e as units, and growth can be describ ed via aggregation of small pre-existing modules. Similarly , in corp orate en terprises, the practice of “lift- outs”, employing pre-existing functional teams of people (rather than building up a team from individual hires), is on the rise [6]. This insight suggests that a new class of net work growth mo dels incorporating group arriv al could lead to more realistic mo dels. Moreo ver, most existing w ork in modeling netw ork growth fo cuses on matc hing a single or few attributes of empirical netw orks, in particu- lar degree distribution, clustering co efficien t, etc. But net works can differ in man y w ays while b eing similar in others, e.g. some with the same degree distribution ha ve different levels of assortativ e mixing. Th us, a more ∗ Electronic address: filko v@cs.ucdavis.edu comprehensiv e comparison, simultaneously across man y imp ortan t attributes, is desirable. Hence, the purp ose of this letter is t wo-fold. Our first in tent is to prop ose mo deling net work gro wth by sequen- tial aggregation of groups of no des, represen ted by small, connected graphs or gr aphlets attac hing preferen tially in the netw ork, rather than by preferential attac hment of single no des. Th us, we introduce the gr aphlet arrival mo del and show that in spite of its added complexity im- p ortan t analytical results can be obtained. The model based on iteratively adding the three-node W -graphlet yields netw orks with degree distributions (the distribu- tion of the probability of observing a no de of degree k ) that follow an asymptotic pow er law, i.e. , p k ∼ k − γ , where, γ is a parameter ranging from 3 ≤ γ ≤ 5, in agreemen t with those found in a num b er of highly-cited studies of real-w orld systems where graphlets could pla y a crucial role [7]. W e also analytically derive the de gr e e assortativity , ρ , a measure of the tendency of no des to link to nodes of like degree, whic h has the p o wer to dis- criminate b et ween empirical netw orks from v arious fields, ev en if they ha ve similar degree distributions [8, 9]. As noted recen tly [8], an interesting op en problem is to come up with a single growth mo del which could generate net- w orks of both p ositive assortativity , like so cial netw orks, and negativ e assortativit y , like technological and biolog- ical net works. W e find that our mo del yields tunable assortativit y , with resp ect to a parameter, α , whic h de- termines the graphlet attachmen t p oin t probability , as explained b elo w. Our numerical results for netw orks up to ' 10 7 no des in size (which co v ers most real-w orld net- w orks) sho w assortativ e b ehavior, ( ρ > 0), for low er α and dissortativ e b eha vior, ( ρ < 0), for higher v alues of α . Our analytical calculations show that ρ ≥ 0 for infinite size net works. The second in tent of this letter is to in tro duce tec h- niques for comprehensiv ely comparing net works across a suite of netw ork prop erties simultane ously , allowing for a 2 FIG. 1: The Growth Pro cess. The W -graphlet arrives and merges into the existing netw ork at either its midp oint (with probabilit y 1 − α ) or its periphery (with probability α ). Here we show the pro cess, after the arriv al of 10 graphlets for α = 0 (left) and α = 1 (righ t). Already the creation/surpression of hubs is evident as w ell as the more homogeneous nature of the degree distribution for α = 1. Netw orks grown with 0 < α < 1 sho w b ehaviors intermediate betw een these tw o. m uch more in-depth ev aluation of netw ork mo dels than is p ossible using the commonly existing practice of com- paring primarily degree distribution. T o that end, we compare the ability of our model netw orks to matc h the v ariability of 113 real net works under 15 attributes, and demonstrate ho w data mining metho ds lik e clustering and statistical dimension reduction (via Principal Com- p onen t Analysis) can be utilized to assess that matc h. A simple extension of our mo del yields remark ably large co verage of the attribute space spanned by the 113 real net works, and a significant matc h of the ranges of real net works ov er all attributes. T o fully mo del with the graphlet arriv al paradigm, one m ust decide on which graphlet(s) to use, with which of their nodes to attac h, and where in the netw ork to attac h them. Common undirected graphlets include the dyad (edge), the tw o triads (3 no des) and the six tetrads (4 no des). T o prop erly analyze their arriv al and attac hment in to the netw ork one must classify the graphlets’ no des in to equiv alence classes based on symmetry . Our mo del, illustrated in Fig. 1, considers the simplest non-trivial case: series of arriving triads consisting of a single node of degree t wo and tw o iden tical no des of degree one, whic h w e call the W -graphlet. This graphlet’s asymmetry pro- vides a choice of t wo top ologically different attachmen t p oin ts (the t wo no des of degree one are equiv alent but differen t than the single no de of degree tw o), unlike the edge and triangle graphlets which allo w only one. The graphlets attac h to the netw ork by merging one of their v ertices in to an existing no de selected with probabilit y prop ortional to the no de’s degree, i.e., via preferen tial attac hment. The model chooses the degree-one merge p oin t with probability α and the degree-tw o merge p oint with probabilit y (1 − α ). First, w e deriv e the asymptotic degree distribution, p k , for the W -graphlet arriv al mo del via a master equation approac h. Starting with a single edge at time t = 0, the n umber of no des at time t is N ( t ) = 2 t + 2 ≈ 2 t , for large t . Let d i ( t ) denote the degree of vertex i at time t . Then, the probability that incoming graphlet j merges with node i is p j → i = d i ( t ) P i d i ( t ) = d i ( t ) 2 N ( t ) = d i ( t ) 4 t , where P i d i ( t ) = 2 N ( t ) as there is one edge for each no de in the graph. Let N k ( t ) be the n um b er of no des with degree k at time t . Due to the asymmetry of the W , we get separate equations of N k ( t ) for k ≥ 3, k = 2 and k = 1. Making the natural assumption that p k ( t ) = N k ( t ) / N ( t ) and assuming steady-state ( p k ( t ) → p k ) leads to N k ( t ) = 2 tp k . F rom this and the N k ( t ) equations, which may be detailed elsewhere [10], we get : p k ≥ 3 = α [( k − 1) / ( k + 4)] p k − 1 +(1 − α )[( k − 2) / ( k + 4)] p k − 2 , with p 2 = α 15 (7 − α ) and p 1 = 2 5 (2 − α ). Since p k ≥ 3 dep ends on b oth p k − 1 and p k − 2 non-trivially , w e cannot solv e it analytically . Ho wev er for large k , a simple linear appro ximation results in γ = (6 − α ) / (2 − α ). The results of n umerical solutions are shown in the inset to Fig. 2. The degree assortativit y , ρ , is defined as the P earson correlation coefficient b etw een the degrees of all pairs of connected v ertices in the netw ork [8]. Here, using a rate equation approach [11, 12], we directly calculate ρ from e kl , the probability distribution that an edge in an undi- rected graph is incident to vertices of degree k and l , and p k , the degree distribution [10]. Let E kl ( t ) denote the n umber of edges with a v ertex of degree k at one end and a v ertex of degree l at the other at time t . W e note that P k ≥ l E kl ( t ) = 2 t + 2 ≈ 2 t , for large t , which implies E kl = 2 te kl , for steady state. T o derive a rate equation for E kl ( t ) we account for the pro cesses that change it when a new W arriv es. The pro cesses that increase E kl are when: with probability (1 − α ), a W merges its mid- p oin t to a vertex of degree k − 2, whic h is already attached to a vertex of degree l (and the same argument with k and l rev ersed); with probability α , a W merges one of its endp oin ts to a v ertex of degree k − 1 (resp ectiv ely l − 1), whic h is already attached to a v ertex of degree l (resp ec- tiv ely k ); in the sp ecial case when k = 1, with probability (1 − α ), a W merges its midpoint to a vertex of degree 3 ● ● ● ● ● ● ● ● ● ● ● 0.0 0.2 0.4 0.6 0.8 1.0 − 0.02 0.02 0.06 0.10 α α ρ ρ FIG. 2: (Color online) Poin ts sho w the mean ρ o ver 100 sim u- lations of 10 6 no de netw orks. Bars represen t v alues within t wo standard deviations. Solid line is the theoretical prediction for net works with maxim um degree 2500. Inset: Distribution of a verage degree for α = 0 . 2 and α = 0 . 7 o ver an ensem ble of 5000 realizations, together with b est fitting lines with slopes equal to the, resp ectiv e, analytical γ ’s of 3 . 2 and 4 . 1. l − 2, pro ducing tw o new edges, eac h inciden t to vertices of degree l and 1; in the sp ecial case when k = 2, with probabilit y α , a W merges one of its endpoin ts to a vertex of degree l − 1, pro ducing one new edge incident to v er- tices of degree l and 2. The processes that decrease E kl are when: with probabilit y (1 − α ) a new W merges its midp oin t to a v ertex of degree k (resp ectively l ), which is already attached to a vertex of degree l (resp ectively k ); with probabilit y α a new W merges one of its end- p oin ts to a vertex of degree k (resp ectiv ely l ), which is already attached to a vertex of degree l (respectively k ). F rom these cases, and incorp orating preferential attac h- men t (by m ultiplying the n umber of edges gained or lost b y m/ 4 t , where m is the degree of the no de to whic h the new W is attac hed), we derive a rate equation for E kl : 4 t d dt ( E kl ) = (1 − α ) ˆ E k − 2 ,l ( t )( k − 2) + E k,l − 2 ( t )( l − 2) + 2 N l − 2 ( l − 2) δ k, 1 ˜ + α ˆ E k − 1 ,l ( t )( k − 1) + E k,l − 1 ( t )( l − 1) + N l − 1 ( l − 1) δ k, 2 ˜ − E kl ( t )( k + l ) , where δ i,j is the Kroneck er delta. Substituting E kl ( t ) = 2 te kl and N k ( t ) = 2 tp k eliminates time from this equa- tion and yields expressions for e k ≥ 3 ,l , e 1 ,l ≥ 3 and e 2 ,l ≥ 3 . T o initialize the recurrences w e similarly calculate e 11 = 0 , e 21 = e 12 = 2 α/ 7, and e 22 = α ( e 12 + p 1 ) / 8. In addition, b ecause of the symmetry of the E ij terms, and since the edges are undirected when i = j , we are ov er-coun ting so w e divide e ij b y 2. Conv ersely , when 0 < | i − j | ≤ 2, w e are under-counting and so we m ultiply e ij b y 2. There- fore, the e kl ’s (and hence, ρ ) can b e calculated [10] for a given v alue of α . A plot of ρ v ersus α can b e seen in Fig. 2, where a go o d agreemen t is apparent with net- w orks simulated from our mo del. Previous attempts to create a mo del that admitted v arying ρ v alues w orked b y rewiring the edges of an existing netw ork [13]. In con- FIG. 3: (Color online) Illustration of the W β mo del. Once the graphlet attaches to the netw ork, based on α , w e introduce up to l (here l = 4) additional edges from the graphlet into random no des of the net work, each with probability β . trast, our mo del grows net works with a range of negative and positive ρ v alues from first principles, giving insight in to how assortativity may arise in net works. W e note that in our exp erimen ts ρ approaches 0 from the neg- ativ e side for α < 2 / 3, but it do es so very slowly and is negative for all netw orks we tried (up to 10 7 no des). It can b e shown that [10] in the thermo dynamic limit Newman’s original formula for ρ [8] yields ρ = 0 when α < 2 / 3. The W -graphlet arriv al mo del alwa ys pro duces trees and hence is not exp ected to match empirical net works on some interesting prop erties (such as clustering co ef- ficien t). Therefore, we examine a simple extension to the W -mo del which allows it to pro duce denser graphs, without significantly affecting the mo del’s degree distri- bution and assortativit y features. The extended mo del, illustrated in Fig. 3, adds with probability β at eac h time step, l edges (or dy ads) from the arriving graphlet in to the existing netw ork, with the attachmen t points b eing chosen uniformly at random. In addition to al- lo wing denser graphs, this “ W β -mo del” also reflects the b eha vior in v arious real-world netw orks, where a newly arriving graphlet may attach to the existing netw ork at more than one p oint ( e.g. , new families arriving in a city , etc.). A theoretical analysis for the extended mo del is v ery complex. Instead, in the following mo del compar- ison we sim ulate net works for many v alues across the p ossible parameter space ( α, β , l ). Existing netw ork literature compares netw orks or net- w ork mo dels by studying one or t wo particular prop er- ties, and most commonly the degree distribution. In this letter, we introduce a fifteen-dimensional attribute vec- tor of seven well-kno wn net work prop erties, whic h should enable a general and comprehensive comparison b etw een an y set of netw orks. These prop erties are: the num b er of no des, the n umber of edges, the geo desic distribution, the b etw eenness coefficient distribution, the clustering co efficien t distribution, the assortativit y , and the degree 4 FIG. 4: (Color online) Symmetric heatmap of attribute cor- relations among netw orks. Red (blue) indicates p erfect cor- relation (an ti-correlation). White is the intermediate case of no correlation. The small amount of clustering along the di- agonal attests to the relative indep endence of the attributes. distribution of the netw ork. F or the four distributions, w e use the mean, standard deviation, and sk ewness as pro xy attributes, for a total of 15 attributes. Net works are mapp ed to p oints in a 15-dimensional space defined b y these attributes, normalizing each v alue b y subtract- ing the attribute mean and dividing by the attribute’s standard deviation. Our collection of real-world netw orks consists of 113 div erse netw orks from biological, so cial and technical do- mains. It includes soft ware call graphs [14], a social net work of softw are developers [15], political so cial net- w orks [16, 17], 3 gene net works [18, 19, 20], 3 protein- protein interaction netw orks [21], cellular net works for sev eral organisms [22], and several others do wnloaded from a web rep ository of netw orks [23, 24, 25, 26, 27, 28, 29, 30]. The degree of ov erlap, or dep endence, b e- t ween the attributes when characterizing net w orks can b e assessed by the symmetric heatmap in Fig. 4, sho w- ing the pairwise correlations (Pearson) of the netw ork attributes ov er a representativ e sample of real-world net- w orks (one from eac h data set describ ed ab ov e). The ro ws and columns of the heatmap are ordered so that, within the limitations of the hierarchical clustering used, the attributes most correlated with eac h other are placed closest. The map allo ws us to identify clusters of “simi- lar” net work attributes by lo oking for blo c ks of squares along the diagonal of the figure. Since there is only a small amoun t of clustering along the diagonal, it follows that most netw ork attributes we ha ve c hosen are rela- tiv ely indep endent, and thus, provide information to our analysis. In the following analysis w e ha v e eliminated 4 of the 15 net work attributes and retained 11. One reason, is that some attributes, like num b er of no des and edges were tigh tly correlated as indicated in the heatmap. So w e only k ept one of them, the num b er of edges. Another reason is that since the l ’th momen t of a p o w er-law dis- tribution, p ( k ) ∼ k − γ , is only defined for l < ( γ − 1), w e hav e omitted the sk ew of the degree distribution, as a precaution. F or the same reason, the v ariance and sk ew of the b etw eenness distribution hav e b een left out, ev en though the exact nature of the betw eenness distribution do es not seem to b e known. The distribution of the clus- tering co efficien t and geo desic are defined for the mo dels in vestigated in this pap er [34] and ha ve hence b een re- tained. Next, w e compare a collection of W β -arriv al growth net works to the ab o ve collection and to a baseline collec- tion of net w orks from the w ell-kno wn BA mo del [31]. W e c hose BA as a baseline b ecause, lik e BA, our graphlet- arriv al model uses the mechanism of preferen tial attach- men t, only instead of no des we hav e graphlets arriving. W e sample a large swath of the parameter space for the W β -arriv al mo del, iterating across several p ossible v al- ues for eac h parameter and creating net works that co ver the size range of real-world netw orks. T o this end, we use netw ork sizes ranging from 500 to 5250 no des at 250 no de interv als, α v alues in the range 0 ≤ α ≤ 1 at inter- v als of 0 . 1, β v alues in the range 0 ≤ β ≤ 1 at interv als of 0 . 1, and l v alues in the range 1 to 5. F or each p ossible com bination of v alues of these four parameters, w e create fiv e netw orks, giving us a total of 60 , 500 netw orks. F or the BA mo del, we generate 500 sample netw orks b y v ary- ing the n umber of no des in the same range as our mo del (with iden tical increments), v arying the num b er of edges added at each attac hment from 1 to 5, and creating 5 sample netw orks for each p ossible combination of these t wo parameters. T o ob jectiv ely assess the extent to whic h our mo del net works cov er the range of attributes sim ultaneously , w e visualize the attribute space using an established sta- tistical dimension-reduction tec hnique, Princip al Comp o- nents A nalysis (PCA), whic h guaran tees maximal reten- tion of the v ariance when pro jecting data in to a lo wer dimension [32]. PCA finds the pro jection of an n - dimensional data set on to a space of the same dimension, where the new axes, or princip al c omp onents , are orthog- onal and linear com binations of the original dimensional v ariables, such that the first d axes, d ≤ n , retain the maximal v ariance of the original data set p ossible with that many dimensions. Fig. 5 sho ws the pro jections of the sets of W β mo del, BA mo del, and real-w orld net- w orks on to the first three principal comp onen ts (out of 11) of the real-w orld data set found b y the PCA algo- rithm. These principal comp onen ts retain 71% of the original data v ariance and demonstrate the larger co v er- age p oten tial of the extended graphlet arriv al mo del. W e note that these results are fairly stable with resp ect to the num b er of v ariables used in the PCA analysis: using b et w een 2 − 4 fewer (or more) than the 11 v ariables do es not qualitatively change the results [10]. While PCA has b een used before to cluster netw orks [33], our metho dol- ogy here is no vel in that it offers a general and explicit w ay to compare growth mo dels relative to eac h other, 5 with resp ect to the fraction of PCA space they cov er. Additionally , it allows for mo dels to b e compared more finely , along individual or combinations of original v ari- ables, by pro jecting those v ariable vectors in the same PCA space, e.g. assortativit y in Fig. 5, and then observ- ing the spread differentials betw een the mo del netw orks along those vectors. In conclusion, graphlet arriv al mo dels are a positive step tow ard more realistic net work models which, as we sho w, b etter appro ximate empirical net works in biology , soft ware, and social science, b oth in the mo deling step (graphlet versus no de arriv al) and in the results (match- ing more complex measures of netw orks, lik e assortativ- it y). A broad degree distribution and wide v ariation of assortativit y are features of the W -arriv al mo del which are not present in preferential attachmen t models that gro w via individual no des, and/or edges. In particu- lar, w e b elieve that the attachmen t asymmetry of the W -graphlet is largely resp onsible for these features and that they would not be apparent in a graphlet mo del of fully connected graphlets (e.g edge, triangle, or square). Therefore, we exp ect more complete graphlet arriv al mo dels (whose theoretical analysis w ould also b e more complex), considering a larger set of possible graphlets to yield even b etter mo dels of empirical net works (we also note that the addition of simpler graphlets should expand the range of p ossible γ ’s to b elo w 3, where the exp onen ts of most real-world net works with pow er-la w degree-distributions reside). Finally , we an ticipate that the tec hnique of comprehensive comparison of netw orks across a suite of netw ork prop erties introduced in this letter, w ould find wider use in the netw ork literature. This work was funded in part b y the National Science F oundation under Grant No. I IS-0613949. W e thank the anon ymous referees for their inv aluable comments which impro ved this paper. [1] Alb ert R. , and Barab´ asi A.L. , Rev. Mo d. Ph ys., 74 , 47 (2002); M.E.J. Newman, SIAM Review, 45 , 167 (2003). [2] Alb erts B. et. al. , Mole cular Biolo gy of the Cel l (Garland Science, London, 2007, 5’th Ed.); [3] Davidson E.H. , The R e gulatory Genome: Gene R e gula- tory Networks In Development And Evolution (Academic Press/Elsevier, San Diego, 2006); [4] W eitz J.S. et. al. , PLoS Biology , 5(1) , (2007); Kashtan N. and Alon U. , Proc. Natl. Acad. Sci. U S A, 102 , 13773 (2005); Middendorf M. et. al. , Pro c. Natl. Acad. Sci. U S A, 102 , 3192 (2005). [5] Gamma E. et. al. , Design p atterns: elements of r eusable obje ct-oriente d softwar e . (Addison-W esley Long- man Publishing Co., Boston, 1995). [6] Groysberg B. and Abrahams R., Harv ard Business Rev., 84 (2006); McGregor J. , Business W eek, Dec. 18 (2006). [7] Steyvers M. and T enenbaum J.B. , Cognitive Science, 29 , 41 (2005) Redner S., Eur. Phys. J. B, 4 , 131(1998); Liljeros F. et. al. ,, Nature, 411 , 907 (2001) [8] Newman M. E. J. , Ph ys. Rev. Lett., 89 , 208701 (2002) [9] Newman M. E. J. and Park J. , Ph ys. Rev. E, 68 , 036122 (2003). [10] Saul Z.M. et al. , unpublished [11] Callaw ay D.S. et. al , Phys. Rev. E, 64 , 041902 (2001). [12] Krapivsky P . L. and Redner S. , Ph ys. Rev. E, 63 , 066123 (2001). [13] Xulvi-Brunet R. and Sok olov I.M. , Ph ys. Rev. E, 70 , 066102 (2004) [14] Saul Z. M. et. al , Pro c. A CM SIGSOFT Internatl. Symp. on F oundations of Soft ware Engineering, 15-24 (2007). [15] Bird C. et. al , Proc. Internatl. W orkshop on Mining Soft- w are Rep ositories, 137-143 (2006). [16] Adamic L. A. and Glance N. , The WWW-2005 W ork- shop on the W eblogging Ecosystem (2005). [17] Krebs V., A netw ork of b ooks ab out recent US p olitics sold at www.amazon.com. url : http://www.orgnet.com/ [18] T eixeira M. C. et. al. , Nucl. Acids Res., 34 , D446 (2006). [19] Shen-Orr S. S. et. al. ,Nature Genetics, 31 , 64 (2002). [20] Lee T. I. et. al. , Science, 298 , 799 (2002). [21] Breitkreutz B.-J. et. al. , Genome Biology , 4(3) (2003). [22] Jeong H. et. al. ,Nature, 407 , 651 (2000). [23] Newman M. E. J. , url : h ttp://www-p ersonal. umic h.edu/ ∼ mejn/netdata/. Last chec ked Marc h 8, 2008. [24] Newman M. E. J. , Ph ys. Rev. E , 74 , 036104 (2006). [25] W atts D. J. and Strogatz S. H. , Nature, 393 , 440 (1998). [26] Newman M. E. J. , Pro c. Natl. Acad. Sci. USA, 98 , 404 (2001). [27] Lusseau D. et. al. Behav. Ecol. Sociobiol., 54 , 396 (2003). [28] Girv an M. and Newman M. E. J. , Pro c. Natl. Acad. Sci. USA, 99 , 7821 (2002). [29] Zachary W. W. , J. Anthropol. Res., 33 , 452 (1977). [30] Knuth D. E. , The Stanfor d Gr aphBase: A Platform for Combinatorial Computing (Addison-W esley , Reading, MA, 1993). [31] Barabasi A. L. , and Albert R. , Science, 286 , 509 (1999). [32] Jolliffe I. T. , Princip al Comp onent Analysis (Springer- V erlag, New-Y ork, 2002, 2’nd Ed.) [33] Costa L. da F. et al. , Adv ances in Ph ysics, 56 , 167 (2007). [34] Dorogovtsev S.N., Goltsev A.V., Mendes J.F.F., Phys. Rev. E, 65 , 066122 (2002) 6 FIG. 5: (Color online) A pro jection of our mo del nets (orange p oin ts), the BA mo del nets (light blue), and real-world nets (blac k) on to the first three principal comp onen ts of the eleven dimensional PCA space of our real-world data set (w e omitted the n umber of no des, sk ew of the degree distribution, and v ariance and skew of the b etw eenness distribution from the original 15 attributes, as described in the text). Here, the PCA1 axis is primarily composed of (in terms of their coefficient’s magnitude) a combination of the num b er of edges, mean and skew of geo desic, mean and st. dev. of clustering, and mean and st. dev. of degree. PCA2 is mainly a combination of the st. dev and mean of geo desic, and assortativit y . PCA3 is mainly a combination of the mean of betw eenness, mean of clustering, num b er of edges, and st. dev of degree. As an example of a spread along an original parameter, the grey arro w is parallel to and sho ws the direction and magnitude of assortativity when pro jected on to this space.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment