A preferential attachment model with Poisson growth for scale-free networks
We propose a scale-free network model with a tunable power-law exponent. The Poisson growth model, as we call it, is an offshoot of the celebrated model of Barab\'{a}si and Albert where a network is generated iteratively from a small seed network; at…
Authors: Paul Sheridan, Yuichi Yagahara, Hidetoshi Shimodaira
A preferen tial attac hmen t mo del with P oisson gro wth for scale-free net w orks P aul Sheridan Y uic hi Y agahara Hidetoshi Shimodaira T oky o Institute of T ec hnology , Departmen t of Mathematical and Computing Sciences, 2-12-1 Ook a yama, Meguro-ku, T oky o 152-8552, Japan sherida6@is.titech.ac.jp Septem b er 8, 2021 Abstract W e prop ose a scale-free net work mo del with a tunable p o wer-la w exp onen t. The P oisson growth model, as we call it, is an offsho ot of the celebrated mo del of Barab´ asi and Albert where a netw ork is generated iteratively from a small seed net w ork; at eac h step a node is added together with a n umber of inciden t edges preferen tially attac hed to nodes already in the netw ork. A key feature of our model is that the n umber of edges added at each step is a random v ariable with P oisson distribution, and, unlike the Barab´ asi-Alb ert mo del where this quan tity is fixed, it can generate an y netw ork. Our mo del is motiv ated b y an application in Ba yesian inference implemen ted as Marko v chain Mon te Carlo to estimate a netw ork; for this purp ose, we also give a form ula for the probability of a netw ork under our model. Keywor ds: Ba yesian inference, complex net works, netw ork models, p o wer- la w, scale-free 1 In tro duction Un til recent times, modeling of large-scale, real-world netw orks was primar- ily limited in scop e to the theory of r andom networks made popular b y Erd¨ os and R´ en yi (1959). In the Erd¨ os-R ´ enyi mo del, for instance, a net work of N 1 no des is generated b y connecting eac h pair of no des with a sp ecified probabil- it y . The de gr e e distribution p ( k ) of a large-scale random netw ork is described b y a binomial distribution, where the de gr e e k of a no de denotes the num b er of undirected edges incident up on it. Th us, degree in a random netw ork has a strong central tendency and is sub ject to exp onen tial deca y so that the a verage degree of a netw ork is representativ e of the degree of a typical no de. Ov er the past dec ade, though, n umerous empirical studies of c omplex networks , as they are kno wn, hav e established that in man y such systems— net works arising from real-world phenomena as diverse in origin as man- made netw orks like the W orld Wide W eb, to naturally o ccurring ones lik e protein-protein in teraction net works, to citation net works in the scien tific literature; see Alb ert et al. (1999), Jeong et al. (2001), and Redner (1998), resp ectiv ely—the ma jorit y of nodes ha v e only a few edges, while some nodes, often called hubs , are highly connected. This c haracteristic cannot be ex- plained b y the theory of random net works. Instead, many complex netw orks exhibit a degree distribution that closely follows a p ower-law p ( k ) ∝ k − γ o ver a large range of k , with an exp onen t γ t ypically b et w een 2 and 3. A netw ork that is described b y a p o w er-law is called sc ale-fr e e , and this prop ert y is thought to b e fundamental to the organization of many complex systems; Strogatz (2001). As preliminary exp erimental evidence mounted (see W atts and Strogatz (1998), for instance), a simple, theoretical explanation accounting for the univ ersality of p o wer-la ws soon follo wed; the net w ork mo del of Barab´ asi and Alb ert (1999) (BA) pro vided a fundamental understanding of the develop- men t of a wide v ariet y of complex net w orks on an abstract level. Beginning with a connected seed net work of t 0 no des, the BA algorithm generates a net work using tw o basic mechanisms: gr owth , where o ver a series of itera- tions t = t 0 , t 0 + 1 , . . . the netw ork is augmented by a single node together with m ≤ t 0 undirected incident edges; and pr efer ential attachment where the new edges are connected with exactly m no des already in the netw ork suc h that the probabilit y a no de of degree k gets an edge is proportional to r ( k ) = k , the degree of the node. When m is fixed throughout, Bol- lob´ as et al. (2001) show ed rigorously that the BA mo del follo ws a p o wer-la w with exp onent γ = 3 in the limit of large t . The sudden appearance of the BA model in the literature, nearly a decade ago, spark ed a flurry of researc h in the field, and, consequently , numerous v ariations and generaliza- tions upon this protot ypal mo del ha v e b een proposed; Alb ert and Barab´ asi (2002) and Newman (2003) rank as the preeminen t surv ey pap ers on the sub ject. In this paper, w e prop ose a new gr owing mo del based on preferen tial 2 attac hment: the P oisson growth (PG) mo del. Our model, as describ ed in Section 2, is an extension of the BA model in t w o regards. Firstly , we consider the num b er of edges added at a step to be a random quantit y; at each step, w e assign a v alue to m according to a P oisson distribution with expectation λ > 0. Secondly , w e a v ail ourselves of a more general class of preferen tial attachmen t functions r ( k ) studied b y several authors including Krapivsky and Redner (2001) and Dorogo vtsev et al. (2000). In Section 3 w e argue that the degree distribution of the PG mo del follows a p o w er-law with exponent γ that can b e tuned to any v alue greater than 2; the tec hnical details of our argumen t are left for the App endix. In addition, w e conducted a simulation study to supp ort our theoretical claims. Our results, pro vided in Section 4, show that the v alues of γ w e estimated from netw orks generate under the PG mo del are in agreement with those predicted b y our form ulae for the p o wer-la w exponent. Our motiv ation for prop osing the PG mo del, as explained in Sec tion 5, arises from a need for a simple, yet realistic mo del that is serviceable in ap- plications. In fact, with our mo del every netw ork has a nonzero probabilit y of b eing generated, in addition to possessing a tunable p o wer-la w exp onent. In con trast, the BA model has a fixed γ , and is sub ject to numerous struc- tural constraints which severely limit the v ariety of generable netw orks. W e giv e a simple form ula for the probabilit y of a net work under the PG mo del, whic h can b e applied quite naturally in Bay esian inference using Mark ov c hain Mon te Carlo (MCMC) metho ds. Firstly , giv en a net work G , we ma y estimate the PG mo del parameters, or engage in model selection in the case when w e hav e more mo dels; or, going against the grain, w e ma y estimate an unknown G from data using our PG model form ula as a scale-free prior distribution. Finally , other scale-free mo dels hav e been put forth in the literature, b esides our new mo del, that are realistic enough for use in applications. An extension of BA mo del by Alb ert and Barab´ asi (2000) incorp orates “lo cal ev ents,” whic h allows for mo difications, such as rewiring of existing edges, to the netw ork at eac h step. Other authors, including Sol ´ e et al. (2002), pro- p osed a class of growing mo dels based on no de duplication and edge rewiring. Other scale-free mo del not based on gro wth ha ve also been prop osed; for ex- ample, the static mo del of Lee et al. (2005). Among these mo dels, the PG mo del is the simplest preferen tial attac hmen t model sufficien tly realistic for use in applications. 3 2 The P oisson Gro wth Mo del In the PG model, w e begin with a small seed net w ork of t 0 no des. Let G t = ( V t , E t ) b e the netw ork at the onset of time step t ≥ t 0 where V t = { v 1 , v 2 , . . . , v t } is a set of t no des and E t is a m ultiset of undirected edges so that multiple edges b et w een no des, but no lo ops, are permitted. The up dated net w ork G t +1 is generated from G t as follo ws: Poisson gr owth : A new node v t +1 is added to the net w ork together with m t inciden t edges; m t is a random v ariable assigned according to a P oisson distribution with exp ectation λ > 0. Pr efer ential attachment : Each edge emanating from v t +1 is connected with a no de already in the net work. No de selection can b e considered as a series of m t indep enden t trials, where at each trial the probability of selecting a no de from V t with degree k is q t ( k ) = r ( k ) P t i =1 r ( k i,t ) , (1) where k i,t is the degree of no de v i at s tep t . Define s i,t as the num b er of times no de v i is c hosen at step t . Then the entire selection procedure is equiv alen t to dra wing a v ector ( s 1 ,t , s 2 ,t , . . . , s t,t ) from a m ultinomial distribution with probabilities q t ( k 1 , 1 ), . . . , q t ( k t,t ) and sample size m t . Equiv alently , s i,t has Poisson distribution with exp ectation λq t ( k i,t ) indep enden tly for i = 1 , . . . , t . The PG mo del is determined by the choice of r ( k ); w e concentrate on t wo sp ecifications and discuss their implications in the next section. Firstly , let r ( k ) = k + a (2) where the offset a ≥ 0 is a constant. More generally , w e define r ( k ) = k + a, k ≥ 1 , and r (0) = b (3) b y taking a ≥ − 1 with extended domain, but in doing so define a threshold parameter b ≥ 0. Indeed, the latter form ulation includes the former as a sp ecial case when w e restrict a = b ≥ 0, so that o verall our mo del is sp ecified b y the parameter θ = ( a, b, λ ). The BA mo del can b e explained as a reduction of our mo del by taking a = b = 0, and by fixing 1 ≤ m t = m ≤ t 0 so that the n umber of edges added to the system at eac h step is a constant; the new edges are preferentially 4 attac hed from the new no de to exactly m other nodes. Man y structural constrains are implicit in the BA mo del. Indeed, at step t , a net w ork with t no des m ust ha ve m ( t − t 0 ) + | E t 0 | edges, none of which are m ultiple, whereas the n umber of edges for the PG mo del can tak e other v alues. A n umber of extensions of the BA mo del based up on generalizing r ( k ) ha ve b een prop osed. In particular, Krapivsky et al. (2000) analyzed a version where the preferen tial attachmen t function is not linear in the degree k of a no de, but instead can b e a p ow er of the degree k ν , ν > 0. They show ed that for the scale-free prop ert y to hold, r ( k ) m ust b e asymptotically linear in k . In a subsequen t w ork, Krapivsky and Redner (2001) and Dorogo vtsev et al. (2000) indep endently w ent on to establish that adding the offset a > − m as in (2) do es not violate this requiremen t, and deriv ed the p ow er-law exp onent γ = 3+ a/m . Their result is analogous with our rep orted p o wer-la w exp onent in (7) with λ = m as seen in the next section. F urthermore, Krapivsky and Redner (2001) inv estigated an attachmen t function similar to (3) defined by r ( k ) = k , k ≥ 2 , r (1) = b, r (0) = 0. As they to ok m ≥ 1 they did not need to b e concerned with no des of degree k = 0. The p ow er-law exp onen t they deriv ed in this case is reminiscen t of our result in (6). 3 The Degree Distribution of the Poisson Gro wth Mo del In this section, w e discuss the degree distribution p ( k ) for netw orks generated under the PG mo del. The main result is that the degree distribution follows a p o w er-law p ( k ) ∼ k − γ , (4) where a k ∼ b k indicates these tw o sequences are prop ortional to eac h other so that a k /b k con verges to a nonzero constan t as k → ∞ . This result is an immediate consequence of the recursiv e form ula ( k + a − 1 + γ ) p ( k ) = ( k + a − 1) p ( k − 1) (5) for sufficiently large k , and thus p ( k ) ∼ ( k + a − 1) − γ ∼ k − γ . The p o wer-la w exp onen t is γ = 3 + a + ( b − a ) p (0) λ (6) for the preferen tial attachmen t function defined in (3), and the exponent tak es the range γ > 2; the lo wer limit γ → 2 can b e attained by letting a = − 1, b = 0, and λ → 0. This low er limit is in fact the limit for an y form 5 of r ( k ) when λ do es not dep end on t ; γ m ust b e larger than 2 to ensure the mean degree P ∞ k =0 k p ( k ) = 2 λ conv erges. F or the sp ecial case (2), the exp onen t b ecomes γ = 3 + a λ , (7) and the range is γ ≥ 3. T o make the argumen t precise, w e hav e to note that G t is generated randomly and the degree distribution of G t also v aries. Let n t ( k ) b e the n umber of no des in G t with degree k . Since P ∞ k =0 n t ( k ) = t , the observed degree distribution of G t is defined by p t ( k ) = n t ( k ) /t for k ≥ 0. F or sufficien tly large t , p t ( k ) ma y follo w the p o wer-la w of (4) as seen b elo w. W e consider a mo derately large k for the asymptotic argument as t → ∞ . The maximum v alue of k for consideration is k ∼ t c for a given t with a constan t c = 1 / ( γ + 2 + ) with any > 0. Then, the exp ectation of p t ( k ) can b e expressed as E( p t ( k )) ∼ k − γ , (8) whic h is the p o wer-la w we would like to show for p t ( k ). The v ariance of p t ( k ) will be shown as V( p t ( k )) = O ( k 2+ t − 1 ) , (9) indicating the v ariance reduces by the factor 1 /t . Note that (9) is not a tight upp er bound, and the v ariance can b e muc h smaller. See the App endix for the pro of of (8) and (9). Let 0 < d < 1 / (2 γ + 2 + ), and consider k = O ( t d ), whic h is even smaller than t c . Then, p V( p t ( k )) E( p t ( k )) = O ( k γ +1+ / 2 t − 1 / 2 ) = O ( t α ) (10) with α = d ( γ + 1 + / 2) − 1 / 2 < 0, and thus the limiting distribution lim t →∞ p t ( k ) = p ( k ) follows the p ow er-law of (4). By taking → 0, the p o w er-law of p t ( k ) is sho wn up to k ∼ t d with d < 1 / (2 γ + 2). It remains to giv e an expression for p (0) in (6). W e will show in the App endix that p (0) is a solution of the quadratic equation ( b − a ) x 2 + (2 λ + a + λb − ( b − a ) e − λ ) x − (2 λ + a ) e − λ = 0 . (11) F or a 6 = b , one of the solutions p (0) = 1 2( b − a ) hn (2 λ + a + λb − ( b − a ) e − λ ) 2 + 4( b − a )(2 λ + a ) e − λ o 1 / 2 − (2 λ + a + λb − ( b − a ) e − λ ) i (12) 6 is the unique stable solution with 0 < p (0) < 1; this can b e chec ked b y lo oking at the sign of p t +1 (0) − p t (0) in the neigh b orho od of p (0). 4 Sim ulation Study A small sim ulation study w as conducted to supp ort our theoretical claims of Section 3. Specifically , we wish to confirm via sim ulation that the degree distribution p ( k ) of (4) as well as its exp ected v alue E( p t ( k )) as in (8) follow a p ow er-law with γ as in (6). T o that end w e generated netw orks under the PG mo del for a v ariety of parameter settings. F or each sp ecification of θ we generated n sim = 10 4 net works of size N = 5000, each from a seed net work of a pair of connected nodes. W e included the BA mo del, generated under analogous conditions, so as to demonstrate the soundness of our results whic h are summarized in T able 1. In p oin t of fact, estimating γ from a netw ork can b e quite tric ky and it has b een the sub ject of some atten tion in the literature; see Goldstein et al. (2004). W e sided with the maxim um lik eliho od (ML) approach de- scrib ed b y Newman (2005). In this metho dology , the ML estimate of γ for a particular net work is given b y ˆ γ = 1 + X k ≥ k min n ( k ) · X k ≥ k min n ( k ) log k k min − 1 where n ( k ) is the n umber of no des with degree k , and k min is the minim um degree after which the p o wer-la w behavior holds. Bauk e (2007) studied selecting a v alue for k min b y using a χ 2 go odness of fit test o ver a range of k min ; how ev er, we shied aw a y from this level of scrutiny as we found that taking k min = 10 was reasonable for our examples. This metho dology is illustrated in Figure 1 (a) and (b) where we plot the degree distribution with ˆ γ for a typical net w ork generated by the BA and PG mo del, resp ectiv ely . Returning to T able 1, in eac h case, w e confirm (4) has p o wer-la w ex- p onen t as predicted b y (6) and (12). W e computed ˆ γ for eac h netw ork, and calculated the mean and standard deviation of ˆ γ v alues for n sim net- w orks. W e observe that the mean ˆ γ agrees w ell with the predicted γ , and the v ariation of ˆ γ is relativ ely small as suggested b y (10). In addition, to sho w that the same holds for (8), in each case w e com- puted the a verage degree distribution of the n sim net works as an estimate of E( p ( k )). Then w e estimated the degree exp onen t ˆ γ av g as seen in the table and Figure 2. Again, the sim ulated results match w ell with theory . 7 5 Discussion The PG mo del has a sp ecial place in the class of preferen tial attac hmen t mo dels. It has a tunable p o w er-law exp onent and a simple implemen tation, y et it can generate any net work. In contrast, the BA model and its gen- eralizations described in Section 2 ha ve serious restrictions on the types of net works that can b e generated b ecause m is held constan t. F or example, at step t an instantiation of the BA mo del will consist of a t no de netw ork with the num b er of edges equal to exactly m ( t − t 0 ), plus the num b er of edges in the seed netw ork. The simple design of our mo del makes comput- ing the probability of a net work straigh tforward. This in com bination with its mo deling p oten tial gives rise to sev eral useful applications in Ba yesian inference. In explicit terms, let G = ( V , E ) b e a net w ork with N = | V | no des where V = { v 0 1 , . . . , v 0 N } . F urthermore, let G N = ( V N , E N ) b e a net work generated under PG mo del after step N − 1 so that V N = { v 1 , . . . , v N } , where the seed net work consists of a single no de. The asso ciation b et ween V and V N is defined by a permutation σ = ( σ 1 , . . . , σ N ) so that v i = v 0 σ i . Giv en G , once w e sp ecify σ , then it is straightforw ard to compute k i,t , s i,t for i = 1 , . . . , t ; t = 1 , . . . , N − 1. Then the probability of G given θ = ( a, b, λ ) and σ is P( G | θ , σ ) = N − 1 Y t =1 t Y i =1 e − λq t ( k i,t ) ( λq t ( k i,t )) s i,t s i,t ! ! . One application is when G is kno wn and we wish to estimate θ . This can b e done by assigning a prior π ( θ ) for θ and the uniform prior on σ . The p osterior probabilit y of ( θ, σ ) given G is π ( θ , σ | G ) ∝ P( G | θ , σ ) π ( θ ) . Using MCMC to pro duce a chain of v alues for ( θ , σ ), the p osterior π ( θ | G ) is simply obtained from the histogram of θ in the c hain. Moreo ver, this pro cedure can b e used for model comparison, if we ha ve sev eral mo dels for generating the net work. Another application is when we wish to mak e inference ab out G from data D with lik eliho o d function P ( D | G ). The p osterior probabilit y of ( G, θ , σ ) given D is π ( G, θ , σ | D ) ∝ P ( D | G )P( G | θ , σ ) π ( θ ) . Then the p osterior π ( G | D ) is simply obtained from the frequency of G in the c hain. Indeed, w e used this approac h for inferring a gene netw ork from microarra y data in Sheridan et al. (2007). 8 Recall that the PG model pro duces net w orks with m ultiple edges. In practice, we often wan t to restrict our interest to netw orks without m ultiple edges. As an appro ximation, w e could apply the formula for P ( G | θ, σ ) just as w ell in this case. Alternativ ely , w e prop ose a slight mo dification to our mo del where w e generate m t edges at step t according to a binomial distribution with parameter p = λ/t and sample size t . In this form ulation the seed net work m ust b e selected such that λ ≤ t 0 , otherwise p > 1 may o ccur. Then by s ampling nodes without replacement, m ultiple edges are av oided. In our simulation (results not included) we found that these mo difications do not c hange the p o wer-la w. Finally , though we made sp ecific choices for r ( k ) in our arguments, the PG mo del can be generalized to a wider class of preferen tial attac hmen t functions. F or instance, Dorogovtsev and Mendes (2001) in v estigated ac c el- er ate d gr owth mo dels where m t increases as the netw ork gro ws. It should b e possible to incorp orate accelerated gro wth in to PG mo del by gradually increasing the v alue of λ ov er time. Another line of generalizations of the PG mo del is via the inclusion of lo cal ev ents. App endix: Pro ofs The exp e cte d value of p t ( k ) Here w e give the pro of of (8). W e assume that the functional form of r ( k ) is (2), and a mo dification to handle (3) is men tioned at the bottom. Let I ( A ) denote the indicator function of the ev ent A , so I ( A ) = 1 if A is true and I ( A ) = 0 if A is false. W e use the notation P( · ), E( · ) and V( · ) for the probability , exp ectation and the v ariance, and also P( ·| A ), E( ·| A ) and V( ·| A ) for those giv en a condition A . By noting n t +1 ( k ) = t X i =1 I ( k i,t + s i,t = k ) + I ( m t = k ) , the conditional exp ectation of n t +1 ( k ) giv en G t is E( n t +1 ( k ) | G t ) = t X i =1 P( k i,t + s i,t = k | G t ) + P( m t = k | G t ) = t X i =1 e − λq t ( k i,t ) ( λq t ( k i,t )) k − k i,t ( k − k i,t )! + e − λ λ k k ! 9 = k X s =0 n t ( k − s ) e − λq t ( k − s ) ( λq t ( k − s )) s s ! + e − λ λ k k ! . (13) The last term e − λ λ k /k ! ∼ ( eλ/k ) k can b e ignored for a large k , since it is exponentially small as k grows. W e examine the terms in the summation o ver s = 0 , 1 , . . . , k for k = O ( t c ) as t → ∞ . F or a fixed s , q t ( k − s ) ∼ k /t for a linear preferen tial attachmen t model. More sp ecifically , for r ( k ) = k + a , k ≥ 0, q t ( k − s ) = r ( k − s ) P t i =1 r ( k i,t ) = k − s + a t (2 λ + a ) (1 + O ( t − 1 / 2 )) , b ecause the mean degree of G t is 1 t t X i =1 k i,t = 2 t | E t 0 | + t − 1 X t 0 = t 0 m t 0 = 2 λ + O ( t − 1 / 2 ) , and the denominator of q t ( k ) is t X i =1 r ( k i,t ) = t X i =1 ( k i,t + a ) = t (2 λ + a + O ( t − 1 / 2 )) . (14) Th us the sum in (13) o ver s = 0 , 1 b ecomes n t ( k ) 1 − λ ( k + a ) (2 λ + a ) t + O ( k t − 3 / 2 ) + n t ( k − 1) λ ( k + a − 1) (2 λ + a ) t + O ( k t − 3 / 2 ) . F or s ≥ 2, each term is ∼ n t ( k − s )( k /t ) s . By noting P k s =2 n t ( k − s ) ≤ t , the sum o ver s = 2 , . . . , k b ecomes O ( k 2 t − 1 ). Next, we take the exp ectation of (13) with resp ect to G t to obtain the unconditional exp ectation E( n t +1 ( k )), and replace n t ( k ) = tp t ( k ). Using the results of the previous paragraph, w e get E( p t +1 ( k )) = E( p t ( k )) − λ (2 λ + a ) t ( k 0 + γ + O ( k t − 1 / 2 ))E( p t ( k )) − ( k 0 + O ( k t − 1 / 2 ))E( p t ( k − 1)) + O ( k 2 t − 1 ) (15) with k 0 = k + a − 1 and the γ of (7). Let us assume E( p t ( k − 1)) ∼ ( k − 1) − γ , and remember c < 1 / ( γ + 2). By taking the limit t → ∞ and equating E( p t +1 ( k )) = E( p t ( k )), w e get ( k 0 + γ + o (1))E( p t ( k )) = ( k 0 + o (1))E( p t ( k − 1)) . 10 So that, for sufficien tly large t , E( p t ( k )) ∼ k − γ also holds for k . Since E( p t ( k )) = O (1) for a fixed k , the pow er-law holds for an y k b y induction up to k ∼ t c . F or r ( k ) of (3), the preferential attac hmen t is mo dified to r ( k ) = k + a + ( b − a ) I ( k = 0) , k ≥ 0 . This c hanges the the denominator of q t ( k ) in (14) to t X i =1 r ( k i,t ) = t 2 λ + a + ( b − a ) p t (0) + O ( t − 1 / 2 ) , (16) and thus 2 λ + a in the updating form ula (15) is replaced with 2 λ + a + ( b − a ) p (0), leading to (6). Note that p t (0) = p (0) + O ( t − 1 / 2 ) from (9) sho wn in the next section. The varianc e of p t ( k ) Here w e give the proof of (9) by w orking on V( n t ( k )) = t 2 V( p t ( k )). Although r ( k ) of (2) is again assumed, the argument is basically the same for (3). By noting the iden tity V( n t +1 ( k )) = E(V( n t +1 ( k ) | G t )) + V(E( n t +1 ( k ) | G t )) , (17) w e ev aluate the tw o terms on the righ t hand side. The conditional v ariance of n t +1 ( k ) given G t is ev aluated rather similarly as the conditional exp ectation of (13). By noting V( I ( A )) = P( A ) − P( A ) 2 , V( n t +1 ( k ) | G t ) is expressed for k = O ( t c ) as k X s =0 n t ( k − s ) ( e − λq t ( k − s ) ( λq t ( k − s )) s s ! − e − λq t ( k − s ) ( λq t ( k − s )) s s ! 2 ) ≈ n t ( k ) λ ( k + a ) (2 λ + a ) t + n t ( k − 1) λ ( k + a − 1) (2 λ + a ) t , (18) where terms from I ( m t = k ) are ignored for a large k . Th us, the first term in (17) is E(V( n t +1 ( k ) | G t )) = O ( k − γ +1 ) . 11 On the other hand, the second term in (17) is ev aluated by considering the v ariance of (13) as V(E( n t +1 ( k ) | G t )) ≤ V( n t ( k )) 1 − 2 λ ( k + a ) (2 λ + a ) t + O ( k t − 3 / 2 ) +2 p V( n t ( k )) p V( n t ( k − 1)) λ ( k + a − 1) (2 λ + a ) t + O ( k t − 3 / 2 ) +V( n t ( k − 1)) O ( k 2 t − 2 ) + p V( n t ( k )) O ( k 2 t − 1 ) + O ( k 4 t − 2 ) . W e substitute these t wo expressions for those in (17). W e will show, by induction, that V( n t ( k )) < Ak 2+ t (19) holds for all ( t, k ) with k = O ( t c ) using some constant A . Let us assume that (19) holds for ( t, k ) and ( t, k − 1). By taking a sufficiently large A , w e ha ve V( n t +1 ( k )) ≤ Ak 2+ ( t − (2 λ + a ) − 1 ) + o ( k 1+ / 2 ) < Ak 2+ ( t + 1) , (20) implying that (19) also holds for ( t + 1 , k ). On the other hand, for an y random v ariable 0 ≤ n ≤ t with its ex- p ectation E( n ) fixed, the largest p ossible v ariance O ( t )E( n ) is attained if the probability concentrates on the extreme v alues 0 and t . Applying this upp er bound to n t ( k ) with k ∼ t c , we obtain V( n t ( k )) /t = O (E( n t ( k ))) = O ( k − γ t ) = O ( k 2+ ), implying that (19) holds for an y ( t, k ) with k ∼ t c . F or induction with resp ect to k , we only hav e to sho w V( n t ( k )) < v ( k ) t (21) for a sufficiently large k so that terms from I ( m t = k ) in (18) can b e ignored. v ( k ) is an arbitrary constant dep ending on k . W e start from k = 0. First note that n t +1 (0) = t X i =1 I ( k i,t = 0 ∩ s i,t = 0) + I ( m t = 0) . Th us E( n t +1 (0) | G t ) = n t (0) e − λq t (0) + e − λ , and so V(E( n t +1 (0) | G t )) = V( n t (0)) 1 − 2 λa (2 λ + a ) t + O ( t − 3 / 2 ) . On the other hand, V( n t +1 (0) | G t ) is expressed as n t (0)( e − λq t (0) − e − 2 λq t (0) ) + e − λ − e − 2 λ + 2 n t (0)(1 − e − λq t (0) ) e − λ . 12 By substituting these tw o expressions for those in (17), we observe that the increase of the v ariance, i.e., V( n t +1 (0)) − V( n t (0)) is bounded b y a constan t, and w e hav e V( n t (0)) = O ( t ). Let us assume (21) holds up to k − 1. Then V( n t +1 ( k )) can b e expressed quite similarly as (20), but E(V( n t +1 ( k ) | G t )) includes additional terms from I ( m t = k ); V( I ( m t = k )) = O (1) and E( P t i =1 Co v ( I ( k i,t + s i,t = k ) , I ( m t = k ) | G t )). F or k i,t = k , the co v ariance term ≤ P( m t = k )(1 − P( s i,t = 0 | G t )) = O ( t − 1 ), and for k i,t = k − s with s ≥ 1, the cov ariance term ≤ P( s i,t = s )(1 − P( m t = k )) = O ( t − s ). Thus, b y taking the sum o v er i = 1 , . . . , t , it b ecomes O ( t · t − 1 ) = O (1). Therefore, V( n t +1 ( k )) − V( n t ( k )) is bounded b y a constan t, and (21) holds for k . By induction, (21) holds for an y k . The e quation of p (0) Here we derive (11) for the r ( x ) of (3). By taking the exp ectation of E( n t +1 (0) | G t ) = n t (0) e − λq t (0) + e − λ with resp ect to G t , and using (16), w e get E( n t +1 (0)) = E( n t (0)) 1 − λb (2 λ + a + ( b − a ) p (0)) t + O ( t − 3 / 2 ) + e − λ . By substituting n t (0) = tp t (0) and taking the limit t → ∞ , we get a form ula for f ( x ) = ( t + 1)(E( p t +1 (0)) − E( p t (0))) as a function of x = p (0) f ( x ) = − x 1 + λb 2 λ + a + ( b − a ) x + e − λ . The quadratic equation (11) is obtained b y letting f ( x ) = 0. In addition, the condition d f ( x ) /dx < 0 w as chec k ed for the stable solution. References Alb ert, R., Jeong, H., Barab´ asi, A.-L. (1999). Diameter of the world-wide w eb. Natur e , 401, 130–131. Alb ert, R., Barab´ asi, A.-L. (2000). T op ology of ev olving net w orks: lo cal ev ents and univ ersality . Phys. R ev. L ett. , 85, 5234–5237. Alb ert, R., Barab´ asi, A.-L. (2002). Statistical mec hanics of complex net- w orks. R ev. Mo d. Phys. , 74, 47–97. Barab´ asi, A.L., Albert, R. (1999). Emergence of scaling in random netw orks. Scienc e , 286, 509–512. 13 Bauk e, H. (2007). Parameter estimation for pow er-law distributions b y max- im um likelihoo d metho ds. The Eur op e an Physic al Journal B - Condense d Matter and Complex Systems , 58(2), 167–173. Bollob´ as B., Riordan, O., Sp encer, J., T usan´ ady , G. (2001). The degree sequence of a scale-free random graph pro cess. R andom Structur es Algo- rithms , 18, 279–290. Dorogo vtsev, S.N., Mendes, J.F.F., Samukhin, A.N. (2000). Structure of gro wing net works with preferential linking. Phys. R ev. L ett. , 85, 4633– 4636. Dorogo vtsev , S.N., Mendes, J.F.F. (2001). Effect of accelerated gro wth of comm unications netw orks on their structure. Phys. R ev. E , 63, 025101. Erd¨ os, P ., R´ enyi, A. (1959). On random graphs I. Public ationes Mathemat- ic ae , 6, 290–297. Goldstein, M.L., Morris, S.A., Y en, G.G. (2004). Problems with fitting to the pow er-la w distribution. The Eur op e an Physics Journal B , 41, 255-258. Jeong, H., Mason, S., Barab´ asi, A.-L., Oltv ai, Z.N. (2001). Lethalit y and cen trality in protein netw orks. Natur e , 411, 41–42. Krapivsky , P .L., Redner, S., Leyvraz, F. (2000). Connectivity of growing random net works. Phys. R ev. L ett. , 85, 4629–4632. Krapivsky , P .L., Redner, S. (2001). Organization of gro wing random net- w orks. Phys. R ev. E , 63, 066123. Lee, D.S., Goh, K.I., Kahng, B., Kim, D. (2005). Scale-free random graphs and P otts mo del. Pr amana Journal of Physics , 64, 1149–1159. Newman, M. (2003). The structure and function of complex net works. SIAM R eview , 45(2), 176–256. Newman, M.E.J. (2005). P o wer la ws, Pareto distributions and Zipf ’s law. Contemp or ary Physics , 46(5), 323–351. Redner, S. (1998). How p opular is your pap er? An empirical study of the citation distribution. The Eur op e an Physics Journal B , 4, 131–134. Sheridan, P ., Kamimura, T., Shimo daira, H. (2007). Scale-free netw orks in Ba yesian inference with applications to bioinformatics. Pr o c e e dings 14 of The International Workshop on Data-Mining and Statistic al Scienc e (DMSS2007) , 1–16, T oky o. Sol ´ e, R. V., Pastor-Satorras, R., Smith, E., Kepler, T. B. (2002). A model of large-scale proteome ev olution. A dvanc es in Complex Systems , 5, 43–54. Strogatz, S.H. (2001). Exploring complex net w orks. Natur e , 410, 268–276. W atts, D.J., Strogatz, S.H. (1998). Collective dynamics of small-w orld net- w orks. Natur e , 393, 440–442. 15 T able 1: Summary of estimated p o w er-law exp onen ts from simulated net- w orks. The last column is theoretically predicted γ . Mo del P arameters Mean k Mean ˆ γ ± s.d. ˆ γ av g γ BA m = 1 2.0 3 . 03 ± 0 . 15 3.03 3 PG θ = (0 , 0 , 1) 2.0 3 . 03 ± 0 . 12 3.03 3 PG θ = ( − 0 . 9 , 0 . 1 , 1) 2.0 2 . 54 ± 0 . 10 2.51 2.44 PG θ = ( − 0 . 9 , 0 . 1 , 3) 6.0 2 . 86 ± 0 . 05 2 . 86 2.72 PG θ = (0 . 5 , 0 . 5 , 3) 6.0 3 . 15 ± 0 . 05 3.15 3.17 16 1 2 5 10 20 50 100 2e − 04 1e − 03 5e − 03 2e − 02 1e − 01 5e − 01 k p(k) 1 2 5 10 20 50 100 200 2e − 04 5e − 04 2e − 03 5e − 03 2e − 02 5e − 02 2e − 01 k p(k) (a) (b) Figure 1: Degree distribution p ( k ) of a t ypical net work plotted on a log-log scale with the p o wer-la w line using estimated exponent ˆ γ . (a) Generated under the BA mo del; ˆ γ = 3 . 03. (b) Generated under the PG mo del with θ = (0 , 0 , 1); ˆ γ = 3 . 01. 17 (a) (b) 1 2 5 10 20 50 100 200 500 1e − 07 1e − 05 1e − 03 1e − 01 k p(k) 1 2 5 10 20 50 100 200 500 1e − 07 1e − 05 1e − 03 1e − 01 k p(k) Figure 2: Avrage degree distribution E( p ( k )) of the sim ulation with the p o w er-law line using estimated exponent ˆ γ av g . Ploted for (a) the BA mo del and for (b) the PG mo del with θ = (0 , 0 , 1), where ˆ γ av g = 3 . 03 for both cases. 18
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment