Joint Estimation of the Non-parametric Transitivity and Preferential Attachment Functions in Scientific Co-authorship Networks

Join t Estimation of the Non-parametric T ransitivit y and Preferen tial A ttac hmen t F unctions in Scien tiﬁc Co-authorship Net w orks Masaaki Inoue ∗ † 1,2 , Thong Pham † 2 , and Hidetoshi Shimo daira 1,2 1 Ky oto Universit y 2 RIKEN Cen ter for AIP Abstract W e prop ose a statistical method to estimate simultaneously the non-parametric transitivit y and pref- eren tial attac hment functions in a gro wing net work, in con trast to con ven tional metho ds that either estimate eac h function in isolation or assume some functional form for them. Our mo del is shown to b e a go od ﬁt to tw o real-world co-authorship netw orks and b e able to bring to light in triguing details of the preferen tial attac hment and transitivity phenomena that w ould b e unav ailable under traditional metho ds. W e also introduce a metho d to quan tify the amoun t of contributions of those phenomena in the growth pro cess of a netw ork based on the probabilistic dynamic pro cess induced by the mo del formula. Applying this method, we found that transitivity dominated preferen tial attachmen t in b oth co-authorship net- w orks. This suggests the imp ortance of indirect relations in scientiﬁc creative pro cesses. The prop osed metho ds are implemented in the R pac k age FoFaF . Keywor ds: transitivit y , preferen tial attac hment, co-authorship netw ork, collab oration netw ork, complex net work, net work growth 1 Introduction Science has nev er b een more collab orativ e. In this era that has b een witnessing an unprecedented explosion of m ulti-author sc holarly articles (Larivière, Gingras, Sugimoto, & T sou, 2015), collab oration has become more and more imp ortan t in the path to scientiﬁc success (Jones, W uch t y , & Uzzi, 2008; Bornmann, 2017). Promising ideas from numerous analytic ﬁelds, including complex netw ork theory , statistics, and informetrics, ha ve b een wea v ed together to understand this collaborative nature of science (Zeng et al., 2017; F ortunato et al., 2018). One of the early attempts to analyze the formativ e pro cess of scientiﬁc collab orations came from ph ysics when Newman proposed a non-parametric method to estimate the preferen tial attac hment (P A) and tran- sitivit y functions from scientiﬁc collab oration net works (Newman, 2001a). P A (Price, 1965; Merton, 1968; Price, 1976; Alb ert & Barabási, 1999) and transitivity (Heider, 1946; Holland & Leinhardt, 1970, 1971, 1976) are tw o of the most fundamen tal mec hanisms of net work gro wth. On the one hand, P A is a phenomenon concerning the ﬁrst-order structure of a net work. In P A, the higher the n umber of co-authors a scientist already has, the more collab orators they will form. On the other hand, transitivity concerns the second-order structure: co-authors of co-authors are likely to collab orate. Newman’s metho d is non-parametric in the sense that it do es not assume an y forms for either the P A or transitivity function. The metho d, how ev er, considers ∗ Corresponding author: minoue@sys.i.kyoto-u.ac.jp † These authors contributed equally . 1 eac h phenomenon in isolation and thus completely ignores any entanglemen ts of the tw o phenomena, which are entirely plausible in real-world netw orks. Apart from this non-parametric-in-isolation approac h, a join t-estimation approac h, in which the t wo phenomena are considered sim ultaneously , has b een attempted recen tly (Kronegger, Mali, F erligo j, & Doreian, 2012; F erligo j, Kronegger, Mali, Snijders, & Doreian, 2015; Zinilli, 2016), all under the framework of sto c hastic actor-based mo dels (Snijders, 2001). This approach is, how ev er, inherently parametric: it assumes the forms of the P A and transitivity functions a priori , th us risks losing ﬁne details of the t wo phenomena, details that are diﬃcult to b e captured by an y parametric functional forms. W e argue that the ideal method, whenever p ossible, should com bine the b est of b oth w orlds: it should consider b oth phenomena simultaneously , and it should not assume any functional forms for them. Our main contributions are three-fold. In our ﬁrst contribution, we prop ose a netw ork growth mo del that com bines non-parametric P A and transitivity functions and derive an eﬃcient Minorize-Maximization (MM) algorithm (Hunter & Lange, 2000) to estimate them sim ultaneously . This iterative algorithm is guaran teed to increase the model’s log-likelihoo d p er iteration. W e demonstrate through simulated examples that our approac h are capable of capturing complex details of P A and transitivit y , while the conv en tional approaches cannot (cf. Fig. 1). W e also p erform a systematic simulation to conﬁrm the p erformance of our algorithm. In our second contribution, we suggest a metho d to quantify the amount of con tributions of P A and transitivit y in the growth pro cess of a netw ork. Our quantiﬁcation exploits the probabilistic dynamic pro cess induced by the netw ork growth form ula and can b e e asily extended to other netw ork growth mec hanisms. In our third contribution, we apply the prop osed metho ds to tw o real-w orld co-authorship net works and unco ver some interesting prop erties that would b e unav ailable under conv entional approaches. In particular, as con trast to the typical pow er-la w functional form assumption, the transitivity eﬀect seems to be highly non-p o w er-law. W e also found that transitivit y dominated P A in the growth pro cesses of b oth netw orks. This suggests the importance of indirect relations in scientiﬁc creative processes: it do es matter who y our collab orators collab orate with. All the prop osed metho ds are implemen ted in the R pack age FoFaF . The rest of the paper is organized as follo ws. The proposed method is discussed in details in Section 2. In Section 3, w e discuss ho w to exploit the probabilistic dynamic pro cess imp osed by the mo del formula to sensibly quan tify the amount of con tributions of P A and transitivity . W e apply the prop osed metho d to t wo real-world collab oration net works and discuss the results in Section 4. Concluding remarks are given in Section 5. 2 Proposed Method W e ﬁrst review brieﬂy the history of P A and transitivit y mo delling and then describ e our netw ork growth mo del that incorp orates non-parametric P A and transitivity functions. W e also explain its relation to some con ven tional net work mo dels. W e then discuss maxim um partial lik eliho o d estimation for the model and pro vide tw o sim ulated examples to demonstrate ho w our metho d works. W e conclude the section with a systematic simulation to inv estigate the p erformance of the prop osed metho d. 2.1 P A and tr ansitivity mo del ling The notion of a rich-get-ric her phenomenon has its ro ot in the theoretical w orks of Y ule (Y ule, 1925) and Simon (Simon, 1955). Its status as a fundamental pro cess in informetrics was cemented by revolutionary w orks of Merton (Merton, 1968) and Price (Price, 1965, 1976). The term “preferential attac hment” was coined b y Barabási and Alb ert when they re-discov ered the mechanism in the context of complex netw orks (Alb ert & Barabási, 1999). In P A, the probabilit y a no de with degree k receives a new edge is prop ortional to its P A function A k . When A k is an increasing function on av erage, the P A eﬀect exists: a no de with a high degree k is more lik ely to receive more new connections. T o estimate the P A phenomenon in a netw ork is to estimate the function A k giv en that netw ork’s growth data. V arious non-parametric approaches (Newman, 2001a; Pham, Sheridan, & Shimo daira, 2015) and parametric ones (Massen & Jonathan, 2007; Gómez, Kapp en, & Kalten brunner, 2 2011) hav e b een prop osed. In parametric metho ds, p o w er-law function forms, e.g., A k = ( k + 1) α , are often emplo yed (Krapivsky , Ro dgers, & Redner, 2001). T ransitivit y started out as a concept in psychology (Heider, 1946) and was dev elop ed theoretically in the framew ork of so cial net work analysis by Holland and Leinhardt in the 1970s (Holland & Leinhardt, 1970, 1971, 1976). It was introduced to the informetrician’s modelling to olbox in 2001 when Newman pro vided a heuristic metho d to estimate the transitivity function in real-w orld co-authorship netw orks (Newman, 2001a) and Snijders introduced his now-famous sto c hastic actor-based models that include transitivit y as a netw ork formation mechanism (Snijders, 2001). In transitivity , the probabilit y that a pair of tw o no des with b common neigh b ors is proportional to the transitivit y function B b . When B b is an increasing function on av erage, the transitivity eﬀect is at play: the more common neighbors a pair of no des share, the easier for them to connect. Similar to the case of P A, non-parametric approaches (Newman, 2001a) and parametric approac hes (Kronegger et al., 2012; F erligo j et al., 2015; Zinilli, 2016) ha ve b een prop osed to estimate B b from observed net work data. W e re-emphasize that all existing metho ds either consider P A or transitivit y in isolation, or are of a parametric nature. 2.2 Pr op ose d network mo del Our mo del can b e viewed as a discrete Marko v mo del, whic h is a p opular framew ork in so cial netw ork mo deling (Holland & Leinhardt, 1977). Let G t denote the netw ork at time t . Starting from a seed netw ork G 0 , at each time-step t = 1 , · · · , T , v ( t ) new no des and m ( t ) new edges are added to G t − 1 to form G t . In particular, at the onset of time-step t , let k i ( t ) denote the degree of no de i and b ij ( t ) the num ber of common neigh b ors b et ween no des i and j in G t − 1 . Our mo del dictates that the probability that a new edge emerges b et w een no de i and no de j at time step t is indep endent of other new edges at that time and is equal to P ij ( t ) ∝ A k i ( t ) A k j ( t ) B b ij ( t ) , (1) where A k is the P A function of the degree k and B b is the transitivit y function of the num ber of common neigh b ors b . In other words, the un-ordered pair of the t wo ends ( i, j ) of a new edge follo ws a categorical distribution o v er all un-ordered pairs of no des existing at time t . Each pair’s w eight is proportional to the pro duct of P A and transitivit y v alues of that pair at t . Thus this form ulation can capture simultaneously P A and transitivity eﬀects. Supp ose that the join t distribution of v ( t ) , m ( t ) , and G 0 is go verned by some parameter vector θ . W e mak e a standard assumption, which is virtually employ ed in all netw ork models, that θ is indep enden t of A k and B b . As w e shall see later, this indep endence assumption enables a partial likelihoo d approach in which one can ignore θ in estimating A k and B b . Next we discuss the relation betw een the mo del in Eq. (1) and mo dels in the literature. 2.3 R elate d mo dels As explained earlier, while there are mo dels that either include a non-parametric A k function (Pham et al., 2015) or a non-parametric B b function (Newman, 2001a), Eq. (1) is the ﬁrst to combine b oth non-parametric functions. It includes as sp ecial cases some well-kno wn complex netw ork mo dels, such as the Barabási-Alb ert mo del (Albert & Barabási, 1999) or the Erdös-Rényi-with-gro wth mo del (Callaw a y , Hopcroft, Klein b erg, Newman, & Strogatz, 2001). The well-kno wn sto c hastic actor-based mo del (Snijders, 2001, 2017; Ripley , Snijders, Bo da, Vörös, & Preciado, 2018) has b een emplo yed in studies of scientiﬁc co-authorship net works (Kronegger et al., 2012; F erligo j et al., 2015; Zinilli, 2016). It is, how ev er, not clear how to con vert the P A and transitivity functions in our probabilistic setting to those in the setting of sto c hastic actor-based mo del, since the tw o mo dels are deﬁned diﬀerently . W e note that the P A and transitivity phenomena are virtually mo delled in a parametric manner in the sto c hastic actor-based mo del. 3 One key assumption of the mo del in Eq. (1) is that A k and B b do not dep end on t , i.e., they stay unc hanged throughout the gro wth pro cess. While this time inv ariability assumption is standard and employ ed in all the net work mo dels mentioned previously , there is a growing b ody of mo dels departing from it. A time-v arying A k has b een discussed in the context of citation netw orks (Csárdi, Strandburg, Zalányi, T obo c hnik, & Érdi, 2007; W ang, Y u, & Y u, 2008; Medo, Cimini, & Gualdi, 2011), while diﬀerent parametric forms for such A k are studied b y Medo (Medo, 2014). More recently , the R pac k age tergm (Krivitsky & Handco c k, 2019) allows the estimation of time-v arying parametric P A and transitivity functions. There is, how ever, no existing work that employs time-v arying and non-parametric mo delling sim ulataneously , presumably for the reason that a h uge amount of data is probably needed in such a mo del. It is likely that in practical situations one alwa ys has to choose b et ween non-parametric mo delling and time-v arying mo delling. W e demonstrate in Section 4.4 that the time inv ariability assumption do indeed hold in all real-world netw orks analyzed in this pap er. 2.4 Maximum Partial Likeliho o d Estimation Here we describ e how to estimate the parameters of the mo del in Eq. (1). Denote D = { G 0 , G 1 , · · · , G T } the observ ed data, and let A = [ A 0 , A 1 , . . . , A k max ] with A k > 0 b e the P A function and B = [ B 0 , B 1 , . . . , B b max ] with B b > 0 b e the transitivit y function. Here k max is the maximum degree and b max is the maximum n umber of common neighbors betw een a pair of nodes. Giv en D , our goal is to estimate A and B without assuming any sp eciﬁc functional forms, an approach we call “non-parametric”. With the indep endence assumption stated in the previous section, the part of the log-likelihoo d that con tains A and B and the part of the log-likelihoo d that contains θ are separable, i.e., L ( A , B , θ | D ) = L ( A , B | D ) + L ( θ | D ) holds, where L denotes the log-lik eliho od function. This allo ws us to ignore θ in estimating A k and B b . Starting from Eq. (1), with some calculations w e arrive at L ( A , B | D ) = T X t =1 k max X k 1 =0 k max X k 2 = k 1 b max X b =0 m k 1 ,k 2 ,b ( t ) log A k 1 A k 2 B b − T X t =1 m ( t ) log k max X k 1 =0 k max X k 2 = k 1 b max X b =0 n k 1 ,k 2 ,b ( t ) A k 1 A k 2 B b ! , (2) where n k 1 ,k 2 ,b ( t ) is the n um b er of no de pairs ( i, j ) that satisfy ( k i ( t ) , k j ( t ) , b ij ( t )) = ( k 1 , k 2 , b ) with k 1 ≤ k 2 at time-step t , and m k 1 ,k 2 ,b ( t ) is the n umber of new edges betw een suc h no de pairs. The num b er of new edges at time-step t is then expressed as m ( t ) = P k max k 1 =0 P k max k 2 = k 1 P b max b =0 m k 1 ,k 2 ,b ( t ) . Although analytically maximizing L ( A , B | D ) is intractable, we can derive an eﬃcient MM algorithm that iteratively up dates A and B . See App endix A for its deriv ation. W e also denote the ﬁnal result of the algorithm ˆ A and ˆ B , estimates of A and B . 2.5 Il lustr ate d examples W e demonstrate the eﬀectiveness of our metho d in tw o examples. In the ﬁrst example, we simulate a netw ork using Eq. (1) with A k = 3 log (max k, 1) α + 1 and B b = 3 log (max b, 1) α + 1 . This functional form, which deviates substantially from the p o w er-law form, has b een used to demonstrate the working of non-parametric P A estimation metho ds (Pham et al., 2015). The net work has a total of N = 1000 no des. At each time-step, one new node is added to the net work with m ( t ) = 5 new edges. In the second example, we ﬁrst estimate A k and B b b y applying our prop osed metho d to a real-w orld co-authorship netw ork b etw een authors in statistics journals (cf. Section 4), and then use these parameter v alues for simulating a netw ork based on Eq. (1). In the pro cess, we kept the initial condition and the num b er of new no des and new edges at each time-step exactly as what were observ ed in the real-w orld netw ork. W e apply three estimation metho ds to each simulated net work. The ﬁrst is our prop osed metho d, which join tly estimates the non-parametric functions A k and B b . The second is a joint parametric metho d, which join tly estimates P A and transitivity using the simplistic functional forms A k = ( k + 1) α and B b = ( b + 1) α . 4 This parametric formation is used widely in v arious P A and transitivity estimation methods (Massen & Jonathan, 2007; Gómez et al., 2011). The third metho d ignores the join t existence of P A and transitivity: it consists of tw o sub-metho ds: the ﬁrst one is a non-parametric metho d for estimating P A in isolation (Pham et al., 2015), and the second one is a maximum likelihoo d version of a non-parametric metho d for estimating transitivit y in isolation (Newman, 2001a). The results are shown in Fig. 1. In b oth examples, while the joint parametric metho d somehow succeeded in obtaining the general trends of A k and B b , it failed to capture the deviations from the p o w er-law form in the tw o functions. The non-parametric-in-isolation metho d grossly ov er-estimated b oth P A and transitivit y mec hanisms, due to its complete disregard of their join t existence. The prop osed metho d work ed reasonably w ell, succeeding in capturing the P A and transitivity functions in ﬁne details. 2.6 Simulation study W e p erform a syste matic simulation to ev aluate how well the prop osed metho d can estimate A k and B b . W e c ho ose A k = ( k + 1) α and B b = ( b + 1) β as the true functions. This p o wer-la w functional form has b een used in previous sim ulation studies of P A estimation metho ds (Pham et al., 2015; Pham, Sheridan, & Shimo daira, to app ear). W e consider ﬁve v alues ( 0 , 0 . 5 , 1 , 1 . 5 , and 2 ) for the exp onen t α and six v alues ( 0 , 0 . 5 , 1 , 1 . 5 , 2 , 2 . 5 , and 3 ) for the exp onen t β . These are the ranges of α and β observ ed in Section 4.2. F or eac h combination of α and β , we simulate 10 netw orks. In each netw ork, the total num b er of no des is 1000 and there are ﬁve new edges at each time-step. F or each simulated netw ork, we ﬁrst estimate A k and B b as describ ed in the previous section and then ﬁt ( k + 1) α and ( b + 1) β to the estimation results to ﬁnd the estimates of α and β . In other words, we indirectly measure how well A k and B b are estimated b y lo oking at how well α and β are estimated: if the estimates of α and β are go o d, the estimations of A k and B b are likely successful, to o. Figure 2 shows the true and estimated v alues of α and β . The prop osed metho d successfully recov ers α and β in all combinations. This implies that the estimation of A k and B b w ent w ell. 3 Quantifying the amount of contributions of P A and transitivity Our mo del leads to a simple answer to a previously-unraised yet fascinating question: how can one compare the amount of contributions of P A and transitivity in the growth pro cess of a net work? T o the b est of our kno wledge, no one has attempted to quan tify the amoun t of contributions of diﬀerent netw ork growth mec hanisms. T o answer this question, one must ﬁnd a meaningful wa y to deﬁne the amount of contributions so that they are computable and comparable. W e achiev e this by considering the dynamic pro cess expressed in Eq. (1). This probabilistic dynamic pro cess suggests that the v ariability of the P A/transitivity v alues in the set of no de pairs is a sensible measure for the amount of contribution of P A/transitivity . Let us deﬁne the amount of con tributions of P A and transitivity at time-step t . Denote them as s P A ( t ) and s trans ( t ) , resp ectiv ely . T aking logarithm of b oth sides of Eq. (1), one gets: log 2 P ij ( t ) = log 2 [ A k i ( t ) A k j ( t ) ] + log 2 B b ij ( t ) − C ( t ) , (3) with C ( t ) = log 2 P i,j A k i ( t ) A k j ( t ) B b ij ( t ) is the logarithm of the normalizing constant at time-step t and is indep enden t of i and j . Equation (3) implies that, lo oking lo cally at a no de pair ( i, j ) , P A and transitivit y con tribute to log 2 P ij ( t ) by the amoun ts of log 2 [ A k i ( t ) A k j ( t ) ] and log 2 B b ij ( t ) , resp ectiv ely; the amount of con tribution is measured by log 2 fold-c hanges. What is imp ortan t globally is, ho wev er, the relativ e sizes of all log 2 [ A k i ( t ) A k j ( t ) ] and log 2 B b ij ( t ) at that time-step t . F or example, consider the case when A k = 1 , ∀ k . In this case, the the v alue of log 2 [ A k i ( t ) A k j ( t ) ] will b e the same for all no de pairs, and th us P A w ould hav e no role in determining whic h pair w ould get a new edge. By considering the case when B b = 1 , ∀ b , one can see that the same reasoning should also apply to log 2 B ij ( t ) . 5 10 0 10 1 10 2 10 3 10 0 10 1 10 2 10 3 10 4 10 0 10 1 10 2 2 × 10 2 10 0 10 1 10 2 10 3 10 4 10 5 10 6 1 10 100 200 1 10 100 200 1 10 100 1 10 20 Degree k + 1 Number of common neighbors b + 1 Degree k + 1 Number of common neighbors b + 1 PA function A k T ransitivity function B b PA function A k T ransitivity function B b T rue Joint nonparametric (proposed) Joint parametric Nonparametric in isolation T rue Joint nonparametric (proposed) Joint parametric Nonparametric in isolation T rue Joint nonparametric (proposed) Joint parametric Nonparametric in isolation T rue Joint nonparametric (proposed) Joint parametric Nonparametric in isolation A B C D Figure 1: Prop osed metho d compared with conv entional metho ds in estimating the P A and transitivity functions in t wo sim ulated netw orks. A , B : the estimated P A and transitivity functions from a sim ulated net work with A k = 3 log(max k , 1) α + 1 and B b = 3 log(max b, 1) α + 1 . C , D : the estimated P A and transitivity functions from a simulated netw ork in which the true A k and B b are the A k and B b estimated from a real- w orld co-authorship netw ork b et ween authors in statistics journals. The interv al at each p oin t of the prop osed metho d represents the standard deviation obtained as a by-product of the maxim um likelihoo d estimation. In b oth netw orks, the prop osed metho d successfully captured the ﬁne details of P A and transitivity . 6 0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 β α T rue Estimated Figure 2: T rue and estimated exp onen ts α and β from the p o w er-law forms A k = ( k + 1) α and B b = ( b + 1) β . The exp onen ts are estimated by a tw o-step pro cedure: ﬁrst A k and B b are estimated join tly by the prop osed metho d, then ( k + 1) α and ( b + 1) β are ﬁtted to the estimated results b y least square. Eac h estimated p oin t is the mean of the results of 10 sim ulations, with the error bars display the standard errors of the mean. This observ ation prompts us to deﬁne s P A ( t ) and s trans ( t ) as the standard deviations of log 2 [ A k i ( t ) A k j ( t ) ] and log 2 B b ij ( t ) , respectively , when ( i, j ) is sampled based on Eq. (1). Let U ( t ) b e the set formed by all no de pairs ( i, j ) that exist at time-step t . The probability P ij ( t ) in Eq. (1) can b e written explicitly as P ij ( t ) = A k i ( t ) A k j ( t ) B b ij ( t ) / X ( i,j ) ∈ U ( t ) A k i ( t ) A k j ( t ) B b ij ( t ) . The aforementioned standard deviations can b e calculated as follo ws. s P A ( t ) :=   X ( i,j ) ∈ U ( t ) P ij ( t )  log 2 [ A k i ( t ) A k j ( t ) ] − E P A ( t )  2   1 / 2 , (4) s trans ( t ) :=   X ( i,j ) ∈ U ( t ) P ij ( t )  log 2 B b ij ( t ) − E trans ( t )  2   1 / 2 , (5) in which E P A ( t ) := P ( i,j ) ∈ U ( t ) P ij ( t ) log 2 [ A k i ( t ) A k j ( t ) ] , and E trans ( t ) := P ( i,j ) ∈ U ( t ) P ij ( t ) log 2 B b ij ( t ) . Al- though A k and B b are only deﬁned up to multiplicativ e constants, the standard deviations of log 2 [ A k i ( t ) A k j ( t ) ] and log 2 B b ij ( t ) are inv ariant to constan t factors in A k and B b , and th us s P A ( t ) and s trans ( t ) are well-deﬁned. The use of base-2 logarithms allo ws us to in terpret s P A ( t ) and s trans ( t ) as log 2 fold-c hanges; a contribution v alue of s indicates a change of the probabilit y 2 s times in Eq. (1). W e also note that, although A k and B b are assumed to b e time-in v ariant, k i ( t ) , b ij ( t ) , and P ij ( t ) c hange ov er time, thus leading to dynamic nature of s P A ( t ) and s trans ( t ) . In real-world situations, what are a v ailable to us is not the true v alues A and B , but only their estimates ˆ A and ˆ B . W e plug these estimates into Eqs. (4) and (5) to obtain ˆ s P A ( t ) and ˆ s trans ( t ) , estimates of s P A ( t ) and s trans ( t ) . 7 The requiremen t that ( i, j ) is sampled from Eq. (1) is needed to faithfully reﬂect the probabilistic dynamic pro cess and leads to the following interpretation of s P A ( t ) and s trans ( t ) . Assume that at some time-step t we observ ed m ( t ) ≥ 2 new edges whose end p oin ts are ( i 1 , j 1 ) , · · · , ( i m ( t ) , j m ( t ) ) . Consider the sample standard deviation of log 2 ( B b i l j l ( t ) ) for l = 1 , · · · , m ( t ) , which is deﬁned as h trans ( t ) :=    1 m ( t ) − 1 m ( t ) X l =1 log 2 ( B b i l j l ( t ) ) 2 − 1 m ( t )( m ( t ) − 1)   m ( t ) X l =1 log 2 ( B b i l j l ( t ) )   2    1 / 2 . Similarly , deﬁne h P A ( t ) as the sample standard deviation of log 2 ( A k i l ( t ) A k j l ( t ) ) for l = 1 , · · · , m ( t ) . Standard calculations then give us s trans ( t ) 2 = E h trans ( t ) 2 and s P A ( t ) 2 = E h P A ( t ) 2 . Plugging in the estimates ˆ A and ˆ B , we can view ˆ s P A ( t ) and ˆ s trans ( t ) as the estimates of the exp ectations of the sample standard deviations in P A and transitivity v alues observ ed at the end p oints of new edges at time-step t . As w e shall see in Section 4.3, this interpretation also giv es us a mean to visualize how well the mo del ﬁts an observed net work. Finally , we note that this quantiﬁcation approach is not limited to P A and transitivit y . Giv en a growth form ula in which all gro wth mechanisms are com bined in a multiplicativ e wa y , for example, as in Eq. (1), the standard deviations of the logarithmic v alue of each growth mechanism can b e used as a measure of the con tribution of that mechanism. 4 Results and Discussion 4.1 Two c o-authorship networks W e applied our proposed method to tw o diﬀerent scien tiﬁc co-authorship net works: SMJ (Ronda-Pupo & Pham, 2018) and ST A (Ji & Jin, 2016), where no des represent authors and links represent co-authorship in pap ers. SMJ includes pap ers published in the Strategic Managemen t Journal, considered to b e one of the top journals in strategy and management, from 1980 to 2017. ST A includes pap ers in four statistics journals: the Journal of the American Statistical Asso ciation, the Journal of the Ro y al Statistical Society (Series B), the Annals of Statistics, and Biometrik a, from 2003 to 2012. These journals are generally considered the top journals in statistics. New collab orations and rep eated collab orations are p ooled together in b oth net works. The time resolution is c hosen to b e one-year in SMJ and six-month in ST A. T able 1 shows the summary statistics for tw o netw orks. The ratios ∆ | V | / | V | and ∆ | E | / | E | are b oth close to one, whic h indicate that each netw ork grows out from a very small initial netw ork. Since the n um b er of new edges ∆ | V | is lo osely corresp onding to the n um b er of a v ailable data in our statistical model, ST A has the biggest amoun t of data. The clustering co eﬃcien ts in b oth net works are rather high, but nev ertheless fall in the normal range observed in real-world net works (Newman, 2001b). T able 1: Summary statistics for t wo scientiﬁc co-authorship netw orks. | V | and | E | are the total num b ers of no des and edges in the ﬁnal snapshot, respectively . T is the num ber of time-steps. ∆ | V | and ∆ | E | are the incremen ts of nodes and edges after the initial snapshot, respectively . C is the clustering co eﬃcien t of the ﬁnal snapshot. k max is the maxim um degree and b max is the maxim um num ber of common neighbors. Dataset | V | | E | T ∆ | V | ∆ | E | C k max b max SMJ 2704 4131 24 1991 3538 0.378 34 15 ST A 3607 6808 20 3261 6509 0.320 65 19 It is instructive to lo ok at more ﬁne-grained statistics. Figures 3A and B sho w the distributions of the n umber of collab orators k in the ﬁnal snapshot of SMJ and ST A, resp ectively . Since the degree distributions in t wo datasets exhibit signs of hea vy-tails, w e ﬁtted one of the most representativ e class of heavy-tail distribution, the p o wer-la w distribution k − γ deg , to these degree distributions by Clauset’s pro cedure (Clauset, 8 Shalizi, & Newman, 2009). This pro cedure ﬁrst c ho oses the minimum degree k min from whic h the p o wer- la w holds, and then uses a maxim um likelihoo d approach to estimate the p o wer-la w exp onen t γ deg . The estimated p o wer-la w exp onen ts for degree distributions in SMJ and ST A are 2.97 and 3.35, respectively . These v alues fall in the range of 2 < γ deg < 4 , whic h is a commonly observ ed range for γ deg in real-w orld net works (Newman, 2005; Clauset et al., 2009). The situation with the distributions of b ij is, ho wev er, less clear. Figures 3C and D show the distributions of the num b er of no de pairs with b common neighbors in the ﬁnal snapshot of SMJ and ST A, resp ectiv ely . W e also ﬁtted the pow er-law distribution b − γ cn to the distributions of b b y Clauset’s pro cedure and found that γ cn in SMJ and ST A are 2.99 and 3.22, resp ectiv ely . The p o wer-la w form, ho w ever, seems to b e not a very go od ﬁt for these distributions. The ranges of b in tw o distributions seem to b e to o narro w to say anything deﬁnitely about the tails. T o the b est of our kno wledge, no previous w ork has studied the distributions of b ij , either in co-authorship netw orks or any other netw ork types. Since ﬁguring out the distributional form of b ij is not our main goal, we lea ve this task as future w ork. 4.2 P A and tr ansitivity eﬀe cts ar e highly non-p ower-law Applying the prop osed metho d to tw o data-sets, we found that the estimated P A and transitivity functions displa y non-p o w er-law and complex trends (Figures 4). In both net w orks, the v alue of A k increases on a verage in k , whic h implies the existence of the P A phenomenon: the more collab orators an author gets, the more likely they w ould get a new one. This is consisten t with previous results in the literature, in which the phenomenon has b een found in collab oration net works in diverse ﬁelds (Newman, 2001a; Milo jević, 2010; Kronegger et al., 2012; F erligo j et al., 2015). The situation with the transitivity functions is more complex. In b oth SMJ and ST A, there is a h uge jump when b go es from 0 to 1 : B 1 /B 0 is ab out 60 times in SMJ and almost 100 times in ST A. These jumps in B b v alues ha ve b een previously observed in co-authorship netw orks (Newman, 2001a; Milo jević, 2010). After this initial jump, B b , how ev er, sta ys relatively horizontal in b oth SMJ and ST A, b efore slightly increases again in SMJ. This complex departure from the pow er-law form renders any statement ab out a univ ersal transitivit y eﬀect moot. The v alue of B b at every b > 0 is, how ever, at least one order of magnitude higher than B 0 , which suggests that, co-authors of co-authors seem to b e at least ten times more lik ely to b ecome new co-authors, comparing with the case when there is no m utual co-author. It is informative to supplement the non-parametric analysis with a parametric one, since the theoretical literature oﬀers many insights in this context. Here, w e follo w the standard practice and ﬁt the p o wer-la w functional forms A k = ( k + 1) α and B b = ( b + 1) β (Krapivsky et al., 2001; Jeong, Néda, & Barabási, 2003; Pham et al., 2015). T o ﬁnd the P A attachmen t exp onen t α and the transitivity attachmen t exp onen t β , w e substitute these forms to Eq. (1) and numerically maximizes the resulted log-lik eliho od function with resp ected to α and β . T able 2 shows the estimated v alues of α and β . T able 2: Estimated v alues of the P A attachmen t exp onen t α and the transitivity attac hment exp onen t β in t wo net works. The v alues are estimated by maxim um p artial lik eliho od estimation. The interv al given at eac h estimated v alue is tw o-sigma. Net work P A attachmen t exp onen t α T ransitivit y attachmen t exp onen t β SMJ 0 . 93 ( ± 0 . 04) 2 . 50 ( ± 0 . 07) ST A 0 . 84 ( ± 0 . 03) 3 . 05 ( ± 0 . 04) The P A attachmen t α in b oth netw orks are in the sub-linear region, i.e., 0 < α < 1 , which is a frequently observ ed range in real-world net works (Newman, 2001a; Pham et al., 2015; Ronda-Pup o & Pham, 2018). While this region has been shown to giv e rise to a heavy-tail degree distribution when there is only P A at pla y (Krapivsky et al., 2001), there is no such theoretical result when P A jointly exists with transitivity . It is, ho wev er, not entirely unreasonable to exp ect that the sub-linear v alue of α is resp onsible for the observed hea vy-tail degree distributions in Figs. 3A and B. 9 γ d e g = 2.97 γ d e g = 3.35 γ c n = 2.99 γ c n = 3.22 10 0 10 1 10 2 10 3 10 0 10 1 10 2 10 3 10 0 10 2 10 4 10 7 10 0 10 2 10 4 10 7 1 5 10 40 1 7 30 70 1 3 3 20 1 3 3 20 Number of collaborators k + 1 Number of collaborators k + 1 Number of common neighbors b + 1 Number of common neighbors b + 1 Number of authors Number of authors Number of pairs of authors Number of pairs of authors B C D A SMJ S TA SMJ S TA Figure 3: The distributions of k i and b ij in the ﬁnal snapshots of t wo netw orks. A , B : the degree distributions in the ﬁnal snapshots of SMJ and ST A, resp ectiv ely . These distributions b oth displa y typical shap es of hea vy- tails. In eac h panel, the solid line presents the ﬁtted p o wer-la w distribution, and the dotted line indicates where the minimum degree k min is set. C , D : the distributions of the num b er of pairs with b common neigh b ors in the ﬁnal snapshots of SMJ and ST A, resp ectiv ely . In each panel, the solid line presen ts the ﬁtted p o w er-law distribution, and the dotted line indicates where the minimum num b er of common neighbours b min is set. In contrast to the degree distributions, the ranges of these distributions of b ij are to o narrow for any signs of hea vy-tails to emerge. 10 SMJ STA SMJ STA 1 5 10 30 10 0 10 1 10 2 10 3 10 4 1 5 15 70 1 3 10 25 Degree k + 1 Number of common neighbours b + 1 PA function A k T ransitivity function B b A B Figure 4: Non-parametric joint estimation of the preferen tial attac hment A k and transitivit y functions B b in SMJ and ST A. The v ertical bar at each estimated v alue indicates the ± 2 σ conﬁdence interv al. A : the estimated P A functions are increasing on av erage in b oth netw orks. This implies the existence of the P A phenomenon: a highly-connected author is likely to get more new collab orations than a lowly-connected one. B : The transitivity eﬀect is highly non-p o w er-law in b oth netw orks. While B b greatly increases when b c hanges from 0 to 1 in b oth netw orks, after this initial h uge jump, B b sta ys relatively horizontal in SMJ and only slightly increases in ST A. The huge jump at b = 1 implies that co-authors of co-authors is at least ten times more lik ely to b ecome new co-authors, comparing with the case when there is no mutual co-author. 11 The transitivit y attachmen t exp onen ts β are b oth greater than 1 , indicating an exp onen tially faster growth rate of the transitivit y function comparing to the P A function. This is eviden t in, for example, ST A: while A 10 is less than 10 , B 10 is already larger than 100 . T o the best of our kno wledge, there is no theoretical result on the eﬀect of β on the structure of a gro wing net w ork, even for the supposedly simpler case when there is only transitivity . Ov erall, the results in this section indicate the join t existence of P A and transitivit y phenomena in b oth net works. Our non-parametric approach revealed that a conv en tional p o w er-law functional form in a parametric approach may not b e the b est to describ e the tw o phenomena. F or A k , the p o w er-law form ﬁts reasonably w ell the lo w-degree part, but cannot capture the deviations from the p o wer-la w form in the high-degree part. F or B b , the p o w er-law form is ev en less suitable. W e hop e our non-parametric ﬁndings could oﬀer hin ts on more suitable parametric forms for A k and B b . 4.3 T r ansitivity dominate d P A in b oth networks After obtaining the estimates ˆ A and ˆ B , we can compute the amount of contributions of P A and transitivity in the gro wth pro cess of each net work by plugging these estimates into Eqs. (4) and (5). The estimated amoun t of contributions ˆ s P A ( t ) and ˆ s trans ( t ) are sho wn in Fig. 5 as solid lines. 0 1 2 3 4 0 10 15 23 0 5 10 19 t t Contribution Estimated PA contrib. Estimated trans. contrib. Simulated PA contrib. Simulated trans. contrib. SMJ S TA Figure 5: Estimated and simulated contributions of P A and transitivity at each time-step in SMJ and ST A. The amount of con tribution is measured by log 2 fold-c hanges in the model Eq. (1). The solid lines are the estimated contributions ˆ s P A ( t ) and ˆ s trans ( t ) calculated by plugging the estimates ˆ A and ˆ B in to Eqs. (4) and (5). Each dotted line is the a v erage of the corresp onding true con tributions of 100 simulated net works that use ˆ A and ˆ B as the true functions. The band around each depicts the ± tw o times the p opulation standard deviation of the sim ulated con tributions. The solid and dotted lines agree well with each other, whic h suggests that ˆ s P A ( t ) and ˆ s trans ( t ) are reliable. In each netw ork, ˆ s trans ( t ) is greater than ˆ s P A ( t ) for all t . One might ask whether these tendencies hold for the true v alues s P A ( t ) and s trans ( t ) as w ell, or they are just artifacts arising when w e plug ˆ A and ˆ B 12 in to Eqs. (4) and (5). W e demonstrate by simulations that, if the true A and B are close to the estimates ˆ A and ˆ B , s P A ( t ) and s trans ( t ) are similar to ˆ s P A ( t ) and ˆ s trans ( t ) . F or each real netw ork, w e sim ulated 100 net works based on Eq. (1) using the estimates ˆ A and ˆ B as true functions. W e k ept all the aspects of the gro wth pro cess that are not gov erned by Eq. (1) the same as what observed in the real netw ork. This includes using the observed initial graph and the observ ed num b ers of new no des and new edges at each time-step in the sim ulation. Sin ce ˆ A and ˆ B are the true P A and transitivity functions for each sim ulated netw ork, w e w ere able to calculate the true con tributions of P A and transitivity in eac h simulated netw ork using Eqs. (4) and (5). The b eha viours of the simulated contributions are very similar to the estimated contributions ˆ s P A ( t ) and ˆ s trans ( t ) , which indicates that the latter are likely to b e reliable. As explained in Section 3, one can interpret the con tributions ˆ s P A ( t ) and ˆ s trans ( t ) as estimates of the exp ectations of ˆ h P A ( t ) and ˆ h trans ( t ) , the sample standard deviations of P A and transitivity v alues at end p oin ts of actually-observed new edges at time-step t . This is expressed as E ˆ h P A ( t ) ≈ ˆ s P A ( t ); E ˆ h trans ( t ) ≈ ˆ s trans ( t ) , where the estimates ˆ s P A ( t ) and ˆ s trans ( t ) slightly o verestimate the exp ectations, b ecause E ˆ h trans ( t ) ≤ ( E ˆ h trans ( t ) 2 ) 1 / 2 ≈ ˆ s trans ( t ) . Figure 6 shows the observed v alues ˆ h P A ( t ) and ˆ h trans ( t ) , the estimates ˆ s P A ( t ) and ˆ s trans ( t ) of their exp ecta- tions, and the estimates of their standard deviations (App endix B). The observed v alues generally fall within t wo standard deviations around the estimates of their exp ectations, which implies that Eq. (1) is consistent with the observ ed data. 0.1 1.0 3.0 4.0 1 10 15 20 23 1 5 10 15 19 t t Contr ib ution SMJ S TA PA contrib. T rans. contrib. Sample sd of PA values Sample sd of trans. values Figure 6: The sample standard deviations of P A and transitivity v alues at end p oin ts of actually-observ ed new edges, ˆ h P A ( t ) and ˆ h trans ( t ) , agree w ell with their estimated exp ectations, ˆ s P A ( t ) and ˆ s trans ( t ) . This implies that the statistical mo del is consistent with the observed data. The band around ˆ s P A ( t ) depicts the in terv al of ± tw o standard deviations of ˆ h P A ( t ) . The band around ˆ s trans ( t ) is similar. Ov erall, the data indicate the gov erning role of transitivity in the growth pro cesses of b oth netw orks: it is mostly the diﬀerences in the transitivity v alues that decide where new collab orations are formed. This 13 in tuitive result is consistent with previous results whic h found that common neigh b ors are more eﬀectiv e than P A at link prediction in co-authorship netw orks (Liben-Now ell & Kleinberg, 2007). If P A w as what dominates, a scientist would only need to indiscriminately acquire as muc h collaborators as p ossible in order to bo ost their n umber of collab orators in the future. In ligh t of the curren t result, they , how ever, migh t need to be more selectiv e, since a collab orator who has collab orated with a lot of p eople might oﬀer more adv antages. 4.4 Diagnosis: time-invarianc e and go o dness-of-ﬁt Finally , we consider t wo questions that are critical to our real-w orld data analysis. The ﬁrst concerns the v alidity of the time-in v ariance assumption of A k and B b in t wo net works: in eac h netw ork, do A k and B b sta y relativ ely unc hanged throughout the growth process? The second is whether Eq. (1) is a reasonably go od mo del for the netw orks. Although Fig. (6) already hinted at an aﬃrmative answer for b oth questions, w e examine each question in ﬁner details. 4.4.1 Time in v ariance of the P A and transitivit y functions One wa y to answer the ﬁrst question is to compare the A k and B b in Fig. 4 with the A k and B b estimated using only some p ortion of the growth pro cess, for many diﬀerent portions. If they are similar, one can conclude that A k and B b indeed stay unchanged throughout the growth pro cess, and thus the time-inv ariance assumption is v alid. T o this end, from eac h original netw ork, w e create three new netw orks. The ﬁrst new netw ork (“First Half ”) contains only the ﬁrst half of the gro wth process, thus allows estimating A k and B b in this p ortion. In the second new net work (“Initial 0.5”), w e set the initial time at the middle of the time-line, eﬀectiv ely enabling estimation of A k and B b of the second half of the growth pro cess. In the third new netw ork (“Initial 0.75”), we set the initial time at the 3 / 4 p oin t of the time-line. This netw ork lets us estimating A k and B B in the last quarter of the gro wth process. The estimated A k and B b in these three new netw orks then are compared with the A k and B b obtained from the full growth pro cess (Figure 7). Visual insp ection of Fig. 7 suggests that b oth the P A and transitivity functions stay relatively unc hanged in the gro wth pro cess of eac h net work. This v alidates the time-inv ariance assumption. 4.4.2 Go odness-of-ﬁt W e use a simulation-based approac h to inv estigate the go odness-of-ﬁt of the mo del. F or each real-world net work, w e re-use the sim ulation data used in Fig. 5, which consists of 100 sim ulated net works generated using the estimated A k and B b of that netw ork as true functions. W e compare some statistics of the simulated net works with the corresp onding statistics of the real net work. If Eq. (1) is a go od ﬁt, then the observed statistics and the simulated statistics m ust b e close. Similar simulation-based approaches hav e b een prop osed for inspecting go odness-of-ﬁt of exponential random-graph mo dels (Hun ter, Go odreau, & Handco c k, 2008) and sto c hastic actor-based mo dels (Conaldi, Lomi, & T onellato, 2012; J. Lospinoso, 2012). F or an ov erview, w e lo ok at how well the model can replicate the observed degree curv es. In Fig. 8, for eac h real-w orld net w ork w e choose uniformly at random ten nodes from the top 1% of all no des in term of the n umber of new edges accumulated during the growth pro cess. F or eac h no de, w e then plot the evolution line of the observed degree v alue and the simulated degree v alue. The closer this line to the line of equalit y is, the b etter the model captures the observ ed degree growth of that node. Although for some no des the sim ulated degree sometimes tends to b e low er than the observed degree, the lines are o verall reasonably close to the iden tity line, which implies the mo del captured the degree growth w ell. F or a closer insp ection, we then lo ok at how well the mo del replicates the probability distribution of new edges during the growth pro cess. In particular, consider sampling uniformly at random an edge e from the set of all new edges in the growth pro cess. Supp ose that e is betw een a node pair with d egrees K 1 and K 2 ( K 1 ≤ K 2 ) and the n umber of their common neigh b ors is X . The relative frequency , or observed probability , that K 1 = k 1 , K 2 = k 2 , and X = b is p k 1 ,k 2 ,b = P t m k 1 ,k 2 ,b ( t ) / P k max k 1 =0 P k max k 2 = k 1 P b max b =0 P t m k 1 ,k 2 ,b ( t ) , in 14 1 3 10 20 1 2 5 10 10 0 10 1 10 2 10 3 10 0 10 1 10 2 10 3 10 4 1 5 10 40 1 5 10 60 1 3 10 1 2 3 4 5 10 20 Number of collaborators k + 1 Number of collaborators k + 1 Number of common neighbours b + 1 Number of common neighbours b + 1 PA function A k PA function A k T ransitivity function B b T ransitivity function B b Full First Half Initial 0.5 Initial 0.75 Full First Half Initial 0.5 Initial 0.75 Full First Half Initial 0.5 Initial 0.75 Full First Half Initial 0.5 Initial 0.75 A B C D SMJ SMJ S TA S TA Figure 7: Time inv ariance of the P A and transitivity functions. A and C : P A and transitivity functions of SMJ. B and D : P A and transitivity functions of ST A. While “First Half ” contains only the ﬁrst half of the growth process, the initial time is set at the middle and at the 3 / 4 p oin t of the time-line in “Initial 0.5” and “Initial 0.75”, resp ectiv ely . In each data-set, all four P A /transitivit y functions agree well with eac h other, which suggests that the P A and transitivity functions stay relatively unchanged throughout the gro wth pro cess. 15 1 3 10 30 1 3 10 30 1 3 10 30 1 3 10 30 Observed degree Observed degree Simulated degree Simulated degree SMJ S TA Figure 8: P airs of observ ed and simulated degree of some high-degree no des in t wo netw orks. The simulation data is the same as the sim ulation data used in Fig. 5. In eac h netw ork, ten no des are chosen uniformly at random from the top 1% of all no des in term of the num b er of new edges accumulated during the growth pro cess. Eac h line represen ts the observed degrees and the corresp onding simulated v alues at each time-step of a no de. Eac h sim ulated v alue is a veraged o ver 100 simulations. In each netw ork, the pairs of observed and sim ulated degrees are reasonably close to the identit y line, which suggests the mo del ﬁts the degree curves w ell. 16 whic h m k 1 ,k 2 ,b ( t ) is the num ber of new edges emerged at time t b et w een a node pair whose degrees are k 1 and k 2 and their n umber of common neighbors is b . The probability p k 1 ,k 2 ,b th us summarizes information ab out the asso ciations of k 1 , k 2 , and b at the end p oin ts of new edges through out the growth pro cess. Our join t estimation of P A and transitivit y is compared with t wo con v entional approaches in which P A (Pham et al., 2015) and transitivity (Newman, 2001a) are estimated in isolation. F or each of these t wo approac hes, w e ﬁrst estimate the P A/transitivity function in isolation and then use the estimated function to generate 100 netw orks in order to insp ect ho w well eac h existing metho d replicates p k 1 ,k 2 ,b . In order to visualize this probabilit y distribution, which is m ulti-dimensional, we slice it into many one-dimensional ones b y conditioning. Firstly , we lo ok at p k | b ∈B := P r ( K 1 + K 2 = k | X ∈ B ) = X b ∈B k max X k 1 =0 p k 1 ,k − k 1 ,b / X b ∈B k max X k 1 =0 k max X k 2 = k 1 p k 1 ,k 2 ,b , with the conv en tion that p k 1 ,k 2 ,b = 0 whenever k 1 > k 2 or k 2 > k max . This is the probability distribution of K 1 + K 2 conditioning on the even t X ∈ B . Sinc e we know from Fig. 3 that the n um b er of no de pairs with b = 0 or b = 1 is v astly greater than the rest, we consider tw o probability distributions p k | b ≤ 1 and p k | b ≥ 2 and show their cumulativ e probability distributions in Fig. 9. In all cases, the joint estimation approac h b est replicated the observed distributions. It is surprising to observe that the B b -in-isolation approac h, which do es not explicitly leverage any information ab out k , has more or less the same replication performance as the A k -in-isolation approac h, whic h explicitly do es. This suggests that the dimension of b preserves a fair amoun t of the information ab out k . Secondly , we lo ok at p b | ( k 1 ,k 2 ) ∈K := P r ( X = b | ( K 1 , K 2 ) ∈ K ) = X ( k 1 ,k 2 ) ∈K p k 1 ,k 2 ,b / b max X b =0 X ( k 1 ,k 2 ) ∈K p k 1 ,k 2 ,b , where K is a non-empt y set of un-ordered pairs. This is the probability distribution of X conditioning on the even t ( K 1 , K 2 ) ∈ K . Given a pair of no de whose degrees are k 1 and k 2 and their num b er of common neigh b ours is b , there is a natural condition imp osed on b : b must b e not greater than either k 1 or k 2 . So if one chooses K such that k 1 or k 2 could b e to o small, the range of b would b e severely limited. F or this reason, w e consider t wo probability distributions: p b | max( k 1 ,k 2 ) ≤ 9 and p b | max( k 1 ,k 2 ) ≥ 10 , both allow a large range for b . Their cumulativ e distributions are sho wn in Fig. 10. Once again, the joint estimation approach b est replicated the observed cumulativ e probability distributions in all cases. While the B b -in-isolation approach replicated fairly well the observed distributions in most cases, the A k -in-isolation approach completely failed to do so in all cases. This implies that, while the dimension of b seems to preserve a fair amoun t of the information ab out k 1 and k 2 , the dimensions of k 1 and k 2 main tain little information ab out b . Ov erall, the joint estimation approach p erformed comparatively w ell. The surprisingly go od p erformance of the B b -in-isolation approach is, in fact, in agreement with the dominating role of B b in the gro wth pro cess of b oth netw orks. Combining the results in Fig. 8 with those in Figs. 9 and 10, w e conclude that the join t estimation approach captured reasonably well b oth ﬁrst-order and second-order information of the netw orks. This go od ﬁt is consistent with the fact that the key assumption of time-in v ariability of A k and B b is satisﬁed in b oth netw orks. 5 Conclusion W e prop osed a statistical net work mo del that incorp orates non-parametric P A and transitivity functions and deriv ed an eﬃcien t MM algorithm for estimating its parameters. W e also presented a metho d that is able to quan tify the amoun t of con tributions of not only P A and transitivit y but also many other netw ork gro wth mec hanisms by exploiting the probabilistic dynamic pro cess induced by the mo del formula. 17 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 1 10 100 1 10 100 1 10 100 1 10 100 Sum of two degrees k + 1 Sum of two degrees k + 1 Sum of two degrees k + 1 Sum of two degrees k + 1 Cumulative Probability Cumulative Probability Cumulative Probability Cumulative Probability Observed Joint estimation PA in isolation T rans. in isolation Observed Joint estimation PA in isolation T rans. in isolation Observed Joint estimation PA in isolation T rans. in isolation Observed Joint estimation PA in isolation T rans. in isolation A B C D SMJ S TA SMJ S TA Figure 9: Observed and sim ulated cum ulativ e probabilit y distributions p k | b ≤ 1 and p k | b ≥ 2 of k = k 1 + k 2 in t wo net works. F or each estimation metho d, we generate 100 net works from the estimation result and rep ort the a verage v alues ov er 100 simulations. A and B : the cumulativ e probability distribution p k | b ≤ 1 in SMJ and ST A, resp ectiv ely . C and D : the cumulativ e probability distribution p k | b ≥ 2 in SMJ and ST A, resp ectiv ely . In all cases, our join t estimation approach replicated the observ ed distributions comparatively well. 18 0.7 0.8 0.9 1.0 0.6 0.7 0.8 0.9 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 1 3 10 1 3 10 1 3 10 1 3 10 30 Number of common neighbors b + 1 Number of common neighbors b + 1 Number of common neighbors b + 1 Number of common neighbors b + 1 Cumulative Probability Cumulative Probability Cumulative Probability Cumulative Probability Observed Joint estimation PA in isolation T rans. in isolation Observed Joint estimation PA in isolation T rans. in isolation Observed Joint estimation PA in isolation T rans. in isolation Observed Joint estimation PA in isolation T rans. in isolation A B C D SMJ S TA SMJ S TA Figure 10: Observ ed and simulated cumulativ e probability distributions p b | max( k 1 ,k 2 ) ≤ 9 and p b | max( k 1 ,k 2 ) ≥ 10 in tw o netw orks. F or each estimation metho d, we generate 100 net works from the estimation result and rep ort the av erage v alues ov er 100 sim ulations. A and B : the cumulativ e probability distribution p b | max( k 1 ,k 2 ) ≤ 9 in SMJ and ST A, resp ectiv ely . C and D : the cumulativ e probability distribution of p b | max( k 1 ,k 2 ) ≥ 10 in SMJ and ST A, respectively . In all cases, our join t estimation approac h replicated the observ ed distributions comparativ ely well. 19 W e show ed that the prop osed net work mo del is a reasonably goo d ﬁt to tw o real-w orld co-authorship net works and rev ealed intriguing prop erties of the P A and transitivity functions in those netw orks. The P A function is increasing on av erage in b oth netw orks, whic h implies the P A eﬀect is at play . Excluding the high degree part, it do es follo w the conv entional p o wer-la w form reasonably well. The transitivity function is, ho wev er, highly non-p o w er-law in t wo netw orks: it jumps greatly after b = 0 , but stays relatively horizon tal or only sligh tly increases afterwards. This non-conv entional form implies that co-authors of co-authors seems to b e at least ten times more lik ely to b ecome new co-authors, comparing with the case when there is no mutual co-author. W e also found transitivit y dominating P A in both net works, which suggests the imp ortance of indirect relations in scientiﬁc creativ e pro cesses. There are some fascinating directions for further developing the statistical metho dology . Firstly , although the proposed model and most other net work mo dels in the literature assume that new edges at each time- step are indep enden t, such edges are hardly so in real-w orld collab oration net works. Eﬃcien tly relaxing this assumption migh t lead to b etter mo dels for this netw ork type. Secondly , it is curious to see whether one could tak e the time-inv ariabilit y test developed for sto c hastic actor-based mo dels (J. A. Lospinoso, Sch wein b erger, Snijders, & Ripley, 2011) and adapt it to our mo del. On the application fron t, this work la ys out a p oten tially fruitful approach for analyzing complex net- w orks, while raising more questions than it answ ers. Do es transitivity alwa ys dominate P A in co-authorship net works? Which parametric forms are capable of capturing the ﬁne details seen in Fig. 4? What are the prop erties of P A and transitivity in co-authorship netw orks at the level of institutions or countries? W e hope this pap er has convinced informetricians to include non-parametric mo delling of P A and transitivity in to their to olbox. 6 Ackno wledgements This work w as supp orted in part by JSPS KAKENHI Grant Numbers JP19K20231 to TP and JP16H02789 to HS. The funding source had no role in study design; in the collection, analysis and in terpretation of data; in the writing of the rep ort; and in the decision to submit the article for publication. References Alb ert, R., & Barabási, A. (1999). Emergence of scaling in random netw orks. Scienc e , 286 , 509–512. https://doi.org/10.1126/science.286.5439.509 Bornmann, L. (2017). Is collab oration among scientists related to the citation impact of papers b ecause their qualit y increases with collab oration? An analysis based on data from f1000prime and normalized citation scores. Journal of the Asso ciation for Information Scienc e and T e chnolo gy , 68 (4), 1036–1047. https://doi.org/10.1002/asi.23728 Calla wa y , D. S., Hop croft, J. E., Kleinberg, J. M., Newman, M. E. J., & Strogatz, S. H. (2001). Are randomly gro wn graphs really random? Physic al R eview E , 64 , 041902. https://doi.org/10.1103/ PhysRevE.64.041902 Clauset, A., Shalizi, C. R., & Newman, M. E. J. (2009). Po wer-la w distributions in empirical data. SIAM R eview , 51 (4), 661–703. https://doi.org/10.1137/070710111 Conaldi, G., Lomi, A., & T onellato, M. (2012). Dynamic mo dels of aﬃliation and the netw ork structure of problem solving in an op en source soft ware pro ject. Or ganizational R ese ar ch Metho ds , 15 (3), 385–412. https://doi.org/10.1177/1094428111430541 Csárdi, G., Strandburg, K. J., Zalán yi, L., T ob ochnik, J., & Érdi, P . (2007). Modeling inno v ation by a kinetic description of the patent citation system. Physic a A: Statistic al Me chanics and its Applic ations , 374 (2), 783–793. https://doi.org/10.1016/j.physa.2006.08.022 F erligo j, A., Kronegger, L., Mali, F., Snijders, T. A. B., & Doreian, P . (2015). Scientiﬁc collab oration dynamics in a national scien tiﬁc system. Scientometrics , 104 (3), 985–1012. https://doi.org/10 .1007/s11192-015-1585-7 20 F ortunato, S., Bergstrom, C. T., Börner, K., Ev ans, J. A., Helbing, D., Milo jević, S., . . . Barabási, A.-L. (2018). Science of science. Scienc e , 359 (6379). https://doi.org/10.1126/science.aao0185 Gómez, V., Kapp en, H. J., & Kaltenbrunner, A. (2011). Mo deling the structure and evolution of discussion cascades. In Pr o c e e dings of the 22nd ACM c onfer enc e on hyp ertext and hyp erme dia (pp. 181–190). New Y ork, NY, USA: ACM. https://doi.org/10.1145/1995966.1995992 Heider, F. (1946). Attitudes and cognitive organization. The Journal of Psycholo gy , 21 (1), 107–112. https://doi.org/10.1080/00223980.1946.9917275 Holland, P. W., & Leinhardt, S. (1970). A metho d for detecting structure in sociometric data. Americ an Journal of So ciolo gy , 76 (3), 492–513. https://doi.org/10.1086/224954 Holland, P. W., & Leinhardt, S. (1971). T ransitivity in structural models of small groups. Comp ar ative Gr oup Studies , 2 (2), 107–124. https://doi.org/10.1177/104649647100200201 Holland, P. W., & Leinhardt, S. (1976). Lo cal structure in so cial netw orks. So ciolo gic al Metho dolo gy , 7 , 1–45. https://doi.org/10.2307/270703 Holland, P. W., & Leinhardt, S. (1977). A dynamic mo del for so cial netw orks. The Journal of Mathematic al So ciolo gy , 5 (1), 5–20. https://doi.org/10.1080/0022250X.1977.9989862 Hun ter, D. R., Goo dreau, S. M., & Handco c k, M. S. (2008). Go o dness of ﬁt of social netw ork mod- els. Journal of the Americ an Statistic al Asso ciation , 103 (481), 248–258. https://doi.org/10.1198/ 016214507000000446 Hun ter, D. R., & Lange, K. (2000). Quan tile regression via an MM algorithm. Journal of Computational and Gr aphic al Statistics , 60–77. https://doi.org/10.2307/1390613 Hun ter, D. R., & Lange, K. (2004). A tutorial on MM algorithms. The Americ an Statistician , 58 , 30–37. https://doi.org/10.1198/0003130042836 Jeong, H., Néda, Z., & Barabási, A. (2003). Measuring preferential attac hmen t in evolving netw orks. Eur ophysics L etters , 61 (61), 567–572. https://doi.org/10.1209/epl/i2003-00166-9 Ji, P ., & Jin, J. (2016). Coauthorship and citation netw orks for statisticians. The Annals of Applie d Statistics , 10 (4), 1779–1812. https://doi.org/10.1214/15-AOAS896 Jones, B. F., W uch ty , S., & Uzzi, B. (2008). Multi-universit y researc h teams: Shifting impact, geograph y , and stratiﬁcation in science. Scienc e , 322 (5905), 1259–1262. https://doi.org/10.1126/science .1158357 Krapivsky , P ., Ro dgers, G., & Redner, S. (2001). Organization of growing random netw orks. Physic al R eview E , 066123. https://doi.org/10.1103/PhysRevE.63.066123 Krivitsky , P. N., & Handco c k, M. S. (2019). tergm: Fit, simulate and diagnose mo dels for netw ork evolution based on exp onen tial-family random graph mo dels [Computer softw are manual]. (R pack age version 3.6.0) https://CRAN.R-project.org/package=tergm Kronegger, L., Mali, F., F erligo j, A., & Doreian, P . (2012). Collab oration structures in slov enian scientiﬁc comm unities. Scientometrics , 90 (2), 631–647. https://doi.org/10.1007/s11192-011-0493-8 Larivière, V., Gingras, Y., Sugimoto, C. R., & T sou, A. (2015). T eam size matters: Collaboration and scien tiﬁc impact since 1900. Journal of the Asso ciation for Information Scienc e and T e chnolo gy , 66 (7), 1323–1332. https://doi.org/10.1002/asi.23266 Lib en-No w ell, D., & Klein berg, J. (2007). The link-prediction problem for so cial netw orks. Journal of the Americ an So ciety for Information Scienc e and T e chnolo gy , 58 (7), 1019–1031. https://doi.org/ 10.1002/asi.20591 Lospinoso, J. (2012). Statistic al mo dels for so cial network dynamics (Doctoral dissertation, Oxford Univ ersity, UK). https://ora.ox.ac.uk/objects/ora:6726 Lospinoso, J. A., Sch w einberger, M., Snijders, T. A. B., & Ripley , R. M. (2011). Assessing and accounting for time heterogeneity in sto c hastic actor orien ted mo dels. A dvanc es in Data A nalysis and Classiﬁc ation , 5 (2), 147–176. https://doi.org/10.1007/s11634-010-0076-1 Massen, C., & Jonathan, P . (2007). Preferential attachmen t during the ev olution of a p oten tial energy landscap e. The Journal of Chemic al Physics , 127 , 114306. https://doi.org/10.1063/1.2773721 Medo, M. (2014). Statistical v alidation of high-dimensional mo dels of growing netw orks. Physic al R eview E , 89 , 032801. https://doi.org/10.1103/PhysRevE.89.032801 21 Medo, M., Cimini, G., & Gualdi, S. (2011). T emp oral eﬀects in the growth of netw orks. Physic al R eview L etter , 107 , 238701. https://doi.org/10.1103/PhysRevLett.107.238701 Merton, R. K. (1968). The Matthew eﬀect in science. Scienc e , 159 (3810), 56–63. https://doi.org/ 10.1126/science.159.3810.56 Milo jević, S. (2010). Mo des of collab oration in mo dern science: Bey ond p o wer la ws and preferential attac h- men t. Journal of the Americ an So ciety for Information Scienc e and T e chnolo gy , 61 (7), 1410–1423. https://doi.org/10.1002/asi.21331 Newman, M. E. J. (2001a). Clustering and preferential attachmen t in gro wing net works. Physic al R eview E , 64 (2), 025102. https://doi.org/10.1103/PhysRevE.64.025102 Newman, M. E. J. (2001b). The structure of scien tiﬁc collaboration net w orks. Pr o c e e dings of the National A c ademy of Scienc es , 98 (2), 404–409. https://doi.org/10.1073/pnas.98.2.404 Newman, M. E. J. (2005). P ow er la ws, Pareto distributions and Zipf ’s law. Contemp or ary Physics , 46 , 323–351. https://doi.org/10.1080/00107510500052444 Pham, T., Sheridan, P ., & Shimodaira, H. (2015). P AFit: a statistical metho d for measuring preferen tial attac hment in temp oral complex net works. PLOS ONE (9), e0137796. https://doi.org/10.1371/ journal.pone.0137796 Pham, T., Sheridan, P ., & Shimodaira, H. (2016). Join t estimation of prefe ren tial attachmen t and node ﬁtness in gro wing complex netw orks. Scientiﬁc R ep orts , 6 . https://doi.org/10.1038/srep32558 Pham, T., Sheridan, P ., & Shimo daira, H. (to app ear). P AFit: an R pack age for estimating preferential attac hment and node ﬁtness in temp oral complex netw orks. Journal of Statistic al Softwar e . https:// Price, D. d. S. (1965). Net works of scien tiﬁc pap ers. Scienc e , 149 (3683), 510–515. https://doi.org/ 10.1126/science.149.3683.510 Price, D. d. S. (1976). A general theory of bibliometric and other cumulativ e adv antage pro cesses. Journal of the Americ an So ciety for Information Scienc e , 27 (5), 292–306. https://doi.org/10.1002/asi .4630270505 Ripley , R. M., Snijders, T. A., Bo da, Z., Vörös, A., & Preciado, P . (2018). Manual for siena version 4.0 (version ma y 24, 2018) [Computer softw are manual]. http://www.stats.ox.ac.uk/~snijders/siena/ Ronda-Pup o, G. A., & Pham, T. (2018). The evolutions of the rich get richer and the ﬁt get richer phenomena in sc holarly netw orks: the case of the strategic management journal. Scientometrics . https://doi.org/10.1007/s11192-018-2761-3 Simon, H. A. (1955). On a class of skew distribution functions. Biometrika , 42 (3/4), 425–440. https:// doi.org/10.1093/biomet/42.3-4.425 Snijders, T. A. (2001). The statistical ev aluation of so cial net work dynamics. So ciolo gic al Metho dolo gy , 31 (1), 361–395. https://doi.org/10.1111/0081-1750.00099 Snijders, T. A. (2017). Sto c hastic actor-oriented models for net work dynamics. Annual R eview of Statis- tics and Its Applic ation , 4 (1), 343–363. https://doi.org/10.1146/annurev-statistics-060116 -054035 W ang, M., Y u, G., & Y u, D. (2008). Measuring the preferential attac hment mec hanism in citation net- w orks. Physic a A: Statistic al Me chanics and its Applic ations , 387 (18), 4692–4698. https://doi.org/ 10.1016/j.physa.2008.03.017 Y ule, G. U. (1925). A mathematical theory of ev olution, based on the conclusions of Dr. J.C. Willis,F.R.S. Philosophic al T r ansactions of the R oyal So ciety of L ondon B: Biolo gic al Scienc es , 213 (402–410), 21–87. https://doi.org/10.1098/rstb.1925.0002 Zeng, A., Shen, Z., Zhou, J., W u, J., F an, Y., W ang, Y., & Stanley , H. E. (2017). The science of science: F rom the persp ective of complex systems. Physics R ep orts , 714–715 , 1–73. https://doi.org/10.1016/ j.physrep.2017.10.001 Zinilli, A. (2016). Comp etitiv e pro ject funding and dynamic complex net works: evidence from Pro jects of National Interest (PRIN). Scientometrics , 108 (2), 633–652. https://doi.org/10.1007/s11192-016 -1976-4 22 A An MM algorithm for estimating the non-parametric P A and transitivity functions T o maximize the partial log-likelihoo d function l ( A , B ) in Eq. (2), we deriv e an instance of the Minorize- Maximization algorithms (Hun ter & Lange, 2000). Denote A ( q ) k the v alue of A k at iteration q ( q ≥ 0 ), and A ( q ) = [ A ( q ) 0 , A ( q ) 1 , . . . , A ( q ) k max ] the v alue of A at that iteration. Deﬁne B ( q ) b and B ( q ) in a similar wa y . Starting from some initial v alues ( A 0 , B 0 ) at iteration q = 0 , w e w ant to compute ( A ( q +1) , B ( q +1) ) from ( A ( q ) , B ( q ) ) . In MM algorithms, one derives such up date formulas by ﬁrst ﬁnding a surrogate function Q ( A , B ) that satisﬁes l ( A , B ) ≥ Q ( A , B ) , ∀ A , B and l ( A ( q ) , B ( q ) ) = Q ( A ( q ) , B ( q ) ) , and then maximize the surrogate function. One can prov e that, if ( A q +1 , B q +1 ) maximizes Q ( A , B ) , then l ( A ( q +1) , B ( q +1) ) ≥ l ( A ( q ) , B ( q ) ) , i.e., the ob jectiv e function is increased monotonically per iteration. Since there can be man y surrogate functions that satisfy the conditions, the main indicator for ev aluating a particular Q ( A , B ) is ho w easily we can maximize it. Based on previous works (Pham et al., 2015; Pham, Sheridan, & Shimo daira, 2016), the following function is a surrogate function of l : Q 0 ( A , B ) = T X t =0 K X i =0 K X j = i B X l =0 m i,j,l ( t ) log A i A j B l − T X t =0 m ( t ) log   K X i =0 K X j = i B X l =0 n i,j,l ( t ) A ( q ) i A ( q ) j B ( q ) l   − T X t =0 m ( t ) P K i =0 P K j = i P B l =0 n i,j,l ( t ) A i A j B l P K i =0 P K j = i P B l =0 n i,j,l ( t ) A ( q ) i A ( q ) j B ( q ) l + T X t =0 m ( t ) , (A.1) where K := k max and B := b max . The product A i A j B l in the numerator of the third term in the r.h.s. of Eq. (A.1) preven ts parallel up dating of A and B . One wa y to deal with this pro duct is to apply the AM-GM inequality (Hunter & Lange, 2004): − A i A j B l ≥ − 1 2 A ( q ) j A ( q ) i A 2 i + A ( q ) i A ( q ) j A 2 j ! B l ≥ − 1 4    A ( q ) j B ( q ) l  A ( q ) i  3 A 4 i + A ( q ) i B ( q ) l  A ( q ) j  3 A 4 j    − 1 2 A ( q ) i A ( q ) j B ( q ) l B 2 l . Plugging this inequalit y to Eq. (A.1), we obtain the ﬁnal surrogate function: Q ( A , B ) = T X t =0 K X i =0 K X j = i B X l =0 m i,j,l ( t ) log A i A j B l − T X t =0 m ( t ) log   K X i =0 K X j = i B X l =0 n i,j,l ( t ) A ( q ) i A ( q ) j B ( q ) l   − T X t =0 m ( t ) P K i =0 P K j = i P B l =0 n i,j,l ( t ) 1 4 A ( q ) j B ( q ) l  A ( q ) i  3 A 4 i + A ( q ) i B ( q ) l  A ( q ) j  3 A 4 j ! + 1 2 A ( q ) i A ( q ) j B ( q ) l B 2 l ! P K i =0 P K j = i P B l =0 n i,j,l ( t ) A ( q ) i A ( q ) j B ( q ) l + T X t =0 m ( t ) . 23 Solving ∂ Q ( A , B ) ∂ A k = 0 and ∂ Q ( A , B ) ∂ B b = 0 , we obtain the following closed-form formulas: A ( q +1) k = 4 v u u u u u u u t P T t =0 P k i =0 m i,k, · ( t ) + P T t =0 P K j = k m k,j, · ( t ) P T t =0 m ( t ) P K j = k +1 P B l =0 n k,j,l ( t ) A ( q ) j B ( q ) l  A ( q ) k  3 + P k − 1 i =0 P B l n i,k,l ( t ) A ( q ) i B ( q ) l  A ( q ) k  3 + P B l =0 n k,k,l ( t )    A ( q ) k B ( q ) l  A ( q ) k  3 + A ( q ) k B ( q ) l  A ( q ) k  3    P K i =0 P K j = i P B l =0 n i,j,l ( t ) A ( q ) i A ( q ) j B ( q ) l , B ( q +1) b = v u u u u u t P T t =0 m · , · ,b ( t ) P T t =0 m ( t ) P K i =0 P K j = i n i,j,b ( t ) A ( q ) i A ( q ) j B ( q ) b P K i =0 P K j = i P B l =0 n i,j,l ( t ) A ( q ) i A ( q ) j B ( q ) l , where m i,k, · ( t ) := P B l =0 m i,k,l ( t ) and m · , · ,b := P K i =0 P K j = i m i,j,b ( t ) . Based on these formulas, at each iteration A ( q +1) and B ( q +1) can b e computed in parallel without solving an y additional optimization problems. This enables the metho d to work with large data-sets. The ob jective function v alue l ( A ( q +1) , B ( q +1) ) , as explained earlier, is guaranteed to b e increasing in q . B Estimation of the standard deviations of ˆ h trans ( t ) and ˆ h P A ( t ) W e hav e the following closed-form formula for the v ariance of the sample v ariance h trans ( t ) 2 : V h trans ( t ) 2 = 1 m ( t ) E (log 2 B b ij − E log 2 B b ij ) 4 − ( m ( t ) − 3) s trans ( t ) 4 m ( t )( m ( t ) − 1) . The delta metho d then gives: sd ( h trans ( t )) ≈ 1 2  E h trans ( t ) 2  − 1 / 2 p V h trans ( t ) 2 ≈ 1 2 ( s trans ( t )) − 1 p V h trans ( t ) 2 . The standard deviation of ˆ h trans ( t ) then can b e calculated by plugging ˆ A and ˆ B into the ab o ve formula. The standard deviation of ˆ h P A ( t ) follows the same deriv ation. 24

Joint Estimation of the Non-parametric Transitivity and Preferential Attachment Functions in Scientific Co-authorship Networks

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment