Multiplex Network Regression: How do relations drive interactions?
We introduce a statistical regression model to investigate the impact of dyadic relations on complex networks generated from observed repeated interactions. It is based on generalised hypergeometric ensembles (gHypEG), a class of statistical network …
Authors: Giona Casiraghi
G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) Multiplex Net w ork Regression: Ho w do relations driv e in teractions? Giona Casiraghi ∗ Chair of Systems Design, ETH Züric h, W ein b ergstrasse 56/58, 8092 Züric h, Switzerland. Abstract W e in troduce a statistical regression mo del to in v estigate the i mpact of dy adic relations on complex netw orks generated from observed rep eated interactions. It is based on generalised h yp ergeometric ensem bles (gHypEG), a class of statistical netw ork ensembles developed re- cen tly to deal with multi-edge graph and count data. W e represent different t ypes of kno wn relations b etw een system elemen ts by w eigh ted graphs, separated in the different la y ers of a m ultiplex netw ork. With our method, w e can regress the influence of each relational lay er, the explanatory v ariables, on the interaction counts, the dep enden t v ariables. Moreov er, we can quantify the statistical significance of the relations as explanatory v ariables for the ob- serv ed interactions. T o demonstrate the p ow er of our approac h, we inv estigate an example based on empirical data. 1 In tro duction In the study of real-w orld complex systems, we often deal with datasets of observe d r ep e ate d inter actions b etw een individuals. These datasets are used to generate net w orks where system’s elemen ts are represen ted b y v ertices and interactions b y edges. W e ask whether these interactions are random ev en ts or whether they are driven by existing relations b etw een system’s elements. T o answ er this question, we prop ose a statistical mo del to regress r elations , which we identify as c ovariate variables , on a net w ork created from in teractions, which we will refer to as our dep endent variables . In general, a regression mo del explains dep endent v ariables as a function of some cov ariates, accoun ting for random effects. Here, w e assume that the observ ed in teractions are driv en by differen t relations, possibly masked b y c ombinatorial effe cts . With combinatorial effects, w e mean that elements that in teract more, in general, are also more lik ely to interact with eac h other, even if they hav e no relations. This problem is well known in netw ork theory , where it is referred to as de gr e e-c orr e ction (see e.g., [20, 31, 33]). F or example, the fact that t w o individuals hav e con tact v ery often can b e explained by m ultiple reasons. They may in teract b ecause they are friends, b ecause work together, or simply b ecause they are v ery active, and hence hav e high chances to meet. Therefore, to ha v e a full understanding of a system, w e hav e to disen tangle relations from com binatorial effects. ∗ E-mail: gcasiraghi@ethz.c h. 1/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) Datasets of in teractions are ubiquitous across disciplines. Examples of these are recorded contacts b et w een individuals (e.g., So cioPatterns [25, 40], Reality Mining [13]), mutualistic in teractions b et w een sp ecies in ecology [12, 28], economical transactions b etw een countries and firms [16, 32], and collab orations b etw een firms [42]. In these cases, researchers are interested in learning whether the observed interactions are driven by relations b etw een the elements of the system. They ask whether friendship pla ys a role in the con tacts b etw een students [25], whether homophily driv es in teractions within so cial and political net w orks [2], i.e., whether individuals sharing similar c haracteristics are more likely to interact [27], or whether collab orations b etw een companies are driv en by geographical distance or industrial sector similarity [42]. There exist different approac hes addressing the problem of quantifying the interdependence be- t w een observed edges and dyadic relations in netw orks. This problem, ho w ev er, is exacerbated b y the fact that the dy adic relations represented in complex netw orks are not indep endent of one another. There is a broad literature on mo delling relational data to account for some of these prop erties that can arguably b e traced to the So cial Relations Mo del of [44]. Because of the non-indep endency of dyadic relations, ordinary least squares regression mo dels are inappropri- ate to analyse netw ork data [21]. T o partially o v ercome their limits, [21] introduced a regression metho d based on the quadratic assignmen t pro cedure developed b y [19]. An alternativ e approac h to address the problem of the non-indep endence of edges is that tak en in laten t space mo dels [18]. There, although the mo del still assumes the probability of edges to b e indep enden t in the sam- pling pro cess, the dep endence is accounted for in the laten t space constructed from the data. Other statistical metho ds commonly used in the analysis of so cial netw orks are based on exp o- nen tial random graph mo dels (ERGMs) or their extensions (see, e.g., [36 – 39]). Although b eing effectiv e under sp ecific conditions, all these metho ds hav e b een developed for unweighte d graphs. This means that they are not optimal for datasets whic h contain rep eated interactions, that need to b e represen ted as inte ger-weighte d graphs, usually referred to as multi-e dge networks or multi-gr aphs . The solution to this issue is to thr eshold the interactions to obtain an unw eighted graph (e.g., [10]). Clearly , this approac h do es not exploit all the information av ailable in the data and therefore, ma y pro duce sub-optimal results [1]. A ddressing these limitations, ERGMs, for example, hav e b een extended for coun t data [22, 23]. F urthermore, the latent space framework [18] introduced a regression mo del that naturally admits differen t co v ariates and deals with count data as well [35]. The relational ev ents model [3], instead, handles in teractions recorded along with a time stamp. This framework has since b een extended to include, e.g., missing observ ations, auto-regressive mo dels, and information to capture hidden homophily . Other mo dels that partially address those issues include the random dot pro duct graphs [41], and the sto c hastic block models [34]. Using the most general theory , none of these mo dels requires the observed in teraction to b e binary – they can b e counts or contin uous [17]. Ho w ev er, in their generalisation to count or contin uous data, these mo dels require the assumption 2/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) of a sp ecific distribution for the num b er of edges, which implicitly assumes a distinctive edge generating pro cess, as discussed in [4]. Moreov er, man y of these mo dels do not scale w ell to large datasets. In particular, in the case of ERGMs, the increased size of the sample space makes impractical the n umerical estimation of mo del parameters employing Monte-Carlo simulations. As a result, it is c hallenging to fit large datasets of rep eated in teractions. The generalised hypergeometric ensemble of random graphs (gHypEG) allows to address these limitations, pro viding a suitable mo del for the analysis of complex systems [4, 5, 8]. GHypEGs join t w o characteristics that are essen tial for the study of multi-edge net w orks. First, they are sp ecif- ically tailored to the analysis of m ulti-graphs, allo wing the easy interpretation of parameters. Second, their underlying probability distribution can b e stated in closed form, th us simplifying the study of datasets with a large n um ber of rep eated interactions. W e demonstrate the pow er of our approach and its p erformance with an example based on an empirical dataset consisting of more than 180 000 in teractions. The av ailable data consist of an interaction netw ork, built from recorded con tact coun ts b et w een high-sc hool students, and of further information such as studen t’s gender, class mem bership and topic, self-rep orted friendship relations, and F aceb o ok connections. 2 Metho dology 2.1 Net work representation Figure 1: The m ultiplex netw ork representation of a relational dataset. The b ottom lay er (blue) captures the in teraction counts that are observed. The top lay ers (yello w) encodes differ- en t types of relations, like weigh ted friendship links, or comm unity membership. The mo del w e prop ose allo ws us to understand how these relational lay ers impact interactions. 3/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) Relational datasets as the one pro vided in [25], consist of interaction counts and a collection of dy adic relations and vertex attributes. V ertex attributes, such as communit y membership or gen- der, often yield strong relations b et w een individuals, as individuals in the same communit y tend to interact more than individuals in different ones. W e can study this t ype of data representing it as a multiplex network . Multiplex net w orks are a particular class of interconnected multi-la yer net w orks where the vertices of each lay er corresp ond (cf. fig. 1) [15]. Supp ose that we hav e a dataset consisting of m recorded interactions b etw een n elements and r differen t types of relations b etw een them. W e enco de the interactions in a graph with n = | V | v ertices and m (multi-)edges. Since t w o individuals ma y interact more than once, multiple edges ma y exist b etw een the same couple of v ertices, giving rise to a multi-e dge graph. In the follo wing, w e will refer to this graph as the interaction lay er I . F or each t yp e of relation, w e can generate a graph that enco des the dyadic relations b etw een the elements of the system as weighte d e dges b et w een vertices. The w eigh t of each edge enco des the strength of the relation. W e will refer to these r graphs as the relational lay ers R l with l ∈ [1 , r ] . Let now M b e the multiplex net w ork generated b y the r + 1 lay ers and n = | V | v ertices. Figure 1 illustrates the m ultiplex approac h w e take. In the following, w e prop ose a framework to p erform statistical regressions with these netw ork la y ers as co v ariates. W e assume the multi-edged graph I to b e the dep endent v ariable and the remaining lay ers R l to b e the cov ariates, or explanatory v ariables. The mo del that results has the follo wing form: I = f ( R 1 , . . . , R r ; θ 1 , . . . , θ r ) , (1) for some function f : R V × V × · · · × R V × V × R r → N V × V , where the parameters θ l , l ∈ [1 , r ] are the parameters of the regression mo del corresp onding to each lay er R l . 2.2 Statistical Mo del Generalised Hyp ergeometric Ensem bles of Random Graphs (gHypEG) The approac h describ ed in this pap er exploits the generalised hypergeometric ensemble of random graphs (gHy- pEG). This class of mo dels extends the configuration mo del (CM) [29, 30] b y enco ding complex top ological patterns, while at the same time preserving degree distributions. The aim of this arti- cle is to estimate how to bias the pro cess underlying the configuration mo del, based on observed data. F or this reason, b efore introducing the form ulation of our regression mo del, we provide a brief o v erview of gHypEG. A more formal presentation is given in [4, 5, 7]. In the CM, the probability of connecting tw o vertices dep ends only on their (out- and in-) degrees. The CM assigns to each vertex as many out-stubs (or half-edges) as its out-degree, and as man y in-stubs as its in-degree. It then connects random pairs of vertices joining out- and in-stubs. This is done by sampling uniformly at random one out- and one in-stub from the p o ol of all out- and 4/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) P A B P A C P A D edge probability 0 1 A B D C edge probability 0 1 ~ 3 Ω A B P A B ~2 Ω AC P AC ~ Ω AD P AD A B D C Figure 2: Probabilities of connecting differen t stubs in CM and gHypEG. Graphical illustration of the probabilit y of connecting tw o vertices as a function of degrees (left figure), and degree and prop ensities (righ t figure). Higher prop ensities can b e related to strong relations b etw een v ertices, shown as dashed connection. A stronger relation (thick er line) may result in a higher prop ensit y to in teract, as shown for the pair (A,D). in-stubs resp ectively , and then connecting them, until no more stubs are av ailable [14]. The left side of fig. 2 illustrates this case fo cusing on a vertex A . The probability of connecting vertex A with one of the v ertices B , C , or D dep ends only on the abundance of stubs, and hence on the in-degree of the v ertices themselves. The higher the in-degree, the higher the num b er of in-stubs of the vertex. Hence, the higher the probability to randomly sample a stub b elonging to the v ertex. GHypEG giv e an expression for the probabilit y distribution underlying this pro cess, where the degrees of the vertices are preserved in exp ectations [5]. This result is achiev ed exploiting an urn represen tation of the problem. Edges are balls in an urn, and sampling from the CM corresp onds to sampling balls (i.e., edges) from an urn constructed as follows. F or eac h pair of vertices ( i, j ) , w e can denote with k out i and k in j their resp ective out- and in-degrees. The num b er of combinations of out-stubs of i with in-stubs of j which could create an edge is given by k out i k in j . F or each dy ad ( i, j ) we place k out i k in j balls of a given colour in the urn. This provides us with an urn con taining P ij k out i k in j edges of as many colours as pair of vertices that could b e connected. The pro cess of sampling m edges from such a ‘soft’ configuration mo del is thus describ ed by sampling m balls from the urn, and the probability distribution of observing a graph I under the mo del is given b y the multivariate hyp er ge ometric distribution with parameters Ξ = { k out i k in j } i,j : Pr( I | Ξ ) = P ij Ξ ij m − 1 Y i,j ∈ V Ξ ij A ij , (2) where A ij denotes the elemen t ij of the adjacency matrix of I , and the probability of observing I is non-zero only if P ij A ij = m . 5/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) GHypEG expand this form ulation to allo w the mo dification of the CM based on observ ations ab out the system. Sp ecifically , w e aim at mo delling the probabilit y of connecting tw o vertices not only based on degrees (i.e., num b er of stubs), but also on an indep endent prop ensity of tw o v ertices to b e connected. Such prop ensities captures non-degree related effects to b e incorp orated in to the mo del in the form of the o dds of connecting a pair of vertices instead of another. The righ t side of fig. 2 illustrates this case, where A is most lik ely to connect with vertex D , even though D has only one av ailable stub. W e can see this in the follo wing w ay . Supp ose that there w as an underlying so cial net work connecting the vertices of I . Then, we could exp ect that vertices that hav e a strong connection in the so cial net w ork (thick dashed line in fig. 2) hav e a high prop ensity to interact. This results in a higher probabilit y to observ e in teractions b et w een the pair (A,D) compared to all others. In [4], we ha v e inv estigated how blo ck and communit y structures can b e enco ded b y sp ecifying suitable prop ensities in the form of a blo ck matrix. Here, w e lo ok into the more general case where we aim at constraining the configuration mo del suc h that given edges are more likely than others according to external information ab out the pro cess mo delled. Such external information will construct the co v ariates in our regression mo del. W e collect prop ensities in a matrix Ω . The matrix encodes thus dyadic propensities of vertices that go b eyond what prescrib ed b y the combinatorial matrix Ξ . The ratio b etw een any tw o elemen ts Ω ij and Ω kl of the prop ensity matrix is the o dds-ratio of observing an edge b et w een v ertices i and j instead of k and l , indep endently of the degrees of the vertices. The probability of a graph I dep ends on the stubs’ configuration sp ecified b y Ξ , and on the o dds defined b y Ω . As for the case of the CM, this pro cess can b e seen as sampling edges from an urn, where edges c haracterised by a large prop ensit y are more likely to b e sampled. Such a probability distribution is describ ed by the m ultiv ariate W allenius’ noncen tral h yp ergeometric distribution [9, 43]: Pr( I | Ξ , Ω ) = Y i,j Ξ ij A ij Z 1 0 Y i,j 1 − z Ω ij S Ω A ij dz (3) with S Ω = P i,j Ω ij (Ξ ij − A ij ) . Here, w e assume that the en tries of the matrix of stub’s configuration Ξ are built according to the configuration model. This is the most general w a y to enco de com binatorial effect generated b y the different activity , i.e., degree, of vertices. It means that more activ e vertices, i.e., ha v e a higher degree, are more lik ely to in teract. Hence, Ξ is entirely defined by I . Regression Mo del The aim of our regression mo del is to find a suitable wa y to estimate Ω , based on the co v ariate lay ers {R l } l ∈ [1 ,r ] . W e thus prop ose to define Ω as a function of the 6/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) relational la y ers {R l } l ∈ [1 ,r ] : Ω := r Y l =1 R ( l ) θ l = exp ( r X l =1 θ l log R ( l ) ) , (4) where R ( l ) is the adjacency matrix constructed from the netw ork R l . Under this assumption, w e fix a multiplicativ e relation b etw een the different lay ers. That means, a v alue of 0 for a dy ad i, j in any lay er R ( l ) corresp onds to enco ding the imp ossibilit y of observing any edge b etw een i and j . Moreov er, a v alue of 1 for a dyad i, j in a lay er R ( l ) means that the la y er do es not affect the probabilit y of observing this dyad. F urthermore, the right-hand side of eq. (4) provides a simple wa y to interpret the parameters of the mo del θ l . If the cov ariate lay ers are sp ecified in a conv enient wa y , as w e will sho w later, θ l reflects the log-o dds of observing an in teraction b et w een a pair of vertices for whic h there is an edge in the cov ariate la y er R l , against a pair for whic h there is no edge in R l . W e can now sp ecify the statistical mo del in eq. (1). W e tak e f as the exp ectation of the gHypEG that maximises the probability of observing I , given the relational la y ers {R l } l ∈ [1 ,r ] . Estimat- ing such a mo del is therefore equiv alen t to find maximum lik elihoo d estimators (MLE) for the parameter v ector Θ in eq. (2). Equations (2) and (4) sho w that the likelihoo d of Θ given the observed graph I is defined by L ( Θ |I ) = Y i,j Ξ ij A ij Z 1 0 Y i,j 1 − z Q r l =1 R ( l ) ij θ l S Θ A ij dz (5) with S Θ = P i,j Q r l =1 R ( l ) ij θ l (Ξ ij − A ij ) . Although the numerical maximisation of eq. (5) is difficult, for m P ij Ξ ij w e can approx- imate the W allenius non-central multiv ariate hypergeometric distribution with a m ultinomial distribution with appropriately chosen probabilities p ij = Ξ ij Ω ij / P kl Ξ kl Ω kl (cf. [45]). Because P ij Ξ ij ≈ m 2 , the m ultinomial appro ximation holds even for small netw orks. Therefore eq. (5) as a function of Θ can b e approximated up to constants by L ( Θ |I ) ∼ Y i,j ∈ V Ξ ij Q r l =1 R ( l ) ij θ l P i,j ∈ V Ξ ij Q r l =1 R ( l ) ij θ l A ij . (6) W e obtain the MLE ˆ Θ = argmax Θ ( L ( Θ |I )) of eq. (6) by solving n umerically the system giv en b y ∇ L ( Θ ) = 0 . Each comp onent of the gradien t of the log-likelihoo d ∇ log( L ( Θ )) is then given 7/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) b y ∂ log ( L ( Θ |I )) ∂ θ l = − m P ij log R ( l ) ij Ξ ij Q r l =1 R ( l ) ij θ l P ij Ξ ij Q r l =1 R ( l ) ij θ l + X ij A ij log R ( l ) ij (7) Thanks to the asymptotic prop erties of MLEs, w e can compute the confidence interv als for the parameters estimates ˆ Θ . With c as the appropriate z -critical v alue for a giv en confidence (e.g., 1 . 96 for 95% confidence interv als), the confidence interv al for one parameter estimate ˆ θ l is given as follo ws: ˆ θ l ∈ ˆ θ l − c q ( J ( ˆ Θ ) − 1 ) ll , ˆ θ l + c q ( J ( ˆ Θ ) − 1 ) ll , (8) where J ( ˆ Θ ) = −∇ 2 log( L ( ˆ Θ | I )) is the observ ed Fisher information matrix [11]. F rom eq. (6) we get the follo wing expression for J ( ˆ Θ ) : J ( ˆ Θ ) lk = m P ij Ξ ij Q r l =1 R ( l ) ij θ l P ij log R ( l ) ij log( R k,ij )Ξ ij Q r l =1 R ( l ) ij θ l P ij Ξ ij Q r l =1 R ( l ) ij θ l 2 + − m P ij log R ( l ) ij Ξ ij Q r l =1 R ( l ) ij θ l P ij log( R k,ij )Ξ ij Q r l =1 R ( l ) ij θ k P ij Ξ ij Q r l =1 R ( l ) ij θ l 2 (9) In the R library ghypernet [6], a v ailable to do wnload from the CRAN, we provide the nrm routine to p erform netw ork regression mo del estimation. 2.3 General regression mo del The mo del describ ed in the previous section can b e generalized to account for multiple observ a- tions of the m ultiplex M . F or example, supp ose we ha v e data ab out con tacts b etw een studen ts in a school, and we ha v e collected the same type of data for different schools. Let us assume no w w e wan t to learn whether gender homophily plays the same role in the interactions across all the schools. This implies that while the relations b etw een the individuals change for differ- en t observ ations, e.g., gender distribution in differen t schools, the effect that the relations ha v e on the in teractions remains constan t. In other w ords, the relational la y ers change for each ob- serv ation, i.e., R ( i ) 6 = R ( j ) , where i and j are differen t observ ations. On the other hand, the parameter θ quantifying the effect of the relations on the interactions is assumed to b e constan t, i.e., θ ( i ) = θ ( j ) = θ . 8/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) Supp ose thus to ha v e N indep endent observ ations of the m ultiplex M , each denoted M ( i ) . W e assume that the influence of the indep endent lay ers R ( i ) l on the dep endent lay er I ( i ) is fixed, i.e., for eac h observ ation i , θ ( i ) = θ ∀ i ∈ N . Since each observ ation I ( i ) is indep endent and follo ws the distribution of the gHypEG given in eq. (2), the join t probability distribution is just the pro duct of each probabilit y . Therefore the lik eliho o d of the parameter vector Θ is given by L ( Θ |I (0) , I (1) , . . . , I ( N ) ) := N Y i =1 L ( Θ |I ( i ) ) , (10) where L ( Θ |I ( i ) ) is defined as in eq. (5). It is worth noting that the interaction lay ers I ( i ) come from the same class of distributions but are not identically distributed. This is true unless the n um ber of edges M ( i ) = M and the matrix Ξ ( i ) = Ξ are constan t for each observ ation ( i ) . Giv en the likelihoo d in eq. (10), w e can derive the MLE ˆ Θ of the parameter θ . Denoting with L ( Θ |I ( i ) ) the log-lik eliho o d of θ and b y ¯ L ( Θ ) = 1 N N X i =1 L ( Θ |I ( i ) ) (11) the a v erage log-likelihoo d, ˆ Θ is defined as follows: ˆ Θ = argmax θ ¯ L ( Θ ) . (12) 2.4 Mo del selection and effect sizes Recall we hav e a multiplex M with r + 1 lay ers. Supp ose w e ha ve estimated the statistical regression mo del defined in section 2.3. W e th us kno w the MLEs { ˆ θ l } l ∈ [1 ,r ] corresp onding the r relational lay ers {R l } l ∈ [1 ,r ] , and each of their v alues quantifies the str ength of the effect eac h la y er has on the in teraction lay er I . Are all these parameters needed? In other w ords, we wan t to quantify the go o dness of fit of the mo del with all parameters { ˆ θ l } l ∈ [1 ,r ] , and compare it to a mo del with fewer parameters. This allo ws us to select the parameters and the la yers with significan t effect, and disregard those with non-significant effects on the interactions. W e wan t to compare which of tw o statistical mo dels defined by the sub-multiplexes {R l } l ∈ [1 ,q ] and {R l } l ∈ [1 ,q + s ] as in eq. (1), one with q and the other one with q + s relational lay ers, b etter describ es the observ ed in teraction lay er I . Both mo dels are described b y eq. (5) with the appropriate la y ers c hosen as predictors. The t w o mo dels are nested, as one is a particular case of the other. In fact, the mo del defined by {R l } l ∈ [1 ,q ] 9/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) can b e obtained by setting to 0 the s co efficien ts { θ l } l ∈ [ q ,q + s ] corresp onding to the {R l } l ∈ [ q ,q + s ] la y ers in the second mo del (cf. eq. (4)). W e can p erform mo del sele ction using the likeliho o d r atio test . In particular, we can identify the n ull h yp othesis H 0 b y the mo del defined by {R l } l ∈ [1 ,q ] with ˜ q parameters, and the alternative h yp othesis H 1 b y the mo del defined by {R l } l ∈ [1 ,q + s ] , with ˜ s more parameters. This allows testing whether the explaining p o w er of the more complex mo del with ˜ q + ˜ s parameters is high enough to justify the increase in complexit y . Alternativ ely , as discussed already in [4], we can use AIC and BIC to choose the b est b etw een the t w o mo dels. Moreov er, information criteria allo w us to compare al l mo dels built using different com binations of lay ers, even when they are not nested. If w e pro ceed in a step-wise fashion, constructing the sub-m ultiplexes corresponding to the set of predictors with decreasing AIC scores, w e obtain a forward selection metho d that allo ws building models of increasing complexity . Finally , the goo dness of fit of the model can b e assessed qualitatively through the adjusted McF adden’s pseudo-r-squared ρ 2 [26]. The adjusted McF adden’s pseudo-r-squared is a co efficient of determination analogous to the m ultiple-correlation co efficien t used in OLS linear regression mo dels, adjusted for mo del complexit y . It is based on maximum likelihoo d estimates of mo del parameters, and it is hence suitable to ev aluate the go o dness of fit of our mo del. It is defined as follo ws: ρ 2 = 1 − L ( ˆ Θ q + s |I ) − K L ( Θ 0 |I ) . (13) In eq. (13), L ( ˆ Θ |I ) is the log-likelihoo d of the full mo del obtained from the MLE ˆ Θ q + s of the parameter vector θ q + s , L ( Θ 0 |I ) is the likelihoo d of the null-model where no explanatory v ariable is used (i.e., the CM), and K is n um b er of degrees of freedom of the full mo del. The closer the v alue of ρ 2 is to 1, the b etter is the fit of the mo del. The inclusion of the num b er of degrees of freedom K adjusts for mo del complexit y , b y punishing mo dels with an excessiv e num ber of parameters. The v alue of ρ 2 can also b e seen as an estimate of the amoun t of v ariability in the data explained b y the mo del, in terms of the relative increase in lik eliho o d of the mo del. More generally , w e can define a McF adden co efficient MC = 1 − L ( ˆ Θ q + s |I ) L ( ˆ Θ q |I ) . (14) that allows to ev aluate the relative increment in likelihoo d pro vided b y extending the mo del defined by {R l } l ∈ [1 ,q ] with q parameters with s more parameters. Large v alues of the McF adden co efficien t can then b e used to pro xy the improv emen t generated by the addition of new predictors whic h are in troducing new information into the model. Low v alues of the McF adden co efficien t, on the other hand, reflect the introduction of predictors that do not improv e the mo del considerably and th us do not allow to obtain new insights on the data. 10/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) In the next section, w e sho w an application of these metho ds on an empirical dataset ab out h uman interactions. 3 Application: High School Contacts Analysis 3.1 Data W e show case our metho d with a case study . Sp ecifically , we apply our technique to a So cioP attern dataset [25], to measure the strength and the significance of the effect of eac h la y er of information pro vided on the observed num b er of interactions. A t https://www.sg.ethz.ch/nrm- tutorial , w e pro vide a tutorial companion to this article with the co de used to generate the results shown here. In this case study , we analyse a dataset consisting of 188 508 recorded contacts b etw een 327 studen ts ov er five days that we represent in the graph of interactions I . The dataset contains additional t yp es of relations b etw een the studen ts that w e enco de as predictors in different relational la y ers. The a v ailable relations, that serve as cov ariates for the regression mo del are the follo wing. There are 2 so cial netw orks providing connection b et w een the studen ts. One so cial net w ork contains self-rep orted (directed) friendship relations. The second so cial netw ork rep orts F aceb o ok connection b etw een studen ts, which result in undirected links. F urthermore, studen ts are assigned to 9 different classes, group ed into 4 topical blo cks. W e h yp othesise that students in the same class and in the same topical blo ck are more likely to interact with each other. Moreo v er, the gender of each student is pro vided. These last 3 lay ers are thus built according to categorical information ab out the v ertices. In practice, these require defining a blo ck mo del for eac h category and then estimate them all together. Ho w ev er, here, b ecause we wan t to b e able to join together categorical data with dy adic data, w e pro ceed differen tly than in [4]. Let l b e a lab elling of vertices and R l the corresp onding lay er in the multiplex represen tation. W e can define a partition vector z l , whose i-th en try z l i sp ecifies the lab el of vertex i . F or every dy ad i, j for which z l i = z l j , w e set R ( l ) ij = κ . In the cases where z l i 6 = z l j , w e set R ( l ) ij = 1 . When p erforming the MLE of the parameter θ l , corresp onding to the lay er R l , w e rescale the v alue of R ( l ) suc h that ( R ( l ) ) θ l estimates the strength of the effect provided by the lab elling l . Note that we could c hoose an y v alue for κ , as what we are interested into is the MLE prop ensit y κ θ , whic h will b e constan t, satisfying eq. (5). If w e fix κ = e , where e is Euler’s num b er, θ l can b e easily in terpreted in terms of the order of magnitude of the cont ribution of R ( l ) to the prop ensit y matrix of the gHypEG. F urthermore, if we define the o dds-ratio ω = Ω ij / Ω kl , we can see that 11/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) P C P C * P S I * 2 B I O 1 2 B I O 2 2 B I O 3 M P M P * 1 M P * 2 Figure 3: The graph obtained from the con tacts b et w een students. Each student is coloured according to its class mem b ership and the in ternal ring groups classes on a similar topic. F rom this figure, it is clear that most of the con tacts happ en b etw een students of the same class, and there is a preference for con tacts b et w een students attending classes on the same topic. the log-o dds log( ω ) are given by θ : log( ω ) = log Ω ij Ω kl = log R ( l ) ij R ( l ) kl ! θ l = θ l log R ( l ) ij R ( l ) kl ! = θ l log e 1 = θ l (15) If θ l is larger than 0 , there is a p ositiv e effect, as κ θ l > 1 is incr e asing the prop ensity Ω ij for i, j with the same lab el. Similarly , if θ l < 0 there is a negative effect, i.e., the graph is disassortative with resp ect to the lab elling l . The first relational lay er R C in the dataset reflects the separation of students in to 9 classes. W e w an t to con trol for the separation into classes, as the encounters b et w een students attending differen t classes are naturally limited, as can b e observ ed in fig. 3. T o build R ( C ) w e can set R ( C ) ij = e if i, j are in the same class, and R ( C ) ij = 1 if i, j are in differen t classes. 12/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) A second relational lay er R T is built according to the topic of the different classes students take. The nine classes are group ed into four topical areas, of 3,3,2,1 classes, resp ectively . There are three classes of type "MP" (MP , MP01, MP02), t w o of t ype "PC" (PC and PC0), one of type "PSI" (PSI0) and 3 of type "BIO" (2BIO1, 2BIO2, 2BIO3). This separation is highlighted by the internal ring in fig. 3. The lay er R T is defined similarly to R C , setting R ( T ) ij = e if i, j attend classes in the same topical area, and R ( T ) ij = 1 if not. The third relational lay er R G is built using the gender of the students. W e wan t to correct for gender homophily , as this could partially play a role in student in teractions. W e build its adjacency matrix R ( G ) as ab o v e. The dataset also provides information ab out actual friendship relations b etw een the students. W e can build the fourth predictor using the so cial net w orks obtained from self-r ep orte d friendship r elations . Because this data is self-rep orted, it generates a directed so cial netw ork. In fact, some studen ts rep ort a friendship relation with another studen t, which hav e no corresp onding link from the other student. F or this reason, we can mo del this predictor as tw o separate lay ers , giv en that the in teractions on whic h w e wan t to regress are undirected. W e set one la y er R f to capture all corresp onded friendships, i.e., all those edges that are symmetric. W e set a second la y er R 1 / 2 f to capture all non-corresp onded friendships, i.e., all those edges that are asymmetric. Both la y ers are built follo wing a similar pro cess to the one discussed ab o v e: if there is a friendship relation b etw een tw o v ertices i, j , we set the v alue of the adjacency matrix of the corresp onding la y er to R ( f ) ij = κ = e . Otherwise, we set the v alue in the adjacency matrix to 1 . This wa y , w e can interpret the parameter θ f as the log-o dds of observing an interaction b et w een tw o ‘friends’ against t w o non-‘friends’. The fifth predictor is built using the provided F ac eb o ok c onne ctions . This dataset is, according to [25], incomplete. In fact, not all students disclosed their F aceb o ok accoun ts to extract relations. That means, for some students, we know the presence or the absence of F aceb o ok relationships, while for others, we cannot sa y an ything. Instead of mo delling the lac k of information as a lac k of relations, w e can split this predictor in to tw o non-separable lay ers. W e do so b y building a la y er R fb similarly to the friendship lay er. In particular, we set R ( fb ) ij = 1 for all those dy ads for which we ha v e no data. Moreov er, we generate a dummy ‘correction’ lay er R ε where we set R ( ε ) ij = e for all those dyads for which we hav e no data, and R ( ε ) ij = 1 otherwise. In this wa y , we can estimate the true effect of the a v ailable F aceb o ok relations, correcting for the effect of the non-a v ailable ones. Because in this case log( ω friend non friend ) = log Ω fb friend Ω non-friend = log R ( fb ) fb friend R ( fb ) non friend ! θ fb · R ( ε ) data R ( ε ) data ! θ ε = θ fb , (16) 13/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) log( ω friend no data ) = log Ω fb friend Ω no data = log R ( fb ) fb friend R ( fb ) no data ! θ fb · R ( ε ) data R ( ε ) no data ! θ ε = θ fb − θ ε , (17) θ fb pro vides the log-o dds of in teractions b etw een students that are friends on F aceb o ok against those b etw een studen ts that are not friends. Similarly , θ fb − θ ε giv es the log-o dds of interactions b et w een students that are friends on F aceb o ok against those with a student that did not pro vide access to the data. In general, w e assume that the absence of an edge in either R f or R fb is not enough to disallow an in teraction to happ en. It is for this reason that we choose to set the weigh t of the relations b et w een students who are not "friends" in either of the t w o lay ers to 1 . If w e were to set it to 0 instead, w e would hav e disallow ed the presence of edges b et w een those dyads in the mo del en tirely , in contrast with what observed in the data. W e sp eculate that R C will ha v e a very strong influence on the interactions since the division in to classes acts as physical b oundary for studen ts in teractions. In general, moreo v er, we would assume the information provided b y the tw o friendship lay ers will b e comparable, as the re- p orted friendship relations should b e part of the F aceb o ok connections. Similarly , w e exp ect that corresp onded friendship will yield a stronger effect on interactions. 3.2 Mo del W e build a regression mo del with the five predictors describ ed ab ov e. With suc h a mo del, we an- sw er the question of whether interactions b etw een students are related to (a) friendship relations, as p erceived by the studen ts themselves, and (b) F aceb o ok connections. The estimated effects are corrected for the degree of the vertices, i.e., for how active students are, and for the the fact that the students are phisically separated in different classes. Hence, as a first step we estimate a mo del for the case (a) and the case (b). The first tw o columns of table 1 provide the estimates of Θ ( a ) and Θ ( b ) resp ectiv ely . In b oth cases, we see that there is a strong effect pro vided by the tw o so cial netw orks, signalled b y a p ositive v alue of the estimated parameters. Also, we see that the offline so cial netw ork defined by the friendship relations has as a stronger effect compared to the online so cial netw ork. This can b e seen b oth from the effect size highlighted by the absolute v alue of the parameters, and from the larger v alue of ρ 2 and smaller AIC. The second tw o columns in table 1 show the mo del estimated after correcting for the con trol v ariables defined by R C , R T , R G . W e notice that, while the effect of friendship relations remains strong, the effect of F aceb o ok connections almost entirely disapp ears when controlling for the class mem b ership of students. Finally , in the full mo del shown in the fifth column of table 1 it can b e seen that friendship relations completely tak e ov er the small explaining p ow er pro duced b y F aceb o ok connections. 14/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) T able 1: Fitted parameters for the 7-la y er mo del, standard errors for the estimates, and corre- sp onding significance of the parameter estimates, obtained from a standard t-test as describ ed in eq. (8). 3 stars corresp ond to a p-v alue p < α = 0 . 001 . F or the regression, w e used κ = e . Θ ( a ) Θ ( b ) Θ ( a † ) Θ ( b † ) Θ ( a † + b † ) Con trol R ( C ) 3 . 196 ∗∗∗ 3 . 318 ∗∗∗ 3 . 168 ∗∗∗ (0 . 011) (0 . 011) (0 . 011) R ( T ) 2 . 275 ∗∗∗ 2 . 281 ∗∗∗ 2 . 281 ∗∗∗ (0 . 021) (0 . 021) (0 . 021) R ( G ) 0 . 194 ∗∗∗ 0 . 258 ∗∗∗ 0 . 200 ∗∗∗ (0 . 005) (0 . 005) (0 . 005) F riendship R ( f ) 3 . 696 ∗∗∗ 1 . 810 ∗∗∗ 1 . 820 ∗∗∗ (0 . 005) (0 . 006) (0 . 006) R (1 / 2 f ) 2 . 147 ∗∗∗ 0 . 385 ∗∗∗ 0 . 421 ∗∗∗ (0 . 015) (0 . 015) (0 . 015) F aceb o ok R ( fb ) 2 . 344 ∗∗∗ 0 . 535 ∗∗∗ 0 . 106 ∗∗∗ (0 . 006) (0 . 006) (0 . 006) R ( ε ) 0 . 564 ∗∗∗ 0 . 330 ∗∗∗ 0 . 357 ∗∗∗ (0 . 005) (0 . 005) (0 . 005) AIC 516342 . 8 643916 . 5 4061 . 8 71444 . 8 0 ρ 2 0 . 175 0 . 081 0 . 556 0 . 506 0 . 559 ∗∗∗ p < 0 . 001 , ∗∗ p < 0 . 01 , ∗ p < 0 . 05 As we exp ected, from the results of the regression we can see a strong effect obtained from the separation of vertices in to the categories corresp onding to classes. In the full mo del, the v alue of θ C 0 implies an o dds-ratio of e θ C = 23 . 76 for the probability of an interaction b etw een classmates against an encoun ter of studen ts of differen t classes, giv en ev erything else equal. This means that there are approximately 24 more c hances that tw o classmates meet, compared to encoun ters betw een students of differen t classes. Class topics are also a driving force for the in teractions. Contact b etw een studen ts attending classes on the same topic is 10 times more lik ely to b e observed than contact b etw een students attending classes on different topics. The v alue of θ G supp orts the presence of a w eak gender homophily in the encounters b etw een students, with an o dds-ratio of 1 . 22 . The effect of self-rep orted friendship is large and p ositiv e as exp ected, while non-corresp onded friendships yield a m uc h low er effect, even though this is larger than 15/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) T able 2: Mo del selection steps. F or each step, we rep ort the McF adden R squared, the improv e- men t in AIC, and the relativ e go o dness-of-fit in terms of the MC co efficient. The 6 mo dels are ordered b y increasing complexity . (1) (2) (3) (4) (5) (6) Con trol R ( C ) 4 . 641 ∗∗∗ 4 . 417 ∗∗∗ 3 . 207 ∗∗∗ 3 . 179 ∗∗∗ 3 . 176 ∗∗∗ 3 . 168 ∗∗∗ (0 . 007) (0 . 007) (0 . 008) (0 . 008) (0 . 008) (0 . 008) R ( T ) 2 . 307 ∗∗∗ 2 . 314 ∗∗∗ 2 . 282 ∗∗∗ 2 . 281 ∗∗∗ (0 . 016) (0 . 016) (0 . 016) (0 . 016) R ( G ) 0 . 205 ∗∗∗ 0 . 200 ∗∗∗ (0 . 004) (0 . 004) F riendship R ( f ) 1 . 819 ∗∗∗ 1 . 812 ∗∗∗ 1 . 817 ∗∗∗ 1 . 801 ∗∗∗ 1 . 820 ∗∗∗ (0 . 004) (0 . 004) (0 . 004) (0 . 004) (0 . 005) R (1 / 2 f ) 0 . 421 ∗∗∗ (0 . 011) F aceb o ok R ( fb ) 0 . 121 ∗∗∗ 0 . 124 ∗∗∗ 0 . 106 ∗∗∗ (0 . 005) (0 . 005) (0 . 005) R ( ε ) 0 . 347 ∗∗∗ 0 . 351 ∗∗∗ 0 . 357 ∗∗∗ (0 . 004) (0 . 004) (0 . 004) AIC 98609 . 8 21323 . 0 6370 . 6 2514 . 3 713 . 7 0 . 0 MC 0 . 486 0 . 112 0 . 024 0 . 006 0 . 003 0 . 001 ρ 2 0 . 486 0 . 543 0 . 554 0 . 557 0 . 558 0 . 559 ∗∗∗ p < 0 . 001 , ∗∗ p < 0 . 01 , ∗ p < 0 . 05 that pro vided by F aceb o ok connections. W e no w pro ceed to study the contribution of each relational la y er to mo del fit. T o do so, we follo w a stepwise selection metho d, as describ ed in section 2.4. W e introduce one predictor after the other, starting from those that hav e the highest contribution according to AIC. This means that, in the first step, we add the predictor whose corresp onding mo del has the low est AIC. Then, in the second we add one at a time to the first predictor all the remaining ones, to find the second-b est cont ribution, and w e pro ceed un til all predictors ha v e b een added. During the stepwise selection pro cess, w e monitor the increment in go o dness-of-fit in terms of AIC, and the relative go o dness-of-fit in terms of the McF adden co efficien t MC. As describ ed ab o v e, these tw o criteria give us t w o alternative w a ys to p erform mo del selection. By lo oking at 16/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) the change in AIC (cf. table 2), we see a clear difference b et w een the first tw o mo dels and the remaining ones. In fact, b oth predictors R C and R f pro vide a substan tial decrease in AIC. In other words, they hav e go o d explaining p ow er for the observ ed interactions. On the other hand, the other predictors provide a less marked decrease in AIC. W e also observe that the second tw o mo dels, corresp onding to the introduction of the la y ers R T (topic) and the couple R fb and R ε (fb), provide a similar decrease in AIC. Finally , the last tw o predictors pro vide a smaller decrease in AIC. In terms of information, how ever, the b est mo del according to AIC is nevertheless the one that incorp orates all parameters [24]. If w e consider the relative improv ement in lik eliho o d instead, as provided by the McF adden co efficien t, we see a similar pattern. The first t w o parameters pro vide a definite improv ement in the go o dness of fit, the second t w o a small impro v emen t, while the last tw o show a negligible impro v emen t. The reason for these results has to b e searched in the fact that the predictors are partially correlated. In fact, the class predictor and the friendship predictors provide largely indep enden t data. The third predictor, although important, is a sup erset of the class predictor. Hence, it yields a smaller impro v emen t in the go o dness of fit of the mo del. A solution to this issue could b e obtained b y mo delling the differen t classes as separated blo c ks in a BCCM (cf. [4], increasing the n um b er of parameters but capturing b oth the class and the topic membership at the same time. The fourth predictor, rep orting F aceb o ok relations, is incomplete. Hence, it can only explain part of the data. Moreov er, it is partly correlated with self-declared friendship, as p eople that declare to be friends are often friends on F aceb o ok (40% of the t w o so cial netw orks o v erlap). F rom this example, we can conclude that in the dataset studied the observed in teractions are strongly influenced b y social relations in the form of friendship links, even when we correct for the sub division into classes and topic, as can also b e visualised in fig. 3. Gender homophily is, instead, relatively weak after accoun ting for all factors. Moreov er, it is interesting to note that non-corresp onded friendships, i.e., friendships that hav e b een declared only by one student, ha v e a very low effect on the observ ed interactions, as long as corresp onded friendships are taken into accoun t. 4 Conclusion In this article, we hav e prop osed a new statistical mo del to quan tify how observed interactions dep end on differen t relations, in the framework of multiplex netw orks. The mo del is based on the assumption that interactions b etw een elements of a system are driven by t w o factors. The first factor is the existence of relations betw een elements, such as friendship or homophily . The second is the combinat orial randomness caused by the activity of the elements. Element s that are more activ e are more likely to interact with each other, even if they are unrelated. 17/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) Differen t from common approaches used in netw ork analysis, our metho dology has b een sp ecifi- cally designed to deal with m ulti-edge graphs. It therefore allo ws to use the whole data av ailable, without the need of thresholding it to obtain un w eigh ted, i.e., binary , graphs. In fact, rep eated in teractions b et w een elements of a system generate multi-edge graphs, where the vertices corre- sp ond to the elements of the system. Similarly , relations can ha v e v arying intensit y and should b e enco ded in w eigh ted graphs. This is why thresholding the data into binary netw orks can b e a w aste of useful information. Our mo del separates random and deterministic influences on interactions, accounting for the randomness as combinatorial effects. W e hence identify how much kno wn relations drive the in teractions. T o achiev e this, we base our regression mo del on generalized hypergeometric en- sem bles of random graphs, a class of statistical netw ork ensembles we hav e recently introduced. The formulation of our mo del allows to estimate the strength of the dep endence b et w een relations and interactions, together with its statistical significance. Moreov er, the parameters estimated b y our mo del can b e readily in terpreted as the log-o dds of observing in teractions b etw een tw o differen t pairs of v ertices. Thanks to these c haracteristics, the statistical regression mo del de- scrib ed in this article results in a p ow erful to ol for the analysis of complex systems consisting of a large n um ber of highly interacting elements. Studying how different relations driv e observed interactions is not only necessary to increase the understanding of a system, it is also needed to control the dynamics of a system. In fact, to do so w e hav e to appropriately mo dify the relations that are the driving forces underlying its b ehavior. Similarly , if we w an t to increase the resilience of a system, w e wan t to affect the relations that are resp onsible for its weaknesses. Having a clear understanding on how and which relations impact the b eha vior of the elemen ts of a system is a necessary condition to prop erly con trol it. In conclusion, the metho d w e prop ose is a ma jor adv ance for the analysis of relational datasets and complex netw orks. By allowing the study of multi-edge and weigh ted graphs, it increases the breadth of applicabilit y of netw ork theory . In future work, it will allow to identify missing in teractions, according to null-models based on known relations. Thanks to this, it will b e p ossible to unco v er unknown r elations b et w een elements of a system. A ckno wledgment The author thanks S. Sch w eighofer, G. V accario and F. Sch weitzer for useful discussion, and V. Nan um y an for designing fig. 3. 18/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) References [1] Ahnert, S. E.; Garlaschelli, D.; Fink, T. M. A.; Caldarelli, G. (2007). Ensem ble approach to the analysis of w eigh ted netw orks. Physic al R eview E 76(1) , 016101. [2] Branden b erger, L. (2018). T rading fav ors: Examining the temp oral dynamics of recipro city in congressional collab orations using relational even t mo dels. So cial networks 54 , 238–253. [3] Butts, C. T. (2008). A Relational Even t F ramework for So cial Action. So ciolo gic al Metho d- olo gy 38(1) , 155–200. [4] Casiraghi, G. (2019). The blo c k-constrained configuration mo del. Applie d Network Scienc e 4(1) , 123. [5] Casiraghi, G.; Nanum yan, V. (2018). Generalised h yp ergeometric ensem bles of random graphs: the configuration mo del as an urn problem. arXiv pr eprint arXiv:1810.06495 . [6] Casiraghi, G.; Nanum yan, V. (2020). ghypernet: Fit and Sim ulate Generalised Hyp ergeo- metric Ensem bles of Graphs. [7] Casiraghi, G.; Nan umy an, V.; Scholtes, I.; Sc hw eitzer, F. (2016). Generalized Hyp erge- ometric Ensem bles: Statistical Hyp othesis T esting in Complex Netw orks. arXiv pr eprint arXiv:1607.02441 . [8] Casiraghi, G.; Nanum yan, V.; Scholtes, I.; Sch w eitzer, F. (2017). F rom relational data to graphs: Inferring significant links using generalized h ypergeometric ensembles. In: Interna- tional Confer enc e on So cial Informatics . Springer, pp. 111–120. [9] Chesson, J. (1978). Measuring Preference in Selective Predation. Ec olo gy 59(2) , 211–215. [10] Cranmer, S. J.; Desmarais, B. A. (2011). Inferen tial net w ork analysis with exponential random graph mo dels. Politic al Analysis 19(1) , 66–86. [11] Degro ot; Sc hervish (2002). Pr ob ability and Statistics . A ddison-W esley Publishing, 3th edn. ISBN 0201524880. [12] Dic ks, L. V.; Corb et, S. a.; Pywell, R. F. (2002). Compartmen talization in plant-insect flo w er visitor webs. Journal of Animal Ec olo gy 71(1) , 32–43. [13] Eagle, N.; Pen tland, A. S. (2006). Reality mining: sensing complex so cial systems. Personal and ubiquitous c omputing 10(4) , 255–268. [14] F osdic k, B. K.; Larremore, D. B.; Nishim ura, J.; Ugander, J. (2018). Configuring random graph mo dels with fixed degree sequences. SIAM R eview 60(2) , 315–355. 19/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) [15] Garas, A. (ed.) (2016). Inter c onne cte d Networks . Understanding Complex Systems, Cham: Springer In ternational Publishing. ISBN 978-3-319-23945-3. [16] Garas, A.; Argyrakis, P .; Rozen blat, C.; T omassini, M.; Havlin, S. (2010). W orldwide spread- ing of economic crisis. New Journal of Physics 12(11) , 113043. [17] Hoff, P .; F osdick, B.; V olfovsky , A.; Sto v el, K. (2013). Likelihoo ds for fixed rank nomination net w orks. Network Scienc e 1(03) , 253–277. [18] Hoff, P . D.; Raftery , A. E.; Handco ck, M. S. (2002). Laten t Space Approaches to So cial Net w ork Analysis. Journal of the Americ an Statistic al Asso ciation 97(460) , 1090–1098. [19] Hub ert, L.; Sc h ultz, J. (1976). Quadratic assignmen t as a general data analysis strategy. British Journal of Mathematic al and Statistic al Psycholo gy 29(2) , 190–241. [20] Karrer, B.; Newman, M. E. J. (2011). Sto chastic blo ckmodels and comm unit y structure in net w orks. Phys. R ev. E 83(1) , 16107. [21] Krac khardt, D. (1988). Predicting with net w orks: Nonparametric multiple regression anal- ysis of dy adic data. So cial Networks 10(4) , 359–381. [22] Krivitsky , P . N. (2012). Exp onen tial-family random graph mo dels for v alued net works. Ele ctr onic Journal of Statistics 6 , 1100–1128. [23] Krivitsky , P . N.; Butts, C. T. (2017). Exp onen tial-family random graph mo dels for rank- order relational data. So ciolo gic al Metho dolo gy 47(1) , 68–112. [24] Lehmann, E. L.; Romano, J. P . (eds.) (2005). T esting Statistic al Hyp otheses . Springer T exts in Statistics, New Y ork, NY: Springer New Y ork. ISBN 978-0-387-98864-1. [25] Mastrandrea, R.; F ournet, J.; Barrat, A. (2015). Contact patterns in a high school: A comparison b etw een data collected using wearable sensors, contact diaries and friendship surv eys. PL oS ONE 10(9) , 1–26. [26] McF adden, D. (1974). Conditional logit analysis of qualitative choice b eha vior. F r ontiers in Ec onometrics , 105–142. [27] McPherson, M.; Smith-Lovin, L.; Co ok, J. M. (2001). Birds of a F eather: Homophily in So cial Net w orks. A nnual R eview of So ciolo gy 27(1) , 415–444. [28] Memmott, J. (1999). The structure of a plant-pollination fo o d web. Ec olo gy L etters 2 , 276–280. 20/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) [29] Mollo y , M.; Reed, B. (1995). A critical p oint for random graphs with a given degree sequence. R andom Structur es & A lgorithms 6(2-3) , 161–180. [30] Mollo y , M.; Reed, B. (1998). The Size of the Giant Comp onent of a Random Graph with a Giv en Degree Sequence. Combinatorics, Pr ob ability and Computing 7(3) , 295–305. [31] Newman, M. E. J.; Peixoto, T. P . (2015). Generalized Communities in Netw orks. Physic al R eview L etters 115(8) , 088701. [32] Onnela, J.-P .; Chakrab orti, A.; Kaski, K.; Kertész, J.; Kanto, A. (2003). Dynamics of mark et correlations: T axonom y and p ortfolio analysis. Physic al R eview E 68(5) , 056110. [33] P eixoto, T. P . (2014). Hierarchical Blo ck Structures and High-Resolution Mo del Selection in Large Net w orks. Physic al R eview X 4(1) , 011047. [34] Rohe, K.; Chatterjee, S.; Y u, B. (2011). Sp ectral clustering and the high-dimensional sto c hastic blo c kmo del. The A nnals of Statistics 39(4) , 1878–1915. [35] Sew ell, D. K.; Chen, Y. (2015). Latent Space Mo dels for Dynamic Netw orks. Journal of the A meric an Statistic al Asso ciation 110(512) , 1646–1657. [36] Snijders, T.; Spreen, M.; Zwaagstra, R. (1995). The Use of Multilevel Mo deling for Analysis of Personal Netw orks: Net w orks of Co caine Users in an Urban Area. Journal of Qualitative A nthr op olo gy 5(2) , 85–105. [37] Snijders, T. A. (2011). Statistical Mo dels for So cial Net w orks. Annual R eview of So ciolo gy 37(1) , 131–153. [38] Snijders, T. A. B. (1996). Sto c hastic actor-oriented mo dels for netw ork change. The Journal of Mathematic al So ciolo gy 21(1-2) , 149–172. [39] Snijders, T. A. B.; v an de Bunt, G. G.; Steglich, C. E. G. (2010). Introduction to sto chastic actor-based mo dels for netw ork dynamics. So cial Networks 32(1) , 44–60. [40] Stehlé, J.; V oirin, N.; Barrat, A.; Cattuto, C.; Isella, L.; Pinton, J. F.; Quaggiotto, M.; v an den Bro eck, W.; Régis, C.; Lina, B.; V anhems, P . (2011). High-resolution measurements of face-to-face con tact patterns in a primary school. PL oS ONE 6(8) . [41] Sussman, D. L.; T ang, M.; Fishkind, D. E.; Prieb e, C. E. (2012). A Consistent Adjacency Sp ectral Embedding for Sto chastic Blo ckmodel Graphs. Journal of the Americ an Statistic al Asso ciation 107(499) , 1119–1128. [42] T omasello, M. V.; Nap oletano, M.; Garas, A.; Sch w eitzer, F. (2017). The rise and fall of R&D net w orks. Industrial and c orp or ate change 26(4) , 617–646. 21/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) [43] W allenius, K. T. (1963). Biase d Sampling: the Nonc entr al Hyp er ge ometric Pr ob ability Dis- tribution . Ph.d. thesis, Stanford Universit y . [44] W arner, R. M.; Kenny , D. A.; Stoto, M. (1979). A new round robin analysis of v ariance for so cial in teraction data. Journal of Personality and So cial Psycholo gy 37(10) , 1742–1757. [45] Zingg, C.; Casiraghi, G.; V accario, G.; Sc h w eitzer, F. (2019). What is the En trop y of a So cial Organization? Entr opy 21(9) , 901. 22/22
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment