Multiplex Network Regression: How do relations drive interactions?

G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) Multiplex Net w ork Regression: Ho w do relations driv e in teractions? Giona Casiraghi ∗ Chair of Systems Design, ETH Züric h, W ein b ergstrasse 56/58, 8092 Züric h, Switzerland. Abstract W e in troduce a statistical regression mo del to in v estigate the i mpact of dy adic relations on complex netw orks generated from observed rep eated interactions. It is based on generalised h yp ergeometric ensem bles (gHypEG), a class of statistical netw ork ensembles developed re- cen tly to deal with multi-edge graph and count data. W e represent diﬀerent t ypes of kno wn relations b etw een system elemen ts by w eigh ted graphs, separated in the diﬀerent la y ers of a m ultiplex netw ork. With our method, w e can regress the inﬂuence of each relational lay er, the explanatory v ariables, on the interaction counts, the dep enden t v ariables. Moreov er, we can quantify the statistical signiﬁcance of the relations as explanatory v ariables for the ob- serv ed interactions. T o demonstrate the p ow er of our approac h, we inv estigate an example based on empirical data. 1 In tro duction In the study of real-w orld complex systems, we often deal with datasets of observe d r ep e ate d inter actions b etw een individuals. These datasets are used to generate net w orks where system’s elemen ts are represen ted b y v ertices and interactions b y edges. W e ask whether these interactions are random ev en ts or whether they are driven by existing relations b etw een system’s elements. T o answ er this question, we prop ose a statistical mo del to regress r elations , which we identify as c ovariate variables , on a net w ork created from in teractions, which we will refer to as our dep endent variables . In general, a regression mo del explains dep endent v ariables as a function of some cov ariates, accoun ting for random eﬀects. Here, w e assume that the observ ed in teractions are driv en by diﬀeren t relations, possibly masked b y c ombinatorial eﬀe cts . With combinatorial eﬀects, w e mean that elements that in teract more, in general, are also more lik ely to interact with eac h other, even if they hav e no relations. This problem is well known in netw ork theory , where it is referred to as de gr e e-c orr e ction (see e.g., [20, 31, 33]). F or example, the fact that t w o individuals hav e con tact v ery often can b e explained by m ultiple reasons. They may in teract b ecause they are friends, b ecause work together, or simply b ecause they are v ery active, and hence hav e high chances to meet. Therefore, to ha v e a full understanding of a system, w e hav e to disen tangle relations from com binatorial eﬀects. ∗ E-mail: gcasiraghi@ethz.c h. 1/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) Datasets of in teractions are ubiquitous across disciplines. Examples of these are recorded contacts b et w een individuals (e.g., So cioPatterns [25, 40], Reality Mining [13]), mutualistic in teractions b et w een sp ecies in ecology [12, 28], economical transactions b etw een countries and ﬁrms [16, 32], and collab orations b etw een ﬁrms [42]. In these cases, researchers are interested in learning whether the observed interactions are driven by relations b etw een the elements of the system. They ask whether friendship pla ys a role in the con tacts b etw een students [25], whether homophily driv es in teractions within so cial and political net w orks [2], i.e., whether individuals sharing similar c haracteristics are more likely to interact [27], or whether collab orations b etw een companies are driv en by geographical distance or industrial sector similarity [42]. There exist diﬀerent approac hes addressing the problem of quantifying the interdependence be- t w een observed edges and dyadic relations in netw orks. This problem, ho w ev er, is exacerbated b y the fact that the dy adic relations represented in complex netw orks are not indep endent of one another. There is a broad literature on mo delling relational data to account for some of these prop erties that can arguably b e traced to the So cial Relations Mo del of [44]. Because of the non-indep endency of dyadic relations, ordinary least squares regression mo dels are inappropri- ate to analyse netw ork data [21]. T o partially o v ercome their limits, [21] introduced a regression metho d based on the quadratic assignmen t pro cedure developed b y [19]. An alternativ e approac h to address the problem of the non-indep endence of edges is that tak en in laten t space mo dels [18]. There, although the mo del still assumes the probability of edges to b e indep enden t in the sam- pling pro cess, the dep endence is accounted for in the laten t space constructed from the data. Other statistical metho ds commonly used in the analysis of so cial netw orks are based on exp o- nen tial random graph mo dels (ERGMs) or their extensions (see, e.g., [36 – 39]). Although b eing eﬀectiv e under sp eciﬁc conditions, all these metho ds hav e b een developed for unweighte d graphs. This means that they are not optimal for datasets whic h contain rep eated interactions, that need to b e represen ted as inte ger-weighte d graphs, usually referred to as multi-e dge networks or multi-gr aphs . The solution to this issue is to thr eshold the interactions to obtain an unw eighted graph (e.g., [10]). Clearly , this approac h do es not exploit all the information av ailable in the data and therefore, ma y pro duce sub-optimal results [1]. A ddressing these limitations, ERGMs, for example, hav e b een extended for coun t data [22, 23]. F urthermore, the latent space framework [18] introduced a regression mo del that naturally admits diﬀeren t co v ariates and deals with count data as well [35]. The relational ev ents model [3], instead, handles in teractions recorded along with a time stamp. This framework has since b een extended to include, e.g., missing observ ations, auto-regressive mo dels, and information to capture hidden homophily . Other mo dels that partially address those issues include the random dot pro duct graphs [41], and the sto c hastic block models [34]. Using the most general theory , none of these mo dels requires the observed in teraction to b e binary – they can b e counts or contin uous [17]. Ho w ev er, in their generalisation to count or contin uous data, these mo dels require the assumption 2/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) of a sp eciﬁc distribution for the num b er of edges, which implicitly assumes a distinctive edge generating pro cess, as discussed in [4]. Moreov er, man y of these mo dels do not scale w ell to large datasets. In particular, in the case of ERGMs, the increased size of the sample space makes impractical the n umerical estimation of mo del parameters employing Monte-Carlo simulations. As a result, it is c hallenging to ﬁt large datasets of rep eated in teractions. The generalised hypergeometric ensemble of random graphs (gHypEG) allows to address these limitations, pro viding a suitable mo del for the analysis of complex systems [4, 5, 8]. GHypEGs join t w o characteristics that are essen tial for the study of multi-edge net w orks. First, they are sp ecif- ically tailored to the analysis of m ulti-graphs, allo wing the easy interpretation of parameters. Second, their underlying probability distribution can b e stated in closed form, th us simplifying the study of datasets with a large n um ber of rep eated interactions. W e demonstrate the pow er of our approach and its p erformance with an example based on an empirical dataset consisting of more than 180 000 in teractions. The av ailable data consist of an interaction netw ork, built from recorded con tact coun ts b et w een high-sc hool students, and of further information such as studen t’s gender, class mem bership and topic, self-rep orted friendship relations, and F aceb o ok connections. 2 Metho dology 2.1 Net work representation Figure 1: The m ultiplex netw ork representation of a relational dataset. The b ottom lay er (blue) captures the in teraction counts that are observed. The top lay ers (yello w) encodes diﬀer- en t types of relations, like weigh ted friendship links, or comm unity membership. The mo del w e prop ose allo ws us to understand how these relational lay ers impact interactions. 3/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) Relational datasets as the one pro vided in [25], consist of interaction counts and a collection of dy adic relations and vertex attributes. V ertex attributes, such as communit y membership or gen- der, often yield strong relations b et w een individuals, as individuals in the same communit y tend to interact more than individuals in diﬀerent ones. W e can study this t ype of data representing it as a multiplex network . Multiplex net w orks are a particular class of interconnected multi-la yer net w orks where the vertices of each lay er corresp ond (cf. ﬁg. 1) [15]. Supp ose that we hav e a dataset consisting of m recorded interactions b etw een n elements and r diﬀeren t types of relations b etw een them. W e enco de the interactions in a graph with n = | V | v ertices and m (multi-)edges. Since t w o individuals ma y interact more than once, multiple edges ma y exist b etw een the same couple of v ertices, giving rise to a multi-e dge graph. In the follo wing, w e will refer to this graph as the interaction lay er I . F or each t yp e of relation, w e can generate a graph that enco des the dyadic relations b etw een the elements of the system as weighte d e dges b et w een vertices. The w eigh t of each edge enco des the strength of the relation. W e will refer to these r graphs as the relational lay ers R l with l ∈ [1 , r ] . Let now M b e the multiplex net w ork generated b y the r + 1 lay ers and n = | V | v ertices. Figure 1 illustrates the m ultiplex approac h w e take. In the following, w e prop ose a framework to p erform statistical regressions with these netw ork la y ers as co v ariates. W e assume the multi-edged graph I to b e the dep endent v ariable and the remaining lay ers R l to b e the cov ariates, or explanatory v ariables. The mo del that results has the follo wing form: I = f ( R 1 , . . . , R r ; θ 1 , . . . , θ r ) , (1) for some function f : R V × V × · · · × R V × V × R r → N V × V , where the parameters θ l , l ∈ [1 , r ] are the parameters of the regression mo del corresp onding to each lay er R l . 2.2 Statistical Mo del Generalised Hyp ergeometric Ensem bles of Random Graphs (gHypEG) The approac h describ ed in this pap er exploits the generalised hypergeometric ensemble of random graphs (gHy- pEG). This class of mo dels extends the conﬁguration mo del (CM) [29, 30] b y enco ding complex top ological patterns, while at the same time preserving degree distributions. The aim of this arti- cle is to estimate how to bias the pro cess underlying the conﬁguration mo del, based on observed data. F or this reason, b efore introducing the form ulation of our regression mo del, we provide a brief o v erview of gHypEG. A more formal presentation is given in [4, 5, 7]. In the CM, the probability of connecting tw o vertices dep ends only on their (out- and in-) degrees. The CM assigns to each vertex as many out-stubs (or half-edges) as its out-degree, and as man y in-stubs as its in-degree. It then connects random pairs of vertices joining out- and in-stubs. This is done by sampling uniformly at random one out- and one in-stub from the p o ol of all out- and 4/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) P A B P A C P A D edge probability 0 1 A B D C edge probability 0 1 ~ 3 Ω A B P A B ~2 Ω AC P AC ~ Ω AD P AD A B D C Figure 2: Probabilities of connecting diﬀeren t stubs in CM and gHypEG. Graphical illustration of the probabilit y of connecting tw o vertices as a function of degrees (left ﬁgure), and degree and prop ensities (righ t ﬁgure). Higher prop ensities can b e related to strong relations b etw een v ertices, shown as dashed connection. A stronger relation (thick er line) may result in a higher prop ensit y to in teract, as shown for the pair (A,D). in-stubs resp ectively , and then connecting them, until no more stubs are av ailable [14]. The left side of ﬁg. 2 illustrates this case fo cusing on a vertex A . The probability of connecting vertex A with one of the v ertices B , C , or D dep ends only on the abundance of stubs, and hence on the in-degree of the v ertices themselves. The higher the in-degree, the higher the num b er of in-stubs of the vertex. Hence, the higher the probability to randomly sample a stub b elonging to the v ertex. GHypEG giv e an expression for the probabilit y distribution underlying this pro cess, where the degrees of the vertices are preserved in exp ectations [5]. This result is achiev ed exploiting an urn represen tation of the problem. Edges are balls in an urn, and sampling from the CM corresp onds to sampling balls (i.e., edges) from an urn constructed as follows. F or eac h pair of vertices ( i, j ) , w e can denote with k out i and k in j their resp ective out- and in-degrees. The num b er of combinations of out-stubs of i with in-stubs of j which could create an edge is given by k out i k in j . F or each dy ad ( i, j ) we place k out i k in j balls of a given colour in the urn. This provides us with an urn con taining P ij k out i k in j edges of as many colours as pair of vertices that could b e connected. The pro cess of sampling m edges from such a ‘soft’ conﬁguration mo del is thus describ ed by sampling m balls from the urn, and the probability distribution of observing a graph I under the mo del is given b y the multivariate hyp er ge ometric distribution with parameters Ξ = { k out i k in j } i,j : Pr( I | Ξ ) =  P ij Ξ ij m  − 1 Y i,j ∈ V  Ξ ij A ij  , (2) where A ij denotes the elemen t ij of the adjacency matrix of I , and the probability of observing I is non-zero only if P ij A ij = m . 5/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) GHypEG expand this form ulation to allo w the mo diﬁcation of the CM based on observ ations ab out the system. Sp eciﬁcally , w e aim at mo delling the probabilit y of connecting tw o vertices not only based on degrees (i.e., num b er of stubs), but also on an indep endent prop ensity of tw o v ertices to b e connected. Such prop ensities captures non-degree related eﬀects to b e incorp orated in to the mo del in the form of the o dds of connecting a pair of vertices instead of another. The righ t side of ﬁg. 2 illustrates this case, where A is most lik ely to connect with vertex D , even though D has only one av ailable stub. W e can see this in the follo wing w ay . Supp ose that there w as an underlying so cial net work connecting the vertices of I . Then, we could exp ect that vertices that hav e a strong connection in the so cial net w ork (thick dashed line in ﬁg. 2) hav e a high prop ensity to interact. This results in a higher probabilit y to observ e in teractions b et w een the pair (A,D) compared to all others. In [4], we ha v e inv estigated how blo ck and communit y structures can b e enco ded b y sp ecifying suitable prop ensities in the form of a blo ck matrix. Here, w e lo ok into the more general case where we aim at constraining the conﬁguration mo del suc h that given edges are more likely than others according to external information ab out the pro cess mo delled. Such external information will construct the co v ariates in our regression mo del. W e collect prop ensities in a matrix Ω . The matrix encodes thus dyadic propensities of vertices that go b eyond what prescrib ed b y the combinatorial matrix Ξ . The ratio b etw een any tw o elemen ts Ω ij and Ω kl of the prop ensity matrix is the o dds-ratio of observing an edge b et w een v ertices i and j instead of k and l , indep endently of the degrees of the vertices. The probability of a graph I dep ends on the stubs’ conﬁguration sp eciﬁed b y Ξ , and on the o dds deﬁned b y Ω . As for the case of the CM, this pro cess can b e seen as sampling edges from an urn, where edges c haracterised by a large prop ensit y are more likely to b e sampled. Such a probability distribution is describ ed by the m ultiv ariate W allenius’ noncen tral h yp ergeometric distribution [9, 43]: Pr( I | Ξ , Ω ) =   Y i,j  Ξ ij A ij    Z 1 0 Y i,j  1 − z Ω ij S Ω  A ij dz (3) with S Ω = P i,j Ω ij (Ξ ij − A ij ) . Here, w e assume that the en tries of the matrix of stub’s conﬁguration Ξ are built according to the conﬁguration model. This is the most general w a y to enco de com binatorial eﬀect generated b y the diﬀerent activity , i.e., degree, of vertices. It means that more activ e vertices, i.e., ha v e a higher degree, are more lik ely to in teract. Hence, Ξ is entirely deﬁned by I . Regression Mo del The aim of our regression mo del is to ﬁnd a suitable wa y to estimate Ω , based on the co v ariate lay ers {R l } l ∈ [1 ,r ] . W e thus prop ose to deﬁne Ω as a function of the 6/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) relational la y ers {R l } l ∈ [1 ,r ] : Ω := r Y l =1  R ( l )  θ l = exp ( r X l =1 θ l log R ( l ) ) , (4) where R ( l ) is the adjacency matrix constructed from the netw ork R l . Under this assumption, w e ﬁx a multiplicativ e relation b etw een the diﬀerent lay ers. That means, a v alue of 0 for a dy ad i, j in any lay er R ( l ) corresp onds to enco ding the imp ossibilit y of observing any edge b etw een i and j . Moreov er, a v alue of 1 for a dyad i, j in a lay er R ( l ) means that the la y er do es not aﬀect the probabilit y of observing this dyad. F urthermore, the right-hand side of eq. (4) provides a simple wa y to interpret the parameters of the mo del θ l . If the cov ariate lay ers are sp eciﬁed in a conv enient wa y , as w e will sho w later, θ l reﬂects the log-o dds of observing an in teraction b et w een a pair of vertices for whic h there is an edge in the cov ariate la y er R l , against a pair for whic h there is no edge in R l . W e can now sp ecify the statistical mo del in eq. (1). W e tak e f as the exp ectation of the gHypEG that maximises the probability of observing I , given the relational la y ers {R l } l ∈ [1 ,r ] . Estimat- ing such a mo del is therefore equiv alen t to ﬁnd maximum lik elihoo d estimators (MLE) for the parameter v ector Θ in eq. (2). Equations (2) and (4) sho w that the likelihoo d of Θ given the observed graph I is deﬁned by L ( Θ |I ) =   Y i,j  Ξ ij A ij    Z 1 0 Y i,j    1 − z Q r l =1  R ( l ) ij  θ l S Θ    A ij dz (5) with S Θ = P i,j Q r l =1  R ( l ) ij  θ l (Ξ ij − A ij ) . Although the numerical maximisation of eq. (5) is diﬃcult, for m  P ij Ξ ij w e can approx- imate the W allenius non-central multiv ariate hypergeometric distribution with a m ultinomial distribution with appropriately chosen probabilities p ij = Ξ ij Ω ij / P kl Ξ kl Ω kl (cf. [45]). Because P ij Ξ ij ≈ m 2 , the m ultinomial appro ximation holds even for small netw orks. Therefore eq. (5) as a function of Θ can b e approximated up to constants by L ( Θ |I ) ∼ Y i,j ∈ V    Ξ ij Q r l =1  R ( l ) ij  θ l P i,j ∈ V Ξ ij Q r l =1  R ( l ) ij  θ l    A ij . (6) W e obtain the MLE ˆ Θ = argmax Θ ( L ( Θ |I )) of eq. (6) by solving n umerically the system giv en b y ∇ L ( Θ ) = 0 . Each comp onent of the gradien t of the log-likelihoo d ∇ log( L ( Θ )) is then given 7/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) b y ∂ log ( L ( Θ |I )) ∂ θ l = − m P ij log  R ( l ) ij  Ξ ij Q r l =1  R ( l ) ij  θ l P ij Ξ ij Q r l =1  R ( l ) ij  θ l + X ij A ij log  R ( l ) ij  (7) Thanks to the asymptotic prop erties of MLEs, w e can compute the conﬁdence interv als for the parameters estimates ˆ Θ . With c as the appropriate z -critical v alue for a giv en conﬁdence (e.g., 1 . 96 for 95% conﬁdence interv als), the conﬁdence interv al for one parameter estimate ˆ θ l is given as follo ws: ˆ θ l ∈  ˆ θ l − c q ( J ( ˆ Θ ) − 1 ) ll , ˆ θ l + c q ( J ( ˆ Θ ) − 1 ) ll  , (8) where J ( ˆ Θ ) = −∇ 2 log( L ( ˆ Θ | I )) is the observ ed Fisher information matrix [11]. F rom eq. (6) we get the follo wing expression for J ( ˆ Θ ) : J ( ˆ Θ ) lk = m  P ij Ξ ij Q r l =1  R ( l ) ij  θ l   P ij log  R ( l ) ij  log( R k,ij )Ξ ij Q r l =1  R ( l ) ij  θ l   P ij Ξ ij Q r l =1  R ( l ) ij  θ l  2 + − m  P ij log  R ( l ) ij  Ξ ij Q r l =1  R ( l ) ij  θ l   P ij log( R k,ij )Ξ ij Q r l =1  R ( l ) ij  θ k   P ij Ξ ij Q r l =1  R ( l ) ij  θ l  2 (9) In the R library ghypernet [6], a v ailable to do wnload from the CRAN, we provide the nrm routine to p erform netw ork regression mo del estimation. 2.3 General regression mo del The mo del describ ed in the previous section can b e generalized to account for multiple observ a- tions of the m ultiplex M . F or example, supp ose we ha v e data ab out con tacts b etw een studen ts in a school, and we ha v e collected the same type of data for diﬀerent schools. Let us assume no w w e wan t to learn whether gender homophily plays the same role in the interactions across all the schools. This implies that while the relations b etw een the individuals change for diﬀer- en t observ ations, e.g., gender distribution in diﬀeren t schools, the eﬀect that the relations ha v e on the in teractions remains constan t. In other w ords, the relational la y ers change for each ob- serv ation, i.e., R ( i ) 6 = R ( j ) , where i and j are diﬀeren t observ ations. On the other hand, the parameter θ quantifying the eﬀect of the relations on the interactions is assumed to b e constan t, i.e., θ ( i ) = θ ( j ) = θ . 8/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) Supp ose thus to ha v e N indep endent observ ations of the m ultiplex M , each denoted M ( i ) . W e assume that the inﬂuence of the indep endent lay ers R ( i ) l on the dep endent lay er I ( i ) is ﬁxed, i.e., for eac h observ ation i , θ ( i ) = θ ∀ i ∈ N . Since each observ ation I ( i ) is indep endent and follo ws the distribution of the gHypEG given in eq. (2), the join t probability distribution is just the pro duct of each probabilit y . Therefore the lik eliho o d of the parameter vector Θ is given by L ( Θ |I (0) , I (1) , . . . , I ( N ) ) := N Y i =1 L ( Θ |I ( i ) ) , (10) where L ( Θ |I ( i ) ) is deﬁned as in eq. (5). It is worth noting that the interaction lay ers I ( i ) come from the same class of distributions but are not identically distributed. This is true unless the n um ber of edges M ( i ) = M and the matrix Ξ ( i ) = Ξ are constan t for each observ ation ( i ) . Giv en the likelihoo d in eq. (10), w e can derive the MLE ˆ Θ of the parameter θ . Denoting with L ( Θ |I ( i ) ) the log-lik eliho o d of θ and b y ¯ L ( Θ ) = 1 N N X i =1 L ( Θ |I ( i ) ) (11) the a v erage log-likelihoo d, ˆ Θ is deﬁned as follows: ˆ Θ = argmax θ  ¯ L ( Θ )  . (12) 2.4 Mo del selection and eﬀect sizes Recall we hav e a multiplex M with r + 1 lay ers. Supp ose w e ha ve estimated the statistical regression mo del deﬁned in section 2.3. W e th us kno w the MLEs { ˆ θ l } l ∈ [1 ,r ] corresp onding the r relational lay ers {R l } l ∈ [1 ,r ] , and each of their v alues quantiﬁes the str ength of the eﬀect eac h la y er has on the in teraction lay er I . Are all these parameters needed? In other w ords, we wan t to quantify the go o dness of ﬁt of the mo del with all parameters { ˆ θ l } l ∈ [1 ,r ] , and compare it to a mo del with fewer parameters. This allo ws us to select the parameters and the la yers with signiﬁcan t eﬀect, and disregard those with non-signiﬁcant eﬀects on the interactions. W e wan t to compare which of tw o statistical mo dels deﬁned by the sub-multiplexes {R l } l ∈ [1 ,q ] and {R l } l ∈ [1 ,q + s ] as in eq. (1), one with q and the other one with q + s relational lay ers, b etter describ es the observ ed in teraction lay er I . Both mo dels are described b y eq. (5) with the appropriate la y ers c hosen as predictors. The t w o mo dels are nested, as one is a particular case of the other. In fact, the mo del deﬁned by {R l } l ∈ [1 ,q ] 9/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) can b e obtained by setting to 0 the s co eﬃcien ts { θ l } l ∈ [ q ,q + s ] corresp onding to the {R l } l ∈ [ q ,q + s ] la y ers in the second mo del (cf. eq. (4)). W e can p erform mo del sele ction using the likeliho o d r atio test . In particular, we can identify the n ull h yp othesis H 0 b y the mo del deﬁned by {R l } l ∈ [1 ,q ] with ˜ q parameters, and the alternative h yp othesis H 1 b y the mo del deﬁned by {R l } l ∈ [1 ,q + s ] , with ˜ s more parameters. This allows testing whether the explaining p o w er of the more complex mo del with ˜ q + ˜ s parameters is high enough to justify the increase in complexit y . Alternativ ely , as discussed already in [4], we can use AIC and BIC to choose the b est b etw een the t w o mo dels. Moreov er, information criteria allo w us to compare al l mo dels built using diﬀerent com binations of lay ers, even when they are not nested. If w e pro ceed in a step-wise fashion, constructing the sub-m ultiplexes corresponding to the set of predictors with decreasing AIC scores, w e obtain a forward selection metho d that allo ws building models of increasing complexity . Finally , the goo dness of ﬁt of the model can b e assessed qualitatively through the adjusted McF adden’s pseudo-r-squared ρ 2 [26]. The adjusted McF adden’s pseudo-r-squared is a co eﬃcient of determination analogous to the m ultiple-correlation co eﬃcien t used in OLS linear regression mo dels, adjusted for mo del complexit y . It is based on maximum likelihoo d estimates of mo del parameters, and it is hence suitable to ev aluate the go o dness of ﬁt of our mo del. It is deﬁned as follo ws: ρ 2 = 1 − L ( ˆ Θ q + s |I ) − K L ( Θ 0 |I ) . (13) In eq. (13), L ( ˆ Θ |I ) is the log-likelihoo d of the full mo del obtained from the MLE ˆ Θ q + s of the parameter vector θ q + s , L ( Θ 0 |I ) is the likelihoo d of the null-model where no explanatory v ariable is used (i.e., the CM), and K is n um b er of degrees of freedom of the full mo del. The closer the v alue of ρ 2 is to 1, the b etter is the ﬁt of the mo del. The inclusion of the num b er of degrees of freedom K adjusts for mo del complexit y , b y punishing mo dels with an excessiv e num ber of parameters. The v alue of ρ 2 can also b e seen as an estimate of the amoun t of v ariability in the data explained b y the mo del, in terms of the relative increase in lik eliho o d of the mo del. More generally , w e can deﬁne a McF adden co eﬃcient MC = 1 − L ( ˆ Θ q + s |I ) L ( ˆ Θ q |I ) . (14) that allows to ev aluate the relative increment in likelihoo d pro vided b y extending the mo del deﬁned by {R l } l ∈ [1 ,q ] with q parameters with s more parameters. Large v alues of the McF adden co eﬃcien t can then b e used to pro xy the improv emen t generated by the addition of new predictors whic h are in troducing new information into the model. Low v alues of the McF adden co eﬃcien t, on the other hand, reﬂect the introduction of predictors that do not improv e the mo del considerably and th us do not allow to obtain new insights on the data. 10/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) In the next section, w e sho w an application of these metho ds on an empirical dataset ab out h uman interactions. 3 Application: High School Contacts Analysis 3.1 Data W e show case our metho d with a case study . Sp eciﬁcally , we apply our technique to a So cioP attern dataset [25], to measure the strength and the signiﬁcance of the eﬀect of eac h la y er of information pro vided on the observed num b er of interactions. A t https://www.sg.ethz.ch/nrm- tutorial , w e pro vide a tutorial companion to this article with the co de used to generate the results shown here. In this case study , we analyse a dataset consisting of 188 508 recorded contacts b etw een 327 studen ts ov er ﬁve days that we represent in the graph of interactions I . The dataset contains additional t yp es of relations b etw een the studen ts that w e enco de as predictors in diﬀerent relational la y ers. The a v ailable relations, that serve as cov ariates for the regression mo del are the follo wing. There are 2 so cial netw orks providing connection b et w een the studen ts. One so cial net w ork contains self-rep orted (directed) friendship relations. The second so cial netw ork rep orts F aceb o ok connection b etw een studen ts, which result in undirected links. F urthermore, studen ts are assigned to 9 diﬀerent classes, group ed into 4 topical blo cks. W e h yp othesise that students in the same class and in the same topical blo ck are more likely to interact with each other. Moreo v er, the gender of each student is pro vided. These last 3 lay ers are thus built according to categorical information ab out the v ertices. In practice, these require deﬁning a blo ck mo del for eac h category and then estimate them all together. Ho w ev er, here, b ecause we wan t to b e able to join together categorical data with dy adic data, w e pro ceed diﬀeren tly than in [4]. Let l b e a lab elling of vertices and R l the corresp onding lay er in the multiplex represen tation. W e can deﬁne a partition vector z l , whose i-th en try z l i sp eciﬁes the lab el of vertex i . F or every dy ad i, j for which z l i = z l j , w e set R ( l ) ij = κ . In the cases where z l i 6 = z l j , w e set R ( l ) ij = 1 . When p erforming the MLE of the parameter θ l , corresp onding to the lay er R l , w e rescale the v alue of R ( l ) suc h that ( R ( l ) ) θ l estimates the strength of the eﬀect provided by the lab elling l . Note that we could c hoose an y v alue for κ , as what we are interested into is the MLE prop ensit y κ θ , whic h will b e constan t, satisfying eq. (5). If w e ﬁx κ = e , where e is Euler’s num b er, θ l can b e easily in terpreted in terms of the order of magnitude of the cont ribution of R ( l ) to the prop ensit y matrix of the gHypEG. F urthermore, if we deﬁne the o dds-ratio ω = Ω ij / Ω kl , we can see that 11/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) P C P C * P S I * 2 B I O 1 2 B I O 2 2 B I O 3 M P M P * 1 M P * 2 Figure 3: The graph obtained from the con tacts b et w een students. Each student is coloured according to its class mem b ership and the in ternal ring groups classes on a similar topic. F rom this ﬁgure, it is clear that most of the con tacts happ en b etw een students of the same class, and there is a preference for con tacts b et w een students attending classes on the same topic. the log-o dds log( ω ) are given by θ : log( ω ) = log  Ω ij Ω kl  = log   R ( l ) ij R ( l ) kl ! θ l   = θ l log R ( l ) ij R ( l ) kl ! = θ l log  e 1  = θ l (15) If θ l is larger than 0 , there is a p ositiv e eﬀect, as κ θ l > 1 is incr e asing the prop ensity Ω ij for i, j with the same lab el. Similarly , if θ l < 0 there is a negative eﬀect, i.e., the graph is disassortative with resp ect to the lab elling l . The ﬁrst relational lay er R C in the dataset reﬂects the separation of students in to 9 classes. W e w an t to con trol for the separation into classes, as the encounters b et w een students attending diﬀeren t classes are naturally limited, as can b e observ ed in ﬁg. 3. T o build R ( C ) w e can set R ( C ) ij = e if i, j are in the same class, and R ( C ) ij = 1 if i, j are in diﬀeren t classes. 12/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) A second relational lay er R T is built according to the topic of the diﬀerent classes students take. The nine classes are group ed into four topical areas, of 3,3,2,1 classes, resp ectively . There are three classes of type "MP" (MP , MP01, MP02), t w o of t ype "PC" (PC and PC0), one of type "PSI" (PSI0) and 3 of type "BIO" (2BIO1, 2BIO2, 2BIO3). This separation is highlighted by the internal ring in ﬁg. 3. The lay er R T is deﬁned similarly to R C , setting R ( T ) ij = e if i, j attend classes in the same topical area, and R ( T ) ij = 1 if not. The third relational lay er R G is built using the gender of the students. W e wan t to correct for gender homophily , as this could partially play a role in student in teractions. W e build its adjacency matrix R ( G ) as ab o v e. The dataset also provides information ab out actual friendship relations b etw een the students. W e can build the fourth predictor using the so cial net w orks obtained from self-r ep orte d friendship r elations . Because this data is self-rep orted, it generates a directed so cial netw ork. In fact, some studen ts rep ort a friendship relation with another studen t, which hav e no corresp onding link from the other student. F or this reason, we can mo del this predictor as tw o separate lay ers , giv en that the in teractions on whic h w e wan t to regress are undirected. W e set one la y er R f to capture all corresp onded friendships, i.e., all those edges that are symmetric. W e set a second la y er R 1 / 2 f to capture all non-corresp onded friendships, i.e., all those edges that are asymmetric. Both la y ers are built follo wing a similar pro cess to the one discussed ab o v e: if there is a friendship relation b etw een tw o v ertices i, j , we set the v alue of the adjacency matrix of the corresp onding la y er to R ( f ) ij = κ = e . Otherwise, we set the v alue in the adjacency matrix to 1 . This wa y , w e can interpret the parameter θ f as the log-o dds of observing an interaction b et w een tw o ‘friends’ against t w o non-‘friends’. The ﬁfth predictor is built using the provided F ac eb o ok c onne ctions . This dataset is, according to [25], incomplete. In fact, not all students disclosed their F aceb o ok accoun ts to extract relations. That means, for some students, we know the presence or the absence of F aceb o ok relationships, while for others, we cannot sa y an ything. Instead of mo delling the lac k of information as a lac k of relations, w e can split this predictor in to tw o non-separable lay ers. W e do so b y building a la y er R fb similarly to the friendship lay er. In particular, we set R ( fb ) ij = 1 for all those dy ads for which we ha v e no data. Moreov er, we generate a dummy ‘correction’ lay er R ε where we set R ( ε ) ij = e for all those dyads for which we hav e no data, and R ( ε ) ij = 1 otherwise. In this wa y , we can estimate the true eﬀect of the a v ailable F aceb o ok relations, correcting for the eﬀect of the non-a v ailable ones. Because in this case log( ω friend non friend ) = log  Ω fb friend Ω non-friend  = log   R ( fb ) fb friend R ( fb ) non friend ! θ fb · R ( ε ) data R ( ε ) data ! θ ε   = θ fb , (16) 13/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) log( ω friend no data ) = log  Ω fb friend Ω no data  = log   R ( fb ) fb friend R ( fb ) no data ! θ fb · R ( ε ) data R ( ε ) no data ! θ ε   = θ fb − θ ε , (17) θ fb pro vides the log-o dds of in teractions b etw een students that are friends on F aceb o ok against those b etw een studen ts that are not friends. Similarly , θ fb − θ ε giv es the log-o dds of interactions b et w een students that are friends on F aceb o ok against those with a student that did not pro vide access to the data. In general, w e assume that the absence of an edge in either R f or R fb is not enough to disallow an in teraction to happ en. It is for this reason that we choose to set the weigh t of the relations b et w een students who are not "friends" in either of the t w o lay ers to 1 . If w e were to set it to 0 instead, w e would hav e disallow ed the presence of edges b et w een those dyads in the mo del en tirely , in contrast with what observed in the data. W e sp eculate that R C will ha v e a very strong inﬂuence on the interactions since the division in to classes acts as physical b oundary for studen ts in teractions. In general, moreo v er, we would assume the information provided b y the tw o friendship lay ers will b e comparable, as the re- p orted friendship relations should b e part of the F aceb o ok connections. Similarly , w e exp ect that corresp onded friendship will yield a stronger eﬀect on interactions. 3.2 Mo del W e build a regression mo del with the ﬁve predictors describ ed ab ov e. With suc h a mo del, we an- sw er the question of whether interactions b etw een students are related to (a) friendship relations, as p erceived by the studen ts themselves, and (b) F aceb o ok connections. The estimated eﬀects are corrected for the degree of the vertices, i.e., for how active students are, and for the the fact that the students are phisically separated in diﬀerent classes. Hence, as a ﬁrst step we estimate a mo del for the case (a) and the case (b). The ﬁrst tw o columns of table 1 provide the estimates of Θ ( a ) and Θ ( b ) resp ectiv ely . In b oth cases, we see that there is a strong eﬀect pro vided by the tw o so cial netw orks, signalled b y a p ositive v alue of the estimated parameters. Also, we see that the oﬄine so cial netw ork deﬁned by the friendship relations has as a stronger eﬀect compared to the online so cial netw ork. This can b e seen b oth from the eﬀect size highlighted by the absolute v alue of the parameters, and from the larger v alue of ρ 2 and smaller AIC. The second tw o columns in table 1 show the mo del estimated after correcting for the con trol v ariables deﬁned by R C , R T , R G . W e notice that, while the eﬀect of friendship relations remains strong, the eﬀect of F aceb o ok connections almost entirely disapp ears when controlling for the class mem b ership of students. Finally , in the full mo del shown in the ﬁfth column of table 1 it can b e seen that friendship relations completely tak e ov er the small explaining p ow er pro duced b y F aceb o ok connections. 14/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) T able 1: Fitted parameters for the 7-la y er mo del, standard errors for the estimates, and corre- sp onding signiﬁcance of the parameter estimates, obtained from a standard t-test as describ ed in eq. (8). 3 stars corresp ond to a p-v alue p < α = 0 . 001 . F or the regression, w e used κ = e . Θ ( a ) Θ ( b ) Θ ( a † ) Θ ( b † ) Θ ( a † + b † ) Con trol R ( C ) 3 . 196 ∗∗∗ 3 . 318 ∗∗∗ 3 . 168 ∗∗∗ (0 . 011) (0 . 011) (0 . 011) R ( T ) 2 . 275 ∗∗∗ 2 . 281 ∗∗∗ 2 . 281 ∗∗∗ (0 . 021) (0 . 021) (0 . 021) R ( G ) 0 . 194 ∗∗∗ 0 . 258 ∗∗∗ 0 . 200 ∗∗∗ (0 . 005) (0 . 005) (0 . 005) F riendship R ( f ) 3 . 696 ∗∗∗ 1 . 810 ∗∗∗ 1 . 820 ∗∗∗ (0 . 005) (0 . 006) (0 . 006) R (1 / 2 f ) 2 . 147 ∗∗∗ 0 . 385 ∗∗∗ 0 . 421 ∗∗∗ (0 . 015) (0 . 015) (0 . 015) F aceb o ok R ( fb ) 2 . 344 ∗∗∗ 0 . 535 ∗∗∗ 0 . 106 ∗∗∗ (0 . 006) (0 . 006) (0 . 006) R ( ε ) 0 . 564 ∗∗∗ 0 . 330 ∗∗∗ 0 . 357 ∗∗∗ (0 . 005) (0 . 005) (0 . 005) AIC 516342 . 8 643916 . 5 4061 . 8 71444 . 8 0 ρ 2 0 . 175 0 . 081 0 . 556 0 . 506 0 . 559 ∗∗∗ p < 0 . 001 , ∗∗ p < 0 . 01 , ∗ p < 0 . 05 As we exp ected, from the results of the regression we can see a strong eﬀect obtained from the separation of vertices in to the categories corresp onding to classes. In the full mo del, the v alue of θ C  0 implies an o dds-ratio of e θ C = 23 . 76 for the probability of an interaction b etw een classmates against an encoun ter of studen ts of diﬀeren t classes, giv en ev erything else equal. This means that there are approximately 24 more c hances that tw o classmates meet, compared to encoun ters betw een students of diﬀeren t classes. Class topics are also a driving force for the in teractions. Contact b etw een studen ts attending classes on the same topic is 10 times more lik ely to b e observed than contact b etw een students attending classes on diﬀerent topics. The v alue of θ G supp orts the presence of a w eak gender homophily in the encounters b etw een students, with an o dds-ratio of 1 . 22 . The eﬀect of self-rep orted friendship is large and p ositiv e as exp ected, while non-corresp onded friendships yield a m uc h low er eﬀect, even though this is larger than 15/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) T able 2: Mo del selection steps. F or each step, we rep ort the McF adden R squared, the improv e- men t in AIC, and the relativ e go o dness-of-ﬁt in terms of the MC co eﬃcient. The 6 mo dels are ordered b y increasing complexity . (1) (2) (3) (4) (5) (6) Con trol R ( C ) 4 . 641 ∗∗∗ 4 . 417 ∗∗∗ 3 . 207 ∗∗∗ 3 . 179 ∗∗∗ 3 . 176 ∗∗∗ 3 . 168 ∗∗∗ (0 . 007) (0 . 007) (0 . 008) (0 . 008) (0 . 008) (0 . 008) R ( T ) 2 . 307 ∗∗∗ 2 . 314 ∗∗∗ 2 . 282 ∗∗∗ 2 . 281 ∗∗∗ (0 . 016) (0 . 016) (0 . 016) (0 . 016) R ( G ) 0 . 205 ∗∗∗ 0 . 200 ∗∗∗ (0 . 004) (0 . 004) F riendship R ( f ) 1 . 819 ∗∗∗ 1 . 812 ∗∗∗ 1 . 817 ∗∗∗ 1 . 801 ∗∗∗ 1 . 820 ∗∗∗ (0 . 004) (0 . 004) (0 . 004) (0 . 004) (0 . 005) R (1 / 2 f ) 0 . 421 ∗∗∗ (0 . 011) F aceb o ok R ( fb ) 0 . 121 ∗∗∗ 0 . 124 ∗∗∗ 0 . 106 ∗∗∗ (0 . 005) (0 . 005) (0 . 005) R ( ε ) 0 . 347 ∗∗∗ 0 . 351 ∗∗∗ 0 . 357 ∗∗∗ (0 . 004) (0 . 004) (0 . 004) AIC 98609 . 8 21323 . 0 6370 . 6 2514 . 3 713 . 7 0 . 0 MC 0 . 486 0 . 112 0 . 024 0 . 006 0 . 003 0 . 001 ρ 2 0 . 486 0 . 543 0 . 554 0 . 557 0 . 558 0 . 559 ∗∗∗ p < 0 . 001 , ∗∗ p < 0 . 01 , ∗ p < 0 . 05 that pro vided by F aceb o ok connections. W e no w pro ceed to study the contribution of each relational la y er to mo del ﬁt. T o do so, we follo w a stepwise selection metho d, as describ ed in section 2.4. W e introduce one predictor after the other, starting from those that hav e the highest contribution according to AIC. This means that, in the ﬁrst step, we add the predictor whose corresp onding mo del has the low est AIC. Then, in the second we add one at a time to the ﬁrst predictor all the remaining ones, to ﬁnd the second-b est cont ribution, and w e pro ceed un til all predictors ha v e b een added. During the stepwise selection pro cess, w e monitor the increment in go o dness-of-ﬁt in terms of AIC, and the relative go o dness-of-ﬁt in terms of the McF adden co eﬃcien t MC. As describ ed ab o v e, these tw o criteria give us t w o alternative w a ys to p erform mo del selection. By lo oking at 16/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) the change in AIC (cf. table 2), we see a clear diﬀerence b et w een the ﬁrst tw o mo dels and the remaining ones. In fact, b oth predictors R C and R f pro vide a substan tial decrease in AIC. In other words, they hav e go o d explaining p ow er for the observ ed interactions. On the other hand, the other predictors provide a less marked decrease in AIC. W e also observe that the second tw o mo dels, corresp onding to the introduction of the la y ers R T (topic) and the couple R fb and R ε (fb), provide a similar decrease in AIC. Finally , the last tw o predictors pro vide a smaller decrease in AIC. In terms of information, how ever, the b est mo del according to AIC is nevertheless the one that incorp orates all parameters [24]. If w e consider the relative improv ement in lik eliho o d instead, as provided by the McF adden co eﬃcien t, we see a similar pattern. The ﬁrst t w o parameters pro vide a deﬁnite improv ement in the go o dness of ﬁt, the second t w o a small impro v emen t, while the last tw o show a negligible impro v emen t. The reason for these results has to b e searched in the fact that the predictors are partially correlated. In fact, the class predictor and the friendship predictors provide largely indep enden t data. The third predictor, although important, is a sup erset of the class predictor. Hence, it yields a smaller impro v emen t in the go o dness of ﬁt of the mo del. A solution to this issue could b e obtained b y mo delling the diﬀeren t classes as separated blo c ks in a BCCM (cf. [4], increasing the n um b er of parameters but capturing b oth the class and the topic membership at the same time. The fourth predictor, rep orting F aceb o ok relations, is incomplete. Hence, it can only explain part of the data. Moreov er, it is partly correlated with self-declared friendship, as p eople that declare to be friends are often friends on F aceb o ok (40% of the t w o so cial netw orks o v erlap). F rom this example, we can conclude that in the dataset studied the observed in teractions are strongly inﬂuenced b y social relations in the form of friendship links, even when we correct for the sub division into classes and topic, as can also b e visualised in ﬁg. 3. Gender homophily is, instead, relatively weak after accoun ting for all factors. Moreov er, it is interesting to note that non-corresp onded friendships, i.e., friendships that hav e b een declared only by one student, ha v e a very low eﬀect on the observ ed interactions, as long as corresp onded friendships are taken into accoun t. 4 Conclusion In this article, we hav e prop osed a new statistical mo del to quan tify how observed interactions dep end on diﬀeren t relations, in the framework of multiplex netw orks. The mo del is based on the assumption that interactions b etw een elements of a system are driven by t w o factors. The ﬁrst factor is the existence of relations betw een elements, such as friendship or homophily . The second is the combinat orial randomness caused by the activity of the elements. Element s that are more activ e are more likely to interact with each other, even if they are unrelated. 17/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) Diﬀeren t from common approaches used in netw ork analysis, our metho dology has b een sp eciﬁ- cally designed to deal with m ulti-edge graphs. It therefore allo ws to use the whole data av ailable, without the need of thresholding it to obtain un w eigh ted, i.e., binary , graphs. In fact, rep eated in teractions b et w een elements of a system generate multi-edge graphs, where the vertices corre- sp ond to the elements of the system. Similarly , relations can ha v e v arying intensit y and should b e enco ded in w eigh ted graphs. This is why thresholding the data into binary netw orks can b e a w aste of useful information. Our mo del separates random and deterministic inﬂuences on interactions, accounting for the randomness as combinatorial eﬀects. W e hence identify how much kno wn relations drive the in teractions. T o achiev e this, we base our regression mo del on generalized hypergeometric en- sem bles of random graphs, a class of statistical netw ork ensembles we hav e recently introduced. The formulation of our mo del allows to estimate the strength of the dep endence b et w een relations and interactions, together with its statistical signiﬁcance. Moreov er, the parameters estimated b y our mo del can b e readily in terpreted as the log-o dds of observing in teractions b etw een tw o diﬀeren t pairs of v ertices. Thanks to these c haracteristics, the statistical regression mo del de- scrib ed in this article results in a p ow erful to ol for the analysis of complex systems consisting of a large n um ber of highly interacting elements. Studying how diﬀerent relations driv e observed interactions is not only necessary to increase the understanding of a system, it is also needed to control the dynamics of a system. In fact, to do so w e hav e to appropriately mo dify the relations that are the driving forces underlying its b ehavior. Similarly , if we w an t to increase the resilience of a system, w e wan t to aﬀect the relations that are resp onsible for its weaknesses. Having a clear understanding on how and which relations impact the b eha vior of the elemen ts of a system is a necessary condition to prop erly con trol it. In conclusion, the metho d w e prop ose is a ma jor adv ance for the analysis of relational datasets and complex netw orks. By allowing the study of multi-edge and weigh ted graphs, it increases the breadth of applicabilit y of netw ork theory . In future work, it will allow to identify missing in teractions, according to null-models based on known relations. Thanks to this, it will b e p ossible to unco v er unknown r elations b et w een elements of a system. A ckno wledgment The author thanks S. Sch w eighofer, G. V accario and F. Sch weitzer for useful discussion, and V. Nan um y an for designing ﬁg. 3. 18/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) References [1] Ahnert, S. E.; Garlaschelli, D.; Fink, T. M. A.; Caldarelli, G. (2007). Ensem ble approach to the analysis of w eigh ted netw orks. Physic al R eview E 76(1) , 016101. [2] Branden b erger, L. (2018). T rading fav ors: Examining the temp oral dynamics of recipro city in congressional collab orations using relational even t mo dels. So cial networks 54 , 238–253. [3] Butts, C. T. (2008). A Relational Even t F ramework for So cial Action. So ciolo gic al Metho d- olo gy 38(1) , 155–200. [4] Casiraghi, G. (2019). The blo c k-constrained conﬁguration mo del. Applie d Network Scienc e 4(1) , 123. [5] Casiraghi, G.; Nanum yan, V. (2018). Generalised h yp ergeometric ensem bles of random graphs: the conﬁguration mo del as an urn problem. arXiv pr eprint arXiv:1810.06495 . [6] Casiraghi, G.; Nanum yan, V. (2020). ghypernet: Fit and Sim ulate Generalised Hyp ergeo- metric Ensem bles of Graphs. [7] Casiraghi, G.; Nan umy an, V.; Scholtes, I.; Sc hw eitzer, F. (2016). Generalized Hyp erge- ometric Ensem bles: Statistical Hyp othesis T esting in Complex Netw orks. arXiv pr eprint arXiv:1607.02441 . [8] Casiraghi, G.; Nanum yan, V.; Scholtes, I.; Sch w eitzer, F. (2017). F rom relational data to graphs: Inferring signiﬁcant links using generalized h ypergeometric ensembles. In: Interna- tional Confer enc e on So cial Informatics . Springer, pp. 111–120. [9] Chesson, J. (1978). Measuring Preference in Selective Predation. Ec olo gy 59(2) , 211–215. [10] Cranmer, S. J.; Desmarais, B. A. (2011). Inferen tial net w ork analysis with exponential random graph mo dels. Politic al Analysis 19(1) , 66–86. [11] Degro ot; Sc hervish (2002). Pr ob ability and Statistics . A ddison-W esley Publishing, 3th edn. ISBN 0201524880. [12] Dic ks, L. V.; Corb et, S. a.; Pywell, R. F. (2002). Compartmen talization in plant-insect ﬂo w er visitor webs. Journal of Animal Ec olo gy 71(1) , 32–43. [13] Eagle, N.; Pen tland, A. S. (2006). Reality mining: sensing complex so cial systems. Personal and ubiquitous c omputing 10(4) , 255–268. [14] F osdic k, B. K.; Larremore, D. B.; Nishim ura, J.; Ugander, J. (2018). Conﬁguring random graph mo dels with ﬁxed degree sequences. SIAM R eview 60(2) , 315–355. 19/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) [15] Garas, A. (ed.) (2016). Inter c onne cte d Networks . Understanding Complex Systems, Cham: Springer In ternational Publishing. ISBN 978-3-319-23945-3. [16] Garas, A.; Argyrakis, P .; Rozen blat, C.; T omassini, M.; Havlin, S. (2010). W orldwide spread- ing of economic crisis. New Journal of Physics 12(11) , 113043. [17] Hoﬀ, P .; F osdick, B.; V olfovsky , A.; Sto v el, K. (2013). Likelihoo ds for ﬁxed rank nomination net w orks. Network Scienc e 1(03) , 253–277. [18] Hoﬀ, P . D.; Raftery , A. E.; Handco ck, M. S. (2002). Laten t Space Approaches to So cial Net w ork Analysis. Journal of the Americ an Statistic al Asso ciation 97(460) , 1090–1098. [19] Hub ert, L.; Sc h ultz, J. (1976). Quadratic assignmen t as a general data analysis strategy. British Journal of Mathematic al and Statistic al Psycholo gy 29(2) , 190–241. [20] Karrer, B.; Newman, M. E. J. (2011). Sto chastic blo ckmodels and comm unit y structure in net w orks. Phys. R ev. E 83(1) , 16107. [21] Krac khardt, D. (1988). Predicting with net w orks: Nonparametric multiple regression anal- ysis of dy adic data. So cial Networks 10(4) , 359–381. [22] Krivitsky , P . N. (2012). Exp onen tial-family random graph mo dels for v alued net works. Ele ctr onic Journal of Statistics 6 , 1100–1128. [23] Krivitsky , P . N.; Butts, C. T. (2017). Exp onen tial-family random graph mo dels for rank- order relational data. So ciolo gic al Metho dolo gy 47(1) , 68–112. [24] Lehmann, E. L.; Romano, J. P . (eds.) (2005). T esting Statistic al Hyp otheses . Springer T exts in Statistics, New Y ork, NY: Springer New Y ork. ISBN 978-0-387-98864-1. [25] Mastrandrea, R.; F ournet, J.; Barrat, A. (2015). Contact patterns in a high school: A comparison b etw een data collected using wearable sensors, contact diaries and friendship surv eys. PL oS ONE 10(9) , 1–26. [26] McF adden, D. (1974). Conditional logit analysis of qualitative choice b eha vior. F r ontiers in Ec onometrics , 105–142. [27] McPherson, M.; Smith-Lovin, L.; Co ok, J. M. (2001). Birds of a F eather: Homophily in So cial Net w orks. A nnual R eview of So ciolo gy 27(1) , 415–444. [28] Memmott, J. (1999). The structure of a plant-pollination fo o d web. Ec olo gy L etters 2 , 276–280. 20/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) [29] Mollo y , M.; Reed, B. (1995). A critical p oint for random graphs with a given degree sequence. R andom Structur es & A lgorithms 6(2-3) , 161–180. [30] Mollo y , M.; Reed, B. (1998). The Size of the Giant Comp onent of a Random Graph with a Giv en Degree Sequence. Combinatorics, Pr ob ability and Computing 7(3) , 295–305. [31] Newman, M. E. J.; Peixoto, T. P . (2015). Generalized Communities in Netw orks. Physic al R eview L etters 115(8) , 088701. [32] Onnela, J.-P .; Chakrab orti, A.; Kaski, K.; Kertész, J.; Kanto, A. (2003). Dynamics of mark et correlations: T axonom y and p ortfolio analysis. Physic al R eview E 68(5) , 056110. [33] P eixoto, T. P . (2014). Hierarchical Blo ck Structures and High-Resolution Mo del Selection in Large Net w orks. Physic al R eview X 4(1) , 011047. [34] Rohe, K.; Chatterjee, S.; Y u, B. (2011). Sp ectral clustering and the high-dimensional sto c hastic blo c kmo del. The A nnals of Statistics 39(4) , 1878–1915. [35] Sew ell, D. K.; Chen, Y. (2015). Latent Space Mo dels for Dynamic Netw orks. Journal of the A meric an Statistic al Asso ciation 110(512) , 1646–1657. [36] Snijders, T.; Spreen, M.; Zwaagstra, R. (1995). The Use of Multilevel Mo deling for Analysis of Personal Netw orks: Net w orks of Co caine Users in an Urban Area. Journal of Qualitative A nthr op olo gy 5(2) , 85–105. [37] Snijders, T. A. (2011). Statistical Mo dels for So cial Net w orks. Annual R eview of So ciolo gy 37(1) , 131–153. [38] Snijders, T. A. B. (1996). Sto c hastic actor-oriented mo dels for netw ork change. The Journal of Mathematic al So ciolo gy 21(1-2) , 149–172. [39] Snijders, T. A. B.; v an de Bunt, G. G.; Steglich, C. E. G. (2010). Introduction to sto chastic actor-based mo dels for netw ork dynamics. So cial Networks 32(1) , 44–60. [40] Stehlé, J.; V oirin, N.; Barrat, A.; Cattuto, C.; Isella, L.; Pinton, J. F.; Quaggiotto, M.; v an den Bro eck, W.; Régis, C.; Lina, B.; V anhems, P . (2011). High-resolution measurements of face-to-face con tact patterns in a primary school. PL oS ONE 6(8) . [41] Sussman, D. L.; T ang, M.; Fishkind, D. E.; Prieb e, C. E. (2012). A Consistent Adjacency Sp ectral Embedding for Sto chastic Blo ckmodel Graphs. Journal of the Americ an Statistic al Asso ciation 107(499) , 1119–1128. [42] T omasello, M. V.; Nap oletano, M.; Garas, A.; Sch w eitzer, F. (2017). The rise and fall of R&D net w orks. Industrial and c orp or ate change 26(4) , 617–646. 21/22 G. Casiraghi: Multiplex Net w ork Regression: Ho w do relations driv e in teractions? (Submitted for publication: 10 July 2020) [43] W allenius, K. T. (1963). Biase d Sampling: the Nonc entr al Hyp er ge ometric Pr ob ability Dis- tribution . Ph.d. thesis, Stanford Universit y . [44] W arner, R. M.; Kenny , D. A.; Stoto, M. (1979). A new round robin analysis of v ariance for so cial in teraction data. Journal of Personality and So cial Psycholo gy 37(10) , 1742–1757. [45] Zingg, C.; Casiraghi, G.; V accario, G.; Sc h w eitzer, F. (2019). What is the En trop y of a So cial Organization? Entr opy 21(9) , 901. 22/22

Multiplex Network Regression: How do relations drive interactions?

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment