Contrasting Multiple Social Network Autocorrelations for Binary Outcomes, With Applications To Technology Adoption

Con trasting Multiple So cial Net w ork Auto correlations for Binary Outcomes, With Applications T o T ec hnology Adoption Bin Zhang A.C. Thomas P atric k Doreian Da vid Krac khardt Rama yy a Krishnan June 4, 2018 Abstract The rise of so cially targeted mark eting suggests that decisions made b y consumers can be predicted not only from their personal tastes and c haracteristics, but also from the decisions of p eople who are close to them in their net w orks. One obstacle to consider is that there ma y b e sev eral diﬀerent measures for “closeness” that are appropriate, ei- ther through diﬀerent types of friendships, or diﬀerent functions of distance on one kind of friendship, where only a subset of these net works may actually b e relev an t. Another is that these decisions are often binary and more diﬃcult to model with con ven tional ap- proac hes, b oth conceptually and computationally . T o address these issues, w e presen t a hierarc hical mo del for individual binary outcomes that uses and extends the machinery of the auto-probit metho d for binary data. W e demonstrate the behavior of the pa- rameters estimated b y the multiple netw ork-regime auto-probit mo del (m-NAP) under v arious sensitivit y conditions, such as the impact of the prior distribution and the na- ture of the structure of the netw ork, and demonstrate on several examples of correlated binary data in net works of interest to Information Systems, including the adoption of Caller Ring-Back T ones, whose use is gov erned by direct connection but explained b y additional net work top ologies. 1 In tro duction The prev alence and widespread adoption of online so cial net w orks hav e made the analysis of these net works, particularly the b ehaviors of individuals embedded within, an imp ortant topic of study in information systems Agarw al et al. (2008); Oinas-Kukkonen et al. (2010), building oﬀ previous w ork in the con text of tec hnology diﬀusion Branc heau and W etherb e (1990); Chatterjee and Eliash b erg (1990); Premkumar et al. (1994). While past inv estiga- tions in to behavior in net w orks were t ypically limited to hundreds of p eople, contemporary data collection and retriev al technologies enable easy access to net work data on a m uc h larger 1 scale. Analyzing the b eha vior of these individuals, such as their purc hasing or technology adoption tendencies, requires statistical techniques that can handle b oth the scop e and the complexit y of the data. The so cial net work asp ect is one such complexit y . Researc hers once assumed that an individual’s decision to purchase a pro duct or adopt a tec hnology is solely asso ciated with their p ersonal attributes, suc h as age, education, and income Kamakura and Russell (1989); Allen by and Rossi (1998), though this could b e due b oth to a lack of so cial netw ork data and a mec hanism for handling it; indeed, recent dev elopmen ts hav e sho wn that their decisions are asso ciated with the decisions of an individual’s neigh b ors in their so cial net w orks Bernheim (1994); Manski (2000); Smith and LeSage (2004). This could b e due to a “contagious” eﬀect, where someone imitates the b ehavior of their friends, or an indication of latent homophily , in which some unobserv ed and shared trait drives b oth the tendency for t wo p eople to form a friendship and for each to adopt (Aral et al., 2009; Shalizi and Thomas, 2011); either so- cial property will increase the abilit y to predict a p erson’s adoption b ehavior b eyond their p ersonal characteristics. Eac h of these pro duces outcomes that are correlated b et w een members of the net w ork who are connected. A p opular approac h to study this phenomenon is to use a mo del with explicit autocorrelation b etw een individual outcomes, deﬁned with a single net work structure term. With the depth of data no w a v ailable, an actor is v ery often observed to b e a mem b er of m ultiple distinct but o v erlapping netw orks, such as a friend net work, a work colleague net- w ork, a family netw ork, and so forth, and eac h of these net works ma y hav e some connection to the outcome of in terest, so a mo del that condenses all net w orks in to one relation will b e in- suﬃcien t. While mo dels ha v e b een dev elop ed to include t wo or more net work auto correlation terms, such as Doreian (1989), these do not allow for the immediate and principled inclusion of binary outcomes; other metho ds to deal with binary outcomes on multiple netw orks, suc h as Y ang and Allen b y (2003), instead tak e a w eighted a v erage of other net works in the system, com bining them in to one, whic h has the side eﬀect of constraining the sign of each netw ork auto correlation comp onent to b e iden tical, which ma y b e undesirable if there are m ultiple eﬀects thought to b e in opposition to one another. T o deal with these issues, we construct a mo del for binary outcomes that uses the probit framew ork, allo wing us to represen t these outcomes as if they are dichotomized outcomes from a m ultiv ariate Gaussian random v ariable; this is then presented as in Doreian (1989) to ha ve multiple regimes of net w ork autocorrelation. W e ﬁrst use the Exp ectation-Maximization 2 algorithm (EM) to ﬁnd a maximum lik eliho o d estimator for the mo del parameters, then use Mark ov Chain Mon te Carlo, a metho d from Ba y esian statistics, to dev elop an alternate es- timate based on the p osterior mean. W e also study the sensitivit y of b oth solutions to the c hange of parameters’ prior distribution. Preliminary exp eriments show that the E-M solu- tion to this mo del is degenerate, and cannot pro duce a usable v ariance-cov ariance matrix for parameter estimates, and so the MCMC metho d is preferred. Our softw are is also v alidated b y using the p osterior quantiles metho d of Co ok et al. (2006). W e ensure that the parameter estimates from the mo del are correct b y testing ﬁrst on sim ulated data, before moving on to real examples of net work-correlated behavior. The rest of the pap er is organized as follo ws. W e discuss the literature on the net w ork auto correlation mo del in Section 2. Our tw o estimation algorithms for the multi-net w ork autoprobit, based on EM and MCMC, are presen ted in Section 3. In Section 4 w e present the results of experiments for soft war e v alidation and parameter estimation b ehavior observ ation. Conclusions and suggestions for future work complete the pap er in Section 5. 2 Bac kground [[Previously: Literature]] Netw ork mo dels of b ehavior are dev elop ed to study the process of so cial inﬂuence on the diﬀusion of a b ehavior, whic h is the pro cess “by whic h an inno v ation is communicated through certain c hannels o ver time among the members of a so cial system ... a sp ecial type of communication concerned with the spread of messages that are p erceiv ed as new ideas” Rogers (1962). These mo dels hav e b een widely used to study diﬀusion since the Bass (1969) mo del, a p opulation-lev el approach that assumes that every one in the so cial net work has the same probability of interacting. Suc h assumption is not realistic b ecause giv en a large so cial netw ork, the probabilit y of an y random t wo nodes connecting to eac h other is not the same; for example, p eople with closer ph ysical distance communicate more and are likely to exert greater inﬂuence on eac h other. A reﬁnemen t to this approac h is a mo del where the outcomes of neighboring individuals are explicitly linked, suc h as the simultaneous autoregressiv e mo del (SAR). The general metho d of SAR is describ ed in Anselin (1988) and Cressie (1993); it considers sim ultaneous autoregression on the residuals of the form y = X β + θ , θ = ρ W θ +  where y is a vector of observed outcomes, in this case consumer choice; X is a vector of explanatory v ariables. Rather than an independent error term, θ represents error terms whose 3 correlation is speciﬁed b y W , the so cial net work matrix of in terest, and ρ , the corresp onding net work auto correlation, distributing a Gaussian error term  i . Maxim um lik eliho o d estimate solutions are pro vided b y Ord (1975), Doreian (1980, 1982), and Smirnov (2005). Standard netw ork auto correlation mo dels can only accommo date one net work, suc h as those of Burt (1987) and Leenders (1997). How ev er, an actor is v ery often under inﬂuence of m ultiple netw orks, such as that of friends and that of colleagues. So if a research requires in vestigation of which autocorrelation term out of m ultiple net works plays the most signiﬁ- can t role in consumers’ decision, none of these mo dels are adequate, and a mo del that can accommo date tw o or more net works is necessary . Cohesion and structural equiv alence are tw o comp eting so cial net work mo dels to explain diﬀusion of innov ation. In the cohesion mo del, a fo cal p erson’s adoption is inﬂuenced b y his/her neigh b ors in the net work. In the structural equiv alence mo del, a focal p erson’s adoption is inﬂuenced b y the p eople who ha ve the same p osition in the so cial net w ork, suc h as sharing many common neigh b ors. While considerable work has been done on these mo dels on real data, the question of whic h net w ork mo del best explains diﬀusion has not been resolv ed. T o approac h this, Doreian (1989) introduced a mo del for “tw o regimes of netw ork eﬀects auto correlation” 1 for contin uous outcomes. The mo del is describ ed as b elo w: y = X β + ρ 1 W 1 y + ρ 2 W 2 y +  where y is the dep enden t v ariable; X is a vector of explanatory v ariables; each W represen ts a so cial structure underlying each autoregressive regime. This mo del takes b oth interde- p endence of actors and their attributes, such as demographics, in to consideration; these in terdep endencies are eac h describ ed by a w eigh t matrix W i . Doreian’s mo del can capture b oth actor’s in trinsic opinion and inﬂuence from alters in his so cial net work. As this model tak es a con tin uous dep endent v ariable, F ujimoto and V alente (2011) present a plausible solution for binary outcomes b y directly inserting an autocorrelation term Wy 1 The term “netw ork eﬀects” can refer to tw o directly related concepts: the auto correlation b et w een indi- vidual b ehaviors on a netw ork, and the increased impact of a technology to an individual when used by more p eople within a netw ork. Our meaning is the ﬁrst, though we use the term p artial network auto c orr elation to a void am biguity . 4 in to the righ t hand side of a logistic regression: y i ∼ Be( p i ) log( p i 1 − p i ) = X β + ρ X j W ij y j Due to its sp eed of implemen tation, this metho d is called “quic k and dirt y” (QAD) by Dor- eian (1982). Although it may supp ort a binary dep endent v ariable and multiple netw ork terms, this mo del do es not satisfy the assumption of logistic regression – the observ ations are not conditionally indep endent, and the estimation results are biased. Thomas (2012) sho ws that this metho d has more consequences than exp ected for the estimation pro cedure beyond simple bias; for example, in cases where W is a directed graph, net works that are directional cannot b e distinguished from their reversed coun terparts. Y ang and Allen b y (2003) prop ose a hierarc hical Ba yesian autoregressive mixture model to analyze the eﬀect of m ultiple net work auto correlation terms on a binary outcome. Their mo del can only technically accommo date one netw ork eﬀect, comp osed of several smaller net works that are w eigh ted and added together. This model therefore assumes that all comp onen t net work coeﬃcients m ust ha ve the same sign 2 , and also b e statistically signiﬁcant or insigniﬁcant together. Such assumptions do not hold if the eﬀect of any but not all of the comp onent netw orks is statistically insigniﬁcant, or of the opp osite sign to the other net works, so a metho d that estimates co eﬃcients for eac h W separately is necessary for our applications. W e con trast our method with the Y ang-Allen by grand W construction method, a ﬁnite mixture of co eﬃcient matrices, in App endix A.5. 3 Metho d W e propose a v ariant of the auto-probit mo del that accommo dates m ultiple regimes of net- w ork auto correlation terms for the same group of actors, which we call the m ultiple netw ork auto-probit mo del (m-NAP). W e then provide t wo metho ds to obtain estimates for our mo del. The ﬁrst is the use of Expectation-Maximization, whic h emplo ys a maximum lik eliho o d ap- proac h, and the second one is a Mark o v Chain Mon te Carlo routine that treats the model as Ba yesian. Detailed descriptions of b oth estimations are sho wn in App endix A.1 and A.2. 2 It is of course p ossible to sp ecify terms in the W matrix as negativ e, to represen t an ticorrelation on a tie, but this m ust b e done a priori , and is redundan t in our approac h. 5 3.1 Mo del Sp eciﬁcation The actors are assumed to ha v e k diﬀeren t t yp es of netw ork connections b etw een them, where W i is the i th net work in question i ∈ { 1 , ..., k } . y is the vector of length n of observed binary c hoices, and is an indicator function of the laten t preference of consumers z . If z is larger than a threshold 0, consumers c ho ose y as 1; if z is smaller than 0, then consumers would c ho ose y as 0. y = I ( z > 0) z = X β + θ +  ,  ∼ Normal n (0 , I n ) θ = k X i =1 ρ i W i θ + u , u ∼ Normal n (0 , σ 2 I n ) z is a function of both exogenous cov ariates X , auto correlation term θ , and individual error. X is an n × m co v ariate matrixthat includes a constant as its ﬁrst column; these cov ari- ates could be the exogenous c haracteristics of consumers. β is an m × 1 coeﬃcient vector asso ciated with X . θ is the auto correlation term, which is resp onsible for those nonzero co v ariances in the z . θ can b e describ ed as the aggregation of multiple net work structure W i and coeﬃcient ρ i . Eac h W i is a netw ork structure describing connections and relationships among consumers. Our mo del explicitly allo ws multiple comp eting netw orks that can be deﬁned by diﬀeren t mec hanisms on an existing basis of netw ork ties; for example, W 1 describ es an eﬀect acting directly on a declared tie, suc h as homophily or so cial inﬂuence, whereas W 2 describ es the structural equiv alence due to those ties. It can also b e that eac h W i is deﬁned b y a diﬀerent t yp e of netw ork edge, such as friendship, colleagueship, or m utual group mem b ership; note that none of these relationships must b e m utually exclusive. Each co eﬃcient ρ i describ es the eﬀect size of its corresponding netw ork W i ,so that w e can compare the relative scales of comp eting netw ork structures for the same group of actors embedded in social net works. The error term for the mo del is mo deled as an augmented expression that consists of t wo parts,  and u .  is the unobserv able error term of z that describ es individual-lev el v ariation that is not shared on the net work, and u is the error that is then distributed along eac h net w ork, accounting for the non-zero cov ariance betw een units. If we marginalize this mo del by in tegrating out θ , all the unobserved in terdep endency will be isolated in a s ingle expression for the distribution of z , giv en parameters β , ρ and σ 2 , as multiv ariate with mean 6 X β and v ariance Q . z ∼ Normal ( X β , Q ) where Q = I n + σ 2 I n − k X i =1 ρ i W i ! − 1   I n − k X i =1 ρ i W i ! − 1   > . The non-standard form of the co v ariance matrix can therefore p ose a signiﬁcan t computa- tional issue. 3.2 Exp ectation-Maximization Solution W e ﬁrst develop an approac h b y maximizing the lik eliho o d of the mo del using E-M. Since z is latent, w e treat it as unobserv able data, for whic h the E-M algorithm is one of the most used metho ds. Detailed description of our solution for k regimes of netw ork auto correlation is in App endix A.1. The metho d consists of tw o steps: ﬁrst, estimate the exp ected v alue of functions of the unobserv ed z giv en the curren t parameter set φ , ( φ = { β , ρ , σ 2 } ). Second, use these esti- mates to form a complete data set { y , X , z } , with which w e estimate a new φ by maximizing the exp ectation of the likelihoo d of the complete data. W e ﬁrst initialize the parameters to b e estimated, β i ∼ Normal( ν β , Ω β ); ρ j ∼ Normal( ν ρ , Ω ρ ); σ 2 ∼ Gamma( a, b ) where i = 1 , ..., m , and j = 1 , ..., k . Let these v alues equal φ (0) . F or the E-step, w e calculate the conditional exp ectation of the log-lik eliho o d, with respect 7 to the augmented data, G ( φ | φ ( t ) ) = E z k y , φ ( t ) [log L ( φ | z , y )] = − n 2 log 2 π − n 2 log | Q | − 1 2 n X i =1 n X j =1 ˇ q ij (E[ z i z j ] − E[ z i ] X j β − E[ z j ] X i β + X i X j β 2 ) where t is the curren t step n um b er and ˇ q ij is element ( i, j ) in the matrix Q − 1 . In the M-step, w e maximize G ( φ | φ ( t ) ) to get β t +1 , ρ t +1 and [ σ 2 ] ( t +1) for the next step. β ( t +1) = arg max β G ( β | ρ ( t ) , [ σ 2 ] ( t ) ); ρ ( t +1) = arg max ρ G ( ρ | β ( t +1) , [ σ 2 ] ( t ) ); [ σ 2 ] ( t +1) = arg max [ σ 2 ] G ( [ σ 2 ] | β ( t +1) , ρ ( t +1) ) W e replace φ ( t ) with φ ( t +1) and rep eat the E-step and M-step until all the parameters con- v erge. P arameter estimates from the E-M algorithm conv erge to the MLE estimates W u (1983). It is worth noting that the analytical solution for all the parameters is not alwa ys possible. Consider the maximization with resp ect to the auto correlation v ariance parameter σ 2 : [ σ 2 ] ( t +1) = arg max [ σ 2 ] G ( φ | φ ( t ) ) ∂ log L ∂ [ σ 2 ] = ∂ ∂ [ σ 2 ]  − 1 2 log | Q | − 1 2 ( z − X β ) > Q − 1 ( z − X β )  (1) The ﬁrst term at the the righ t hand side of Equation (1) is: ∂ ∂ [ σ 2 ] log | Q | = ∂ ∂ [ σ 2 ] log        I n + [ σ 2 ] I n − k X i =1 ρ i W i ! − 1   I n − k X i =1 ρ i W i ! − 1   >        8 The second term is: ∂ ∂ [ σ 2 ] ( z − X β ) > Q − 1 ( z − X β ) = ∂ ∂ [ σ 2 ] ( z − X β ) >    I n + [ σ 2 ] I n − k X i =1 ρ i W i ! − 1   I n − k X i =1 ρ i W i ! − 1   >    − 1 ( z − X β ) This is not solv able analytically , and numerical metho ds are needed to get the estimators for this parameter and for ρ . As it happ ens, the E-M algorithm pro duces a degenerate solution. This is b ecause it estimates the mo de of σ 2 , the error term of the auto correlation term θ , which is at 0 (see Figure 1), and pro duces a singular v ariance-co v ariance matrix estimate using the Hessian appro ximation. Thus w e ha ve to ﬁnd another solution. Figure 1: An estimated probabilit y distribution for σ 2 , v ariance of θ . Maximum lik eliho o d metho ds, suc h as the Exp ectation-Maximization method, will c ho ose σ 2 = 0, a degenerate solution. 3.3 F ull Ba y esian Solution W e turn to Bay esian metho ds. Since the observed c hoice of consumer’s is decided by his/her unobserv ed preference, this mo del has a hierarchical structure, so it is natural to think of 9 T able 1: Cyclical conditional sampling steps for Mark ov Chain Mon te Carlo P arameter Densit y Dra w T yp e z T runNormal n ( X β + θ , I n ) P arallel β Normal n ( ν β , Ω β ) P arallel θ Normal n ( ν θ , Ω θ ) P arallel σ 2 In vGamma( a, b ) Single ρ i Metrop olis step Sequen tial using a hierarchical Ba y esian metho d. In addition to the model sp eciﬁcation abov e, prior distributions for each of the highest-level parameters in the mo del are also required. As b efore, y is the observed dic hotomous c hoice and calculated by the latent preference z . With Mark ov Chain Mon te Carlo, w e generate dra ws from a series of full conditional probability distributions, deriv ed from the join t distribution. W e summarize the forms of the full con- ditional distributions of all the parameters to estimate in T able 1, and in full in App endix A.2. Giv en the observ ed c hoice of consumer, the laten t v ariable z is generated from a truncated normal distribution with a mean of X β + θ with unit error. The prior distributions of the parameters (sho wn in T able 1 are adapted from priors prop osed b y Smith and LeSage (2004): • β follo ws a m ultiv ariate normal distribution with mean ν β and v ariance Ω β . • σ 2 follo ws an in verse gamma distribution with parameters a and b . • Eac h ρ i follo ws a normal distribution with mean ν ρ and v ariance Ω ρ . The sampler algorithm w as constructed in the R programming language, including a mec hanism to generate data from the mo del. V alidation of the algorithm was conducted using the metho d of p osterior quantiles (Co ok et al., 2006), ensuring the correctness of the co de for all analyses. Posterior quan tiles is a sim ulation-based method that generates data from the mo del and v eriﬁes that the soft ware can generate parameter estimate randomly around true parameter. F or detailed description of the implemen tation, please see App endix A.3. 3.4 Sensitivit y to Prior Sp eciﬁcation W e test the p erformance of the sampler using prior distributions that are closer to our c hosen mo del than the trivial priors used to chec k the mo del co de in order to assess the b eha vior of the algorithm under non-ideal conditions. W e demonstrate on data simulated 10 from the mo del, using tw o pre-existing netw ork conﬁgurations, and sp ecify diﬀeren t prior distributions for eac h parameter. T o demonstrate, we choose a prior distribution for ρ 1 with high v ariance, ρ ∼ Normal(0 , 100), . As sho wn in Figure 2(a), the p osterior dra ws of ρ 1 ha ve high temp oral auto correlation. T o compare, w e choose a narro w prior distribution for ρ 1 , ρ 1 ∼ Normal(0 . 05 , 0 . 05 2 ); the p osterior draws for ρ 1 are shown in Figure 2(b), and the temp oral auto correlation is considerably smaller. With the v olume of data under consideration, it is clear that the posterior distribution of ρ is sensitiv e to its prior distribution. (a) ρ ∼ Normal(0 , 100) (b) ρ ∼ Normal(0 . 05 , 0 . 05 2 ) Figure 2: T esting the sensitivit y of the inference of an auto correlation parameter ρ 1 to the prior distribution. (a) The Mark ov Chain for a weakly informativ e prior distribution is consisten t with the “oracle” v alue ρ 1 , but the chain has signiﬁcan t temporal auto correlation. (b) The Marko v Chain with a strongly informative prior distribution has muc h less temp oral auto correlation, but is b eholden to its prior distribution more than the data. In most of our examples, we do not ha ve a great deal of prior information a v ailable on an y net work parameters, suggesting that most of our analyses will b e conducted with minimally informativ e prior distributions. With such high auto correlation b etw een sequential dra ws, the eﬀectiv e sample size is extremely small. W e therefore use a high degree of thinning to pro duce a series of uncorrelated draws from the p osterior. 11 4 Applications 4.1 Auto Purc hase Data of Y ang and Allen b y (2003) W e use Y ang and Allen by’s 2003 Japanese car data to compare the ﬁndings of our method with those in the original study . The data consists of information on 857 purchase decisions of mid-size cars; the dep enden t v ariable is whether the car purchased was Japanese ( y m = 1) or otherwise ( y m = 0). All the car models in the data are substitutable and ha v e roughly similar prices. An imp ortant question of interest is whether the preferences of Japanese car among consumers are interdependent or not. The in terdep endence in the netw ork is measured by geographical lo cation, where W ij = 1, if consumer i and j live in the same zip co de, and 0, otherwise. Explanatory v ariables include actors’ demographic information such as age, ann ual household income, ethnic group, education and other information suc h as the price of the car, whether the optional accessories are purchased for the car, latitude and longitude of the actor’s lo cation. T o construct a net work, Y ang and Allen by use whether the consumers’ home address in the same zip co de as the indicator of a connection. Th us the net work struc- ture W , the cohesion, is the joint mem b ership of same geographic area. By comparing the parameters of Y ang and Allen by’s mo del to those for m-NAP on the same dataset, with the same underlying deﬁnition of netw ork structure, we contrast our approac hes and demonstrate the v alue of separating the impact of v arious netw ork auto- correlations. The comparison of the co eﬃcien t estimates from Y ang and Allenb y and our Ba yesian solution is shown in Figure 3 , for b oth explanatory v ariables and for netw ork auto- correlations. W e sp ecify a second netw ork term W 2 to be the structural equiv alence of t wo consumers, calculated as the simple adjacency distance betw een the tw o vectors represen ting individuals’ connections to other individuals in the net work to measure structural equiv a- lence. In a undirected netw ork with non-weigh ted edges the adjacency distance b etw een t wo no des i and j is the n um b er of individuals who ha ve diﬀerent relationships to i and j resp ectiv ely , d ij = v u u t N X k =1 ,k 6 = i,j ( A ik − A j k ) 2 , (2) where A ik = 1 if no de i and k are neigh b ors, and 0 otherwise. The larger d betw een node 12 i and j , the less structurally equiv alen t they are. W e use the in v erse of d ij plus one in or- der to construct a measure with a p ositive, ﬁnite relationship with role equiv alence, so that s ij = 1 d ij +1 . In our setting, a random elemen t A ij in Equation (2) is from matrix W 1 , so d ij is the adjacency distance b et ween an y tw o vectors A i and A j , representing consumer i ’s con- nections, and consumer j ’s connections to all the other consumers in the data, resp ectiv ely . The inv erse of d ij with an addition to 1 (to a v oid zero as denominator), s ij , b ecomes elemen t of structural equiv alence matrix W 2 . The comparison is sho wn in Figure 3. Eac h b o x con tains the estimates of one parameter from three metho ds: from left to righ t, Y ang and Allen b y , NAP with 1 net work, and NAP with 2 net works. All the co eﬃcient estimates, ˆ β i , ˆ ρ 2 , and ˆ σ 2 of the three metho ds hav e similar mean, standard deviation and credible in terv al. One thing interesting here is the ef- fect size of the second netw ork, structural equiv alence, has a signiﬁcan t negative eﬀect. This suggests a diminishing cluster eﬀect; when the n um b er of p eople in the cluster gets bigger, the inﬂuence do es not increase prop ortionally . 4.2 Caller Ring-Bac k T one Usage In A Mobile Net w ork W e use m-NAP to in vestigate the purc hase of Caller Ring Back T ones (CRBT) within a cel- lular phone netw ork, a tec hnology of increasing interest around the w orld. When someone calls the subscriber of a CRBT, the caller do es not hear the standard ring-bac k tone but instead hears a song, jok e or other message c hosen b y the subscrib er un til the subscrib er answ ers the phone or the mailbox takes ov er. As so on as a CRBT is downloaded, it is set as the default ring back tone, and triggered automatically by all phone call. Our data were obtained from a large Indian telecomm unications compan y (source and raw data conﬁden- tial). W e ha v e cellular phone call records and CRBT purc hase records ov er a three-month p erio d, and phone accoun t holders’ demographic information suc h as age and gender. W e extract a comm unity of 597 users that are highly internally connected from a p opulation with appro ximately 26 million unique users using the T ransitive Clustering and Pruning (T- CLAP) algorithm (Zhang et al., 2011). Within this cluster, net w ork edges are sp eciﬁed b et w een users who call each other during the p erio d of observ ation, as mutual symmetric connection implies equal and stable relationships (Hanneman and Riddle, 2005), rather than w eaker relationships or calls related to businesses (inquiries or telemark eters). W e include sev eral explanatory v ariables in this mo del: 13 Figure 3: A comparison of co eﬃcien t estimates betw een the Y ang-Allen b y metho d and m- NAP with 1 or 2 netw orks. The mo dels giv e similar results, while noting that there is no w a negativ e and statistically signiﬁcant eﬀect on the netw ork represen ting structural equiv alence. β 0 : co eﬃcien t of constant term, β 1 : co eﬃcien t of X 1 , car price; β 2 : co eﬃcien t of X 2 , car’s optional accessory; β 3 : co eﬃcient of X 3 , consumer’s age; β 4 : co eﬃcient of X 4 , consumer’s income; β 5 : co eﬃcien t of X 5 , consumer’s ethnicity; β 6 : co eﬃcien t of X 6 , residence longitude; β 7 : co eﬃcient of X 7 , residence latitude; ρ 1 : co eﬃcient of ﬁrst net work auto correlation term, W 1 , cohesion; ρ 2 : co eﬃcient of the second netw ork auto correlation term, W 2 , structural equiv alence; σ 2 : estimated v ariance of the error term in autocorrelation. 14 • The gender of the cellular phone accoun t holder; • The age of the account holder; • The n umber of unique outb ound connections from the user (known as the “outdegree”). F rom our original net work, we derive tw o matrices corresp onding to cohesion and struc- tural equiv alence. Cohesion assumes callers who mak e phone calls to eac h other will hear the called party’s CRBT thus more lik ely to buy that ring-bac k tone or get in terested in CRBT and ev entually adopt the tec hnology . Since the num b er of p eople a caller calls are drastically diﬀeren t, w e normalize the cohesion matrix by dividing eac h ro w by the total n umber of adopters, to mak e the matrix element to b e the p ercen tage of adoption. Struc- tural equiv alence is once again deﬁned as the adjacency distance b et ween t wo callers. Here it is less clear that there is an ob vious mechanism for how structural equiv alence can impact adoption, as it relates to a relationship that do es not expose the caller to the CRBT. Figure 4: T race plot of CRBT netw ork parameters. Description of parameters: β 0 : co eﬃcient of constan t term; β 1 : co eﬃcient of consumer’s gender; β 2 : co eﬃcient of consumer’s age; β 3 : co eﬃcien t of n um b er of called contacts; ρ 1 : co eﬃcien t of ﬁrst net work autocorrelation term, W 1 , cohesion; ρ 2 : co eﬃcient of the second netw ork auto correlation term, W 2 , structural equiv alence; σ 2 : estimated v ariance of the error term in autocorrelation; loglik e: log-lik eliho o d of y . W e sho w estimates for eac h parameter of the mo del is sho wn in Figure 4.2. Again, w e observ e a signiﬁcan t negativ e eﬀect for structural equiv alence. This new net work auto corre- 15 lation, with a co eﬃcien t of opp osite sign from that of the ﬁrst netw ork auto correlation W 1 , cannot b e iden tiﬁed by an y earlier mo dels. 5 Conclusion W e ha v e introduced a new auto-probit mo del to study binary choice of a group of actors that hav e multiple net work relationships among them. W e sp eciﬁed the ﬁtting of the mo del for b oth E-M and hierarc hical Bay esian methods. W e found that the E-M solution cannot estimate the parameters for this particular mo del, thus only hierarc hical Ba y esian solution can b e used here. W e also v alidated our Ba yesian solution b y using the p osterior quantiles metho d and the results sho w our softw are returns accurate estimates. Finally we compare the estimates returned b y Y ang and Allen by , NAP with one net work eﬀect (cohesion), and NAP with tw o netw ork eﬀects (cohesion and structural equiv alence), b y using real data. W e w ant to ensure that the approach can recov er v ariabilit y in the netw ork eﬀect size. Assuming W θ has strong eﬀect, w e will v ary ρ ’s true v alue from small num b er to large n um- b er, and observ e whether our solution can capture the v ariation. Finally w e also wan t to study ho w multicollinearities b et ween X s, and b et w een X and W θ aﬀect estimated results. References A gar w al, R. , Gupt a, A. K. and Kra ut, R. (2008). Editorial o v erview – the in terplay b et w een digital and so cial net w orks. Information Systems R ese ar ch , 19 243–252. Allenby, G. M. and Rossi, P. E. (1998). Marketing models of consumer heterogeneity . Journal of Ec onometrics , 89 57–78. Anselin, L. (1988). Sp atial Ec onometrics: Metho ds and Mo dels . 1st ed. Studies in Op era- tional Regional Science, Springer, The Netherlands. Aral, S. , Muchnik, L. and Sundararajan, A. (2009). Distinguishing Inﬂuence Based Con tagion from Homophily Driven Diﬀusion in Dynamic Netw orks. Pr o c e e dings of the National A c ademy of Scienc es , 106 21544. Bass, F. M. (1969). A new pro duct growth for mo del consumer durables. Management Scienc e , 15 215–227. 16 Bernheim, B. D. (1994). A theory of conformit y . Journal of Politic al Ec onomy , 102 841–77. Branchea u, C. J. and Wetherbe, C. J. (1990). The adoption of spreadsheet soft ware: T esting innov ation diﬀusion theory in the context of end-user computing. Information Systems R ese ar ch , 1 115–143. Bur t, R. S. (1987). Social contagion and inno v ation: Cohesion versus structural equiv alence. A meric an Journal of So ciolo gy , 92 1287. Cha tterjee, R. and Eliashberg, J. (1990). The inno v ation diﬀusion pro cess in a het- erogeneous p opulation: A micromodeling approac h. Management Scienc e , 36 1057–1079. Cook, S. R. , Gelman, A. and R ubin, D. B. (2006). V alidation of soft ware for ba y esian mo dels using p osterior quan tiles. Journal of Computational and Gr aphic al Statistics , 15 675–692. Cressie, N. A. C. (1993). Statistics for Sp atial Data . Revised ed. Probabilit y and Statistics series, Wiley-Interscience, New Y ork. Doreian, P. (1980). Linear mo dels with spatially distributed data: Spatial disturbances or spatial eﬀects. So ciolo gic al Metho ds and R ese ar ch , 9 29–60. Doreian, P. (1982). Maximum likelihoo d metho ds for linear mo dels: Spatial eﬀects and spatial disturbance terms. So ciolo gic al Metho ds and R ese ar ch , 10 243–269. Doreian, P. (1989). Two R e gimes of Network Eﬀe cts A uto c orr elation , chap. 14. The Small W orld, Ablex Publishing, Norw o o d, NJ, 280–295. Fujimoto, K. and V alente, T. W. (2011). Net work inﬂuence on adolescen t alcohol use: Relational, p ositional, and aﬃliation-based p eer inﬂuence. Unpublished man uscript. Hanneman, R. and Riddle, M. (2005). Intr o duction to so cial network metho ds . Online, Riv erside, CA. Kamakura, W. A. and Russell, G. J. (1989). A probabilistic choice model for market segmen tation and elasticit y structure. Journal of Marketing R ese ar ch , 26 379–390. Leenders, R. T. (1997). L ongitudinal b ehavior of network structur e and actor atributes: mo deling inter dep endenc e of c ontagion and sele ction , chap. Evolution of So cial Net works. Gordon and Breach, New Y ork, 165–184. Manski, C. F. (2000). Economic analysis of so cial in teractions. Journal of Ec onomic Persp e ctives , 14 115–136. 17 Oinas-Kukk onen, H. , L yytinen, K. and Yoo, Y. (2010). So cial netw orks and infor- mation systems: Ongoing and future research streams. Journal of the Asso ciation for Information Systems , 11 61–68. Ord, K. (1975). Estimation metho ds for mo dels of spatial interaction. Journal of the A meric an Statistic al Asso ciation , 70 120–126. Premkumar, G. , Ramamur thy, K. and Nilakant a, S. (1994). Implementation of elec- tronic data interc hange: an innov ation diﬀusion p ersp ective. Journal of Management Infor- mation Systems - Sp e cial se ction: Str ate gic and c omp etitive information systems ar chive , 11 157–186. R ogers, E. M. (1962). Diﬀusion of Innovations . F ree Press, New Y ork. Shalizi, C. R. and Thomas, A. C. (2011). Homophily and Contagion Are Generically Confounded in Observ ational So cial Net w ork Studies. So ciolo gic al Metho ds and R ese ar ch , 40 211–239. Smirno v, O. A. (2005). Computation of the information matrix for mo dels with spatial in teraction on a lattice. Journal of Computational and Gr aphic al Statistics , 14 910–927. Smith, T. E. and LeSage, J. P. (2004). A Ba y esian Probit Mo del with Spatial Dep enden- cies. In A dvanc es in Ec onometrics: V olume 18: Sp atial and Sp atiotemp or al Ec onometrics (K. R. Pace and J. P . LeSage, eds.). Elsevier, United Kingdom, 127–160. Thomas, A. C. (2012). The so cial contagion h yp othesis: Commen t on “so cial con tagion theory: Examining dynamic so cial net works and h uman b eha vior”. In press at Statistics in Medicine. Wu, C. F. J. (1983). On the con vergence prop erties of the em algorithm. The Annals of Statistics , 11 95–103. Y ang, S. and Allenby, G. M. (2003). Mo deling interdependent consumer preferences. Journal of Marketing R ese ar ch , XL 282–294. Zhang, B. , Krackhardt, D. , Krishnan, R. and Doreian, P. (2011). An eﬀective and eﬃcient subp opulation extraction metho d in large so cial netw orks. Pr o c e e dings of International Confer enc e on Information Systems . 18 APPENDIX A.1 E-M solution implemen tation A.1.1 Deduction First, get the distribution of θ . I n − k X i =1 ρ i W i ! θ = u θ = I n − k X i =1 ρ i W i ! − 1 u θ ∼ Normal    0 , σ 2 I n − k X i =1 ρ i W i ! − 1   I n − k X i =1 ρ i W i ! − 1   >    Then get the distribution of z | β , ρ , σ 2 : z ∼ Normal ( X β , Q ) , where Q = I n + σ 2 I n − k X i =1 ρ i W i ! − 1   I n − k X i =1 ρ i W i ! − 1   > The joint distribution of y and z can transformed as: p ( y | z ) p ( z | β , ρ , σ 2 ) = p ( y , z | β , ρ , σ 2 ) = p ( z | y ; β , ρ , σ 2 ) p ( y ) (3) The right side of equation (3) are t w o distributions w e already ha ve, as shown below. p ( y ) = 1 √ 2 π exp  − 1 2 ( z − X β ) > ( z − X β )  Φ( X β ) I ( z > 0) z | β , ρ , σ 2 ∼ Normal( X β , Q ) z | y , X ; β , ρ , σ 2 ∼ T runNormal( X β , Q ) Consider parameter β only , p ( β , z | y ) = p ( β | z , y ) p ( z | y ) z | y , X ; β ∼ T runNormal( X β , Q ) 19 Assume V ar( z )=1, L ( β | z ) = 1 √ 2 π n X i =1 exp  − 1 2 ( z i − X i β ) 2  ˆ β = ( X > X ) − 1 X > R , where R = E[ z | θ , y ] Then include parameters, ρ and σ 2 . E[ z ] ( t +1) = E[ z | y , β ( t ) ] = f ( β ( t ) , y ) log L ( β , ρ , σ 2 | z ) = log p ( z | β , ρ , σ 2 ) = log n Y i =1 p ( z i | β , ρ , σ 2 ) = n X i =1 log 1 p 2 π | Q | − 1 2 ( z − X β ) > Q − 1 ( z − X β ) = n X i =1 log 1 p 2 π | Q | −  1 2 z > Q − 1 z − z > Q − 1 X β − X > β Q − 1 z + X > β Q − 1 X β  (4) If decomp ose the matrices ab ov e as v ector pro duct, then: (4) = n X i =1 log 1 p 2 π | Q | − 1 2 n X i =1 n X j =1 ( z i − X i β ) ˇ q ij ( z j − X j β ) = n X i =1 log 1 p 2 π | Q | − 1 2 n X i =1 n X j =1 ˇ q ij ( z i z j − z i X j β − z j X i β + X i X j β 2 ) where ˇ q ij is the element in ˇ Q , and ˇ Q = Q − 1 . A.1.2 Exp ectation step In the exp ectation step, get the exp ected log-likelihoo d of parameters. Q ( φ | φ ( t ) ) = E z | y , φ ( t ) [log L ( φ | z , y )] = E " n X i =1 log 1 p 2 π | Q # − E  1 2 ( z − X β ) > Q − 1 ( z − X β )  = − n 2 log 2 π − n 2 log | Q | − 1 2 n X i =1 n X j =1 ˇ q ij (E[ z i z j ] − E[ z i ] X j β − E[ z j ] X i β + X i X j β 2 ) 20 where φ is the parameter set, and t is the n umber of steps. A.1.3 Maximization step In the maximization step, get the parameter estimates maximizing the exp ected log-likelihoo d. First, estimate β β ( t +1) = arg max β Q ( φ | φ ( t ) ) = arg max β n X i =1 log 1 p 2 π | Q | − 1 2 ( z − X β ) > Q − 1 ( z − X β ) (5) If directly apply analytical metho d to solv e the Equation (5) ab ov e, then: ∂ log L ∂ β = ∂ ∂ β  − 1 2 ( z − X β ) > Q − 1 ( z − X β )  ∂ ∂ β ( z − X β ) > Q − 1 ( z − X β ) = ∂ ∂ β ( z > Q − 1 z − z > Q − 1 X β − β > X > Q − 1 z + β > X > Q − 1 X β ) = − z > Q − 1 X − X > Q − 1 z + X > Q − 1 X β (6) Set Equation (6) as 0, then: − z > Q − 1 X − X > Q − 1 z + X > Q − 1 X β = 0 ˆ β =  X > Q − 1 X  − 1 X > Q − 1 R Second, estimate parameter ρ : ρ ( t +1) = arg max ρ Q ( φ | φ ( t ) ) Assume ρ = { ρ 1 , ..., ρ k } , without losing an y generalizabiliy , ρ 1 can b e estimated as: ρ ( t +1) 1 = arg max ρ 1 Q ( φ | φ ( t ) ) ∂ log L ∂ ρ 1 = ∂ ∂ ρ 1  − 1 2 log | Q | − 1 2 ( z − X β ) > Q − 1 ( z − X β )  ∂ ∂ ρ 1 log | Q | = − tr( W 1 Q − 1 ) ∂ ∂ ρ 1 ( z − X β ) > Q − 1 ( z − X β ) = ∂ ∂ ρ 1 ( z > Q − 1 z − z > Q − 1 X β − β > X > Q − 1 z + β > X > Q − 1 X β ) 21 It is imp ossible to get the analytical solution for ρ i . Third, estimate parameter σ 2 . Let σ 2 = [ σ 2 ] [ σ 2 ] ( t +1) = arg max [ σ 2 ] Q ( φ | φ ( t ) ) ∂ log L ∂ [ σ 2 ] = ∂ ∂ [ σ 2 ]  − 1 2 log | Q | − 1 2 ( z − X β ) > Q − 1 ( z − X β )  (7) The ﬁrst term at the the righ t hand side of equation ab ov e is: ∂ ∂ [ σ 2 ] log | Q | = ∂ ∂ [ σ 2 ] log        I n + [ σ 2 ] I n − k X i =1 ρ i W i ! − 1   I n − k X i =1 ρ i W i ! − 1   >        The second term is: ∂ ∂ [ σ 2 ] ( z − X β ) > Q − 1 ( z − X β ) = ∂ ∂ [ σ 2 ] ( z − X β ) >    I n + [ σ 2 ] I n − k X i =1 ρ i W i ! − 1   I n − k X i =1 ρ i W i ! − 1   >    − 1 ( z − X β ) This is again not solv able b y using analytical metho d. A.2 Mark o v Chain Mon te Carlo estimation The Marko v Chain Monte Carlo metho d generates a sequence of draws that approaches the p osterior distribution of in terest. Our solution consists of steps as follo ws. Step 1. Generate z , z follows truncated normal distribution. z ∼ T runNormal n ( X β + θ , I n ) where I n is the n × n iden tity matrix. If y i = 1, then z i ≥ 0, if y i = 0, then z i < 0 Step 2. Generate β , β ∼ Normal( ν β , Ω β ) 22 1. deﬁne β 0 , where β 0 =       0 0 . . . 0       2. deﬁne D = hI n , D is a baseline v ariance matrix, corresp onding to the prior p ( β ), where h is a large constan t, e.g. 400. D − 1 =       σ 2 0 0 . . . 0 0 σ 2 0 . . . 0 . . . . . . . . . . . . 0 0 . . . σ 2 0       Set σ 2 0 as 1 400 , a small n umber close to 0, compared with Normal(0 , 1), where σ 2 0 = 1 3. Ω β =  D − 1 + X > X  − 1 This is b ecause: z = X β + θ +  β = X − 1 ( z − θ −  ) ∴ β ∼ Normal  X − 1 ( z − θ ) , ( X > X ) − 1  Based on law of initial v alues, Ω β =  D − 1 + X > X  − 1 4. Then ν β can b e represen ted by ν β = Ω β  X > ( z − θ ) + D − 1  Step 3. Generate θ , θ ∼ Normal( ν θ , Ω θ ) 1. First, deﬁne B = I n − X i ρ i W i θ = X i ρ i W i + u ( I n − X i ρ i W i ) θ = u B θ = u θ = B − 1 u 23 Let V ar( u ) = σ 2 I n V ar( θ ) = V ar( B − 1 u ) = ( B > B ) − 1 σ 2 I n =  B > B σ 2  − 1 2. Then Ω θ =  I n + B > B σ 2  − 1 W e then add an oﬀset I n to B > B σ 2 . So Ω θ =  I n + B > B σ 2  − 1 3. ν θ = Ω θ ( z − X β ), since θ = ( z − X β ) −  Step 4. Generate σ 2 , σ 2 ∼ In vGamma( a, b ) a = s 0 + n 2 b = 2 θ > B > B θ + 2 q 0 where s 0 and q 0 are the parameters for the conjugate prior of σ 2 , and n is the size of data. Step 5. Finally w e generate co eﬃcient for W , ρ i , using Metrop olis-Hasting sampling with a random walk c hain. ρ new i = ρ old i + ∆ i , where the increment random v ariable ∆ i ∼ Normal( ν ∆ , Ω ∆ ). The accepting probability α is obtained b y: min     | B new | exp  − 1 2 σ 2 θ > B > new B new θ  | B old | exp  − 1 2 σ 2 θ > B > old B old θ  , 1     A.3 V alidation of Ba y esian Soft w are One challenge of Ba yesian metho ds is getting an error-free implementation. Bay esian solu- tions often ha ve high complexit y , and a lac k of softw are causes man y researc hers to develop their o wn, greatly increasing the c hance of soft ware error; many mo dels are not v alidated, 24 and man y of them hav e errors and do not return correct estimations. So it is v ery necessary to conﬁrm that the co de returns correct results. The v alidation of Ba y esian softw are imple- men tations has a short history; we wrote a program using a standard method, the metho d of p osterior quantiles Co ok et al. (2006), to v alidate our softw are. This metho d again is a simulation-based metho d. The idea is to generate data from the mo del and v erify that the soft w are will properly reco v er the underlying parameters in a principled w ay . First, w e dra w the parameters θ from its prior distribution p (Θ), then generate data from distribution p ( y | θ ). If the soft w are is correctly co ded, the quan tiles of eac h true parameter should b e uniformly distributed with resp ect to the algorithm output. F or example, the 95% credible in terv al should con tain the true parameter with probabilit y 95%. Assume w e wan t to es- timate the parameter θ in Ba y esian mo del p ( θ | y ) = p ( y | θ ) p ( θ ), where p ( θ ) is the prior distribution of θ , p ( y | θ ) is the distribution of data, and p ( θ | y ) is the p osterior distribution. The estimated quantile can b e deﬁned as: ˆ q ( θ 0 ) = ˆ P ( θ < θ 0 ) = 1 N N X i =1 I ( θ i < θ 0 ) where θ 0 is the true v alue drawn from prior distribution; ˆ θ is a series of draw from p osterior distribution generated b y the soft w are to-b e-tested; N is the num b er of draws in MCMC. The quan tile is the probability of p osterior sample smaller than the true v alue, and the estimated quan tile is the n umber of p osterior dra ws generated b y softw are smaller than the true v alue. If the soft ware is correctly coded, then the quantile distribution for parameter θ , ˆ q ( θ 0 ) should approac hes Uniform(0 , 1), when N → ∞ Co ok et al. (2006). The whole pro cess up to no w is deﬁned as one replication. If run a n umber of replications, we exp ect to observe a uniformly distribution ˆ q ( θ 0 ) around θ 0 , meaning p osterior should b e randomly distributed around the true v alue. W e then demonstrate the sim ulations we ran. Assume the mo del w e wan t to estimate is: z = X 1 β 1 + X 2 β 2 + θ +  ; θ = ρ 1 W 1 θ + ρ 2 W 2 θ + u W e then sp eciﬁed a prior distribution for each parameter, and use MCMC to simulate the 25 p osterior distributions. β ∼ Normal(0 , 1); σ 2 ∼ In vGamma(5 , 10); ρ ∼ Normal(0 . 05 , 0 . 05 2 ) W e performed a simulation of 10 replications to v alidate our hierarc hical Ba y esian MCMC soft ware. The generated sample size for X is 50, so the size of the net work structure W is 50 b y 50. In eac h replication w e generated 20000 dra ws from the posterior distribution of all the parameters in φ ( φ = { β 1 , β 2 , ρ 1 , ρ 2 , σ 2 } ), and kept one from ev ery 20 draws, yielding 1000 draws for eac h parameter. W e then count the n umber of dra ws larger than the true parameters in eac h replication. If the soft w are is correctly written, each estimated v alue should b e randomly distributed around the true v alue, so the n um b er of estimates larger than the true v alue should b e uniformly distributed among the 10 replications. W e po oled all these quantiles for the ﬁv e parameters, 50 in total, and the sorted results are shown in Figure 5. Figure 5: Distribution of sorted quan tiles of parameters, β 1 , β 2 , ρ 1 , ρ 2 , σ 2 , o ver 10 replications. The roughly uniform distribution indicates that the algorithm code functions correctly for data simulated from the mo del. 26 A.4 Solution diagnostic W e run MCMC exp eriment to conﬁrm there is no auto correlation among draws of each parameter. In this exp erimen t, w e set the length of MCMC chain as 30,000, burn-in as 10,000, and thinning as 20, whic h is used for removing the auto correlations b et ween draws. The trace plots generated from our co de for the 1000 dra ws after burn-in and thinning are listed in the Figure 6 b elow. Figure 6: T race plot of a t w o-net work auto-probit model. β 0 : co eﬃcient of constan t term, β 1 : co eﬃcient of car price; β 2 : co eﬃcient of car’s optional accessory; β 3 : co eﬃcient of consumer’s age; β 4 : co eﬃcien t of consumer’s income; β 5 : co eﬃcien t of consumer’s ethnicity; β 6 : co eﬃcient of residence longitude; β 7 : co eﬃcient of residence latitude; ρ 1 : co eﬃcient of ﬁrst netw ork auto correlation term, W 1 , cohesion; ρ 2 : co eﬃcien t of the second net work auto correlation term, W 2 , structural equiv alence; σ 2 : estimated v ariance of the error term in auto correlation. W e ha v e 12 plots total. Eac h plot depicts dra ws for a particular parameter estimation. The ﬁrst 9 plots, from left to righ t and top to b ottom, are the trace for the β i , co eﬃcient of indep enden t v ariables. Each p oint represents the v alue of estimated co eﬃcient ˆ β i , and the red line represen ts the mean. W e observe all ˆ β i s are randomly distributed around the mean, and the mean is signiﬁcant, showing the estimation results are v alid. The 10th and 11th plots are for the t wo estimated netw ork eﬀect co eﬃcients ˆ ρ 1 and ˆ ρ 2 . W e found b oth ˆ ρ i are 27 also signiﬁcant, and randomly distributed around their means. The only coeﬃcient sho wing auto correlation is σ 2 . Note that not all v alues of ρ 1 and ρ 2 can mak e B ( B = I n − ρ 1 W 1 − ρ 2 W 2 ) in v ertible. The plot b elow shows the relationship b et ween the v alues of ρ 1 and ρ 2 , and the inv ertibility of B . The green area is where B is inv ertible, and red area is otherwise. If limit draws to the green area, we will hav e correlated ρ 1 and ρ 2 . When w e dra w ρ 1 and ρ 2 using biv ariate normal, there is no apparen t correlation b etw een them (see Figure 7). W e understand the correlation b et w een ρ 1 and ρ 2 comes from the deﬁnition of W 1 and W 2 , not the prior non-correlation. Figure 7: Regions of v alidit y for ρ 1 and ρ 2 for which B is inv ertible (green) or not (red). 28 A.5 W as a mixture of matrices Y ang and Allenb y 2003) sp eciﬁed the autoregressive matrix W as a ﬁnite mixture of co eﬃcient matrices, each related to a sp eciﬁc cov ariate: W = n X i =1 φ i W i n X i =1 φ i = 1 where i represents the indices of the co v ariates, i = 1... n . φ i is the corresp ondent weigh t of the comp onent matrix W i . W i is asso ciated with a cov ariate X i . 29

Contrasting Multiple Social Network Autocorrelations for Binary Outcomes, With Applications To Technology Adoption

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment