Modeling homophily and stochastic equivalence in symmetric relational data
This article discusses a latent variable model for inference and prediction of symmetric relational data. The model, based on the idea of the eigenvalue decomposition, represents the relationship between two nodes as the weighted inner-product of n…
Authors: ** Peter D. Hoff, Adrian E. Raftery, Michael A. H
Mo deling homophily and s to c hastic equiv alence in symmetric relational data P eter D. Hoff ∗ Octob er 22, 2 018 Abstract This article discusses a laten t v ariable mo del for inference and pr ediction of symmetric relational data. The mo del, ba s ed on the idea of t he eigenv alue decomp osition, repr esents the relationship b etw een tw o no des as the weigh ted inner-pro duct of no de-sp ecific v ectors of latent characteristics. This “eigenmo del” ge neralizes other p opular la tent v ariable mo dels, such as latent cla ss and distance models: It is sho wn mathematically that an y latent class o r distance mo del has a r epresentation as an eigenmodel, but not vice-versa. T he practical implicatio ns of this are examined in the context of three real datasets, for which the eigenmo del ha s as goo d or better out-of-sample pr edictive perfor mance than the other tw o models. Some key wor ds : F actor a nalysis, latent class, Markov c hain Monte Carlo, so c ia l net work. 1 In tro duction Let { y i,j : 1 ≤ i < j ≤ n } denote data measured on pairs of a s et of n ob j ects or no des. Th e examples considered in this artic le include friendships among people, asso ciations among w ords and in teractions among proteins. Suc h measurements are often represented by a sociomatrix Y , whic h is a symmetric n × n matrix with an undefin ed diagonal. One of the goals of relational data analysis is to describe the v ariation among the en tr ies of Y , as w ell as an y p otent ial co v ariation of Y with observ ed explanatory v ariables X = { x i,j , 1 ≤ i < j ≤ n } . T o this end, a v ariet y of statistical mo dels ha ve b een d ev elop ed that describ e y i,j as some func- tion of node-sp ecific la ten t v ariables u i and u j and a lin ear predictor β T x i,j . In suc h form u lations, { u 1 , . . . , u n } represent across-no de v ariation in the y i,j ’s and β represen ts co v ariation of the y i,j ’s ∗ Departments of Statistics, Bios tatistics and the Center for Statistics and the S ocial Sciences, Universit y of W ashington, Seattle, W ashington 98195-4322. W eb: http://www.stat.wa shington.edu/hoff/ . This w ork w as partially funded b y NSF gra nt n umber 0631 531. 1 Figure 1: Net w orks exhibiting h omophily (left panel) and stochastic equiv alence (righ t panel). with th e x i,j ’s. F or example, No wic ki and Snijders [2001] presen t a mo del in whic h eac h no de i is assumed to b elong to an unobserved latent class u i , and a probabilit y d istr ibution describ es the relationships b et w een eac h p air o f classes (see Kemp et al. [2004] and Airoldi et al. [200 5 ] for recent extensions o f this app r oac h). Suc h a mo del ca ptur es sto chastic e qu ivalenc e , a type of pattern often seen in net w ork d ata in whic h the no des can b e divided in to groups su c h that mem b ers of th e same group ha v e similar patterns of relationships. An alternativ e approac h to representing across-nod e v ariation is based on the idea of ho mophily , in which the relatio nsh ips b et w een nod es with similar charact eristics are stronger than the rela- tionships b et w een no des ha ving differen t c haracteristics. Homophily pro vides an explanation to data patterns often seen in so cial net w orks, such as transitivit y (“a friend of a friend is a fr iend ”), balance (“the enem y of m y friend is an enem y”) and the e xistence of cohesiv e subgroup s of no d es. In order to represent s u c h patterns, Hoff et al. [2002] present a m o del in whic h the conditional mean of y i,j is a function o f β ′ x i,j − | u i − u j | , where { u 1 , . . . , u n } are v ectors of unobserv ed, late nt c haracteristics in a Euclidean space. In the co ntext of binary relatio nal data , suc h a mo del predicts the existence of more transitive triples, or “triangles,” than would b e seen under a rand om allo- cation of edges among pairs of no d es. An imp ortant assu mption of this mo del is that t wo no des with a strong rela tionship b et we en them are also similar to eac h other in terms of ho w they relate to other no d es: A stron g relationship b et we en i and j s uggests | u i − u j | is small, but this further implies that | u i − u k | ≈ | u j − u k | , and s o no d es i and j are assumed to ha ve similar relationships to other no des. The laten t class mo d el of No w ic ki and Sn ij d ers [2001] and the latent distance mo d el of Hoff et al. [2002] are able to iden tify , resp ectiv ely , classes of no des w ith similar r oles, a nd the lo cational prop erties of the no des. These tw o items are p erhaps the tw o primary features of in terest in s o cial net w ork and relational data analysis. F or example, discussion of these concepts mak es u p more 2 than half of the 734 pages of main text in W asserman and F aust [1994]. Ho w ev er, a mo del that ca n represent one feature ma y not b e able to rep resen t the other: Consid er the t wo graphs in Figure 1. Th e graph on the left displa ys a large degree of transitivit y , and can b e w ell-represen ted b y the laten t distance model with a se t of vect ors { u 1 , . . . , u n } in t w o-dimensional space, in whic h the probabilit y of an edge b et w een i and j is decreasing in | u i − u j | . In co ntrast, represent ation of the graph by a laten t cla ss m o del would requir e a large num b er of classes, none of wh ic h w ould b e particularly cohesive or distinguishable fr om the others. The second p anel of Figure 1 displa ys a n et w ork in vo lving thr ee classes of s to c hastically equiv alent no des, t w o of whic h (sa y A and B ) ha v e only across-class ties, and one ( C ) that h as b oth within- and across-class ties. This graph is w ell-represen ted by a laten t class mo del in whic h edges o ccur with high pr obabilit y b et w een pairs ha ving one m em b er in eac h of A and B or in B and C , and among pairs having b oth mem b er s in C (in m o dels of stoc hastic equiv alence, nod es within eac h class are not differen tiated). In contrast , represent ation of this t yp e of graph with a latent distance mo del w ould r equire the dimension of the laten t c haracteristics to b e o n th e order of the cla ss memb ership sizes. Man y real net w orks exhibit com binations of stru ctural equiv alence and homophily in v arying degrees. In these situations, use of either the latent class or distance mo del w ould only b e repre- sen ting p art of the net w ork structure. The goal of this pap er is to sh o w that a simple statistical mo del based on the eigen v alue decomp osition can generalize the latent class and d istance mo dels: Just as an y symm etric matrix can b e appro ximated with a sub set of its largest eigen v alues and corresp ondin g eigen v ectors, the v ariation in a so ciomatrix can b e represen ted by mo deling y i,j as a function of β ′ x i,j + u T i Λ u j , wh ere { u 1 , . . . , u n } are node-sp ecific factors and Λ is a diagonal matrix. In this article, we sho w mathematically and b y example ho w this eigenmo del can repr esen t b oth sto c hastic equiv alence and homophily in symmetric relatio nal data, and th us is more general than the other t w o laten t v ariable mo dels. The next section m otiv ates the use of laten t v ariables models for relational data, and sho ws mathematicall y that the eigenmo del generalizes the laten t class and distance mo dels in the sense that it can compactly represent th e same netw ork features as these other mod els but not vice- v ersa. Section 3 compares the o ut-of-sample predictiv e p erformance of these three mo d els on three differen t datasets: a so cial n et w ork of 1 2th g raders; a relatio nal dataset on w ord asso ciation coun ts from the first chapter of Genesis; and a dataset on protein-protein in teractions. The first t wo net w orks exhib it late nt homophily and sto c hastic equiv alence resp ectiv ely , whereas the third sho w s b oth to some degree. I n sup p ort of the theoretical results of Section 2, the lat ent d istance and class mo d els p erf orm wel l for the first a nd s econd datasets resp ectiv ely , whereas th e eig enmo d el p erforms w ell for all three. Section 4 summarizes the results and discusses some extensions. 3 2 Laten t v ariable mo deling of relational data 2.1 Justification of laten t v ariable modeling The us e of probabilistic latent v ariable mo d els for the repr esen tation of relational data can b e motiv ated in a n atural w ay: F or und ir ected data without co v ariate information, symmetry suggests that an y probability model we c onsider shou ld treat the no des as b eing exc hangeable, so that Pr( { y i,j : 1 ≤ i < j ≤ n } ∈ A ) = Pr( { y π i,π j : 1 ≤ i < j ≤ n } ∈ A ) for any p erm utation π of th e integers { 1 , . . . , n } and any set of so ciomatrices A . Results of Hoo v er [1982] and Aldous [1985, c h ap. 14] s h o w that if a mo d el satisfies the ab ov e exc hangeabilit y condition for eac h in teger n , then it can b e written as a laten t v ariable mo del of the form y i,j = h ( µ, u i , u j , ǫ i,j ) (1) for i.i.d. la tent v ariables { u 1 , . . . , u n } , i.i.d. pair-sp ecific effects { ǫ i,j : 1 ≤ i < j ≤ n } and some function h that i s sy m metric in its second and third argumen ts. This result is very general - it sa ys that any statistical mo del for a sociomatrix in which the n o des a re exc h angeable can b e written as a laten t v ariable mo del. Difference c hoices of h lead to differen t mo d els for y . A general probit mo del for binary net wo rk data can b e put in the form of (1) as f ollo w s: { ǫ i,j : 1 ≤ i < j ≤ n } ∼ i.i.d. normal(0 , 1) { u 1 , . . . , u n } ∼ i.i.d. f ( u | ψ ) y i,j = h ( µ, u i , u j , ǫ i,j ) = δ (0 , ∞ ) ( µ + α ( u i , u j ) + ǫ i,j ) , where µ and ψ are p arameters to b e estimated, and α is a symmetric function, also p oten tially in v olving parameters to b e estimate d. Cov ariation b et w een Y and an arra y of pr edictor v ariables X can b e represented b y adding a linear predictor β T x i,j to µ . Finally , in tegrating o ver ǫ i,j w e obtain Pr( y i,j = 1 | x i,j , u i , u j ) = Φ[ µ + β T x i,j + α ( u i , u j )]. Since th e ǫ i,j ’s can b e assumed to b e indep end en t, the conditional probabilit y of Y giv en X and { u 1 , . . . , u n } can b e expressed as Pr( y i,j = 1 | x i,j , u i , u j ) ≡ θ i,j = Φ[ µ + β T x i,j + α ( u i , u j )] (2) Pr( Y | X , u 1 , . . . , u n ) = Y i 0 or λ k < 0. I n this w a y , the mo del can represent both p ositiv e or negativ e h omophily in v aryin g degrees, and sto c hastically equiv alent no des (no d es with the same or similar latent v ectors) ma y or ma y not ha ve strong relationships with one another. W e n o w show that the eigenmo del generaliz es the latent class and d istance mo dels: Let S n b e the set of n × n so ciomatrices, and let C K = { C ∈ S n : c i,j = m u i ,u j , u i ∈ { 1 , . . . , K } , M a K × K symmetric matrix } ; D K = { D ∈ S n : d i,j = −| u i − u j | , u i ∈ R K } ; E K = { E ∈ S n : e i,j = u T i Λ u j , u i ∈ R K , Λ a K × K diagonal matrix } . In other w ords, C K is the set of p ossible v alues of { α ( u i , u j ) , 1 ≤ i < j ≤ n } under a K -dimensional laten t class mo del, and similarly for D K and E K . 5 E K generalizes C K : Let C ∈ C K and let ˜ C b e a completion of C obtained by s etting c i,i = m u i ,u i . There are at most K unique ro ws of ˜ C and so ˜ C is of ran k K at most. Since th e set E K con tains all so ciomatrices that ca n b e completed as a rank- K matrix, we hav e C K ⊆ E K . Since E K includes matrices with n unique ro ws, C K ⊂ E K unless K ≥ n in whic h case the t w o sets are equal. E K +1 w eakly generalizes D K : Let D ∈ D K . S u c h a (negativ e) distance matrix will generally b e of full rank, in whic h case it cannot b e repr esented exa ctly by an E ∈ E K for K < n . Ho w eve r, what is critical from a mo deling p ersp ectiv e is whether or n ot the or der of the en tr ies of eac h D can be matc hed b y th e order of the en tries of an E . This is b ecause the probit and ordered probit mod el we are considerin g in clude threshold v ariables { µ y : y ∈ Y } whic h can b e a dju sted to accommodate monotone transformations of α ( u i , u j ). With this in mind , note th at the m atrix of squar e d distances among a set of K -dimensional v ectors { z 1 , . . . , z n } is a monoto nic transf orm ation of the distances, is of rank K + 2 or less (as D 2 = [ z ′ 1 z 1 , . . . , z ′ n z n ] T 1 T + 1[ z ′ 1 z 1 , . . . , z ′ n z n ] − 2 Z Z T ) and s o is in E K +2 . F u rthermore, letting u i = ( z i , q r 2 − z T i z i ) ∈ R K +1 for eac h i ∈ { 1 , . . . , n } , we ha v e u ′ i u j = z ′ i z j + p ( r 2 − | u i | 2 )( r 2 − | u j | 2 ). F or large r this is appro ximately r 2 − | z i − z j | 2 / 2, whic h is an increasing function of the negativ e distance d i,j . F or large enough r the n umerical order of the en tries of this E ∈ E K +1 is the same as that of D ∈ D K . D K do es not weakly generalize E 1 : Consider E ∈ E 1 generated by Λ = 1, u 1 = 1 and u i = r < 1 for i > 1. Th en r = e 1 ,i 1 = e 1 ,i 2 > e i 1 ,i 2 = r 2 for all i 1 , i 2 6 = 1. F or whic h K is su c h an ordering of the elemen ts of D ∈ D K p ossible? If K = 1 th en suc h an ord ering is p ossible only if n = 3. F or K = 2 suc h an ordering is p ossible for n ≤ 6. This is b ecause the kissing numb er in R 2 , or the num b er of non-o v erlapping sph eres of unit radius that can simultaneo usly touch a cen tral sphere of unit radius, is 6. If we put no de 1 at th e cent er of the cent ral sphere, and 6 no des at the cen ters of the 6 kissin g sph eres, then w e ha v e d 1 ,i 1 = d 1 ,i 2 = d i 1 ,i 2 for all i 1 , i 2 6 = 1. W e can only hav e d 1 ,i 1 = d 1 ,i 2 > d i 1 ,i 2 if we remo v e one of th e non-cen tral spheres to allo w for more ro om b et wee n those remaining, lea ving one cen tral sphere plus fiv e kissing sp heres for a total of n = 6. Increasing n increases the necessary dimension of the Eu clidean space, and so for an y K ther e are n and E ∈ E 1 that ha v e ent ry ord erings that cannot b e matc hed by those of any D ∈ D K . A less general p ositiv e semi-definite v ersion of the eig enmo d el has b een studied b y Hoff [2005], in whic h Λ wa s tak en to b e the iden tit y matrix. Suc h a mo del can w eakly generalize a distance mo del, but cannot generalize a laten t class m o del, as the eigen v alues of a latent class mo del could b e negativ e. 6 3 Mo d el comparison on three differen t datasets 3.1 P arameter estima tion Ba y esian parameter estimation for the three mo d els under consideration can b e ac hieve d v ia Mark o v c hain Mon te Carlo (MCMC) algorithms, in wh ic h p osterior distributions for th e u nknown quan- tities are appro ximated with empirical distributions of samp les fr om a Mark o v c h ain. F or th ese algorithms, it is usefu l to form ulate the probit mo dels described in Section 2 .1 in terms o f an addi- tional laten t v ariable z i,j ∼ normal[ β ′ x i,j + α ( u i , u j )], for whic h y i,j = y if µ y < z i,j < µ y + 1 . Using conjugate prior d istr ibutions where p ossible, the MCMC algorithms pro ceed by generating a new state φ ( s +1) = { Z ( s +1) , µ ( s +1) , β ( s +1) , u ( s +1) 1 , . . . , u ( s +1) n } from a current state φ ( s ) as follo ws: 1. F or eac h { i, j } , sample z i,j from its (constrained normal) full conditional distribution. 2. F or eac h y ∈ Y , sample µ y from its (normal) full conditional distribution. 3. Sample β from its (multiv ariate n ormal) full conditional distribution. 4. Sample u 1 , . . . , u n and their associated parameters: • F or the laten t distance mo del, p rop ose and acce pt or reject new v alues of the u i ’s with the Metrop olis algorithm, and then sample the p opu lation v ariances of the u i ’s from their (in ve rse-gamma) full conditional distributions. • F or the laten t class mo del, u p d ate eac h class v ariable u i from its (multinomia l) condi- tional distrib ution giv en cur ren t v alues of Z, { u j : j 6 = i } and the v ariance of the elemen ts of M (but m arginally ov er M to impro v e mixing). Then sample the elemen ts of M fr om their (norm al) full conditional distr ibutions and the v ariance of the en tries of M from its (in v erse-gamma) f u ll conditional distribution. • F or the laten t v ector mo del, samp le eac h u i from its (m ultiv ariate normal) full con- ditional distrib ution, sample the mean of the u i ’s from their (normal) fu ll co nd itional distributions, and then sample Λ from its (m ultiv ariate normal) full conditional distri- bution. T o facilitate comparison across mo dels, we used prior distributions in wh ic h the lev el of prior v ariabilit y in α ( u i , u j ) w as similar ac ross the three differen t mo dels. An R pac k age that implemen ts the MCMC is a v ailable at cra n.r-pr oject .org/src/contrib/Descriptions/eigenmodel.html . 3.2 Cross v alidation T o compare the p erformance of these thr ee differen t m o dels w e ev aluated their out-of-sample pr e- dictiv e p erformance u n der a range of d imensions ( K ∈ { 3 , 5 , 10 } ) and on three d ifferen t datasets 7 T able 1: Cross v alidation results and area under the R OC cur ves. K Add health Genesis Protein in teraction dist class eigen dist class eigen dist class eigen 3 0.82 0.6 4 0.75 0.62 0. 82 0.82 0.83 0 .79 0.88 5 0.81 0. 70 0.78 0.66 0 .82 0.82 0.84 0 .84 0.90 10 0.76 0. 69 0.80 0.74 0 .82 0.82 0.85 0 .86 0.90 exhibiting v arying com bin ations of h omophily and stochastic equiv alence. F or eac h com bination of dataset, dimension and mo del w e p erformed a fiv e-fold cross v alidation exp eriment a s follo ws: 1. Randomly divid e the n 2 data v alues into 5 sets of roughly equal size, letting s i,j b e the set to whic h pair { i, j } is assigned. 2. F or eac h s ∈ { 1 , . . . , 5 } : (a) O b tain p osterior distributions of the mo del parameter cond itional on { y i,j : s i,j 6 = s } , the data on pairs not in set s . (b) F or pairs { k , l } in set s , let ˆ y k ,l = E [ y k ,l |{ y i,j : s i,j 6 = s } ], the p osterior predictive mean of y k ,l obtained using data not in set s . This p ro cedure generates a so ciomatrix ˆ Y , in wh ic h eac h en try ˆ y i,j represent s a predicted v alue obtained from using a subset of the data that d o es not include y i,j . Th us ˆ Y i s a so ciomatrix of out-of-sample predictions of the observ ed data Y . 3.3 Adolescen t Health so cial net work The fi rst dataset records friends hip ties among 247 12th-graders, obtained from the National Longi- tudinal S tudy of Adolescen t Health ( www. cpc.u nc.ed u/projects/addhealth ). F or these data, y i,j = 1 or 0 dep ending on whether or not there is a close friendship tie b et w een studen t i and j (as r ep orted b y either i or j ). These data are represen ted as an undirected grap h in th e fir s t panel of Figure 2. Lik e man y so cial net works, these data exhib it a goo d deal of transitivit y . It is therefore not surpr ising that th e b est p erf orm ing mod els considered (in terms of area under the R OC curve, giv en in T able 1) are the distance models, w ith the eige nmo dels close b ehind . In co ntrast, th e laten t cla ss mo dels p erform p o orly , and the results su ggest t hat increasing K for this mo del would not impro ve it s p erformance. 8 Figure 2: So cial net wo rk data and unscaled R OC curv es for the K = 3 mo dels. 3.4 W ord neigh b ors in Genesis The second dataset we co nsider is d eriv ed from wo rd and punctuation coun ts in the firs t chapter of the King James v ersion of Genesis ( www .gute nberg .org/dirs/etext05/bib0110.txt ). There are 158 unique words an d punctuation marks in this c hapter, and for our example we tak e y i,j to b e th e n umb er of times that w ord i and wo rd j app ear next to eac h other (a mo del extension, app r opriate for an asymmetric v ersion of this d ataset, is discussed in the next sectio n). These data ca n b e view ed as a graph with weigh ted edges, the unw eigh ted v ersion o f whic h is sho wn in the fi rst p anel of Figure 3. The lac k of a clear spatial representat ion of these data is not unexp ected, as text data suc h as these do n ot ha v e groups of words with strong within-group connections, nor do they displa y m uc h homophily: a g ive n noun ma y app ear quite frequent ly next to t w o d ifferen t ve rbs , bu t these v erbs will not app ear n ext to eac h other. A b etter description of these data might b e that th ere are classes of words, and connections occur b etw een w ords of different cla sses. T he cross v alidation results supp ort this claim, in that the laten t class mo del p erforms m uch b etter th an the distance mo del on these d ata, as seen in the second p anel of Figure 3 and in T able 1. As discussed in the previous section, the eigenmo del generalizes the laten t class mo del and p erforms equally we ll. W e note that parameter estimates fo r these data w ere obtained using the ordered p robit v ers ions of the mo dels (as the data are not binary), bu t the out-of-sample p redictiv e p erf orm ance was ev aluated based on eac h mo del’s abilit y to predict a non-zero relationship. 9 Figure 3: Relational text data from Genesis and unscaled R OC curve s for the K = 3 mo dels. 3.5 Protein-protein in teraction da ta Our last example is the protein-protein int eraction data of Butland et al. [2005], in whic h y i,j = 1 if proteins i and j bind and y i,j = 0 otherwise. W e analyze the large connected comp onent of this graph, wh ich includ es 230 p roteins and is displa y ed in th e fi rst panel of 4. This graph indicates patterns of b oth stoc hastic equiv alence and homophily: Some n o des co uld b e describ ed as “hubs”, connecting to many other no des w hic h in turn do not connect to eac h other. Such structure is b etter represented by a la tent class mo d el than a d istance mo del. Ho we ve r, most no d es connecting to hubs generally connect to only one hub, whic h is a feat ure that is hard to represen t with a sm all n umb er of laten t cla sses. T o represent this structure well, we w ould need t w o laten t classes p er h ub, one for th e hub itself and one for the n o des connecting to the hub. F urth ermore, the core of the net wo rk (the no des w ith more than t w o connections) displa ys a go o d degree of homoph ily in the form of transitiv e tria ds, a feature whic h is easiest to represen t with a distance mo del. Th e eigenmod el is able to capture b oth of these data features and p erforms b etter th an the other t wo mo dels in terms of out-of-sample predictiv e p erformance. In f act, the K = 3 eigenmodel p erforms b etter than the other t w o mo dels f or an y v alue of K considered. 4 Discussion Laten t distance and laten t class mo dels pro vide co ncise, easily inte rp r eted descriptions of so cial net w orks and relational data. Ho w ev er, neither of these mod els will pro vide a complete picture of relational data that exhibit degrees of b oth homophily and stochastic equiv alence. I n con trast, 10 Figure 4: Protein-protein interac tion d ata and unscaled R OC curv es for the K = 3 mo dels. w e ha v e shown that a laten t eigenmo del is able to represent datasets with either or b oth of these data patterns. This is due to the fact that the eigenmo del p ro vides an unrestricted lo w-rank appro ximation to the so ciomatrix, and is therefore able to represen t a wide array of patterns in the data. The co ncept b ehind th e eig enmo d el is the familiar eigen v alue decomp osition of a symmetric matrix. The analogue for directed net w orks or rectangular matrix d ata would b e a mo del b ased on the singular v alue decomp osition, in whic h data y i,j could b e mod eled as dep end ing on u T i D v j , where u i and v j represent v ectors of laten t row and column effects resp ectiv ely . Statistical inference using the s ingular v alue decomp osition for Gaussian data is straigh tforward. A mod el-based version of the approac h for binary and other non-Gaussian relatio nal datasets could b e imp lemen ted u s ing the ordered probit mo del discussed in this pap er. Ac kno wledgmen t This w ork w as partially fun ded b y NSF gran t n umber 0631531. References Edoardo Airoldi, Da v id Blei, Eric Xing, and Stephen Fienb erg. A la tent mixed mem b ership mo del for relational data. In LinkKDD ’ 05: P r o c e e dings of the 3r d international workshop on Link disc overy , pages 82–89, New Y ork, NY, USA, 2005. A CM Press. ISBN 1-59593-2 15-1. d oi: h ttp://doi.acm.org/ 10.1145/1134271.1134283. 11 Da vid J. Aldous. E x changeabilit y and related topics. In ´ Ec ole d’ ´ et´ e de pr ob abilit´ es de Saint-Flour, XIII—1983 , v olume 1117 of L e ctur e Notes in Math. , pages 1–198. Spr inger, Berlin, 1985 . G. Butland, J. M. Peregrin-Alv arez, J. Li, W. Y ang, X. Y ang, V. Can ad ien, A. Starostine, D. Richards, B. Beatti e, N. Krogan, M. Da vey , J. Parkinson, J. Green blatt, and A. Emili. Inte r- action net w ork con taining conserv ed and essen tial protein complexes in escheric hia coli. Natur e , 433:53 1–537, 2005. P eter D. Hoff. Bilinear mixed-effects mo dels for d y adic data. J. Amer. Statist. Asso c. , 100(4 69): 286–2 95, 2005. ISSN 0162- 1459. P eter D. Hoff, Adrian E. Raftery , and Mark S. Handcock. Laten t sp ace appr oac h es to so cial net w ork analysis. J. A mer. Statist. A sso c. , 97(460):1 090–1098, 2002. IS SN 0162 -1459. D. N. Hoo ve r. Ro w-column exc hangeabilit y and a generalized mo del for probabilit y . In Exchange- ability in pr ob ability and statistics (R ome, 1981 ) , p ages 281–291. North-Holland, Amsterdam, 1982. Charles Kemp , T homas L. Griffith s, and Joshua B. T enenbaum. Disco vering laten t classes in relational data. AI Memo 2004- 019, Massac husetts In stitute of T ec hnology , 2004. Krzysztof No wic ki and T om A. B. Snijders. Estimation and prediction for sto c hastic blo c kstruc- tures. J. A mer. Statist. A sso c. , 96(455 ):1077–1087 , 2001. IS S N 0162 -1459. Stanley W asserman and Katherine F aust. So cial Network A nalysis: Metho ds and Applic ations . Cam bridge Univ ersit y Press, Cam bridge, 1994. 12
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment