생존 이론으로 보는 정보 전파와 네트워크 추론

본 논문은 서바이벌 분석을 기반으로 정보·행동·질병 등 다양한 전염 현상의 확산을 모델링하고, 관측된 감염 시각 데이터를 이용해 숨겨진 네트워크 구조를 추정하는 두 가지 위험 모델(가법·곱법)을 제안한다. 가법 모델은 기존 연속시간 전파 모델들을 일반화하고, 곱법 모델은 노드가 다른 노드의 감염 위험을 증가시키거나 감소시킬 수 있도록 확장한다. 두 모델 모두 로그우도 최적화를 통해 볼록(convex) 문제로 변환되어 효율적으로 학습 가능하며, …

저자: Manuel Gomez Rodriguez, Jure Leskovec, Bernhard Schoelkopf

생존 이론으로 보는 정보 전파와 네트워크 추론
Mo deling Information Propagation with Surviv al Theory Man uel Gomez-Ro driguez 1 , 2 manuelgr@tuebingen.mpg.de Jure Lesk o v ec 2 jure@cs.st anford.edu Bernhard Sc h¨ olk opf 1 bs@tuebingen.mpg.de 1 MPI for In telligent Systems and 2 Stanford Univ ersity Abstract Net works provide a ‘skeleton’ for the spread of con tagions, lik e, information, ideas, b e- ha viors and diseases. Man y times net w orks o ver whic h con tagions diffuse are unobserved and need to be inferred. Here we apply sur- viv al theory to dev elop general additiv e and m ultiplicative risk mo dels under which the net work inference problems can b e solved effi- cien tly b y exploiting their conv exit y . Our additiv e risk mo del generalizes several exis- ting net work inference models. W e show all these mo dels are particular cases of our more general mo del. Our m ultiplicative mo del allo ws for mo deling scenarios in whic h a node can either increase or decrease the risk of acti- v ation of another no de, in contrast with pre- vious approac hes, which consider only posi- tiv e risk increments. W e ev aluate the p erfor- mance of our netw ork inference algorithms on large syn thetic and real cascade datasets, and sho w that our mo dels are able to predict the length and duration of cascades in real data. 1. Introduction Net work diffusion is one of the fundamental pro cesses taking place in netw orks (Rogers, 1995). F or example, information, diseases, rumors, and b eha viors spread o ver underlying social and information net w orks. Abs- tractly , w e think of a c ontagion that appears at some no de of a net w ork and then spreads lik e an epidemic from no de to no de o v er the edges of the netw ork. F or example, in information propagation, the con tagion corresp onds to a piece of information (Lib en-No w ell & Klein b erg, 2008; Lesko v ec et al., 2009), the no des corresp ond to people and infection ev en ts are the times Pr o c e e dings of the 30 th International Confer enc e on Ma- chine L e arning , Atlan ta, Georgia, USA, 2013. JMLR: W&CP volume 28. Copyrigh t 2013 by the author(s). when nodes learn about the information. Similarly , w e can think ab out the spread of a new t yp e of b eha vior or an action, lik e, purchasing and recommending a new product (Lesk ov ec et al., 2006) or the propagation of a contagious disease o ver so cial net work of indivi- duals (Bailey, 1975). Propagation often occurs o ver netw orks whic h are hi- dden or unobserved. Ho wev er, w e can observ e the trace of the contagion spreading. F or example, in in- formation diffusion, we observ e when a node learns ab out the information but not who they heard it from. In epidemiology , a person can become ill but cannot tell who infected her. And, in mark eting, it is p ossi- ble to observ e when customers buy pro ducts but not who influenced their decisions. Th us, w e can observe a set of contagion infection times and the goal is to infer the edges of the underlying netw ork o ver which the con tagion diffused (Gomez-Ro driguez et al., 2010). In this pap er w e prop ose a general theoretical frame- w ork to mo del information propagation and then infer hidden or unobserved netw orks using surviv al theory . W e generalize previous w ork, develop efficien t netw ork inference metho ds, and v alidate them exp erimen tally . In particular, our methods not only identify the net- w ork structure but also infer which links inhibit or encourage the diffusion of the con tagion. Our approach to information propagation. W e consider c ontagions spreading across a fixed p opula- tion of no des. The contagion spreads by no des forcing other nodes to switch from b eing uninfected to being infected, but no des cannot switc h in the opp osite di- rection. Therefore, we can represen t whether a node is infected at any giv en time as a nondecreasing (binary) coun ting pro cess. W e then model the instan taneous risk of infection, i.e., the hazar d r ate (Aalen et al., 2008) of a node by using the infection times of other previously infected no des as explanatory v ariables or c ovariates . By inferring which no des influence the ha- zard rate of a given node, we disco v er the edges of Mo deling Information Propagation with Surviv al Theory the underlying netw ork ov er whic h propagation takes place. In particular, if the hazard rate of no de i de- p ends on the infection time of no de j , then there is a directed edge ( j, i ) in the underlying net work. W e then develop t wo models. First, w e in tro duce an additiv e risk mo del under which the hazard rate of eac h node is an additiv e function of the infection times of other previously infected nodes. W e show that sev e- ral previous approaches to netw ork inference (Gomez- Ro driguez et al., 2011; Du et al., 2012; W ang et al., 2012; Gomez-Ro driguez et al., 2013) are particular cases of our more general additive risk model. Ho w- ev er, all these mo dels implicitly consider previously infected no des to only incr e ase the instan taneous risk of infection. W e then relax this assumption and de- v elop a multiplicativ e risk model under which the ha- zard rate of each no de is m ultiplicative on the infec- tion times of other previously infected nodes. This allo ws previously infected no des to either incr e ase or de cr e ase the risk of another no de getting infected. F or example, trendsetters’ probability of buying a product ma y increase when she observes her peers buying a pro duct but ma y also de cr e ase when she realizes that aver age, mainstr e am friends are buying the product. Similarly , consider an example of a blog which often men tions pieces of information from a general news media sites, but only whenever they are not related to sp orts. Therefore, if the general news media site publishes a piece of information related to sp orts, w e w ould like the blog’s risk of adopting the information to b e smaller than for other type of information. Last, w e sho w ho w to efficien tly fit the parameters of both mo dels b y using the maximum lik eliho od principle and b y exploiting conv exity of the optimization problems. Related w ork. In recen t y ears, many net w ork in- ference algorithms hav e b een dev elop ed (Saito et al., 2009; Gomez-Rodriguez et al., 2010; 2011; 2013; My ers & Lesko v ec, 2010; Sno wsill et al., 2011; Netrapalli & Sangha vi, 2012; Gomez-Ro driguez & Sc h¨ olk opf, 2012; W ang et al., 2012). These approac hes differ in a sense that some infer only the net w ork structure (Gomez- Ro driguez et al., 2010; Snowsill et al., 2011), while others infer not only the net w ork structure but also the str ength or the a verage latency of every edge in the netw ork (Saito et al., 2009; Myers & Lesko vec, 2010; Gomez-Rodriguez et al., 2011; 2013; W ang et al., 2012). Most of the approaches use only temporal in- formation while a few metho ds (Netrapalli & Sang- ha vi, 2012; W ang et al., 2012) consider b oth temp o- ral information and additional non-temp oral features. Our w ork provides tw o nov el con tributions ov er ab o v e approac hes. First, our additive risk mo del is a gene- ralization of several mo dels which ha ve b een prop osed previously in the literatu re. Second, w e dev elop a m ul- tiplicativ e mo del which allo ws no des to increase or de- crease the risk of infection of another no de. 2. Mo deling information propagation with surviv al analysis Information propagation as a coun ting pro- cess. W e consider multiple independent con tagions spreading across an unobserv ed net work on N nodes. As a single contagion spreads, it creates a c asc ade . A cascade t c of contagion c is simply a N -dimensional v ector t c := ( t c 1 , . . . , t c N ) recording the times when eac h of N nodes got infected by the con tagion c : t c i ∈ [ t 0 , t 0 + T c ] ∪ {∞} , where t 0 is the infection time of the first no de. Generally , contagions do not infect all the no des of the net w ork, and symbol ∞ is used for no des that w ere not infected b y the con tagion c during the observ ation windo w [ t 0 , t 0 + T c ]. F or simplicit y , we assume T c = T for all cascades; the results generalize trivially . In an information or rumor propagation se- tting, each cascade c corresp onds to a different piece of information or rumor, nodes i are people, and the infection time of a no de t c i is simply the time when no de i first learned ab out the piece of information or rumor. No w, consider no de i , cascade t c = t , and an indi- cator function N i ( t ) suc h that N i ( t ) = 1 if no de i is infected by time t in the cascade and N i ( t ) = 0 otherwise. Then, we define the filtr ation F t as the set of no des that has b een infected b y time t and their infection times, i.e. , F t = ( t t 0 . Then w e can decomp ose N i ( t ) uniquely as N i ( t ) = Λ i ( t ) + M i ( t ), where Λ i ( t ) is a nondecreasing predictable process, called cumulative intensity pr o c ess and M i ( t ) i s a mean zero martingale. This is called the Do ob-Mey er decomp osition of a s ubmartingale (Aalen et al., 2008). Consider Λ i ( t ) to b e absolutely conti- n uous, then there exists a pr e dictable nonnegativ e in- tensit y pro cess λ i ( t ) suc h that: N i ( t ) = Z t 0 λ i ( s ) ds + M i ( t ) . (1) No w, w e assume that the intensit y pro cess λ i ( t ) de- p ends on a v ector of explanatory v ariables or c ova- riates , s ( t ) = γ ( t T , or equiv alen tly t n = ∞ ) and apply logarithms. Therefore, the log-likelihoo d of cas- Mo deling Information Propagation with Surviv al Theory cade t is: log f ( t ; A ) = X i : t i T X m : t m 0 are the parameters of the mo del, which represen t the p ositive or ne gative in- fluence of node j on no de i . If β j i > 1, then when no de j gets infected, the instantaneous risk of infec- tion of node i increases. Similarly , if β j i < 1, then it decreases, and, if β j i = 1, no de j does not hav e an y effect on the risk of node i , i.e., there is no edge in the netw ork. The baseline function α 0 i ( t ) hav e a com- plex shape and is c hosen based on exp ert kno wledge. F or simplicity , we consider simple functions suc h as α 0 i ( t ) = e α 0 i , α 0 i ( t ) = e α 0 i t , or α 0 i ( t ) = e α 0 i /t , where w e set α 0 i to some v alue equal for all nodes i . W e note that w e also tried to include α 0 i as a v ariable in the netw ork inference problem, but this did not lead to impro ved p erformance. Our goal now is to infer the optimal parameters β j i that maximize the lik eliho o d of a set of observ ed cas- cades C . Imp ortantly , b y inferring the parameters β j i , w e also discov er the underlying netw ork ov er which propagation occurs. If β j i 6 = 1, then there is an edge from node j to node i , and if β j i = 1, there is not edge. T o this aim, w e need to compute the lik eliho od of a cascade starting from the hazard rate of each no de. W e first compute the cumulativ e likelihoo d F i ( t | s ( t )) of infection of a no de i using Eq. 2: F i ( t | s ( t ); β i ) = 1 − exp   − X j : t j ≤ t,j > 0 Y k : t k 0 β ki Z t j t j − 1 α 0 i ( t ) dt   , (11) where β i = ( β 1 i , . . . , β N i ). Then, the likelihoo d of infection f i ( t | s ( t )) is: f i ( t | s ( t ); β i ) = α 0 i ( t ) Y k : t k 0 β ki Z t j t j − 1 α 0 i ( t ) dt   , (13) where B := { β j i | i, j = 1 , . . . , n, i 6 = j } . Ho wev er, Eq. 13 only considers infected no des. The fact that some no des are not infected b y the con tagion is also informativ e. W e then add surviv al terms for any non- infected no de n ( t n > T , or equiv alently t n = ∞ ). W e no w reparameterize β j i to α j i = log( β j i ) and apply logarithms to compute the log-likelihoo d of a cascade as, log f ( t ; A ) = X i : t i 0 α ki Z t j t j − 1 α 0 i ( t ) dt − X n : t n >T X j : t j ≤ T e P k : t k 0 α ki Z t j t j − 1 α 0 i ( t ) dt, (14) where A := [ α i ] ∈ R N × N and α ii = 0. The first three terms represent the infected no des and the last term Mo deling Information Propagation with Surviv al Theory represen ts the surviving ones up to the observ ation windo w cut-off T . Assuming indep enden t cascades, the log-lik elihoo d of a set of cascades C is the sum of the log-likelihoo ds of the individual cascades given b y Eq. 14. Then, we apply the maxim um likelihoo d principle on the log-likelihoo d of the set of cascades to find the optimal parameters α i of ev ery no de i : minimize A − P c ∈ C log f ( t c ; A ) (15) The solution to Eq. 15 is unique and computable: Theorem 3. The network infer enc e pr oblem under the multiplic ative risk mo del define d in Eq. 15 is c onvex in A . Pr o of. Result follows from linearit y , comp osition rules for con vexit y , and conv exit y of the exp onen tial. Mo del parameters hav e natural interpretation. If α j i > 0, node j increases the hazard rate of no de i (p ositiv e influence), if α j i < 0, no de j decreases the hazard rate of no de i (negativ e influence), and finally if a parameter α j i = 0, no de j does not hav e an y in- fluence on i – there is no edge b et ween j and i . Ho wev er, there are some undesirable prop erties of the solution to the multiplicativ e risk mo del as defined by Eq. 15. The optimal net w ork will b e dense: any pair of no des ( j, i ) that are not infected b y the same con tagion at least once will hav e negativ e influence on eac h other. Ev en w orse, the negative influence b et ween those pairs of no des will be arbitrarily large, making the optimal solution un b ounded. W e prop ose the following solu- tion to this issue. If pair ( j, i ) does not get infected in any common cascades, we set α j i to zero and do not include it in the log-likelihoo d computation. This rules out in teractions b etw een no des that got infected in disjoin t sets of cascades and a voids un b ounded opti- mal solutions. In other words, we assume that if no de j has (p ositiv e or negativ e) influence on node i , then i and j should get infected b y at least one common con tagion and naturally j should get infected b efore i . By ruling out in teractions b et ween nodes that got infected in disjoin t cascades we successfully reduce the net work density of the optimal solution. How ever, the solution is not encouraged to b e sparse yet. W e ac hieve ev en greater sparsity b y including L1-norm regulariza- tion term (Bo yd & V andenberghe, 2004). Therefore, w e finally solve: minimize A − P c ∈ C g ( t c ; A ) + λ P j,i | α j i | , (16) where λ is a sparsity penalty p arameter and g ( t c ; A ) is the log-likelihoo d of cascade t c whic h omits para- meters α j i of pairs ( j, i ) that did not get infected b y 0 0.2 0.4 0.6 0.8 1 0 2000 4000 6000 8000 10000 Accuracy Number of cascades Additive, γ (t j ; t) = (t-t j ) Multiplicative, α 0,i (t) = e -3 (a) Edge accuracy (C-P) 0 0.2 0.4 0.6 0.8 1 0 2000 4000 6000 8000 10000 Accuracy Number of cascades Additive, γ (t j ; t) = 1(t>t j ) Multiplicative, α 0,i (t) = e -4 t (b) Edge accuracy (HI) 0 0.2 0.4 0.6 0.8 1 0 2000 4000 6000 8000 10000 MSE Number of cascades Additive, γ (t j ; t) = (t-t j ) Multiplicative, α 0,i (t) = e -3 (c) MSE (C-P) 0 0.2 0.4 0.6 0.8 1 0 2000 4000 6000 8000 10000 MSE Number of cascades Additive, γ (t j ; t) = 1(t>t j ) Multiplicative, α 0,i (t) = e -4 t (d) MSE (HI) Figure 1. Edge accuracy and MSE of our inference me- tho ds for additive and multiplicativ e propagation models against num b er of cascades. W e used 1,024 no de Core- P eriphery (C-P) and Hierarc hical (HI) Kroneck er netw orks with an av erage of four edges p er no de and T = 4. at least one common contagion (and t j < t i ). The ab o ve problem is con vex by using the same reasoning as in Th. 3. Finally , we note that by in troducing a L1- norm regularization term, we are essentially assuming Laplacian prior ov er A . Dep ending on the domain, other priors may b e more appropriate; as long as they are jointly log-concav e on A , the netw ork inference problem will still b e conv ex. 5. Exp erimental ev aluation W e ev aluate the p erformance of b oth the additive and the m ultiplicativ e model on synthetic netw orks that mimic the structure of real net w orks as well as on a dataset of more than 10 million information cascades spreading b et w een 3.3 million w ebsites ov er a 4 month p eriod 1 . 5.1. Exp erimen ts on synthetic data In this section, we compare the performance of our inference algorithms for additive and m ultiplicativ e mo dels for different net work structures, time shaping functions, baselines, and observ ation windows. W e skip a comparison to other metho ds such as Net- Ra te , KernelCascade , moNet or InfoP a th since our mo del is able to mimic these metho ds by simply c ho osing the appropriate time shaping function γ ( · ; · ), T able 1. Rather, w e focus on comparing m ultiplicativ e and additiv e mo dels. 1 Av ailable at http://snap.stanford.edu/infopath/ Mo deling Information Propagation with Surviv al Theory 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 7 8 9 10 Accuracy T Additive, γ (t j ; t) = e 0.25(t-t j ) Multiplicative, α 0,i (t) = e -2 (a) Edge accuracy 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 7 8 9 10 MSE T Additive, γ (t j ; t) = e 0.25(t-t j ) Multiplicative, α 0,i (t) = e -2 (b) MSE Figure 2. Edge accuracy and MSE of our inference me- tho ds for additive and multiplicativ e propagation models against observ ation windo w. W e used a 1,024 node Ran- dom Kroneck er netw ork with an av erage of 4 edges p er no de. Exp erimen tal setup. First we generate realis- tic synthetic netw orks using the Kroneck er graph mo del (Lesk ov ec et al., 2010), and set the edge ha- zard function parameters α j,i randomly , dra wn from a uniform distribution. W e then sim ulate and record a set of cascades propagating o v er the net w ork using the additiv e or the multiplicativ e mo del. F or eac h cascade w e pic k the cascade initiator node uniformly at ran- dom and generate the infection times follo wing a simi- lar pro cedure as in Austin (2012): W e draw a uniform random v ariable p er no de, and then use inv erse trans- form sampling (Devro y e, 1986) to generate piecewise lik eliho ods of no de infections. Note that every time a paren t of no de i gets infected, we need to consider a new in terv al in the piecewise likelihoo d of infection of no de i . P erformance vs. n umber of cascades. W e ev a- luate our inference metho ds by computing t w o di- fferen t measures: edge accuracy and mean squared error (MSE). Edge accuracy quan tifies the fraction of edges the metho d was able to infer correctly: 1 − P i,j | I ( α ∗ i,j ) − I ( ˆ α i,j ) | P i,j I ( α ∗ i,j )+ P i,j I ( ˆ α i,j ) , where I ( α ) = 1 if α > 0 and I ( α ) = 0 otherwise. The MSE quan tifies the error in the estimates of parameters α : E  ( α ∗ − ˆ α ) 2  , where α ∗ is the true parameter of the mo del and ˆ α is the estimated parameter. Figure 1 sho ws the edge accuracy and the MSE against cascade size for tw o types of Kroneck er net works: hie- rarc hical and core-periphery , using different additive and multiplicativ e propagation mo dels. Comparing additiv e and multiplicativ e mo dels w e find that in or- der to infer the netw orks to same accuracy the mul- tiplicativ e mo del requires more data. This means it is more difficult to disco ver the net work and fit the parameters for the m ultiplicativ e model than for the additiv e model. Moreo ver, estimating the v alue of the model parameters is considerably harder than sim- ply discov ering edges and therefore more cascades are T opic or news even t ( Q ) # sites # memes Arab Spring 950 17,975 Bailout 1,127 36,863 F ukushima 1,244 24,888 Gaddafi 1,068 38,166 Kate Middleton 1,292 15,112 T able 2. T opic and news even t statistics. needed for accurate estimates. P erformance vs. observ ation windo w length. Lengthening the observ ation window increases the n umber of observed infections and results in a more represen tative sample of the underlying dynamics. Therefore, it should in tuitively result in more accu- rate estimates for both the additive and multiplica- tiv e mo dels. Figure 2 sho ws performance against di- fferen t observ ation windo w lengths for a random net- w ork (Erd˝ os & R ´ en yi, 1960), using additiv e and m ulti- plicativ e mo dels ov er 1,000 cascades. The exp erimen- tal results support the abov e in tuition. How ever, giv en a sufficiently large observ ation windo w, increasing fur- ther the length of the windo w does not increase perfor- mance significantly , as observed in case of the additive mo del with exp onen tial time shaping function. 5.2. Exp erimen ts on real data Exp erimen tal setup. W e trace the flow of informa- tion using memes (Lesk ov ec et al., 2009). Memes are short textual phrases (lik e, “lipstic k on a pig”) that tra vel through a set of blogs and mainstream media w ebsites. W e consider eac h meme m as a separate information cascade c m . Since all documents whic h con tain memes are time-stamp ed, a cascade c m is sim- ply a record of times when sites first mentioned meme m . W e use more than 10 million distinct memes from 3.3 million websites ov er a perio d of 4 mon ths, from Ma y 2011 till August 2011. Our aim is to consider sites that actively spread memes o ver the W eb, so we select the top 5,000 sites in terms of the n umber of memes they mentioned. Moreov er, w e are interested in inferring propagation mo dels related to particular topics or even ts. Therefore, w e consider w e are also given a keyw ord query Q related to the ev ent/topic of interest. When w e infer the parameters of the additive or multiplicativ e mo dels for a giv en query Q , we only consider do cumen ts (and the memes they mention) that include k eywords in Q . T able 2 summarizes the num b er of sites and meme cascades for sev eral topics and real world even ts. Unfortunately , true mo dels (or ground truth) are un- kno wn on real data. Previous net work inference algo- Mo deling Information Propagation with Surviv al Theory 10 -4 10 -3 10 -2 10 -1 10 0 1 10 100 CCDF Cascade size (nodes) Test A1 A2 M1 M2 (a) CS: Arab Spring 10 -4 10 -3 10 -2 10 -1 10 0 1 10 100 CCDF Cascade size (nodes) Test A1 A2 M1 M2 (b) CS: Bailout 10 -4 10 -3 10 -2 10 -1 10 0 1 10 100 CCDF Cascade size (nodes) Test A1 A2 M1 M2 (c) CS: F ukushima 10 -4 10 -3 10 -2 10 -1 10 0 1 10 100 CCDF Cascade size (nodes) Test A1 A2 M1 M2 (d) CS: Gaddafi 10 -4 10 -3 10 -2 10 -1 10 0 1 10 100 CCDF Cascade size (nodes) Test A1 A2 M1 M2 (e) CS: K. Middleton 10 -4 10 -3 10 -2 10 -1 10 0 0 20 40 60 80 100 120 CCDF Cascade duration (days) Test A1 A2 M1 M2 (f ) CD: Arab Spring 10 -4 10 -3 10 -2 10 -1 10 0 0 20 40 60 80 100 120 CCDF Cascade duration (days) Test A1 A2 M1 M2 (g) CD: Bailout 10 -4 10 -3 10 -2 10 -1 10 0 0 20 40 60 80 100 120 CCDF Cascade duration (days) Test A1 A2 M1 M2 (h) CD: F ukushima 10 -4 10 -3 10 -2 10 -1 10 0 0 20 40 60 80 100 120 CCDF Cascade duration (days) Test A1 A2 M1 M2 (i) CD: Gaddafi 10 -4 10 -3 10 -2 10 -1 10 0 0 20 40 60 80 100 120 CCDF Cascade duration (days) Test A1 A2 M1 M2 (j) CD: K. Middleton Figure 3. Cascade size (CS) and cascade duration (CD) distributions for test sets on several topics. W e compare the cascade test set (T est) against syn thetic cascade sets, generated using tw o additive models, one (A1) with γ ( t j ; t i ) = I ( t i > t j ) and other (A2) with γ ( t j ; t i ) = 1 / ( t i − t j ), and tw o multiplicativ e mo dels, one (M1) with α 0 ,i ( t i ) = e b I ( t i > t j ) and another (M2) with α 0 ,i ( t i ) = e b /t i , where we set b = − 3. rithms (Gomez-Ro driguez et al., 2010; 2011; 2013; Du et al., 2012) hav e b een v alidated using explicit h yp er- link cascades. How ever, the use of hyperlinks to refer to the source of information is relativ ely rare (esp e- cially in mainstream media) (Gomez-Ro driguez et al., 2010) and hyperlinks only allow us to test for no des whic h increase the instan taneous risk of infection of other no des, as in the additive risk mo del, but not for no des whic h either increase or decrease it. T o o ver- come this, we instead ev aluate the predictive p o wer of our models. F or each query Q w e create a training and a test set of cascades. The training (test) set contains 80% (20%) of the recorded cascades for the ev en t/topic of in terest; both sets are disjoint and created at ran- dom. W e use the training set to fit the parameters of our additiv e and m ultiplicative mo dels and the test set to ev aluate the mo dels. Cascade size prediction. F or each ev en t/topic of in terest, we ev aluate the predictiv e p o wer of b oth mo- dels, learned using the training set, b y comparing the cascade size distribution of the test set against a syn- thetically generated cascade set using the trained mod- els. W e build the syn thetically generated cascade set b y simulating a set of cascades starting from the true source nodes of the cascades in the test set using the mo del learned from the training set and an observ ation windo w equal to the one of the test set. Figures 3(a-e) sho w the distribution of cascade sizes for the test sets and for the synthetically generated cascade sets using t wo differen t additive mo dels and tw o m ultiplicativ e mo dels. None of the mo dels is a clear winner in terms of similarity with the test sets, but the additive mo del with in verse linear time shaping function (A2) tends to underestimate the cascade size. Surprisingly , the cas- cade size distributions in the syn thetically generated cascade se ts are v ery similar to the empirical distribu- tions, sp ecially up to 10 infected no des p er cascade. Cascade duration prediction. Next w e further ev aluate the predictive p o w er of b oth mo dels, learned using the training sets, by comparing the cascade du- ration distribution of the test sets against syntheti- cally generated cascade sets using the trained mo dels. Figures 3(f-j) show the distribution of the cascade du- ration of the test sets and the synthetically generated cascade sets using the same additiv e and multiplicativ e mo dels. In this case, the p erformance of the mo dels differs more dramatically . The additive model with in- v erse linear time shaping function gets the closest to the empirical distribution of the test set at the cost of underestimating the cascade size. 6. Conclusion Our w ork here con tributes tow ards a general mathe- matical theory of information propagation o ver net- w orks while also providing flexible metho ds. More- o ver, there are also many ven ues for future w ork. In the additive model, external influences that are en- dogenous to the net work (My ers et al., 2012) could be considered by including an extra additive term α i 0 ( t ). In the m ultiplicative model, one could consider non- parametric baselines α i, 0 ( t ) b y fitting the mo del using partial lik eliho o d. Both mo dels could b e extended to include other types of cov ariates s ( t ) and also to con- sider time v arying parameters α j i ( t ) in order to infer dynamic net works (Gomez-Ro driguez et al., 2013). Fi- nally , developing go odness of fit tests would b e useful to c ho ose among mo dels in a more principled manner. Mo deling Information Propagation with Surviv al Theory References Aalen, O.O., Borgan, Ø., and Gjessing, H.K. Survival and event history analysis: a pr o c ess p oint of view . Springer V erlag, 2008. Austin, P .C. Generating surviv al times to simulate co x proportional hazards mo dels with time-v arying co v ariates. Statistics in Me dicine , 2012. Bailey , N. T. J. The Mathematic al The ory of Infe ctious Dise ases and its Applic ations . Hafner Press, 2nd edition, 1975. Bo yd, S.P . and V anden b erghe, L. Convex optimiza- tion . Cambridge Universit y Press, 2004. Devro ye, L. Non-uniform r andom variate gener ation , v olume 4. Springer-V erlag New Y ork, 1986. Du, N., Song, L., Smola, A., and Y uan, M. Learning net works of heterogeneous influence. In NIPS ’12: Neur al Information Pr o c essing Systems , 2012. Erd˝ os, P . and R´ en yi, A. On the ev olution of random graphs. Public ation of the Mathematic al Institute of the Hungarian A c ademy of Scienc e , 5:17–67, 1960. Gomez-Ro driguez, M. and Sc h¨ olkopf, B. Submo dular Inference of Diffusion Netw orks from Multiple T rees. In ICML ’12: Pr o c e e dings of the 29th International Confer enc e on Machine L e arning , 2012. Gomez-Ro driguez, M., Lesk ov ec, J., and Krause, A. Inferring Net works of Diffusion and Influence. In KDD ’10: Pr o c e e dings of the 16th ACM SIGKDD International Confer enc e on Know le dge Disc overy and Data Mining , 2010. Gomez-Ro driguez, M., Balduzzi, D., and Sch¨ olk opf, B. Unco vering the T emp oral Dynamics of Diffusion Net works. In ICML ’11: Pr o c e e dings of the 28th In- ternational Confer enc e on Machine L e arning , 2011. Gomez-Ro driguez, M., Lesko vec, J., and Sc h¨ olk opf, B. Structure and Dynamics of Information P athw ays in On-line Media. In WSDM ’13: Pr o c e e dings of the 6th A CM International Confer enc e on Web Se ar ch and Data Mining , 2013. Kemp e, D., Klein b erg, J. M., and T ardos, ´ E. Maximiz- ing the spread of influence through a social net w ork. In KDD ’03: Pr o c e e dings of the 9th ACM SIGKDD International Confer enc e on Know le dge Disc overy and Data Mining , 2003. Lesk ov ec, J., Adamic, L. A., and Hub erman, B. A. The dynamics of viral marketing. In EC’ 06: Pr o c e e dings of the eigth International Confer enc e on Ele ctr onic Commer c e , 2006. Lesk ov ec, J., Backstrom, L., and Kleinberg, J. Meme- trac king and the dynamics of the news cycle. In KDD ’09: Pr o c e e dings of the 15th ACM SIGKDD International Confer enc e on Know le dge Disc overy and Data Mining , 2009. Lesk ov ec, J., Chakrabarti, D., Klein b erg, J., F alout- sos, C., and Ghahramani, Z. Kroneck er graphs: An approac h to modeling net w orks. Journal of Machine L e arning R ese ar ch , 11:985–1042, 2010. Lib en-No well, David and Klein b erg, Jon. T racing the flo w of information on a global scale using Inter- net chain-letter data. Pr o c e e dings of the National A c ademy of Scienc es , 105(12):4633–4638, 2008. My ers, S. and Lesko v ec, J. On the Conv exity of La- ten t So cial Netw ork Inference. In NIPS ’10: Neur al Information Pr o c essing Systems , 2010. My ers, S., Lesko v ec, J., and Zhu, C. Information Dif- fusion and External Influence in Netw orks. In KDD ’12: Pr o c e e dings of the 18th ACM SIGKDD Interna- tional Confer enc e on Know le dge Disc overy and Data Mining , 2012. Netrapalli, P . and Sanghavi, S. Finding the graph of epidemic cascades. In ACM SIGMET- RICS/Performanc e ’12: Pr o c e e dings of the A CM SIGMETRICS and Performanc e Confer enc e , 2012. Rogers, E. M. Diffusion of Innovations . F ree Press, New Y ork, fourth edition, 1995. Saito, K., Kimura, M., Ohara, K., and Motoda, H. Learning con tin uous-time information diffusion mo del for so cial b eha vioral data analysis. A dvanc es in Machine L e arning , pp. 322–337, 2009. Sno wsill, T.M., Fyson, N., De Bie, T., and Cristianini, N. Refining causalit y: who copied from whom? In KDD ’11: Pr o c e e dings of the 17th ACM SIGKDD International Confer enc e on Know le dge Disc overy and Data Mining , 2011. W ang, L., Ermon, S., and Hopcroft, J. F eature- enhanced probabilistic mo dels for diffusion net work inference. In ECML PKDD ’12: Pr o c e e dings of the Eur op e an Confer enc e on Machine L e arning and Principles and Pr actic e of Know le dge Disc overy in Datab ases , 2012.

원본 논문

고화질 논문을 불러오는 중입니다...

댓글 및 학술 토론

Loading comments...

의견 남기기