Influential Node Detection in Implicit Social Networks using Multi-task Gaussian Copula Models

Influential node detection is a central research topic in social network analysis. Many existing methods rely on the assumption that the network structure is completely known \textit{a priori}. However, in many applications, network structure is unav…

Authors: Qunwei Li, Bhavya Kailkhura, Jayaraman J. Thiagarajan

Influential Node Detection in Implicit Social Networks using Multi-task   Gaussian Copula Models
Influen tial No de Detection in Implicit So cial Net w orks using Multi-task Gaussian Copula Mo dels Qun w ei Li qli33@syr.edu Syr acuse University Bha vya Kailkh ura kailkhura1@llnl.go v ∗ L awr enc e Livermor e National L abs Ja ya raman J. Thiagara jan jja y aram@llnl.go v L awr enc e Livermor e National L abs Zhenliang Zhang zhenliang.zhang@intel.com Intel L abs Pramo d K. V arshney v arshney@syr.edu Syr acuse University Editor: Oren Ana v a, Marco Cuturi, Azadeh Khaleghi, Vitaly Kuznetsov, Alexander Rakhlin Abstract Influen tial no de detection is a central research topic in so cial netw ork analysis. Man y ex- isting metho ds rely on the assumption that the netw ork structure is completely known a priori . Ho wev er, in many applications, netw ork structure is unav ailable to explain the underlying information diffusion phenomenon. T o address the c hallenge of information dif- fusion analysis with incomplete kno wledge of netw ork structure, we develop a m ulti-task lo w rank linear influence mo del. By exploiting the relationships b et ween contagions, our approac h can simultaneously predict the volume (i.e. time series prediction) for eac h con- tagion (or topic) and automatically identify the most influential no des for eac h contagion. The proposed model is v alidated using synthetic data and an ISIS twitter dataset. In addition to impro ving the v olume prediction p erformance significantly , we show that the prop osed approach can reliably infer the most influential users for sp ecific contagions. 1. In tro duction Information emerges dynamically and diffuses quic kly via agen t in teractions in complex net works (e.g. so cial net works) (López-Pintado, 2008). Consequen tly , understanding and prediction of information diffusion mechanisms are c hallenging. There is a rapidly growing in terest in exploiting knowledge of the information dynamics to b etter characterize the fac- tors influencing spread of diseases, planned terrorist attacks, and effective so cial marketing campaigns, etc (Guille and Hacid, 2012). The broad applicabilit y of this problem in so cial net work analysis has led to fo cused researc h on the following questions: (I) Whic h con ta- gions are the most p opular and can diffuse the most? (I I) Which members of the netw ork ∗ . This work was supported in part by ARO under Grant W911NF-14-1-0339. This w ork was p erformed under the auspices of the U.S. Dept. of Energy by Lawrence Livermore National Laboratory under Con tract DE-AC52-07NA27344. 1 are influen tial and play imp ortan t roles in the diffusion pro cess? (I II) What is the range o ver which the contagions can diffuse (Guille et al., 2013)? While attempting to answ er these questions, one is confron ted with tw o crucial c hallenges. First, a descriptiv e diffusion mo del, which can mimic the b ehavior observed in real w orld data, is required. Second, effi- cien t learning algorithms are required for inferring influence structure based on the assumed diffusion mo del. A v ariet y of information diffusion prediction framew orks hav e b een developed in the literature (Y ang and Lesk ov ec, 2010; W ang et al., 2013; Guille et al., 2013; Du et al., 2013; Zhang et al., 2016). A t ypical assumption in many of these approac hes is that a connected net work graph and kno wledge of the corresp onding structure are av ailable a priori . How ev er, in practice, the structure of the netw ork can b e implicit or difficult to mo del, e.g., mo deling the structure of the spread of infectious disease is almost impossible. As a result, netw ork structure unaw are diffusion prediction mo dels hav e gained in terest. F or example, (Y ang and Lesk ov ec, 2010), Y ang et. al. prop osed a linear influence mo del, whic h can effectively predict the information v olume b y assuming that eac h of the contagions spreads with the same influence in an implicit net work. Subsequen tly , in (W ang et al., 2013), the authors extended LIM b y exploiting the sparse structure in the influence function to identify the influen tial no des. Though the relationships b et w een multiple contagions can b e used for more accurate mo deling, most of the existing approac hes ignore that information. In this paper, w e address the ab o ve issues by augmen ting linear influence mo dels with complex task dep endency information. More sp ecifically , we consider the dep endency of dif- feren t contagions in the net w ork, and characterize their relationships using Copula Theory . F urthermore, by imp osing a low-rank regularizer, we are able to characterize the clustering structure of the con tagions and the no des in the net work. Through this no vel form ulation, w e attempt to b oth improv e the accuracy of the prediction system and b etter regularize the influence structure learning problem. Finally , w e develop an efficien t algorithm based on proximal mappings to solv e this optimization problem. Exp erimen ts with syn thetic data rev eal that the proposed approach fairs significan tly b etter than a state-of-the-art m ulti-task v arian t of LIM b oth in terms of volume prediction and influence structure estimation p er- formance. In addition, w e demonstrate the sup eriorit y of the prop osed metho d in predicting the time-v arying volume of t w eets using the ISIS twitter dataset 1 . 2. Background In this section, we presen t the form ulation of linear influence mo del (LIM) (Y ang and Lesk ov ec, 2010) and discuss its limitations. Consider a set of N no des that participate in an information diffusion pro cess of K differen t con tagions ov er time. No de u ∈ { 1 , . . . , N } can b e infected by contagion k ∈ { 1 , . . . , K } at time t ∈ { 0 , 1 , . . . , T } . The v olume V k ( t ) is defined as the total num b er of no des that get infected by the contagion k at time t . Let the indicator function M u,k ( t ) = 1 represen t the ev ent that node u got infected by contagion k at time t , and 0 otherwise. LIM mo dels the v olume V k ( t ) as a sum of influences of no des u 1. ISIS dataset from Kaggle is av ailable at https://www.k aggle.com/kzaman/how-isis-uses-t witter. 2 that got infected before time t : V k ( t + 1) = N X u =1 L − 1 X l =0 M u,k ( t − l ) I u ( l + 1) , (1) where each no de u has a particular non-negative influence function I u ( l ) . One can simply think of I u ( l ) as the n umber of follow-up infections l time units after u got infected. The v alue of L is set to indicate that the influence of a no de drops to 0 after L time units. Th us, the influence of no de u is denoted b y the vector I u = ( I u (1) , . . . , I u ( L )) T ∈ R L × 1 . Next, using the notation V k = ( V (1) , . . . , V ( T )) T ∈ R T × 1 and I = ( I T 1 , · · · , I T N ) T ∈ R LN × 1 , the inference pro cedure of LIM can b e form ulated as follo ws minimize K X k =1 k V k − M k · I k 2 2 + 1 ( I ) , (2) where M k is obtained via concatenation of M u,k , k · k 2 denotes the Euclidean norm, and 1 ( I ) is an indicator function that is zero when I uk ( l ) ≥ 0 and + ∞ otherwise. Though LIM has b een effectiv e in predicting the future volume for eac h con tagion, it assumes that eac h no de has the same influence across all the con tagions. Consequently , to ac hieve con tagion-sensitiv e no de selection in an implicit net work, the LIM mo del w as extended and the multitask sparse linear influential mo del (MSLIM) w as prop osed in (W ang et al., 2013). The influence function is defined b y extending I u in LIM into contagion-sensitiv e I u,k ∈ R L × 1 , whic h is a L -length vector representing the influence of the node u for the contagion k . F or each con tagion k , let I k ∈ R LN × 1 b e the vector obtained b y concatenating I 1 k , . . . , I N k . F or eac h no de u , the influence matrix for the node u is defined: I u = ( I u 1 , . . . , I uK ) ∈ R L × K . Using these notations, the inference pro cedure to estimate I u,k w as form ulated as follows minimize 1 2 K X k =1 k V k − M k · I k k 2 2 + λ N X u =1 k I u k F + γ N X u =1 K X k =1 k I uk k 2 + 1 ( I ) , (3) where k · k F denotes the F rob enius norm. The p enalt y term k I u k F w as used to encourage the entire matrix I u to b e zero altogether, which means that the no de u is non-influen tial for all differen t contagions. If the estimated k I u k F > 0 (i.e., the matrix I u is non-zero), a fine-grained selection is p erformed by the p enalt y N P u =1 K P k =1 k I uk k 2 , whic h is essentially a group- Lasso p enalt y and can encourage the sparsit y of vectors { I uk } . F or a sp ecific contagion k , one can identify the most influential no des by finding the optimal solution { ˆ I uk } of (3). Ho wev er, the p enalt y terms used in MSLIM encourages that certain no des hav e no influence o ver all the con tagions which may not b e true in practice. F urthermore, for most of the real w orld applications, there exists complex dependencies among the contagions. In order to alleviate these shortcomings, w e prop ose a nov el probabilistic multi-task learning framew ork and develop efficient optimization strategies. 3. Prop osed Approach Probabilistic Multi-Con tagion Modeling of Diffusion: W e assume a linear regression mo del for eac h task: V k = M k I k + n k , where V k , M k and I k are defined as b efore, and 3 n k ∈ R T × 1 is an i.i.d. zero-mean Gaussian noise vector with the cov ariance matrix Σ k . The distribution for V k giv en M k , I k and Σ k can b e expressed as V k | M k , I k , Σ k ∼ N  M k I k , Σ k  = exp  − 1 2  V k − M k I k  T Σ − 1 k  V k − M k I k   (2 π ) T 2 | Σ k | 1 2 . (4) Assuming that the influence for a single con tagion is also Gaussian distributed, w e can express the marginal distributions as I k | m k , Θ k ∼ N ( m k , Θ k ) , where m k ∈ R LN × 1 is the mean v ector and can b e expressed as m k = [ m T 1 ,k , . . . , m T N ,k ] T , and Θ k ∈ R LN × LN is the co v ariance matrix of I k . F or a node u and con tagion k , we assume that the v ariables in the influence I uk ha ve the same mean, i.e., m u,k = m u,k 1 L × 1 , where m u,k is a scalar and 1 L × 1 is a vector of all ones with dimension L × 1 . Let m 0 ∈ R N × K represen t the mean matrix with en tries m u,k , and it is connected as m = ( m 1 , . . . , m K ) = Qm 0 , where Q ∈ R LN × N = I N × N ⊗ 1 L × 1 and I N × N is the identit y matrix with dimension N × N and ⊗ is the Kronec ker pro duct op erator. 3.1 Dep endence Structure Mo deling Using Copulas Consider a general case where the contagions are correlated. W e construct a new influence matrix I =  I 1 , . . . , I K  ∈ R LN × K . In our form ulation, I k ’s are assumed to b e correlated and the joint distribution of I is not a simple pro duct of all the marginal distributions of I k as is adopted by most multi-task learning formulations. Here, we prop ose to use a m ulti-task copula that is obtained by tailoring the copula model for the multi-task learning problem. Theorem 1 (Sklar’s The or em). Consider an N -dimensional distribution function F with mar ginal distribution functions F 1 , . . . , F N . Then ther e exists a c opula C , such that for al l x 1 , . . . , x N in [ −∞ , ∞ ] , F ( x 1 , . . . , x N ) = C ( F 1 ( x 1 ) , . . . , F N ( x N )) . If F n is c ontinuous for 1 ≤ n ≤ N , then C is unique, otherwise it is determine d uniquely on RanF 1 × . . . × RanF N wher e RanF n is the r ange of F n . Conversely, given a c opula C and univariate CDFs F 1 , . . . , F N , F is a valid multivariate CDF with mar ginals F 1 , . . . , F N . As a direct consequence of Sklar’s Theorem, for contin uous distributions, the join t proba- bilit y densit y function (PDF) f ( x 1 , . . . , x N ) is obtained b y , f ( x 1 , . . . , x N ) = N Y n =1 f n ( x n ) ! c ( F 1 ( X 1 ) , . . . , F N ( X N )) , (5) where f n ( · ) is the marginal PDF and c is termed as the copula density given b y c ( v ) = ∂ N C ( v 1 , . . . , v N ) ∂ v 1 , . . . , ∂ v N (6) where v n = F n ( x n ) . W e extend the copula theory to multi-task learning and express the join t distribution of I as follows: p ( I 1 , I 2 , . . . , I K ) = K Y k =1 N ( m k , Θ k ) ! c ( F 1 ( I 1 ) , F 2 ( I 2 ) , . . . , F K ( I K )) , (7) 4 where F k ( I k ) is the CDF of the influence for k th con tagion. The copula density function c ( · ) tak es all marginal CDF s { F k ( I k ) } K k =1 as its argumen ts, and main tains the output correlations in a parametric form. Gaussian copula: There are a finite num b er of w ell defined copula families that can c haracterize several dep endence structures. Though, we can in vestigate the choice of an appropriate copula, we consider the Gaussian copula for its fav orable analytical prop erties. A Gaussian copula can b e constructed from the multiv ariate Gaussian CDF, and the resulting prior on I is given by a m ultiv ariate Gaussian distribution as I ∼ MN LN × K ( m , U , Ω ) = exp  − 1 2 tr  U − 1 ( I − m ) Ω − 1 ( I − m ) T  (2 π ) LN K 2 | Ω | LN 2 | U | K 2 (8) where U ∈ R LN × LN is the ro w co v ariance matrix mo deling the correlation b et ween the influence of differen t nodes, Ω ∈ R K × K is the column cov ariance matrix modeling the correlation b et w een the influence for different contagions, and m ∈ R LN × K is the mean matrix of I . The tw o cov ariances can b e computed as E h ( I − m ) ( I − m ) T i = U tr ( Ω ) and E h ( I − m ) T ( I − m ) i = Ω tr ( U ) respectively . W e assume that N individual nodes are spreading the contagions and influencing others indep enden tly , and th us the ro w co v ariance matrix is diagonal and can b e expressed as U = diag ( e 2 1 , e 2 2 , . . . , e 2 N ) ⊗ I L × L , where e 2 n , n ∈ { 1 , . . . , N } are scalars. The p osterior distribution for I , which is prop ortional to the pro duct of the prior in Eq. 4 and the likelihoo d function in Eq. 8, is giv en as p ( I | M , V , Σ , U , Ω ) ∝ p ( V | M , I , Σ ) p ( I | m , U , Ω ) = K Y k =1 N  M k I k , Σ k  ! MN LN × K ( I | m , U , Ω ) , (9) where M = ( M 1 , . . . , M K ) ∈ R T × LN K , V = ( V 1 , . . . , V K ) ∈ R T × K , Σ is the corresp onding co v ariance matrix of n = ( n 1 , . . . , n K ) ∈ R T × K . W e assume Σ k , σ 2 I T × T and also an iden tical v alue of e 2 n = e 2 , ∀ k = 1 , . . . , K, ∀ n = 1 , . . . , N . W e emplo y maxim um a p osteriori (MAP) and maximum lik eliho o d estimation (MLE), and obtain I , m , and Ω b y min I , m , Ω 1 σ 2 K X k =1 k V k − M k I k k 2 2 + 1 e 2 tr  ( I − m ) Ω − 1 ( I − m ) T  + LN ln | Ω | + 1 ( I ) . Ho wev er, if we assume Ω − 1 to b e non-sparse, the solution to Ω − 1 will not b e defined (when K > LN ) or will o verfit (when K is of the same order as LN ) (Rai et al., 2012). In fact, some con tagions in the net work can be uncorrelated, whic h mak es the corresp onding en try v alues in Ω − 1 zero. Hence, w e add a l 1 p enalt y to promote sparsity of matrix Ω − 1 to obtain min I , m , Ω K X k =1 k V k − M k I k k 2 2 + λ 1 tr  ( I − m ) Ω − 1 ( I − m ) T  − λ 2 ln | Ω − 1 | + λ 3 k Ω k 1 + 1 ( I ) . 5 3.2 Mo deling Structure of Influence Matrix I In order to b etter c haracterize the influence matrix, w e prop ose to imp ose a low rank struc- ture on the influence matrix I . The no des or the contagions in the influence netw ork are kno wn to form communities (or clustering structures), whic h may b e captured using the lo w-rank prop ert y of the influence matrix. Note that, the sparse structure in the influence matrix implies that most individuals only influence a small fraction of contagions in the net work while there can b e a few no des with wide-spread influence. W e incorporate this in to our formulation b y using a sparsity promoting regularizer o ver I u,k . min I , m , Ω K X k =1 k V k − M k I k k 2 2 + λ 1 tr  ( I − m ) Ω − 1 ( I − m ) T  − λ 2 ln | Ω − 1 | + λ 3 k Ω k 1 + λ 4 k I k ∗ + λ 5 N X u =1 K X k =1 k I uk k 2 + 1 ( I ) , (10) where k · k ∗ denotes the nuclear norm, and λ 1 , λ 2 , λ 3 , λ 4 and λ 5 are the regularization parameters. With the estimated { ˆ I uk } , one can predict the total v olume of the contagion k at T + 1 by ˆ V k ( T + 1) = P N u =1 P L − 1 l =0 M uk ( T − l ) I uk ( l + 1) . 4. Algorithm W e adopt an alternating optimization approach to solv e the problem in Eq. 10. Optimization w.r.t. m : Given I and Ω − 1 , the mean matrix m can b e obtained by solving the following problem min m tr  ( I − m ) Ω − 1 ( I − m ) T  . The estimate ˆ m can b e analytically obtained as ˆ m = 1 L QQ T I . Optimization w.r.t. Ω : Given I and m , the contagion in verse co v ariance matrix Ω − 1 can b e estimated by solving the follo wing optimization problem min Ω λ 1 tr  ( I − m ) Ω − 1 ( I − m ) T  − λ 2 ln | Ω − 1 | + λ 3 k Ω k 1 The ab ov e is an instance of the standard in verse cov ariance estimation problem with sample co v ariance λ 1 λ 2 ( I − m ) T ( I − m ) , which can b e solved using standard to ols. In particular, we use the graphical Lasso pro cedure in (F riedman et al., 2008) ˆ Ω − 1 = g Lasso  λ 1 /λ 2 ( I − m ) T ( I − m ) , λ 3  . (11) Optimization w.r.t. I : The corresp onding optimization problem b ecomes min I K X k =1 k V k − M k I k k 2 2 + λ 1 tr  ( I − m ) Ω − 1 ( I − m ) T  + λ 4 k I k ∗ + λ 5 N X u =1 K X k =1 k I uk k 2 + 1 ( I ) . W e rewrite the problem as min I ` ( I ) + λ 4 k I k ∗ + 1 ( I ) . (12) 6 Algorithm 1 Incremental Pro ximal Descen t 1: Initialize I = A 2: rep eat 3: Set I = I − θ ∇ I ` ( I ) 4: Set I = prox θλ 4 k·k ∗ ( I ) 5: Set I = P 1 ( I ) 6: until conv ergence 7: return I where ` ( I ) = K P k =1 k V k − M k I k k 2 2 + λ 1 tr  ( I − m ) Ω − 1 ( I − m ) T  + λ 5 N P u =1 K P k =1 k I uk k 2 . This form ulation in volv es a sum of a conv ex differentiable loss and conv ex non-differentiable reg- ularizers whic h renders the problem non-trivial. A string of algorithms ha ve b een dev elop ed for the case where the optimal solution is easy to compute when each regularizer is considered in isolation. This corresp onds to the case where the proximal op erator defined for a con v ex regularizer R : R LN × K → R at a p oint Z b y prox R ( Z ) = arg min 1 2 k I − Z k 2 F + R ( I ) , is easy to compute for eac h regularizer taken separately . See (Combettes and Pesquet, 2011) for a broad o verview of pro ximal metho ds. The pro ximal operator for the n uclear norm is giv en b y the shrink age op eration as follows (Beck and T eb oulle, 2009). If U diag ( σ 1 , . . . , σ n ) V T is the singular v alue decomp osition of Z , then prox λ 4 k·k ∗ ( Z ) = U diag (( σ i − λ 4 ) + ) i V T . The pro ximal operator of the indicator function 1 ( I ) is simply the pro jection on to I u,k ( l ) ≥ 0 , whic h is denoted b y P 1 ( I ) . Next, we mention a matching serial algorithm in tro duced in (Bertsek as, 2011). W e presen t here a version where updates are p erformed according to a cyclic order (Ric hard et al., 2012). Note that one can also randomly select the order of the up dates. W e use the optimization algorithm 1 to solv e the optimization problem in Eq. 12. 5. Exp erimen ts W e compare the p erformance of the proposed approac h to MSLIM by applying it to b oth syn thetic and real datasets. Since the v olume of a con tagion ov er time V k ( t ) can b e viewed as a time series, w e set up this problem as a time series prediction task and ev aluate the p erformance using the prediction mean-squared error (MSE). F urthermore, for the synthetic data set, where we ha ve access to the true influence matrix I , we also ev aluate the perfor- mance of the influence matrix prediction task using the metric k ˆ I − I k F . W e determined the regularization parameters for the prop osed mo del using cross v alidation. In particular, w e split the first 60% of the time instances as the training set and the rest for v alidation. F ollo wing (W ang et al., 2013), we com bine the training and v alidation sets to re-train the mo del with the b est selected regularization parameters and estimate the influence matrix. 5.1 Syn thetic Data W e created a synthetic dataset with the num b er of no des fixed at N = 100 and the n umber of con tagions at K = 20 . In addition, w e assumed that L = 10 and T = 20 . A rank 5 (low- rank) influence matrix I w as generated randomly with uniformly distributed entries. The matrix M was generated with uniformly distributed random integers { 0 , 1 } . F ollowing our 7 T able 1: Prediction p erformance for different information diffusion mo dels on synthetic data. Approac h MSLIM Prop osed V olume Prediction MSE 0.834 0.007 Influence Matrix Estimation Error 0.7681 0.62 mo del assumption, the volume for each V k w as calculated as follows V k = M k × I k + N ( 0 , ∆ ) where N ( 0 , ∆ ) is a multiv ariate normal distribution with cov ariance matrix ∆ . In T able 1, w e present the results obtained using the prop osed approach and its comparison to MSLIM. As can b e observ ed, for b oth volume prediction and influence matrix estimation tasks, the prop osed approac h achiev es highly accurate estimates. 5.2 ISIS T witter Data In this section, we demonstrate the application of the prop osed approac h to a real-w ord analysis task. W e b egin b y describing the t witter dataset used for analysis and the pro cedure adopted to extract the set of contagions. F ollowing this, w e discuss the problem setup and presen t comparisons to MSLIM on predicting the time-v arying t weet v olume. Finally , we presen t a qualitative analysis of the inferred influence structure for differen t con tagions. The ISIS dataset from Kaggle 2 is comprised of ov er 17 , 000 tw eets from 112 users p osted b et w een Jan uary 2015 and Ma y 2016. In addition to the actual t weets, meta-information suc h as the user name and the timestamp for each tw eet are included. W e p erformed a standard pre-pro cessing b y remo ving a v ariety of stop words, e.g. URLs, sym b ols. After prepro cessing, we conv erted each t weet in to a bag-of-words represen tation and extracted the term frequency-inv erse do cumen t frequency (tf-idf ) feature. T opic Modeling: When applying our approach, the first step is to define semanticall y meaningful con tagions. A simple w ay of defining topics is to directly use w ords as topics (e.g., ISIS). How ev er, a single wo rd may not b e rich enough to represent a broad topic (e.g., so cial net work sites). Hence, we prop ose to p erform topic mo deling on the tw eets based on the tf-idf features. In our exp erimen t, w e obtained the topics using Non-negative Matrix F actorization (NMF), which is a p opular scheme for topic disco very , with the num b er of topics K set at 10 . T able 2 lists the top 10 words for eac h of the topics learned using NMF. V olume Time Series Prediction: In our exp erimen t, we set one day as the discrete time step for aggregating the t weet volume. The parameter L denotes the num ber of time steps it takes for the influence of a user to deca y to zero. W e set the parameter L equal to 5 since w e observ ed that b ey ond L = 5 , there is hardly any impro vemen t in p erformance. The MSE on the predicted volume is computed ov er the entire p erio d of observ ation. The comparison of the prediction MSE is presen ted in T able 3. It can be seen that the prop osed approac h significan tly outperforms MSLIM in predicting the time-v arying v olume. Influen tial No de Detection: F or a con tagion k , w e iden tify the most influen tial nodes with resp ect to this con tagion as no des having high k I u , k k 2 v alues. First, in Figure 2(a), we 2. ISIS dataset from Kaggle is av ailable at https://www.k aggle.com/kzaman/how-isis-uses-t witter. 8 T able 2: T op w ords for eac h topic learned using NMF with the ISIS twitter dataset. T opic 1 isis ramiallolah iraq attac k liby a warreporter1 saa aamaq usa abu T opic 2 killed soldiers to day airstrikes injured wounded civilians militan ts iraqi attac k T opic 3 syria russia ramiallolah turk ey ypg breakingnews usa group saa terror T opic 4 state islamic fighters figh ting group saudi new http wila ya con trol T opic 5 alepp o nid gazaui reb els north today northern syrian ypg turkish T opic 6 assad regime myra forces reb els fsa pro islam syrian jaysh T opic 7 al qaeda nusra abu sham ahrar islam jabhat http warreporter1 T opic 8 arm y iraq near ramiallolah iraqi lujah turk ey ramadi w est sinai T opic 9 allah p eople muslims abu accept m uslim make know don islam T opic 10 breaking islamicstate forces amaqagency cit y figh ters iraqi near area syrian (a) A verage Influence (b) Maximum Influence Figure 1: Comparing statistics from the estimated influence matrix with the volume of t weets corresp onding to eac h of the users to identify influential users. In b oth cases, the users with a large influence score are marked in red. plot the correlation among 10 topics learned b y NMF. More specifically , w e plot the pair- wise correlation structure learned by our approac h. It can b e seen that, a strong p ositiv e correlation structure exists, which enabled the improv ed prediction in T able 3. F ollowing this, w e use the predicted influence matrix to select a set of highly influential no des from the dataset. A simple approach to select the influen tial users can b e to select the ones with a large num ber of t weets. How ever, w e argue that the influence predicted in an information diffusion mo del can b e v astly different. Consequen tly , we consider a user to b e influen tial if she has a high influence score for at least one of the topics, or if she can be influen tial for m ultiple topics. F or example, in Figure 1(a), w e plot a verage influence scores of the users (a veraged o ver all the topics) against the total n um b er of t weets. Similarly , in Figure 1(b), 9 (a) (b) Figure 2: (a) Correlation Structure among the topics (non-black color represen ts positive correlation), (b) T op 9 influen tial users and their tw eet distributions. T able 3: V olume prediction p erformance on the ISIS t witter dataset. Approac h MSLIM Prop osed V olume Prediction MSE 2.7 0.329 w e plot influence scores of the users (maximum ov er all the topics) against the total num b er of tw eets. The first striking observ ation is that the users with high influence scores are not necessarily the ones with the most num b er of t w eets. Instead, their impact on the information diffusion relies hea vily on the complex dynamics of the implicit netw ork. Finally , in Figure 2(b) w e plot the p ercen tage of tw eets regarding eac h of the topics for top 9 influential no des. Influen tial no des are obtained as a union of no des identified based on b oth av erage and maximum influence scores. More sp ecifically , w e select the union of users with av erage influence score greater than 1 . 3 and maximum influence score greater than 1 . 8 . In addition to displa ying the distribution across topics, for each influen tial user, we sho w the total num b er of tw eets p osted b y that user. It can b e seen that the total n umber of t weets of these users v ary a lot and, therefore, is not a go o d indication of their influence. 6. Conclusion In this pap er, we considered the problem of influen tial no de detection and v olume time series prediction. W e prop osed a descriptiv e diffusion mo del to take dep endencies among the top- ics in to account. W e also proposed an efficient algorithm based on alternating metho ds to p erform inference and learning on the mo del. It w as shown that the proposed technique out- p erforms existing influen tial node detection techniques. F urthermore, the proposed model w as v alidated b oth on a synthetic and a real (ISIS) dataset. W e sho w ed that the prop osed approac h can efficiently select the most influential users for sp ecific con tagions. W e also presen ted sev eral interesting patterns of the selected influential users for the ISIS dataset. 10 References Amir Bec k and Marc T eb oulle. A fast iterativ e shrink age-thresholding algorithm for linear in verse problems. SIAM journal on imaging scienc es , 2(1):183–202, 2009. Dimitri P Bertsek as. Incremental gradien t, subgradient, and proximal metho ds for conv ex optimization: A surv ey . Optimization for Machine L e arning , 2010:1–38, 2011. P atrick L Com b ettes and Jean-Christophe Pesquet. Pro ximal splitting methods in signal pro cessing. In Fixe d-p oint algorithms for inverse pr oblems in scienc e and engine ering , pages 185–212. Springer, 2011. Nan Du, Le Song, Manuel Gomez-Ro driguez, and Hongyuan Zha. Scalable influence estima- tion in contin uous-time diffusion netw orks. In A dvanc es in neur al information pr o c essing systems , pages 3147–3155, 2013. Jerome F riedman, T revor Hastie, and Rob ert Tibshirani. Sparse inv erse cov ariance estima- tion with the graphical lasso. Biostatistics , 9(3):432–441, 2008. A drien Guille and Hakim Hacid. A predictive mo del for the temp oral dynamics of informa- tion diffusion in online so cial netw orks. In Pr o c e e dings of the 21st international c onfer enc e on W orld Wide W eb , pages 1145–1152. A CM, 2012. A drien Guille, Hakim Hacid, Cecile F avre, and Djamel A Zighed. Information diffusion in online so cial netw orks: A survey . A CM SIGMOD R e c or d , 42(2):17–28, 2013. Dunia Lóp ez-Pin tado. Diffusion in complex so cial netw orks. Games and Ec onomic Behavior , 62(2):573–590, 2008. Piyush Rai, Abhishek Kumar, and Hal Daume. Sim ultaneously leveraging output and task structures for m ultiple-output regression. In A dvanc es in Neur al Information Pr o c essing Systems (NIPS) , pages 3185–3193, 2012. Emile Richard, Pierre-andre Sa v alle, and Nicolas V ay atis. Estimation of simultaneously sparse and low rank matrices. In Pr o c e e dings of the 29th International Confer enc e on Machine L e arning (ICML) , pages 1351–1358, 2012. Yingze W ang, Guang Xiang, and Shi-Kuo Chang. Sparse m ulti-task learning for detecting influen tial nodes in an implicit diffusion net work. In AAAI , 2013. Jaew on Y ang and Jure Lesk ov ec. Mo deling information diffusion in implicit net w orks. In 2010 IEEE International Confer enc e on Data Mining , pages 599–608. IEEE, 2010. P eng Zhang, Jing He, Guo dong Long, Guangy an Huang, and Chengqi Zhang. T ow ards anomalous diffusion sources detection in a large netw ork. ACM T r ansactions on Internet T e chnolo gy (TOIT) , 16(1):2, 2016. 11

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment