HNP3: A Hierarchical Nonparametric Point Process for Modeling Content Diffusion over Social Media

This paper introduces a novel framework for modeling temporal events with complex longitudinal dependency that are generated by dependent sources. This framework takes advantage of multidimensional point processes for modeling time of events. The int…

Authors: Seyed Abbas Hosseini, Ali Khodadadi, Soheil Arabzade

HNP3: A Hierarchical Nonparametric Point Process for Modeling Content   Diffusion over Social Media
HNP3: A Hierarchical Nonparametric Point Process for Modeling Content Dif fusion o ver Social Media Seyed Abbas Hosseini, Ali Khodadadi, Soheil Arabzade and Hamid R. Rabiee AICT Inno vation Center , Department of Computer Engineering Sharif University of T echnology , T ehran, Iran Email: {a_hosseini, khodadadi, arabzade}@ce.sharif.edu, rabiee@sharif.edu Abstract —This paper introduces a novel framework f or mod- eling temporal events with complex longitudinal dependency that are generated by dependent sources. This framework takes advantage of multidimensional point processes for modeling time of events. The intensity function of the pr oposed process is a mixture of intensities, and its complexity grows with the complexity of temporal patterns of data. Moreov er , it utilizes a hierarchical dependent nonparametric approach to model marks of e vents. These capabilities allow the proposed model to adapt its temporal and topical complexity according to the complexity of data, which makes it a suitable candidate f or real world scenarios. An online infer ence algorithm is also proposed that makes the framework applicable to a vast range of applications. The framework is applied to a r eal world application, modeling the diffusion of contents over networks. Extensive experiments re veal the effectiveness of the proposed framework in comparison with state-of-the-art methods. I . I N T RO D U C T I O N A huge amount of information in the form of news, photos, and tweets propagates through social media and networks. Analyzing these information, can help us understand the users’ interests and their influence on each other . This kind of knowledge help us to understand ho w applications such as online advertising operate through incenti vizing users [1]. Considering the temporal dynamics of the different topics discussed ov er the networks can immensely help marketers run more effecti ve campaigns [2]. Therefore, there has been a large amount of research on the analysis and modeling of the content being shared o ver social networks to extract users preferences and the amount of their influence on each other . Users of social networks often share information in one form or another . The content and temporal characteristics of what is shared as well as the relations among members are the three main sources that allo w us to identify users’ interests ov er time and their influence characteristics. Modeling the content that is shared on a network ov er time has man y challenges. This content covers a wide range of topics. Each of these topics emerge at some point, become popular to some extent, influence some parts of the network, and ev entually fade out. Ho wev er , topics propagate with dif ferent rates. Therefore, we need a fle xible model that can not only represent the dynamics of topic popularity , but also model the div ersity and dif fusion rate of topics o ver time. There exist man y dependent nonparametric models for the div ersity and dynamics of the topics in a te xt stream. T wo nonparametric topic-cluster models were introduced in [3], [4] that cluster news based on their topics and infer the number of clusters, concurrently . These models ha ve two main drawbacks. First, they only model a single source of information and hence do not consider the impact of dif ferent sources on each other . Second, they only consider time as a cov ariate, while modeling the time of ne ws events can enhance accuracy of the method in finding topics and also the influence of users on each other . The authors in [5] ha ve recently proposed a nonparametric point process that jointly models both topic and time of the ev ents. Ho wev er , this method assumes that the data are generated by a single source and hence is not applicable of analyzing events ov er a network. A rich literature exists on modeling information dif fusion ov er networks [6]–[8]. These methods model the time of e vents using a point process such as Hawk es process [9] but fall short of considering the content. Some recent methods such as [2], [10] consider the content of the e vent b ut assume that the topics are already known. The authors in [11] have recently proposed a method that jointly models the content and the time of events to infer the topic of the e vents and the influence network. Ho wev er, this method assumes that the number of topics is bounded and kno wn, and also the time and topics of ev ents are assumed to be independent. These assumptions are not valid in social and information networks, where new topics arise over time, and the rate of dif fusion of different content is heavily dependent on their topic [12]. In this paper , we propose a nonparametric point process that jointly models the topic and time of ev ents generated by the users of a network, infers the users’ influence on each other, and their dynamic interests over time in an online manner . In this model, each topic has a specific temporal dynamic which determines its diffusion rate through the network. The model is nonparametric and adapts the number of topics according to the complexity of data. In summary , we make the following contributions: • W e introduce a nonparametric multidimensional point process that can jointly model the time and topic of events for a set of dependent sources. This model permits topics to be shared among dif ferent sources using a hierarchical structure and is able to adapt its complexity according to the complexity of data. • Our model provides a dynamic hierarchical clustering over the ev ents, in three le vels. In the first lev el, the events are clustered based on the root e vent that has triggered them. In the second level, for each user , the root e vents of each cluster are grouped based on their topic and temporal dynamics. Finally , in the third level, the topics of e vents are clustered irrespectiv e of their user . This clustering allows us to better understand the interests of users and also the trending of topics ov er the network. • W e propose an efficient online inference algorithm based on the collapsed Sequential Monte Carlo that relies on marginalizing global latent variables to speed up the inference process. The inference algorithm is online, which makes it a suitable choice for real applications with millions of ev ents. • W e conduct se veral e xperiments on synthetic and real world datasets to e valuate the performance of our model. T o this end, we collected a dataset consisting of 100,000 news articles published over 3 months by 100 ne ws websites. The remainder of this paper is org anized as follo ws. In section II we briefly re view the necessary background. Details of the proposed method is discussed in section III. The proposed inference algorithm is discussed in section IV. T o demonstrate the effecti veness of the proposed model, extensi ve experimental results are reported and analyzed in section V. Finally , section VI concludes this paper and discusses paths for future research. I I . B AC K G R O U N D W e aim to infer the users’ interests and their influence on each other by analyzing the contents being propagated ov er the network. T o this end, we use dependent nonparametric models and point processes to jointly model the occurring time and topics of the ev ents. For the sake of self-sufficienc y , in this section, we re view some necessary background on non-exchangeable nonparametric models and temporal point processes. A. Dependent Nonparametric models A Bayesian nonparametric model, is a Bayesian model with an infinite-dimensional parameter space. Dependent nonpara- metric models extend traditional models to define a probability measure over a set of dependent measures or clusterings usually index ed by a cov ariate [13]. For example, Recurrent Chinese Restaurant Franchise Pro- cess (RCRFP) is a dependent nonparametric model for cluster- ing dependent groups of data [4]. This process assumes that data is categorized into a set of disjoint groups and the data in each group is exchangeable. Howe ver , it is assumed that the groups are indexed by a cov ariate such as time and are dependent of each other . In this model, the number of clusters is unknown and hence RCRFP infers the number of clusters in each group and simultaneously clusters them in to a set of shared clusters to capture the latent structure of each group. For example, in our problem RCRFP can be used to cluster the set of ev ents of different users. Moreover , since the people are interested in a set of common topics, RCRFP shares the clusters among the users. Although this model is a good match for clustering the e vents ov er a network, the exchangeability of e vents of each user is not a valid assumption in this problem. In Section III, we propose an extended version of RCRF that also models the dependency among the customers of a restaurant. B. T emporal P oint Processes T emporal point processes are a set of powerful methods for modeling a list of time-stamped ev ents ( t 1 , . . . , t n ) . A temporal point process can be completely specified by distribution of its inter-ev ent times [14]: f ( t 1 , . . . , t n ) = n Y i =1 f ( t i | t 1 , . . . , t i − 1 ) = n Y i =1 f ∗ ( t i ) (1) T o specify a point process, it suffices to define f ∗ ( t ) , or equiv alently f ( t |H t ) , where H t is the history of e vents up to time t . A more intuitiv e way to characterize a temporal point process is to define the conditional intensity function [15], which is defined as: λ ∗ ( t ) = f ∗ ( t ) 1 − F ∗ ( t ) (2) where F ∗ ( t ) is the CDF of f ∗ ( t ) . Dif ferent point processes can be determined by specifying appropriate intensity functions. For instance, in a homogeneous Poisson process, the intensity is independent of the history , and is constant over time, i.e. λ ∗ ( t ) = λ [16]. In order to model the ev ents of multiple dependent sources, multidimensional point processes can be utilized. In a multidi- mensional point process, the intensity of a dimension depends on the ev ent history of all dimensions. Each e vent can also be associated with some auxiliary information. This information is known as the mark of an e vent, and the associated point process is called a marked point process. F or example, the topics of tweets propagated through a network can be considered as the marks of ev ents. Dependent nonparametric models are a set of flexible tools for modeling marks of e vents that can adapt their complexity according to data. These models can become a po werful tool for modeling temporal data when combined with point processes. Moreover , these models can become more flexible if the complexity of intensity function can be adapted to the complexity of temporal data. In the next section, we describe HNP3, which is a nonparametric multidimensional point process. I I I . P RO P O S E D M O D E L In order to model the propagation of content ov er a social network, we propose the Hierarchical NonParametric Point Process (HNP3). HNP3 is a framework for modeling the ev ent histories of a group of dependent sources, in which the topics are shared among the sources and the number of topics is unbounded. The main idea of HNP3 is to use a multidimensional point process to model the time of ev ents and a hierarchical nonparametric model to model the marks of ev ents. Let D ( t ) = { e i } N ( t ) i =1 denote the set of ev ents observed until time t , where the e vent e i is a triple ( t i , u i , d i ) which indicates that at time t i , user u i shares document d i . Since the members of a network influence each other, the ev ents in a network are mutually-exciting, i.e. each event triggers some new ev ents in the network. Hence, the events can be categorized into endogenous and exogenous e vents. Endogenous e vents are the responses of users to the actions of their neighbors within the network, and exogenous e vents are user actions based on external driv ers. Let s i denote the triggering ev ent for e vent i . If the e vent is e xogenous, then s i = i and otherwise it is the index of e vent that has triggered the i th ev ent. Each e vent e i also has a corresponding latent topic θ i which is regarded as its mark. W e assume that the topic of an endogenous ev ent is the same as the topic of its triggering e vent. Moreover , we assume that each user u has a distribution G t u ov er the topics at time t that represents his interest ov er dif ferent topics, and he selects the topic of an exogenous e vent randomly from this distrib ution at time t . Since the users of a network are usually interested in a set of common topics, we assume that the fa vorite topics are shared among the users. Let { φ k } K ( t ) 1 denote the set of unique topics ov er the network until time t . Each user u is interested in a subset of these topics at any time t which is denoted by { ψ ui } K u ( t ) i =1 , where K u ( t ) denotes the number of topics that user u is interested in, at time t . Each topic is a distribution ov er the words of the dictionary . Every document with topic θ has the same distribution over the words as the distribution of θ . Moreover , we assume that each topic has a specific temporal dynamic which shows the rate at which the e vents of that topic dif fuse over the network. As depicted in Fig. 1, we propose a three-lev el nonparametric model for clustering ev ents according to their topic. In the first lev el, the Hawkes process clusters the e vents based on their triggering event. W e use a v ariation of RCRFP to cluster the exogenous ev ents of each user in the second level and share the topics among all users in the last le vel. For clarity , we use the follo wing notation for the remainder of this paper . D s uk ( t ) denotes the set of ev ents triggered by ev ent s generated by user u until time t with topic φ k . Let D 0 ( t ) be the set of exogenous events until time t . W e use dot notation to represent union over the dotted v ariable, e.g. , D u · ( t ) represent the e vents of user u before time t with any topic, and D ¯ uk ( t ) represent the e vents of all users except u , before time t , with topic k . Moreover , let z i be the index of the topic of the i th e vent among φ k s. That is, θ i = φ z i . A. The Pr oposed Generative Model W e assume that the time at which user u publishes documents follows a Hawkes process with intensity function: λ u ( t ) = µ u + N ( t ) − 1 X s =1 λ u ( t, s ) (3) where µ u is the exogenous intensity which shows the tendency of user u to generate new events. N ( t ) represents the number of e vents until time t . λ u ( t, s ) is the amount of intensity of user u at time t that is caused by ev ent s . λ u ( t, s ) is defined as α u s u κ z s ( t, t s ) where α u s u is the influence of ev ent s ’ s user on u . κ z s ( t, t s ) is a kernel function which determines the diffusion rate of e vents with topic z s . In our case, we use the exponential kernel: κ k ( t, t s ) = e − β k ( t − t s ) (4) As it was mentioned before, we use the topic of the document as the mark of e vents. Using the aforementioned assumptions, if the event e i is e xogenous and we know the triggering e vent s i , then the topic of e i is the same as the topic of e s i , i.e. θ i = θ s i . Otherwise, the user u i selects one of his previously used topics ψ uj with probability n uj ( t ) n u : ( t )+ γ or selects a new topic with probability γ n u : ( t )+ γ : p ( θ i | u i = u, s i = i, t i , z 1: i − 1 ) = (5) K u ( t ) X k =1 n uk ( t ) n u · ( t ) + γ δ ( ψ uk ) + γ γ + n u · ( t ) δ ( ψ t u,new ) where γ is a parameter which shows the tendency of users to talk about new topics, and n uk ( t ) is the weighted number of exogenous ev ents of user u with topic ψ uk , that is: n uk ( t ) = X e ∈ D 0 u ( t ) exp( − ν ( t − t e )) I ( θ e = ψ uk ) (6) where exp( − ν ( t − t e )) is a kernel which represents the decaying impact of events ov er time. In order to share the topics among the users, we use the same idea as RCRF and assume that users select their new topics from a common discrete distribution which shows the popularity of topics ov er the network: p ( ψ t u,new | ψ ·· , γ , H ) = (7) K ( t ) X l =1 m k ( t ) ζ + m · ( t ) δ ( φ l ) + ζ ζ + m · ( t ) H where m k ( t ) shows the popularity of topic φ k ov er the whole network, and is the weighted number of times users select a new topic φ k from 7, that is: m k ( t ) = X e ∈ D 0 ( t ) exp( − ν ( t − t e )) I ( θ e = φ k , l e = 1) (8) where l e = 1 indicates that the topic of the exogenous e vent e is a new one and is sampled from 7. Finally , we draw the content of a document from the distribution of its topic ov er the words of dictionary: d i | φ 1: K ( t ) , z i ∼ M ul t ( φ z i ) I V . I N F E R E N C E W e use a two-step iterative algorithm to update our beliefs about the latent variables in an online manner . First, we use collapsed Sequential Monte Carlo (SMC) [17] to estimate the posterior distribution of local latent variables by marginalizing out all global latent v ariables except β k s. In the second step, we estimate β k using the learned distributions. Each particle repre- sent a hypothesis about the set of latent v ariables and its weight shows our confidence about it. By observing ev ery new ev ent, each particles is updated by appending a new ( s n +1 , z n +1 ) to T ime User 1 User 2 User 3 Fig. 1. The illustration of HNP3 model: The top lev el restaurant represent the popularity of topics over the network. The interest of each user corresponds to distribution over the topics which is represented by a restaurant. The popularity of each topic is the weighted sum of the exogenous e vents generated by the user . Exogenous and endogenous events are represented by circles and squares, respectively . The arrows show the triggering relationship among events. it, and updating their weights correspondingly . T o this end, we need a proposal distribution q ( s n +1 , z n +1 | s 1: n , z 1: n , H t ) to sample from. In order to minimize the variance of the weights, we use its posterior , [18] i.e. p ( s n +1 , z n +1 | s 1: n , z 1: n , H t ) . W e assume a Gamma prior ov er the betas. In order to compute the e xpected v alue of its posterior, we draw M samples from the prior and find the mean as follows: E [ β k | t 1: N , z 1: N , s 1: N ] ≈ M X m =1 w m β ( m ) (9) where w m is the weight of m th sample and is proportional to likelihood p ( t 1: N , z 1: N , s 1: N | β m ) . In the next section we show the ef fectiv eness of the proposed inference algorithm by sev eral experiments on synthetic and real data. V . E X P E R I M E N T A L R E S U LT S In this section, we empirically e valuate the performance of HNP3 by using both synthetic and real data. The experiments on synthetic data are used to e valuate the ef fectiveness of the inference algorithm introduced in section IV. F or the real data, we in vestigate the performance of HNP3 model in inferring the hot topics over the network and their corresponding temporal dynamics. Moreover , we e valuate its power to predict the time of next e vents and also inferring the influence network. A. Synthetic Data In order to e valuate the performance of the proposed inference algorithm, we generated a set of 10 4 e vents by using the proposed generati ve model. W e used the exponential kernel for all four topics with different β parameters. Figure 2(a) shows the performance of HNP3 in estimating the influence matrix α , and exogenous intensity parameters µ u s. As it is e vident in Figure 2(a), although in the first 1000 e vents, HNP3 does not make a significant impro vement over the Hawkes method, but after learning the topics and their corresponding kernel, the error considerably decreases. Since the ability of HNP3 in predicting the time of future e vents heavily depends on correctly estimating the topics kernel, we compared HNP3 and Hawkes process based on the mean likelihood of time of next events, to confirm the efficienc y of the proposed algorithm in learning the kernels. As it is depicted in Figure 2(b), the likelihood of the time of future events is consistently more than the Hawkes process. In order to determine number of particles in the inference algorithm, we tested the algorithm with dif ferent number of particles. As it is depicted in Figure 2(c), the precision of the algorithm in estimating the parameters of the model does not depend on the number of particles too much. Therefore, we used 8 particles in all of our experiments. B. Real Data W e also ev aluated performance of the proposed method on a real dataset, gathered from EventRegistry 1 . For the real data, we first analyze the performance of HNP3 on modeling the content of ev ents. T o this end, we try to address the follo wing questions: 1) How well HNP3 can capture dif ferent topics?, and 2) How well HNP3 can capture the temporal dynamics of topics? W e also analyze the performance of HNP3 on predicting the time of next ev ents and compare its performance with two well known state of the art methods. 1) Dataset Description: Our real dataset corresponds to articles extracted from EventRegistry , which is an online aggregator of news articles around the world. W e have collected news articles containing each of 3 different tags; FIF A , Iran- Sanctions , and P aris-Attack from 2015/11/01 to 2016/01/13. The collected data contains about 100000 news articles and 100 different news sites. The sites are treated as nodes and the articles as e vents. W e hav e preprocessed the data and remov ed 1 http://ev entregistry .org/ Number of events 0 2000 4000 6000 8000 10000 Relative MSE 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 Hawkes HNP3 Number of events 0 50 100 150 200 log likelihood -4.5 -4 -3.5 -3 -2.5 -2 Hawkes HNP3 Number of particles 2 1 2 2 2 3 2 4 2 5 2 6 MSE 0.7928 0.793 0.7932 0.7934 0.7936 HNP3 (a) MSE vs Number of ev ents (b) Likelihoo d vs Num b er of ev ents (c) MSE vs Number of particles Fig. 2. The performance of HNP3 method on synthetic data. Figure (a) shows the relative error in estimating the influence matrix and exogenous intensity parameter . Part (b) compares the mean log likelihood of time for next ev ents in HNP3 and Hawkes models. Figure (c) shows the error in estimating the influence matrix with different number of particles. some stop-words and irrele vant words and e xtracted the bag of words for each article. 2) Results: Content Analysis. T o show the performance of HNP3 on detecting dif ferent topics, we depicted the top frequent w ords in 3 main topics discovered by HNP3. Figures 3a, 3c, and 3e shows the word cloud of top frequent w ords in 3 main topics learned by HNP3. As it can be seen, HNP3 can detect meaningful clusters which are representati ve of true real topics and represent corresponding e vents. T o analyze the temporal dynamics of dif ferent topics, we depicted the intensity function of each topic against time, which is representati ve of their popularity over time. Figures 3b, 3d, and 3f represents the intensity function of 3 different detected topics over time. The results sho w some interesting patterns that confirm the good performance of HNP3 in capturing temporal dynamics of popularity for different topics. As it can be seen from Fig. 3f, the popularity of P aris-Attack topic rises suddenly somewhere in time. This is reasonable, since we collected data two weeks before the P aris-Attack event. Therefore, the intensity of ev ents is zero before the ev ent, and suddenly rises after a large number of events are generated after it happens. Since the FIF A topic is discussed all the time, its intensity is also evenly distributed ov er the time axis. The Iran- Sanctions topic also has a periodical popularity pattern. Since the negotiations about Iran sanctions took place periodically , it is desirable that its popularity rises just after these ne gotiations and then fades out. The above results indicate that the HNP3 performance is acceptable on detecting different topics, capturing their triggering kernels, and their temporal dynamics over time. Prediction. W e also compared the performance of HNP3 on predicting the time of next events with the Hawkes and Dirichlet-Hawkes(DH) models. T o this end, we trained each (a) Time × 10 4 0 0.5 1 1.5 2 2.5 3 3.5 Intensity 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 (b) (c) Time × 10 4 0 0.5 1 1.5 2 2.5 3 3.5 Intensity 0 0.02 0.04 0.06 0.08 0.1 0.12 (d) (e) Time × 10 4 2.8 2.82 2.84 2.86 2.88 2.9 2.92 2.94 2.96 2.98 3 Intensity 0 0.02 0.04 0.06 0.08 0.1 0.12 (f) Fig. 3. Three main topics extracted by HNP3 from the EventRegistry dataset. For each topic, we show the word cloud of top frequent words in the first column. The second column represents the intensity function of each extracted topic, capturing its popularity dynamics, against time. model with some ev ents, and computed the time likelihood of next ev ents for each model. Fig. 4 represents the likelihood of next 100 e vents for HNP3, Hawkes, and DH models. As it is shown in Fig. 4, HNP3 performs better than the Ha wkes and DH models. Moreover , it can be seen that the HNP3 and DH models which utilize the content of events, perform better than the Hawkes model which ignores the content. W e also observ e that the HNP3 model which considers the network effect and the influence of friends, performs better than the DH model which do not consider the influence of users on each other . Fig. 4. The performance of different methods on predicting time of next ev ents for the EventRegistry data. V I . C O N C L U S I O N In this paper , we introduced a framew ork for modeling dependent groups of temporal events with complex longitudinal dependencies. This frame work is able to jointly model the time and marks of events and adapt itself to the complexity of data. The framework also provides a hierarchical clustering of the e vents by utilizing the dependency among content and time of e vents. This clustering may hav e many applications in different areas. For instance, we used the frame work for modeling the content dif fusion over social media and the clustering allowed us to infer the source of events and also the hot topics over the network. Moreov er , HNP3 uses multidimensional point processes for modeling time of e vents. Howe ver , the intensity function of this process is a mixture of intensities and its complexity gro ws with the number of data. In addition, HNP3 utilizes dependent nonparametric methods for modeling marks of ev ents. These capabilities allo w HNP3 to adapt its temporal and topical complexity according to the complexity of data, which makes it a suitable candidate for real world scenarios. Since diffusion of contents over networks has gained a lot of attention in recent years, we applied HNP3 to this real application and designed an online inference algorithm based on SMC, which can efficiently infer parameters of the model. Experiments on synthetic data sho wed the efficienc y of our inference algorithm. The experimental results on real data confirmed the superior performance of the proposed method compared to other recent methods in finding different topics and their diffusion rates. There are many lines to extend this study . For example, we used Hawkes process for modeling time of e vents. One plan to e xtend this method is to use more complex point processes that are analogous to more complex clustering algorithms such as hierarchical dd-CRP [19]. R E F E R E N C E S [1] M. F arajtabar, N. Du, M. Gomez-Rodriguez, I. V alera, H. Zha, and L. Song, “Shaping social activity by incentivizing users, ” in Advances in neural information pr ocessing systems , 2014, pp. 2474–2482. [2] N. Du, L. Song, H. W oo, and H. Zha, “Uncover topic-sensiti ve informa- tion diffusion networks, ” in Pr oceedings of the Sixteenth International Confer ence on Artificial Intelligence and Statistics , 2013, pp. 229–237. [3] A. Ahmed, Q. Ho, C. H. T eo, J. Eisenstein, E. P . Xing, and A. J. Smola, “Online inference for the infinite topic-cluster model: Storylines from streaming text, ” in International Confer ence on Artificial Intelligence and Statistics , 2011, pp. 101–109. [4] A. Ahmed and E. P . Xing, “Timeline: A dynamic hierarchical dirichlet process model for recovering birth/death and ev olution of topics in text stream, ” arXiv preprint , 2012. [5] N. Du, M. Farajtabar , A. Ahmed, A. J. Smola, and L. Song, “Dirichlet- hawkes processes with applications to clustering continuous-time docu- ment streams, ” 2015. [6] T . Iwata, A. Shah, and Z. Ghahramani, “Discov ering latent influence in online social activities via shared cascade poisson processes, ” in Pr oceedings of the 19th A CM SIGKDD international confer ence on Knowledge discovery and data mining . A CM, 2013, pp. 266–274. [7] K. Zhou, H. Zha, and L. Song, “Learning social infectivity in sparse low- rank networks using multi-dimensional hawkes processes, ” in Pr oceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics (AISTA T’13) , 2013, pp. 641–649. [8] L. Tran, M. Farajtabar , L. Song, and H. Zha, “Netcodec: Community detection from individual activities, ” in SIAM International Confer ence on Data Mining (SDM) . SIAM, 2015. [9] T . J. Liniger, “Multivariate hawkes processes, ” Ph.D. dissertation, Diss., Eidgenössische T echnische Hochschule ETH Zürich, Nr . 18403, 2009, 2009. [10] S. H. Y ang and H. Zha, “Mixture of mutually exciting processes for viral dif fusion, ” in Pr oceedings of the 30th International Confer ence on Machine Learning (ICML ’13) , 2013, pp. 1–9. [11] X. He, T . Rekatsinas, J. Foulds, L. Getoor, and Y . Liu, “Hawk estopic: A joint model for network inference and topic modeling from text- based cascades, ” in Pr oceedings of the 32nd International Conference on Machine Learning (ICML-15) , 2015, pp. 871–880. [12] N. Du, L. Song, M. Y uan, and A. J. Smola, “Learning networks of heterogeneous influence, ” in Advances in Neural Information Pr ocessing Systems , 2012, pp. 2780–2788. [13] N. J. Foti, S. W illiamson et al. , “ A survey of non-exchangeable priors for bayesian nonparametric models, ” P attern Analysis and Machine Intelligence, IEEE T ransactions on , vol. 37, no. 2, pp. 359–371, 2015. [14] D. Daley and D. V ere-Jones, An Intr oduction to the Theory of P oint Pr ocesses - V ol. I . Springer Ser . Statist., Springer, New Y ork, 2002. [15] O. Aalen, O. Borgan, and H. Gjessing, Survival and event history analysis: a process point of view . Springer Science & Business Media, 2008. [16] J. F . C. Kingman, P oisson pr ocesses . Oxford university press, 1992. [17] A. Smith, A. Doucet, N. de Freitas, and N. Gordon, Sequential Monte Carlo methods in practice . Springer Science & Business Media, 2013. [18] A. Ahmed, Q. Ho, C. H. T eo, J. Eisenstein, E. P . Xing, and A. J. Smola, “Online inference for the infinite topic-cluster model: Storylines from streaming text, ” in International Confer ence on Artificial Intelligence and Statistics , 2011, pp. 101–109. [19] S. Ghosh, M. Raptis, L. Sigal, and E. B. Sudderth, “Nonparametric clustering with distance dependent hierarchies, ” 2014.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment