Modeling Temporal Activity Patterns in Dynamic Social Networks
The focus of this work is on developing probabilistic models for user activity in social networks by incorporating the social network influence as perceived by the user. For this, we propose a coupled Hidden Markov Model, where each user's activity e…
Authors: Vasanthan Raghavan, Greg Ver Steeg, Aram Galstyan
Mo deling T emp o ral Activit y P atterns in Dynamic So cial Net w o rks V asanthan Raghavan ∗ 1 , Greg V er Steeg 2 , Aram Galsty an 2 , and Alexander G. T a rtakovsky 1 1 Department of Mathematics, University of Southern California, Los Angeles, 90089, CA, USA 2 Information Sciences Institute, University of Southern California, Marina del Rey , 90292, CA, USA Email: Vasanthan Raghavan ∗ - vasanthan raghavan@ieee.org; Greg Ver Steeg - gregv@isi.edu; Aram Galsty an - galsty an@isi.edu; Alexander G. T a rtakovsky - ta rtakov@usc.edu; ∗ Corresponding author Abstract The focus of this wo rk is on developing probabilistic models for user activity in so cial net wo rks by inco rp o rating the so cial netw ork influence as p erceived by the user. F o r this, w e p rop ose a coupled Hidden Mark ov Mo del, where each user’s activity evolves according to a Mark ov chain with a hidden state that is influenced b y the collective activit y of the friends of the user. W e develop generalized Baum-Welch and Viterbi algorithms for mo del parameter learning and state estimation for the p rop osed framew ork. W e then validate the proposed mo del using a significant corpus of user activity on Twitter. Our numerical studies show that with sufficient observations to ensure accurate mo del learning, the proposed framewo rk explains the observed data b etter than either a renewal p ro cess-based mo del or a conventional uncoupled Hidden Mark ov Mo del. W e also demonstrate the utilit y of the proposed app roach in predicting the time to the next t weet. Finally , clustering in the mo del pa rameter space is shown to result in distinct natural clusters of users characterized by the interaction dynamic b et w een a user and his netw ork. Keyw ords Activit y Profile Mo deling, Twitter , Data-Fitting, Explanation, Prediction, Hidden Marko v Mo del, Coupled Hidden Mark ov Mo del, So cial Net work Influence, User Clustering 1 Intro duction So cial netw orking websites such as Facebook , Twitter , etc. hav e b ecome immensely p opular with hundreds of millions of users that engage in v arious forms of activity on these websites. These so cial netw orks pro vide an unparalleled opp ortunit y to study individual and collective behavior at a very large scale. Such studies ha v e profound implications on wide-ranging applications suc h as efficient resource allocation, user-sp ecific information dissemination, user classification, and rapid detection of anomalous b ehavior such as b ot or compromised accoun ts, etc. The simplest mo del for user activity is a Poisson pro cess, where each activit y ev ent (e.g., p osting, t weet- ing, etc.) o ccurs indep endent of the past history at a time-indep endent rate. Ho w ever, recent empirical evidence from m ultiple sources (e-mail logs, w eb surfing, letter corresp ondence, research output, etc.) sug- gest that h uman activit y has distinctly non-P oissonian c haracteristics. In particular, the in ter-even t duration distribution (whic h is exponential for the P oisson pro cess) has been shown to be heavy-tailed and burst y for a n umber of differen t activity types [1, 2]. Differen t approac hes hav e b een put forw ard to explain the non-P oisson nature of the activity patterns [3 – 10]. Nevertheless, despite significan t recen t progress, open questions remain. Most remark ably , on an individual scale, existing studies so far hav e mostly discarded the role and impact of the so cial netw ork where the user activity takes place, instead describing each user via an indep endent sto chastic process. On the other hand, it is clear that so cial in teractions on netw orks affect user activit y , and discarding these interactions should generally lead to sub-optimal models. The main con tribution of this pap er is to dev elop a probabilistic mo del of user activity that explicitly tak es into accoun t the in teraction b etw een users by in tro ducing a coupling b etw een t wo sto chastic processes. Sp ecifically , we prop ose a coupled Hidden Marko v Model (coupled HMM) to describ e in ter-connected dy- namics of user activit y . In our model, the individual dynamics of each user is coupled to the aggregated activit y profile of his neigh b ors (friends or follow ers) in the net work. While a user’s activity may b e pref- eren tially affected by sp ecific neighbors, the predictive p ow er of the model can b e substan tially improv ed using the aggregated activity of all the neighbors. The hidden states in our mo del correspond to different patterns in user activity , similar to the approac h suggested in [7]. How ever, here the state transitions are influenced by the activit y of the neigh b ors, and in turn, the activity of the aggregated set of neighbors is influenced by the state of the giv en user. While many v ariants of the classical HMM approach exist in the literature, the k ey distinguishing feature of this work is the bi-dir e ctional influence b et w een the activity of a user and his netw ork, sp ecifically tailored for social net work applications. Nev ertheless, b eing a v ariant of the conv en tional (uncoupled) HMM, the prop osed model enjo ys the same computational adv an tages in 2 terms of parameter learning as [7] and other HMM v ariants. W e p erform a num b er of exp eriments with data describing user activit y traces on Twitter , and demon- strate that the proposed approach has a b etter p erformance both in terms of explaining observed data (mo del fitting) and predicting future activity (generalization). In particular, w e rep ort statistically signifi- can t improv ement ov er tw o baseline approaches, a renewal pro cess-based mo del and a conv entional HMM. F urthermore, w e use the learned mo dels to cluster users, and find that the resulting cluster structure allo ws in tuitiv e characterization of the users in terms of the interaction dynamics b etw een a user and his so cial net w ork. The rest of the pap er is organized as follows. Section 2 reviews related work in activit y profile mo deling, esp ecially as applicable in so cial netw ork settings. Section 3 prop oses a coupled HMM framew ork for the activit y profile of users and distinguishes the prop osed mo del from prior work. Section 4 develops statistical metho dologies to learn mo del parameters and ho w to use the prop osed framework for data-fitting and forecasting. Section 5 v alidates the mo deling assumptions and illustrates the utilit y and efficacy of the prop osed approach with data from [11]. Concluding remarks are pro vided in Section 6. Without an y prejudice, w e use the male gender-sp ecific connotations for all the users and their attributes in this w ork. Related W ork Recen t researc h on so cial netw orks has fo cused on understanding the prop erties of netw orks induced by so cial in teractions, mo deling information diffusion on suc h netw orks, characterizing their evolution in time, etc. In the direction of mo deling the temp oral activit y of users’ comm unication in such netw orks, sev eral mo dels hav e b een prop osed in the literature. Approaches based on simple p oint process mo dels hav e b een prop osed for user requests in a p eer-to-p eer s etting [12] and social netw ork ev olution [13]. In addition to one-parameter exp onential observ ation densit y for user activity utilized in [6], more general tw o-parameter mo dels suc h as the W eibull (or stretc hed exp onential) hav e been prop osed for mo deling inter-post duration in the con text of instant-messaging netw orks [14], accessing patterns in In ternet-media [15], understanding in ter-p ost dynamics of original con tent in general online so cial net w orks [16], and inter-call duration in cell-phone net works [17]. T o explain the bursty features of h uman dynamics, [1 – 4] suggested the priority selection/queue mecha- nism. An alternative mechanism motiv ated by circadian and weekly cycles of h uman activit y and captured b y cascading non-homogenous P oisson pro cesses was suggested in [5, 6] for the heavy-tails in inter-ev ent durations. Although this model has b een shown to b e consistent with empirical observ ations, it is computa- 3 tionally intensiv e in terms of parameter estimation. T o o v ercome this issue, Malmgren et al. [7] suggested a simpler tw o-state HMM for the activit y of users in an email/comm unication netw ork where the states reflect a measure of the user’s activit y . Similar mo dels, suggestive of a few states determining activity patterns, ha v e b een considered for short message corresp ondence in [18] and Digg activity in [19]. Other work has also emphasized the importance of distinguishing activ e versus inactive users in activity and influence mo deling tasks [20]. An imp ortant feature that c haracterizes the priorit y selection mechanism is that h uman b ehavioral pat- terns are driv en b y resp onses to/from others. On the other hand, the line of w ork motiv ated b y the cascading non-homogenous Poisson pro cesses-based mo dels explains activity patterns due to other mec hanisms suc h as circadian and weekly cycles, task rep etition, and changing communication needs. Mo dels that trav erse b et w een these tw o extremes and incorporate the influence of a user’s social netw ork on activity hav e not b een studied in extensiv e detail in the literature. Some of the examples of suc h a bridging effort include user- generated activity traces used for inferring underlying so cial relationships [21–23], incorporating similarities in the b ehavior of users in a sp ecific user’s so cial net work to mo del his actions [24, 25], a Poisson regression mo del to determine the users that most influence a given user [26], and a user activit y-driv en (rather than connectivit y-driv en) mo del for so cial netw ork evolution [27]. Notwithstanding the fact that similar ideas ha v e b een sp oradically pursued in other settings, our work provides the missing link for a netw ork-driven approac h to capturing individual behavioral patterns. While the theory of classical HMMs is w ell-dev elop ed [28, 29], HMMs are ill-suited in settings where mul- tiple processes interact with eac h other and/or information about the history of the process needed for future inferencing is not reflected in the curren t state. T o incorporate complicated dependencies b etw een sev eral in teracting v ariables, many v arian ts of the classical HMM set-up that are sp ecial cases of the more general theory of dynamic Bayesian networks [30, 31] hav e b een introduced in the literature. Some of thes e exten- sions include the class of autor e gr essive HMMs [32 – 34], input-output HMMs [35, 36], factorial HMMs [37], and c ouple d HMMs. Differen t v arian ts of coupled HMMs hav e b een used in div erse settings including mo dels for complex h uman actions and b eha viors [38, 39], freewa y traffic [40], audio-visual sp eech [41, 42], EEG classification [43], spread of infection in so cial net works [44], etc. 4 Mo deling Activit y Profile of Twitter Users Let T i , i = 0 , 1 , · · · , N denote the time-stamps of a sp ecific user’s t w eets ov er a given p erio d-of-interest. W e can equiv alen tly define the inter-t weet duration ∆ i as ∆ i , T i − T i − 1 , i = 1 , 2 , · · · , N . One of the main goals of this w ork is to develop a mathematical mo del for { ∆ i } , ∆ N 1 = [∆ 1 , · · · , ∆ N ]. Along the lines of [7], we start by developing a simplistic k = 2-state HMM for { ∆ i } . Influence-F ree Hidden Mark ov Mo deling Assumption 1 – Underlying States: W e assume that a v ariable Q i , taking one of t wo p ossible v alues { 0 , 1 } , reflects the state of the user-of-in terest. Sp ecifically , Q i = 0 denotes that the user is in an Inactive state b et w een T i − 1 and T i , whereas Q i = 1 denotes that the user is in an A ctive state. W e also assume that Q i ( i ≥ 1) ev olves in a time-homogenous Mark ovian manner and is dep endent only on Q i − 1 and is conditionally indep endent of Q i − 2 0 = [ Q 0 , · · · , Q i − 2 ] given Q i − 1 . A first-order Marko vian mo del is a reasonable first approximation capturing short-term memory in human b ehavioral dynamics [2, 45]. The state transition probability matrix P = { P [ m, n ] } is given as P = 1 − β 0 , 1 β 0 , 1 β 1 , 0 1 − β 1 , 0 , with P [ m, n ] = P ( Q i = n | Q i − 1 = m ) , m, n ∈ { 0 , 1 } . The density of the initial state Q 0 is denoted as P ( Q 0 = j ) = π j , j = 0 , 1. Note that the switching from the Inactive state to the A ctive state in the HMM paradigm can capture the nocturnal/work-home patterns of individual users without any further explicit modeling [7]. F urther, explicit mo deling of circadian and weekly cycles in social netw ork settings is more difficult than for email communications due to the “often on” more random nature of so cial netw ork in teractions. Assumption 2 – Observation Density: In general, Q i is hidden (unobserv able) and we can only observe { ∆ i } (or equiv alently , { T i } ). In the Inactive state, { ∆ i } form samples from a “low”-rate p oint pro cess, whereas in the A ctive state, { ∆ i } form samples from a “high”-rate point pro cess. Sp ecifically , let the probabilit y density function of ∆ i b e giv en as ∆ i ∼ f 1 ( · ) if Q i = 1 f 0 ( · ) if Q i = 0 , for an appropriate choice of f 0 ( · ) and f 1 ( · ). 5 As mentioned earlier, an exp onential mo del for f · ( · ) corresp onds to a Poisson pro cess assumption under either state. While the exp onential mo del is captured by a single parameter (see T able 1 for details), this simplicit y often constrains the mo del fit either in the small inter-t weet (burst y) regime or large inter-t w eet regime (tails). Two-parameter extensions of the exp onential such as the gamma or W eibull density allow a b etter fit in these tw o regimes. Both the gamma and the W eibull mo dels result in similar mo deling fits. While the gamma mo del allows for simple parameter estimate formulas (see T able 1), the W eibull model requires the solving of coupled equations in the mo del parameters (often, a numerically in tensive procedure). Th us, w e will restrict attention to the exponential and gamma mo del c hoices in this work. Influence-Driven Hidden Mark ov Mo deling A more sophisticated influence-driven mo del is developed no w by making the following additional assump- tions: Assumption 3 – Influenc e of Neighb ors: In addition to Q i − 1 , the evolution of Q i is also influenced b y the aggregated activity of all the users interacting with and influencing the user-of-in terest (“neighbors,” for short). While neighbors is a broad rubric, we restrict attention to the friends and follow ers in this work. F or example, a series of tw eets from the neighbors can result in a reply/ret w eet by the user, or a long p erio d of non-activity from the neighbors could induce the user to initiate a burst of activity . Let the v ariable Z i ( i = 1 , · · · , N ) capture the influence of the neighbors’ t weets on the user-of-interest. Examples of candidate influence structures (information theoretically , Z i is view ed as a side-information metric) include: 1. A binary indicator function that reflects whether there was a mention of the user b etw een T i − 1 and T i (or not); 2. The num b er of such mentions; 3. T otal traffic (aggregated activit y) of the friends of the user, etc. Motiv ated by the same short-term memory assumption as b efore, the c oupling b etw een { Q i } and { Z i } is simplified b y the Mark ovian condition that P ( Q i | Q i − 1 1 , Z i 1 ) = P ( Q i | Q i − 1 , Z i ). In general, to keep com- putational requirements in inferencing lo w, it is helpful to assume that the evolution of Q i is captured by a summary statistic φ ( Z i ) : Z i 7→ [0 , 1] suc h that P ( Q i | Q i − 1 , Z i ) = P 0 ( Q i | Q i − 1 ) · (1 − φ ( Z i )) + P 1 ( Q i | Q i − 1 ) · φ ( Z i ) 6 with P k [ m, n ] = P k ( Q i = n | Q i − 1 = m ) where P 0 = 1 − p 0 p 0 q 0 1 − q 0 and P 1 = 1 − p 1 p 1 q 1 1 − q 1 . In particular, the choice φ ( Z i ) = 1 1( Z i > τ ) for a suitable threshold τ implies that the user switches from the transition probabilit y matrix P 0 to P 1 dep ending on the magnitude of the influence structure. T o paraphrase, the user ev olv es according to a baseline dynamics corresp onding to P 0 if his net w ork activity is b elow a certain threshold and evolv es according to an elev ated dynamics corresp onding to P 1 if his netw ork activity exceeds that threshold. The discussion on mo del learning elab orates on a simple metho d to determine the appropriate c hoice of τ . The discussion on mo del v alidation pro vides some empirical justification for this assumption. Assumption 4 – Evolution of Influenc e Structur e: Noting that Z i is a function of the activity of all the neigh b ors (and not a sp ecific user), we hypothesize that Z i is dep enden t on Q i − 1 , but only weakly dep enden t on Z i − 1 . Motiv ated b y this thinking, we make the simplistic assumption that P ( Z i | Q i − 1 1 , Z i − 1 1 ) = P ( Z i | Q i − 1 ) . (1) Rephrasing, (1) presumes that user aggregation de-correlates Z i from its past history . While the ab ov e assumption can be justified under certain scenarios (see the discussion under model v alidation), more general influence evolution mo dels need to b e considered and the loss in explanatory/predictive p ow er by making the simplistic assumption in (1) needs to b e studied carefully . This is the sub ject of ongoing w ork. F urther, let the probability density function of Z i b e giv en as Z i ∼ g 0 ( · ) if Q i − 1 = 0 g 1 ( · ) if Q i − 1 = 1 . Differen t candidates for g j ( · ) are illustrated in T able 2. Com bining the abov e four assumptions, the join t densit y of the observ ations { ∆ i } , the influence structure { Z i } , and the state { Q i } can b e simplified as P ∆ N 1 , Z N 1 , Q N 0 = P Q 0 , Z 1 , Q 1 , ∆ 1 , · · · , Z N , Q N , ∆ N (2) = P ( Q 0 ) N Y i =1 P ( Z i | Q i − 1 ) N Y i =1 P ( Q i | Q i − 1 , Z i ) N Y i =1 P (∆ i | Q i ) . (3) The dep endence relations that driv e the coupled HMM framew ork for user activity are illustrated in Figs. 1 and 2. 7 Compa rison with Related Mo dels and Architectures Man y extensions to the classical HMM arc hitecture hav e b een proposed in the literature for inferencing problems in differen t settings. W e now compare the prop osed coupled HMM architecture with some of these extensions and v ariants. The conditional dependencies of the inv olved v ariables corresp onding to the different arc hitectures relative to the prop osed model in this w ork are summarized in T able 3. The n um b er of mo del parameters for these arc hitectures with k states, ` net work influence structure levels, and m parameters for the observ ations are presented in T able 4. The simplest extension of the HMM arc hitecture, an autor e gr essive HMM [33], ties the evolution of the user’s in ter-tw eet duration to his state and his past observ ations. Thus, this model do es not incorp orate the influence of the user’s netw ork on his activity . A more general framew ork called a class of input-output HMMs is prop osed in [35] and [36], where the user’s inter-t weet duration is not only dep endent on his state, but also on an external input suc h as the net work influence structure. In con trast, a coupled v ariation of the factorial HMM in [37] takes the viewp oint of the netw ork influence structure b eing an additional state rather than an external input. In either mo del, the current state of the user dep ends not only on his past state, but also on the state of his netw ork. Ho w ev er, in b oth mo dels, the evolution of the user’s netw ork is one-side d and indep endent of the user’s interaction. The case of a ful ly-c ouple d HMM ov ercomes this one-sided ev olution b y ensuring that the state of the user and his net w ork influenc e each other. The price to pay for such generalit y (lack of structure in the conditional dependencies ) is that the model parameters hav e to b e learned via appro ximation algorithms instead of iterativ e techniques. T o o v ercome this difficulty , a structured architecture is prop osed in [38], where P ( Q i | { Q i − 1 , Z i − 1 } ) = P ( Q i | Q i − 1 ) · P ( Q i | Z i − 1 ) P ( Z i | { Q i − 1 , Z i − 1 } ) = P ( Z i | Q i − 1 ) · P ( Z i | Z i − 1 ) . Similarly , [43] proposes another arc hitecture, where P ( Q i | { Q i − 1 , Z i − 1 } ) = β 1 · P ( Q i | Q i − 1 ) + β 2 · P ( Q i | Z i − 1 ) P ( Z i | { Q i − 1 , Z i − 1 } ) = γ 1 · P ( Z i | Q i − 1 ) + γ 2 · P ( Z i | Z i − 1 ) for appropriate normalization constants { β i } and { γ i } . While parameter learning algorithms simplify in either scenario, from a so cial netw ork p ersp ective, the num b er of mo del parameters remain large for small v alues of k and ` relative to the proposed coupled HMM arc hitecture in this work. F or example, these 8 structured coupled HMMs are describ ed by 10 and 18 mo del parameters with an m = 1 parameter observ ation densit y in each state (and 12 and 20 parameters with m = 2) relative to 8 and 10 mo del parameters with the proposed mo del in the same settings (see T able 4 for details). In the backdrop of this discussion, the prop osed coupled HMM architecture offers a principled, nov el and easily-motiv ated mo deling framework, sp ecifically useful in so cial netw ork contexts. Despite b eing simple, it offers significan t p erformance gains ov er the classical HMM architecture and its v ariants. As the subsequen t discussion also shows, the prop osed architecture has the added b enefit of allo wing mo del learning via simple re-estimation form ulas. Metho dology Lea rning Mo del Pa rameters It is of interest to infer the underlying states { Q i } that cannot b e observ ed directly . This task is performed with the aid of the observ ations { ∆ i } in the HMM setting, and with the aid of { ∆ i } and the influence structure { Z i } in the coupled HMM setting. In the HMM setting, a lo c al ly optimal choice of mo del parameters is sought to maximize the likelihoo d function P (∆ N 1 | λ ). F ollo wing the result in [46, 47], starting with an initial choice of HMM parameters ¯ λ , the mo del parameters are re-estimated to maximize Baum’s auxiliary function Q ( λ , ¯ λ ) HMM , defined as, Q ( λ , ¯ λ ) HMM , X Q N 0 log P ∆ N 1 , Q N 0 | λ · P ∆ N 1 , Q N 0 | ¯ λ . It can be easily chec ked [28, 29] that this maximization breaks in to a term-by-term optimization of ind ividual mo del parameters. The mo del re-estimation form ulas are giv en by the Baum-W elch algorithm: b β i,j = P N i =1 ξ i − 1 ( i, j ) P N i =1 ( ξ i − 1 ( i, 0) + ξ i − 1 ( i, 1)) , i 6 = j, i, j ∈ { 0 , 1 } . The Baum-W elch estimate of the parameters defining the different observ ation densities are presen ted in T able 1. In these equations, ξ i ( a, b ) for i = 0 , · · · , N − 1 is defined as ξ i ( a, b ) , P Q i = a, Q i +1 = b | ∆ N 1 , ¯ λ . The update equation for ξ i ( a, b ) follows from [28, Eq. (37)] and the forward-bac kward pro cedure. In the coupled HMM setting, as in [36], we are interested in mo del parameters that maximize the c onditional likelihoo d function P (∆ N 1 | Z N 1 , λ ). If the conditional lik eliho o d is known in closed-form (as in the input-output HMM case [36]) or a tight low er b ound to it is kno wn, a conditional exp ectation maximization 9 algorithm along the lines of [48 – 50] can b e pursued. In the prop osed coupled HMM setting, the conditional lik eliho o d app ears to b e neither amenable to a simple formula nor a tight low er b ound. T o ov ercome this tec hnical difficulty , we now prop ose a tw o-step procedure to learn an estimate of the mo del parameters that maximize P (∆ N 1 | Z N 1 , λ ). In the first step, we fix the threshold that distinguishes the baseline dynamics P 0 from the elev ated dynamics P 1 , τ , to an appropriate c hoice τ init . W e then treat ∆ N 1 and Z N 1 as training observ ations and consider a generalized auxiliary function Q ( λ , ¯ λ ) CHMM of the form: Q ( λ , ¯ λ ) CHMM , X Q N 0 log P ∆ N 1 , Z N 1 , Q N 0 | λ · P ∆ N 1 , Z N 1 , Q N 0 | ¯ λ . (4) As b efore, λ and ¯ λ denote the optimization v ariable corresponding to the parameter space (all the parameters except for τ ) and its initial estimate, respectively . A straightforw ard extension of the pro of in [46, 47] sho ws that maximizing Q ( λ , ¯ λ ) CHMM in the λ v ariable results in a local maximization of the joint lik eliho o d function P ∆ N 1 , Z N 1 | λ . T o obtain analogous re-estimation formulas for an iterative solution to a lo cal maxim um, w e define the equiv alent intermediate v ariable e ξ i ( a, b ) for i = 0 , · · · , N − 1: e ξ i ( a, b ) , P Q i = a, Q i +1 = b | ∆ N 1 , Z N 1 , ¯ λ . Using (3), we can simplify (4) and the join t optimization of the mo del parameters again breaks into a term-b y-term optimization. W e then hav e the follo wing analogous mo del parameter estimates for k ∈ { 0 , 1 } : e p k = P N i =1 , i ∈ Z k e ξ i − 1 (0 , 1) P N i =1 , i ∈ Z k e ξ i − 1 (0 , 0) + P N i =1 , i ∈ Z k e ξ i − 1 (0 , 1) e q k = P N i =1 , i ∈ Z k e ξ i − 1 (1 , 0) P N i =1 , i ∈ Z k e ξ i − 1 (1 , 0) + P N i =1 , i ∈ Z k e ξ i − 1 (1 , 1) with Z 0 = { i : Z i ≤ τ } and Z 1 = { i : Z i > τ } . The re-estimation form ulas for the observ ation density pa- rameters follow the same structure as in T able 1 by replacing ξ i ( · , · ) with e ξ i ( · , · ). F or the parameters defining the density of the influence structure, T able 2 provides a list of re-estimation form ulas. The intermediate v ariable e ξ i ( a, b ) , 1 ≤ i ≤ N − 1 is up dated by a generalized forwar d-bac kw ard pro cedure whose steps are illustrated in T able 5 (see (7)-(13)). W e denote b y λ init the conv erged mo del parameters that lo cally maximize Q ( λ , ¯ λ ) CHMM . Note that λ init is a lo cal maximum only in the λ space and not in { τ × λ } . Th us, the choice { τ init , λ init } do es not maximize P (∆ N 1 | Z N 1 , λ ), not even lo cally . Therefore, in the next step, w e locally optimize the conditional likelihoo d 10 o v er a lo cal region around { τ init , λ init } . That is, n e τ , e λ o = arg max { τ , λ } ∈ L P ∆ N 1 | Z N 1 , λ where L = { τ : τ = τ init + ∆ τ and λ : λ = λ init + ∆ λ } . In this w ork, w e fo cus on a box-constrained L . Alternately , a local gradient search in the mo del parameter space can be pursued to lo cally optimize the conditional lik eliho o d function. Note that the conditional densit y can b e written as P ∆ N 1 | Z N 1 , λ = e α N (0) + e α N (1) b α N (0) + b α N (1) , where the forward algorithm v ariable e α i ( j ) is up dated with the same form ulas as (8)-(10) (see T able 5). On the other hand, b α i ( j ) follo ws the same formula as e α i ( j ), but by constraining P (∆ i | Q i = a ) = 1 for all i and a . Mo del Verification The efficacy of the differen t mo dels to the observ ed data are studied in tw o wa ys. In the first approach, the mo del parameters learned via the (generalized) Baum-W elc h algorithm are used with a state estimation pro cedure to estimate the most probable state sequence associated with the observ ations. F or the HMM setting, state estimation is straightforw ard via the use of the Viterbi algorithm [28]. The generalization of the Viterbi algorithm to the coupled HMM setting requires the definition of in termediate v ariables δ i +1 ( j ) and φ i +1 ( j ) as illustrated in T able 5 (see (14)-(19)). As with the Viterbi algorithm, the most probable state sequence is then estimated as Q ? N = arg max j ∈ { 0 , 1 } δ N ( j ) Q ? i = φ i +1 ( Q ? i +1 ) , 1 ≤ i ≤ N − 1 . The observed inter-t weet durations corresp onding to the classified states are compared with the in ter-tw eet durations obtained with the prop osed mo del(s) via a graphical metho d such as the Quantile-Quan tile (Q-Q) plot. Recall that a Q-Q plot plots the quantiles corresp onding to the true observ ations with the quantiles corresp onding to the mo del(s) [51]. If the prop osed mo del reflects the observ ations correctly , the quantiles lie on the (reference) straight-line that extrapolates the first and the third quartiles. Discrepancies from the straigh t-line b enchmark indicate artifacts in tro duced b y the mo del(s) not seen in the observ ations and/or features in the observ ation not explained by the model(s). 11 In the second approach, the fits of the differen t models to the data are studied via a more formal metric suc h as the Ak aik e Information Criterion (AIC), defined as AIC ( n ) , 2 k − 2 log ( L ) , where k denotes the num b er of parameters used in the mo del, n is the length of the observ ation sequence, and L is the optimized lik eliho o d function for the observ ation sequence corresponding to the learned mo del. The AIC p enalizes mo dels with more parameters and the mo del that results in the smallest v alue of AIC is the most suitable mo del (for the observed data) from the class of mo dels considered. In the HMM setting with k HMM parameters, the AIC corresp onding to { ∆ i } is given as AIC ( n ) HMM , 2 k HMM − 2 log P (∆ N 1 | λ ) = 2 k HMM − 2 log α N (0) + α N (1) , where the c on verged mo del parameter estimates from the Baum-W elch algorithm are used to compute α i ( j ) , P (∆ i 1 , Q i = j ) using the forward pro cedure. In the coupled HMM setting with k CHMM parameters, the corresponding AIC metric is defined as AIC ( n ) CHMM , 2 k CHMM − 2 log P (∆ N 1 | Z N 1 , λ ) , (5) where the mo del parameter estimates that maximize P (∆ N 1 | Z N 1 , λ ) are to b e used in (5). With the mo del parameters learned as explained in the previous section, an upp er b ound to AIC ( n ) CHMM is obtained as AIC ( n ) CHMM ≤ AIC ( n ) , 2 k CHMM − 2 log e α N (0) + e α N (1) b α N (0) + b α N (1) . F orecasting Giv en ∆ n 1 (and Z n 1 ), forecasting ∆ n +1 is of immense imp ortance in tasks such as advertising, anomaly detection (detecting when a compromised account will p ost next), etc. A simple maximum a p osteriori (MAP) predictor of the form e ∆ n +1 MAP = arg max y f (∆ n +1 = y | ∆ n 1 , Z n 1 ) = arg max y k − 1 X i =0 e β i f i (∆ n +1 = y ) where e β i = P j e α n ( j ) P ( Z n +1 | Q n = j ) P ( Q n +1 = i | Z n +1 , Q n = j ) P j e α n ( j ) 12 fails when f i (∆ n +1 = y ) is unimo dal with the same mode for all i . This is alwa ys the case with exp onential observ ation models (mo de is 0) and with gamma mo dels if k i λ i < 1 for all i (mo de is 0), which is typically the case with the b est mo del fits for many users. On the other hand, a conditional mean predictor of the form e ∆ n +1 CM = E [∆ n +1 | ∆ n 1 , Z n 1 ] = k − 1 X i =0 e β i E [∆ n +1 | Q n +1 = i ] results in large forecasting errors in the Inactive state if the mean inter-t weet durations in the tw o states are v ery disparate (typically the case for many users). T o ov ercome these problems, we consider a predictor of the form e ∆ n +1 = k − 1 X i =0 1 1 e Q n +1 = i | ∆ n 1 E [∆ n +1 | Q n +1 = i ] where e Q n +1 is the state es timate using the (generalized) Viterbi algorithm with ∆ n 1 (and Z n 1 ) as inputs and study the forecasting p erformance in the A ctive state with a Symmetric Mean Absolute Percen tage Error (SMAPE) metric: SMAPE ( N ) , 1 N N X i =1 ∆ i − e ∆ i ∆ i + e ∆ i · 1 1( Q i = 1) . The SMAPE metric is a normalized error metric and a smaller v alue indicates a b etter mo del for forecasting. It is seen as a p ercentage error and is b ounded b etw een 0% and 100%. Numerical Results The dataset used to illustrate the efficacy of the mo dels proposed in this work is a 30-da y long record of Twitter activity describ ed in [11]. This dataset consists of N t = 652 , 522 tw eets from N u = 30 , 750 users (with at least one tw eet). The time-scale on which the t w eets are collected is minutes. While the dataset has b een collected using a sno wball sampling technique and reflects a p opulation primarily based out of W est Asia, London and P akistan, the users in the dataset app ear to b e from diverse so cio-economic and p olitical bac kgrounds and ha ve a broad arra y of interests. F urther, the prop erties of the dataset on a collective scale are similar in nature to well-understoo d prop erties of similar datasets [11] suggesting a high confidence on the suitability of the dataset in studying user b ehavior on an individual scale and in its generalizability to other datasets. Since reliable model learning can b e accomplished only for users with sufficien t activity , w e focus on users with a large n umber of t weets ov er the data collection p erio d. There were 223 users with ov er 600 tw eets and 115 users with o ver 1 , 000 t weets. 13 V alidating Mo del Assumptions The coupled HMM framework dev elop ed in this pap er is built on four main assumptions, tw o of which lead to the HMM formulation and t w o that couple the influence of the neighbors to the HMM. Notwithstanding the fact that these assumptions are based on a rational mo del of user behavior, the first t wo assumptions ha v e been well-studied and justified in the literature [7, 18 – 20]. W e now provide some empirical results to justify the latter tw o assumptions. F or this, we start with tw o typical users (denoted as User-I and User-I I) whose activity o ver the thirt y- da y p erio d consists of: i) 807 tw eets, 260 men tions, and 16 , 935 tw eets from his so cial netw ork of 62 friends, and ii) 1 , 914 tw eets, 1 , 108 mentions, and 10 , 281 tw eets from his so cial netw ork of 92 friends. W e also consider an extreme case of a highly active user (denoted as User-I I I) whose activity o ver the thirt y-day p erio d consists of 2 , 387 t weets, 2 , 872 mentions, and 58 , 810 tw eets from his so cial netw ork of 206 friends. Users-I and I I do not app ear to b e p opular public figures, whereas User-I I I is a p opular journalist, adv o cate on man y p olitical issues, and an activist. Assumption 3 hypothesizes that eac h user’s activit y switches from a baseline dynamics to an elev ated state of dynamics depending on the magnitude of τ . T o test this assumption, we study the data from User-II I for n = 1000. With this data, w e use the generalized Baum-W elch algorithm to learn mo del parameters (as a function of τ ) for a coupled HMM with the num b er of men tions as the influence structure. Fig. 3 plots the learned transition probabilities for User-I I I as a function of τ . F rom this figure, w e see that b oth p 0 and p 1 start off around approximately the same v alue (and similarly for q 0 and q 1 ). Ho wev er, as τ increases, the transition probabilities stabilize at differen t v alues suggesting that there is indeed a baseline and an elev ated state in user dynamics. Similar behavior is also seen with data from Users-I and I I. On the other hand, Assumption 4 h yp othesizes that Z i and Z i − 1 are conditionally independent giv en Q i − 1 . T o test this assumption, we use the model parameters learned with the generalized Baum-W elc h algorithm in the generalized Viterbi algorithm to estimate the most likely state sequence corresp onding to the observ ations. The conditional correlation co efficient b etw een Z i and Z i − 1 , defined as, ρ ( Z i , Z i − 1 | Q i − 1 = j ) , E h Z i − E [ Z i | Q i − 1 = j ] · Z i − 1 − E [ Z i − 1 | Q i − 1 = j ] i r E h ( Z i − E [ Z i | Q i − 1 = j ]) 2 i · r E h ( Z i − 1 − E [ Z i − 1 | Q i − 1 = j ]) 2 i , j ∈ { 0 , 1 } is used to study conditional indep endence. T able 6 lists ρ ( Z i , Z i − 1 | Q i − 1 = j ) and the p -v alue corresp onding to this co efficien t for Users-I to I I I with n = 800, n = 1000 and n = 1000, resp ectively . F rom this table, w e see that the correlation co efficient in all the six cases studied has a small (absolute) v alue. A simple 14 explanation for this observ ation is that user aggregation significan tly diminishes the correlation betw een Z i − 1 and Z i . Sp ecifically , at a (standard) significance level of 5%, the null h yp othesis of conditional indep endence b et w een Z i and Z i − 1 cannot b e rejected in five of the six cases studied indicating that Assumption 4 can b e justified for many users. Mo del Fits Fo r Users-I to I I I W e no w study the following mo dels for the activit y profile of the three users: i) con ven tional t wo-state HMM, ii) coupled HMM with a binary influence structure that is set to 1 when there is a men tion of the user and 0 otherwise, iii) coupled HMM with the num b er of such mentions as the influence structure, and iv) coupled HMM with the so cial net w ork traffic of the friends of the user as the influence structure. Exp onential and gamma densities are considered for the observ ations (inter-t weet duration). On the other hand, geometric, P oisson and shifted zeta densities are considered for the num b er of mentions, and a geometric density is considered for the total traffic. See T ables 1 and 2 for mo del details. T ables 7-9 list the AIC scores for these three users with the different mo dels as a function of the num b er of observ ations n for different c hoices: n = 100, n = 250, n = 500, n = 750, or n = 1000. F rom these tables, the follo wing conclusions can b e made: 1. F or all the three users, b oth a P oisson pro cess mo del and a renew al pro cess mo del are signific antly sub-optimal for the observ ations as they implicitly assume a single state for the user’s activity . This is in conformance with similar observ ations in [7]. A conv entional HMM with tw o states o vercomes this problem by assuming that the user switches b et ween an A ctive and an Inactive state and th us provides a better baseline to compare the p erformance of the prop osed modeling framework. 2. F or all com binations of users, n and t yp es of influence structure, a tw o-parameter gamma density for the observ ations results in a b etter fit than possible with an exp onential mo del. This should not b e en tirely surprising since an exponential density is a sp ecial case of the gamma density (a gamma with k = 1 and λ = 1 ρ results in an exp onen tial of rate ρ ). Similar observ ations ha ve also b een made in related recen t work [14–17]. This conclusion is also reinforced by observing the Q-Q plots of the true inter-t weet durations (in the A ctive and Inactive states) relative to the in ter-tw eet duration v alues obtained from four mo dels for User-I I I (see Figs. 4(a)-(d) and Figs. 4(e)-(h), resp ectiv ely). The four comp eting mo dels illustrated are: i) Model A — con v entional HMM with exponential density , ii) Mo del B — con v en tional HMM 15 with gamma density , iii) Mo del C — coupled HMM with geometric influence structure and exponential densit y , and iv) Mo del D — coupled HMM with geometric influence structure and gamma densit y . The most probable state sequence vector corresponding to the observ ations of User-I I I is estimated with the (generalized) Viterbi algorithm. As can be seen from Fig. 4, Model D is the b est fit from among these four mo dels. Nevertheless, the discrepancies of some of the quantiles from the reference straigh t-line sho ws that even this mo del do es not c ompletely capture all the features in the observ ations, suggesting a direction for future w ork in this area. 3. W e no w explain how a coupled HMM works. State estimation with Models B and D is p erformed using the (generalized) Viterbi algorithm. In the n = 1000 case, while the HMM declares 898 of the 1000 in ter-t weet p erio ds (corresp onding to 21 . 06% of the total observ ation p erio d for User-II I) as A ctive , the coupled HMM declares only 845 p erio ds (corresp onding to 16 . 46% of the total observ ation p erio d) as A ctive . In terms of discrepancies in state estimation b et ween the t wo mo dels, 53 inter-t weet p erio ds declared as A ctive by the HMM are re-classified as Inactive by the coupled HMM, whereas all the Inactive states of the HMM are also classified as Inactive b y the coupled HMM. Carefully studying ∆ i , Q i − 1 and Z i for these 53 p erio ds that are re-classified, it can b e seen that the coupled HMM declares a p erio d as Inactive (indep endent of the nature of Q i − 1 or Z i ) provided that ∆ i is large, or when ∆ i is small and in addition, Z i is also small and Q i − 1 = 0. In other words, if the user is in the Inactive state and the influence structure do es not suggest a switch to the A ctive state, a small inter-t weet p erio d is treated as an anomaly rather than as an indicator of c hange to the A ctive state. Thus, in con trast to the HMM setting where the state estimate depends primarily on the magnitude of ∆ i , the coupled HMM is less trigger-happy in the sense that it considers the magnitude of ∆ i in the context of neigh b ors’ activity b efore declaring a state as A ctive or Inactive . 4. In terms of general trends with AIC as the metric for model fitting, if n is small (sa y , n = 100 to 250), a con v entional HMM is competitive and comparable with (sometimes, even better than) a coupled HMM with more parameters. In addition to there not being sufficient data to learn a complicated model, this trend conforms with the p opular intuition of the Occam’s razor that simplistic models shall suffice for observ ations of small length. 5. So cial net w ork traffic (that includes replies, ret w eets , and mo dified t weets from all of a user’s friends) t ypically ov erwhelms the n umber of mentions by at least an order of magnitude (see typical examples with Users-I and I I abov e). Thus, so cial netw ork traffic serves as a go od influence structure to couple 16 a HMM when the num b er of mentions is too small to learn a sophisticated mo del reliably . This is t ypically the case when n is mo derate (neither to o small nor to o large). F or example, with Users-I and I I, traffic leads to a b etter mo del fit for n = 500 and n = 100, resp ectively . 6. How ever, as n increases, the more dir e ctional nature of a men tion (relative to the traffic) means that men tions carry more “information” about the capacit y of a user to resp ond/reply conditioned on seeing a certain type of tw eet from his netw ork (than the traffic). This is clear from the general trend of low er AIC scores with the num b er of men tions than with the social netw ork traffic for large n v alues ( n = 750 or 1000). 7. F or all combinations of users and n , the binary influence structure for the men tions results in a p o orer fit than the num b er of men tions. This is b ecause it is more efficient to capture the num b er of mentions with a one parameter model than to exp end that parameter on a binary v alue. In other words, the loss in p erformance is due to the use of a hard decision metric (binary v alue), provided that the soft decision metric (the num b er of mentions) is captured accurately . While the three mo dels (geometric, P oisson and shifted zeta) for the num b er of mentions result in comparable p erformance for Users-I and I I, the geometric results in a sup erior fit for User-II I. Thus, a geometric densit y can serve as a robust mo del choice for the n um b er of mentions. Giv en that the shifted zeta density captures heavy-tails, the ab o v e trend also suggests that the num b er of men tions o v er an in ter-tw eet duration is not likely to b e hea vy-tailed. P erformance Across Users Giv en that the exponential observ ation density consisten tly under-p erforms in mo del fitting relative to a gamma densit y , w e henceforth fo cus on the p erformance of Mo del D with Mo del B as the baseline. This p erformance gain is captured b y the relativ e AIC gain metric, defined as, ∆ AIC , AIC Model B − AIC Model D . (6) In Fig. 5 and T able 10, ∆ AIC v alues are presented for differen t n v alues for Users-I to I I I. As can b e seen from this data, Model B p erforms b etter than Mo del D for n < 550 for User-I and Mo del D gets b etter as n increases after that. F or User-I I I, Mo del D is b etter than Mo del B for all n > 200 and the p erformance gain impro v es with increasing n till n = 800 and then slightly decreases after that. In general, the p erformance gain with Model D for the typical user is negligible for small v alues of n and this gain impro v es (in general) as n increases. 17 T o study this asp ect more carefully , w e no w consider a corpus of 100 users with differen t n um b ers of t weets and mentions ov er their p erio ds of activit y . F or all the users studied, it is observ ed that a local optim um (to reasonable accuracy) is achiev ed b y the generalized Baum-W elch algorithm within 20-30 iterations and indep enden t of the mo del parameter initializations. Fig. 6(a)-(b) plots the histogram of ∆ AIC for the corpus of 100 use rs with n = 500 and n = 1000, resp ectively . F rom Fig. 6, it can b e seen that Mo del D significantly out-p erforms Model B for a large fraction of the users and this improv ement gets better as n increases. T o understand this, recall that exp − ∆ AIC 2 is the likelihoo d that Model B minimizes the information loss relative to Mo del D . Thus, a ∆ AIC v alue larger than 4 . 61 and 9 . 21 leads to a relativ e lik eliho o d of 10% and 1%, resp ectively . F or the corpus studied here, Mo del D is 100 times as likely to minimize information loss for 25% of the users at n = 500 and 72% of the users at n = 1000, respectively . With a more relaxed b enc hmark, Mo del D is ten times as lik ely to minimize information loss for 33% and 85% of the users at n = 500 and 1000, resp ectively . F or predictive p erformance, analogous to ∆ AIC in (6), we define the relativ e SMAPE gain metric as ∆ SMAPE , SMAPE Model B − SMAPE Model D . Fig. 6(c) plots the histogram of ∆ SMAPE for n = 500 and it can again b e seen that Mo del D is b etter than Mo del B in terms of predictiv e p ow er for a large fraction of users. Thus, a coupled HMM pro vides a better mo deling paradigm for the activit y of a large set of users in so cial net works. User Clustering After learning the mo del parameters for each user, w e no w consider the similarit y betw een differen t users as implied b y those parameters. Since the dataset from [11] has only around 200 users with sufficient activit y ( n > 500) to learn general probabilistic mo dels where the coupled HMM parameters learned via the generalized Baum-W elc h algorithm con verge to local optima in the mo del parameter space, we fo cus on a corpus of 150 of these users. The mean num b er of t weets and mentions for this corpus is 975 . 40 and 629 . 17, resp ectiv ely . The mean n umber of friends and follow ers for the corpus is 251 . 83 and 206 . 46, respectively . With the num b er of men tions as the influence structure, we fo cus on t wo coupled HMM-sp ecific parame- ters that capture the in teraction dynamic b etw een a user and his so cial netw ork to p erform user clustering in the model parameter space. T he parameter p 1 /p 0 measures the prop ensity (likelihoo d) of a user to become activ e upon seeing a large num b er of mentions in his timeline relativ e to a lack of such mentions: p 1 p 0 = P ( Q i = 1 | Q i − 1 = 0 , Z i > τ ) P ( Q i = 1 | Q i − 1 = 0 , Z i ≤ τ ) . 18 On the other hand, the parameter γ 1 /γ 0 measures the prop ensity of the user’s net work to resp ond with a men tion upon seeing activit y at the user relative to his inactivit y: γ 1 γ 0 = P ( Z i > 0 | Q i − 1 = 1) P ( Z i > 0 | Q i − 1 = 0) . Three natural clusters can b e iden tified in the mo del parameter space as a function of the n p 1 p 0 , γ 1 γ 0 o v alues: 1) The baseline scenario where the users are not significan tly influenced by their neighbors and vic e versa corresp onds to p 1 p 0 = 1 = γ 1 γ 0 . The users for whom p 1 p 0 < 1 are the “tails” in the mo del space corresp onding to this baseline scenario. 2) On the other hand, a large v alue of p 1 p 0 indicates that more mentions can induce a user to the A ctive state. Restated, a user can b e induced to p ost at a higher frequency by an activ e so cial net w ork. 3) Similarly , a large v alue of γ 1 γ 0 indicates that a user’s so cial net w ork can b e induced to b ecome activ e (with a larger n umber of mentions) by the user’s activit y . Motiv ated b y this argument of three natural clusters, Fig. 7 clusters 150 users using the K -means algorithm for K = 3. The result of this clustering is that 92 users b elong to Cluster 1 centered around p 1 p 0 , γ 1 γ 0 = (1 , 1). Of the remaining 58 users, 35 b elong to Cluster 2 where p 1 p 0 > 1 . 5 and 23 b elong to Cluster 3 where γ 1 γ 0 > 1 . 5. T o paraphrase the abov e discussion, Cluster 1 is made of a ma jorit y of the users corresp onding to the baseline scenario. On the other hand, Clusters 2 and 3 consist of users outside Cluster 1 and those who are either tigh tly knit to their netw ork (or vic e versa ). These users are significantly affected by their social influence. Some of the typical attributes/qualities that can b est describ e users in Cluster 2 are: commentarial, activist, garrulous, argumentativ e, opinionated, etc. Illustrating these facets, a sample activity listing ov er a single session of a t ypical user in Cluster 2 is provided in T able 11. The session b egins with a question of “y eh.w atching” (apparen tly , a cric k et matc h b etw een Pakistan and England) by a friend of the user. This is follo w ed by a con versation betw een friends and unsolicited commentaries/observ ations on the ongoing matc h b y the user-of-in terest. Such activ e commentary is typical of this user’s p osting b eha vior. Another sample argumen t b etw een t wo users in Cluster 2 is provided in T able 12. This argumen t is an exc hange of political opinions with each user trying to con vince the other about their resp ective positions. A t the end of the argumen t, one of the users realizes and ackno wledges that he has b ecome more vocal on social media, y et also sensitiv e to other users’ p ositions. On the other hand, the so cial netw ork of users in Cluster 3 share similar attributes as users in Cluster 2 ev en though the users themselv es are often more reluctan t to follo w suit. T o illustrate this subtle difference in b ehaviors, a sample activit y listing of a typical user in Cluster 3 is presen ted in T able 13. Here, we see that the user’s so cial net work is strongly opinionated in resp onse to a news story introduced b y the user. 19 Despite in tro ducing the story , the user himself is not sufficien tly p olarized/aggressiv e in his resp onse on so cial media. Similarly , after introducing another story , the user blindly agrees with other users’ p ositions and jok es on the matter. Thus, these examples illustrate ho w the coupled HMM paradigm in tro duced in this w ork captures broad features on user b ehavior despite not capturing the textual conten t in any detail. Conclusion W e hav e in tro duced a new class of coupled Hidden Marko v Mo dels to describe temp oral patterns of user activit y whic h incorp orate the so cial effects of influence from the activit y of a user’s neigh b ors. While there hav e b een many w orks on mo dels for user activity in diverse so cial netw ork settings, our work is the first to incorp orate so cial net work influence on a user’s activity . W e ha v e sho wn that the prop osed model results in b etter explanatory and predictive pow er ov er existing baseline mo dels suc h as a renewal pro cess- based mo del or an uncoupled HMM. User clustering in the mo del parameter space resulted in clusters with distinct in teraction dynamics b et ween users and their net works. Sp ecifically , three clusters corresp onding to: a baseline scenario of no influence of a user on his netw ork (and vic e versa ), and t w o clusters with significant influence of a user on his netw ork, and the netw ork on the user, resp ectiv ely are identified. While our w ork has developed a so cial netw ork-driven user activity mo del, it has only scratched the surface in this promising arena of research. It would be useful to pursue a more detailed study of different candidate mo dels for the influence structures and the observ ations. It is also of interest in understanding whic h type of tw eets/p osts (mentions, replies, ret w eets, undirected tw eets) or the total traffic carries more “information” in dev eloping go o d mo dels for explanation and prediction at the individual scale. It w ould also b e of in terest to develop hierarchical so cial influence-driv en mo dels for groups of users as w ell as better understand those facets of a user’s so cial netw ork that influence him the most. In particular, a careful study of other netw ork influence structures (suc h as transfer en tropy-w eighted traffic [23]) that can capture the in teraction dynamic b etw een a user and his net work is of importance. Com bining temp oral activit y patterns with unstructured information such as the topic or nature of discussion, textual con tent, etc., could result in m uc h b etter predictive p erformance than temp oral activit y alone. F urther, understanding the contribution of the temporal and the textual parts of such a comp osite mo del in prediction w ould also be useful in understanding the limits and capabilities of activity profile mo deling at the individual scale. 20 List of Abbreviations 1. HMM – Hidden Marko v Mo del 2. Coupled HMM – Coupled Hidden Mark o v Mo del 3. Q-Q Plot – Quantile-Quan tile Plot 4. AIC – Ak aik e Information Criterion 5. SMAPE – Symmetric Mean Absolute P ercen tage Error Comp eting Interests The authors declare that they hav e no competing interests. Autho rs’ Contributions VR p erformed the n umerical exp eriments. All the authors developed the mo del, designed the exp eriments, in terpreted the results, wrote the pap er, and appro ved the final man uscript. Ackno wledgment This w ork was supp orted by the Defense Adv anced Researc h Pro jects Agency (DARP A) under gran t # W911NF-12-1-0034 at the Universit y of Southern California. The authors would like to thank Sofus Mac- sk assy for providing the Twitter dataset that was used in illustrating the algorithms prop osed in this w ork. References 1. Barab´ asi AL: The origin of bursts and hea vy tails in h uman dynamics . Natur e 2005, 435 :207–211. 2. Goh KI, Barab´ asi AL: Burstiness and memory in complex systems . Eur ophysics L etters 2008, 81 (48002). 3. V´ azquez A, Oliv eira JG, Dezs¨ o Z, Goh K, Kondor I, Barab´ asi AL: Modeling bursts and heavy tails in h uman dynamics . Physic al R eview E 2006, 73 (3):036127. 4. V´ azquez A, Racz B, Luk acs A, Barab´ asi AL: Impact of non-Poissonian activity patterns on spreading pro cesses . Physic al R eview L etters 2007, 98 (15):158702. 5. Malmgren RD, Stouffer DB, Motter AE, Amaral LAN: A Poissonian explanation for hea vy tails in e-mail comm unication . Pr o c e e dings of the National A c ademy of Sciences 2008, 105 (47):18153–18158. 6. Malmgren RD, Stouffer DB, Campanharo A, Amaral LAN: On universalit y in human corresp ondence activit y . Scienc e 2009, 325 (5948):1696–1700. 7. Malmgren RD, Hofman JM, Amaral LAN, W atts DJ: Characterizing individual communication patterns . In Pr o c e e dings of ACM SIGKDD Confer enc e on Know le dge, Disc overy, and Data Mining, Paris, F r anc e 2009, :607–615. 8. Stehl´ e J, Barrat A, Bianconi G: Dynamical and bursty interactions in so cial netw orks . Physic al R eview E 2010, 81 (3):035101. 21 9. Rybski D, Buldyrev SV, Havlin S, Liljeros F, Makse HA: Communication activit y in a so cial netw ork: relation b etw een long-term correlations and inter-ev ent clustering . Scientific Rep orts 2012, 2 :560. 10. Jo HH, Karsai M, Kert´ esz J, Kaski K: Circadian pattern and burstiness in mobile phone comm unication . New Journal of Physics 2012, 14 :013055. 11. Macsk assy SA: On the study of so cial in teractions in Twitter . In Pr o c ee dings of International Confer enc e on Weblogs and Social Me dia, Dublin, Ir eland 2012, :226–233. 12. Guo L, Chen S, Xiao Z, T an E, Ding X, Zhang X: Measurements, analysis and mo deling of BitT orrent- lik e systems . In Pr o c e e dings of ACM SIGCOMM Internet Me asur ement Confer enc e, New Orle ans, LA 2005, :35–48. 13. Lesko vec J, Backstrom L, Kumar R, T omkins A: Microscopic evolution of so cial netw orks . In Pr o c e e dings of ACM SIGKDD Confer enc e on Know le dge, Disc overy, and Data Mining, L as V e gas, NV 2008, :462–470. 14. Xiao Z, Guo L, T racey J: Understanding instan t messaging traffic c haracteristics . In Pr o c e e dings of IEEE International Conferenc e on Distribute d Computing Systems, T oronto, Canada 2007, :51. 15. Guo L, T an E, Chen S, Xiao Z, Zhang X: The stretched exp onential distribution of Internet media access patterns . In Pr o c e e dings of ACM Symp osium on Principles of Distribute d Computing, T or onto, Canada 2008, :283–294. 16. Guo L, T an E, Chen S, Zhang X, Zhao Y: Analyzing patterns of user conten t generation in online so cial net works . In Pr o c ee dings of ACM SIGKDD Confer enc e on Know le dge, Disc overy, and Data Mining, Paris, F r anc e 2009, :369–378. 17. Jiang ZQ, Xie WJ, Li MX, Podobnik B, Zhou WX, Stanley HE: Calling patterns in human comm unication dynamics . Pr o c e e dings of the National A c ademy of Scienc es 2013, 110 (5):1600–1605. 18. W u Y, Zhou C, Xiao J, Kurths J, Schellnh ub er HJ: Evidence for a bimo dal distribution in human com- m unication . Pr o c e e dings of the National A c ademy of Sciences 2010, 107 (44):18803–18808. 19. Hogg T, Lerman K: Sto c hastic models of user-con tributory web sites . In Pr o c e e dings of the International Confer enc e on Weblogs and Social Me dia, San Jose, CA 2009, :50–57. 20. Romero DM, Galuba W, Asur S, Hub erman BA: Influence and passivity in so cial media . So cial Scienc e R ese ar ch Network Working Pap er Series 2010. 21. Ro driguez MG, Lesk ov ec J, Krause A: Inferring netw orks of diffusion and influence . In Pro c e e dings of the ACM SIGKDD Conferenc e on Know le dge Disc overy and Data Mining, Washington, DC 2010, :1019–1028. 22. Zhao K, Bianconi G: So cial interactions mo del and adaptability of human b ehavior . F r ontiers i n Phys- iolo gy 2011, (2):101. 23. Steeg GV, Galst yan A: Information transfer in social media . In Pr oc e e dings of World Wide Web Confer enc e, Lyon, F r anc e 2012, :509–518. 24. T an C, T ang J, Sun J, Lin Q, W ang F: Social action tracking via noise tolerant time-v arying factor graphs . In Pr o c e e dings of ACM SIGKDD Confer enc e on Know le dge, Disc overy, and Data Mining, Washington, DC 2010, :1049–1058. 25. T an C, Lee L, T ang J, Jiang L, Zhou M, Li P: User-lev el sentimen t analysis incorp orating so cial net works . In Pro c e e dings of ACM SIGKDD Confer enc e on Know le dge, Disc overy, and Data Mining, San Die go, CA 2011, :1397–1405. 26. T ruso v M, Bo dapati A V, Bucklin RE: Determining influential users in Internet so cial net w orks . Journal of Marketing R ese arch 2010, XL VII :643–658. 27. Perra N, Gon¸ calv es B, Pastor-Satorras R, V espignani A: Activity driv en mo deling of time v arying net- w orks . Scientific R ep orts 2012, 2 :469. 28. Rabiner LR: A tutorial on hidden Marko v models and selected applications in speech recognition . Pr o c e e dings of the IEEE 1989, 77 (2):257–286. 29. Bilmes JA: A gen tle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Marko v mo dels . T ec h. rep., International Computer Science Institute, Berk eley , CA 1998. 22 30. Pearl J: Pr ob abilistic Re asoning in Intel ligent Systems: Networks of Plausible Infer enc e . Morgan Kaufmann 1988. 31. Murphy KP: Dynamic Bay esian Netw orks: Representation, Inference and Learning . T ec h. rep., Uni- v ersit y of California, Berkeley 2002. [Ph.D. dissertation]. 32. Poritz AB: Linear predictive hidden Mark ov mo dels and the sp eech signal . In Pr o c e e dings of IEEE International Conferenc e on A c oustics, Sp e e ch, and Signal Pr o c essing, Paris, F r anc e 1982, 7 :1291–1294. 33. Juang BH, Rabiner LR: Mixture autoregressiv e hidden Mark ov models for sp eech signals . IEEE T r ans- actions on A c oustics, Sp e e ch, and Signal Pr o c essing 1985, ASSP-33 (6):1404–1413. 34. Kenny P , Lennig M, Mermelstein P: A linear predictiv e HMM for vector-v alued observ ations with applications to sp eech recognition . IEEE T r ansactions on A c oustics, Sp e e ch, and Signal Pr o cessing 1990, 38 (2):220–225. 35. Cacciatore T, Nowlan SJ: Mixtures of controllers for jump linear and non-linear plants . In Pr o c e e dings of Neural Information Pro c essing Systems, Denver, CO 1993, 7 :719–726. 36. Bengio Y, F ransconi P: Input-output HMM’s for sequence pro cessing . IEEE T r ansactions on Neur al Networks 1996, 7 (5):1231–1249. 37. Ghahramani Z, Jordan MI: F actorial hidden Marko v mo dels . Machine L e arning 1997, 29 :1–31. 38. Brand M, Oliver N, Pen tland A: Coupled hidden Marko v mo dels for complex action recognition . In Pr o c e e dings of IEEE Confer enc e on Computer Vision and Pattern R e c o gnition, San Juan, Puerto Ric o 1997, :994–999. 39. Pa vlovic V: Dynamic Ba yesian net works for information fusion with applications to human-computer in terfaces . T ech. rep., Univ ersity of Illinois, Urbana-Champaign, IL 1999. [Ph.D. dissertation]. 40. Kwon J, Murph y KP: Mo deling freewa y traffic with coupled HMMs . T ec h. rep., Universit y of California, Berk eley , CA 2000. 41. Chu S, Huang T: Bimodal sp eec h recognition using coupled hidden Marko v mo dels . In Pr o c e edings of IEEE International Confer enc e on Sp oken L anguage Pr oc essing, Beijing, China 2000, 2 :747–750. 42. Chu S, Huang T: Audio-visual sp eec h mo deling using coupled hidden Marko v mo dels . In Pr o c e e dings of IEEE International Confer enc e on A c oustics, Sp e e ch, and Signal Pr o c essing, Orlando, FL 2002, :2009–2012. 43. Zhong S, Ghosh J: HMMs and coupled HMMs for multi-c hannel EEG classification . In Pr o c e e dings of International Joint Confer enc e on Neur al Networks, Honolulu, HI 2002, :1154–1159. 44. Dong W, Pen tland A, Heller KA: Graph-coupled HMMs for modeling the spread of infection . In Pr o- c e e dings of Confer enc e on Unc ertainty in A rtificial Intel ligenc e, Catalina Is., CA 2012, :227–236. 45. Karsai M, Kaski K, Barab´ asi AL, Kert´ esz J: Univ ersal features of correlated burst y b eha viour . Scientific R ep orts 2012, 2 :397. 46. Dempster AP , Laird NM, Rubin DB: Maxim um-lik eliho o d from incomplete data via the EM algorithm . Journal of R oyal Statistic al So ciety, Part B 1977, 39 :1–38. 47. Lip orace LA: Maximum lik eliho o d estimation for multiv ariate observ ations of Mark ov sources . IEEE T r ansactions on Information Theory 1982, IT-28 (5):729–734. 48. Jordan MI, Jacobs RA: Hierarchical mixtures of exp erts and the EM algorithm . Neur al Computation 1994, 6 (2):181–214. 49. Jebara T, P entland A: Maximum conditional likelihoo d via b ound maximization and the CEM algo- rithm . In Pr o c e e dings of Neur al Information Pr o c essing Systems, Denver, CO 1998, 11 :494–500. 50. Salo j¨ arvi J, Puolam¨ aki K, Kaski S: Exp ectation maximization algorithms for conditional lik eliho o ds . In Pro c e e dings of the International Conferenc e on Machine L earning, Bonn, Germany 2005, 22 :753–760. 51. Gnanadesik an R: Metho ds for Statistic al Data Analysis of Multivariate Observations . Wiley-Interscience, 2nd edition 1997. 23 Figures Figure 1 - Coupled HMM Coupled HMM framework for user activit y . € Q 0 € Δ 1 € Δ 2 € Δ 3 € Q 1 € Q 2 € Q 3 € Z 1 € Z 2 € Z 3 Time User of interest Other users Observations Figure 2 - State transition Pictorial illustration of state-transition ev olution in the prop osed mo del. I n a ct i ve H i d d e n St a t e s O b se rva t i o n s Act i ve € p k € q k € 1 − p k € 1 − q k Po i n t p ro ce ss “l o w ” ra t e Po i n t p ro ce ss “h i g h ” ra t e € Z i I n f l u e n ce St ru ct u re 24 Figure 3 - Mo del Validation Learned transition probabilities as a function of τ for User-II I with n = 1000. Figure 4 - Q-Q Plots Q-Q plots of in ter-tw eet duration in (a)-(d) A ctive and (e)-(h) Inactive states for User-II I under four differen t mo dels. (a) (b) (c) (d) (e) (f ) (g) (h) 25 Figure 5 - Perfo rmance Imp rovement AIC impro vemen t for Model D relativ e to Mo del B . Figure 6 - Histograms Histogram of AIC gain with coupled HMM for a corpus of 100 users with (a) n = 500 and (b) n = 1000 observ ations. (c) Histogram of SMAPE gain with coupled HMM for n = 500. (a) (b) (c) 26 Figure 7 - User Clustering Clustering 150 users into three clusters based on learned mo del parameter v alues, p 1 /p 0 and γ 1 /γ 0 . 27 T ables T able 1 - Observation Mo dels Mo dels for observ ation giv en state Q i = j, j ∈ { 0 , 1 } . Name Densit y function ( f j (∆ i )) Baum-W elch parameter estimate Exp onen tial ρ j · exp( − ρ j · ∆ i ) b ρ j = N − 1 P i =1 ξ i ( j, 0)+ ξ i ( j, 1) N − 1 P i =1 ∆ i · ξ i ( j, 0)+ ξ i ( j, 1) Gamma 1 Γ( k j ) λ k j j · ∆ k j − 1 i exp − ∆ i λ j b k j and b λ j solv e: k j λ j = N − 1 P i =1 ∆ i · ξ i ( j, 0)+ ξ i ( j, 1) N − 1 P i =1 ξ i ( j, 0)+ ξ i ( j, 1) where Γ( x ) is the Gamma function log( λ j ) + Ψ( k j ) = N − 1 P i =1 log(∆ i ) · ξ i ( j, 0)+ ξ i ( j, 1) N − 1 P i =1 ξ i ( j, 0)+ ξ i ( j, 1) where Ψ( x ) = Γ 0 ( x ) Γ( x ) is the di-gamma function T able 2 - Influence Structure Mo dels Mo dels for influence structure giv en state Q i − 1 = j, j ∈ { 0 , 1 } . Name Densit y function ( g j ( Z i )) Baum-W elch parameter estimate Binary men tions P ( Z i = k | Q i − 1 = j ) = 1 − γ j if k = 0 γ j if k = 1 e γ j = P N i =1 , i : Z i =1 ( e ξ i − 1 ( j, 0)+ e ξ i − 1 ( j, 1) ) P N i =1 ( e ξ i − 1 ( j, 0)+ e ξ i − 1 ( j, 1) ) No. of mentions Geometric P ( Z i = k | Q i − 1 = j ) = (1 − γ j ) · γ k j , k ≥ 0 e γ j = P N i =1 Z i · ( e ξ i − 1 ( j, 0)+ e ξ i − 1 ( j, 1) ) P N i =1 ( Z i +1) · ( e ξ i − 1 ( j, 0)+ e ξ i − 1 ( j, 1) ) P oisson P ( Z i = k | Q i − 1 = j ) = γ k j · exp( − γ j ) Γ( k +1) , k ≥ 0 e γ j = P N i =1 Z i · ( e ξ i − 1 ( j, 0)+ e ξ i − 1 ( j, 1) ) P N i =1 ( e ξ i − 1 ( j, 0)+ e ξ i − 1 ( j, 1) ) Shifted zeta P ( Z i = k | Q i − 1 = j ) = (1+ k ) − γ j ζ ( γ j ) , k ≥ 0 e γ j solv es: where ζ ( x ) is the Riemann-zeta function − ζ 0 ( γ j ) ζ ( γ j ) = P N i =1 log(1+ Z i ) · ( e ξ i − 1 ( j, 0)+ e ξ i − 1 ( j, 1) ) P N i =1 ( e ξ i − 1 ( j, 0)+ e ξ i − 1 ( j, 1) ) T able 3 - Conditional Dep endency Assumptions Conditional dependencies of the inv olved v ariables in differen t HMM arc hitectures. First-Order IO-HMM [36] Coupled F actorial Coupled Prop osed AR-HMM [32 – 34] HMM [37] HMM [38, 43] Mo del { Q i , ∆ i − 1 } − → ∆ i { Z i , Q i } − → ∆ i { Z i , Q i } − → ∆ i Q i − → ∆ i Q i − → ∆ i - Z i − 1 − → Z i Z i − 1 − → Z i { Q i − 1 , Z i − 1 } − → Z i Q i − 1 − → Z i Q i − 1 − → Q i { Q i − 1 , Z i } − → Q i { Q i − 1 , Z i − 1 } − → Q i { Q i − 1 , Z i − 1 } − → Q i { Q i − 1 , Z i } − → Q i 28 T able 4 - Numb er of Pa rameters Num b er of parameters describing differen t mo dels with k states, ` netw ork influence structure levels and an m parameter observ ation density in each state. Number of parameters for k = ` = 2 and m = 1 or m = 2 are also provided. No. of parameters T rans. prob. Obs. Influence T otal m Mo del ( ↓ ) matrix density structure 1 2 Con v entional HMM k ( k − 1) k m - k ( k + m − 1) 4 6 First-Order AR-HMM k ( k − 1) k ( m + 1) - k ( k + m ) 6 8 IO-HMM `k ( k − 1) `k m ` ( ` − 1) ` ( k ( k + m − 1) + ` − 1) 10 14 Coupled F act. HMM `k ( k − 1) `km ` ( ` − 1) ` ( k ( k + m − 1) + ` − 1) 10 14 Coupled HMM [38] ( ` + k )( k − 1) k m ( ` + k )( ` − 1) ( ` + k )( ` + k − 2) 10 12 + k m Coupled HMM [43] 2( ` + k )( k − 1) k m 2( ` + k )( ` − 1) 2( ` + k )( ` + k − 2) + k m 18 20 Prop osed Model `k ( k − 1) k m k ( ` − 1) k ( `k + m − 1) 8 10 29 T able 5 - Algorithm Derivation Steps in generalized Baum-W elc h and Viterbi algorithms. Here, a, b and c are state v ariable v alues and i is the time-index. Steps in generalized Baum-W elc h algorithm: e ξ i ( a, b ) = e α i ( a ) P ( Z i +1 | Q i = a ) P ( Q i +1 = b | Q i = a, Z i +1 ) P (∆ i +1 | Q i +1 = b ) e β i +1 ( b ) P a,b e α i ( a ) P ( Z i +1 | Q i = a ) P ( Q i +1 = b | Q i = a, Z i +1 ) P (∆ i +1 | Q i +1 = b ) e β i +1 ( b ) , where (7) e α i ( a ) , P (∆ i 1 , Z i 1 , Q i = a ) , 2 ≤ i ≤ N − 1 (8) = " X b e α i − 1 ( b ) P ( Z i | Q i − 1 = b ) P ( Q i = a | Q i − 1 = b, Z i ) # P (∆ i | Q i = a ) (9) e α 1 ( a ) = " X b π b P ( Z 1 | Q 0 = b ) P ( Q 1 = a | Q 0 = b, Z 1 ) # P (∆ 1 | Q 1 = a ) (10) e β i ( b ) , P (∆ N i +1 , Z N i +1 | Q i = b ) , 2 ≤ i ≤ N − 1 (11) = P ( Z i +1 | Q i = b ) " X c P ( Q i +1 = c | Q i = b, Y i +1 ) P (∆ i +1 | Q i +1 = c ) e β i +1 ( c ) # (12) e β N ( b ) = 1 . (13) Steps in generalized Viterbi algorithm: δ 1 ( a ) , P ( Q 1 = a, ∆ 1 , Z 1 | λ ) (14) = " X b P ( Q 0 = b ) P ( Z 1 | Q 0 = b ) P ( Q 1 = a | Q 0 = b, Z 1 ) # · P (∆ 1 | Q 1 = a ) (15) φ 1 ( a ) = 0 (16) δ i +1 ( a ) , max Q i 1 h P ( Q i 1 , Q i +1 = a, ∆ i +1 1 , Z i +1 1 | λ ) i , 1 ≤ i ≤ N − 1 (17) = max b h δ i ( b ) P ( Z i +1 | Q i = b ) P ( Q i +1 = a | Q i = b, Z i +1 ) i P (∆ i +1 | Q i +1 = a ) (18) φ i +1 ( a ) = arg max b h δ i ( b ) P ( Z i +1 | Q i = b ) P ( Q i +1 = a | Q i = b, Z i +1 ) i . (19) 30 T able 6 - Conditional Correlation Co efficient Conditional correlation co efficient ρ ( Z i , Z i − 1 | Q i − 1 ) for Users-I to II I in A ctive and Inactive states. Q i − 1 = 1 Q i − 1 = 0 User ( ↓ ) No. of samples Corr. co ef. p -v alue No. of samples Corr. co ef. p -v alue User-I 660 0 . 1127 0 . 0037 138 − 0 . 0244 0 . 7762 User-I I 909 0 . 0590 0 . 0753 89 − 0 . 0540 0 . 6155 User-I I I 844 − 0 . 0459 0 . 1832 154 0 . 1306 0 . 1065 T able 7 - AIC Comparison for User-I AIC scores for a t ypical user with differen t mo del fits. The bolded figures corresp ond to the model with the b est fit. AIC Mo del ( ↓ ) n = 100 n = 250 n = 500 n = 750 Observ ation densit y: Exp onential P oisson process mo del 1017 . 71 2711 . 88 5157 . 41 7556 . 58 Con v entional tw o-state HMM 835 . 42 2155 . 51 3852 . 15 5630 . 03 Coupled HMM, Binary mention 844 . 19 2169 . 88 3867 . 18 5658 . 23 Coupled HMM, No. mentions (Geometric) 840 . 65 2163 . 20 3853 . 10 5621 . 40 Coupled HMM, No. mentions (Poisson) 840 . 96 2163 . 54 3853 . 57 5621 . 90 Coupled HMM, No. mentions (Shifted zeta) 841 . 93 2163 . 39 3853 . 83 5623 . 03 Coupled HMM, So cial netw ork traffic (Geometric) 836 . 22 2156 . 11 3951 . 16 5789 . 35 Observ ation densit y: Gamma Renew al process mo del 879 . 02 2284 . 01 4214 . 99 6230 . 80 Con v entional tw o-state HMM 830 . 66 2134 . 12 3806 . 88 5578 . 28 Coupled HMM, Binary mention 838 . 81 2148 . 02 3816 . 54 5595 . 53 Coupled HMM, No. mentions (Geometric) 836 . 36 2141 . 94 3808 . 53 5571 . 18 Coupled HMM, No. mentions (Poisson) 837 . 42 2142 . 66 3810 . 11 5572 . 50 Coupled HMM, No. mentions (Shifted zeta) 837 . 76 2142 . 13 3809 . 29 5584 . 69 Coupled HMM, So cial netw ork traffic (Geometric) 834 . 81 2149 . 96 3797 . 65 5583 . 40 31 T able 8 - AIC Comparison for User-I I AIC scores for a t ypical user with differen t mo del fits. The bolded figures corresp ond to the model with the b est fit. AIC Mo del ( ↓ ) n = 100 n = 500 n = 1000 Observ ation densit y: Exp onential P oisson process mo del 698 . 48 3444 . 09 6837 . 57 Con v entional tw o-state HMM 504 . 20 2373 . 26 4343 . 94 Coupled HMM, Binary mention 517 . 35 2383 . 19 4354 . 92 Coupled HMM, No. mentions (Geometric) 504 . 62 2364 . 04 4333 . 98 Coupled HMM, No. mentions (Poisson) 504 . 42 2363 . 96 4333 . 78 Coupled HMM, No. mentions (Shifted zeta) 504 . 76 2363 . 94 4333 . 80 Coupled HMM, So cial netw ork traffic (Geometric) 489 . 87 2391 . 80 4406 . 64 Observ ation densit y: Gamma Renew al process mo del 613 . 98 2960 . 90 5691 . 82 Con v entional tw o-state HMM 500 . 68 2334 . 20 4261 . 41 Coupled HMM, Binary mention 511 . 86 2339 . 44 4265 . 43 Coupled HMM, No. mentions (Geometric) 501 . 72 2320 . 02 4257 . 15 Coupled HMM, No. mentions (Poisson) 501 . 39 2319 . 73 4256 . 61 Coupled HMM, No. mentions (Shifted zeta) 501 . 92 2320 . 04 4256 . 93 Coupled HMM, So cial netw ork traffic (Geometric) 487 . 70 2336 . 06 4283 . 44 32 T able 9 - AIC Comparison for User-I I I AIC scores for an extreme case of a highly activ e user with differen t model fits. The bolded figures corresp ond to the mo del with the b est fit. AIC Mo del ( ↓ ) n = 100 n = 500 n = 1000 Observ ation densit y: Exp onential P oisson process mo del 653 . 73 3141 . 36 6250 . 35 Con v entional tw o-state HMM 479 . 31 2359 . 48 4611 . 80 Coupled HMM, Binary mention 495 . 63 2391 . 71 4677 . 30 Coupled HMM, No. mentions (Geometric) 478 . 69 2337 . 30 4552 . 67 Coupled HMM, No. mentions (Poisson) 483 . 11 2339 . 65 4682 . 30 Coupled HMM, No. mentions (Shifted zeta) 490 . 09 2370 . 70 4753 . 87 Coupled HMM, So cial netw ork traffic (Geometric) 499 . 81 2402 . 47 4663 . 76 Observ ation densit y: Gamma Renew al process mo del 585 . 78 2830 . 55 5594 . 71 Con v entional tw o-state HMM 476 . 39 2290 . 53 4440 . 57 Coupled HMM, Binary mention 485 . 79 2318 . 99 4494 . 15 Coupled HMM, No. mentions (Geometric) 473 . 56 2262 . 10 4394 . 30 Coupled HMM, No. mentions (Poisson) 489 . 48 2262 . 22 4521 . 20 Coupled HMM, No. mentions (Shifted zeta) 484 . 44 2300 . 41 4469 . 11 Coupled HMM, So cial netw ork traffic (Geometric) 485 . 73 2313 . 83 4458 . 81 T able 10 - AIC Improvement ∆ AIC for Users-I to II I with differen t n v alues. User ( ↓ ) Small n Mo derate n Large n User-I − 5 . 70 ( n = 100) − 1 . 65 ( n = 500) 7 . 10 ( n = 750) User-I I − 1 . 04 ( n = 100) 14 . 18 ( n = 500) 4 . 26 ( n = 1000) User-I I I 2 . 83 ( n = 100) 28 . 43 ( n = 500) 46 . 27 ( n = 1000) 33 T able 11 - Sample Activity Listing Sample activit y of a typical user in Cluster 2. P osted b y Intended for T ext conten t @F riend @Cluster-2-user y eh.watc hing @Cluster-2-user R T @Non-F riend: In the U.S., y ou can text “FLOOD” to 27722 to donate $10 to the #Pakistan Relief F und. [URL] #helppakistan @Cluster-2-user @F riend :D @Cluster-2-user Is it mandatory for Saeed Ajmal to b o wl on short ball in every o v er #fail #P akCric k et @F riend @Cluster-2-user i think afridi should go for gull @Cluster-2-user Believ e me when I sa y that I predicted that shoaib will get yardy in this ov er :D #P akCrick et @Cluster-2-user That w as a khoo o oni york er b y Umer #Gull W aqar ki yaad aa gai #PakCric ket @F riend R T @Cluster-2-user: 300 not out by Bo om Bo om [URL] @F riend1 @F riend2 ... @F riend8 @Cluster-2-user Baba dam daro o d ak a Muhammad Y ousaf has taken an un believ able catch #PakCric k et @F riend @Cluster-2-user ;-) @F riend R T @Cluster-2-user: #blog post 300 not out by Bo om Bo om [URL] #P akCrick et #Bo omBo om #Afridi @F riend @Cluster-2-user no w what? another scandal against #pak cric ket #Pakistan for winning the match @Cluster-2-user @F riend i think the scandal is going to b e damaging bat of Swan on the y ork er attempt b Shabb y #PakCric k et @Cluster-2-user R T @F riend: There’s alwa ys second chances... it just dep ends on ho w hard you fight for it !! @Cluster-2-user R T @F riend: dedicated to men in Green hop e this streak contin ues and w e we the coming one as w ell.... [URL] @Cluster-2-user Ha y Jazba Jano on to Himmat na haar... the form ula b ehind the success of crick et team #PakCric ket 34 T able 12 - Sample Argument Sample argumen t b etw een t wo typical users in Cluster 2. P osted b y In tended for T ext conten t @Cluster-2-user-2 @Cluster-2-user-1 quic k ques tion...is @salmantaseer one of the bad guys?? i thought the PPP was the lesser of the evils o ver there.. #justw ondering @Cluster-2-user-1 @Cluster-2-user-2 PPP lesser of the evils?! Hahahaha every one working under Zardari is as evil as Bieber can ev er dream of being. @Cluster-2-user-2 @Cluster-2-user-1 a ww..but sherry rehman is so sweet..and i lov e the rehman malik hairdo..and SMQ is so totally suav e..btw who are the go o d guys?? @Cluster-2-user-1 @Cluster-2-user-2 w o w man! Y ou’re kidding, righ t? @Cluster-2-user-2 @Cluster-2-user-1 man!!wh y do i get the feeling that i just said lik e a totally not cool thing.. @Cluster-2-user-1 @Cluster-2-user-2 Err y es. T otally unco ol :P @Cluster-2-user-2 @Cluster-2-user-1 so no go o d guys h uh??and btw yes i do think @fbhutto is totally w annab eish... :) @Cluster-2-user-1 @Cluster-2-user-2 Y eah, I don’t like her either. I lik e Imran Khan better. @Cluster-2-user-2 @Cluster-2-user-1 i lik ed him till i saw a report ver his old 92 world cup winning team mem b ers said they nev er lik ed him.. @Cluster-2-user-2 @Cluster-2-user-1 if he culdnt get his o wn team mem b ers b ehind him then..... @Cluster-2-user-1 @Cluster-2-user-2 W ell, can’t sa y anything ab out that. But he’s a go o d person, a b etter one in p olitics at least. @Cluster-2-user-2 @Cluster-2-user-1 w ell..sub c on tinental p olitics na...every one has a murky past or presen t...just dep ends on ho w deep one digs..no offence intended @Cluster-2-user-1 @Cluster-2-user-2 None tak en. But lo ok at his hospital. Despite a few contro versies, I can tell you firsthand that it’s amazing. Go o d man. @Cluster-2-user-1 #b ecauseoft witter I’ve b ecome more v o cal in terms of v enting. @Cluster-2-user-1 #b ecauseoft witter I’ve b ecome more sensitiv e. I lose a follow er and I go b erserk. 35 T able 13 - Sample Activity Listing Sample activit y of a typical user in Cluster 3. P osted b y Intended for T ext conten t @Cluster-3-user My maid told to day; someb o dy threw one day old daugh ter wrapp ed in a p olythene bag in Filth Drum outside a house. #humanitarian @Cluster-3-user I’ll write on this v ery issue; daugh ter, p ov erty or an illegitimate child w as she? why our so cial system allows us to tak e these plunges? @F riend-1 @Cluster-3-user did the bab y survived? what has w ent wrong with people, they are b ecoming so heartless. :( @Cluster-3-user @F riend-1 haan she surviv ed and someb o dy has adopted her.. only there.. but it is heart breaking.. @F riend-1 @Cluster-3-user oh Thanks God. Y es it is really heart breaking. @F riend-2 @Cluster-3-user alive ???? =O @Cluster-3-user @F riend-2 yeah alive.. an issueless couple has adopted her @F riend-2 @Cluster-3-user Thank Go d! .... thts literally inh uman act! I mean the adopting parents should atleast get back to her parents and hang them! @Cluster-3-user @F riend-2 where would they find them? to hang @Cluster-3-user Half of Pakistanis say match fixing allegations against crick eters untrue [URL] #Cric ket #Pakistan @F riend-3 @Cluster-3-user which means half of Pakistan b elieves the allegations of #Crick et #Matc hFixing against the #P akistan team to b e true #Honesty #ICC @Cluster-3-user @F riend-3 ha ha ha yeah.. agreed @F riend-4 @Cluster-3-user The prop er w ay of match fixing is thru umpires, didn t giv e collingw o o d out he made a big partnership & ga ve well set akmal out @Cluster-3-user @F riend-4 true.. collingwoo d has fixed match.. LOL 36
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment