Modeling Temporal Activity Patterns in Dynamic Social Networks

Mo deling T emp o ral Activit y P atterns in Dynamic So cial Net w o rks V asanthan Raghavan ∗ 1 , Greg V er Steeg 2 , Aram Galsty an 2 , and Alexander G. T a rtakovsky 1 1 Department of Mathematics, University of Southern California, Los Angeles, 90089, CA, USA 2 Information Sciences Institute, University of Southern California, Marina del Rey , 90292, CA, USA Email: Vasanthan Raghavan ∗ - vasanthan raghavan@ieee.org; Greg Ver Steeg - gregv@isi.edu; Aram Galsty an - galsty an@isi.edu; Alexander G. T a rtakovsky - ta rtakov@usc.edu; ∗ Corresponding author Abstract The focus of this wo rk is on developing probabilistic models for user activity in so cial net wo rks by inco rp o rating the so cial netw ork inﬂuence as p erceived by the user. F o r this, w e p rop ose a coupled Hidden Mark ov Mo del, where each user’s activity evolves according to a Mark ov chain with a hidden state that is inﬂuenced b y the collective activit y of the friends of the user. W e develop generalized Baum-Welch and Viterbi algorithms for mo del parameter learning and state estimation for the p rop osed framew ork. W e then validate the proposed mo del using a signiﬁcant corpus of user activity on Twitter. Our numerical studies show that with suﬃcient observations to ensure accurate mo del learning, the proposed framewo rk explains the observed data b etter than either a renewal p ro cess-based mo del or a conventional uncoupled Hidden Mark ov Mo del. W e also demonstrate the utilit y of the proposed app roach in predicting the time to the next t weet. Finally , clustering in the mo del pa rameter space is shown to result in distinct natural clusters of users characterized by the interaction dynamic b et w een a user and his netw ork. Keyw ords Activit y Proﬁle Mo deling, Twitter , Data-Fitting, Explanation, Prediction, Hidden Marko v Mo del, Coupled Hidden Mark ov Mo del, So cial Net work Inﬂuence, User Clustering 1 Intro duction So cial netw orking websites such as Facebook , Twitter , etc. hav e b ecome immensely p opular with hundreds of millions of users that engage in v arious forms of activity on these websites. These so cial netw orks pro vide an unparalleled opp ortunit y to study individual and collective behavior at a very large scale. Such studies ha v e profound implications on wide-ranging applications suc h as eﬃcient resource allocation, user-sp eciﬁc information dissemination, user classiﬁcation, and rapid detection of anomalous b ehavior such as b ot or compromised accoun ts, etc. The simplest mo del for user activity is a Poisson pro cess, where each activit y ev ent (e.g., p osting, t weet- ing, etc.) o ccurs indep endent of the past history at a time-indep endent rate. Ho w ever, recent empirical evidence from m ultiple sources (e-mail logs, w eb surﬁng, letter corresp ondence, research output, etc.) sug- gest that h uman activit y has distinctly non-P oissonian c haracteristics. In particular, the in ter-even t duration distribution (whic h is exponential for the P oisson pro cess) has been shown to be heavy-tailed and burst y for a n umber of diﬀeren t activity types [1, 2]. Diﬀeren t approac hes hav e b een put forw ard to explain the non-P oisson nature of the activity patterns [3 – 10]. Nevertheless, despite signiﬁcan t recen t progress, open questions remain. Most remark ably , on an individual scale, existing studies so far hav e mostly discarded the role and impact of the so cial netw ork where the user activity takes place, instead describing each user via an indep endent sto chastic process. On the other hand, it is clear that so cial in teractions on netw orks aﬀect user activit y , and discarding these interactions should generally lead to sub-optimal models. The main con tribution of this pap er is to dev elop a probabilistic mo del of user activity that explicitly tak es into accoun t the in teraction b etw een users by in tro ducing a coupling b etw een t wo sto chastic processes. Sp eciﬁcally , we prop ose a coupled Hidden Marko v Model (coupled HMM) to describ e in ter-connected dy- namics of user activit y . In our model, the individual dynamics of each user is coupled to the aggregated activit y proﬁle of his neigh b ors (friends or follow ers) in the net work. While a user’s activity may b e pref- eren tially aﬀected by sp eciﬁc neighbors, the predictive p ow er of the model can b e substan tially improv ed using the aggregated activity of all the neighbors. The hidden states in our mo del correspond to diﬀerent patterns in user activity , similar to the approac h suggested in [7]. How ever, here the state transitions are inﬂuenced by the activit y of the neigh b ors, and in turn, the activity of the aggregated set of neighbors is inﬂuenced by the state of the giv en user. While many v ariants of the classical HMM approach exist in the literature, the k ey distinguishing feature of this work is the bi-dir e ctional inﬂuence b et w een the activity of a user and his netw ork, sp eciﬁcally tailored for social net work applications. Nev ertheless, b eing a v ariant of the conv en tional (uncoupled) HMM, the prop osed model enjo ys the same computational adv an tages in 2 terms of parameter learning as [7] and other HMM v ariants. W e p erform a num b er of exp eriments with data describing user activit y traces on Twitter , and demon- strate that the proposed approach has a b etter p erformance both in terms of explaining observed data (mo del ﬁtting) and predicting future activity (generalization). In particular, w e rep ort statistically signiﬁ- can t improv ement ov er tw o baseline approaches, a renewal pro cess-based mo del and a conv entional HMM. F urthermore, w e use the learned mo dels to cluster users, and ﬁnd that the resulting cluster structure allo ws in tuitiv e characterization of the users in terms of the interaction dynamics b etw een a user and his so cial net w ork. The rest of the pap er is organized as follows. Section 2 reviews related work in activit y proﬁle mo deling, esp ecially as applicable in so cial netw ork settings. Section 3 prop oses a coupled HMM framew ork for the activit y proﬁle of users and distinguishes the prop osed mo del from prior work. Section 4 develops statistical metho dologies to learn mo del parameters and ho w to use the prop osed framework for data-ﬁtting and forecasting. Section 5 v alidates the mo deling assumptions and illustrates the utilit y and eﬃcacy of the prop osed approach with data from [11]. Concluding remarks are pro vided in Section 6. Without an y prejudice, w e use the male gender-sp eciﬁc connotations for all the users and their attributes in this w ork. Related W ork Recen t researc h on so cial netw orks has fo cused on understanding the prop erties of netw orks induced by so cial in teractions, mo deling information diﬀusion on suc h netw orks, characterizing their evolution in time, etc. In the direction of mo deling the temp oral activit y of users’ comm unication in such netw orks, sev eral mo dels hav e b een prop osed in the literature. Approaches based on simple p oint process mo dels hav e b een prop osed for user requests in a p eer-to-p eer s etting [12] and social netw ork ev olution [13]. In addition to one-parameter exp onential observ ation densit y for user activity utilized in [6], more general tw o-parameter mo dels suc h as the W eibull (or stretc hed exp onential) hav e been prop osed for mo deling inter-post duration in the con text of instant-messaging netw orks [14], accessing patterns in In ternet-media [15], understanding in ter-p ost dynamics of original con tent in general online so cial net w orks [16], and inter-call duration in cell-phone net works [17]. T o explain the bursty features of h uman dynamics, [1 – 4] suggested the priority selection/queue mecha- nism. An alternative mechanism motiv ated by circadian and weekly cycles of h uman activit y and captured b y cascading non-homogenous P oisson pro cesses was suggested in [5, 6] for the heavy-tails in inter-ev ent durations. Although this model has b een shown to b e consistent with empirical observ ations, it is computa- 3 tionally intensiv e in terms of parameter estimation. T o o v ercome this issue, Malmgren et al. [7] suggested a simpler tw o-state HMM for the activit y of users in an email/comm unication netw ork where the states reﬂect a measure of the user’s activit y . Similar mo dels, suggestive of a few states determining activity patterns, ha v e b een considered for short message corresp ondence in [18] and Digg activity in [19]. Other work has also emphasized the importance of distinguishing activ e versus inactive users in activity and inﬂuence mo deling tasks [20]. An imp ortant feature that c haracterizes the priorit y selection mechanism is that h uman b ehavioral pat- terns are driv en b y resp onses to/from others. On the other hand, the line of w ork motiv ated b y the cascading non-homogenous Poisson pro cesses-based mo dels explains activity patterns due to other mec hanisms suc h as circadian and weekly cycles, task rep etition, and changing communication needs. Mo dels that trav erse b et w een these tw o extremes and incorporate the inﬂuence of a user’s social netw ork on activity hav e not b een studied in extensiv e detail in the literature. Some of the examples of suc h a bridging eﬀort include user- generated activity traces used for inferring underlying so cial relationships [21–23], incorporating similarities in the b ehavior of users in a sp eciﬁc user’s so cial net work to mo del his actions [24, 25], a Poisson regression mo del to determine the users that most inﬂuence a given user [26], and a user activit y-driv en (rather than connectivit y-driv en) mo del for so cial netw ork evolution [27]. Notwithstanding the fact that similar ideas ha v e b een sp oradically pursued in other settings, our work provides the missing link for a netw ork-driven approac h to capturing individual behavioral patterns. While the theory of classical HMMs is w ell-dev elop ed [28, 29], HMMs are ill-suited in settings where mul- tiple processes interact with eac h other and/or information about the history of the process needed for future inferencing is not reﬂected in the curren t state. T o incorporate complicated dependencies b etw een sev eral in teracting v ariables, many v arian ts of the classical HMM set-up that are sp ecial cases of the more general theory of dynamic Bayesian networks [30, 31] hav e b een introduced in the literature. Some of thes e exten- sions include the class of autor e gr essive HMMs [32 – 34], input-output HMMs [35, 36], factorial HMMs [37], and c ouple d HMMs. Diﬀeren t v arian ts of coupled HMMs hav e b een used in div erse settings including mo dels for complex h uman actions and b eha viors [38, 39], freewa y traﬃc [40], audio-visual sp eech [41, 42], EEG classiﬁcation [43], spread of infection in so cial net works [44], etc. 4 Mo deling Activit y Proﬁle of Twitter Users Let T i , i = 0 , 1 , · · · , N denote the time-stamps of a sp eciﬁc user’s t w eets ov er a given p erio d-of-interest. W e can equiv alen tly deﬁne the inter-t weet duration ∆ i as ∆ i , T i − T i − 1 , i = 1 , 2 , · · · , N . One of the main goals of this w ork is to develop a mathematical mo del for { ∆ i } , ∆ N 1 = [∆ 1 , · · · , ∆ N ]. Along the lines of [7], we start by developing a simplistic k = 2-state HMM for { ∆ i } . Inﬂuence-F ree Hidden Mark ov Mo deling Assumption 1 – Underlying States: W e assume that a v ariable Q i , taking one of t wo p ossible v alues { 0 , 1 } , reﬂects the state of the user-of-in terest. Sp eciﬁcally , Q i = 0 denotes that the user is in an Inactive state b et w een T i − 1 and T i , whereas Q i = 1 denotes that the user is in an A ctive state. W e also assume that Q i ( i ≥ 1) ev olves in a time-homogenous Mark ovian manner and is dep endent only on Q i − 1 and is conditionally indep endent of Q i − 2 0 = [ Q 0 , · · · , Q i − 2 ] given Q i − 1 . A ﬁrst-order Marko vian mo del is a reasonable ﬁrst approximation capturing short-term memory in human b ehavioral dynamics [2, 45]. The state transition probability matrix P = { P [ m, n ] } is given as P =  1 − β 0 , 1 β 0 , 1 β 1 , 0 1 − β 1 , 0  , with P [ m, n ] = P ( Q i = n | Q i − 1 = m ) , m, n ∈ { 0 , 1 } . The density of the initial state Q 0 is denoted as P ( Q 0 = j ) = π j , j = 0 , 1. Note that the switching from the Inactive state to the A ctive state in the HMM paradigm can capture the nocturnal/work-home patterns of individual users without any further explicit modeling [7]. F urther, explicit mo deling of circadian and weekly cycles in social netw ork settings is more diﬃcult than for email communications due to the “often on” more random nature of so cial netw ork in teractions. Assumption 2 – Observation Density: In general, Q i is hidden (unobserv able) and we can only observe { ∆ i } (or equiv alently , { T i } ). In the Inactive state, { ∆ i } form samples from a “low”-rate p oint pro cess, whereas in the A ctive state, { ∆ i } form samples from a “high”-rate point pro cess. Sp eciﬁcally , let the probabilit y density function of ∆ i b e giv en as ∆ i ∼  f 1 ( · ) if Q i = 1 f 0 ( · ) if Q i = 0 , for an appropriate choice of f 0 ( · ) and f 1 ( · ). 5 As mentioned earlier, an exp onential mo del for f · ( · ) corresp onds to a Poisson pro cess assumption under either state. While the exp onential mo del is captured by a single parameter (see T able 1 for details), this simplicit y often constrains the mo del ﬁt either in the small inter-t weet (burst y) regime or large inter-t w eet regime (tails). Two-parameter extensions of the exp onential such as the gamma or W eibull density allow a b etter ﬁt in these tw o regimes. Both the gamma and the W eibull mo dels result in similar mo deling ﬁts. While the gamma mo del allows for simple parameter estimate formulas (see T able 1), the W eibull model requires the solving of coupled equations in the mo del parameters (often, a numerically in tensive procedure). Th us, w e will restrict attention to the exponential and gamma mo del c hoices in this work. Inﬂuence-Driven Hidden Mark ov Mo deling A more sophisticated inﬂuence-driven mo del is developed no w by making the following additional assump- tions: Assumption 3 – Inﬂuenc e of Neighb ors: In addition to Q i − 1 , the evolution of Q i is also inﬂuenced b y the aggregated activity of all the users interacting with and inﬂuencing the user-of-in terest (“neighbors,” for short). While neighbors is a broad rubric, we restrict attention to the friends and follow ers in this work. F or example, a series of tw eets from the neighbors can result in a reply/ret w eet by the user, or a long p erio d of non-activity from the neighbors could induce the user to initiate a burst of activity . Let the v ariable Z i ( i = 1 , · · · , N ) capture the inﬂuence of the neighbors’ t weets on the user-of-interest. Examples of candidate inﬂuence structures (information theoretically , Z i is view ed as a side-information metric) include: 1. A binary indicator function that reﬂects whether there was a mention of the user b etw een T i − 1 and T i (or not); 2. The num b er of such mentions; 3. T otal traﬃc (aggregated activit y) of the friends of the user, etc. Motiv ated by the same short-term memory assumption as b efore, the c oupling b etw een { Q i } and { Z i } is simpliﬁed b y the Mark ovian condition that P ( Q i | Q i − 1 1 , Z i 1 ) = P ( Q i | Q i − 1 , Z i ). In general, to keep com- putational requirements in inferencing lo w, it is helpful to assume that the evolution of Q i is captured by a summary statistic φ ( Z i ) : Z i 7→ [0 , 1] suc h that P ( Q i | Q i − 1 , Z i ) = P 0 ( Q i | Q i − 1 ) · (1 − φ ( Z i )) + P 1 ( Q i | Q i − 1 ) · φ ( Z i ) 6 with P k [ m, n ] = P k ( Q i = n | Q i − 1 = m ) where P 0 =  1 − p 0 p 0 q 0 1 − q 0  and P 1 =  1 − p 1 p 1 q 1 1 − q 1  . In particular, the choice φ ( Z i ) = 1 1( Z i > τ ) for a suitable threshold τ implies that the user switches from the transition probabilit y matrix P 0 to P 1 dep ending on the magnitude of the inﬂuence structure. T o paraphrase, the user ev olv es according to a baseline dynamics corresp onding to P 0 if his net w ork activity is b elow a certain threshold and evolv es according to an elev ated dynamics corresp onding to P 1 if his netw ork activity exceeds that threshold. The discussion on mo del learning elab orates on a simple metho d to determine the appropriate c hoice of τ . The discussion on mo del v alidation pro vides some empirical justiﬁcation for this assumption. Assumption 4 – Evolution of Inﬂuenc e Structur e: Noting that Z i is a function of the activity of all the neigh b ors (and not a sp eciﬁc user), we hypothesize that Z i is dep enden t on Q i − 1 , but only weakly dep enden t on Z i − 1 . Motiv ated b y this thinking, we make the simplistic assumption that P ( Z i | Q i − 1 1 , Z i − 1 1 ) = P ( Z i | Q i − 1 ) . (1) Rephrasing, (1) presumes that user aggregation de-correlates Z i from its past history . While the ab ov e assumption can be justiﬁed under certain scenarios (see the discussion under model v alidation), more general inﬂuence evolution mo dels need to b e considered and the loss in explanatory/predictive p ow er by making the simplistic assumption in (1) needs to b e studied carefully . This is the sub ject of ongoing w ork. F urther, let the probability density function of Z i b e giv en as Z i ∼  g 0 ( · ) if Q i − 1 = 0 g 1 ( · ) if Q i − 1 = 1 . Diﬀeren t candidates for g j ( · ) are illustrated in T able 2. Com bining the abov e four assumptions, the join t densit y of the observ ations { ∆ i } , the inﬂuence structure { Z i } , and the state { Q i } can b e simpliﬁed as P  ∆ N 1 , Z N 1 , Q N 0  = P  Q 0 , Z 1 , Q 1 , ∆ 1 , · · · , Z N , Q N , ∆ N  (2) = P ( Q 0 ) N Y i =1 P ( Z i | Q i − 1 ) N Y i =1 P ( Q i | Q i − 1 , Z i ) N Y i =1 P (∆ i | Q i ) . (3) The dep endence relations that driv e the coupled HMM framew ork for user activity are illustrated in Figs. 1 and 2. 7 Compa rison with Related Mo dels and Architectures Man y extensions to the classical HMM arc hitecture hav e b een proposed in the literature for inferencing problems in diﬀeren t settings. W e now compare the prop osed coupled HMM architecture with some of these extensions and v ariants. The conditional dependencies of the inv olved v ariables corresp onding to the diﬀerent arc hitectures relative to the prop osed model in this w ork are summarized in T able 3. The n um b er of mo del parameters for these arc hitectures with k states, ` net work inﬂuence structure levels, and m parameters for the observ ations are presented in T able 4. The simplest extension of the HMM arc hitecture, an autor e gr essive HMM [33], ties the evolution of the user’s in ter-tw eet duration to his state and his past observ ations. Thus, this model do es not incorp orate the inﬂuence of the user’s netw ork on his activity . A more general framew ork called a class of input-output HMMs is prop osed in [35] and [36], where the user’s inter-t weet duration is not only dep endent on his state, but also on an external input suc h as the net work inﬂuence structure. In con trast, a coupled v ariation of the factorial HMM in [37] takes the viewp oint of the netw ork inﬂuence structure b eing an additional state rather than an external input. In either mo del, the current state of the user dep ends not only on his past state, but also on the state of his netw ork. Ho w ev er, in b oth mo dels, the evolution of the user’s netw ork is one-side d and indep endent of the user’s interaction. The case of a ful ly-c ouple d HMM ov ercomes this one-sided ev olution b y ensuring that the state of the user and his net w ork inﬂuenc e each other. The price to pay for such generalit y (lack of structure in the conditional dependencies ) is that the model parameters hav e to b e learned via appro ximation algorithms instead of iterativ e techniques. T o o v ercome this diﬃculty , a structured architecture is prop osed in [38], where P ( Q i | { Q i − 1 , Z i − 1 } ) = P ( Q i | Q i − 1 ) · P ( Q i | Z i − 1 ) P ( Z i | { Q i − 1 , Z i − 1 } ) = P ( Z i | Q i − 1 ) · P ( Z i | Z i − 1 ) . Similarly , [43] proposes another arc hitecture, where P ( Q i | { Q i − 1 , Z i − 1 } ) = β 1 · P ( Q i | Q i − 1 ) + β 2 · P ( Q i | Z i − 1 ) P ( Z i | { Q i − 1 , Z i − 1 } ) = γ 1 · P ( Z i | Q i − 1 ) + γ 2 · P ( Z i | Z i − 1 ) for appropriate normalization constants { β i } and { γ i } . While parameter learning algorithms simplify in either scenario, from a so cial netw ork p ersp ective, the num b er of mo del parameters remain large for small v alues of k and ` relative to the proposed coupled HMM arc hitecture in this work. F or example, these 8 structured coupled HMMs are describ ed by 10 and 18 mo del parameters with an m = 1 parameter observ ation densit y in each state (and 12 and 20 parameters with m = 2) relative to 8 and 10 mo del parameters with the proposed mo del in the same settings (see T able 4 for details). In the backdrop of this discussion, the prop osed coupled HMM architecture oﬀers a principled, nov el and easily-motiv ated mo deling framework, sp eciﬁcally useful in so cial netw ork contexts. Despite b eing simple, it oﬀers signiﬁcan t p erformance gains ov er the classical HMM architecture and its v ariants. As the subsequen t discussion also shows, the prop osed architecture has the added b eneﬁt of allo wing mo del learning via simple re-estimation form ulas. Metho dology Lea rning Mo del Pa rameters It is of interest to infer the underlying states { Q i } that cannot b e observ ed directly . This task is performed with the aid of the observ ations { ∆ i } in the HMM setting, and with the aid of { ∆ i } and the inﬂuence structure { Z i } in the coupled HMM setting. In the HMM setting, a lo c al ly optimal choice of mo del parameters is sought to maximize the likelihoo d function P (∆ N 1 | λ ). F ollo wing the result in [46, 47], starting with an initial choice of HMM parameters ¯ λ , the mo del parameters are re-estimated to maximize Baum’s auxiliary function Q ( λ , ¯ λ )    HMM , deﬁned as, Q ( λ , ¯ λ )    HMM , X Q N 0 log  P  ∆ N 1 , Q N 0 | λ   · P  ∆ N 1 , Q N 0 | ¯ λ  . It can be easily chec ked [28, 29] that this maximization breaks in to a term-by-term optimization of ind ividual mo del parameters. The mo del re-estimation form ulas are giv en by the Baum-W elch algorithm: b β i,j = P N i =1 ξ i − 1 ( i, j ) P N i =1 ( ξ i − 1 ( i, 0) + ξ i − 1 ( i, 1)) , i 6 = j, i, j ∈ { 0 , 1 } . The Baum-W elch estimate of the parameters deﬁning the diﬀerent observ ation densities are presen ted in T able 1. In these equations, ξ i ( a, b ) for i = 0 , · · · , N − 1 is deﬁned as ξ i ( a, b ) , P  Q i = a, Q i +1 = b | ∆ N 1 , ¯ λ  . The update equation for ξ i ( a, b ) follows from [28, Eq. (37)] and the forward-bac kward pro cedure. In the coupled HMM setting, as in [36], we are interested in mo del parameters that maximize the c onditional likelihoo d function P (∆ N 1 | Z N 1 , λ ). If the conditional lik eliho o d is known in closed-form (as in the input-output HMM case [36]) or a tight low er b ound to it is kno wn, a conditional exp ectation maximization 9 algorithm along the lines of [48 – 50] can b e pursued. In the prop osed coupled HMM setting, the conditional lik eliho o d app ears to b e neither amenable to a simple formula nor a tight low er b ound. T o ov ercome this tec hnical diﬃculty , we now prop ose a tw o-step procedure to learn an estimate of the mo del parameters that maximize P (∆ N 1 | Z N 1 , λ ). In the ﬁrst step, we ﬁx the threshold that distinguishes the baseline dynamics P 0 from the elev ated dynamics P 1 , τ , to an appropriate c hoice τ init . W e then treat ∆ N 1 and Z N 1 as training observ ations and consider a generalized auxiliary function Q ( λ , ¯ λ )    CHMM of the form: Q ( λ , ¯ λ )    CHMM , X Q N 0 log  P  ∆ N 1 , Z N 1 , Q N 0 | λ   · P  ∆ N 1 , Z N 1 , Q N 0 | ¯ λ  . (4) As b efore, λ and ¯ λ denote the optimization v ariable corresponding to the parameter space (all the parameters except for τ ) and its initial estimate, respectively . A straightforw ard extension of the pro of in [46, 47] sho ws that maximizing Q ( λ , ¯ λ )    CHMM in the λ v ariable results in a local maximization of the joint lik eliho o d function P  ∆ N 1 , Z N 1 | λ  . T o obtain analogous re-estimation formulas for an iterative solution to a lo cal maxim um, w e deﬁne the equiv alent intermediate v ariable e ξ i ( a, b ) for i = 0 , · · · , N − 1: e ξ i ( a, b ) , P  Q i = a, Q i +1 = b | ∆ N 1 , Z N 1 , ¯ λ  . Using (3), we can simplify (4) and the join t optimization of the mo del parameters again breaks into a term-b y-term optimization. W e then hav e the follo wing analogous mo del parameter estimates for k ∈ { 0 , 1 } : e p k = P N i =1 , i ∈ Z k e ξ i − 1 (0 , 1) P N i =1 , i ∈ Z k e ξ i − 1 (0 , 0) + P N i =1 , i ∈ Z k e ξ i − 1 (0 , 1) e q k = P N i =1 , i ∈ Z k e ξ i − 1 (1 , 0) P N i =1 , i ∈ Z k e ξ i − 1 (1 , 0) + P N i =1 , i ∈ Z k e ξ i − 1 (1 , 1) with Z 0 = { i : Z i ≤ τ } and Z 1 = { i : Z i > τ } . The re-estimation form ulas for the observ ation density pa- rameters follow the same structure as in T able 1 by replacing ξ i ( · , · ) with e ξ i ( · , · ). F or the parameters deﬁning the density of the inﬂuence structure, T able 2 provides a list of re-estimation form ulas. The intermediate v ariable e ξ i ( a, b ) , 1 ≤ i ≤ N − 1 is up dated by a generalized forwar d-bac kw ard pro cedure whose steps are illustrated in T able 5 (see (7)-(13)). W e denote b y λ init the conv erged mo del parameters that lo cally maximize Q ( λ , ¯ λ )    CHMM . Note that λ init is a lo cal maximum only in the λ space and not in { τ × λ } . Th us, the choice { τ init , λ init } do es not maximize P (∆ N 1 | Z N 1 , λ ), not even lo cally . Therefore, in the next step, w e locally optimize the conditional likelihoo d 10 o v er a lo cal region around { τ init , λ init } . That is, n e τ , e λ o = arg max { τ , λ } ∈ L P  ∆ N 1 | Z N 1 , λ  where L = { τ : τ = τ init + ∆ τ and λ : λ = λ init + ∆ λ } . In this w ork, w e fo cus on a box-constrained L . Alternately , a local gradient search in the mo del parameter space can be pursued to lo cally optimize the conditional lik eliho o d function. Note that the conditional densit y can b e written as P  ∆ N 1 | Z N 1 , λ  = e α N (0) + e α N (1) b α N (0) + b α N (1) , where the forward algorithm v ariable e α i ( j ) is up dated with the same form ulas as (8)-(10) (see T able 5). On the other hand, b α i ( j ) follo ws the same formula as e α i ( j ), but by constraining P (∆ i | Q i = a ) = 1 for all i and a . Mo del Veriﬁcation The eﬃcacy of the diﬀeren t mo dels to the observ ed data are studied in tw o wa ys. In the ﬁrst approach, the mo del parameters learned via the (generalized) Baum-W elc h algorithm are used with a state estimation pro cedure to estimate the most probable state sequence associated with the observ ations. F or the HMM setting, state estimation is straightforw ard via the use of the Viterbi algorithm [28]. The generalization of the Viterbi algorithm to the coupled HMM setting requires the deﬁnition of in termediate v ariables δ i +1 ( j ) and φ i +1 ( j ) as illustrated in T able 5 (see (14)-(19)). As with the Viterbi algorithm, the most probable state sequence is then estimated as Q ? N = arg max j ∈ { 0 , 1 } δ N ( j ) Q ? i = φ i +1 ( Q ? i +1 ) , 1 ≤ i ≤ N − 1 . The observed inter-t weet durations corresp onding to the classiﬁed states are compared with the in ter-tw eet durations obtained with the prop osed mo del(s) via a graphical metho d such as the Quantile-Quan tile (Q-Q) plot. Recall that a Q-Q plot plots the quantiles corresp onding to the true observ ations with the quantiles corresp onding to the mo del(s) [51]. If the prop osed mo del reﬂects the observ ations correctly , the quantiles lie on the (reference) straight-line that extrapolates the ﬁrst and the third quartiles. Discrepancies from the straigh t-line b enchmark indicate artifacts in tro duced b y the mo del(s) not seen in the observ ations and/or features in the observ ation not explained by the model(s). 11 In the second approach, the ﬁts of the diﬀeren t models to the data are studied via a more formal metric suc h as the Ak aik e Information Criterion (AIC), deﬁned as AIC ( n ) , 2 k − 2 log ( L ) , where k denotes the num b er of parameters used in the mo del, n is the length of the observ ation sequence, and L is the optimized lik eliho o d function for the observ ation sequence corresponding to the learned mo del. The AIC p enalizes mo dels with more parameters and the mo del that results in the smallest v alue of AIC is the most suitable mo del (for the observed data) from the class of mo dels considered. In the HMM setting with k HMM parameters, the AIC corresp onding to { ∆ i } is given as AIC ( n )    HMM , 2 k HMM − 2 log  P (∆ N 1 | λ )  = 2 k HMM − 2 log  α N (0) + α N (1)  , where the c on verged mo del parameter estimates from the Baum-W elch algorithm are used to compute α i ( j ) , P (∆ i 1 , Q i = j ) using the forward pro cedure. In the coupled HMM setting with k CHMM parameters, the corresponding AIC metric is deﬁned as AIC ( n )    CHMM , 2 k CHMM − 2 log  P (∆ N 1 | Z N 1 , λ )  , (5) where the mo del parameter estimates that maximize P (∆ N 1 | Z N 1 , λ ) are to b e used in (5). With the mo del parameters learned as explained in the previous section, an upp er b ound to AIC ( n )    CHMM is obtained as AIC ( n )    CHMM ≤ AIC ( n ) , 2 k CHMM − 2 log  e α N (0) + e α N (1) b α N (0) + b α N (1)  . F orecasting Giv en ∆ n 1 (and Z n 1 ), forecasting ∆ n +1 is of immense imp ortance in tasks such as advertising, anomaly detection (detecting when a compromised account will p ost next), etc. A simple maximum a p osteriori (MAP) predictor of the form e ∆ n +1    MAP = arg max y f (∆ n +1 = y | ∆ n 1 , Z n 1 ) = arg max y k − 1 X i =0 e β i f i (∆ n +1 = y ) where e β i = P j e α n ( j ) P ( Z n +1 | Q n = j ) P ( Q n +1 = i | Z n +1 , Q n = j ) P j e α n ( j ) 12 fails when f i (∆ n +1 = y ) is unimo dal with the same mode for all i . This is alwa ys the case with exp onential observ ation models (mo de is 0) and with gamma mo dels if k i λ i < 1 for all i (mo de is 0), which is typically the case with the b est mo del ﬁts for many users. On the other hand, a conditional mean predictor of the form e ∆ n +1    CM = E [∆ n +1 | ∆ n 1 , Z n 1 ] = k − 1 X i =0 e β i E [∆ n +1 | Q n +1 = i ] results in large forecasting errors in the Inactive state if the mean inter-t weet durations in the tw o states are v ery disparate (typically the case for many users). T o ov ercome these problems, we consider a predictor of the form e ∆ n +1 = k − 1 X i =0 1 1  e Q n +1 = i | ∆ n 1  E [∆ n +1 | Q n +1 = i ] where e Q n +1 is the state es timate using the (generalized) Viterbi algorithm with ∆ n 1 (and Z n 1 ) as inputs and study the forecasting p erformance in the A ctive state with a Symmetric Mean Absolute Percen tage Error (SMAPE) metric: SMAPE ( N ) , 1 N N X i =1      ∆ i − e ∆ i ∆ i + e ∆ i      · 1 1( Q i = 1) . The SMAPE metric is a normalized error metric and a smaller v alue indicates a b etter mo del for forecasting. It is seen as a p ercentage error and is b ounded b etw een 0% and 100%. Numerical Results The dataset used to illustrate the eﬃcacy of the mo dels proposed in this work is a 30-da y long record of Twitter activity describ ed in [11]. This dataset consists of N t = 652 , 522 tw eets from N u = 30 , 750 users (with at least one tw eet). The time-scale on which the t w eets are collected is minutes. While the dataset has b een collected using a sno wball sampling technique and reﬂects a p opulation primarily based out of W est Asia, London and P akistan, the users in the dataset app ear to b e from diverse so cio-economic and p olitical bac kgrounds and ha ve a broad arra y of interests. F urther, the prop erties of the dataset on a collective scale are similar in nature to well-understoo d prop erties of similar datasets [11] suggesting a high conﬁdence on the suitability of the dataset in studying user b ehavior on an individual scale and in its generalizability to other datasets. Since reliable model learning can b e accomplished only for users with suﬃcien t activity , w e focus on users with a large n umber of t weets ov er the data collection p erio d. There were 223 users with ov er 600 tw eets and 115 users with o ver 1 , 000 t weets. 13 V alidating Mo del Assumptions The coupled HMM framework dev elop ed in this pap er is built on four main assumptions, tw o of which lead to the HMM formulation and t w o that couple the inﬂuence of the neighbors to the HMM. Notwithstanding the fact that these assumptions are based on a rational mo del of user behavior, the ﬁrst t wo assumptions ha v e been well-studied and justiﬁed in the literature [7, 18 – 20]. W e now provide some empirical results to justify the latter tw o assumptions. F or this, we start with tw o typical users (denoted as User-I and User-I I) whose activity o ver the thirt y- da y p erio d consists of: i) 807 tw eets, 260 men tions, and 16 , 935 tw eets from his so cial netw ork of 62 friends, and ii) 1 , 914 tw eets, 1 , 108 mentions, and 10 , 281 tw eets from his so cial netw ork of 92 friends. W e also consider an extreme case of a highly active user (denoted as User-I I I) whose activity o ver the thirt y-day p erio d consists of 2 , 387 t weets, 2 , 872 mentions, and 58 , 810 tw eets from his so cial netw ork of 206 friends. Users-I and I I do not app ear to b e p opular public ﬁgures, whereas User-I I I is a p opular journalist, adv o cate on man y p olitical issues, and an activist. Assumption 3 hypothesizes that eac h user’s activit y switches from a baseline dynamics to an elev ated state of dynamics depending on the magnitude of τ . T o test this assumption, we study the data from User-II I for n = 1000. With this data, w e use the generalized Baum-W elch algorithm to learn mo del parameters (as a function of τ ) for a coupled HMM with the num b er of men tions as the inﬂuence structure. Fig. 3 plots the learned transition probabilities for User-I I I as a function of τ . F rom this ﬁgure, w e see that b oth p 0 and p 1 start oﬀ around approximately the same v alue (and similarly for q 0 and q 1 ). Ho wev er, as τ increases, the transition probabilities stabilize at diﬀeren t v alues suggesting that there is indeed a baseline and an elev ated state in user dynamics. Similar behavior is also seen with data from Users-I and I I. On the other hand, Assumption 4 h yp othesizes that Z i and Z i − 1 are conditionally independent giv en Q i − 1 . T o test this assumption, we use the model parameters learned with the generalized Baum-W elc h algorithm in the generalized Viterbi algorithm to estimate the most likely state sequence corresp onding to the observ ations. The conditional correlation co eﬃcient b etw een Z i and Z i − 1 , deﬁned as, ρ ( Z i , Z i − 1 | Q i − 1 = j ) , E h  Z i − E [ Z i | Q i − 1 = j ]  ·  Z i − 1 − E [ Z i − 1 | Q i − 1 = j ]  i r E h ( Z i − E [ Z i | Q i − 1 = j ]) 2 i · r E h ( Z i − 1 − E [ Z i − 1 | Q i − 1 = j ]) 2 i , j ∈ { 0 , 1 } is used to study conditional indep endence. T able 6 lists ρ ( Z i , Z i − 1 | Q i − 1 = j ) and the p -v alue corresp onding to this co eﬃcien t for Users-I to I I I with n = 800, n = 1000 and n = 1000, resp ectively . F rom this table, w e see that the correlation co eﬃcient in all the six cases studied has a small (absolute) v alue. A simple 14 explanation for this observ ation is that user aggregation signiﬁcan tly diminishes the correlation betw een Z i − 1 and Z i . Sp eciﬁcally , at a (standard) signiﬁcance level of 5%, the null h yp othesis of conditional indep endence b et w een Z i and Z i − 1 cannot b e rejected in ﬁve of the six cases studied indicating that Assumption 4 can b e justiﬁed for many users. Mo del Fits Fo r Users-I to I I I W e no w study the following mo dels for the activit y proﬁle of the three users: i) con ven tional t wo-state HMM, ii) coupled HMM with a binary inﬂuence structure that is set to 1 when there is a men tion of the user and 0 otherwise, iii) coupled HMM with the num b er of such mentions as the inﬂuence structure, and iv) coupled HMM with the so cial net w ork traﬃc of the friends of the user as the inﬂuence structure. Exp onential and gamma densities are considered for the observ ations (inter-t weet duration). On the other hand, geometric, P oisson and shifted zeta densities are considered for the num b er of mentions, and a geometric density is considered for the total traﬃc. See T ables 1 and 2 for mo del details. T ables 7-9 list the AIC scores for these three users with the diﬀerent mo dels as a function of the num b er of observ ations n for diﬀerent c hoices: n = 100, n = 250, n = 500, n = 750, or n = 1000. F rom these tables, the follo wing conclusions can b e made: 1. F or all the three users, b oth a P oisson pro cess mo del and a renew al pro cess mo del are signiﬁc antly sub-optimal for the observ ations as they implicitly assume a single state for the user’s activity . This is in conformance with similar observ ations in [7]. A conv entional HMM with tw o states o vercomes this problem by assuming that the user switches b et ween an A ctive and an Inactive state and th us provides a better baseline to compare the p erformance of the prop osed modeling framework. 2. F or all com binations of users, n and t yp es of inﬂuence structure, a tw o-parameter gamma density for the observ ations results in a b etter ﬁt than possible with an exp onential mo del. This should not b e en tirely surprising since an exponential density is a sp ecial case of the gamma density (a gamma with k = 1 and λ = 1 ρ results in an exp onen tial of rate ρ ). Similar observ ations ha ve also b een made in related recen t work [14–17]. This conclusion is also reinforced by observing the Q-Q plots of the true inter-t weet durations (in the A ctive and Inactive states) relative to the in ter-tw eet duration v alues obtained from four mo dels for User-I I I (see Figs. 4(a)-(d) and Figs. 4(e)-(h), resp ectiv ely). The four comp eting mo dels illustrated are: i) Model A — con v entional HMM with exponential density , ii) Mo del B — con v en tional HMM 15 with gamma density , iii) Mo del C — coupled HMM with geometric inﬂuence structure and exponential densit y , and iv) Mo del D — coupled HMM with geometric inﬂuence structure and gamma densit y . The most probable state sequence vector corresponding to the observ ations of User-I I I is estimated with the (generalized) Viterbi algorithm. As can be seen from Fig. 4, Model D is the b est ﬁt from among these four mo dels. Nevertheless, the discrepancies of some of the quantiles from the reference straigh t-line sho ws that even this mo del do es not c ompletely capture all the features in the observ ations, suggesting a direction for future w ork in this area. 3. W e no w explain how a coupled HMM works. State estimation with Models B and D is p erformed using the (generalized) Viterbi algorithm. In the n = 1000 case, while the HMM declares 898 of the 1000 in ter-t weet p erio ds (corresp onding to 21 . 06% of the total observ ation p erio d for User-II I) as A ctive , the coupled HMM declares only 845 p erio ds (corresp onding to 16 . 46% of the total observ ation p erio d) as A ctive . In terms of discrepancies in state estimation b et ween the t wo mo dels, 53 inter-t weet p erio ds declared as A ctive by the HMM are re-classiﬁed as Inactive by the coupled HMM, whereas all the Inactive states of the HMM are also classiﬁed as Inactive b y the coupled HMM. Carefully studying ∆ i , Q i − 1 and Z i for these 53 p erio ds that are re-classiﬁed, it can b e seen that the coupled HMM declares a p erio d as Inactive (indep endent of the nature of Q i − 1 or Z i ) provided that ∆ i is large, or when ∆ i is small and in addition, Z i is also small and Q i − 1 = 0. In other words, if the user is in the Inactive state and the inﬂuence structure do es not suggest a switch to the A ctive state, a small inter-t weet p erio d is treated as an anomaly rather than as an indicator of c hange to the A ctive state. Thus, in con trast to the HMM setting where the state estimate depends primarily on the magnitude of ∆ i , the coupled HMM is less trigger-happy in the sense that it considers the magnitude of ∆ i in the context of neigh b ors’ activity b efore declaring a state as A ctive or Inactive . 4. In terms of general trends with AIC as the metric for model ﬁtting, if n is small (sa y , n = 100 to 250), a con v entional HMM is competitive and comparable with (sometimes, even better than) a coupled HMM with more parameters. In addition to there not being suﬃcient data to learn a complicated model, this trend conforms with the p opular intuition of the Occam’s razor that simplistic models shall suﬃce for observ ations of small length. 5. So cial net w ork traﬃc (that includes replies, ret w eets , and mo diﬁed t weets from all of a user’s friends) t ypically ov erwhelms the n umber of mentions by at least an order of magnitude (see typical examples with Users-I and I I abov e). Thus, so cial netw ork traﬃc serves as a go od inﬂuence structure to couple 16 a HMM when the num b er of mentions is too small to learn a sophisticated mo del reliably . This is t ypically the case when n is mo derate (neither to o small nor to o large). F or example, with Users-I and I I, traﬃc leads to a b etter mo del ﬁt for n = 500 and n = 100, resp ectively . 6. How ever, as n increases, the more dir e ctional nature of a men tion (relative to the traﬃc) means that men tions carry more “information” about the capacit y of a user to resp ond/reply conditioned on seeing a certain type of tw eet from his netw ork (than the traﬃc). This is clear from the general trend of low er AIC scores with the num b er of men tions than with the social netw ork traﬃc for large n v alues ( n = 750 or 1000). 7. F or all combinations of users and n , the binary inﬂuence structure for the men tions results in a p o orer ﬁt than the num b er of men tions. This is b ecause it is more eﬃcient to capture the num b er of mentions with a one parameter model than to exp end that parameter on a binary v alue. In other words, the loss in p erformance is due to the use of a hard decision metric (binary v alue), provided that the soft decision metric (the num b er of mentions) is captured accurately . While the three mo dels (geometric, P oisson and shifted zeta) for the num b er of mentions result in comparable p erformance for Users-I and I I, the geometric results in a sup erior ﬁt for User-II I. Thus, a geometric densit y can serve as a robust mo del choice for the n um b er of mentions. Giv en that the shifted zeta density captures heavy-tails, the ab o v e trend also suggests that the num b er of men tions o v er an in ter-tw eet duration is not likely to b e hea vy-tailed. P erformance Across Users Giv en that the exponential observ ation density consisten tly under-p erforms in mo del ﬁtting relative to a gamma densit y , w e henceforth fo cus on the p erformance of Mo del D with Mo del B as the baseline. This p erformance gain is captured b y the relativ e AIC gain metric, deﬁned as, ∆ AIC , AIC    Model B − AIC    Model D . (6) In Fig. 5 and T able 10, ∆ AIC v alues are presented for diﬀeren t n v alues for Users-I to I I I. As can b e seen from this data, Model B p erforms b etter than Mo del D for n < 550 for User-I and Mo del D gets b etter as n increases after that. F or User-I I I, Mo del D is b etter than Mo del B for all n > 200 and the p erformance gain impro v es with increasing n till n = 800 and then slightly decreases after that. In general, the p erformance gain with Model D for the typical user is negligible for small v alues of n and this gain impro v es (in general) as n increases. 17 T o study this asp ect more carefully , w e no w consider a corpus of 100 users with diﬀeren t n um b ers of t weets and mentions ov er their p erio ds of activit y . F or all the users studied, it is observ ed that a local optim um (to reasonable accuracy) is achiev ed b y the generalized Baum-W elch algorithm within 20-30 iterations and indep enden t of the mo del parameter initializations. Fig. 6(a)-(b) plots the histogram of ∆ AIC for the corpus of 100 use rs with n = 500 and n = 1000, resp ectively . F rom Fig. 6, it can b e seen that Mo del D signiﬁcantly out-p erforms Model B for a large fraction of the users and this improv ement gets better as n increases. T o understand this, recall that exp  − ∆ AIC 2  is the likelihoo d that Model B minimizes the information loss relative to Mo del D . Thus, a ∆ AIC v alue larger than 4 . 61 and 9 . 21 leads to a relativ e lik eliho o d of 10% and 1%, resp ectively . F or the corpus studied here, Mo del D is 100 times as likely to minimize information loss for 25% of the users at n = 500 and 72% of the users at n = 1000, respectively . With a more relaxed b enc hmark, Mo del D is ten times as lik ely to minimize information loss for 33% and 85% of the users at n = 500 and 1000, resp ectively . F or predictive p erformance, analogous to ∆ AIC in (6), we deﬁne the relativ e SMAPE gain metric as ∆ SMAPE , SMAPE    Model B − SMAPE    Model D . Fig. 6(c) plots the histogram of ∆ SMAPE for n = 500 and it can again b e seen that Mo del D is b etter than Mo del B in terms of predictiv e p ow er for a large fraction of users. Thus, a coupled HMM pro vides a better mo deling paradigm for the activit y of a large set of users in so cial net works. User Clustering After learning the mo del parameters for each user, w e no w consider the similarit y betw een diﬀeren t users as implied b y those parameters. Since the dataset from [11] has only around 200 users with suﬃcient activit y ( n > 500) to learn general probabilistic mo dels where the coupled HMM parameters learned via the generalized Baum-W elc h algorithm con verge to local optima in the mo del parameter space, we fo cus on a corpus of 150 of these users. The mean num b er of t weets and mentions for this corpus is 975 . 40 and 629 . 17, resp ectiv ely . The mean n umber of friends and follow ers for the corpus is 251 . 83 and 206 . 46, respectively . With the num b er of men tions as the inﬂuence structure, we fo cus on t wo coupled HMM-sp eciﬁc parame- ters that capture the in teraction dynamic b etw een a user and his so cial netw ork to p erform user clustering in the model parameter space. T he parameter p 1 /p 0 measures the prop ensity (likelihoo d) of a user to become activ e upon seeing a large num b er of mentions in his timeline relativ e to a lack of such mentions: p 1 p 0 = P ( Q i = 1 | Q i − 1 = 0 , Z i > τ ) P ( Q i = 1 | Q i − 1 = 0 , Z i ≤ τ ) . 18 On the other hand, the parameter γ 1 /γ 0 measures the prop ensity of the user’s net work to resp ond with a men tion upon seeing activit y at the user relative to his inactivit y: γ 1 γ 0 = P ( Z i > 0 | Q i − 1 = 1) P ( Z i > 0 | Q i − 1 = 0) . Three natural clusters can b e iden tiﬁed in the mo del parameter space as a function of the n p 1 p 0 , γ 1 γ 0 o v alues: 1) The baseline scenario where the users are not signiﬁcan tly inﬂuenced by their neighbors and vic e versa corresp onds to p 1 p 0 = 1 = γ 1 γ 0 . The users for whom p 1 p 0 < 1 are the “tails” in the mo del space corresp onding to this baseline scenario. 2) On the other hand, a large v alue of p 1 p 0 indicates that more mentions can induce a user to the A ctive state. Restated, a user can b e induced to p ost at a higher frequency by an activ e so cial net w ork. 3) Similarly , a large v alue of γ 1 γ 0 indicates that a user’s so cial net w ork can b e induced to b ecome activ e (with a larger n umber of mentions) by the user’s activit y . Motiv ated b y this argument of three natural clusters, Fig. 7 clusters 150 users using the K -means algorithm for K = 3. The result of this clustering is that 92 users b elong to Cluster 1 centered around  p 1 p 0 , γ 1 γ 0  = (1 , 1). Of the remaining 58 users, 35 b elong to Cluster 2 where p 1 p 0 > 1 . 5 and 23 b elong to Cluster 3 where γ 1 γ 0 > 1 . 5. T o paraphrase the abov e discussion, Cluster 1 is made of a ma jorit y of the users corresp onding to the baseline scenario. On the other hand, Clusters 2 and 3 consist of users outside Cluster 1 and those who are either tigh tly knit to their netw ork (or vic e versa ). These users are signiﬁcantly aﬀected by their social inﬂuence. Some of the typical attributes/qualities that can b est describ e users in Cluster 2 are: commentarial, activist, garrulous, argumentativ e, opinionated, etc. Illustrating these facets, a sample activity listing ov er a single session of a t ypical user in Cluster 2 is provided in T able 11. The session b egins with a question of “y eh.w atching” (apparen tly , a cric k et matc h b etw een Pakistan and England) by a friend of the user. This is follo w ed by a con versation betw een friends and unsolicited commentaries/observ ations on the ongoing matc h b y the user-of-in terest. Such activ e commentary is typical of this user’s p osting b eha vior. Another sample argumen t b etw een t wo users in Cluster 2 is provided in T able 12. This argumen t is an exc hange of political opinions with each user trying to con vince the other about their resp ective positions. A t the end of the argumen t, one of the users realizes and ackno wledges that he has b ecome more vocal on social media, y et also sensitiv e to other users’ p ositions. On the other hand, the so cial netw ork of users in Cluster 3 share similar attributes as users in Cluster 2 ev en though the users themselv es are often more reluctan t to follo w suit. T o illustrate this subtle diﬀerence in b ehaviors, a sample activit y listing of a typical user in Cluster 3 is presen ted in T able 13. Here, we see that the user’s so cial net work is strongly opinionated in resp onse to a news story introduced b y the user. 19 Despite in tro ducing the story , the user himself is not suﬃcien tly p olarized/aggressiv e in his resp onse on so cial media. Similarly , after introducing another story , the user blindly agrees with other users’ p ositions and jok es on the matter. Thus, these examples illustrate ho w the coupled HMM paradigm in tro duced in this w ork captures broad features on user b ehavior despite not capturing the textual conten t in any detail. Conclusion W e hav e in tro duced a new class of coupled Hidden Marko v Mo dels to describe temp oral patterns of user activit y whic h incorp orate the so cial eﬀects of inﬂuence from the activit y of a user’s neigh b ors. While there hav e b een many w orks on mo dels for user activity in diverse so cial netw ork settings, our work is the ﬁrst to incorp orate so cial net work inﬂuence on a user’s activity . W e ha v e sho wn that the prop osed model results in b etter explanatory and predictive pow er ov er existing baseline mo dels suc h as a renewal pro cess- based mo del or an uncoupled HMM. User clustering in the mo del parameter space resulted in clusters with distinct in teraction dynamics b et ween users and their net works. Sp eciﬁcally , three clusters corresp onding to: a baseline scenario of no inﬂuence of a user on his netw ork (and vic e versa ), and t w o clusters with signiﬁcant inﬂuence of a user on his netw ork, and the netw ork on the user, resp ectiv ely are identiﬁed. While our w ork has developed a so cial netw ork-driven user activity mo del, it has only scratched the surface in this promising arena of research. It would be useful to pursue a more detailed study of diﬀerent candidate mo dels for the inﬂuence structures and the observ ations. It is also of interest in understanding whic h type of tw eets/p osts (mentions, replies, ret w eets, undirected tw eets) or the total traﬃc carries more “information” in dev eloping go o d mo dels for explanation and prediction at the individual scale. It w ould also b e of in terest to develop hierarchical so cial inﬂuence-driv en mo dels for groups of users as w ell as better understand those facets of a user’s so cial netw ork that inﬂuence him the most. In particular, a careful study of other netw ork inﬂuence structures (suc h as transfer en tropy-w eighted traﬃc [23]) that can capture the in teraction dynamic b etw een a user and his net work is of importance. Com bining temp oral activit y patterns with unstructured information such as the topic or nature of discussion, textual con tent, etc., could result in m uc h b etter predictive p erformance than temp oral activit y alone. F urther, understanding the contribution of the temporal and the textual parts of such a comp osite mo del in prediction w ould also be useful in understanding the limits and capabilities of activity proﬁle mo deling at the individual scale. 20 List of Abbreviations 1. HMM – Hidden Marko v Mo del 2. Coupled HMM – Coupled Hidden Mark o v Mo del 3. Q-Q Plot – Quantile-Quan tile Plot 4. AIC – Ak aik e Information Criterion 5. SMAPE – Symmetric Mean Absolute P ercen tage Error Comp eting Interests The authors declare that they hav e no competing interests. Autho rs’ Contributions VR p erformed the n umerical exp eriments. All the authors developed the mo del, designed the exp eriments, in terpreted the results, wrote the pap er, and appro ved the ﬁnal man uscript. Ackno wledgment This w ork was supp orted by the Defense Adv anced Researc h Pro jects Agency (DARP A) under gran t # W911NF-12-1-0034 at the Universit y of Southern California. The authors would like to thank Sofus Mac- sk assy for providing the Twitter dataset that was used in illustrating the algorithms prop osed in this w ork. References 1. Barab´ asi AL: The origin of bursts and hea vy tails in h uman dynamics . Natur e 2005, 435 :207–211. 2. Goh KI, Barab´ asi AL: Burstiness and memory in complex systems . Eur ophysics L etters 2008, 81 (48002). 3. V´ azquez A, Oliv eira JG, Dezs¨ o Z, Goh K, Kondor I, Barab´ asi AL: Modeling bursts and heavy tails in h uman dynamics . Physic al R eview E 2006, 73 (3):036127. 4. V´ azquez A, Racz B, Luk acs A, Barab´ asi AL: Impact of non-Poissonian activity patterns on spreading pro cesses . Physic al R eview L etters 2007, 98 (15):158702. 5. Malmgren RD, Stouﬀer DB, Motter AE, Amaral LAN: A Poissonian explanation for hea vy tails in e-mail comm unication . Pr o c e e dings of the National A c ademy of Sciences 2008, 105 (47):18153–18158. 6. Malmgren RD, Stouﬀer DB, Campanharo A, Amaral LAN: On universalit y in human corresp ondence activit y . Scienc e 2009, 325 (5948):1696–1700. 7. Malmgren RD, Hofman JM, Amaral LAN, W atts DJ: Characterizing individual communication patterns . In Pr o c e e dings of ACM SIGKDD Confer enc e on Know le dge, Disc overy, and Data Mining, Paris, F r anc e 2009, :607–615. 8. Stehl´ e J, Barrat A, Bianconi G: Dynamical and bursty interactions in so cial netw orks . Physic al R eview E 2010, 81 (3):035101. 21 9. Rybski D, Buldyrev SV, Havlin S, Liljeros F, Makse HA: Communication activit y in a so cial netw ork: relation b etw een long-term correlations and inter-ev ent clustering . Scientiﬁc Rep orts 2012, 2 :560. 10. Jo HH, Karsai M, Kert´ esz J, Kaski K: Circadian pattern and burstiness in mobile phone comm unication . New Journal of Physics 2012, 14 :013055. 11. Macsk assy SA: On the study of so cial in teractions in Twitter . In Pr o c ee dings of International Confer enc e on Weblogs and Social Me dia, Dublin, Ir eland 2012, :226–233. 12. Guo L, Chen S, Xiao Z, T an E, Ding X, Zhang X: Measurements, analysis and mo deling of BitT orrent- lik e systems . In Pr o c e e dings of ACM SIGCOMM Internet Me asur ement Confer enc e, New Orle ans, LA 2005, :35–48. 13. Lesko vec J, Backstrom L, Kumar R, T omkins A: Microscopic evolution of so cial netw orks . In Pr o c e e dings of ACM SIGKDD Confer enc e on Know le dge, Disc overy, and Data Mining, L as V e gas, NV 2008, :462–470. 14. Xiao Z, Guo L, T racey J: Understanding instan t messaging traﬃc c haracteristics . In Pr o c e e dings of IEEE International Conferenc e on Distribute d Computing Systems, T oronto, Canada 2007, :51. 15. Guo L, T an E, Chen S, Xiao Z, Zhang X: The stretched exp onential distribution of Internet media access patterns . In Pr o c e e dings of ACM Symp osium on Principles of Distribute d Computing, T or onto, Canada 2008, :283–294. 16. Guo L, T an E, Chen S, Zhang X, Zhao Y: Analyzing patterns of user conten t generation in online so cial net works . In Pr o c ee dings of ACM SIGKDD Confer enc e on Know le dge, Disc overy, and Data Mining, Paris, F r anc e 2009, :369–378. 17. Jiang ZQ, Xie WJ, Li MX, Podobnik B, Zhou WX, Stanley HE: Calling patterns in human comm unication dynamics . Pr o c e e dings of the National A c ademy of Scienc es 2013, 110 (5):1600–1605. 18. W u Y, Zhou C, Xiao J, Kurths J, Schellnh ub er HJ: Evidence for a bimo dal distribution in human com- m unication . Pr o c e e dings of the National A c ademy of Sciences 2010, 107 (44):18803–18808. 19. Hogg T, Lerman K: Sto c hastic models of user-con tributory web sites . In Pr o c e e dings of the International Confer enc e on Weblogs and Social Me dia, San Jose, CA 2009, :50–57. 20. Romero DM, Galuba W, Asur S, Hub erman BA: Inﬂuence and passivity in so cial media . So cial Scienc e R ese ar ch Network Working Pap er Series 2010. 21. Ro driguez MG, Lesk ov ec J, Krause A: Inferring netw orks of diﬀusion and inﬂuence . In Pro c e e dings of the ACM SIGKDD Conferenc e on Know le dge Disc overy and Data Mining, Washington, DC 2010, :1019–1028. 22. Zhao K, Bianconi G: So cial interactions mo del and adaptability of human b ehavior . F r ontiers i n Phys- iolo gy 2011, (2):101. 23. Steeg GV, Galst yan A: Information transfer in social media . In Pr oc e e dings of World Wide Web Confer enc e, Lyon, F r anc e 2012, :509–518. 24. T an C, T ang J, Sun J, Lin Q, W ang F: Social action tracking via noise tolerant time-v arying factor graphs . In Pr o c e e dings of ACM SIGKDD Confer enc e on Know le dge, Disc overy, and Data Mining, Washington, DC 2010, :1049–1058. 25. T an C, Lee L, T ang J, Jiang L, Zhou M, Li P: User-lev el sentimen t analysis incorp orating so cial net works . In Pro c e e dings of ACM SIGKDD Confer enc e on Know le dge, Disc overy, and Data Mining, San Die go, CA 2011, :1397–1405. 26. T ruso v M, Bo dapati A V, Bucklin RE: Determining inﬂuential users in Internet so cial net w orks . Journal of Marketing R ese arch 2010, XL VII :643–658. 27. Perra N, Gon¸ calv es B, Pastor-Satorras R, V espignani A: Activity driv en mo deling of time v arying net- w orks . Scientiﬁc R ep orts 2012, 2 :469. 28. Rabiner LR: A tutorial on hidden Marko v models and selected applications in speech recognition . Pr o c e e dings of the IEEE 1989, 77 (2):257–286. 29. Bilmes JA: A gen tle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Marko v mo dels . T ec h. rep., International Computer Science Institute, Berk eley , CA 1998. 22 30. Pearl J: Pr ob abilistic Re asoning in Intel ligent Systems: Networks of Plausible Infer enc e . Morgan Kaufmann 1988. 31. Murphy KP: Dynamic Bay esian Netw orks: Representation, Inference and Learning . T ec h. rep., Uni- v ersit y of California, Berkeley 2002. [Ph.D. dissertation]. 32. Poritz AB: Linear predictive hidden Mark ov mo dels and the sp eech signal . In Pr o c e e dings of IEEE International Conferenc e on A c oustics, Sp e e ch, and Signal Pr o c essing, Paris, F r anc e 1982, 7 :1291–1294. 33. Juang BH, Rabiner LR: Mixture autoregressiv e hidden Mark ov models for sp eech signals . IEEE T r ans- actions on A c oustics, Sp e e ch, and Signal Pr o c essing 1985, ASSP-33 (6):1404–1413. 34. Kenny P , Lennig M, Mermelstein P: A linear predictiv e HMM for vector-v alued observ ations with applications to sp eech recognition . IEEE T r ansactions on A c oustics, Sp e e ch, and Signal Pr o cessing 1990, 38 (2):220–225. 35. Cacciatore T, Nowlan SJ: Mixtures of controllers for jump linear and non-linear plants . In Pr o c e e dings of Neural Information Pro c essing Systems, Denver, CO 1993, 7 :719–726. 36. Bengio Y, F ransconi P: Input-output HMM’s for sequence pro cessing . IEEE T r ansactions on Neur al Networks 1996, 7 (5):1231–1249. 37. Ghahramani Z, Jordan MI: F actorial hidden Marko v mo dels . Machine L e arning 1997, 29 :1–31. 38. Brand M, Oliver N, Pen tland A: Coupled hidden Marko v mo dels for complex action recognition . In Pr o c e e dings of IEEE Confer enc e on Computer Vision and Pattern R e c o gnition, San Juan, Puerto Ric o 1997, :994–999. 39. Pa vlovic V: Dynamic Ba yesian net works for information fusion with applications to human-computer in terfaces . T ech. rep., Univ ersity of Illinois, Urbana-Champaign, IL 1999. [Ph.D. dissertation]. 40. Kwon J, Murph y KP: Mo deling freewa y traﬃc with coupled HMMs . T ec h. rep., Universit y of California, Berk eley , CA 2000. 41. Chu S, Huang T: Bimodal sp eec h recognition using coupled hidden Marko v mo dels . In Pr o c e edings of IEEE International Confer enc e on Sp oken L anguage Pr oc essing, Beijing, China 2000, 2 :747–750. 42. Chu S, Huang T: Audio-visual sp eec h mo deling using coupled hidden Marko v mo dels . In Pr o c e e dings of IEEE International Confer enc e on A c oustics, Sp e e ch, and Signal Pr o c essing, Orlando, FL 2002, :2009–2012. 43. Zhong S, Ghosh J: HMMs and coupled HMMs for multi-c hannel EEG classiﬁcation . In Pr o c e e dings of International Joint Confer enc e on Neur al Networks, Honolulu, HI 2002, :1154–1159. 44. Dong W, Pen tland A, Heller KA: Graph-coupled HMMs for modeling the spread of infection . In Pr o- c e e dings of Confer enc e on Unc ertainty in A rtiﬁcial Intel ligenc e, Catalina Is., CA 2012, :227–236. 45. Karsai M, Kaski K, Barab´ asi AL, Kert´ esz J: Univ ersal features of correlated burst y b eha viour . Scientiﬁc R ep orts 2012, 2 :397. 46. Dempster AP , Laird NM, Rubin DB: Maxim um-lik eliho o d from incomplete data via the EM algorithm . Journal of R oyal Statistic al So ciety, Part B 1977, 39 :1–38. 47. Lip orace LA: Maximum lik eliho o d estimation for multiv ariate observ ations of Mark ov sources . IEEE T r ansactions on Information Theory 1982, IT-28 (5):729–734. 48. Jordan MI, Jacobs RA: Hierarchical mixtures of exp erts and the EM algorithm . Neur al Computation 1994, 6 (2):181–214. 49. Jebara T, P entland A: Maximum conditional likelihoo d via b ound maximization and the CEM algo- rithm . In Pr o c e e dings of Neur al Information Pr o c essing Systems, Denver, CO 1998, 11 :494–500. 50. Salo j¨ arvi J, Puolam¨ aki K, Kaski S: Exp ectation maximization algorithms for conditional lik eliho o ds . In Pro c e e dings of the International Conferenc e on Machine L earning, Bonn, Germany 2005, 22 :753–760. 51. Gnanadesik an R: Metho ds for Statistic al Data Analysis of Multivariate Observations . Wiley-Interscience, 2nd edition 1997. 23 Figures Figure 1 - Coupled HMM Coupled HMM framework for user activit y . € Q 0 € Δ 1 € Δ 2 € Δ 3 € Q 1 € Q 2 € Q 3 € Z 1 € Z 2 € Z 3 Time User of interest Other users Observations Figure 2 - State transition Pictorial illustration of state-transition ev olution in the prop osed mo del. I n a ct i ve H i d d e n St a t e s O b se rva t i o n s Act i ve € p k € q k € 1 − p k € 1 − q k Po i n t p ro ce ss “l o w ” ra t e Po i n t p ro ce ss “h i g h ” ra t e € Z i I n f l u e n ce St ru ct u re 24 Figure 3 - Mo del Validation Learned transition probabilities as a function of τ for User-II I with n = 1000. Figure 4 - Q-Q Plots Q-Q plots of in ter-tw eet duration in (a)-(d) A ctive and (e)-(h) Inactive states for User-II I under four diﬀeren t mo dels. (a) (b) (c) (d) (e) (f ) (g) (h) 25 Figure 5 - Perfo rmance Imp rovement AIC impro vemen t for Model D relativ e to Mo del B . Figure 6 - Histograms Histogram of AIC gain with coupled HMM for a corpus of 100 users with (a) n = 500 and (b) n = 1000 observ ations. (c) Histogram of SMAPE gain with coupled HMM for n = 500. (a) (b) (c) 26 Figure 7 - User Clustering Clustering 150 users into three clusters based on learned mo del parameter v alues, p 1 /p 0 and γ 1 /γ 0 . 27 T ables T able 1 - Observation Mo dels Mo dels for observ ation giv en state Q i = j, j ∈ { 0 , 1 } . Name Densit y function ( f j (∆ i )) Baum-W elch parameter estimate Exp onen tial ρ j · exp( − ρ j · ∆ i ) b ρ j = N − 1 P i =1  ξ i ( j, 0)+ ξ i ( j, 1)  N − 1 P i =1 ∆ i ·  ξ i ( j, 0)+ ξ i ( j, 1)  Gamma 1 Γ( k j ) λ k j j · ∆ k j − 1 i exp  − ∆ i λ j  b k j and b λ j solv e: k j λ j = N − 1 P i =1 ∆ i ·  ξ i ( j, 0)+ ξ i ( j, 1)  N − 1 P i =1  ξ i ( j, 0)+ ξ i ( j, 1)  where Γ( x ) is the Gamma function log( λ j ) + Ψ( k j ) = N − 1 P i =1 log(∆ i ) ·  ξ i ( j, 0)+ ξ i ( j, 1)  N − 1 P i =1  ξ i ( j, 0)+ ξ i ( j, 1)  where Ψ( x ) = Γ 0 ( x ) Γ( x ) is the di-gamma function T able 2 - Inﬂuence Structure Mo dels Mo dels for inﬂuence structure giv en state Q i − 1 = j, j ∈ { 0 , 1 } . Name Densit y function ( g j ( Z i )) Baum-W elch parameter estimate Binary men tions P ( Z i = k | Q i − 1 = j ) =  1 − γ j if k = 0 γ j if k = 1 e γ j = P N i =1 , i : Z i =1 ( e ξ i − 1 ( j, 0)+ e ξ i − 1 ( j, 1) ) P N i =1 ( e ξ i − 1 ( j, 0)+ e ξ i − 1 ( j, 1) ) No. of mentions Geometric P ( Z i = k | Q i − 1 = j ) = (1 − γ j ) · γ k j , k ≥ 0 e γ j = P N i =1 Z i · ( e ξ i − 1 ( j, 0)+ e ξ i − 1 ( j, 1) ) P N i =1 ( Z i +1) · ( e ξ i − 1 ( j, 0)+ e ξ i − 1 ( j, 1) ) P oisson P ( Z i = k | Q i − 1 = j ) = γ k j · exp( − γ j ) Γ( k +1) , k ≥ 0 e γ j = P N i =1 Z i · ( e ξ i − 1 ( j, 0)+ e ξ i − 1 ( j, 1) ) P N i =1 ( e ξ i − 1 ( j, 0)+ e ξ i − 1 ( j, 1) ) Shifted zeta P ( Z i = k | Q i − 1 = j ) = (1+ k ) − γ j ζ ( γ j ) , k ≥ 0 e γ j solv es: where ζ ( x ) is the Riemann-zeta function − ζ 0 ( γ j ) ζ ( γ j ) = P N i =1 log(1+ Z i ) · ( e ξ i − 1 ( j, 0)+ e ξ i − 1 ( j, 1) ) P N i =1 ( e ξ i − 1 ( j, 0)+ e ξ i − 1 ( j, 1) ) T able 3 - Conditional Dep endency Assumptions Conditional dependencies of the inv olved v ariables in diﬀeren t HMM arc hitectures. First-Order IO-HMM [36] Coupled F actorial Coupled Prop osed AR-HMM [32 – 34] HMM [37] HMM [38, 43] Mo del { Q i , ∆ i − 1 } − → ∆ i { Z i , Q i } − → ∆ i { Z i , Q i } − → ∆ i Q i − → ∆ i Q i − → ∆ i - Z i − 1 − → Z i Z i − 1 − → Z i { Q i − 1 , Z i − 1 } − → Z i Q i − 1 − → Z i Q i − 1 − → Q i { Q i − 1 , Z i } − → Q i { Q i − 1 , Z i − 1 } − → Q i { Q i − 1 , Z i − 1 } − → Q i { Q i − 1 , Z i } − → Q i 28 T able 4 - Numb er of Pa rameters Num b er of parameters describing diﬀeren t mo dels with k states, ` netw ork inﬂuence structure levels and an m parameter observ ation density in each state. Number of parameters for k = ` = 2 and m = 1 or m = 2 are also provided. No. of parameters T rans. prob. Obs. Inﬂuence T otal m Mo del ( ↓ ) matrix density structure 1 2 Con v entional HMM k ( k − 1) k m - k ( k + m − 1) 4 6 First-Order AR-HMM k ( k − 1) k ( m + 1) - k ( k + m ) 6 8 IO-HMM `k ( k − 1) `k m ` ( ` − 1) ` ( k ( k + m − 1) + ` − 1) 10 14 Coupled F act. HMM `k ( k − 1) `km ` ( ` − 1) ` ( k ( k + m − 1) + ` − 1) 10 14 Coupled HMM [38] ( ` + k )( k − 1) k m ( ` + k )( ` − 1) ( ` + k )( ` + k − 2) 10 12 + k m Coupled HMM [43] 2( ` + k )( k − 1) k m 2( ` + k )( ` − 1) 2( ` + k )( ` + k − 2) + k m 18 20 Prop osed Model `k ( k − 1) k m k ( ` − 1) k ( `k + m − 1) 8 10 29 T able 5 - Algorithm Derivation Steps in generalized Baum-W elc h and Viterbi algorithms. Here, a, b and c are state v ariable v alues and i is the time-index. Steps in generalized Baum-W elc h algorithm: e ξ i ( a, b ) = e α i ( a ) P ( Z i +1 | Q i = a ) P ( Q i +1 = b | Q i = a, Z i +1 ) P (∆ i +1 | Q i +1 = b ) e β i +1 ( b ) P a,b e α i ( a ) P ( Z i +1 | Q i = a ) P ( Q i +1 = b | Q i = a, Z i +1 ) P (∆ i +1 | Q i +1 = b ) e β i +1 ( b ) , where (7) e α i ( a ) , P (∆ i 1 , Z i 1 , Q i = a ) , 2 ≤ i ≤ N − 1 (8) = " X b e α i − 1 ( b ) P ( Z i | Q i − 1 = b ) P ( Q i = a | Q i − 1 = b, Z i ) # P (∆ i | Q i = a ) (9) e α 1 ( a ) = " X b π b P ( Z 1 | Q 0 = b ) P ( Q 1 = a | Q 0 = b, Z 1 ) # P (∆ 1 | Q 1 = a ) (10) e β i ( b ) , P (∆ N i +1 , Z N i +1 | Q i = b ) , 2 ≤ i ≤ N − 1 (11) = P ( Z i +1 | Q i = b ) " X c P ( Q i +1 = c | Q i = b, Y i +1 ) P (∆ i +1 | Q i +1 = c ) e β i +1 ( c ) # (12) e β N ( b ) = 1 . (13) Steps in generalized Viterbi algorithm: δ 1 ( a ) , P ( Q 1 = a, ∆ 1 , Z 1 | λ ) (14) = " X b P ( Q 0 = b ) P ( Z 1 | Q 0 = b ) P ( Q 1 = a | Q 0 = b, Z 1 ) # · P (∆ 1 | Q 1 = a ) (15) φ 1 ( a ) = 0 (16) δ i +1 ( a ) , max Q i 1 h P ( Q i 1 , Q i +1 = a, ∆ i +1 1 , Z i +1 1 | λ ) i , 1 ≤ i ≤ N − 1 (17) = max b h δ i ( b ) P ( Z i +1 | Q i = b ) P ( Q i +1 = a | Q i = b, Z i +1 ) i P (∆ i +1 | Q i +1 = a ) (18) φ i +1 ( a ) = arg max b h δ i ( b ) P ( Z i +1 | Q i = b ) P ( Q i +1 = a | Q i = b, Z i +1 ) i . (19) 30 T able 6 - Conditional Correlation Co eﬃcient Conditional correlation co eﬃcient ρ ( Z i , Z i − 1 | Q i − 1 ) for Users-I to II I in A ctive and Inactive states. Q i − 1 = 1 Q i − 1 = 0 User ( ↓ ) No. of samples Corr. co ef. p -v alue No. of samples Corr. co ef. p -v alue User-I 660 0 . 1127 0 . 0037 138 − 0 . 0244 0 . 7762 User-I I 909 0 . 0590 0 . 0753 89 − 0 . 0540 0 . 6155 User-I I I 844 − 0 . 0459 0 . 1832 154 0 . 1306 0 . 1065 T able 7 - AIC Comparison for User-I AIC scores for a t ypical user with diﬀeren t mo del ﬁts. The bolded ﬁgures corresp ond to the model with the b est ﬁt. AIC Mo del ( ↓ ) n = 100 n = 250 n = 500 n = 750 Observ ation densit y: Exp onential P oisson process mo del 1017 . 71 2711 . 88 5157 . 41 7556 . 58 Con v entional tw o-state HMM 835 . 42 2155 . 51 3852 . 15 5630 . 03 Coupled HMM, Binary mention 844 . 19 2169 . 88 3867 . 18 5658 . 23 Coupled HMM, No. mentions (Geometric) 840 . 65 2163 . 20 3853 . 10 5621 . 40 Coupled HMM, No. mentions (Poisson) 840 . 96 2163 . 54 3853 . 57 5621 . 90 Coupled HMM, No. mentions (Shifted zeta) 841 . 93 2163 . 39 3853 . 83 5623 . 03 Coupled HMM, So cial netw ork traﬃc (Geometric) 836 . 22 2156 . 11 3951 . 16 5789 . 35 Observ ation densit y: Gamma Renew al process mo del 879 . 02 2284 . 01 4214 . 99 6230 . 80 Con v entional tw o-state HMM 830 . 66 2134 . 12 3806 . 88 5578 . 28 Coupled HMM, Binary mention 838 . 81 2148 . 02 3816 . 54 5595 . 53 Coupled HMM, No. mentions (Geometric) 836 . 36 2141 . 94 3808 . 53 5571 . 18 Coupled HMM, No. mentions (Poisson) 837 . 42 2142 . 66 3810 . 11 5572 . 50 Coupled HMM, No. mentions (Shifted zeta) 837 . 76 2142 . 13 3809 . 29 5584 . 69 Coupled HMM, So cial netw ork traﬃc (Geometric) 834 . 81 2149 . 96 3797 . 65 5583 . 40 31 T able 8 - AIC Comparison for User-I I AIC scores for a t ypical user with diﬀeren t mo del ﬁts. The bolded ﬁgures corresp ond to the model with the b est ﬁt. AIC Mo del ( ↓ ) n = 100 n = 500 n = 1000 Observ ation densit y: Exp onential P oisson process mo del 698 . 48 3444 . 09 6837 . 57 Con v entional tw o-state HMM 504 . 20 2373 . 26 4343 . 94 Coupled HMM, Binary mention 517 . 35 2383 . 19 4354 . 92 Coupled HMM, No. mentions (Geometric) 504 . 62 2364 . 04 4333 . 98 Coupled HMM, No. mentions (Poisson) 504 . 42 2363 . 96 4333 . 78 Coupled HMM, No. mentions (Shifted zeta) 504 . 76 2363 . 94 4333 . 80 Coupled HMM, So cial netw ork traﬃc (Geometric) 489 . 87 2391 . 80 4406 . 64 Observ ation densit y: Gamma Renew al process mo del 613 . 98 2960 . 90 5691 . 82 Con v entional tw o-state HMM 500 . 68 2334 . 20 4261 . 41 Coupled HMM, Binary mention 511 . 86 2339 . 44 4265 . 43 Coupled HMM, No. mentions (Geometric) 501 . 72 2320 . 02 4257 . 15 Coupled HMM, No. mentions (Poisson) 501 . 39 2319 . 73 4256 . 61 Coupled HMM, No. mentions (Shifted zeta) 501 . 92 2320 . 04 4256 . 93 Coupled HMM, So cial netw ork traﬃc (Geometric) 487 . 70 2336 . 06 4283 . 44 32 T able 9 - AIC Comparison for User-I I I AIC scores for an extreme case of a highly activ e user with diﬀeren t model ﬁts. The bolded ﬁgures corresp ond to the mo del with the b est ﬁt. AIC Mo del ( ↓ ) n = 100 n = 500 n = 1000 Observ ation densit y: Exp onential P oisson process mo del 653 . 73 3141 . 36 6250 . 35 Con v entional tw o-state HMM 479 . 31 2359 . 48 4611 . 80 Coupled HMM, Binary mention 495 . 63 2391 . 71 4677 . 30 Coupled HMM, No. mentions (Geometric) 478 . 69 2337 . 30 4552 . 67 Coupled HMM, No. mentions (Poisson) 483 . 11 2339 . 65 4682 . 30 Coupled HMM, No. mentions (Shifted zeta) 490 . 09 2370 . 70 4753 . 87 Coupled HMM, So cial netw ork traﬃc (Geometric) 499 . 81 2402 . 47 4663 . 76 Observ ation densit y: Gamma Renew al process mo del 585 . 78 2830 . 55 5594 . 71 Con v entional tw o-state HMM 476 . 39 2290 . 53 4440 . 57 Coupled HMM, Binary mention 485 . 79 2318 . 99 4494 . 15 Coupled HMM, No. mentions (Geometric) 473 . 56 2262 . 10 4394 . 30 Coupled HMM, No. mentions (Poisson) 489 . 48 2262 . 22 4521 . 20 Coupled HMM, No. mentions (Shifted zeta) 484 . 44 2300 . 41 4469 . 11 Coupled HMM, So cial netw ork traﬃc (Geometric) 485 . 73 2313 . 83 4458 . 81 T able 10 - AIC Improvement ∆ AIC for Users-I to II I with diﬀeren t n v alues. User ( ↓ ) Small n Mo derate n Large n User-I − 5 . 70 ( n = 100) − 1 . 65 ( n = 500) 7 . 10 ( n = 750) User-I I − 1 . 04 ( n = 100) 14 . 18 ( n = 500) 4 . 26 ( n = 1000) User-I I I 2 . 83 ( n = 100) 28 . 43 ( n = 500) 46 . 27 ( n = 1000) 33 T able 11 - Sample Activity Listing Sample activit y of a typical user in Cluster 2. P osted b y Intended for T ext conten t @F riend @Cluster-2-user y eh.watc hing @Cluster-2-user R T @Non-F riend: In the U.S., y ou can text “FLOOD” to 27722 to donate $10 to the #Pakistan Relief F und. [URL] #helppakistan @Cluster-2-user @F riend :D @Cluster-2-user Is it mandatory for Saeed Ajmal to b o wl on short ball in every o v er #fail #P akCric k et @F riend @Cluster-2-user i think afridi should go for gull @Cluster-2-user Believ e me when I sa y that I predicted that shoaib will get yardy in this ov er :D #P akCrick et @Cluster-2-user That w as a khoo o oni york er b y Umer #Gull W aqar ki yaad aa gai #PakCric ket @F riend R T @Cluster-2-user: 300 not out by Bo om Bo om [URL] @F riend1 @F riend2 ... @F riend8 @Cluster-2-user Baba dam daro o d ak a Muhammad Y ousaf has taken an un believ able catch #PakCric k et @F riend @Cluster-2-user ;-) @F riend R T @Cluster-2-user: #blog post 300 not out by Bo om Bo om [URL] #P akCrick et #Bo omBo om #Afridi @F riend @Cluster-2-user no w what? another scandal against #pak cric ket #Pakistan for winning the match @Cluster-2-user @F riend i think the scandal is going to b e damaging bat of Swan on the y ork er attempt b Shabb y #PakCric k et @Cluster-2-user R T @F riend: There’s alwa ys second chances... it just dep ends on ho w hard you ﬁght for it !! @Cluster-2-user R T @F riend: dedicated to men in Green hop e this streak contin ues and w e we the coming one as w ell.... [URL] @Cluster-2-user Ha y Jazba Jano on to Himmat na haar... the form ula b ehind the success of crick et team #PakCric ket 34 T able 12 - Sample Argument Sample argumen t b etw een t wo typical users in Cluster 2. P osted b y In tended for T ext conten t @Cluster-2-user-2 @Cluster-2-user-1 quic k ques tion...is @salmantaseer one of the bad guys?? i thought the PPP was the lesser of the evils o ver there.. #justw ondering @Cluster-2-user-1 @Cluster-2-user-2 PPP lesser of the evils?! Hahahaha every one working under Zardari is as evil as Bieber can ev er dream of being. @Cluster-2-user-2 @Cluster-2-user-1 a ww..but sherry rehman is so sweet..and i lov e the rehman malik hairdo..and SMQ is so totally suav e..btw who are the go o d guys?? @Cluster-2-user-1 @Cluster-2-user-2 w o w man! Y ou’re kidding, righ t? @Cluster-2-user-2 @Cluster-2-user-1 man!!wh y do i get the feeling that i just said lik e a totally not cool thing.. @Cluster-2-user-1 @Cluster-2-user-2 Err y es. T otally unco ol :P @Cluster-2-user-2 @Cluster-2-user-1 so no go o d guys h uh??and btw yes i do think @fbhutto is totally w annab eish... :) @Cluster-2-user-1 @Cluster-2-user-2 Y eah, I don’t like her either. I lik e Imran Khan better. @Cluster-2-user-2 @Cluster-2-user-1 i lik ed him till i saw a report ver his old 92 world cup winning team mem b ers said they nev er lik ed him.. @Cluster-2-user-2 @Cluster-2-user-1 if he culdnt get his o wn team mem b ers b ehind him then..... @Cluster-2-user-1 @Cluster-2-user-2 W ell, can’t sa y anything ab out that. But he’s a go o d person, a b etter one in p olitics at least. @Cluster-2-user-2 @Cluster-2-user-1 w ell..sub c on tinental p olitics na...every one has a murky past or presen t...just dep ends on ho w deep one digs..no oﬀence intended @Cluster-2-user-1 @Cluster-2-user-2 None tak en. But lo ok at his hospital. Despite a few contro versies, I can tell you ﬁrsthand that it’s amazing. Go o d man. @Cluster-2-user-1 #b ecauseoft witter I’ve b ecome more v o cal in terms of v enting. @Cluster-2-user-1 #b ecauseoft witter I’ve b ecome more sensitiv e. I lose a follow er and I go b erserk. 35 T able 13 - Sample Activity Listing Sample activit y of a typical user in Cluster 3. P osted b y Intended for T ext conten t @Cluster-3-user My maid told to day; someb o dy threw one day old daugh ter wrapp ed in a p olythene bag in Filth Drum outside a house. #humanitarian @Cluster-3-user I’ll write on this v ery issue; daugh ter, p ov erty or an illegitimate child w as she? why our so cial system allows us to tak e these plunges? @F riend-1 @Cluster-3-user did the bab y survived? what has w ent wrong with people, they are b ecoming so heartless. :( @Cluster-3-user @F riend-1 haan she surviv ed and someb o dy has adopted her.. only there.. but it is heart breaking.. @F riend-1 @Cluster-3-user oh Thanks God. Y es it is really heart breaking. @F riend-2 @Cluster-3-user alive ???? =O @Cluster-3-user @F riend-2 yeah alive.. an issueless couple has adopted her @F riend-2 @Cluster-3-user Thank Go d! .... thts literally inh uman act! I mean the adopting parents should atleast get back to her parents and hang them! @Cluster-3-user @F riend-2 where would they ﬁnd them? to hang @Cluster-3-user Half of Pakistanis say match ﬁxing allegations against crick eters untrue [URL] #Cric ket #Pakistan @F riend-3 @Cluster-3-user which means half of Pakistan b elieves the allegations of #Crick et #Matc hFixing against the #P akistan team to b e true #Honesty #ICC @Cluster-3-user @F riend-3 ha ha ha yeah.. agreed @F riend-4 @Cluster-3-user The prop er w ay of match ﬁxing is thru umpires, didn t giv e collingw o o d out he made a big partnership & ga ve well set akmal out @Cluster-3-user @F riend-4 true.. collingwoo d has ﬁxed match.. LOL 36

Modeling Temporal Activity Patterns in Dynamic Social Networks

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment