The Bayesian Echo Chamber: Modeling Social Influence via Linguistic Accommodation
We present the Bayesian Echo Chamber, a new Bayesian generative model for social interaction data. By modeling the evolution of people's language usage over time, this model discovers latent influence relationships between them. Unlike previous work …
Authors: Fangjian Guo, Charles Blundell, Hanna Wallach
The Ba y esian Ec ho Cham b er: Mo deling So cial Influence via Linguistic Accommo dation F ang jian Guo Charles Blundell Hanna W allach Katherine Heller Duk e Univ ersit y Durham, NC, USA guo@cs.duke.edu Gatsb y Unit, UCL London, UK c.blundell@gatsby.ucl.ac.uk Microsoft Researc h New Y ork, NY, USA wallach@microsoft.com Duk e Universit y Durham, NC, USA kheller@stat.duke.edu Abstract W e presen t the Ba yesian Ec ho Chamber, a new Ba yesian generativ e mo del for social in teraction data. By mo deling the evolu- tion of people’s language usage ov er time, this mo del discov ers latent influence rela- tionships b et w een them. Unlik e previous w ork on inferring influence, which has pri- marily fo cused on simple temp oral dynam- ics evidenced via turn-taking b eha vior, our mo del captures more nuanced influence rela- tionships, evidenced via linguistic accommo- dation patterns in in teraction conten t. The mo del, whic h is based on a discrete analog of the multiv ariate Ha wkes pro cess, permits a fully Bay esian inference algorithm. W e v al- idate our mo del’s ability to discov er latent influence patterns using transcripts of argu- men ts heard by the US Supreme Court and the mo vie “12 Angry Men.” W e sho wcase our mo del’s capabilities b y using it to infer latent influence patterns from F ederal Open Mark et Committee meeting transcripts, demonstrat- ing state-of-the-art p erformance at unco ver- ing so cial dynamics in group discussions. 1 INTR ODUCTION As increasing quan tities of so cial interaction data b ecome av ailable, often through online sources, re- searc hers strive to find new w a ys of using these data to learn about human behavior. Most social pro cesses, in whic h p eople or groups of p eople in teract with one an- other in order to achiev e sp ecific (and sometimes con- App earing in Pro ceedings of the 18 th In ternational Con- ference on Artificial Intelligence and Statistics (AIST A TS) 2015, San Diego, CA, USA. JMLR: W&CP volume 38. Cop yright 2015 b y the authors. tradictory) goals, are extremely complex. In order to construct realistic models of these so cial pro cesses, it is therefore necessary to take into account their structure (e.g., who sp oke with whom), con tent (e.g., what was said), and temporal dynamics (e.g., when they sp ok e). When studying so cial processes, one of the most p er- v asiv e questions is “who influences whom?” This ques- tion is of interest not only to so ciologists and psyc hol- ogists, but also to political scientists, organizational scien tists, and marketing researchers. Since influence relationships are seldom made explicit, they m ust be inferred from other information. Influence has tradi- tionally b een studied by analyzing declared structural links in observed netw orks, such as F aceb ook “friend- ships” [Backstrom et al., 2006], pap er citations [de Solla Price, 1965], and bill co-sp onsorships [F owler, 2006]. F or many domains, ho wev er, explicitly stated links do not exist, are unreliable, or fail to reflect p erti- nen t b eha vior. In these domains, researchers ha ve used observ ed interaction dynamics as a proxy b y which to infer influence and other so cial relationships. Much of this w ork has concentrated (either implicitly or explic- itly) on turn-taking b eha vior—i.e., “who acts next.” In this pap er, we tak e a different approach: w e mo ve b ey ond turn-taking b eha vior, and present a new mo del, the Ba yesian Ec ho Chamber, that uses ob- serv ed interaction conten t, in the context of temp o- ral dynamics, to capture influence. Our mo del draws up on a substantial b ody of work within so ciolinguis- tics indicating that when tw o p eople interact, either orally or in writing, the use of a w ord by one p erson can increase the other p erson’s probabilit y of subse- quen tly using that word. F urthermore, the e xten t of this increase dep ends on pow er differences and influ- ence relationships: the language used by a less p o wer- ful person will drift further so as to more closely re- sem ble or “accommo date” the language used by more p o w erful p eople. This phenomenon is known as lin- guistic accommo dation [W est and T urner, 2010]. W e demonstrate that linguistic accommo dation can rev eal The Bay esian Ec ho Cham b er: Mo deling Social Influence via Linguistic Accommodation more n uanced influence patterns than those revealed b y simple recipro cal b eha viors such as turn-taking. The Bay esian Echo Cham b er is a new, mutually ex- citing, dynamic language mo del that combines ideas from Ha wkes processes [Hawk es, 1971] with ideas from Ba yesian language mo deling. W e dra w inspiration from Blundell et al.’s mo del of turn-taking b eha v- ior [2012] (describ ed in section 2) to define a new mo del of the mutual excitation of words in social interactions. This approach, which leverages a discrete analog of a multiv ariate Hawk es pro cess, enables the Bay esian Ec ho Chamber to capture linguistic accommo dation patterns via latent influence v ariables. These v ariables define a weigh ted influence netw ork that reveals fine- grained information ab out who influences whom. W e provide details of the Ba yesian Ec ho Chamber in section 3, including an MCMC algorithm for inferring the laten t influence v ariables (and other parameters) from real-w orld data. T o v alidate this algorithm, we pro vide parameter recov ery results obtained using syn- thetic data. In section 5 we compare to several base- line mo dels on data sets including argumen ts heard by the US Supreme Court [MacWhinney, 2007] and the transcript of the 1957 mo vie “12 Angry Men.” W e compare influence netw orks inferred using our mo del to those inferred using Blundell et al.’s mo del. W e sho w that by fo cusing on linguistic accommo dation patterns, our mo del infers different—more substan- tiv ely meaningful—influence netw orks than those in- ferred from turn-taking behavior. W e also com bine our mo del with Blundell et al.’s so as to jointly mo del turn-taking and linguistic accommodation. W e in ves- tigate the p ossibilit y of tying the laten t influence pa- rameters to see if a single global notion of influence can be discov ered. Finally , we show case our mo del’s p oten tial as an exploratory analysis to ol for so cial sci- en tists using recently released transcripts of F ederal Reserv e’s F ederal Op en Market Committee meetings. 2 INFLUENCE VIA TURN-T AKING In this section, we give a brief description of a v ariant of Blundell et al.’s mo del for inferring influence from turn-taking b eha vior. Unlike Blundell et al.’s original pap er, whic h modeled pairwise actions, w e concentrate on a broadcast or group discussion setting appropri- ate for the data that w e wish to mo del. (W e also do not cluster participan ts by their interaction patterns.) This setting, in which every utterance is heard by and th us p otentially influences ev ery participant, o ccurs in man y scenarios of in terest to social scien tists. F urther- more, the comparativ ely information-impov erished na- ture of this setting makes it one in which abilit y to infer influence relationships is deemed extremely v aluable. Blundell et al.’s mo del sp ecifies a probabilistic genera- tiv e process for the time stamps T = {T ( p ) } P p =1 asso ci- ated with a set of actions made by P people. In a group discussion setting, these actions corresp ond to utter- ances, and the model captures who will sp eak next and when that next utterance will o ccur. Letting N ( p ) ( T ) denote the total num b er of utterances made by p er- son p ov er the en tire observ ation interv al [0 , T ), each utterance made b y p is asso ciated with a time stamp indicating its start time, i.e., T ( p ) = { t ( p ) n } N ( p ) ( T ) n =1 . W e assume that the duration of each utterance ∆ t ( p ) n is observ ed and that its end time t 0 ( p ) n can b e calculated from its start time and duration: t 0 ( p ) n = t ( p ) n + ∆ t ( p ) n . Ha wkes pro cesses [Hawk es, 1971]—a class of self- and m utually exciting doubly sto c hastic p oin t pro cesses— form the mathematical foundation of Blundell et al.’s mo del. A Hawk es pro cess is a particular form of inho- mogeneous P oisson pro cess with a conditional sto c has- tic rate function λ ( t ) that dep ends on the time stamps of all ev ents prior to time t . Blundell et al. model turn- taking interactions using coupled Ha wkes pro cesses. F or a group discussion setting, we instead define a m ultiv ariate Ha wkes pro cess, in whic h each p erson p is asso ciated with his or her o wn Hawk es pro cess de- fined on (0 , ∞ ). Letting N ( p ) ( · ) denote the counting measure of p erson p ’s Hawk es process, which tak es as its argume n t an interv al [ a, b ) and returns the num- b er of utterances made by p during that interv al, the sto c hastic rate function for p ’s Hawk es pro cess is λ ( p ) ( t ) = λ ( p ) 0 + X q 6 = p Z t − 0 g ( q p ) ( t, u ) d N ( q ) ( u ) (1) = λ ( p ) 0 + X q 6 = p X n : t 0 ( q ) n t ∗} P p =1 . The predictiv e probability of held-out data is then P ( W test | T test , D test , W train , T train , D train ). Al- though this probability is analytically in tractable, its logarithm can b e approximated via the low er b ound 1 S P S s =1 log P ( W test | T test , D test , Θ ( s ) ) where Θ ( s ) de- notes a set of sampled parameter v alues dra wn from the p osterior distribution P (Θ | W train , T train , D train ). Appro ximate log probabilities obtained using the Ba yesian Echo Chamber (with S = 3000 samples af- ter 1000 burn-in sampling iterations), a unigram lan- guage mo del, and Blei and Laffert y’s dynamic topic mo del [2006] are pro vided in table 1. Log probabilities for additional data sets are pro vided in the supplemen- tary material. The unigram language mo del is equiv a- len t to setting all influence parameters in our model to zero. In all experiments inv olving the dynamic topic mo del, eac h data set was sliced into K = 10 or K = 5 equally-sized time slices (dep ending on the training– testing split), with the last slice taken to be the test set and utterances treated as do cumen ts. Each log prob- abilit y reported for the dynamic topic mo del is the highest v alue obtained using either 5, 10, or 20 topics. Since inference for the dynamic topic model was p er- formed using a v ariational inference algorithm, 3 its log probabilities are also low er b ounds and standard devi- ations are not av ailable. F or all data sets, the Bay esian Ec ho Chamber out-performed b oth the unigram lan- guage mo del and the dynamic topic mo del. 5.3 Influence Recov ery In this section, we demonstrate that the Bay esian Ec ho Chamber can recov er known influence patterns in Supreme Court arguments and in the movie “12 Angry Men.” W e also use these data sources to to compare influence net works inferred by our mo del to those inferred by the mo del describ ed in section 2. All rep orted influence parameters w ere obtained by av er- aging 3,000 p osterior samples; p osterior standard de- viations are pro vided in the supplementary material. 5.3.1 US Supreme Court As describ ed previously , Supreme Court arguments are extremely formulaic: The attorneys representing the p etitioner present their argument first, speaking for a total of 30 minutes b efore the resp onden t’s attor- neys are allow ed to presen t their argumen t. Justices routinely interrupt these presentations. W e therefore an ticipate that influence netw orks inferred from lin- guistic accommo dation patterns will reveal significan t influence exerted by the p etitioner’s attorneys, sim- ply b ecause they sp eak first, establishing the language used in the rest of the discussion. W e also anticipate 3 Inference co de obtained from http://www.cs. princeton.edu/ ~ blei/topicmodeling.html The Bay esian Ec ho Cham b er: Mo deling Social Influence via Linguistic Accommodation T able 1: Predictive Log Probabilities of Held-Out Data. 10% T est Set 20% T est Set Data Set Our Mo del Unigram DTM Our Mo del Unigram DTM Synthetic -4292.97 ± 0.02 -4297.92 ± 0.04 -4364.81 -8702.92 ± 0.04 -8717.77 ± 0.08 -8948.07 DC v. Heller -7383.45 ± 0.12 -7794.25 ± 0.21 -7533.58 -12404.21 ± 0.15 -13126.73 ± 0.26 -12744.73 L&G v. T exas -6663.33 ± 0.12 -6937.66 ± 0.18 -6759.06 -10248.80 ± 0.21 -10791.25 ± 0.23 -10459.87 Citizens United v. FEC -5770.12 ± 0.14 -6120.67 ± 0.18 -5851.224 -16370.7 ± 0.95 -17157.21 ± 0.40 -16400.46 “12 Angry Men” -4667.47 ± 0.24 -4920.21 ± 0.14 -4691.11 -8722.97 ± 0.27 -9222.99 ± 0.25 -8787.35 DELLI CLEME GINS GURA ROBE SOUT SCAL STEV KENN BREY (a) SCAL SOUT ROBE GURA GINS CLEME DELLI BREY KENN STEV (b) person DELLI GURA ROBE CLEME SCAL STEV KENN GINS SOUT BREY DELLI GURA ROBE CLEME SCAL STEV KENN GINS SOUT BREY 0 10 20 30 40 0 250 500 750 influence influence Our Model Blundell et al.'s Model exerted received (c) Figure 2: Influence Netw orks (Posterior Mean) Inferred from the DC v. Heller Case. (a) Netw ork Inferred Using Our Mo del and (b) Inferred Using Blundell et al.’s Model. (c) T otal Influence Exerted/Received b y Each P erson. that influence net works inferred from turn-taking b e- ha vior will reveal significant influence exerted by the justices o ver the attorneys. This is because the justices in terrogate the attorneys’ during their presentations. As an illustrativ e example, w e presen t results obtained from the District of Colum bia v. Heller case in figure 2. (The other tw o cases, La wrence and Garner v. T exas and Citizens United v. F ederal Election Commission exhibited remark ably similar influence net works.) The influence netw ork 4 inferred using the Ba yesian Echo c hamber is sho wn in 2(a), while the net work inferred using Blundell et al.’s model is shown in 2(b). T o illustrate posterior uncertaint y , netw orks dra wn with differen t p osterior quantiles are pro vided in the sup- plemen tary material. The total influence exerted and receiv ed by each participant are shown for each mo del in figure 2(c). The error bars represent the poste- rior standard deviation. The justices presen t for this case were Alito, Breyer, Ginsburg, Kennedy , Rob erts, Scalia, Stevens, Souter, and Thomas, while the at- torneys were Dellinger (representing the p etitioner), Gura (represen ting the resp ondent), and Clement (as amicae curae, supporting the p etitioner). Ultimately , Alito, Kennedy , Rob erts, Scalia, and Thomas (the ma- jorit y) sided with the resp onden t, while Breyer, Gins- burg, and Stevens (the minority) sided with the p e- titioner. Neither Alito or Thomas sp ok e ten or more 4 Plotted using qgraph [Epsk amp et al., 2012]. utterances, so they were not included in our analyses. The influence net work inferred using our model is very sparse. As exp ected, Dellinger (who represen ted the p etitioner and presen ted his argument first) is shown as exerting the most influence. The justices with the most influence are Kennedy and Rob erts, b oth of whom ultimately supp orted the resp onden t and thus in terrogated Dellinger muc h more the other justices. The most striking pattern in the influence netw ork in- ferred using Blundell et al.’s model is that the three attorneys receiv ed m uch more influence from the jus- tices than vice versa. This pattern could b e seen as reflecting the status difference betw een justices and at- torneys or as reflecting the form ulaic structure of the Supreme Court: attorneys presen t argumen ts, while justices interrupt to make commen ts or ask questions. 5.3.2 “12 Angry Men” Unlik e Supreme Court arguments, the dialog in “12 Angry Men” is informal and in tended to seem natural. Since the fo cus of the movie is discussion-based con- sensus building in a group setting, we therefore antic- ipate that the narrative of the movie will b e reflected in influence netw orks inferred from linguistic accom- mo dation patterns and from turn-taking b eha vior. The influence netw ork inferred using our model is sho wn in figure 3(a), while the total influence exerted F ang jian Guo, Charles Blundell, Hanna W allach, Katherine Heller 8 3 7 6 1 1 4 12 5 2 9 1 10 (a) 3 10 7 6 1 1 4 12 5 2 9 8 1 (b) 8 3 10 7 1 4 6 1 1 12 9 2 5 8 3 10 7 1 4 6 1 1 12 9 2 5 influence influence 0 10 20 0 200 400 600 juror Our Model exerted received Blundell et al.'s Model (c) Figure 3: Influence Netw orks (Posterior Mean) Inferred from “12 Angry Men.” (a) Netw ork Inferred Using Our Mo del and (b) Inferred Using Blundell et al.’s Mo del. (c) T otal Influence Exerted/Received by Each Person. and receiv ed by each juror are shown in the top of fig- ure 3(c). The most significan t pattern is that three in- dividuals exert more influence ov er others than others do o ver them: Juror 8, Juror 3, and, to a lesser extent, Juror 10. Juror 8 is the protagonist of the movie, and initially casts the only “not guilty” v ote. The other jurors ultimately change their votes to match his. Ju- ror 3, the antagonist, is the last to change his vote. It therefore unsurprising that Juror 8, the first to v ote “not guilty”, should dominate the discussion con tent. Similarly , Juror 3, the last to change his “guilty” v ote, is most inv ested in discussing defendan t’s supp osed guilt. Juror 10 is one of the last three jurors, along with Jurors 3 and 4, to c hange his vote. How ever, un- lik e Juror 4 (who stands out marginally in figure 3(a) and, according to figure 3(c), has less influence ov er others than others do ov er him), Juror 10 is argumen- tativ e as he changes his mind. Ov erall, the consistency of the inferred influence netw ork with the narrative of the movie confirms that the Ba yesian Ec ho Chamber can indeed uncov er substan tive influence relationships. The influence net work inferred using Blundell et al.’s mo del and the total influence exerted and received by eac h juror are sho wn in figure 3(b) and the b ottom of 3(c), resp ectiv ely . The four jurors who exert more in- fluence ov er others than others do ov er them (Juror 2, Juror 5, Juror 9, and Juror 11) are the first four jurors to change their v otes. Jurors 5 and 11, who exert the most influence, are v erb ose, while Jurors 2 and 9 are comparativ ely taciturn. Jurors 8 exerts little influence b ecause he must resp ond to questions and defend his p osition as he tries to p ersuade the others to agree with him, m uch like the attorneys in the Supreme Court. 5.4 Exploratory Analysis of F OMC Meetings Finally , we performed an exploratory analysis of the relationships inferred from transcripts of 32 F ederal Op en Market Committee meetings surrounding the 2007–2008 financial crisis, ranging from March 27, 2006 to December 15, 2008, inclusive. Since utter- ance durations are not av ailable for these transcripts (also preven ting the use of Blundell et al.’s mo del), w e set the duration of eac h utterance to a v alue prop or- tional to its length in tok ens. W e divided the meetings in to three subsets: Marc h 27, 2006 through June 28, 2006; August 8, 2006 through August 7, 2007; and Au- gust 10, 2007 through December 15, 2008. The first subset corresp onds to meetings with a resultan t p olicy of tightening; the second to meetings with a neutral outcome; and the third to meetings that resulted in easing. These meetings were all chaired b y Bernanke. Figure 4 depicts the influence netw ork for each sub- set (aggregated by a veraging o ver the meetings in that subset) inferred using our mo del. In the first netw ork, corresp onding to pre-crash meetings from March 27, 2006 through June 28, 2006, Bernank e, Fisher, and Lac ker pla y the biggest roles with Bernanke, the c hair, exerting the most influence ov er others. Given his role as chair, Bernank e’s inv olvemen t is arguably unsur- prising, but Fisher and Lack er’s roles are notable. Un- lik e Bernanke, Fisher and Lack er are b oth “hawks” and th us generally in fav or of tigh tening monetary p olicy; the meetings in this subset all resulted in an outcome of tightening. In the second netw ork, corre- sp onding to pre-crash meetings from August 8, 2006 through August 7, 2007, Bernanke, Fisher, and Lack er all con tinue to play significant roles, but the net work is m uch less sparse, w ith b oth hawks and “dov es” (those generally in fa vor of easing monetary policy) exert- ing influence ov er others. In con trast to the meet- ings in the previous subset, these meetings resulted in neutralit y—i.e., neither tightening or easing. Fi- nally , in the third net work, corresponding to p ost- crash meetings from August 10, 2007 through Decem- b er 15, 2008, there are fewer strong influence relation- The Bay esian Ec ho Cham b er: Mo deling Social Influence via Linguistic Accommodation BERNANKE KOHN GEITHNER LACKER FISHER (a) BERNANKE KOHN GEITHNER FISHER LACKER DUDLEY PLOSSER (b) BERNANKE KOHN GEITHNER PLOSSER LACKER FISHER DUDLEY (c) Figure 4: Influence Netw orks (Posterior Mean) Inferred from FOMC Meetings Using Our Mo del. (a) March 27, 2006–June 28, 2006. (b) August 8, 2006–August 7, 2007. (c) August 10, 2007–December 15, 2008. ships. Bernanke (the chair and a dov e) still plays a ma jor role, while Fisher and Lack er’s roles are signifi- can tly diminished. Instead, Dudley , also a dov e and a close ally of Bernanke, pla ys a muc h greater role, esp e- cially in his relationship with Bernanke. These meet- ings all resulted in monetary p olicy easing, a strategy generally fa vored by dov es and opp osed by hawks. There has been little work in p olitical science, eco- nomics, or computer science on analyzing these meet- ing transcripts. As a result, the inferred netw orks not only sho wcase our mo del’s abilit y to disco ver la- ten t influence relationships from linguistic accommo- dation, but also constitute a researc h con tribution of substan tive interest to p olitical scientists, economists, and other so cial scien tists studying the financial crisis. 5.5 Mo del Com bination Since influence can b e inferred from b oth turn-taking b eha vior and linguistic accommo dation, we explored the p ossibilit y of combining the Bay esian Echo Cham- b er and Blundell et al.’s mo del to form a “sup ermodel” with a single set of shared influence parameters. The simplest wa y to share these parameters is to tie them together as ρ ( q p ) = r ν ( q p ) , where r is a scaling factor and ρ ( q p ) and ν ( q p ) corresp ond to the influence from p erson q to p erson p in our mo del and Blundell et al.’s mo del, respectively . Tying the influence parameters in this wa y provides the mo del with the capacity to capture a global notion of influence that is based up on b oth turn-taking and linguistic accommo dation. This tied mo del, whose lik eliho o d is the product of the Bay esian Echo Chamber’s likelihoo d and that of Blundell et al.’s mo del but with shared influence pa- rameters, assigned low er probabilities to held-out data than the fully factorized mo del (i.e., separate influ- ence parameters). Log probabilities, obtained using a 90%–10% training–testing split and a v o cabulary of V = 300 word types in order to reduce computation time, are pro vided in the supplementary material. In terestingly , the netw orks inferred by the mo del with tied parameters are extremely similar to those inferred using the Bay esian Echo Chamber. These results sug- gest that linguistic accommo dation reflects a more in- formativ e notion of influence that that evidenced via turn-taking. W e expect that in vestigating other wa ys of combining turn-taking-based mo dels with ours will b e a promising direction for future exploration. 6 DISCUSSION The Bay esian Ec ho Chamber is a new generativ e mo del for disco vering laten t influence net works via lin- guistic accommo dation patterns. W e demonstrated that our model can reco ver kno wn influence patterns in syn thetic data, argumen ts heard by the US Supreme Court, and in the mo vie “12 Angry Men.” W e com- pared influence net works inferred using our mo del to those inferred using a v ariant of Blundell et al.’s turn- taking-based mo del and show ed that by mo deling lin- guistic accommo dation patterns, our mo del infers dif- feren t, and often more meaningful, influence netw orks. Finally , we sho wcased our model’s potential as an ex- ploratory analysis to ol for so cial scien tists by inferring laten t influence relationships b et ween members of the F ederal Reserve’s F ederal Op en Market Committee. Promising av en ues for future work include (1) model- ing linguistic accommo dation separately for function and con tent words and (2) explicitly mo deling the dy- namic ev olution of influence netw orks ov er time. Ac knowledgemen ts Thanks to Juston Moore and Aaron Schein for their w ork on early stages of this pro ject, and to Aaron for the “Bay esian Echo Cham b er” mo del name. This work w as supported in part b y the Center for Intelligen t Information Retriev al, in part by NSF grant #IIS- 1320219, and in part b y NSF grant #SBE-0965436. An y opinions, findings and conclusions or recommen- dations expressed in this material are the authors’ and do not necessarily reflect those of the sp onsor. F ang jian Guo, Charles Blundell, Hanna W allach, Katherine Heller References Bac kstrom, L., Huttenlo c her, D., Klein b erg, J., and Lan, X. (2006). Group formation in large social net- w orks: Membership, gro wth, and evolution. In Pr o- c e e dings of the 12th ACM SIGKDD International Confer enc e on Know le dge Disc overy and Data Min- ing . Blei, D. M. and Lafferty , J. D. (2006). Dynamic topic mo dels. In Pr o c e e dings of the 23r d International Confer enc e on Machine L e arning . Blundell, C., Heller, K. A., and Beck, J. (2012). Mo d- elling recipro cating relationships with Ha wkes pro- cesses. In A dvanc es In Neur al Information Pr o c ess- ing Systems . Bremaud, P . and Massouli, L. (1996). Stability of non- linear Hawk es pro cesses. The Annals of Pr ob ability , pages 1563–1588. Daley , D. J. and V ere-Jones, D. (1988). An Intr o duc- tion to the The ory of Point Pr o c esses . Springer. Danescu-Niculescu-Mizil, C., Lee, L., Pang, B., and Klein b erg, J. (2012). Ec ho es of p o wer: Language effects and p o wer differences in so cial interaction. In Pr o c e e dings of the 21st International Confer enc e on World Wide Web , pages 699–708. de Solla Price, D. J. (1965). Netw orks of scientific pap ers. Scienc e , 149(3683):510–515. DuBois, C., Butts, C., and Smyth, P . (2013). Stochas- tic blo ckmodeling of relational ev ent dynamics. In Pr o c e e dings of the Sixte enth International Confer- enc e on Artificial Intel ligenc e and Statistics , pages 238–246. Epsk amp, S., Cramer, A. O., W aldorp, L. J., Sc hmittmann, V. D., and Borsb oom, D. (2012). qgraph : Net work visualizations of relationships in psyc hometric data. Journal of Statistic al Softwar e , 48(4):1–18. F owler, J. H. (2006). Legislativ e cosp onsorship net- w orks in the US House and Senate. So cial Networks , 28:454–465. Gerrish, S. M. and Blei, D. M. (2010). A language- based approac h to measuring sc holarly impact. In Pr o c e e dings of the 26th International Confer enc e on Machine L e arning . Ha wkes, A. G. (1971). Poin t sp ectra of some self- exciting and mutually exciting p oin t pro cesses. Journal of the R oyal Statistic al So ciety: Series B (Metho dolo gy) , 58:83–90. Iw ata, T., Shah, A., and Ghahramani, Z. (2013). Dis- co vering latent influence in online social activities via shared cascade Poisson pro cesses. In Pr o c e e d- ings of the 19th ACM SIGKDD International Con- fer enc e on Know le dge Disc overy and Data Mining , pages 266–274. Linderman, S. W. and Adams, R. P . (2014). Dis- co vering latent net work structure in p oin t pro cess data. International Confer enc e on Machine L e arn- ing (ICML) . MacWhinney , B. (2007). The T alkBank pro ject. Cr e- ating and Digitizing L anguage Corp or a: Synchr onic Datab ases . Neal, R. M. (2003). Slice sampling. Annals of Statis- tics , 31(3):705–767. P erry , P . O. and W olfe, P . J. (2013). Poin t pro cess mo delling for directed in teraction net works. Journal of the R oyal Statistic al So ciety: Series B (Metho d- olo gy) , 75(5):821–849. Rasm ussen, J. G. (2013). Ba yesian inference for ha wkes pro cesses. Metho dolo gy and Computing in Applie d Pr ob ability , 15(3):623–642. Simma, A. and Jordan, M. I. (2010). Mo deling ev ents with cascades of Poisson pro cesses. Pr o c e e dings of Unc ertainty in A rtificial Intel ligenc e . W est, R. and T urner, L. (2010). Intr o ducting Commu- nic ation The ory: Analysis and Applic ations . Mc- Gra w Hill. Zhou, K., Zha, H., and Song, L. (2013). Learning triggering kernels for multi-dimensional Ha wkes pro- cesses. In Pr o c e e dings of the 30th International Con- fer enc e on Machine L e arning , pages 1301–1309. Supplemen tary Material for “The Ba y esian Echo Cham b er” F ang jian Guo Charles Blundell Hanna W allach Katherine Heller Duk e Universit y Durham, NC, USA guo@cs.duke.edu Gatsb y Unit, UCL London, UK c.blundell@gatsby.ucl.ac.uk Microsoft Researc h New Y ork, NY, USA wallach@microsoft.com Duk e Universit y Durham, NC, USA kheller@stat.duke.edu 1 INFLUENCE VIA TURN-T AKING In this section, we provide appropriate priors and details of an inference algorithm for the v ariant of Blundell et al.’s mo del [2012] describ ed in section 2 of the pap er. F or real-world group discussions, the utterance start times T = {T ( p ) } P p =1 and durations D = {{ ∆ t ( p ) n } N ( p ) ( T ) n =1 } P p =1 are observ ed, while param- eters Θ = { λ ( p ) 0 , { ν ( q p ) } q 6 = p , τ ( p ) T } P p =1 are unobserved; ho wev er, information ab out the v alues of these param- eters can b e quantified via their p osterior distribution giv en T and D , obtained via Bay es’ theorem, i.e., P (Θ | T , D ) ∝ P ( T | Θ , D ) P (Θ) . (5) The lik eliho o d term has the form P ( T | Θ , D ) = P Y p =1 exp − Λ ( p ) ( T ) N ( p ) ( T ) Y n =1 λ ( p ) ( t ( p ) n ) , (6) where Λ ( p ) ( T ) = R T 0 λ ( p ) ( t ) d t is the exp ected total n umber of utterances made o ver the en tire observ ation in terv al from 0 to T [Daley and V ere-Jones, 1988]. Lik e Blundell et al., we place an improper prior ov er λ ( p ) 0 > 0. W e also use priors to ensure that the m ul- tiv ariate Hawk es pro cess is stationary . Sp ecifically , w e employ the stationarity condition of Bremaud and Massouli [1996]. If M is a P × P matrix given by M ( q p ) = Z ∞ u g ( q p ) ( t, u ) d t = ν ( q p ) τ ( p ) T , (7) then this condition requires the sp ectral radius of M to b e strictly less than one. This condition is not straight- forw ard to enforce with tractable constrain ts; how ever, since the sp ectral radius of M is upp er-b ounded by an y matrix norm, the condition may be enforced b y requiring that k M k < 1 for an y norm k · k . W e use the maxim um absolute column sum norm: k M k 1 → 1 = max k x k 1 =1 k M x k 1 (8) = max p =1 , ··· ,P τ ( p ) T X q 6 = p ν ( q p ) . (9) Rewriting this expression implies an improp er joint prior o ver { τ ( p ) T } P p =1 and {{ ν ( q p ) } q 6 = p } P p =1 in whic h 0 < τ ( p ) T < 1 P q 6 = p ν ( q p ) and (10) 0 < ν ( q p ) < 1 τ ( p ) T − P r 6 = q ,r 6 = p ν ( rp ) . (11) Although the resultan t p osterior distribution P (Θ | T , D ) is analytically intractable, p osterior samples can b e drawn using either the conditional in tensity function approach or the cluster process ap- proac h describ ed by Rasmussen [2013]. Like Blundell et al., w e take the former approach and use a slice- within-Gibbs algorithm [Neal, 2003] that sequentially samples eac h parameter from its conditional posterior. This slice-within-Gibbs algorithm requires frequent ev aluation of the likelihoo d in equation 6; how ever, the computational cost can b e reduced by noting that the pro duct ov er rate functions can b e efficiently com- puted using the follo wing recurrence relation: λ ( p ) ( t ( p ) n ) = λ ( p ) 0 + λ ( p ) ( t ( p ) n − 1 ) − λ ( p ) 0 exp − t ( p ) n − t ( p ) n − 1 τ ( p ) T ! + X q 6 = p X m : t ( p ) n − 1 ≤ t 0 ( q ) m
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment