Predictability of conversation partners

Predictability of conversation partners
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Recent developments in sensing technologies have enabled us to examine the nature of human social behavior in greater detail. By applying an information theoretic method to the spatiotemporal data of cell-phone locations, [C. Song et al. Science 327, 1018 (2010)] found that human mobility patterns are remarkably predictable. Inspired by their work, we address a similar predictability question in a different kind of human social activity: conversation events. The predictability in the sequence of one’s conversation partners is defined as the degree to which one’s next conversation partner can be predicted given the current partner. We quantify this predictability by using the mutual information. We examine the predictability of conversation events for each individual using the longitudinal data of face-to-face interactions collected from two company offices in Japan. Each subject wears a name tag equipped with an infrared sensor node, and conversation events are marked when signals are exchanged between sensor nodes in close proximity. We find that the conversation events are predictable to some extent; knowing the current partner decreases the uncertainty about the next partner by 28.4% on average. Much of the predictability is explained by long-tailed distributions of interevent intervals. However, a predictability also exists in the data, apart from the contribution of their long-tailed nature. In addition, an individual’s predictability is correlated with the position in the static social network derived from the data. Individuals confined in a community - in the sense of an abundance of surrounding triangles - tend to have low predictability, and those bridging different communities tend to have high predictability.


💡 Research Summary

This paper investigates how predictable a person’s next conversation partner is, using high‑resolution face‑to‑face interaction data collected in two Japanese company offices. Each employee wore an infrared‑equipped name‑tag that recorded a “conversation event” whenever two tags faced each other within three meters for at least one minute. The two datasets comprise 163 participants over 73 days (≈52 k events) and 211 participants over 120 days (≈125 k events).

The authors disregard the timing of events because preliminary analysis showed weak temporal correlations, and instead focus on the ordered list of conversation partners for each individual – the “partner sequence”. For each person i they compute three entropy‑based quantities, following the methodology used for human mobility predictability:

  1. Random entropy H₀ᵢ = log₂(kᵢ), where kᵢ is the number of distinct partners i ever talked to. This is the entropy if all partners were chosen uniformly at random.

  2. Uncorrelated entropy H₁ᵢ = –∑₍j∈Nᵢ₎ Pᵢ(j) log₂ Pᵢ(j), where Pᵢ(j) is the empirical probability that i talks with partner j. This captures the heterogeneity of partner preferences.

  3. Conditional entropy H₂ᵢ = –∑₍j∈Nᵢ₎ Pᵢ(j) ∑₍ℓ∈Nᵢ₎ Pᵢ(ℓ|j) log₂ Pᵢ(ℓ|j), where Pᵢ(ℓ|j) is the probability that ℓ is the next partner after a conversation with j.

The difference mutual information Iᵢ = H₁ᵢ – H₂ᵢ quantifies how much knowing the current partner reduces uncertainty about the next partner. Iᵢ ranges from 0 (no predictability) to H₁ᵢ (perfect determinism).

After discarding individuals with fewer than 100 conversation events (to avoid severe finite‑size bias), the authors find that every remaining individual has Iᵢ > 0. On average, H₂ᵢ is 28.4 % smaller than H₁ᵢ, meaning that the current partner provides roughly a 30 % information gain about the next partner.

A major source of this predictability is the burstiness of conversation patterns. Inter‑event intervals for a given pair follow a heavy‑tailed (power‑law) distribution, so after a conversation with j, the same pair is likely to converse again shortly thereafter. To quantify the contribution of burstiness, the authors randomize inter‑event intervals within each day while preserving the overall interval distribution. The resulting “burst‑randomized” mutual information I_burst accounts for about 79.5 % of the original Iᵢ, indicating that most of the predictability stems from the bursty temporal clustering of interactions.

To test whether any predictability remains after removing burst effects, consecutive conversations with the same partner are merged into a single event, eliminating repeated‑partner runs. The mutual information computed on these merged sequences (I_merge) is still positive (average ≈ 0.12 bits), confirming that beyond burstiness, there is an intrinsic sequential structure in partner selection.

The authors also examine the static conversation network built by aggregating all events: nodes are individuals, edge weights are total conversation counts. Both networks are single connected components with high clustering coefficients (≈ 0.65 and 0.61) and positive degree assortativity, typical of social graphs. For each individual they compute the clustering coefficient and the abundance of triangles (a proxy for being embedded in a tight community) as well as the presence of “weak links” that bridge different communities.

A clear relationship emerges: individuals embedded in dense communities (high clustering, many surrounding triangles) exhibit low Iᵢ, i.e., their next partner is relatively unpredictable given the current one. Conversely, “bridge” individuals who connect disparate communities via weak links show higher Iᵢ, meaning their partner choice follows a more deterministic pattern. This suggests that social role—whether one is a community core or a broker—affects the sequential dynamics of conversations.

Statistical significance is verified through bootstrap tests: 100 random partner sequences (preserving partner frequencies but destroying temporal order) produce I values that are significantly lower than the empirical Iᵢ at the 1 % level. The authors also note that randomizing the order of simultaneous conversations (multiple partners within the same minute) does not inflate Iᵢ; in fact, it slightly reduces it, confirming robustness against this source of noise.

In summary, the paper demonstrates that conversation partner sequences are not random. Predictability arises primarily from bursty interaction patterns, but a residual deterministic component remains, strongly linked to an individual’s position in the underlying social network. These findings challenge the common modeling assumption that state transitions (e.g., opinion changes, infection events) are Markovian and suggest that incorporating both temporal burstiness and network‑based role information could improve models of information spreading, opinion dynamics, and epidemic processes in real‑world social systems.


Comments & Academic Discussion

Loading comments...

Leave a Comment