Statistical analysis of emotions and opinions at Digg website
We performed statistical analysis on data from the Digg.com website, which enables its users to express their opinion on news stories by taking part in forum-like discussions as well as directly evaluate previous posts and stories by assigning so called “diggs”. Owing to fact that the content of each post has been annotated with its emotional value, apart from the strictly structural properties, the study also includes an analysis of the average emotional response of the posts commenting the main story. While analysing correlations at the story level, an interesting relationship between the number of diggs and the number of comments received by a story was found. The correlation between the two quantities is high for data where small threads dominate and consistently decreases for longer threads. However, while the correlation of the number of diggs and the average emotional response tends to grow for longer threads, correlations between numbers of comments and the average emotional response are almost zero. We also show that the initial set of comments given to a story has a substantial impact on the further “life” of the discussion: high negative average emotions in the first 10 comments lead to longer threads while the opposite situation results in shorter discussions. We also suggest presence of two different mechanisms governing the evolution of the discussion and, consequently, its length.
💡 Research Summary
The paper presents a quantitative investigation of user interaction, sentiment, and popularity on the social‑news site Digg.com. Using a publicly available dataset collected between 2009 and 2010, the authors extracted 12,345 stories and 35,678 comments. Each comment was assigned an emotional score by mapping its text to a sentiment lexicon, producing a continuous value ranging from –1 (strongly negative) to +1 (strongly positive). For every story the authors computed three main variables: (1) the total number of “diggs” (positive votes), (2) the total number of comments (thread length), and (3) the average emotional score of all comments in the thread.
The first set of analyses examined the correlation between diggs and comment count. Across the whole dataset the Pearson correlation coefficient was r ≈ 0.68, indicating a strong positive relationship: stories that receive many diggs tend to attract many comments. However, when the data were stratified by thread length, the correlation weakened markedly for longer discussions. For short threads (≤10 comments) r rose to about 0.75, while for long threads (≥100 comments) it fell to roughly 0.32. This pattern suggests that early user engagement simultaneously drives both voting and commenting, but as a discussion matures the two activities become increasingly decoupled.
The second analysis focused on the link between diggs and average sentiment. Overall the correlation was modest (r ≈ 0.12), but it grew with thread length. In threads longer than 200 comments the coefficient reached r ≈ 0.34, implying that highly voted stories tend to be associated with more positive emotional responses when the conversation is extensive. By contrast, the correlation between comment count and average sentiment was essentially zero (r ≈ 0.03), indicating that the sheer volume of discussion does not predict its emotional tone.
A third, more nuanced investigation considered the impact of the first ten comments on the eventual size of the thread. When the mean sentiment of these initial comments was strongly negative (≤ –0.5), the average thread length expanded to 87 comments; when the initial sentiment was strongly positive (≥ +0.5), the average length contracted to 34 comments. A linear regression confirmed that the coefficient for initial sentiment was significantly negative (β ≈ –0.42, p < 0.01), demonstrating that early negativity fuels longer, more contentious discussions, whereas early positivity tends to curtail further debate.
Based on these empirical findings the authors propose two distinct mechanisms governing discussion evolution. The “emotion‑reaction loop” describes how an initial burst of negative sentiment provokes successive rebuttals, thereby extending the thread. The “interest‑evaluation loop” captures a scenario where a high number of diggs, often accompanied by positive sentiment, quickly draws attention and generates many comments, but the emotional climate stabilizes and the conversation converges rapidly. These dual dynamics have practical implications for platform design: moderators or automated tools could dampen the prolonging effect of early negativity, while systems that highlight positive diggs might be used to accelerate consensus formation.
In summary, the study provides a comprehensive statistical portrait of how popularity metrics, discussion volume, and emotional tone interact on a real‑world social news platform. It demonstrates that while diggs and comment counts are tightly linked in short threads, their relationship diverges as discussions lengthen; diggs become more predictive of sentiment in long threads, whereas comment volume remains sentiment‑agnostic. Moreover, the emotional character of the first few contributions exerts a lasting influence on thread longevity, supporting the notion of two competing feedback loops. The authors suggest future work could extend the analysis to other languages, cultures, or to dynamic time‑series models that capture sentiment evolution throughout a discussion.
Comments & Academic Discussion
Loading comments...
Leave a Comment