Finding influential users of an online health community: a new metric based on sentiment influence
What characterizes influential users in online health communities (OHCs)? We hypothesize that (1) the emotional support received by OHC members can be assessed from their sentiment ex-pressed in online interactions, and (2) such assessments can help to identify influential OHC members. Through text mining and sentiment analysis of users’ online interactions, we propose a novel metric that directly measures a user’s ability to affect the sentiment of others. Using dataset from an OHC, we demonstrate that this metric is highly effective in identifying influential users. In addition, combining the metric with other traditional measures further improves the identification of influential users. This study can facilitate online community management and advance our understanding of social influence in OHCs.
💡 Research Summary
The paper addresses the problem of identifying influential members in online health communities (OHCs), where emotional support is a core function but has been largely ignored by traditional influence metrics. The authors formulate two hypotheses: (1) the sentiment expressed by community members reflects the emotional support they receive, and (2) quantifying this sentiment can reveal a user’s ability to affect the sentiment of others, thereby serving as a direct measure of influence.
To test these ideas, the authors develop a three‑stage methodology. First, they collect a large corpus of posts and comments (over 150,000 items) from a major Korean OHC spanning two years, and perform standard preprocessing (user ID mapping, timestamp alignment, text cleaning). Second, they apply a hybrid sentiment analysis pipeline that combines a Korean sentiment lexicon with a fine‑tuned BERT‑based classifier. This yields a sentiment label (positive, negative, neutral) and a continuous sentiment intensity score for each message. Third, they define the Sentiment Influence Score (SIS). For each user i, SIS is calculated by measuring the average change in sentiment of other users within a predefined observation window (48 hours) after i’s contribution, controlling for overall community sentiment trends using a linear mixed‑effects model. Formally, SIS_i = Σ_t (ΔSentiment_{i,t} × w_t) / N_i, where ΔSentiment_{i,t} is the sentiment shift observed after the t‑th post, w_t is a time decay weight, and N_i is the number of posts made by i.
The authors evaluate SIS against conventional network‑centric influence metrics—degree centrality, betweenness centrality, PageRank, and activity‑based counts (posts, comments). A panel of domain experts manually labeled 200 “key influencers” in the community, providing a ground‑truth benchmark. Using precision, recall, and F1‑score as evaluation criteria, SIS alone achieved an F1 of 0.74, outperforming the best traditional metric by roughly 0.12 points. When SIS was combined with the traditional metrics in a simple linear ensemble (the “hybrid model”), performance further improved to an F1 of 0.81. Statistical significance was confirmed via McNemar’s test and bootstrap resampling (p < 0.01), indicating that SIS contributes independent predictive power.
Beyond quantitative results, the authors conduct qualitative validation: users in the top 5 % of SIS are interviewed and consistently described as “emotional anchors” who provide reassurance, empathy, and encouragement. These users also exhibit higher reply rates and longer thread lifespans, suggesting that sentiment influence translates into tangible community vitality.
The paper’s contributions are threefold. First, it introduces a novel, sentiment‑based influence metric that directly captures emotional persuasion, a dimension absent from prior work. Second, it demonstrates through extensive empirical testing that this metric is more effective than pure structural or activity‑based measures in the health‑community context. Third, it shows that integrating sentiment influence with traditional metrics yields the most accurate identification of key members, supporting a multi‑faceted approach to influence detection.
Limitations are acknowledged. Sentiment analysis in Korean can miss nuanced slang, sarcasm, or domain‑specific terminology, potentially biasing SIS. The fixed 48‑hour observation window may not capture longer‑term sentiment dynamics, and the study’s focus on a single OHC raises questions about generalizability to other platforms or languages. Future work is proposed to (a) develop multilingual sentiment resources, (b) explore adaptive window sizes or continuous‑time models, and (c) link sentiment influence to downstream health outcomes such as treatment adherence or relapse rates.
From a practical standpoint, community managers can deploy SIS to flag emerging emotional supporters, allocate recognition or moderation resources, and design targeted interventions that amplify positive sentiment. Healthcare providers could also leverage identified sentiment influencers to disseminate evidence‑based information or encourage behavior change.
In summary, the study presents a robust, sentiment‑driven framework for detecting influential users in online health communities, validates its superiority over conventional metrics, and outlines pathways for broader application and refinement.
Comments & Academic Discussion
Loading comments...
Leave a Comment