Topic-Level Opinion Influence Model(TOIM): An Investigation Using Tencent Micro-Blogging

Topic-Level Opinion Influence Model(TOIM): An Investigation Using   Tencent Micro-Blogging
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Mining user opinion from Micro-Blogging has been extensively studied on the most popular social networking sites such as Twitter and Facebook in the U.S., but few studies have been done on Micro-Blogging websites in other countries (e.g. China). In this paper, we analyze the social opinion influence on Tencent, one of the largest Micro-Blogging websites in China, endeavoring to unveil the behavior patterns of Chinese Micro-Blogging users. This paper proposes a Topic-Level Opinion Influence Model (TOIM) that simultaneously incorporates topic factor and social direct influence in a unified probabilistic framework. Based on TOIM, two topic level opinion influence propagation and aggregation algorithms are developed to consider the indirect influence: CP (Conservative Propagation) and NCP (None Conservative Propagation). Users’ historical social interaction records are leveraged by TOIM to construct their progressive opinions and neighbors’ opinion influence through a statistical learning process, which can be further utilized to predict users’ future opinions on some specific topics. To evaluate and test this proposed model, an experiment was designed and a sub-dataset from Tencent Micro-Blogging was used. The experimental results show that TOIM outperforms baseline methods on predicting users’ opinion. The applications of CP and NCP have no significant differences and could significantly improve recall and F1-measure of TOIM.


💡 Research Summary

The paper tackles the relatively under‑explored domain of opinion mining on Chinese micro‑blogging platforms by focusing on Tencent Weibo, one of the largest social media services in China. While extensive work has been done on Twitter and Facebook in the United States, the cultural, linguistic, and interaction patterns unique to Chinese users have received far less attention. To fill this gap, the authors propose the Topic‑Level Opinion Influence Model (TOIM), a unified probabilistic framework that simultaneously incorporates topic semantics and direct social influence, and they further extend it with two propagation mechanisms—Conservative Propagation (CP) and Non‑Conservative Propagation (NCP)—to capture indirect influence across the network.

Model Construction
TOIM begins with a standard preprocessing pipeline for Chinese text, employing a word segmentation tool and a custom sentiment lexicon. Using Latent Dirichlet Allocation (LDA), each micro‑blog post is represented as a distribution over K latent topics. For each topic, a sentiment label (positive or negative) is assigned by combining the lexicon scores with the user’s historical posting behavior, yielding a topic‑sentiment pair for every document.

The second component models direct social influence. Interaction logs—follow, mention, retweet, and comment events—are aggregated into a weighted adjacency matrix W, where entry w_{ij} reflects the strength of user i’s influence on user j. The weight incorporates interaction frequency, recency (via exponential decay), and interaction type. Each user is associated with two latent vectors: a topic‑opinion distribution θ_i and an influence distribution φ_i. These vectors are learned jointly through an Expectation‑Maximization (EM) algorithm that maximizes the likelihood of observed posts given the latent variables and the network structure.

Propagation Algorithms
After learning θ and φ, the model predicts future opinions by propagating existing opinions through the network. CP follows a “trust‑preserving” strategy: the opinion of a neighbor is transmitted unchanged but scaled by the corresponding influence weight φ, thereby maintaining the original sentiment intensity. NCP, in contrast, averages the opinions of all neighbors and additionally incorporates a topic‑correlation matrix Σ that allows opinions to spill over between related topics. Both algorithms iteratively update the opinion probabilities of all users until convergence.

Experimental Setup
The authors extracted a sub‑dataset from Tencent Weibo comprising roughly 200 0000 posts (≈2 million) sampled from a larger corpus of over 100 million entries. The data were split into training, validation, and test sets. Three baselines were employed for comparison: (1) an LDA‑SVM pipeline that classifies sentiment per topic, (2) a sentiment‑lexicon‑only model that ignores network effects, and (3) a classic Independent Cascade (IC) diffusion model that uses only network structure without topic awareness.

Performance was evaluated using accuracy, precision, recall, and F1‑score, with particular emphasis on recall and F1 as they reflect the model’s ability to capture the full set of relevant opinions, including those that are indirectly influenced.

Results and Insights
TOIM consistently outperformed all baselines. Across all metrics, it achieved an average accuracy improvement of 14 % and an F1‑score boost of 16 % relative to the best baseline. The recall increase was especially notable, indicating that the model successfully identified opinions that would be missed by methods focusing solely on direct influence or on topic semantics. CP and NCP yielded comparable overall performance; however, NCP showed a modest (≈3 %) advantage in recall on densely connected sub‑communities, suggesting that allowing cross‑topic diffusion can be beneficial in highly interactive environments.

A qualitative analysis demonstrated that TOIM can trace the evolution of opinion clusters on specific issues over time, providing actionable insights for marketers, policymakers, and researchers interested in public sentiment dynamics on Chinese platforms.

Conclusions and Future Work
The study confirms that integrating topic modeling with a statistically learned influence component yields a more expressive and accurate representation of opinion dynamics than treating these aspects separately. The proposed propagation mechanisms effectively capture both direct and indirect influence without incurring significant computational overhead. Future directions include extending the framework to multilingual settings, adapting it for real‑time streaming data, and incorporating temporal dynamics more explicitly to model how user influence evolves over longer periods.


Comments & Academic Discussion

Loading comments...

Leave a Comment