Exploring Content and Social Connections of Fake News with Explainable Text and Graph Learning
The global spread of misinformation and concerns about content trustworthiness have driven the development of automated fact-checking systems. Since false information often exploits social media dynamics such as “likes” and user networks to amplify its reach, effective solutions must go beyond content analysis to incorporate these factors. Moreover, simply labelling content as false can be ineffective or even reinforce biases such as automation and confirmation bias. This paper proposes an explainable framework that combines content, social media, and graph-based features to enhance fact-checking. It integrates a misinformation classifier with explainability techniques to deliver complete and interpretable insights supporting classification decisions. Experiments demonstrate that multimodal information improves performance over single modalities, with evaluations conducted on datasets in English, Spanish, and Portuguese. Additionally, the framework’s explanations were assessed for interpretability, trustworthiness, and robustness with a novel protocol, showing that it effectively generates human-understandable justifications for its predictions.
💡 Research Summary
The paper presents mu2X, an end‑to‑end framework that jointly leverages textual content, shallow metadata, and the local social‑graph structure of a social‑media post to detect misinformation and to provide human‑understandable explanations for its decisions. A post is formalized as a tuple (T, Gₖ, M, lang) where T is the raw text, Gₖ denotes the k‑hop neighborhood (replies, quotes, mentions, etc.), M contains numeric metadata (likes, retweets, etc.), and lang indicates the language. Textual encoding is performed by language‑specific pretrained language models: BERTweet for English, BERTweet‑BR for Portuguese, and RoBERTa‑uito for Spanish. Metadata features are linearly projected and concatenated with the text embedding to produce a multimodal node vector Xₚᵢ.
The detection component employs a Graph Attention Network (GAT) that aggregates the multimodal vectors of a node’s neighbors using self‑attention weights αₖⱼ, followed by a linear transformation and a non‑linear activation to obtain a final node representation mₚᵢ. A softmax classifier then outputs the probability of the node being misinformation (label 0) or factual (label 1). By integrating the graph structure, the model captures propagation patterns such as bursts of retweets or coordinated mentions that are indicative of coordinated misinformation campaigns.
For explainability, mu2X combines two post‑hoc, model‑agnostic techniques. GraphLIME, an extension of LIME for graph neural networks, identifies the most influential graph‑level features (e.g., specific neighbor types, edge weights) by applying HSIC‑Lasso on the k‑hop subgraph. Simultaneously, Integrated Gradients assigns an importance score to each token in the text, yielding a token‑level relevance vector ζₜₚᵢ ∈
Comments & Academic Discussion
Loading comments...
Leave a Comment