FactAppeal: Identifying Epistemic Factual Appeals in News Media
How is a factual claim made credible? We propose the novel task of Epistemic Appeal Identification, which identifies whether and how factual statements have been anchored by external sources or evidence. To advance research on this task, we present FactAppeal, a manually annotated dataset of 3,226 English-language news sentences. Unlike prior resources that focus solely on claim detection and verification, FactAppeal identifies the nuanced epistemic structures and evidentiary basis underlying these claims and used to support them. FactAppeal contains span-level annotations which identify factual statements and mentions of sources on which they rely. Moreover, the annotations include fine-grained characteristics of factual appeals such as the type of source (e.g. Active Participant, Witness, Expert, Direct Evidence), whether it is mentioned by name, mentions of the source’s role and epistemic credentials, attribution to the source via direct or indirect quotation, and other features. We model the task with a range of encoder models and generative decoder models in the 2B-9B parameter range. Our best performing model, based on Gemma 2 9B, achieves a macro-F1 score of 0.73.
💡 Research Summary
The paper “FactAppeal: Identifying Epistemic Factual Appeals in News Media” introduces a novel task called Epistemic Appeal Identification, which goes beyond traditional claim detection and verification by determining not only whether a sentence contains a factual claim but also how that claim is anchored to external sources or evidence. To support this task, the authors release FactAppeal, a manually annotated corpus of 3,226 English‑language news sentences. Each sentence is annotated at the span level with fine‑grained tags that capture (i) factual statements with or without an appeal, (ii) source mentions, (iii) source attributes (e.g., titles, credentials), (iv) the method of appeal (direct quotation vs. indirect paraphrase), (v) recipient, (vi) appeal time and location, and (vii) the type of epistemic source (Active Participant, Witness, Official, Expert, Direct Evidence, Expert Document, News Report, etc.).
The annotation scheme is hierarchical: a “Fact_With_Appeal” span is always linked to a “Source” span, which may be marked as Named or Unnamed and assigned a source type. Additional modifiers indicate whether the appeal is a direct quote or an indirect one. The authors also define a typology based on two dimensions—proximity to the event (internal vs. external) and human vs. non‑human—allowing a nuanced distinction between, for example, a witness who observed an incident and a scientific report that provides expert knowledge.
Inter‑annotator agreement was measured on a subset annotated by both an author and a research assistant. After converting span annotations to binary word‑level labels, the authors report an Intersection‑over‑Union (IoU) of 0.74 and Cohen’s κ of 0.82 across tags, indicating substantial agreement. Agreement is especially high for the main tags (Fact_With_Appeal, Source, Source_Attribute), while rare tags such as Appeal_Time and Appeal_Location have limited instances.
For modeling, the authors experiment with a range of encoder‑only and encoder‑decoder architectures spanning 2 B to 9 B parameters, including RoBERTa, DeBERTa, LLaMA‑2, and Gemma‑2. Models are trained to predict multiple overlapping spans and their associated attributes jointly. Performance is evaluated using macro‑averaged F1. The best result is achieved by Gemma 2 9B, which attains a macro‑F1 of 0.73, outperforming smaller models and demonstrating that large language models can handle the multi‑label, span‑based nature of the task. Nonetheless, the overall performance remains below perfect, particularly for low‑frequency source types and indirect quotations, suggesting room for improvement.
Statistical analysis of the dataset shows that over 80 % of sentences contain a factual claim, reflecting the factual nature of news reporting. However, only about one‑third of those factual statements are accompanied by an explicit epistemic appeal, meaning that “Fact_Without_Appeal” spans are roughly twice as common as “Fact_With_Appeal” spans. The distribution of source types is skewed toward internal sources (Active Participants, Witnesses, Officials) with external sources (Experts, Expert Documents, News Reports) appearing less frequently.
The paper’s contributions are threefold: (1) it defines a new, richer task that integrates factuality detection with source attribution, providing a more complete picture of how credibility is constructed in news; (2) it offers a high‑quality, richly annotated dataset that can serve as a benchmark for future work in epistemic reasoning, discourse analysis, and fact‑checking; (3) it demonstrates that current large language models can be adapted to this complex multi‑span labeling problem, setting a baseline for subsequent research.
Future directions suggested include expanding the dataset to cover low‑resource languages and domains, improving annotation density for rare tags, integrating coreference resolution and entity linking to handle multi‑sentence appeals, and exploring multi‑task learning where epistemic appeal detection is jointly trained with claim verification or evidence retrieval. The authors also envision applications in media literacy tools that highlight the evidential basis of news claims, thereby helping readers assess credibility more critically.
Comments & Academic Discussion
Loading comments...
Leave a Comment