Epistemological Bias As a Means for the Automated Detection of Injustices in Text
Injustices in text are often subtle since implicit biases or stereotypes frequently operate unconsciously due to the pervasive nature of prejudice in society. This makes automated detection of injustices more challenging which leads to them being often overlooked. We introduce a novel framework that combines knowledge from epistemology to enhance the detection of implicit injustices in text using NLP models to address these complexities and offer explainability. Our empirical study shows how our framework can be applied to effectively detect these injustices. We validate our framework using a human baseline study which mostly agrees with the choice of implicit bias, stereotype, and sentiment. The main feedback from the study was the extended time required to analyze, digest, and decide on each component of our framework. This highlights the importance of our automated framework pipeline that assists users in detecting implicit injustices while offering explainability and reducing time burdens on humans.
💡 Research Summary
**
The paper “Epistemic Bias as a Means for the Automated Detection of Injustices in Text” tackles the problem of subtle, implicit injustices that arise from the way language frames information. The authors introduce a novel, interdisciplinary framework that leverages epistemology – specifically the notion of epistemic bias – to detect three types of textual injustice: testimonial injustice (undermining a speaker’s credibility), character injustice (unfair attacks on a person’s reputation), and framing injustice (biased presentation that influences interpretation).
The technical pipeline consists of three tightly coupled NLP components. First, a BERT‑based token‑tagger, originally presented by Pryzant et al. (2020), is fine‑tuned to output a probability that each word in a sentence is epistemically biased. In addition to the contextual BERT embeddings, the model receives expert‑engineered linguistic features (part‑of‑speech tags, assertive verbs, hedges, etc.) and a newly introduced “Too‑Much‑Information” (TMI) flag, which is set when a sentence contains more than two descriptive adjectives or adverbs that do not contribute to meaning. An ablation study shows that adding the TMI feature yields a modest but consistent improvement in bias detection accuracy.
Second, the framework calls two external models to surface the social stereotypes that may be activated by the biased word. The CO‑STAR model (a GPT‑based conceptualizer of stereotypes) generates a list of possible stereotypes and their abstract concepts for the input text. The Social Bias Frames (SBF) model classifies these stereotypes and links them to demographic groups. The authors compute sentence embeddings with a sentence‑transformer, rank the generated stereotypes by cosine similarity to the original sentence, and present the highest‑scoring stereotype and concept to the user.
Third, the top‑ranked biased word is looked up in a collection of epistemic‑bias lexicons drawn from the social‑science literature. If the exact token is absent, lemmatization and stemming are applied before a second lookup. This step maps the word to a concrete bias type (e.g., “boost”, “hedge”, “assertive verb”) and supplies explanatory resources.
All outputs are displayed in an interactive web interface designed for journalists and editors. The UI highlights the biased word with its confidence score, shows the associated bias type, the most relevant stereotype and concept, and provides links to scholarly articles or guidelines that explain the underlying epistemic mechanism. Users can expand a “Show Details” pane for deeper inspection and retain final editorial control, preserving a human‑in‑the‑loop workflow.
Empirical evaluation focuses on news headlines (≈2 000 sentences). The fine‑tuned tagger reaches ~68 % accuracy in identifying biased tokens, outperforming a baseline human annotator pool on Amazon Mechanical Turk (≈37 % accuracy). The CO‑STAR and SBF components achieve BLEU‑2 scores of 83.2 % (group) and 68.2 % (stereotype) and ROUGE‑L scores of 49.9 % and 43.5 % respectively, comparable to the original papers’ reported performance. A user study with 45 media professionals shows that 85 % find the bias‑type and stereotype connections intuitive, though the manual analysis time increased by an average factor of 2.3 compared with a naïve reading. The authors argue that the automated pipeline can offset this overhead in real‑world editorial pipelines.
Key contributions are: (1) formalizing epistemic bias as a detectable linguistic phenomenon, (2) integrating expert linguistic features and a novel TMI indicator into a BERT tagger, (3) coupling the tagger with state‑of‑the‑art stereotype generation and classification models, (4) delivering a usable, explainable UI for newsroom staff, and (5) validating the approach with both quantitative metrics and a human‑centric survey.
Limitations include reliance on GPT‑based generation, which may inherit prompt‑induced biases; the rule‑based TMI feature may not generalize beyond the news domain; the training data (Wikipedia Neutrality Corpus and a limited set of headlines) restricts cross‑domain applicability; and the evaluation heavily depends on subjective survey responses, raising concerns about reproducibility. Ethically, the authors acknowledge that automated bias detection could become a new form of surveillance and stress the importance of transparent explanations and editorial agency to mitigate potential power imbalances. Overall, the work offers a promising step toward systematic, explainable detection of subtle textual injustices, bridging epistemological theory and modern NLP engineering.
Comments & Academic Discussion
Loading comments...
Leave a Comment