📝 Original Info
- Title: ‘The Dentist is an involved parent, the bartender is not’: Revealing Implicit Biases in QA with Implicit BBQ
- ArXiv ID: 2512.06732
- Date: 2025-12-07
- Authors: Aarushi Wagh, Saniya Srivastava
📝 Abstract
Existing benchmarks evaluating biases in large language models (LLMs) primarily rely on explicit cues, declaring protected attributes like religion, race, gender by name. However, real-world interactions often contain implicit biases, inferred subtly through names, cultural cues, or traits. This critical oversight creates a significant blind spot in fairness evaluation. We introduce ImplicitBBQ, a benchmark extending the Bias Benchmark for QA (BBQ) with implicitly cued protected attributes across 6 categories. Our evaluation of GPT-4o on ImplicitBBQ illustrates troubling performance disparity from explicit BBQ prompts, with accuracy declining up to 7% in the "sexual orientation" subcategory and consistent decline located across most other categories. This indicates that current LLMs contain implicit biases undetected by explicit benchmarks. ImplicitBBQ offers a crucial tool for nuanced fairness evaluation in NLP.
💡 Deep Analysis
Deep Dive into "The Dentist is an involved parent, the bartender is not": Revealing Implicit Biases in QA with Implicit BBQ.
Existing benchmarks evaluating biases in large language models (LLMs) primarily rely on explicit cues, declaring protected attributes like religion, race, gender by name. However, real-world interactions often contain implicit biases, inferred subtly through names, cultural cues, or traits. This critical oversight creates a significant blind spot in fairness evaluation. We introduce ImplicitBBQ, a benchmark extending the Bias Benchmark for QA (BBQ) with implicitly cued protected attributes across 6 categories. Our evaluation of GPT-4o on ImplicitBBQ illustrates troubling performance disparity from explicit BBQ prompts, with accuracy declining up to 7% in the “sexual orientation” subcategory and consistent decline located across most other categories. This indicates that current LLMs contain implicit biases undetected by explicit benchmarks. ImplicitBBQ offers a crucial tool for nuanced fairness evaluation in NLP.
📄 Full Content
"The Dentist is an involved parent, the bartender is not": Revealing
Implicit Biases in QA with Implicit BBQ
Aarushi Wagh
Georgia Institute of Technology
awagh31@gatech.edu
Saniya Srivastava
Georgia Institute of Technology
ssrivastava334@gatech.edu
Abstract
Existing benchmarks evaluating biases in large
language models (LLMs) primarily rely on ex-
plicit cues, declaring protected attributes like
religion, race, gender by name. However, real-
world interactions often contain implicit biases,
inferred subtly through names, cultural cues, or
traits. This critical oversight creates a signifi-
cant blind spot in fairness evaluation. We intro-
duce ImplicitBBQ, a benchmark extending the
Bias Benchmark for QA (BBQ) with implicitly
cued protected attributes across 6 categories.
Our evaluation of GPT-4o on ImplicitBBQ il-
lustrates troubling performance disparity from
explicit BBQ prompts, with accuracy declining
up to 7% in the "sexual orientation" subcate-
gory and consistent decline located across most
other categories. This indicates that current
LLMs contain implicit biases undetected by
explicit benchmarks. ImplicitBBQ offers a cru-
cial tool for nuanced fairness evaluation in NLP.
1
1
Introduction
Large language models (LLMs) are increasingly be-
ing used as fundamental components of many NLP
applications. Their widespread integration into crit-
ical functions in society, including healthcare, fi-
nance, and human resources, raises critical ques-
tions regarding their potential to inherit, spread,
and reinforce societal bias. Trained on vast inter-
net corpora, LLMs inevitably reflect human prej-
udices and stereotypes. Algorithmic bias, which
occurs when systematic error creates discrimina-
tory outcomes, can exacerbate existing disparities
and pose tangible societal risks. Even minor biases,
scaled across millions of LLM decisions, can lead
to systemic discrimination, necessitating rigorous
evaluation.
Currently, bias benchmarks like the Bias Bench-
mark for QA (BBQ) (Parrish et al., 2022) rely pre-
1Code and data are available at https://github.com/
ssrivastava22/ImplicitBBQ.
dominantly on self-reported protected attributes
(e.g., “a Jewish person and Muslim person”). This
explicit specification is not very representative of
the tact in social interactions in the real world,
where identities are typically inferred based on sub-
tle cues like names, cultural practices, or appear-
ances. Evidence has indicated that LLMs may pass
explicit bias tests but remain with implicit biases,
like how humans may hold egalitarian values but
with subconscious correlations (Bai et al., 2024).
This discrepancy creates a significant blind spot,
for models may appear unbiased on explicit tests
and yet harbor hidden biases in subtle, real-world
contexts.
To address this crucial evaluation gap, we intro-
duce ImplicitBBQ, a new extension to the BBQ
dataset specifically aimed at testing LLMs for fine-
grained, hidden biases. Our empirical test of GPT-
4o on ImplicitBBQ demonstrates substantial per-
formance degradation compared to the baseline
dataset. Hence, ImplicitBBQ is a highly significant
resource to robust testing of LLM fairness and to
mitigate subtle biases that have serious implications
in high-stakes real-world applications.
2
Related Work
Bias evaluation in LLMs has mainly been fo-
cused on metrics like the Bias Benchmark for QA
(BBQ) (Parrish et al., 2022) using clearly specified
protected attributes. Extensions such as Korean-
BBQ have adapted these explicit benchmarks to dif-
ferent cultural contexts (Jin et al., 2024). But these
explicit approaches may not be able to model all
the subtleties of biases that are conveyed through
implicit cues in real scenarios.
Implicit bias detection within LLMs has been
explored more thoroughly in recent studies draw-
ing inspiration from psychological tests such as the
Implicit Association Test (IAT) (Greenwald et al.,
1998) (Lin and Li, 2025). Prompt-based meth-
ods, including the LLM Word Association Test and
arXiv:2512.06732v1 [cs.CL] 7 Dec 2025
LLM Relative Decision Test, have been suggested
to uncover implicit discrimination and unconscious
associations within LLMs (Bai et al., 2024). These
methods are likely to uncover biases not evident
when models are evaluated against typical explicit
baselines alone. While such enhancements recog-
nize deeper correlations, there remains a knowl-
edge gap in question-answering benchmarks that
particularly evaluate how implicit biases regulate
LLM decision-making in nuanced QA.
Beyond IAT-inspired prompting, self-reflection-
based evaluations have also examined how explicit
and implicit biases diverge in LLMs. Zhao et al.
(2025) map implicit bias measurement to IAT-style
prompts and explicit bias to Self-Report Assess-
ment (SRA) by having the model perform self-
reflection on its own output, finding a systematic
inconsistency where explicit stereotyping is mild
among outputs, but implicit stereotyping is strong.
These results suggest
…(Full text truncated)…
📸 Image Gallery
Reference
This content is AI-processed based on ArXiv data.