'The Dentist is an involved parent, the bartender is not': Revealing Implicit Biases in QA with Implicit BBQ

Reading time: 5 minute
...

📝 Original Info

  • Title: ‘The Dentist is an involved parent, the bartender is not’: Revealing Implicit Biases in QA with Implicit BBQ
  • ArXiv ID: 2512.06732
  • Date: 2025-12-07
  • Authors: Aarushi Wagh, Saniya Srivastava

📝 Abstract

Existing benchmarks evaluating biases in large language models (LLMs) primarily rely on explicit cues, declaring protected attributes like religion, race, gender by name. However, real-world interactions often contain implicit biases, inferred subtly through names, cultural cues, or traits. This critical oversight creates a significant blind spot in fairness evaluation. We introduce ImplicitBBQ, a benchmark extending the Bias Benchmark for QA (BBQ) with implicitly cued protected attributes across 6 categories. Our evaluation of GPT-4o on ImplicitBBQ illustrates troubling performance disparity from explicit BBQ prompts, with accuracy declining up to 7% in the "sexual orientation" subcategory and consistent decline located across most other categories. This indicates that current LLMs contain implicit biases undetected by explicit benchmarks. ImplicitBBQ offers a crucial tool for nuanced fairness evaluation in NLP.

💡 Deep Analysis

Deep Dive into "The Dentist is an involved parent, the bartender is not": Revealing Implicit Biases in QA with Implicit BBQ.

Existing benchmarks evaluating biases in large language models (LLMs) primarily rely on explicit cues, declaring protected attributes like religion, race, gender by name. However, real-world interactions often contain implicit biases, inferred subtly through names, cultural cues, or traits. This critical oversight creates a significant blind spot in fairness evaluation. We introduce ImplicitBBQ, a benchmark extending the Bias Benchmark for QA (BBQ) with implicitly cued protected attributes across 6 categories. Our evaluation of GPT-4o on ImplicitBBQ illustrates troubling performance disparity from explicit BBQ prompts, with accuracy declining up to 7% in the “sexual orientation” subcategory and consistent decline located across most other categories. This indicates that current LLMs contain implicit biases undetected by explicit benchmarks. ImplicitBBQ offers a crucial tool for nuanced fairness evaluation in NLP.

📄 Full Content

"The Dentist is an involved parent, the bartender is not": Revealing Implicit Biases in QA with Implicit BBQ Aarushi Wagh Georgia Institute of Technology awagh31@gatech.edu Saniya Srivastava Georgia Institute of Technology ssrivastava334@gatech.edu Abstract Existing benchmarks evaluating biases in large language models (LLMs) primarily rely on ex- plicit cues, declaring protected attributes like religion, race, gender by name. However, real- world interactions often contain implicit biases, inferred subtly through names, cultural cues, or traits. This critical oversight creates a signifi- cant blind spot in fairness evaluation. We intro- duce ImplicitBBQ, a benchmark extending the Bias Benchmark for QA (BBQ) with implicitly cued protected attributes across 6 categories. Our evaluation of GPT-4o on ImplicitBBQ il- lustrates troubling performance disparity from explicit BBQ prompts, with accuracy declining up to 7% in the "sexual orientation" subcate- gory and consistent decline located across most other categories. This indicates that current LLMs contain implicit biases undetected by explicit benchmarks. ImplicitBBQ offers a cru- cial tool for nuanced fairness evaluation in NLP. 1 1 Introduction Large language models (LLMs) are increasingly be- ing used as fundamental components of many NLP applications. Their widespread integration into crit- ical functions in society, including healthcare, fi- nance, and human resources, raises critical ques- tions regarding their potential to inherit, spread, and reinforce societal bias. Trained on vast inter- net corpora, LLMs inevitably reflect human prej- udices and stereotypes. Algorithmic bias, which occurs when systematic error creates discrimina- tory outcomes, can exacerbate existing disparities and pose tangible societal risks. Even minor biases, scaled across millions of LLM decisions, can lead to systemic discrimination, necessitating rigorous evaluation. Currently, bias benchmarks like the Bias Bench- mark for QA (BBQ) (Parrish et al., 2022) rely pre- 1Code and data are available at https://github.com/ ssrivastava22/ImplicitBBQ. dominantly on self-reported protected attributes (e.g., “a Jewish person and Muslim person”). This explicit specification is not very representative of the tact in social interactions in the real world, where identities are typically inferred based on sub- tle cues like names, cultural practices, or appear- ances. Evidence has indicated that LLMs may pass explicit bias tests but remain with implicit biases, like how humans may hold egalitarian values but with subconscious correlations (Bai et al., 2024). This discrepancy creates a significant blind spot, for models may appear unbiased on explicit tests and yet harbor hidden biases in subtle, real-world contexts. To address this crucial evaluation gap, we intro- duce ImplicitBBQ, a new extension to the BBQ dataset specifically aimed at testing LLMs for fine- grained, hidden biases. Our empirical test of GPT- 4o on ImplicitBBQ demonstrates substantial per- formance degradation compared to the baseline dataset. Hence, ImplicitBBQ is a highly significant resource to robust testing of LLM fairness and to mitigate subtle biases that have serious implications in high-stakes real-world applications. 2 Related Work Bias evaluation in LLMs has mainly been fo- cused on metrics like the Bias Benchmark for QA (BBQ) (Parrish et al., 2022) using clearly specified protected attributes. Extensions such as Korean- BBQ have adapted these explicit benchmarks to dif- ferent cultural contexts (Jin et al., 2024). But these explicit approaches may not be able to model all the subtleties of biases that are conveyed through implicit cues in real scenarios. Implicit bias detection within LLMs has been explored more thoroughly in recent studies draw- ing inspiration from psychological tests such as the Implicit Association Test (IAT) (Greenwald et al., 1998) (Lin and Li, 2025). Prompt-based meth- ods, including the LLM Word Association Test and arXiv:2512.06732v1 [cs.CL] 7 Dec 2025 LLM Relative Decision Test, have been suggested to uncover implicit discrimination and unconscious associations within LLMs (Bai et al., 2024). These methods are likely to uncover biases not evident when models are evaluated against typical explicit baselines alone. While such enhancements recog- nize deeper correlations, there remains a knowl- edge gap in question-answering benchmarks that particularly evaluate how implicit biases regulate LLM decision-making in nuanced QA. Beyond IAT-inspired prompting, self-reflection- based evaluations have also examined how explicit and implicit biases diverge in LLMs. Zhao et al. (2025) map implicit bias measurement to IAT-style prompts and explicit bias to Self-Report Assess- ment (SRA) by having the model perform self- reflection on its own output, finding a systematic inconsistency where explicit stereotyping is mild among outputs, but implicit stereotyping is strong. These results suggest

…(Full text truncated)…

📸 Image Gallery

page_1.png page_2.png page_3.png

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut