AI Hallucination from Students' Perspective: A Thematic Analysis
As students increasingly rely on large language models, hallucinations pose a growing threat to learning. To mitigate this, AI literacy must expand beyond prompt engineering to address how students should detect and respond to LLM hallucinations. To support this, we need to understand how students experience hallucinations, how they detect them, and why they believe they occur. To investigate these questions, we asked university students three open-ended questions about their experiences with AI hallucinations, their detection strategies, and their mental models of why hallucinations occur. Sixty-three students responded to the survey. Thematic analysis of their responses revealed that reported hallucination issues primarily relate to incorrect or fabricated citations, false information, overconfident but misleading responses, poor adherence to prompts, persistence in incorrect answers, and sycophancy. To detect hallucinations, students rely either on intuitive judgment or on active verification strategies, such as cross-checking with external sources or re-prompting the model. Students’ explanations for why hallucinations occur reflected several mental models, including notable misconceptions. Many described AI as a research engine that fabricates information when it cannot locate an answer in its “database.” Others attributed hallucinations to issues with training data, inadequate prompting, or the model’s inability to understand or verify information. These findings illuminate vulnerabilities in AI-supported learning and highlight the need for explicit instruction in verification protocols, accurate mental models of generative AI, and awareness of behaviors such as sycophancy and confident delivery that obscure inaccuracy. The study contributes empirical evidence for integrating hallucination awareness and mitigation into AI literacy curricula.
💡 Research Summary
**
This paper investigates how university students experience, detect, and explain hallucinations generated by large language models (LLMs) such as ChatGPT. Sixty‑three senior computer‑engineering students were asked three open‑ended survey questions: (1) describe concrete examples of LLM hallucination they had encountered, (2) explain how they identified those hallucinations, and (3) discuss why they think hallucinations occur. The authors applied Braun and Clarke’s thematic analysis using the open‑source tool Taguette, achieving high inter‑rater reliability (≈87‑93 %). The analysis yielded four main themes for the types of hallucinations (RQ1), two themes for detection strategies (RQ2), and five themes for students’ mental models of the phenomenon (RQ3).
Types of hallucinations (RQ1). Students reported seven sub‑categories: fabricated or non‑existent citations (≈25 % of mentions), outright fabricated facts or statistics, over‑confident but incorrect answers, incomplete or overly generic responses, failure to follow or misinterpret prompts (including the model inserting unstated assumptions), persistence or “looping” where repeated prompting does not correct the error, and sycophantic or self‑contradictory behavior. These issues appeared across domains such as programming, mathematics, electrical engineering, and general knowledge queries. A small portion of comments blamed newer model versions for fewer hallucinations or expressed emotional reactions (frustration, disappointment).
Detection strategies (RQ2). Students relied on two broad approaches. The first, “intuitive judgment,” involved recognizing logical inconsistency, nonsensical language, excessive detail, or irrelevance as red flags. The second, “verification‑based detection,” comprised concrete actions: cross‑checking with external sources (academic papers, reputable websites), re‑prompting the model to test consistency, and, for code‑related tasks, actually executing the generated code to see if it runs. While many students combined both approaches, the study highlights that intuition alone is insufficient and systematic verification is essential.
Mental models of why hallucinations occur (RQ3). Participants articulated five explanatory frameworks: (1) the “database‑gap” myth – the belief that the model fabricates information when it cannot retrieve it from an internal database; (2) training‑data bias or contamination – errors stem from flawed or noisy data used during pre‑training; (3) prompt‑design failure – vague or insufficient prompts lead the model to guess; (4) inherent limits of understanding – the model cannot truly comprehend or reason, merely predicts plausible token sequences; and (5) the model’s confidence mechanism – the system presents answers with high confidence regardless of factual correctness. Notably, the first model reflects a widespread misconception that LLMs function like search engines, which has pedagogical implications.
Implications for AI literacy education. The findings suggest that AI‑enhanced learning curricula should move beyond prompt‑engineering tutorials. First, educators need to provide domain‑specific examples of hallucinations (e.g., fabricated citations, false statistics) so students can recognize patterns. Second, curricula must embed explicit verification protocols—fact‑checking checklists, source‑validation worksheets, and hands‑on code testing—to cultivate systematic habits. Third, instruction should correct the “LLM as a database” myth by explaining the probabilistic nature of generative models and their lack of built‑in truth verification. Fourth, awareness of model behaviors such as sycophancy and over‑confidence should be raised, teaching students to treat confidence cues as unreliable indicators of accuracy. Finally, integrating these components with critical‑thinking training will help students use LLMs responsibly, mitigate misinformation, and avoid academic integrity breaches.
In sum, this study provides empirical, student‑centered evidence that LLM hallucinations are multifaceted—spanning factual, citation, confidence, and interactional dimensions—and that students employ both intuitive and procedural strategies to detect them. By revealing prevalent misconceptions about why hallucinations happen, the paper underscores the need for a holistic AI literacy framework that combines technical verification skills with accurate mental models of generative AI. Such an approach promises to safeguard learning outcomes while preserving the pedagogical benefits of LLMs.
Comments & Academic Discussion
Loading comments...
Leave a Comment