계층적 교육 감독을 통한 저비용 LLM 튜터의 신뢰성 강화

Reading time: 5 minute
...

📝 Original Info

  • Title: 계층적 교육 감독을 통한 저비용 LLM 튜터의 신뢰성 강화
  • ArXiv ID: 2512.22496
  • Date: Pending
  • Authors: ** Saisab Sadhu¹, Ashim Dhor¹ ¹ Indian Institute of Science Education and Research Bhopal, 인도 **

📝 Abstract

Large Language Models (LLMs) are increasingly deployed as automated tutors to address educator shortages; however, they often fail at pedagogical reasoning, frequently validating incorrect student solutions (sycophancy) or providing overly direct answers that hinder learning. We introduce Hierarchical Pedagogical Oversight (HPO), a framework that adapts structured adversarial synthesis to educational assessment. Unlike cooperative multi-agent systems that often drift toward superficial consensus, HPO enforces a dialectical separation of concerns: specialist agents first distill dialogue context, which then grounds a moderated, five-act debate between opposing pedagogical critics. We evaluate this framework on the MRBench dataset of 1,214 middle-school mathematics dialogues. Our 8B-parameter model achieves a Macro F1 of 0.845, outperforming GPT-4o (0.812) by 3.3% while using 20× fewer parameters. These results establish adversarial reasoning as a critical mechanism for deploying reliable, lowcompute pedagogical oversight in resource-constrained environments.

💡 Deep Analysis

Figure 1

📄 Full Content

Hierarchical Pedagogical Oversight: A Multi-Agent Adversarial Framework for Reliable AI Tutoring * Saisab Sadhu1, Ashim Dhor1 1Indian Institute of Science Education and Research Bhopal Bhopal, Madhya Pradesh, India sadhusaisab@gmail.com, ashimdhor2003@gmail.com Abstract Large Language Models (LLMs) are increasingly deployed as automated tutors to address educator shortages; however, they often fail at pedagogical reasoning, frequently validating in- correct student solutions (sycophancy) or providing overly di- rect answers that hinder learning. We introduce Hierarchical Pedagogical Oversight (HPO), a framework that adapts struc- tured adversarial synthesis to educational assessment. Unlike cooperative multi-agent systems that often drift toward su- perficial consensus, HPO enforces a dialectical separation of concerns: specialist agents first distill dialogue context, which then grounds a moderated, five-act debate between oppos- ing pedagogical critics. We evaluate this framework on the MRBench dataset of 1,214 middle-school mathematics di- alogues. Our 8B-parameter model achieves a Macro F1 of 0.845, outperforming GPT-4o (0.812) by 3.3% while using 20× fewer parameters. These results establish adversarial rea- soning as a critical mechanism for deploying reliable, low- compute pedagogical oversight in resource-constrained envi- ronments. 1 Introduction The deployment of Large Language Models (LLMs) as auto- mated tutors offers a scalable solution to the global shortage of qualified educators (Kochmar et al. 2025). However, re- cent benchmarks reveal a fundamental reliability gap: LLMs frequently validate incorrect student reasoning to maintain conversational rapport, a phenomenon known as sycophancy (Wang et al. 2024) or fail to identify implicit conceptual er- rors (Demszky et al. 2023). In educational settings without human-in-the-loop supervision, such “pedagogical halluci- nations” (Ji et al. 2023) can actively reinforce misconcep- tions. These failures often stem from a structural flaw in current agent design: the conflation of generation and eval- uation. Tasking a single model to simultaneously teach and assess its own pedagogical quality creates vulnerability to confirmation bias. Building on our prior work in financial NLP, where struc- tured adversarial synthesis (SAS) mitigated bias in market analysis (Sadhu, Patra, and Basu 2025), we adapt this dialec- *Accepted for presentation at the AAAI 2026 EGSAI Commu- nity Activity (AAAI 2026). Copyright © 2026, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. tical approach to education with critical architectural modi- fications. Unlike generative financial synthesis, pedagogical oversight is a constrained classification task requiring strict adherence to instructional taxonomies (Mistake Identifica- tion and Guidance Quality). We propose Hierarchical Pedagogical Oversight (HPO), a framework designed to operationalize adversarial reason- ing for education, leveraging principles of AI feedback (Bai et al. 2022) to audit model behavior. HPO decouples the tutoring process from evaluative judgment through a three- phase pipeline: (1) Intelligence Distillation, (2) Adversarial Debate, and (3) Synthesis. We empirically validate HPO on the MRBench dataset (Maurya et al. 2025). Our results show that an 8B-parameter model, when structured via HPO, out- performs GPT-4o by 3.3% in Macro F1. We demonstrate that this performance is driven by the adversarial protocol itself; removing the ”Devil’s Advocate” moderator results in a sig- nificant performance drop. 2 Methodology: The HPO Framework 2.1 Problem Formalization We define pedagogical oversight as a classification task. Given a dialogue history D, a new student utterance un+1 containing a potential misconception, and a candidate tutor response Rcand, the system maps the tuple (D, un+1, Rcand) to a label vector (yMI, yPG). Here, yMI ∈{0, 1} denotes Mis- take Identification, and yPG ∈{0, 1, 2} denotes Guidance Quality (Direct Solution vs. Partial vs. Effective Scaffold- ing). 2.2 Phase 1: Intelligence Distillation To ground the downstream debate, three parallel specialist agents (Conceptual Analyst, Behavioral Analyst, Trajectory Analyst) distill the raw dialogue into a ”Pedagogical Brief- ing” (B), extracting the mathematical concept, engagement signals, and longitudinal understanding trajectory (see Ap- pendix A). This grounded context prevents hallucination of student intent. 2.3 Phase 2: Structured Adversarial Debate The core of HPO is a deterministic, five-act debate proto- col designed to stress-test the candidate response. In Act arXiv:2512.22496v1 [cs.MA] 27 Dec 2025 I (Opening), a Permissive Critic and a Strict Critic gen- erate opposing theses regarding the response’s quality. For instance, given a tutor hint “Think about what happens when you add fractions,” the Permissive Critic might argue this provides effective scaffolding, while the Strict Critic

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut