📝 Original Info Title: 계층적 교육 감독을 통한 저비용 LLM 튜터의 신뢰성 강화ArXiv ID: 2512.22496Date: PendingAuthors: ** Saisab Sadhu¹, Ashim Dhor¹ ¹ Indian Institute of Science Education and Research Bhopal, 인도 **📝 Abstract Large Language Models (LLMs) are increasingly deployed as automated tutors to address educator shortages; however, they often fail at pedagogical reasoning, frequently validating incorrect student solutions (sycophancy) or providing overly direct answers that hinder learning. We introduce Hierarchical Pedagogical Oversight (HPO), a framework that adapts structured adversarial synthesis to educational assessment. Unlike cooperative multi-agent systems that often drift toward superficial consensus, HPO enforces a dialectical separation of concerns: specialist agents first distill dialogue context, which then grounds a moderated, five-act debate between opposing pedagogical critics. We evaluate this framework on the MRBench dataset of 1,214 middle-school mathematics dialogues. Our 8B-parameter model achieves a Macro F1 of 0.845, outperforming GPT-4o (0.812) by 3.3% while using 20× fewer parameters. These results establish adversarial reasoning as a critical mechanism for deploying reliable, lowcompute pedagogical oversight in resource-constrained environments.
💡 Deep Analysis
📄 Full Content Hierarchical Pedagogical Oversight: A Multi-Agent Adversarial Framework for
Reliable AI Tutoring *
Saisab Sadhu1, Ashim Dhor1
1Indian Institute of Science Education and Research Bhopal
Bhopal, Madhya Pradesh, India
sadhusaisab@gmail.com, ashimdhor2003@gmail.com
Abstract
Large Language Models (LLMs) are increasingly deployed as
automated tutors to address educator shortages; however, they
often fail at pedagogical reasoning, frequently validating in-
correct student solutions (sycophancy) or providing overly di-
rect answers that hinder learning. We introduce Hierarchical
Pedagogical Oversight (HPO), a framework that adapts struc-
tured adversarial synthesis to educational assessment. Unlike
cooperative multi-agent systems that often drift toward su-
perficial consensus, HPO enforces a dialectical separation of
concerns: specialist agents first distill dialogue context, which
then grounds a moderated, five-act debate between oppos-
ing pedagogical critics. We evaluate this framework on the
MRBench dataset of 1,214 middle-school mathematics di-
alogues. Our 8B-parameter model achieves a Macro F1 of
0.845, outperforming GPT-4o (0.812) by 3.3% while using
20× fewer parameters. These results establish adversarial rea-
soning as a critical mechanism for deploying reliable, low-
compute pedagogical oversight in resource-constrained envi-
ronments.
1
Introduction
The deployment of Large Language Models (LLMs) as auto-
mated tutors offers a scalable solution to the global shortage
of qualified educators (Kochmar et al. 2025). However, re-
cent benchmarks reveal a fundamental reliability gap: LLMs
frequently validate incorrect student reasoning to maintain
conversational rapport, a phenomenon known as sycophancy
(Wang et al. 2024) or fail to identify implicit conceptual er-
rors (Demszky et al. 2023). In educational settings without
human-in-the-loop supervision, such “pedagogical halluci-
nations” (Ji et al. 2023) can actively reinforce misconcep-
tions. These failures often stem from a structural flaw in
current agent design: the conflation of generation and eval-
uation. Tasking a single model to simultaneously teach and
assess its own pedagogical quality creates vulnerability to
confirmation bias.
Building on our prior work in financial NLP, where struc-
tured adversarial synthesis (SAS) mitigated bias in market
analysis (Sadhu, Patra, and Basu 2025), we adapt this dialec-
*Accepted for presentation at the AAAI 2026 EGSAI Commu-
nity Activity (AAAI 2026).
Copyright © 2026, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
tical approach to education with critical architectural modi-
fications. Unlike generative financial synthesis, pedagogical
oversight is a constrained classification task requiring strict
adherence to instructional taxonomies (Mistake Identifica-
tion and Guidance Quality).
We propose Hierarchical Pedagogical Oversight (HPO),
a framework designed to operationalize adversarial reason-
ing for education, leveraging principles of AI feedback (Bai
et al. 2022) to audit model behavior. HPO decouples the
tutoring process from evaluative judgment through a three-
phase pipeline: (1) Intelligence Distillation, (2) Adversarial
Debate, and (3) Synthesis. We empirically validate HPO on
the MRBench dataset (Maurya et al. 2025). Our results show
that an 8B-parameter model, when structured via HPO, out-
performs GPT-4o by 3.3% in Macro F1. We demonstrate that
this performance is driven by the adversarial protocol itself;
removing the ”Devil’s Advocate” moderator results in a sig-
nificant performance drop.
2
Methodology: The HPO Framework
2.1
Problem Formalization
We define pedagogical oversight as a classification task.
Given a dialogue history D, a new student utterance un+1
containing a potential misconception, and a candidate tutor
response Rcand, the system maps the tuple (D, un+1, Rcand)
to a label vector (yMI, yPG). Here, yMI ∈{0, 1} denotes Mis-
take Identification, and yPG ∈{0, 1, 2} denotes Guidance
Quality (Direct Solution vs. Partial vs. Effective Scaffold-
ing).
2.2
Phase 1: Intelligence Distillation
To ground the downstream debate, three parallel specialist
agents (Conceptual Analyst, Behavioral Analyst, Trajectory
Analyst) distill the raw dialogue into a ”Pedagogical Brief-
ing” (B), extracting the mathematical concept, engagement
signals, and longitudinal understanding trajectory (see Ap-
pendix A). This grounded context prevents hallucination of
student intent.
2.3
Phase 2: Structured Adversarial Debate
The core of HPO is a deterministic, five-act debate proto-
col designed to stress-test the candidate response. In Act
arXiv:2512.22496v1 [cs.MA] 27 Dec 2025
I (Opening), a Permissive Critic and a Strict Critic gen-
erate opposing theses regarding the response’s quality. For
instance, given a tutor hint “Think about what happens
when you add fractions,” the Permissive Critic might argue
this provides effective scaffolding, while the Strict Critic
📸 Image Gallery
Reference This content is AI-processed based on open access ArXiv data.