절차 학습을 위한 구조화된 AI 코칭 시스템 아이비
📝 Abstract
In procedural skill learning, instructional explanations must convey not just steps, but the causal, goal-directed, and compositional logic behind them. Large language models (LLMs) often produce fluent yet shallow responses that miss this structure. We present Ivy, an AI coaching system that delivers structured, multi-step explanations by combining symbolic Task-Method-Knowledge (TMK) models with a generative interpretation layer-an LLM that constructs explanations while being constrained by TMK structure. TMK encodes causal transitions, goal hierarchies, and problem decompositions, and guides the LLM within explicit structural bounds. We evaluate Ivy against responses against GPT and retrieval-augmented GPT baselines using expert and independent annotations across three inferential dimensions. Results show that symbolic constraints consistently improve the structural quality of explanations for “how” and “why” questions. This study demonstrates a scalable AI for education approach that strengthens the pedagogical value of AI-generated explanations in intelligent coaching systems.
💡 Analysis
In procedural skill learning, instructional explanations must convey not just steps, but the causal, goal-directed, and compositional logic behind them. Large language models (LLMs) often produce fluent yet shallow responses that miss this structure. We present Ivy, an AI coaching system that delivers structured, multi-step explanations by combining symbolic Task-Method-Knowledge (TMK) models with a generative interpretation layer-an LLM that constructs explanations while being constrained by TMK structure. TMK encodes causal transitions, goal hierarchies, and problem decompositions, and guides the LLM within explicit structural bounds. We evaluate Ivy against responses against GPT and retrieval-augmented GPT baselines using expert and independent annotations across three inferential dimensions. Results show that symbolic constraints consistently improve the structural quality of explanations for “how” and “why” questions. This study demonstrates a scalable AI for education approach that strengthens the pedagogical value of AI-generated explanations in intelligent coaching systems.
📄 Content
Improving Procedural Skill Explanations via Constrained Generation: A Symbolic-LLM Hybrid Architecture Rahul Dass, Thomas Bowlin, Zebing Li, Xiao Jin and Ashok Goel Georgia Institute of Technology {rdass7,tbowlin3,zebing.li,xjin96,ag25}@gatech.edu Abstract In procedural skill learning, instructional explanations must convey not just steps, but the causal, goal-directed, and com- positional logic behind them. Large language models (LLMs) often produce fluent yet shallow responses that miss this structure. We present Ivy, an AI coaching system that de- livers structured, multi-step explanations by combining sym- bolic Task-Method-Knowledge (TMK) models with a genera- tive interpretation layer-an LLM that constructs explanations while being constrained by TMK structure. TMK encodes causal transitions, goal hierarchies, and problem decomposi- tions, and guides the LLM within explicit structural bounds. We evaluate Ivy against responses against GPT and retrieval- augmented GPT baselines using expert and independent an- notations across three inferential dimensions. Results show that symbolic constraints consistently improve the structural quality of explanations for “how” and “why” questions. This study demonstrates a scalable AI for education approach that strengthens the pedagogical value of AI-generated explana- tions in intelligent coaching systems. Introduction In Spring 2025, Ivy (Dass et al. 2025) is an AI coaching system that was deployed in an online graduate-level AI course at a large U.S. university [Redacted for double-blind review], where it answered students’ questions about proce- dural skills. Across multiple pilot studies, Ivy’s responses were perceived fluent and often helpful, yet we observed a recurring gap: explanations can lack the structured logic instructors expect for teaching procedural skills-how steps depend on state (causality), why steps serve an overarch- ing goal (teleology), and how problems decompose into sub- goals (decomposition) (Lum et al. 2025). This gap matters educationally. Learners increasingly rely on AI-generated explanations when practicing proce- dural tasks like writing code, debugging, or constructing arguments. But skill transfer, the benchmark of meaning- ful learning, depends not only on surface fluency but on understanding why and how procedures work (Bransford et al. 2000). Transfer requires causal insight into the logic of tasks-a hallmark of expert reasoning that unconstrained LLMs may fail to replicate (Bransford et al. 2000). Thus, Copyright © 2026, Association for the Advancement of Artificial Intelligence (www.aaai.org ). All rights reserved. improving the inferential structure of explanations is central to pedagogical value, not a peripheral nicety. Currently, Ivy uses Task-Method-Knowledge (TMK) models (Goel and Rugaber 2017), we refer to as “TMK- Basic”: they capture goals and associated mechanisms but do not explicitly encode causal transitions, goal hierarchies, or formal goal decomposition. In this work, we introduce “TMK-Structured”, which makes these three inferencing structures explicit: causal transitions (modeled through finite state machines), teleological linkages (via goal hierarchies), and hierarchical decomposition (through goal abstraction)- each capturing a different aspect of how a goal unfolds, why it matters, and how it can be broken down. We present a constrained generation architecture that sep- arates symbolic control (TMK layer) from generative inter- pretation and synthesis (LLM layer), see Figure 1. Symbolic Control (TMK Layer) Finite State Machines, Goal Hierarchies, Safety Conditions, Transitions Generative interpretation (LLM Layer) Interprets TMK, Generates explanations Ivy accesses TMK model Learner Question Response or Explanation Figure 1: Concept diagram of Ivy’s two-layered inferencing: TMK encodes structured knowledge; LLMs perform run- time inference within these constraints. We evaluate Ivy+TMK-Structured against three baselines, including Ivy+TMK-Basic using expert and independent ratings of correctness and three inferential dimensions. Our primary hypothesis is that TMK-Structured will increase the prevalence and explicitness of these inferential structures in generated explanations relative to the baselines. Together, the architecture and evaluation advance a prac- tical question for AI in education: can symbolic procedural constraints make LLM explanations not only more reliable, arXiv:2511.20942v1 [cs.AI] 26 Nov 2025 but also more pedagogically aligned with the way instruc- tors teach procedural skills? We observe promising gains, most notably in problem decomposition-the dimension most directly scaffolded by the TMK-Structured design. Related Work Structured Reasoning and Transfer Long-standing work in the learning sciences shows that successful skill transfer requires more than surface flu- ency; learners must understand underlying principles and know when/how/why to apply them (Bransford et al. 2000). Caus
This content is AI-processed based on ArXiv data.