AI-based Verbal and Visual Scaffolding in a Serious Game: Effects on Learning and Cognitive Load

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Due to their interactive nature, serious games offer valuable opportunities for supporting learning in educational contexts. Recent advances in large language models (LLMs) have further opened the door to new forms of personalized scaffolding in education. In this study, we combine both worlds and study three types of AI-based scaffolding designs in a serious game: (i) no scaffolding, (ii) chat-based (verbal) scaffolding provided by an AI-based non-player character (NPC), and (iii) combined chat-(verbal) and action-based (visual) scaffolding in which the AI may both try to explain or demonstrate the next step towards a solution. The scaffolding conditions are embedded in Qookies, a serious game designed to introduce fundamental concepts of quantum technologies. A total of 152 school students, university students, and members of the general public were randomly assigned to one of the three conditions. The results show that all groups experience significant learning gains, confirming the overall effectiveness of the serious game itself. No significant differences in learning outcomes emerged between scaffolding conditions. However, intrinsic cognitive load was lower in the combined chat-and-action (verbal+visual) scaffolding condition compared to the chat (verbal)-only condition, suggesting that visual demonstrations may offer more accessible support. Interaction analyses further revealed that players engaged with the AI character primarily for level-related questions and action recommendations, while deeper interactions were relatively rare.

💡 Research Summary

**
This paper investigates how different levels of AI‑based scaffolding affect learning outcomes and cognitive load within a serious educational game called “Qookies,” which is designed to introduce fundamental concepts of quantum technologies to a heterogeneous audience (middle‑school, high‑school, university students, and the general public). Three scaffolding conditions were compared: (1) no scaffolding, (2) verbal scaffolding delivered via a large‑language‑model (LLM) chatbot (Llama 3.1 70B) that can explain concepts, clarify steps, and give hints without revealing solutions, and (3) a combined verbal + visual scaffolding condition in which the same chatbot is complemented by an AI‑driven action module based on one‑shot reinforcement learning (RL). The RL component can perform in‑game actions or demonstrations that illustrate the next step toward a solution, while still deferring to the player when its knowledge is insufficient.

A total of 152 participants were randomly assigned to one of the three conditions. Learning was measured with pre‑ and post‑test items assessing conceptual understanding of quantum technologies, as well as self‑report scales of motivation and interest. Cognitive load was assessed using the NASA‑TLX framework, separating intrinsic cognitive load (ICL), extraneous cognitive load (ECL), and germane cognitive load (GCL).

Results showed significant learning gains across all three groups, confirming that the game itself is an effective learning environment. However, there were no statistically significant differences in post‑test scores among the scaffolding conditions, indicating that the presence or type of AI support did not directly boost learning performance beyond what the game already provided.

In contrast, intrinsic cognitive load differed between the verbal‑only and the combined verbal + visual conditions: participants who received visual demonstrations in addition to verbal hints reported lower ICL (p < 0.05). Extraneous load remained comparable across groups, and germane load showed a slight, non‑significant increase in the combined condition. The reduction in ICL suggests that visual scaffolding helps learners internalize abstract quantum concepts more efficiently, thereby easing the mental effort required to process the core material.

Interaction log analysis revealed that learners primarily used the AI NPC for level‑related queries (e.g., “What should I do next?”) and for requesting concrete action recommendations. Deep, metacognitive dialogues or conceptual debates with the AI were rare. This pattern indicates that participants treated the AI as a problem‑solving tool rather than as a conversational partner for higher‑order reasoning.

From a technical standpoint, the verbal scaffold leveraged a pre‑trained LLM with carefully crafted prompts containing the target concepts for each level, which limited hallucinations while still allowing flexible natural‑language interaction. The visual scaffold employed a reinforcement‑learning agent that learns from observing gameplay and generates action plans grounded in the current game state. The two modules were integrated through an “embedding interface” that aligns dialogue and action generation with the game’s internal representation, ensuring consistency and reducing the risk of contradictory advice.

The study contributes several insights to the emerging field of AI‑enhanced educational games. First, it demonstrates a practical implementation of multimodal AI (LLM + RL) within a single NPC, offering both textual explanations and demonstrative actions. Second, it provides empirical evidence that the primary benefit of such scaffolding may lie in managing intrinsic cognitive load rather than directly increasing test scores. Third, the findings support the theoretical claim that visual demonstrations can make complex, abstract scientific concepts more accessible, especially for learners with limited prior knowledge.

Limitations include the modest sample size, the short‑term nature of the assessment (no long‑term retention or transfer tests), and the reliance on self‑reported cognitive load measures. Additionally, while the LLM prompts were designed to mitigate hallucinations, the system was not immune to occasional errors, which could affect trust and learning if not carefully monitored.

In conclusion, AI‑based scaffolding—particularly when it combines verbal explanations with visual demonstrations—can reduce learners’ intrinsic cognitive load in a serious game context without necessarily improving immediate learning outcomes beyond the game’s inherent instructional value. Designers of future educational games should therefore consider multimodal AI companions that provide concrete visual support alongside natural‑language hints, especially when teaching abstract or technically demanding subjects such as quantum physics.

AI-based Verbal and Visual Scaffolding in a Serious Game: Effects on Learning and Cognitive Load

💡 Research Summary

Comments & Academic Discussion

Leave a Comment