Theory of Mind for Explainable Human-Robot Interaction

Reading time: 5 minute
...

📝 Original Info

  • Title: Theory of Mind for Explainable Human-Robot Interaction
  • ArXiv ID: 2512.23482
  • Date: 2025-12-29
  • Authors: ** 원문에 저자 정보가 제공되지 않음. (추후 논문 본문 또는 DOI 확인 필요) **

📝 Abstract

Within the context of human-robot interaction (HRI), Theory of Mind (ToM) is intended to serve as a user-friendly backend to the interface of robotic systems, enabling robots to infer and respond to human mental states. When integrated into robots, ToM allows them to adapt their internal models to users' behaviors, enhancing the interpretability and predictability of their actions. Similarly, Explainable Artificial Intelligence (XAI) aims to make AI systems transparent and interpretable, allowing humans to understand and interact with them effectively. Since ToM in HRI serves related purposes, we propose to consider ToM as a form of XAI and evaluate it through the eValuation XAI (VXAI) framework and its seven desiderata. This paper identifies a critical gap in the application of ToM within HRI, as existing methods rarely assess the extent to which explanations correspond to the robot's actual internal reasoning. To address this limitation, we propose to integrate ToM within XAI frameworks. By embedding ToM principles inside XAI, we argue for a shift in perspective, as current XAI research focuses predominantly on the AI system itself and often lacks user-centered explanations. Incorporating ToM would enable a change in focus, prioritizing the user's informational needs and perspective.

💡 Deep Analysis

Figure 1

📄 Full Content

As interactions between humans and robots become increasingly common (Lee 2021), it is intuitive to seek more human-like modes of interaction to be able to understand robots' behaviors (Sridharan and Meadows 2019;Kerzel et al. 2023). This need naturally motivates the application of ToM in HRI. ToM refers to the human ability to attribute mental states such as beliefs, desires, and intentions to oneself and others to predict and explain behavior (Premack and Woodruff 1978). When embedded in robots, ToM methods emphasize understanding and adapting to users' mental states, and this can be used to produce explanations that are often more intuitive and user-friendly (Williams, Fiore, and Jentsch 2022). ToM also allows robots to interpret and respond to users' inferred mental states, fostering more natural, adaptive, and transparent interactions (Yuan et al. 2022). On the other hand, XAI aims to make black box models more transparent and interpretable; however, it frequently overlooks user-centered evaluations (Rong et al. 2024).

Since both XAI and ToM in HRI aim to make internal reasoning more understandable to humans and enhance human-AI collaboration, we propose considering ToM as a form of XAI, and therefore evaluate it accordingly. To this end, we evaluated recent ToM studies in HRI using an XAI evaluation framework and identified some limitations. Existing ToM approaches rarely assess whether the information presented to users accurately reflects the robot’s internal reasoning, nor do they evaluate the robustness and reproducibility of the explanations.

To address gaps in both ToM and XAI, particularly regarding explanation fidelity and user-centered evaluation, we propose leveraging ToM within an XAI framework, combining ToM’s user-centered perspective with XAI’s technical rigor. This shift in perspective aims to enable evaluations that encompass both fidelity to the model and alignment with user understanding, ultimately narrowing the gap between system transparency and human interpretability.

ToM is often treated as a heuristic in artificial intelligence, where one of the participants is replaced by a robot. In this section, we review recent studies that have used ToM in HRI to evaluate human-AI collaboration and understanding.

Several studies examine whether humans naturally attribute ToM to robots even in the absence of explicit ToM mechanisms. A first study found that humans are able to interpret robots’ behavior similarly to human behavior, provided that the robots display distinct and interpretable social cues. However, when a robot’s cues deviate from human expectations, this understanding diminishes (Banks 2020). A second study, which examined the robustness and conviction of large language models (LLMs), demonstrated that while LLMs can serve as a useful tool in human-robot interaction (Becker et al. 2025), they do not function as reliable ToM agents (Verma, Bhambri, and Kambhampati 2024). These findings suggest that effective human-robot interaction is facilitated when robots produce responses that align with typical human behavior, and highlight the need for the integration of explicit ToM-like mechanisms in robotic systems.

A second line of research has investigated embedding ToMlike reasoning directly within robots and assessing its impact on trust, helpfulness, and mutual understanding. Some studies have focused on evaluating user perception, revealing that robots equipped with ToM capabilities are perceived more positively (Mou et al. 2020), particularly when they provide assistance aligned with users’ goals (Cantucci and Falcone 2022). Similarly, robots that reason about human beliefs are generally considered more helpful and socially competent (Shvo et al. 2022), and are also regarded as more trustworthy (Angelopoulos et al. 2025). At the same time, when providing explanations, robots may fail to enhance user understanding or improve decision-making, as not all explanations are equally effective (Yuan et al. 2022). In contrast, approaches that implement multiple levels of explanation have been shown to improve user comprehension and the interaction (Kerzel et al. 2022). Although these studies evaluate human-AI collaboration and occasionally describe their work as XAI, none assess it using XAI-specific criteria. Moreover, none have explicitly integrated ToM with XAI, highlighting a gap that our work addresses.

While the field of ToM primarily claims to enhance user understanding, trust, and, more broadly, human-AI collaboration, these claims are often not systematically evaluated. This lack of evaluation stems from the fact that, if ToM purports to provide explanations for users, it should be assessed using the same criteria applied in XAI. Indeed, these objectives align closely with those of XAI, which aims to design AI systems that are interpretable and comprehensible to humans (Rong et al. 2024). We therefore propose to systematically evaluate ToM in line with the rigor of its cla

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut