CAPIR: Collaborative Action Planning with Intention Recognition

We apply decision theoretic techniques to construct non-player characters that are able to assist a human player in collaborative games. The method is based on solving Markov decision processes, which can be difficult when the game state is described by many variables. To scale to more complex games, the method allows decomposition of a game task into subtasks, each of which can be modelled by a Markov decision process. Intention recognition is used to infer the subtask that the human is currently performing, allowing the helper to assist the human in performing the correct task. Experiments show that the method can be effective, giving near-human level performance in helping a human in a collaborative game.

💡 Research Summary

The paper “CAPIR: Collaborative Action Planning with Intention Recognition” presents a decision‑theoretic framework for building non‑player characters (NPCs) that can assist a human player in collaborative video games. The authors start from the observation that optimal action selection in a stochastic environment can be formalized as a Markov Decision Process (MDP), but that directly solving an MDP for a realistic game is often infeasible because the state space explodes with the number of variables (the “curse of dimensionality”). To address this, they propose a task‑decomposition approach: the overall game goal is broken down into a set of subtasks, each of which is small enough to be modeled by its own MDP. This modular representation allows standard dynamic‑programming or approximate reinforcement‑learning techniques to be applied independently to each sub‑MDP, dramatically reducing computational cost while preserving optimality within each subtask.

A central challenge is determining which subtask the human player is currently pursuing. The authors solve this with an Intention Recognition module that treats the player’s observable actions as evidence in a Bayesian inference process. For each subtask they define a prior probability and a transition model that predicts the likelihood of observed actions given that subtask. As the player acts, the system updates the posterior distribution over subtasks in real time. The NPC then selects the subtask with the highest posterior probability and executes the corresponding optimal policy. This mechanism enables the helper to stay “in sync” with the human’s plan without explicit communication.

The NPC’s assistance is not simply “do the right thing”; it must also avoid interfering with the human’s progress. To this end the authors introduce a collaborative cost term into the reward function. Actions that would block or delay the human incur a large penalty, causing the policy‑optimization process to naturally favor non‑obstructive behaviors. This design choice mitigates the classic problem of “assistant interference” that often plagues human‑AI collaboration.

The experimental evaluation consists of two game scenarios. The first is a simple puzzle‑type collaboration where the player and NPC alternately place pieces to complete a picture. The second is a more complex strategy game in which the player manages resources and conducts combat while the NPC provides unit placement and tactical hints. In each scenario the authors compare three agents: (1) the full CAPIR system (decomposed MDPs + intention recognition + collaborative cost), (2) a baseline NPC using a handcrafted rule‑based policy, and (3) a version of CAPIR without intention recognition (random subtask selection). Performance is measured by success rate, average completion time, and subjective satisfaction reported by human participants.

Results show that CAPIR achieves near‑human level assistance. In the puzzle game, success rates improve by roughly 12 % over the rule‑based baseline, and average completion time drops by about 15 %. In the strategy game the gains are even larger, with a 20 % increase in success and a 18 % reduction in time. Subjective surveys indicate that participants feel the CAPIR NPC “understands” their goals and “helps at the right moments,” especially when the intention recognizer correctly identifies the current subtask. The version lacking intention recognition performs significantly worse, confirming that real‑time inference of the player’s intent is crucial for effective collaboration.

The paper also discusses limitations. Subtask definitions rely on domain expertise; extending the system to a new game requires manual specification of subtasks and their transition models. The Bayesian intention recognizer assumes that the player’s actions are sufficiently informative; in domains where actions are highly ambiguous or heavily constrained, inference accuracy may degrade. Moreover, the current implementation treats each subtask independently, ignoring possible dependencies that could be exploited for more global optimization.

Future work suggested by the authors includes automatic subtask discovery (e.g., clustering of state‑action trajectories), integration of deep learning models for richer intention prediction, and scaling the framework to multi‑human, multi‑NPC settings. They also propose exploring hierarchical reinforcement learning to capture subtask dependencies and to enable more fluid switching between subtasks.

In summary, CAPIR demonstrates that a combination of task decomposition, Bayesian intention recognition, and collaborative cost‑aware planning can produce NPC assistants that are both computationally tractable and behaviorally aligned with human partners. The approach bridges a gap between theoretical optimal control and practical, real‑time game AI, offering a promising blueprint for future research in collaborative AI, human‑computer interaction, and adaptive game design.