Detecting Perspective Shifts in Multi-agent Systems

February 18, 2026

Reading time: 5 minute

...

📝 Original Info

Title: Detecting Perspective Shifts in Multi-agent Systems
ArXiv ID: 2512.05013
Date: 2025-12-04
Authors: Eric Bridgeford, Hayden Helm

📝 Abstract

Generative models augmented with external tools and update mechanisms (or \textit{agents}) have demonstrated capabilities beyond intelligent prompting of base models. As agent use proliferates, dynamic multi-agent systems have naturally emerged. Recent work has investigated the theoretical and empirical properties of low-dimensional representations of agents based on query responses at a single time point. This paper introduces the Temporal Data Kernel Perspective Space (TDKPS), which jointly embeds agents across time, and proposes several novel hypothesis tests for detecting behavioral change at the agent- and group-level in black-box multi-agent systems. We characterize the empirical properties of our proposed tests, including their sensitivity to key hyperparameters, in simulations motivated by a multi-agent system of evolving digital personas. Finally, we demonstrate via natural experiment that our proposed tests detect changes that correlate sensitively, specifically, and significantly with a real exogenous event. As far as we are aware, TDKPS is the first principled framework for monitoring behavioral dynamics in black-box multi-agent systems -- a critical capability as generative agent deployment continues to scale.

💡 Deep Analysis

📄 Full Content

The general improvement of Large Language Models (LLMs) has spurred the development of generative systems that use tools to interact directly with their environment (Yee et al., 2024). Consider a system consisting of a data-scraper with access to the internet, a database for storage, and an LLM. When prompted, the system begins to scrape the internet based on the prompt. The LLM assesses the relevance of each datum before it is added to the database. Once the scrape is complete, the LLM provides a response based on the data stored in the database. We refer to a system whose behavior can be affected by a change in its tools (e.g., the Recent progress in the design of generative agents has led to their deployment in increasingly complex environments, often populated by other agents performing similar information-gathering and reasoning tasks. These environments are characterized as multi-agent systems in which agents interact, exchange information, or otherwise influence each other's behaviors. Multi-agent systems are inherently complex (Han et al., 2024): each agent typically consists of multiple dependent components; their update mechanisms are often loosely defined and depend on interaction with their environment; and the effects of one agent's actions on itself and others can be convoluted. Given the rise in popularity of generative agents for different tasks, developing methods for understanding the dynamics of multiagent systems is critical to advancing the reliability and safety of agent use in shared environments.

One of the most fundamental statistical challenges in studying multi-agent systems is determining if, or when, an agent’s (or collection of agents’) behavior(s) have changed.

In this paper, we address this problem in the black-box setting, where an agent’s internal mechanisms are inaccessible and only its inputs and outputs are available for analysis. This setting is the most universal and realistic regime for monitoring modern multi-agent systems, given that an agent may be proprietary, may have access to private external tools, or may have access to privileged information.

Contribution. As far as we are aware, the framework developed herein -the Temporal Data Kernel Perspective Space (TDKPS) -is the first to enable principled statistical inference on general agent dynamics in black-box multiagent systems. We describe and characterize the statistical properties of the first tests for temporal change detection at the agent-and group-level.

Multi-agent systems The majority of research on multiagent generative systems has been in the context of computational sociology and behavioral simulation (Park et al., 2023;McGuinness et al., 2025;Chen et al., 2025a). Broadly, Temporal Data Kernel Perspective Space (t=2018-04)

Figure 1. The T = 2, 2-d Temporal Data Kernel Perspective Space (“TDKPS”) of a multi-agent system consisting of generative agents parameterized by different, dynamic retrieval datasets. Each dot/triangle is an agent. TDKPS enables interpretable and principled analysis of multi-agent systems in the black-box setting. For more experimental details, see Section 3.1.

prior work simulates predefined agent architectures interacting within sandboxed environments (Park et al., 2023;Piao et al., 2025) or along fixed interaction graphs (Papachristou & Yuan, 2024;Helm et al., 2024;Chuang et al., 2024).

Quantitative tracking of agent dynamics in these studies is typically limited to aggregate system-level statistics of group behavior (Sun et al., 2025;Chen et al., 2025b;Tran et al., 2025). The growing deployment of agentic systems in live, open environments motivates our more general treatment of black-box multi-agent systems and our approach to characterizing general agent behavior.

Representations of models A core component of any generative agent is the LLM that drives its reasoning, tool use, and responses. Understanding differences between models is therefore a natural starting point for understanding differences between agents. Numerous methods embed language models into low-dimensional spaces-via their internal representations (Duderstadt et al., 2024;Huh et al., 2024), parameter weights (Putterman et al., 2024), or responses to shared queries (Acharyya et al., 2025)-enabling standard statistical inference. Among these, the Data Kernel Perspective Space (DKPS) (Helm et al., 2024;Acharyya et al., 2025;Helm et al., 2025a) is most directly relevant: it represents black-box models via response similarities to a reference query set. To our knowledge, no existing framework provides representations for studying the dynamics of general agent behavior.

Inference on structured objects Detecting temporal changes in multi-agent systems requires statistical methods for structured, high-dimensional data. Prior work on structured objects-latent position graphs (Tang et al., 2013), connectomes (Chung et al., 2021;Bridgeford et al., 2025), physiological signals (Chen et al., 2022)-typically cons

📄 Read Full PDF on ArXiv