From Particles to Agents: Hallucination as a Metric for Cognitive Friction in Spatial Simulation
Traditional architectural simulations (e.g. Computational Fluid Dynamics, evacuation, structural analysis) model elements as deterministic physics-based “particles” rather than cognitive “agents”. To bridge this, we introduce \textbf{Agentic Environmental Simulations}, where Large Multimodal generative models actively predict the next state of spatial environments based on semantic expectation. Drawing on examples from accessibility-oriented AR pipelines and multimodal digital twins, we propose a shift from chronological time-steps to Episodic Spatial Reasoning, where simulations advance through meaningful, surprisal-triggered events. Within this framework we posit AI hallucinations as diagnostic tools. By formalizing the \textbf{Cognitive Friction} ($C_f$) it is possible to reveal “Phantom Affordances”, i.e. semiotic ambiguities in built space. Finally, we challenge current HCI paradigms by treating environments as dynamic cognitive partners and propose a human-centered framework of cognitive orchestration for designing AI-driven simulations that preserve autonomy, affective clarity, and cognitive integrity.
💡 Research Summary
The paper argues that traditional architectural simulations—such as computational fluid dynamics, crowd evacuation, and structural analysis—treat elements as deterministic physical “particles” and therefore ignore the cognitive dimension of human occupants. To overcome this limitation, the authors introduce Agentic Environmental Simulations, where the fundamental computational unit is a reasoning loop powered by large multimodal generative models (LLMs, VLMs). These models predict the next state of a spatial environment based on semantic expectations rather than purely physical laws.
A central contribution is the shift from chronological, high‑frequency time‑step simulation to Episodic Spatial Reasoning. Inspired by dual‑process theories (System 1 fast heuristics, System 2 slow reasoning) and event‑segmentation research, the pipeline runs a low‑cost physics‑based “Heuristic Autopilot” for routine locomotion. When a surprisal measure exceeds a predefined threshold τ—typically at doorways, signage ambiguities, or other cognitively salient junctions—the high‑cost multimodal LLM is activated to generate an expectation E_gen of the proximal state. The physical environment supplies the ground‑truth R_phys.
The authors reconceptualize AI hallucinations not as model failures but as diagnostic signals of semiotic ambiguity. They define a Cognitive Friction metric:
\
Comments & Academic Discussion
Loading comments...
Leave a Comment