Visual Categorization Across Minds and Models: Cognitive Analysis of Human Labeling and Neuro-Symbolic Integration

December 10, 2025

Reading time: 5 minute

...

📝 Original Info

Title: Visual Categorization Across Minds and Models: Cognitive Analysis of Human Labeling and Neuro-Symbolic Integration
ArXiv ID: 2512.09340
Date: 2025-12-10
Authors: Chethana Prasad Kabgere

📝 Abstract

Understanding how humans and AI systems interpret ambiguous visual stimuli offers critical insight into the nature of perception, reasoning, and decision-making. This paper examines image labeling performance across human participants and deep neural networks, focusing on low-resolution, perceptually degraded stimuli. Drawing from computational cognitive science, cognitive architectures, and connectionist-symbolic hybrid models, we contrast human strategies such as analogical reasoning, shape-based recognition, and confidence modulation with AI's feature-based processing. Grounded in Marr's tri-level hypothesis, Simon's bounded rationality, and Thagard's frameworks of representation and emotion, we analyze participant responses in relation to Grad-CAM visualizations of model attention. Human behavior is further interpreted through cognitive principles modeled in ACT-R and Soar, revealing layered and heuristic decision strategies under uncertainty. Our findings highlight key parallels and divergences between biological and artificial systems in representation, inference, and confidence calibration. The analysis motivates future neuro-symbolic architectures that unify structured symbolic reasoning with connectionist representations. Such architectures, informed by principles of embodiment, explainability, and cognitive alignment, offer a path toward AI systems that are not only performant but also interpretable and cognitively grounded.

💡 Deep Analysis

📄 Full Content

1 Visual Categorization Across Minds and Models: Cognitive Analysis of Human Labeling and Neuro-Symbolic Integration Chethana Prasad Kabgere , Student , Georgia Institute of Technology, Atlanta, Georgia, USA. ckabgere3@gatech.edu Abstract—Understanding how humans and AI systems inter- pret ambiguous visual stimuli offers critical insight into the nature of perception, reasoning, and decision-making. This paper examines image labeling performance across human participants and deep neural networks, focusing on low-resolution, percep- tually degraded stimuli. Drawing from computational cognitive science, cognitive architectures, and connectionist-symbolic hy- brid models, we contrast human strategies—such as analogi- cal reasoning, shape-based recognition, and confidence modu- lation—with AI’s feature-based processing.Grounded in Marr’s tri-level hypothesis, Simon’s bounded rationality, and Thagard’s frameworks of representation and emotion, we analyze partici- pant responses in relation to Grad-CAM visualizations of model attention. Human behavior is further interpreted through cog- nitive principles modeled in ACT-R and Soar, revealing layered and heuristic decision strategies under uncertainty.Our findings highlight key parallels and divergences between biological and artificial systems in representation, inference, and confidence cal- ibration. The analysis motivates future neuro-symbolic architec- tures that unify structured symbolic reasoning with connectionist representations. Such architectures, informed by principles of embodiment, explainability, and cognitive alignment, offer a path toward AI systems that are not only performant but also interpretable and cognitively grounded. Index Terms—Visual classification,Analogical reasoning, Em- bodied cognition, Distributed cognition, Neuro-symbolic integra- tion I. INTRODUCTION A RTIFICIAL intelligence (AI) systems, particularly those built on deep neural networks, have demonstrated remarkable capabilities in tasks like image classification. Nonetheless, their underlying mechanisms differ fundamen- tally from human cognition. AI typically relies on connection- ist representations—encoding statistical regularities in pixel data—while human cognition leverages symbolic reasoning, analogies, and contextual understanding derived from em- bodied experience. These distinct modes of representation raise compelling questions about how each “mind” processes ambiguous visual information under uncertainty. Visual classification of low-resolution images offers a fertile ground for examining these differences. According to Marr’s three levels of analysis, any vision system must be understood across (1) the computational level, defining what problem is solved; (2) the algorithmic level, specifying the representations and procedures used; and (3) the implementation level, de- scribing its physical realization[1], [12]. Whereas AI systems implement hierarchical feature extraction via convolutional ar- chitectures, humans utilize generative and analogical processes to interpret sparse visual stimuli[13]. Our study is informed by several core cognitive-science principles. Bounded rationality posits that humans use satis- ficing heuristics rather than optimal calculations, due to com- putational and environmental limitations[14], [3]. Analogical reasoning enables humans to map novel objects onto known categories through structural similarities[4], [5]. Embodied cognition asserts that perceptual processes are grounded in bodily experience and interaction with the world[6]. Finally, human reasoning often operates through distributed cognition, where knowledge is represented and accessed across mental and environmental contexts[7]. Research Question: How do human cognitive strategies for labeling ambiguous visual stimuli compare to the feature- based labeling processes of AI systems, and how can these in- sights inform the design of cognitively aligned neuro-symbolic AI architectures? This paper examines how these cognitive principles man- ifest through a comparison between human participants and a ResNet-18 model trained on CIFAR-10. By combining participant-reported strategies and confidence ratings with AI attention visualizations (e.g., Grad-CAM), we analyze how representational format, reasoning procedure, and implementa- tion produce convergent or divergent classification behaviors. Emerging research suggests that neuro-symbolic integra- tion—combining connectionist perception with symbolic rea- soning—offers a promising pathway toward AI systems with improved interpretability and human-like cognitive robustness[8]. By situating our empirical findings within these theoretical frameworks, this work aims to inform the design of AI architectures that more closely mirror human cognitive processes in visual reasoning. II. LITERATURE SURVEY A. Foundational Concepts (Part A) Symbolic and connectionist approaches have long shaped cognitive science’s understanding of human and arti

📄 Read Full PDF on ArXiv