Title: Visual Categorization Across Minds and Models: Cognitive Analysis of Human Labeling and Neuro-Symbolic Integration
ArXiv ID: 2512.09340
Date: 2025-12-10
Authors: Chethana Prasad Kabgere
📝 Abstract
Understanding how humans and AI systems interpret ambiguous visual stimuli offers critical insight into the nature of perception, reasoning, and decision-making. This paper examines image labeling performance across human participants and deep neural networks, focusing on low-resolution, perceptually degraded stimuli. Drawing from computational cognitive science, cognitive architectures, and connectionist-symbolic hybrid models, we contrast human strategies such as analogical reasoning, shape-based recognition, and confidence modulation with AI's feature-based processing. Grounded in Marr's tri-level hypothesis, Simon's bounded rationality, and Thagard's frameworks of representation and emotion, we analyze participant responses in relation to Grad-CAM visualizations of model attention. Human behavior is further interpreted through cognitive principles modeled in ACT-R and Soar, revealing layered and heuristic decision strategies under uncertainty. Our findings highlight key parallels and divergences between biological and artificial systems in representation, inference, and confidence calibration. The analysis motivates future neuro-symbolic architectures that unify structured symbolic reasoning with connectionist representations. Such architectures, informed by principles of embodiment, explainability, and cognitive alignment, offer a path toward AI systems that are not only performant but also interpretable and cognitively grounded.
💡 Deep Analysis
📄 Full Content
1
Visual Categorization Across Minds and Models:
Cognitive Analysis of Human Labeling and
Neuro-Symbolic Integration
Chethana Prasad Kabgere , Student , Georgia Institute of Technology, Atlanta, Georgia, USA.
ckabgere3@gatech.edu
Abstract—Understanding how humans and AI systems inter-
pret ambiguous visual stimuli offers critical insight into the
nature of perception, reasoning, and decision-making. This paper
examines image labeling performance across human participants
and deep neural networks, focusing on low-resolution, percep-
tually degraded stimuli. Drawing from computational cognitive
science, cognitive architectures, and connectionist-symbolic hy-
brid models, we contrast human strategies—such as analogi-
cal reasoning, shape-based recognition, and confidence modu-
lation—with AI’s feature-based processing.Grounded in Marr’s
tri-level hypothesis, Simon’s bounded rationality, and Thagard’s
frameworks of representation and emotion, we analyze partici-
pant responses in relation to Grad-CAM visualizations of model
attention. Human behavior is further interpreted through cog-
nitive principles modeled in ACT-R and Soar, revealing layered
and heuristic decision strategies under uncertainty.Our findings
highlight key parallels and divergences between biological and
artificial systems in representation, inference, and confidence cal-
ibration. The analysis motivates future neuro-symbolic architec-
tures that unify structured symbolic reasoning with connectionist
representations. Such architectures, informed by principles of
embodiment, explainability, and cognitive alignment, offer a
path toward AI systems that are not only performant but also
interpretable and cognitively grounded.
Index Terms—Visual classification,Analogical reasoning, Em-
bodied cognition, Distributed cognition, Neuro-symbolic integra-
tion
I. INTRODUCTION
A
RTIFICIAL intelligence (AI) systems, particularly those
built on deep neural networks, have demonstrated
remarkable capabilities in tasks like image classification.
Nonetheless, their underlying mechanisms differ fundamen-
tally from human cognition. AI typically relies on connection-
ist representations—encoding statistical regularities in pixel
data—while human cognition leverages symbolic reasoning,
analogies, and contextual understanding derived from em-
bodied experience. These distinct modes of representation
raise compelling questions about how each “mind” processes
ambiguous visual information under uncertainty.
Visual classification of low-resolution images offers a fertile
ground for examining these differences. According to Marr’s
three levels of analysis, any vision system must be understood
across (1) the computational level, defining what problem is
solved; (2) the algorithmic level, specifying the representations
and procedures used; and (3) the implementation level, de-
scribing its physical realization[1], [12]. Whereas AI systems
implement hierarchical feature extraction via convolutional ar-
chitectures, humans utilize generative and analogical processes
to interpret sparse visual stimuli[13].
Our study is informed by several core cognitive-science
principles. Bounded rationality posits that humans use satis-
ficing heuristics rather than optimal calculations, due to com-
putational and environmental limitations[14], [3]. Analogical
reasoning enables humans to map novel objects onto known
categories through structural similarities[4], [5]. Embodied
cognition asserts that perceptual processes are grounded in
bodily experience and interaction with the world[6]. Finally,
human reasoning often operates through distributed cognition,
where knowledge is represented and accessed across mental
and environmental contexts[7].
Research Question: How do human cognitive strategies
for labeling ambiguous visual stimuli compare to the feature-
based labeling processes of AI systems, and how can these in-
sights inform the design of cognitively aligned neuro-symbolic
AI architectures?
This paper examines how these cognitive principles man-
ifest through a comparison between human participants and
a ResNet-18 model trained on CIFAR-10. By combining
participant-reported strategies and confidence ratings with AI
attention visualizations (e.g., Grad-CAM), we analyze how
representational format, reasoning procedure, and implementa-
tion produce convergent or divergent classification behaviors.
Emerging research suggests that neuro-symbolic integra-
tion—combining connectionist perception with symbolic rea-
soning—offers a promising pathway toward AI systems
with
improved
interpretability
and
human-like
cognitive
robustness[8]. By situating our empirical findings within these
theoretical frameworks, this work aims to inform the design
of AI architectures that more closely mirror human cognitive
processes in visual reasoning.
II. LITERATURE SURVEY
A. Foundational Concepts (Part A)
Symbolic and connectionist approaches have long shaped
cognitive science’s understanding of human and arti