In this paper we introduce a biologically inspired Convolutional Neural Network (CNN) architecture called LGN-CNN that has a first convolutional layer composed by a single filter that mimics the role of the Lateral Geniculate Nucleus (LGN). The first layer of the neural network shows a rotational symmetric pattern justified by the structure of the net itself that turns up to be an approximation of a Laplacian of Gaussian (LoG). The latter function is in turn a good approximation of the receptive field profiles (RFPs) of the cells in the LGN. The analogy with the visual system is established, emerging directly from the architecture of the neural network. A proof of rotation invariance of the first layer is given on a fixed LGN-CNN architecture and the computational results are shown. Thus, contrast invariance capability of the LGN-CNN is investigated and a comparison between the Retinex effects of the first layer of LGN-CNN and the Retinex effects of a LoG is provided on different images. A statistical study is done on the filters of the second convolutional layer with respect to biological data. In conclusion, the model we have introduced approximates well the RFPs of both LGN and V1 attaining similar behavior as regards long range connections of LGN cells that show Retinex effects.
💡 Summary & Analysis
This paper introduces LGN-CNN, a biologically inspired convolutional neural network (CNN) architecture that mimics the visual system's structure. The main focus is to enhance understanding and performance in visual information processing by aligning with natural human mechanisms. The core innovation lies in designing the first convolutional layer of this CNN using a single filter that closely approximates the Laplacian of Gaussian (LoG), reflecting how the Lateral Geniculate Nucleus (LGN) processes input before passing it to the primary visual cortex (V1). This design ensures rotational symmetry, mirroring properties found in LGN cells. The second layer uses Gabor filters to simulate V1 cells' characteristics, enhancing the network's resemblance to biological vision pathways.
The paper demonstrates that this architecture exhibits rotation and contrast invariance, effectively replicating Retinex effects on various images. Statistical analyses of the second convolutional layer confirm its alignment with biological data. The significance of LGN-CNN lies in its potential to improve deep learning models’ performance by better mimicking human visual processing mechanisms. This approach can be particularly beneficial for advancing image recognition and processing technologies.
📄 Full Paper Content (ArXiv Source)
# Introducing LGN-CNN architecture
In this section we introduce one of the main novelty of this paper, a
CNN architecture inspired by the structure of the visual system and, in
particular, takes into account LGN cells.
The retinal action in a CNN has been implemented in , where the authors
have proposed a bottleneck model for the retinal output. In our model we
propose a single filter layer at the beginning of the CNN that should
mimic the action of the LGN. As we have already discussed in Section
12 the RFP of an LGN cell can be
modeled by a LoG that acts directly on the visual stimulus. Since the
LGN preprocesses the visual stimulus before it reaches V1, we should add
a first layer at the beginning of the CNN that reproduces the role of
the LGN.
In particular, if we consider a classical CNN we can add before the
other convolutional layers, a layer $`\ell^0`$ composed by only one
filter $`\Psi^0`$ of size $`s^0 \times s^0`$ and a ReLU function. Note
that after the first layer $`\ell^0`$ we will not apply any pooling. In
this way taking a classical CNN and adding $`\ell^0`$ will not modify
the structure of the neural network and the number of parameters will
only increase by $`s^0 \times s^0`$. Furthermore, $`\Psi^0`$ will
prefilter the input image without modifying its dimensions; this
behavior mimics the behavior of the LGN which let the neural network to
be closer to the visual system structure. Figure
1 shows a scheme of the first steps
of the visual pathway (i.e., LGN and V1) in parallel with the first two
layers $`\ell^0`$ and $`\ell^1`$ of the LGN-CNN architecture.
Scheme of the LGN and V1 in parallel with the first two
layers ℓ0 and ℓ1 of the LGN-CNN
architecture.
The theoretical idea behind this structure can be found in a simple
result on rotational symmetric convex functionals. In particular, we
recall that a rotational symmetric convex functional $`F`$ has a unique
minimum $`\omega`$. Since $`F`$ is rotational symmetric,
$`F(\omega \circ g) = F(\omega)`$ for a rotation $`g`$. Thus, since the
minimum is unique, $`\omega = \omega \circ g`$, implying the rotational
symmetry of $`\omega`$. There are several results on symmetries of
minimum for functionals as for example in , . Our aim is to extend these
results in the case of CNNs in particular on our architecture that we
name as Lateral Geniculate Nucleus Convolutional Neural Network
(LGN-CNN).
We will also show that the Gabor-like filters in the second convolution
layer, reprojected in the $`(n_x, n_y)`$ plane introduced by Ringach and
recalled above, satisfy the same properties of elongation which
characterizes the RFPs of simple cells in V1. This analysis should
enforce the link between our architecture and the visual system
structure, at least as regards simple cells in V1.
The copyright of this content belongs to the respective researchers. We deeply appreciate their hard work and contribution to the advancement of human civilization.