Introducing LGN-CNN architecture

In this section we introduce one of the main novelty of this paper, a CNN architecture inspired by the structure of the visual system and, in particular, takes into account LGN cells.

The retinal action in a CNN has been implemented in , where the authors have proposed a bottleneck model for the retinal output. In our model we propose a single filter layer at the beginning of the CNN that should mimic the action of the LGN. As we have already discussed in Section 12 the RFP of an LGN cell can be modeled by a LoG that acts directly on the visual stimulus. Since the LGN preprocesses the visual stimulus before it reaches V1, we should add a first layer at the beginning of the CNN that reproduces the role of the LGN.

In particular, if we consider a classical CNN we can add before the other convolutional layers, a layer $`\ell^0`$ composed by only one filter $`\Psi^0`$ of size $`s^0 \times s^0`$ and a ReLU function. Note that after the first layer $`\ell^0`$ we will not apply any pooling. In this way taking a classical CNN and adding $`\ell^0`$ will not modify the structure of the neural network and the number of parameters will only increase by $`s^0 \times s^0`$. Furthermore, $`\Psi^0`$ will prefilter the input image without modifying its dimensions; this behavior mimics the behavior of the LGN which let the neural network to be closer to the visual system structure. Figure 1 shows a scheme of the first steps of the visual pathway (i.e., LGN and V1) in parallel with the first two layers $`\ell^0`$ and $`\ell^1`$ of the LGN-CNN architecture.

Scheme of the LGN and V1 in parallel with the first two layers ℓ⁰ and ℓ¹ of the LGN-CNN architecture.

The theoretical idea behind this structure can be found in a simple result on rotational symmetric convex functionals. In particular, we recall that a rotational symmetric convex functional $`F`$ has a unique minimum $`\omega`$. Since $`F`$ is rotational symmetric, $`F(\omega \circ g) = F(\omega)`$ for a rotation $`g`$. Thus, since the minimum is unique, $`\omega = \omega \circ g`$, implying the rotational symmetry of $`\omega`$. There are several results on symmetries of minimum for functionals as for example in , . Our aim is to extend these results in the case of CNNs in particular on our architecture that we name as Lateral Geniculate Nucleus Convolutional Neural Network (LGN-CNN).

We will also show that the Gabor-like filters in the second convolution layer, reprojected in the $`(n_x, n_y)`$ plane introduced by Ringach and recalled above, satisfy the same properties of elongation which characterizes the RFPs of simple cells in V1. This analysis should enforce the link between our architecture and the visual system structure, at least as regards simple cells in V1.