Bipolar Morphological Neural Networks: Convolution Without Multiplication

In the paper we introduce a novel bipolar morphological neuron and bipolar morphological layer models. The models use only such operations as addition, subtraction and maximum inside the neuron and exponent and logarithm as activation functions for t…

Authors: Elena Limonova, Daniil Matveev, Dmitry Nikolaev

Bipolar Morphological Neural Networks: Convolution Without   Multiplication
Bipolar Morphological Neural Netw orks: Con volution W ithout Multiplication Elena Limonov a 1,2,4 , Daniil Matvee v 2,3 , Dmitry Nikolae v 2,4 , Vladimir V . Arlazarov 2,5 1 Institute for Systems Analysis, FRC CSC RAS, Moscow , Russia; 2 Smart Engines Service LLC, Moscow , Russia; 3 Moscow State Uni versity , Moscow , Russia; 4 Institute for Information T ransmission Problems RAS, Mosco w , Russia; 5 Moscow Institute of Ph ysics and T echnology , Dolgoprudnii, Russia; ABSTRA CT In the paper we introduce a novel bipolar morphological neuron and bipolar morphological layer models. The models use only such operations as addition, subtraction and maximum inside the neuron and e xponent and logarithm as acti v ation functions for the layer . The proposed models unlike pre viously introduced morphological neural networks approximate the classical computations and show better recognition results. W e also propose layer-by-layer approach to train the bipolar morphological networks, which can be further de veloped to an incremental approach for separate neurons to get higher accurac y . Both these approaches do not require special training algorithms and can use a v ariety of gradient descent methods. T o demonstrate efficiency of the proposed model we consider classical conv olutional neural networks and con vert the pre-trained con volutional layers to the bipolar morphological layers. Seeing that the experiments on recognition of MNIST and MRZ symbols show only moderate decrease of accuracy after conv ersion and training, bipolar neuron model can provide faster inference and be v ery useful in mobile and embedded systems. Keyw ords: morphological neural netw ork, bipolar neuron, computational efficienc y , layer-by-layer training 1. INTR ODUCTION Neural netw ork methods are widely used in problems of recognition and machine vision [1–5]. V arious deep neural network architectures have been dev eloped for solving problems of current interest. The scope of neural network usage is steadily growing. Currently they are activ ely used on mobile devices and embedded systems with limited performance and a strong need for low power consumption. Despite the fact that there is a number of methods for improving inference speed of neural networks [6–9], each of them has its own constraints that limit applicability , so this area is still the area of activ e research. One way to impro ve the inference time of neural networks is to use a computationally simplified neuron model. The calculations in such a neuron model can be implemented using fewer logic gates than the sequence of multiplications and additions used in the classical neuron model. This means that calculations in a simplified neuron can be performed in less time and are more energy ef ficient. The latter circumstance is especially important for mobile recognition systems. The e xamples of neural netw orks with simplified models of a neuron are neural networks with integer calculations [10] and morphological neural networks [11]. The usage of inte ger data types can speed up inference because calculation of the integer sum and integer product on modern mobile central processors is faster than real sum and product. This is due to the architectural issues of the ARM processor , which is most often used on mobile devices and embedded systems, as well as the presence of Single Instruction Multiple Data extensions. Such extensions can perform one operation on the elements of data register simultaneously [12]. In case of integer calculations it is very useful, as the register has fixed size of, for e xample, 128 bits (for NEON and SSE) and allows processing of 4 float32 v alues and 16 8-bit values. Ho wever , replacing the classical neuron model with an integer model implies a change in the calculation results due to accuracy los s of the weights and possible o verflo ws. Recent research introduces different methods to preserve recognition quality ev en with low-bit quantization of a netw ork [6, 8, 13 – 17]. The morphological neuron model uses addition and maximum/minimum operations instead of addition and multipli- cation, as well as threshold nonlinear activ ation functions [11, 18]. This model largely appeals to the biological properties of neurons. Further dev elopment of this idea is a dendrite morphological neuron, which allo ws simulating the processes of excitation and inhibition [19] and a generalization of the proposed model in terms of lattice algebra [20]. It is also worth noting that the morphological neural network is usually a single-layer perceptron. T o train such a neural network, heuristic algorithms are used [21], which can be supplemented by stochastic gradient descent [22]. One more morphological neural network model DenMo-Net with dilation and erosion neurons was presented in [23]. The network demonstrated good results with three-layer DenMo-Net architecture relativ e to a classical three-layer archi- tecture. Howe ver , the considered model does not seem to be scalable and does not sho w state-of-the-art quality in image recognition problems with such a simple structure. In this paper , a new model of neuron based on the idea of a morphological neuron is proposed. The proposed model is considered as an approximation of a classical neuron, which allows us to adapt modern neural network architectures to this model. W e also introduce an approach to training and fine-tuning of such neural networks. W e tested it for the MNIST number recognition [24] and Machine-Readable Zone (MRZ) [25] symbol recognition problems. Experimental results show no recognition quality loss in the problems. 2. BIPOLAR MORPHOLOGICAL LA YER MODEL W e propose approximation of neural network layers with neurons calculating linear combination of inputs by a mor- phological structure. Such layers can be fully-connected or conv olutional, which are normally the most computationally complex part of a neural network. Let us describe the proposed morphological structure. W e call it bipolar morpholog- ical layer (BM layer) or bipolar morphological neuron (BMN) for one neuron. The word bipolar here refers to the two computational paths, that consider excitation and inhibition processes in the neuron. In biology bipolar neurons are mostly responsible for perception and can be found, for example, in the retina. Our main idea is to represent the major amount of computations using max/min and addition operations. The structure of BM layer is inspired by approximation of the classical layer . Calculations of one neuron can be expressed as 4 neurons placed in parallel with the follo wing addition/subtraction: N X i =1 x i w i = N X i =1 p 00 i x i w i − N X i =1 p 01 i x i | w i | − N X i =1 p 10 i | x i | w i + N X i =1 p 11 i | x i || w i | , where p kj i = ( 1 , if ( − 1) k x i > 0 and ( − 1) j w i > 0 0 , otherwise V alues of p kj i define the connections of new neurons. For each of them we can consider inputs and weights as positive values and perform the approximation. Let us denote: M = max j ( x j w j ) k = N P i =1 x i w i M − 1 The approximation is: N X i =1 x i w i = exp { ln N X i =1 x i w i } = exp { ln M (1 + k ) } = (1 + k ) exp ln M = (1 + k ) exp  ln(max j ( x j w j ))  = = (1 + k ) exp max j ln( x j w j ) = (1 + k ) exp max j (ln x j + ln w j ) = (1 + k ) exp max j ( y j + v j ) ≈ exp max j ( y j + v j ) , where y j are new inputs, v j = ln w j are new weights. It is obvious, that it is correct, when k  1 . Since 0 ≤ k ≤ N − 1 , the best case is the sum containing only one non-zero term ( k = 0 ) and the w orst case is the sum with all equal terms ( k = N − 1 ). In the worst case real value for the sum will be N times more than approximated. This behavior can not be called a good approximation. Howe ver , ev en an approximation with well limited absolute error can lead to an unpredictable decrease of neural network accuracy due to a strong non-linearity between layers. For example, low-precision neural networks do not sho w high accurac y after direct conv ersion and gi ve perfect results with the help of special training approaches like in [16]. Thus, the accuracy of the one neuron approximation should not be a decisiv e criterion in the case of neural networks. It is much more important whether the approximation leads to a high resulting quality of the network or not. In the paper we in vestigate the accuracy of the introduced approximation after con version and training. The approximation shown abo ve leads to the proposed BM layer structure in the Fig. 1. The rectifier (ReLU) allo ws us to take values above zero and create four computational path for positiv e and negati ve input or coefficients. Then we take the logarithm of rectified input ( X ) and perform essential morphological operation of the layer . The results are passed to the exponential unit and subtracted to get the output ( Y ). ( X ~ V j i ) k = max l ( X l + ( V j i ) kl ) X X 0 X 1 Y 00 Y 01 Y 10 Y 11 Y 0 := Y 00 − Y 01 Y 1 := Y 10 − Y 11 Y := Y 0 − Y 1 ln ◦ ReLU ln ◦ ReLU ◦ − exp ◦ ~ V 0 0 exp ◦ ~ V 1 0 exp ◦ ~ V 0 1 exp ◦ ~ V 1 1 Figure 1. The structure of a BM layer with input vector X and weight matrix V j i . The ◦ symbol designates function composition. The BM layer will obtain results similar to the original in case of a good logarithm approximation, which means that the sum has one major term. If there are se veral dominant terms we can take it into account for our approximation. Operations of ln and exp are performed on the acti vation (the signal transmitted between network layers) and can be considered as a part of an acti vation function. Normally the activ ation functions do not mak e significant contributions to the computational complexity of a neural network, so the increase of computational complexity should be of little consequence. If the activ ation function yet take noticeable time, we can approximate it by a simpler function. For example, it can be a piece-wise linear approximation. One more option is to perform input quantization and use look-up tables, which is also fast. As a result, BM layer structure can be expressed as follo ws: BMN ( x, w ) = exp max j (ln ReLU ( x j ) + v 0 j ) − exp max j (ln ReLU ( x j ) + v 1 j ) − exp max j (ln ReLU ( − x j ) + v 0 j )+ + exp max j (ln ReLU ( − x j ) + v 1 j ) , where v k j = ( ln | w j | , if ( − 1) k w j > 0 −∞ , otherwise If a layer of the neural network includes bias, which is added to the linear combination, it can be added after the proposed BM layer approximation. 3. TRAINING METHOD Let us introduce a method for obtaining a neural netw ork with BM layers. T raining BM using standard algorithms can be challenging, because there is only one non-zero gradient element due to max operation and only one weight is updated at each iteration. Some weights can nev er be updated and nev er fire after training thus giving redundancy of the network. W e use their approximation nature and train them layer-by-layer . The idea is to modify con volutional and fully-connected layers sequentially from the first to the last, freeze the modified structure and weights and train the other layers. The approach allo ws us to fine-tune the network and ignore possible issues of training the BMNs directly and still adapt the neural network to the changes and sav e calculation accurac y . W e hav e obtained good results by this layer-by-layer method while training 8-bit integer neural networks [8]. The idea to divide weights into groups and perform approximation and fine-tuning until the approximation of the full network is introduced in [16] for lossless lo w-bit quantization. As a result, the method 1 of training can be summarized as: Algorithm 1: T raining of BM network Data: T raining data Result: Neural network with BM layers 1 T rain classical neural network by standard methods; 2 for each con v and fc layers do 3 Approximate current layer and freeze its weights; 4 T rain the remaining part of the network by standard methods; 5 Perform steps 1-4 sev eral times with dif ferent initial conditions and choose the best result; W e also present the second training method, which is dif ferent from method 1 at stages 3 and 4. The weights are not frozen and BM layers are trained with the whole network. In this case we can face con ver gence issues and slower training process, but try to a void them by initialization with con verted weights, that we suppose are close to the desired v alues. The next step for the algorithm de velopment is using neuron-by-neuron fine-tuning. Then we should perform approxi- mation and freezing of weights in a layer only for one neuron at a time. 4. EXPERIMENT AL RESUL TS 4.1 MNIST MNIST is a database with images of handwritten digits. This is a basic dataset for image classification problems. The training set consists of 60000 gray images of 28 × 28 pixels [24]. There is also a test data of 10000 images. W e use 10% of the training set for validation and the rest for training. The examples of MNIST images are sho wn in Fig. 2a. W e consider 2 simple con volutional neural networks (CNNs) and analyze the accuracy obtained after replacing con vo- lutional and fully-connected layers with BM layers. The notation used to represent different architectures is as follo ws: con v( n , w x , w y ) — con volutional layer with n filters of size w x × w y ; fc( n ) — fully-connected layer with n neurons; maxpool( w x , w y ) — max-pooling layer with the window of size w x × w y ; dropout( p ) — dropout the input signals with the probability p ; relu — rectifier activ ation function ReLU ( x ) = max( x, 0) ; softmax — standard softmax activ ation function. The CNN 1 architecture is: con v1(30, 5, 5) - relu1 - dropout1(0,2) - fc1(10) - softmax1. The CNN 2 architecture is: conv1(40, 5, 5) - relu1 - maxpool1(2, 2) - conv2(40, 5, 5) - relu2 - fc1(200) - relu3 - dropout1(0,3) - fc2(10) - softmax1. For CNN 1 and CNN 2 we perform layer-by-layer con version to the BM network. In the T able 1 we show the resulting accuracy depending on the conv erted part. Con verted part ”none” corresponds to the original classical network. For methods 1 and 2 we demonstrate the accuracy before fine-tuning (BM layer’ s weights are not trained) and after fine-tuning (the whole network is trained). All values were av eraged ov er 10 measurements with random initialization. The results for training without BM layers sho w moderate accuracy decrease for con volutional layers and dramatic decline for fully-connected layers. This can happen due to the drop of the approximation quality for these layers. Moreover , accuracy with two con volution layers con verted is not much better than the accuracy of the fully-connected layers only , which means that BM conv olutions without training perform only slightly better than random. Howe ver , results with training of con verted layers sho w almost no accuracy decrease after con volution layers con version and better results for fully-connected layers. Introduced results do not necessary mean that BM neural network with fully-connected layers can not reach desired accuracy of classical networks, because training methods for BM neurons are still to be in vestigated. In our case conv ersion works excellently for con volutional layers, but giv es poor results for fully-connected ones. At the same time, neural network inference takes major time for con volutional layers, so the proposed method suits well for speeding up inference. The area of future research is de veloping of new training approaches to achie ve state-of-the-art quality for multiplication- free neural network. a) b) Figure 2. The example images from training datasets. a) MNIST , b) MRZ T able 1. MNIST recognition accuracy for neural networks with BM layers. CNN Con verted part Accuracy , % Method 1 Method 2 before fine- tuning after fine- tuning before fine- tuning after fine- tuning CNN 1 none 98,72 - 98,72 - con v1 42,47 98,51 38,38 98,76 con v1 - relu1 - dropout1 - fc1 26,89 - 19,86 94,00 CNN 2 none 99,45 - 99,45 - con v1 94,90 99,41 94,57 99,42 con v1 - relu1 - maxpool1 - conv2 21,25 98,68 36,23 99,37 con v1 - relu1 - maxpool1 - con v2 - relu2 - fc1 10,01 74,95 17,25 99,04 con v1 - relu1 - maxpool1 - con v2 - relu2 - fc1 - dropout1 - relu3 - fc2 12,91 - 48,73 97,86 4.2 MRZ symbols Here we use pri vate dataset of about 2 , 8 × 10 5 gray images of 21 × 17 pix els in size. The images contain 37 MRZ symbols, which were extracted from real documents with machine-readable zone. W e use 10% of the data for validation and 9 , 4 × 10 4 additional images for test. The CNN 3 architecture is: con v1(8, 3, 3) - relu1 - conv2(30, 5, 5) - relu2 - conv3(30, 5, 5) - relu3 - dropout1(0,25) - fc1(37) - softmax1. The CNN 4 architecture is: con v1(8, 3, 3) - relu1 - con v2(8, 5, 5) - relu2 - con v3(8, 3, 3) - relu3 - dropout1(0,25) - con v4(12, 5, 5) - relu4 - conv5(12, 3, 3) - relu5 - con v6(12, 1, 1) - relu6 - fc1(37) - softmax1. Con version results are shown in the T able 2. Con verted part “one“ corresponds to the original classical netw ork. For methods 1 and 2 we demonstrate the accuracy before fine-tuning (BM layer’ s weights are not trained) and after fine-tuning (the whole network is trained). All values were av eraged ov er 10 measurements with random initialization. The recognition accuracy is only slightly dif ferent for the first two con verted layers and trained remaining part of the network (with frozen BM layers), but then significantly decreases. The possible reason of the effect can be difficulty with adapting to new approximate features extracted by the BM con volutional layers. Howev er, training of the full con verted network including BM layers sho ws no significant accuracy decline for all con v olutional layers, but shows visible de grada- tion for fully-connected layers. As a result, currently we recommend conv ersion and training of conv olutional layers only to keep the original recognition quality , while BM neurons for fully-connected layers are to be further inv estigated. T able 2. MRZ recognition accuracy for neural networks with BM layers. CNN Con verted part Accuracy , % Method 1 Method 2 before fine- tuning after fine- tuning before fine- tuning after fine- tuning CNN 3 none 99,63 - 99,63 - con v1 97,76 99,64 83,07 99,62 con v1 - relu1 - conv2 8,59 99,47 21,12 99,58 con v1 - relu1 - con v2 - relu2 - con v3 3,67 98,79 36,89 99,57 con v1 - relu1 - con v2 - relu2 - con v3 - relu3 - dropout1 - fc1 12,58 - 27,84 93,38 CNN 4 none 99,67 - 99,67 - con v1 91,20 99,66 93,71 99,67 con v1 - relu1 - conv2 6,14 99,52 73,79 99,66 con v1 - relu1 - con v2 - relu2 - con v3 23,58 99,42 70,25 99,66 con v1 - relu1 - con v2 - relu2 - con v3 - relu3 - dropout1 - con v4 29,56 99,04 77,92 99,63 con v1 - relu1 - con v2 - relu2 - con v3 - relu3 - dropout1 - conv4 - relu4 - con v5 34,18 98,45 17,08 99,64 con v1 - relu1 - con v2 - relu2 - con v3 - relu3 - dropout1 - conv4 - relu4 - con v5 - relu5 - conv6 5,83 98,00 90,46 99,61 con v1 - relu1 - con v2 - relu2 - con v3 - relu3 - dropout1 - conv4 - relu4 - con v5 - relu5 - conv6 -relu6 - fc1 4,70 - 27,57 95,46 5. CONCLUSION AND DISCUSSION In the paper we propose a new bipolar morphological neuron, which approximates classical neuron. W e show how to con vert a layer of a classical neural network to the BM layer and introduce an approach to training. It utilizes layer -by-layer con version to BM layers and training of the remaining part of the full network using standard methods. In such way the approach allo ws us to av oid training issues, such as updating only one weight at each step due to maximum operations. W e demonstrate that recognition accuracy of BM networks with only conv olutional layers con verted is close to those of the original classical networks for MNIST and MRZ datasets. Bipolar morphological neural networks gi ve new possibilities to speed up neural network inference, because addi- tion/subtraction and maximum/minimum have lo wer latency than multiplication for most modern devices. FPGA systems for BM neural networks can be easier and more ener gy efficient, because the y do not require multiplication units for con vo- lutions. In the proposed BM model we used complex activ ation functions, b ut they tak e much less time than conv olutional or fully-connected layer, because they are applied to the activ ation signal between layers and hav e only linear complexity . Furthermore, activ ation functions can be approximated and implemented via look-up tables and be computed ev en faster . It should be noted that state-of-the-art methods for speeding up inference like low-precision computations, pruning, or structure simplifications can also be applied to the BM model. Howev er, the accuracy of the resulting neural networks is still to be determined. One more advantage of the structure is that BM layers can be included in already existing architectures and do not restrict them in any aspects. For example, morphological neural networks do not allo w us to stack many layers to increase quality , while here we can vary the number of BM layers without training concerns. This work opens a way for the wide research of bipolar morphological neural networks as a method, which gives recognition accuracy close to those of classical neural networks (especially if con verting only computationally complex parts) b ut better inference speed. Introduced training method can be further de veloped to allow , for example, training from scratch and improv e results for BM fully-connected layers. A CKNO WLEDGMENTS The reported study was partially supported by RFBR, research projects 17-29-03240 and 18-07-01384. REFERENCES [1 ] N. Sk oryukina, D. P . Nikolae v , A. Sheshkus, and D. Pole voy , “Real time rectangular document detection on mobile devices, ” in Se venth International Conference on Machine V ision (ICMV 2014) , 9445 , 94452A, International Society for Optics and Photonics (2015). DOI: 10.1117/12.2181377. [2 ] K. Bulato v , V . V . Arlazarov , T . Cherno v , O. Slavin, and D. Nikolae v , “Smart IDReader: Document recognition in video stream, ” in CBDAR 2017 , 39–44 (2018). DOI: 10.1109/ICD AR.2017.347. [3 ] N. Abramov , A. T alalaev , and V . Fralenko, “Intelligent telemetry data analysis for diagnosing of the spacecraft hard- ware, ” Journal of Information T echnologies and Computing Systems (01), 64–75 (2016). [4 ] O. S. A vsentiev , T . V . Meshcheryako va, and V . V . Navoe v , “Sequential application of the hierarchy analysis method and associati ve training of a neural network in examination problems, ” V estnik Y uUrGU. Ser . Mat. Model. Progr . 10 (3), 142–147 (2017). DOI: 10.14529/mmp170312. [5 ] A. Sheshkus, D. P . Nickolaev , A. Ingachev a, and N. Skoruykina, “ Approach to recognition of flexible form for credit card expiration date recognition as example, ” in ICMV 2015 , A. V . P . R. D. Nikolae v , ed., 9875 , 1–5, SPIE (Dec. 2015). DOI: 10.1117/12.2229534. [6 ] S. Gupta, A. Agraw al, K. Gopalakrishnan, and P . Narayanan, “Deep learning with limited numerical precision, ” in Proceedings of the 32nd International Conference on Machine Learning (ICML-15) , D. Blei and F . Bach, eds., 1737–1746, JMLR W orkshop and Conference Proceedings (2015). [7 ] E. L. Denton, W . Zaremba, J. Bruna, Y . LeCun, and R. Fer gus, “Exploiting linear structure within conv olutional networks for ef ficient ev aluation, ” in Adv ances in Neural Information Processing Systems 27 , Z. Ghahramani, M. W elling, C. Cortes, N. Lawrence, and K. W einberger , eds., 1269–1277, Curran Associates, Inc. (2014). [8 ] D. Ilin, E. Limonova, V . Arlazarov , and D. Nikolae v , “Fast integer approximations in conv olutional neural networks using layer -by-layer training, ” in Ninth International Conference on Machine V ision , 103410Q–103410Q, Interna- tional Society for Optics and Photonics (2017). DOI: 10.1117/12.2268722. [9 ] E. E. Limono v a, A. V . Sheshkus, A. A. Ivanov a, and D. P . Nikolae v , “Con volutional neural netw ork structure trans- formations for comple xity reduction and speed improvement, ” Pattern Recognition and Image Analysis 28 (1), 24–33 (2018). [10 ] V . V anhoucke, A. Senior , and M. Z. Mao, “Improving the speed of neural networks on cpus, ” in Deep Learning and Unsupervised Feature Learning W orkshop, NIPS 2011 , (2011). [11 ] G. X. Ritter and P . Sussner , “ An introduction to morphological neural networks, ” Proceedings of 13th International Conference on Pattern Recognition 4 , 709–717 v ol.4 (1996). [12 ] D. A. Patterson and J. L. Hennessy , Computer Organization and Design, Fourth Edition, F ourth Edition: The Hard- ware/Software Interface (The Mor gan Kaufmann Series in Computer Architecture and Design) , Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 4th ed. (2008). [13 ] S. Zhou, Y . W u, Z. Ni, X. Zhou, H. W en, and Y . Zou, “Dorefa-net: Training low bitwidth con volutional neural networks with lo w bitwidth gradients, ” arXiv preprint arXi v:1606.06160 (2016). [14 ] M. Courbariaux, I. Hubara, D. Soudry , R. El-Y aniv , and Y . Bengio, “Binarized neural networks: Training deep neural networks with weights and acti v ations constrained to+ 1 or-1, ” arXiv preprint arXi v:1602.02830 (2016). [15 ] M. Rastegari, V . Ordonez, J. Redmon, and A. Farhadi, “Xnor-net: Imagenet classification using binary con volutional neural networks, ” in European Conference on Computer V ision , 525–542, Springer (2016). [16 ] A. Zhou, A. Y ao, Y . Guo, L. Xu, and Y . Chen, “Incremental network quantization: T o wards lossless cnns with low-precision weights, ” (02 2017). [17 ] Y . Choukroun, E. Kravchik, and P . Kisile v , “Low-bit quantization of neural networks for efficient inference, ” (02 2019). [18 ] P . Sussner and E. L. Esmi, Constructi ve Morphological Neural Networks: Some Theoretical Aspects and Experimen- tal Results in Classification , 123–144, Springer Berlin Heidelberg, Berlin, Heidelberg (2009). [19 ] G. X. Ritter , L. Iancu, and G. Urcid, “Morphological perceptrons with dendritic structure, ” in The 12th IEEE Inter- national Conference on Fuzzy Systems, 2003. FUZZ ’03. , 2 , 1296–1301 vol.2 (May 2003). [20 ] G. X. Ritter and G. Urcid, “Lattice algebra approach to single-neuron computation, ” IEEE T ransactions on Neural Networks 14 , 282–295 (March 2003). [21 ] H. Sossa and E. Guev ara, “Efficient training for dendrite morphological neural networks, ” Neurocomputing 131 , 132–142 (05 2014). [22 ] E. Zamora and H. Sossa, “Dendrite morphological neurons trained by stochastic gradient descent, ” in 2016 IEEE Symposium Series on Computational Intelligence (SSCI) , 1–8 (Dec 2016). [23 ] R. Mondal, S. Santra, and B. Chanda, “Dense morphological network: an uni versal function approximator , ” arXi v preprint arXiv:1901.00109 (2019). [24 ] THE MNIST DA T ABASE of handwritten digits, http://yann.lecun.com/e xdb/mnist/ . [25 ] Machine-readable passport, https://en.wikipedia.org/wiki/Machine-readable passport .

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment