Computer Science / Computer Vision Computer Science / Machine Learning Computer Science / Neural Computing

A Deep Convolutional Auto-Encoder with Pooling - Unpooling Layers in Caffe

February 23, 2026

Reading time: 6 minute

...

#Neural Computing #Machine Learning #Computer Vision #Computer Science

📝 Original Info

Title: A Deep Convolutional Auto-Encoder with Pooling - Unpooling Layers in Caffe
ArXiv ID: 1701.04949
Date: 2017-01-19
Authors: ** - Volodymyr Turchenko (vtu@uleth.ca) - Eric Chalmers (eric.chalmers@uleth.ca) - Artur Luczak (luczak@uleth.ca) **

📝 Abstract

This paper presents the development of several models of a deep convolutional auto-encoder in the Caffe deep learning framework and their experimental evaluation on the example of MNIST dataset. We have created five models of a convolutional auto-encoder which differ architecturally by the presence or absence of pooling and unpooling layers in the auto-encoder's encoder and decoder parts. Our results show that the developed models provide very good results in dimensionality reduction and unsupervised clustering tasks, and small classification errors when we used the learned internal code as an input of a supervised linear classifier and multi-layer perceptron. The best results were provided by a model where the encoder part contains convolutional and pooling layers, followed by an analogous decoder part with deconvolution and unpooling layers without the use of switch variables in the decoder part. The paper also discusses practical details of the creation of a deep convolutional auto-encoder in the very popular Caffe deep learning framework. We believe that our approach and results presented in this paper could help other researchers to build efficient deep neural network architectures in the future.

💡 Deep Analysis

Deep Dive into A Deep Convolutional Auto-Encoder with Pooling - Unpooling Layers in Caffe.

This paper presents the development of several models of a deep convolutional auto-encoder in the Caffe deep learning framework and their experimental evaluation on the example of MNIST dataset. We have created five models of a convolutional auto-encoder which differ architecturally by the presence or absence of pooling and unpooling layers in the auto-encoder’s encoder and decoder parts. Our results show that the developed models provide very good results in dimensionality reduction and unsupervised clustering tasks, and small classification errors when we used the learned internal code as an input of a supervised linear classifier and multi-layer perceptron. The best results were provided by a model where the encoder part contains convolutional and pooling layers, followed by an analogous decoder part with deconvolution and unpooling layers without the use of switch variables in the decoder part. The paper also discusses practical details of the creation of a deep convolutional auto-

📄 Full Content

1 A Deep Convolutional Auto-Encoder with Pooling - Unpooling Layers in Caffe

Volodymyr Turchenko, Eric Chalmers, Artur Luczak

Canadian Centre for Behavioural Neuroscience Department of Neuroscience, University of Lethbridge 4401 University Drive, Lethbridge, AB, T1K 3M4, Canada {vtu, eric.chalmers, luczak}@uleth.ca

Abstract – This paper presents the development of several models of a deep convolutional auto-encoder in the Caffe deep learning framework and their experimental evaluation on the example of MNIST dataset. We have created five models of a convolutional auto-encoder which differ architecturally by the presence or absence of pooling and unpooling layers in the auto-encoder’s encoder and decoder parts. Our results show that the developed models provide very good results in dimensionality reduction and unsupervised clustering tasks, and small classification errors when we used the learned internal code as an input of a supervised linear classifier and multi-layer perceptron. The best results were provided by a model where the encoder part contains convolutional and pooling layers, followed by an analogous decoder part with deconvolution and unpooling layers without the use of switch variables in the decoder part. The paper also discusses practical details of the creation of a deep convolutional auto-encoder in the very popular Caffe deep learning framework. We believe that our approach and results presented in this paper could help other researchers to build efficient deep neural network architectures in the future.

Keywords – Deep convolutional auto-encoder, machine learning, neural networks, dimensionality reduction, unsupervised clustering.

Introduction

An auto-encoder (AE) model is based on an encoder-decoder paradigm, where an encoder first transforms an input into a typically lower-dimensional representation, and a decoder is tuned to reconstruct the initial input from this representation through the minimization of a cost function [1-4]. An AE is trained in unsupervised fashion which allows extracting generally useful features from unlabeled data. AEs and unsupervised learning methods have been widely used in many scientific and industrial applications, mainly solving tasks like network pre-training, feature extraction, dimensionality reduction, and clustering. A classic or shallow AE has only one hidden layer which is a lower- dimensional representation of the input. In the last decade, the revolutionary success of deep neural network (NN) architectures has shown that deep AEs with many hidden layers in the encoder and decoder parts are the state-of-the-art models in unsupervised learning. In comparison with a shallow AE, when the number of trainable parameters is the same, a deep AE can reproduce the input with lower reconstruction error [5]. A deep AE can extract hierarchical features by its hidden layers and, therefore, substantially improve the quality of solving specific task. One of the variations of a deep AE [5] is a deep convolutional auto-encoder (CAE) which, instead of fully-connected layers, contains convolutional layers in the encoder part and deconvolution layers in the decoder part. Deep CAEs may be better suited to image processing tasks because they fully utilize the properties of convolutional neural networks (CNNs), which have been proven to provide better results on noisy, shifted (translated) and corrupted image data [6].
Modern deep learning frameworks, i.e. ConvNet2 [7], Theano with lightweight extensions Lasagne and Keras [8- 10], Torch7 [11], Caffe [12], TensorFlow [13] and others, have become very popular tools in deep learning research since they provide fast deployment of state-of-the-art deep learning models along with state-of-the-art training algorithms (Stochastic Gradient Descent, AdaDelta, etc.) allowing rapid research progress and emerging commercial applications. Moreover, these frameworks implement many state-of-the-art approaches to network initialization, parametrization and regularization, as well as state-of-the-art example models. Besides many outstanding features, we have chosen the Caffe deep learning framework [12] mainly for two reasons: (i) a description of a deep NN is pretty straightforward, it is just a text file describing the layers and (ii) Caffe has a Matlab wrapper, which is very convenient and allows getting Caffe results directly into a Matlab workspace for their further processing (visualization, etc.) [12].
The goal of this paper is to present the practical implementation of several CAE models in the Caffe deep learning framework, as well as experimental results on solving an unsupervised clustering task using the MNIST dataset. This study is an extended version of our paper published in arXiv [14]. All developed Caffe .prototxt files to reproduce our models along with Matlab-based visualization scripts are included in supplementary materials. The paper is organized as fo

…(Full text truncated)…

📄 Read Full PDF on ArXiv

📸 Image Gallery

Reference

This content is AI-processed based on ArXiv data.

A Deep Convolutional Auto-Encoder with Pooling - Unpooling Layers in Caffe

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

📸 Image Gallery

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

📸 Image Gallery

Reference

Related Posts

A stochastic model of human visual attention with a dynamic Bayesian network

A Unified Algorithmic Framework for Multi-Dimensional Scaling

Chemception: A Deep Neural Network with Minimal Chemistry Knowledge Matches the Performance of Expert-developed QSAR/QSPR Models

Start searching

No results found