Greedy Deep Dictionary Learning
📝 Abstract
In this work we propose a new deep learning tool called deep dictionary learning. Multi-level dictionaries are learnt in a greedy fashion, one layer at a time. This requires solving a simple (shallow) dictionary learning problem, the solution to this is well known. We apply the proposed technique on some benchmark deep learning datasets. We compare our results with other deep learning tools like stacked autoencoder and deep belief network; and state of the art supervised dictionary learning tools like discriminative KSVD and label consistent KSVD. Our method yields better results than all.
💡 Analysis
In this work we propose a new deep learning tool called deep dictionary learning. Multi-level dictionaries are learnt in a greedy fashion, one layer at a time. This requires solving a simple (shallow) dictionary learning problem, the solution to this is well known. We apply the proposed technique on some benchmark deep learning datasets. We compare our results with other deep learning tools like stacked autoencoder and deep belief network; and state of the art supervised dictionary learning tools like discriminative KSVD and label consistent KSVD. Our method yields better results than all.
📄 Content
Abstract—In this work we propose a new deep learning tool – deep dictionary learning. Multi-level dictionaries are learnt in a greedy fashion – one layer at a time. This requires solving a simple (shallow) dictionary learning problem; the solution to this is well known. We apply the proposed technique on some benchmark deep learning datasets. We compare our results with other deep learning tools like stacked autoencoder and deep belief network; and state-of-the-art supervised dictionary learning tools like discriminative K-SVD and label consistent K-SVD. Our method yields better results than all.
Index Terms—Deep Learning, Dictionary Learning, Feature Extraction
I. INTRODUCTION
N recent years there has been a lot of interest in dictionary
learning. However the concept of dictionary learning has
been around for much longer. Its application in vision [1] and
information retrieval [2] dates back to the late 90’s. In those
days, the term ‘dictionary learning’ had not been coined;
researchers were using the term ‘matrix factorization’. The goal
was to learn an empirical basis from the data. It basically
required decomposing the data matrix to a basis / dictionary
matrix and a feature matrix – hence the name ‘matrix
factorization’.
The current popularity of dictionary learning owes to K-SVD
[3, 4]. K-SVD is an algorithm to decompose a matrix (training
data) into a dense basis and sparse coefficients. However the
concept of such a dense-sparse decomposition predates K-SVD
[5]. Since the advent of K-SVD in 2006, there have been a
plethora of work on this topic. Dictionary learning can be used
both for unsupervised problems (mainly inverse problems in
image processing) as well as for problems arising in supervised
feature extraction.
Dictionary learning has been used in virtually all inverse
problems arising in image processing starting from simple
image [6, 7] and video [8] denoising, image inpainting [9], to
more complex problems like color image restoration [10],
inverse half toning [11] and even medical image reconstruction
[12, 13]. Solving inverse problems is not the goal of this work;
we are more interested in dictionary learning from the
perspective of machine learning. We briefly discussed [6-13]
for the sake of completeness.
Mathematical transforms like DCT, wavelet, curvelet, Gabor
etc. have been widely used in image classification problems
[14-16]. These techniques used these transforms as a
sparsifying step followed by statistical feature extraction
methods like PCA or LDA before feeding the features to a
classifier. Just as dictionary learning is replacing such fixed
transforms (wavelet, DCT, curvelet etc.) in signal processing
problems, it is also replacing them in feature extraction
scenarios. Dictionary learning gives researchers the opportunity
to design dictionaries to yield not only sparse representation
(like curvelet, wavelet, DCT etc.) but also discriminative
information.
Initial techniques proposed naïve approaches which learnt
specific dictionaries for each class [17-19]. Later approaches
incorporated discriminative penalties into the dictionary
learning framework. One such technique is to include softmax
discriminative cost function [20-22]; other discriminative
penalties include Fisher discrimination criterion [23], linear
predictive classification error [24, 25] and hinge loss function
[26, 27]. In [28, 29] discrimination is introduced by forcing the
learned features to map to corresponding class labels.
All prior studies on dictionary learning (DL) are ‘shallow’
learning models just like a restricted boltzman machine (RBM)
[30] and autoencoder (AE) [31]. DL, RBM and AE – all fall
under the broader topic of representation learning. In DL, the
cost function is Euclidean distance between the data and the
representation given the learned basis; for RBM it is Boltzman
energy; in AE, the cost is the Euclidean reconstruction error
between the data and the decoded representation / features.
Almost at the same time, when dictionary learning started
gaining popularity, researchers in machine learning observed
that better (more abstract and compact) representation can be
achieved by going deeper. Deep Belief Network (DBN) is
formed by stacking one RBM after the other [32, 33]. Similarly
stacked autoencoder (SAE) were created by one AE inside the
other [34, 35].
Following the success of DBN and SAE, we propose to learn
multi-level deep dictionaries. This is the first work on deep
dictionary learning. The rest of the paper will be organized into
several sections….
II. LITERATURE REVIEW
We will briefly review prior studies on dictionary learning,
stacked autoencoders and deep Boltzmann machines.
A. Dictionary Learning
Early studies in dictionary learning wanted to learn a basis for
Greedy Deep Dictionary Learning
Snigdha Tariyal, Angshul Majumdar, Member IEEE, Richa Singh, Senior Member IEEE, a
This content is AI-processed based on ArXiv data.