Spike-and-Slab Sparse Coding for Unsupervised Feature Discovery

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We consider the problem of using a factor model we call {\em spike-and-slab sparse coding} (S3C) to learn features for a classification task. The S3C model resembles both the spike-and-slab RBM and sparse coding. Since exact inference in this model is intractable, we derive a structured variational inference procedure and employ a variational EM training algorithm. Prior work on approximate inference for this model has not prioritized the ability to exploit parallel architectures and scale to enormous problem sizes. We present an inference procedure appropriate for use with GPUs which allows us to dramatically increase both the training set size and the amount of latent factors. We demonstrate that this approach improves upon the supervised learning capabilities of both sparse coding and the ssRBM on the CIFAR-10 dataset. We evaluate our approach’s potential for semi-supervised learning on subsets of CIFAR-10. We demonstrate state-of-the art self-taught learning performance on the STL-10 dataset and use our method to win the NIPS 2011 Workshop on Challenges In Learning Hierarchical Models’ Transfer Learning Challenge.

💡 Research Summary

The paper introduces Spike‑and‑Slab Sparse Coding (S3C), a probabilistic latent‑variable model that combines binary “spike” variables indicating whether a latent factor is active with continuous “slab” variables that carry the actual value when the spike is on. This hybrid structure inherits the sparsity‑inducing behavior of traditional sparse coding while preserving the stochastic generative formulation of spike‑and‑slab Restricted Boltzmann Machines (ssRBMs). Exact inference in S3C is intractable because the posterior couples discrete and continuous variables. To address this, the authors develop a structured variational inference scheme that factorizes the approximate posterior as a product of independent Bernoulli distributions for spikes and conditional Gaussians for slabs. The variational parameters (spike logits, slab means, and variances) are updated in closed form using block‑coordinate ascent, and all updates can be expressed as matrix‑vector operations, making the algorithm highly amenable to GPU acceleration.

Training proceeds via a variational Expectation‑Maximization (EM) loop. In the E‑step, the structured variational inference yields expectations of the latent variables for each data point. In the M‑step, the expected complete‑data log‑likelihood is maximized with respect to the model parameters: the weight matrix, slab prior precision, and spike prior probability. Because the expectations are already computed, the M‑step reduces to standard linear‑regression‑type updates for the weights and conjugate‑prior updates for the hyper‑parameters. The entire pipeline operates on mini‑batches, allowing the model to scale to large datasets and thousands of latent factors.

Empirical evaluation focuses on image classification benchmarks. On CIFAR‑10, a model with 1,600 latent factors and 256‑pixel filters trained with the proposed GPU‑friendly variational EM outperforms both ssRBM and conventional sparse coding in fully supervised settings, achieving over 78 % accuracy compared to roughly 73 % for ssRBM. In semi‑supervised experiments where only 10 % of the training labels are available, S3C still maintains an accuracy advantage of 3‑5 % over competing methods, demonstrating its ability to extract useful structure from unlabeled data.

The authors also test self‑taught learning on STL‑10, pre‑training S3C on 100 k unlabeled images and then training a linear SVM on the learned features. The resulting classifier reaches 81.2 % accuracy, surpassing the previous state‑of‑the‑art by a notable margin. Finally, the model won the NIPS 2011 Workshop “Challenges In Learning Hierarchical Models” Transfer Learning Challenge, confirming that features learned by S3C transfer effectively across domains.

Key contributions are: (1) a structured variational inference algorithm that exploits parallel hardware, enabling training on massive datasets and high‑dimensional latent spaces; (2) a spike‑and‑slab formulation that simultaneously enforces sparsity and retains a principled probabilistic interpretation; and (3) comprehensive experimental evidence that S3C delivers superior performance in supervised, semi‑supervised, and transfer‑learning scenarios. The paper suggests future work on deep hierarchical extensions, integration of modern stochastic optimizers (e.g., Adam), and application to non‑image modalities such as text or time‑series data.

Spike-and-Slab Sparse Coding for Unsupervised Feature Discovery

💡 Research Summary

Comments & Academic Discussion

Leave a Comment