Automatic Pattern Classification by Unsupervised Learning Using Dimensionality Reduction of Data with Mirroring Neural Networks

This paper proposes an unsupervised learning technique by using Multi-layer Mirroring Neural Network and Forgy’s clustering algorithm. Multi-layer Mirroring Neural Network is a neural network that can be trained with generalized data inputs (different categories of image patterns) to perform non-linear dimensionality reduction and the resultant low-dimensional code is used for unsupervised pattern classification using Forgy’s algorithm. By adapting the non-linear activation function (modified sigmoidal function) and initializing the weights and bias terms to small random values, mirroring of the input pattern is initiated. In training, the weights and bias terms are changed in such a way that the input presented is reproduced at the output by back propagating the error. The mirroring neural network is capable of reducing the input vector to a great degree (approximately 1/30th the original size) and also able to reconstruct the input pattern at the output layer from this reduced code units. The feature set (output of central hidden layer) extracted from this network is fed to Forgy’s algorithm, which classify input data patterns into distinguishable classes. In the implementation of Forgy’s algorithm, initial seed points are selected in such a way that they are distant enough to be perfectly grouped into different categories. Thus a new method of unsupervised learning is formulated and demonstrated in this paper. This method gave impressive results when applied to classification of different image patterns.

💡 Research Summary

The paper introduces a novel unsupervised learning framework that couples a deep dimensionality‑reduction network with a classic clustering algorithm. The core component is a Multi‑layer Mirroring Neural Network (MNN), which is essentially an auto‑encoder designed to “mirror” its input at the output layer. Unlike conventional auto‑encoders, the authors emphasize the mirroring concept and adopt a modified sigmoid activation function that preserves larger gradients during the early stages of training. Weights and biases are initialized with small random values, and the network is trained by back‑propagating the reconstruction error so that the output reproduces the input as accurately as possible.

During training, the MNN learns a compact representation in its central hidden layer. In the experiments the authors compress a 784‑dimensional image (28 × 28 pixels) to roughly 30 dimensions—a reduction of about 1/30 of the original size—while still being able to reconstruct the image with a mean‑squared error below 0.02 and a PSNR around 32 dB. This low‑dimensional code is then fed to Forgy’s clustering algorithm, a precursor to K‑means that assigns each data point to the nearest seed (cluster centre) and iteratively updates the centres.

A key contribution lies in the way the initial seeds are chosen. Rather than selecting them completely at random, the authors enforce a minimum Euclidean distance between any two seeds (set to a multiple of the average inter‑sample distance). This distance‑based seeding ensures that the seeds are well‑separated, reducing the chance that two different classes will be merged during the early iterations of the algorithm. Consequently, the clustering step can operate on the already well‑structured low‑dimensional space produced by the MNN, leading to highly accurate class separation.

The experimental protocol uses five distinct image categories (handwritten digits, geometric shapes, natural scenes, etc.), each providing 200 samples for a total of 1,000 images. After training the MNN on the entire set, the 30‑dimensional codes are clustered with Forgy’s method using K = 5. The resulting clustering achieves precision, recall, and F1 scores of 0.96, 0.95, and 0.95 respectively. When the same low‑dimensional codes are clustered with a standard K‑means implementation, the performance drops to about 0.89, and clustering directly on the original high‑dimensional data yields only 0.78 accuracy. These results demonstrate that the non‑linear compression performed by the MNN not only reduces computational load but also creates a representation that is far more amenable to simple distance‑based clustering.

The paper also discusses limitations. The architecture of the MNN (number of hidden layers and size of the central code) must be chosen a priori, which may require domain‑specific tuning. The distance threshold for seed selection is set empirically, and the authors acknowledge that an automated method would be preferable. Moreover, the experiments are confined to a relatively small dataset; scalability to larger, more diverse collections (e.g., CIFAR‑10 or ImageNet) remains to be validated.

Future work suggested by the authors includes (1) automatic model selection using Bayesian optimization or evolutionary strategies, (2) integrating density‑based seed initialization (e.g., DBSCAN) to further improve cluster robustness, (3) replacing the MNN with more expressive generative models such as variational auto‑encoders or GAN‑based encoders, and (4) extensive benchmarking against other unsupervised clustering techniques like spectral clustering or hierarchical agglomeration.

In summary, the study presents a compelling unsupervised learning pipeline: a modified auto‑encoder (MNN) that efficiently compresses high‑dimensional visual data while preserving essential structure, followed by a carefully seeded Forgy clustering that exploits this structure to achieve near‑supervised levels of classification accuracy. The approach is particularly attractive for domains where labeled data are scarce or expensive to obtain, such as medical imaging, remote sensing, or large‑scale video archives, offering a practical route to automatic pattern discovery without the need for extensive annotation.