Classification of Log-Polar-Visual Eigenfaces using Multilayer Perceptron

In this paper we present a simple novel approach to tackle the challenges of scaling and rotation of face images in face recognition. The proposed approach registers the training and testing visual face images by log-polar transformation, which is capable to handle complicacies introduced by scaling and rotation. Log-polar images are projected into eigenspace and finally classified using an improved multi-layer perceptron. In the experiments we have used ORL face database and Object Tracking and Classification Beyond Visible Spectrum (OTCBVS) database for visual face images. Experimental results show that the proposed approach significantly improves the recognition performances from visual to log-polar-visual face images. In case of ORL face database, recognition rate for visual face images is 89.5% and that is increased to 97.5% for log-polar-visual face images whereas for OTCBVS face database recognition rate for visual images is 87.84% and 96.36% for log-polar-visual face images.

💡 Research Summary

The paper tackles two persistent challenges in visual face recognition—variations in scale and rotation—by introducing a log‑polar transformation as a preprocessing step, followed by conventional eigenface (PCA) feature extraction and classification with an improved multilayer perceptron (MLP). The log‑polar mapping converts a face image’s scaling and rotation into simple translations along the radial (log‑radius) and angular axes, respectively. Consequently, faces that are rotated or resized in the Cartesian domain become nearly translation‑invariant in the log‑polar domain, which greatly simplifies subsequent statistical modeling.

After transformation, the authors apply Principal Component Analysis (PCA) to the set of log‑polar images, constructing an eigenspace that captures the dominant variance of the normalized data. Because the log‑polar step already suppresses much of the geometric variability, the PCA basis vectors more faithfully represent intrinsic facial structure rather than artefacts of pose or size. Each transformed image is projected onto this eigenspace, yielding a compact feature vector.

These vectors are fed into a multilayer perceptron. The authors refer to the network as an “improved” MLP, implying that they tuned the number of hidden layers, neuron counts, learning rate, momentum, and possibly regularization to achieve better non‑linear discrimination than a single‑layer perceptron. Training proceeds via standard back‑propagation, and the output layer produces a one‑hot encoding of the subject identity.

Experimental validation uses two publicly available datasets: the ORL database (40 subjects, 10 frontal images each) and the Object Tracking and Classification Beyond Visible Spectrum (OTCBVS) database, which includes more challenging illumination and background variations. For each dataset the authors evaluate two conditions: (1) raw visual images processed directly by PCA‑MLP, and (2) images first transformed to log‑polar coordinates and then processed by the same pipeline. Results show a substantial boost in recognition rates after log‑polar conversion. On ORL, accuracy rises from 89.5 % (visual) to 97.5 % (log‑polar‑visual). On OTCBVS, accuracy improves from 87.84 % to 96.36 %. The consistent gains across both a controlled dataset (ORL) and a more realistic one (OTCBVS) demonstrate that the log‑polar step effectively normalizes scale and rotation, allowing the PCA‑MLP combination to focus on discriminative facial features.

Key contributions are: (i) the novel application of log‑polar transformation to face recognition, providing inherent scale‑ and rotation‑invariance; (ii) empirical evidence that traditional eigenface methods, when coupled with this transformation, achieve state‑of‑the‑art performance without redesigning the feature extractor; (iii) a thorough comparative study on two diverse databases, confirming the method’s generality.

Nevertheless, the approach has limitations. Log‑polar mapping assumes the face is roughly centered; mis‑alignment can cause distortion, so a reliable face detection and alignment stage is prerequisite but not detailed in the paper. The transformation itself is computationally intensive, potentially hindering real‑time deployment unless optimized (e.g., GPU acceleration or approximation). The experiments are limited to relatively small datasets and do not explore extreme variations such as occlusions, extreme pose angles, or cross‑spectrum imaging. Moreover, the paper lacks a full description of the MLP architecture (layer sizes, activation functions, training epochs), which hampers reproducibility.

Future work could address these issues by (a) integrating robust face detection and alignment with the log‑polar pipeline for a fully automated system; (b) replacing the linear PCA step with deep convolutional or transformer‑based encoders that can learn richer representations from log‑polar images; (c) extending evaluation to larger, more diverse benchmarks (e.g., LFW, MegaFace) and testing robustness to occlusion and illumination changes; (d) optimizing the log‑polar computation through parallel processing or learned approximations; and (e) exploring multimodal fusion of log‑polar and original visual features to exploit complementary information. Such extensions would move the method from a promising research prototype toward practical deployment in security, surveillance, and authentication applications where faces appear at varying scales and orientations.

💡 Research Summary

📜 Original Paper Content