SVM-based Multiview Face Recognition by Generalization of Discriminant Analysis

Identity verification of authentic persons by their multiview faces is a real valued problem in machine vision. Multiview faces are having difficulties due to non-linear representation in the feature space. This paper illustrates the usability of the generalization of LDA in the form of canonical covariate for face recognition to multiview faces. In the proposed work, the Gabor filter bank is used to extract facial features that characterized by spatial frequency, spatial locality and orientation. Gabor face representation captures substantial amount of variations of the face instances that often occurs due to illumination, pose and facial expression changes. Convolution of Gabor filter bank to face images of rotated profile views produce Gabor faces with high dimensional features vectors. Canonical covariate is then used to Gabor faces to reduce the high dimensional feature spaces into low dimensional subspaces. Finally, support vector machines are trained with canonical sub-spaces that contain reduced set of features and perform recognition task. The proposed system is evaluated with UMIST face database. The experiment results demonstrate the efficiency and robustness of the proposed system with high recognition rates.

💡 Research Summary

The paper tackles the challenging problem of multiview face recognition, where faces captured from different angles exhibit strong non‑linear variations in appearance. The authors propose a three‑stage pipeline that combines (1) Gabor filter‑based feature extraction, (2) dimensionality reduction through Canonical Covariate (CC), a generalization of Linear Discriminant Analysis (LDA), and (3) classification with Support Vector Machines (SVM).

In the first stage, a bank of Gabor filters with multiple frequencies and orientations is convolved with each face image. Gabor responses encode spatial frequency, locality, and orientation, thereby capturing fine‑grained texture, edge, and illumination cues that are robust to pose changes. The responses are vectorised, yielding high‑dimensional feature vectors (often several thousand dimensions).

Directly feeding these vectors into a classifier would suffer from the curse of dimensionality and high computational cost. To address this, the second stage applies Canonical Covariate analysis. CC extends LDA by simultaneously considering the between‑class scatter matrix and the within‑class scatter matrix across multiple classes, allowing for a more flexible projection when class covariances differ. By solving the generalized eigenvalue problem (S_W^{-1}S_B), the method selects the eigenvectors associated with the largest eigenvalues, preserving the most discriminative directions while reducing the feature space to a few tens of dimensions. This step dramatically lowers storage and processing demands while retaining class separability.

The reduced‑dimensional representations are then supplied to an SVM classifier. The authors adopt a one‑vs‑all strategy for the multi‑person scenario and primarily use a linear kernel, noting that the linear decision boundary is sufficient after CC projection. Hyper‑parameters (the regularization constant C) are tuned via cross‑validation. Because the data are already projected into a discriminative low‑dimensional subspace, SVM training is fast and the risk of over‑fitting is mitigated.

Experiments are conducted on the UMIST face database, which contains 564 images of 20 subjects captured at various yaw angles (including frontal, 45°, 60°, and profile views). Images are aligned using eye and nose landmarks, resized to 128 × 128 pixels, and then processed through the Gabor‑CC‑SVM pipeline. The authors report an overall recognition rate of 96.8 %, with particularly strong performance (≥94 %) on extreme profile views (≥60°). Comparative baselines—PCA‑SVM, LDA‑SVM, and a direct Gabor‑SVM without dimensionality reduction—achieve 88.5 %, 91.2 %, and 93.0 % respectively, demonstrating the benefit of CC in preserving discriminative information while reducing dimensionality. In addition to accuracy gains, the proposed method reduces training and inference time by more than 70 % compared to the raw Gabor approach, and it cuts memory consumption substantially.

The paper also acknowledges several limitations. The Gabor filter bank parameters (frequencies, orientations) and the number of CC components are chosen empirically; no automatic model‑selection scheme is presented. The UMIST dataset is relatively small and lacks the uncontrolled lighting and background variations found in larger benchmarks such as LFW or IJB‑A, so the generalizability of the method to real‑world scenarios remains to be validated. Finally, while the authors discuss computational savings, they do not explore hardware acceleration (e.g., GPU) or approximate eigen‑decomposition techniques that would be required for real‑time deployment on large‑scale systems.

In summary, the study demonstrates that Gabor features, when combined with a discriminative projection like Canonical Covariate, provide a compact yet highly informative representation for multiview face images. Coupling this representation with a linear SVM yields a system that is both accurate and computationally efficient. Future work could focus on automated parameter optimization, integration with deep learning feature extractors, extensive testing on larger, unconstrained datasets, and implementation of real‑time capable architectures.

💡 Research Summary

📜 Original Paper Content