Automated User Identification from Facial Thermograms with Siamese Networks

Automated User Identification from Facial Thermograms with Siamese Networks
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The article analyzes the use of thermal imaging technologies for biometric identification based on facial thermograms. It presents a comparative analysis of infrared spectral ranges (NIR, SWIR, MWIR, and LWIR). The paper also defines key requirements for thermal cameras used in biometric systems, including sensor resolution, thermal sensitivity, and a frame rate of at least 30 Hz. Siamese neural networks are proposed as an effective approach for automating the identification process. In experiments conducted on a proprietary dataset, the proposed method achieved an accuracy of approximately 80%. The study also examines the potential of hybrid systems that combine visible and infrared spectra to overcome the limitations of individual modalities. The results indicate that thermal imaging is a promising technology for developing reliable security systems.


💡 Research Summary

The paper investigates the use of facial thermograms—thermal images captured in the infrared spectrum—as a biometric modality and proposes an automated identification system based on Siamese neural networks. After outlining the shortcomings of conventional visible‑light face recognition (sensitivity to illumination, pose, expression, and susceptibility to spoofing), the authors argue that thermal imaging captures intrinsic physiological features such as vascular patterns, which are invariant to makeup, aging, and lighting conditions.

A detailed review of infrared spectral bands is provided: Near‑IR (NIR, 0.75–1.4 µm) and Short‑Wave IR (SWIR, 1.4–3 µm) rely mainly on reflected radiation and require active illumination; Mid‑Wave IR (MWIR, 3–8 µm) captures emitted heat but often needs cooling; Long‑Wave IR (LWIR, 8–14 µm) aligns with the peak emission of the human body (≈9–10 µm) and enables passive, non‑intrusive imaging. Consequently, LWIR is identified as the most suitable band for facial thermography.

Hardware requirements are enumerated: a minimum spatial resolution of 320 × 240 pixels (preferably 640 × 512 pixels or higher), thermal sensitivity measured as Noise Equivalent Temperature Difference (NETD) ≤30 mK (≤20 mK for high‑precision tasks), and a frame rate of at least 30 Hz to avoid motion blur. Commercial uncooled micro‑bolometer cameras from vendors such as FLIR, Teledyne, Xenics, Lynred, and InfraTec meet these specifications.

The experimental dataset consists of 3,720 thermograms extracted from video streams of 32 participants recorded with a UNI‑T UTi260B camera. Each video (3–5 minutes) provides thousands of frames, capturing a range of head orientations, facial expressions, and subtle movements, thereby enriching the data without artificial augmentation.

A Siamese network architecture is employed because it directly learns a similarity metric rather than a closed‑set classifier. Two identical subnetworks share weights and transform each input image into a compact feature vector; the Euclidean distance between vectors indicates whether the pair belongs to the same person. This design is well‑suited for open‑set verification where unseen identities may appear during operation. The model is trained for 300 epochs with standard data‑augmentation (small rotations, scaling, expression variations) to improve generalisation.

Evaluation on an 80/20 train‑test split yields an accuracy of 0.7999, precision 0.7761, recall 0.7899, and F1‑score 0.7368. These figures demonstrate that the Siamese approach can reliably discriminate between thermal facial images, achieving roughly 80 % correct identification despite the modest dataset size.

The authors acknowledge several limitations. A dataset of 32 subjects is insufficient for deployment in security‑critical environments that demand near‑perfect performance and robustness across diverse demographics, lighting, and ambient temperature conditions. Thermal images are also vulnerable to physiological variations (exercise, emotional state, alcohol intake) and external factors (glasses, breathing, ambient temperature), which can introduce noise and reduce discriminative power. Moreover, while thermal imaging raises the bar against simple photo or mask attacks, sophisticated physical perturbations and adversarial patterns can still deceive infrared sensors, indicating a need for dedicated anti‑spoofing mechanisms.

Future work is outlined: (1) expanding the dataset to thousands of subjects to capture broader variability; (2) employing higher‑resolution, lower‑NETD cameras to improve signal quality; (3) exploring more advanced network designs such as attention‑based or transformer models to better capture subtle thermal patterns; (4) integrating visible‑light imagery to create a multispectral hybrid system that leverages the complementary strengths of both modalities; and (5) developing robust liveness detection and adversarial defenses tailored to thermal data.

In conclusion, the study demonstrates that facial thermograms, when processed with Siamese neural networks, provide a viable, contact‑less biometric modality that is largely immune to illumination changes and offers inherent resistance to basic spoofing. However, achieving the reliability required for real‑world security applications will require larger, more diverse datasets, improved sensor hardware, and sophisticated algorithmic enhancements, especially in the areas of anti‑spoofing and multimodal fusion. The paper thus positions thermal‑based biometrics as a promising direction for next‑generation authentication systems while clearly mapping the technical challenges that must be addressed for practical adoption.


Comments & Academic Discussion

Loading comments...

Leave a Comment