Instant Automated Inference of Perceived Mental Stress through Smartphone PPG and Thermal Imaging
Background: A smartphone is a promising tool for daily cardiovascular measurement and mental stress monitoring. A smartphone camera-based PhotoPlethysmoGraphy (PPG) and a low-cost thermal camera can be used to create cheap, convenient and mobile monitoring systems. However, to ensure reliable monitoring results, a person has to remain still for several minutes while a measurement is being taken. This is very cumbersome and makes its use in real-life mobile situations quite impractical. Objective: We propose a system which combines PPG and thermography with the aim of improving cardiovascular signal quality and capturing stress responses quickly. Methods: Using a smartphone camera with a low cost thermal camera added on, we built a novel system which continuously and reliably measures two different types of cardiovascular events: i) blood volume pulse and ii) vasoconstriction/dilation-induced temperature changes of the nose tip. 17 healthy participants, involved in a series of stress-inducing mental workload tasks, measured their physiological responses to stressors over a short window of time (20 seconds) immediately after each task. Participants reported their level of perceived mental stress using a 10-cm Visual Analogue Scale (VAS). We used normalized K-means clustering to reduce interpersonal differences in the self-reported ratings. For the instant stress inference task, we built novel low-level feature sets representing variability of cardiovascular patterns. We then used the automatic feature learning capability of artificial Neural Networks (NN) to improve the mapping between the extracted set of features and the self-reported ratings. We compared our proposed method with existing hand-engineered features-based machine learning methods. Results, Conclusions: … due to limited space here, we refer to our manuscript.
💡 Research Summary
This paper presents a novel mobile stress‑monitoring system that simultaneously records photoplethysmography (PPG) using a smartphone’s rear RGB camera and thermal imaging of the nose tip using a low‑cost FLIR One 2G attachment. The authors designed a signal‑processing pipeline that extracts raw blood‑volume‑pulse (BVP) signals from temporal variations in Shannon entropy of the red‑channel images, and then derives precise P‑P intervals by removing a moving‑average baseline. Thermal frames are processed to obtain a one‑dimensional temperature time series from the nose tip, with a filter that suppresses respiratory cycles to isolate vasoconstriction‑related temperature fluctuations. Signal quality indices (pSQI) indicate that the PPG data are of high quality (mean 0.755) and that thermal data quality improves markedly after filtering (from 0.714 to 0.157).
Seventeen healthy participants performed a series of cognitively demanding tasks (e.g., Stroop, N‑back) to induce varying levels of mental stress. Immediately after each task, a 20‑second recording of both modalities was captured, and participants rated their perceived stress on a 10‑cm visual analogue scale (VAS). To reduce inter‑subject variability, the VAS scores were normalized and clustered using K‑means, producing discrete stress labels.
Two feature‑generation strategies were compared. Traditional high‑level features (HRV metrics such as LF/HF ratio, SDNN, and thermal directionality) showed low correlation with the short‑window VAS scores and yielded modest classification accuracies (≈68 % for PPG‑only, ≈59 % for thermal‑only). In contrast, the authors fed low‑level sequences (raw P‑P intervals and temperature fluctuations) directly into a multilayer perceptron neural network, allowing the model to learn discriminative high‑level representations automatically. In leave‑one‑subject‑out cross‑validation, the multimodal neural‑network achieved 78.33 % accuracy, outperforming single‑modality models and matching state‑of‑the‑art stress‑recognition systems that typically require at least two minutes of data to reach ~80 % accuracy.
Additional experiments examined the impact of labeling strategies; models performed best when stress labels were normalized per participant, highlighting the importance of handling subjective rating differences.
Overall, the study demonstrates that (1) high‑quality cardiovascular and thermal signals can be extracted from ultra‑short (20 s) smartphone recordings, (2) low‑level physiological variability can be leveraged by neural networks to achieve robust stress inference, and (3) combining PPG with thermal imaging substantially improves performance over either modality alone. The proposed approach offers a practical, non‑invasive solution for real‑time mental‑stress monitoring in everyday mobile contexts.
Comments & Academic Discussion
Loading comments...
Leave a Comment