Modified TSception for Analyzing Driver Drowsiness and Mental Workload from EEG
Driver drowsiness is a leading cause of traffic accidents, necessitating real-time, reliable detection systems to ensure road safety. This study proposes a Modified TSception architecture for robust assessment of driver fatigue and mental workload using Electroencephalography (EEG). The model introduces a five-layer hierarchical temporal refinement strategy to capture multi-scale brain dynamics, surpassing the original TSception’s three-layer approach. Key innovations include the use of Adaptive Average Pooling (ADP) for structural flexibility across varying EEG dimensions and a two-stage fusion mechanism to optimize spatiotemporal feature integration for improved stability. Evaluated on the SEED-VIG dataset, the Modified TSception achieves 83.46% accuracy, comparable to the original model (83.15%), but with a significantly reduced confidence interval (0.24 vs. 0.36), indicating better performance stability. The architecture’s generalizability was further validated on the STEW mental workload dataset, achieving state-of-the-art accuracies of 95.93% and 95.35% for 2-class and 3-class classification, respectively. These results show that the proposed modifications improve consistency and cross-task generalizability, making the model a reliable framework for EEG-based safety monitoring.
💡 Research Summary
The paper presents a novel deep‑learning architecture, Modified TSception, designed to improve the reliability and cross‑task generalization of EEG‑based driver drowsiness and mental‑workload detection. Building on the original TSception, the authors introduce three major enhancements: (1) a five‑layer hierarchical temporal pathway (instead of three), where the first three layers capture standard rhythmic activity (0.5 s, 0.25 s, 0.125 s windows) and the additional two layers employ halved sampling rates to act as refinement filters for transient events such as alpha spindles and theta bursts; (2) Adaptive Average Pooling (ADP) in the third temporal block and in spatial blocks, which dynamically adjusts the pooling size to produce a fixed‑size feature map regardless of input length or channel count, thereby making the network device‑agnostic and robust to variations in recording protocols; (3) a two‑stage fusion mechanism that first merges spatial features and then refines the combined representation with 1×1 pointwise convolutions, enhancing inter‑channel relationships and reducing over‑fitting.
The architecture processes EEG tensors of shape (1, 17, 200) – a one‑second segment from 17 scalp electrodes sampled at 200 Hz. Each temporal block uses 2‑D convolutions with kernel (1, k_t), LeakyReLU activation, and either average pooling or ADP, followed by batch normalization. Spatial blocks employ kernels (17, 1) and (8, 1) to capture full‑head and partial‑head interactions, also followed by ADP and batch normalization. The concatenated spatiotemporal features pass through a fully‑connected layer (64 units, 0.5 dropout) and a softmax classifier. Training uses cross‑entropy loss with label smoothing to mitigate class imbalance.
Experimental evaluation was conducted on two publicly available datasets. On SEED‑VIG (driver drowsiness), Modified TSception achieved 83.46 % accuracy with a confidence interval (CI) of ±0.24, slightly higher than the original TSception’s 83.15 % but with a markedly tighter CI (original CI ± 0.36). This demonstrates improved stability across subjects, a critical factor for real‑world safety systems. On the STEW dataset (mental workload), the model reached state‑of‑the‑art performance: 95.93 % accuracy for binary workload classification and 95.35 % for three‑class classification, surpassing previously reported CNN‑based approaches (generally ≤94 %). The cross‑task results indicate that the hierarchical temporal layers and ADP enable the network to learn task‑agnostic EEG representations.
The authors acknowledge limitations: the current design is optimized for 17‑channel, 1‑second inputs, and its scalability to high‑density or long‑duration recordings remains untested. Real‑time inference latency and power consumption were not quantified, which are essential for in‑vehicle deployment. Moreover, the study does not explore data‑augmentation, domain‑adaptation, or multimodal fusion (e.g., eye‑tracking, heart rate), which could further enhance robustness.
In summary, Modified TSception advances EEG‑based cognitive‑state monitoring by (i) enriching temporal feature granularity through additional hierarchical layers, (ii) ensuring dimensional consistency across diverse recording setups via Adaptive Average Pooling, and (iii) improving spatiotemporal synergy with a two‑stage fusion strategy. These innovations yield comparable or higher accuracy while substantially reducing performance variance, making the model a promising candidate for reliable, real‑time driver‑monitoring systems and broader cognitive‑load assessment applications.
Comments & Academic Discussion
Loading comments...
Leave a Comment