Divergence Framework for EEG based Multiclass Motor Imagery Brain Computer Interface
Similar to most of the real world data, the ubiquitous presence of non-stationarities in the EEG signals significantly perturb the feature distribution thus deteriorating the performance of Brain Computer Interface. In this letter, a novel method is proposed based on Joint Approximate Diagonalization (JAD) to optimize stationarity for multiclass motor imagery Brain Computer Interface (BCI) in an information theoretic framework. Specifically, in the proposed method, we estimate the subspace which optimizes the discriminability between the classes and simultaneously preserve stationarity within the motor imagery classes. We determine the subspace for the proposed approach through optimization using gradient descent on an orthogonal manifold. The performance of the proposed stationarity enforcing algorithm is compared to that of baseline One-Versus-Rest (OVR)-CSP and JAD on publicly available BCI competition IV dataset IIa. Results show that an improvement in average classification accuracies across the subjects over the baseline algorithms and thus essence of alleviating within session non-stationarities.
💡 Research Summary
**
The paper addresses a critical challenge in EEG‑based motor‑imagery brain‑computer interfaces (BCIs): the presence of non‑stationarities that distort feature distributions and degrade classification performance, especially in multiclass scenarios. While the Common Spatial Patterns (CSP) algorithm has been the work‑horse for binary motor‑imagery classification, its extensions to multiclass problems—such as One‑Versus‑Rest (OVR) CSP, pairwise binary voting, and information‑theoretic filter selection—still suffer from intra‑session variability because they do not explicitly enforce stationarity within each class.
To overcome this limitation, the authors propose two novel frameworks that integrate stationarity constraints into an information‑theoretic divergence formulation based on Joint Approximate Diagonalization (JAD). The first framework, termed DivOVR‑WS, augments the conventional OVR‑CSP objective with a regularization term that penalizes the Kullback‑Leibler (KL) divergence between each trial’s covariance matrix and its class‑wise mean covariance. A scalar regularization parameter λ balances discriminability (the CSP‑style term) against stationarity (the KL term). The second framework, DivJAD‑WS, builds directly on the JAD approach, which seeks a linear transform V that jointly diagonalizes the K class covariance matrices, thereby minimizing the sum of KL divergences between transformed covariances and diagonal matrices. In DivJAD‑WS the same KL‑based stationarity term is added, yielding a combined loss Δ(V) = (1‑λ)·J(R) + λ·J_s(Id_R), where J(R) is the original JAD divergence and J_s measures trial‑to‑class divergence. The optimization is performed on the orthogonal manifold using Riemannian gradient descent; the orthogonal matrix R is initialized randomly (or with the OVR‑CSP solution for faster convergence) and updated via line search until convergence criteria are met. After optimization, the spatial filter matrix V = (R·W)^T is obtained (W being the whitening transform), and the top d filters are selected using an information‑theoretic criterion such as mutual information.
Experimental validation uses the publicly available BCI Competition IV dataset IIa, which contains four motor‑imagery classes (left hand, right hand, feet, tongue) from nine subjects. Each subject contributed 288 training and 288 testing trials. Pre‑processing extracts a 0.5–3.5 s window after cue onset and applies an 8–30 Hz band‑pass filter. Covariance matrices are estimated with a shrinkage estimator. For all methods, eight spatial filters are retained and fed to a Linear Discriminant Analysis (LDA) classifier. The regularization parameter λ is tuned via cross‑validation over the range
Comments & Academic Discussion
Loading comments...
Leave a Comment