A P300 ERP-based Brain-Computer Interface (BCI) speller is an assistive communication tool. It searches for the P300 event-related potential (ERP) elicited by target stimuli, distinguishing it from the neural responses to non-target stimuli embedded in electroencephalogram (EEG) signals. Conventional methods require a lengthy calibration procedure to construct the binary classifier, which reduced overall efficiency. Thus, we proposed a unified framework with minimum calibration effort such that, given a small amount of labeled calibration data, we employed an adaptive semi-supervised EM-GMM algorithm to update the binary classifier. We evaluated our method based on character-level prediction accuracy, information transfer rate (ITR), and BCI utility. We applied calibration on training data and reported results on testing data. Our results indicate that, out of 15 participants, 9 participants exceed the minimum character-level accuracy of 0.7 using either on our adaptive method or the benchmark, and 7 out of these 9 participants showed that our adaptive method performed better than the benchmark. The proposed semi-supervised learning framework provides a practical and efficient alternative to improve the overall spelling efficiency in the real-time BCI speller system, particularly in contexts with limited labeled data.
A Brain-Computer Interface (BCI) is a technology-driven system that captures, processes, and converts brain signals into actionable commands to control an output device and perform specific tasks [1]. One common type is the EEG-based BCI, which uses electroencephalography (EEG) to record brain activity from the scalp. EEG-based BCIs are widely used due to their low cost, non-invasiveness, and high temporal resolution. Among them, speller systems represent a practi-cal communication tool for individuals with severe physical impairments [2].
Among EEG-based speller systems, the P300 speller has received particular attention due to its reliability and ease of implementation. The P300 speller provides a non-invasive way for users to communicate, and it has been beneficial for individuals with severe motor impairments such as amyotrophic lateral sclerosis (ALS) [3]. The P300 is a specific event-related potential (ERP) characterized by a positive voltage deflection occurring approximately 300 milliseconds after a stimulus. It is typically elicited in response to a rare but relevant event, referred to as the target stimulus, while frequent irrelevant events are known as non-target stimuli [4]. Users are asked to focus on the target character they wish to type and mentally respond whenever a stimulus group contains that character, while ignoring all other groups [5].
The Row-Column Paradigm (RCP) is a widely used stimulus presentation paradigms for P300-based BCI spellers. Figure 1 shows a virtual keyboard arranged in six rows and six columns, presented to the user [5]. During a sequence, six rows and six columns are flashed randomly. The row and column stimuli containing the character the user intends to spell are the target stimulus groups. Therefore, it always exists two target stimuli and ten non-target stimuli within each sequence.
The conventional framework of an EEG-based BCI speller system began with the acquisition of neural signals from the user (Figure 2). The signals were pre-processed using spatial and spectral filters [6]. The stimulus-specific EEG signal is obtained by truncating at the onset of the stimulus with a fixed time window, e.g., 800 ms. The extracted features from stimulus-specific EEG signal segments are subsequently subjected to binary classification to differentiate target responses from non-target responses, facilitating the computation of character-level probabilities [7]. In the RCP, the system identified the intended character by locating the intersection of the row and column associated with target responses. Finally, feedback is provided to the user, forming a closed-loop system that supports real-time performance improvement [8,9].
Various existing machine learning (ML) methods such as support vector machine (SVM) [10], convolutional neural networks (CNN) [11], logistic regression [12], linear discriminant analysis (LDA) [13], and stepwise LDA (swLDA) [14] have successfully constructed binary classifiers. Most existing approaches relied on supervised learning, which requires large amounts of labeled data for calibration.
However, the performance of the system is limited due to the tedious calibration procedure. In practice, collecting such labeled EEG data is time-consuming and boring for certain users. And these challenges are further amplified when such systems are implemented for practical applications. EEG signals are highly susceptible to external noise, and individual neural variability adds complexity to signal processing. As a result, supervised methods typically required extensive calibration with large datasets, which leads to user fatigue that degrades data quality over time [15,16].
To address these challenges, semi-supervised methods provide a practical approach by utilizing a small amount of labeled data together with a larger pool of unlabeled data. Previous work [17] had applied unsupervised training for realtime P300 adaptation. Their Bayesian model embedded the constraints of the P300 speller paradigm and optimized the inner product between EEG feature and weight vectors. The authors performed Gaussian Mixture Model to the scalar inner product via Expectation Maximization (EM) algorithm. While this eliminated labeled data requirements, their approach performed poorly when few repetitions were used and its training was unstable, requiring many random starting points to find a good solution.
Building on this foundation, we proposed a semi-supervised training framework for P300-based BCI systems that integrated a small amount of labeled data for initialization with ongoing unsupervised adaptation. Instead of working on the inner product, we applied Gaussian Mixture Model directly to the EEG feature vectors with a data-driven covariance matrix assumption. This approach can be used in the following setting: For example, the user types a simple word such as “GO”, which is used to initiate the spelling task and help initialize model parameters. It then transits
This content is AI-processed based on open access ArXiv data.