Sparse Bayesian Modeling of EEG Channel Interactions Improves P300 Brain-Computer Interface Performance

Sparse Ba y esian Mo deling of EEG Channel In teractions Impro v es P300 Brain-Computer In terface P erformance Guo xuan Ma ∗ , Y uan Zhong ∗ , Mo y an Li ∗ , Y uxiao Nie, and Jian Kang † Departmen t of Biostatistics, Univ ersit y of Mic higan, Ann Arb or MI 48109 F ebruary 23, 2026 Abstract Electro encephalograph y (EEG)-based P300 brain-computer interfaces (BCIs) en- able comm unication without physical mo vemen t by detecting stim ulus-evok ed neu- ral resp onses. Accurate and eﬃcient deco ding remains challenging due to high di- mensionalit y , temp oral dep endence, and complex interactions across EEG c hannels. Most existing approac hes treat channels independently or rely on blac k-b ox ma- c hine learning mo dels, limiting interpretabilit y and p ersonalization. W e prop ose a sparse Bay esian time-v arying regression framework that explicitly mo dels pairwise EEG c hannel in teractions while p erforming automatic temporal feature selection. The mo del employs a relaxed-thresholded Gaussian pro cess prior to induce struc- tured sparsit y in b oth channel-speciﬁc and in teraction eﬀects, enabling interpretable iden tiﬁcation of task-relev an t c hannels and c hannel pairs. Applied to a publicly a v ail- able P300 speller dataset of 55 participan ts, the prop osed method achiev es a median c haracter-level accuracy of 100% using all stimulus sequences and attains the high- est ov erall deco ding p erformance among comp eting statistical and deep learning ap- proac hes. Incorp orating channel interactions yields subgroup-sp eciﬁc gains of up to 7% in character-lev el accuracy , particularly among participants who abstained from alcohol (up to 18% improv ement). Imp ortan tly , the prop osed metho d improv es me- dian BCI-Utilit y b y appro ximately 10% at its optimal operating p oint, ac hieving p eak throughput after only seven stimulus sequences. These results demonstrate that ex- plicitly mo deling structured EEG c hannel in teractions within a principled Bay esian framew ork enhances predictiv e accuracy , improv es user-cen tric throughput, and sup- p orts p ersonalization in P300 BCI systems. Keywor ds: Brain-Computer Interface, Bay esian Metho d, Gaussian Pro cess, P300 Sp eller ∗ Equally con tributed † T o whom corresp ondence should b e addressed: jiank ang@umich.edu 1 1 In tro duction Brain-computer interfaces (BCIs) facilitate direct comm unication betw een the human brain and external devices such as computers. The widespread in terest in BCI systems stems from their broad range of p otential applications in mo vemen t and communication assistance, particularly for individuals with motor impairmen ts (Pfurtsc heller et al. 2008), while their applications include p erformance enhancement in healthy users, neurorehabilitation, and other cognitive and clinical researc h domains (V an Erp et al. 2012, Kw ak et al. 2015). The BCI systems t ypically acquire brain activit y through noninv asiv e electro encephalog- raph y (EEG). Among v arious EEG signals, ev en t-related p oten tials (ERPs) refer to brain resp onses elicited by external stimuli such as visual, auditory , or somatosensory cues (F ar- w ell & Donc hin 1988). In an ERP-based BCI design, the system presents m ultiple stimuli in an on-screen keyboard, and the user fo cuses attention and mentally resp onse to the desired stimulus. The stim ulus that the user intends to select is called the target stimulus. After each stimulus presentation, the BCI classiﬁes the corresp onding EEG resp onse as either target or non-target, dep ending on whether it con tains ERP comp onents of target p erception. This design is often named after the P300 comp onen t, a deﬂection in the EEG signal that p eaks appro ximately 300 ms after the onset of a rare or unexp ected target stim ulus (F azel-Rezai et al. 2012). Among v arious applications of P300-based BCIs, the P300 sp eller is one of the most well-kno wn. It functions as a virtual k eyb oard whic h allo ws users to type characters b y attending to ﬂashing groups of letters or symbols, which are then deco ded based on the elicited P300 resp onses (F arw ell & Donc hin 1988). Data and motiv ation T o b etter illustrate the problem, we brieﬂy describ e the moti- v ating EEG dataset from a publicly a v ailable P300-based BCI exp eriment by W on et al. (2022). The study collected EEG recordings from 55 participants during BCI sp elling tasks, together with questionnaire data on demographics and ph ysical and mental states 2 Figure 1: An illustration of the P300 BCI sp elling task and recorded signal. (A) The P300 BCI sp eller presen ts a sequence of row or column ﬂashes (a row or a column) on a virtual screen to the user. The user fo cuses on a target c haracter and resp onds to the row and column ﬂashes that con tain the target c haracter. The EEG signals of the ﬂash are recorded from multiple c hannels and are segmented in to a 1200 ms window follo wing stimulus onset. (B) The names and lay out of the 32 scalp EEG channels according to the international 10- 20 system. (C) Example of recorded multi-c hannel EEG signals and the derived channel connectivit y . b efore and after the exp erimen t. Eac h participant completed b oth a calibration (train- ing) phase and a test phase using a P300 sp eller. During sp elling, rows and columns of a virtual keyboard (a 6 × 6 matrix; see Figure 1A) were ﬂashed rep eatedly in a random order. As a result, each stimulus sequence w as consisted of 12 ﬂashes (i.e., 6 ro ws and 6 columns). F or a giv en target character, the ﬂashes corresp onding to its ro w and column constituted target stim uli, while all remaining ﬂashes were non-targets. Eac h c haracter selection included 15 suc h stim ulus sequences. EEG signals w ere recorded from 32 scalp 3 c hannels arranged according to the international 10-20 system and segmented in to 1200 ms windows follo wing the stimulus onset to capture the ev en t-related resp onses asso ciated with the P300 comp onen t. More details are a v ailable in Section 4. As illustrated in Figure 1C, discriminative information b et ween target and non-target signals v aries across channels and time. F or example, channels FC2 and PO3 sho w clear target-related deﬂections in the P300 latency range, whereas channel CP1 exhibits muc h w eak er diﬀerences. Even within informativ e channels, only limited p ost-stim ulus time inter- v als con tribute meaningfully to discrimination. These motiv ates temporal c hannel selection via sparse mo deling to reduce noise from uninformative channels and time p oints. Bey ond marginal temp oral features, the signal in teractions b et ween EEG channels, summarized through measures of chann el connectivit y , provide an additional and complemen tary source of discriminativ e information. Figure 1C sho ws that the connectivit y patterns deriv ed from the target signal diﬀer from those of non-target signals. Incorp orating channel connectivit y as an additional predictor, alongside temp oral EEG features, therefore has the p otential to impro v e classiﬁcation p erformance. F urthermore, b ecause sub ject-lev el questionnaire data on demographics and physical and men tal states are av ailable, this dataset enables in v estigation of heterogeneity across individuals, including whether certain subgroups of participan ts b eneﬁt more from accounting for signal in teractions in mo deling than others. This op ens the do or to iden tifying p opulations for whom mo deling channel in teractions is particularly adv an tageous, with implications for p ersonalized and adaptive BCI systems. Related w ork The most-studied and fundamental computational c hallenge in P300- based BCIs is the classiﬁcation of brain activity follo wing each stimulus as either a target or non-target resp onse. Accurate classiﬁcation of these EEG segments enables the identi- ﬁcation of the target row and column in the sp eller matrix, and hence the selection of the desired c haracter. The original work (F arw ell & Donchi n 1988) developed four classiﬁcation 4 metho ds, step wise linear discriminant analysis (SWLDA), p eak picking, area, and cov ari- ance, with the b est p erformance ac hieved b y SWLDA in their exp eriment. Subsequen t studies hav e sought to impro ve P300 sp eller p erformance through more adv anced mac hine learning and signal pro cessing tec hniques, including indep endent comp onent analysis (ICA) (Xu et al. 2004) and supp ort v ector machines (SVMs) (Kap er et al. 2004). A comprehen- siv e review b y Philip & George (2020) highligh ted the strength of ensem ble metho ds, which com bine the adv antages of m ultiple classiﬁers and are particularly eﬀective for handling the class imbalance inheren t in P300 datasets. Bay esian metho ds hav e also gained increas- ing atten tion for BCI classiﬁcation. F or example, Zhang et al. (2015) prop osed a sparse Ba y esian mo del using Laplace priors for EEG signal classiﬁcation. Barth ´ elem y et al. (2023) in tro duced a Bay esian accumulation of Riemannian probabilities, providing an end-to-end framew ork for P300 BCI classiﬁcation. Recently , Ma et al. (2022) developed a Bay esian generativ e mo del that characterizes the probabilit y distribution of multi-trial EEG sig- nals, which provides b oth a ﬂexible simulation to ol for EEG data and a nov el probabilistic classiﬁer for P300 BCIs. Despite the success of existing metho ds, most approac hes ov erlo ok p otential functional relationships among brain regions and treat EEG signals from diﬀerent c hannels as in- dep enden t predictors. Ho wev er, studies ha v e shown that brain functions arise from the co ordinated activit y of distributed areas rather than isolated regions (T ononi & Edel- man 1998, F riston et al. 1997). In the BCI context, Kabbara et al. (2016) demonstrated clear diﬀerences in functional brain netw orks b etw een target and non-target visual stim uli. These ﬁndings suggest that in teractions among EEG channe ls, which reﬂect net work-lev el brain dynamics, carry imp ortan t information for distinguishing stimulus types (target/non- target). Mo deling suc h inter-c hannel dep endencies can therefore enhance b oth the predic- tiv e accuracy and the neurophysiological interpretabilit y of BCI systems. T o mo del signal in teractions, many existing approaches adopt a tw o-step strategy , b y 5 ﬁrst iden tifying main eﬀects and then reﬁtting the mo del with b oth main and in teraction eﬀects (Hao et al. 2018, W ang et al. 2021). Ho wev er, this approach is less suitable for EEG data, as the presence of an interaction do es not necessarily dep end on the existence of corresp onding main eﬀects. Zhao et al. (2025) to ok a diﬀerence approach b y introduc- ing the Gaussian Laten t channel mo del with Sparse time-v arying eﬀects (GLASS) for P300 sp eller, which uses constrained m ultinomial logistic regression for target classiﬁcation while accoun ting for correlations b et ween channels via laten t c hannel decomp ositions. Although GLASS incorp orates channel correlations in to the mo del structure, it do es not explicitly mo del c hannel interactions as predictors, and therefore cannot directly ev aluate and in- terpret the eﬀect sizes of inter-c hannel relationships. Recen t Ba yesian approac hes hav e incorp orated b oth main and in teraction terms within a uniﬁed inference framew ork using hierarc hical shrink age priors (Griﬃn & Brown 2017). Nonetheless, such general frame- w orks is less eﬀectiv e in accounting for the unique temp oral and structural dep endencies in EEG data. More recently , neural net work-based metho ds ha ve b een prop osed in EEG classiﬁcation tasks. Examples include multi-task auto enco der mo dels (Ditthapron et al. 2019), compact con volutional neural net w orks suc h as EEGNet (Lawhern et al. 2018), and w eigh ted ensem ble strategies (Kshirsagar & Londhe 2019). These approac hes can implicitly capture interaction eﬀects among EEG channels and their asso ciation with stimulus t yp e outcomes. How ever, they often require extensive task-sp eciﬁc arc hitectural design and large training datasets, which v ary considerably across studies. Moreov er, the blac k-b ox nature of neural net works limits in terpretability , making it diﬃcult to iden tify which speciﬁc c han- nels or channel pairs driv e classiﬁcation decisions. This lack of interpretabilit y constrains the scien tiﬁc insigh ts that can b e drawn ab out the underlying neural mechanisms and ma y hinder mo del generalizabilit y . Consequently , there is a strong need for statistical metho ds that can incorp orate in ter-channel interaction eﬀects while preserving interpretabilit y and computational eﬃciency . 6 Our con tributions In this pap er, we prop ose a Bay esian time-v arying regression mo del with signal interactions via relaxed-thresholded Gaussian pro cess (SI-R TGP) priors. Our mo del relaxes the traditional linearit y assumption among EEG predictors by explicitly mo deling signal in teraction eﬀects across c hannels, while performing temporal-spatial c han- nel selection, thereby improving b oth predictiv e p erformance and in terpretabilit y . T o our kno wledge, this is among the ﬁrst models to explicitly incorporate inter-c hannel in teractions in to EEG-based prediction within a Bay esian framew ork. W e prop ose a relaxed-thresholded Gaussian pro cess (R TGP) prior to ﬂexibly mo del the asso ciation b et ween EEG signals and stim ulus-t yp e outcomes. It has several adv an tages compared to existing work. First, the R TGP prior deﬁnes a broad class of temp orally v arying functions that are piecewise smo oth and sparse, which enables automatic feature selection with Bay esian inference. Second, compared with existing thresholded Gaussian pro cess priors, suc h as the soft-thresholded (Kang et al. 2018) and hard-thresholded (Cai et al. 2020) v ariants, the prop osed R TGP prior is more ﬂexible, capable of adapting to b oth sparse and non-sparse patterns by tuning a relaxation parameter. It also oﬀers sub- stan tial computational adv antages, allo wing eﬃcient MCMC sampling ev en for large-scale EEG datasets. W e ev aluate the prop osed metho d using b oth syn thetic data and EEG data from the P300 sp eller study by W on et al. (2022). The SI-R TGP mo del achiev es higher classiﬁcation accuracy for many participants and identiﬁes meaningful channels and c han- nel pairs, whic h provides v aluable insigh ts in to the neural mechanisms underlying P300 resp onses. 7 2 Metho d 2.1 Ba yes ian time-v arying mo del with signal in teractions Our mo del is individual-speciﬁc, meaning that a separate mo del is built for each participan t using only their own calibration data and ev aluated on their corresp onding test data. F or clarit y , the sub ject index is suppressed in the notation throughout the mo del developmen t. Supp ose a total of R target characters are t yp ed during the calibration phase. F or eac h c haracter r = 1 , . . . , R , the BCI presen ts S stimulus sequences, each consisting of J = 12 ﬂashes: 6 ro w stimuli ( j = 1 , . . . , 6) and 6 column stimuli ( j = 7 , . . . , 12) on a 6 × 6 sp eller matrix, in a random order. Let I = { ( r , s, j ) | r = 1 , . . . , R ; s = 1 , . . . , S ; j = 1 , . . . , J } denote the index set of all stimulus presentations. The total n umber of ﬂashes is therefore n = | I | = R × S × J ﬂashes in total, and w e use i ∈ I to index an individual ﬂash. Let K denote the n um b er of EEG channels and T the n umber of time p oin ts within the p ost-stim ulus window. Then, denote X ki ( t ) the observ ed intensit y of the EEG signal of the i -th stimulus from channel k at time t . Let X ki = ( X ki (1) , . . . , X ki ( T )) ⊤ ∈ R T and X i =  X ⊤ 1 i , X ⊤ 2 i , · · · , X ⊤ K i  ⊤ ∈ R p , where p = K T . W e denote by Z i ( k 1 , k 2 ) the signal in teraction b etw een channels k 1 and k 2 , where 1 ≤ k 1 < k 2 ≤ K . In this study , w e deﬁne the signal interaction as the Fisher Z-transformation of the Pearson correlation b etw een X k 1 i and X k 2 i , i.e. Z i ( k 1 , k 2 ) = 1 2 log  1 + cor( X k 1 i , X k 2 i ) 1 − cor( X k 1 i , X k 2 i )  . Let Z i = { Z i ( k 1 , k 2 ) } 1 ≤ k 1 0 , supp ose f ( x ) ∼ GP (0 , κ ) and ˜ f ( x ) ∼ N ( f ( x ) , ξ 2 ) . L et g ( x ) = f ( x ) I ( | ˜ f ( x ) | > ω ) ≜ T r ( f , ω , ξ 2 ) , then g ( x ) fol lows a r elaxe d-thr esholde d Gaussian pr o c ess, denote d as g ( x ) ∼ R TGP ( κ, ω , ξ 2 ) . In Deﬁnition 1, I ( · ) denotes the indicator function and T r is the relaxed-thresholding function. The introduction of ˜ f ( x ) allows the full conditional distribution of f ( x ) to hav e a conjugate and closed-form expression. The parameter ξ 2 represen ts the v ariance of ˜ f ( x ) and serv es as a relaxing parameter that con trols the indep enden t white noise added to f ( x ). Smaller v alues of ξ 2 imp ose a stricter constraint that preserv es the mean structure of f ( x ), while larger v alues pro vide greater ﬂexibility . T o illustrate, Figure 3 compares diﬀeren t thresholded GP functions. Let f ( x ) ∼ GP(0 , κ ). The soft-thresholding function T s ( f ( x ) , 0 . 5) sets v alues with | f ( x ) | < 0 . 5 to zero, and otherwise shrinks the magnitude b y the threshold 0.5. The hard-thresholding function T h ( f ( x ) , 0 . 5) also sets v alues b elow magnitude of 0.5 to zero, while leaving larger v alues unc hanged. Both soft and hard thresholded GPs imp ose sparsit y and piecewise smoothness, with the hard-thresholded GP introducing jump discon tin uities and the soft-thresholded GP remaining contin uous. 10 Figure 3: Illustration of diﬀerent thresholded Gaussian pro cess prior. T s ( · , 0 . 5) and T h ( · , 0 . 5) represents the soft and the hard thresholding function thresholded at 0.5. T r ( · , 0 . 5 , ξ 2 ) represents the prop osed relaxed-thresholding function with diﬀerent v alue of relaxing parameter ξ 2 . The second ro w of Figure 3 illustrates the prop osed R TGP under diﬀerent v alues of ξ . When ξ = 0 . 01, T r { f ( x ) , 0 . 5 , 0 . 01 } closely resem bles the hard-thresholded GP . F or ξ = 0 . 1, T r { f ( x ) , 0 . 5 , 0 . 1 } is a contin uous function that preserves sparsit y , similar to the soft- thresholded GP . As ξ increases to 1, T r { f ( x ) , 0 . 5 , 1 } reco v ers f ( x ). This ﬂexibility is particularly v aluable in mo deling EEG data, where the true signal pattern is unkno wn. F or example, if the true curve near x = 100 is with meaningful magnitude as in 3, R TGP can adapt and reco ver the pattern by c ho osing an appropriate ξ (e.g. ξ = 1), whereas b oth soft- and hard-thresholded GPs would enforce sparsity with probabilit y 1. The following prop osition formalizes the relationship b etw een the R TGP and other thresholded GPs. Prop osition 1. Given a thr esholding p ar ameter ω > 0 , let T r ( θ , ω , ξ 2 ) = θ · I ( | ˜ θ | > ω ) , T h ( θ , ω ) = θ · I ( | θ | > ω ) and T s ( θ , ω ) = sgn( θ )( | θ | − ω ) · I ( | θ | > ω ) wher e θ ∼ P θ ( θ ) . Then for any ϵ > 0 , ther e exist ξ 2 , such that Pr ( | T r ( θ , ω , ξ 2 ) − T h ( θ , ω ) | < ϵ ) > 0 , Pr ( | T r ( θ , ω , ξ 2 ) − θ | < ϵ ) > 0 , and Pr ( | T r ( θ , ω , ξ 2 ) − T s ( θ ⋆ , ω ) | < ϵ ) > 0 , wher e θ ⋆ = θ + ω 11 when θ > 0 and θ ⋆ = θ − ω when θ < 0 . F urthermor e, lim ξ 2 → 0 T r ( θ , ω , ξ 2 ) = T h ( θ , ω ) and lim ξ 2 →∞ T r ( θ , ω , ξ 2 ) = θ . Prop osition 1 provides a mathematical illustration of Figure 3, which shows that relaxed- thresholded function has certain probability to reduce to soft or hard thresholded function and the ﬂexibilit y is con trolled b y the relaxing parameter ξ 2 . The pro of is included in the supplemen tary materials. Giv en a stationary k ernel κ , w e assign β k ( t ) ∼ R TGP( κ, ω 1 , ξ 2 ) for channel k . That is, β k ( t ) = E k ( t ) I ( | ˜ E k ( t ) | > ω 1 ) , E k ( t ) ∼ GP(0 , κ ) , ˜ E k ( t ) ∼ N( E k ( t ) , ξ 2 ) , (2) where { E k ( · ) } K k =1 follo w indep endent Gaussian pro cesses. There are v arious c hoices for the k ernel function κ ( · , · ); for instance, we use the mo diﬁed squared exp onential (MSE) k ernel, deﬁned as: κ ( x, x ′ ) = exp  − α  | x | 2 2 + | x ′ | 2 2  − ρ | x − x ′ | 2 2  , where α > 0 , ρ > 0, and | · | 2 is the L 2 -norm (W u et al. 2024, 2025). Here, α is the decay parameter that con trols the deca y rate of v ariance. The parameter ρ is the smo othing parameter; a smaller v alue of ρ corresp onds to a smo other GP . Note that the MSE kernel b ecomes a standard squared exp onen tial kernel when α = 0. In our study , we set the hyperparameter α to a small v alue (0.01) and estimate ρ following the discussion in Lin et al. (2023). Sp eciﬁcally , ρ is estimated b y av eraging the estimated smo othing parameters of ﬁtting GP mo dels to the EEG signal on all the channels. W e also include the sensitivity analysis of the hyperparameter ρ in the supplemen tary materials. Similarly , for the eﬀect of signal interaction across c hannels, we assign ζ ( k 1 , k 2 ) ∼ R TGP( σ 2 η κ I , ω 2 , ξ 2 ) , k 1 < k 2 , where κ I represen ts the identit y k ernel. That is, ζ ( k 1 , k 2 ) = η ( k 1 , k 2 ) I ( | ˜ η ( k 1 , k 2 ) | > ω 2 ) , η ( k 1 , k 2 ) iid ∼ N(0 , σ 2 η ) , ˜ η ( k 1 , k 2 ) ∼ N( η ( k 1 , k 2 ) , ξ 2 ) , (3) and σ 2 η ∼ IG( a η , b η ). W e choose to use the identit y kernel here to assume prior indep endence 12 across c hannel pairs. Other kernels can b e adopted when prior information is a v ailable in diﬀeren t applications. By combining Mo del (1) with the prior sp eciﬁcations in (2) and (3), w e deﬁne the prop osed mo del as a Ba yesian time-v arying classiﬁcation mo del with signal in teractions via relaxed-thresholded Gaussian pro cess prior (SI-R TGP). 3 P osterior Computation In this section, we describ e the p osterior computation for Mo del (1) with priors (2) and (3) using probit link function. W e represent the Gaussian pro cesses b y the Karhunen- Lo ` ev e expansion and obtain an equiv alent mo del representation. Sp eciﬁcally , consider the sp ectral decomp osition of the kernel function, κ ( x, x ′ ) = P ∞ l =1 λ l ψ l ( x ) ψ l ( x ′ ), where { λ l } ∞ l =1 are the eigenv alues in descending order, and { ψ l ( x ) } ∞ l =1 are the corresp onding orthonormal eigenfunctions. By Mercer’s Theorem, w e can represen t the Gaussian pro cess E k ( t ) in (2) b y E k ( t ) = P ∞ l =1 e kl ψ l ( t ), where e kl are Karh unen-Lo` ev e co eﬃcients. W e truncate the expansion to the leading L terms, where L is c hosen following the common practice in principal comp onen t analysis where we retain enough comp onen ts to explain a high prop ortion of total v ariation. Then, the prior of β k ( · ) sp eciﬁed in (2) b ecomes β k ( t ) = ( L X l =1 e kl ψ l ( t ) ) I ( | ˜ E k ( t ) | > ω 1 ) , ˜ E k ( t ) ∼ N L X l =1 e kl ψ l ( t ) , ξ 2 ! , (4) and e kl ∼ N(0 , σ 2 e λ l ). W e set σ 2 e to be large v alues and a η = b η = 0 . 001 so that the priors are non-informativ e. F or ﬂexibility , we set ξ 2 = 1 in the ﬁrst 200 iterations and then gradually decrease its v alue to 0 . 0001. F or ω 1 and ω 2 , w e set them to zero in the ﬁrst 200 steps, then assign an adaptiv e discrete prior, i.e. P ( ω 1 = γ 1 z ) = 1 / Z and P ( ω 2 = γ 2 z ) = 1 / Z , z = 1 , · · · , Z , where { γ 1 z } Z z =1 and { γ 2 z } Z z =1 are Z evenly spaced n umber betw een a ω quan tile and b ω quan tile of {| ˜ E k ( t ) |} K,T k =1 ,t =1 and {| ˜ η ( k 1 , k 2 ) |} k 1 0  denote the true signal supp ort, and deﬁne its estimated counterpart as ˆ ν k ( t ) = 1 if | ˆ β k ( t ) | > 0, ˆ ν k ( t ) = 0, otherwise. W e quantify selection p erformance using the eﬀectiv e selection window ratio (ESWR) and the exclusive eﬀective windo w ratio (EEWR), deﬁned resp ectiv ely as ESWR( ν k ) =   { t : ˆ ν k ( t ) = 1 & ν k ( t ) = 1 }     { t : ν k ( t ) = 1 }   , EEWR( ν k ) =   { t : ˆ ν k ( t ) = 0 & ν k ( t ) = 0 }     { t : ν k ( t ) = 0 }   . Accordingly , ESWR reﬂects how eﬀectiv ely a metho d reco vers true signal time p oints, whereas EEWR reﬂects ho w eﬀectively it excludes non-signal time p oin ts. 5.3 Sim ulation results T able 2 rep orts the character-lev el prediction accuracy under diﬀerent simulation settings. Ov erall, the prop osed SIR TGP metho ds achiev e the b est prediction p erformance across all settings. In most conﬁgurations, SIR TGP-P metho d ac hieves the highest accuracy among the comp eting approaches. As the p eak ratio α increases, prediction accuracy generally impro v es. The adv antage of SIR TGP is most pronounced under w eak main eﬀects ( α = 2 . 5). Under stronger main eﬀects, sev eral machine learning metho ds exhibit less stable b eha vior, whereas SIR TGP remains consistently comp etitive. Prediction accuracy decreases as the noise v ariance σ 2 increases for all metho ds. Nevertheless, SIR TGP maintains superior p erformance across all noise lev els. In addition, increasing τ 2 strengthens c hannel-level spatial dep endence in the noise pro cess. In this setting, SIR TGP achiev es the largest p erformance gains and the b est o v erall accuracy , which highlights the b eneﬁt of explicitly mo deling in teraction structure when such dep endence is presen t. T able 3 rep orts the selection accuracy results. Compared with SWLDA, SIR TGP at- tains substan tially higher ESWR on c hannels with true signal activity , indicating more accurate reco very of con tiguous signal-support regions. In con trast, SWLDA tends to 27 T able 2: Character-level prediction accuracy under diﬀeren t sim ulation settings. Mean : mean accuracy; SD : standard deviation. (a) Diﬀerent p eak ratios under σ 2 = 20 and τ 2 = 9. Metho d α = 2 . 5 α = 3 . 0 α = 3 . 5 Mean (SD) Mean (SD) Mean (SD) EEGNet 0.804 (0.196) 0.916 (0.135) 0.784 (0.236) LR 0.524 (0.204) 0.769 (0.190) 0.583 (0.298) RF 0.502 (0.223) 0.769 (0.210) 0.545 (0.321) SV C 0.694 (0.220) 0.886 (0.147) 0.693 (0.278) SWLD A 0.689 (0.229) 0.885 (0.159) 0.683 (0.286) X GBo ost 0.498 (0.244) 0.645 (0.222) 0.428 (0.279) STGP-P 0.530 (0.254) 0.650 (0.282) 0.719 (0.283) R TGP-L 0.715 (0.214) 0.882 (0.153) 0.952 (0.087) R TGP-P 0.752 (0.216) 0.885 (0.151) 0.950 (0.093) SIR TGP-L 0.897 (0.138) 0.946 (0.092) 0.974 (0.057) SIR TGP-P 0.931 (0.115) 0.959 (0.078) 0.978 (0.048) (b) Diﬀerent noise v ariances under α = 2 . 5 and τ 2 = 9. Metho d σ 2 = 20 σ 2 = 25 σ 2 = 40 Mean (SD) Mean (SD) Mean (SD) EEGNet 0.804 (0.196) 0.742 (0.213) 0.694 (0.237) LR 0.524 (0.204) 0.450 (0.192) 0.417 (0.208) RF 0.502 (0.223) 0.415 (0.201) 0.370 (0.228) SV C 0.694 (0.220) 0.611 (0.220) 0.560 (0.244) SWLD A 0.689 (0.229) 0.605 (0.230) 0.548 (0.252) X GBo ost 0.498 (0.244) 0.408 (0.274) 0.393 (0.261) STGP-P 0.530 (0.254) 0.493 (0.244) 0.405 (0.208) R TGP-L 0.715 (0.214) 0.662 (0.218) 0.506 (0.200) R TGP-P 0.752 (0.216) 0.691 (0.223) 0.547 (0.214) SIR TGP-L 0.897 (0.138) 0.860 (0.160) 0.700 (0.209) SIR TGP-P 0.931 (0.115) 0.893 (0.150) 0.768 (0.207) (c) Diﬀerent in teraction strengths under α = 2 . 5 and σ 2 = 20. Metho d τ 2 = 1 τ 2 = 4 τ 2 = 9 Mean (SD) Mean (SD) Mean (SD) EEGNet 0.434 (0.191) 0.810 (0.221) 0.804 (0.196) LR 0.829 (0.185) 0.603 (0.273) 0.524 (0.204) RF 0.588 (0.224) 0.576 (0.299) 0.502 (0.223) SV C 0.749 (0.205) 0.727 (0.259) 0.694 (0.220) SWLD A 0.762 (0.212) 0.723 (0.266) 0.689 (0.229) X GBo ost 0.572 (0.237) 0.536 (0.238) 0.498 (0.244) STGP-P 0.618 (0.276) 0.582 (0.274) 0.530 (0.254) R TGP-L 0.828 (0.177) 0.787 (0.193) 0.715 (0.214) R TGP-P 0.837 (0.181) 0.801 (0.201) 0.752 (0.216) SIR TGP-L 0.801 (0.186) 0.834 (0.170) 0.897 (0.138) SIR TGP-P 0.827 (0.190) 0.863 (0.163) 0.931 (0.115) 28 T able 3: Selection accuracy measured by ESWR and EEWR under α = 2 . 5, σ 2 = 20, and τ 2 = 9. The rep orted num b ers are Mean (SD) across simulation replications. Metho d Ch 1 Ch 2 Ch 3 Ch 4 Ch 5 Ch 6 ESWR SIR TGP 0.465 (0.103) 0.518 (0.144) 0.479 (0.100) 0.497 (0.123) – – SWLD A 0.393 (0.067) 0.395 (0.055) 0.378 (0.066) 0.425 (0.082) – – EEWR SIR TGP 0.918 (0.087) 0.940 (0.073) 0.944 (0.090) 0.941 (0.074) 0.960 (0.046) 0.971 (0.039) SWLD A 0.936 (0.052) 0.934 (0.067) 0.949 (0.058) 0.935 (0.059) 0.947 (0.031) 0.941 (0.034) select isolated time p oin ts, whic h leads to reduced co verage of the true signal windo w. Mean while, b oth metho ds achiev e comparably high EEWR, suggesting eﬀectiv e control of false discov eries in non-signal regions. 6 Conclusion In this study , we prop ose a Bay esian time-v arying regression mo del with channel in terac- tions via the relaxed-thresholded Gaussian pro cess (R TGP) priors for P300 BCI sp eller. The proposed SIR TGP mo dels ac hiev es the o verall b est predictive p erformance under v ary- ing signal and noise conditions in simulation, and sho ws superior temporal c hannel selection compared with the baseline metho d. The application of SIR TGP on a publicly a v ailable dataset demonstrates its real-w orld adv antage compared to common predictiv e models used in P300 sp eller. F urthermore, through the propsed framework, we identify k ey channels and c hannel pairs that con tribute to P300 detection, which oﬀers insigh ts for future BCI study and neural signal mo deling. References Ab en, B., Calderon, C. B., V an den Bussche, E. & V erguts, T. (2020), ‘Cognitiv e eﬀort mo dulates connectivity b et ween dorsal an terior cingulate cortex and task-relev ant corti- cal areas’, Journal of Neur oscienc e 40 (19), 3838–3848. 29 Aguilar, O. & Grullon, R. (2025), ‘Spatial dynamics and functional connectivity in eeg: Insigh ts from lexical pro cessing’, bioRxiv pp. 2025–04. Barth ´ elem y , Q., Chev allier, S., Bertrand-Lalo, R. & Clisson, P . (2023), ‘End-to-end p300 b ci using ba y esian accum ulation of riemannian probabilities’, Br ain-Computer Interfac es 10 (1), 50–61. Cai, Q., Kang, J. & Y u, T. (2020), ‘Bay esian net w ork marker selection via the thresholded graph laplacian gaussian prior’, Bayesian Analysis 15 (1), 79. Costumero, V., Bueic hek ´ u, E., Adri´ an-V entura, J. & ´ Avila, C. (2020), ‘Op ening or clos- ing eyes at rest mo dulates the functional connectivity of v1 with default and salience net w orks’, Scientiﬁc r ep orts 10 (1), 9137. Dal Seno, B., Matteucci, M. & Mainardi, L. T. (2009), ‘The utility metric: a no vel metho d to assess the o v erall p erformance of discrete brain–computer interfaces’, IEEE T r ansac- tions on Neur al Systems and R ehabilitation Engine ering 18 (1), 20–28. Ditthapron, A., Banluesombatkul, N., Ketrat, S., Ch uangsu wanic h, E. & Wilaiprasitp orn, T. (2019), ‘Univ ersal joint feature extraction for p300 eeg classiﬁcation using multi-task auto enco der’, IEEE A c c ess 7 , 68415–68428. Dou w, L., Nieb o er, D., v an Dijk, B. W., Stam, C. J. & Twisk, J. W. (2014), ‘A healthy brain in a health y b o dy: brain net w ork correlates of physical and men tal ﬁtness’, PL oS One 9 (2), e88202. Elton, A., Garbutt, J. C. & Bo ettiger, C. A. (2021), ‘Risk and resilience for alcohol use disorder revealed in brain functional connectivity’, Neur oImage: Clinic al 32 , 102801. F arwell, L. A. & Donchin, E. (1988), ‘T alking oﬀ the top of your head: tow ard a mental 30 prosthesis utilizing even t-related brain p oten tials’, Ele ctr o enc ephalo gr aphy and clinic al Neur ophysiolo gy 70 (6), 510–523. F azel-Rezai, R., Allison, B. Z., Guger, C., Sellers, E. W., Kleih, S. C. & K ¨ ubler, A. (2012), ‘P300 brain computer interface: current c hallenges and emerging trends’, F r ontiers in neur o engine ering 5 , 14. F riston, K. J., Buec hel, C., Fink, G. R., Morris, J., Rolls, E. & Dolan, R. J. (1997), ‘Psy- c hoph ysiological and mo dulatory in teractions in neuroimaging’, Neur oimage 6 (3), 218– 229. Griﬃn, J. & Brown, P . (2017), ‘Hierarc hical shrink age priors for regression mo dels’, Bayesian Analysis 12 (1), 135–159. Hao, N., F eng, Y. & Zhang, H. H. (2018), ‘Model selection for high-dimensional quadratic regression via regularization’, Journal of the Americ an Statistic al Asso ciation 113 (522), 615–625. Indo vina, I. & Macaluso, E. (2004), ‘Occipital–parietal in teractions during shifts of exoge- nous visuospatial atten tion: trial-dep enden t c hanges of eﬀective connectivity’, Magnetic r esonanc e imaging 22 (10), 1477–1486. Isma ylo v a, E., Di San te, J., Gouin, J.-P ., P omares, F. B., Vitaro, F., T rembla y , R. E. & Bo oij, L. (2018), ‘Asso ciations b etw een daily mo o d states and brain gray matter v olume, resting-state functional connectivit y and task-based activity in health y adults’, F r ontiers in Human Neur oscienc e 12 , 168. Kabbara, A., Khalil, M., El-F alou, W., Eid, H. & Hassan, M. (2016), ‘F unctional brain connectivit y as a new feature for p300 sp eller’, PL oS One 11 (1), e0146282. 31 Kang, J., Reic h, B. J. & Staicu, A.-M. (2018), ‘Scalar-on-image regression via the soft- thresholded gaussian pro cess’, Biometrika 105 (1), 165–184. Kap er, M., Meinic ke, P ., Grossek atho efer, U., Lingner, T. & Ritter, H. (2004), ‘Bci comp e- tition 2003-data set iib: supp ort vector mac hines for the p300 sp eller paradigm’, IEEE T r ansactions on biome dic al Engine ering 51 (6), 1073–1076. Kshirsagar, G. B. & Londhe, N. D. (2019), ‘W eigh ted ensemble of deep conv olution neu- ral net works for single-trial character detection in dev anagari-script-based p300 sp eller’, IEEE T r ansactions on Co gnitive and Developmental Systems 12 (3), 551–560. Kurihara, Y., T ak ahashi, T. & Osu, R. (2022), ‘The relationship b etw een stabilit y of in- terp ersonal co ordination and inter-brain eeg synchronization during anti-phase tapping’, Scientiﬁc r ep orts 12 (1), 6164. Kw ak, N.-S., M ¨ uller, K.-R. & Lee, S.-W. (2015), ‘A low er limb exoskeleton con trol sys- tem based on steady state visual evok ed p oten tials’, Journal of neur al engine ering 12 (5), 056009. La whern, V. J., Solon, A. J., W ayto wich, N. R., Gordon, S. M., Hung, C. P . & Lance, B. J. (2018), ‘Eegnet: a compact conv olutional neural netw ork for eeg-based brain–computer in terfaces’, Journal of neur al engine ering 15 (5), 056013. Lin, Z., Si, Y. & Kang, J. (2023), ‘Latent subgroup iden tiﬁcation in image-on-scalar regres- sion’, arXiv pr eprint arXiv:2307.00129 . Ma, G., Kang, J., Thompson, D. E. & Huggins, J. E. (2023), ‘Bci-utilit y metric for asyn- c hronous p300 brain-computer in terface systems’, IEEE T r ansactions on Neur al Systems and R ehabilitation Engine ering 31 , 3968–3977. 32 Ma, T., Li, Y., Huggins, J. E., Zhu, J. & Kang, J. (2022), ‘Bay esian inferences on neu- ral activity in eeg-based brain-computer in terface’, Journal of the Americ an Statistic al Asso ciation 117 (539), 1122–1133. Mic hael, E. B., Keller, T. A., Carp enter, P . A. & Just, M. A. (2001), ‘fmri inv estigation of sen tence comprehension b y eye and by ear: Mo dalit y ﬁngerprin ts on cognitiv e pro cesses’, Human br ain mapping 13 (4), 239–252. Noad, K. N., W atson, D. M. & Andrews, T. J. (2024), ‘F amiliarity enhances functional connectivit y b et ween visual and nonvisual regions of the brain during natural viewing’, Cer ebr al Cortex 34 (7), bhae285. Pfurtsc heller, G., M ¨ uller-Putz, G. R., Scherer, R. & Neup er, C. (2008), ‘Rehabilitation with brain-computer interface systems’, Computer 41 (10), 58–65. Philip, J. T. & George, S. T. (2020), ‘Visual p300 mind-sp eller brain-computer interfaces: a walk through the recen t developmen ts with sp ecial fo cus on classiﬁcation algorithms’, Clinic al EEG and neur oscienc e 51 (1), 19–33. Sak amoto, Y. & Aono, M. (2009), Sup ervised adaptive do wnsampling for P300-based brain- computer interface, in ‘2009 Annual In ternational Conference of the IEEE Engineering in Medicine and Biology So ciety’, IEEE, pp. 567–570. Sarraf, J., Pattnaik, P . et al. (2023), ‘A study of classiﬁcation techniques on P300 sp eller dataset’, Materials T o day: Pr o c e e dings 80 , 2047–2050. Shi, R. & Kang, J. (2015), ‘Thresholded m ultiscale gaussian pro cesses with applica- tion to bay esian feature selection for massiv e neuroimaging data’, arXiv pr eprint arXiv:1504.06074 . 33 Shokri-Ko jori, E., T omasi, D., Wiers, C. E., W ang, G.-J. & V olko w, N. D. (2017), ‘Alcohol aﬀects brain functional connectivity and its coupling with b ehavior: greater eﬀects in male heavy drinkers’, Mole cular psychiatry 22 (8), 1185–1195. T erstege, D. J., Durante, I. M. & Epp, J. R. (2022), ‘Brain-wide neuronal activ ation and functional connectivit y are mo dulated by prior exp osure to rep etitiv e learning episo des’, F r ontiers in Behavior al Neur oscienc e 16 , 907707. T ononi, G. & Edelman, G. M. (1998), ‘Consciousness and complexity’, scienc e 282 (5395), 1846–1851. T rimmel, K., v an Graan, A. L., Caciagli, L., Haag, A., Ko epp, M. J., Thompson, P . J. & Duncan, J. S. (2018), ‘Left temp oral lob e language net w ork connectivit y in temp oral lob e epilepsy’, Br ain 141 (8), 2406–2418. V an Erp, J., Lotte, F. & T angermann, M. (2012), ‘Brain-computer interfaces: b eyond medical applications’, Computer 45 (4), 26–34. W ang, C., Jiang, B. & Zhu, L. (2021), ‘P enalized in teraction estimation for ultrahigh dimensional quadratic regression’, Statistic a Sinic a 31 (3), 1549–1570. W ang, W., Li, H., W ang, Y., Liu, L. & Qian, Q. (2024), ‘Changes in eﬀective connectivit y during the visual-motor in tegration tasks: a preliminary f-nirs study’, Behavior al and Br ain F unctions 20 (1), 4. W on, K., Kwon, M., Ahn, M. & Jun, S. C. (2022), ‘EEG dataset for RSVP and P300 sp eller brain-computer interfaces’, Scientiﬁc data 9 (1), 388. W u, B., Guo, Y. & Kang, J. (2024), ‘Ba yesian spatial blind source separation via the thresh- olded gaussian pro cess’, Journal of the Americ an Statistic al Asso ciation 119 (545), 422– 433. 34 W u, B., W u, K. & Kang, J. (2025), ‘Bay esian scalar-on-image regression with a spa- tially v arying single-lay er neural netw ork prior’, Journal of Machine L e arning R ese ar ch 26 (116), 1–38. Xu, N., Gao, X., Hong, B., Miao, X., Gao, S. & Y ang, F. (2004), ‘Bci comp etition 2003- data set iib: enhancing p300 w av e detection using ica-based subspace pro jections for b ci applications’, IEEE tr ansactions on biome dic al engine ering 51 (6), 1067–1072. Xu, Y., Johnson, T. D., Heitzeg, M. & Kang, J. (2026), ‘Ba yesian image mediation analysis’, Journal of the Americ an Statistic al Asso ciation In Press . Xu, Y. & Kang, J. (2025), Ba yesian image regression with soft-thresholded conditional autoregressiv e prior, in ‘The Thirteenth In ternational Conference on Learning Represen- tations’. Zhang, Y., Zhou, G., Jin, J., Zhao, Q., W ang, X. & Cic ho cki, A. (2015), ‘Sparse ba y esian classiﬁcation of eeg for brain–computer interface’, IEEE tr ansactions on neur al networks and le arning systems 27 (11), 2256–2267. Zhao, B., Huggins, J. E. & Kang, J. (2025), ‘Bay esian inference on brain-computer inter- faces via glass’, Journal of the Americ an Statistic al Asso ciation In Press . 35

Sparse Bayesian Modeling of EEG Channel Interactions Improves P300 Brain-Computer Interface Performance

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment