Classification of EEG Signals using Genetic Programming for Feature Construction

Classification of EEG Signals using Genetic Programming for Feature Construction Ícaro Marcelino Miranda University of Brasília Brasília, Brazil icaro.mar celino@hotmail.com Claus Aranha T sukuba Univ ersity T sukuba, Japan caranha@cs.tsukuba.ac.jp Marcelo Ladeira University of Brasília Brasília, Brazil mladeira@unb.br ABSTRA CT The analysis of electroencephalogram (EEG) waves is of critical importance for the diagnosis of sleep disorders, such as sleep ap- nea and insomnia, besides that, seizures, epilepsy , head injuries, dizziness, headaches and brain tumors. In this context, one im- portant task is the identication of visible structures in the EEG signal, such as sleep spindles and K-complexes. The identication of these structures is usually performed by visual inspe ction from human experts, a process that can be error pr one and susceptible to biases. Therefore there is interest in developing technologies for the automated analysis of EEG. In this paper , we propose a new Genetic Programming (GP) framework for feature construction and dimensionality reduction from EEG signals. W e use these fea- tures to automatically identify spindles and K-comple xes on data from the DREAMS project. Using 5 dier ent classiers, the set of attributes produced by GP obtained better AUC scores than those obtained from PCA or the full set of attributes. Also, the results obtained from the proposed framework obtained a better balance of Spe cicity and Recall than other models recently proposed in the literature. Analysis of the features most used by GP also sug- gested improvements for data acquisition protocols in future EEG examinations. CCS CONCEPTS • Mathematics of computing → Dimensionality reduction ; • Computing methodologies → Genetic programming ; KEY W ORDS Classication, EEG, Dimensionality Reduction, Feature Construc- tion, Feature Selection, Genetic Programming, K Complex, Sleep Spindles A CM Reference Format: Ícaro Mar celino Miranda, Claus Aranha, and Marcelo Ladeira. 2019. Classi- cation of EEG Signals using Genetic Pr ogramming for Feature Construction. In Genetic and Evolutionary Computation Conference (GECCO ’19), July 13–17, 2019, Prague, Czech Republic . A CM, New Y ork, N Y, USA, 10 pages. https://doi.org/10.1145/3321707.3321737 Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or aliate of a national go vern- ment. As such, the Government retains a nonexclusive , royalty-free right to publish or reproduce this article, or to allow others to do so , for Government purposes only . GECCO ’19, July 13–17, 2019, Prague, Czech Republic © 2019 Copyright held by the owner/author(s). Publication rights licensed to the Association for Computing Machinery . ACM ISBN 978-1-4503-6111-8/19/07. . . $15.00 https://doi.org/10.1145/3321707.3321737 1 IN TRODUCTION About 40% of the world’s population suers from some sleep disor- der [ 28 , 41 ]. Sleep quality directly aects the health and quality of life of the human being. Poor sleep causes many people to seek out specialized clinics for an accurate diagnosis. One of the most com- mon techniques of analysis is done by obser ving brain activity , eye movement, muscle tension, and other body signals by polysomnog- raphy (PSG). The examination consists of collecting data through a series of ele ctrodes connected to the patient’s skin and scalp during his or her usual nighttime sleep. This examination allows the diagnosis of several disorders, such as obstructive sleep apnea, insomnia, narcolepsy , restless legs syn- drome and bruxism. It is also useful for the identication of visible waveforms like sle ep spindles (SS) and K-complexes (KC) which, besides assisting in sleep staging, are related to the consolidation of memory and sensor y systems. Abnormalities in their forms may indicate neuropathologies or sleep disorders. In patients with sleep or neurological disorders, the study of these waveforms helps in the understanding of the neurophysiological functioning and thus, allows to raise hypotheses about the problem. In particular , sle ep spindles have a number of theoretical and clinical implications in understanding how brain activity during sleep is aected and the development of the disorder [42]. The identication of waveforms on EEG signals is usually done by visual inspection by experts. This is a time-consuming and tir- ing process, which may introduce biases and errors [ 34 ]. In conse- quence, specialists not always arrive in the same identication, as illustrated in Figure 2. Be cause of this, there is an interest in the development of automated tools for waveform detection [42]. There are many challenges related to EEG signal analysis. They have spatial and temporal co-variance, implying highly dependent samples. They are also non-stationary , noisy and sensitive to ex- ternal interference [ 19 ]. In order to describe these signals without losing information, a high number of features are necessary from the original signals, implying in high dimensionality samples. One way to improve the automated classication of EEG signal structures is by using dimensionality reduction and feature con- struction techniques. In this sense, Genetic Programming (GP) can be used to generate a function that generates a set of new , reduced features from the original ones. Using GP for feature construction has two advantages: First, GP can generate non-linear combina- tions of features, making it more expressiv e than traditional featur e reduction te chniques. Second, an analysis of the rules created by GP may allow insights about the importance of the dierent original features, as suggested by Ivert et al. [17]. Guo et al. have previously proposed a GP framework for feature construction in the context of seizure dete ction in EEG signals using GECCO ’19, July 13–17, 2019, Prague, Czech Republic Í. M. Miranda et al. KNN [ 13 ]. In this paper , we build upon this work and apply it to the more dicult problem of detecting structures such as Sleep Spindles and K-complexes. Mor e specically , we use short samples (2s vs 26s) to precisely identify the locations of the structur e, w e use AUC instead of Precision as the tness function, and we explore ve dierent classiers instead of just KNN. T o test the proposed framework (Figure 1) w e perform the iden- tication of Sleep Spindles and K -complexes on the DREAMS [ 9 ] dataset. Starting from a set of 75 features per sample, the proposed GP nds a constructed set with a median of 12 features. W e show that the feature set found by GP achiev ed better AUC than using the full set of features, or even a set of 29 features selected by PCA. Additionally , the proposed mo del achieve a better balance of Recall and Specicity when compared with other recently proposed mod- els for the same problem. Finally , and perhaps more interestingly , an analysis of the rules constructed by GP showed that we could use only one of the three EEG channels in the dataset to obtain the same quality of identication. This result suggests that a simpler examination could be used, causing less discomfort for patients. 2 THE EEG CLASSIFICA TION PROBLEM Electroencephalogram (EEG) is a typically noninvasive e xamina- tion for the observation of electrical activity of the brain [ 36 ]. This information is obtained through electrodes attached to the scalp with a conductive paste. Through the analysis of these data it is p os- sible to detect diseases and psychiatric and neurological problems. Figure 1: Framework proposed in this work for the identi- cation of structures in sle ep EEG Figure 2: Example of Sleep Spindles and K -complexes iden- tied from EEG data by two dierent experts. Usually the analysis of these signals done visually by experts (Fig- ure 2), which makes the process tiresome , te dious and susceptible to errors [34]. T o assist specialists in this visual task, a numb er of metho ds of au- tomatic processing and analysis of EEG signals has be en proposed. W e emphasize the use of automatic methods for the study of apnea [ 38 ], epilepsy [ 13 ], drowsiness [ 4 ], sleep spindles [ 1 , 5 , 8 , 9 , 21 , 43 ], K complexes [14, 31, 37], sleep stages [22] and schizophrenia [33]. 2.1 Sleep Spindles and K -complexes Sleep has two main phases: REM (Rapid Eye Movement) sleep and NREM (non-REM) sleep. Occupying up to 80% of the sleep time, the NREM phase is divided into 4 stages, ranging fr om lightest to deepest sleep [30]. In particular , stage 2 of NREM sleep has as its main characteris- tic the appearance of specic waveforms, K -complexes and Sleep Spindles. The beginning of this stage is dened by the occurrence Classification of EEG Signals using Genetic Programming for Feature Construction GECCO ’19, July 13–17, 2019, Prague, Czech Republic of these signals. Because of this well dened presence, they are very important for sleep staging. Although Sleep Spindles mark entr y into stage 2 of NREM sleep, they may also occur in stage 3 [ 2 ]. When a spindle occurs, the amplitude of the EEG signal increases and decreases progressively , having a minimum duration of 0.5 s with dened bandwidth b e- tween 12 and 14 Hz in the criterion of Rechtschafen and Kales [ 32 ] (some authors may consider from 11 until 16 Hz). Peak-to-peak amplitude settings can also b e found between 5 and 25 µ V [ 9 ]. The occurrence of spindles contributes to memor y consolidation, to continuous sleep [ 3 ] and in the study of sleep and neurological disorders. The characteristics of the spindles change with the patient’s age and sex [ 6 ]. The tendency is for it to o ccur less with advancing age [ 29 ]. As for gender , the phenomenon usually o ccurs twice more during the sleep of women, due to hormonal factors [10, 24]. The sleep spindles, in general, have a well-dened structure (oc- currence, bandwidth, and amplitude). However the advancement of the patient’s age and pathologies cause inaccuracies in their shape. T ypically , the number of spindles decreases and their shape deteri- orates [ 18 ]. Their shape may be distorted and are more subject to the occurrence of interference [ 8 ]. Patients with schizophrenia, for example, do not have normal patterns in the spindles [ 12 ]. Changes can also be observed due to fatal familial insomnia [ 27 ], autism and epilepsy [ 16 ]. This lack of standard is important for the diagnosis of neurological diseases, but it makes it more dicult to identify this phenomenon for specialists and automatic methods. The K-comple x is a negative acute wave immediately followed by a p ositive component that clearly arises in the EEG, having a minimum duration of 0.5s in the frequencies of 12 to 14 Hz [ 2 ]. In the identication, they can be easily confused with any waveform with high peaks [ 11 , 20 ] (Figure 2). Abnormal activity of the K complex may be related to epilepsy , restless leg syndrome, and obstructive sleep apnea. 2.2 DREAMS Data W e use the databases collected by the DREAMS project [ 9 ], which consist of a series of polysomnography (PSG) with expert annota- tions on phenomena or sleep disorders. W e use the sleep spindle and K-complexes datasets from this project. Their purpose is to tune, train and test automatic detection algorithms. The Spindles dataset consists of 30 minute stretches of the central EEG channel (extracted from full-night PSG records), independently annotated by two experts. The data were acquired in a sle ep lab- oratory of a Belgian hospital (BrainnetTM System of MED A TEC, Brussels) using a 32-channel digital polygraph. It is important to highlight that all records on this dataset are from patients with various sleep pathologies: dyssonia, restless legs syndrome, insom- nia, apnea syndrome or hypopnea. EOG, EMG and EEG channels (channels FP1- A1, O1- A1 and C3- A1 or CZ-A1) were r e corded, us- ing the European standard data format (EDF) for storage. Only EEG channels will be used in this work. The sampling frequency varies between patients, having r e cords of 200Hz, 100Hz or 50Hz of 30 minutes duration. The recordings were given independently to two experts, who annotated their esti- mates for the locations of the sleep spindles. The K-complex records were collected in the same hospital as the spindle registers, with the same equipment. There are 10 polysomnographic recordings from healthy individuals. Just like the previous base , we only use EEG channels. The sampling frequency was 200 Hz for all patients with a 30 min duration. In the same way , the excerpts were given independently to two experts. T o reduce bias, the experts did not have access to sleep staging of the records. 2.3 Data preparation The original EEG data was prepared for the automated identication process using the following procedure. 2.3.1 W avelets Transform. W avelet transforms are mathemati- cal tools capable of decomposing signals into several components that allow analysis at dierent time and frequency scales [7]. An input signal x , passes through a low-pass lter д and a high- pass lter h (parallel, not sequentially), each with a cut-o fr e quency equal to one half of the sampling frequency of the input. Then, the two generated sub-signals, that is, the output of the lters has their samples reduced in half (see Figure 3). Figure 3: Example of Digital W avelet Transform in 3 levels This process can be r ep eated at several le vels, causing the output of the low pass lter to be the input signal of a new pair of lters, followed by downsampling. 2.3.2 Feature extraction. The EEG data use d were sampled with dierent frequencies (50, 100 or 200 Hz). For the application of the Digital W avelet Transform (D W T) [ 23 ] with 5 decomposition levels, the data were resampled with an increased frequency of 256 Hz in all cases by means of interpolation through a cubic spline. For each decomposition level ( D 1 to D 5 ) in each EEG channel, the signal was separated into 2 second samples, and the following attributes were calculated for each sample: Signal amplitude av er- age, Signal amplitude standar d deviation (SD), Symmetry , Power Spectral Density (PSD), and Signal curve length. Following this procedure, we obtain 900 samples (2s samples from 30 minutes of signal) with 75 real valued attributes (3 EEG channels, 5 D WT levels, and 5 attributes per level). These attributes are summarized in tables 1, 2, and 3. Here , the columns represent the coecients of the D W T levels and the lines the operations performed. GECCO ’19, July 13–17, 2019, Prague, Czech Republic Í. M. Miranda et al. D1 D2 D3 D4 D5 A verage ARG0 ARG5 ARG10 ARG15 ARG20 SD ARG1 ARG6 ARG11 ARG16 ARG21 Symmetry ARG2 ARG7 ARG12 ARG17 ARG22 PSD ARG3 ARG8 ARG13 ARG18 ARG23 Curve Length ARG4 ARG9 ARG14 ARG19 ARG24 T able 1: Attributes for the central EEG channel (CZ- A1 or C3- A1) D1 D2 D3 D4 D5 A verage ARG25 ARG30 ARG35 ARG40 ARG45 SD ARG26 ARG31 ARG36 ARG41 ARG46 Symmetry ARG27 ARG32 ARG37 ARG42 ARG47 PSD ARG28 ARG33 ARG38 ARG43 ARG48 Comp. Length ARG29 ARG34 ARG39 ARG44 ARG49 T able 2: Attributes for the EEG channel FP1-A1 D1 D2 D3 D4 D5 A verage ARG50 ARG55 ARG60 ARG65 ARG70 SD ARG51 ARG56 ARG61 ARG66 ARG71 Symmetry ARG52 ARG57 ARG62 ARG67 ARG72 PSD ARG53 ARG58 ARG63 ARG68 ARG73 Curve Length ARG54 ARG59 ARG64 ARG69 ARG74 T able 3: Attributes for the EEG channel O1-A1 3 PROPOSED FRAMEW ORK Previously , Guo et al. [ 13 ] proposed the use of Genetic Program- ming (GP) for the construction of features for EEG analysis in the classication of epileptic episodes. T aking this frame work as a base , we de velop a framework for the identication of structur es in sleep EEG, which we describe in this section. There are many characteristics in the structure identication problem which dierentiates it from the earlier classication work. W e must divide the EEG signals into multiple short samples in order to identify the position of the Spindles and K -complexes in the signal. As a consequence, the data be comes highly unbalanced, complicating the training process. Also, we work on three distinct EEG channels (as opposed to a single channel in the original work). W e tested several improvements on the original work to deal with this harder problem. First, we use the Area Under the ROC Curve (AUC) of the classiers instead of the accuracy as the tness measure, since the AUC is more robust and discriminating [ 15 ]. Also, we compare sev eral classiers in addition to KNN, to explor e the relationship between classier choice and GP feature construc- tion. GP is widely applie d in the construction and selection of features for its good performance. In classication problems it is possible to evolve a tree for each problem class, selecting the best attributes and creating new features for each of them [ 25 ]. Even with unbal- anced data, this approach with GP also gets go od results [ 40 ]. In literature, there are also application studies in benchmarks [ 35 ] and in databases with few samples [26]. Finally , we publish the program of the proposed framework and experiments at our repository 1 for reproducibility purposes. 3.1 GP for Feature Construction Our proposed framework uses Genetic Programming (GP) to gen- erate the constructed set of features from the original features. The GP tree is dened as follows: The input nodes are selected from the original attributes. The intermediate nodes are selected from a set of arithmetic operators { + , − , × , } , as well as a set of protected operators {/ , ln , p } . These protected operators have their denitions slightly modied to avoid errors such as division by zero, as follows: protected division ( a , b ) =  1 , b = 0 a b , b , 0 protected log ( a ) =  1 , b = 0 l n ( | a | ) , b , 0 protected square root ( a ) = p | a | Additionally , a spe cial "Feature Operator" [ 13 ], F , is used to indicate how to obtain the constructed featur es from the GP tr ee. The feature operator returns the value of its input as its output, without any changes. Its purpose is to mark a subtree as one of the constructed features. Each F operator will b e the root of a subtree that expresses the function to calculate one attribute for the constructed set. In this way , a GP tree containing ten nodes with the F operator will generate a constructed attribute set with 10 attributes. For example, the tree depicted in Figur e 4 shows a GP individual with two subtrees marked by the F operator . If we assume that the original attribute set has 26 attributes (a..z), this tr ee will generate a constructed attribute set with tw o attributes: F 1 = a and F 2 = b − 1 . The use of the F operator allows a single GP tree to e xpress mul- tiple attribute constructing functions. In this way , we avoid having to explicitly dene how many attributes will be constructed before- hand, which would be necessary if each attribute were e xpressed by a separate tree [13]. Also, this allows GP trees in the population to exchange use- ful subtrees containing sets of constructed attributes that were successful. W e believe that this allo ws the GP to pass around the most relevant subtrees to the next generations and, with this, keep attributes and attribute subsets that facilitate classication. T o evaluate one GP tree using the structure described in the previous section, we use the following procedure. First, we gen- erate the set of constructed attributes for the GP tree using the F operators. Second, we train a classier using this set of constructed attributes. Finally , we use the AUC value of the classier as the tness value for the GP tree. In this way , the product of the evolu- tionary process is both a set of constructe d attributes, as well as a classier trained on those attributes. The parameters used for the evolutionary process in the curr ent framework are listed in T able 4. As the parameters provided good results in the initial runs, they were maintained for the following. The "Uniform mutation" selects 1 https://github.com/IcaroMar celino/SleepEEG Classification of EEG Signals using Genetic Programming for Feature Construction GECCO ’19, July 13–17, 2019, Prague, Czech Republic Figure 4: Example of GP program marked with F operators. This GP tree constructs two attributes: F 1 = a and F 2 = b − 1 a random node from the individual and replaces subtree rooted in that node with a a randomly generated one. Parameter V alue Number of generations 300 Number of individuals 100 Selection T ournament (size = 3) Crossover One Point Mutation Uniform Primitive Functions + , − , / , ∗ , l n , √ , F Fitness AUC T able 4: GP Model Parameterization 3.2 Classiers W e compare the performance of the set of constructed attributes by GP by using ve dierent classiers: • K Nearest Neighbors (KNN) • Naive Bayes (NB) • Support V ector Machines (SVM) • Decision Tree (DT) • Multilayer Perceptron (MLP) W e perform an initial tuning procedure using the full set of 75 features to select the hyperparameters used by each classier in the next experiments. For each tuning classier , we execute 100 runs for each parameter value tested in the following sets: • KNN: k ∈ { 3 , 5 , 7 , 9 , 11 , 13 , 15 , 17 , 19 } • SVM: kernel ∈ { Radial Basis Function (RBF), p olynomial, sigmoid } • MLP: activation function ∈ { ReLU, logistic } , neurons in hid- den layer ∈ { 15 , 30 , 45 , 60 , 75 } The parameters that pr ovide the best performance were selected. KNN with k = 5 , SVM with RBF as kernel and MLP with a single hidden layer with 15 neurons and ReLU activation. 4 EXPERIMEN T W e perform several evaluations of the classiers on the Sleep Spin- dles and K-Comple xes datasets in order to analyse the performance of the propose framework. The results of the classiers are com- pared using the full set of 75 attributes, a r e duced set of 29 attributes selected by PCA, and the set of attributes constructed by the GP framework. The training dataset (used to train both the GP and the classi- ers) was generated by simple random sampling 70% of the signal samples, and labelling them as positive samples (Sleep Spindles or K-comple xes) if both specialists agreed on the label. Additionally , because the dataset is highly unbalanced, we balance the training dataset by randomly remo ving samples from the majority class until both classes have the same number of signal samples. The test dataset was generated by the remaining 30% of the samples, and each signal sample was labeled as p ositive if either specialist annotate d it as positive. Also, the balancing procedure is not performed on the testing data set. This resulted in a slightly harder testing data set. For each experiment, we r ep eat the training/testing procedure 10 times, and report the aggregate results of these 10 repetitions as described in the subsections b elow . 4.1 Results The rst experiment was aimed at verifying the performance of the classiers on the test dataset without the reduction of dimensional- ity by GP , i.e., using the 75 featur es. The results are also useful to justify the application of feature construction. The classiers performance can b e seen in Figure 5a. Only MLP has good results, achieving an AUC greater than 0.7 with low SD . Applying PCA on the data with a 95% threshold for variance, ensuring little loss of information, the initial set of 75 features can be represented with 29 attributes. The performance of the trained classiers with the feature set generated by PCA on the respective test set can be seen in Figure 5b. For this problem, the PCA representation caused a p erformance reduction in all classiers except NB. The performance of applying the classiers on the feature set generated by PCA can be seen in Figure 5b . For this problem, the PCA representation caused a decrease in performance in all classi- ers except NB. Applying the GP feature r eduction, the AUC of the classication increases for all classiers except KNN (Figure 5c) and the SD reduces for all cases. This method is capable to reduce the numb er of features from 75 to less than 29 (Figure 7). Using the same approach for training K-complexes classiers, the models achieve high AUC scores too (Figure 5d). 4.2 Analysis of gender dierence Gender Expert 1 Expert 2 Male 157 121 Female 198 288 T able 5: Spindles scored by the experts for each gender As mentioned before, there are gender dierences in sleep spin- dles. T o se e if this dierence aects the performance of the classi- ers, the data were separated by gender . GECCO ’19, July 13–17, 2019, Prague, Czech Republic Í. M. Miranda et al. (a) Classiers p erformance over the 75 initial attributes (b) Classiers performance over 29 PCA components (c) Classiers performance over reduced feature sets generated by GP (d) Classiers performance over reduced feature sets generated by GP (e) Classiers p erformance over reduced feature sets generated by GP (Male Patients) (f ) Classiers performance ov er reduced feature sets generated by GP (Female Patients) Figure 5: Performance of the Classiers on the test set. a- full attribute set (Sleep Spindles), b- PCA attribute set (Sleep Spindles), c- GP attribute set (Sleep Spindles), d- GP attribute set (K -complex), e- GP Attribute (Sleep Spindles, Males only), f- GP Attribute (Sleep Spindles, Females only) Classification of EEG Signals using Genetic Programming for Feature Construction GECCO ’19, July 13–17, 2019, Prague, Czech Republic Figure 6: Numb er of occurrences of the feature in the mo dels generated by GP (Same models from Figure 5c). Figure 7: Number of dimensions in the models generated by GP (Same models from Figure 5c) Observing the Figures 5e and 5f, the classier’s performance for female patients is higher than the male patients. As sleep spindles occur more often in female patients, in data there ar e more repre- sentative samples of the waveform (see T able 5), which facilitates the training of more ecient classiers. 4.3 Constructed Features Analysis In Figure 6, the frequency of occurrence of featur es in the models training shows that there are attributes mor e relevant than others in the dataset. The greater occurrence of the features associate d to the central EEG channel indicates that it has more important role to the identication of sleep spindles. Using only this channel for training (i.e, only with the 25 rst features of the dataset), the performance of all classiers increase for sleep spindles and K -complexes identication (Figure 8). This attributes reduction contributes to a better understanding of the phenomenon, providing a more ecient approach by reducing the use of electrodes, consuming less resources and generating less discomfort to the patient. Reference Recall Specif. Prec. F 1 Lachner-Piza et al., 2018 [21] 0.65 0.98 0.38 0.48 T sanas and Cliord, 2015 [39] 0.76 0.92 0.33 0.46 Zhuang et al., 2016 [44] 0.51 0.99 0.70 0.59 Proposed model 0.75 0.98 0.35 0.48 T able 6: Comparison between the proposed model and liter- ature mo dels 5 COMP ARISON TO LITERA T URE MODELS In the T able 6, the b est generated model with the proposed approach (with NB classier ) was compar e d with the literatur e models which also used DREAMS data 2 . T sanas at. el [ 39 ] and Zhuang et al. [ 44 ] propose d continuous wavelet transform (CWT) based approaches and the estimation of the probability of spindles occurrences. Lachner-Piza et al. [ 21 ] proposed a SVM approach with a feature selection method based on the label-feature and feature-feature correlations for determining the relevance and redundancy of each feature . Observing the performance of the models, all obtained high specicity , indicating that the identication of samples where no spindles samples are present is reliable. Mor eover , there is a trade- o b etween sensitivity and precision. In the context of applying automatic identiers, false negatives are more unwanted than false 2 Further comparisons between sleep spindle identiers can be seen in [21] GECCO ’19, July 13–17, 2019, Prague, Czech Republic Í. M. Miranda et al. Figure 8: Classiers performance over the central EEG chan- nel attributes for Sleep Spindles and K-complexes positives. That is, a highly accurate but not very sensitive classier generates many false positives, indicating that it is not judicious. In a semi-automatic application with low sensitivity , it is necessary for a specialist to insp ect the markings performe d by the classier , eliminating the excess of false positives. This is the case of the detector of Zhuang et al. [44]. The detectors of T sanas and Cliord [ 39 ] and Lachner-Piza et al. [ 21 ] and the proposed model have achieved a better compromise between sensitivity (recall) and precision. This implies that the identication of signal stretches as spindles is more reliable . The proposed approach allows the generation of competitive models with the literature . The T sanas and Cliord model, although having a slightly higher sensitivity than the proposed model. In contrast to the model of Lachner-Piza et al., our model has only minor precision, with 0.03 of dierence. 6 DISCUSSION The GP feature construction improves the performance of a classi- er reducing the search space and generating more explicit relations between variables. Observing the reduction in the number of at- tributes, the dimensionality of the problem is reduce d by up to 7 times in most cases. In addition, by analyzing the most frequent attributes, it is clear which ones are most relevant to the models. With this information, it is possible to select the most important EEG channels. In the case of K-complexes sleep spindles, only the central EEG channel is sucient to perform the waveform identication. The single channel approach already reduces search space by one-third. Furthermore, fewer ele ctrodes will be required for the examina- tion, making it more comfortable for the patient, consequently , approaching the sleep in the laborator y of the daily sleep, avoiding biases. 7 CONCLUSIONS The use of automatic methods to identify sleep phenomena makes it possible to classify EEG signal segments with good performance, indicating whether or not a particular event occurs. Excerpts of 30 minutes can have hundreds of events that you want to identify . In this respect, the proposed model can be used to accelerate the process, and it is up to the expert to assign the classication. The model was also useful for sleep staging, since the presence of spindles and K -complexes strongly characterize sleep stage 2. It has also be en shown that it is possible to signicantly im- prove the performance of classiers by selecting and constructing attributes. In addition, the use of GP allows greater interpretability and mathematical analysis of the new attributes generated, which may help to better understand the model and the pr oblem. It is also possible to inspect the attributes generated through knowledge in the application domain. The approach also does not require in-depth kno wledge of the application domain. In the rst experiment, no assumptions ab out the data were performed. The ease of dening the terms in which the solutions will be written, that is, the operators and the terminals, allow the creation of hybrid models with biome dical information. It can also facilitate communication between specialists from dierent areas. The automation of the selection and construction of attributes generates a dataset suitable for the desired classier . But, it is very simple to apply the attributes generated in another classier . The processing time is also reduced with the smaller number of dimen- sions. The PSG generated signals used in sle ep clinics are stored directly on computers. Therefore, the application of the proposed technique can be easily applied in this context. The generated models, once trained, make predictions quickly , facilitating a real-time approach. Measurement of the micro-ev ent activity on EEG signals in dif- ferent populations can provide important information about ab- normalities in brain signals and assist in the investigation and hypothesis assessment of observed phenomena or disturbances. This underscores the importance of the study . With the proposed models, the identication of the spindles or K-comple xes is less costly for the specialist. It may even r eplace its function in this task if the performance of the models is satisfactory for the requested analysis. Therefor e, the diagnosis can be faster Classification of EEG Signals using Genetic Programming for Feature Construction GECCO ’19, July 13–17, 2019, Prague, Czech Republic and the return to the patient suering from some disorder is more ecient. Moreover , this methodology can easily be extended to other classication problems. GECCO ’19, July 13–17, 2019, Prague, Czech Republic Í. M. Miranda et al. REFERENCES [1] W essam Al-salman, Y an Li, Peng W en, and Mohammed Diykh. 2018. An ecient approach for EEG sleep spindles detection based on fractal dimension coupled with time frequency image. Biomedical Signal Processing and Control 41 (2018), 210–221. [2] Richard B Berr y , Rita Brooks, Charlene E Gamaldo, Susan M Har ding, CL Mar cus, BV V aughn, et al . 2012. The AASM manual for the scoring of sleep and asso ciated events. Rules, T erminology and T e chnical Spe cications, Darien, Illinois, American Academy of Sle ep Medicine (2012). [3] Z Clemens, D Fab o, and P Halasz. 2005. Overnight verbal memory retention correlates with the number of sle ep spindles. Neuroscience 132, 2 (2005), 529–535. [4] Thiago LT da Silveira, Alice J Kozakevicius, and Cesar R Rodrigues. 2016. A u- tomated drowsiness detection through wavelet packet analysis of a single EEG channel. Expert Systems with A pplications 55 (2016), 559–565. [5] Aur ora D’Atri, Luana No velli, Michele Ferrara, Oliviero Bruni, and Luigi De Gen- naro. 2018. Dierent maturational changes of fast and slow sleep spindles in the rst four years of life. Sleep medicine 42 (2018), 73–82. [6] Luigi De Gennaro and Michele Ferrara. 2003. Sleep spindles: an overview . Sleep medicine reviews 7, 5 (2003), 423–440. [7] Paulo Cupertino de Lima. 2002. W avelets: uma introdução. Matemática Univer- sitária 33 (2002), 13–44. [8] Stéphanie Devuyst, Thierr y Dutoit, Jean-François Didier , François Meers, Etienne Stanus, Patricia Stenuit, and Myriam Kerkhofs. 2006. A utomatic sleep spindle detection in patients with sleep disorders. In Engineering in Medicine and Biology Society , 2006. EMBS’06. 28th A nnual International Conference of the IEEE . IEEE, 3883–3886. [9] Stéphanie Devuyst, Thierry Dutoit, Patricia Stenuit, and Myriam Kerkhofs. 2011. Automatic sleep spindles detectionâĂ Ťo ver view and development of a standard proposal assessment method. In Engineering in Medicine and Biology So ciety , EMBC, 2011 A nnual International Conference of the IEEE . IEEE, 1713–1716. [10] Andrea Dzaja, Sara Arber , Jenny Hislop, Myriam Kerkhofs, Caroline Kopp, Thomas Pollmächer , Päivi Polo-Kantola, Debra J Skene, Patricia Stenuit, Irene T obler , et al . 2005. W omen’s sleep in health and disease. Journal of psychiatric research 39, 1 (2005), 55–76. [11] A ykut Erdamar , Fazıl Duman, and Sinan Y etkin. 2012. A wavelet and teager energy operator based method for automatic detection of K-Complex in sleep EEG. Expert Systems with A pplications 39, 1 (2012), 1284–1290. [12] Fabio Ferrarelli, Reto Huber , Michael J Peterson, Marcello Massimini, Michael Murphy , Brady A Riedner , Adam W atson, Pietro Bria, and Giulio T ononi. 2007. Reduced sleep spindle activity in schizophrenia patients. A merican Journal of Psychiatry 164, 3 (2007), 483–492. [13] Ling Guo, Daniel Rivero, Julián Dorado, Cristian R Munteanu, and Alejandro Pazos. 2011. Automatic feature extraction using genetic programming: An ap- plication to epileptic EEG classication. Expert Systems with Applications 38, 8 (2011), 10425–10436. [14] Elena Hernández-Pereira, V eronica Bolón-Canedo, Noelia Sánchez-Maroño, Diego Álvarez-Estévez, Vicente Moret-Bonillo, and Amparo Alonso-Betanzos. 2016. A comparison of performance of K-complex classication methods using feature selection. Information Sciences 328 (2016), 1–14. [15] Jin Huang and Charles X Ling. 2005. Using AUC and accuracy in evaluating learning algorithms. IEEE Transactions on knowledge and Data Engineering 17, 3 (2005), 299–310. [16] Saam Iranmanesh and Esther Rodriguez- Villegas. 2017. An Ultralow-Power Sleep Spindle Detection System on Chip. IEEE transactions on biomedical circuits and systems 11, 4 (2017), 858–866. [17] Annica Ivert, Claus Aranha, and Hitoshi Iba. 2015. Feature selection and classi- cation using ensembles of genetic programs and within-class and between-class permutations. In Evolutionary Computation (CEC), 2015 IEEE Congress on . IEEE, 1121–1128. [18] C Principe Jose and R Smith Jack. 1982. Sleep spindle characteristics as a function of age. Sleep 5, 1 (1982), 73–84. [19] Jasmin Kevric and Abdulhamit Subasi. 2017. Comparison of signal decompo- sition methods in classication of EEG signals for motor-imagery BCI system. Biomedical Signal Processing and Control 31 (2017), 398–406. [20] Laerke K Krohne, Rie B Hansen, Julie AE Christensen, Helge BD Sorensen, and Poul Jennum. 2014. Dete ction of K-complexes based on the wav elet transform. In Engineering in Medicine and Biology Society (EMBC), 2014 36th A nnual Interna- tional Conference of the IEEE . IEEE, 5450–5453. [21] Daniel Lachner-Piza, Nino Epitashvili, Andreas Schulze-Bonhage, Thomas Stieglitz, Julia Jacobs, and Matthias Dümpelmann. 2018. A single channel sleep- spindle detector based on multivariate classication of EEG epo chs: MUSSDET . Journal of neuroscience methods 297 (2018), 31–43. [22] T arek Lajnef, Sahbi Chaibi, Perrine Ruby , Pierr e-Emmanuel A guera, Jean-Baptiste Eichenlaub, Mounir Samet, Abdennaceur K achouri, and Karim Jerbi. 2015. Learn- ing machines and sleeping brains: automatic sleep stage classication using decision-tree multi-class support vector machines. Journal of neuroscience meth- ods 250 (2015), 94–105. [23] Stephane G Mallat. 1989. A theory for multiresolution signal decomposition: the wavelet representation. IEEE transactions on pattern analysis and machine intelligence 11, 7 (1989), 674–693. [24] Rachel Manber and Roseanne Armitage. 1999. Sex, steroids, and sleep: a review . Sleep 22, 5 (1999), 540–541. [25] Durga Prasad Muni, Nikhil R Pal, and Jyotirmay Das. 2006. Genetic programming for simultaneous feature selection and classier design. (2006). [26] RJ Nandi, Asoke K Nandi, Rangaraj M Rangayyan, and D Scutt. 2006. Classica- tion of breast masses in mammograms using genetic pr ogramming and feature selection. Medical and biological engineering and computing 44, 8 (2006), 683–694. [27] E Niedermeyer and Mar cia Ribeiro. 2000. Considerations of nonconvulsiv e status epilepticus. Clinical Electroencephalography 31, 4 (2000), 192–195. [28] Maurice M Ohayon. 2011. Epidemiological overview of sleep disorders in the general population. Sleep Medicine Research (SMR) 2, 1 (2011), 1–9. [29] Kevin R Peters, Laura B Ray , Stuart Fogel, Valerie Smith, and Carlyle T Smith. 2014. Age dierences in the variability and distribution of sle ep spindle and rapid eye movement densities. PloS one 9, 3 (2014), e91047. [30] Anil Natesan Rama, S Charles Cho, and Clete A Kushida. 2006. Normal human sleep. Sle ep: A comprehensive handbook (2006), 3–9. [31] Rakesh Ranjan, Rajeev Arya, Steven Lawrence Fernandes, Erukonda Sravya, and Vinay Jain. 2018. A Fuzzy Neural Network Approach for A utomatic K-Complex Detection in Sleep EEG Signal. Pattern Recognition Letters (2018). [32] Allan Rechtschaen. 1968. A manual for standardized terminology , techniques and scoring system for sleep stages in human subjects. Brain information service (1968). [33] J. Röschke, J. Fell, and P Beckmann. 1995. Nonlinear analysis of sle ep EEG data in schizophrenia: calculation of the principal Lyapunov exponent. Psychiatr y research 56.3 (1995), 257–269. [34] Michael H Silber , Sonia Ancoli-Israel, Michael H Bonnet, Sudhansu Chokroverty , Madeleine M Grigg-Damberger , Max Hirshkowitz, Sheldon Kapen, Sharon A Keenan, Meir H Kryger , Thomas Penzel, et al . 2007. The visual scoring of sleep in adults. Journal of Clinical Sleep Medicine 3, 02 (2007), 22–22. [35] Ranyart R Suárez, José María V alencia-Ramírez, and Mario Gra. 2014. Genetic programming as a feature selection algorithm. In Power , Ele ctronics and Computing (ROPEC), 2014 IEEE International Autumn Meeting on . IEEE, 1–5. [36] D Puthankattil Subha, Paul K Joseph, Rajendra Acharya, and Choo Min Lim. 2010. EEG signal analysis: a survey . Journal of medical systems 34, 2 (2010), 195–212. [37] Lin Sun, Xiangmin Zhang, Shaoxiong Huang, Jiuxing Liang, and Yuxi Luo. 2018. K-complex morphological features in male obstructive sleep apnea-hypopnea syndrome patients. Respirator y physiology & neurobiology 248 (2018), 10–16. [38] M Emin T agluk, Mehmet Akin, and Necmettin Sezgin. 2010. Classıfıcation of sleep apnea by using wavelet transform and articial neural networks. Expert Systems with A pplications 37, 2 (2010), 1600–1607. [39] Athanasios Tsanas and Gari D. Cliord. 2015. Stage-independent, single lead EEG sleep spindle detection using the continuous wavelet transform and local weighted smo othing. Frontiers in Human Neuroscience 9 (2015), 181. https: //doi.org/10.3389/fnhum.2015.00181 [40] Felipe Viegas, Leonardo Rocha, Marcos Gonçalves, Fernando Mourão, Giovanni Sá, Thiago Salles, Guilherme Andrade, and Isac Sandin. 2018. A genetic pro- gramming approach for feature selection in highly dimensional skewed data. Neurocomputing 273 (2018), 554–569. [41] Alan W ade, Nava Zisapel, and Patrick Lemoine. 2008. Prolonged-release mela- tonin for the treatment of insomnia: targeting quality of sleep and morning alertness. (2008). [42] Oren M W einer and Thien Thanh Dang- Vu. 2016. Spindle oscillations in sleep disorders: a systematic review . Neural plasticity 2016 (2016). [43] Cüneyt Yücelbaş, Şule Yücelbaş, Seral Özşen, Gülay T ezel, Serkan Küççüktürk, and Şebnem Y osunkaya. 2018. A utomatic detection of sle ep spindles with the use of STFT , EMD and D W T methods. Neural Computing and A pplications 29, 8 (2018), 17–33. [44] Xiaobin Zhuang, Y uanqing Li, and Nengneng Peng. 2016. Enhanced automatic sleep spindle detection: a sliding window-based wavelet analysis and comparison using a proposal assessment method. A pplied Informatics 3, 1 (08 Dec 2016), 11. https://doi.org/10.1186/s40535- 016- 0027- 9

Classification of EEG Signals using Genetic Programming for Feature Construction

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment