Addressing Class Imbalance in Classification Problems of Noisy Signals by using Fourier Transform Surrogates

Randomizing the Fourier-transform (FT) phases of temporal-spatial data generates surrogates that approximate examples from the data-generating distribution. We propose such FT surrogates as a novel tool to augment and analyze training of neural netwo…

Authors: Justus T. C. Schwabedal, John C. Snyder, Ayse Cakmak

Addressing Class Imbalance in Classification Problems of Noisy Signals   by using Fourier Transform Surrogates
1 Addressing Class Imbalance in Classification Problems of Noisy Signals by using F ourier T ransform Surrogates Justus T . C. Schwabedal, John C. Snyder , A yse Cakmak, Shamim Nemati, Gari D. Clif ford Abstract —Randomizing the Fourier -transf orm (FT) phases of temporal-spatial data generates surrogates that appr oximate examples from the data-generating distrib ution. W e pr opose such FT surrogates as a no vel tool to augment and analyze training of neural networks and explor e the approach in the example of sleep-stage classification. By computing FT surrogates of raw EEG, EOG, and EMG signals of under -repr esented sleep stages, we balanced the CAPSLPDB sleep database. W e then trained and tested a convolutional neural network for sleep stage classification, and found that our surrogate-based augmentation impro ved the mean F1-score by 7%. As another application of FT surrogates, we formulated an approach to compute saliency maps for individual sleep epochs. The visualization is based on the response of inferr ed class probabilities under r eplacement of short data segments by partial surrogates. T o quantify how well the distrib utions of the surrogates and the original data match, we evaluated a trained classifier on surrogates of correctly classified examples, and summarized these conditional predictions in a confusion matrix. W e show how such conditional confusion matrices can qualitatively explain the performance of surr ogates in class balancing. The FT -surrogate augmentation appr oach may impro ve classification on noisy signals if carefully adapted to the data distribution under analysis. Index T erms —saliency maps, deep machine learning, class imbalance, sleep staging I . I N T R O D UC T I O N Classification problems in biomedical signals are often imbalanced by one or more orders of magnitude. For example, epileptic seizures are rare minute-long ev ents that interrupt hours, days or even weeks of apparently normal cortical activity in the electroencephalogram (EEG) [1]. As another example, certain transitional sleep stages, such as S1 and S3, are underrepresented with respect to more stable stages such as wakefulness or Rapid Eye Mov ement (REM) sleep [2]. Rare ev ents, such as a possibly fatal status epilepticus or sleep-onset REM indicati ve of narcolepsy , are especially important in the biomedical realm. It it is therefore imperati ve that such underrepresented classes are not swamped by the more prev alent ones. Classification algorithms such as logistic regression, support vector machines, and random forest models can be extended to incorporate class imbalances in their cost function structure (see Haixiang et al. [3] for a revie w on class balancing). Howe v er by design, such extensions are aimed at general J. Schwabedal, S. Nemati, and A yse Cakmak, and G. Clifford are with the Department of Biomedical Informatics, Emory Univ ersity , GA, USA Manuscript recie ved XXX, revised XXX applicability , offering only little flexibility to incorporate do- main specific kno wledge. In mini-batch-based methods of deep learning, the imbalanced class distrib ution is typically equilibrated by discarding examples of prev alent classes, or repeating those in the minority . A ten-fold up-sampling may , howe v er , lead to partial over -fitting whereas under-sampling unsatisfactorily discards vast amounts of v aluable data. Instead it has been proposed to sample from inferred distri- butions of minority classes [4]. In principle, deep generativ e models, in particular generativ e adversarial networks, can be used to approximate examples from these distributions [5], [6]. Howe v er , these methods are very data-hungry , and we belie ve they will likely fail to generate a variety of examples of rare classes in a dataset. Class imbalance problem has also been addressed in the dev elopment of automatic sleep-staging systems (see Aboaloayon et al. [7] for a surve y on such systems). Howe ver in the few deep learning approaches we found, classes were balanced by either discarding data [8], [9], or by up-sampling through repetitions of data [10]. Both approaches introduce biases in the predictions, either falsely pointing away from abnormality or falsely predicting illnesses. Possible remedies can come from domain-knowledge based models of the rare class, which can either be based in physical understanding of the biophysical process that generates the observations, or through a statistical approach. In this article, we discuss up-sampling based on Fourier - T ransform (FT) surrogates [11]. W e further describe a surrogate-based method to construct saliency maps for a trained classifier . Specifically , we measure the response of inferred class probabilities under a surrogate replacement of short data segments. Our method adds to the se veral techniques that hav e been developed for image data, such as methods of gradient ascent, saliency maps [12], and deconv olution networks [13]. Previously the surrogate approach has been dev eloped to test the hypothesis that a signal has been generated by a linear stationary stochastic process, and has been previously applied to EEG signals in this context [14]. Howe v er , we present a very different utilization in exploring the question of how surrogates can facilitate machine learning, both, in training classifiers and in analyzing what these learn to recognize. 2 I I . M E T H O D S A. Surr o gates based on the F ourier T ransform The complex Fourier components s n of a signal x n can be decomposed into amplitudes a n and phases ϕ n as s n = a n e iϕ n . Sample sequences of stationary linear random pr o- cesses are uniquely defined by the Fourier amplitudes a n , whereas their Fourier phases are random numbers in the interval [0 , 2 π ) . Under this assumption, we can draw a new sequence y n that is statistically independent from x n while representing the same generating distribution as first demon- strated by Theiler [15]. W e simply replace the Fourier phases of s n by new random numbers from the interval [0 , 2 π ) , and apply the in verse Fourier transform. Under the assumptions of linearity and stationarity , we use this FT -surrogate method to generate new independent samples of the sleep database analyzed here. In Fig. 1, we show examples of EEG segments together with examples of their FT surrogates. Example (a) is dominated by EEG alpha wav es centered around 10-Hz, wherein for example (b), such alpha wav es are only visible in the segment’ s first half. Comparing their surrogates allows us to understand better the effect of nonstationarity on the FT surrogate technique. While the surrogate represents the data in example (a) visually well, surrogate (b) does not show a strong localization of the alpha wa ves to a particular subsection. The power in this band is smeared across the whole surrogate segment thus leading to a very different visual appearance. Schreiber et al. [16] extended the FT -surrogate method to simultaneously model the time-domain amplitude distribution P ( x n ) in addition to the Fourier -amplitude distrib ution P ( s n ) of the original signal. In short, their algorithm starts by computing a regular FT surrogate. Next, the time-domain distribution of the surrogate is replaced by the original one. Then,the adjusted surrogate is Fourier transformed again and the Fourier distribution is replaced by the original one. The last two steps are repeated iteratively until the time-domain distri- bution con ver ges suf ficiently . Accordingly , these surrogates are called iterative amplitude-adjusted FT (IAAFT) surr ogates . B. P olysomnogr aphic Database W e processed the CAPSLPDB sleep database consisting of 101 overnight polysomnographies (PSGs) [17], [18]. Each recording contained about eight hours of multichannel record- ings and sleep-stage annotations scored by an expert accord- ing to R&K 68 rules [19]. W e did not take into account recordings rbd11 , brux1 1 , and nfle27 because of missing sleep scores, and n4 , n8 , n12 , and n16 because these only contained EEG channels. The remainder of recordings were divided into fi ve equidis- tant age bins. The di vision was based on the data distribution. Each record had been divided in 30-second intervals each assigned one of the sleep stages Wake , S1 , S2 , S3 , S4 , REM , or MT by an expert sleep technician. In Fig. 2, we summarize the distribution of stages stratified by age groups. W e ignored stage MT which occurred only 685 times. In each age group, 1 The score of brux1 was recov ered after the analysis. stage S1 was least well represented, av eraging at 4% across all groups. The fraction of stage Wake increased with age, and meanwhile the fraction of S4 and REM decreased. From all av ailable channels in each recording, we select a subset including two EEG channels, one EOG, and one EMG channel, which is a maximal subset included in all recordings. The av ailable EEG channels were also heterogeneous regard- ing the recording site and deriv ation. They were selected with a preference list (numbers included EEG1 and EEG2): F3-C3 ( n = 67 ), P3-O1 ( 67 ), C4-M1 ( 29 ), F4-C4 ( 26 ), C3-M2 ( 6 ), O2-M1 ( 3 ), P3-Cz ( 2 ), F7-Cz ( 2 ). W e resampled all signals to 32 Hz after applying a 13 Hz 4 -th order Butterworth low- pass filter to reduce aliasing. Alongside different stages of sleep, aging is also known to correlate with characteristic EEG patterns. The co-variation leads to an implicit class under-representation of wakefulness in young, for example. Moreov er , the database consists of individuals suffering from v arious diseases or disorders that are represented differentially across age, and av ailable chan- nels reflect to some extend the disease-specific in v estigation: records of young nocturnal-frontal-lobe-epilepsy patients in- cluded more EEG channels than regular PSGs, for example. W e do not attempt to address all of these sources of class imbalance within the scope of this article, because on this lev el of detail, the present database is too small. C. Network Arc hitectur e and T r aining W e explored a con v olutional neural-network architecture as a deep learning model for our sleep database. The goal was to optimize the F1-score for all six classes across the different age groups. W e used Google Cloud’ s ml-engine infrastructure for all computations including Bayesian hyper - parameter optimization. Our architecture takes as input 30-second raw sequences of two EEG, one EMG, and one EOG channel as one example, and outputs soft maximum-based probabilities for the six classes Wake , S1 , S2 , S3 , S4 , and REM . T wo parts constitute our network architecture: first each channel is processed by dedicated neural networks only operating on that one channel; and second, their outputs are merged to process interrelation among channels. Note, that the 4-channel input would suffice for sleep-scoring experts to deduce sleep stages. The network architecture is summarized in detail in T ab . I. Channel pipes. In the first stage each channel is processed with a pipe of one-dimensional con volutional layers. While all pipes share the same architecture, each channel type has its own parameters, i.e. the two EEG channels share the same parameters. W e choose parameter sharing across EEG chan- nels because the heterogeneity in our dataset prohibited to train dedicated channels for specific electrode locations. Choosing the same pipe architecture for each channel facilitated joining their outputs in the second stage. After each con volutional layer we apply dropout with p = 0 . 33 . The Scale layer was initialized with a factor 0 . 05 µ V − 1 . Biases were initialized as zero, and weights were initialized drawing from standard Glorot-uniform distributions [20]. Joined pipe. In the second stage, outputs of the first stage are stacked to form a ( n, 4 , f ) -dimensional tensor , where f is the 3 Fig. 1. Examples of FT surr ogates. W e show four 30-second signals of EEG (a-c) and EOG (d) from CAPSLPDB recorded during different stages of sleep (indicated). Each signal (black line) is counterposed with a representation of its FT surrogates (red line). Panel (b) illustrates the effect of non-stationarity on the technique, Panel (d) that of nonlinearity . Fig. 2. Stage distribution across age gr oups. The relativ e histogram displays the distrib ution of all 107,738 epochs stratified by age and sleep stage. number of filters, and n the length of each of the four joined sequences. A two-dimensional con volution layer is applied to the result, followed by two dense layers and the six-neuron soft-max layer to be matched with class probabilities. After the first two dense layers, we apply dropout with p = 0 . 015 . Biases were initialized as zero, and weights were initialized drawing from a Glorot uniform distribution. W e trained the network on 7000 mini-batches of 128 ex- amples, and using an RMS-Prop optimization algorithm with a learning rate of 0 . 0016 , a decay parameter of 0 . 9 , and no momentum [21]. The number of steps was chosen through our experience of visually inspecting validation and training loss, T ABLE I N E UR A L N E T WO R K A R C HI T E C TU R E . F : N U M B ER O F FI LTE R S , W: D I ME N S I ON O F E AC H FI LT ER , S : S T RI D E PA R AM E T E R , R E LU : R E CT I FI E D L I NE A R U N I T . N AM E D E S C RI P T I O N O U T P UT C H A N NE L - P I P E A R C H I TE C T U R E E AC H W I T H 3 2 , 9 3 6 T R A I NA B L E PA RA M E T E R S . I N P U T 3 0 - S EC O N D S I G N AL 960 S C A L E S C A L AR R E S CA L I N G 960 C O N V 1D W: 1 6 , F : 1 6 , R E L U 960 × 16 M A X P OO L W: 3 , S : 2 480 × 16 C O N V 1D W: 1 9 , F : 1 9 , R E L U 480 × 19 M A X P OO L W: 3 , S : 2 240 × 19 C O N V 1D W: 2 3 , F : 2 3 , R E L U 240 × 23 M A X P OO L W: 3 , S : 2 120 × 23 C O N V 1D W: 2 7 , F : 2 7 , R E LU 120 × 27 M A X P OO L W: 3 , S : 2 60 × 27 J O I N ED - P I P E A R C H I T EC T U R E W I T H 6 4 , 3 71 T R A I NA B L E P A R A M E T ER S . I N P U T O U T P UT O F C H A N N EL P I P ES 60 × 4 × 27 C O N V 2D W: 20 × 4 , F : 1 0 , R E L U 41 × 1 × 10 D E N S E 8 5 N E U RO N S , R E L U 85 D E N S E 8 5 N E U RO N S , R E L U 85 D E N S E 6 N E U RO N S , S O F T - M A X 6 and assured that these quantities always reached stable v alues. D. V alidation Split and Data Sampling W e split off a validation set from the database by holding one recording back from each age group. On these fiv e recordings, we v alidated an instance of a neural netw ork which we trained on the training set consisting of all other records. In a 5-fold cr oss validation , we split the database (and trained networks) fiv e times, each with different validation recordings. This yielded a total validation set of fiv e recordings from each age group, i.e. a total of 25 recordings. 4 During training, we sorted the training set by stage label for up-sampling and augmentation which we controlled by two parameters β ∈ [0 , 1] and α ∈ [0 , 1] . As a last step, we shuffled the processed training set to randomly group examples into mini-batches of size 128 . Up-sampling. W e computed the number of repetitions n c of under-represented class c necessary to match the number of the most frequent class. W e then multiplied n c by a factor β , and added a corresponding number of random repetitions to the training set. The factor allowed us to control up-sampling. For the presented results, we set β = 0 . 9 . A ugmentation. Each channel in the repeated examples were replaced by FT surrogates with probability α . That means for a gi ven repeated e xample that only some of its channels could be augmented by surrogate replacements. W ith this publication we provide the preprocessed database and scripts reproducing our results. 2 I I I . R E S U LT S A. T raining with FT Surr ogate-based Class Balancing W e started our analysis by training the feed-forward neural network model without replacing any repeated signals by FT surrogates ( α = 0 ). Our training did not show considerable ov er-fitting as indicated by a close proximity of training- and test-set accuracies. Though not groundbreaking, our classifica- tion results on the five-fold v alidation set sho wn in Fig. 3 were within the range of previously reported results for sleep-stage classification, especially for the very complex CAPSLPDB . Fig. 3. Confusion matrix of test-set predictions. W e ev aluated the network on examples from the test set and computed the fraction of labels of a certain class with respect to their predictions shown in this color-coded confusion- matrix representation. W e lev eraged the trained model to in vestigate how well signals of different sleep stages are represented by their FT sur- rogates. For each correctly predicted example, we computed a 2 https://github.com/cliffordlab/sleep-convolutions-tf surrogate and re-applied the classifier . W e analyzed the confu- sion matrix for these surrogate labels conditioned on a correct original prediction. As shown in Fig. 4(a), FT surrogates of stages Wake , S1 , and S4 were predicted to be from the correct class with probabilities larger than 80%, whereas surrogates of S2 and REM showed lo west conditional accuracies. Comparing the of f-diagonal matrix elements, we found that S1 -surrogates are more often miss-classified as Wake , S2 as S3 , S3 as S4 , and REM as S1 . Exemplary , the miss-classification S1 → Wake may be explained by the redistribution of non-stationary bursts of alpha oscillations when drawing a surrogate as visible in Fig. 1(b): in the surrogate, the alpha rhythm appears in more than 50% of the segment thus making a classification Wake more likely by eye and by algorithm. W e hypothesize that miss-classifications S2 → S3 , S3 → S4 , and REM → S1 are also due to non-stationarities, i.e., K-complexes, and bursts of delta wav es or rapid eye movements. W e also ev aluated the conditional confusion matrix when replacing the original correctly predicted e xamples with IAAFT surrogates, as sho wn in Fig. 4(b). Comparing the conditional accuracies of FT and IAAFT surrogates, we observed that the latter were equal or better predicted for all stages except S1 . Standard de viations in conditional confusion values were around 1%. Next, we increased the augmentation probability α to values between 0 and 1 , thus replacing fractions of up-sampled signals by FT surrogates. At each α , we performed our scheme of fi ve-fold cross-validation and observed how prediction probabilities changed. W e found a consistent maximum of the F1-score at about α = 0 . 4 (cf. Fig. 5). The con v ex dependence of the F1-score on α can be better understood when decom- posing the measure into its constituent per-class accuracies summarized in Fig. 6. While the accuracy of stages Wake , S2 , and S4 slightly increase, the S1 - and S3 -accuracies rapidly decrease towards zero beyond α > 0 . 4 . These two opposing objectiv es create the quantitativ e compromise exhibited as a non-trivial maximum in the α -dependence of the F1-score. Notice that the accuracy of stage S2 showed the greatest benefit of surrogate-based up-sampling, of which no surrogates were created. Unfortunately , we were not yet able to e valuate and compare IAAFT surrogates with these results due to temporal and budget constraints. B. P artial FT Surro gates to Analyze Class Pr obabilities Based on FT surrogates we propose a novel technique to create saliency maps from which we can read out the relativ e importance of a subsection of a signal for the predicted class probabilities. First, we selected a window length and a subset of channels in which we presumed to find a rele v ant feature. T o query the rele vance of the data at a gi ven location in the epoch, one could, naiv ely , zero-out the subsection in question and observe how inferred probabilities change. Howe ver , imputing such quiescent periods can introduce class biases; for example a very low-v oltage EMG signal strongly indicates REM sleep ov er other sleep stages. Instead, we spliced out the signal window , and replaced the subsection with an FT surrogate generated from the remainder of the signal under analysis as 5 Fig. 4. Conditional confusion matrix. The correct predictions (diagonal in Fig. 3) were transformed to (a) FT surrogates, and (b) IAAFT surrogates, and than re-scored by the sleep-staging algorithm. The result is presented in respectiv e conditional confusion matrices. visualized in Fig. 7. All splicing was performed smoothly by cosine half-wav e interpolation of 0 . 5 -second overlaps. For a giv en window location, the partial surrogate replacement was performed multiple times. For each replacement, the epoch was then processed by our sleep-staging algorithm and the class probabilities were recorded. Finally , we av eraged these class probabilities ov er the independent replacements. The av eraged probabilities as a function of the window position yielded a saliency map that described the rele vance of local- ized features for the classification result found in for a specific example. W e demonstrate the partial FT surrogate technique with an example epoch of stage REM that has been misclassified as stage S2 by our algorithm (cf. Fig. 8). In the latter half of the example, there is a K-complex visible in both EEG-channels and the EOG-channel, which according to the rules leads to a stage change to S2 in the following epoch. Had it occurred in Fig. 5. A verage F1-score versus augmentation probability . The average F1-score depending on surrogate augmentation probability α shows a distinct maximum, both for the test set as well as for the training set. Fig. 6. Per -class accuracy versus augmentation probability . The per-class accuracy depending on surrogate augmentation probability α shows slightly increasing patterns for stages Wake , S2 , S4 , and REM , and sharply decreasing patterns for stages S1 , and S3 . This discrepancy explains to a certain extend the con ve x F1-score dependence (cf. Fig. 5). the earlier half of the example, the example would have been scored as S2. W e analyzed this epoch using our partial-surrogate method (cf. Fig. 8(a)), and counterposed the result with naiv e zeroing out of equiv alent subsections (cf. Fig. 8(b)). The prediction probabilities of sleep states S2 and REM crossed or re versed as the surrogate replacement 5-second window slides across the location of the K-complex. The probabilities also reversed Fig. 7. Example of a partial FT surrogate. A 4-second subsection of an EEG signal recorded during stage Wake is sho wn (black line), together with a partial FT surrogate (red line). The partial surrogate replaced the anomaly in the segment between second 13 and 17. Note that the surrogate dominantly contains the ∼ 10-Hz alpha wav es also visible in the rest of the signal. 6 Fig. 8. Partial FT surrogate analysis. This 30-second REM epoch was miss-classified as S2 with P ( S2 ) = 68 % (black dashed line). The suspicion was that the K-complex at about 17 seconds caused the miss-classification. The used classifier was trained with α = 0 . 4 . (a) W e analyzed the epoch with the partial FT -surrogate technique to both EEG signals using a 5-second-long moving windo w with an ov erlap of 0 . 5 seconds, and 500 surrogate replacements. The a veraged probability of S2 and REM change as a function of the window location. The temporary re versal of probabilities indicates that the K-complex at about 17 seconds caused the miss-classification. (b) The surrogate approach is counterposed with simple zeroing out, in which an equiv alent 5-second window is (smoothly) replaced by zeros. This nai ve approach also sho ws a reversal of probabilities, but at the wrong position. Note that there was no of fset in the signal. for the zero-out method, howe ver , not concurrently with the visually identified ev ent. I V . D I S C U S S I O N W e explored two applications of Fourier transform (FT) surrogates to sleep stage classification: we analyzed ho w up-sampling minority examples with FT surrogates affects the prediction scores. Furthermore, we described a method of saliency maps based on partial FT surrogates that allow us to analyze how individual class probabilities depend on subsections of the signals. The con v ex dependence of the F1-score on the augmentation probability indicates a possible benefit of surrogate-based up- sampling. Howe v er , this might not be the case for all class labels equally . Increases in the S2 -accuracy seemed to be at the expense of stages S1 and S3 for larger values of α . Based on these results, we hypothesize that the effect of surrogate augmentation on an individual class accuracy does not directly depend on their conditional prediction accuracies, which are on the diagonal of the conditional confusion matrix (cf. Fig. 4(a)); instead, augmentation may introduce mixing between class labels indicated by a large of f-diagonal element upon which the accuracy of one of the mixed labels will dominate. Accordingly , we hypothesize the accuracy increase of S2 and REM to be at the expense of classification accuracy of S1 , and the increase in accuracy for S2 and S4 at the expense of classification accuracy of S3 . The conditional confusion matrix of IAAFT surrogates exhibit higher accu- racy and lower off-diagonal elements indicating mixing of labels (cf. Fig. 4(b)). One interpretation of the results is that IAAFT surrogates are able to model the data distrib ution more accurately; on the other hand, the results are also consistent with the data distribution to be highly collapsed into regions that are well predicted by our algorithm. While the former would suggest benefits of using IAAFT ov er simple surrogates, the latter would mean that using IAAFT would increase the tendency to ov er-fit the data. T o date, we understand little about the topological properties of the IAAFT distribution and therefore it is hard to reason which effect will dominate. Therefore, it would be interesting to see how training with IAAFT surrogates impacts accuracy scores in this and other examples of biomedical data analysis. Specifically we predict from our hypothesis, that augmentation with IAAFT surrogates will hav e a less negati ve impact on the S1 classification accuracy . Partial surrogate analysis is not restricted to neural-network based or other differentiable classifiers as these saliency maps are created purely by controlling input and output probabilities. Also, the technique, aimed at transient signal features, does not greatly suffer from the requirement of stationarity since the replaced subsections are of lengths at which EEG signals are approximately statistically stationary . Howe ver , features without temporal localization cannot be delineated with our technique. For example, a constant alpha-wa ve background will not be detected to distinguish Wake from S1 because the surrogate replacement will also contain alpha wa ves (com- pare Fig. 1(a) and (b)). Such features are more likely to be highlighted by gradient-based saliency maps, and when 7 training on a wav elet representation of the signal as data input. The example shown in Fig. 8 highlights the strength of our technique, where it allo wed us to gather evidence that our sleep-staging algorithm learned about the existence of K- complex es and their relev ance of distinguishing between REM and S2 (cf. Fig. 8). This was particularly unclear gi ven the relativ ely poor accurac y of the classifier . W e conclude from the present work that the ability to draw independent examples from the data distribution is important in training, analysis, and validation of deep machine-learning models. As in this work, such examples can be used to balance and augment a database to achie ve better generalization, and to understand which statistical properties of data are instrumental to black-box learning algorithms to make predictions. Unless the database is large enough to train a deep generative model that mimics the data distribution, it is necessary to build the generator from a strong set of constraints rooted in specific domain knowledge. This is especially the case for under- represented classes for which we do not hav e a lot of data. Usage of FT surrogates is constrained to stationary linear random data as the current work illustrates. For IAAFT surrogates we cannot formulate the precise constraints. In the future, it may also be helpful to query mechanism-based models to generate surrogates in situations, particularly for nonlinear signals, that are not well represented by FT -based surrogates, such as electrocardiograms. In the future, we plan to adopt our approach to identify am- biguous or mislabeled data which are often mislabeled for two general reasons: natural inter- and intra-observer v ariability for transitional epochs, and errors due to quantization or coarse windowing of data. Although the issue of only moderate inter - and intra-rater agreement le vels is a known issue in sleep stage labeling [22], the latter issue is a particularly under-explored problem in sleep stage classification. In particular we plan to use partial FT surrogate analysis to identify epochs ambiguous due to short transient ev ents. The ability to programatically exclude such edge cases from a training may enhance the efficac y of sleep-stage classification. A C K N O W L E D G E M E N T S This research is supported in part by funding from the James S. McDonnell Foundation, Grant 220020484 (http://www .jsmf.org), the Rett Syndrome Research Trust and Emory Univ ersity and the National Science Foundation Grant 1636933 (BD Spokes: SPOKE: SOUTH: Large-Scale Medical Informatics for Patient Care Coordination and Engagement). Dr . Nemati’ s research is funded through an NIH Early Ca- reer Dev elopment A ward in Biomedical Big Data science (1K01ES025445-01A1). This work was partially funded by NSF grant 1822378. R E F E R E N C E S [1] F . Mormann, R. G. Andrzejak, C. E. Elger, and K. Lehnertz, “Seizure prediction: The long and winding road, ” Br ain , vol. 130, no. 2, pp. 314–333, 2006. [Online]. A v ailable: http://dx.doi.org/doi:10.1093/brain/a wl241 [2] M. A. Carskadon and W . C. Dement, “Normal human sleep: An overvie w , ” Principles and Practice of Sleep Medicine , vol. 4, pp. 13–23, 2005. [3] G. Haixiang, L. Y ijing, J. Shang, G. Mingyun, H. Y uanyue, and G. Bing, “Learning from class-imbalanced data: Revie w of methods and applications, ” Expert Sys. Appl. , vol. 73, pp. 220–239, 2017. [Online]. A vailable: http://dx.doi.org/10.1016/j.eswa.2016.12.035 [4] N. V . Chawla, K. W . Bowyer , L. O. Hall, and W . P . Ke gelmeyer , “SMO TE: Synthetic minority over-sampling technique, ” J. Artif. Intell. Res. , vol. 16, pp. 321–357, 2002. [Online]. A vailable: http://dx.doi.org/10.1613/jair .953 [5] M. Mirza and S. Osindero, “Conditional generativ e adversarial nets, ” arXiv pr eprint arXiv:1411.1784 , 2014. [6] A. Antoniou, A. Storkey , and H. Edwards, “Data augmentation generativ e adversarial networks, ” arXiv preprint , 2017. [Online]. A v ailable: http://arxi v .or g/abs/1711.04340 [7] K. A. Aboalayon, M. Faezipour , W . S. Almuhammadi, and S. Moslehpour , “Sleep stage classification using eeg signal analysis: A comprehensiv e survey and new inv estigation, ” Entr opy , vol. 18, no. 9, p. 272, 2016. [Online]. A vailable: http://dx.doi.org/10.3390/e18090272 [8] O. Tsinalis, P . M. Matthews, Y . Guo, and S. Zafeiriou, “ Automatic sleep stage scoring with Single-Channel EEG using con volutional neural networks, ” arXiv preprint , 2016. [Online]. A vailable: http://arxi v .or g/abs/1610.01683 [9] H. Dong, A. Supratak, W . Pan, C. Wu, P . M. Matthews, and Y . Guo, “Mixed neural network approach for temporal sleep stage classification, ” IEEE T . Neur . Sys. Reh. , no. 99, p. 1, 2017. [Online]. A vailable: http://dx.doi.or g/10.1109/tnsre.2017.2733220 [10] A. Supratak, H. Dong, C. Wu, and Y . Guo, “Deepsleepnet: A model for automatic sleep stage scoring based on raw single-channel eeg, ” IEEE T . Neur . Syst. Reh. , vol. 25, no. 11, pp. 1998–2008, Aug. 2017. [Online]. A vailable: http://dx.doi.org/10.1109/tnsre.2017.2721116 [11] T . Schreiber and A. Schmitz, “Surrogate time series, ” Physica D , vol. 142, no. 3, pp. 346–382, 2000. [Online]. A vailable: http://dx.doi.org/doi:10.1016/S0167-2789(00)00043-9 [12] K. Simonyan, A. V edaldi, and A. Zisserman, “Deep inside conv olutional networks: V isualising image classification models and saliency maps, ” arXiv pr eprint arXiv:1312.6034 , 2013. [13] M. D. Zeiler and R. Fergus, “V isualizing and understanding conv olu- tional networks, ” in Eur opean conference on computer vision . Springer, 2014, pp. 818–833. [14] R. G. Andrzejak, K. Lehnertz, F . Mormann, C. Rieke, P . David, and C. E. Elger , “Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state, ” Phys. Rev . E , vol. 64, no. 6, p. 061907, 2001. [Online]. A v ailable: http://dx.doi.org/10.1103/PhysRe vE.64.061907 [15] J. Theiler , S. Eubank, A. Longtin, B. Galdrikian, and J. D. Farmer, “T esting for nonlinearity in time series: The method of surrogate data, ” Physica D , vol. 58, no. 1-4, pp. 77–94, 1992. [Online]. A vailable: http://dx.doi.org/10.1016/0167-2789(92)90102-S [16] T . Schreiber and A. Schmitz, “Improved surrogate data for nonlinearity tests, ” Phys. Rev . Lett. , vol. 77, no. 4, pp. 635–638, 1996. [Online]. A vailable: http://dx.doi.or g/10.1103/PhysRevLett.77.635 [17] A. L. Goldberger , L. A. N. Amaral, L. Glass, J. M. Hausdorff, P . C. Ivano v , R. G. Mark, J. E. Mietus, G. B. Moody , C.-K. Peng, and H. E. Stanley , “Physiobank, physiotoolkit, and physionet, ” Cir culation , vol. 101, no. 23, pp. e215–e220, 2000. [Online]. A vailable: http://dx.doi.org/doi:10.1161/01.CIR.101.23.e215 [18] M. G. T erzano, L. Parrino, A. Smerieri, R. Chervin, S. Chokroverty , C. Guilleminault, M. Hirshkowitz, M. Mahowald, H. Moldofsky , A. Rosa, R. Thomas, and A. W alters, “ Atlas, rules, and recording techniques for the scoring of cyclic alternating pattern (CAP) in human sleep. ” Sleep Med. , vol. 3, no. 2, pp. 187–199, 2002. [Online]. A vailable: http://dx.doi.or g/10.1016/S1389-9457(02)00003-5 [19] A. Rechtschaffen and A. Kales, A Manual of Standardized T erminolo gy , T echniques, and Scoring Systems for Sleep Stages of Human Subjects . Brain Information/Brain Research Institute, 1968. [20] X. Glorot and Y . Bengio, “Understanding the difficulty of training deep feedforward neural networks, ” in Pr oceedings of the Thirteenth International Confer ence on Artificial Intelligence and Statistics , 2010, pp. 249–256. [21] T . Tieleman and G. Hinton, “Lecture 6.5 - RmsProp: Divide the gradient by a running av erage of its recent magnitude, ” COURSERA: Neural Networks for Machine Learning, 2012. [22] H. Danker-Hopfe, P . Anderer , J. Zeitlhofer , M. Boeck, H. Dorn, G. Gruber , E. Heller, E. Loretz, D. Moser, S. Parapatics, B. Saletu, A. Schmidt, and G. Dorffner , “Interrater reliability for sleep scoring according to the Rechtschaffen & Kales and the new AASM standard, ” J. Sleep Res. , vol. 18, no. 1, pp. 74–84, 2009. 8 [Online]. A vailable: http://www .ncbi.nlm.nih.gov/pubmed/19250176 http://doi.wiley .com/10.1111/j.1365-2869.2008.00700.x

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment