Combining Support Vector Machine and Elephant Herding Optimization for Cardiac Arrhythmias
Many people are currently suffering from heart diseases that can lead to untimely death. The most common heart abnormality is arrhythmia, which is simply irregular beating of the heart. A prediction system for the early intervention and prevention of…
Authors: Aboul Ella Hassanien, Moataz Kilany, Essam H. Houssein
Combining Suppor t V ector Machine and Elephant Her ding Optimization f or Car diac Arrh ythmias Aboul Ella Hassanien 1,* , Moataz Kilany 2,* , and Essam H. Houssein 2 2 F aculty of Computers and Inf ormation, Information T echology Depar tment, Cairo University , Egypt 3 F aculty of Computers and Inf ormation, Computer Science Depar tment, Minia University , Egypt 4 Scientific Research Group in Egypt (SRGE) http://www .egyptscience .net * aboitcairo@gmail.com ABSTRA CT Many people are currently suff er ing from hear t diseases that can lead to untimely death. The most common hear t abnor mality is arrh ythmia, which is simply irregular beating of the heart. A prediction system f or the early intervention and pre v ention of hear t diseases, including cardiov ascular diseases (CD Vs) and arrh ythmia, is impor tant. This paper introduces the classification of electrocardiogram (ECG) hear tbeats into nor mal or abnor mal. The approach is based on the combination of swarm optimization algorithms with a modified P an–T ompkins algorithm (MPT A) and support vector machines (SVMs). The MPT A was implemented to remo ve ECG noise , follo wed by the application of the e xtended f eatures extraction algorithm (EFEA) for ECG f eature extraction. Then, elephant herding optimization (EHO) was used to find a subset of ECG f eatures from a larger f eature pool that provided better classification perf ormance than that achiev ed using the whole set. Finally , SVMs were used for classification. The results show that the EHO-SVM approach achie ved good classification results in terms of five statistical indices: accuracy , 93.31%; sensitivity , 45.49%; precision, 46.45%; F-measure, 45.48%; and specificity , 45.48%. Further more, the results demonstrate a clear improvement in accuracy compared to that of other methods when applied to the MIT -BIH arrh ythmia database. Introduction The W orld Health Organization (WHO) refers to CVDs as the main cause of death around the world. An estimated 17.5 million people died from CVDs in 2012, representing 31% of all global deaths 1 . Accordingly , cardiac health research has recei ved substantial attention from researchers, especially those tar geting pre venti ve, medical and technological adv ances. The main interest of researchers in this field is the improv ement of traditional cardiov ascular-diagnosis technologies. ECG is a common and vital diagnosis tool for man y cardiac disorders and breathing disorders, such as obstructi ve sleep apnea syndrome, and for monitoring other functional or structural cardiac abnormalities 2 . The a vailability , reasonable cost, simplicity and low risk of ECG ha ve made it a popular technique that has been applied in many research fields during the past two decades. ECG is a non-inv asi ve tool that measures the electrophysiological activity and of the heart and the cardiov ascular system 3 and analysis of heart function. A heartbeat signals has three main characteristic features: the P w ave, QRS comple x, and T wa ve. Each feature appears as a distinguishable peak that is repeated in each beat signal. Cardiac arrhythmia detection requires analysis of the morphology , amplitude, and duration of the P , QRS and T peaks. The automation of ECG signal analysis based on the main characteristic P , QRS and T wa ves is an important research field for sev eral reasons. Physicians depend on these signals to diagnose many cardiac diseases, such as autonomic malfunction, and other vascular , respiratory or even psychological dysfunctions. The automation process in volves numerous fields. This paper employs ECG analysis techniques to produce a model to efficiently and accurately detect heartbeats belonging to a set of categories known by cardiologists. T o obtain good results, we combined nov el optimization techniques with classifier methods to perform heartbeat classification 4 . Sev eral recent studies on ECG classification and modeling ha ve been presented. For e xample, in 5 , temporal features and the hermit function coef ficient are extracted from ECG signals as an input vector of the block-based neural network. In 6 , the bacterial foraging optimization (BFO) and particle sw arm optimization (PSO) are combined with neural networks (NNs) for the detection of left and right b undle branch block ECG patterns. In 7 , the cutof f frequency of ECG was in vestigated, and the spectrum of the ECG signal was e xtracted from four classes. In 8 , the proposed algorithm required approximately 15 min to filter a training set composed of 250 labeled samples. A fiv e-lev el ECG signal quality classification algorithm using SVM was outlined in 9 . In 10 , computers in Cardiology applied continuous wa velet transform and Daubechies w avelet to the benchmark MIT -BIH Arrhythmia database. In addition, hybrid firefly and PSO (FFPSO) were combined with NNs to detect b undle branch block in 11 . Additionally , PSO with a random asynchronous approach was introduced in 12 . Finally , in 13 , a fuzzy classifier with a genetic algorithm (GA) was proposed to classify ECG signals more precisely based on a dynamic model of the ECG signal. The aim of this paper is to present an automatic classification approach for cardiac arrhythmias. The results introduced in this paper 14 show that ECG classification of arrhythmias can be highly accurate. Therefore, these past results serve as motiv ation to focus on the classification of ECG heartbeat signals into normal or abnormal. Feature extraction and selection techniques play a major role in the domain of signal processing. Therefore, the performance of identification systems depends strongly on these techniques 15 . This paper introduces a hybrid optimization and classification approach that uses EHO 16 to select relev ant features and optimize the SVM classifier parameters for ECG heartbeat signals. The introduced classification approach is superior to alternati ve approaches in a number of aspects such as; 1) W e applied a recent optimization algorithm that emplo ys a simple and relati vely quick search pattern. 2) Our v alidation relied on a stable benchmark dataset acquired by the MIT -BIH Laboratory and employed a relativ ely large number of records (10 patients). 3) V alidation was conducted using 3-fold lea ve-one-out cross-v alidation for generalized performance. 4) The classification model relied on a large number of features compared with pre vious ECG classification problems. 5) Staged optimization was employed for ECG classification optimization, i.e., feature selection and parameter optimization were performed in separate stages, in contrast to many former studies. Staged optimization pre vents the loss of opportunities that arise when search agents change correlated parameters (features and classification penalty) simultaneously . The structure of this paper is as follo ws. The ne xt section introduces the techniques and materials employed in this paper . The methodology is explained in detail in terms of the applied dataset, feature extraction and selection, and emotion regression optimization process. Then, the experimental results and a discussion of these results are presented. Finally , concluding remarks and future work are provided in last section. Materials and Methods Electrocar diogram (ECG) Signals Six kno wn heartbeat types can be identified. Each heartbeat can be accurately described by an ECG wa veform consisting of fi ve peaks (features). The detection and ev aluation of each peak and its v ariance, distance and other mathematical characteristics leads to a powerful identification of heartbeat properties 17 . T able 1 shows a description of each wav eform. All these characteristic points should be detected. T able 1. ECG wav es. W av e Description P A trial depolarization. Q Point before R, with slope < 0. R Distance between two peaks of QRS. S Point after R with slope > 0. T V entricular re-polarization. As a part of the ECG automatic detection process, additional features are extracted from the P , Q, R, S and T wav eforms as feature vectors 17 . Those five basic components (P , Q, R, S, and T) are used to interpret the ECG, as shown in T able 2 . T able 2. ECG parameters description. Amplitude Duration P W ave - 0.25 mV PR interval- 0.12s to 0.20s Q W ave -25% of R wa ve ST interval- 0.05s to 0.15s R W ave -1.60 mV QT interval- 0.35s to 0.44s T W ave -0.1 to 0.5 mV QRS interval-0.09s Elephant Herding Optimization A new algorithm introduced by Gai-Ge W ang et. al. in 2012, named Elephant Herding Optimization Algorithm (EHO) 16 . EHO solve all kinds of global optimization problems and the herding behavior of the elephants can be modeled as follo w; (1) each population is composed of some clans in the same time each clan has fixed number of elephants. (2) at each generation, a fix ed number of male will lea ve their family group and li ve far a way . (3) in each clan, the elephants li ve together under the leader called a matriarch. Exploration and exploitation in EHO are achiev ed by the clan updating operator and the separating operator . Algorithm 1 provides the algorithmic frame work of the EHO. For more details about EHO, see 16 . 2/ 13 Algorithm 1 Pseudo code of EHO. 1: Initialization: Initialize the generation counter g = 1; the maximum generation M axGen and the population; 2: While g < M axGen do 3: All the elephants should be classified according to the fitness (objectiv e function) 4: Perform clan updating operator 5: Perform separating operator 6: assess the population by newly updated positions 7: g = g + 1 8: end while Support V ector Machines (SVMs) Sev eral classifiers hav e been proposed in the signal processing domain, including artificial neural network 18 , SVM, and fuzzy logic system 19 . Most researchers have focused on SVM for CVD classification of ECG signals 20 . The classification process of ECG signals for CVDs using SVM is re garded as the main objecti ve of this paper . Previous research illustrated the great performance of SVM, in which data are represented as a P-dimensional vector 21 . Classification is performed by means of optimal separating hyperplanes, which ensure the greatest margin between the closest data points that belong to separate classes. SVMs depend on kernels in the classification process, and kernel selection is a challenging task that strongly af fects the classification performance. The SVM algorithm aims to find the greatest distance around a hyperplane to separate a positi ve class from a negati ve class, as illustrated in the following Equations. f ( x ) = ( w . φ ( x )) + b (1) R S V M ( C ) = c 1 N N ∑ i = 1 L ε ( y i y ∆ i ) + 1 2 W T . W (2) L ε ( y i y ∆ i ) = y i − y ∆ i − ε y i − y ∆ i , ≥ ε 0 , Ot her wise (3) y ∆ = f ( x ) = N ∑ i = 1 ( α i − α ∗ i ) K ( X i , X ) + b (4) K ( X i , X ) = e x p (( − 1 δ 2 ( X i − X j ) 2 )) (5) Where φ ( x ) is a non-linear high-dimensional feature space and x is the input space. w and b are the modifiable model and threshold, estimated by minimizing, respecti vely . α i − α ∗ i is a Lagrange multipliers. k ( x i , x ) and δ 2 defines Gaussian kernel and the width of the kernel function, respecti vely . C is a positive real constant. ε refers to SVM parameter . Proposed Appr oach f or ECG Hear tbeat Classification Fiv e distinct points (P , Q, R, S and T wav es) are included in each ECG signal. Fig. 1 shows the four phases of the SVM feature optimization process of the proposed approach: (1) Preprocessing, (2) ECG feature extraction, (3) Feature selection and optimization, and (4) Classification and validation. Later, we pro vide a detailed model for phases (3) and (4), which are shown in Fig. 2 . In this paper , the EHO algorithm was modified for the purpose of classification optimization. Elephant locations are identified as SVM parameters in the selected features set while elephant fitness is realized as the av erage classification accuracy for all cross-validation folds. For the fitness calculation, the SVM is trained with three training sets and validated against three validation sets. Algorithm 2 provides the algorithmic frame work of the EHO-SVM classifier presented in Fig. 2 . Algorithm 3 shows the fitness calculation process using the SVM classifier . 3/ 13 Figure 1. The proposed approach for ECG heartbeat classification. Algorithm 2 EHO-SVM approach. 1: Input: Training sets (F olds) ( T 1 , T 2 . . . T n ) 2: Input: V alidation sets (Folds) ( V 1 , V 2 . . . V n ) 3: Output: Classification accuracy 4: Initialization: 5: Generat ion count er t ← 1 6: Initialize population locations (SVM, Kernel parameters / Selected Features) 7: Evaluate population fitness ( g ) (Alg. 3 , Eq. 6 ) 8: While g < M axGen do 9: Sort all the elephants according to their fitness 10: Apply clan updating operator 11: Apply elephant separating operator 12: Evaluate population fitness ( g ) (Alg. 3 , Eq. 6 ) 13: Find best elephant with highest fitness (Classification accuracy) 14: g = g + 1 15: end while Fitness Function An optimization algorithm generally depends on a fitness function to find best solution. The fitness function provides the algorithm a value that quantifies the fitness of each solution found in search space. In this paper , we selected classification accuracy as the solution qualifier through the search process. Classification accuracy is in the range [ 0 , 1 ] , and each elephant (search agent) is characterized by a number of accuracies that depend on the cross-validation strategy . In this paper , each elephant has three accuracy v alues, one for each fold in the 3-fold cross-validation strate gy . The accuracy values for all folds are av eraged to obtain the fitness value for the search algorithm, as sho wn in Equation 6 . f ( i , j ) = n ∑ k = 1 Acc i , j , k n (6) 4/ 13 Figure 2. The general approach for ECG heartbeat classification based on EHO. where f ( i , j ) is the fitness value for elephant i in iteration j . n represents the number of folds selected for cross-v alidation. Acc i , j , k is the accuracy of the e valuation for elephant i in iteration j for the data fold k . Algorithm 3 shows the fitness calculation process using the SVM classifier . Algorithm 3 Evaluate elephant population fitness. 1: Input: Training sets (F olds) ( T 1 , T 2 . . . T n ) 2: Input: V alidation sets (Folds) ( V 1 , V 2 . . . V n ) 3: Input: Population number ( j ) 4: Output: T otal accuracy 5: for Each elephant i in population j E i , j do 6: Get elephant location Loc i , j ← Locat ion ( E i , j ) 7: for Each training set T i ∈ T 1 , T 2 , T 3 do 8: SVM parameters P ← Ge t P aramet ers ( Loc i , j ) 9: Sel ect ed Fea t ures F ← Ge t Feat ures ( Loc i , j ) 10: T rain SVM on T i using P , F 11: Val id at ion Accuracy V ← V alidate V i 12: Acc ← Acc + V 13: end for 14: end for 15: T o t al Accur acy ← Acc / n 16: F i t ness ← T ot al Accuracy 17: Exit 5/ 13 Pre-processing Phase Using MPT A Power -line interference and baseline wandering are regarded as the most prominent types of noise that strongly af fect signals. Patient respiration, with a frequency in the range of 0.15 to 0.3 Hz, is the main source of baseline wandering. Power-line interference is categorized as narro w-band noise centered at approximately 60 Hz and occupies a bandwidth less than 1 kHz. The other sources of noise are wide-band and also affect ECG signals. The hardw are used to acquire ECG signals has the ability to suppress po wer-line interference; ho wev er , wide-band noise and baseline wandering cannot be suppressed by hardware alone. Therefore, software algorithms are used to remov e baseline wandering and other wide-band noise 22 . In this paper, MPT A 23 is used to remove dif ferent types of artifacts and noise. First, a bandpass filter, composed of a low-pass filter and a high-pass filter , is used to reduce noise. Then, a deri vati ve filter is used to obtain the slope information. Amplitude squaring is performed, and the signal is passed to a moving window inte grator . Finally , a thresholding technique is applied, and the peaks are detected. ECG Feature Extraction A wa ve analysis technique is required to perform feature extraction. W ave analysis techniques decompose a giv en wav e into its wa velet b uilding blocks. In this paper , two feature e xtraction techniques are applied to extract features for classification, such as the RR interval. Featur e Extraction Using MPT A: Nine heartbeat wa ves are extracted from the ST segment and QRS comple x based on MPT A, and the ECG signals are decomposed into lo w-frequency signals. Therefore, the low-frequency band is utilized to detect the P , QRS, and T waves. Featur e Extraction Using Improv ed Featur e Extraction Algorithm (IFEA): W e apply an IFEA 24 to obtain more features. The algorithm takes the output of MPT A, pinpoints the wav e components from the results, and calculates ne w features. The MPT A output describes different types of locations (points in time) on the ECG w aveform in terms of descripti ve letters (annotations). There are numerous letters employed for this purpose such as P , N, and T representing the P-type wave, the R-Peak of a normal beat, and the T -T ype wav e respecti vely . There are also auxiliary letters such as the opening ad closing brackets representing the beginning and end of a wa ve type with the wa ve peak enclosed in between. The following is an example of three consecutiv e heartbeats from the data set : (P)(N)(t)(P)(V)(t)(P)(N)(t) which are two normal beats with a V entricular Arrhythmias beat in the middle. The R-T ype wa ve peak takes a series of letters that annotates the type of heart beat in a whole. For example, N is a normal beat as in the “(P) (N) (t)” wav e. A “(P) (V) (t)” wav e form indicates a beat with a V entricular Arrhythmias. The algorithm also resolves some defects in the P-QRS-T e xtractor when patterns of the waveforms are not consistent, not complete or do not exist for the corresponding beats. MPT A is employed to extract the nine heartbeat wav e characteristic features (1 through 9) sho wn in T able 3 , which represent detailed features of the previously described P , Q, R, S, and T wav eforms. The additional ten features (10 through 19) are extracted by IFEA. All nineteen features e xtracted with MPT A and IFEA are depicted in T able 3 . Feature Selection and Optimization Based on EHO Swarm intelligence (SI) is a ne w branch of artificial intelligence employed to mimic the collectiv e behavior of social swarms in nature, such as elephant herding, social spiders, gray wolv es, and ant colonies. A swarm is composed of a set of agents that interact among themselv es and with the en vironment without central control. Recent research introduced swarm-based algorithms that can rapidly solve search-based problems at low cost. The types of swarms include nature-inspired and population-based. Classification and feature optimization(feature selection and parameter optimization) are two of many application domains that successfully employ SI. Other domains include machine learning, bioinformatics, medical informatics, dynamical systems and operations research 25 . The proposed approach utilizes EHO for feature selection and parameter optimization to improv e the classification accuracy . The feature optimization framework is illustrated in Fig. 2 , which depicts the last two phases of the classification approach: Phase 3 is feature selection and optimization and Phase 4 is classification and validation. Fig. 2 also sho ws how EHO employs the SVM classifier to ev aluate the fitness of each search agent in each optimization iteration. ECG Classification and Optimization P arameters Research ef forts hav e shown the dependency between feature optimization and SVM parameter optimization. A kno wn approach is to perform optimization via multiple stages of feature optimization follo wed by SVM parameter optimization rather than simultaneously optimizing features and parameters in the same run. Fig. 3 illustrates the multi-stage feature and parameter optimization model. In this w ork, we established a four -stage optimization process, where the parameter and feature optimization processes are interchanged in each stage. 6/ 13 T able 3. Heartbeats features extracted by MPT A and EFEA. N Featur e Meaning 1 PS Beginning location of P wa ve form. 2 P Peak location of P wa ve form. 3 PE End location of P wa ve form. 4 Q Beginning of QRS comple x. 5 R R peak of QRS complex. 6 S End of QRS complex. 7 TS Beginning of T wa ve form. 8 T Peak of T wa ve form. 9 TE End of T wa ve form. 10 QRS QRS = S Q. 11 P-R P RSeg = Q PE. 12 P-R P RInt = Q PS. 13 S-T S TSeg = TS S. 14 Q-T TE Q. 15 R-R RNext R. 16 P-P PNext P . 17 R-R / P-P RR-PPSim = ABS(R-R P-P). 18 R-R v ariance V ar (R-R). 19 Heartbeat 60/R-R. Figure 3. The staged feature and parameter optimization approach. Results The simulation results are obtained using MA TLAB R2014a. The e xperimental setup of the dataset, training data and testing data is presented in the following section. ECG Dataset Description Researchers use standard databases for analysis purposes. The PhysioNet website is dedicated to medical data corresponding to various diseases 26 . PhysioNet databases are composed of hundreds of digitized medical records of ECG, EEG and other types of physiologic signals. Each ECG record is annotated and re vised by a number of cardiologists. Man y research efforts depend on the MIT -BIH Arrhythmia database provided by PhysioNet and obtained by the MIT -BIH Laboratory , which consists of sev eral ECG signal records for patients with dif ferent types of abnormalities and diseases that af fect heart rhythms 27 . MIT -BIH Arrhythmia database comprises of 25 male and 22 female subjects and has 48 half-hours. The signals were collected at 360 samples/sec/channel ov er a 10 mV range with 11-bit resolution. Additionally , each record is annotated by two or more cardiologists independently , and approximately 110,000 annotations are included in the database 27 . W e applied the proposed classification approach to a subset of the dataset that includes 10 patients with 16 heartbeat types. 7/ 13 The data were processed into 10 feature vectors (one for each patient), which combined represent 24,474 records of 10 features each. These data are considered suf ficiently large to cov er the great variability of patients while maintaining a reasonable level of computational ov erhead. T able 4 shows the datasets emplo yed in our experiment. T able 4. ECG dataset description. N Patient No. Gender Age PhysioNet Standard Beat T ypes 1 202 Male 68 N-A-a-V -F 2 203 Male 43 N-a-V -F 3 205 Male 59 N-A-V -F 4 207 Female 89 L-R-A-V -E 5 214 Male 53 L-V -F 6 215 Male 81 N-A-V -F 7 217 Male 65 N-V -/-f 8 219 Male Unknown N-A-V -F 9 221 Male 83 N-V 10 223 Male 73 N-A-a-V -F-e The types of heartbeat are represented by symbols defined by PhysioNet, as shown in T able 5 . This set of beat types is translated from 16 classes into two classes, normal and abnormal (N and A), with type N considered to be normal and all types considered to be abnormal (A). W e selected ten patients with a suf ficient number of beat types to ensure the v alidity of classification results and to describe se veral types of heartbeat. A total of 25,210 ECG beats of dif ferent types were used for classification. T able 5. Heartbeat descriptions. Beats Description T otal number N Normal beat 16742 L Left b undle 3460 V Premature v entricular contraction 2154 / Paced beat 1542 ! V entricular flutter wav e 472 A Atrial premature 228 f Fusion of paced and normal beat 260 x Non-conducted P-wa ve (blocked APB) 133 R Right b undle 86 | Isolated QRS-like artifact 37 F Fusion of v entricular 30 a Atrial premature 22 E V entricular escape 16 e Atrial escape 16 [ Start of v entricular flutter/fibrillation 6 ] End of v entricular flutter/fibrillation 6 P arameters Settings The cross-v alidation, SVM parameter settings, and EHO parameter settings are included in the e xperiments. W e performed 3-fold leav e-one-out cross-validation on all datasets T able 6 summarizes the selected settings for SVM and EHO. A subset of the settings is determined based on the recommendations of the algorithm designers, and the others are set via comprehensi ve testing. P erformance Measurements Fi ve standard criteria are used to ev aluate the proposed approach: 1) accuracy (Acc), 2) precision (Prec), (3) specificity (Sp), (4) F-measure (F), and (5) sensitivity (Se). Performance measures generally depend on four main metrics of a binary classification result (positiv e/negati ve/true/false). Mathematically , the performance measures are defined by the following Equations. 8/ 13 T able 6. Parameter settings for SVMs and EHO. SVM EHO Parameter V alue Parameter V alue Kernel Radial Basis Alpha Factor 5 Penalty [1, 1000] Peta Factor 0.0005 Gamma [0, 1000] Elephant Keep 2 Scaling [-1, 1] Clans Count 5 Elephants 30 • Accuracy (Acc): Acc = T P + T N T P + F P + F N + T N ∗ 100 (7) • Precision (Prec): Prec = T P T P + F P ∗ 100 (8) • Sensitivity (Se): Se = T P T P + F N ∗ 100 (9) • F-measure (F): F = 2 ∗ PP V ∗ T PR PP V + T PR (10) • Specificity (Sp): S p = T N T N + F P ∗ 100 (11) Discussion The follo wing section presents the classification results for the 10 selected patients in terms of the performance measures for each patient. T ables 7 and 8 summarize the classification accuracy , specificity , sensitivity , precision, F-measure for each patient record. The best accuracy results per record are shown in boldface. Each patient has four results sets, one for each stage of the optimization, as discussed in the previous sections. Additionally , the stage in which the best accurac y is obtained for each patient is indicated by boldface font. The problem considered in this paper is not a binary classification problem, so we e xtract the true positive (TP), false positi ve (FP), true negati ve (TN), and false ne gativ e (FN) measures by means of a confusion matrix constructed for the classification test. T able 8 summarizes the best results for each classifier along with the classification accurac y , precision, sensiti vity , F-measure, and specificity . The table compares between accuracy values of SVM and EHO-SVM. The accuracy v alues of SVM were acquired from early stages of optimization (ST1), where SVM model w as assigned random parameters for all patients. Then results were av eraged ov er all patients for each accuracy metric. EHO-SVM accuracy v alues are calculated as the av erage of best values for all patients and for each metric stated in T able 7 . Fig. 4 shows the results for each classifier and the visual comparison of the best results obtained by SVM and EHO. The proposed approach achiev es the best classification performance with the highest number of features. MPT A was employed to extract nine heartbeat w av e characteristic features (indices 1 through 9), which represent detailed features for the pre viously described P , Q, R, S, and T wa veforms. Additionally , ten features are extracted by means of the proposed IFEA (indices 10 through 19) 24 . The behavior of EHO during the search process is depicted in Fig. 5 , which illustrates the ev olution of the fitness function v alue (averaged v alue) associated with the best global swarm parameter for all patients and stages, also kno wn as the conv ergence of the fitness function, based on EHO. 9/ 13 T able 7. Summary of EHO-SVM approach results. Patient No. Stage No. Acc Prec Se F Sp 202 Stage 1 97.09% 44.51% 33.02% 36.19% 85.74% Stage 2 97.10% 19.42% 20.00% 19.71% 80.00% Stage 3 97.10% 19.42% 20.00% 19.71% 80.00% Stage 4 97.28% 49.52% 34.03% 38.59% 86.49% 203 Stage 1 85.95% 35.36% 21.43% 21.13% 81.41% Stage 2 89.68% 17.94% 20.00% 18.91% 80.00% Stage 3 89.69% 24.63% 20.34% 19.61% 80.31% Stage 4 89.69% 24.63% 20.34% 19.61% 80.31% 205 Stage 1 98.79% 47.77% 45.12% 45.92% 92.31% Stage 2 98.79% 47.77% 45.12% 45.92% 92.31% Stage 3 98.72% 47.48% 44.71% 45.42% 91.93% Stage 4 98.76% 47.92% 44.45% 45.80% 91.74% 207 Stage 1 81.21% 32.92% 31.04% 28.41% 83.89% Stage 2 82.07% 44.10% 33.65% 32.83% 83.75% Stage 3 78.35% 15.69% 19.97% 17.57% 79.98% Stage 4 80.91% 20.77% 20.32% 18.88% 80.88% 214 Stage 1 95.66% 46.27% 43.91% 44.50% 93.93% Stage 2 97.21% 48.60% 50.00% 49.29% 50.00% Stage 3 97.21% 48.60% 50.00% 49.29% 50.00% Stage 4 97.70% 48.20% 45.98% 46.96% 96.00% 215 Stage 1 98.75% 47.13% 46.06% 46.57% 95.83% Stage 2 98.81% 47.41% 46.08% 46.71% 95.85% Stage 3 98.81% 47.41% 46.08% 46.71% 95.85% Stage 4 98.81% 47.41% 46.08% 46.71% 95.85% 217 Stage 1 84.95% 71.33% 69.30% 67.63% 94.44% Stage 2 86.06% 74.70% 70.72% 69.19% 94.83% Stage 3 86.11% 74.74% 71.13% 69.42% 94.92% Stage 4 86.11% 74.74% 71.13% 69.42% 94.92% 219 Stage 1 98.14% 44.62% 42.79% 43.47% 90.85% Stage 2 98.70% 49.12% 42.64% 45.30% 90.60% Stage 3 99.26% 48.57% 48.36% 48.37% 95.84% Stage 4 99.26% 48.57% 48.36% 48.37% 95.84% 221 Stage 1 99.54% 99.13% 99.23% 99.18% 99.23% Stage 2 99.67% 99.11% 99.70% 99.41% 99.70% Stage 3 99.71% 99.14% 99.83% 99.48% 99.83% Stage 4 99.71% 99.14% 99.83% 99.48% 99.83% 223 Stage 1 88.82% 28.14% 28.11% 27.99% 93.59% Stage 2 90.55% 30.16% 28.68% 29.20% 93.68% Stage 3 90.93% 29.70% 29.59% 29.55% 94.50% Stage 4 91.81% 29.26% 30.92% 30.05% 95.91% T able 8. Summary of the experimental results. Measures SVM EHO-SVM Improvement Accuracy 80.31% 94.07% 13.76% Precision 40.45% 52.32% 11.87% Sensitivity 40.49% 47.85% 7.36% F-measure 38.48% 47.58% 9.10% Specificity 40.48% 47.58% 7.10% As shown by Fig. 5 , each record reaches the maximum classification accuracy at an arbitrary stage. Some records reach the 10/ 13 Figure 4. Classification performance for SVM and EHO-SVM. maximum in stage 1, for example, patient number 205 under EHO optimization, while others reach the maximum in stages 2, 3 or 4. Overall, multi-stage optimization is important and can produce better results than those of single-stage optimization. The con ver gence curves for EHO show the accuracy of the algorithm and how fast it reaches the final accuracy . For the test conducted in this paper with the defined parameters (T able 6 ), EHO reaches the maximum accuracy with fe wer iterations. The previous conclusion is v alid with respect to both each stage individually and to the ov erall con ver gence curve for all stages. Figure 5. A verage con vergence curv es for EHO. Accuracy Analysis T o avoid possible bias in the selection of the test and training sets, 3-fold cross-v alidation is utilized in this paper; hence, the ECG dataset was divided into three parts. For comparison, we consider some pre vious studies based on the same dataset. In 6 , the MIT -BIH Arrhythmia database was tested using PSO, GA, BFO, and bacterial foraging–particle swarm optimization (BFPSO) with SVM. In 10 , CWT and the histogram representation were applied to determine the QRS, T and P wa ves. Furthermore, in 28 , the optimal number of Hermite functions to represent the QRS wav e was studied. In 29 , SVM was utilized to cluster heartbeats based on only two types of features, in contrast to our work with nineteen features. Additionally , in 30 , wa velet time frequency (WTF) was applied to detect sudden amplitude and frequency jumps, b ut the ECG signals were recorded under hypnosis to obtain heart rate variability . Ho wev er , this work did not focus on the heart rate classification accuracy . These comparisons are shown in T able 9 , where it is clear that the proposed approach outperforms the compared studies. The proposed classification approach was applied to 10 patients, 16 types of heartbeat and 24,474 records. The proposed classification approach was validated and e valuated for ef ficiency based on the suf ficiently large data co vering a large v ariety of patients. It is important to note that this approach requires a number of future improv ements. The proposed model currently tar gets two classes of heartbeat arrhythmias, normal and abnormal, which are considered to be relati vely general classes. Howe ver , we are improving this w ork to accurately separate more precise classes of heartbeat, such as PVC, F , A, R, and F . The proposed model also applies a relatively traditional classification technique (SVM); we plan to employ deep learning techniques to achiev e better classification performance. Finally , a more advanced and more popular feature extraction technique, such as wa velet transform, is required in future work. 11/ 13 T able 9. Comparison of the results and methods of studies that used the same MIT -BIH Arrhythmia database. Studies Y ear Appr oach Accuracy 28 2015 Hermite functions 90% 6 2015 BFPSO-SVM 76.74% 10 2016 Delineation Method 92.44% 29 2017 SVM 93.1% 30 2017 WTF N A Proposed 2017 EHO-SVM 93.31% Conclusion and Future W ork ECG analysis helps cardiologist to make decisions about cardiac arrhythmias more accurately and easily to sav e li ves of thousands of people. ECG records the electrical activity of the heart within a specific time; hence, ECG is considered to be an important diagnostic tool to assess heart function. In this paper , we hav e de veloped a hybrid approach for automatic ECG signal classification by means of EHO and SVMs. The proposed approach includes three modules for automatic ECG signal classification: an efficient preprocessing module, a feature extraction module, and a feature optimization and classification module. In the preprocessing module, the MPT A and IFEA are applied to extract nineteen heartbeat features. Additionally , we use SVMs to classify features extracted from the previous module. Finally , in the last module, EHO is utilized to optimize the features and parameters extracted by the SVMs. The experiments sho wed that the proposed approach achie ves precise detection. Moreover , the proposed approach sho ws promise for use by medical experts who wish to diagnose heart and cardiac disorders based on ECG signals. In future work, we will propose an automated cardiac arrhythmia classification approach using hybrid SVMs and spike neural network with recent meta-heuristic optimization algorithms to focus on common disorders, such as congestiv e heart failure, and other cases of biomedical time series. Additional important goals are to analyze ECG signals in time domain and to detect the optimal representation of the P , QRS and T patterns. References 1. Cardiov ascular diseases (cvds). http://www.who.int/mediacentre/factsheets/fs317/en . 2. Bortolan, G. & W illems, J. Diagnostic ECG classification based on neural networks. J. Electr ocardiol. 26 , 75–79 (1992). 3. Hasan, M. & Mamun, M. Hardware approach of r-peak detection for the measurement of fetal and maternal heart rates. J. applied r esear ch technology 10 , 835–844 (2012). 4. Houssein, E. H., Kilan y , M. & Hassanien, A. E. Ecg signals classification: A revie w . Int. J. Intell. Eng. Informatics 5 , 376 – 396 (2017). 5. Shadmand, S. & Mashoufi, B. A new personalized ECG signal classification algorithm using block-based neural network and particle swarm optimization. Biomed. Signal Pr ocess. Contr ol. 25 , 12 – 23 (2016). 6. K ora, P . & Kalva, S. R. Hybrid bacterial foraging and particle swarm optimization for detecting bundle branch block. SpringerPlus 4 , 1–19 (2015). 7. Moein, S. & Logeswaran, R. Intelligent ECG signal noise remov al using psonn. Int. J. Comput. Appl. 45 , 9–17 (2012). Full text a vailable. 8. Pasolli, E. & Melgani, F . Genetic algorithm-based method for mitigating label noise issue in ECG signal classification. Biomed. Signal Pr ocess. Control. 19 , 130 – 136 (2015). 9. Li, Q., Rajagopalan, C. & Clif ford, G. D. A machine learning approach to multi-le vel ECG signal quality classification. Comput. Methods Pr og. Biomed. 117 , 435–447 (2014). 10. Y ochum, M., Renaud, C. & Jacquir, S. Automatic detection of p, qrs and t patterns in 12 leads ECG signal based on cwt. Biomed. Signal Pr ocess. Control. 25 , 46 – 52 (2016). 11. K ora, P . & Krishna, K. R. Hybrid firefly and particle swarm optimization algorithm for the detection of b undle branch block. Int. J. Cardiovasc. Acad. 2 , 44 – 48 (2016). 12. Adam, A., Shapiai, M. I., Mohd Tumari, M. Z., Mohamad, M. S. & Mubin, M. Feature selection and classifier parameters estimation for eeg signals peak detection using particle swarm optimization. The Sci. W orld J. 2014 , 13 (2014). 13. V afaie, M., Ataei, M. & Koofigar , H. Heart diseases prediction based on ECG signals’ classification using a genetic-fuzzy system and dynamical model of ECG signals. Biomed. Signal Process. Contr ol. 14 , 291 – 296 (2014). 12/ 13 14. Jovic, A. & Jo vic, F . Classification of cardiac arrhythmias based on alphabet entrop y of heart rate variability time series. Biomed. Signal Pr ocess. Control. 31 , 217–230 (2017). 15. Mihandoost, S. & Amirani, M. C. Cyclic spectral analysis of electrocardiogram signals based on garch model. Biomed. Signal Pr ocess. Control. 31 , 79–88 (2017). 16. W ang, G. G., Deb, S. & d. S. Coelho, L. Elephant herding optimization. In Computational and Business Intelligence (ISCBI), 2015 3r d International Symposium on , 1–5 (IEEE, 2015). 17. Kutlu, Y . & Kuntalp, D. A multi-stage automatic arrhythmia recognition and classification system. Comput. Biol. Medicine 41 , 37–45 (2011). 18. Gaetano, A. D., Panunzi, S., Rinaldi, F ., Risi, A. & Sciandrone, M. A patient adaptable ECG beat classifier based on neural networks. Appl. Math. Comput. 213 , 243 – 249 (2009). 19. Tsipouras, M. G., V oglis, C. & Fotiadis, D. I. A framew ork for fuzzy expert system creation application to cardiov ascular diseases. IEEE T ransactions on Biomed. Eng. 54 , 2089–2105 (2007). 20. Ghosh, D., Midya, B. L., K oley , C. & Purkait, P . W av elet aided SVM analysis of ECG signals for cardiac abnormality detection. In 2005 Annual IEEE India Conference - Indicon , 9–13 (IEEE, 2005). 21. Karpagachelvi, S., Arthanari, M. & Si vakumar , M. Classification of electrocardiogram signals with support vector machines and extreme learning machine. Neural Comput. Appl. 21 , 1331–1339 (2012). 22. Rocha, T . et al. A lead dependent ischemic episodes detection strategy using hermite functions. Biomed. Signal Pr ocess. Contr ol. 5 , 271 – 281 (2010). 23. Kritika Bawa, P . S. R-peak detection by modified pan-tompkins algorithm. Int. J. Adv. Res. & T echnol. 3 (2014). 24. Houssein, E. H., Kilan y , M., Hassanien, A. E. & Snasel, V . A two-stage feature extraction approach for ecg signals. In International Afr o-European Confer ence for Industrial Advancement , 299–310 (Springer , 2016). 25. Hassanien, A. E. & Emary , E. Swarm Intelligence: Principles, Advances, and Applications (CRC Press, 2016). 26. Goldberger , A. L. et al. Physiobank, physiotoolkit, and physionet. Circ. 101 , e215–e220 (2000). 27. K or ¨ urek, M. & Nizam, A. Clustering mit–bih arrhythmias with ant colony optimization using time domain and pca compressed wa velet coef ficients. Digit. Signal Pr ocess. 20 , 1050 – 1060 (2010). 28. M ´ arquez, D. G., Otero, A., Garc ´ ıa, C. A. & Presedo, J. A study on the representation of qrs complex es with the optimum number of hermite functions. Biomed. Signal Process. Contr ol. 22 , 11–18 (2015). 29. Chen, S., Hua, W ., Li, Z., Li, J. & Gao, X. Heartbeat classification using projected and dynamic features of ecg signal. Biomed. Signal Pr ocess. Control. 31 , 165–173 (2017). 30. Chen, X., Y ang, R., Ge, L., Zhang, L. & Lv , R. Heart rate variability analysis during hypnosis using wa velet transformation. Biomed. Signal Pr ocess. Control. 31 , 1–5 (2017). A uthor contributions statement Moataz Kilany performed the experiments, discussed the data and wrote the paper . Aboul Ella Hassanien conceived and supervised the research, discuss the experiments and polished the paper . Essam H. Houssein participated in written some part in the paper and write the references. All authors read and approved the final paper . Additional inf ormation The authors declare no competing interests. 13/ 13
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment