MULTIVARIATE FEATURES EXTRACTION FOR DETECTION OF EPILEPTIC SEIZURES IN ELECTROENCEPHALOGRAM Abstract: Recently more researchers in the biomedical engineering have introduced many techniques which try to detect epileptic seizures in electroencephalogram (EEG). The main objective of this paper is to develop technique that is capable of differentiating between epileptic and normal signals. This technique consists of two stages. The first stage is the features extraction and the second stage is the classification of these features. Fast Fourier transform, autoregressive model, and other nonlinear features were used as input features for the classification phase. The classification phase consists of three neural network classifiers and a majority decision to classify the input features. The classification accuracy of the proposed technique was superior compared to other techniques for accuracy. The proposed technique after using the majority method has accuracy equal 99.5%. Keywords: Epileptic seizures, autoregressive model, correlation dimension, Lyapunov exponent, and artificial neural network. 1. INTRODUCTION Epilepsy is the second common serious neurological disorder after stroke. One percentage of the people in the world suffering from epilepsy and 30% of epileptics are not helped by medication [1]. In Egypt, 643,639 people suffer from epilepsy [2]. An epileptic seizure is an abnormality in EEG recordings and is characterized by a brief and episodic neuronal synchronous discharge with dramatically increased amplitude. There are two types of epilepsy, generalized epilepsy and partial epilepsy. Generalized epilepsy involves the entire brain at once, whereas partial epilepsy involves a portion of the brain. The epileptic seizure may cause a short period of amnesia, attack of abnormal rage, sudden anxiety or fear, a moment of incoherent speech or mumbling or 1 several twitches like contractions of muscles, usually in head region [3]. There are two different types of EEG signals depending on where the signal is taken in the head: scalp or intracranial. For scalp EEG, the focus of this research, small metal discs, or electrodes, are placed on the scalp with good mechanical and electrical contact. Intracranial EEG (IEEG) is obtained by special electrodes implanted in the brain during a surgery [4]. This research will concentrate in building a computer aided diagnosis system to classify the epileptic signals and normal signals by analyzing the EEG signals. This system consists of two stages. The first stage is the features extraction from EEG signals and the second stage is the classification of these features, as shown in figure 1. The extracted features are fast Fourier transform coefficients, autoregressive model parameters, and other nonlinear features. EEG Signal Features Extraction Classification Output (Normal or Abnormal) Figure 1: Block diagram of proposed technique The classification technique which used to classify these features is artificial neural network (ANN). The classification phase consists of three neural network classifiers for each feature and a majority decision. 2. REVIEW OF PREVIOUS WORK In the last decade, different techniques of classification and input representation for the automatic recognition of EEG patterns have been developed. For example, Chaovalitwongse et al. [1] applied data mining techniques to EEG data in order to classify between the brain's normal and pre-seizure epileptic activities through the measure of the brain dynamics. These measures include Lyapunov exponents, angular frequency, and Entropy. They used the Support vector machines as classification technique. They considered the seizure is involving two states; pre-seizure, and post-seizure. The total classification accuracy of this technique was 88.3%. The probabilities of correctly predicting of pre-seizure, post-seizure and normal EEG's were about 90%, 81%, and 94%, respectively. Adeli et al. [4] applied discrete Daubechies with order four and harmonic wavelets for analysis of epileptic EEG records. They found 2 that the analysis of EEG signals by wavelet transform improved understanding of the mechanisms causing epileptic disorders, and this algorithm can be extended to create computational models for automatic detection of epileptic discharges. In [5], Subasia et al. used fast Fourier transform (FFT) and autoregressive (AR) model with maximum likelihood estimation (MLE) as features. They classified these features by using Artificial Neural Networks (ANNs) into two-group categorization: epileptic seizure and non-epileptic seizure. The classification accuracies were 91.6% and 92.3% for (FFT) and (AR) with (MLE), respectively. Gigola et al. [6] analyzed the EEG signals by using accumulated energy based on wavelet analysis to predict the epileptic seizure onset. They showed that the accumulated energy with wavelet would contribute for predicting epileptic seizure onset from EEGs signals because this method predicted the seizure onset of 12 cases from 13 preseizure signals. Abdulhamit Subasi [7] used a new approach based on neural network and fuzzy logic technologies for detection of epileptic seizures. the incorporation of both heuristics and deep knowledge to exploit the best characteristics of each. A dynamic fuzzy neural network (DFNN) is used in the classification of EEG signals. EEG signals were decomposed into the frequency sub-bands using discrete wavelet transform (DWT). Then these sub-band frequencies were used as an input to a DFNN with two discrete outputs: normal and epileptic. The classification accuracy of the DFNN model and the neural network were 93% and 92%, respectively. He concluded that the accuracy rate of DFNN model were higher than that of neural network model. Gao et al. [8] applied recurrence time statistics to detect epilepsy from continuously monitored EEG signals. They compared between recurrence time method and nonlinear method, such as the short-term maximum Lyapunov exponents (STLmax) method. They found that the detection using the STLmax method is still much noisier and less accurate than that using the recurrence time based method. The recurrence time based method is faster at least 10 times than STLmax method, and much easier to use for automatic seizure detection. Harikumar et al. [9] developed a fuzzy classification model for epilepsy risk level analysis for EEG signals. They extracted five features from EEG signal for each epoch. These features are energy of the epoch, the number of positive and negative peaks, spikes, the total number of spikes and sharp waves in the channels, and variance of each epoch. The percentage performance for Fuzzy systems was as low as 3 40%. They tried to enhance this performance by using an optimization method. After optimization, performance became 80 %. Hoeve [10] presented an algorithm for the detection and classification of epileptic activity in the EEG using independent component analysis (ICA). His results showed that the sensitivity and the selectivity of classification were 74% and 19%, respectively. McSharry et al. [11] focused on the detection of epileptic seizures from scalp EEG recordings by linear & non-linear technique, and multidimensional probability evolution (MDPE). Linear analysis methods have been used to detect linearly changes in EEG signal, such as variance, power spectrum, and auto-correlation function. Nonlinear analysis methods have been used to detect dynamical changes in EEG signal, such as correlation dimension, correlation density, crosscorrelation integral, Lyapunov exponent similarity measures, and nonlinear predictability. They found that the variance and MDPE were able to detect the seizure in each of the ten scalp EEG recordings. Esteller [12] detected seizure onset by using automatic system which consists of several stages: preprocessing, processing (feature extraction and selection), classification (training of a neural network), and validation (testing of the designed neural network). He focused in his research on the extraction and selection of features which intended to detect seizure onset. He used several features such as correlation dimension, lyapunov exponent, energy, energy derivative, accumulated energy, nonlinear energy, zero crossings, power spectrum, power spectrum of frequencies bands, coherence, fractal dimension, derivative of fractal dimension, accumulated fractal dimension, entropy, mutual information, mean frequency, cross correlation, 4th coefficients of 5th order AR model, absolute value of 4th wavelet coefficients, and spike detector. He used k-factor equation to select discriminate feature. He didn't implement the classification stage and validation stage. In [13], McGrogan presented a system for automatically detection of epileptic seizures within EEG recordings. He proposed the power parameters, discrimination of single parameters, reflection coefficients, prediction error, total and band-limited power, and rhythmicity as features. He used artificial neural networks to classify EEG signals as either seizure or non-seizure. His results presented in two measures, sensitivity and specificity. The specificity and the sensitivity of classification were 88.5% and 67.5%, respectively. From the specificity and the sensitivity, the total classification accuracy of this system was 78%. 4 From previous work, we indicate that the highest accuracy of the previous techniques for detection epileptic seizures is 93%. In this paper, we develop technique that is capable of differentiating between epileptic and normal signals and this technique reached the accuracy of 99.5% compared to the accuracy of the technique in [7] which equal about 93%. 3. PROPOSED TECHNIQUE 3.1 EEG Data Acquisition We get the EEG data from clinical Neurophysiology unit of El-kasr El-einy hospital. The personal computer picked up EEG signals by data acquisition system which contains data acquisition card (Unilink type, Schwarzer manufactured) and signal processors. EEG data which produced by EEG device (model: Etas 32, electrode placement system: 10-20 international system) can be recorded into computer memory using this card. Brainlab programming package was used. The brainlab program gives facilities to select desired montage. The data acquisition system provides real time data processing. Eighteen channels of EEG are recorded simultaneously, where all the electrodes are referenced to common potential (referential montage). The EEG is sampled at 250 Hz. The EEG is broken down into epochs for the purpose of feature extraction, where each epoch is 10 seconds. The five patients with known clinical epilepsy findings are undertaken for classification. We select the normal and epileptic forms from the recording by assisting from expert neurophysiologist who describes the clinical case for this recording. The software for analyzing the EEG data was implemented using Matlab 7. 3.2 Features extraction 3.2.1 Autoregressive Model The Autoregressive (AR) method is an alternative way to calculate the spectrum of signals. It is especially useful when the signals have low signal-to-noise ratio. The autoregressive (AR) model of an order p can be written as: xt 1 xt 1 ..... t p xt p zt 5 (1) Here: xt is the time series of data, Zt is a purely random process and the parameters α1… αp are called the AR coefficients. The name "autoregressive'' comes from the fact that xt is regressed on the past values of itself. The selection of the model order in AR spectral estimation is a critical subject. The most popular approach to find the optimal order of Autoregressive model was done by trial and error. After using different orders of autoregressive model, we found the 5th order gave best performance for classification. So, we used 5 AR coefficients in our experiment. There are different methods to calculate AR coefficients. In our experiment, we estimate the autoregressive (AR) model parameters by using Yule-Walker method [4]. 3.2.2 Fast Fourier transform The Fourier transform (FFT) is a Mathematical transformation which is applied to signals to obtain frequency components of the signals. We used the first 18 coefficients of FFT on our data set as input to the ANN classifier, as in [4]. We found that the 18 coefficients give the worst performance about 67.8%. We used other method to increase the performance of classification. This method is Fisher’s discriminant ratio (FDR) [11], and it is considered as a feature selection method. By using FDR, we select different numbers of features (3, 5, 10, 15, 20, 25, 30, and 35) from FFT according to the variance between them. We found that the best number of features is three features, which give us the best performance, as shown in fig. 2. 6 100 95 Accuracy of classification 90 85 80 75 70 65 60 55 50 0 5 10 15 20 25 FFT coefficients 30 35 40 Figure 2. The accuracy of classification of FFT with FDR 3.2.3 Nonlinear features Nonlinear (dynamical) system is defined as the system which is moving or changing in time. The brain is considered as dynamical device. In fact there is ample evidence for nonlinearity, in particular, in small assemblies of neurons. Brain electrical activity sometimes exhibits unpredictable (chaotic) EEG pattern. This unpredictable behavior is called chaos. Dynamical system can be described by several features, such as correlation dimension, Lyapunov exponents, approximate entropy, etc. In this research, we calculate two important chaotic parameters, namely correlation dimension (D2) and largest Lyapunov exponents (λ k). The mathematical description of a dynamical system consists of two parts: the state which is a snapshot of the process at a given instant in time and the dynamics which is the set of rules by which the states evolve over time. In the case of the brain as a dynamical system, the available information about the system is a set of EEG measurements from scalp electrodes. There is no mathematical description of the underlying dynamics of the brain because the total number of state variables is not known. Therefore, to study the dynamics of such system, we first need to reconstruct the state space trajectory. The most common method to do this is using delay time embedding theorem to create a larger dimensional geometric object by embedding into a larger m-dimensional embedding space. We can know suitable m by using the false nearest neighbor (FNN) algorithm. The dimension m in which false neighbors disappear is the smallest dimension that can be used for the given data. From knowing the dimension m, the state space 7 trajectory of system can be reconstructed. So, we can estimate the dynamics of this system. 3.2.3.1 Correlation dimension (D2) The Grassberger–Procaccia algorithm [15] uses a correlation integral C(r) to represent the object, which is defined as the average number of neighbors each point in the reconstructed phase space has within a given distance r, given as in equation (2), C (r ) 1 [r x(i) x( j) ] N p i, j (2) Here: ||..|| symbolized the Euclidean distance between reconstructed state vectors x(i) and x(j), Np=k(k-1)/2 is the number of distinct pairs of reconstructed state vectors, θ is the Heaviside unit step function (i.e., θ(x)=0 when x<0 and θ(x)=1 when x ≥ 0). The correlation dimension D2 is defined as the slope of the linear region of the plot of log (C(r)) versus log(r) for small values of r [16]. That is presented in equation (3), log c(r ) r 0 log r D2 lim (3) 3.2.3.2 Largest Lyapunov Exponent (λk) Lyapunov exponents quantify the sensitivity of the system to initial conditions, which is an important feature of chaotic systems and describes how small changes in the state of a system grow at an exponential rate and eventually dominate the behavior. Lyapunov exponents are defined as the long time average exponential rates of divergence of nearby states [15]. A positive, finite, value of λ means an exponential divergence of nearby trajectories. If a system has at least one positive Lyapunov exponent, then the system is chaotic. The larger the positive exponent, the more chaotic the system becomes. Lyapunov exponents will be arranged such that λ1 ≥λ2≥.... ≥λn, where λ1and λn correspond to the most rapidly expanding and contracting principal axes, respectively. The largest Lyapunov exponent λ1 is calculated as a measure of the chaotic behavior of the system using the Wolf algorithm [16]. Consider two trajectories with nearby initial conditions on an attracting manifold. When the attractor is chaotic, the trajectories 8 diverge, on the average, at an exponential rate characterized by the largest lyapunov exponent λ1. The algorithm used is as follows, 1. Compute the distance d0 of two, very close points in the reconstructed phase space orbit. 2. Follow both points as they travel a short distance along the orbit. The distance d1 between them is calculated. 3. If d1 become too large, one of the points is kept and an appropriate replacement for the other point is chosen. 4. The two points are now allowed to evolve again following steps 1-3. 5. After s propagation steps, the largest lyapunov exponent λ1 is estimated by using equation (4), 1 d (t ) 1 S log 2 1 k ts t0 k 1 d0 (tk 1 ) (4) In this work, the correlation dimension and largest lyapunov exponent were calculated using a software package for signal processing with emphasis on nonlinear time-series analysis [14]. 3.3 Classifier In our experiment, we used artificial neural network as classifier. There are different types of artificial neural networks. We used a feed forward multilayered neural network which is a commonly used type of artificial neural networks. It consists of a layer of input neurons, a layer of output neurons and one or more hidden layers. We used one hidden layer for this network. We used each features group (AR, FFT, and Nonlinear features) as input features to a neural network and we computed the performance of each neural network in each case. We used five input neurons for 5 order of AR, three input neurons for three best features of FFT, and five for nonlinear features. The number of output neuron is two. The most popular approach to find the optimal number of hidden neurons was done by trial and error. The data set used for training the network equal 432 signals (216 for normal and epileptic signals) and the data set used for testing the network equal 216 signals (108 for normal and epileptic signals). 9 3.4 Majority method The majority method is a novel method to enhance the accuracy of classification. There are three networks for classification; the network 1, 2, and 3 are used for validating three FFT features, five AR features, and and five nonlinear features of testing data, respectively. The majority method depends on the result of simulation from each network. We enter the result of testing of each network to majority stage. The outcome of the majority stage is majority result from all features. If the output of network 1 and network 2 is normal case, the majority of total output is normal case, as in Fig. 3. Testing data AR features Net1 Nonlinear features FFT features with FDR Net2 1 Net3 0 1 Majority method 1 Normal case Figure 3. Block diagram of validation stage with majority method 4. RESULTS The results of classification are expressed in term of the accuracy. The accuracy is the total percentage of correct predictions. The results presented in following table which contain the classification performance of ANN for fast Fourier transform with FRD, autoregressive, and nonlinear features. In table 1, we showed that the AR, FFT, and nonlinear features gave 99.07%, 90.3%, 98.6%, respectively. After using the majority method, the accuracy of discriminate between epilepsy and normal case became 99.5%, as shown in table 1. From this results, we noted that the majority method improved the performance of the system by 0.5% relative to the AR feature alone but this improvement is significant improvement which is 10 helped the neurophysiologist to give correct diagnostic decision for any case. Conspicuously, the majority method of FFT, AR, and nonlinear features enhance the performance and give the best accuracy of classification than FFT, AR, or nonlinear features alone. Table 1: The accuracy of classification of different features and after using majority method Accuracy 3 FFT coefficients by FDR 5 AR coefficients D2 with λ k Majority method 90.3% 99.07% 98.6% 99.5% 5. CONCLUSION From the previous results, we conclude that the system of ANN classifier with autoregressive model is better for detection epileptic seizures than that with FFT, or nonlinear features. The majority method with three classifiers of three features is best method for enhancement the performance of system. From this system, we can create computational models for automatic detection of epileptic discharges in EEG signals that can be used to predict the onset of seizure. 6. ACKNOWLEDGEMENT The authors would like to thank El-kasr El-einy Hospital, especially the clinical neurophysiology unit, for their assistance with regards to accessibility to their equipment, the acquisition of the samples used in this study and their expertise in EEG analysis. REFERENCES [1] Wanpracha Chaovalitwongse, Panos M. Pardalos, and Oleg A. Prokopyev, "EEG Classification in Epilepsy", Kluwer Academic Publishers, June 30, 2004, URL: http://coewww.rutgers.edu/ie/ research/working_paper/ paper%2004-027.pdf. [2] "Statistics by Country for Arrhythmias", 2005, URL: http://www.cureresearch.com/e/epilepsy/stats-country.htm. [3] John G. Webster, Medical instrumentations: Application and design book, third edition, pp. 173-175, John Wiley & Sons, Inc., 1998. [4] Hojjat Adeli, Ziqin Zhou, and Nahid Dadmehr, "Analysis of EEG records in an epileptic patient using wavelet transform", Journal of Neuroscience Methods, Vol. 123, pp.69-87, 2003. 11 [5] Abdulhamit Subasia, M. Kemal Kiymika, Ahmet Alkana, and Etem Koklukayab, "Neural network classification of EEG signals by using AR with MLE preprocessing for epileptic seizure detection", Mathematical and Computational Applications, Vol. 10, No. 1, pp. 57-70, 2005. [6] S. Gigola, F. Ortiz, C. E. D’Attellis, W. Silva and S. Kochen, "Prediction of epileptic seizures using accumulated energy in a multiresolution framework", Journal of Neuroscience Methods , Volume 138, Issues 1-2 , pp. 107-111, 30 September 2004. [7] Abdulhamit Subasi, "Automatic detection of epileptic seizure using dynamic fuzzy neural networks", Expert Systems with Applications, Volume 31, Issue 2, August 2006, pp. 320-328. [8] J.B. Gao, Hui Liu, K. E. Hild, J.C. Principe, and J. C. Sackellares, "Epileptic Seizure Detection from ECoG Using Recurrence time Statistics", Proceedings of the 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBS 2004), San Francisco, CA, September, 2004. [9] R.Harikumar, and B.Sabarish Narayanan, "Fuzzy Techniques for Classification of Epilepsy Risk Level from EEG Signals", International Conference IEEE TENCON 2003-IISc-Bangalore October 2003. [10] M Hoeve, "Detecting epileptic seizure activity in the EEG by Independent Component Analysis", MSc. Thesis, University of Twente, Enschede, the Netherlands, 2003. [11] P. E. McSharry, T. He, L. A. Smith, and L. Tarassenko, "Linear and non-linear methods for automatic seizure detection in scalp electro-encephalogram recordings", Medical & Biological Engineering & Computing, vol.40, pp. 447–461, March, 2002. [12] Rosana Esteller, "Detection of Seizure Onset in Epileptic Patients from Intracranial EEG Signals", Phd. thesis, School of Electrical and Computer Engineering, Georgia Institute of Technology, June 1999. [13] N McGrogan, under supervisor: Prof L Tarassenko, "Neural Network Detection of Epileptic Seizures in the Electroencephalogram", Probationary Research Transfer Report, Department of Engineering Science, Oxford University, February 22, 1999, URL: http://www.new.ox.ac.uk/ ~nmcgroga/work/transfer.pdf. [14] DPI Göttingen, Tstool package ver. 1.11,2003, URL: http: // www.physik3.gwdg.de/tstool. [15] Mohamed I. Owis, Ahmed H. Abou-Zied, Abou-Bakr M. Youssef, and Yasser M. Kadah, "Study of Features Based on Nonlinear Dynamical Modeling in ECG Arrhythmia Detection and Classification", IEEE trans. biomedical engineering, VOL. 49, NO. 7, pp. 733-736, JULY 2002. [16] Mohamed I. Owis, “Novel techniques for cardiac arrhythmia detection”, PhD. Thesis, Cairo university, Egypt, 2001. 12