International Journal of Mechanical Engineering and Technology (IJMET) Volume 10, Issue 01, January 2019, pp. 690–698, Article ID: IJMET_10_01_070 Available online at http://www.iaeme.com/ijmet/issues.asp?JType=IJMET&VType=10&IType=1 ISSN Print: 0976-6340 and ISSN Online: 0976-6359 © IAEME Publication Scopus Indexed INSIGHT TO MUTUAL INFORMATION AND MATRIX FACTORIZATION WITH LINEAR NEURAL NETWORKS FOR EPILEPSY CLASSIFICATION FROM EEG SIGNALS Harikumar Rajaguru and Sunil Kumar Prabhakar Department of ECE, Bannari Amman Institute of Technology Sathyamangalam, India ABSTRACT As rich spatiotemporal dynamics are exhibited in the human brain, it is quite complicated in nature. Sudden electrical disturbance of the brain occurs in a temporary manner and it causes epileptic seizures. Seizures may be sometimes confused with other events and sometimes it may even go unnoticed. Prediction of occurrence of an epileptic seizure is quite difficult and it is very difficult to understand the course of action. To analyze this widespread disorder of the brain, Electroencephalography (EEG) is used. It is indeed one of the best techniques to probe the activity of the brain and it is highly useful to diagnose the neurological disease. Tons of information is obtained by the EEG monitoring system and analyzing it visually is quite difficult. Therefore, the dimensionality of the EEG data is reduced with the help of dimensionality reduction techniques like Mutual Information (MI) and Matrix Factorization (MF). The values reduced through dimensionality reduction are then classified with the help of Linear Layer Networks for the classification of epilepsy from EEG Signals. Results show that when MI is used to reduce the dimensionality and classified with Linear Layer Networks an average classification accuracy of 96.60% is obtained. When MF is employed with Linear Layer Networks an average classification accuracy of 97.47% is obtained. Keywords: EEG, Seizure, MI, MF, Epilepsy O. Cite this Article: Harikumar Rajaguru and Sunil Kumar Prabhakar, Insight to Mutual Information and Matrix Factorization with Linear Neural Networks for Epilepsy Classification from EEG Signals International Journal of Mechanical Engineering and Technology, 10(01), 2019, pp. 690-698. http://www.iaeme.com/IJMET/issues.asp?JType=IJMET&VType=10&IType=1 http://www.iaeme.com/IJMET/index.asp 690 editor@iaeme.com Harikumar Rajaguru and Sunil Kumar Prabhakar 1. INTRODUCTION A set of brain disorders which are chronic in nature and characterized by recurrent seizures is called as epilepsy [1]. The basic diagnostic test for epilepsy is done with the help of EEG [2]. EEG provides an incessant measure of the function of the cortex and has an amazing spatio temporal resolution. For a lot of clinical purposes and reasons, various efforts are spent on interpreting the EEG signals. For interpreting EEG signals, the gold standard in present clinical practice is through visual scanning and inspection which is quite tedious [3]. Shortage in the supply of experienced electroencephalographers is a big problem and so there is a need for the automated systems which helps in the detailed interpretation of the EEG signals [4]. Some of the significant and most relevant works discussed in literature regarding EEG signal processing and epilepsy classification from EEG signals are discussed as follows. The permutation entropy of scalp EEG in order to investigate epilepsy was given by Ferlazzo et al [5]. The aggregation operators along with the fuzzy techniques for the epilepsy classification from EEG signals using cerebral blood flow was done by Harikumar and Kumar [6]. An automated detection system to detect the epileptic seizures was designed by Swami et al with the help of Support Vector Machine (SVM) Classifier [7]. A Modified Sparse Representation Classifier and Naïve Bayesian Classifier were developed by Rajaguru and Prabhakar for the classification of epilepsy from EEG signals [8]. A chaos based nonlinear analysis of epileptic seizures was developed by Sahu et al [9]. From a wavelet thresholding point of view, the different frequency behaviors of EEG signals in epileptic patients were analyzed by Harikumar and Kumar [10]. The diagnosis of epilepsy with the help of combined doffing oscillator was done by Gandhi et al [11]. The dimensionality reduction techniques were classified by utilizing the Genetic Algorithms for classifying epilepsy from EEG signals by Prabhakar and Rajaguru [12]. The time-frequency analysis was done in a detailed manner for detection of epileptic seizures by Tzallas et al [13]. The classification of epilepsy risk levels using various distance measures was done by Prabhakar and Rajaguru [14]. The methodologies like Bayesian Linear Discriminant Analysis (BLDA), Hybrid Artificial Bee Colony with Particle Swarm Optimization (ABC-PSO), Sparse Principal Component Analysis (S-PCA), and Soft Decision Trees Classifiers was utilized by Rajaguru and Prabhakar for classifying the epilepsy risk levels from EEG signals [15],[16]. In this MF and MI are used as dimensionality reduction techniques and it is later classified with the help of Linear Layer Networks. The organization of the paper is described as follows. In section 2, the materials and methods are discussed followed by the application of linear layer networks as post classifiers for classification of epilepsy from EEG signals in section 3. Section 4 provides the results and discussion. Section 5 gives the conclusion. The pictorial description of the work is shown in Figure 1. http://www.iaeme.com/IJMET/index.asp 691 editor@iaeme.com An Insight to Mutual Information and Matrix Factorization with Linear Neural Networks for Epilepsy Classification from EEG Signals Figure 1 Proposed Flow of Work 2. DATA SET AND DIMENSIONALITY REDUCTION TECHNIQUES The EEG data analyzed here is for totally 20 patients which are obtained from the Neurology Department of the Sri Ramakrishna Hospital, Coimbatore. The EEG data is obtained in European Data Format (EDF). The 16 channel electrodes are kept on the scalp of the epileptic patients according to the standard 10-20 International system and the recordings of EEG are done. The recordings of the EEG are done for more than 55 minutes and the EEG recordings are split into epochs for easy analysis and computation. Each epoch of a single channel has approximately 400 values and so for all the epochs of the 16 channels of the total 20 patients, the values are too high and so to reduce the overall dimensionality of the EEG signals, dimensionality reduction techniques such as Mutual Information and Matrix Factorization is utilized. 2.1. Mutual Information Given wa dom(a) and wb dom(b) , the probability parameters are denoted as follows: p ( wa ) : the probability of a which takes the value wa p(wa , wb ) : the joint probability of a which takes the value wa and b takes the value wb . p (wb wa ) = p ( wa , wb ) p( wa ) http://www.iaeme.com/IJMET/index.asp 692 editor@iaeme.com Harikumar Rajaguru and Sunil Kumar Prabhakar Entropy and Mutual Information are very closely related [17]. In information theory, one of the central notices is entropy which helps to assess the uncertainty in a particular random variable. The entropy of a random variable a denoted as H (a) is defined as H (a) = − p(w ) log p(w ) wa dom ( a ) a a The information which one random variable tells about another random variable is described with the help of MI. The MI of 2 random variables a and b denoted as I (a, b) is defined as I= wa dom ( a ) wb dom ( b ) p( wa , wb ) log p( wa , wb ) p( wa ) p( wb ) The important information that b tells about a is the reduction in uncertainty about a due to the knowledge of b and vice-versa. If the value of I (a, b) is greater, then more information will be shared by a and b about each other. 2.2. Matrix Factorization It is a famous computational method utilized for reducing the dimensionality of the data [18]. This technique has also been used widely in the fields of chemical spectral analysis, data mining, image processing, formulation of new drugs and so on. It is by means of factorizing the data matrix into a sparse, low rank and non-negative matrix. It is widely called as Positive Matrix Factorization. Assume L to be a non-negative matrix with a dimension c d . The nonmatrix factorization decomposes the matrix into a sparse, low rank and non-negative factors so that the approximation of the original data is modeled as L = KQ where K and Q have non-negative elements. K is of a particular dimension c r and is termed as the basis matrix because its row contains a set of basis vectors. Q is of a particular dimension r d , and is termed as the weight matrix because its row has coefficient sequences. For the factorization, the rank r is chosen so that (c + d )r cd . To the columns of L , the columns of Q are always in one-to-one correspondence. Therefore, the result (KQ) in easily interpreted as weighted sum of every basis vector in K . The weight represents the corresponding columns of Q . The resulting additive properties obtained from the non-negative constraints of negative matrix factorization results in a specific basis vector that represents the local component of the standard original data. 3. POST CLASSIFICATION WITH LINEAR LAYER NEURAL NETWORK The dimensionally reduced values are then fed inside the Linear Neural Networks to classify the risk of epilepsy from EEG signals. A linear neural network is an affine mapping technique [19]. The training of this network can be done in two ways, firstly with the least squares methodology or secondly using the Widrow-Hoff algorithm. Least squares can be implemented using the pseudo inverse and Widrow-Hoff algorithm is nothing but a simple approximate gradient descent present on the least squares error. In time series, the application includes prediction and removal of noise, where initially the time series is embedded to k and so this linear layer networks is predominantly used in the analysis of signal community. http://www.iaeme.com/IJMET/index.asp 693 editor@iaeme.com An Insight to Mutual Information and Matrix Factorization with Linear Neural Networks for Epilepsy Classification from EEG Signals The linear neural node has a schematic form where the data flows generally from left to right. The real numbers z1 ,..., z n to the input layers is presented initially. The w1k corresponds to a specific weight. A weighted edge means it is multiplied by w1k and generally it corresponds to the edge which travels from node k to node i . After a particular mode Q , the total sum of the incoming signals is considered and then added to a particular value d . The resting state of the cell is taken as 'd ' and finally the result is passed along the axon. The end result is expressed as z (w11 z1 + w12 z 2 + ....,+ w1n z n + d ) which can be visualized as an n -dimensional plane. So only a linear neural node is specifically considered as an affine map. Similarly, multiple computational nodes are combined together in order to get a linear neural network. The affine problem is represented as Fz + g = t ) )) ) and is quite equivalent to solve a linear nature problem Fz = t , where F is m n and g is m 1 , where the total number of nodes in the input layer is given as 'n' and the total number of nodes in the output layer is given as 'm' . To best determine both the respective biases and weights that best match a given input-output set, training a linear network is perfectly done. Training corresponds to a particular computation of a pseudo inverse when all the data is available. 4. RESULTS AND DISCUSSION When Mutual Information and Matrix Factorization are considered as dimensionality reduction techniques and when it is classified with Linear Neural Networks classifier and based on the parameters like Performance Index, Accuracy, Quality Values, Time Delay, Specificity and Sensitivity the average results are computed in Tables 1 and 2. The mathematical formulae for the Performance Index (PI), Sensitivity, Specificity and Accuracy are given as follows PC − MC − FA PI = 100 PC where Perfect Classification is expressed by PC, Missed Classification is denoted by MC and the False Alarm is expressed by FA. The Sensitivity, Specificity and Accuracy measures are mathematically explained by the following Sensitivity = PC 100 PC + FA Specificity = PC 100 PC + MC Sensitivity + Specificity 2 The Quality Value QV is mathematically defined as follows Accuracy = Qv = C ( R fa + 0.2) * (Tdly * Pdct + 6 * Pmsd ) where C specifies the scaling constant, Rfa mentions the number of false alarm per set, Tdly denotes the average delay of the onset classification in seconds, Pdct explains the percentage of perfect classification and Pmsd specifies the percentage of perfect risk level missed. http://www.iaeme.com/IJMET/index.asp 694 editor@iaeme.com Harikumar Rajaguru and Sunil Kumar Prabhakar The time delay is mathematically written as follows MC PC Delay = 2 + 6 100 100 Several reasons justify the use of this formula. i. QV monotonically increases where Rfa decreases. The lower the false alarm rate is better the classifier performance ii. The constant false alarm of 0.2 per set is added to Rfa in the QV formula. In our method, the false alarm rate is low and usually ranges form 0 – 0.5. A rate higher than 0.5 is unacceptable iii. Tdly * Pdct + 6*Pmsd is actually the weighted average of the delay of onset classification while Tdly in the average delay of onset classification is used as a delay for missed risk level by the method. The weights for the average delay of classification or missed risk level or the percentages of perfectly classified and missed risk levels consequently the weighted average delay reflects the quality of a classifier with respect to the classification delay. iv. As a result of (iii), QV is inversely proportional to the weighted average of the delay of onset classification. This reflects the fact that the shorter the onset classification delays the better the classifier. v. Theoretically, the weighted delay, Tdly * Pdct + 6*Pmsd could be zero if no level is missed (Pmsd is zero) and all classification delays Tdly are zero. However, this cannot happen because Tdly can never be zero. In order to classify a perfect risk level with the shortest possible delay, the classification window, this lasts 2.0 seconds. As a result, the weighted delay Tdly * Pdct + 6*Pmsd cannot be very small. A constant C is empirically set to 10 because this scale is the value of QV to an easy reading range. The higher value of QV, the better the classifier among the different classifier, the classifier with the highest QV should be the best. Time Table 1 Performance Analysis of MI with Linear Neural Networks Parameters Epoch 1 Epoch 2 Epoch 3 Average PC (%) 94.08 93.54 94.38 94.00 MC (%) 6.66 4.37 4.37 5.13 FA (%) 0.41 2.08 1.24 1.24 PI (%) 91.67 92.89 93.88 92.81 Specificity (%) 93.33 95.6 95.63 94.86 Sensitivity (%) 99.38 97.31 98.35 98.35 Time Delay (sec) 2.26 2.13 2.15 2.18 Quality Values 21.98 21.57 22.12 21.89 Accuracy (%) 96.36 96.47 96.99 96.60 http://www.iaeme.com/IJMET/index.asp 695 editor@iaeme.com An Insight to Mutual Information and Matrix Factorization with Linear Neural Networks for Epilepsy Classification from EEG Signals Table 2 Performance Analysis of MF with Linear Neural Networks Parameters Epoch 1 Epoch 2 Epoch 3 Average PC (%) 96.25 95.84 95.84 95.97 MC (%) 2.49 1.45 1.66 1.87 FA (%) 1.24 2.70 2.49 2.14 PI (%) 96.08 95.65 95.65 95.79 Specificity (%) 97.50 98.54 98.33 98.12 Sensitivity (%) 98.15 95.99 96.30 96.11 Time Delay (sec) 2.07 2.00 2.02 2.03 Quality Values 22.78 22.07 22.14 22.33 Accuracy (%) 97.82 97.27 97.32 97.47 5. CONCLUSION It is thus concluded that when Mutual Information is utilized as a dimensionality reduction technique and when it is classified with Linear Neural Networks, an average classification accuracy of 96.60% along with an average quality value of 21.89 is obtained. Similarly, when Matrix Factorization is utilized as a dimensionality reduction technique and when it is classified with Linear Neural Networks, a classification accuracy of 97.47% along with an average quality value of 22.33 is obtained. Thus, the performance of the Matrix Factorization surpasses the performance of the Mutual Information when classified with Linear Neural Networks. Future works aim to work with different dimensionality reduction techniques and classify with Linear Neural Networks to classify the epilepsy risk level from EEG signals. REFERENCES [1] [2] [3] [4] S.K Prabhakar, H Rajaguru, “ICA, LGE and FMI as Dimensionality Reduction Techniques followed by GMM as Post Classifier for the Classification of Epilepsy Risk Levels from EEG Signals”, 9th IEEE European Modelling Symposium 2015, October 6-8, Madrid, Spain, 978-1-5090-0206-1/15, DOI: 10.1109/EMS.2015.20 S.K Prabhakar, H Rajaguru, “Classification of Epilepsy Risk using Variable Thresholding Based Feature Extraction Technique and suitable Post Classifiers”, 9th IEEE European Modelling Symposium 2015, October 6-8, Madrid, Spain, 978-1-5090-0206-1-15, DOI: 10.1109/EMS.2015.22 S.K Prabhakar, H Rajaguru, “Entropy Based PAPR Reduction for STTC System Utilized for Classification of Epilepsy from EEG Signals Using PSD and SVM”, IFBME Proceedings (Springer), 3rd International Conference on Movement, Health and Exercise (MoHE), September 28-30, 2016, Malaysia. S.K Prabhakar, H Rajaguru, “Wireless Systems with Reduced PAPR Using K-means Modified PTS Implemented for Epilepsy Classification from EEG Signals”, IFBME Proceedings (Springer), International Conference on Advancements of Medicine and Health Care through Technology (MEDITECH), October 12-15, 2016, Romania. http://www.iaeme.com/IJMET/index.asp 696 editor@iaeme.com Harikumar Rajaguru and Sunil Kumar Prabhakar [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] E. Ferlazzo, N. Mammone, V. Cianci, S. Gasparini, A. Gambardella, A. Labate, M. A. Latella, V. Sofia, M. Elia, F. C. Morabito, and U. Aguglia, “Permutation entropy of scalp EEG: A tool to investigate epilepsies. Suggestions from absence epilepsies,” Clin. Neurophysiol., vol. 125, no. 1, pp. 13–20, Jan. 2014. R.Harikumar, P.Sunil Kumar, “ Fuzzy Techniques and Aggregation Operators in Classification of Epilepsy Risk Levels for Diabetic Patients Using EEG Signals and Cerebral Blood Flow”, Journal of Biomaterials and Tissue Engineering”, Vol.5, No.4, pages :316-322, April 2015 P. Swami, B. K. Panigrahi, A. K. Godiyal, M. Bhatia, J. Santhosh, and S. Anand, “Robust Expert System Design for Automated Detection of Epileptic Seizures using SVM Classifier,” IEEE Proc. Int. Conf. Parallel, Distributed and Grid Comput., Solan, India, pp. 219–222, Dec. 2014. H. Rajaguru, S.K Prabhakar, “A Framework for Epilepsy Classification Using Modified Sparse Representation Classifiers and Native Bayesian Classifier from EEG Signals”, Journal of Medical Imaging and Health Informatics, Vol.6, no.8, pp:1829-1837, December 2016 Issue R. Sahu, T. Parija, B. Mohapatra, B. Rout, S. Sahu, R. Panda, P. Pal, and T. Gandhi, “Chaos Based Nonlinear Analysis of Epileptic Seizure,” IEEE Proc. Int. Conf. Emerging Trends Engg. Tech., Goa, India, pp. 594–598, Nov. 2010. R.Harikumar, P.Sunil Kumar, “Frequency behaviors of electroencephalography signals in epileptic patients from a wavelet Thresholding perspective”, Applied Mathematical Sciences, Vol. 9, 2015, no.50, 2451-2457, HIKARI Ltd, http://dx.doi.org/10.12988/ams.2015.52135 T. Gandhi, P. Bhowmik, A. Mohapatra, S. Das, S. Anand, and B. K. Panigrahi, “Epilepsy Diagnosis Using Combined Duffing Oscillator and PNN Model,” Bioinforma. Intell. Control., vol. 1, no. 1, pp. 64–70, June 2012 S.K Prabhakar, H. Rajaguru, “Utilizing Genetic Algorithms with Dimensionality Reduction Techniques for Epilepsy Classification from EEG Signals”, International Journal of Pharmacy and Technology, Vol.8, Issue.1, March 2016, pg:11334-11346 T. Tzallas, M.G. Tsipouras, and D.I. Fotiadi, “Epileptic Seizure Detection in EEGs Using Time – Frequency Analysis,” IEEE Trans. Info. Tech. Biomedicine, vol. 13, no. 5, pp. 703– 710, Sep. 2009. S.K Prabhakar, H Rajaguru, “A Different Approach to Epilepsy Risk Level Classification Utilizing Various Distance Measures as Post Classifiers”, Proceedings of the 8th Biomedical Engineering International Conference (BMEiCON), November 25-27, 2015, Pattaya, Thailand. H.Rajaguru, S.K Prabhakar, ‘Bayesian Linear Discriminant Analysis with Hybrid ABCPSO Classifier for Classifying Epilepsy from EEG Signals’, IEEE Proceedings of the International Conference on Computing Methodologies and Communication (ICCMC 2017), Erode, India H.Rajaguru, S.K Prabhakar, ‘Sparse PCA and Soft Decision Tree Classifiers for Epilepsy Classification from EEG signals, IEEE Proceedings of the International Conference on Electronics, Communication and Aerospace Technology (ICECA 2017), Coimbatore, India , pp. 581-584 Yiping Ke, James Cheng, Wilfred Ng, “An information-theoretic approach to quantitative association rule mining”,22 July 2007, Springer-Verlag London Limited M. Berry, M. Browne, A. Langville, P. Pauca, and R.J. Plemmons, “Algorithms and applications for approximate nonnegative matrix factorization,” Computational Statistics and Data Analysis, vol. 52, pp. 155–173, 2007. http://www.iaeme.com/IJMET/index.asp 697 editor@iaeme.com An Insight to Mutual Information and Matrix Factorization with Linear Neural Networks for Epilepsy Classification from EEG Signals [19] V.V.Ramalingam, S.Ganesh kumar, and V. Sugumaran, Analysis of Eeg Signals Using Data Mining Approach, International Journal of Computer Engineering and Technology, pp. 206-212 http://www.iaeme.com/IJMET/index.asp 698 editor@iaeme.com