Pattern recognition of epileptic EEG graphoelements with adaptive segmentation, supervised and unsupervised learning algorithms Vladimir Krajca1, Jiri Hozman1, Jitka Mohylová2, Svojmil Petránek3 1Czech Technical University in Prague, Faculty of Biomedical Engineering, Czech Republic, vladimir.krajca@fbmi.cvut.cz 2 VŠB-Technical University of Ostrava, Faculty of Electrical Engineering and Computer Science, Czech Republic, jitka.mohylova@vsb.cz 3 Hospital Na Bulovce, Dept. Neurology, Prague, ebupetranek@seznam.cz 1 Introduction The electroencephalogram (EEG) provides markers of brain disturbances in the field of epilepsy. In short duration EEG data recordings, the epileptic graphoelements may not manifest itself. The visual analysis of lengthy signals is a tedious task. It is necessary to track the EEG activity on the computer screen and to detect the epileptiform graphoelements. The automation of the process is needed. The EEG wave classification both by supervised and unsupervised learning algorithms will be compared. Combination of the above algorithms will be used 2 Aim of study To show, that artificial neural networks (ANN) exhibit better precision of classification of EEG graphoelements, then cluster analysis used perviously Cluster analysis can be used in preprocessing – in semi-automatic creation of etalons for learning classifiers Etalons can be extracted both manually and automatically from original EEG recordings – from segments detected by adaptive segmentation and described by a feature set from the time, frequency, and entropic domains. 3 Automatic identification of EEG graphoelements In different areas of EEG processing, as – Brain maturation assesemnt of the newborns – Monitoring and detection of epileptic seizures in adults computerized analysis of micro- and macrostructure of EEG is desirable. EEG microstructure – identification of single graphoelements and /or frequency bands, EEG bursts, artefacts, etc. Macrostructure – trends, detection of significant events, behavioral states, sleep stages, reveals hidden information in long-term EEG processing 4 Cluster analysis and adaptive segmentation yield color identification of the classes. It reflects microstructure (short events). Temporal profiles reflect macrostructure, classs membership in the course of a time 5 Macrostructure is reflected in temporal profiles (example: time scale 15 min/page) SIGNIFICANT EVENT (artefact) SIGNIFICANT EVENT (epi paroxysms) 6 Cursor in profile (15 min/page) selects event in original EEG recording (at that position). Example: muscle artefacts (blue color) ORIGINAL EEG 10s/page PROFILE (15min/page) CURSOR 7 Example – epi event at cursor position 8 Example – epileptic events are reflected in temporal profile 9 Adaptive segmentation and identified clusters improve feature extraction and etalons selection (we can use as a guide segment boundaries and types/classes of segments) 10 Cluster analysis Advantages: unsupervised learning („push the button and wait for results“), classes are ordered according the increasing amplitude of segment Disadvantages: classes (clusters) selected by a computer Last (red class) can consist of genuine epileptic spikes, or there can be artefacts 11 Learning (supervised) classifers Advantages: by supervised learning we can ourselves decide, which class is the first, second, etc. We can decide (by teaching) which types of graphoelements we are looking for. One class can consist of moving artefacts, which can be later eliminated Disadvantages: teaching of classifier and etalons (prototypes) selection is a tedious work, requiring a skilled expert. 12 Expert in semi-automatic etalons selection Best compromise between visual and fullautomatized EEG analysis is semi -automatic method, using both machine learning and expertise of the physician As a first, preprocessing step, cluster analysis is used for etalons extraction: it is effective, but the classes are created independently on a user wishes. They can be inhomogeneous. 13 Learning classifier Teaching is tedious. Etalons – typical representatives of the desired classes must be created/selected by a teacher. Etalons are submitted to classifier during a learning process. At least 50-100 prototypes/class are necessary (personal experience) Manual prototypes selection is time-consuming: but we can exploit class centers of the clusters for automatic prototype selection – outliers are edited by an expert. 14 Automatic classification of EEG graphoelements by a cluster analysis Efficient, without necessity of learning Hybrid segments with overlapping classes exhibiting features of several classes can be misclassified. No posiibility to influence classification – to specify uswer defined classes (artefacts in last class etc.) Clusters are created by „natural“ data structure Clusters have spheric shape in the feature space, are formed without the user intervention. 16 Testing the methodology on the real data EEG record of patient with the diagnosis epilepsy (length 31 min , 8 classes) Both epileptic graphoelements and impulse artefacts have similar parameters (features). 17 Cluster analysis Noise/muscle artefacts are misclassified into blue (6th) class of impulse artefacts. See its position in temporal profiles. Note the good identification of continuous impulse artefacts in 13th channel. 18 Cluster analysis Misclassified „hybrid“ segments exhibiting features of both classes. Blue and violet are the class colors Fuzzy cluster analysis might help to improve to eliminate the hybrid segments 19 Cluster analysis can be used for semiautomatic extraction of etalons from the raw, original EEG Typical, representative segments of the cluster are positioned in feature space near the center of gravity. They are typical members of the class (etalons) of the class, closest to the class center . Because cluster analysis works relatively quicky, we have at our disposal the candidates for etalons . Only minimum effort is needed for final editing of the etalons set. 20 Representative segments , closest to the center of cluster = etalons for teaching of the learning classifier (neural network) Cluster analysis 21 Learning classifiers could provide the solution/improvement to the above mentioned problems. Method: 1. User specifies what to search for 2. Realisation is performed by ANN (artificial neural networks) 3. Learning by GA (genetic algorithms) 4. Weights initializing (to avoid local minimum) - simulated annealing 22 ANN 24-12-8 24 inputs - features 12 neurons in hidden layer (input features combining , set empirically – try and mistake approach 8 outputs (8 classes) 23 ANN, 3-layer perceptron Improvement of cluster analysis method – impulse and noisy artefacts are distinguished now. 24 ANN, 3-layer perceptron Classes are more homogeneous now 25 How to select etalons? 1. Expert selects etalons with a mouse on the computer screen 2. (semi) automatically by cluster analysis (minor editing of the etalons database) 26 Example of etalons selection – by mouse within the range (boundaries) of adaptive segmentation segments ANN, etalons selection ETALON SPECTRUM FEATURES 27 ETALON SELECTION FROM EEG 2 ANN, etalons selection 1 - etalon selection 2 – etalon identification (class number entered by a teacher) 1 3 – click on the etalon – features histogram (4) and spectrum (5) DATABASE EDITING Parameters are compared in small window (6). Average features and average spectrum for each class (7) 4 3 6 5 29 7 Epileptic prototypes and artefacts in two different channels ANN, summary sheets 31 Visualization – different types of activity can be identified by a color directly in the real EEG/temporal profiles under the cursor position Results visualization 32 ANN, etalons selection 33 Features evaluation ANN - MLP (3- layer perceptron) Scatterogram AP- Sigma (amplitude vs. sigma frequency band) a b c d 34 Conclusion ANN with genetic algorithm and simulated annealing can learn to recognize the EG graphoelements much better than unsupervised learning algorithm. The types of graphoelements of classes can be specified by an user. Cluster analysis provides "natural" clusters, it is not possible to specify, that class number six, for example, consists of artifacts Cluster analysis can be used in the first step of processing - for etalons specification Adaptive segmentation can be used for manual selection of etalons from EEG for segment boundaries plotting in 35 the graph SHLUKOVÁ ANALÝZA Fuzzy shluková analýza (algoritmus FCM). Impulsy chybně zařazeny do třídy epi grafoelementů. Práh 0.3 ZLEPŠENÍ HOMOGENITY DAT. NETYPICKÉ SEGMENTY JSOU VYLOUČENY 36 Fuzzy shluková analýza - eliminace outliers s menším členstvím než 0.3 37 Pro hodnocení kvality navržených příznaků lze užít histogram příznaků a spektrum. Z obr. 11 je opět patrné, že třídy č. 5 a 7 (počítáno od nuly) by se měly sloučit. ANN - MLP (3-vrtstvý perceptron) 38 Ale !!! : Artefakty - mohou spadnout do poslední třídy, stejně jako pomalá vysokovoltážní aktivita a poškodit přesnost detekce ARTEFAKTY 39 Automatic classification and visualization of epileptic EEG by supervised and unsupervised algorithms 40 MOTIVACE – vedlejší paroxysmus 41 Srovnání fuzzy k-NN a Shlukové analýzy 42 43 44 Cluster 45 46 Fuzzy k-NN 47 Problems to be solved Etalons description – features selection Database of prototypes Generalization – presented examples based on etalons extracted from the beginning of the same recording Robust identification Optimal MLP structure (number of hidden neurons) Modern better classifiers inspired by a nature (genetic algorithms, ant colony optimization,…). 48