ENHANCED AUTOMATIC IDENTIFICATION OF ARRHYTHMIA IN ELECTROCARDIOGRAM (ECG) SIGNALS BASED ON FRACTAL FEATURES AND SVM TECHNIQUE. By Maram Hasan Al-Alfi Supervisor Dr. Rashiq Marie This Thesis Was Submitted in Partial Fulfillment of the Requirements for the Master Degree in Computer Science Faculty of Scientific Research and Graduate Studies Zarqa University Zarqa, Jordan April, 2014 i ii iii ACKNOWLEDGMENTS Thanks Allah , thanks so much because I would not have been able to complete this thesis without His aid and support. I would like to express my deepest appreciation to my advisor, Dr. Rashiq Marie for his leadership, support,and attention to details. I have enjoyed the aid and support of my mother who instilled in me confidence and a drive for pursuing my MSc. degree. Finally I would like to thank the staff members of the Department of Computer Science at Zarqa University for their continuous aids. iv TABLE OF CONTENTS Contents List of Tables ................................................................................................................ v List of Figures............................................................................................................... vi List of Acronyms...........................................................................................................viii List of Publications ........................................................................................................ix Abstract in Arabic...........................................................................................................x Abstract in English........................................................................................................xii Chapter 1: Introduction ........................................................................................…...1 1.1 Overview ............................................................................................................1 1.2 Problem Definition..............................................................................................2 1.3 Thesis Contribution ............................................................................................2 1.4 Thesis Organization.............................................................................................3 Chapter 2: Literature Review and Related Work..................................................... 4 2.1 Introduction...................................................................................................... 4 2.2 Related Work.....................................................................................................4 2.3 Taxonomy and research definition....................................................................6 Chapter 3: Electrocardiogram (ECG) Biosignals and Fractal Geometry...….........8 3.1 Overview .............................................................................................................8 3.2 Properties of ECG Signals..................................................................................11 3.3 Normal cardiac Electrocardiogram versus Abnormal .......................................13 3.4 Fractal features .................................................................................................18 3.5 Explanation of Fractal Geometry and Fractal Dimensions ……….………...20 3.5.1 Calculating Fractal Dimensions………………………..…………............22 v 3.5.2 Examples of Deterministic Fractals ………………………..…….........24 3.5.3 Fractals and Fractal Geometry applications ………………..……..........26 3.6 Fractal Features Extraction From ECG Signals..............................................28 3.6.1 Time Domain Methods of Estimating FD.................................................28 Chapter 4: ECG Feature Extracting using (PSM) and Classification with (SVM)………………………..………………………………………………….........32 4.1 Introduction ...............................................................................................................................32 4.2 Feature Extraction Using Power Spectrum Method(PSM)...................................,…32 4.2.1 PSM Methodology Algorithm............................................................................37 4.3 Arrhythmia Classification based on SVM.......................................................................38 4.3.1 Multiclass SVM……………....................................................................,..........41 4.3.2 Application of OAA SVM Using Fractal Features in ECG Arrhythmia diagnosis……………………………………………………………....…………………………….…42 Chapter 5: Experimental Evaluation.........................................................................................44 5.1 Dataset Description...............................................................................................................44 5.2 Experimental Results………………............................................................................ 47 5.2.1 Fractal features Extraction..................................................................................... 47 5.2.2 Classification with SVM........................................................................................ 51 Chapter 6: Conclusion and Future Work……………………………………………............53 6.1 Conclusion……………………………………………….……………….……………….…53 6.2 Future work………………….……………..……………………………………...........……54 References……………………………………………………………………………….…...…..….....55 Appendices……………………………………………………………………………..…..……..........59 Appendix A: Matlab Code ...................................................................................................................59 vi LIST OF TABLES Table Title Page Table 1 Description of the Used Dataset 45 Table 2 The Estimated FD Values for Normal Sinus Rhythm Signals 48 The Estimated FD Values for Ventricular Premature Arrhythmia Signals Table 4 The Estimated FD Values for Atrial Premature Arrhythmia Signals Table 5 The Estimated FD Values for Right Bundle Branch Block Arrhythmia Signals Table 6 The Estimated FD Values for Left Bundle Branch Block Arrhythmia Signals Table 7 Distinct Range of FD Values for Sample ECG Signals Using PSM Table 8 Ranges of FD Values for Sample ECG Signal Using Katz’s Method Table 9 Ranges of FD Values for Sample ECG Signal Using Higuchi’s Method Table 10 Ranges of FD Values for Sample ECG Signal Using Hurst’s Method Table 11 Average of the Estimated FD Values 49 Table 12 Number of Training and Testing Beats Used 52 Table 13 Class Percentage Accuracy Achieved on the Testing PSM – FD Values with a Total Number of 122 PSM - FD Training Values 52 Table 3 vii 49 49 49 50 50 50 50 50 LIST OF FIGURES Title Figure Page Figure 1 Propagation of the depolarization wave in the heart muscle Figure 2 Typical shape of ECG signal and its essential waves 12 Figure 3 A Normal sinus rhythm 13 Figure 4 Premature Ventricular 14 Figure 5 Atrial Premature 15 Figure.6 Right bundle-branch block 16 Figure 7 Left bundle-branch block 17 Figure 8 Fern Leaf 19 Figure 9 Classical geometry objects 19 Figure 10 Fractal Curves 19 Figure 11 Demonstration of fractal dimensions with Euclidean line segments 20 Figure 12 Demonstration of fractal dimensions with Euclidean planes 22 Figure 13 The Koch Curve 24 Figure 14 The Sierpinski Triangle 25 Figure 15 (left) Normal Sinus Rhythm ECG signal of size 1024, (right) Zoom in version of Normal Sinus Rhythm ECG signal of size 512 35 (left) Atrial Premature Arrhythmia ECG signal of size 1024, (right) Zoom in version of Atrial Premature Arrhythmia ECG signal of size 512 35 Measured power spectrum of Normal Sinus Rhythm ECG signal (left-to-right and top-to-bottom) for window size 1024; 512; 256; 128 36 Figure 16 Figure 17 Figure 18 Measured power spectrum of Atrial Premature ECG signal (left-to-right and top-to-bottom) for window size - viii 9 36 1024; 512; 256; 128 Figure 19 Estimate the Fourier Dimension q of ECG Signal 38 Figure 20 SVM Model 39 Figure 21 Hyper Plane 40 Figure 22 SVM method using fractal features for ECG Arrhythmia diagnosis 43 Figure 23 (A),(B) ECG signals of Normal Rhythm 45 Figure 24 (A), (B) ECG signals of a Premature Ventricular Arrhythmia 45 Figure 25 (A),(B) ECG signals of a Atrial Premature Arrhythmia 46 Figure 26 (A),(B) ECG signals of Right Bundle-Branch Block Arrhythmia 46 Figure 27 (A),(B) ECG signals of Left Bundle-Branch Block Arrhythmia 46 ix LIST OF ACRONYMS AA Average Accuracy AF Atrial Fibrillation AFIB Atrial Fibrillation beat AP Atrial premature beat BII Heart Block ECG Electro Cardio Gram FD Fractal Dimension FFT Fast Fourier Transform HRV Heart Variability Beat LBBB Left Bundle Branch Block PDF probability distribution function Poly Polynomial PSDF Power Spectral Density Function PSM Power Spectrum Method PSM Power Spectrum Method PVC Ventricular premature beat RBBB Right Bundle Branch Block RBF Radial Basis Function RS Rescaled Range Method RSF Random Scaling Fractal SVM Support Vector Machine SVT Supraventricular tachycardia x LIST OF PUBLICATIONS Rashiq R. Marie and Maram H. Al Alfi, (M a y 1 1 , 2 0 1 4),“Identification of Cardiac Diseases from (ECG) Signals based on Fractal Analysis ", Internation Journal of Computers and Technology (IJCIT) , Vol. 13 , no. 6 : pp.4556-4565 . Maram H. Al Alfi and Rashiq R. Marie,( 2 0 1 4), “Support Vector Machine based Arrhythmia Classification using Fractal Dimension Feature of ECG Signal ", International Journal of Computer Science Issues (IJCSI),(Submitted). xi آلية محسنة للكشف عن عدم انتظام ضربات القلب في تخطيط القلب الكهربائي اعتمادا على الخصائص الفراكتليه و تقنية SVM إعداد مرام حسن األلفي المشرف د .رشيق مرعي الملخص ان عملية تحليل تخطيط القلب الكهربائي ( )ECGهي واحدة من االهتمامات البحثية الرئيسية في معالجة اإلشارات الطبية الحيوية .و ترجع أسباب هذا االهتمام الى :النمو في أنشطة الرعاية الصحية للقلب في جميع أنحاء العالم ،والتقدم السريع في تكنولوجيا الحاسوب الرقمي التي تلعب دورا أساسيا في الكشف عن الحاالت المرضية في اإلشارات الحيويه .وألن عملية تقييم نتائج التشخيص لهذه اإلشارات الحيويه يعتمد بشكل كبير على الكمية و الدقة والسرعة ،يعتبر التحليل القائم على الكمبيوتر مفيد جدا في العالج السريري. في هذه األطروحة تم اقتراح طريقة لتحليل إشارة ال( )ECGباستخدام الخصائص الفراكتليه وتقنية ، SVMو لقد وجدت من التجربة العملية بأن هذه الطريقة توفر نمطا تشخيصيا الكترونيا جيدا لمرض عدم انتظام ضربات القلب ،كما يمكن استخدامها من قبل الطبيب المختص لتشخيص أنواع مختلفة من هذا المرض بمتوسط دقة . ٪33.88 إن إشارات تخطيط القلب تظهر أنماطا كسورية ،ولقد قمت بايجاد البعد الكسوري ( ) FD من سلسلة ( )ECGالزمنية في مرحلة استخراج الميزة .ولهذا الغرض تم تطبيق أسلوب طيف الطاقة ( ) PSMعلى أربعة أنواع من الطول الموجي غير طبيعي و الطبيعي ،ولقد تم جمع جميع اشارات ( )ECGمن قاعدة عدم انتظام ضربات القلب ( معهد ماساتشوستس للتكنولوجيا . ) BIH xii أخيرا تم بناء -SVMمتعدد مصنف وتغذيته بمتجه من األبعاد الكسورية (FDs )الشارات تخطيط القلب ،المستخرجة من المرحلة السابقة موصوفة وفقا ألربعة أنواع من الطول الموجي غير طبيعي وواحد طبيعي .النتائج التي تم الحصول عليها تؤكد تفوق الطريقة المقترحة لتحديد عدم انتظام ضربات القلب بالمقارنة مع الطرق التقليدية األخرى ،التي تعتمد في تحليلها الشارات ECGعلى ميزات التشكل و ميزات ECGالزمنية الثالث ,أي مدة التركيب QRSو الفاصل الزمني( RRالفترة الزمنية بين نقطتين متتاليتين Rتمثل المسافة بين قمم QRSللنبضة الحالية والسابقة) ،والفاصل الزمني RRالمتوسط خالل العشرة دقات األخيرة [ . ]88كما تشير النتائج أن تحسينات جوهرية من حيث دقة التصنيف يمكن تحقيقها من خالل هذا النظام التصنيفي المقترح. xiii ENHANCED AUTOMATIC IDENTIFICATION OF ARRHYTHMIA IN ELECTROCARDIOGRAM (ECG) SIGNALS BASED ON FRACTAL FEATURES AND SVM TECHNIQUE. By Maram Hasan Al-Alfi Supervisor Dr. Rashiq Marie ABSTRACT Analysis process of electrocardiogram (ECG) is one of the major research interests in bio-medical signal processing. The reasons for this interest are the growth in the cardiac health care activities all over the world, and the rapid advance in digital computer technology which play an essential role in the detection of disease states from bio-signals. Because the assessment process of diagnostic results for these biosignals heavily depends upon quantity, accuracy, and speed , computer based analysis is very useful in clinical therapy. In this thesis a method of analysis (ECG) signals using fractal features and support vector machine (SVM) technique has been proposed and I found out from practical experiment that this method provides a good electronic diagnose pattern for cardiac arrhythmia disease , as it can be used by a specialist doctor to diagnose various types of this disease with an average accuracy of 89.33% . By the fact that ECG signals show a fractal patterns , it has been tried to find out the fractal dimension (FD) of the ECG time series in a feature extraction phase. For this purpose the Power Spectrum Method (PSM) has been applied to four kinds of xiv abnormal waveforms and normal beats , all ECG signsls has been acquired from the Massachusetts Institute of Technology (BIH) arrhythmia database. Finally multi-SVM classifier has been constructed and fed by a vector of an ECG signals FDs extracted from previous phase labeled according to the four kinds of abnormal waveforms and normal one. The obtained results confirm the superiority of the proposed method for identifying cardiac arrhythemia as compared to traditional one which is analyses ECG signals based on morphology features and three ECG temporal features, i.e., the QRS complex duration (combination of three of the graphical deflections seen on a typical ECG), the RR interval (the time span between two consecutive R points representing the distance between the QRS peaks of the present and previous beats), and the RR interval averaged over the ten last beats [33] , and suggest that substantial improvements in terms of classification accuracy can be achieved by this proposed classification system. xv Chapter 1 Introduction 1.1 Overview Computer technology has an important role in structuring biological systems. The huge growth of high performance computing techniques in recent years, with regard to the development of useful and accurate models of biological systems, has contributed significantly to new approaches to fundamental problems of modeling behavior of biological systems. The importance of biological time series analysis, which displays typically complex dynamics, has long been recognized in the area of non-linear analysis. Several features have been proposed to detect hidden important dynamical properties of the signals. These nonlinear dynamical techniques have been applied to many areas including the areas of medicine and biology [1]. In the year of 2004, the nonlinear techniques have been used to analyze physiological signals: heart rate, nerve activity, renal blood flow, arterial pressure, EEG and respiratory signals [1]. To investigate the time-varying spectral characteristics of the underlying process most of the methods often being by computing the time variation of the common statistical properties of the process [3]. However, these methods fail to properly deal with the nonlinearity of the process, but fractal analysis which I have applied here to analyze electrocardiodiagram (ECG) signals allows me to effectively process these signals to obtain their higher – order statistics. The ECG signal is the electrical signal generated by the heart’s muscle measured on the skin surface of the body. This biosignal is essentially non-stationary signal; it displays a fractal like self-similarity . It may contain indicators of current disease, or even warnings about impending diseases. The indicators may be present at all times or may occur at random in the time scale. However, to study a set of irregularity in huge 1 amount of data collected over several hours is hard and time consuming. Therefore, computer based analytical tools for in- depth study and classification of data over day long intervals can be very useful in diagnostics. 1.2 Problem Definition ECG has a basic role in cardiology since it consists of effective simple vast low cost procedures for the diagnosis of cardiac disorders and is very relevant for their impact on patient’s life. One of pathological alternations observable by ECG is cardiac rhythm disturbance (or arrhythmia). Arrhythmia is considered to lead to life threatening conditions. Thus the detection of abnormalities in intensive care patients is very essential and critical, hence the presence of automatic analysis of ECG and abnormality detection is very helpful ,as it will be an aid to clinical staff in the absence of doctors, it will also help doctors to diagnose and act faster in case of emergency conditions. Designing low cost , high performance and simple to use tool for ECG offering a combination of diagnostic features seems to be a global pursuit. 1.3 Thesis Contributions The contributions of this thesis can be summarized as follows: 1. The implementation of an automatic approach to achieve highly reliable detection of cardiac abnormalities, which include fractal features extraction, arrhythmia classification and assessment. 2. Features extraction based on fractal analysis with the use of Power Spectrum Method (PSM) for different cardiac diseases. 3. Evaluation of features extracted using SVM classification technique for detection of cardiac arrhythmia. 4. Evaluation of the performance of suitable classifier architecture and classifier inputs in the detection of various cardiac arrhythmia. 2 1.4 Thesis Organization The thesis is organized in six chapters. After the introductory chapter , which contains the problem definition and thesis contribution. Chapter 2 gives a cite of view of literature review to the research topics related to this thesis work. Chapter 3 provides useful medical and technical information for the understanding of ECG signal , describes morphologies of normal heart beats and of different arrhythmias , gives a description of Fractal and Fractal Geometry terms with its applications , describes the way of computing fractal dimensions , gives samples of individual fractals and finally presents some methods for computing the fractal dimention. in time domain. Chapter 4 focusing on the Power Spectrum Method (PSM) as it the proposed method for extracting ECG features that makes use of the characteristic of Power Spectral Density Function (PSDF) of a Random Scaling Fractal Signal in frequency domain , applying this way to identify cardiac diseases from ECG signals, the chapter then describes classification process with the help of One Against All (OAA)-SVM procedure to classify five types of heart beats. Chapter 5 gives a discussion and evaluation of the experimental results obtained from classifying five types of heart beat. , namely, normal sinus rhythm (N), atrial premature beat (A), ventricular premature beat (V), right Bundle Branch Block (RBBB), and left Bundle Branch Block (LBBB). The experiment automatically analyzes electrocardiograms with the help of fractal features bases, localizes points of interest and decides whether they are normal or not. Chapter 6 concludes this thesis. It describes the results obtained with the proposed method and recommends for future improvements. 3 Chapter 2 Literature Review and Related Work 2.1 Introduction An ECG facilitates two major kinds of information; firstly, if the time intervals on the ECG are measured, it helps in determining the duration of the electrical wave crossing the heart and consequently we can determine whether the electrical activity is normal or slow, fast or irregular. Secondly, if the amount of electrical activity passing through the heart muscle is measured, it enables a pediatric cardiologist to find out if parts of the heart are too large or are overworked [6]. Thus, physicians diagnose arrhythmia based on long-term ECG data using an ECG recording system. Physicians interpret the morphology of the ECG waveform and decide whether the heartbeat belongs to the normal sinus rhythm or to the class of arrhythmia [5]. With the various remote and mobile healthcare systems adapting ECG recorders, are being increased in number these days, the importance of a better and robust automatic arrhythmia classification algorithm is being increasingly acknowledged.The analysis of ECG is basically recognizing its’ pattern and classifying arrhythmia in real time. 2.2 Related Work Many algorithms have been proposed over previous years for developing an automated systems to accurately classify the electrocardiographic signals. Ms. Alka Vishwa , Dr. Archana Sharma [7] used Artificial Neural Networks (ANN) to classify whether the patient is suffering from arrhythmia . ANN structure is used to test patients’ records. Authors conclude that this phase of study will be further expanded into classification between types of arrhythmia. Accuracy achieved by this level is 4 quite low. Therefore further expansion is to use better algorithms for more accuracy like k- nearest neighborhood and support vector machines. Silipo et al [8] presented a comparison work for ECG classification using two classification techniques; one with supervised; and other with unsupervised learning. Yu et al., and Guyon et al [9] presented the use of feature selection methods for choosing a number of features among the original features. An obvious advantage of using feature selection is reduction in the time and cost of feature acquisition as well as reduction in classifier training and testing time. Feature selection is also helpful in improving classifier accuracy, provided that noisy, irrelevant or redundant features are eliminated. Song et al.[10] proposed Support Vector Machine (SVM) based arrhythmia classification with the reduction of feature dimensions by linear discriminant analysis (LDA). Raghav , S. ;Mishra ,A ,K.[11] used a method for the classification of ECG arrhythmia using local fractal dimensions of ECG signal as the features to classify the arrhythmic beats. The method is based on matching these fractal dimension series of the test ECG waveform to that of the representative ECG waveforms of different types of arrhythmia. Mahmoodabadi et al [12] described an approach for ECG feature extraction which utilizes Daubechies Wavelets transform. They had developed and evaluated an electrocardiogram (ECG) feature extraction system based on the multi-resolution wavelet transform. Saxena et al [13] described an approach for effective feature extraction form ECG signals. Their paper deals with an competent composite method which has been developed for data compression, signal retrieval and feature extraction of ECG signals.An algorithm was presented by Chouhan and Mehta [14] for detection of QRS complexities. The recognition of QRS complexes forms the origin for more or 5 less all automated ECG analysis algorithms. The presented algorithm utilizes a modified definition of slope, of ECG signal, as the feature for detection of QRS. A succession of transformations of the filtered and baseline drift corrected ECG signal is used for mining of a new modified slope-feature. A method for automatic extraction of both time interval and morphological features, from the Electrocardiogram (ECG) to classify ECGs into normal and arrhythmic was described by Alexakis et al. in [15]. The method utilized the combination of artificial neural networks (ANN) and Linear Discriminant Analysis (LDA) techniques for feature extraction. Five ECG features namely RR, RTc, T wave amplitude, T wave skewness, and T wave kurtosis were used in their method. These features are obtained with the assistance of automatic algorithms. The onset and end of the T wave were detected using the tangent method. The three feature combinations used had very analogous performance when considering the average performance metrics. 2.3 Taxonomy and research definition In this thesis an enhanced diagnosis method for identifying cardiac ECG Arrhythmia using OAA SVM classifier based on fractal dimension is presented. The proposed method, firstly, extracts the features of ECG Arrhythmia based on fractal theory , in this phase three methods in time domain and one method in frequency domain are used to estimate the fractal dimension values for the normal and different pathological conditions which established different ranges of FD for each specific disease. Such intervals are utilized to distinguish clearly between healthy and nonhealthy persons by putting each of them in distinct FD range. This should facilitate in its application as a supplemental method to support the diagnosis of a pathological or normal heart condition. The Power Spectrum Method (PSM) shows a better distinguish between the ECG signals for healthy and non-healthy persons versus the other methods. 6 The results also suggest that FD is a practical tool for identification of abnormality characteristic in the ECG recordings. After fractal features had extracted, and Since, a SVM is known to have the advantage of offering remarkable performance of classification; in this study I have chosen most widely used One Against All (OAA)SVM based methods optimized by fractal feature selection for classification of standard arrhythmia dataset[16] and thereby comparing their accuracy rates obtained for best results. OAA-SVM classifier was trained by these features in order to recognize and classify the ECG beats. Compared with the diagnosis method which had been used based on ECG morphology features and three ECG temporal features, i.e., the QRS complex duration, the RR interval (the time span between two consecutive R points representing the distance between the QRS peaks of the present and previous beats), and the RR interval averaged over the ten last beats [33], the proposed method has advantages of simple architecture and global optimum ability. 7 Chapter 3 Electrocardiogram (ECG) Biosignals and Fractal Geometry 3.1 Overview Electrocardiography deals with the electrical activity of the heart. Monitored by placing sensors at the limb extremities of the subject[1] .As shown in Figure1[4] Electrocardiogram (ECG) is a faithful record of the origin and propagation of the electric potential through cardiac muscles. It is considered as a representative signal of cardiac physiology useful in diagnosing cardiac disorders. The cardiac cycle mainly consists of three electrical components representing the activation and deactivation of the atria and ventricles, and of the blood pumping chambers of the heart. During each cardiac cycle the atria contracts in diastole to fill the ventricles which then contract during systole to supply blood to the lungs and the systemic circulation. Contraction of the atria and ventricles is tightly coordinated by a wave of depolarization spreading through the muscular walls of these chambers. The depolarization wave reflects movement of charge across myocyte membranes and is in effect of an electrical current spreading through the heart. Following contraction, cardiac muscle returns to a resting state and this is associated with reversal of the movement of charge across the myocyte membranes, this second wave of electrical activity is termed cardiac repolarization. The leads of the ECG machine are designed to detect and record these two waves of cardiac electrical activity. The depolarization wave spreads through the heart in a highly predictable pattern and to understand the ECG readout, the pattern of spread of cardiac depolarization needs to be understood [4]. The deflection produced by atrial depolarization is termed a P wave while ventricular depolarization produces the QRS complex. The diffuse deflection produced 8 by ventricular repolarisation is termed a T wave. The nomenclature of the QRS complex can cause some confusion but is in fact quite straightforward. Within the QRS complex, any positive deflection, that is a deflection above the isoelectric line, is termed an R wave. Any negative deflection which follows an R wave is termed an S wave. However, if the first deflection of the QRS complex is negative this deflection is termed a q wave[4]. The section of the ECG recording connecting the end of the QRS complex and the beginning of the T wave is termed the ST segment. Figure 1. Propagation of the depolarization wave in the heart muscle The potential difference recorded at the two points of the electromagnetic field reflects the ECG signal. The shape of the ECG signal and a cyclic repetition of its characteristic parts including P-QRS-T complex, constitute essential information about operation of the electrical conduction system of the heart. By analyzing the ECG signals recorded simultaneously at different points of the human body, we can obtain 9 essential diagnostic information related to heart functioning. It is concerned not only with the electrophysiological parameters of the heart, but it is also connected with its anatomical and mechanical properties. In essence, the ECG signal is an electric signal generated directly by the heart muscle cells. The information included in the ECG signal is directly related to the source of the signal, that is, the heart itself. ECG signals are recorded as a difference of electric potentials at the two points inside of the heart, on its surface or on a surface of the human body. The potential difference corresponds to the voltage recorded between two points where the measurements were taken. This voltage is the amplitude of the ECG signal recorded in the two-pole (two electrode) system. Such a two-electrode system applied to the recording of the ECG signal is referred to as an ECG lead. The ECG signal recorded on paper or electronic data carrier is called an electrocardiogram. Although Biologists have traditionally represented heartbeats as sine waves, scientists have come to recognize that it can better characterized using fractal geometry[18]. The ECG signals , oscillating at the borderline between chaos and order, have a fractal nature. If the beat is too periodic, heart failure might be the result, but a heart attack might occur when it is too aperiodic. Fractals are new branch of mathematics and an art which is generally known as “a rough or fragmented geometric shapes that can be split into parts, each of which is (at least approximately) a reducedsize copy of the whole"[20].While the classical Euclidean geometry works with objects which exist in integer dimensions, fractal geometry deals with objects in non-integer dimensions. Euclidean geometry is a description lines, ellipses, circles, etc. Fractal geometry, however, is described in algorithms which are a set of instructions on how to create a fractal[20]. 10 Euclidean geometry is perfectly suited for the world that humans have created. But if one considers the structures that are present in nature, many of Euclidean geometry rules disappear. Clouds are not perfect spheres, mountains are not symmetric cones, and lightning does not travel in a straight line. Nature is rough, and until very recently this roughness was impossible to measure. The discovery of fractal geometry made it possible to mathematically explore of the kinds of rough irregularities that exist in nature. In our world there are a lot of objects which exist in integer dimensions, single dimensional points, one dimensional lines and curves, two dimension plane figures like circles and squares, and three dimensional solid objects such as spheres and cubes. However, many things in nature are described better with a dimension being a part of the way between two whole numbers. While a straight line has a dimension of exactly one, a fractal curve will have a dimension between one and two, depending on how much space it takes up as it curves and twists[20]. 3.2 Properties of ECG Signals ECG signals are one of the best-known biomedical signals. Given their nature, they bring forward a number of challenges during their registration, processing, and analysis. Characteristic properties of biomedical signals include their nonstationarity, noise capability, and variability among individuals. ECG signals show all these properties. For the purposes of ECG diagnostics defined was a typical ECG signal (viewed as normal) that reflects the electrical activity of the heart muscle place during a single heart evolution. Figure 2 [5] defines some characteristic segments, points, and parameters used to capture the essence of the signal. In medical diagnostics, the relationships between the shape and parameters of the signal and the functioning of the heart are often expressed in terms of linguistic statements resulting in some logic expressions. For instance, we have the terms such as “extended R wave,” “shortened 11 QT interval,” “unclear Q wave,” elevated ST segment,” “low T wave,” etc. The expert cardiologist forms his/her own model of the process, which is described in a linguistic fashion. It is clear that the model is formed on a basis of gained knowledge and experience [5]. Figure 2. Typical shape of ECG signal and its essential waves 12 3.3 Normal cardiac Electrocardiogram versus Abnormal Normal sinus rhythm is the rhythm of a healthy normal heart, where the sinus node triggers the cardiac activation.This is easily diagnosed by noting that the three deflections, P-QRS-T, follow in this order and are differentiable as shown in Figure 3[4]. The sinus rhythm is normal if its frequency is between 60 and 100/min[4]. Figure 3. Normal sinus rhythm. An arrhythmia is an abnormality in the heart’s rhythm, or heart beat pattern. The heart beat can be too slow, too fast, have extra beats, or otherwise beat irregularly [4]. Below are some types of cardiac arrhythmia ,with a brief illustration of its properties . Ventricular Arrhythmias In ventricular arrhythmias ventricular activation does not originate from the AV node and/or does not proceed in the ventricles in a normal way. If the activation proceeds to the ventricles along the conduction system, the inner walls of the ventricles are activated almost simultaneously and the activation front proceeds mainly radially toward the outer walls. As a result, the QRS-complex is of relatively short duration. If the ventricular conduction system is broken or the ventricular activation starts far from the AV node, it takes a longer time for the activation front to proceed throughout the ventricular mass. The criterion for normal ventricular activation is a QRS-interval 13 shorter than 0.1 s. A QRS-interval lasting longer than 0.1 s indicates abnormal ventricular activation[4]. Premature Ventricular Contraction(PVC) Figure 4 [4] Shows A premature ventricular contraction which is one that occurs abnormally early. If its origin is in the atrium or in the AV node, it has a supraventricular origin. The complex produced by this supraventricular arrhythmia lasts less than 0.1 s. If the origin is in the ventricular muscle, the QRS-complex has a very abnormal form and lasts longer than 0.1 s. Usually the P-wave is not associated with it [5]. Figure 4. Premature Ventricular. Atrial Premature (AP) Atrial premature complexes are also called premature atrial contractions (PACs) and may cause heart palpitations or unusual awareness of heartbeats. Palpitations may be heartbeats that are extra fast, extra slow, or irregularly timed. PACs occur when a beat of your heart occurs early in the heart cycle or prematurely (CincinnatiChildren’s) [6]. PACs result in a feeling that the heart has skipped a beat, or that your heartbeat has briefly paused. Sometimes, PACs occur and you 14 can’t feel them. Premature beats are common, and usually harmless. Rarely, PACs may indicate a serious heart condition such as life-threatening arrhythmias. When a premature beat occurs in the upper chambers of heart, it is known as an atrial complex or contraction. Premature beats can also occur in the lower chambers of your heart. These are known as ventricular complexes. Causes and symptoms of both types of premature beats are similar Atrial and ventricular hypertrophies are illustrated in Figure 5 [4] . Premature Figure 5. Atrial Premature Bundle-branch block Bundle-Branch Block denotes a conduction defect in either of the bundle-branches or in either fascicle of the left bundle-branch. If the two bundle-branches exhibit a block simultaneously, the progress of activation from the atria to the ventricles is completely inhibited; this is regarded as third-degree atrioventricular block . The consequence of left or right Bundle-Branch Block is that activation of the ventricle must await initiation by the opposite ventricle. After this, activation proceeds entirely on a cell-to-cell basis. The absence of involvement of the conduction system, which initiates early activity of many sites, results in a much slower activation process along normal pathways. The consequence is manifest in bizarre shaped QRS-complexes of abnormally long duration. 15 Right Bundle-Branch Block (RBBB) If the right bundle-branch is defective so that the electrical impulse cannot travel through it to the right ventricle, activation reaches the right ventricle by proceeding from the left ventricle. It then travels through the septal and right ventricular muscle mass. This progress is, of course, slower than that through the conduction system and leads to a QRS-complex wider than 0.1 s. Usually the duration criterion for the QRS-complex in right Bundle-Branch Block (RBBB) as well as for the left Brundle-Branch Block (LBBB) as well as for the left Bundle- Branch Block (LBBB) is >0.12 s [5] . With normal activation the electrical forces of the right ventricle are partially concealed by the larger sources arising from the activation of the left ventricle. In right Bundle-Branch Block (RBBB), activation of the right ventricle is so much delayed, that it can be seen following the activation of the left ventricle. (Activation of the left ventricle takes place normally). Right Bundle-Branch Block is illustrated in Figure 6 [4]. Figure 6. Right bundle-branch block 16 Left Bundle-Branch Block (LBBB) The situation in left Bundle-Branch Block (LBBB) is similar, but activation proceeds in a direction opposite to RBBB. Again the duration criterion for complete block is 0.12 s or more for the QRS-complex [4]. Because the activation wavefront travels in more or less the normal direction in LBBB, the signals' polarities are generally normal. However, because of the abnormal sites of initiation of the left ventricular activation front and the presence of normal right ventricular activation the outcome is complex and the electric heart vector makes a slower and larger loop to the left and is seen as a broad and tall R-wave, usually in leads I, aVL, V5, or V6 as illustrated in Figure 7[4]. Figure 7. Left bundle-branch block 17 3.4 Fractal Features Two of the most important properties of fractals are self-similarity and noninteger dimension. We can explain self-similarity by looking carefully at a fern leaf shown in Figure 8 [20] below, and notice that every little leaf - part of the bigger one has the same shape as the whole fern leaf. We can say that the fern leaf is self-similar. The same is with fractals: we can magnify them many times and after every step we will see the same shape, which is a characteristic of that particular fractal. The non-integer dimension is more difficult to explain. Classical geometry deals with objects of integer dimensions: zero dimensional points, one dimensional lines and curves, two dimensional plane figures such as squares and circles, and three dimensional solids such as cubes and spheres as shown in Figure 9. However, many natural phenomena are better described using a dimension between two whole numbers. So while a straight line has a dimension of one. Figure 10 [20] demonstrates how a fractal curve will have a dimension between one and two, depending on how much space it takes up as it twists and curves. The more the flat fractal fills a plane, the closer it approaches two dimensions. Likewise, a "hilly fractal scene" will reach a dimension somewhere between two and three. So a fractal landscape made up of a large hill covered with tiny mounds would be close to the second dimension, while a rough surface composed of many medium-sized hills would be close to the third dimension[20]. 18 Figure 8. Fern Leaf Figure 9. Classical geometry objects Figure 10. Fractal Curves 19 3.5 Explanation of Fractal Geometry and Fractal Dimension Fractal Dimension can be demonstrated by first defining a fractal set as : = Where (1) is the number of fragments with the linear dimension defined as constant, and , is some defines the fractal dimension. If this equation is rearranged with simple algebra, the outcome is: = (2) Given a line of unit length, we can divide it in varying ways and do different things with each segment. For the Figure 11.a, if the segment is divided into two parts, making , where is the length of the division[17] . One of the parts is kept and the other is disposed of, so = 1. If we divide the remaining segment into two parts and again only keep one of the fragments, then = and =1. If this process is repeated (iterated), D turns out to be zero, which gives the equivalent to the Euclidean point. Regardless of the number of iterations, at order n, =1. Hence, D will always be zero. This way of thinking makes sense because if we take a line segment and continually divide it into two, keeping only one of the pieces, the length of the line segment will approach zero as the order approaches infinity. Figure 11. Demonstration of fractal dimensions with Euclidean line segments. 20 A Euclidean line which exists in the first dimension can be demonstrated as simple as this. This example is modeled in Figure 11.b .The line segment is again divided into two parts; however, we keep all the fragments, so we get = and = 4. Hence, = and = 2. Iterating again, = 1. This also makes sense because we never remove any part of line so it will always remain of unit length. In the first two examples, the results are both Euclidean figures with dimensions of zero and one, respectively. It is, however, just as easy to create a line segment with a fractal dimension between zero and one. In Figure 11.c the line has been segmented into three different parts and keeps only the two end pieces. After the first iteration, we get = and = 2. When this process is repeated, we get Therefore, D = = and = 4. = 0.6309. To show how to generate line segements with a varying fractal dimension, we start with a line segment of unit length and divide it into five distinct parts as in Figure 11.d. By keeping only the two end pieces and the center piece, we get = example D = and = = 3 . Iterating again, we get = and = 9. In this 0.6826. As this process is iterated, the infinite set of points is called dust. Fractal dimensions are not limited to being between zero and one. When applying the same method to the Euclidean square it produces items with a fractal dimension between zero and two. For each of the following examples, each square will be divided into nine squares of equal size, making = .The iterations continue n times[17]. Figure 12.a demonstrates the Euclidean point, by keeping only one square with each iteration, making = = 1. In Figure 12.b, we keep only the top three squares with each iteration, making N1 = 3 and N2 = 9. Through this process we discover a Euclidean line with a dimension of one. The last Euclidean figure which can be derived from this 21 example is the plane in Figure 12.c. To accomplish this, we keep all the squares with each iteration. To produce a figure with a fractal dimension, we will keep only the two pieces in the upper left and lower right corners with each iteration as in Figure 12.d , making and = 4. Hence, at the second order D = = 0.6309. On the other hand, if we remove only the center piece with each iteration, as in Figure 12.e , then we get and =2 =8 = 64. This example produces a fractal dimension of 1.8928. Figure 12. Demonstration of fractal dimensions with Euclidean planes. 3.5.1 Calculating Fractal Dimension For certain objects which we have dealt with all of our life, such as squares, lines, and cubes, it is easy to assign a dimension. We intuitively feel that a square has two dimensions, a line has one dimension, and a cube has three dimensions. We might feel this way because there are two directions in which we can move on a square, one direction on a line, and three directions in a cube, but sometimes we can move in a certain number of directions and sometimes we can move in a different number of directions. This is what causes fractal dimensions to be non-integers. 22 To derive a formula for calculating fractal dimensions which will work with all figures, let’s first look at how to calculate the dimensions for the figures which we already know. A line can be divided into n = separate pieces. Each of those pieces is the size of the whole line and each piece, if magnified n times, would look exactly the same as the original. Repeating the process for a square, we find that it can be divided into same concept holds true for a cube, we need pieces would be pieces. The pieces to reassemble a cube. Each of the the size of the whole figure. The exponent in each of these examples is the dimension. For fractals, we need a generalized formula, which can be derived from what we already know. Because of the way in which this formula ends up, it is independent of the base used for the logarithms. For a line: = For a square: = For a cube: = If we look back at figures 11 & 12, they were divided into pieces that when zoomed in on n times, reappeared to starting figure. Because of this, we divide the ln(number of divisions) by the natural logarithm of the magnification factor. The resulting formula gives the dimension, represented by D [17]. D= (3) For a line: D= =1 For a square: D= =2 For a cube: D= =3 23 Each of these examples was easy because the magnification factor was always n. But for fractals, magnification factor will be a constant, which varies for each fractal . 3.5.2 Examples of Deterministic Fractals and its applications The Koch Curve For all previous examples that have been dealt with removing pieces from various geometric figures. Fractals, and fractal dimensions can also be defined by adding onto geometric figures. The Koch curve was named after Helge Von Koch in 1904. The generation of this fractal is simple. We begin with a straight line of unit length and divide it into three equally sized parts. The middle section is replaced with an equilateral triangle and its base is removed. After one iteration, the length is increased by four-thirds. As this process is repeated, the length of the figure tends to infinity as the length of the side of each new triangle goes to zero. Assuming this could be iterated an infinite number of times, the result would be as in Figure 13 [20] which is infinitely wiggly, having no straight lines whatsoever, this type of fractals which is made by humans called Deterministic Fractals [17]. Figure 13. The Koch Curve 24 To calculate the dimension of the Koch Curve, we look at the image of the fractal and realize that it has a magnification factor of three and with each iteration, it is divided into four smaller pieces. Knowing this, we get : D = ln(4) / ln(3) D = 1.3863 / 1.0986 D = 1.2619 The Koch Curve has a dimension of 1.2619. The Sierpinski Triangle Sierpinski triangle in Figure 14 [20] is created by infinite removals. Each triangle is divided into four smaller, upside down triangles. The center of the four triangles is removed. As this process is iterated an infinite number of times, the total area of the set tends to infinity as the size of each new triangle goes to zero [18]. Figure 14. The Sierpinski Triangle After closer examination of the process used to generate the Sierpinski Triangle and the image produced by this process, we realize that the magnification factor is two. With each magnification, there are three divisions of the triangle. With this data, we get: D = ln(3) / ln(2) D = 1.0986 / 0.6931 D = 1.5850 The Sierpinski Triangle has a dimension of 1.5850 [17]. 25 3.5.3 Fractals and Fractal Geometry applications Fractal geometry has permeated many areas of science, such as astrophysics, biological sciences, and has become one of the most important techniques in computer graphics. Fractals in astrophysics Astrophysicists believe that the way to know how stars are formed and ultimately found their home in the Universe is the fractal nature of interstellar gas. Fractal distributions are hierarchical, like smoke trails or billowy clouds in the sky [20]. Turbulence shapes for the clouds in the sky and the clouds in space, which give them an irregular but repetitive pattern that is impossible to be described without the help of fractal geometry. Fractals in the Biological Sciences Biologists have traditionally modeled nature using Euclidean representations of natural objects or series. They represented heartbeats as sine waves, conifer trees as cones, animal habitats as simple areas, and cell membranes as curves or simple surfaces. However, scientists have come to recognize that many natural constructs are better characterized using fractal geometry. Scientists discovered that the basic architecture of a chromosome is tree-like; every chromosome consists of many 'mini-chromosomes', and therefore can be treated as fractal[18]. For a human chromosome, for example, a fractal dimension D equals 2,34 (between the plane and the space dimension).Self-similarity has been found also in DNA sequences. In the opinion of some biologists fractal properties of DNA can be used to resolve evolutionary relationships in animals. 26 The human body is also governed by fractal rhythms called ECG signals. The ECG signals , oscillating at the borderline between chaos and order, have such a fractal nature. If the beat is too periodic, heart failure might be the result, but a heart attack might occur when it is too aperiodic. Another characteristic of fractals is encountered with the fibrillating heart. For the normal heart, an electrical signal is sent in a regulated wave through the entire threedimensional structure, causing each cell to contract and then relax. This wave is somehow broken up in the fibrillating heart leaving the organ never immediately entirely relaxed or in contraction. This uncoordinated wave can cause the blockage of arteries and can lead eventually to the death of the contracting organ [21]. Fractals in computer graphics The biggest use of fractals in everyday life is in computer science. Many image compression schemes use fractal algorithms to compress computer graphics files to less than a quarter of their original size. Computer graphic artists use many fractal forms to create textured landscapes and other intricate models. But fractal signals can also be used to model natural objects, allowing us to define mathematically our environment with a higher accuracy than ever before as we will see in analysis of ECG biomedical signals in this thesis. 27 3.6 Fractal Features Extraction From ECG Signals FD is a descriptive measure that has been proven useful in quantifying the complexity or self similarity of biomedical signals. Such analysis of complexity of biomedical signals helps us to study physiological processes underlying the systems. The FD can be used to study dynamics of transitions between different states of systems like heart and also in various physiological and pathological conditions [23]. As ECG signal of a human heart is a self-similar object, so it must have a fractal dimension that can be extracted using mathematical methods to help identifying and distinguish specific states of heart pathological conditions Several methods have been proposed in the literature to estimate the FD of signals or time series data either in time or frequency domain. Analysis in the time domain processing the signal data directly, while analysis in frequency domain requires Fourier or wavelet transform of the signal [25].This section investigates time domain methods for computing FD values from ECG time series signals depending on fractal geometry in order to extract its main features. 3.6.1 Time Domain Methods of Estimating FD Herein, fractal complexity of signal is characterized in real-time by computing its FD using each of Katz’s method, Hugshi’s method and Hurst’s method. A. Katz’s method The FD of a signal curve, based on Katz’s method[24], can be defined as: FD = log (L)/log (d) (4) Where, L is the total signal curve length or sum of distance between successive points, and d is the diameter estimation of the distance between the first data point and the data which gives the farthest distance. 28 d and L , are respectively, can be expressed mathematically as below: (5) L= (6) Normalizing distances in (1) by the average distance between successive points, say a, gives: FD = (7) Defining n as the number of steps in the signal curve less than the number of points N, then n = . Substituting n in (2), FD according to Katz’s approach is expressed as: FD = (8) B. Higuchi’s method Higuchi proposed an efficient algorithm to calculate the FD directly from time series [28]. Assume a one dimensional time series X= {X(1), X(2), X(3), …, X(N)} where, N is the total number of samples, in our case the series X would be the successive values of ECG signal. The Higuch’s algorithm constructs k new time series as: (9) where k and m are integers, represent time interval between points and initial time value respectively, M = For each new time series constructed the length is computed as: (10) where, is a normalization factor for the curve length of 29 . The length of the series L(k) for the time interval k is computed as the mean of the k values, for m = 1, 2, ..., k . L(k ) = (11) If L (k) is proportional to , then the curve describing the shape of ECG time series is fractal-like with the dimension FD. In this case, if ln(L(k)) is plotted against ln(k) , k = 1, 2, 3, ..., , the points fall on a straight line with a slope equal to FD.The fractal dimension of ECG signal is calculated via above method while applying adaptive and fixed windowing method. C. Rescaled Range (R/S) Method Hurst developed R/S method which is a statistical technique to analyze a large number of natural phenomena [19]. The R/S method is one of the oldest and best known methods for estimating H (Hurst parameter). Let { sample points of an ECG recordings. The mean these points are, respectively, , k = 1, 2, 3, ..., N be a set of N and the standard deviation S(N) of and S(N) = The R/S- statistic or rescaled adjusted range , is defined by the ratio: (12) where , =( + + +...+ )−k. (13) k = 1, 2, 3, ...,N. 30 Hurst found empirically that, for many time series observed in nature, they are well represented by the relation (14) where C is a finite positive constant. By taking logs we obtain : (15) Therefore, the slope of a plot of log(R/S) against log(N) provides the Hurst parameter, H[27]. The relation between the Hurst exponent and the fractal dimension is simply determined as FD=2-H. So fractal dimension with the help of these equations can easily evaluated in the rescaled range analysis 31 Chapter 4 ECG Feature Extracting using PSM and Classification with SVM 4.1 Introduction As it has seen from the previous chapter that the Hurst parameter (Dimension), H measures the feature of self-affinity of time series in real-time domain. Herein, I have presented the description of this feature through processing the time series in the frequency domain in which I have assumed that the power spectrum of this signal is dominated by a Random Scaling Fractal(RSF) model P(f) = c/ , where c > 0. Then an automatic ECG arrhythmia diagnosis method based on SVM using Fractal Dimension is proposed based on estimation of Fractal Dimension (FD) of ECG recordings by focusing on the Power Spectrum Method (PSM) that makes use of the characteristic of Power Spectral Density Function (PSDF) of a Random Scaling Fractal Signal. 31 dataset of ECG signals taken from MIT-BIH arrhythmia database [16] have been utilized to estimate the FD. In the following section I have introduced a power spectrum method (PSM) depending on the frequency analysis by which I have tried to capture the fractal behavior of ECG signals based on the RSF model. 4.2 Feature Extraction using Power Spectrum Method (PSM) Fractals are applicable when the underlying process being mathematically modeled has a similar appearance regardless of the scale over which it is observed. It turns out that many of natural signals can be modeled using fractals. Many signals observed in nature are random fractals including biomedical signals such as ECG time series signal. Random Scaling Fractal (RSF) signals are signals whose probability distribution function (PDF) has the same ‘shape’ irrespective of the scale over which 32 they are observed. Accordingly, random fractal signals are statistically self-similar (self-affinity), they are self-similar in a statistical sense[24]. However, ECG time series signal exhibits the features of self-affinity , so it can be considered as an example of RSF signals. RSF signals are characterized by power spectra whose frequency distribution is proportional to 1/ where is the frequency and q > 0 is the ‘Fourier Dimension’, a value that is simply related to the Fractal Dimension, FD and Hurst (Dimension) parameter H, by the relation q = H + 1/2 = (5 - 2D)/2. This power law describes the conventional RSF models which are based on stationary processes in which the ‘statistics’ of the RSF signals are invariant of time and the value of q is constant. Assume X(t), in time domain, is a time series of ECG signal which is assumed to be a self-affine signal. Notice that each of Figure 15 and Figure 16 shows the plotting of 1024 points of normal and abnormal ECG signals, respectively, with its similar small version of size 512 points from each of them . The power spectrum of such a signal can be written as P( ) = , where X( ) is Fast Fourier Transform (FFT) of the time series in frequency domain ( i.e. X( ) = t(X(t))). For such time series the power spectrum, P( ) obeys the RSF model P( ) = c/ (16) Figure 17 and Figure 18 show examples of different plots of the measured power spectrum of normal and abnormal ECG signal, respectively, over different window size. These figures give the evidence that the power spectrum of the ECG time series signals obeys the RSF model. The behavior of ECG signals can be characterized through estimating the parameter q in the proposed model where the estimated values of this parameter reflects the degree of self-similarity (fractality) in ECG signals. To do this the least square technique is applied to the measurements of ECG signals as follow: 33 Let , , , ..., (N being a power of 2) be sample points of an ECG signal . By considering the case in which the digital power spectrum P( ) is given by applying a FFT to this time series. This data can be approximated by: (17) or (18) If we consider the error function (19) where = , and it is assumed that the spatial frequency and the measured power spectrum then the solutions of equations (least square method) gives: (20) and (21) 34 Since the power spectrum of real signals of size N is symmetric about the DC level, where the DC level is taken to the mid point + 1 of the array, so in practice only the data that lies to the right of DC[24]. Figure 15. (left) Normal Sinus Rhythm ECG signal of size 1024, (right) Zoom in version of Normal Sinus Rhythm ECG signal of size 512 Figure 16. (left) Atrial Premature Arrhythmia ECG signal of size 1024, (right) Zoom in version of Atrial Premature Arrhythmia ECG signal of size 512 35 Figure 17. Measured power spectrum of Normal Sinus Rhythm ECG signal (left-to-right and top-to-bottom) for window size 1024; 512; 256; 128 Figure 18. Measured power spectrum of Atrial Premature ECG signal (left-to-right and top-to-bottom) for window size 1024; 512; 256; 128 36 4.2.1 PSM Methodology Algorithm ECG time series signal exhibits the features of self-affinity , so it can be considered as an example of RSF signals. To estimate the fractal parameter in this series I converted it to frequency domain in which I assumed that the empirical power spectrum of each series has an envelope Power Spectrum Density Function (PSDF) which is given as the RSF model P( ) = .By using Moving Window technique, I choose a window of size N to move over the points of the time series to be analyzed. From each window segment I applied the PSM to estimate the Fourier Dimension q, after implementing normalizing and transformation to spectral domain on the given segment. The following algorithm summarizes the steps of the Methodology Process used , which is explained in block diagram in Figure 19. Step 1: Use a window of size N = 512 over the points of a given ECG time series to extract a signal array of points , say , N . This process is applied to 38720 cardiac beats for 31 persons. Step 2: Normalize the signal achieved in Step1: . Step 3: Compute the Discrete Fourier Transform (DFT) of using a Fast Fourier Transform (FFT) and with special shifting yield . Step 4: Compute the empirical power spectrum of . Step 5: Extract the right halve of the computed power spectrum. Step 6: Compute the parameter q using the computational formula of the PSM given in equation (20). Step 7: Iterate Step1 through to Step6 until the end of the time series . Step 8: Compute the Fractal Dimension D , where 37 . X Y FFT Z fftshift Power P Spectrum PSM MM Estimated Value of q ECG Time Series Figure 19. Estimate the Fourier Dimension q of ECG Signal . 4.3 Arrhythmia Classification based on (SVM) Finding arrhythmia characteristics corresponding to Premature Ventricular Contraction (PVC), Atrial Premature (AP), and Bundle-Branch Block (BBB)[22]from ECG recording have received considerable attention in recent years. Differences in normal and abnormal ECG signals can’t be easily determined especially with human eyes. Developing an intelligent method for identification of such cardiac diseases is very helpful in biomedical field, as it will be an aid to clinical staff in the absence of doctor, It will also help doctor to diagnose and act faster in case of emergency conditions. Support Vector Machine (SVM) has advantages of very accurate ability of classification, simple architecture as well as less overfitting and robust to noise. As seen in chapter 5 of this thesis, Fractal Dimension can quantitatively describe the non-linear behavior of signals, thus it can be used as features for diagnosing ECG Arrhythmia. The support vector machine usually deals with pattern classification which means classifying the different types of patterns. There are different types of patterns i.e. Linear and non-linear. Linear patterns are patterns that are easily distinguishable or can be easily separated in low dimension whereas non-linear patterns are patterns that are not easily distinguishable or cannot be easily separated and hence these types of patterns need to 38 be further manipulated so that they can be easily separated. Figure 20 [28] shows the main idea of SVM which is the construction of an optimal hyper plane, which can be used for classification, for linearly separable patterns , that maximizes the margin of the hyper plane i.e. the distance from the hyper plane to the nearest point of each pattern. The main objective of SVM is to maximize the margin so that it can correctly classify the given patterns i.e. larger the margin size, it classifies the patterns more correctly . The equation shown below is the hyper plane representation: aX + bY = C (22) Figure 21 [28] shows the basic idea of the hyper plane in a three dimension when it is used to separate two different patterns. Basically, this plane comprises three lines that separate two different patterns in 3-D space, mainly marginal line and two other lines on either side of marginal lines where support vectors are located. Figure 20 SVM Model 39 Figure 21 Hyper Plane For non-linear separable patterns, the given patterns are mapping into new space usually a higher dimension space so that they become linearly separable. This aim was done by using kernel function, (x). i.e. x (x). Selecting different kernel functions is an important aspect in the SVM-based classification, commonly used kernel functions include : Linear, Polynomial (Poly) and Radial Basis Function (RBF). Different Kernel functions create different mapping for creating non-linear separation surfaces. Therefore, the problem of solving optimal classification now translates into solving quadratic programming problems. It is to seek a partition hyper plane to make the binary blank area (2/||w||) maximum, where w is a weight vector. which means we have to maximize the weight of the margin. It is expressed as: Min (x) = ½ (w, w), Such that: (23) 40 4.3.1 Multiclass SVM The SVM technique was originally proposed essentially for binary classification. But, the classification of ECG signals often involves the simultaneous discrimination of numerous information classes. In order to face this issue, a number of multiclass classification strategies can be adopted [28], [29]. The most popular methods based on combining binary SVM are: one-against-all (OAA) and the one-against-one (OAO) strategies. The former involves a reduced number of binary decompositions (and thus, of SVMs), which are, however, more complex. The latter requires a shorter training time, but may incur conflicts between classes due to the nature of the score function used for decision. Both strategies generally lead to similar results in terms of classification accuracy. In this thesis, I had considered the OAA strategy. Briefly, this strategy is based on the following procedure. Let Ω= { } be the set of T possible labels (information classes) associated with the ECG beats that we desire to classify. First, a group of T SVM classifiers is trained. Each classifier aims at solving a binary classification problem defined by the discrimination between one information class =1, 2... T) against all others (i.e., Ω -{ (i . Then, in the classification phase, the “winner-takes-all” rule is used to decide which label to assign to each beat. This means that the winning class is the one that corresponds to the SVM classifier of the group that shows the highest output (discriminant function value). 41 4.3.2 Application of OAA SVM Using Fractal Features in ECG Arrhythmia Diagnosis Using OAA SVM based on fractal features to diagnose ECG Arrhythmia involves extracting ECG fractal features using (PSM), forming training vector , establishing OAA SVMs, training the OAA SVMs and diagnosing Arrhythmia. Figure 22 gives the block diagram of the proposed method which can be summarized as follows: Step 1. Extracting features: Fractal dimension can reflect the fractality features of ECG signals. Different types of ECG signals have different values of fractal dimension when PSM applied to it. In this step, ECG Arrhythmia is deduced from different ranges of FD for healthy and non-healthy persons. A PSM is applied to 38720 cardiac beats for 31 persons. The response of selected beats is sampled by applying windowing technique with 512 window size and the feature using fractal dimension is extracted. This process is repeated for all 31 persons. Step 2. Establishes the multi-class OAA SVM networks based on Arrhythmia classes and trains it : In this step, the structure of SVM classifier is built by the following steps: 1. Load the dataset as a vector of points ,which represents a total of 122 FD points estimated by PSM that was measured in Step1 for 31 persons. 2. Create a two-column matrix containing the FD vector as the first column , and a labels corresponding to the FD values estimated in Step1 as a second coloumn. i.e : Normal Sinus Rhythm :1, Ventricular Premature Arrhythmia:2, Atrial Premature:3, Right bundle-branch block:4, Left bundle-branch block:5 3. Create a new column vector, groups, to classify data into only two groups , by applying the OAA strategy. 42 4. Create a 5-fold cross-validation to randomly select training and testing points from the groups to feed SVM model . i.e. : indices = crossvalind('Kfold' groups,5) ;for i = 1:5 test = (indices == i); train = ~test. 5. Use the svmclassify Matlab built in function to classify the test set vector , with the use of : Linear , Poly and Rbf kernel functions . 6. Evaluate the performance of the classifier. 7. Repeat steps (3-6) after exchanging the two groups to apply all diseases labels . Step 3 . Diagnosing Arrhythmia with trained OAA SVM: In this step, the trained OAA SVM is used to diagnose the unknown Arrhythmia. It starts with extracting fractal features from testing samples using PSM. Then fractal features are brought to the trained multi-class OAA SVM classifier and diagnosis results are obtained. Testing Samples of Unknown Arrhythmia ECG to be Diagnosed Extracting Fractal Features Using PSM SVM Trainer U SVM Classification Model SVM Classifier Diagnosis Results Figure 22. SVM method using fractal features for ECG Arrhythmia diagnosis . 43 Chapter 5 Experimental Evaluation and Discussion 5.1 Dataset Description In this work , a fractal dimension (FD) for 31 dataset of ECG signals has been determined in time domain and frequency domain, then ranges of FD, is established for a healthy person and persons with various heart diseases. The sample of ECG signals for the present study is obtained from MIT/BIH database via Physionet website [16]. The MIT-BIH database contains both normal and abnormal types of ECG signals. In this study, the considered beats refer to the following classes: Normal Sinus Rhythm (N), Premature Ventricular Contraction (PVC), Atrial Premature (AP) , Right Bundle-Branch Block (RBBB) , and Left Bundle-Branch Block (LBBB). The beats were selected from the recordings of 31 persons , which correspond to the following files: 17052m, 16420m, 19088m, 19093m, 16265m , 16483m , 16273m , 16549m , 16539m , 16795m , 17453m , 18177m, 18184m , 19090m , 19830m , 16786m , 16277m , 16792m ,and 16272m for Normal Sinus Rhythm (N). 200m , 208m , and 215m for Premature Ventricular Contraction (PVC). 100m , 209m, and 223m for Atrial Premature (AP). 124m , 231m , and 232m for Right Bundle-Branch Block (RBBB) and 214m , 109m , and 217m for Left Bundle-Branch Block (LBBB). The properties of these signals are described in Table 1. Figure 23 to Figure 27 show the plot of two dataset from each type of signal. 44 Table 1. Description of the used dataset Signal Type Ventricular Premature Arrhythmia Atrial Premature Right bundle-branch block Left bundle-branch block Normal Sinus Rhythm No. of samples/ signal 3600 3600 3600 3600 1280 Sampling frequency 360 Hz 360 Hz 360 Hz 360 Hz 128 Hz Sample intervals 0.6500000 sec 0.6500000 sec 0.6500000 sec 0.6500000 sec 0.1132740 sec Figure 23 . (A),(B) ECG signals of Normal Rhythm Figure 24 . (A), (B) ECG signals of a Premature Ventricular Arrhythmia 45 Figure 25. (A),(B) ECG signals of a Atrial Premature Arrhythmia Figure 26. (A),(B) ECG signals of Right Bundle-Branch Block Arrhythmia Figure 27. (A),(B) ECG signals of Left Bundle-Branch Block Arrhythmia 46 5.2 Experimental Results It is shown From the block diagram of the proposed method in Figure 26 in the previous chapter that in order to feed the classification process, we need to extract fractal features of the ECG signals .To do so I have utilized 31 dataset which are composed of ECG signals recorded from healthy subjects and patients with heart arrhythmia. I have performed the experiments using Matlab7 on ECG datasets from the MITBIH arrhythmia database [16]. 5.2.1 Fractal Features Extraction The FD feature from each class of ECG timesereis signal has extracted using a non overlapping window of size 512 points by means of the methods presented in section 5.2.1.1, chapter 5 of this thesis. Table 2 shows the results obtained for the estimation of FD from the Normal heart rhythm signals, which prove that the healthy heart is the fractal heart ; since the value of FD lies between 1 and 2. Tables 3-6 show the results obtained for the estimation of FD from the pathological signals, and Tables 7-10 show the intervals (lower bound and upper bound) of the estimated FD for each specific disease corresponding to each estimation method. By comparing these estimated FD intervals shown in Tables 7-10, it is clear that only PSM , can distinguish obviously between healthy and non-healthy persons by putting each of them in distinct FD range. On the other hand, Table 11 shows the average of FD values for each of ECG signal type along with the estimated methods that are used. For the PSM we note that the average FD value for Normal Sinus Rhythm is 1.522589. During the other heart arrhythmias, Left Bundle Branch Block, Right Bundle Branch Block , Atrial Premature and Ventricular Premature Arrhythmia the values are lower, and are equal to 0.742733, 0.438833, 0.249733, and 0.082267 respectively. 47 There is a decrease in the average of FD value , this decrease in the FD value indicates a decrease in the heterogeneity of the cardiac recording [ 23]. Meanwhile, if we compare the average FD value of Normal heart rhythms with Abnormal heart rhythms that are obtained by each of the time domain methods (i.e. Katz’s, Higuch’s and Hurst’s method) and the frequency domain method (i.e., PSM) it is clear that the PSM has an advantage of distinguishing between the normal condition and the pathological one more clearly than these methods. So that the PSM can provide a significant clinical advantage where it can readily be incorporated ’on line’ to provide (and to possibly control) the onset of a pathological condition, which is indicated by a drop in the FD value. Table 2. The Estimated FD Values for Normal Sinus Rhythm Signals Dataset (1280 beats ) FD estimation methods Katz Higuchi's Hurst’s PSM 1. 2. 3. 4. 5. 6. 17052m 16420m 19088m 19093m 16265m 16483m 2.1743 2.1372 1.7247 1.6402 2.4422 2.1783 1.6051 1.5768 1.1140 1.0492 1.7299 1.3252 1.3384 1.3452 1.3570 1.3405 1.3377 1.3949 1.1819 1.4657 1.9838 1.9392 1.7882 1.2708 7. 16273m 2.2544 1.4789 1.3648 1.4580 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 16549m 16539m 16795m 17453m 18177m 18184m 19090m 19830m 16786m 16277m 16792m 16272m 2.1744 2.0761 2.0527 2.1010 2.1240 2.2229 1.7677 1.6790 1.9652 1.8652 2.3240 1.9763 1.2713 1.5325 1.3285 1.4149 1.3793 1.4506 1.1015 1.0295 1.5126 1.3061 1.5122 1.5072 1.3558 1.3467 1.3360 1.4081 1.1263 1.3885 1.3374 1.3479 1.3634 1.3277 1.1872 1.1942 1.4893 1.6843 1.4527 1.8547 1.9123 1.6968 1.1478 1.5554 1.1189 1.1617 1.1955 1.5722 48 Table 3. The Estimated FD Values for Ventricular Premature Arrhythmia Signals Dataset (3600 beats) 1. 2. 3. 200m 208m 215m FD estimation methods Katz 1.6248 1.6607 1.8216 Higuchi's Hurst’s PSM 1.1312 1.1616 1.4922 1.2818 1.2415 1.0592 0.0774 0.0349 0.1345 Table 4. The Estimated FD Values for Atrial Premature Arrhythmia Signals Dataset (3600beats ) 1. 2. 3. 100m 209m 223m FD estimation methods Katz 1.8583 2.2321 1.6217 Higuchi's Hurst’s PSM 1.3014 1.4429 1.1095 1.2676 1.0869 1.3258 0.2436 0.2266 0.2790 Table 5. The Estimated FD Values for Right Bundle Branch Block Arrhythmia Signals Dataset (3600beats ) 1. 2. 3. 124m 231m 232m FD estimation methods Katz Higuchi's Hurst’s PSM 1.8544 1.7198 1.8760 1.2683 1.1971 1.2378 1.0567 1.2580 1.2723 0.3902 0.4104 0.5159 Table 6. The Estimated FD Values for Left Bundle Branch Block Arrhythmia Signals Dataset (3600beats ) 1. 2. 3. 214m 109m 217m FD estimation methods Katz Higuchi's Hurst’s PSM 1.6661 1.7174 1.6644 1.1670 1.1585 1.1259 1.3110 1.1073 1.2376 0.8321 0.6954 0.7007 49 Table 7. Distinct Range of FD Values for Sample ECG Signals Using PSM Signal Type Ventricular Premature Arrhythmia Atrial Premature Right Bundle Branch Block Left Bundle Branch Block Normal Sinus Rhythm Range 0.0349 - 0.1345 0.2266 - 0.2790 0.3902 - 0.5159 0.6954 - 0.8321 1.1189- 1.9838 Table 8. Ranges of FD Values for Sample ECG Signal Using Katz’s Method Signal Type Ventricular Premature Arrhythmia Atrial Premature Right Bundle Branch Block Left Bundle Branch Block Normal Sinus Rhythm Range 1.6248 - 1.8216 1.6217 - 2.2321 1.7198 - 1.8760 1.6644 - 1.7174 1.6402 - 2.4422 Table 9. Ranges of FD Values for Sample ECG Signal Using Higuchi’s Method Signal Type Ventricular Premature Arrhythmia Atrial Premature Right Bundle Branch Block Left Bundle Branch Block Normal Sinus Rhythm Range 1.1312 - 1.4922 1.1095 - 1.4429 1.1971 - 1.2683 1.1259 - 1.1670 1.0295 - 1.7299 Table 10. Ranges of FD Values for Sample ECG Signal Using Hurst’s Method Signal Type Ventricular Premature Arrhythmia Atrial Premature Right Bundle Branch Block Left Bundle Branch Block Normal Sinus Rhythm Range 1.0592 - 1.2818 1.0869 - 1.3258 1.0567 - 1.2723 1.1073 - 1.3110 1.1263 - 1.4081 Table 11. Average of the Estimated FD Values Signal Type Ventricular Premature Arrhythmia Atrial Premature Right Bundle Branch Block Left Bundle Branch Block Normal Sinus Rhythm Katz 1.702367 1.904033 1.816733 1.682633 2.046305 Higuchi's 1.261667 1.284600 1.234400 1.150467 1.380279 50 Hurst’s 1.194167 1.226767 1.195667 1.218633 1.326195 PSM 0.082267 0.249733 0.438833 0.742733 1.522589 5.2.2 Classification with SVM In order to obtain reliable assessments of the classification accuracy of the investigated classier, I carried out five different trials with the use of OAA SVM procedure described in chapter 6 , each with a new set of randomly selected testing and training values in which each of them represents the value of PSM-FD for each type of disease. The results of these five trials obtained on the test set were thus averaged. The detailed numbers of ECG beats according to PSM-FD for each class used in the experiment with a comparison of average run time needed for each are reported in Table 12. Classification performance summarized in Table 13 was evaluated in terms of two measures, which are: 1) the accuracy of each class that is the percentage of correctly classified beats among the beats of the considered class . The accuracy of PSM FD SVM has %100 for Normal Sinus Rhythm with the use of all kernels used in the experiment , this means that the proposed method has an advantage of distinguishing between the normal condition and the pathological clearly. So that the PSM can provide a significant clinical advantage where it can readily be incorporated ’on line’ to provide (and to possibly control) the onset of a pathological condition, which is indicated by a high accuracy rates shown in Table 13 which had achieved by the experiment . 2) the average accuracy (AA), which is the average over the classification accuracies obtained for the different classes . 51 Table 12. Number of ECG Beats According to PSM - FD Used in the Experiment with a Comparison of Average Run Time Needed for Each. Class ECG Beats PSM FD N 24150 38 A 338 21 V 4039 21 RBBB 3789 21 LBBB 1801 21 Total 34117 122 Average Run-Time(second) 313200 2.93333 Table 13. Class Percentage Accuracy Achieved on the Testing PSM – FD Values with a Total Number of 122 PSM - FD Training Values Method AA N A V RBBB LBBB SVM-linear % 78.90 % 81.42 % 80.25 % 74.84 % 82.53 % 72.58 SVM-poly % 85.75 % 85.74 % 83.19 % 84.48 %92.03 % 89.94 SVM-rbf % 87.48 % 88.69 % 87.39 % 81.48 %95.98 % 87.49 PSM FD - SVM-linear % 85.41 % 011 % 81.97 % 80.33 %82.79 % 81.96 PSM FD - SVM- poly % 87.97 % 100 % 85.25 % 89.87 %82.79 % 81.96 PSM FD - SVM- rbf % 89.33 % 011 % 89.94 % 89.97 %82.79 % 83.97 As reported in Table 13, the AA accuracies achieved with the proposed PSM -FD - SVM classifier based on the Gaussian kernel (SVM–rbf) on the test set are equal to 89.33%.This result is better than those achieved by the SVM-linear and the SVM-poly. Indeed AA accuracies are equal to 85.41 % for the SVM-linear classifier, and 87.97 % for the SVM-poly classifier. This experiment appears to confirm what is observed in other application fields, i.e., the superiority of SVM based on the Gaussian kernel as compared to traditional classifiers when dealing with feature spaces of very high dimensionality. In addition to previous accuracies results shown in Table 13 for the proposed PSM-FD - SVM classification system with the low average run-time shown in Table 12 , Table 13 provides a reference classification in order to quantify the capability of the proposed system to further improve these results. 52 Chapter 6 Conclusion and Future Work 6.1 Conclusion In this thesis an enhanced diagnosis method for identifying cardiac ECG Arrhythmia using OAA SVM classifier based on fractal dimension was presented. The proposed method, firstly, extracts the features of ECG Arrhythmia based on fractal theory , in this phase three methods in time domain and one method in frequency domain are used to estimate the fractal dimension values for the normal and different pathological conditions which established different ranges of FD for each specific disease. Such intervals are utilized to distinguish clearly between healthy and nonhealthy persons by putting each of them in distinct FD range. This should facilitate in its application as a supplemental method to support the diagnosis of a pathological or normal heart condition. The Power Spectrum Method (PSM) shows a better distinguish between the ECG signals for healthy and non-healthy persons versus the other methods. The results also suggest that FD is a practical tool for identification of abnormality characteristic in the ECG recordings. After fractal features had extracted, the OAA SVM classifier was trained by these features in order to recognize and classify the ECG beats. Compared with the diagnosis method which had been used based on ECG morphology features and three ECG temporal features, i.e., the QRS complex duration, the RR interval (the time span between two consecutive R points representing the distance between the QRS peaks of the present and previous beats), and the RR interval averaged over the ten last beats [33], the proposed method has advantages of simple architecture and global optimum ability. 53 The OAA SVM tool trained with simulated data was found to be capable of predicting ECG Arrhythmia classes accurately when the beats data were presented to the trained OAA SVM for prediction. The result of simulation for verification shows that the accuracy ratio of the proposed method in diagnosis using OAA SVM classifier based on fractal dimension is high, with an average accuracy of 89.33%. 6.2 Future Work From the obtained experimental results, we can strongly recommend the use of the SVM based on fractal features approach for classifying ECG signals as an alternative diagnosis to the traditional diagnosis methods of cardiac Arrhythmia, on account of their superior generalization capability as compared to traditional classification techniques. This capability generally provides them with higher classification accuracies and a lower overfitting with a robust to noise. For future work researches verify that when increasing the number of training beats, the classification accuracies increase and the differences between the classifiers appear less pronounced. It would be interesting more to analyze another feature of cardiac signals such as the Heart Voice signal, Heart Variability Beat (HRV) signal. 54 REFERENCES [1] Netter FH (1971), Heart, The Ciba Collection of Medical Illustrations, Ciba Pharmaceutical Company, Summit, N.J, Vol. 5: 293 pp. [2] Kannathal ,( 2004), Acharya , (2004), Garret ( 2003), Yuru ( 2004). [3] Robert (1995 ), Kaplan ( 1999), Laurent (1998). [4] Goldman MJ (1986),Principles of Clinical Electrocardiography, Lange Medical Publications, Los Altos, Cal, 12th ed: 460 pp. [5] Scheidt S (1984), Basic Electrocardiography: Abnormalities of Electrocardiographic Patterns, Ciba Pharmaceutical Company, Summit, N.J,Vol. 6/36: 32 pp. [6] Marissa Selner,( August 20, 2012) , Medically Reviewed by Peter Rudd, MD . [7] A.Vishwa and A. Sharma , (December 2011),“Arrhythmic ECG signal classification Using Machine Learning Techniques”. International Journal of Computer Science, Information Technology, & Security (IJCSITS), Vol. 1, No. 2. [8] Silipo,( 2011),” ECG Feature Extraction Techniques - A Survey Approach” , International Journal of Engineering, Science and Technology ,Vol. 3, No. 8:pp. 122-131. [9] Yu ,and Guyon ,“Ensemble Feature Weighting Based on Local Learning and Diversity”,Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence. [10]Song et al, “Support Vector Machine Based Arrhythmia Classification Using Reduced Features”,( December 2005), International Journal of Control, Automation, and Systems, vol. 3, no. 4:pp. 571-579. [11] Raghav , S. ;Mishra ,A ,K,(2008) ,” Fractal feature based ECG arrhythmia classification”, TENCON , IEEE Region 10 Conference. 55 [12] S. Z. Mahmoodabadi, A. Ahmadian, and M. D. Abolhasani,( 2005) , “ ECG Feature Extraction using Daubechies Wavelets”, Proceedings of the fifth IASTED International conference on Visualization, Imaging and Image Processing, pp. 343 . [13] S. C. Saxena, A. Sharma, and S. C. Chaudhary,( 1997.), “Data compression and feature extraction of ECG signals” ,International Journal of Systems Science, vol. 28, no. 5:pp. 483-498. [14] V. S. Chouhan, and S. S. Mehta,( 2008) “Detection of QRS Complexes in 12 lead ECG using Adaptive Quantized Threshold”, IJCSNS International Journal of Computer Science and Network Security, vol. 8, no. 1. [15] V. S. Chouhan, and S. S. Mehta,( March, 2007), “Total Removal of Baseline Drift from ECG Signal”, Proceedings of International conference on Computing: Theory and Applications, ICTTA–07:pp. 512-515, ISI. [16] MIT-BIH Arrhythmia Database from PhysioBank- Physiologic Signal Archives for Biomedical Research. Retrieved 10 March , 2013 , from http://www.physionet.org/ physiobank/database. [17] Retrieved 10 March, 2013, from http://library.thinkquest.org/3493/ frames/fractal.html. [18] Turner, M.J, (2000), Modeling Nature With Fractals, Leicester . [19] P. Vanouplines, “Rescaled Range Analysis and the Fractal Dimension of pi”, University Library, Free University Brussels, Pleinlaan 2, 1050 Brussels Belgium. [20] Mandelbrot, B.B,( 1982) ,“The Fractal Geometry of Nature”, San Francisco. [21] Angel Chang,(February,1993)“Fractals in Biological Systems”. 56 [22]Al Alfi. M ,(2014), “Enhanced Automatic Identification of Arrhythmia in Electrocardiogram (ECG) Signals based on Fractal Features and SVM Technique”, Unpublished Master Dissertation,Zarqa Private University,Jordan,Zarqa. [23] Accardo A., Affinito M., Carrozzi M, Bouquet F,( 1997) ”Use of the Fractal Dimension for the Analysis of Electroencephalographic Time Series”, Biol Cyber, 77: 339-350. [24] J.M.Blackledge,(2006),”Digital Signal Processing: Mathematical and Computation Methods: Software Development and Applications”, 2nd Edition, London: Horwood Publishing Limited. [25] Schepers HE, van Beek JHGM, Bassingtwaighte JB,(1992), ”Four Methods to Estimate the Fractal Dimension from Selfaffine Signals”, IEEE Engg Me Bio, (6): 57-64. [26] Farid Melgani and Yakoub Bazi ,( September 2008) ,“Classification of Electrocardiogram Signals With Support Vector Machines and Particle Swarm Optimization”, IEEE Transactions on Information Technology in Biomedicine, Vol. 12, No. 5. [27] Kaplan D. and Glass L,( 1995) ”Understanding Nonlinear Dynamics Textbooks in Mathematical Sciences” , T F Banchoff ,New York: Springer. [28] F. Melgani and L. Bruzzone,(Aug. 2004),“Classification of Hyperspectral Remote Sensing Images with Support Vector Machine”, IEEE Trans. Geosci, Remote Sens, vol. 42, no. 8: pp. 1778–1790. [29] C.-W. Hsu and C.-J. Lin, (Mar. 2002),“A Comparison of Methods for Multiclass Support Vector Machines,” IEEE Trans. Neural Netw.,vol. 13, no.2: pp. 415– 425. [30]Chih-Wei Hsu, Chih-Jen Lin,(2002),” A Comparison of Methods for Multiclas Support Vector Machines”, IEEE Transactions on Neural Networks, vol. 13,No 2. 57 [31] M. Stone,( 1974) ,“Cross-validatory Choice and Assessment of Statistical Predictions”, J. R. Statist. Soc. B,vol. 36: pp. 111–147. [32] Arle J. E. and Simon R. H,(1990) ”An Application of Fractal Dimension to the Detection of Transients in the Electroencephalogram Electroencephalogr”, Clin Neurophysiology, 75, 296305. [33] F. de Chazal and R. B. Reilly,( Dec. 2006) ,“A Patient Adapting Heart Beat Classifier Using ECG Morphology and Heartbeat Interval Features,” IEEE Trans. Biomed. Eng. Vol. 53, no. 12: pp. 2535–2543. 58 Appendices Appendix A: Matlab Code %%%%%%%%%%%%% Matlab Code %%%%%%%%%%%%%%% ---------------------------Rescaled Range Algorithm -----------------------%H = HURST(X) calculates the Hurst exponent of time series X using the R/S %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function D = Hurst(Z) clear; clc; clc dataset=1; switch dataset case 1 load 801m Z = val' ; case 2 load ecg4_20m Z = val' ; case 3 load ecg10_20m Z = val' ; case 4 load ecg11_20m Z = val' ; case 5 load ecg12_20m Z = val' ; case 6 Z = load('sig_y1.txt'); case 7 Z = load('sig_y2.txt'); case 8 Z = load('sig_y3.txt'); case 9 Z = load('sig_y4.txt'); case 10 Z = load('sig_y5.txt'); End 59 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% m=length(Z); x=zeros(1,m); y=zeros(1,m); y2=zeros(1,m); for tau=3:m X=zeros(1,tau); Zsr=mean(Z(1:tau)); for t=1:tau X(t)=sum(Z(1:t)-Zsr); end; R=max(X)-min(X); S=std(Z(1:tau),1); H=log10(R/S)/log10(tau/2); x(tau)=log10(tau); y(tau)=H; y2(tau)=log10(R/S); end; D=2-H; %plot(x,y,'k--',x,y2,'k-'),legend('H-track','R/S-track','Location','South') %xlabel('lg(number of test)'),ylabel('lg(R/S)') %axis([x(1) x(end) -inf +inf]),drawnow %figure(gcf). 60 %%%%%-----------------------------Katz Algorithm------------------------%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % S = num of beat samples = num of beat intervals + 1 % fs = number of beat samples \ second % fs= S/full time taken of all samples %Input: %x: (either column or row) vector of length N %fs = number of beat samples \ second %Output: %f: Katz fractal dimension of x %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%% function f = KatzFD(x,fs) clc dataset=5; switch dataset case 1 load ecg2_20m x = val' ; fs =500; case 2 load ecg4_20m x = val' ; fs =500; case 3 load ecg10_20m x = val' ; fs =500; case 4 load ecg11_20m x = val' ; fs =500; case 5 load ecg12_20m x = val' ; fs =500; case 6 x = load('sig_y1.txt'); fs = 1.2095833; case 7 x = load('sig_y2.txt'); fs = .9770833; case 8 x = load('sig_y3.txt'); fs = 1.615278; case 9 x = load('sig_y4.txt'); 61 fs = .7652778; case 10 x = load('sig_y5.txt'); fs = .9673611; end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%% x=(x-mean(x))/std(x); n = length(x); t=(0:1/fs:(n-1)/fs); t=t'; x1=[t x]; for i=1:n-1 d(i)=sqrt(abs(x1(i+1,1)-x1(i,1))^2+abs(x1(i+1,2)-x1(i,2))^2); dmax(i)=sqrt((abs(x1(i+1,1)-x1(1,1))^2+abs(x1(i+1,2)-x1(1,2))^2)); end totlen=sum(d); avglen=mean(d); maxdist=max(dmax); numstep=double(totlen/avglen); den=double(maxdist/totlen); f=(log(numstep))/((log(den)+log(numstep))); 62 %-----------------------------Higuchi's Algorithm-----------------------%function xhfd=hfd(x,kmax) %k:integer indicates interval time %m:integer indicates initial time %N:is the total numberof samples in one epoch %Input: %x: (either column or row) vector of length N %kmax: maximum value of k %Output: %xhfd: Higuchi fractal dimension of x %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%% function xhfd=HiguchiFD(x,kmax) clc dataset=5; switch dataset case 1 load ecg2_20m x = val' ; case 2 load ecg4_20m x = val' ; case 3 load ecg10_20m x = val' ; case 4 load ecg11_20m x = val' ; case 5 load ecg12_20m x = val' ; case 6 x = load('sig_y1.txt'); case 7 x = load('sig_y2.txt'); case 8 x = load('sig_y3.txt'); case 9 x = load('sig_y4.txt'); case 10 x = load('sig_y5.txt'); end 63 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%% if ~exist('kmax','var')||isempty(kmax), kmax= 7 ; % 1280/256 = 5 interval time series ??? ??????? ??? ??? ????? ??? ?? ???? 256 end; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%% x=x(:)'; N=length(x); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%% Lmk=zeros(kmax,kmax); for k=1:kmax, for m=1:k, Lmki=0; for i=1:fix((N-m)/k), Lmki=Lmki+abs(x(m+i*k)-x(m+(i-1)*k)); end; Ng=(N-1)/(fix((N-m)/k)*k); Lmk(m,k)=(Lmki*Ng)/k; end; end; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%% Lk=zeros(1,kmax); for k=1:kmax, Lk(1,k)=sum(Lmk(1:k,k))/k; end; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%% lnLk=log(Lk); lnk=log(1./[1:kmax]); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%% b=polyfit(lnk,lnLk,1); xhfd=b(1); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%% 64 %--------------------------------------PSM Algorithm------------------------------------------% %============================================================ function Result=Estm_q(signal) clc Result=[]; load 16273m % val 1x1280 %n1 = val'; %val =load('sig_y2.txt'); %load 17052m signal = val'; S =signal; format short if length(S) < 1024 disp('This size of data should 1024 or more...'); return; else D=fix(length(S)./512); Trc=S(1:D*512); N=512; %size of the signal that we need m=N/2+1; C2=0;jj=0; for C1=1:N:length(Trc) % take different windows along the signal jj=jj+1; C2=C2+1; fs=Trc(C1:C2*N); % Extract only signal of size 1024, fs=fs/max(fs); % Normalize the signal FS=fft(fs); % Calculate the FFT FS=fftshift(FS); % Apply shifting(to move the zero-frequency component to the center) ps=abs(FS).^2; % Calculate the power spectrum p=ps(1:N/2); % Take the left halve of power spectrum, exclude the DC %calculations to recover the estimated q (estmt_q) x1 =0; %sum[log(k)] x2 =0; %sum[log(p)] x12=0; %sum[log(k)*log(p)] x11=0; %sum[log(k)^2] for i= 1:N/2 k=abs(i-m); % k=frequency if((p(i)~=0) & (k~=0)) x1=x1+log(k); x2=x2+log(p(i)); x12=x12+log(k)*log(p(i)); x11=x11+log(k)*log(k); else 65 N=N; end end estm_q=((N/2)*x12-x1*x2)/(x1^2-(N/2)*x11);% The formula to estimate q Result(1,jj)=C2; Result(2,jj)=(5- (2*estm_q ))/2; %Result(2,jj)= estm_q ; ps(513)= 0; %figure, ; %t = 1:0.1:10; %plot(estm_q ); %plot(ps); %plot( ApplyThreshold (Result(2,:), -0.5, 0.5), 'r'); plot(Result(2,:)); %plot(ps); %line('XData', [0 9], 'YData', [ 0.0741 0.0741 ], 'LineStyle', '-', ... %'LineWidth', 1, 'Color','m'); %line('XData', [0 9], 'YData', [ 0 % 'LineWidth', 1, 'Color','b'); 0 ], 'LineStyle', '-', ... %title('Power spectrum of Normal Sinus Rhythm for 17052m dataset with 256 window size '); end % end of windowing end 66 %--------------------------------------SVM Classification-------------------------------------% %============================================================ clc load 'test_svm_Normal_V.txt'; %# load ECG dataset data = test_svm_Normal_V(:,1); groups = test_svm_Normal_V(:,2); %# create a two-class problem numInst = size(data); First =zeros(numInst); for i=1:numInst if groups(i)==1 First(i)=1; classF = data(groups(i)); else First(i)=2; classF = data(groups(i)); end end k=5; cvFolds = crossvalind('Kfold',groups,k); %# get indices of 5-fold CV cp = classperf(groups); %# init performance tracker for i = 1:k %# for each fold test = (cvFolds == i); %# get indices of test instances train = ~test; %# get indices training instances %# train an SVM model over training instances options = optimset('maxiter', 5000, 'largescale','off'); %options settings for SVMTRAIN svmStruct = svmtrain(data(train,:),groups(train),'KERNEL_FUNCTION','rbf','showplot',false ,'quadprog_opts' , options); %# test using test instances classes = svmclassify(svmStruct,data(test,:),'showplot',false); %# evaluate and update performance object 67 cp = classperf(cp,classes,test); end %# get accuracy cp.CorrectRate Second =zeros(numInst); for i=1:numInst if groups(i)==1||2 Second(i)=1; classS = data(groups(i)); else Second(i,1)=2; classS = data(groups(i)); end end indices2 = crossvalind('Kfold',Second,k); cp = classperf(Second); for i = 1:k test = (indices2 == i); train = ~test; options = optimset('maxiter', 5000, 'largescale','off'); %options settings for SVMTRAIN svmStructS = svmtrain(classS(train,:),Second(train),'KERNEL_FUNCTION','linear','showplot', false); classes2 = svmclassify(svmStructS,classS(test,:),'showplot',false); cp = classperf(cp,classes2,test); end cp.CorrectRate %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Third =zeros(numInst); for i=1:numInst if groups(i)==1||2||3 Third(i)=1; classT = data(groups(i)); else Third(i)=2; classT = data(groups(i)); end end 68 indices3 = crossvalind('Kfold',Third,k); cp = classperf(Third); for i = 1:k test = (indices3 == i); train = ~test; options = optimset('maxiter', 5000, 'largescale','off'); %options settings for SVMTRAIN svmStructT = svmtrain(classT(train,:),Third(train),'KERNEL_FUNCTION','linear','showplot',f alse); classes3 = svmclassify(svmStructT,classT(test,:),'showplot',false); cp= classperf(cp,classes3,test); end cp.CorrectRate %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Forth =zeros(numInst); for i=1:numInst if groups(i)==1||2||3||4 Forth(i,1)=1; classF = data(groups(i)); else Forth(i)=2; classFo = data(groups(i)); end end indices4 = crossvalind('Kfold',Forth,k); cp = classperf(Forth); for i = 1:k test = (indices4 == i); train = ~test; options = optimset('maxiter', 5000, 'largescale','off'); %options settings for SVMTRAIN svmStructFo = svmtrain(classFo(train,:),Forth(train),'KERNEL_FUNCTION','linear','showplot', false); classes4 = svmclassify(svmStructFo,classFo(test,:),'showplot',false); cp =classperf(cp,classes4,test); end cp.CorrectRate. 69