Uploaded by Hasibul Sifat

ENHANCED AUTOMATIC IDENTIFICATION OF ARRHYTHMIA IN ELECTROCARDIOGRAM (ECG) SIGNALS BASED ON FRACTAL FEATURES AND SVM TECHNIQUE.

advertisement
ENHANCED AUTOMATIC IDENTIFICATION OF ARRHYTHMIA
IN ELECTROCARDIOGRAM (ECG) SIGNALS BASED ON
FRACTAL FEATURES AND SVM TECHNIQUE.
By
Maram Hasan Al-Alfi
Supervisor
Dr. Rashiq Marie
This Thesis Was Submitted in Partial Fulfillment of the Requirements
for the Master Degree in Computer Science
Faculty of Scientific Research and Graduate Studies
Zarqa University
Zarqa, Jordan
April, 2014
i
ii
iii
ACKNOWLEDGMENTS
Thanks Allah , thanks so much because I would not have been able to complete this
thesis without His aid and support.
I would like to express my deepest appreciation to my advisor, Dr. Rashiq Marie for his
leadership, support,and attention to details.
I have enjoyed the aid and support of my mother who instilled in me confidence and a
drive for pursuing my MSc. degree.
Finally I would like to thank the staff members of the Department of Computer
Science at Zarqa University for their continuous aids.
iv
TABLE OF CONTENTS
Contents
List of Tables ................................................................................................................ v
List of Figures............................................................................................................... vi
List of Acronyms...........................................................................................................viii
List of Publications ........................................................................................................ix
Abstract in Arabic...........................................................................................................x
Abstract in English........................................................................................................xii
Chapter 1: Introduction ........................................................................................…...1
1.1 Overview ............................................................................................................1
1.2 Problem Definition..............................................................................................2
1.3 Thesis Contribution ............................................................................................2
1.4 Thesis Organization.............................................................................................3
Chapter 2: Literature Review and Related Work..................................................... 4
2.1 Introduction...................................................................................................... 4
2.2 Related Work.....................................................................................................4
2.3 Taxonomy and research definition....................................................................6
Chapter 3: Electrocardiogram (ECG) Biosignals and Fractal Geometry...….........8
3.1 Overview .............................................................................................................8
3.2 Properties of ECG Signals..................................................................................11
3.3 Normal cardiac Electrocardiogram versus Abnormal .......................................13
3.4 Fractal features .................................................................................................18
3.5 Explanation of Fractal Geometry and Fractal Dimensions ……….………...20
3.5.1 Calculating Fractal Dimensions………………………..…………............22
v
3.5.2 Examples of Deterministic Fractals ………………………..…….........24
3.5.3 Fractals and Fractal Geometry applications ………………..……..........26
3.6 Fractal Features Extraction From ECG Signals..............................................28
3.6.1 Time Domain Methods of Estimating FD.................................................28
Chapter 4: ECG Feature Extracting using (PSM) and Classification with
(SVM)………………………..………………………………………………….........32
4.1 Introduction ...............................................................................................................................32
4.2 Feature Extraction Using Power Spectrum Method(PSM)...................................,…32
4.2.1 PSM Methodology Algorithm............................................................................37
4.3 Arrhythmia Classification based on SVM.......................................................................38
4.3.1 Multiclass SVM……………....................................................................,..........41
4.3.2 Application of OAA SVM Using Fractal Features in ECG Arrhythmia
diagnosis……………………………………………………………....…………………………….…42
Chapter 5: Experimental Evaluation.........................................................................................44
5.1 Dataset Description...............................................................................................................44
5.2 Experimental Results………………............................................................................ 47
5.2.1 Fractal features Extraction..................................................................................... 47
5.2.2 Classification with SVM........................................................................................ 51
Chapter 6: Conclusion and Future Work……………………………………………............53
6.1 Conclusion……………………………………………….……………….……………….…53
6.2 Future work………………….……………..……………………………………...........……54
References……………………………………………………………………………….…...…..….....55
Appendices……………………………………………………………………………..…..……..........59
Appendix A: Matlab Code ...................................................................................................................59
vi
LIST OF TABLES
Table
Title
Page
Table 1
Description of the Used Dataset
45
Table 2
The Estimated FD Values for Normal Sinus Rhythm Signals
48
The Estimated FD Values for Ventricular Premature
Arrhythmia Signals
Table 4 The Estimated FD Values for Atrial Premature Arrhythmia
Signals
Table 5 The Estimated FD Values for Right Bundle Branch Block
Arrhythmia Signals
Table 6 The Estimated FD Values for Left Bundle Branch Block
Arrhythmia Signals
Table 7 Distinct Range of FD Values for Sample ECG Signals Using
PSM
Table 8 Ranges of FD Values for Sample ECG Signal Using Katz’s
Method
Table 9 Ranges of FD Values for Sample ECG Signal Using
Higuchi’s Method
Table 10 Ranges of FD Values for Sample ECG Signal Using Hurst’s
Method
Table 11 Average of the Estimated FD Values
49
Table 12 Number of Training and Testing Beats Used
52
Table 13 Class Percentage Accuracy Achieved on the Testing PSM – FD
Values with a Total Number of 122 PSM - FD Training Values
52
Table 3
vii
49
49
49
50
50
50
50
50
LIST OF FIGURES
Title
Figure
Page
Figure 1
Propagation of the depolarization wave in the heart muscle
Figure 2
Typical shape of ECG signal and its essential waves
12
Figure 3
A Normal sinus rhythm
13
Figure 4
Premature Ventricular
14
Figure 5
Atrial Premature
15
Figure.6
Right bundle-branch block
16
Figure 7
Left bundle-branch block
17
Figure 8
Fern Leaf
19
Figure 9
Classical geometry objects
19
Figure 10
Fractal Curves
19
Figure 11
Demonstration of fractal dimensions with Euclidean line segments
20
Figure 12
Demonstration of fractal dimensions with Euclidean planes
22
Figure 13
The Koch Curve
24
Figure 14
The Sierpinski Triangle
25
Figure 15
(left) Normal Sinus Rhythm ECG signal of size
1024, (right) Zoom in version of Normal Sinus Rhythm
ECG signal of size 512
35
(left) Atrial Premature Arrhythmia ECG signal of
size 1024, (right) Zoom in version of Atrial Premature
Arrhythmia ECG signal of size 512
35
Measured power spectrum of Normal Sinus
Rhythm ECG signal (left-to-right and top-to-bottom) for
window size 1024; 512; 256; 128
36
Figure 16
Figure 17
Figure 18
Measured power spectrum of Atrial Premature
ECG signal (left-to-right and top-to-bottom) for window size -
viii
9
36
1024; 512; 256; 128
Figure 19
Estimate the Fourier Dimension q of ECG Signal
38
Figure 20
SVM Model
39
Figure 21
Hyper Plane
40
Figure 22
SVM method using fractal features for ECG Arrhythmia
diagnosis
43
Figure 23
(A),(B) ECG signals of Normal Rhythm
45
Figure 24
(A), (B) ECG signals of a Premature Ventricular Arrhythmia
45
Figure 25
(A),(B) ECG signals of a Atrial Premature Arrhythmia
46
Figure 26
(A),(B) ECG signals of Right Bundle-Branch Block Arrhythmia
46
Figure 27
(A),(B) ECG signals of Left Bundle-Branch Block Arrhythmia
46
ix
LIST OF ACRONYMS
AA
Average Accuracy
AF
Atrial Fibrillation
AFIB
Atrial Fibrillation beat
AP
Atrial premature beat
BII
Heart Block
ECG
Electro Cardio Gram
FD
Fractal Dimension
FFT
Fast Fourier Transform
HRV
Heart Variability Beat
LBBB
Left Bundle Branch Block
PDF
probability distribution function
Poly
Polynomial
PSDF
Power Spectral Density Function
PSM
Power Spectrum Method
PSM
Power Spectrum Method
PVC
Ventricular premature beat
RBBB
Right Bundle Branch Block
RBF
Radial Basis Function
RS
Rescaled Range Method
RSF
Random Scaling Fractal
SVM
Support Vector Machine
SVT
Supraventricular tachycardia
x
LIST OF PUBLICATIONS
Rashiq R. Marie and Maram H. Al Alfi, (M a y 1 1 , 2 0 1 4),“Identification of
Cardiac Diseases from (ECG) Signals based on Fractal Analysis ",
Internation
Journal of Computers and Technology (IJCIT) , Vol. 13 , no. 6 : pp.4556-4565 .
Maram H. Al Alfi and Rashiq R. Marie,( 2 0 1 4), “Support Vector Machine based
Arrhythmia Classification using Fractal Dimension Feature
of ECG Signal ",
International Journal of Computer Science Issues (IJCSI),(Submitted).
xi
‫آلية محسنة للكشف عن عدم انتظام ضربات القلب في تخطيط القلب الكهربائي‬
‫اعتمادا على الخصائص الفراكتليه و تقنية ‪SVM‬‬
‫إعداد‬
‫مرام حسن األلفي‬
‫المشرف‬
‫د‪ .‬رشيق مرعي‬
‫الملخص‬
‫ان عملية تحليل تخطيط القلب الكهربائي (‪ )ECG‬هي واحدة من االهتمامات البحثية‬
‫الرئيسية في معالجة اإلشارات الطبية الحيوية‪ .‬و ترجع أسباب هذا االهتمام الى‪ :‬النمو في أنشطة‬
‫الرعاية الصحية للقلب في جميع أنحاء العالم‪ ،‬والتقدم السريع في تكنولوجيا الحاسوب الرقمي التي‬
‫تلعب دورا أساسيا في الكشف عن الحاالت المرضية في اإلشارات الحيويه‪ .‬وألن عملية تقييم‬
‫نتائج التشخيص لهذه اإلشارات الحيويه يعتمد بشكل كبير على الكمية و الدقة والسرعة‪ ،‬يعتبر‬
‫التحليل القائم على الكمبيوتر مفيد جدا في العالج السريري‪.‬‬
‫في هذه األطروحة تم اقتراح طريقة لتحليل إشارة ال(‪ )ECG‬باستخدام الخصائص‬
‫الفراكتليه وتقنية ‪ ، SVM‬و لقد وجدت من التجربة العملية بأن هذه الطريقة توفر نمطا تشخيصيا‬
‫الكترونيا جيدا لمرض عدم انتظام ضربات القلب‪ ،‬كما يمكن استخدامها من قبل الطبيب المختص‬
‫لتشخيص أنواع مختلفة من هذا المرض بمتوسط دقة ‪. ٪33.88‬‬
‫إن إشارات تخطيط القلب تظهر أنماطا كسورية ‪ ،‬ولقد قمت بايجاد البعد الكسوري ( ‪) FD‬‬
‫من سلسلة (‪ )ECG‬الزمنية في مرحلة استخراج الميزة‪ .‬ولهذا الغرض تم تطبيق أسلوب طيف‬
‫الطاقة ( ‪ ) PSM‬على أربعة أنواع من الطول الموجي غير طبيعي و الطبيعي‪ ،‬ولقد تم جمع‬
‫جميع اشارات (‪ )ECG‬من قاعدة عدم انتظام ضربات القلب ( معهد ماساتشوستس للتكنولوجيا‬
‫‪. ) BIH‬‬
‫‪xii‬‬
‫أخيرا تم بناء ‪ -SVM‬متعدد مصنف وتغذيته بمتجه من األبعاد الكسورية (‪FDs‬‬
‫)الشارات تخطيط القلب ‪ ،‬المستخرجة من المرحلة السابقة موصوفة وفقا ألربعة أنواع من الطول‬
‫الموجي غير طبيعي وواحد طبيعي ‪.‬النتائج التي تم الحصول عليها تؤكد تفوق الطريقة المقترحة‬
‫لتحديد عدم انتظام ضربات القلب بالمقارنة مع الطرق التقليدية األخرى ‪ ،‬التي تعتمد في تحليلها‬
‫الشارات ‪ ECG‬على ميزات التشكل و ميزات ‪ECG‬الزمنية الثالث ‪ ,‬أي مدة التركيب ‪ QRS‬و‬
‫الفاصل الزمني‪( RR‬الفترة الزمنية بين نقطتين متتاليتين ‪ R‬تمثل المسافة بين قمم ‪ QRS‬للنبضة‬
‫الحالية والسابقة)‪ ،‬والفاصل الزمني ‪ RR‬المتوسط خالل العشرة دقات األخيرة [‪ . ]88‬كما تشير‬
‫النتائج أن تحسينات جوهرية من حيث دقة التصنيف يمكن تحقيقها من خالل هذا النظام التصنيفي‬
‫المقترح‪.‬‬
‫‪xiii‬‬
ENHANCED AUTOMATIC IDENTIFICATION OF ARRHYTHMIA
IN ELECTROCARDIOGRAM (ECG) SIGNALS BASED ON
FRACTAL FEATURES AND SVM TECHNIQUE.
By
Maram Hasan Al-Alfi
Supervisor
Dr. Rashiq Marie
ABSTRACT
Analysis process of electrocardiogram (ECG) is one of the major research
interests in bio-medical signal processing. The reasons for this interest are the growth
in the cardiac health care activities all over the world, and the rapid advance in digital
computer technology which play an essential role in the detection of disease states
from bio-signals. Because the assessment process of diagnostic results for these biosignals heavily depends upon quantity, accuracy, and speed , computer based analysis
is very useful in clinical therapy.
In this thesis a method of analysis (ECG) signals using fractal features and
support vector machine (SVM) technique has been proposed and I found out from
practical experiment that this method provides a good electronic diagnose pattern for
cardiac arrhythmia disease , as it can be used by a specialist doctor to diagnose various
types of this disease with an average accuracy of 89.33% .
By the fact that ECG signals show a fractal patterns , it has been tried to find
out the fractal dimension (FD) of the ECG time series in a feature extraction phase. For
this purpose the Power Spectrum Method (PSM) has been applied to four kinds of
xiv
abnormal waveforms and normal beats , all ECG signsls has been acquired from the
Massachusetts Institute of Technology (BIH) arrhythmia database. Finally multi-SVM
classifier has been constructed and fed by a vector of an ECG signals FDs extracted
from previous phase labeled according to the four kinds of abnormal waveforms and
normal one.
The obtained results confirm the superiority of the proposed method for
identifying cardiac arrhythemia as compared to traditional one which is analyses ECG
signals based on morphology features and three ECG temporal features, i.e., the QRS
complex duration (combination of three of the graphical deflections seen on a typical
ECG), the RR interval (the time span between two consecutive R points representing
the distance between the QRS peaks of the present and previous beats), and the RR
interval averaged over the ten last beats [33] , and suggest that substantial
improvements in terms of classification accuracy can be achieved by this proposed
classification system.
xv
Chapter 1
Introduction
1.1
Overview
Computer technology has an important role in structuring biological systems. The
huge growth of high performance computing techniques in recent years, with regard to
the development of useful and accurate models of biological systems, has contributed
significantly to new approaches to fundamental problems of modeling behavior of
biological systems. The importance of biological time series analysis, which displays
typically complex dynamics, has long been recognized in the area of non-linear
analysis. Several features have been proposed to detect hidden important dynamical
properties of the signals. These nonlinear dynamical techniques have been applied to
many areas including the areas of medicine and biology [1].
In the year of 2004, the nonlinear techniques have been used to analyze
physiological signals: heart rate, nerve activity, renal blood flow, arterial pressure,
EEG and respiratory signals [1]. To investigate the time-varying spectral characteristics
of the underlying process most of the methods often being by computing the time
variation of the common statistical properties of the process [3]. However, these
methods fail to properly deal with the nonlinearity of the process, but fractal analysis
which I have applied here to analyze electrocardiodiagram (ECG) signals allows me to
effectively process these signals to obtain their higher – order statistics.
The ECG signal is the electrical signal generated by the heart’s muscle measured
on the skin surface of the body. This biosignal is essentially non-stationary signal; it
displays a fractal like self-similarity . It may contain indicators of current disease, or
even warnings about impending diseases. The indicators may be present at all times or
may occur at random in the time scale. However, to study a set of irregularity in huge
1
amount of data collected over several hours is hard and time consuming. Therefore,
computer based analytical tools for in- depth study and classification of data over day
long intervals can be very useful in diagnostics.
1.2 Problem Definition
ECG has a basic role in cardiology since it consists of effective simple vast low
cost procedures for the diagnosis of cardiac disorders and is very relevant for their
impact on patient’s life. One of pathological alternations observable by ECG is cardiac
rhythm disturbance (or arrhythmia). Arrhythmia is considered to lead to life
threatening conditions. Thus the detection of abnormalities in intensive care patients is
very essential and critical, hence the presence of automatic analysis of ECG and
abnormality detection is very helpful ,as it will be an aid to clinical staff in the absence
of doctors, it will also help doctors to diagnose and act faster in case of emergency
conditions. Designing low cost , high performance and simple to use tool for ECG
offering a combination of diagnostic features seems to be a global pursuit.
1.3
Thesis Contributions
The contributions of this thesis can be summarized as follows:
1. The implementation of an automatic approach to achieve highly reliable detection
of cardiac abnormalities, which include fractal features extraction, arrhythmia
classification and assessment.
2. Features extraction based on fractal analysis with the use of Power Spectrum
Method (PSM) for different cardiac diseases.
3. Evaluation of features extracted using SVM classification technique for detection
of cardiac arrhythmia.
4. Evaluation of the performance of suitable classifier architecture and classifier
inputs in the detection of various cardiac arrhythmia.
2
1.4 Thesis Organization
The thesis is organized in six chapters. After the introductory chapter , which
contains the problem definition and thesis contribution.
Chapter 2 gives a cite of view of literature review to the research topics related
to this thesis work.
Chapter 3 provides useful medical and technical information for the
understanding of ECG signal , describes morphologies of normal heart beats and of
different arrhythmias , gives a description of Fractal and Fractal Geometry terms with
its applications , describes the way of computing fractal dimensions , gives samples of
individual fractals and finally presents some methods for computing the fractal
dimention. in time domain.
Chapter 4 focusing on the Power Spectrum Method (PSM) as it the proposed
method for extracting ECG features that makes use of the characteristic of Power
Spectral Density Function (PSDF) of a Random Scaling Fractal Signal in frequency
domain , applying this way to identify cardiac diseases from ECG signals, the chapter
then describes classification process with the help of One Against All (OAA)-SVM
procedure to classify five types of heart beats.
Chapter 5 gives a discussion and evaluation of the experimental results
obtained from classifying five types of heart beat. , namely, normal sinus rhythm (N),
atrial premature beat (A), ventricular premature beat (V), right Bundle Branch Block
(RBBB), and left Bundle Branch Block (LBBB). The experiment automatically analyzes
electrocardiograms with the help of fractal features bases, localizes points of interest and
decides whether they are normal or not.
Chapter 6 concludes this thesis. It describes the results obtained with the
proposed method and recommends for future improvements.
3
Chapter 2
Literature Review and Related Work
2.1
Introduction
An ECG facilitates two major kinds of information; firstly, if the time intervals on
the ECG are measured, it helps in determining the duration of the electrical wave
crossing the heart and consequently we can determine whether the electrical activity is
normal or slow, fast or irregular. Secondly, if the amount of electrical activity passing
through the heart muscle is measured, it enables a pediatric cardiologist to find out if
parts of the heart are too large or are overworked [6]. Thus, physicians diagnose
arrhythmia based on long-term ECG data using an ECG recording system.
Physicians interpret the morphology of the ECG waveform and decide whether the
heartbeat belongs to the normal sinus rhythm or to the class of arrhythmia [5]. With the
various remote and mobile healthcare systems adapting ECG recorders, are being
increased in number these days, the importance of a better and robust automatic
arrhythmia classification algorithm is being increasingly acknowledged.The analysis of
ECG is basically recognizing its’ pattern and classifying arrhythmia in real time.
2.2 Related Work
Many algorithms have been proposed over previous years for developing an
automated systems to accurately classify the electrocardiographic signals. Ms. Alka
Vishwa , Dr. Archana Sharma [7] used Artificial Neural Networks (ANN) to classify
whether the patient is suffering from arrhythmia . ANN structure is used to test
patients’ records. Authors conclude that this phase of study will be further expanded
into classification between types of arrhythmia. Accuracy achieved by this level is
4
quite low. Therefore further expansion is to use better algorithms for more accuracy
like k- nearest neighborhood and support vector machines.
Silipo et al [8] presented a comparison work for ECG classification using two
classification techniques; one with supervised; and other with unsupervised learning.
Yu et al., and Guyon et al [9] presented the use of feature selection methods for
choosing a number of features among the original features. An obvious advantage of
using feature selection is reduction in the time and cost of feature acquisition as well as
reduction in classifier training and testing time. Feature selection is also helpful in
improving classifier accuracy, provided that noisy, irrelevant or redundant features are
eliminated.
Song et al.[10] proposed Support Vector Machine (SVM) based arrhythmia
classification with the reduction of feature dimensions by linear discriminant analysis
(LDA). Raghav , S. ;Mishra ,A ,K.[11] used a method for the classification of ECG
arrhythmia using local fractal dimensions of ECG signal as the features to classify the
arrhythmic beats. The method is based on matching these fractal dimension series of
the test ECG waveform to that of the representative ECG waveforms of different types
of arrhythmia.
Mahmoodabadi et al [12] described an approach for ECG feature extraction
which utilizes Daubechies Wavelets transform. They had developed and evaluated an
electrocardiogram (ECG) feature extraction system based on the multi-resolution
wavelet transform. Saxena et al [13] described an approach for effective feature
extraction form ECG signals. Their paper deals with an competent composite method
which has been developed for data compression, signal retrieval and feature extraction
of ECG signals.An algorithm was presented by Chouhan and Mehta [14] for detection
of QRS complexities. The recognition of QRS complexes forms the origin for more or
5
less all automated ECG analysis algorithms. The presented algorithm utilizes a modified
definition of slope, of ECG signal, as the feature for detection of QRS. A succession of
transformations of the filtered and baseline drift corrected ECG signal is used for
mining of a new modified slope-feature. A method for automatic extraction of both time
interval and morphological features, from the Electrocardiogram (ECG) to classify
ECGs into normal and arrhythmic was described by Alexakis et al. in [15]. The method
utilized the combination of artificial neural networks (ANN) and Linear Discriminant
Analysis (LDA) techniques for feature extraction. Five ECG features namely RR, RTc,
T wave amplitude, T wave skewness, and T wave kurtosis were used in their method.
These features are obtained with the assistance of automatic algorithms. The onset and
end of the T wave were detected using the tangent method. The three feature
combinations used had very analogous performance when considering the average
performance metrics.
2.3 Taxonomy and research definition
In this thesis an enhanced diagnosis method for identifying cardiac ECG
Arrhythmia using OAA SVM classifier based on fractal dimension is presented. The
proposed method, firstly, extracts the features of ECG Arrhythmia based on fractal
theory , in this phase three methods in time domain and one method in frequency domain
are used to estimate the fractal dimension values for the normal and different
pathological conditions which established different ranges of FD for each specific
disease. Such intervals are utilized to distinguish clearly between healthy and nonhealthy persons by putting each of them in distinct FD range. This should facilitate in its
application as a supplemental method to support the diagnosis of a pathological or
normal heart condition. The Power Spectrum Method (PSM) shows a better distinguish
between the ECG signals for healthy and non-healthy persons versus the other methods.
6
The results also suggest that FD is a practical tool for identification of abnormality
characteristic in the ECG recordings. After fractal features had extracted, and Since, a
SVM is known to have the advantage of offering remarkable performance of
classification; in this study I have chosen most widely used One Against All (OAA)SVM based methods optimized by fractal feature selection for classification of standard
arrhythmia dataset[16] and thereby comparing their accuracy rates obtained for best
results. OAA-SVM classifier was trained by these features in order to recognize and
classify the ECG beats. Compared with the diagnosis method which had been used based
on ECG morphology features and three ECG temporal features, i.e., the QRS complex
duration, the RR interval (the time span between two consecutive R points representing
the distance between the QRS peaks of the present and previous beats), and the RR
interval averaged over the ten last beats [33], the proposed method has advantages of
simple architecture and global optimum ability.
7
Chapter 3
Electrocardiogram (ECG) Biosignals and Fractal Geometry
3.1 Overview
Electrocardiography deals with the electrical activity of the heart. Monitored by
placing sensors at the limb extremities of the subject[1] .As shown in Figure1[4]
Electrocardiogram (ECG) is a faithful record of the origin and propagation of the
electric potential through cardiac muscles. It is considered as a representative signal of
cardiac physiology useful in diagnosing cardiac disorders.
The cardiac cycle mainly consists of three electrical components representing
the activation and deactivation of the atria and ventricles, and of the blood pumping
chambers of the heart. During each cardiac cycle the atria contracts in diastole to fill
the ventricles which then contract during systole to supply blood to the lungs and the
systemic circulation. Contraction of the atria and ventricles is tightly coordinated by a
wave of depolarization spreading through the muscular walls of these chambers.
The depolarization wave reflects movement of charge across myocyte
membranes and is in effect of an electrical current spreading through the heart.
Following contraction, cardiac muscle returns to a resting state and this is associated
with reversal of the movement of charge across the myocyte membranes, this second
wave of electrical activity is termed cardiac repolarization.
The leads of the ECG machine are designed to detect and record these two
waves of cardiac electrical activity. The depolarization wave spreads through the heart
in a highly predictable pattern and to understand the ECG readout, the pattern of spread
of cardiac depolarization needs to be understood [4].
The deflection produced by atrial depolarization is termed a P wave while
ventricular depolarization produces the QRS complex. The diffuse deflection produced
8
by ventricular repolarisation is termed a T wave. The nomenclature of the QRS
complex can cause some confusion but is in fact quite straightforward. Within the QRS
complex, any positive deflection, that is a deflection above the isoelectric line, is
termed an R wave. Any negative deflection which follows an R wave is termed an S
wave. However, if the first deflection of the QRS complex is negative this deflection is
termed a q wave[4]. The section of the ECG recording connecting the end of the QRS
complex and the beginning of the T wave is termed the ST segment.
Figure 1. Propagation of the depolarization wave in the heart muscle
The potential difference recorded at the two points of the electromagnetic field
reflects the ECG signal. The shape of the ECG signal and a cyclic repetition of its
characteristic parts including P-QRS-T complex, constitute essential information about
operation of the electrical conduction system of the heart. By analyzing the ECG
signals recorded simultaneously at different points of the human body, we can obtain
9
essential diagnostic information related to heart functioning. It is concerned not only
with the electrophysiological parameters of the heart, but it is also connected with its
anatomical and mechanical properties. In essence, the ECG signal is an electric signal
generated directly by the heart muscle cells.
The information included in the ECG signal is directly related to the source of
the signal, that is, the heart itself. ECG signals are recorded as a difference of electric
potentials at the two points inside of the heart, on its surface or on a surface of the
human body. The potential difference corresponds to the voltage recorded between two
points where the measurements were taken. This voltage is the amplitude of the ECG
signal recorded in the two-pole (two electrode) system. Such a two-electrode system
applied to the recording of the ECG signal is referred to as an ECG lead. The ECG
signal recorded on paper or electronic data carrier is called an electrocardiogram.
Although Biologists have traditionally represented heartbeats as sine waves,
scientists have come to recognize that it can better characterized using fractal
geometry[18]. The ECG signals , oscillating at the borderline between chaos and order,
have a fractal nature. If the beat is too periodic, heart failure might be the result, but a
heart attack might occur when it is too aperiodic. Fractals are
new branch of
mathematics and an art which is generally known as “a rough or fragmented geometric
shapes that can be split into parts, each of which is (at least approximately) a reducedsize copy of the whole"[20].While the classical Euclidean geometry works with objects
which exist in integer dimensions, fractal geometry deals with objects in non-integer
dimensions. Euclidean geometry is a description lines, ellipses, circles, etc. Fractal
geometry, however, is described in algorithms which are a set of instructions on how to
create a fractal[20].
10
Euclidean geometry is perfectly suited for the world that humans have created. But
if one considers the structures that are present in nature, many of Euclidean geometry
rules disappear. Clouds are not perfect spheres, mountains are not symmetric cones, and
lightning does not travel in a straight line. Nature is rough, and until very recently this
roughness was impossible to measure. The discovery of fractal geometry made it
possible to mathematically explore of the kinds of rough irregularities that exist in nature.
In our world there are a lot of objects which exist in integer dimensions, single
dimensional points, one dimensional lines and curves, two dimension plane figures like
circles and squares, and three dimensional solid objects such as spheres and cubes.
However, many things in nature are described better with a dimension being a part of
the way between two whole numbers. While a straight line has a dimension of exactly
one, a fractal curve will have a dimension between one and two, depending on how much
space it takes up as it curves and twists[20].
3.2 Properties of ECG Signals
ECG signals are one of the best-known biomedical signals. Given their nature,
they bring forward a number of challenges during their registration, processing, and
analysis. Characteristic properties of biomedical signals include their nonstationarity,
noise capability, and variability among individuals. ECG signals show all these
properties. For the purposes of ECG diagnostics defined was a typical ECG signal
(viewed as normal) that reflects the electrical activity of the heart muscle place during a
single heart evolution. Figure 2 [5] defines some characteristic segments, points, and
parameters used to capture the essence of the signal. In medical diagnostics, the
relationships between the shape and parameters of the signal and the functioning of the
heart are often expressed in terms of linguistic statements resulting in some logic
expressions. For instance, we have the terms such as “extended R wave,” “shortened
11
QT interval,” “unclear Q wave,” elevated ST segment,” “low T wave,” etc. The expert
cardiologist forms his/her own model of the process, which is described in a linguistic
fashion. It is clear that the model is formed on a basis of gained knowledge and
experience [5].
Figure 2. Typical shape of ECG signal and its essential waves
12
3.3 Normal cardiac Electrocardiogram versus Abnormal
Normal sinus rhythm is the rhythm of a healthy normal heart, where the sinus
node triggers the cardiac activation.This is easily diagnosed by noting that the three
deflections, P-QRS-T, follow in this order and are differentiable as shown in Figure 3[4].
The sinus rhythm is normal if its frequency is between 60 and 100/min[4].
Figure 3. Normal sinus rhythm.
An arrhythmia is an abnormality in the heart’s rhythm, or heart beat pattern. The
heart beat can be too slow, too fast, have extra beats, or otherwise beat irregularly [4].
Below are some types of cardiac arrhythmia ,with a brief illustration of its properties .
Ventricular Arrhythmias
In ventricular arrhythmias ventricular activation does not originate from the AV
node and/or does not proceed in the ventricles in a normal way. If the activation
proceeds to the ventricles along the conduction system, the inner walls of the ventricles
are activated almost simultaneously and the activation front proceeds mainly radially
toward the outer walls. As a result, the QRS-complex is of relatively short duration. If
the ventricular conduction system is broken or the ventricular activation starts far from
the AV node, it takes a longer time for the activation front to proceed throughout the
ventricular mass. The criterion for normal ventricular activation is a QRS-interval
13
shorter than 0.1 s. A QRS-interval lasting longer than 0.1 s indicates abnormal
ventricular activation[4].

Premature Ventricular Contraction(PVC)
Figure 4 [4] Shows A premature ventricular contraction which is one that occurs
abnormally early. If its origin is in the atrium or in the AV node, it has a
supraventricular origin. The complex produced by this supraventricular arrhythmia lasts
less than 0.1 s. If the origin is in the ventricular muscle, the QRS-complex has a very
abnormal form and lasts longer than 0.1 s. Usually the P-wave is not associated with it
[5].
Figure 4. Premature Ventricular.

Atrial Premature (AP)
Atrial premature complexes are also called premature atrial contractions (PACs)
and may cause heart palpitations or unusual awareness of heartbeats. Palpitations
may be heartbeats that are extra fast, extra slow, or irregularly timed. PACs occur
when a beat of your heart occurs early in the heart cycle or prematurely
(CincinnatiChildren’s) [6]. PACs result in a feeling that the heart has skipped a
beat, or that your heartbeat has briefly paused. Sometimes, PACs occur and you
14
can’t feel them. Premature beats are common, and usually harmless. Rarely, PACs
may indicate a serious heart condition such as life-threatening arrhythmias.
When a premature beat occurs in the upper chambers of heart, it is known as an
atrial complex or contraction. Premature beats can also occur in the lower chambers
of your heart. These are known as ventricular complexes. Causes and symptoms of
both types of premature beats are similar Atrial and ventricular hypertrophies are
illustrated in Figure 5 [4] .
Premature
Figure 5. Atrial Premature
Bundle-branch block
Bundle-Branch Block denotes a conduction defect in either of the bundle-branches
or in either fascicle of the left bundle-branch. If the two bundle-branches exhibit a block
simultaneously, the progress of activation from the atria to the ventricles is completely
inhibited; this is regarded as third-degree atrioventricular block . The consequence of
left or right Bundle-Branch Block is that activation of the ventricle must await initiation
by the opposite ventricle. After this, activation proceeds entirely on a cell-to-cell basis.
The absence of involvement of the conduction system, which initiates early activity of
many sites, results in a much slower activation process along normal pathways. The
consequence is manifest in bizarre shaped QRS-complexes of abnormally long duration.
15

Right Bundle-Branch Block (RBBB)
If the right bundle-branch is defective so that the electrical impulse cannot travel
through it to the right ventricle, activation reaches the right ventricle by proceeding from
the left ventricle. It then travels through the septal and right ventricular muscle mass.
This progress is, of course, slower than that through the conduction system and leads to
a QRS-complex wider than 0.1 s. Usually the duration criterion for the QRS-complex in
right Bundle-Branch Block (RBBB) as well as for the left Brundle-Branch Block
(LBBB) as well as for the left Bundle- Branch Block (LBBB) is >0.12 s [5] .
With normal activation the electrical forces of the right ventricle are partially
concealed by the larger sources arising from the activation of the left ventricle. In right
Bundle-Branch Block (RBBB), activation of the right ventricle is so much delayed, that
it can be seen following the activation of the left ventricle. (Activation of the left
ventricle takes place normally). Right Bundle-Branch Block is illustrated in Figure 6 [4].
Figure 6. Right bundle-branch block
16

Left Bundle-Branch Block (LBBB)
The situation in left Bundle-Branch Block (LBBB) is similar, but activation
proceeds in a direction opposite to RBBB. Again the duration criterion for complete
block is 0.12 s or more for the QRS-complex [4]. Because the activation wavefront
travels in more or less the normal direction in LBBB, the signals' polarities are generally
normal. However, because of the abnormal sites of initiation of the left ventricular
activation front and the presence of normal right ventricular activation the outcome is
complex and the electric heart vector makes a slower and larger loop to the left and is
seen as a broad and tall R-wave, usually in leads I, aVL, V5, or V6 as illustrated in
Figure 7[4].
Figure 7. Left bundle-branch block
17
3.4 Fractal Features
Two of the most important properties of fractals are self-similarity and noninteger dimension. We can explain self-similarity by looking carefully at a fern leaf
shown in Figure 8 [20] below, and notice that every little leaf - part of the bigger one has the same shape as the whole fern leaf. We can say that the fern leaf is self-similar.
The same is with fractals: we can magnify them many times and after every step we will
see the same shape, which is a characteristic of that particular fractal.
The non-integer dimension is more difficult to explain. Classical geometry deals
with objects of integer dimensions: zero dimensional points, one dimensional lines and
curves, two dimensional plane figures such as squares and circles, and three dimensional
solids such as cubes and spheres as shown in Figure 9. However, many natural
phenomena are better described using a dimension between two whole numbers. So
while a straight line has a dimension of one. Figure 10 [20] demonstrates how a fractal
curve will have a dimension between one and two, depending on how much space it
takes up as it twists and curves. The more the flat fractal fills a plane, the closer it
approaches two dimensions. Likewise, a "hilly fractal scene" will reach a dimension
somewhere between two and three. So a fractal landscape made up of a large hill covered
with tiny mounds would be close to the second dimension, while a rough surface
composed of many medium-sized hills would be close to the third dimension[20].
18
Figure 8. Fern Leaf
Figure 9. Classical geometry objects
Figure 10. Fractal Curves
19
3.5 Explanation of Fractal Geometry and Fractal Dimension
Fractal Dimension can be demonstrated by first defining a fractal set as :
=
Where
(1)
is the number of fragments with the linear dimension defined as
constant, and
,
is some
defines the fractal dimension. If this equation is rearranged with simple
algebra, the outcome is:
=
(2)
Given a line of unit length, we can divide it in varying ways and do different
things with each segment. For the Figure 11.a, if the segment is divided into two parts,
making
, where
is the length of the division[17] . One of the parts is kept and
the other is disposed of, so
= 1. If we divide the remaining segment into two parts
and again only keep one of the fragments, then
=
and
=1. If this process is
repeated (iterated), D turns out to be zero, which gives the equivalent to the Euclidean
point. Regardless of the number of iterations, at order n,
=1. Hence, D will always be
zero. This way of thinking makes sense because if we take a line segment and
continually divide it into two, keeping only one of the pieces, the length of the line
segment will approach zero as the order approaches infinity.
Figure 11. Demonstration of fractal dimensions with Euclidean line segments.
20
A Euclidean line which exists in the first dimension can be demonstrated as simple
as this. This example is modeled in Figure 11.b .The line segment is again divided into
two parts; however, we keep all the fragments, so
we get
=
and
= 4. Hence,
=
and
= 2. Iterating again,
= 1. This also makes sense because we never
remove any part of line so it will always remain of unit length.
In the first two examples, the results are both Euclidean figures with dimensions of
zero and one, respectively. It is, however, just as easy to create a line segment with a
fractal dimension between zero and one. In Figure 11.c the line has been segmented into
three different parts and keeps only the two end pieces. After the first iteration, we get
=
and
= 2. When this process is repeated, we get
Therefore, D =
=
and
= 4.
= 0.6309. To show how to generate line segements with a varying
fractal dimension, we start with a line segment of unit length and divide it into five
distinct parts as in Figure 11.d. By keeping only the two end pieces and the center piece,
we get
=
example D =
and
=
= 3 . Iterating again, we get
=
and
= 9. In this
0.6826. As this process is iterated, the infinite set of points is
called dust.
Fractal dimensions are not limited to being between zero and one. When applying
the same method to the Euclidean square it produces items with a fractal dimension
between zero and two. For each of the following examples, each square will be divided
into nine squares of equal size, making
=
.The iterations continue n times[17].
Figure 12.a demonstrates the Euclidean point, by keeping only one square with each
iteration, making
=
= 1. In Figure 12.b, we keep only the top three squares with
each iteration, making N1 = 3 and N2 = 9. Through this process we discover a Euclidean
line with a dimension of one. The last Euclidean figure which can be derived from this
21
example is the plane in Figure 12.c. To accomplish this, we keep all the squares with
each iteration.
To produce a figure with a fractal dimension, we will keep only the two pieces in the
upper left and lower right corners with each iteration as in Figure 12.d , making
and
= 4. Hence, at the second order D =
= 0.6309. On the other hand, if we
remove only the center piece with each iteration, as in Figure 12.e , then we get
and
=2
=8
= 64. This example produces a fractal dimension of 1.8928.
Figure 12. Demonstration of fractal dimensions with Euclidean planes.
3.5.1 Calculating Fractal Dimension
For certain objects which we have dealt with all of our life, such as squares, lines,
and cubes, it is easy to assign a dimension. We intuitively feel that a square has two
dimensions, a line has one dimension, and a cube has three dimensions. We might feel
this way because there are two directions in which we can move on a square, one
direction on a line, and three directions in a cube, but sometimes we can move in a
certain number of directions and sometimes we can move in a different number of
directions. This is what causes fractal dimensions to be non-integers.
22
To derive a formula for calculating fractal dimensions which will work with all
figures, let’s first look at how to calculate the dimensions for the figures which we
already know. A line can be divided into n =
separate pieces. Each of those pieces is
the size of the whole line and each piece, if magnified n times, would look exactly the
same as the original.
Repeating the process for a square, we find that it can be divided into
same concept holds true for a cube, we need
pieces would be
pieces. The
pieces to reassemble a cube. Each of the
the size of the whole figure. The exponent in each of these examples
is the dimension. For fractals, we need a generalized formula, which can be derived from
what we already know. Because of the way in which this formula ends up, it is
independent of the base used for the logarithms.
For a line:
=
For a square:
=
For a cube:
=
If we look back at figures 11 & 12, they were divided into pieces that when zoomed
in on n times, reappeared to starting figure. Because of this, we divide the ln(number of
divisions) by the natural logarithm of the magnification factor. The resulting formula
gives the dimension, represented by D [17].
D=
(3)
For a line:
D=
=1
For a square:
D=
=2
For a cube:
D=
=3
23
Each of these examples was easy because the magnification factor was always n. But for
fractals, magnification factor will be a constant, which varies for each fractal .
3.5.2 Examples of Deterministic Fractals and its applications
 The Koch Curve
For all previous examples that have been dealt with removing pieces from various
geometric figures. Fractals, and fractal dimensions can also be defined by adding onto
geometric figures. The Koch curve was named after Helge Von Koch in 1904. The
generation of this fractal is simple. We begin with a straight line of unit length and divide
it into three equally sized parts. The middle section is replaced with an equilateral
triangle and its base is removed. After one iteration, the length is increased by four-thirds.
As this process is repeated, the length of the figure tends to infinity as the length of the
side of each new triangle goes to zero. Assuming this could be iterated an infinite
number of times, the result would be as in Figure 13 [20] which is infinitely wiggly,
having no straight lines whatsoever, this type of fractals which is made by humans called
Deterministic Fractals [17].
Figure 13. The Koch Curve
24
To calculate the dimension of the Koch Curve, we look at the image of the fractal
and realize that it has a magnification factor of three and with each iteration, it is divided
into four smaller pieces. Knowing this, we get :
D = ln(4) / ln(3)
D = 1.3863 / 1.0986
D = 1.2619
The Koch Curve has a dimension of 1.2619.
 The Sierpinski Triangle
Sierpinski triangle in Figure 14 [20] is created by infinite removals. Each triangle is
divided into four smaller, upside down triangles. The center of the four triangles is
removed. As this process is iterated an infinite number of times, the total area of the set
tends to infinity as the size of each new triangle goes to zero [18].
Figure 14. The Sierpinski Triangle
After closer examination of the process used to generate the Sierpinski Triangle and the
image produced by this process, we realize that the magnification factor is two. With
each magnification, there are three divisions of the triangle. With this data, we get:
D = ln(3) / ln(2)
D = 1.0986 / 0.6931
D = 1.5850
The Sierpinski Triangle has a dimension of 1.5850 [17].
25
3.5.3 Fractals and Fractal Geometry applications
Fractal geometry has permeated many areas of science, such as astrophysics,
biological sciences, and has become one of the most important techniques in computer
graphics.

Fractals in astrophysics
Astrophysicists believe that the way to know how stars are formed and ultimately
found their home in the Universe is the fractal nature of interstellar gas. Fractal
distributions are hierarchical, like smoke trails or billowy clouds in the sky [20].
Turbulence shapes for the clouds in the sky and the clouds in space, which give them an
irregular but repetitive pattern that is impossible to be described without the help of
fractal geometry.

Fractals in the Biological Sciences
Biologists have traditionally modeled nature using Euclidean representations of
natural objects or series. They represented heartbeats as sine waves, conifer trees as
cones, animal habitats as simple areas, and cell membranes as curves or simple surfaces.
However, scientists have come to recognize that many natural constructs are better
characterized using fractal geometry. Scientists discovered that the basic architecture of a
chromosome is tree-like; every chromosome consists of many 'mini-chromosomes', and
therefore can be treated as fractal[18]. For a human chromosome, for example, a fractal
dimension D equals 2,34 (between the plane and the space dimension).Self-similarity has
been found also in DNA sequences. In the opinion of some biologists fractal properties
of DNA can be used to resolve evolutionary relationships in animals.
26
The human body is also governed by fractal rhythms called ECG signals. The
ECG signals , oscillating at the borderline between chaos and order, have such a fractal
nature. If the beat is too periodic, heart failure might be the result, but a heart attack
might occur when it is too aperiodic.
Another characteristic of fractals is encountered with the fibrillating heart. For
the normal heart, an electrical signal is sent in a regulated wave through the entire threedimensional structure, causing each cell to contract and then relax. This wave is
somehow broken up in the fibrillating heart leaving the organ never immediately entirely
relaxed or in contraction. This uncoordinated wave can cause the blockage of arteries and
can lead eventually to the death of the contracting organ [21].
 Fractals in computer graphics
The biggest use of fractals in everyday life is in computer science. Many image
compression schemes use fractal algorithms to compress computer graphics files to less
than a quarter of their original size.
Computer graphic artists use many fractal forms to create textured landscapes and
other intricate models. But fractal signals can also be used to model natural objects,
allowing us to define mathematically our environment with a higher accuracy than ever
before as we will see in analysis of ECG biomedical signals in this thesis.
27
3.6 Fractal Features Extraction From ECG Signals
FD is a descriptive measure that has been proven useful in quantifying the
complexity or self similarity of biomedical signals. Such analysis of complexity of
biomedical signals helps us to study physiological processes underlying the systems.
The FD can be used to study dynamics of transitions between different states of systems
like heart and also in various physiological and pathological conditions [23]. As ECG
signal of a human heart is a self-similar object, so it must have a fractal dimension that
can be extracted using mathematical methods to help identifying and distinguish
specific states of heart pathological conditions Several methods have been proposed in
the literature to estimate the FD of signals or time series data either in time or frequency
domain. Analysis in the time domain processing the signal data directly, while analysis
in frequency domain requires Fourier or wavelet transform of the signal [25].This
section investigates time domain methods for computing FD values from ECG time
series signals depending on fractal geometry in order to extract its main features.
3.6.1 Time Domain Methods of Estimating FD
Herein, fractal complexity of signal is characterized in real-time by computing its
FD using each of Katz’s method, Hugshi’s method and Hurst’s method.
A. Katz’s method
The FD of a signal curve, based on Katz’s method[24], can be defined as:
FD = log (L)/log (d)
(4)
Where, L is the total signal curve length or sum of distance between successive points,
and d is the diameter estimation of the distance between the first data point and the data
which gives the farthest distance.
28
d and L , are respectively, can be expressed mathematically as below:
(5)
L=
(6)
Normalizing distances in (1) by the average distance between successive points, say a,
gives:
FD =
(7)
Defining n as the number of steps in the signal curve less than the number of points N,
then n =
. Substituting n in (2), FD according to Katz’s approach is expressed as:
FD =
(8)
B. Higuchi’s method
Higuchi proposed an efficient algorithm to calculate the FD directly from time series
[28]. Assume a one dimensional time series X= {X(1), X(2), X(3), …, X(N)} where, N
is the total number of samples, in our case the series X would be the successive values
of ECG signal. The Higuch’s algorithm constructs k new time series as:
(9)
where k and m are integers, represent time interval between points and initial time value
respectively, M =
For each new time series
constructed the length
is computed as:
(10)
where,
is a normalization factor for the curve length of
29
.
The length of the series L(k) for the time interval k is computed as the mean of the k
values, for m = 1, 2, ..., k .
L(k ) =
(11)
If L (k) is proportional to
, then the curve describing the shape of ECG time series
is fractal-like with the dimension FD. In this case, if ln(L(k)) is plotted against ln(k) , k
= 1, 2, 3, ...,
, the points fall on a straight line with a slope equal to FD.The fractal
dimension of ECG signal is calculated via above method while applying adaptive and
fixed windowing method.
C. Rescaled Range (R/S) Method
Hurst developed R/S method which is a statistical technique to analyze a large
number of natural phenomena [19]. The R/S method is one of the oldest and best known
methods for estimating H (Hurst parameter). Let {
sample points of an ECG recordings. The mean
these points are, respectively,
, k = 1, 2, 3, ..., N be a set of N
and the standard deviation S(N) of
and S(N) =
The R/S-
statistic or rescaled adjusted range , is defined by the ratio:
(12)
where ,
=(
+
+
+...+
)−k.
(13)
k = 1, 2, 3, ...,N.
30
Hurst found empirically that, for many time series observed in nature, they are well
represented by the relation
(14)
where C is a finite positive constant. By taking logs we obtain :
(15)
Therefore, the slope of a plot of log(R/S) against log(N) provides the Hurst parameter,
H[27]. The relation between the Hurst exponent and the fractal dimension is simply
determined as FD=2-H. So fractal dimension with the help of these equations can
easily evaluated in the rescaled range analysis
31
Chapter 4
ECG Feature Extracting using PSM and Classification with SVM
4.1 Introduction
As it has seen from the previous chapter that the Hurst parameter (Dimension), H
measures the feature of self-affinity of time series in real-time domain. Herein, I have
presented the description of this feature through processing the time series in the
frequency domain in which I have assumed that the power spectrum of this signal is
dominated by a Random Scaling Fractal(RSF) model P(f) = c/ , where c > 0. Then an
automatic ECG arrhythmia diagnosis method based on SVM using Fractal Dimension is
proposed based on estimation of Fractal Dimension (FD) of ECG recordings by focusing
on the Power Spectrum Method (PSM) that makes use of the characteristic of Power
Spectral Density Function (PSDF) of a Random Scaling Fractal Signal. 31 dataset of
ECG signals taken from MIT-BIH arrhythmia database [16] have been utilized to
estimate the FD. In the following section I have introduced a power spectrum method
(PSM) depending on the frequency analysis by which I have tried to capture the fractal
behavior of ECG signals based on the RSF model.
4.2 Feature Extraction using Power Spectrum Method (PSM)
Fractals are applicable when the underlying process being mathematically
modeled has a similar appearance regardless of the scale over which it is observed. It
turns out that many of natural signals can be modeled using fractals. Many signals
observed in nature are random fractals including biomedical signals such as ECG time
series signal. Random Scaling Fractal (RSF) signals are signals whose probability
distribution function (PDF) has the same ‘shape’ irrespective of the scale over which
32
they are observed. Accordingly, random fractal signals are statistically self-similar
(self-affinity), they are self-similar in a statistical sense[24]. However, ECG time
series signal exhibits the features of self-affinity , so it can be considered as an
example of RSF signals. RSF signals are characterized by power spectra whose
frequency distribution is proportional to 1/
where is the frequency and q > 0 is the
‘Fourier Dimension’, a value that is simply related to the Fractal Dimension, FD and
Hurst (Dimension) parameter H, by the relation q = H + 1/2 = (5 - 2D)/2. This power
law
describes the conventional RSF models which are based on stationary processes
in which the ‘statistics’ of the RSF signals are invariant of time and the value of q is
constant. Assume X(t), in time domain, is a time series of ECG signal which is
assumed to be a self-affine signal. Notice that each of Figure 15 and Figure 16 shows
the plotting of 1024 points of normal and abnormal ECG signals, respectively, with its
similar small version of size 512 points from each of them . The power spectrum of
such a signal can be written as P( ) =
, where X( ) is Fast Fourier Transform
(FFT) of the time series in frequency domain ( i.e. X( ) = t(X(t))). For such time
series the power spectrum, P( ) obeys the RSF model
P( ) = c/
(16)
Figure 17 and Figure 18 show examples of different plots of the measured power
spectrum of normal and abnormal ECG signal, respectively, over different window size.
These figures give the evidence that the power spectrum of the ECG time series signals
obeys the RSF model. The behavior of ECG signals can be characterized through
estimating the parameter q in the proposed model where the estimated values of this
parameter reflects the degree of self-similarity (fractality) in ECG signals. To do this
the least square technique is applied to the measurements of ECG signals as follow:
33
Let
,
,
, ...,
(N being a power of 2) be sample points of an ECG signal . By
considering the case in which the digital power spectrum
P( ) is given by applying
a FFT to this time series. This data can be approximated by:
(17)
or
(18)
If we consider the error function
(19)
where
=
, and it is assumed that the spatial frequency and the measured power
spectrum
then the solutions of equations (least square method)
gives:
(20)
and
(21)
34
Since the power spectrum of real signals of size N is symmetric about the DC level,
where the DC level is taken to the mid point + 1 of the array, so in practice only the
data that lies to the right of DC[24].
Figure 15. (left) Normal Sinus Rhythm ECG signal of size
1024, (right) Zoom in version of Normal Sinus Rhythm
ECG signal of size 512
Figure 16. (left) Atrial Premature Arrhythmia ECG signal of
size 1024, (right) Zoom in version of Atrial Premature Arrhythmia
ECG signal of size 512
35
Figure 17. Measured power spectrum of Normal Sinus
Rhythm ECG signal (left-to-right and top-to-bottom) for
window size 1024; 512; 256; 128
Figure 18. Measured power spectrum of Atrial Premature
ECG signal (left-to-right and top-to-bottom) for
window size 1024; 512; 256; 128
36
4.2.1 PSM Methodology Algorithm
ECG time series signal exhibits the features of self-affinity , so it can be considered
as an example of RSF signals. To estimate the fractal parameter in this series I
converted it to frequency domain in which I assumed that the empirical power spectrum
of each series has an envelope Power Spectrum Density Function (PSDF) which is
given as the RSF model P( ) =
.By using Moving Window technique, I choose a
window of size N to move over the points of the time series to be analyzed. From each
window segment I applied the PSM to estimate the Fourier Dimension q, after
implementing normalizing and transformation to spectral domain on the given segment.
The following algorithm summarizes the steps of the Methodology Process used ,
which is explained in block diagram in Figure 19.
Step 1: Use a window of size N = 512 over the points of a given ECG time series to
extract a signal array of points , say
, N . This process is applied to
38720 cardiac beats for 31 persons.
Step 2: Normalize the signal achieved in Step1:
.
Step 3: Compute the Discrete Fourier Transform (DFT) of
using a Fast Fourier
Transform (FFT) and with special shifting yield
.
Step 4: Compute the empirical power spectrum of
.
Step 5: Extract the right halve of the computed power spectrum.
Step 6: Compute the parameter q using the computational formula of the PSM given in
equation (20).
Step 7: Iterate Step1 through to Step6 until the end of the time series .
Step 8: Compute the Fractal Dimension D , where
37
.
X
Y
FFT
Z
fftshift
Power P
Spectrum
PSM
MM
Estimated
Value of q
ECG Time Series
Figure 19. Estimate the Fourier Dimension q of ECG Signal .
4.3 Arrhythmia Classification based on (SVM)
Finding arrhythmia characteristics corresponding to Premature Ventricular
Contraction (PVC), Atrial Premature (AP), and Bundle-Branch Block (BBB)[22]from
ECG recording have received considerable attention in recent years. Differences in
normal and abnormal ECG signals can’t be easily determined especially with human
eyes. Developing an intelligent method for identification of such cardiac diseases is very
helpful in biomedical field, as it will be an aid to clinical staff in the absence of doctor,
It will also help doctor to diagnose and act faster in case of emergency conditions.
Support Vector Machine (SVM) has advantages of very accurate ability of
classification, simple architecture as well as less overfitting and robust to noise. As seen
in chapter 5 of this thesis, Fractal Dimension can quantitatively describe the non-linear
behavior of signals, thus it can be used as features for diagnosing ECG Arrhythmia.
The support vector machine usually deals with pattern classification which means
classifying the different types of patterns. There are different types of patterns i.e. Linear
and non-linear. Linear patterns are patterns that are easily distinguishable or can be easily
separated in low dimension whereas non-linear patterns are patterns that are not easily
distinguishable or cannot be easily separated and hence these types of patterns need to
38
be further manipulated so that they can be easily separated. Figure 20 [28] shows the
main idea of SVM which is the construction of an optimal hyper plane, which can be
used for classification, for linearly separable patterns , that maximizes the margin of the
hyper plane i.e. the distance from the hyper plane to the nearest point of each pattern.
The main objective of SVM is to maximize the margin so that it can correctly
classify the given patterns i.e. larger the margin size, it classifies the patterns more
correctly . The equation shown below is the hyper plane representation:
aX + bY = C
(22)
Figure 21 [28] shows the basic idea of the hyper plane in a three dimension when it is
used to separate two different patterns. Basically, this plane comprises three lines that
separate two different patterns in 3-D space, mainly marginal line and two other lines
on either side of marginal lines where support vectors are located.
Figure 20 SVM Model
39
Figure 21 Hyper Plane
For non-linear separable patterns, the given patterns are mapping into new space
usually a higher dimension space so that they become linearly separable. This aim
was done by using kernel function, (x).
i.e.
x
(x).
Selecting different kernel functions is an important aspect in the SVM-based
classification, commonly used kernel functions include : Linear, Polynomial (Poly) and
Radial Basis Function (RBF).
Different Kernel functions create different mapping for creating non-linear separation
surfaces. Therefore, the problem of solving optimal classification now translates into
solving quadratic programming problems. It is to seek a partition hyper plane to make
the binary blank area (2/||w||) maximum, where w is a weight vector. which means we
have to maximize the weight of the margin. It is expressed as:
Min (x) =
½ (w, w),
Such that:
(23)
40
4.3.1 Multiclass SVM
The SVM technique was originally proposed essentially for binary classification.
But, the classification of ECG signals often involves the simultaneous discrimination of
numerous information classes. In order to face this issue, a number of multiclass
classification strategies can be adopted [28], [29]. The most popular methods based on
combining binary SVM are: one-against-all (OAA) and the one-against-one (OAO)
strategies. The former involves a reduced number of binary decompositions (and thus, of
SVMs), which are, however, more complex. The latter requires a shorter training time,
but may incur conflicts between classes due to the nature of the score function used for
decision.
Both strategies generally lead to similar results in terms of classification
accuracy. In this thesis, I had considered the OAA strategy. Briefly, this strategy is based
on the following procedure. Let Ω= {
} be the set of T possible labels
(information classes) associated with the ECG beats that we desire to classify. First, a
group of T SVM classifiers is trained. Each classifier aims at solving a binary
classification problem defined by the discrimination between one information class
=1, 2... T) against all others (i.e., Ω -{
(i
. Then, in the classification phase, the
“winner-takes-all” rule is used to decide which label to assign to each beat. This means
that the winning class is the one that corresponds to the SVM classifier of the group that
shows the highest output (discriminant function value).
41
4.3.2 Application of OAA SVM Using Fractal Features in ECG
Arrhythmia Diagnosis
Using OAA SVM based on fractal features to diagnose ECG Arrhythmia involves
extracting ECG fractal features using (PSM), forming training vector , establishing OAA
SVMs, training the OAA SVMs and diagnosing Arrhythmia. Figure 22 gives the block
diagram of the proposed method which can be summarized as follows:
Step 1. Extracting features: Fractal dimension can reflect the fractality features of
ECG signals. Different types of ECG signals have different values of fractal dimension
when PSM applied to it. In this step, ECG Arrhythmia is deduced from different ranges
of FD for healthy and non-healthy persons. A PSM is applied to 38720 cardiac beats for
31 persons. The response of selected beats is sampled by applying windowing technique
with 512 window size and the feature using fractal dimension is extracted. This process
is repeated for all 31 persons.
Step 2. Establishes the multi-class OAA SVM networks based on Arrhythmia
classes and trains it : In this step, the structure of SVM classifier is built by the
following steps:
1. Load the dataset as a vector of points ,which represents a total of 122 FD points
estimated by PSM that was measured in Step1 for 31 persons.
2. Create a two-column matrix containing the FD vector as the first column , and a
labels corresponding to the FD values estimated in Step1 as a second coloumn.
i.e : Normal Sinus Rhythm :1, Ventricular Premature Arrhythmia:2, Atrial
Premature:3, Right bundle-branch block:4, Left bundle-branch block:5
3. Create a new column vector, groups, to classify data into only two groups , by
applying the OAA strategy.
42
4. Create a 5-fold cross-validation to randomly select training and testing points
from
the groups to feed SVM model . i.e. : indices = crossvalind('Kfold'
groups,5) ;for i = 1:5 test = (indices == i); train = ~test.
5. Use the svmclassify Matlab built in function to classify the test set vector , with
the use of : Linear , Poly and Rbf kernel functions .
6. Evaluate the performance of the classifier.
7. Repeat steps (3-6) after exchanging the two groups to apply all diseases labels .
Step 3 . Diagnosing Arrhythmia with trained OAA SVM: In this step, the trained
OAA SVM is used to diagnose the unknown Arrhythmia. It starts with extracting fractal
features from testing samples using PSM. Then fractal features are brought to the trained
multi-class OAA SVM classifier and diagnosis results are obtained.
Testing Samples of
Unknown
Arrhythmia
ECG to be
Diagnosed
Extracting
Fractal Features
Using PSM
SVM Trainer
U
SVM Classification
Model
SVM Classifier
Diagnosis Results
Figure 22. SVM method using fractal features for ECG Arrhythmia diagnosis .
43
Chapter 5
Experimental Evaluation and Discussion
5.1 Dataset Description
In this work , a fractal dimension (FD) for 31 dataset of ECG signals has been
determined in time domain and frequency domain, then ranges of FD, is established for a
healthy person and persons with various heart diseases. The sample of ECG signals for
the present study is obtained from MIT/BIH database via Physionet website [16]. The
MIT-BIH database contains both normal and abnormal types of ECG signals. In this
study, the considered beats refer to the following classes: Normal Sinus Rhythm (N),
Premature Ventricular Contraction (PVC), Atrial Premature (AP) , Right Bundle-Branch
Block (RBBB) , and Left Bundle-Branch Block (LBBB). The beats were selected from
the recordings of 31 persons , which correspond to the following files: 17052m, 16420m,
19088m, 19093m, 16265m , 16483m , 16273m , 16549m , 16539m , 16795m , 17453m ,
18177m, 18184m , 19090m , 19830m , 16786m , 16277m , 16792m ,and 16272m for
Normal Sinus Rhythm (N). 200m , 208m , and 215m for Premature Ventricular
Contraction (PVC). 100m , 209m, and 223m for Atrial Premature (AP). 124m , 231m ,
and 232m for Right Bundle-Branch Block (RBBB) and 214m , 109m , and 217m for Left
Bundle-Branch Block (LBBB). The properties of these signals are described in Table 1.
Figure 23 to Figure 27 show the plot of two dataset from each type of signal.
44
Table 1. Description of the used dataset
Signal Type
Ventricular Premature Arrhythmia
Atrial Premature
Right bundle-branch block
Left bundle-branch block
Normal Sinus Rhythm
No. of samples/
signal
3600
3600
3600
3600
1280
Sampling
frequency
360 Hz
360 Hz
360 Hz
360 Hz
128 Hz
Sample intervals
0.6500000 sec
0.6500000 sec
0.6500000 sec
0.6500000 sec
0.1132740 sec
Figure 23 . (A),(B) ECG signals of Normal Rhythm
Figure 24 . (A), (B) ECG signals of a Premature Ventricular Arrhythmia
45
Figure 25. (A),(B) ECG signals of a Atrial Premature Arrhythmia
Figure 26. (A),(B) ECG signals of Right Bundle-Branch Block Arrhythmia
Figure 27. (A),(B) ECG signals of Left Bundle-Branch Block Arrhythmia
46
5.2 Experimental Results
It is shown From the block diagram of the proposed method in Figure 26 in the
previous chapter that in order to feed the classification process, we need to extract fractal
features of the ECG signals .To do so I have utilized 31 dataset which are composed of
ECG signals recorded from healthy subjects and patients with heart arrhythmia. I have
performed the experiments using Matlab7 on ECG datasets from the MITBIH
arrhythmia database [16].
5.2.1 Fractal Features Extraction
The FD feature from each class of ECG timesereis signal has extracted using a non
overlapping window of size 512 points by means of the methods presented in section
5.2.1.1, chapter 5 of this thesis. Table 2 shows the results obtained for the estimation of
FD from the Normal heart rhythm signals, which prove that the healthy heart is the
fractal heart ; since the value of FD lies between 1 and 2.
Tables 3-6 show the results obtained for the estimation of FD from the pathological
signals, and Tables 7-10 show the intervals (lower bound and upper bound) of the
estimated FD for each specific disease corresponding to each estimation method. By
comparing these estimated FD intervals shown in Tables 7-10, it is clear that only PSM ,
can distinguish obviously between healthy and non-healthy persons by putting each of
them in distinct FD range. On the other hand, Table 11 shows the average of FD values
for each of ECG signal type along with the estimated methods that are used. For the PSM
we note that the average FD value for Normal Sinus Rhythm is 1.522589. During the
other heart arrhythmias, Left Bundle Branch Block, Right Bundle Branch Block , Atrial
Premature and Ventricular Premature Arrhythmia the values are lower, and are equal to
0.742733, 0.438833, 0.249733, and 0.082267 respectively.
47
There is a decrease in the average of FD value , this decrease in the FD value
indicates a decrease in the heterogeneity of the cardiac recording [ 23]. Meanwhile, if we
compare the average FD value of Normal heart rhythms with Abnormal heart rhythms
that are obtained by each of the time domain methods (i.e. Katz’s, Higuch’s and Hurst’s
method) and the frequency domain method (i.e., PSM) it is clear that the PSM has an
advantage of distinguishing between the normal condition and the pathological one more
clearly than these methods. So that the PSM can provide a significant clinical advantage
where it can readily be incorporated ’on line’ to provide (and to possibly control) the
onset of a pathological condition, which is indicated by a drop in the FD value.
Table 2. The Estimated FD Values for Normal Sinus Rhythm Signals
Dataset
(1280 beats )
FD estimation methods
Katz
Higuchi's
Hurst’s
PSM
1.
2.
3.
4.
5.
6.
17052m
16420m
19088m
19093m
16265m
16483m
2.1743
2.1372
1.7247
1.6402
2.4422
2.1783
1.6051
1.5768
1.1140
1.0492
1.7299
1.3252
1.3384
1.3452
1.3570
1.3405
1.3377
1.3949
1.1819
1.4657
1.9838
1.9392
1.7882
1.2708
7.
16273m
2.2544
1.4789
1.3648
1.4580
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
16549m
16539m
16795m
17453m
18177m
18184m
19090m
19830m
16786m
16277m
16792m
16272m
2.1744
2.0761
2.0527
2.1010
2.1240
2.2229
1.7677
1.6790
1.9652
1.8652
2.3240
1.9763
1.2713
1.5325
1.3285
1.4149
1.3793
1.4506
1.1015
1.0295
1.5126
1.3061
1.5122
1.5072
1.3558
1.3467
1.3360
1.4081
1.1263
1.3885
1.3374
1.3479
1.3634
1.3277
1.1872
1.1942
1.4893
1.6843
1.4527
1.8547
1.9123
1.6968
1.1478
1.5554
1.1189
1.1617
1.1955
1.5722
48
Table 3. The Estimated FD Values for Ventricular Premature Arrhythmia Signals
Dataset
(3600 beats)
1.
2.
3.
200m
208m
215m
FD estimation methods
Katz
1.6248
1.6607
1.8216
Higuchi's
Hurst’s
PSM
1.1312
1.1616
1.4922
1.2818
1.2415
1.0592
0.0774
0.0349
0.1345
Table 4. The Estimated FD Values for Atrial Premature Arrhythmia Signals
Dataset
(3600beats )
1.
2.
3.
100m
209m
223m
FD estimation methods
Katz
1.8583
2.2321
1.6217
Higuchi's
Hurst’s
PSM
1.3014
1.4429
1.1095
1.2676
1.0869
1.3258
0.2436
0.2266
0.2790
Table 5. The Estimated FD Values for Right Bundle Branch Block Arrhythmia Signals
Dataset
(3600beats )
1.
2.
3.
124m
231m
232m
FD estimation methods
Katz
Higuchi's
Hurst’s
PSM
1.8544
1.7198
1.8760
1.2683
1.1971
1.2378
1.0567
1.2580
1.2723
0.3902
0.4104
0.5159
Table 6. The Estimated FD Values for Left Bundle Branch Block Arrhythmia Signals
Dataset
(3600beats )
1.
2.
3.
214m
109m
217m
FD estimation methods
Katz
Higuchi's
Hurst’s
PSM
1.6661
1.7174
1.6644
1.1670
1.1585
1.1259
1.3110
1.1073
1.2376
0.8321
0.6954
0.7007
49
Table 7. Distinct Range of FD Values for Sample ECG Signals Using PSM
Signal Type
Ventricular Premature Arrhythmia
Atrial Premature
Right Bundle Branch Block
Left Bundle Branch Block
Normal Sinus Rhythm
Range
0.0349 - 0.1345
0.2266 - 0.2790
0.3902 - 0.5159
0.6954 - 0.8321
1.1189- 1.9838
Table 8. Ranges of FD Values for Sample ECG Signal Using Katz’s Method
Signal Type
Ventricular Premature Arrhythmia
Atrial Premature
Right Bundle Branch Block
Left Bundle Branch Block
Normal Sinus Rhythm
Range
1.6248 - 1.8216
1.6217 - 2.2321
1.7198 - 1.8760
1.6644 - 1.7174
1.6402 - 2.4422
Table 9. Ranges of FD Values for Sample ECG Signal Using Higuchi’s Method
Signal Type
Ventricular Premature Arrhythmia
Atrial Premature
Right Bundle Branch Block
Left Bundle Branch Block
Normal Sinus Rhythm
Range
1.1312 - 1.4922
1.1095 - 1.4429
1.1971 - 1.2683
1.1259 - 1.1670
1.0295 - 1.7299
Table 10. Ranges of FD Values for Sample ECG Signal Using Hurst’s Method
Signal Type
Ventricular Premature Arrhythmia
Atrial Premature
Right Bundle Branch Block
Left Bundle Branch Block
Normal Sinus Rhythm
Range
1.0592 - 1.2818
1.0869 - 1.3258
1.0567 - 1.2723
1.1073 - 1.3110
1.1263 - 1.4081
Table 11. Average of the Estimated FD Values
Signal Type
Ventricular Premature Arrhythmia
Atrial Premature
Right Bundle Branch Block
Left Bundle Branch Block
Normal Sinus Rhythm
Katz
1.702367
1.904033
1.816733
1.682633
2.046305
Higuchi's
1.261667
1.284600
1.234400
1.150467
1.380279
50
Hurst’s
1.194167
1.226767
1.195667
1.218633
1.326195
PSM
0.082267
0.249733
0.438833
0.742733
1.522589
5.2.2 Classification with SVM
In order to obtain reliable assessments of the classification accuracy of the
investigated classier, I carried out five different trials with the use of OAA SVM
procedure described in chapter 6 , each with a new set of randomly selected testing and
training values in which each of them represents the value of PSM-FD for each type of
disease. The results of these five trials obtained on the test set were thus averaged. The
detailed numbers of ECG beats according to PSM-FD for each class used in the
experiment with a comparison of average run time needed for each are reported in Table
12. Classification performance summarized in Table 13 was evaluated in terms of two
measures, which are: 1) the accuracy of each class that is the percentage of correctly
classified beats among the beats of the considered class . The accuracy of PSM FD SVM has %100 for Normal Sinus Rhythm with the use of all kernels used in the
experiment , this means that the proposed method has an advantage of distinguishing
between the normal condition and the pathological clearly. So that the PSM can provide
a significant clinical advantage where it can readily be incorporated ’on line’ to provide
(and to possibly control) the onset of a pathological condition, which is indicated by a
high accuracy rates shown in Table 13 which had achieved by the experiment .
2) the average accuracy (AA), which is the average over the classification accuracies
obtained for the different classes .
51
Table 12. Number of ECG Beats According to PSM - FD Used in the Experiment with a
Comparison of Average Run Time Needed for Each.
Class
ECG Beats
PSM FD
N
24150
38
A
338
21
V
4039
21
RBBB
3789
21
LBBB
1801
21
Total
34117
122
Average Run-Time(second)
313200
2.93333
Table 13. Class Percentage Accuracy Achieved on the Testing PSM – FD Values
with a Total Number of 122 PSM - FD Training Values
Method
AA
N
A
V
RBBB
LBBB
SVM-linear
% 78.90
% 81.42
% 80.25
% 74.84
% 82.53
% 72.58
SVM-poly
% 85.75
% 85.74
% 83.19
% 84.48
%92.03
% 89.94
SVM-rbf
% 87.48
% 88.69
% 87.39
% 81.48
%95.98
% 87.49
PSM FD - SVM-linear
% 85.41
% 011
% 81.97
% 80.33
%82.79
% 81.96
PSM FD - SVM- poly
% 87.97
% 100
% 85.25
% 89.87
%82.79
% 81.96
PSM FD - SVM- rbf
% 89.33
% 011
% 89.94
% 89.97
%82.79
% 83.97
As reported in Table 13, the AA accuracies achieved with the proposed PSM -FD - SVM
classifier based on the Gaussian kernel (SVM–rbf) on the test set are equal to
89.33%.This result is better than those achieved by the SVM-linear and the SVM-poly.
Indeed AA accuracies are equal to 85.41 % for the SVM-linear classifier, and 87.97 %
for the SVM-poly classifier. This experiment appears to confirm what is observed in
other application fields, i.e., the superiority of SVM based on the Gaussian kernel as
compared to traditional classifiers when dealing with feature spaces of very high
dimensionality. In addition to previous accuracies results shown in Table 13 for the
proposed PSM-FD - SVM classification system with the low average run-time shown in
Table 12 , Table 13 provides a reference classification in order to quantify the capability
of the proposed system to further improve these results.
52
Chapter 6
Conclusion and Future Work
6.1 Conclusion
In this thesis an enhanced diagnosis method for identifying cardiac ECG
Arrhythmia using OAA SVM classifier based on fractal dimension was presented. The
proposed method, firstly, extracts the features of ECG Arrhythmia based on fractal
theory , in this phase three methods in time domain and one method in frequency domain
are used to estimate the fractal dimension values for the normal and different
pathological conditions which established different ranges of FD for each specific
disease. Such intervals are utilized to distinguish clearly between healthy and nonhealthy persons by putting each of them in distinct FD range. This should facilitate in its
application as a supplemental method to support the diagnosis of a pathological or
normal heart condition. The Power Spectrum Method (PSM) shows a better distinguish
between the ECG signals for healthy and non-healthy persons versus the other methods.
The results also suggest that FD is a practical tool for identification of abnormality
characteristic in the ECG recordings.
After fractal features had extracted, the OAA SVM classifier was trained by these
features in order to recognize and classify the ECG beats. Compared with the diagnosis
method which had been used based on ECG morphology features and three ECG
temporal features, i.e., the QRS complex duration, the RR interval (the time span
between two consecutive R points representing the distance between the QRS peaks of
the present and previous beats), and the RR interval averaged over the ten last beats [33],
the proposed method has advantages of simple architecture and global optimum ability.
53
The OAA SVM tool trained with simulated data was found to be capable of
predicting ECG Arrhythmia classes accurately when the beats data were presented to the
trained OAA SVM for prediction. The result of simulation for verification shows that the
accuracy ratio of the proposed method in diagnosis using OAA SVM classifier based on
fractal dimension is high, with an average accuracy of 89.33%.
6.2 Future Work
From the obtained experimental results, we can strongly recommend the use of the
SVM based on fractal features approach for classifying ECG signals as an alternative
diagnosis to the traditional diagnosis methods of cardiac Arrhythmia, on account of their
superior generalization capability as compared to traditional classification techniques.
This capability generally provides them with higher classification accuracies and a lower
overfitting with a robust to noise. For future work researches verify that when increasing
the number of training beats, the classification accuracies increase and the differences
between the classifiers appear less pronounced. It would be interesting more to analyze
another feature of cardiac signals such as the Heart Voice signal, Heart Variability Beat
(HRV) signal.
54
REFERENCES
[1] Netter FH (1971), Heart, The Ciba Collection of Medical Illustrations, Ciba
Pharmaceutical Company, Summit, N.J, Vol. 5: 293 pp.
[2] Kannathal ,( 2004), Acharya , (2004), Garret ( 2003), Yuru ( 2004).
[3] Robert (1995 ), Kaplan ( 1999), Laurent (1998).
[4] Goldman MJ (1986),Principles of Clinical Electrocardiography, Lange Medical
Publications, Los Altos, Cal, 12th ed: 460 pp.
[5] Scheidt S (1984), Basic Electrocardiography: Abnormalities of Electrocardiographic
Patterns, Ciba Pharmaceutical Company, Summit, N.J,Vol. 6/36: 32 pp.
[6] Marissa Selner,( August 20, 2012) , Medically Reviewed by Peter Rudd, MD .
[7] A.Vishwa and A. Sharma , (December 2011),“Arrhythmic ECG signal classification
Using Machine Learning Techniques”. International Journal of Computer Science,
Information Technology, & Security (IJCSITS), Vol. 1, No. 2.
[8] Silipo,( 2011),” ECG Feature Extraction Techniques - A Survey Approach” ,
International Journal of Engineering, Science and Technology ,Vol. 3, No. 8:pp.
122-131.
[9] Yu ,and Guyon ,“Ensemble Feature Weighting Based on Local Learning and
Diversity”,Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence.
[10]Song et al, “Support Vector Machine Based Arrhythmia Classification Using
Reduced Features”,( December 2005), International Journal of Control, Automation,
and Systems, vol. 3, no. 4:pp. 571-579.
[11] Raghav , S. ;Mishra ,A ,K,(2008) ,” Fractal feature based ECG arrhythmia
classification”, TENCON , IEEE Region 10 Conference.
55
[12] S. Z. Mahmoodabadi, A. Ahmadian, and M. D. Abolhasani,( 2005) , “ ECG
Feature Extraction using Daubechies Wavelets”, Proceedings of the fifth IASTED
International conference on Visualization, Imaging and Image Processing, pp. 343 .
[13] S. C. Saxena, A. Sharma, and S. C. Chaudhary,( 1997.), “Data compression and
feature extraction of ECG signals” ,International Journal of Systems Science, vol.
28, no. 5:pp. 483-498.
[14] V. S. Chouhan, and S. S. Mehta,( 2008) “Detection of QRS Complexes in 12 lead
ECG
using Adaptive Quantized Threshold”, IJCSNS International Journal of
Computer Science and Network Security, vol. 8, no. 1.
[15] V. S. Chouhan, and S. S. Mehta,( March, 2007), “Total Removal of Baseline
Drift from ECG Signal”, Proceedings of International conference on Computing:
Theory and Applications, ICTTA–07:pp. 512-515, ISI.
[16] MIT-BIH Arrhythmia Database from PhysioBank- Physiologic Signal Archives for
Biomedical Research. Retrieved 10 March , 2013 , from http://www.physionet.org/ physiobank/database.
[17] Retrieved 10 March, 2013, from http://library.thinkquest.org/3493/ frames/fractal.html.
[18] Turner, M.J, (2000), Modeling Nature With Fractals, Leicester .
[19] P. Vanouplines, “Rescaled Range Analysis and the Fractal Dimension of pi”,
University Library, Free University Brussels, Pleinlaan 2, 1050 Brussels Belgium.
[20] Mandelbrot, B.B,( 1982) ,“The Fractal Geometry of Nature”, San Francisco.
[21] Angel Chang,(February,1993)“Fractals in Biological Systems”.
56
[22]Al Alfi. M ,(2014), “Enhanced Automatic Identification of Arrhythmia in
Electrocardiogram (ECG) Signals based on Fractal Features and SVM
Technique”, Unpublished Master Dissertation,Zarqa Private University,Jordan,Zarqa.
[23] Accardo A., Affinito M., Carrozzi M, Bouquet F,( 1997) ”Use of the Fractal
Dimension for the Analysis of Electroencephalographic Time Series”, Biol Cyber, 77:
339-350.
[24] J.M.Blackledge,(2006),”Digital Signal Processing: Mathematical and
Computation Methods: Software Development and Applications”, 2nd
Edition, London: Horwood Publishing Limited.
[25] Schepers HE, van Beek JHGM, Bassingtwaighte JB,(1992), ”Four Methods
to Estimate the Fractal Dimension from Selfaffine Signals”, IEEE Engg Me
Bio, (6): 57-64.
[26] Farid Melgani and Yakoub Bazi ,( September 2008) ,“Classification of
Electrocardiogram Signals With Support Vector Machines and Particle Swarm
Optimization”, IEEE Transactions on Information Technology in Biomedicine, Vol.
12, No. 5.
[27] Kaplan D. and Glass L,( 1995) ”Understanding Nonlinear Dynamics
Textbooks in Mathematical Sciences” , T F Banchoff ,New York: Springer.
[28] F. Melgani and L. Bruzzone,(Aug. 2004),“Classification of Hyperspectral
Remote Sensing Images with Support Vector Machine”, IEEE Trans. Geosci,
Remote Sens, vol. 42, no. 8: pp. 1778–1790.
[29] C.-W. Hsu and C.-J. Lin, (Mar. 2002),“A Comparison of Methods for Multiclass
Support Vector Machines,” IEEE Trans. Neural Netw.,vol. 13, no.2: pp. 415– 425.
[30]Chih-Wei Hsu, Chih-Jen Lin,(2002),” A Comparison of Methods for Multiclas
Support Vector Machines”, IEEE Transactions on Neural Networks, vol. 13,No 2.
57
[31] M. Stone,( 1974) ,“Cross-validatory Choice and Assessment of Statistical
Predictions”, J. R. Statist. Soc. B,vol. 36: pp. 111–147.
[32] Arle J. E. and Simon R. H,(1990) ”An Application of Fractal Dimension to the
Detection of Transients in the Electroencephalogram Electroencephalogr”, Clin
Neurophysiology, 75, 296305.
[33] F. de Chazal and R. B. Reilly,( Dec. 2006) ,“A Patient Adapting Heart Beat
Classifier Using ECG Morphology and Heartbeat Interval Features,” IEEE Trans.
Biomed. Eng. Vol. 53, no. 12: pp. 2535–2543.
58
Appendices
Appendix A: Matlab Code
%%%%%%%%%%%%% Matlab Code %%%%%%%%%%%%%%%
---------------------------Rescaled Range Algorithm -----------------------%H = HURST(X) calculates the Hurst exponent of time series X using the R/S
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function D = Hurst(Z)
clear;
clc;
clc
dataset=1;
switch dataset
case 1
load 801m
Z = val' ;
case 2
load ecg4_20m
Z = val' ;
case 3
load ecg10_20m
Z = val' ;
case 4
load ecg11_20m
Z = val' ;
case 5
load ecg12_20m
Z = val' ;
case 6
Z = load('sig_y1.txt');
case 7
Z = load('sig_y2.txt');
case 8
Z = load('sig_y3.txt');
case 9
Z = load('sig_y4.txt');
case 10
Z = load('sig_y5.txt');
End
59
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
m=length(Z);
x=zeros(1,m);
y=zeros(1,m);
y2=zeros(1,m);
for tau=3:m
X=zeros(1,tau);
Zsr=mean(Z(1:tau));
for t=1:tau
X(t)=sum(Z(1:t)-Zsr);
end;
R=max(X)-min(X);
S=std(Z(1:tau),1);
H=log10(R/S)/log10(tau/2);
x(tau)=log10(tau);
y(tau)=H;
y2(tau)=log10(R/S);
end;
D=2-H;
%plot(x,y,'k--',x,y2,'k-'),legend('H-track','R/S-track','Location','South')
%xlabel('lg(number of test)'),ylabel('lg(R/S)')
%axis([x(1) x(end) -inf +inf]),drawnow
%figure(gcf).
60
%%%%%-----------------------------Katz Algorithm------------------------%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% S = num of beat samples = num of beat intervals + 1
% fs = number of beat samples \ second
% fs= S/full time taken of all samples
%Input:
%x: (either column or row) vector of length N
%fs = number of beat samples \ second
%Output:
%f: Katz fractal dimension of x
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%
function f = KatzFD(x,fs)
clc
dataset=5;
switch dataset
case 1
load ecg2_20m
x = val' ;
fs =500;
case 2
load ecg4_20m
x = val' ;
fs =500;
case 3
load ecg10_20m
x = val' ;
fs =500;
case 4
load ecg11_20m
x = val' ;
fs =500;
case 5
load ecg12_20m
x = val' ;
fs =500;
case 6
x = load('sig_y1.txt');
fs = 1.2095833;
case 7
x = load('sig_y2.txt');
fs = .9770833;
case 8
x = load('sig_y3.txt');
fs = 1.615278;
case 9
x = load('sig_y4.txt');
61
fs = .7652778;
case 10
x = load('sig_y5.txt');
fs = .9673611;
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%
x=(x-mean(x))/std(x);
n = length(x);
t=(0:1/fs:(n-1)/fs);
t=t';
x1=[t x];
for i=1:n-1
d(i)=sqrt(abs(x1(i+1,1)-x1(i,1))^2+abs(x1(i+1,2)-x1(i,2))^2);
dmax(i)=sqrt((abs(x1(i+1,1)-x1(1,1))^2+abs(x1(i+1,2)-x1(1,2))^2));
end
totlen=sum(d);
avglen=mean(d);
maxdist=max(dmax);
numstep=double(totlen/avglen);
den=double(maxdist/totlen);
f=(log(numstep))/((log(den)+log(numstep)));
62
%-----------------------------Higuchi's Algorithm-----------------------%function xhfd=hfd(x,kmax)
%k:integer indicates interval time
%m:integer indicates initial time
%N:is the total numberof samples in one epoch
%Input:
%x: (either column or row) vector of length N
%kmax: maximum value of k
%Output:
%xhfd: Higuchi fractal dimension of x
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%
function xhfd=HiguchiFD(x,kmax)
clc
dataset=5;
switch dataset
case 1
load ecg2_20m
x = val' ;
case 2
load ecg4_20m
x = val' ;
case 3
load ecg10_20m
x = val' ;
case 4
load ecg11_20m
x = val' ;
case 5
load ecg12_20m
x = val' ;
case 6
x = load('sig_y1.txt');
case 7
x = load('sig_y2.txt');
case 8
x = load('sig_y3.txt');
case 9
x = load('sig_y4.txt');
case 10
x = load('sig_y5.txt');
end
63
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%
if ~exist('kmax','var')||isempty(kmax),
kmax= 7 ; % 1280/256 = 5 interval time series ??? ??????? ??? ??? ????? ??? ??
???? 256
end;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%
x=x(:)';
N=length(x);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%
Lmk=zeros(kmax,kmax);
for k=1:kmax,
for m=1:k,
Lmki=0;
for i=1:fix((N-m)/k),
Lmki=Lmki+abs(x(m+i*k)-x(m+(i-1)*k));
end;
Ng=(N-1)/(fix((N-m)/k)*k);
Lmk(m,k)=(Lmki*Ng)/k;
end;
end;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%
Lk=zeros(1,kmax);
for k=1:kmax,
Lk(1,k)=sum(Lmk(1:k,k))/k;
end;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%
lnLk=log(Lk);
lnk=log(1./[1:kmax]);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%
b=polyfit(lnk,lnLk,1);
xhfd=b(1);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%
64
%--------------------------------------PSM Algorithm------------------------------------------%
%============================================================
function Result=Estm_q(signal)
clc
Result=[];
load 16273m % val
1x1280
%n1 = val';
%val =load('sig_y2.txt');
%load 17052m
signal = val';
S =signal;
format short
if length(S) < 1024
disp('This size of data should 1024 or more...');
return;
else
D=fix(length(S)./512);
Trc=S(1:D*512);
N=512; %size of the signal that we need
m=N/2+1;
C2=0;jj=0;
for C1=1:N:length(Trc) % take different windows along the signal
jj=jj+1;
C2=C2+1;
fs=Trc(C1:C2*N);
% Extract only signal of size 1024,
fs=fs/max(fs);
% Normalize the signal
FS=fft(fs);
% Calculate the FFT
FS=fftshift(FS); % Apply shifting(to move the zero-frequency component to the
center)
ps=abs(FS).^2;
% Calculate the power spectrum
p=ps(1:N/2);
% Take the left halve of power spectrum, exclude the DC
%calculations to recover the estimated q (estmt_q)
x1 =0; %sum[log(k)]
x2 =0; %sum[log(p)]
x12=0; %sum[log(k)*log(p)]
x11=0; %sum[log(k)^2]
for i= 1:N/2
k=abs(i-m);
% k=frequency
if((p(i)~=0) & (k~=0))
x1=x1+log(k);
x2=x2+log(p(i));
x12=x12+log(k)*log(p(i));
x11=x11+log(k)*log(k);
else
65
N=N;
end
end
estm_q=((N/2)*x12-x1*x2)/(x1^2-(N/2)*x11);% The formula to estimate q
Result(1,jj)=C2;
Result(2,jj)=(5- (2*estm_q ))/2;
%Result(2,jj)= estm_q ;
ps(513)= 0;
%figure, ;
%t = 1:0.1:10;
%plot(estm_q );
%plot(ps);
%plot( ApplyThreshold (Result(2,:), -0.5, 0.5), 'r');
plot(Result(2,:));
%plot(ps);
%line('XData', [0 9], 'YData', [ 0.0741 0.0741 ], 'LineStyle', '-', ...
%'LineWidth', 1, 'Color','m');
%line('XData', [0 9], 'YData', [ 0
% 'LineWidth', 1, 'Color','b');
0
], 'LineStyle', '-', ...
%title('Power spectrum of Normal Sinus Rhythm for 17052m dataset with 256
window size ');
end % end of windowing
end
66
%--------------------------------------SVM Classification-------------------------------------%
%============================================================
clc
load 'test_svm_Normal_V.txt'; %# load ECG dataset
data = test_svm_Normal_V(:,1);
groups = test_svm_Normal_V(:,2); %# create a two-class problem
numInst = size(data);
First =zeros(numInst);
for i=1:numInst
if groups(i)==1
First(i)=1;
classF = data(groups(i));
else
First(i)=2;
classF = data(groups(i));
end
end
k=5;
cvFolds = crossvalind('Kfold',groups,k); %# get indices of 5-fold CV
cp = classperf(groups); %# init performance tracker
for i = 1:k
%# for each fold
test = (cvFolds == i); %# get indices of test instances
train = ~test;
%# get indices training instances
%# train an SVM model over training instances
options = optimset('maxiter', 5000, 'largescale','off'); %options settings for
SVMTRAIN
svmStruct =
svmtrain(data(train,:),groups(train),'KERNEL_FUNCTION','rbf','showplot',false
,'quadprog_opts' , options);
%# test using test instances
classes = svmclassify(svmStruct,data(test,:),'showplot',false);
%# evaluate and update performance object
67
cp = classperf(cp,classes,test);
end
%# get accuracy
cp.CorrectRate
Second =zeros(numInst);
for i=1:numInst
if groups(i)==1||2
Second(i)=1;
classS = data(groups(i));
else
Second(i,1)=2;
classS = data(groups(i));
end
end
indices2 = crossvalind('Kfold',Second,k);
cp = classperf(Second);
for i = 1:k
test = (indices2 == i); train = ~test;
options = optimset('maxiter', 5000, 'largescale','off'); %options settings for
SVMTRAIN
svmStructS =
svmtrain(classS(train,:),Second(train),'KERNEL_FUNCTION','linear','showplot',
false);
classes2 = svmclassify(svmStructS,classS(test,:),'showplot',false);
cp = classperf(cp,classes2,test);
end
cp.CorrectRate
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Third =zeros(numInst);
for i=1:numInst
if groups(i)==1||2||3
Third(i)=1;
classT = data(groups(i));
else
Third(i)=2;
classT = data(groups(i));
end
end
68
indices3 = crossvalind('Kfold',Third,k);
cp = classperf(Third);
for i = 1:k
test = (indices3 == i); train = ~test;
options = optimset('maxiter', 5000, 'largescale','off'); %options settings for
SVMTRAIN
svmStructT =
svmtrain(classT(train,:),Third(train),'KERNEL_FUNCTION','linear','showplot',f
alse);
classes3 = svmclassify(svmStructT,classT(test,:),'showplot',false);
cp= classperf(cp,classes3,test);
end
cp.CorrectRate
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Forth =zeros(numInst);
for i=1:numInst
if groups(i)==1||2||3||4
Forth(i,1)=1;
classF = data(groups(i));
else
Forth(i)=2;
classFo = data(groups(i));
end
end
indices4 = crossvalind('Kfold',Forth,k);
cp = classperf(Forth);
for i = 1:k
test = (indices4 == i); train = ~test;
options = optimset('maxiter', 5000, 'largescale','off'); %options settings for
SVMTRAIN
svmStructFo =
svmtrain(classFo(train,:),Forth(train),'KERNEL_FUNCTION','linear','showplot',
false);
classes4 = svmclassify(svmStructFo,classFo(test,:),'showplot',false);
cp =classperf(cp,classes4,test);
end
cp.CorrectRate.
69
Download