Uploaded by subha.velappan.s

ComparativeAnalysisofClassificationTechniquesusingCardiotocographyDataset

advertisement
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/322100603
Comparative Analysis of Classification Techniques using Cardiotocography
Dataset
Article · December 2013
CITATIONS
READS
11
1,567
All content following this page was uploaded by Subha .V on 28 December 2017.
The user has requested enhancement of the downloaded file.
IJRIT International Journal of Research in Information Technology, Volume 1, Issue 12, December, 2013, Pg. 274-280
International Journal of Research in Information Technology (IJRIT)
www.ijrit.com
ISSN 2001-5569
Comparative Analysis of Classification
Techniques using Cardiotocography Dataset
V.Subha1, Dr.D.Murugan2, Jency Rani3, Dr.K.Rajalakshmi4
1
2
Assistant Professor, Department of CSE, Manonmaniam Sundaranar University,
Tirunelveli, TamilNadu, India.
Subha_velappan@yahoo.co.in
Associate Professor, Department of CSE, Manonmaniam Sundaranar University,
Tirunelveli, TamilNadu, India.
dhanushkodim@yahoo.com
3
PG Scholar, Department of CSE, Manonmaniam Sundaranar University,
Tirunelveli, TamilNadu, India.
jency.rani96@gmail.com
4
AssistantProfessor, CITE, Manonmaniam Sundaranar University,
Tirunelveli,TamilNadu, India.
Abstract
Cardiotocography (CTG) is the most widely used technique to monitor the fetal health and fetal heart rate. By observing the
Cardiotocography trace patterns doctors can understand the state of the fetus and take the decision whether it is in normal state or
pathologic state. The major challenges in medical domain are the extraction of intelligible knowledge from medical diagnosis
data such as CTG data. By using this CTG data, four classification methods (Naive Bayes, Decision Tree, Multi Layer Perceptron
and Radial Basis Function) are compared in this paper. The metrics (Accuracy, Sensitivity, Specificity, F-Score) are used to
evaluate them. In order to improve the accuracy, feature selection methods are also applied and the performance of classifiers is
compared.
Keywords: Cardiotocography, Data mining, Feature selection, classification.
1. Introduction
Data Mining is the process of extracting the knowledge from the databases. Now-a-days, disease prediction is more
difficult for doctors in the medical field. Data mining techniques are helpful for extracting the patterns from medical
data which aid in medical decision making.
Cardiotocography (CTG) is a technical means of recording the fetal heart rate (FHR) and the uterine contractions
(UC) during pregnancy, typically in the third trimester to evaluate maternal and fetal well-being. During the process
of CTG analysis the FHR patterns are observed manually by obstetricians. In the recent past fetal heart rate baseline
and its frequency analysis has been taken in to research on many aspects [2]. Fetal heart rate (FHR) monitoring is
V.Subha, IJRIT
274
mainly used to find out the amount of oxygen a fetus is acquiring during the time of labor. Even then death and long
term disablement occurs due tohypoxia during delivery. More than 50% of these deaths were caused by not
recognizing the abnormal FHR pattern. Currently available data mining techniques can be used for analyzing and
classifying the CTG data to avoid human mistakes and to assist doctors to take a decision.
2. Related works
Rooth et al. [6] introduced the International Federation of Obstetrics and Gynaecology (FIGO) guidelines as an
attempt to standardize the use of electronic monitoring of FHR. The first work of automatic CTG analysis following
FIGO guidelines consists of describing and extracting the CTG morphological features [7]. Bernades [5] developed
SisPorto, a system for automatic analysis of CTG tracings, based on an improvement of the morphological feature
extraction introduced in [7]. Krupaet al. [4] proposed a new method for FHR signal analysis based on Empirical
Mode Decomposition (EMD) for feature extraction and SVM with RBF for classification of FHR recordings.
The performance of neural network based classification model [8] was compared with the most commonly used
unsupervised clustering methods Fuzzy C-mean and k-mean clustering. Mei-Ling Huang et al.[9] predicted fetal
distress using discriminant analysis (DA), decision tree (DT), and artificial neural network (ANN). Back
propagation network, Radial Basis Function and Support Vector Machines [10] were used for classifying
cardiotocogram data. Sundar et al [11] evaluated machine learning based Naive Bayes probabilistic model based
classifier for classifying CTG data. Artificial Neural Networks (ANNs) were used as a classifier [12] to detect FHR
acceleration and deceleration patterns and to estimate the FHR baseline and variability. ANNs were used to classify
deceleration patterns into episodic and periodic decelerations [13] according to FIGO guidelines and based on the
relationship between the parameters of the deceleration and the associated uterine contraction. ANNs with Radial
Basis Functions (RBF) and Multi Layer Perceptrons (MLP) were the best performing classifiers.
Four different types of feature selection methods as PCA, Info Gain, GAME and meta selection method[14] were
analyzed. The meta feature selection method provided more sensitivity and specificity than other methods. The best
feature selection method for naive bayes classifier was reported. The feature selection method with 15 attributes and
10 attributes was compared [15]. The selection method with 15 attributes yielded more performance than with 10
attributes selection method. Chudaceket al. [16] used SVM, Naive Bayes, and a decision tree (C4.5 algorithm) with
a polynomial kernel to analyze FHR signals based on linear features and non-linear features. They used three FS
methods: Principal Component Analysis, Information Gain, and Group of Adaptive Models Evolution (GAME).
Georgoulaset al. [17] proposed a FS method based on bPSO (binary Particle Swarm Optimization) for FHR signal
analysis using SVM and k-NN.
3. Materials and methods
3.1 Dataset
The dataset of cardiotocography is downloaded from UCI Machine Learning Repository [1]. It contains 2126
instances and 21 attributes with 1 class attribute. The machine learning tool WEKA is used for experimentation.
3.2 Classification Algorithms:
The following four types of classifiers are used for data classification.
Naive Bayes Classifier (NB): Naive Bayes is a statistical classifier which assumes no dependency between
attributes. It attempts to maximize the posterior probability in determining the class. By theory, this classifier has
minimum error rate but it may not be case always. According to Bayesian theorem
P (A|B) =P (A)*P (B/A)/P(B)
Where, P (B|A)=P(A∩B)/P(A)
V.Subha, IJRIT
275
Based on above formula, Bayesian classifier calculates conditional probability of an instance belonging to each
class, and based on such conditional probability data, the instance is classified as the class with the highest
conditional probability.
Decision Tree (DT): Decision trees are powerful and popular tools for classification and prediction. It represents
rules, which can be understood by humans and used in knowledge system such as database. It is a popular classifier
which is simple and easy to implement. It requires no domain knowledge or parameter setting and can handle high
dimensional data. Hence it is more appropriate for exploratory knowledge discovery. The performance of decision
trees can be enhanced with suitable attribute selection. Correct selection of attributes partition the data set into
distinct classes. Our work uses J48 decision tree for classification.
Multi-Layer Perceptron (MLP): Multilayer Perceptron classifier is based on back propagation algorithm to
classify instances of data. The nodes in this neural network are all sigmoid. The back propagation neural network is
referred as the network of simple processing elements working together to produce an output. It should be learned
by a set of weights for predicting the class label of tuples. The neural network consists of three layers namely input
layer, one or more hidden layers, and an output layer [13]. Each layer should be made up of units. The input layer of
the network corresponds to the attributes. To make input layer, the inputs are fed simultaneously into the units.
These inputs are passed through the input layer and are then weighted and fed simultaneously to a second layer of
neuron like units, which is known as a hidden layer. The outputs of the hidden layer units can be input to another
hidden layer, and so on. The number of hidden layers is arbitrary, although in practice, usually only one is used [4].
At the core, back propagation is simply an efficient and exact method for calculating all the derivatives of a single
target quantity with respect to a large set of input quantities [13].
Radial Basis Function (RBF) Networks: It consists of 3 layers an input layer, a hidden layer and an output
layer. The hidden units provide a set of functions that constitute an arbitrary basis for the input patterns. The hidden
units are known as radial centers. The transformation from input space to hidden unit space is nonlinear where as
transformation from hidden unit space to output space is linear. The dimension of each center for a p input network
is p-1.The radial basis functions in the hidden layer produces a significant non-zero response only when the input
falls within a small localized region of the input space. Each hidden unit has its own receptive field in input space. It
uses the Gaussian radial function. Activation function of the hidden unit computes the Euclidean distance between
the input vector and the center of unit.
3.3 Classification methods
Two
types
of
classification
approaches
have
been
employed
as
shown
in
Fig.1.
CTG data set
(21 attributes)
Feature subset selection
Apply classification methods
with full features dataset (21
Apply classification methods with
selected feature subset (15
Performance
Evaluation
Performance
Result comparison
Fig. 1 Classification methods
V.Subha, IJRIT
276
3.4 Feature Selection
Feature selection is frequently used as a preprocessing step to machine learning. It is a process of choosing a subset
of original features so that the feature space is optimally reduced according to a certain evaluation criterion. Feature
selection has a main goal of finding a feature subset that produces higher classification accuracy. High dimensional
data (i.e., data sets with hundreds or thousands of features) can contain high degree of irrelevant and redundant
information which may greatly degrade the performance of learning algorithms. Therefore, feature selection
becomes very necessary for machine learning tasks. The following three feature selection methods in WEKA are
used.
Gain Ratio Attribute Evaluation: Evaluates the worth of an attribute by measuring the gain ratio with respect
to the class.
GainR (Class, Attribute) = (H(Class) - H(Class | Attribute)) / H(Attribute).
ReliefF Attribute Evaluation: Evaluates the worth of an attribute by repeatedly sampling an instance and
considering the value of the given attribute for the nearest instance of the same and different class.
Symmetrical Uncertainty Attribute Evaluation: Evaluates the worth of an attribute by measuring the
symmetrical uncertainty with respect to the class.
SymmU (Class, Attribute) = 2 * (H(Class) - H(Class | Attribute)) / H(Class) + H(Attribute).
4. Experimental results
The CTG dataset contains 2126 instances and21 attributes with 1 class attribute. The dataset consists of
measurements of fetal heart rate (FHR) and uterine contraction (UC).The CTGs were classified by expert
obstetricians and a consensus classification label has been assigned to each of them. The classification is with
respect to a fetal heart rate class code (N-Normal, S-Suspect and P-Pathologic). Therefore this dataset can be used
for 3-class experiment.
The attributes of the dataset is described as follows;
Table 1: Attribute information of CTG dataset
1
LB
FHR baseline (beats per minute)
12
Width
Width of FHR histogram
2
AC
Accelerations per second
13
Min
Minimum of FHR histogram
3
FM
Fetal movements per second
14
Max
Maximum of FHR histogram
4
UC
Uterine contractions per second
15
Nmax
Histogram peaks
5
DL
Light decelerations per second
16
Nzeros
Histogram zeros
6
DS
Severe decelerations per second
17
Mode
Histogram mode
7
DP
Prolonged decelerations per second
18
Mean
Histogram mean
8
ASTV
19
Median
Histogram median
9
MSTV
20
Variance
Histogram variance
10
ALTV
21
Tendency
Histogram tendency
11
MLTV
22
NSP
Fetal state class code (Normal=1;
Suspect=2; Pathologic=3)
Percentage of time with abnormal
short term variability
Mean value of short term
variability
Percentage of time with abnormal
long term variability
Mean value of long term variability
The following metrics are used for performance evaluation.
V.Subha, IJRIT
277
Accuracy= TP/( TN +FP+ FN +TP)
Sensitivity= TP /(TP+FN)
Specificity =TN/( TN+ FP)
F-Score=[2*A*B] /[A+B]
where,
A= TP/(TP+FP) & B= TP/(TP+FN)
TP: True Positives
FP: False Positives
TN: True Negatives
FN: False Negatives
4.1 Results
The classification techniques (NB, DT, MLP and RBF) are compared using original CTG dataset with 21 attributes.
The metrics (Accuracy, Sensitivity, Specificity and F-Score) are used to evaluate them. The experimental result is
shown in Table 2.
Table 2: Comparison of classification techniques using full features CTG data set with 21 attributes
Classifiers
Metrics
NB
DT
MLP
RBF
Accuracy
82.1
92.9
91.9
86.0
Sensitivity
82.1
92.9
91.9
86.0
Specificity
87.6
92.8
91.8
88.0
F-Score
83.7
92.9
91.8
86.7
From this, it is observed that DT yields better performance than other classifiers in terms of accuracy.
Table 3: Comparison of feature selection methods using Naive Bayes Classifier
Metrics
Feature selection methods
Gain Ratio
ReliefF
Symmetrical Uncertainty
Accuracy
83.9
82.8
83.4
Sensitivity
83.9
82.8
83.4
Specificity
88.3
87.7
88.1
F-Score
85.2
84.2
84.8
Table 3 illustrates the performance of Naïve Bayes Classifier using three feature selection methods. It is observed
that GainRatio feature selection method yields better performance than other feature selection methods. Hence,
GainRatio feature selection method is selected for further experimentation. The reduced CTG dataset with 15
attributes is formed by using this GainRatio feature selection method. The classification techniques (NB, DT, MLP
and RBF) are compared using feature reduced CTG dataset with 15 attributes. Performances of these classifiers are
evaluated by using the metrics. The experimental result is shown in Table 4.
Table 4: Comparison of classification techniques using reduced CTG dataset (15 attributes)
Metrics
Accuracy
V.Subha, IJRIT
Classifiers
NB
DT
MLP
RBF
83.9
93.3
92.6
87.2
278
Sensitivity
83.9
93.3
92.6
87.2
Specificity
88.3
93.2
92.4
89.0
F-Score
85.2
93.3
92.5
87.8
The table 4 shows that Decision Tree yields more accuracy than other classifiers with reduced feature set. The
overall performance is analysed between full CTG dataset (21 attributes) and feature reduced CTG dataset (15
attributes). The comparison result of the classification techniques using full CTG dataset with 21 attributes and
reduced CTG dataset with 15 attributes is shown in Fig. 2.The observations illustrates that classification using
reduced feature set performs better than with full featured dataset.
Accuracy(%)
95
90
85
Before Feature Subset
Selection
80
After Feature Subset
Selection
75
NB
DT
MLP
RBF
Classifiers
Fig. 2 Comparison of classification techniques before feature subset selection and after feature subset
selection
5. Conclusion
In this paper, four classification techniques (NB, DT, MLP and RBF) are compared using original CTG dataset with
21 attributes. The DT classifier yields more performance than other classifiers in terms of accuracy. For improving
the accuracy, feature selection method is applied. Here, Gain Ratio feature selection method is applied to the
original CTG dataset to obtain reduced CTG dataset that has 15 attributes. The performance is analysed using the
classification techniques between original CTG dataset with 21 attributes and reduced CTG dataset with 15
attributes. The classification with reduced CTG data set yields more performance than original dataset. In future,
other classification methods and feature selection techniques can be applied to this dataset so that performance can
be improved.
References
[1] UCI Machine Learning Repository, http://archive.ics.uci.edu/ml/datasets/Cardiotocography
[2] Shahad Nidhal, M. A. Mohd. Ali1 and Hind Najah, “A novel cardiotocography fetal heartrate baseline estimation
algorithm”, Scientific Research and Essays Vol.5, No.24, 2010, pp.4002-4010.
[3] Antonia Costa, MD; Diogo Ayres-de-Campos, PhD; Fernada Costa, MD; Cristina Santos, MS; Joao Bernardes,
PhD, “Prediction of neonatal academia by Computer analysis of fetal heartrate and ST event sibnals” AJOG –
American Journal of Obstetrics and Gynecology, 2009.
[4] N. Krupa , M.A.MA, E.Zahedi, S. Ahmed and F.M.Hassan, "Antepartum fetal heart rate feature extraction and
classification using empirical mode decomposition and support vector machine,” BioMed. Eng. OnLine, Vol.10,
No.61, 2011.
[5] J. Bernardes, C. Moura, J. P. de Sa, and L. P. Leite, “The Porto system for automated cardiotocographic signal
analysis,”J. Perinat. Med., pp.61–65, 1991.
V.Subha, IJRIT
279
[6] G. Rooth, A. Huch, and R. Huch, “Guidelines for the use of fetal monitoring,” Int. J. Gynaecol. Obstet., vol.25,
1987, pp.159-167.
[7] R. Mantel, H. P. van Geijn, F. J. Caron, J. M. Swartjes, E. E. van Woerden, and H. W. Jongsma, “Computer
analysis of antepartum fetal heart rate and Detection of accelerations and decelerations,” Int. J. Biomed.
Comput., Vol.25, No.4, 1990, pp.273-286.
[8] Sundar, C., M. Chitradevi and G. Geetharamani, “An Overview Of Research Challenges for Classification Of
Cardiotocogram Data” Journal of Computer Science, Vol.9, No.2, 2013, pp.198-206.
[9] Mei-Ling Huang and Yung-Yan Hsu, “Fetal distress prediction using discriminant analysis, decision tree, and
artificial neural network”, J. Biomedical Science and Engineering, Vol.5, 2012, pp.526-533.
[10] Sundar Chinnasamy, Chitradevi Muthusamy and Geetharamani Gopal, “An Outlier Based Bi-Level Neural
Network Classification System for Improved Classification of Cardiotocogram Data”, Life Science Journal,
Vol.10, No.1, 2013, pp.244-251.
[11] Sundar.C, M.Chitradevi and G.Geetharamani, “An Analysis on the Performance of Naïve Bayes Probabilistic
Model based Classifier for Cardiotocogram Data Classification” International Journal on Computational
Sciences & Applications, Vol.3, No.1, 2013, pp.17-26.
[12] P. A.Warrick, E. F.Hamilton, and M. Macieszczak, “Neural network based detection of fetal heart rate
patterns”, in Proc. IEEE Int. Joint Conf. on Neural Netw., Vol.5, 2005, pp.2400- 2405.
[13] M. Jezewski, P. Labaj, J. Wrobel, A. Matonia, J. Jezewski, and D.Cholewa, “Automated classification of
deceleration patterns in fetal heart rate signal using neural networks,” in IFMBE Proc., 2007, vol.18, pp.5-8.
[14] V Chudacek, J Spilka, B Rubackova, M Koucky, “Evaluation of Feature Subsets for Classification of
Cardiotocographic Recordings “, Computers in Cardiology, Vol.35, 2008, pp.845−848.
[15] Mohamed El BachirMenai, Fatimah J. Mohder, and Fayha Al-mutairi, “Influence of Feature Selection on
Naïve Bayes Classifier for Recognizing Patterns in Cardiotocograms“ Journal of Medical and Bioengineering
Vol.2, No.1, 2013, pp.66-70.
[16] V. Chudacek, J. Spilka, M. Huptych, G. Georgoulas, P. Janku, M. Koucky, C. Stylios, and L. Lhotska, “Linear
and non-linear features for intrapartum cardiotocography evaluation”, Computing in Cardiology,2010;37:991002.
[17] G. Georgoulas, D. Chrysostomos, and P. P. Groumpos, “Predicting the risk of metabolic acidosis for newborns
based on fetal heart rate signal classification using support vector machines”, IEEE Trans. Biomed. Eng.,2006,
pp.875- 884.
[18] Ian H.Witten, Eibe Frank,”Practical Machine learning tools and Techniques”, Second Edition, Morgan
Kaufmann Publishers, Imprintof Elsevier, 2005.
V.Subha, IJRIT
View publication stats
280
Download