See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/322100603 Comparative Analysis of Classification Techniques using Cardiotocography Dataset Article · December 2013 CITATIONS READS 11 1,567 All content following this page was uploaded by Subha .V on 28 December 2017. The user has requested enhancement of the downloaded file. IJRIT International Journal of Research in Information Technology, Volume 1, Issue 12, December, 2013, Pg. 274-280 International Journal of Research in Information Technology (IJRIT) www.ijrit.com ISSN 2001-5569 Comparative Analysis of Classification Techniques using Cardiotocography Dataset V.Subha1, Dr.D.Murugan2, Jency Rani3, Dr.K.Rajalakshmi4 1 2 Assistant Professor, Department of CSE, Manonmaniam Sundaranar University, Tirunelveli, TamilNadu, India. Subha_velappan@yahoo.co.in Associate Professor, Department of CSE, Manonmaniam Sundaranar University, Tirunelveli, TamilNadu, India. dhanushkodim@yahoo.com 3 PG Scholar, Department of CSE, Manonmaniam Sundaranar University, Tirunelveli, TamilNadu, India. jency.rani96@gmail.com 4 AssistantProfessor, CITE, Manonmaniam Sundaranar University, Tirunelveli,TamilNadu, India. Abstract Cardiotocography (CTG) is the most widely used technique to monitor the fetal health and fetal heart rate. By observing the Cardiotocography trace patterns doctors can understand the state of the fetus and take the decision whether it is in normal state or pathologic state. The major challenges in medical domain are the extraction of intelligible knowledge from medical diagnosis data such as CTG data. By using this CTG data, four classification methods (Naive Bayes, Decision Tree, Multi Layer Perceptron and Radial Basis Function) are compared in this paper. The metrics (Accuracy, Sensitivity, Specificity, F-Score) are used to evaluate them. In order to improve the accuracy, feature selection methods are also applied and the performance of classifiers is compared. Keywords: Cardiotocography, Data mining, Feature selection, classification. 1. Introduction Data Mining is the process of extracting the knowledge from the databases. Now-a-days, disease prediction is more difficult for doctors in the medical field. Data mining techniques are helpful for extracting the patterns from medical data which aid in medical decision making. Cardiotocography (CTG) is a technical means of recording the fetal heart rate (FHR) and the uterine contractions (UC) during pregnancy, typically in the third trimester to evaluate maternal and fetal well-being. During the process of CTG analysis the FHR patterns are observed manually by obstetricians. In the recent past fetal heart rate baseline and its frequency analysis has been taken in to research on many aspects [2]. Fetal heart rate (FHR) monitoring is V.Subha, IJRIT 274 mainly used to find out the amount of oxygen a fetus is acquiring during the time of labor. Even then death and long term disablement occurs due tohypoxia during delivery. More than 50% of these deaths were caused by not recognizing the abnormal FHR pattern. Currently available data mining techniques can be used for analyzing and classifying the CTG data to avoid human mistakes and to assist doctors to take a decision. 2. Related works Rooth et al. [6] introduced the International Federation of Obstetrics and Gynaecology (FIGO) guidelines as an attempt to standardize the use of electronic monitoring of FHR. The first work of automatic CTG analysis following FIGO guidelines consists of describing and extracting the CTG morphological features [7]. Bernades [5] developed SisPorto, a system for automatic analysis of CTG tracings, based on an improvement of the morphological feature extraction introduced in [7]. Krupaet al. [4] proposed a new method for FHR signal analysis based on Empirical Mode Decomposition (EMD) for feature extraction and SVM with RBF for classification of FHR recordings. The performance of neural network based classification model [8] was compared with the most commonly used unsupervised clustering methods Fuzzy C-mean and k-mean clustering. Mei-Ling Huang et al.[9] predicted fetal distress using discriminant analysis (DA), decision tree (DT), and artificial neural network (ANN). Back propagation network, Radial Basis Function and Support Vector Machines [10] were used for classifying cardiotocogram data. Sundar et al [11] evaluated machine learning based Naive Bayes probabilistic model based classifier for classifying CTG data. Artificial Neural Networks (ANNs) were used as a classifier [12] to detect FHR acceleration and deceleration patterns and to estimate the FHR baseline and variability. ANNs were used to classify deceleration patterns into episodic and periodic decelerations [13] according to FIGO guidelines and based on the relationship between the parameters of the deceleration and the associated uterine contraction. ANNs with Radial Basis Functions (RBF) and Multi Layer Perceptrons (MLP) were the best performing classifiers. Four different types of feature selection methods as PCA, Info Gain, GAME and meta selection method[14] were analyzed. The meta feature selection method provided more sensitivity and specificity than other methods. The best feature selection method for naive bayes classifier was reported. The feature selection method with 15 attributes and 10 attributes was compared [15]. The selection method with 15 attributes yielded more performance than with 10 attributes selection method. Chudaceket al. [16] used SVM, Naive Bayes, and a decision tree (C4.5 algorithm) with a polynomial kernel to analyze FHR signals based on linear features and non-linear features. They used three FS methods: Principal Component Analysis, Information Gain, and Group of Adaptive Models Evolution (GAME). Georgoulaset al. [17] proposed a FS method based on bPSO (binary Particle Swarm Optimization) for FHR signal analysis using SVM and k-NN. 3. Materials and methods 3.1 Dataset The dataset of cardiotocography is downloaded from UCI Machine Learning Repository [1]. It contains 2126 instances and 21 attributes with 1 class attribute. The machine learning tool WEKA is used for experimentation. 3.2 Classification Algorithms: The following four types of classifiers are used for data classification. Naive Bayes Classifier (NB): Naive Bayes is a statistical classifier which assumes no dependency between attributes. It attempts to maximize the posterior probability in determining the class. By theory, this classifier has minimum error rate but it may not be case always. According to Bayesian theorem P (A|B) =P (A)*P (B/A)/P(B) Where, P (B|A)=P(A∩B)/P(A) V.Subha, IJRIT 275 Based on above formula, Bayesian classifier calculates conditional probability of an instance belonging to each class, and based on such conditional probability data, the instance is classified as the class with the highest conditional probability. Decision Tree (DT): Decision trees are powerful and popular tools for classification and prediction. It represents rules, which can be understood by humans and used in knowledge system such as database. It is a popular classifier which is simple and easy to implement. It requires no domain knowledge or parameter setting and can handle high dimensional data. Hence it is more appropriate for exploratory knowledge discovery. The performance of decision trees can be enhanced with suitable attribute selection. Correct selection of attributes partition the data set into distinct classes. Our work uses J48 decision tree for classification. Multi-Layer Perceptron (MLP): Multilayer Perceptron classifier is based on back propagation algorithm to classify instances of data. The nodes in this neural network are all sigmoid. The back propagation neural network is referred as the network of simple processing elements working together to produce an output. It should be learned by a set of weights for predicting the class label of tuples. The neural network consists of three layers namely input layer, one or more hidden layers, and an output layer [13]. Each layer should be made up of units. The input layer of the network corresponds to the attributes. To make input layer, the inputs are fed simultaneously into the units. These inputs are passed through the input layer and are then weighted and fed simultaneously to a second layer of neuron like units, which is known as a hidden layer. The outputs of the hidden layer units can be input to another hidden layer, and so on. The number of hidden layers is arbitrary, although in practice, usually only one is used [4]. At the core, back propagation is simply an efficient and exact method for calculating all the derivatives of a single target quantity with respect to a large set of input quantities [13]. Radial Basis Function (RBF) Networks: It consists of 3 layers an input layer, a hidden layer and an output layer. The hidden units provide a set of functions that constitute an arbitrary basis for the input patterns. The hidden units are known as radial centers. The transformation from input space to hidden unit space is nonlinear where as transformation from hidden unit space to output space is linear. The dimension of each center for a p input network is p-1.The radial basis functions in the hidden layer produces a significant non-zero response only when the input falls within a small localized region of the input space. Each hidden unit has its own receptive field in input space. It uses the Gaussian radial function. Activation function of the hidden unit computes the Euclidean distance between the input vector and the center of unit. 3.3 Classification methods Two types of classification approaches have been employed as shown in Fig.1. CTG data set (21 attributes) Feature subset selection Apply classification methods with full features dataset (21 Apply classification methods with selected feature subset (15 Performance Evaluation Performance Result comparison Fig. 1 Classification methods V.Subha, IJRIT 276 3.4 Feature Selection Feature selection is frequently used as a preprocessing step to machine learning. It is a process of choosing a subset of original features so that the feature space is optimally reduced according to a certain evaluation criterion. Feature selection has a main goal of finding a feature subset that produces higher classification accuracy. High dimensional data (i.e., data sets with hundreds or thousands of features) can contain high degree of irrelevant and redundant information which may greatly degrade the performance of learning algorithms. Therefore, feature selection becomes very necessary for machine learning tasks. The following three feature selection methods in WEKA are used. Gain Ratio Attribute Evaluation: Evaluates the worth of an attribute by measuring the gain ratio with respect to the class. GainR (Class, Attribute) = (H(Class) - H(Class | Attribute)) / H(Attribute). ReliefF Attribute Evaluation: Evaluates the worth of an attribute by repeatedly sampling an instance and considering the value of the given attribute for the nearest instance of the same and different class. Symmetrical Uncertainty Attribute Evaluation: Evaluates the worth of an attribute by measuring the symmetrical uncertainty with respect to the class. SymmU (Class, Attribute) = 2 * (H(Class) - H(Class | Attribute)) / H(Class) + H(Attribute). 4. Experimental results The CTG dataset contains 2126 instances and21 attributes with 1 class attribute. The dataset consists of measurements of fetal heart rate (FHR) and uterine contraction (UC).The CTGs were classified by expert obstetricians and a consensus classification label has been assigned to each of them. The classification is with respect to a fetal heart rate class code (N-Normal, S-Suspect and P-Pathologic). Therefore this dataset can be used for 3-class experiment. The attributes of the dataset is described as follows; Table 1: Attribute information of CTG dataset 1 LB FHR baseline (beats per minute) 12 Width Width of FHR histogram 2 AC Accelerations per second 13 Min Minimum of FHR histogram 3 FM Fetal movements per second 14 Max Maximum of FHR histogram 4 UC Uterine contractions per second 15 Nmax Histogram peaks 5 DL Light decelerations per second 16 Nzeros Histogram zeros 6 DS Severe decelerations per second 17 Mode Histogram mode 7 DP Prolonged decelerations per second 18 Mean Histogram mean 8 ASTV 19 Median Histogram median 9 MSTV 20 Variance Histogram variance 10 ALTV 21 Tendency Histogram tendency 11 MLTV 22 NSP Fetal state class code (Normal=1; Suspect=2; Pathologic=3) Percentage of time with abnormal short term variability Mean value of short term variability Percentage of time with abnormal long term variability Mean value of long term variability The following metrics are used for performance evaluation. V.Subha, IJRIT 277 Accuracy= TP/( TN +FP+ FN +TP) Sensitivity= TP /(TP+FN) Specificity =TN/( TN+ FP) F-Score=[2*A*B] /[A+B] where, A= TP/(TP+FP) & B= TP/(TP+FN) TP: True Positives FP: False Positives TN: True Negatives FN: False Negatives 4.1 Results The classification techniques (NB, DT, MLP and RBF) are compared using original CTG dataset with 21 attributes. The metrics (Accuracy, Sensitivity, Specificity and F-Score) are used to evaluate them. The experimental result is shown in Table 2. Table 2: Comparison of classification techniques using full features CTG data set with 21 attributes Classifiers Metrics NB DT MLP RBF Accuracy 82.1 92.9 91.9 86.0 Sensitivity 82.1 92.9 91.9 86.0 Specificity 87.6 92.8 91.8 88.0 F-Score 83.7 92.9 91.8 86.7 From this, it is observed that DT yields better performance than other classifiers in terms of accuracy. Table 3: Comparison of feature selection methods using Naive Bayes Classifier Metrics Feature selection methods Gain Ratio ReliefF Symmetrical Uncertainty Accuracy 83.9 82.8 83.4 Sensitivity 83.9 82.8 83.4 Specificity 88.3 87.7 88.1 F-Score 85.2 84.2 84.8 Table 3 illustrates the performance of Naïve Bayes Classifier using three feature selection methods. It is observed that GainRatio feature selection method yields better performance than other feature selection methods. Hence, GainRatio feature selection method is selected for further experimentation. The reduced CTG dataset with 15 attributes is formed by using this GainRatio feature selection method. The classification techniques (NB, DT, MLP and RBF) are compared using feature reduced CTG dataset with 15 attributes. Performances of these classifiers are evaluated by using the metrics. The experimental result is shown in Table 4. Table 4: Comparison of classification techniques using reduced CTG dataset (15 attributes) Metrics Accuracy V.Subha, IJRIT Classifiers NB DT MLP RBF 83.9 93.3 92.6 87.2 278 Sensitivity 83.9 93.3 92.6 87.2 Specificity 88.3 93.2 92.4 89.0 F-Score 85.2 93.3 92.5 87.8 The table 4 shows that Decision Tree yields more accuracy than other classifiers with reduced feature set. The overall performance is analysed between full CTG dataset (21 attributes) and feature reduced CTG dataset (15 attributes). The comparison result of the classification techniques using full CTG dataset with 21 attributes and reduced CTG dataset with 15 attributes is shown in Fig. 2.The observations illustrates that classification using reduced feature set performs better than with full featured dataset. Accuracy(%) 95 90 85 Before Feature Subset Selection 80 After Feature Subset Selection 75 NB DT MLP RBF Classifiers Fig. 2 Comparison of classification techniques before feature subset selection and after feature subset selection 5. Conclusion In this paper, four classification techniques (NB, DT, MLP and RBF) are compared using original CTG dataset with 21 attributes. The DT classifier yields more performance than other classifiers in terms of accuracy. For improving the accuracy, feature selection method is applied. Here, Gain Ratio feature selection method is applied to the original CTG dataset to obtain reduced CTG dataset that has 15 attributes. The performance is analysed using the classification techniques between original CTG dataset with 21 attributes and reduced CTG dataset with 15 attributes. The classification with reduced CTG data set yields more performance than original dataset. In future, other classification methods and feature selection techniques can be applied to this dataset so that performance can be improved. References [1] UCI Machine Learning Repository, http://archive.ics.uci.edu/ml/datasets/Cardiotocography [2] Shahad Nidhal, M. A. Mohd. Ali1 and Hind Najah, “A novel cardiotocography fetal heartrate baseline estimation algorithm”, Scientific Research and Essays Vol.5, No.24, 2010, pp.4002-4010. [3] Antonia Costa, MD; Diogo Ayres-de-Campos, PhD; Fernada Costa, MD; Cristina Santos, MS; Joao Bernardes, PhD, “Prediction of neonatal academia by Computer analysis of fetal heartrate and ST event sibnals” AJOG – American Journal of Obstetrics and Gynecology, 2009. [4] N. Krupa , M.A.MA, E.Zahedi, S. Ahmed and F.M.Hassan, "Antepartum fetal heart rate feature extraction and classification using empirical mode decomposition and support vector machine,” BioMed. Eng. OnLine, Vol.10, No.61, 2011. [5] J. Bernardes, C. Moura, J. P. de Sa, and L. P. Leite, “The Porto system for automated cardiotocographic signal analysis,”J. Perinat. Med., pp.61–65, 1991. V.Subha, IJRIT 279 [6] G. Rooth, A. Huch, and R. Huch, “Guidelines for the use of fetal monitoring,” Int. J. Gynaecol. Obstet., vol.25, 1987, pp.159-167. [7] R. Mantel, H. P. van Geijn, F. J. Caron, J. M. Swartjes, E. E. van Woerden, and H. W. Jongsma, “Computer analysis of antepartum fetal heart rate and Detection of accelerations and decelerations,” Int. J. Biomed. Comput., Vol.25, No.4, 1990, pp.273-286. [8] Sundar, C., M. Chitradevi and G. Geetharamani, “An Overview Of Research Challenges for Classification Of Cardiotocogram Data” Journal of Computer Science, Vol.9, No.2, 2013, pp.198-206. [9] Mei-Ling Huang and Yung-Yan Hsu, “Fetal distress prediction using discriminant analysis, decision tree, and artificial neural network”, J. Biomedical Science and Engineering, Vol.5, 2012, pp.526-533. [10] Sundar Chinnasamy, Chitradevi Muthusamy and Geetharamani Gopal, “An Outlier Based Bi-Level Neural Network Classification System for Improved Classification of Cardiotocogram Data”, Life Science Journal, Vol.10, No.1, 2013, pp.244-251. [11] Sundar.C, M.Chitradevi and G.Geetharamani, “An Analysis on the Performance of Naïve Bayes Probabilistic Model based Classifier for Cardiotocogram Data Classification” International Journal on Computational Sciences & Applications, Vol.3, No.1, 2013, pp.17-26. [12] P. A.Warrick, E. F.Hamilton, and M. Macieszczak, “Neural network based detection of fetal heart rate patterns”, in Proc. IEEE Int. Joint Conf. on Neural Netw., Vol.5, 2005, pp.2400- 2405. [13] M. Jezewski, P. Labaj, J. Wrobel, A. Matonia, J. Jezewski, and D.Cholewa, “Automated classification of deceleration patterns in fetal heart rate signal using neural networks,” in IFMBE Proc., 2007, vol.18, pp.5-8. [14] V Chudacek, J Spilka, B Rubackova, M Koucky, “Evaluation of Feature Subsets for Classification of Cardiotocographic Recordings “, Computers in Cardiology, Vol.35, 2008, pp.845−848. [15] Mohamed El BachirMenai, Fatimah J. Mohder, and Fayha Al-mutairi, “Influence of Feature Selection on Naïve Bayes Classifier for Recognizing Patterns in Cardiotocograms“ Journal of Medical and Bioengineering Vol.2, No.1, 2013, pp.66-70. [16] V. Chudacek, J. Spilka, M. Huptych, G. Georgoulas, P. Janku, M. Koucky, C. Stylios, and L. Lhotska, “Linear and non-linear features for intrapartum cardiotocography evaluation”, Computing in Cardiology,2010;37:991002. [17] G. Georgoulas, D. Chrysostomos, and P. P. Groumpos, “Predicting the risk of metabolic acidosis for newborns based on fetal heart rate signal classification using support vector machines”, IEEE Trans. Biomed. Eng.,2006, pp.875- 884. [18] Ian H.Witten, Eibe Frank,”Practical Machine learning tools and Techniques”, Second Edition, Morgan Kaufmann Publishers, Imprintof Elsevier, 2005. V.Subha, IJRIT View publication stats 280