Uploaded by Kaplan Kaplan

2018 ICONDATA yasin

advertisement
Diagnosis of Parkinson's Disease Using ANN and SVM Models
Y Cuskun1, K Kaplan1, H M Ertunc1
yasincuskun@gmail.com, kaplan.kaplan@kocaeli.edu.tr, hmertunc@kocaeli.edu.tr
1
Department of Mechatronics, Sensor Laboratory, Kocaeli, 41380, TURKEY
Abstract
Parkinson's is a chronic neurological disorder that occurs in the brain due to the lack of a
substance called ‘dopamine’. Individuals with this disease may have symptoms such as
movement disorders, postural and balance disorders, change in speech and change in
handwriting. In this study, speech changes were considered for disease identification. Then,
speech data of diseased and healthy individuals were recorded and 22 speech features were
extracted for each patient from these speech data. The extracted features are used to classify the
individuals whether they belong to disease or not by means of ANN (Artificial Neural
Networks) and SVM (Support Vector Machines) classification methods, which are widely used
in machine learning. The purpose of the classification is to identify individuals belonging to
Parkinson's disease and provide a decision support mechanism to medical doctors. As a result
of the experimental studies, Parkinson's disease was classified with SVM model with 96,146%
and ANN model with 94.71% accuracy.
Keywords: Diagnose Of Parkinson Disease, Classification, ANN, SVM
2. Introduction
Parkinson's disease is a chronic neurological disease that occurs in the brain with the lack of a
substance called 'dopamine'. The disease is firstly described by British physician James
Parkinson in 1817 and characterized as 'shaking or trembling'. This disorder is a neurological
health problem that affects more than 4 million people over the worldwide. It is estimated that
200 thousand people with Parkinson's suffer from this disease in TURKEY [1]. Parkinson's
disease is the most common disease after Alzheimer's disease. It can cause movement disorders
such as tremor and slowness in movements, muscular rigidity, posture and balance disorders,
change in speech and handwriting. It is not possible to remove the symptom with medical
treatment, but the symptoms can be taken under control with medications and some surgical
interventions. For this reason, in this study, diseased individuals were distinguished from
E-ISBN: 978-605-68882-5-0
273
healthy individuals by considering speech disorders seen in patients. In the study, 22 features
were extracted using speech data of 195 individuals belonging to diseased and healthy humans.
Then, by using ANN and SVM models, widely used in machine learning, diseased individuals
were distinguished from healthy individuals with this features. The aim of the study is to help
patient people to control the symptoms even if it is not possible to treat the disease [1]. In some
studies in the literature, classifications have been made for Parkinson's disease using
classification methods such as HEC (Stacked Autoloader) and PNN (probabilistic neural
network) [2, 3]. In this study, classification methods such as ANN and SVM, which are
commonly used in machine learning for diagnosis of Parkinson's disease and also which usually
give successful results, have been used.
3. Theoretical Background
3.1. Artificial Neural Network
Artificial neural networks are computer systems that simulate the learning function, which is
the most basic feature of the human brain. They perform the learning process with the help of
samples. These nets consist of interconnected artificial nerve cells. Each link has a weight value.
The information that artificial neural network possesses is stored in these weight values and
spread to the network [4]. The learning process takes place when these weight values are
updated. Artificial neural networks usually consist of three layers. From these layers, the inputs
of the system are kept at the input layer and this information are processed in the hidden layer
and transmitted to the output layer. When this process happens, first the model inputs are
multiplied by random weights initially, and the value in each neuron of the hidden layer is
determined as follows;
𝒏𝒆𝒕𝒋 = ∑π’…π’Š=𝟏 π’™π’Š ∗ π’˜π’‹ + 𝒃
(1)
Where 𝑛𝑒𝑑𝑗 represents the value of each neuron, d is the number of inputs, x is the input, b is
the bias term, and w is the weight. After the hidden layer values are calculated, the value of
each hidden layer is found with the aid of the activation function. If the activation function is
denoted by f as indicated in the following equation;
π’š = 𝒇(𝒏𝒆𝒕𝒋 )
(2)
The activation function may be sigmoid, tangent or hyperbolic tangent in accordance with the
structure of the system. After the value of each neuron in the hidden layer is found, the same
operations are repeated and this time the model output is found. If the model used is a complex
E-ISBN: 978-605-68882-5-0
274
model, the number of hidden layers can be increased. Figure 1 shows an artificial neural
network model with single hidden layer.
Figure 1. Single layer Artificial Neural Network model.
The learning process of the model is usually found by using the backpropagation algorithm. In
backpropagation algorithm; the output value is found by using randomly determined weight
values at the beginning. Then, using this output value, the input weights are updated as follows.
π’˜π’π’†π’˜ = π’˜π’π’π’… + βˆ†w
(3)
where Δw is the value;
βˆ†π½
βˆ†π’˜π’π’†π’˜ = πœ‚ ∗ βˆ†π‘€
π‘œπ‘™π‘‘
+ 𝛼 ∗βˆ†π’˜π’π’π’…
(4)
Here η is called the learning rate and α is called the momentum coefficient. The ΔJ value
indicates the difference between the output value and the expected output value. These steps
are repeated for each data set to find the most appropriate weight values for the model. In this
way, learning of the model is realized [6].
3.2. Support Vector Machines
The classification method with SVM is generally used in binary classifications. In this binary
classification, a decision function is used to search the most appropriate hyperplane which can
divide the training data. It is aimed to find the maximum distance between the nearest points
when the hyperplane is located. As shown in Fig. 2, the hyperplane optimal hyperplane, which
limits the boundary to the maximum, and the points that limit the boundary width, are called
support vector machines [7].
E-ISBN: 978-605-68882-5-0
275
Figure 2. Optimum hyperplane and support vectors.
The hyperplanes of these support vector machines can be found by the following equation.
π’˜. π’™π’Š + 𝒃 = ±πŸ
(5)
Where w is the weight vector (hyperplane normal) and b is the trend value [8]. To maximize
the hyperplane bounds ||w|| the expression must be minimal. In this case the following limited
optimization problem needs to be solved.
𝟏
𝟐
π’Žπ’Šπ’ [ ||π’˜||𝟐 ]
(6)
The limitations related to this are;
π’šπ’Š (π’˜ ∗ π’™π’Š + 𝒃) − 𝟏 ≥ 𝟎 and y ∈ {-1,+1}
(7)
It is expressed in the form [9]. If this optimization problem is solved by Lagrange equations;
𝟏
𝑳(π’˜, 𝒃, 𝜢) = 𝟐 ||π’˜||𝟐 - ∑π’Œπ’Š=𝟏 πœΆπ’Š ∗ π’šπ’Š ∗ (π’˜ ∗ π’™π’Š + 𝒃) + ∑π’Œπ’Š=𝟏 πœΆπ’Š
(8)
equality is achieved. As a result, for a problem of two classes that can be linearly separated, the
decision function can be written as [9].
𝒇(𝒙) = π’”π’Šπ’ˆπ’(∑π’Œπ’Š=𝟏 π€π’Š ∗ π’šπ’Š ∗ (𝒙 ∗ π’™π’Š ) + 𝒃)
(9)
In some cases, some of the training data can not be separated linearly, but in such cases a
problem is solved by talking about a positive artificial variable as in Fig. 3. Control can be done
with a C parameter that takes positive values to maximize the hyperplane bounds and minimize
the error [10].
E-ISBN: 978-605-68882-5-0
276
Figure 3. Hyperplane identification for non-linear data sets.
Optimization problem for nonlinear data classes when regulatory parameter and artificial
variable are added;
𝟏
π’Žπ’Šπ’ [𝟐 ||π’˜||𝟐 + π‘ͺ ∗ ∑π’“π’Š=𝟏 πƒπ’Š ]
(10)
π’šπ’Š (π’˜ ∗ 𝝋(π’™π’Š ) + 𝒃) − 𝟏 ≥ 𝟏 − πƒπ’Š
(11)
The limitations related to this are;
πƒπ’Š ≥ 𝟎 and i=1,…,N
In order to solve the optimization problem expressed in Eq. 11 and 12, the data which can not
be linearly separated in the input space, as shown in Fig. 4, is moved to a higher dimensional
space and linearly separated to determine the hyperplane.
Figure 4. Converting the data to a higher size with the kernel function.
While SVM is mathematically modelled, it can be classified as linear by using a kernel function
expressed as 𝐾(π‘₯𝑖 , π‘₯𝑗 ) = πœ‘(π‘₯) ∗ πœ‘(π‘₯𝑗 ). In this case, it can be written as follows [10].
𝒇(𝒙) = π’”π’Šπ’ˆπ’(∑π’Š πœΆπ’Š ∗ π’šπ’Š ∗ 𝝋(𝒙) ∗ 𝝋(𝒙𝒋 ) + 𝒃 )
E-ISBN: 978-605-68882-5-0
(12)
277
4. Method
Speech data from Parkinson's patients was recorded and 22 features were extracted from these
data. These features are obtained from the UCI database [11]. Some of these features are namely
average vocal fundamental frequency, maximum vocal fundamental frequency, minimum vocal
fundamental frequency, several measures of variation in fundamental frequency and several
measures of variation in amplitude. A total of 195x23 matrix was obtained after these properties
were obtained. Approximately 80% of the data sets consisting of diseased and healthy
individuals for a reliable classification were equally selected as the training data (156) and the
remaining 20% as the test data (39). After the data sets were created, training and test data sets
were obtained by mixing. Then, the datasets are classified by running a program in MATLAB
environment.
4.1. Artificial Neural Network
An ANN model with two hidden layers was created, then training and test data sets were
randomly determined. The constructed ANN model for this study is given in Fig. 5.
Firstly, with the forward propagation algorithm, the output value is found by randomly
determined weight values at first, then the error between the output is found and the expected
output is found. This error is used in backpropagation algorithm. In backpropagation algorithm;
the error value is differentiated according to the weight values in each layer in order. This
derivative value is multiplied by the learning rate and the found weight values are updated.
Figure 5. ANN model used in the classification process.
The parameters used in the ANN model are given in Table 1. These values are intuitively found.
E-ISBN: 978-605-68882-5-0
278
Table 1. ANN parameters.
ANN Parameters
Value
η (Learning Rate)
0,8
α (Momentum Coefficient)
0,4
Number Of Neurons In First Hidden Layer
11
Number Of Neurons In Second Hidden Layer
5
Activation Function
Sigmoid
Since ANN is a predictive method, the output value is limited so that diseased and healthy
individuals are identified. The restriction operation is performed according to the activation
function. Since sigmoid is used as the activation function, this value is set to 0.5.
4.2. Support Vector Machines
By using the same data set that was employed in ANN model, classification process is realized
again based on SVM method. To this end, a non-linear classification is performed by SVM
model. The kernel function (Radial Basis Function) is shown as in the Eq. 13.
(π’Š)
(𝒋)
𝑲(𝒙 , 𝒙 ) = 𝒆
||π’™π’Š −𝒙𝒋 ||𝟐
𝟐𝝈𝟐
(13)
Tablo 2. SVM parameters.
SVM Parameters
Value
C
1
Kernel Function
RBF
σ
1
The parameters used in the SVM model are determined by empirical and are given in Table 2.
5. Results and Discussions
The test set error formula used in the results is given in Eq. 14, Precision and Recall values for
the f1 score accuracy rate are given in Eq. 15 and f1 score accuracy ratios are given in Equation
16.
E-ISBN: 978-605-68882-5-0
279
∑𝑡
π’Š=𝟏
(𝑢𝒕 −π‘Άπ’Ž )𝟐
𝑡
(14)
where N is the number of data in the test set 𝑢𝒕 is the target output value, and π‘Άπ’Ž is the model
output.
𝑷(π‘·π’“π’†π’„π’Šπ’”π’Šπ’π’) =
𝑻𝑷
𝑻𝑷+𝑭𝑷
𝑹 (𝑹𝒆𝒄𝒂𝒍𝒍) =
𝑻𝑷
𝑻𝑷+𝑭𝑡
(15)
where TP is the correctly estimated number of Parkinson individuals, FP is the number of
incorrectly estimated healthy individuals, and FN is the incorrectly estimated number of
Parkinson individuals. Precision (P) is the value that indicates the ratio of accurately predicted
positive observations to total estimated positive observations. Recall (R) value determines the
ratio of accurately predicted positive observations to all observations in the actual class - yes.
π’‡πŸ 𝒔𝒄𝒐𝒓𝒆 =
𝟐∗𝑷∗𝑹
𝑷+𝑹
(16)
The obtained values are shown in Table 3 for 5-fold cross validation error and accuracy;
Table 3. 5-fold cross validation values and averages.
5-fold cross
validation
ANN
SVM
Test Error
f1 score (%)
Test Error
f1 score (%)
1. Trial
0.0980
93.33
0.0862
94.74
2. Trial
0.0483
96.67
0.0517
96.84
3. Trial
0.1204
91.80
0.1034
93.48
4. Trial
0.0739
95.08
0.0172
98.9
5. Trial
0.0549
96.67
0.0517
96.77
Average
0.0791
94.71
0.06204
96.146
As seen in the Table 3, the average test error of ANN model is 0.0791 and average f1 score is
94.71% while the average test error rate is 0.06204 and average f1 score is 96.146 for SVM
classification. When the values obtained from the models are examined, it can be seen that
classification success rate by SVM is more than that of ANN model. The classification results
with ANN are shown in Fig. 6 and the classification results with SVM are shown in Fig. 7,
respectively.
E-ISBN: 978-605-68882-5-0
280
Figure 6. Classification results with ANN.
Figure 7. Classification results with SVM.
6. Conclusions
In this study, 22 features were extracted from the speech data from Parkinson and healthy
individuals and classified according to ANN and SVM classification methods. As shown in
Table 3, when the SVM and ANN classifications were compared, the accuracy of classification
with SVM was 96.146% and the accuracy with ANN was 94.71%. When the results were
compared, it was observed that SVM classification was more successful. As a result of this
E-ISBN: 978-605-68882-5-0
281
studies, the decision-support model obtained by the SVM method will help doctors to make
decisions for the diagnosis of Parkinson's disease.
7. Acknowledgment
This work was performed at Sensor Laboratory in Department of Mechatronics Engineering in
Kocaeli University.
References
[1]
Δ°senkul M E, 2011 Parkinson HastalığınΔ±n Teşhisi Δ°çin Veri Toplama ve Örüntü TanΔ±ma Sistemi
[2]
Badem H, Çalışkan A, Baştürk A and Yüksel M E 2016 Electrical-Electronics and Biomedical
Engineering
Conference
Yığınlanmış
Özdevinimli
KodlayΔ±cΔ±
ile
Parkinson
Hastalığının
Sınıflandırılması ve Teşhis Edilmesi p 1
[3]
BarışçΔ±l M S, Çetin O, Er O and Demirtaş F, 2012 Electric Letters on Science & Engineering
Olasılıksal Sinir Ağının (PNN) Parkinson Hastalığının Teşhisinde Kullanılması p 1
[4]
Öztemel E 2012 Papatya Publishing and Education Yapay Sinir AğlarΔ±
[5]
Bayram S, Kaplan K, Kuncan M, Ertunç H M 2013 Turkish National Committee for Automatic
Control Bilyeli Rulmanlarda Zaman UzayΔ±nda Δ°statistiksel Öznitelik ÇΔ±karΔ±mΔ± ve Yapay Sinir AğlarΔ±
Metodu ile Hata Boyutunun Kestirimi p 986
[6]
Duda R O, Hart P E and Stork D G 2000 New York: A Wiley Interscience Publication Pattern
Classification
[7]
Vapnik V N 2000 New York: Springer Sicence & Busines Media New York The Nature of
Statistical Learning Theory
[8]
Kavzoğlu T and Çölkesen Δ° 2010 Harita Dergisi Destek Vektör Makineleri Δ°le Uydu
Görüntülerinin SΔ±nΔ±flandΔ±rΔ±lmasΔ±nda Kernel FonksiyonlarΔ±nΔ±n Etkilerinin Δ°ncelenmesi p 73
[9]
Osuna E E, Freund R and Girosi F 1997 Massachusetts Institute of Technology and Artificial
Intelligence Laboratory Support Vector Machines: Training and Applications p 144
[10]
Cortes C and Vapnik V 1995 Kluwer Academic Publishers Support Vector Networks p 273
[11]
Lichman, M 2013 CA: University of California, School of Information and Computer Science
UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]
E-ISBN: 978-605-68882-5-0
282
Download