1401806

advertisement
1
AUTOMATIC CLASSIFYING METHOD FOR WEDGE
2
TIGHTNESS BY SUPPORT VECTOR MACHINE AND
3
ARTIFICIAL NEURAL NETWORK
4
5
Running head: Automatic wedge tightness by Support Vector Machine and
6
Artificial Neural Network Technique
7
8
Thanachai Poombansao1*, Waree Kongprawechnon1, Somsak
9
Kittipiyakul1, Chonlada Theerawon2, Muntita Charoenlarp1 and It
10
Chunsangnate1
11
12
1
School of Information, Computer, and Communication Technology (ICT),
13
Sirindhorn International Institute of Technology, Thammasat University, Khlong Luang,
14
Pathum Thani, 12121, Thailand. Tel. 0-2986-9009; Fax. 0-2986-9112;
15
E-mail: t.poombansao@gmail.com
16
2
17
2
National Electronics and Computer Technology Center (NECTEC), Khlong Luang, Pathum
18
Thani, 12121, Thailand. Tel. 0-2564-6900; Fax. 0-2564-6901-3;
19
*Corresponding author
20
21
Abstract
22
This study proposed an automatic classifying system for wedge tightness of a
23
generator’s wedge. It consists of 4 processes called data collection,
24
preprocessing, feature extraction, and classification. The aim of this study is to
25
classify and verify the tightness of a generator’s wedge for the Electricity
26
Generating Authority of Thailand (EGAT) by using Machine Learning
27
algorithms, called Support Vector Machine (SVM) and Artificial Neural
28
Network (ANN). The linear and radial basis function (RBF) are selected for
29
SVM classifier. The evaluation is completed by using a 10-fold cross validation
30
technique to give high accuracy and a low number of False Negatives (FN).
31
From the simulation results, the signals extracted in the frequency domain has
32
less FN than the time domain, and ANN gives the best performance among
33
classifiers.
34
Keywords: Wedge tightness signal, Support Vector Machine, Artificial Neural
35
Network, classification
2
3
36
37
Introduction
38
Nowadays, the use of electricity is increasing rapidly around the world due to the
39
exponential growth of new technology and population. Electricity is involved in all
40
daily activities such as industries, hospitals, and houses. Therefore, more usage of
41
power plants is needed to support the demand of users, which cause the deterioration
42
of generators. The information of a generator such as the tightness of a generator’s
43
wedge is very important to evaluate the safety of the generator. Normally, human
44
judgment has been used to determine the wedge performance inside the generator by
45
inspection and evaluation. A human can check the wedge tightness by tapping with a
46
hammer, feeling for vibration, listening to the sound by trained ears, and recording
47
the degree of looseness on a chart. A tight wedge has very little vibration or hard
48
ringing sound. On the other hand, a loose wedge has noticeably vibration and a very
49
hollow sound (Geoff Klempner and Isidor Kerszenbaum, 2004). This method is very
50
subjective because it depends on individual judgment from operators by hearing the
51
sound from a slot wedge (Summer project pal, 2011). However, it consumes both
52
time and cost because the rotor in the generator has to be removed and the generator
53
needs to shut down during operation. Hence, a robot has been used instead of a
54
human to determine the wedge performance, which is cost efficient because the rotor
55
does not need to be removed during operation. George introduced the testing of stator
56
wedge tightness in an electrical generator that contains a vibration sensor attached to
3
4
57
the stator core lamination and a mounting system for mounting the base assembly to
58
impact assembly (George, F.D. et al, 1996). At first, robots used for generators in
59
Thailand were imported robots. However, the robots are very expensive and too big
60
for some generators in Thailand. Therefore, the Electricity Generating Authority of
61
Thailand (EGAT) plans to develop its own robot that is smaller with lower cost. The
62
robot also is required to automatically classify the wedge performance of a generator.
63
The task of a robot can be divided into two main functions including wedge
64
tapping control and automatic analyze wedge tightness system. Wedge tapping
65
control is used to adjust the tapping pattern of the robot to be a single, periodic, or
66
repeating pattern. The wedge tightness signals are analyzed afterward. There are 4
67
steps to analyze wedge tightness signals including data collection, preprocessing,
68
feature extraction, and classification.
69
Data collection is the raw data received by the robot to illustrate the wedge
70
inside the generator. Preprocessing refers to improving the raw data quality including
71
filtering, and noise cancellation. Feature extraction reduces the pattern vector to a
72
lower dimension and keeps the useful information from the original vector, called the
73
feature vector. Classification takes the feature vector as an input to define the
74
tightness of the wedge by placing the feature vector into the most appropriate class
75
(Tohka J., 2013). Many classification methods have been applied to the system such
76
as nearest neighbor classifier, wavelet transform, probabilistic classification, and
77
multiclass classification.
4
5
78
SVM classification is an alternative way for pattern recognition by using the
79
separation on the hyperplane (decision boundary) to distinguish data diversity. ANN
80
classification is the machine learning algorithm to teach to recognize and remember
81
different traits. ANN also has low computation loads, and flexibility for pattern
82
recognition. A computational neuron can produce either a linear or non-linear input
83
which allows the network to learning efficiently. ANN is configured for a specific
84
application such as pattern recognition or data classification through a learning
85
process. There are many research works with SVM and ANN for classification such
86
as Cardiac Auscultation Analysis, a screening system to diagnose disease from heart
87
sounds in digital format (Phatiwuttipat, P. et al, 2011).
88
classification methods also are used for analyzing construction materials with
89
ultrasonic waves to find the number of defects, and the result is very accurate
90
(Saechai, S. et al, 2012). The objective of this work is to develop an automatic
91
classification system for determining the tightness of a generator’s wedge.
The SVM and ANN
92
This study is beneficial for identifying the tightness of a generator’s wedge
93
in a power plant. A brief description of both SVM and ANN classifiers, the
94
methodology, and the comparison between both classification results is discussed in
95
the following sections.
96
97
Material and methods
5
6
98
System Configuration
99
In the experiment, Figure 1 illustrates the developed robot by the National
100
Electronic and Computer Technology Center (NECTEC) with size 180-300
101
millimeters wide, 300 millimeters long and 20 millimeters thick. The developed
102
robot is cheaper than the cost of an imported robot (about 250,000 U.S. dollars).
103
The overall robot system is shown in Figure 2. The control box acts like a
104
brain and nervous system which is responsibility for controlling and linking all
105
functions into the entire system, including movement of a camera, movement of
106
robot, and wedge tapping. The display monitor is used for monitoring each activity in
107
the system including video camera, wedge’s result, position of the robot and the
108
graphical user interface (GUI). The joystick is used for controlling the direction of the
109
camera, the movement of robot, and tapping pulse. In addition, the robot connects to
110
the control box via cable. This study focuses on wedge tightness signal analysis.
111
The raw data of a wedge tightness signal obtained from the robot goes
112
through the process called “automatic analyze wedge tightness system” in order to
113
classify the data. The block diagram in Figure 3 illustrates the signal processing part
114
of the wedge tightness signal.
115
116
Data collection
6
7
117
Data collection comes from the robot tapping on the generator and receiving a
118
reacted signal back. “Signal conditioning” is used in order to focus and make the
119
signals even clearer. A Data Acquisition System (DAQ) is used to convert analog
120
data samples from the robot into digital numeric values to plot a graph (Figure 4).
121
Figure 4 a) and Figure 5 a) show the tight wedge data signal and the loose wedge
122
signal, respectively. The real raw data is obtained from the board calibration, which
123
has 20,000 sampling points. The robot taps every 100 ms and each sampling point has
124
a time interval of 50 µs.
125
In this experiment, there are 240 inputs for each wedge signal (tight and loose). The
126
inputs have been reduced into 2,000 sampling points to extract only significant parts.
127
The 2,000 sampling points of tight wedge signal and loose wedge signal are shown in
128
Figure 4 and Figure 5, respectively.
129
130
Preprocessing
131
Preprocessing is used for the preparation of signal. Preprocessing refers to
132
operations performed on the raw data to improve its quality such as filtering, and
133
noise cancellation. The wedge tightness signal response is reduced by using a
134
threshold to capture the significant part. The threshold is set to 2,000 sampling points
135
to remove the unwanted part of the signal.
136
7
8
137
Feature extraction
138
Feature extraction extracts the significant features from the raw data. Fast
139
Fourier Transform (FFT) is used for feature extraction in the frequency domain. FFT
140
is a very useful tool for signal analysis to obtain the features for classification (Kim et
141
al., Unpublished). FFT computes the same as the Discrete Fourier Transform (DFT)
142
by letting
143
as follows:
be complex numbers. The DFT can be defined by the formula
144
145
where
146
By contrast, FFT takes approximately
147
which takes approximately
148
experiment to obtain the input features of classifiers as follows.
149
150
151
152
operations which is faster than DFT,
operations. There are 2 methods used in this
i) Time domain classifier: 2,000 data points in the time domain are used in
this experiment, as shown in Figure 6.
ii) Frequency domain classifier: 2,000 data points are transformed to the
frequency domain by using Fast Fourier Transform (FFT), as shown in Figure 7.
153
154
Classification
8
9
155
The input sample is separated into 2 classes including 120 tight wedge data
156
signals and 120 loose wedge data signals. The classification methods used in this
157
experiment are Support Vector Machine (SVM) and Artificial Neural Network
158
(ANN).
159
160
Support Vector Machine (SVM)
161
Support Vector Machine (SVM) is one of the important models for pattern
162
recognition and classification. The main concept of SVM is to transform an input
163
signal into feature space and divide the clear gap between classes with an optimal
164
hyperplane (decision boundary) and maximum margin (support vector) (Avci, E.,
165
2009), as shown in Figure 8 (Chatunapalak, I. et al., 2012). Since most of the data are
166
non-linear and non-separable, a kernel function is used for mapping a non-linear
167
input space into a linear space (higher dimensional feature space) to perform the
168
separation (Ahmad, A.R. et al., 2009). Figure 9 illustrates how to transform a non-
169
linear data point to a higher dimensional feature space by using SVM with a kernel
170
function.
171
Many kernel functions are used for SVM classifiers such as linear, polynomial, radial
172
basis unction (RBF), Fourier, etc. The kernel parameters in a kernel function must be
173
set properly to maximize the accuracy of SVM classification. This study selected
174
linear and RBF kernels for SVM classification.
9
10
175
SVM classification trained and evaluated input samples by a data splitting
176
method. The training and testing samples generally split into 90% and 10% of the
177
input samples for each class, respectively. The input, training, and testing samples are
178
provided in Table 1. The binary classifications of SVM are performed for
179
classification. Each data input is prepared in the form of row vector of size
180
where
181
columns of features (2,000 features), as shown in the formula as follows:
represents the number of input samples. Each input sample contains
182
183
The output of a wedge tightness signal depends on the data class. The
184
numbers 1 and 0 are set as a target for class 1 and class 2 (tight and loose wedge),
185
respectively. The output vector for binary classification respectively is presented as
186
follows:
187
188
The proper kernel function is selected to train the non-linear SVM classifier. The
189
linear and RBF kernels are used to train classifiers and test results because the RBF
190
kernel has good classification according to many studies. The RBF kernel gives high
191
classification performance in this study. Ten-fold cross validation is applied for
192
classification, as shown in Figure 10. The datasheets are randomly divided into 10
193
subsets. Each subset is constructed and trained by using 9 out of 10 (90%) folds and
10
11
194
tested 1 out of 10 (10%) to obtain a cross validation estimate of its error rate. The
195
process is repeated in the same steps for 10 subsets and averaged for classification
196
accuracy from all data (Delen et al., 2005). The proposed SVM classification used
197
MATLAB for simulations.
198
199
Artificial Neural Network (ANN)
200
An Artificial Neural Network (ANN), usually called Neural Network (NN), is
201
a good technique for determining pattern recognition. The main concept of ANN is
202
inspired by animal nervous systems (brain), which is composed of the large number
203
of highly interconnected processing elements (neurons) working to solve specific
204
problems through a learning process. The learning process of ANN includes adaptive
205
learning, self-organization, real-time operation, and fault tolerance. ANN uses the
206
back-propagate model to solve non-linear signals.
207
The back-propagated model is a feed-forward multi layered ANN that allows
208
a signal to travel one way from input to output with no feedback (loops). Figure 11
209
illustrates the back-propagated model of ANN. According to Figure 11, the ANN
210
network consists of 3 layers including input layer, hidden layer, and output layer. The
211
input layer represents the raw data fed into the network as an input, which forwards to
212
the hidden layer. The hidden layer consists of neurons to solving complex problem by
11
12
213
calculation, and forwards the result into the output layer. The output layer takes the
214
results from the hidden layer, performs a calculation, and gives the final result.
215
The input samples and classes are the same as SVM. MATLAB is used to
216
train, validate, and test for Neural Network Pattern Recognition Protocol (nprtool) by
217
selected input and target. Training, validation, and testing is set to 85%, 5%, and
218
10%, respectively, and the number of hidden neurons is set to 10.
219
220
Results and discussion
221
Each raw data of wedge tightness signal performs preprocessing and feature
222
extraction to obtain the input features for the classifier. The number of total input
223
samples, training samples, and testing samples are mentioned in a previous section.
224
The trained classifier is evaluated by test samples. The classification accuracies are
225
analyzed by using two different types of input features from the data that were
226
obtained in the time domain and frequency domain. The efficiency of each single
227
classification is obtained by a 2×2 confusion matrix. The confusion matrix is a table
228
that shows the performance of the algorithm. An example of a confusion matrix is
229
shown in Figure 12. The first and second rows represent the data with target tight
230
wedge and loose wedge, respectively. The first and second columns represent the
231
predictor of a tight wedge and loose wedge, respectively. The true positives (TP) are
232
the number of tight wedges which are correctly classified. False positives (FP) are the
12
13
233
number of loose wedges which are misclassified from tight wedges. On the other
234
hand, true negatives (TN) are the number of loose wedges which are correctly
235
classified and false negatives (FN) shows the number of tight wedges which are
236
misclassified from loose wedges. Positive predictive value (PPV) and negative
237
predictive value (NPV) are the rate that the predictor can give a correct classification
238
for the tight wedge and loose wedge, respectively. Sensitivity and specificity are the
239
correctness prediction for tight wedge and loose wedge, respectively. Accuracy
240
(ACC) value computes the average accuracy in the classification tables. All Values in
241
the tables are the average classification accuracies of 10 confusion matrices.
242
243
Time Domain Analysis
244
The result of classification algorithms for wedge tightness signal in the time
245
domain is shown in Table 1 including SVM (linear and RBF) with kernel functions
246
and ANN classifier.
247
From Table 1, the value of specification in the time domain between 2 classes
248
(tight and loose wedge) is classified. The sensitivity of ANN (98.24%) has better
249
performance than VSM in both linear (96.08%) and RBF (96.1%) functions. The
250
specification of ANN (90.52%) also has better efficiency than the others (linear
251
88.59% and RBF 88.67%). In comparison, the SVM with RBF kernel function has an
252
accuracy of about 91.38% which is worse than that of SVM with linear function
13
14
253
(92.34%), but has twice faster average training time than that of the linear function.
254
From the result, ANN classifier provides the best efficiency, compared to both SVM
255
(linear and RBF) classifiers in the time domain. However, a false negative is very
256
important for wedge tightness because the misclassification of a loose wedge can
257
cause uncertainly in a generator. Since false negatives still exist in the time domain
258
(1.25% for ANN, 1.97% for SVM with linear function, and 1.98% for SVM with
259
RBF function), frequency domain classification is applied to decrease the number of
260
false negatives.
261
262
Frequency Domain Analysis
263
The classification result in frequency domain is shown in Table 2 including
264
SVM (linear and RBF) with kernel functions and ANN classifiers. From Table 2, the
265
number of false negatives for SVM with RBF function is 0.08%, which is the worst
266
classification, compared to the other methods. The numbers of false negatives in
267
ANN and SVM with linear function are reduced to 0%, which is a perfect
268
classification for wedge tightness. The training time decreases when using SVM with
269
linear function (11.182s), SVM with RBF function (7.095s), and ANN (2s) methods.
270
271
Conclusion
14
15
272
The automatic classification method for wedge tightness by support vector machine
273
and artificial neural network provides good recognition and classification results in
274
the frequency domain. The classification result shows that the ANN classification
275
method has the best performance, which has the highest accuracy and shortest
276
training time. The result shows that the frequency domain signal has more accuracy
277
than the time domain signal. By comparison between 2 SVM methods, the linear
278
function provides a better solution in terms of accuracy, but consumes more time than
279
the RBF function does.
280
For further study, binary-classification (tight or loose wedge) from this
281
experiment may not be sufficient to identify a suspected loose wedge generator in a
282
real environment. The multi-classification technique can be developed in the future to
283
predict how many degrees of looseness in a suspected loose wedge for a generator.
284
For instance, multi-class support vector machine has been used by an automatic
285
screening method to detect glaucoma for early treatment and preserve eye sight
286
(Vejjanugraha, P. et al, 2013).
287
wedge tightness needs Graphic User Interface (GUI) for ease of use and
288
understanding.
Moreover, the automatic classifying method for
289
290
Acknowledgements
15
16
291
This research is financially supported by School of Information, Computer, and
292
Communication Technology (ICT) and the National Research University Project,
293
Thailand office of Higher Education Commission (NRU). The author would like to
294
thank the National Electronics and Computer Technology Center (NECTEC) and
295
Electricity Generating Authority of Thailand (EGAT) for advice and data in this
296
research.
297
298
299
300
301
302
303
304
305
306
307
16
17
308
309
310
References
311
Geoff Klempner and Isidor Kerszenbaum (2004). Stator winding mechanical tests. In:
312
Operation and maintenance of large turbo-generators. John Wiley & Sons,
313
Inc, Hoboken, New Jersey, p. 485-487.
314
Summer project pal (2011). A new method for Stator Slot Wedge Testing of Large
315
…… Generators. Available from: http://topicideas.net/how-to-a-new-method-for-
316
…… stator-slot-wedge-testing-of-large-generators. Access date: Jan, 2001.
317
George, F.D., Mark, W.F., and Harry, L.S., inventors; Westinghouse Electric
318
...…….Corporation, assignee. February 27, 1996. System and method for impact
319
…… testing wedge tightness in an electrical generator. U.S. patent no. 5,493,894.
320
Tohka, J. (2013). Introduction to Pattern Recognition. Department of Signal
321
……….Processing, Tempere University of Technology.
322
Phatiwuttipat, P., Kongprawechnon, W., and Tungpimolrut, K. (2012). Cardiac
323
…........Auscultation Analysis System with Neural Network and SVM technique.
324
………International Institute of Technology, Thammasat University, Pathum
325
………Thani, Thailand.
17
18
326
Saechai, S., Kongprawechnon, W., and Sahamitmongkol, R. (2012). Test System for
327
………Defect Detection in Construction Material with Ultrasonic Waves by Support
328
………Vector Machine and Neural Network, SCIS-ISIS 2012, Kobe, Japan.
329
Kim, J.S., Kim, I.B., and Yoo, K.S. An experimental comparison of ultrasonic pulse
330
………velocity and frequency domain signal of concrete, unpublished.
331
Avci, E. (2009). A new intelligent diagnosis system for the heart valve disease by
332
………Using genetic-SVM classifier. Expert Systems with Applications, 36: 10618-
333
………10626.
334
Ahmad, A.R., Khalid, M., Gaudin, C.V., and Poisson, E. (2004). Comparison of
335
………support vector machine and neural network in character level discriminant
336
………training for online word recognition. UNITEN Students Conference on
337
………Research and Development, Malaysia.
338
Chatunapalak, I., Phatiwuttipat, P., Kongprawechnon, W., and Tungpimolrut, K.
339
………(2012). Childhood musical murmur classification with support vector machine
340
………technique. In Proceeding of the 9th International Conference on Electrical
341
………Engineering/Electronics, Computer, Telecommunications and Information
342
………Technology (ECTI-CON 2012), Hua Hin, Thailand.
343
Delen, D., Walker, G., and Kadam, A. (2005). Predicting breast cancer survivability:
344
………a comparison of three data mining methods. Artificial Intelligence in
345
………Medicine, 34: 10618-10626.
18
19
346
Vejjanugraha, P., Kongprawechnon, W., Kondo, T., Tungpimolrut, K., Kaneko, H.,
347
………and Sintuwong, S. (2013). An Automatic Screening Method For Detecting
348
………Glaucoma
349
………University, Khlong Luang, Pathum Thani, Thailand.
Using
Multi-class
Support
350
351
352
353
354
355
356
357
358
359
360
361
362
19
Vector
Machine,
Thammasat
20
363
Table 1. Classification results in the time domain
Methods
Kernel
TP
FP
FN
TN
PPV
NPV
Sensitivity
Specification
ACC
Training
Function
(%)
(%)
(%)
(%)
(%)
(%)
(%)
(%)
(%)
time (S)
ANN
-
46.25
5
1.25
47.51
90.55
97
98.24
90.52
93.75
2.000
SVM
Linear
48.04
5.71
1.97
44.31
89.4
95.79
96.08
88.59
92.34
7.460
RBF
48.03
6.66
1.98
43.34
87.84
95.69
96.1
86.67
91.38
3.857
364
365
366
367
368
369
370
371
20
21
372
Methods
373
Kernel
TP
FP
FN
TN
PPV
NPV
Sensitivity
Specification
ACC
Training
Function
(%)
(%)
(%)
(%)
(%)
(%)
(%)
(%)
(%)
time (S)
ANN
-
56.67
0
0
43.33
100
100
100
100
100
2.000
SVM
Linear
50
0
0
50
100
100
100
100
100
11.182
RBF
49.92
0.46
0.08
49.55
99.12
99.84
99.84
99.09
99.47
7.095
Table 2. Classification results in the frequency domain
374
21
22
375
376
377
378
Figure 1. Imported robot for testing.
379
380
381
382
383
384
22
23
385
386
Control box
Control
Movement of
camera
PLC
+
Joystick
Automatic
Analysis of wedge
tightness system
Display Monitor
+
Manual
Camera
Control
Movement of
robot
387
388
Figure 2. Overall robot system
389
390
391
392
393
394
23
24
395
396
DATA COLLECTION
PREPROCESSING
FEATURE EXTRACTION
CLASSIFICATION
TIME ANALYSIS
Wedge tightness
signal response
Noise Removal
Time Domain
FREQUENCY ANALYSIS
FFT Transform
Frequency
Domain
397
398
Figure 3. Block diagram of signal processing part
399
400
401
402
403
404
405
406
407
24
SVM or ANN
CLASSIFICATION
RESULTS
25
408
409
(a)
(b)
410
411
412
Figure 4. a) The original sampling points of tight wedge signal b) The 2,000 sampling
points of tight wedge signal
413
414
415
25
26
416
417
(a)
(b)
418
419
420
Figure 5. a) The original sampling points of loose wedge signal b) The 2,000 sampling
points of loose wedge signal
421
422
423
26
27
424
425
(a)
426
(b)
427
428
429
430
Figure 6. a) Tight wedge signal in the time domain b) Loose wedge signal in the time
domain
431
432
433
434
27
28
435
436
437
(a)
438
(b)
439
440
441
442
Figure 7. a) Tight wedge signal in the frequency domain b) Loose wedge signal in the
frequency domain
443
444
445
446
28
29
447
448
449
450
Figure 8. Basic concept of support vector machine
451
452
453
454
455
456
457
458
29
30
459
460
461
462
Figure 9. Transformation non-linear data points to higher dimension linear space
463
464
465
466
467
468
469
470
30
31
471
472
Have not been tested sets
Test sets
Tested sets
10%
474
10%
10%
10%
10%
10%
10%
10%
10%
10%
10%
10%
10%
10%
473
10%
10%
10%
Round 1
10%
10%
10%
Round 2
Figure 10. Ten-fold cross-validation procedure
475
476
477
478
479
480
481
31
10%
…
10%
10%
10%
10%
10%
10%
10%
10%
10%
Round 10
32
482
483
484
485
486
Figure 11. Back-Propagation model of artificial neural network
487
488
489
490
491
32
33
492
493
494
Condition positive Condition negative
Test outcome
positive
True Positive
(TP)
False Positive
(FP)
Positive Predictive
Value (PPV)
TP
TP+FP
Test outcome
negative
False Negative
(FN)
True Negative
(TN)
Negative Predictive
Value (NPV)
TN
FN+TN
Sensitivity
TP
TP+FN
Specification
TN
FP+TN
Accuracy
TP+TN
TP+FP+FN+TN
495
496
Figure 12. Example of 2×2 confusion matrix
33
Download