Indian Journal of Engineering & Materials Sciences
Vol.16, August 2009, pp. 205-210
S N Omkar a
*, J Senthilnath a
& S Suresh b a
Department of Aerospace Engineering, Indian Institute of Science, Bangalore 560 012, India b
Department of Electrical Engineering, Indian Institute of Technology, New Delhi 110 016, India
Received 24 July 2008; accepted 16 July 2009
In this paper, pattern classification problem in tool wear monitoring is solved using nature inspired techniques such as
Genetic Programming (GP) and Ant-Miner (AM). The main advantage of GP and AM is their ability to learn the underlying data relationships and express them in the form of mathematical equation or simple rules. The extraction of knowledge from the training data set using GP and AM are in the form of Genetic Programming Classifier Expression (GPCE) and rules respectively. The GPCE and AM extracted rules are then applied to set of data in the testing/validation set to obtain the classification accuracy. A major attraction in GP evolved GPCE and AM based classification is the possibility of obtaining an expert system like rules that can be directly applied subsequently by the user in his/her application. The performance of the data classification using GP and AM is as good as the classification accuracy obtained in the earlier study (i.e. using
ANN approach).
Keywords: Tool wear monitoring; Genetic Programming; Ant-Miner
In manufacturing, the development of mathematical model is an important task to critically analyze the process. These mathematical models relate the inputs of the system to the desired outputs. In some cases, obtaining a mathematical model (i.e., relationship between the input and the desired outputs) can be a difficult task. In such situations, there is a need to build mathematical models based on the given inputoutput data. These models should effectively identify the underlying input-output relationship. Artificial neural networks (ANNs) have been successfully used to find the input-output relationship. Neural network
(NN) applications in manufacturing can be broadly classified into pattern classification problems
1
and function approximation problems
2
.
Purushothaman et al .
1
have formulated the tool wear monitoring problem as a pattern classification problem. In this study, the six inputs to the NN are speed, feed, depth of cut, axial force, radial force and the tangential force. The output of the NN is flank wear bandwidth. In their study, (after training the neural network) for a given input pattern, based on the flank wear bandwidth, the output is classified as pattern belonging to class-1 or class-2, and thus considered as a pattern classification problem. Also in
——————
* For correspondence (E-mail: omkar@aero.iisc.ernet.in) their study, to reduce the complexity, the input dimensions are reduced from six to two, and a NN is trained for pattern classification problem.
NNs have been used for various manufacturing applications
2,3
. The main limitation of NN is that they capture the relationship between the input-output data effectively, but the weights do not express the relationship between the input-output data explicitly.
However, the nature inspired techniques like Genetic
Programming (GP) and Ant-Miner (AM) can give explicit relationship between the input and output classes. In the Genetic Programming approach
4 arithmetic function sets are used to evolve Genetic
Programming Classifier Expressions (GPCE). The
GPCE can be expressed mathematically or as an expert system like rules for pattern classification
5,6
.
On the other hand, AM
7,8
is a class of data classification algorithm modeled on the actions of an ant colony. AM is used to extract simple rules from the given data set
8-10
.
In this study, we use GP and AM approach for the pattern classification problem of tool wear monitoring discussed by Purushothaman et al .
1
. A genetic programming classifier expression is evolved as a discriminant function between two classes using training set of data points. The GPCE are then applied to set of data in the testing/validation set to obtain the
206 INDIAN J. ENG. MATER. SCI., AUGUST 2009 classification accuracy. On the other hand, AM is used to derive knowledge in the form of simple rules from reproduction, crossover and mutation to generate the next generation. Hence, the solution is evolved the training data. These rules are applied sequentially to the testing data set to obtain the classification accuracy. These nature inspired techniques are intriguing because of their ability to classify the data through the generations.
Koza
4
has applied GP for a two-class pattern classification problem. In a two-class problem, a single GP expression is evolved. While evaluating efficiently and in the simplicity of the mathematical expression and rules that have been extracted.
In this paper, the nature inspired techniques for pattern classification and applications to tool wear monitoring are discussed. The mathematical model and rule extraction are described.
Nature Inspired Techniques for Pattern the GP expression, if the result is positive, then the input data are assigned to one class (say class-1); else they are assigned to the other class (class-2).
Thus, in the training set the desired (known) output d is +1 for samples belong to one class (say class-1), and the desired (known) output d is –1 for samples belong to the other class (class-2). Hence, the output of a GP expression is either +1 (indicating that the
Classification
Nature inspired technique is the field of research that works with computational techniques inspired in part by nature and natural systems. These nature inspired techniques provide a more robust and efficient approach for solving complex real-world problems
11,12
.
Many nature inspired techniques such as Artificial
Neural Network
13
, Ant Colony Optimization
7
, Genetic
Programming
4
and Particle Swarm Optimization
(PSO)
14,15
have been proposed. Among these, we input sample belongs to that class) or –1 (indicating that the input sample does not belong to that class).
We call this GP expression evolved in a two-class problem as GPCE for pattern classification problem.
This GPCE is the mathematical model evolved for the pattern classification problem. This GPCE divides the feature space into two regions. GP uses the function set that contains operators and functions to evolve a GPCE as the discriminant function for the two classes present in the training set. Let Y be briefly describe two methods - GP and AM.
Genetic programming for pattern classification
Genetic programming is an evolutionary approach the output of the GPCE.
IF GPCE( x ) ≥ 0 THEN Y = +1, x ∈ class-1
IF GPCE( x ) < 0 THEN Y = -1, x ∉ class-1, where x is the input feature values. In the present study, for evolving a GPCE we have used the function which applies the Darwin’s principle of survival of the fittest to a population of parametric solution of a given problem. GP evolves a population of computer programs, which are possible solutions to a given problem. Each program or individual in the population is generally represented as a tree composed of functions and data/terminals appropriate to the problem domain. The set of functions F and set of terminals/inputs T must satisfy the closure and sufficiency properties. The closure property demands that the function set is well defined and closed for any combination of arguments that it may encounter. On the other hand, the sufficiency property requires that the set of functions in F and the set of terminals be able to express a solution of the problem. The function set may contain standard arithmetic operators, mathematical functions, logical operators, and domain-specific functions. The terminal set usually consists of feature variables and constants.
Each individual in the population is assigned a fitness value, which quantifies how good the solution is. The fitness value is computed by a problem-dependent fitness function. GP uses genetic operations like set with only arithmetic operations (+, -,
Koza
4
÷ , and × ).
has shown that in GP the evolution is a never-ending process, and hence a termination criterion is needed. The termination criterion for GP is generally based on the problem or is limited by the number of generations. In GP, a user-defined fitness function has to be maximized for his/her application.
Thus, at the end of a GP run, we have a current population of individuals and also the fittest individual that appeared during the run. The fittest individual that has evolved for the given problem is its solution or desired mathematical model.
Ant-miner for pattern classification
An ant colony optimization approach for discovery of classification rules has been proposed called Ant-
Miner
8-10
. Ant-miner follows a sequential covering approach to discover a list of classification rules covering all, or almost all, the training cases. At first, the list of discovered rules is empty and the training
OMKAR et al.: RULE EXTRACTION FOR TOOL WEAR MONITORING set consists of all the training cases. A rule is added to the rule list when it classifies correctly a pre-defined number of training cases. A three step process gets repeated for each training case – rule construction,
207
3.
The determination of the tool's wear caused by abrasion, erosion, or other sinfluences.
Purushothaman et al .
1
have experimentally rule pruning and pheromone updating, until one rule studied and simulated the challenges involved to gets extracted. This rule is added to the list of classify the tool wear data based on two-class pattern discovered rules and the training cases that are classification problem using NN. In their covered correctly by this rule (i.e., cases satisfying the experimental study, in the ranges of various rule antecedent and having the class predicted by the parameters such as speed, feed and depth of cut, data rule consequent) are removed from the training set. is collected on axial force, radial force, tangential
This process is performed iteratively while the force, and flank wear bandwidth. The conditions of number of uncovered training cases is greater than a user-specified threshold. the machining and the resource used are
1 explained in Purushothaman et al .
. Thus, in their
Each classification rule has the form IF <term1 study, there are six inputs namely speed, feed, depth
AND term2 AND …> THEN <class>.Each term is a of cut, axial force, radial force and tangential force. triple <attribute, operator, value>, where value is a
The flank wear bandwidth is the output. The output value belonging to the domain of attribute. The
(flank wear bandwidth) is modified for pattern operator element in the triple is a relational operator. classification problem as (i) all the data points for
The six inputs of the tool wear monitoring which the flank wear bandwidth is less than or equal constitute the attribute set. The relational operators: to 200 belong to class-1 and (ii) all the data points greater than (>), less than (<), greater than equal to for which the flank bandwidth is greater than 200, belong to class-2.
(>=), less than equal to (<=) and equal to (=) constitute the operator set.
In their study-113 data points are collected in which 87 data points (or samples) belonged to
Applications to Tool Wear Monitoring
Monitoring of tool wear is an important class-1, and 26 data points belonged to class-2. They used 20 data points belonging to class-1 and 10 data requirement for realizing automated manufacturing.
Tool wear is a very complex phenomenon which can points belonging to class-2 for training the NN. The rest of the data points, 67 (class-1) and 16 (class-2), lead to machine down time, product rejects and can also cause problems to personnel
16
. The three most important tasks in the area of tool monitoring are
17
:
1.
The fast detection of collisions, i.e. any unintended contacts between the tool and the are used for testing. The input is reduced from six dimensions to two dimensions using optimal discriminant method and the NN is trained.
Quantifying the input-output relationship is difficult using NN. Hence, Genetic Programming and
Ant-Miner are used to obtain a mathematical model workpiece or parts of the machine (causing e.g. and simple rules for this problem. rapidly increasing forces);
2.
The identification of tool breakage, e.g. outbreaks at brittle cutting edges and
In the present study, we use the same data points as in Purushothaman et al .
1
for this pattern classification problem using GP and AM. A partial list of data
Sl. No.
5
6
7
8
1
2
3
4
9
10 x
1
(Speed)
450
450
450
350
300
450
400
450
456
450 x
2
(Feed)
6
10
10
10
10
10
10
10
10
10
Table 1— Subset of experimental data set x
3
(Depth of cut)
15
50
200
50
50
150
50
150
100
200
Input x
4
(Axial force)
150
60
180
60
45
750
175
240
550
1100 x
5
(Radial force)
115
50
130
90
80
650
350
850
590
1200 x
6
(Tangential force)
150
115
450
125
70
500
140
620
430
840
Output
Class
1
2
2
2
1
1
1
1
2
2
208 INDIAN J. ENG. MATER. SCI., AUGUST 2009 points using the six input features such as x
1
(speed), x
2
(feed), x
3
(depth of cut), x
4
(axial force), x
5
(radial force) and x
6
(tangential force) and a desired output feature (i.e., class-1 and class-2) are given in Table 1.
In the GP or AM approach to pattern classification, the given data set is divided into training set and validation/testing set. In case of GP, the training set data points are used for obtaining GPCE
2
Equivalent mathematical model is x
1
+ 2 x
3
− 60 −
21 x
4 x
4
−
x
2
2
4 x
5
+
66 x
4
− x
3
21
− 12 .
3243
… (1)
(mathematical model) and the testing set data points are used for obtaining classification accuracy whereas
AM extract rules from the training data and the extracted rules are used to classify the test data. We
For a given input sample, if the above expression is greater than or equal to zero, then the input sample belongs to class-1. Otherwise, the input sample belongs to class-2. From this mathematical have used 21 data points that belong to class-1, and 11 data points that belong to class-2 for obtaining GPCE and simple rules. The rest of the data points 66
(belonging to the class-1) and 15 (belonging to the expression, we can derive simple rules. This is as follows:
Classification rule: class-2) are used for testing/validation and for obtaining classification accuracy.
Mathematical Model and Rule Extraction
If
( x
1
+ x
3
)
> belongs to class-1. x
2
4 + x
5
+ 30 , then this sample
Genetic programming
The genetic programming parameters which include population size, GP generations, cross over weight, mutation weight, mutation rate and tournament size are varied until they produce most favorable classification result. The optimum values for the above parameters for the most favorable results are as follows:
Population size = 2000
GP generations = 5,00,000
This rule says if the sum of the value of speed (x
1
) and depth of cut (x
3
) is greater than the sum of half the value of axial force (x
4
), the value of radial force
(x
5
) and the constant value 30, then the flank wear width will be less than 200 (class-1).
The advantage of above classification rule is that any person without much knowledge about the physical process can easily use them for classification. The rules also represent the knowledge that is learned while obtaining the GPCEs.
Ant-miner
Cross over weight = 70
Mutation weight = 20
Mutation rate = 60
Tournament size =3000
The Ant-Miner parameters such as
Number_of_ants, Min_cases_per_rule, Max_ uncovered_cases, Number_rules_to_converge were varied to extract different set of rules and the overall
For the above parameter, we have done several runs to evolve GPCE with the training set and the best GPCE obtained for the run is listed below. The classification efficiencies hence obtained were recorded. The optimum values for the above parameters are as follows:
GP expression evolved is in the form of LISP s-expression and this expression can be easily converted into a mathematical expression as follows.
GPCE: (MUL (SUB (MUL (SUB (ADD 20 x
4
)
(SUB –44 2)) x
4
) (ADD (SUB (DIV x
3
21)
(DIV –75 –111)) 13)) (DIV (ADD (ADD (ADD x
3
x
1
)
(ADD 32 –29)) (SUB (DIV (SUB (ADD 20 x
4
)
(SUB –44 2)) –2) x
5
)) x
4
))
Number_of_ants = 25.
Min_cases_per_rule = 6.
Max_uncovered_cases = 3.
Number_rules_converg = 5.
AM extracted rules from the training data set and the extracted rules were used to classify the test data.
Following are some of the rules extracted by the algorithm:
OMKAR et al.: RULE EXTRACTION FOR TOOL WEAR MONITORING
For the class-1 of tool wear data set: x
1
<= 372
This rule says if the value of speed ( x
1
) is less than or equal to 372, then the sample belongs to class-1.
209 classified without any misclassifications and hence has an individual efficiency of 100%. The overall classification is impressive with an efficiency of
100%. The overall classification efficiency obtained
For the class-2 of tool wear data set: x
1
> 393 and x
2
<=17 and x
5
>307 for the training data is a measure of the relevance of the GPCE extracted.
Next the GPCE extracted are applied to the testing data set and the efficiencies are evaluated. As we can
This rule says if the value of speed ( x
1
) is greater than 393 and the value of feed ( x
2
) is less than or equal to 17 and the value of radial force ( x
5
) is greater than 307, then the sample belongs to class-2.
Simulations and Results
To evaluate the performance, the data set is used to arrive at the classification matrix which is of size notice from the classification matrix generated for the testing data (Table 3), two of the samples belonging to class-2 are misclassified as class-1, but overall efficiency is impressive with a 97.53%.
Purushothaman et al .
1
applied NN to solve this problem, and the classification accuracy obtained in their approach is 96.36%. We can observe that the n × n , where n is the number of classes. A typical entry q ij
in the classification matrix shows how many samples belonging to class i have been classified into classification accuracy obtained in GP approach is comparable to that of NN approach.
Ant-miner simulation and classification class j . For a perfect classifier, the classification matrix is diagonal. However, due to misclassification we get off-diagonal elements. The individual efficiency of class i is defined (for all j ) as q ii
/ ∑ q ji
The overall efficiency is defined as
( ∑ q ii
) / N
… (2)
… (3)
The classification matrices obtained after applying the derived rules from AM for the training and testing data are shown in Tables 4 and 5 respectively. From the classification matrix for the training data we can notice that in the training set, samples belonging to class-1 getting classified without any misclassifications and hence has an individual efficiency of 100%. But for class-2 a where N data set.
is the total number of elements in the
GP simulation and classification single case is getting misclassified as class-1. Hence class-2 has an individual efficiency of 90.90%.
However, the overall classification is impressive with an efficiency of 96.87%. The overall
Initially, GP learns from the training data set and evolves the GPCE. The Classification Matrices obtained after applying the GPCE, for the training and testing data are shown in Tables 2 and 3 classification efficiency obtained for the training data is a measure of the relevance of the rules extracted.
Next the rules extracted are applied to the testing data set and the efficiencies are evaluated (Table 5). respectively. From the classification matrix for the training data we can notice that in the training set, samples belonging to class-1 and class-2 are getting
As we can notice from the classification matrix generated for the testing data, there are some
Table 2— Classification matrix of tool wear monitoring training data set by GP algorithm
Class-1
Class-2
Class-1
21
0
Class-2
Overall efficiency = 100%
0
11
Individual Efficiency
100%
100%
Table 3— Classification matrix of tool wear monitoring testing data set by GP algorithm
Class-1
Class-2
Class-1
66
2
Class-2
Overall efficiency = 97.53%
0
13
Individual Efficiency
100%
86.67%
Table 4— Classification matrix of tool wear monitoring training data set by AM algorithm
Class-1 Class-2 Individual Efficiency
Class-1 21
Class-2 1
0
10
Overall efficiency = 96.87%
100%
90.9%
Table 5— Classification matrix of tool wear monitoring testing data set by AM algorithm
Class-1 Class-2 Individual Efficiency
Class-1 66
Class-2 4
0
11
Overall efficiency = 95.06%
100%
73.34%
210 misclassifications between the two classes, but overall efficiency of 95.06% is almost the same as that obtained for the training data set.
Conclusions
INDIAN J. ENG. MATER. SCI., AUGUST 2009
In this paper, nature inspired techniques such as genetic programming and ant-miner are used to solve a pattern classification problem that arise in tool wear monitoring, is presented. These techniques evolve a mathematical model or a rule base that express an input-output relationship explicitly. This approach is better than other approaches such as NN, in the sense that it gives an insight into the knowledge contained in the data set. Also, GPCE and AM extracted rules may be used in developing a rule-based expert system.
References
1 Purushothaman S & Srinivasa Y G, Int J Prod Res , 36 (1998)
635-651.
2 Anderson K, Cook G E, Kasai G & Ramaswamy K, IEEE
Trans Ind Appl , 26 (1990) 824-830.
3 Cook G E, Barnett R J, Anderson K & Strauss A M, IEEE
Trans Ind Appl , 31 (1995) 1484-1491.
4 Koza J R, Genetic Programming: On the Programming of
Computers by Means of Natural Selection . (M I T Press,
5 Kishore J K, Patnaik L M, Mani V & Agarwal V K, IEEE
Trans Evolut Comput, 4 (2000) 242-258.
6 Suresh S, Omkar S N, Mani V & Menaka C, J Aerospace Sci
Technol , 56 (2004) 26-41.
7 Marco Dorigo & Christian Blum, Theor Comput Sci , 344
(2005) 243-278.
8 Parpinelli R S, Lopes H S & Freitas A A, IEEE Trans Evol
Comput, 6 (2002) 321-332.
9 Omkar S N & Raghavendra K U, IEEE Int Conf Industrial
Technology , (2006) 1559-1562.
10 Omkar S N & Raghavendra T R, Eng Appl Artif Intell , 21
(2008) 1381-1388.
11 Back T & Schwefel H P, Evolut Comput , 1 (1993).
12 Yao X E, Evolutionary Computation: Theory and
Applications , (World Scientific, Singapore), (1999).
13 S. Haykin, Neural Networks – A Comprehensive Foundation ,
2 nd
ed, ( New York), 1994.
14 Eberchart R & Kennedy J, A new optimizer using particle swarm theory, in Proc Int Sym Micro Machine and Human
Science , Japan, 1995.
15 Eberchart R & Kennedy J, Particle swarm optimization, in
Proc. IEEE Int Conf Neural Networks , 1995.
16 Dimla D E Jr, Lister P M Leighton N J, Int J Mach Tools
Manufact, 37 (1997) 1219 -1241.
17 Golz H U, Schillo E, Wolf A, Kaufeld M, Sprengel P,
Johannsen P & Heinek D. Bewertung yon Werkzeugtiber wachungssystemen aus Sicht der Anwender In: f2Jberwachung von Zerspan und Umformprozessen,
Cambridge, USA), 1992. Dtisseldorf, VDI-Verlag, (1995) 309-317.