3 SVM based Yarn Predictive Model - 19th International Conference

advertisement
19th International Conference on Production Research
AN INTELLIGENT REASONING MODEL FOR YARN MANUFACTURE
Jian-Guo Yang, Fu Zhou, Jing-Zhu Pang, Zhi-Jun Lv
College Of Mechanical Engineering, University of DongHua, Ren Min Bei Road 2999, Song Jiang Zone,
Shanghai, P R China
Abstract
Although many works have been done to construct prediction models on yarn processing quality, the
relation between spinning variables and yarn properties has not been established conclusively so far.
Support vector machines (SVMs), based on statistical learning theory, are gaining applications in the areas
of machine learning and pattern recognition because of the high accuracy and good generalization
capability. This study briefly introduces the SVM regression algorithms, and presents the SVM based
system architecture for predicting yarn properties. Model selection which amounts to search in hyperparameter space is performed for study of suitable parameters with grid-research method. Experimental
results have been compared with those of ANN models. The investigation indicates that in the small data
sets and real-life production, SVM models are capable of remaining the stability of predictive accuracy, and
more suitable for noisy and dynamic spinning process
Keywords:
Support vector machines, Structure risk minimization, Predictive model, Kernel function, Yarn quality
1 INTRODUCTION
Changing economic and political conditions and the
increasing globalisation of the market mean that the textile
sector faces ever challenges. To stay competitive, there is
an increasing need for companies to invest in new
products. Along the textile chain, innovative technologies
and solutions are required to continuously optimize the
production process. High quality standards and an
extensive technical and trade know-how are thus
prerequisite to keep abreast of the growing dynamics of
the sector [1]. Although many works have been done to
construct prediction models on yarn processing quality, the
relation between spinning variables and yarn properties
has not been established conclusively so far.. The
increasing quality demands from the spinners make clear
the need to explore innovative ways of quality prediction
furthermore. The widespread use of artificial intelligence
(AI) has created a revolution in the domain of quality
prediction, for example, application of artificial neural
network (ANN) in textile engineering [2]. This study
presents a support vector machines based intelligent
predictive model for yarn process quality. The relative
algorithm, model selection and experiments are presented
in detail.
2 SVM REGRESSION ALGORITHMS
2.1 Paper title and authors
The main objective of regression is to approximate a
function g(x) from a given noisy set of samples
G  {( xi , y i )}
N
i 1
obtained from the function g. The
basic idea of support vector machines (SVM) for
regression is to map the data x into a high dimensional
feature space via a nonlinear mapping and to perform a
linear regression in this feature space.
D
f ( x)   wi  i ( x)  b
(1)
i 1
where w denotes the weight vector, b is a constant known
as “bias”,
{ i ( x)}
D
i 1
are called features. Thus, the
problem of nonlinear regression in lower-dimensional input
space is transformed into a linear regression in the highdimensional feature space. The unknown parameters w
and b in Equation (1) are estimated using the training set,
G. To avoid over fitting and thereby improving the
generalization capability, following regularized functional
involving summation of the empirical risk and a complexity
term
w
2
, is minimized [3]
R reg  Remp   w 
2
1
M
M

i 1
f ( xi )  y i

 w
2
(2)
where λis a regularization constant and the cost function
defined by
 f ( x)  y  
f ( x)  y   
0

,
( f ( x)  y   )
( f ( x)  y   )
(3)
is called Vapnik’s “ε-insensitive loss function”. It can be
shown that the minimizing function has the following form:
M
f ( x,  ,  * )   ( i   i* )k ( xi , x)  b
(4)
i 1
 i i*  0
function k ( xi , x)
with
,
 i ,  i*  0
and
the
kernel
describes the dot product in the D-
dimensional feature space.
k ( xi , x j )   ( x i ),  ( x j )
(5)
It is important to note that the featuresΦj need not be
computed; rather what is needed is the kernel function that
is very simple and has a known analytical form. The only
condition required is that the kernel function has to satisfy
Mercer’s condition. Some of the mostly used kernels
include linear, polynomial, radial basis function, and
sigmoid. Note also that for Vapnik’s ε -insensitive loss
function, the Lagrange multipliers  i ,  i are sparse, i.e.
*
they result in nonzero values after the optimization (2) only
if they are on the boundary, which means that they satisfy
the
Karush–Kuhn–Tucker
conditions.
The
coefficients
 i ,  i*
are obtained by maximizing the
following form:
Max : R( * , )  
1 M
 (i*  i )( *j   j ) K ( xi , x j )
2 i , j 1
M
   ( i*   i ) 
i 1
M
.S .T .
 (
i 1
*
i
M

i 1
y i ( i*   i )
 i )
(6)
(7)
0   ,i  C
*
i
Only a number of coefficients  i ,  i will be different from
*
zero, and the data points associated to them are called
support vectors. Parameters C and εare free and have to
be decided by the user. Computing b requires a more
direct use of the Karush–Kuhn–Tucker conditions that lead
to the quadratic programming problems stated above. The
x k on
the
in the open interval (0, C). One
xk
key idea is to pick those values for a point
margin, i.e.
 k or 
*
k
would be sufficient but for stability purposes it is
recommended that one take the average over all points on
the margin. More detailed description of SVM for
database. The reasoning machines are a SVM-based yarn
process simulator in nature, which are used to train the
predictive models, and then make some real-world
process decision in term of the different raw materials
inputs
3.2 Model Selections
In the yarn predictive learning task, the appropriate model
and parameter estimation method should be selected to
obtain a high level of performance of the learning machine.
Lacking a priori information about the accuracy of the yvalues, it can be difficult to come up with a reasonable
value of ε a prior. Instead, one would rather specify the
degree of sparseness and let the algorithms automatically
compute ε from the data. This is the idea of ν-SVM, a
modification of the original ε -SVM introduced by
Schőlkopf, Smola, Williamson et al [6], which were used to
construct the yarn predictive model in our study. Under the
approach, the usually parameters to be chosen are the
following:

the penalty term C which determines the tradeoff
between the complexity of the decision function and
the number of training examples misclassified;

the sparsity parameter ν in accordance with the noise
that is in the output values in order to get the highest
generalization accuracy.
User Interface
Raw
Material
Yarn
Properties
Yarn Quality Prediction
Reasoning Machines
SVM-based Process Simulator
Textile Engineering Database
Data Acquisition
Yarn Production Process
Fig.1 Yarn Quality Predictive Model Architecture
regression can be found in Ref. [3~6]
3 SVM BASED YARN PREDICTIVE MODEL
3.1 Model Architecture
Considering some salient features of SVM such as the
absence of local minima, the sparseness of the solution
and the improved generalization, there was proposed
SVM-based yarn quality prediction system (shown as
Fig.1). The system architecture mainly consists of three
modules, i.e. data acquisition, reasoning machine, and
user interface. Among them, the user interface provides
friendly interactive operation with the model, including data
cleaning, model training, parameter selection, and so on.
The data acquisition collects and transforms the various
data from yarn production process into engineering

the kernel function such that K ( x,
y)
According to the reference [7], the sparsity parameter ν
usually may be choose in the interval [0.3, 0.6], here
ν=0.583. And radial basis function (RBF) kernel, given by
Equitation 8 is used:
K ( x, y )  exp(  x  y
2
/ 2 2 )
(8)
where σ is the width of the RBF kernel parameter.
The RBF kernel nonlinearly maps samples into a higher
dimensional space, so it, unlike the linear kernel, can
handle the case when the relation between inputs and
outputs is nonlinear. In addition, the sigmoid kernel
behaves like RBF for certain parameters. The reason
using RBF kernels is the number of hyper-parameters
19th International Conference on Production Research
which influences the complexity of model selection. The
polynomial kernel has more hyper-parameters than the
RBF kernel. Finally, for the RBF kernel, it has less
numerical difficulties; and a key point is 0  k ( x, y )  1 in
contrast to polynomial kernels of which kernel values may
go to infinity or zero while the degree is large. Moreover, it
is noted that the sigmoid kernel is not valid (i.e. not the
inner product of two vectors) under some parameters [4].
3.3 Optimization of Model Parameter
Obviously, in the SVM model there are still two key
parameters need choosing: C and σ. Unfortunately, it is
difficult to know beforehand which C and σ are the best for
one problem. Our goal is just about to identify good (C, σ)
so that the model can accurately predict unknown data
(i.e., testing data). Therefore, a common way is to
separate training data to two parts of which one is
considered unknown in training the model. Then the
prediction accuracy on data sets can more precisely reflect
the performance on predicting unknown data. The
procedure for improved model is called as crossvalidation. The cross-validation procedure can also
prevent the over-fitting problem furthermore. In this study,
the regression function was built with a given set of
parameters {C, σ}.The performance of the parameter set is
measured by the computational risk, here mean squared
error (MSE, see Equation 9) on the last subset. The above
procedure is repeated p times, so that each subset is used
once for testing. Averaging the MSE over the p trials gives
an estimate of the expected generalization error for
p 1
l
p
training on sets of size
, l is the number of
training data.
MSE 
1 p q
( y ti( j )  y (pij ) ) 2

pq j 1 i 1
(9)
where q is the sample number of tested subset in the
training set;
y ti( j ) and y (pij )
th
are the
i th observed value and
predicted value under j tested subset, respectively. In
order to capture the better pairs of (C, σ), a “grid-search”
[8] on C and σ is employed in this work. Firstly, in term of
possible range of the two parameters, C and σ were
divided r pairs; then each pair of the parameters was tried
using cross-validation and the one with the best crossvalidation accuracy was picked up as optimal parameters
of the model.
4 THE EXPERIMENTS STUDY
In this work, a small population (a total of twenty-six
different data samples) from real worsted spinning was
acquired. To demonstrate the generalization performance
of SVM model, different experiments were completed and
comparisons with ANN models.To make problem more
simply, like most ANN models[2, 9], some fibre properties
and processing information were selected as the SVM
model’s inputs, which were mean fibre diameter (MFD, μ
m), diameter distribute (CVD, %), hauteur (HT, mm), fiber
length distribution (CVH, %), short fiber content (SFC, %),
yarn count (CT, tex), twist (TW, t.p.m), draft ratio (DR),
spinning speed (SS, r.p.m), traveler number (TN). Four
yarn properties, namely unevenness (CV %), elongation at
break (EB, %), break force (BF, cN) and end-down per
1000 spindle hour (ED), served as the SVM model’s
outputs.
One of the primary aspects of developing a SVM
regression model is the selection of the penalty term C
and the width of the RBF kernel parameter σ. To optimize
the two parameters, the “grid-search” method above was
applied in the present work. In fact, optimizing the model
parameters need an iterative process which can
continuously shrink the searching area and as a result,
obtain a satisfying solution. Table1 lists the final searching
area and optimal values of the four SVM models,
respectively.
After the completion of model development or training, all
the models based on SVM (and ANN) were subjected to
the unseen testing data set. Statistical parameters such as
the correlation coefficient between the actual and
predicted values (R), mean squared error, and mean
error%, were used to compare the predictive power of the
SVM-based and ANN-based models. Results are shown in
Table2. It has observed that for ANN models, the mean
error (%) of three models is more than 10% except that the
CV% remains about 5%, and the correlation coefficient (R)
of the CV% and EB models is very low, shown as 0.76 and
0.67 respectively. However, for SVM models, the mean
error (%) is less than 10% except that the ED is still high,
and the correlation coefficient (R) of all models is improved
to more than 0.80. On the other hand, the cases with over
10% error also decrease from 5 and 6 in ANN models to 2
and 3 in SVM models. In fact, among all four yarn
properties considered in our work, end-down per 1000
spindle hours could be affected by different operators and
observers [10], which data often result in undermining the
prediction accuracy of various regression models. Even
so, for ED, almost all statistical parameters using SVM
model seem to be much better than using ANN model
5 CONCLUSIONS
Support vector machines are a new learning-byexample paradigm with many potential applications in
science and engineering. The salient features of SVM
include the absence of local minima, the sparseness of the
solution and the improved generalization. SVMs being a
relatively new technique, their application on textile
production have hitherto been quite limited. However, the
elegance of the formalism involved and their successful
use in diverse science and engineering applications
confirm the expectations raised in this appealing learning
from examples approach. In this study, we presented the
SVM model for predicting the yarn properties and
compared with the BP neural network model. We have
found that like ANN model, the SVM model is able to
predict to a reasonably good accuracy in most of cases.
And a more interested phenomenon is that in small data
set and real-life production, the predictive power of ANN
models appears to decrease, while SVM models are still
capable of remaining the stability of predictive accuracy to
some extent. The experimental results indicate that the
SVM models are more suitable for noisy and dynamic
spinning process. Of course, like other emerging industrial
techniques, applied issues on SVM reaffirm the due
commitment to their further development and investigation,
such as the problems how to design the kernel function
and how to set the SVM hyper-parameters (to make the
industrial model development more easily). Our research
thus far demonstrates that SVMs are able to provide an
alternative solution for the spinners to predict yarn
properties more correctly and reliably
6 ACKNOWLEDGMENT
This research was supported by national science
foundation and technology support plan of the People
Republic of China, under contract number 70371040 and
2006BAF01A44 respectively.
7 REFERENCES
[1]
Renate Esswein, “Knowledge assures quality”,
International Textile Bulletin, 2004, Vol15, no2,
17~21,
[2]
[3]
[4]
[5]
[6]
R. Chattonpadhyay and A. Guha, “Artificial Neural
Networks: Applications to Textiles”, Textile Progress,
2004, Vol35, no1, 1~42,
V. David Sanchez A, “Advanced Support Vector
Machines and Kernel Methods”, Neurocomputing,
2003, Vol55, no3, 5-20 ,
V. N. Vapnik, 1999, The Nature of Statistical Learning
Theory, 2nd ed., Berlin: Springer, 31-188,
B. Scholkopf, C. Burges, and A. Smola, 1999,
Advances in Kernel Methods—Support Vector
Learning. Cambridge, MA: MIT Press, 5-73,
B. Scholkopf, Smola A. and Williamson. R.C., et al,
“New
support
vector
algorithms”,
Neural
Computation, 2000, Vol12, no4, 1207-1245,
[7]
Athanassia Chalimourda, B. Scholkopt, A. Smola,
“Experimentally Optimal ν in Support Vector
Regression for Different Noise Models and
Parameter Settings”, IEEE trans. on Neural Netw.,
2004, Vol17, no2, 127-141
[8] Chih-Wei Hsu, Chih-Chung Chang, and Chih-Jen Lin,
A practical guide to support vector classification,
available at http://www.csie.ntu.edu.tw/~cjlin/paper
[9] Refael B., Lijing W., Xungai W., “Predicting worsted
spinning performance with an artificial neural network
model”, Textile Res. J. , 2004, Vol74, no.8, 757-763,
[10] Peter R. Lord, 2003, Handbook of Yarn Production
(Technology, Science and Economics), Abinhton
England: Woodhead publishing Limited, 95-212
Table1 The optimal values of σand C
Output parameter
Optimal value
CV %
σ = 0.973, C = 1606
Elongation at break
σ = 0.016, C = 14.55
Breaking force
σ = 0.012, C = 101.19
Ends-down
σ = 0.287, C = 2.975
Table2 Comparison of the predictive power of the SVM-based and ANN-based models
Sample No.
Predicted value using ANN model
Predicted value using SVM model
CV%
EB
BF
ED
CV%
EB
BF
ED
W21
19.32
13.81
113.89
70.41
19.66
12.85
116.24
72.06
W22
20.52
16.55
61.91
75.78
20.88
12.25
76.87
72.40
W23
15.62
12.32
153.46
39.40
16.84
15.59
156.57
42.22
W24
20.66
16.55
61.91
75.78
20.75
12.25
76.87
72.40
W25
22.60
19.77
47.00
69.84
19.66
12.76
76.86
59.31
W26
20.70
11.87
66.76
79.22
21.20
12.59
66.62
81.27
0.76
0.67
0.96
0.88
0.88
0.80
0.99
0.91
Mean squared error
0.01
0.12
0.07
0.03
0.003
0.05
0.01
0.03
Mean error%
5.73
24.35
13.67
19.99
2.85
9.23
5.52
17.29
1
6
5
6
0
2
2
3
Correlation
coefficient. R
Cases with
over 10% error
Download