19th International Conference on Production Research AN INTELLIGENT REASONING MODEL FOR YARN MANUFACTURE Jian-Guo Yang, Fu Zhou, Jing-Zhu Pang, Zhi-Jun Lv College Of Mechanical Engineering, University of DongHua, Ren Min Bei Road 2999, Song Jiang Zone, Shanghai, P R China Abstract Although many works have been done to construct prediction models on yarn processing quality, the relation between spinning variables and yarn properties has not been established conclusively so far. Support vector machines (SVMs), based on statistical learning theory, are gaining applications in the areas of machine learning and pattern recognition because of the high accuracy and good generalization capability. This study briefly introduces the SVM regression algorithms, and presents the SVM based system architecture for predicting yarn properties. Model selection which amounts to search in hyperparameter space is performed for study of suitable parameters with grid-research method. Experimental results have been compared with those of ANN models. The investigation indicates that in the small data sets and real-life production, SVM models are capable of remaining the stability of predictive accuracy, and more suitable for noisy and dynamic spinning process Keywords: Support vector machines, Structure risk minimization, Predictive model, Kernel function, Yarn quality 1 INTRODUCTION Changing economic and political conditions and the increasing globalisation of the market mean that the textile sector faces ever challenges. To stay competitive, there is an increasing need for companies to invest in new products. Along the textile chain, innovative technologies and solutions are required to continuously optimize the production process. High quality standards and an extensive technical and trade know-how are thus prerequisite to keep abreast of the growing dynamics of the sector [1]. Although many works have been done to construct prediction models on yarn processing quality, the relation between spinning variables and yarn properties has not been established conclusively so far.. The increasing quality demands from the spinners make clear the need to explore innovative ways of quality prediction furthermore. The widespread use of artificial intelligence (AI) has created a revolution in the domain of quality prediction, for example, application of artificial neural network (ANN) in textile engineering [2]. This study presents a support vector machines based intelligent predictive model for yarn process quality. The relative algorithm, model selection and experiments are presented in detail. 2 SVM REGRESSION ALGORITHMS 2.1 Paper title and authors The main objective of regression is to approximate a function g(x) from a given noisy set of samples G {( xi , y i )} N i 1 obtained from the function g. The basic idea of support vector machines (SVM) for regression is to map the data x into a high dimensional feature space via a nonlinear mapping and to perform a linear regression in this feature space. D f ( x) wi i ( x) b (1) i 1 where w denotes the weight vector, b is a constant known as “bias”, { i ( x)} D i 1 are called features. Thus, the problem of nonlinear regression in lower-dimensional input space is transformed into a linear regression in the highdimensional feature space. The unknown parameters w and b in Equation (1) are estimated using the training set, G. To avoid over fitting and thereby improving the generalization capability, following regularized functional involving summation of the empirical risk and a complexity term w 2 , is minimized [3] R reg Remp w 2 1 M M i 1 f ( xi ) y i w 2 (2) where λis a regularization constant and the cost function defined by f ( x) y f ( x) y 0 , ( f ( x) y ) ( f ( x) y ) (3) is called Vapnik’s “ε-insensitive loss function”. It can be shown that the minimizing function has the following form: M f ( x, , * ) ( i i* )k ( xi , x) b (4) i 1 i i* 0 function k ( xi , x) with , i , i* 0 and the kernel describes the dot product in the D- dimensional feature space. k ( xi , x j ) ( x i ), ( x j ) (5) It is important to note that the featuresΦj need not be computed; rather what is needed is the kernel function that is very simple and has a known analytical form. The only condition required is that the kernel function has to satisfy Mercer’s condition. Some of the mostly used kernels include linear, polynomial, radial basis function, and sigmoid. Note also that for Vapnik’s ε -insensitive loss function, the Lagrange multipliers i , i are sparse, i.e. * they result in nonzero values after the optimization (2) only if they are on the boundary, which means that they satisfy the Karush–Kuhn–Tucker conditions. The coefficients i , i* are obtained by maximizing the following form: Max : R( * , ) 1 M (i* i )( *j j ) K ( xi , x j ) 2 i , j 1 M ( i* i ) i 1 M .S .T . ( i 1 * i M i 1 y i ( i* i ) i ) (6) (7) 0 ,i C * i Only a number of coefficients i , i will be different from * zero, and the data points associated to them are called support vectors. Parameters C and εare free and have to be decided by the user. Computing b requires a more direct use of the Karush–Kuhn–Tucker conditions that lead to the quadratic programming problems stated above. The x k on the in the open interval (0, C). One xk key idea is to pick those values for a point margin, i.e. k or * k would be sufficient but for stability purposes it is recommended that one take the average over all points on the margin. More detailed description of SVM for database. The reasoning machines are a SVM-based yarn process simulator in nature, which are used to train the predictive models, and then make some real-world process decision in term of the different raw materials inputs 3.2 Model Selections In the yarn predictive learning task, the appropriate model and parameter estimation method should be selected to obtain a high level of performance of the learning machine. Lacking a priori information about the accuracy of the yvalues, it can be difficult to come up with a reasonable value of ε a prior. Instead, one would rather specify the degree of sparseness and let the algorithms automatically compute ε from the data. This is the idea of ν-SVM, a modification of the original ε -SVM introduced by Schőlkopf, Smola, Williamson et al [6], which were used to construct the yarn predictive model in our study. Under the approach, the usually parameters to be chosen are the following: the penalty term C which determines the tradeoff between the complexity of the decision function and the number of training examples misclassified; the sparsity parameter ν in accordance with the noise that is in the output values in order to get the highest generalization accuracy. User Interface Raw Material Yarn Properties Yarn Quality Prediction Reasoning Machines SVM-based Process Simulator Textile Engineering Database Data Acquisition Yarn Production Process Fig.1 Yarn Quality Predictive Model Architecture regression can be found in Ref. [3~6] 3 SVM BASED YARN PREDICTIVE MODEL 3.1 Model Architecture Considering some salient features of SVM such as the absence of local minima, the sparseness of the solution and the improved generalization, there was proposed SVM-based yarn quality prediction system (shown as Fig.1). The system architecture mainly consists of three modules, i.e. data acquisition, reasoning machine, and user interface. Among them, the user interface provides friendly interactive operation with the model, including data cleaning, model training, parameter selection, and so on. The data acquisition collects and transforms the various data from yarn production process into engineering the kernel function such that K ( x, y) According to the reference [7], the sparsity parameter ν usually may be choose in the interval [0.3, 0.6], here ν=0.583. And radial basis function (RBF) kernel, given by Equitation 8 is used: K ( x, y ) exp( x y 2 / 2 2 ) (8) where σ is the width of the RBF kernel parameter. The RBF kernel nonlinearly maps samples into a higher dimensional space, so it, unlike the linear kernel, can handle the case when the relation between inputs and outputs is nonlinear. In addition, the sigmoid kernel behaves like RBF for certain parameters. The reason using RBF kernels is the number of hyper-parameters 19th International Conference on Production Research which influences the complexity of model selection. The polynomial kernel has more hyper-parameters than the RBF kernel. Finally, for the RBF kernel, it has less numerical difficulties; and a key point is 0 k ( x, y ) 1 in contrast to polynomial kernels of which kernel values may go to infinity or zero while the degree is large. Moreover, it is noted that the sigmoid kernel is not valid (i.e. not the inner product of two vectors) under some parameters [4]. 3.3 Optimization of Model Parameter Obviously, in the SVM model there are still two key parameters need choosing: C and σ. Unfortunately, it is difficult to know beforehand which C and σ are the best for one problem. Our goal is just about to identify good (C, σ) so that the model can accurately predict unknown data (i.e., testing data). Therefore, a common way is to separate training data to two parts of which one is considered unknown in training the model. Then the prediction accuracy on data sets can more precisely reflect the performance on predicting unknown data. The procedure for improved model is called as crossvalidation. The cross-validation procedure can also prevent the over-fitting problem furthermore. In this study, the regression function was built with a given set of parameters {C, σ}.The performance of the parameter set is measured by the computational risk, here mean squared error (MSE, see Equation 9) on the last subset. The above procedure is repeated p times, so that each subset is used once for testing. Averaging the MSE over the p trials gives an estimate of the expected generalization error for p 1 l p training on sets of size , l is the number of training data. MSE 1 p q ( y ti( j ) y (pij ) ) 2 pq j 1 i 1 (9) where q is the sample number of tested subset in the training set; y ti( j ) and y (pij ) th are the i th observed value and predicted value under j tested subset, respectively. In order to capture the better pairs of (C, σ), a “grid-search” [8] on C and σ is employed in this work. Firstly, in term of possible range of the two parameters, C and σ were divided r pairs; then each pair of the parameters was tried using cross-validation and the one with the best crossvalidation accuracy was picked up as optimal parameters of the model. 4 THE EXPERIMENTS STUDY In this work, a small population (a total of twenty-six different data samples) from real worsted spinning was acquired. To demonstrate the generalization performance of SVM model, different experiments were completed and comparisons with ANN models.To make problem more simply, like most ANN models[2, 9], some fibre properties and processing information were selected as the SVM model’s inputs, which were mean fibre diameter (MFD, μ m), diameter distribute (CVD, %), hauteur (HT, mm), fiber length distribution (CVH, %), short fiber content (SFC, %), yarn count (CT, tex), twist (TW, t.p.m), draft ratio (DR), spinning speed (SS, r.p.m), traveler number (TN). Four yarn properties, namely unevenness (CV %), elongation at break (EB, %), break force (BF, cN) and end-down per 1000 spindle hour (ED), served as the SVM model’s outputs. One of the primary aspects of developing a SVM regression model is the selection of the penalty term C and the width of the RBF kernel parameter σ. To optimize the two parameters, the “grid-search” method above was applied in the present work. In fact, optimizing the model parameters need an iterative process which can continuously shrink the searching area and as a result, obtain a satisfying solution. Table1 lists the final searching area and optimal values of the four SVM models, respectively. After the completion of model development or training, all the models based on SVM (and ANN) were subjected to the unseen testing data set. Statistical parameters such as the correlation coefficient between the actual and predicted values (R), mean squared error, and mean error%, were used to compare the predictive power of the SVM-based and ANN-based models. Results are shown in Table2. It has observed that for ANN models, the mean error (%) of three models is more than 10% except that the CV% remains about 5%, and the correlation coefficient (R) of the CV% and EB models is very low, shown as 0.76 and 0.67 respectively. However, for SVM models, the mean error (%) is less than 10% except that the ED is still high, and the correlation coefficient (R) of all models is improved to more than 0.80. On the other hand, the cases with over 10% error also decrease from 5 and 6 in ANN models to 2 and 3 in SVM models. In fact, among all four yarn properties considered in our work, end-down per 1000 spindle hours could be affected by different operators and observers [10], which data often result in undermining the prediction accuracy of various regression models. Even so, for ED, almost all statistical parameters using SVM model seem to be much better than using ANN model 5 CONCLUSIONS Support vector machines are a new learning-byexample paradigm with many potential applications in science and engineering. The salient features of SVM include the absence of local minima, the sparseness of the solution and the improved generalization. SVMs being a relatively new technique, their application on textile production have hitherto been quite limited. However, the elegance of the formalism involved and their successful use in diverse science and engineering applications confirm the expectations raised in this appealing learning from examples approach. In this study, we presented the SVM model for predicting the yarn properties and compared with the BP neural network model. We have found that like ANN model, the SVM model is able to predict to a reasonably good accuracy in most of cases. And a more interested phenomenon is that in small data set and real-life production, the predictive power of ANN models appears to decrease, while SVM models are still capable of remaining the stability of predictive accuracy to some extent. The experimental results indicate that the SVM models are more suitable for noisy and dynamic spinning process. Of course, like other emerging industrial techniques, applied issues on SVM reaffirm the due commitment to their further development and investigation, such as the problems how to design the kernel function and how to set the SVM hyper-parameters (to make the industrial model development more easily). Our research thus far demonstrates that SVMs are able to provide an alternative solution for the spinners to predict yarn properties more correctly and reliably 6 ACKNOWLEDGMENT This research was supported by national science foundation and technology support plan of the People Republic of China, under contract number 70371040 and 2006BAF01A44 respectively. 7 REFERENCES [1] Renate Esswein, “Knowledge assures quality”, International Textile Bulletin, 2004, Vol15, no2, 17~21, [2] [3] [4] [5] [6] R. Chattonpadhyay and A. Guha, “Artificial Neural Networks: Applications to Textiles”, Textile Progress, 2004, Vol35, no1, 1~42, V. David Sanchez A, “Advanced Support Vector Machines and Kernel Methods”, Neurocomputing, 2003, Vol55, no3, 5-20 , V. N. Vapnik, 1999, The Nature of Statistical Learning Theory, 2nd ed., Berlin: Springer, 31-188, B. Scholkopf, C. Burges, and A. Smola, 1999, Advances in Kernel Methods—Support Vector Learning. Cambridge, MA: MIT Press, 5-73, B. Scholkopf, Smola A. and Williamson. R.C., et al, “New support vector algorithms”, Neural Computation, 2000, Vol12, no4, 1207-1245, [7] Athanassia Chalimourda, B. Scholkopt, A. Smola, “Experimentally Optimal ν in Support Vector Regression for Different Noise Models and Parameter Settings”, IEEE trans. on Neural Netw., 2004, Vol17, no2, 127-141 [8] Chih-Wei Hsu, Chih-Chung Chang, and Chih-Jen Lin, A practical guide to support vector classification, available at http://www.csie.ntu.edu.tw/~cjlin/paper [9] Refael B., Lijing W., Xungai W., “Predicting worsted spinning performance with an artificial neural network model”, Textile Res. J. , 2004, Vol74, no.8, 757-763, [10] Peter R. Lord, 2003, Handbook of Yarn Production (Technology, Science and Economics), Abinhton England: Woodhead publishing Limited, 95-212 Table1 The optimal values of σand C Output parameter Optimal value CV % σ = 0.973, C = 1606 Elongation at break σ = 0.016, C = 14.55 Breaking force σ = 0.012, C = 101.19 Ends-down σ = 0.287, C = 2.975 Table2 Comparison of the predictive power of the SVM-based and ANN-based models Sample No. Predicted value using ANN model Predicted value using SVM model CV% EB BF ED CV% EB BF ED W21 19.32 13.81 113.89 70.41 19.66 12.85 116.24 72.06 W22 20.52 16.55 61.91 75.78 20.88 12.25 76.87 72.40 W23 15.62 12.32 153.46 39.40 16.84 15.59 156.57 42.22 W24 20.66 16.55 61.91 75.78 20.75 12.25 76.87 72.40 W25 22.60 19.77 47.00 69.84 19.66 12.76 76.86 59.31 W26 20.70 11.87 66.76 79.22 21.20 12.59 66.62 81.27 0.76 0.67 0.96 0.88 0.88 0.80 0.99 0.91 Mean squared error 0.01 0.12 0.07 0.03 0.003 0.05 0.01 0.03 Mean error% 5.73 24.35 13.67 19.99 2.85 9.23 5.52 17.29 1 6 5 6 0 2 2 3 Correlation coefficient. R Cases with over 10% error