Header Space reserved for Publication Control chart pattern recognition using wavelet analysis and neural networks Chuen-Sheng Cheng 1, Hui-Ping Cheng 2, Kuo-Ko Huang 1 1 Dept. of Industrial Engineering and Management, Yuan-Ze University, Tao-Yuan, Taiwan 320, ROC 1 E-mail: ieccheng@saturn.yzu.edu.tw 2 Department of Business Administration, Ming-Dao University, Changhua, Taiwan 52345, ROC 2 E-mail: hpcheng@mdu.edu.tw Abstract Control charts are useful tool in detecting out-of-control situations in process data. There are many unnatural patterns that may exist in process data indicating the process is out of control. The presence of unnatural patterns implies that a process is affected by assignable causes, and corrective actions should be taken. Identification of unnatural patterns can greatly narrow the set of possible causes that must be investigated, and thus the diagnostic work could be reduced in length. This paper presents a modified self-organizing neural network developed for control chart pattern analysis. The aim is to develop a pattern clustering approach when no prior knowledge of the unnatural patterns is available. This paper also investigates the use of features extracted from wavelet analysis as the components of the input vectors. Experimental results and comparisons based on simulated and real data show that the proposed approach performs better than traditional approach. Our research concluded that the extracted features can improve the performance of the proposed neural network. Keywords: Control chart; Pattern recognition; Wavelet analysis; Feature extraction 1. Introduction Statistical process control charts have been widely used for many years to monitor the quality characteristics of a process. A process is considered out of control if a point falls outside the control limits or a series of points exhibit an unnatural pattern (also known as nonrandom variation). Analysis of unnatural patterns is an important aspect of control charting. These unnatural patterns provide valuable information regarding potentialities for process improvement. It is well documented that a particular unnatural pattern on a control chart is often associated with a specific set of assignable causes (Western Electric Company, 1958). Therefore, once any unnatural patterns are recognized, the scope of process diagnosis can be greatly narrowed to a small set of possible causes that must be investigated, and thus the diagnostic search could be reduced in length. Fig. 1 displays examples of the types of unnatural pattern, the reader is referred to the Western Electric Handbook (Western Electric Company, 1958) for a detailed description of these unnatural patterns. The typical unnatural patterns on control charts are defined as follows. 1. Trends. A trend can be defined as a continuous movement in one direction (either upward or downward). 2. Sudden shifts. A shift may be defined as a sudden or abrupt change in the average of the process. 3. Systematic variation. One of the characteristics of a natural pattern is that the point-to-point fluctuations are unsystematic or unpredictable. In systematic variations a low point is always followed by a high one or vice versa. Page Number Header Space reserved for Publication 4. 5. Cycles. Cyclic behaviour of the process mean can be recognized by a series of high portions or peaks interspersed with low portions or troughs. Mixtures. In a mixture the points tend to fall near the high and low edge of the pattern with an absence of normal fluctuations near the middle. A mixture is actually a combination of data from separate distributions. Numerous approaches have been proposed to control chart pattern recognition. These include statistical (Cheng & Hubele, 1996; Yang & Yang, 2005), rule-based expert system (Cheng, 1989) and artificial neural network techniques (Al-Assaf, 2004; Cheng, 1997; Guh & Hsieh, 1999; Guh & Tannock, 1999; Hassan et al., 2003; Hwarng & Hubele, 1993; Hwarng & Hubele, 1993; Jang et al., 2003; Pacella et al., 2004; Perry et al., 2001; Pham & Chan, 1998; Pham & Chan, 2001; Pham & Oztemel, 1992; Pham & Oztemel, 1994; Pham & Wani, 1997; Smith, 1994; Wani & Pham, 1999; Yang & Yang, 2002). Supervised neural networks have been successfully employed in references (Al-Assaf, 2004; Cheng, 1997; Guh & Hsieh, 1999; Guh & Tannock, 1999; Hassan et al., 2003; Hwarng & Hubele, 1993; Hwarng & Hubele, 1993; Jang et al., 2003; Perry et al., 2001; Pham & Oztemel, 1992; Pham & Oztemel, 1994; Pham & Wani, 1997; Smith, 1994; Wani & Pham, 1999; Yang & Yang, 2002). These neural networks learn to recognize patterns by being presented with representative samples during a training phase. Ideally, sample patterns should be developed from a real process. A common approach adopted by previous researches was to generate training samples based on predefined mathematical model. An implicit assumption of this approach is that the groups of unnatural patterns are known in advance. In actual cases, sufficient training samples of unnatural patterns may not be readily available. In addition, the use of pre-defined models may create problems for patterns not previously encountered. Unsupervised neural networks can be used to cluster data into groups with similar features. This approach has been studied in (Pacella et al., 2004; Pham & Chan, 1998; Pham & Chan, 2001). (a) Normal pattern (b) Increasing trend (c) Decreasing trend (d) Cy clic pattern (e) Sy stematic pattern (f ) Mixture (g) Upward shif t (h) Downward shif t Figure. 1 - Examples of control chart patterns. Page Number Header Space reserved for Publication This paper presents a self-organizing neural network developed for control chart pattern recognition. The aim is to develop a pattern clustering approach when no prior knowledge of the unnatural patterns is available. This paper also investigates the use of features extracted from wavelet analysis as the components of the input vectors. The paper comprises six main sections. Section 2 briefly reviews self-organizing map. In Section 3, the SOM-based approach is explained in detail. Section 4 shows the results of using the proposed neural network to classify control chart patterns. Section 5 reports the performance evaluation based on a small set of patterns collected from manufacturing process. Finally, conclusions are made in Section 6. 2. Self-organizing map Fig. 2 illustrates a typical architecture of the Kohonen self-organizing map. There are m cluster units, arranged in a one- or two- dimensional array; the input vectors are n-tuples. The weight vector for a cluster unit serves as an exemplar (or template) of the input patterns associated with that cluster. During the self-organization process, the cluster unit whose weight vector matches the input vector most closely is chosen as the winner. The winning unit and its neighbouring units (determined by a radius, R and neighbourhood topology) update their weights. The SOM neural network can be used to cluster a set of p continuous-valued vectors x ( x1, x2 , x3 ,.., xn ) into m clusters. The training of a Kohonen self-organizing map can be summarized as follows: 1. Set topological neighbourhood parameter R. Set learning rate parameter . 2. Initialize weights wij . The weights are initialized to random values chosen from the same range of 3. 4. values as the components of the input vectors. Present an input vector from the training set to the map. For each j , compute D( j ) (wij xi ) 2 . 5. 6. Find index J such that D( j ) is a minimum. For all units j within a specified neighbourhood of J , and for all i : 7. Update learning rate when all input vectors have been presented. Reduce radius of topological neighbourhood. Repeat steps 3 to 7 until a specified number of iterations are completed. i wij (new) wij (old ) [ xi wij (old )] . 8. The number of input units in the network, n , is equal to the dimension of the vectors to be classified by the network. Note that the components of the input vector could be raw (unprocessed) data and/or a set of features extracted from the data. The number of output nodes, m , is set arbitrarily. The value of m represents the maximum number of clusters to be formed. If it is higher than the number of actual classes, only part of the output nodes will be committed. If m is too high, many trivial class templates will be created. The learning rate is a slowly decreasing function of training epochs. The radius of the neighborhood around a cluster unit also decreases as the clustering process progresses. yj y1 w11 wi1 x1 wn1 w1 j wij ym wnj xi w1m wim wnm xn Figure. 2 - Architecture of the Kohonen self-organizing neural network. Page Number Header Space reserved for Publication In the recall mode, only steps 4 to 6 are used. When a new input vector is presented to the network, the trained neural network will determine the best matched template. For more details on the Kohonen selforganizing map, refer to reference (Kohonen, 1989). 3. The proposed methodology This paper presents a self-organizing neural network developed for control chart pattern recognition. As with other types of self-organizing neural networks, no desired output is provided with any input in the training phase. The network segregates the input vectors into clusters as training proceeds. The clusters identify the different classes present in the input data set. A good classifier should have high classification accuracies with both the training and the test data sets and should involve only a small number of committed nodes by comparison with the number of patterns in the data set. The problem of the basic SOM is that many nodes are needed to obtain good accuracies (Pham & Chan, 1998). In order to alleviate this problem, we propose a modification to the training of the traditional SOM. The operation of the proposed network involves the following steps: 1. Formation of clusters Apply SOM to cluster the training set into m clusters. The value of m is set substantially larger than the number of potential classes in the data set. 2. Merging of clusters Treat the exemplars of the committed output nodes as the input vectors. Set the number of output nodes to m * , where m* m , and apply the SOM procedure. The rationale of the proposed modification is explained as follows. In the phase of cluster formation, the number of maximum clusters to be formed ( m ) is set at a high value so that input vectors with differing prominent features will cause new cluster nodes to be formed. In other words, the value of m should be kept high to ensure that different patterns are separated into distinct classes. This approach will establish the most representative class template. However, a high value of m will cause many trivial class templates to be created at the beginning of the training. The second phase can be thought of as merging clusters which seem to represent the same pattern class. It is intuitive that a clustering algorithm will perform better using a smooth version of input patterns (i.e., the class templates). The components of the input vector could be raw (unprocessed) data and/or a set of features extracted from the data. Previous researches (Hassan et al., 2003; Pham & Wani, 1997; Smith, 1994; Wani & Pham, 1999) indicated that the performance of the neural network-based pattern recognizer can be improved using extracted statistical features as the input vector. Al-Assaf (2004) showed that neural network works on a reduced set of coefficients obtained from the wavelet analysis performs better than using raw data as input. The present work investigates the use of features extracted from wavelet analysis. The characteristics would be extracted from the process data and added to the input vector before presenting to the neural network. Fast Haar transform (FHT) (Kaiser, 1998; Newland, 1993) was used to de-noise and extract important features from unnatural patterns. Consider a data vector consisting of 2 L numbers a0 ( ) , where 1 2 L . We can form two new vectors, defined as follows: 1 a1 ( ) [a0 (2 1) a0 (2 )] (1) 2 1 d1 ( ) [d 0 (2 1) d 0 (2 )] (2) 2 where 1 2 L1 . Applying the same process, we obtain 1 al ( ) [al 1 (2 1) al 1 (2 )] (3) 2 1 d l ( ) [d l 1 (2 1) d l 1 (2 )] (4) 2 where 1 2 Ll . Both al ( ) and d l ( ) have 2 Ll components. In this research we will study the effect of input vector augmented by al and d l . As an example, consider the following process data: Page Number Header Space reserved for Publication x [ x1 , x2 , x3 , x4 , x5 , x6 , x7 , x8 ]T (5) The extracted characteristics, a 2 and d 2 , are: 1 [a1 (1) a1 (2)] 2 1 a2 (2) [a1 (3) a1 (4)] 2 1 d 2 (1) [ d1 (1) d1 ( 2)] 2 1 d 2 (2) [d1 (3) d1 (4)] 2 The overall (or augmented) input vector presented to the network would be: x* [ x1 , x2 , x3 , x4 , x5 , x6 , x7 , x8 , a2 (1), a2 (2), d 2 (1), d 2 (2)]T a2 (1) (6) (7) (8) (9) (10) 4. Simulation results and discussion The implementing of the proposed neural network-based algorithm requires no knowledge of the unnatural patterns. However, to estimate the performance of the proposed neural network, unnatural process data were simulated. The patterns include normal, cyclic (sine wave and cosine wave) pattern, increasing trend, decreasing trend, upward shift and downward shift. The training data set consists of 462 input vectors (66 for each type). The testing data set contained the same number of input vectors as the training set. An input vector includes a time series of 32 data points. The generation of the different types of patterns for the training and test data sets is described in reference (Cheng, 1989). The parameters used for simulating unnatural patterns are given in Table 1. The topology and training parameters are given in Table 2. The proposed neural network was implemented in Matlab (MathWorks, 2004). The proposed neural network will be evaluated according to the following criteria: it should have the minimum number of established clusters and the highest classification accuracy. A set of experiments was conducted to estimate the performance of the proposed methodology. The performance of the basic SOM was selected as the benchmark. In order to have a fair comparison, the number of maximum clusters was kept the same for all competitive procedures. Table 1 - Parameters for simulating patterns Parameters (in terms of ) Pattern type Increasing trend Decreasing trend Upward shift Downward shift Cyclic pattern gradient: 0.1 to 0.3 gradient: -0.3 to -0.1 shift magnitude: 0.5 to 3.0 shift magnitude: -3.0 to -0.5 amplitude:0.5 to 3.0; period: 12 Table 2 - Topology and training parameters Network Topology Basic SOM Linear ( 7 1 ) Proposed approach (cluster-forming) Proposed approach (cluster-merging) Rectangular ( 8 8 ) Linear ( 7 1 ) 0.9 (geometric decrease) 0.9 (geometric decrease) 0.01 (geometric decrease) Page Number R 1 (remains constant) 1 (remains constant) 0 (remains constant) Training epochs 50 50 50 Header Space reserved for Publication The results obtained through simulation are summarized in Tables 3-5. From Table 3 we can conclude that the proposed approach offers better performances than the basic SOM. The accuracy was improved from 80.95% to 85.06% by using cluster-merging. The results of the testing data set also indicate that the proposed approach performed better than basic SOM. It is fair to conclude that the proposed method can reduce the number of clusters while maintaining good levels of accuracy. Table 4 presents a confusion matrix to describe the misclassification of patterns. There was tendency for the cyclic pattern to be most confused with random pattern, the increasing trend with upward shift, and the decreasing trend with downward shift. Fig. 3 depicts the class templates after training. It is clear that the class template is a smooth version of the input patterns and eventually becomes a noise-free pattern. In some unsupervised classification tasks, the user may have no knowledge as to how many potential classes there are in the data set. An experiment was conducted to measure the sensitivity of the proposed approach. Table 5 illustrates the sensitivity analysis of the proposed procedure. The results show that a reasonable solution was still obtained at various of number of m * . Table 3 - Results for basic SOM and the proposed approach Training n Network Basic SOM Proposed approach 32 32 m 7 64 m* Testing Accuracy (%) After merging (%) Accuracy (%) 80.95 93.51 - 81.39 85.06 86.58 - 7 Table 4 - Confusion matrix of the proposed approach using training data Recognized patterns Input patterns Up-trend Dn-trend Cycle-1 Cycle-2 Up-shift Dn-shift Normal Up-trend 84.8 0.0 0.0 0.0 15.2 0.0 0.0 Dn-trend 0.0 84.8 0.0 0.0 0.0 15.2 0.0 Cycle-1 0.0 0.0 86.4 0.0 0.0 0.0 13.6 Cycle-2 0.0 0.0 0.0 87.9 0.0 0.0 12.1 Up-shift 14.3 0.0 1.5 0.0 75.7 0.0 8.5 Dn-shift 0.0 15.2 0.0 1.5 0.0 77.3 6.0 Normal 0.0 0.0 0.0 1.5 0.0 0.0 98.5 Page Number Header Space reserved for Publication 0.50 Variable normal up-trend dn-trend cycle1 cycle2 up-shift dn-shift Data 0.25 0.00 -0.25 -0.50 5 10 15 20 25 30 Index Figure. 3 - Class templates. From Tables 3-5, the following observations can be made about the properties of the proposed procedure. First, based on the results of testing phase, it is concluded that the proposed approach can work as a pattern classifier to classify new process data. Second, the proposed approach can cluster unnatural pattern data to form a training set useful for supervised learning. Table 5 - Sensitivity analysis Network Basic SOM Proposed approach (8×8) Proposed approach (9×9) n m m* Accuracy (%) After merging (%) 32 6 7 8 9 10 - - - - - 70.35 80.95 82.03 84.63 85.06 - - - - - 64 6 93.51 73.81 64 7 93.51 85.06 64 8 93.51 86.58 64 9 93.51 88.96 64 10 93.51 88.31 81 6 94.16 77.49 81 7 94.16 86.80 81 8 94.16 85.28 81 9 94.16 86.15 81 10 94.16 85.93 32 32 Page Number Header Space reserved for Publication Table 6 - Comparison of different approaches Network Basic SOM (7×1) Proposed approach with raw data Proposed approach with raw data and a3 ,d3 n m m* Accuracy (%) After merging (%) 32 7 - 80.95 - 32 64 7 93.51 85.06 40 64 7 92.21 87.88 The proposed neural network described above used solely the process data in the process of clustering. The performance might be improved if the process data as well as extracted features were employed in the clustering process. The results in Table 6 show that the augmented input vector could attain better accuracy level. Evaluated by the same data set, the accuracy was improved from 85.06% to 87.88%. The improvement was due to the fact that the augmented input vectors include global shape as well as local details. 5. Application This section describes how the proposed neural network was used to classify a small set of patterns collected from industry process. The set comprised 30 patterns. After coding, each pattern was a 32dimensional vector with real numbers between -1 and +1 as components. Fig. 4 illustrates the correct grouping of the input patterns. This grouping was based on visual inspection of experienced engineers. The grouping structure using the proposed approach and basic SOM is presented in Table 7. The maximum number of clusters to be formed was set to the number of actual classes. With basic SOM, only four cluster nodes were committed, while the proposed approach established the correct number of clusters. The results show that the classification accuracies obtained with the proposed neural network are better than basic SOM. It is clear that basic SOM failed to differentiate between upward shifts and increasing trends completely. They were all assigned to the same cluster unit. As a final comparison, we applied the run chart of Minitab statistical software (Minitab, 2004) to analyze the above patterns. The run chart performs two tests for randomness that provide information on the nonrandom variation due to trends, oscillation, mixtures, and clustering (shift). The first test is based on the number of runs up or down and is sensitive to two types of nonrandom behavior - oscillation and trends. The second test is based on the total number of runs that occur both above and below the median. This test is designed to detect mixtures and clustering. Table 8 summarizes the results for trends and shifts. A P value less than a predetermined level of significance will indicate that an unnatural pattern is present. At significant level of 0.05, all trend patterns and three upward shifts are not significant. In other words, the run chart cannot recognize these patterns. Page Number Header Space reserved for Publication P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15 P16 P17 P18 P19 P20 P21 P22 P23 P24 P25 P26 P27 P28 P29 P30 Figure. 4 - Correct grouping for real data. Table 7 - Grouping results for real data Network n m m* Committed nodes Grouping structure Accuracy (%) C1 {P1, P2 , P3 , P4 , P5 , P6} C 2 {P7 , P8 , P9 , P10 , P11, P12 , Basic SOM 32 5 - P25 , P26 , P27 , P28 , P29 , P30} 4 80.00 C3 {P13, P14, P15, P16, P17, P18} C4 {P19, P20, P21, P22, P23, P24} C1 {P1, P2 , P3 , P4 , P5 , P6} C2 {P7 , P8 , P9 , P11, P12} Proposed aproach 32 20 5 5 C3 {P13, P14, P15, P16, P17, P18} C4 {P19, P20, P21, P22, P23, P24} C5 {P10, P25, P26, P27 , P28, P29, P30} Page Number 96.67 Header Space reserved for Publication Pattern Table 8 - Recognition results using Minitab P value for clustering P value for trend P7 0.35965 0.66701 P8 0.03617 0.50000 P9 0.03617 0.09766 P10 0.35965 0.50000 P11 0.14047 0.50000 P12 0.23613 0.90234 P25 0.00594 0.04211 P26 0.07528 0.19398 P27 0.35965 0.66701 P28 0.03617 0.95789 P29 0.03617 0.50000 P30 0.14047 0.33299 6. Conclusions Statistical process control (SPC) makes use of control charts to determine whether a process is functioning correctly. A control chart may exhibit many unnatural patterns indicating that the process is out-of-control. Correct recognition of these patterns is important to achieving early detection of potential problems and maintaining the quality of the process. This paper has described the application of self-organizing neural network to control chart pattern classification. Several methods have been proposed to improve the performance of the basic SOM network. The application of the proposed approach has been analyzed by means of simulations and real data collected from manufacturing process. The results indicate that the proposed method performs better than traditional approach. One advantage of the proposed approach is that it requires no previous information about unnatural pattern appearances and related mathematical models. The proposed approach can work as a pattern classifier to classify new process data. In addition, the proposed approach can cluster unnatural pattern data to form a training set useful for supervised learning. Currently, we are extending this work to investigate the performance of supervised pattern-recognizer based on training data clustered by the proposed approach. Page Number Header Space reserved for Publication References Al-Assaf, Y. (2004). Recognition of control chart patterns using multi-resolution wavelets analysis and neural network. Computers and Industrial Engineering, 47, 17-29. Cheng, C. S. (1989). Group technology and expert systems concepts applied to statistical process control in smallbatch manufacturing. Ph.D. Dissertation, Arizona State University, Tempe, Arizona. Cheng, C. S. (1997). A neural network approach for the analysis of control chart patterns. International Journal of Production Research, 35, 667-697. Cheng, C. S., & Hubele, N. F. (1996). A pattern recognition algorithm for an x-bar control chart. IIE Transactions, 28, 215-224. Guh, R. S., & Hsieh, Y. C. (1999). A neural network based model for abnormal pattern recognition of control charts. Computers and Industrial Engineering, 36, 97-108. Guh, R. S., & Tannock, J. D. T. (1999). Recognition of control chart concurrent patterns using a neural network approach. International Journal of Production Research, 37, 1743-1765. Hassan, A., Baksh, M., Shaharoun, A. M., & Jamaluddin, H. (2003). Improved SPC chart pattern recognition using statistical features. International Journal of Production Research, 41, 1587-1603. Hwarng, H. B., & Hubele, N. F. (1993). X-bar chart pattern identification through efficient off-line neural network training. IIE Transactions, 25, 27-40. Hwarng, H. B., & Hubele, N. F. (1993). Back-propagation pattern recognizers for X-bar control chart: methodology and performance. Computers and Industrial Engineering, 24, 219-235. Jang, K. Y., Yang, K., & Kang, C. (2003). Application of artificial neural network to identify non-random variation patterns on the run chart in automotive assembly process. International Journal of Production Research, 41, 1239-1254. Kaiser, G. (1998). The Fast Harr Transform-Gateway to wavelets. IEEE Potentials, April/May, 34-37. Kohonen, T. (1989). Self-Organisation and Associative Memory. (3rd ed.). Berlin: Springer-Verlag. MathWorks. (2004). MATLAB 7.0 User’s Guide. Natick: MathWorks Inc. Minitab. (2004). Minitab 14.0 User’s Guide. Minitab Inc. Newland, D. E. (1993). An Introduction to Random Vibrations. Spectral & Wavelet Analysis, NY: John Wiley & Sons. Pacella, M., Semeraro, Q., & Anglani, A. (2004). Adaptive resonance theory-based neural algorithms for manufacturing process quality control. International Journal of Production Research, 42, 4581-4607. Perry, M. B., Spoerre, J. K., & Velasco, T. (2001). Control chart pattern recognition using back propagation artificial neural networks. International Journal of Production Research, 39, 3399-3418. Pham, D. T., & Chan, A. B. (1998). Control chart pattern recognition using a new type of self-organizing neural network. Proceedings of the Institution of Mechanical Engineers, Part I, 212, 115-127. Pham, D. T., & Chan, A. B. (2001). Unsupervised adaptive resonance theory neural networks for control chart pattern recognition. Proceedings of the Institution of Mechanical Engineers, Part B, 215, 59-67. Pham, D. T., & Oztemel, E. (1992). Control chart pattern recognition using neural networks. Journal of Systems Engineering, 2, 256-262. Pham, D. T., & Oztemel, E. (1994). Control chart pattern recognition using learning vector quantization networks. International Journal of Production Research, 32, 721-729. Pham, D. T., & Wani, M. A. (1997). Feature-based control chart pattern recognition. International Journal of Production Research, 35, 1875-1890. Smith, A. E. (1994). X-bar and R control chart interpretation using neural computing. International Journal of Production Research, 32, 309-320. Wani, M. A., & Pham, D. T. (1999). Efficient control chart pattern recognition through synergistic and distributed artificial neural networks. Proceedings of the Institution of Mechanical Engineers, Part B, 213, 157-169. Western Electric Company. (1958). Statistical Quality Control Handbook. Indianapolis, Indiana: Western Electric Co. Inc. Yang, M. S., & Yang, J. H. (2002). A fuzzy-soft learning vector quantization for control chart pattern recognition. International Journal of Production Research, 40, 2721-2731. Yang, J. H., & Yang, M. S. (2005). A control chart pattern recognition system using a statistical correlation coefficient method. Computers and Industrial Engineering, 48, 205-221. Page Number