Designing effectively a set of X quality control charts Francisco Aparisi Marco A. de Luna Eugenio Epprecht Universidad Politécnica de Valencia ITESM Departamento de Engenharia Departamento de Estadística e I. O. Campus Guadalajara Industrial Aplicadas y Calidad Departamento de Ingeniería Mecánica Pontificia Universidade Católica de Río e Industrial. Camino de Vera s/n. 46022 Valencia de Janeiro Spain +34-963877490 Mexico +52-3336693000 faparisi@eio.upv.es mdeluna@itesm.mx ABSTRACT Nowadays it is a common practice in industry to control simultaneously p variables of a unit or a productive process. There are two possibilities: first, to employ a set of univariate charts (like the X chart) or, second, to use a multivariate chart (like the T2 control chart). The design and performance of the T2 control chart has been widely studied in the bibliography. However, the effective design of a set of depth. X charts has not been researched in In this paper we show how it is possible to find the best parameters of a set of X charts. The parameters are found solving an optimization problem employing Genetic Algorithms (GAs). For example, in the case of two variables (p = 2) the variables to find are: upper and lower control limits of the two charts (UCLX1, LCLX1, UCLX2, and LCLX2) and the two sample sizes (n1 and n2). The objectives are: 1.- to detect as soon as possible (minimum ARL) a shift of size A = (A1, A2). 2.- not to detect a shift of size B = (B1, B2). 3.- The in-control ARL has to be the specified value ARL0. After the optimization is carried out we make a comparison against the performance of the T2 control chart. Categories and Subject Descriptors J.2 [Physical Sciences and Engineering]: Engineering. G.3 [Probability and Statistics]: Multivariate Statistics. General Terms Algorithms Keywords Statistical Process Control, Genetic Algorithms, multivariate quality control. 1. INTRODUCTION In industrial situations quality control charts are often used to observe whether a process is in control. When there is one quality characteristic Shewhart control charts ( X charts) are usually applied to monitor process shifts. But there are many situations in which the simultaneous control of two or more related quality CEP: 22453-900, Gávea, Rio de Janeiro - RJ - Brasil +55-2135271287 eke@ind.puc-rio.br characteristics is necessary. For example, the quality control of some dimensions of a part. In these cases, it is still possible the design of a statistical process control for all quality characteristics employing Shewhart control charts but there are another feasible multivariate schemes. As example of how to control several variables with X charts, let us consider a part which has two quality characteristics, represented by the variables X1 and X2 normally distributed. First we consider the case where both characteristics are independent. If the two variables are monitored separately a univariate X chart with, for example, 3-sigma limits can be constructed for each characteristic. Each chart has a probability of exceeding 3-sigma control limits = 0.0027 (the type I error). The probability that both variables fall inside control limit when the process is in control is (1-0.0027)(1-0.0027) = 0.994607. So the overall type I error for this case is ' = 1-0.994607 = 0.0054. If there are p statistically independent quality characteristics and charts with α type I error are used, the real type I error α’ is = 1- (1- ) p (1) Therefore, if we want to fix the type I error for the process as a whole, having p independent variables, equation (1) can be used to calculate the suitable type error for each chart, and then, to obtain the correct control limits. If the variables are not independent, which is the most common case, a more complex process are to be employed to obtain the control limits to have an ' value. For simplicity, let us continue with the bivariate case supposing that the variables follow a bivariate normal distribution with mean vector 0 ' ( 0,1 , 0,2 ) and covariance matrix 2 0,1 0 0,1 0,2 0,1 0,2 2 0,2 (2) and X charts are used. Then an a value has to be calculate such that ' 1 P a X1 0,1 X 0,2 a a 2 a 0,1 / n 0, 2 / n small or moderate process shifts. (Crosier [4], Pignatello and Runger [5] and Lowry, Woodall, Champ and Rigdon [6]). 1 P a Z1 a a Z2 a 1 a a a a f ( z1, z2 )dz1dz2 (3) where n is the sample size and f ( z1 , z2 ) is the joint density function of a bivariate standard normal distribution with correlation, . So the control limits are: 0,1 0, 2 0,1 a for X1 and 0,2 a for X 2 n n . Another solution to the problem of controlling several variables was provided by Hotelling (1947). Consider p correlated characteristics are being measured simultaneously and these characteristics follow a multivariate normal distribution with ' ( 0,1 , 0,2 , ... , 0, p ) and covariance matrix Σ mean vector 0 0 when the process is in control. When a ith sample of size n is taken we have n values of each characteristic and it is possible to calculate the X i vector, which represents the ith sample average vector for the p characteristics. The charting statistic Ti 2 n( X i 0 ) ' 1 ( X i 0 ) (4) 2 is called Hotelling´s T2 statistic. Ti is distributed as a chi-square 2 variate with p degrees of freedom. Notice that Ti 0 . When the process is in control, i 0 , there is a probability α 2 that this statisitic exceeds a critical point p, , so that the overall error rate (type I) can be maintained exactly at the level α by T 2 2p, triggering a signal when i . It is common practice to suggest the use of a Shewhart type 2 control chart (figure 1) 2 with an upper control limit (CL) of p, (see, e.g., Jackson [1], 2 Jackson [2] and Alt [3]). If i 0 Ti is distributed as a non- central chi-squared distribution with p degrees of freedom and ' 1 with non-centrality parameter n(i 0 ) (i 0 ) . This scheme corresponds to the likelihood ratio test for H0: i 0 vs. H1: i 0 . Recently multivariate CUSUM and multivariate EWMA schemes has been defined showing better power than T2 chart specially for Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SAC’08, March 16-20, 2008, Fortaleza, Ceará, Brazil. Copyright 2008 ACM 978-1-59593-753-7/08/0003…$5.00. 2. DESIGNING A SET OF CHARTS X CONTROL The objective of a X control chart, or Shewhart chart, is to detect that the mean of the process has shifted. The following statistic hypothesis test is posed: H 0 : m m0 H1 : m m0 where m is the present mean of the process and m0 is the mean when the process was in an in-control state. The procedure consists of taking a random sample from the process and computing the sample average, X . The value of this statistic is plotted in a chart with control limits UCL (Upper Control Limit) and LCL (Lower Control Limit). If the point exceeds one of the control limits then the alternative hypothesis, H1, must be accepted, therefore, we accept that there is a shift in the process mean, the process is out of control. Therefore, an out-of-control signal is obtained when a point plots outside control limits In order to measure the performance of a control chart the ARL is the most used parameter. The ARL (Average Run Length) is the average number of points (samples) until an out-of-control signal is shown. When the process is in-control, the value of ARL should be as large as possible (no false alarms). However, when there is a shift in the process mean, we would like to detect it quickly, and, therefore, the value of ARL should be as low as possible. The “ideal” chart should have an in-control ARL of infinity and out-of-control ARL equals to 1. Nevertheless, it is not possible to obtain this ideal chart, and the in-control and out-ofcontrol ARLs values must be balanced. When p variables are monitored using a set of X control charts, p means are monitored simultaneously, following the previous scheme. The whole scheme (the p charts) must have the in-control ARL that the users desires and must have an out-of-control ARL as low as possible, i.e. minimum. Therefore, the following parameters of the p charts are to be found: 1.- Control limits: UCLX1, LCLX1, UCLX2, LCLX2 … UCLXp, LCLXp. 2.- Sample sizes: n1 , n2, np Therefore, the design of a set of X control charts consists of finding the previous parameters to fulfill the ARL requirements of the user. Figure 1. Software solving the example of application. Figure 2. Comparison of ARLs. 3. SOLVING THE OPTIMIZATION PROBLEM The design of a set of X control charts can be posed as an optimization problem. Given: Magnitude of the shift to be detected, d. In-control ARL, ARL(d = 0), ARL0. Find: Control limits and sample sizes: UCLX1, LCLX1, UCLX2, LCLX2 … UCLXp, LCLXp, n1 , n2, n p that minimizes ARL(d) with the restriction ARL(d = 0) = ARL0. Number of variables to monitor, p. The ARL function to minimize is: 1 1 ARL 1 UCLX1 LCLX1 UCLXp LCLXp (2 ) p/2 1/ 2 e 1 ( 1 0 )T 1 ( 1 0 ) 2 n = 5 (for both variables). UCLX1 = 2.81; LCLX1 = 4.25. UCLX2 = 3.81; LCLX2 = 4.43. ARL(d = 0) = ARL0 = 400.02 · (5) where 1 is the vector of out-of-control means, and 0 in the vector of in-control means and Σ is the covariance matrix of the variables.. In addition, more restrictions are added taking into account the real application in industry. Normally, there is a maximum value of sample size that can be taken form the process, nmax. On the other hand, in some cases the user may specify that all sample sizes must be equal. This optimization problem is not easy, as integers and real number are mixed. On the other hand, the ARL function is not linear. Therefore, it is a suitable problem to be solved using Genetic Algorithms (GAs). The use of GAs to find the optimum parameters of quality control charts is quite new. Some examples can be found in Aparisi and García-Díaz [7], Champ and Aparisi [8] and He and Gregorian [9]. 4. SOFTWARE AND EXAMPLE OF APPLICATION On of the objectives of this work is to help the final user in the industry to solve the optimization problem posed here. For that reason, friendly software has been developed. Figure 1 shows this software, that, at the moment, solves the problem when two or three variables are monitored, the most common cases. The user has to input the “Model Parameters”, i.e, the number of variables (means) to be controlled, the desired in-control ARL, the maximum sample size that can be used, the option to obtain all sample sizes equal, the specification of the shift that must be detected, and the correlation coefficients of variables. The user can specify the parameters of the GA (number of generations, population size, …). But the default values shown by the software are been sought to optimize the performance of the GA. An example of application is solved using the software. Figure 1 shows the solution to this problem. Two variables are monitored. When the process is in an in-control state the means are: 2.2 7.8 0 When the process is in-control an ARL of 400 is desired, ARL0 = 400. The maximum sample size that can be used is nmax = 5, and it is required that the two variables will be sample with the same sample size. The correlation coefficient between the variables is r = 0.8. The user want to obtain the best parameters for the two Shewhart charts to minimize the ARL to detect an increment of the first variable equals to 0.7 and an increment of the second variable equals to 0.5. After running the software (less than one minute) the solution found is: 0.7 ARL(d ) 9.38 0.5 Therefore, this scheme will need an average of 9.38 samples to detect a shift with new mean 2.9 1 8.3 5. COMPARISON WITH THE T2 CONTROL CHART As it was commented in the introduction, the alternative to monitor p variables is the use of the T2 control chart. The software helps the user to make a comparison between the performances of using p Shewhart charts versus employing a T2 control chart. In Figure 1, it is possible to see how the software shows the ARL of the T2 control chart to detect the specified shift, having the same in-control ARL, ARL0 = 400. In this case, the ARL of the multivariate chart is 20.89, in comparison against the ARL of the two Shewhart charts, 9.38. Therefore, the set of two Shewhart charts takes less than the half to detect the out-of-control state. In this first approach, it seems that the use of a set of X charts will be a better option. However, the chart on the right of Figure 2 has to be considered. This chart shows an ARL comparison between both schemes. It is possible to see that the set of X charts produces a lower out-of-control ARL moving the maximum values of ARL from the in-control point. That means that the maximum values of ARL are not located for the in-control state (no shift). Therefore, for some shifts the set of X charts will obtain very large out-of-control ARLs, i.e, these shifts will be very difficult to detect. However, the T2 control chart will always show lower ARL for any given shift, in comparison with its incontrol ARL. The fact that some shifts are not easily detected by the set of X charts, but at the same time, it shows a quite lower out-of-control ARL for a given shift, can be utilized by the user. As Woodall [X] states, in some situations it is required to not detect some shifts, keeping the performance to detect other shifts. 6. CONCLUSIONS In this work we have posed the optimization of a set of X quality control charts for the case that the user specify the in-control ARL with the objective of finding the parameters that minimizes the out-of-control ARL for a given shift magnitude. Friendly software has been developed to help users in industry to find these optimal parameters using GAs. The results of the optimization shows that the set of X charts will detect before the out-of-control state (lower out-of-control ARL) in comparison versus the use of a T2 control chart. [4] Crosier, R. B., 1988, Multivariate Generalizations of cumulative sum quality-control schemes. Technometrics, 30, 291-303. However, the set of X charts optimized for a given shift magnitude will not detect easily other shifts. These shifts are shown by the software, and as commented by Woodall [X], this behavior solves a need found in some industries. [5] Pignatello, J. J., Jr. and Runger, G. C., 1990, Comparisons of multivariate CUSUM charts. Journal of Quality Technology, 22, 173- 186. 7. ACKNOWLEDGMENTS [6] Lowry, C.A., Woodall, W. H., Champ, C. W. and Rigdon, S.E., 1992, A multivariate exponentially weighted moving average control chart. Technometrics, 34, 46-53. This work has been supported by the Ministry of Education and Science of Spain, research project number DPI2006-06124 including European FEDER funding, and the support of the ITESM-Foundation Carolina agreement. [7] Aparisi, F. and García-Díaz, J. C., 2004, Optimization of Univariate and Multivariate Exponentially Weighted Moving Average Control Charts using Genetic Algorithms, Computers and Operations Research, 31 (9), 1437-1454. 8. REFERENCES [8] Champ, C. W. and Aparisi, F., 2007, Hotelling’s T2 Double Sampling Charts. Quality and Reliability Engineering International, accepted [1] Jackson, J. E., 1959, Quality Control Methods for Several Related Variables, Technometrics, 1 (4), 359-377. [2] Jackson, J. E., 1985, Multivariate Quality Control, Communications in Statistics, 14 (11), pp. 2657-2688. [3] Alt, F.B., 1985, Multivariate Control Charts, Encyclopedia of Statistical Sciences, vol 6. (S. Kotz and N. L. Johson, Eds. Wiley, New York), pp. 110-122. [9] He, D. and Grigoryan, A., 2002, .Construction of Double Sampling S- Control Charts for Agile Manufacturing, Quality and Reliability Engineering International 18, 343355.