QUICK & DIRTY GRR PROCEDURE TO RANK TEST METHOD VARIABILITY Mike Mercer, Quality Engineering Specialist, 3M, St. Paul, MN Steve Cox, Lean Six Sigma Coach, 3M, St. Paul, MN Introduction One of the first steps in a process improvement effort is to assess the stability of the process. One of the simplest methods is to use some type of control chart. Once a baseline has been established some thought must be given to how much of the variability is actually process variability and how much is due to test method measurement error. This becomes important in any improvement effort. It is not uncommon that for many processes, especially those using destructive testing like the PSA industry, that a large portion of variability can be due to the test method measurement. Processes with high test method variability can often achieve major improvement by just improving the test. In any event trying to improve a process without at least a good estimate of test method error is frustrating at best and potentially quite costly. Performing experimentation without knowing what minimum effect can be detected can be costly in both resources and morale and possibly more important can lead to erroneous conclusions and chasing down dead ends. The situation can be worse if experimentation follows a Design of Experiments (DOE) methodology, in that many runs can fail to show any factors as significant. In this case frustration abounds and the damage done extends well past the process under study. As an example Figure 1 shows the results of a typical study to select the material that demonstrates the highest CD Tensile. The chart shows the average result of 5 measurements for each formulation. Figure 2 is taken from ASTM D3759 and displays that the width the 95% confidence limits for a single operator testing different materials is +/- 62.7%. So instead of selecting the best material, what we really did was reject the worst one, as depicted in Figure 3. This paper will demonstrate an easy way for characterizing the test method impact on process variation. Control Charting PSA Tape Processes Where a process has only one quality characteristic value per coated roll, the classic Individual – Moving Range Control Charts are a valuable method to judge stability of the process. See Figure 4. Often times the Individual Moving Range Control Chart is used to plot quality characteristics for lanes. Figure 4 depicts the coat weight for the left side of the coater, a second chart is used to depict the right side coating weight also. When more than one quality characteristic values are collected per roll, we can have an entirely different situation if we apply the classic Xbar Range Control Chart. By their very nature tape coating processes can violate the very assumptions required for successful implementation of classic control charting techniques. Classic Xbar Range control charts are based on the assumption that the variability within the 83 subgroup is the same as the variability between subgroups. When crossweb measurements are used to represent a subgroup (i.e., within variability), their variability is often much less than the downweb variability (i.e., between variability). When classic control charts are used, the process often looks wildly out of control. See for instance Figure 5 which depicts a classic Xbar Range control chart where the subgroups are composed of two coat weight measurements, one from each side of the web, plotted for each coated roll. The crossweb variation is by coater design optimized to be uniform, whereas the downweb variation is influenced by other factors. Luckily for us, Don Wheeler1 has described a procedure for situations where the within variation (crossweb) is different from the between variation (downweb). Luckily also, Minitab2 has implemented a Control Chart for this situation, see Figure 6. The top chart depicts the stability of the process average. The middle range chart depicts the downweb Range stability. The bottom range chart depicts the crossweb stability. Figure 7 shows the standard deviation computations that accompany these between/within control charts. The estimate of the within standard deviation is the variation due to crossweb, the between standard deviation is the variation due to downweb. The total is the square root of the sum of squares of the two components, and is the standard deviation that represents the total variation due to the process. It is the number that can be used to calculate a Cpk index for instance. It is not our intention to teach you about the use of the Between/Within Control Charting techniques, since Wheeler has already done that for us. However it is this very structure of between/within variation that we will use to obtain estimates of test method variation from typical tape coating processes. Figure 1. Selection of the best formulation 84 Figure 2. ASTM D3759 confidence intervals Figure 3. Rejection of the worst 85 I-MR Chart of Left CW Individual Value 34 2 U C L=33.951 33 _ X=31.760 32 31 30 LC L=29.569 1 7 M oving Range 4 13 19 25 31 O bser vation 37 43 49 55 1 3 U C L=2.691 2 __ M R=0.824 1 0 LC L=0 1 7 13 19 25 31 O bser vation 37 43 49 55 Figure 4. Individual Moving Range Control Chart for Coat Weight Figure 5. Xbar Range Control Chart for Tape Coat Weight 86 Figure 6. Between/Within Control Chart for Coat Weight Figure 7. Standard Deviation Calculations from Between/Within Control Chart Gage R&R Studies In cases where one process is under study, it is common to use a Gage R&R study to assess the test method measurement system “health.” The criteria for the action level for the Gage R&R method is often quoted from the Automotive Industry Action Group (AIAG) Measurement Systems Analysis Manual3 (MSA). Donald Wheeler4 has published a critique of this action level and makes a good case that at least for control charts the criteria is far too conservative. In any event we will use the Gage R&R ratio as another metric used to rank the health of the measurement system. A gage study has the advantage of looking at operators, their repeatability and reproducibility. There are some issues with gage studies, probably the most prominent being that the samples must be carefully selected so they bracket the process width. It is sometimes difficult to do this and if not done properly the resulting Gage R&R ratio does not reflect what is really occurring because the estimate of process variation is incorrect. 87 GageR & R ratio " ! Test Method ! Process Equation 1 Thus time can be spent fixing an adequate measurement system when that effort would be much better spent in process improvement. The difficulties of gage studies can be overcome with careful planning and preparation. Tests that are destructive can represent additional challenges which can usually be compensated for by careful planning. Gage R&R studies have the disadvantage that they require additional experimentation above and beyond what is routinely collected. Typically ten samples are collected for testing, three operators perform the tests and repeat the testing three times for each sample. This amounts to 90 data points, which is a considerable amount of additional work. The resulting statistical output is comprised of two components for the test method standard deviation, the repeatability portion and the reproducibility portion. Repeatability is the result one person gets testing the same material repeatedly. Reproducibility is the representation of the variability due to different operators. Repeatability is always present. Reproducibility, if present, is the result in many cases of poor training. In the best of cases it is negligible. Many tape processes require that multiple specimens be tested and averaged to represent the reported result of the quality measurement. In some cases the test is such that multiple specimens are needed to “average out” test variability to an acceptable level. In may be that multiple specimens are required as a customer requirement, it may also be that it is simply the way things have always been done. PSTC test methods routinely require multiple specimens to be tested and the average or median reported, see Figure 8. Figure 8. PSTC Test Methods requiring multiple specimens 88 If a Gage R&R study were performed on PSTC 107 where five specimens are required to represent the reported result of the test method then the Gage R&R study now requires 450 individual tests (90 times 5 specimens per average.) Clearly if there were a simpler method to obtain an estimate of the test method repeatability standard deviation it would be welcome. Concept for Quick and Dirty Gage R&R Studies Often times the individual values obtained to calculate the reported result of the quality characteristic represented by a test method are discarded and only the average value recorded. The subsequent quality characteristic is often charted on the classic Individual Moving Range Control Chart and gives a picture of the process stability. The lost information due to discarding the individual results of the quality characteristic is in actuality the information needed to calculate the repeat standard deviation from a Gage R&R study. If we can calculate the standard deviation of the individual results we can estimate the standard deviation of the test method itself since the two are related as follows. ! Test Method " ! Individuals n " 2 ! Individual s n Equation 2 Instead of discarding the individual values of the quality characteristic if we use them in conjunction with the Between/Within Control Chart for one lane we can obtain an estimate of test method variability from the within standard deviation. ! Individuals " ! Within Equation 3 The between standard deviation is a representation of the true process variability. However the observed process variation standard deviation is a combination of both process and test method ! Process " ! 2 Between # ! 2Test Method Equation 4 Substituting equations 2 and 3 into equation 4 gives ! Process " ! 2 Between # !2 Within n Substituting these relations into the definition of the Gage R&R ratio from equation 1 gives 89 Equation 5 !2 Within n GageR & R ratio " ! 2 Between # ! " 2 Within !2 Within 2 n! Between # ! Within 2 Equation 6 n We now have a way to calculate the Gage R&R ratio directly from the output of a Between/Within Control Chart when we use the individual values of the test method. Process Capability Calculations The next thing to consider is what is the capability of the process and what portion of the variability is due to the test method. Fortunately again Minitab has a procedure to help us analyze this in the presence of Between/Within variability. For illustration purposes only, assume that the data represented earlier in Figure 6 is in fact from two repeats of the coating weight which are averaged to be the coating weight reported test result instead of from the right and left sides of the web. We do this to show the calculations of the Between and Within standard deviations as calculated before. Between/Within Capability Sixpack of Coat Weight Individual Value Individuals Chart of Subgroup Means Capability Histogram 1 34 UCL=33.937 LSL USL S pecifications LS L 29.5 U S L 34.5 _ X=31.807 32 30 5 1 7 13 2 19 25 31 37 43 49 LCL=29.677 55 30.00 30.75 31.50 32.25 Moving Range Chart of Subgroup Means Moving Range 4 33.75 34.50 Normal Prob Plot A D: 0.492, P : 0.214 1 UCL=2.617 2 __ MR=0.801 0 LCL=0 1 7 13 19 25 31 37 43 49 55 30 1 1 1 UCL=1.176 1 0 1 7 13 19 25 2 22 _ R=0.360 2 31 LCL=0 37 43 32 34 36 Capability Plot Range Chart of A ll Data 2 Sample Range 33.00 49 S tD ev Betw een 0.67315 Within 0.319149 B/W 0.744974 O v erall 0.925557 B/W O v erall S pecs C apa Cp C pk Pp P pk C pm S tats 1.12 1.03 0.9 0.83 * 55 Figure 9. Minitab Between/Within Capability Sixpack This chart contains a lot of information about our process as well as the test method. The values of the Between and Within standard deviations are given and are identical as seen from Figure 7. The Gage R&R ratio from these data is 90 GageR & R ratio " !2 Within 2 2 n! Between # ! Within " .319 2 " .32 2 x.6732 # .319 2 Equation 7 In addition we have Cp, Cpk, Pp, and Ppk metrics. The Cpk metric (calculated from the control chart standard deviation estimates) and Ppk (population standard deviation) would be examined. In this example the Cpk of 1.12 is greater than the Ppk of .83 indicating there is special cause variation present. Any difference between the Cp and the Cpk and similarly the Pp and the Ppk is due to the process being off center from the specifications, in this case the process is slightly off center. Quick and Dirty Ranking In many process improvement situations there are many processes and many quality characteristics to work on. Capability metrics are relatively easy to compute and gage metrics less easy, in some cases much more difficult. In the situation where repeats are present the described method allows a quick method to do a gage study. It allows many processes to be “triaged” so resources are focused on processes most in need of work. In the above example the process is not capable (Ppk<1) and test method is marginal, so this would be a potential candidate for improvement. In the factory that produced this product there would likely be other processes more in need of improvement. There are situations where a quick estimate of test variation can be very useful even the case of very capable processes. In the case of process improvement there can be several criteria used to evaluate the potential for improvement activities. Process capability metrics, test variation, resource availability, and economic factors. In an actual case study there were two processes and a total of 29 products with many quality characteristics per product which are represented as the result of a test method. They were analyzed in two days using the described procedures. The individual test data had been recorded in the quality records and was available for analysis. A sixpack was run for each process/product/quality characteristic. The Gage R&R ratio was computed and tabulated along with the other measures depicted in Table 1. The data developed was then put in a cause and effect matrix (C&E) and scored against criteria selected by the improvement team. The scored matrix was then sorted into rank order and the top ten were selected for improvement work. 91 Table 1. Quick and dirty process capability improvement opportunities Process Product Quality Characteristic Cpk Cp Cpk Ppk Potential for process improvement via test method. Smaller is better Adhesion Coat Weight Adhesion Adhesion Shear “ “ “ “ Tack Potential for process improvement via removing special causes A A B C C “ “ “ “ D Actual process performance >1.33 is better SDC SDC SDC MCM MCM “ “ “ “ MCM Potential for process improvement via centering. Bigger is better Description of Improvement activity => ratio Gage R&R ratio Potential for process improvement. >1.33 is better ratio Cpk Ppk 0.95 1.03 1.52 1.00 1.33 “ “ “ “ 1.33 1.31 1.09 1.00 1.00 1.1 “ “ “ “ 1.4 .90 .83 1.12 .65 1.25 “ “ “ “ 1.24 1.05 1.24 1.35 1.54 1.06 “ “ “ “ 1.07 .65 .32 .10 .71 .20 “ “ “ “ .11 In situations where repeats are used in testing, the described technique can greatly reduce time required to achieve results. It is particularly effective in ranking processes for improvement. This allows the most needed improvements to be done first so the greatest value can be achieved. It should be noted that in all cases discussed the metric described would be for individual values. In the case where specifications are based on averages the calculated standard deviations would need to be corrected for sample size. Literature Citations Donald Wheeler Advanced Topics in Statistical Process Control, SPC Press, 1995 Minitab Software, www.minitab.com. Revision 15 3 Measurement Systems Analysis, 3rd Edition, 2002, AIAG, Automotive Industry Action Group 4 Donald Wheeler, Good Data, Bad Data and Process Behavior Charts. http://www.spcpress.com/pdf/good_data_%20bad_data.pdf 1 2 92 ! ! ! "#$%!&'!"()*+,)-.!/(0,+-1!/2(-3(1! ! ! 45,)3!6!7,189!:;;!<1=)(>51(!8=!;-+3!"(?8!@(8*=>!A-1,-B,.,89! !"#$%&'(!&)#&)*(+,-*(.-/*(.-*(0!( ! ! ( ( ! !"#$%&'(!&)#&)*(+,-*(.-/*(.-*!,?!C5-.,89!(+D,+((1,+D!?2(),-.,?8!E,8*!8*(!F+>5?81,-.! -+>!"1-+?2=18-8,=+!G5?,+(??!-8!&@H!%(!*-?!-!B1=->!B-)3D1=5+>!,+!C5-.,89!,+).5>,+D! C5-.,89!0-+-D(0(+8I!?(1J,+D!'K!9(-1?!-?!F/L!MKK'!8(-0!.(->(1!N=1!8*(!F+>5?81,-.! O>*(?,J(!-+>!"-2(?!7,J,?,=+!-+>!?8-8,?8,)?I!E,8*!(02*-?,?!=+!8(?8!0(8*=>!J-1,-B,.,89!-+>! )=+81=.!)*-18,+DH!@(1)(1!)511(+8.9!,?!>(J(.=2,+D!-!8(?8!0(8*=>!$(+8(1!=N!#P)(..(+)(!N=1! 8*(!F+>5?81,-.!-+>!"1-+?2=18-8,=+!G5?,+(??H!%(!(-1+(>!-!GH/H!>(D1((!,+!)*(0,)-.! (+D,+((1,+D!N1=0!8*(!/8-8(!Q+,J(1?,89!=N!R(E!S=13!-8!G5NN-.=I!-+>!251?5(>!->>,8,=+-.! (>5)-8,=+!,+!)*(0,)-.!(+D,+((1,+D!N1=0!8*(!Q+,J(1?,89!=N!@,++(?=8-H! 81