Simplified Design of Experiments Quality Day Orange Empire 0701 October 28, 2010 Goals of Today’s Session • Understand why Design of Experiments is preferable. • Learn how to perform basic experiments to optimize results and reduce variation. Approaches to Experimentation • Build-test-fix (What experimentation?) • One-factor-at-a-time (OFAT) • Designed experiments (DOE) Build-Test=Fix • The tinkerer’s approach • Impossible to know if true optimum achieved – Quit when it works! • Consistently slow – Requires intuition, luck, rework – Continual fire-fighting One Factor at a Time (OFAT) • Approach: – Run all factors at one condition – Repeat, changing condition of one factor – Continuing to hold that factor at that condition, rerun with another factor at its second condition – repeat until all factors at their optimum conditions • Slow and requires many tests • Can miss interactions! “One Factor at a Time” is Like The Blind Man and the Elephant What we conclude may be determined by where we are looking! Design of Experiments • A statistics-based approach to designed experiments. • A methodology to achieve a predictive knowledge of a complex, multi-variable process with the fewest trials possible. • An optimization of the experimental process itself. Key Concepts • DOE is about better understanding of our processes INPUTS (Factors) X variables OUTPUTS (Responses) Y variables People Materials PROCESS: Equipment responses related to performing a service Policies responses related to producing a produce Procedures A Blending of Inputs which Generates Corresponding Outputs Methods Environment Illustration of a Process responses related to completing a task Injection Molding Process INPUTS (Factors) X variables OUTPUTS (Responses) Y variables Type of Raw Material Mold Temperature Holding Pressure PROCESS: % shrinkage from mold size Holding Time Gate Size Manufacturing Injection Molded Parts Screw Speed Moisture Content thickness of molded part Manufacturing Injection Molded Parts number of defective parts Concrete Mixing Process INPUTS (Factors) X variables OUTPUTS (Responses) Y variables Type of cement compressive strength Percent water PROCESS: Type of Additives Percent Additives Mixing Time modulus of elasticity Discovering Optimal Concrete Mixture Curing Conditions % Plasticizer Optimum Concrete Mixture modulus of rupture Poisson's ratio Microwave Popcorn Making Process INPUTS (Factors) X variables OUTPUTS (Responses) Y variables Brand: Cheap vs Costly PROCESS: Taste: Scale of 1 to 10 Time: 4 min vs 6 min Power: 75% or 100% M aking the Best M icrow av e popcorn Height: On bottom or raised M aking microw av e popcorn Bullets: Grams of unpopped corns Three Key Principles • Replication [DOE’s version of Sample Size] – Replication of an experiment – Allows an estimate of experimental error – Allows for a more precise estimate of the sample mean value • Randomization [Run experiments in random order] – Cornerstone of all statistical methods – “Average out” effects of extraneous factors – Reduce bias and systematic errors • Blocking [What can influence my experiment?] – Increases precision of experiment – “Factor out” variables not studied Steps in a Designed Experiments Study 1. Brainstorm the problem / causes 2. Design the Experiment 3. Perform the experiment 4. Analyze the data 5. Validate your results Determining Input Factors to Optimize Output FACTORIAL EXPERIMENT IMPROVEMENT OF THE MEAN Case Study • You have been assigned to look into a problem with your company’s lamination process. Customers have been complaining about separation in your wood laminate, and you are leading a team to determine how to reduce the separation. Step 1. Brainstorm • You met with your team, and through your investigation, you determine that the best approach would be to minimize the amount of CURL Laminate Curl Step 1. Brainstorm • After brainstorming and list reduction, your team decided to look at 3 factors that they felt influenced the amount of curl: A Top Roll Tension, currently set at 22 B Bottom Roll Tension, currently set at 22 C Rewind Tension, currently set at 9 Step 2. Design the Experiment The team decides to study 2 levels for each of the factors, and determine the effect on the response variable, curl: Factor Low (-) Setting High (+) Setting Top Roll Tension 16 28 Bottom Roll Tension 16 28 Rewind tension 6 12 Note: Team will make sure that their selected values are FEASIBLE. Step 2. Design the Experiment • This experiment will have 2 levels (high and low settings) and 3 factors (top roll tension, bottom roll tension, and rewind tension) • Number of Experiments = 23 = 8 Step 2. Design the Experiment Run Top Roll Tension Bottom Roll Tension Rewind Tension 1 - - - 2 + - - 3 - + - 4 + + - 5 - - + 6 + - + 7 - + + 8 + + + Alternate - -,+ + Alternate - - - ++++ Alternate -,+ Note: If another variable is added, there would be 24=16 runs The next design generation would be “- - - - - - - -, + + + + + + +” Step 3. Perform the Experiment Run Top Roll Tension Bottom Roll Tension Rewind Tension 16 28 16 16 16 28 7 28 16 28 16 8 28 1 2 3 4 5 6 Replication 1 Replication 2 Average Std. Deviation 6 6 6 87 88 87.5 0.707 76 78 77.0 1.414 90 92 91.0 1.414 28 16 16 28 6 12 12 12 83 80 81.5 2.121 101 96 98.5 3.535 92 91 91.5 0.707 100 104 102.0 2.828 28 12 92 91 91.5 0.707 Perform the runs in random order to ensure statistical validity Step 4. Analyze the Data • We are going to calculate the MAIN EFFECTS for the 3 factors. • We are also going to calculate the INTERACTION EFFECTS between each of the 2 factor combinations and the three factor combination. • Let’s start with the main effects… Step 4. Analyze the Data Visualization of Main Effects 102.0 + 98.5 91.5 91.5 Rewind Tension 91.0 - - 81.5 87.5 Top Roll Tension 77.0 + + Bottom Roll Tension Note: lowest curl achieved when Top Roll Tension HIGH and Bottom Roll and Rewind Tensions set LOW Step 4 Analyze the Data Main Effect Calculations • Average the “High” Settings and subtract the average of the “Low” Settings: 77.0 81.5 91.5 91.5 87.5 91.0 98.5 102.0 Main Effect of Top Roll Tension= 9.38 4 4 91.0 81.5 102.0 91.5 87.5 77.0 98.5 91.5 Main Effect of Bottom Roll Tension= 2.88 4 4 98.5 91.5 102.0 91.5 87.5 77.0 91.0 81.5 Main Effect of Rewind Tension= 11.63 4 4 Step 4. Analyze the Data Interaction Effects Run Top Roll Tension (A) Bottom Roll Tension (B) Rewind Tension (C) AB AC BC ABC Avg Curl + - + - + - + + + + - + + 87.5 + + + + + + + - + - + + - 81.5 7 + + - 102.0 8 + + + + + + + 91.5 1 2 3 4 5 6 77.0 91.0 98.5 91.5 Simply multiply the signs of the columns, i.e. “+ times + equals +” “- times - equals +” “+ times - equals -” and “- times + equals -” Step 4 Analyze the Data Interaction Effect Calculations • Average the “High” Settings and subtract the average of the “Low” Settings: 87.5 81.5 98.5 91.5 77.0 91.0 91.5 102.0 Interaction AB= 0.63 4 4 87.5 91.0 91.5 91.5 77.0 81.5 98.5 102.0 Interaction AC= 0.63 4 4 87.5 77.0 102.0 91.5 91.0 81.5 98.5 91.5 Interaction BC= 1.13 4 4 77.0 91.0 98.5 91.5 87.5 81.5 91.5 102.0 Interaction ABC= 1.13 4 4 Step 4 Analyze the Data Pareto of Effect Size Effect Size 14 12 10 8 6 4 2 0 Rewind Tension (C) Top Roll Tension (A) Bottom Roll Tension (B) BC Interaction 3-Way Interaction AB Interaction Interaction Effect Size Pareto shows the absolute value of the effects for comparability of significance. Step 4. Analyze the Data Visualization of Interaction Effects Interaction – Top and Bottom Roll Tension Curl 100 95 90 85 Slope= -10.0 Slope= -8.75 Bottom Roll Low Bottom Roll High 80 75 Top Roll Tension Low Top Roll Tension High Note: parallel lines indicate LACK OF INTERACTION Step 4. Analyze the Data Visualization of Interaction Effects Interaction – Top Roll and Rewind Tension Curl 120 Slope= -8.75 100 80 60 Slope=-10.0 Rewind Low Rewind High 40 20 0 Top Roll Tension Low Top Roll Tension High Note: parallel lines indicate LACK OF INTERACTION Step 4. Analyze the Data Visualization of Interaction Effects Interaction –Bottom Roll and Rewind Tension Curl 100 95 90 85 80 Slope = 1.75 Rewind Low Slope = 4.0 Rewind High 75 Bottom Roll Tension Bottom Roll Tension Low High Note: parallel lines indicate LACK OF INTERACTION Step 4. Analyze the Data Test of Significance • We can test for the significance of the effects using the t-distribution and confidence intervals. • The first step is to calculate the POOLED STANDARD DEVIATION for all of the observations…. Step 4 Analyze the Data Pooled Standard Deviation We are studying 7 factors/ interactions (A, B, C, AB, AC, BC, ABC) in 8 runs We did 2 replications of each run. n1S12 n2 S 22 ... nv Sv2 Sp n1 n2 ... nv for v runs Note: if the number of replicates is the same for all runs, we can simply calculate the pooled standard deviation as the square root of the average of the variances Step 4 Analyze the Data Pooled Standard Deviation Run Replication 1 Replication 2 Average Std. Deviation Variance 1 87 88 87.5 0.707 0.4998 2 76 78 77.0 1.414 1.9994 3 90 92 91.0 1.414 1.9994 4 83 80 81.5 2.121 4.4986 5 101 96 98.5 3.535 12.4962 6 92 91 91.5 0.707 0.4998 7 100 104 102.0 2.828 7.9976 8 92 91 91.5 0.707 0.4998 Simplified Calculation: Avg Variance=3.8113 Std Dev = 1.952 Step 4. Analyze the Data t Statistic for Significance • Key Information for calculation: =risk (Confidence = 1- ) [5%, 95%] p=number of “+” per effect column [4] r=number of replicates [2] Sp=Pooled Standard Deviation [1.952] f=degree of fractionalization (in our case 0, since we are doing a full factorial) k=number of factors [3] Degrees of Freedom=(r-1)2k-f [8] Step 4. Analyze the Data t Statistic for Significance The t statistic for 95% confidence (5% risk), 2 tailed test, 8 degrees of freedom is 2.306 The Error is calculated as: Error t.025,8 S p 2 2 2.306 1.952 2.25 pr (4)(2) The confidence interval for each of the effects is: – Effect +/- Error t table for 2 tailed test Step 4. Analyze the Data t Statistic for Significance • Since the Error = 2.25, any effect that is contained in the limits of: 0 +/- 2.25 • Is considered NOT STATISTICALLY SIGNIFICANT Factor Effect Rewind Tension m(C) 11.63 Top Roll Tension (A) -9.38 Bottom Roll Tension (B) 2.88 BC Interaction -1.13 ABC Interaction -1.13 AB Interaction -0.63 AC Interaction 0.63 Statistically Significant Not Statistically Significant Step 4. Analyze the Data Conclusion • Based on this outcome, I will conclude that Top Roll tension, Bottom Roll Tension, and Rewind Tension are significant at the 95% significance level • I will conclude that I should set: – Top Roll Tension HIGH (28) – Bottom Roll Tension LOW (16) – Rewind Tension LOW (6) Step 4. Analyze the Data Conclusion Run Top Roll Tension Bottom Roll Tension Rewind Tension Replication 1 Replication 2 Average Std. Deviation 1 16 16 6 87 88 87.5 0.707 2 28 16 6 76 78 77.0 1.414 3 16 28 6 90 92 91.0 1.414 4 28 28 6 83 80 81.5 2.121 5 16 16 12 101 96 98.5 3.535 6 28 16 12 92 91 91.5 0.707 7 16 28 12 100 104 102.0 2.828 8 28 28 12 92 91 91.5 0.707 Grand Average=90.0625 Analysis of Data - Response Model • The factor effects can be used to establish a model to predict responses. EffectA EffectB EffectC Expected Response=Grand Average+ A B C 2 2 2 EffectAB EffectAC EffectBC EffectABC AB AC BC ABC 2 2 2 2 where A=setting for factor A(-1) where B=setting for factor B(+1) where C=setting for factor C(+1) Calculating Expected Response Factor Effect Effect/2 Setting Grand Average Value 90.0625 Top Roll Tension (A) -9.38 -4.69 1 -4.69 Bottom Roll Tension (B) 2.88 1.44 -1 -1.44 Rewind Tension m(C) 11.63 5.815 -1 -5.815 AB Interaction -0.63 -0.315 (1)(-1) 0.315 AC Interaction 0.63 0.315 (1)(-1) -0.315 BC Interaction -1.13 -0.565 (-1)(-1) -0.565 ABC Interaction -1.13 -0.565 (1)(-1)(-1) -0.565 So, if we set A at low (-1), B high (+1) and C High (+1), we would predict: 90.0625 - 4.69 - 1.44 - 5.815 + 0.315 - 0.315 - 0.565 - 0.565 = 76.99 Step 5. Validation of Model • Run the process with the new settings for a trial period to collect data on the curl response. • Compare the new data to the historical data to confirm improvement. • A simple Test of Hypothesis of before and after data with a t test can be used. Step 5. Validation of Model • We are trying to prove that the curl with the new process settings is significantly less than the curl with the old process settings at a 95% level of significance. Our historical standard deviation has been 3.5. – Ho: µold µnew – Ha: µold > µnew Commonly Used Z-Values 1-α (or 1-β) α (or β) Z Value .995 .005 2.575 .990 .010 2.387 .975 .025 1.960 .950 .050 1.645 .900 .100 1.282 .800 .200 0.842 43 Step 5. Validation of Model • Sample Size Required: • = 5% • = 3.4 = 10% Change to detect = 2 2 Z Z 2 n 2 (1.645 1.282) 3.4 24.75 25 2 2 2 2 Step 5 Validation of Model • Suppose our historical data (25 data points) is as follows for curl: – 100, 95, 98, 102, 97, 90, 91, 98, 101, 95 – 94, 105, 96, 104, 100, 96, 98, 96, 91, 99 – 100, 102, 98, 95, 96 • Average = 97.48 • Sample Standard Deviation = 3.8419 Step 5 Validation of Model • Now, we run our new settings and collect 25 additional sets of data: – 86, 77, 85, 81, 81, 83, 78, 79, 81, 80 – 81, 78, 76, 75, 79, 83, 85, 81, 78, 78 – 85, 81, 79, 83, 72 • Average = 80.20 • Sample Standard Deviation = 3.391 t test – Assuming Equal Variances in Excel Old 100 95 98 102 97 90 New 86 77 85 81 81 83 91 98 101 95 94 105 96 104 100 96 98 96 91 99 100 102 98 95 96 78 79 81 80 81 78 76 75 79 83 85 81 78 78 85 81 79 83 72 t-Test: Two-Sample Assuming Equal Variances Old Mean Variance Observations Pooled Variance Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail New 97.48 14.76 25 13.13 80.2 11.5 25 0 48 16.86034207 4.21133E-22 1.677224197 8.42266E-22 2.010634722 t critical (24 degrees of freedom) = 1.677 Calculated t value = 16.86 We can conclude with more than 95% confidence that the new parameter settings have significantly reduced the amount of curl in our process. Determining Input Factors to Reduce Variation FACTORIAL EXPERIMENT REDUCTION OF VARIATION Variation Reduction • Standard deviations are not normally distributed, and therefore cannot be used directly as a response variable. • Options include: – Use the natural or base 10 log to obtain normality – Use –log10(s) for normality – Use the F Statistic on the average variances for the high and low settings Data – Curl Reduction Run 1 2 3 4 5 6 7 8 Top Roll Tension (A) Bottom Roll Tension (B) Rewind Tension (C) AB AC BC ABC + - + - + - + + + + - + + 87.5 0.707 .004998 77.0 1.414 1.9994 91.0 1.414 1.9994 + + + - + + + + - + - + - 81.5 2.121 4.4986 98.5 3.535 12.4962 91.5 0.707 004998 + + + + + + + + + + 102.0 2.828 7.9976 91.5 0.707 004998 Avg Curl Std Dev Variance Calculate an F Value for each Effect For Top Roll Tension 2 1.994 4.4986 .004998 .004998 1.6256 4 .004998 1.9994 12.4962 7.9976 5.6245 4 S S 2 2 F S L arg er S 2 Smaller 5.6245 3.460 1.6256 (r 1)2k f 1x 230 Degrees of Freedom= 4 2 2 Confidence 95%, Risk 5% F.025,4,4 9.60 r=# replicates (2) k=# factors (3) f=fractionalization (0) Any effects with a F value greater than 9.60 are significant at the 95% level. Significant for reduced variation??? F Table for = .025 Calculate an F Value for each Effect For Bottom Roll Tension 2 1.9994 4.4986 7.9976 .004998 3.625 4 .004998 1.9994 12.4962 .004998 3.626 4 S S 2 2 F S L arg er S 2 Smaller 3.626 1.000 3.625 (r 1)2k f 1x 230 Degrees of Freedom= 4 2 2 Confidence 95%, Risk 5% F.025,4,4 9.60 r=# replicates (2) k=# factors (3) f=fractionalization (0) Any effects with a F value greater than 9.60 are significant at the 95% level. Significant for reduced variation??? Calculate an F Value for each Effect For Rewind Tension 2 12.4962 .004998 7.9976 .004998 5.126 4 .004998 1.9994 1.9994 4.4986 2.1256 4 S S 2 2 F S L arg er S 2 Smaller 5.126 2.412 2.1256 (r 1)2k f 1x 230 Degrees of Freedom= 4 2 2 Confidence 95%, Risk 5% F.025,4,4 9.60 r=# replicates (2) k=# factors (3) f=fractionalization (0) Any effects with a F value greater than 9.60 are significant at the 95% level. Significant for reduced variation??? Calculate an F Value for each Effect For Interaction of Top Roll and Bottom Roll Tension 2 .004998 4.4986 12.4962 .004998 4.2512 4 1.9994 1.9994 .004998 7.9976 3.0003 4 S S 2 2 F S L arg er S 2 Smaller 4.2512 1.4169 3.0003 (r 1)2k f 1x 230 Degrees of Freedom= 4 2 2 Confidence 95%, Risk 5% F.025,4,4 9.60 r=# replicates (2) k=# factors (3) f=fractionalization (0) Any effects with a F value greater than 9.60 are significant at the 95% level. Significant for reduced variation??? Calculate an F Value for each Effect For Interaction of Top Roll and Rewind Tension 2 .004998 1.9994 .004998 .004998 0.5036 4 1.9994 4.4986 12.4962 7.9976 6.74795 4 S S 2 2 F S L arg er S 2 Smaller 6.74795 13.3994 0.5036 (r 1)2k f 1x 230 Degrees of Freedom= 4 2 2 Confidence 95%, Risk 5% F.025,4,4 9.60 r=# replicates (2) k=# factors (3) f=fractionalization (0) Any effects with a F value greater than 9.60 are significant at the 95% level. Significant for reduced variation??? Calculate an F Value for each Effect For Interaction of Bottom Roll and Rewind Tension 2 .004998 1.9994 7.9976 .004998 2.5017 4 1.9994 4.4986 12.4962 .004998 4.7498 4 S S 2 2 F S L arg er S 2 Smaller 4.7498 1.8986 2.5017 (r 1)2k f 1x 230 Degrees of Freedom= 4 2 2 Confidence 95%, Risk 5% F.025,4,4 9.60 r=# replicates (2) k=# factors (3) f=fractionalization (0) Any effects with a F value greater than 9.60 are significant at the 95% level. Significant for reduced variation??? Calculate an F Value for each Effect For Interaction of All Factors 2 1.9994 1.9994 12.4962 .004998 4.1250 4 .004998 4.4986 .004998 7.9976 3.1265 4 S S 2 2 F S L arg er S 2 Smaller 4.1250 1.31937 3.1265 (r 1)2k f 1x 230 Degrees of Freedom= 4 2 2 Confidence 95%, Risk 5% F.025,4,4 9.60 r=# replicates (2) k=# factors (3) f=fractionalization (0) Any effects with a F value greater than 9.60 are significant at the 95% level. Significant for reduced variation??? Lets Summarize… Factor F Value Top Roll Tension (A) 3.460 Bottom Roll Tension (B) 1.0000 Rewind Tension (C) 2.412 AB Interaction 1.4169 AC Interaction 13.3994 BC Interaction 1.8986 ABC Interaction 1.31937 Note, the AC (Top Roll-Rewind Tension) is the only factor or interaction that exceeds the F Critical Value of 9.60. Conclusion • To minimize variation, we find that variation is minimized when Top Roll Tension and Rewind Tension are set to the same levels (i.e. High-High or Low-Low). • Our optimal settings to minimize curl were Top Roll High, Bottom Roll Low, and Rewind Low. Top Roll and Rewind influenced the curl in OPPOSITE DIRECTIONS! How can we decide? Case Studies Instructions: Work in teams to read the case study and then develop an experimental design. You should include: • What factors to study • How many and what levels to study • How many replicates to take • How many runs? • Design your experiment. Case Study 1 • You have been working late recently, and subsisting on microwave popcorn. As a result, you have decided to find the formula for the best popcorn. You are down to two brands, A and B. You also find that the time varies between 4 and 6 minutes and power between 75% and 100%. • You judge the quality of the popcorn by taste and the number of unpopped kernels. Case 1 Design • • • • • What factors to study How many and what levels to study How many replicates to take How many runs? Design your experiment Case 1 Design • • • • • What factors to study How many and what levels to study How many replicates to take How many runs? Design your experiment Case Study 2 • Your company has a high-pressure chemical reactor which filters impurities from your product. You have been asked to determine the proper settings to maximize the filtration rate. After a brainstorming session, your team decides that temperature, pressure, chemical concentration % and stir rate all influence the filtration rate (gallons per hour). • Looking over the historical records, you see temperature has varied from 24oC to 35oC. Pressure can range from 10 PSIG to 15 PSIG, and chemical concentration from 2% to 4%. The lowest stir rate has been 15 RPM and the highest 30 RPM. • Design an experiment to determine what the optimal levels would be to maximize the filtration rate. Case 2 Design • • • • • What factors to study How many and what levels to study How many replicates to take How many runs? Design your experiment Case 2 Design • • • • • What factors to study How many and what levels to study How many replicates to take How many runs? Design your experiment Case Study 3 • A manufacturer of ice crease has hired you to assist them with achieving their target fill rate of 2.50 pounds. After discussing with them the relevant factors, it is concluded that there are two main variables impacting the final weight, fill temperature and the overfill %. Overfill is the percentage of air that is incorporated into the ice crease. The fill temperature is sensitive, and they have machines that vary from 20oF to 25oF. Overfill percentage is also tightly controlled. Studies have shown that overfill percentages greater than 120% are perceived as “lower quality” by customers and percentages less than 90% cause the ice crease to be difficult to scoop. Case 3 Design • • • • • What factors to study How many and what levels to study How many replicates to take How many runs? Design your experiment Case 3 Design • • • • • What factors to study How many and what levels to study How many replicates to take How many runs? Design your experiment Points to Remember 1. Get a clear understanding of the problem you intend to solve. 2. Conduct an exhaustive and detailed brainstorming session. 3. Teamwork – involve people involved in all aspects of the process being studied. 4. Randomize the experiment trial order 5. Replicate to understand and estimate variation. 6. Perform confirmatory runs and experiments to test your model’s validity. Your Assignment • Find a problem and run a Designed Experiment using the methods we learned today within 1-2 weeks of completing this session. • No problem – Optimize microwave popcorn or chocolate chip cookies! Questions?