Chapter 6 The 2k Factorial Design 1 6.1 Introduction • The special cases of the general factorial design (Chapter 5) • k factors and each factor has only two levels • Levels: – quantitative (temperature, pressure,…), or qualitative (machine, operator,…) – High and low – Each replicate has 2 2 = 2k observations 2 • Assumptions: (1) the factor is fixed, (2) the design is completely randomized and (3) the usual normality assumptions are satisfied • Wildly used in factor screening experiments 3 6.2 The 22 Factorial Design • Two factors, A and B, and each factor has two levels, low and high. • Example: the concentration of reactant v.s. the amount of the catalyst (Page 219) 4 • “-” And “+” denote the low and high levels of a factor, respectively • Low and high are arbitrary terms • Geometrically, the four runs form the corners of a square • Factors can be quantitative or qualitative, although their treatment in the final model will be different 5 • Average effect of a factor = the change in response produced by a change in the level of that factor averaged over the levels if the other factors. • (1), a, b and ab: the total of n replicates taken at the treatment combination. • The main effects: A 1 {[ ab b ] [ a (1)]} 2n 2n B b (1) 2n y A y A {[ ab a ] [ b (1)]} 2n [ ab a b (1)] 2n ab a 1 1 ab b 2n 1 [ ab b a (1)] 2n a (1) 2n y B y B 6 • The interaction effect: AB 1 {[ ab b ] [ a (1)]} 2n 1 [ ab (1) a b ] 2n ab (1) ba 2n 2n • In that example, A = 8.33, B = -5.00 and AB = 1.67 • Analysis of Variance • The total effects: Contrast A ab a b (1) Contrast B ab b a (1) Contrast AB ab (1) a b 7 • Sum of squares: SS SS SS A B AB [ ab a b (1)] 4n [ ab b a (1)] [ ab (1) b a ] 2 4n SS T 2 2 n i 1 E 2 4n 2 SS 2 y ijk 2 j 1 k 1 SS T SS y A 4n SS B SS AB 8 Response:Conversion ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Sum of Source Squares Model 291.67 A 208.33 B 75.00 AB 8.33 Pure Error 31.33 Cor Total 323.00 DF 3 1 1 1 8 11 Mean Square 97.22 208.33 75.00 8.33 3.92 F Value 24.82 53.19 19.15 2.13 Prob > F 0.0002 < 0.0001 0.0024 0.1828 Std. Dev. Mean C.V. 1.98 27.50 7.20 R-Squared Adj R-Squared Pred R-Squared 0.9030 0.8666 0.7817 PRESS 70.50 Adeq Precision 11.669 The F-test for the “model” source is testing the significance of the overall model; that is, is either A, B, or AB or some combination of these effects important? 9 • Table of plus and minus signs: (1) I + A – B – AB + a b ab + + + + – + – + + – – + 10 • The regression model: y 0 1 x1 2 x 2 – x1 and x2 are coded variables that represent the two factors, i.e. x1 (or x2) only take values on – 1 and 1. – Use least square method to get the estimations of the coefficients – For that example, 8 . 33 5 . 00 yˆ 27 . 5 x1 x2 2 2 – Model adequacy: residuals (Pages 224~225) and normal probability plot (Figure 6.2) 11 • Response surface plot: yˆ 18 . 33 0 . 8333 Conc 5 . 00 Catalyst – Figure 6.3 12 6.3 The 23 Design • Three factors, A, B and C, and each factor has two levels. (Figure 6.4 (a)) • Design matrix (Figure 6.4 (b)) • (1), a, b, ab, c, ac, bc, abc • 7 degree of freedom: main effect = 1, and interaction = 1 13 14 • Estimate main effect: A 1 [ a (1) ab b ac c abc bc ] 4n y A y A a ab ac abc (1) b c bc 4n 1 4n [ a ab ac abc (1) b c bc ] 4n • Estimate two-factor interaction: the difference between the average A effects at the two levels of 1 B AB [ abc bc ab b ac c a (1)] 4n abc ab c (1) 4n bc b ac a 4n 15 • Three-factor interaction: ABC 1 {[ abc bc ] [ ac c ] [ ab b ] [ a (1)]} 4n 1 [ abc bc ac c ab b a (1)] 4n • Contrast: Table 6.3 – Equal number of plus and minus – The inner product of any two columns = 0 – I is an identity element – The product of any two columns yields another column – Orthogonal design • Sum of squares: SS = (Contrast)2/8n 16 Table of – and + Signs for the 23 Factorial Design (pg. 231) Factorial Effect Treatment Combination I A B AB C AC BC ABC (1) a + – – + – + + – + + – – – – + + b + – + – – + – + ab + + + + – – – – c + – – + + – – + ac + + – – + + – – bc + – + – + – + – abc + + + + + + + + Contrast 24 18 6 14 2 4 4 Effect 3.00 2.25 0.75 1.75 0.25 0.50 0.50 17 • Example 6.1 A = carbonation, B = pressure, C = speed, y = fill deviation 18 • Estimation of Factor Effects Model Error Error Error Error Error Error Error Error Error Term Effect Intercept A 3 B 2.25 C 1.75 AB 0.75 AC 0.25 BC 0.5 ABC 0.5 LOF 0 P Error SumSqr % Contribution Lenth's ME Lenth's SME 1.25382 1.88156 36 20.25 12.25 2.25 0.25 1 1 46.1538 25.9615 15.7051 2.88462 0.320513 1.28205 1.28205 5 6.41026 19 • ANOVA Summary – Full Model Response:Fill-deviation ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Sum of Source Squares Model 73.00 A 36.00 B 20.25 C 12.25 AB 2.25 AC 0.25 BC 1.00 ABC 1.00 Pure Error 5.00 Cor Total 78.00 DF 7 1 1 1 1 1 1 1 8 15 Mean Square 10.43 36.00 20.25 12.25 2.25 0.25 1.00 1.00 0.63 F Value 16.69 57.60 32.40 19.60 3.60 0.40 1.60 1.60 Prob > F 0.0003 < 0.0001 0.0005 0.0022 0.0943 0.5447 0.2415 0.2415 Std. Dev. Mean C.V. 0.79 1.00 79.06 R-Squared 0.9359 Adj R-Squared Pred R-Squared 0.8798 0.7436 PRESS 20.00 Adeq Precision 13.416 20 • The regression model and response surface: – The regression model: 3 . 00 2 . 25 1 . 75 0 . 75 yˆ 1 . 00 x1 x2 x3 x1 x 2 2 2 2 2 – Response surface and contour plot (Figure 6.7) Coefficient Factor Estimate Intercept 1.00 A-Carbonation 1.50 B-Pressure 1.13 C-Speed 0.88 AB 0.38 Standard 95% CI 95% CI DF Error Low High 1 0.20 0.55 1.45 1 0.20 1.05 1.95 1 0.20 0.68 1.57 1 0.20 0.43 1.32 1 0.20 -0.072 0.82 21 • Contour & Response Surface Plots – Speed at the High Level DESIGN-EXPERT Plot DESIGN-EXPERT Plot Fill-deviation Carbonation 2 Fill-deviationXY == A: B: Pressure 2 30.00 Fill-deviation X = A: Carbonation Y = B: Pressure Actual Factor C: Speed = 250.00 4.875 3.5625 Design Points 28.75 2.25 3.125 Actual Factor C: Speed = 250.00 Fill-deviation B: Pres s ure 0.9375 -0.375 2.25 27.50 1.375 0.5 26.25 30.00 12.00 28.75 11.50 27.50 2 25.00 10.00 11.00 2 B: Pressure 26.25 10.50 11.00 11.50 10.50 A: Carbonation 12.00 25.00 10.00 A: Carbonation 22 • Refine Model – Remove Nonsignificant Factors Response: Fill-deviation ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Sum of Source Squares Model 70.75 A 36.00 B 20.25 C 12.25 AB 2.25 Residual 7.25 LOF 2.25 Pure E 5.00 C Total 78.00 DF 4 1 1 1 1 11 3 8 15 Mean Square 17.69 36.00 20.25 12.25 2.25 0.66 0.75 0.63 F Value 26.84 54.62 30.72 18.59 3.41 Prob > F < 0.0001 < 0.0001 0.0002 0.0012 0.0917 1.20 0.3700 Std. Dev. 0.81 Mean 1.00 C.V. 81.18 R-Squared Adj R-Squared Pred R-Squared 0.9071 0.8733 0.8033 PRESS Adeq Precision 15.424 15.34 23 6.4 The General 2k Design • k factors and each factor has two levels • Interactions • The standard order for a 24 design: (1), a, b, ab, c, ac, bc, abc, d, ad, bd, abd, cd, acd, bcd, abcd k tw o-factor interactions 2 k three-factor interactions 3 1 k factor interaction 24 • The general approach for the statistical analysis: – Estimate factor effects – Form initial model (full model) – Perform analysis of variance (Table 6.9) – Refine the model – Analyze residual – Interpret results • Contrast ( a 1)( b 1) ( k 1) ABC ... K 2 ABC K n2 SS ABC K 1 n2 k k Contrast ( Contrast ABC K ABC K ) 2 25 6.5 A Single Replicate of the 2k Design • These are 2k factorial designs with one observation at each corner of the “cube” • An unreplicated 2k factorial design is also sometimes called a “single replicate” of the 2k • If the factors are spaced too closely, it increases the chances that the noise will overwhelm the signal in the data 26 • Lack of replication causes potential problems in statistical testing – Replication admits an estimate of “pure error” (a better phrase is an internal estimate of error) – With no replication, fitting the full model results in zero degrees of freedom for error • Potential solutions to this problem – Pooling high-order interactions to estimate error (sparsity of effects principle) – Normal probability plotting of effects (Daniels, 1959) 27 • Example 6.2 (A single replicate of the 24 design) – A 24 factorial was used to investigate the effects of four factors on the filtration rate of a resin – The factors are A = temperature, B = pressure, C = concentration of formaldehyde, D= stirring rate 28 29 • Estimates of the effects Model Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Term Intercept A B C D AB AC AD BC BD CD ABC ABD ACD BCD ABCD Effect SumSqr % Contribution 21.625 3.125 9.875 14.625 0.125 -18.125 16.625 2.375 -0.375 -1.125 1.875 4.125 -1.625 -2.625 1.375 1870.56 39.0625 390.062 855.563 0.0625 1314.06 1105.56 22.5625 0.5625 5.0625 14.0625 68.0625 10.5625 27.5625 7.5625 Lenth's ME Lenth's SME 32.6397 0.681608 6.80626 14.9288 0.00109057 22.9293 19.2911 0.393696 0.00981515 0.0883363 0.245379 1.18763 0.184307 0.480942 0.131959 6.74778 13.699 30 • The normal probability plot of the effects DESIGN-EXPERT Plot Filtration Rate Temperature Pressure Concentration Stirring Rate 99 A 95 90 Norm al % probability A: B: C: D: Normal plot AD 80 C 70 D 50 30 20 10 5 AC 1 -18.12 -8.19 1.75 Effect 11.69 21.62 31 DESIGN-EXPERT Plot Interaction Graph Filtration Rate DESIGN-EXPERT Plot Interaction Graph C: Concentration 104 Filtration Rate X = A: Temperature Y = C: Concentration D: Stirring Rate 104 X = A: Temperature Y = D: Stirring Rate 88.4426 D- -1.000 D+ 1.000 Actual Factors B: Pressure = 0.00 C: Concentration = 0.00 Filtration Rate Filtration Rate C- -1.000 C+ 1.000 Actual Factors B: Pressure = 0.00 D: Stirring Rate = 0.00 72.8851 88.75 73.5 57.3277 58.25 41.7702 43 -1.00 -0.50 0.00 0.50 A: Tem perature 1.00 -1.00 -0.50 0.00 0.50 1.00 A: Tem perature 32 • B is not significant and all interactions involving B are negligible • Design projection: 24 design => 23 design in A,C and D • ANOVA table (Table 6.13) 33 Response:Filtration Rate ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Source Model A C D AC AD Residual Cor Total Sum of Squares 5535.81 1870.56 390.06 855.56 1314.06 1105.56 195.12 5730.94 Std. Dev. Mean C.V. 4.42 70.06 6.30 R-Squared 0.9660 Adj R-Squared Pred R-Squared 0.9489 0.9128 PRESS 499.52 Adeq Precision 20.841 DF 5 1 1 1 1 1 10 15 Mean Square 1107.16 1870.56 390.06 855.56 1314.06 1105.56 19.51 F Value 56.74 95.86 19.99 43.85 67.34 56.66 Prob >F < 0.0001 < 0.0001 0.0012 < 0.0001 < 0.0001 < 0.0001 34 • The regression model: Final Equation in Terms of Coded Factors: Filtration Rate = +70.06250 +10.81250 * Temperature +4.93750 * Concentration +7.31250 * Stirring Rate -9.06250 * Temperature * Concentration +8.31250 * Temperature * Stirring Rate • Residual Analysis (P. 251) • Response surface (P. 252) 35 DESIGN-EXPERT Plot Filtration Rate Normal plot of residuals 99 95 Norm al % probability 90 80 70 50 30 20 10 5 1 -1.83 -0.96 -0.09 0.78 1.65 Studentized Res iduals 36 • Half-normal plot: the absolute value of the effect estimates against the cumulative normal probabilities. DESIGN-EXPERT Plot Filtration Rate Temperature Pressure Concentration Stirring Rate 99 97 A 95 Half Norm al % probability A: B: C: D: Half Normal plot 90 AC 85 AD 80 D 70 C 60 40 20 0 0.00 5.41 10.81 |Effect| 16.22 21.63 37 • Example 6.3 (Data transformation in a Factorial Design) A = drill load, B = flow, C = speed, D = type of mud, y = advance rate of the drill 38 • The normal probability plot of the effect estimates DESIGN-EXPERT Plot adv._rate load flow speed mud 99 97 B 95 Half Norm al % probability A: B: C: D: Half Normal plot 90 C 85 D 80 BD BC 70 60 40 20 0 0.00 1.61 3.22 4.83 6.44 |Effect| 39 • Residual analysis DESIGN-EXPERT Plot Normal plot of residuals adv._rate XPERT Plot Residuals vs. Predicted 2.58625 99 95 1.44875 80 70 Res iduals Norm al % probability 90 50 0.31125 30 20 10 -0.82625 5 1 -1.96375 -1.96375 -0.82625 0.31125 Res idual 1.44875 2.58625 1.69 4.70 7.70 10.71 13.71 Predicted 40 • The residual plots indicate that there are problems with the equality of variance assumption • The usual approach to this problem is to employ a transformation on the response • In this example, y * ln y 41 DESIGN-EXPERT Plot Ln(adv._rate) load flow speed mud Three main effects are large 99 97 B 95 Half Norm al % probability A: B: C: D: Half Normal plot 90 No indication of large interaction effects C 85 D 80 70 What happened to the interactions? 60 40 20 0 0.00 0.29 0.58 0.87 1.16 |Effect| 42 Response: adv._rate Transform: Natural log Constant: 0.000 ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Sum of Mean F Source Squares DF Square Value Prob > F Model 7.11 3 2.37 164.82 < 0.0001 B 5.35 1 5.35 371.49 < 0.0001 C 1.34 1 1.34 93.05 < 0.0001 D 0.43 1 0.43 29.92 0.0001 Residual 0.17 12 0.014 Cor Total 7.29 15 Std. Dev. 0.12 Mean 1.60 C.V. 7.51 R-Squared Adj R-Squared Pred R-Squared 0.9763 0.9704 0.9579 PRESS Adeq Precision 34.391 0.31 43 • Following Log transformation Final Equation in Terms of Coded Factors: Ln(adv._rate) = +1.60 +0.58 * B +0.29 * C +0.16 * D 44 DESIGN-EXPERT Plot Ln(adv._rate) Normal plot of residuals DESIGN-EXPERT Plot Ln(adv._rate) Residuals vs. Predicted 0.194177 99 95 0.104087 80 70 Res iduals Norm al % probability 90 50 0.0139965 30 20 10 -0.0760939 5 1 -0.166184 -0.166184 -0.0760939 0.0139965 Res idual 0.104087 0.194177 0.57 1.08 1.60 2.11 2.63 Predicted 45 • Example 6.4: – Two factors (A and D) affect the mean number of defects – A third factor (B) affects variability – Residual plots were useful in identifying the dispersion effect – The magnitude of the dispersion effects: F i ln * 2 2 S (i ) S (i ) – When variance of positive and negative are equal, this statistic has an approximate normal distribution 46 6.6 The Addition of Center Points to the 2k Design • Based on the idea of replicating some of the runs in a factorial design • Runs at the center provide an estimate of error and allow the experimenter to distinguish between two possible models: k First-order m odel (interaction) y 0 i 1 k S econd-order m odel y 0 i 1 k i xi i xi k i 1 ji ij xi x j ji k i 1 k k ij x i x j ii x i 2 i 1 47 y F y C no "curvature" The hypotheses are: k H 0 : ii 0 i 1 k H 1 : ii 0 i 1 SS Pure Q uad n F nC ( y F y C ) n F nC This sum of squares has a single degree of freedom 48 2 • Example 6.6 nC 5 Usually between 3 and 6 center points will work well Design-Expert provides the analysis, including the F-test for pure quadratic curvature 49 Response: yield ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Source Model A B AB Curvature Pure Error Cor Total Sum of Squares 2.83 2.40 0.42 2.500E-003 2.722E-003 0.17 3.00 Std. Dev. Mean 0.21 40.44 R-Squared Adj R-Squared C.V. 0.51 Pred R-Squared N/A PRESS N/A Adeq Precision 14.234 DF 3 1 1 1 1 4 8 Mean Square 0.94 2.40 0.42 2.500E-003 2.722E-003 0.043 F Value 21.92 55.87 9.83 0.058 0.063 Prob > F 0.0060 0.0017 0.0350 0.8213 0.8137 0.9427 0.8996 50 • If curvature is significant, augment the design with axial runs to create a central composite design. The CCD is a very effective design for fitting a second-order response surface model 51