Design of Engineering Experiments Part 5 – The 2k Factorial Design • Text reference, Chapter 6 • Special case of the general factorial design; k factors, all at two levels • The two levels are usually called low and high (they could be either quantitative or qualitative) • Very widely used in industrial experimentation • Form a basic “building block” for other very useful experimental designs (DNA) • Special (short-cut) methods for analysis • We will make use of Design-Expert Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 1 Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 2 The Simplest Case: The 22 “-” and “+” denote the low and high levels of a factor, respectively • Low and high are arbitrary terms • Geometrically, the four runs form the corners of a square • Factors can be quantitative or qualitative, although their treatment in the final model will be different Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 3 Chemical Process Example A = reactant concentration, B = catalyst amount, y = recovery Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 4 Analysis Procedure for a Factorial Design • Estimate factor effects • Formulate model – With replication, use full model – With an unreplicated design, use normal probability plots • • • • Statistical testing (ANOVA) Refine the model Analyze residuals (graphical) Interpret results Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 5 Estimation of Factor Effects = A y A+ − y A − ab + a b + (1) = − 2n 2n = 21n [ab + a − b − (1)] = B yB + − yB − ab + b a + (1) = − 2n 2n = 21n [ab + b − a − (1)] See textbook, pg. 235-236 for manual calculations The effect estimates are: A = 8.33, B = -5.00, AB = 1.67 Practical interpretation? Design-Expert analysis ab + (1) a + b = − AB 2n 2n = 21n [ab + (1) − a − b] Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 6 Estimation of Factor Effects Form Tentative Model Model Model Model Model Error Error Term Effect SumSqr % Contribution Intercept A 8.33333 208.333 64.4995 B -5 75 23.2198 AB 1.66667 8.33333 2.57998 Lack Of Fit 0 0 P Error 31.3333 9.70072 Lenth's ME Lenth's SME Chapter 6 6.15809 7.95671 Design & Analysis of Experiments 8E 2012 Montgomery 7 Statistical Testing - ANOVA The F-test for the “model” source is testing the significance of the overall model; that is, is either A, B, or AB or some combination of these effects important? Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 8 Design-Expert output, full model Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 9 Design-Expert output, edited or reduced model Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 10 Residuals and Diagnostic Checking Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 11 The Response Surface Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 12 The 23 Factorial Design Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 13 Effects in The 23 Factorial Design = A y A+ − y A − = B yB + − yB − = C yC + − yC − etc, etc, ... Analysis done via computer Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 14 An Example of a 23 Factorial Design A = gap, B = Flow, C = Power, y = Etch Rate Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 15 Table of – and + Signs for the 23 Factorial Design (pg. 218) Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 16 Properties of the Table • Except for column I, every column has an equal number of + and – signs • The sum of the product of signs in any two columns is zero • Multiplying any column by I leaves that column unchanged (identity element) • The product of any two columns yields a column in the table: A× B = AB AB × BC= AB 2C= AC • Orthogonal design • Orthogonality is an important property shared by all factorial designs Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 17 Estimation of Factor Effects Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 18 ANOVA Summary – Full Model Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 19 Model Coefficients – Full Model Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 20 Refine Model – Remove Nonsignificant Factors Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 21 Model Coefficients – Reduced Model Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 22 Model Summary Statistics for Reduced Model • R2 and adjusted R2 SS Model 5.106 ×10 = = = 0.9608 R 5 5.314 ×10 SST 5 2 R 2 Adj SS E / df E 20857.75 /12 = = = 1− 1− 0.9509 5 5.314 ×10 /15 SST / dfT • R2 for prediction (based on PRESS) PRESS 37080.44 2 RPred = 1− = 1− = 0.9302 5 SST 5.314 × 10 Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 23 Model Summary Statistics • Standard error of model coefficients (full model) se ( βˆ ) = V= ( βˆ ) σ2 = n2k MS E = n2k 2252.56 = 11.87 2(8) • Confidence interval on model coefficients βˆ − tα / 2,df se( βˆ ) ≤ β ≤ βˆ + tα / 2,df se( βˆ ) E Chapter 6 E Design & Analysis of Experiments 8E 2012 Montgomery 24 The Regression Model Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 25 Model Interpretation Cube plots are often useful visual displays of experimental results Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 26 Cube Plot of Ranges What do the large ranges when gap and power are at the high level tell you? Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 27 Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 28 The General 2k Factorial Design • Section 6-4, pg. 253, Table 6-9, pg. 25 • There will be k main effects, and k two-factor interactions 2 k three-factor interactions 3 1 k − factor interaction Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 29 6.5 Unreplicated 2k Factorial Designs • These are 2k factorial designs with one observation at each corner of the “cube” • An unreplicated 2k factorial design is also sometimes called a “single replicate” of the 2k • These designs are very widely used • Risks…if there is only one observation at each corner, is there a chance of unusual response observations spoiling the results? • Modeling “noise”? Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 30 Spacing of Factor Levels in the Unreplicated 2k Factorial Designs If the factors are spaced too closely, it increases the chances that the noise will overwhelm the signal in the data More aggressive spacing is usually best Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 31 Unreplicated 2k Factorial Designs • Lack of replication causes potential problems in statistical testing – Replication admits an estimate of “pure error” (a better phrase is an internal estimate of error) – With no replication, fitting the full model results in zero degrees of freedom for error • Potential solutions to this problem – Pooling high-order interactions to estimate error – Normal probability plotting of effects (Daniels, 1959) – Other methods…see text Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 32 Example of an Unreplicated 2k Design • A 24 factorial was used to investigate the effects of four factors on the filtration rate of a resin • The factors are A = temperature, B = pressure, C = mole ratio, D= stirring rate • Experiment was performed in a pilot plant Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 33 The Resin Plant Experiment Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 34 The Resin Plant Experiment Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 35 Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 36 Estimates of the Effects Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 37 The Half-Normal Probability Plot of Effects Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 38 Design Projection: ANOVA Summary for the Model as a 23 in Factors A, C, and D Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 39 The Regression Model Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 40 Model Residuals are Satisfactory Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 41 Model Interpretation – Main Effects and Interactions Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 42 Model Interpretation – Response Surface Plots With concentration at either the low or high level, high temperature and high stirring rate results in high filtration rates Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 43 Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 44 Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 45 Outliers: suppose that cd = 375 (instead of 75) Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 46 Dealing with Outliers • • • • • • Replace with an estimate Make the highest-order interaction zero In this case, estimate cd such that ABCD = 0 Analyze only the data you have Now the design isn’t orthogonal Consequences? Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 47 Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 48 The Drilling Experiment Example 6.3 A = drill load, B = flow, C = speed, D = type of mud, y = advance rate of the drill Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 49 Normal Probability Plot of Effects – The Drilling Experiment Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 50 Residual Plots DESIGN-EXPERT Plot adv._rate R e s id ua ls vs . P re d ic te d 2.58625 R e s id u a ls 1.44875 0.31125 - 0.82625 - 1.96375 1.69 4.70 7.70 10.71 13.71 P re d ic te d Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 51 Residual Plots • The residual plots indicate that there are problems with the equality of variance assumption • The usual approach to this problem is to employ a transformation on the response • Power family transformations are widely used y =y * λ • Transformations are typically performed to – Stabilize variance – Induce at least approximate normality – Simplify the model Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 52 Selecting a Transformation • Empirical selection of lambda • Prior (theoretical) knowledge or experience can often suggest the form of a transformation • Analytical selection of lambda…the Box-Cox (1964) method (simultaneously estimates the model parameters and the transformation parameter lambda) • Box-Cox method implemented in Design-Expert Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 53 (15.1) Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 54 The Box-Cox Method DESIGN-EXPERT Plot adv._rate A log transformation is recommended 6.85 L n (R e s id u a lS S ) Lambda Current = 1 Best = -0.23 Low C.I. = -0.79 High C.I. = 0.32 B o x-C o x P lo t fo r P o w e r T ra ns fo rm The procedure provides a confidence interval on the transformation parameter lambda 5.40 Recommend transform: Log (Lambda = 0) 3.95 If unity is included in the confidence interval, no transformation would be needed 2.50 1.05 -3 -2 -1 0 1 2 3 Lam bda Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 55 Effect Estimates Following the Log Transformation Three main effects are large No indication of large interaction effects What happened to the interactions? Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 56 ANOVA Following the Log Transformation Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 57 Following the Log Transformation Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 58 The Log Advance Rate Model • Is the log model “better”? • We would generally prefer a simpler model in a transformed scale to a more complicated model in the original metric • What happened to the interactions? • Sometimes transformations provide insight into the underlying mechanism Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 59 Other Examples of Unreplicated 2k Designs • The sidewall panel experiment (Example 6.4, pg. 274) – Two factors affect the mean number of defects – A third factor affects variability – Residual plots were useful in identifying the dispersion effect • The oxidation furnace experiment (Example 6.5, pg. 245) – Replicates versus repeat (or duplicate) observations? – Modeling within-run variability Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 60 • Example 6.6, Credit Card Marketing, page 278 – Using DOX in marketing and marketing research, a growing application – Analysis is with the JMP screening platform Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 61 Other Analysis Methods for Unreplicated 2k Designs • Lenth’s method (see text, pg. 262) – Analytical method for testing effects, uses an estimate of error formed by pooling small contrasts – Some adjustment to the critical values in the original method can be helpful – Probably most useful as a supplement to the normal probability plot • Conditional inference charts (pg. 264) Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 62 Overview of Lenth’s method For an individual contrast, compare to the margin of error Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 63 Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 64 Adjusted multipliers for Lenth’s method Suggested because the original method makes too many type I errors, especially for small designs (few contrasts) Simulation was used to find these adjusted multipliers Lenth’s method is a nice supplement to the normal probability plot of effects JMP has an excellent implementation of Lenth’s method in the screening platform Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 65 Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 66 The 2k design and design optimality The model parameter estimates in a 2k design (and the effect estimates) are least squares estimates. For example, for a 22 design the model is y =β 0 + β1 x1 + β 2 x2 + β12 x1 x2 + ε (1) = β 0 + β1 (−1) + β 2 (−1) + β12 (−1)(−1) + ε1 a = β 0 + β1 (1) + β 2 (−1) + β12 (1)(−1) + ε 2 b = β 0 + β1 (−1) + β 2 (1) + β12 (−1)(1) + ε 3 ab = β 0 + β1 (1) + β 2 (1) + β12 (1)(1) + ε 4 (1) a ,X y = Xβ= + ε, y = b ab Chapter 6 1 −1 −1 1 1 1 −1 −1 ,β = 1 −1 1 −1 1 1 1 1 Design & Analysis of Experiments 8E 2012 Montgomery The four observations from a 22 design β0 β 1 = ,ε β2 β12 ε1 ε 2 ε 3 ε 4 67 The least squares estimate of β is βˆ = (X′X)-1 X′y −1 (1) + a + b + ab a + ab − b − (1) b + ab − a − (1) a b ab (1) − − + (1) + a + b + ab 4 (1) + a + b + ab a + ab − b − (1) a + ab − b − (1) 1 4 I4 = 4 b + ab − a − (1) b + ab − a − (1) 4 (1) a b ab − − + (1) − a − b + ab 4 4 0 = 0 0 βˆ0 βˆ1 ˆ β2 ˆ β12 The “usual” contrasts Chapter 6 0 0 0 4 0 0 0 4 0 0 0 4 Design & Analysis of Experiments 8E 2012 Montgomery The X′X matrix is diagonal – consequences of an orthogonal design The regression coefficient estimates are exactly half of the ‘usual” effect estimates 68 The matrix X′X has interesting and useful properties: V ( βˆ ) = σ 2 (diagonal element of (X′X) −1 ) = σ2 Minimum possible value for a four-run design 4 Maximum possible value for a four-run design |(X′X) |= 256 Notice that these results depend on both the design that you have chosen and the model What about predicting the response? Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 69 V [ yˆ ( x1 , x2 )] = σ 2 x′(X′X)-1 x x′ = [1, x1 , x2 , x1 x2 ] V [ yˆ ( x1 , x2 )]= σ2 (1 + x12 + x22 + x12 x22 ) 4 ±1, x2 = ±1 The maximum prediction variance occurs when x1 = V [ yˆ ( x1 , x2 )] = σ 2 x= The prediction variance when x= 0 is 1 2 V [ yˆ ( x1 , x2 )] = σ2 4 What about average prediction variance over the design space? Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 70 Average prediction variance 1 1 I 1 2 ˆ = V [ y ( x , x ) dx dx A = area of design space = 2 4 1 2 1 2 ∫ ∫ A −1 −1 1 1 1 2 1 2 2 2 2 σ (1 + x + x + x 1 2 1 x2 ) dx1dx2 ∫ ∫ 4 −1 −1 4 4σ 2 = 9 Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 71 Design-Expert® Software FDS Graph Min StdErr Mean: 0.500 Max StdErr Mean: 1.000 Cuboidal radius = 1 Points = 10000 1.000 StdErr Mean 0.750 0.500 0.250 0.000 0.00 0.25 0.50 0.75 1.00 Fraction of Design Space Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 72 For the 2 2 and in general the k 2 • The design produces regression model coefficients that have the smallest variances (D-optimal design) • The design results in minimizing the maximum variance of the predicted response over the design space (G-optimal design) • The design results in minimizing the average variance of the predicted response over the design space (Ioptimal design) Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 73 Optimal Designs • These results give us some assurance that these designs are “good” designs in some general ways • Factorial designs typically share some (most) of these properties • There are excellent computer routines for finding optimal designs (JMP is outstanding) Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 74 Addition of Center Points to a 2k Designs • Based on the idea of replicating some of the runs in a factorial design • Runs at the center provide an estimate of error and allow the experimenter to distinguish between two possible models: k k k First-order model (interaction) y = β 0 + ∑ β i xi + ∑∑ β ij xi x j + ε =i 1 k k =i 1 j >i k k β 0 + ∑ β i xi + ∑∑ β ij xi x j + ∑ β ii xi2 + ε Second-order model y = i == 1 i 1 j >i Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery i= 1 75 Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 76 y= yC ⇒ no "curvature" F The hypotheses are: k H 0 : ∑ β ii = 0 i =1 k H1 : ∑ β ii ≠ 0 i =1 SS Pure Quad nF nC ( yF − yC ) 2 = nF + nC This sum of squares has a single degree of freedom Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 77 Example 6.7, Pg. 286 Refer to the original experiment shown in Table 6.10. Suppose that four center points are added to this experiment, and at the points x1=x2 =x3=x4=0 the four observed filtration rates were 73, 75, 66, and 69. The average of these four center points is 70.75, and the average of the 16 factorial runs is 70.06. Since are very similar, we suspect that there is no strong curvature present. Chapter 6 nC = 4 Usually between 3 and 6 center points will work well Design-Expert provides the analysis, including the F-test for pure quadratic curvature Design & Analysis of Experiments 8E 2012 Montgomery 78 Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 79 ANOVA for Example 6.7 Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 80 If curvature is significant, augment the design with axial runs to create a central composite design. The CCD is a very effective design for fitting a second-order response surface model Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 81 Practical Use of Center Points (pg. 289) • Use current operating conditions as the center point • Check for “abnormal” conditions during the time the experiment was conducted • Check for time trends • Use center points as the first few runs when there is little or no information available about the magnitude of error • Center points and qualitative factors? Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 82 Center Points and Qualitative Factors Chapter 6 Design & Analysis of Experiments 8E 2012 Montgomery 83