12 6BV04 Screening Designs / department of mathematics and computer science 1 12 Contents • • • • regression analysis and effects 2p-experiments blocks 2p-k-experiments (fractional factorial experiments) • software • literature / department of mathematics and computer science 2 12 Three factors: example Response: deviation filling height bottles Factors: carbon dioxide level (%) pressure (psi) speed (bottles/min) / department of mathematics and computer science A B C 3 12 Effects How do we determine whether an individual factor is of importance? Measure the outcome at 2 different settings of that factor. Scale the settings such that they become the values +1 and -1. / department of mathematics and computer science 4 12 measurement -1 +1 setting factor A / department of mathematics and computer science 5 12 measurement -1 +1 setting factor A / department of mathematics and computer science 6 12 effect measurement -1 +1 setting factor A / department of mathematics and computer science 7 12 effect measurement slope -1 +1 setting factor A / N.B. effect = 2 * slope department of mathematics and computer science 8 12 50 measurement Effect factor A = 50 – 35 = 15 35 -1 +1 setting factor A / department of mathematics and computer science 9 12 More factors We denote factors with capitals: A, B,… Each factor only attains two settings: -1 and +1 The joint settings of all factors in one measurement is called a level combination. / department of mathematics and computer science 10 12 More factors / A B -1 -1 -1 1 1 -1 1 1 Level Combination department of mathematics and computer science 11 12 Notation A level combination consists of small letters. The small letters denote which factors are set at +1; the letters that do not appear are set at -1. Example: ac means: A and C at 1, the remaining factors at -1 N.B. (1) means that all factors are set at -1. / department of mathematics and computer science 12 12 An experiment consists of performing measurements at different level combinations. A run is a measurement at one level combination. Suppose that there are 2 factors, A and B. We perform 4 measurements with the following settings: • A -1 and B -1 (short: (1) ) • A +1 and B -1 (short: a ) • A -1 and B +1 (short: b ) • A +1 and B +1 (short: ab ) / department of mathematics and computer science 13 12 A 22 Experiment with 4 runs A B (1) -1 -1 b -1 1 a 1 -1 ab 1 1 / yield department of mathematics and computer science 14 12 Note: CAPITALS for factors and effects (A, BC, CDEF) small letters for level combinations ( = settings of the experiments) / (a, bc, cde, (1)) department of mathematics and computer science 15 12 +1 Graphical display b ab B -1 (1) a -1 / +1 A department of mathematics and computer science 16 12 +1 40 60 B -1 35 -1 / A 50 +1 department of mathematics and computer science 17 12 +1 40 60 B -1 35 -1 A 50 +1 2 estimates for effect A: / department of mathematics and computer science 18 12 +1 40 60 B -1 35 -1 2 estimates for effect A: / A 50 +1 50 - 35 = 15 department of mathematics and computer science 19 12 +1 40 60 B -1 35 -1 2 estimates for effect A: / A 50 +1 50 - 35 = 15 60 - 40 = 20 department of mathematics and computer science 20 12 +1 40 60 B -1 35 -1 A 2 estimates for effect A: / 50 +1 50 - 35 = 15 60 - 40 = 20 Which estimate is superior? department of mathematics and computer science 21 12 +1 40 60 B -1 35 -1 A 50 +1 2 estimates for effect A: 50 - 35 = 15 60 - 40 = 20 Combine both estimates: ½(50-35) + ½(60-40) = 17.5 / department of mathematics and computer science 22 12 +1 40 60 B -1 35 -1 A 50 +1 In the same way we estimate the effect B (note that all 4 measurements are used!): ½(40-35) + ½(60-50) = 7.5 / department of mathematics and computer science 23 12 +1 40 60 B -1 35 -1 A 50 +1 The interaction effect AB is the difference between the estimates for the effect A: ½(60-40) - ½(50-35) = 2.5 / department of mathematics and computer science 24 12 Interaction effects Cross terms in linear regression models cause interaction effects: Y = 3 + 2 xA + 4 xB + 7 xA xB xA xA +1 YY + 2 + 7 xB, so increase depends on xB. Likewise for xB xB+1 This explains the notation AB . / department of mathematics and computer science 25 12 No interaction 55 Output 50 B low B high 25 20 low / high Factor A department of mathematics and computer science 26 12 Interaction I 55 50 Output B low B high 45 20 low / high Factor A department of mathematics and computer science 27 12 Interaction II 55 Output 50 B low B high 45 20 low / high Factor A department of mathematics and computer science 28 12 Interaction III Output 55 B high 45 20 B low 20 low / high Factor A department of mathematics and computer science 29 12 Trick to Compute Effects A B yield (1) -1 -1 35 b -1 1 40 a 1 -1 50 ab 1 1 60 / (coded) measurement settings department of mathematics and computer science 30 12 Trick to Compute Effects A B yield (1) -1 -1 35 b -1 1 40 a 1 -1 50 ab 1 1 60 / Effect estimates department of mathematics and computer science 31 12 Trick to Compute Effects A B yield (1) -1 -1 35 b -1 1 40 a 1 -1 50 ab 1 1 60 Effect estimates Effect A = ½(-35 - 40 + 50 + 60) = 17.5 Effect B = ½(-35 + 40 – 50 + 60) = 7.5 / department of mathematics and computer science 32 12 Trick to Compute Effects A B AB yield (1) -1 -1 ? 35 b -1 1 ? 40 a 1 -1 ? 50 ab 1 1 ? 60 Effect AB = ½(60-40) - ½(50-35) = 2.5 / department of mathematics and computer science 33 12 Trick to Compute Effects A (1) -1 b -1 a 1 ab 1 B × × × × -1 1 -1 1 = = = = AB yield 1 35 -1 40 -1 50 1 60 AB equals the product of the columns A and B Effect AB = ½(60-40) - ½(50-35) = 2.5 / department of mathematics and computer science 34 12 Trick to Compute Effects I A B AB yield (1) + - - + 35 b + - + - 40 a + + - - 50 ab + + + + 60 Computational rules: I×A = A, I×B = B, A×B=AB etc. This holds true in general (i.e., also for more factors). / department of mathematics and computer science 35 12 3 Factors: a 23 Design (1) a b ab c ac bc abc I A B AB C AC BC ABC + - - + - + + + + - - - + + + - + - - + + + + + + - + - - + + + + + - - + + + - + - + + + + + + + + + + / department of mathematics and computer science 36 12 3 Factors: a 23 Design (1) a b ab c ac bc abc A + + + + / B + + + + C + + + + Yield 5 2 7 1 7 6 9 7 department of mathematics and computer science 37 I (1) + a + b + ab + c + ac + bc + abc + 12 A B AB C AC BC ABC - - + - + + + - + + - + - - + + + + + - - + + + + - + + - + - + + + + + + + + + scheme 23 design bc=9 c=7 effect A = C ac=6 b=7 ab=1 B ¼(+16-28)=-3 (1)=5 / abc=7 a=2 A department of mathematics and computer science 38 I (1) + a + b + ab + c + ac + bc + abc + 12 A B AB C AC BC ABC - - + - + + + - + + - + - - + + + + + - - + + + + - + + - + - + + + + + + + + + scheme 23 design bc=9 abc=7 c=7 effect AB = C ¼(+20-24)=-1 ac= 6 b=7 ab=1 B (1)=5 / a=2 A department of mathematics and computer science 39 12 Back to 2 factors – Blocking (1) b a ab I + + + + A + + B + + AB + + day 1 day 2 Suppose that we cannot perform all measurements at the same day. We are not interested in the difference between 2 days, but we must take the effect of this into account. How do we accomplish that? / department of mathematics and computer science 40 12 Back to 2 factors – Blocking (1) b a ab I + + + + A + + B + + AB + + day 1 1 2 2 “hidden” block effect Suppose that we cannot perform all measurements at the same day. We are not interested in the difference between 2 days, but we must take the effect of this into account. How do we accomplish that? / department of mathematics and computer science 41 12 Back to 2 factors – Blocking (1) b a ab I + + + + A + + B + + AB + + day + + We note that the columns A and day are the same. Consequence: the effect of A and the day effect cannot be distinguished. This is called confounding or aliasing). / department of mathematics and computer science 42 12 Back to 2 factors – Blocking (1) b a ab I + + + + A + + B + + AB + + day ? ? ? ? A general guide-line is to confound the day effect with an interaction of highest possible order. How can we accomplish that here? / department of mathematics and computer science 43 12 Back to 2 factors – Blocking (1) b a ab I + + + + A + + B + + AB + + day + + Solution: day 1: a, b day 2: (1), ab or interchange the days! / department of mathematics and computer science 44 12 Back to 2 factors – Blocking (1) b a ab I A the days B by drawing AB day Choose within lots which must + experiment - be performed + +first. In general, the order of experiments must + + be determined by drawing lots. + is called + randomisation. This + + + + + Solution: day 1: a, b day 2: (1), ab or interchange the days! / department of mathematics and computer science 45 12 I (1) + a + b + ab + c + ac + bc + abc + A + + + + B AB - + - + + + - + - + + + C A C BC A BC - + + + + - + + + + + + + + + + + + day 1 day 2 Here is a scheme for 3 factors. Interactions of order 3 or higher can be neglected in practice. How should we divide the experiments over 2 days? / department of mathematics and computer science 46 12 Fractional experiments Often the number of parameters is too large to allow a complete 2p design (i.e, all 2p possible settings -1 and 1 of the p factors). By performing only a subset of the 2p experiments in a smart way, we can arrange that by performing relatively few, it is possible to estimate the main effects and (possibly) 2nd order interactions. / department of mathematics and computer science 47 12 Fractional experiments (1) a b ab c ac bc abc I A + + + + + + + + + + + + + + + + + + + + + + + + + + + department + of mathematics + + and computer + + science / B AB C AC BC AB C + + + + 48 12 Fractional experiments (1) a b ab c ac bc abc I A + + + + + + + + + + + + + + + + + + + + + + + + + + + department + of mathematics + + and computer + + science / B AB C AC BC AB C + + + + 49 12 Fractional experiments (1) a b ab I A B AB C AC BC + + + + + + + + + + - + + - + + - / department of mathematics and computer science AB C + + - 50 12 Fractional experiments AB I A B AB C AC BC C (1) + + + + a + + + + b + + + + ab + + + + With this half fraction (only 4 = ½×8 experiments) we see that a number of columns are the same (apart from a minus sign): I = -C, A = -AC, B = -BC, AB = -ABC / department of mathematics and computer science 51 12 Fractional experiments I A B AB C AC BC (1) + + + + a + + + b + + + ab + + + + We say that these factors are confounded or aliased. In this particular case we have an ill-chosen fraction, because I and C are confounded. I = -C, A = -AC, B = -BC, AB = -ABC / department of mathematics and computer science AB C + + - 52 12 Fractional experiments – Better Choice: ABC (1) a b ab c ac bc abc I A + + + + + + + + + + + + + + + + + + + + + + + + + + + department + of mathematics + + and computer + + science / B AB C AC BC I= AB C + + + + 53 12 Fractional experiments – Better Choice: I= ABC AB I A B AB C AC BC C a + + + + b + + + + c + + + + abc + + + + + + + + Aliasing structure: I = ABC, A = BC, B = AC, C = AB The other “best choice” would be: I = -ABC / department of mathematics and computer science 54 12 a b c abc I A B AB C AC BC + + + + + + + + + + + + + + + + AB C + + + + In the case of 3 factors further reducing the number of experiments is not possible in practice, because this leads to undesired confounding, e.g. : I = A = BC = ABC, B = C = AB = AC, / department of mathematics and computer science 55 12 a abc I A B AB C AC BC + + + + + + + + + + AB C + + Other quarter fractions also have confounded main effects, which is unacceptable. / department of mathematics and computer science 56 12 Further remarks on fractions • there exist computational rules for aliases. E.g., it follows from A=C that AB = BC. Note that I = A2 = B2 = C2 etc. always holds (see the next lecture) • tables and software are available for choosing a suitable fraction . The extent of confounding is indicated by the resolution. Resolution III is a minimal ; designs with a higher resolution are very much preferred. / department of mathematics and computer science 57 12 Plackett-Burman designs So far we discussed fractional designs for screening. This is sensible if one cannot exclude the possibility of interactions. If one knows based on foreknowledge that there are no interactions or if one is for some reason is only interested in main effects, than Plackett-Burman designs are preferred. They are able to detect significant main effects using only very few runs. A disadvantage of these designs is their complicated aliasing structure. / department of mathematics and computer science 58 12 Number of measurements For every main or interaction effect that has to estimated separately, at least one measurement is necessary. If there are k blocks, then this requires additional k - 1 measurements. The remaining measurements are used for estimation of the variance. It is important to have sufficient measurements for the variance. / department of mathematics and computer science 59 12 Choice of design After a design has been chosen, the factors A, B, … must be assigned to the factors of the experiment. It is recommended to combine any foreknowledge on the factors with the alias structure. The individual measurements must be performed in a random order. • never confound two effects that might both be significant • if you know that a certain effect will not be significant, you can confound it with an effect that might be significant. / department of mathematics and computer science 60 12 Centre points and Replications If there are not enough measurements to obtain a good estimate of the variance, then one can perform replications. Another possibility is to add centre points . Centre point Adding centre points serves two purposes: • better variance estimate • allow to test curvature using a lack-of-fit test / +1 b ab -1 (1) a B -1 A department of mathematics and computer science +1 61 12 Curvature A design in which each factor is only allowed to attain the levels -1 and 1, is implicitly assuming a linear model. This is because knowing only the functions values at -1 and +1, then 1 and x2 cannot be distinguished. We can distinguish them by adding the level 0. This is the idea behind adding centre points. / department of mathematics and computer science 62 12 Analysis of a Design (1) a b ab c ac bc abc A + + + + / B + + + + C + + + + Yield 5 2 7 1 7 6 9 7 department of mathematics and computer science 63 12 Analysis of a Design – With 2-way Interactions Analysis Summary ---------------File name: <Untitled> Estimated effects for Yield ---------------------------------------------------------------------average = 5.5 +/- 0.25 A:A = -3.0 +/- 0.5 B:B = 1.0 +/- 0.5 C:C = 3.5 +/- 0.5 AB = -1.0 +/- 0.5 AC = 1.5 +/- 0.5 BC = 0.5 +/- 0.5 ---------------------------------------------------------------------Standard errors are based on total error with 1 d.f. / department of mathematics and computer science 64 12 Analysis of a Design – With 2-way Interactions Analysis of Variance for Yield -------------------------------------------------------------------------------Source Sum of Squares Df Mean Square F-Ratio P-Value -------------------------------------------------------------------------------A:A 18.0 1 18.0 36.00 0.1051 B:B 2.0 1 2.0 4.00 0.2952 C:C 24.5 1 24.5 49.00 0.0903 AB 2.0 1 2.0 4.00 0.2952 AC 4.5 1 4.5 9.00 0.2048 BC 0.5 1 0.5 1.00 0.5000 Total error 0.5 1 0.5 -------------------------------------------------------------------------------Total (corr.) 52.0 7 R-squared = 99.0385 percent R-squared (adjusted for d.f.) = 93.2692 percent Standard Error of Est. = 0.707107 Mean absolute error = 0.25 Durbin-Watson statistic = 2.5 Lag 1 residual autocorrelation = -0.375 / department of mathematics and computer science 65 12 Analysis of a Design – Only Main Effects Analysis Summary ---------------File name: <Untitled> Estimated effects for Yield ---------------------------------------------------------------------average = 5.5 +/- 0.484123 A:A = -3.0 +/- 0.968246 B:B = 1.0 +/- 0.968246 C:C = 3.5 +/- 0.968246 ---------------------------------------------------------------------Standard errors are based on total error with 4 d.f. Effect estimates remain the same! / department of mathematics and computer science 66 12 Analysis of a Design – Only Main Effects Analysis of Variance for Yield -------------------------------------------------------------------------------Source Sum of Squares Df Mean Square F-Ratio P-Value -------------------------------------------------------------------------------A:A 18.0 1 18.0 9.60 0.0363 B:B 2.0 1 2.0 1.07 0.3601 C:C 24.5 1 24.5 13.07 0.0225 Total error 7.5 4 1.875 -------------------------------------------------------------------------------Total (corr.) 52.0 7 R-squared = 85.5769 percent R-squared (adjusted for d.f.) = 74.7596 percent Standard Error of Est. = 1.36931 Mean absolute error = 0.8125 Durbin-Watson statistic = 2.16667 (P=0.3180) Lag 1 residual autocorrelation = -0.125 / department of mathematics and computer science 67 12 Analysis of a Design with Blocks (1) ab ac bc a b c abc Block 1 1 1 1 2 2 2 2 / A + + + + B + + + + C + + + + Yield 5 1 6 9 2 7 7 7 department of mathematics and computer science 68 12 Analysis of a Design with Blocks – With 2-way Interactions Analysis of Variance for Yield -------------------------------------------------------------------------------Source Sum of Squares Df Mean Square F-Ratio P-Value -------------------------------------------------------------------------------A:A 18.0 1 18.0 B:B 2.0 1 2.0 C:C 24.5 1 24.5 AB 2.0 1 2.0 AC 4.5 1 4.5 BC 0.5 1 0.5 blocks 0.5 1 0.5 Total error 0.0 0 -------------------------------------------------------------------------------Total (corr.) 52.0 7 R-squared = 100.0 percent R-squared (adjusted for d.f.) = 100.0 percent / Saturated design: 0 df for the error term → no testing possible department of mathematics and computer science 69 12 Analysis of a Design with Blocks – Only Main Effects Analysis of Variance for Yield -------------------------------------------------------------------------------Source Sum of Squares Df Mean Square F-Ratio P-Value -------------------------------------------------------------------------------A:A 18.0 1 18.0 7.71 0.0691 B:B 2.0 1 2.0 0.86 0.4228 C:C 24.5 1 24.5 10.50 0.0478 blocks 0.5 1 0.5 0.21 0.6749 Total error 7.0 3 2.33333 -------------------------------------------------------------------------------Total (corr.) 52.0 7 R-squared = 86.5385 percent R-squared (adjusted for d.f.) = 76.4423 percent Standard Error of Est. = 1.52753 Mean absolute error = 0.75 Durbin-Watson statistic = 3.21429 (P=0.0478) Lag 1 residual autocorrelation = -0.642857 / department of mathematics and computer science 70 12 Analysis of a Fractional Design (I = -ABC) (1) ac bc ab A + + / B + + C + + - Yield 5 6 9 1 department of mathematics and computer science 71 12 Analysis of a Fractional Design (I = -ABC) Analysis of Variance for Yield -------------------------------------------------------------------------------Source Sum of Squares Df Mean Square F-Ratio P-Value -------------------------------------------------------------------------------A:A-BC 12.25 1 12.25 B:B-AC 0.25 1 0.25 C:C-AB 20.25 1 20.25 Total error 0.0 0 -------------------------------------------------------------------------------Total (corr.) 32.75 3 R-squared = 100.0 percent R-squared (adjusted for d.f.) = 0.0 percent Estimated effects for Yield ---------------------------------------------------------------------average = 5.25 A:A-BC = -3.5 B:B-AC = -0.5 C:C-AB = 4.5 ---------------------------------------------------------------------No degrees of freedom left to estimate standard errors. / department of mathematics and computer science 72 12 Analysis of a Design with Centre Points (1) a b ab A + + 0 0 0 / B + + 0 0 0 Yield 5 6 9 1 8 8 7 Pure Error = 1 3 1 2 ( yi y ) 3 1 i 1 3 department of mathematics and computer science 73 12 Analysis of a Design with Centre Points Analysis of Variance for Yield -------------------------------------------------------------------------------Source Sum of Squares Df Mean Square F-Ratio P-Value -------------------------------------------------------------------------------A:A 12.25 1 12.25 36.75 0.0261 B:B 0.25 1 0.25 0.75 0.4778 AB 20.25 1 20.25 60.75 0.0161 Lack-of-fit 10.0119 1 10.0119 30.04 0.0317 Pure error 0.666667 2 0.333333 -------------------------------------------------------------------------------Total (corr.) 43.4286 6 R-squared = 75.4112 percent R-squared (adjusted for d.f.) = 50.8224 percent Standard Error of Est. = 0.57735 Mean absolute error = 1.18367 Durbin-Watson statistic = 0.801839 (P=0.1157) Lag 1 residual autocorrelation = 0.524964 / P-Value < 0.05 → Lack-of-fit! department of mathematics and computer science 74 12 Software • Statgraphics: menu Special -> Experimental Design • StatLab: http://www.win.tue.nl/statlab2/ • Design Wizard (illustrates blocks and fractions): http://www.win.tue.nl/statlab2/designApplet.html • Box (simple optimization illustration): http://www.win.tue.nl/~marko/box/box.html / department of mathematics and computer science 75 12 Literature • J. Trygg and S. Wold, Introduction to Experimental Design – What is it? Why and Where is it Useful?, homepage of chemometrics, editorial August 2002: www.acc.umu.se/~tnkjtg/Chemometrics/editorial/aug2002.html • Introduction from moresteam.com: www.moresteam.com/toolbox/t408.cfm • V. Czitrom, One-Factor-at-a-Time Versus Designed Experiments, American Statistician 53 (1999), 126-131 • Thumbnail Handbook for Factorial DOE, StatEase / department of mathematics and computer science 76