Multifactor Experiments November 26, 2013 Gui Citovsky, Julie Heymann, Jessica Sopp, Jin Lee, Qi Fan, Hyunhwan Lee, Jinzhu Yu, Lenny Horowitz, Shuvro Biswas Outline • Two-Factor Experiments with Fixed Crossed Factors • 2k Factorial Experiments • Other Selected Types of Two-Factor Experiments Two-Factor Experiments with Fixed Crossed Factors First, single factor • Comparison of two or more treatments (groups) • Single treatment factor • Example: A study to compare the average flight distances for three types of golf balls differing in the shape of dimples on them: circular, fat elliptical, thin elliptical • Treatments circular, fat elliptical, thin elliptical • Treatment factor type of ball Single factor continued Two-Factor Experiments With Fixed Crossed Factors • Two fixed factors, A with a ≥ 2 levels and B with b ≥ 2 levels • ab treatment combinations • If there are n observations obtained under each treatment combination (n replicates), then there is a total of abn experimental units Two-Factor Experiments With Fixed Crossed Factors • Example: Heat treatment experiment to evaluate the effects of a quenching medium (two levels: oil and water) and quenching temperature (three levels: low, medium, high) on the surface hardness of steel • 2 x 3 = 6 treatment combinations • If 3 steel samples are treated for each combination, we have N = 18 observations Model and Estimates of its Parameters Let yijk=kth observation on the (i,j)th treatment combination, i=1,2,…,a , j=1,2,…,b, and k=1,2,…,n. Let random variable Yijk correspond to observed outcome yijk. Basic Model: Yijk ~ N(mij , s 2 ) and independent i.i.d Yijk = mij + eijk where eijk ~ N(0, s 2 ) Table format Parameters ith Row Average: Grand Mean: åi=1å j=1 mij a m = m·· = b ab å j=1 mij jth Column Average: b mi· = jth Column Main Effect: b j = m· j - m·· b m· j å = ith Row Main Effect: t i = mi· - m·· (i,j)th Row Column Interaction (tb )ij = mij - m - t i - b j = mij - mi· - m· j + m·· a m i=1 ij a Least Squares Estimates mˆ = y··· tˆi = yi·· - y··· bˆ j = y· j· - y··· = yij· - yi·· - y· j· + y··· mˆij = mˆ + tˆi + b j + = y··· + (yi·· - y··· )+ (y· j· - y··· )+ (yij· - yi·· - y· j· + y··· ) = yij· Variance • Sample variance for (i, j)th cell is: s 2 ij å = n k=1 (yijk - yij· )2 n -1 • Pooled estimate for σ2: s = 2 å å a b i=1 j=1 (n -1)sij2 N - ab Example • Experiment to study how mechanical bonding strength of capacitors depends on the type of substrate (factor A) and bonding material (factor B). • 3 substrates: Al2O3 with bracket, Al2O3 no bracket, BeO no bracket • 4 types of bonding material: Epoxy I, Epoxy II, Solder I and Solder II • Four capacitors were tested at each factor level combination Substrate Al2O3 no bracket Al2O3 with bracket BeO Epoxy I 1.51, 1.96, 1.83, 1.98 1.63, 1.80, 1.92, 1.71 3.04, 3.16, 3.09, 3.50 Bonding Material Epoxy II Solder I 2.62, 2.82, 2.96, 2.82, 2.69, 2.93 3.11, 3.11 3.12, 2.94, 2.91, 2.93, 3.23, 2.99 3.01, 2.93 1.91, 2.11, 3.04, 2.91, 1.78, 2.25 2.48, 2.83 Solder II 3.67, 3.40, 3.25, 2.90 3.48, 3.51, 3.24, 3.45 3.47, 3.42, 3.31, 3.76 Example continued Substrate Al2O3 no bracket Al2O3 with bracket BeO Bonding Material Epoxy II Solder I 2.62, 2.82, 2.96, 2.82, 2.69, 2.93 3.11, 3.11 3.12, 2.94, 2.91, 2.93, 3.23, 2.99 3.01, 2.93 1.91, 2.11, 3.04, 2.91, 1.78, 2.25 2.48, 2.83 Epoxy I 1.51, 1.96, 1.83, 1.98 1.63, 1.80, 1.92, 1.71 3.04, 3.16, 3.09, 3.50 t 1 = y1·· - y··· = 2.723- 2.800 = -0.077 b1 = y·1· - y··· = 2.261- 2.800 = -0.539 Solder II 3.67, 3.40, 3.25, 2.90 3.48, 3.51, 3.24, 3.45 3.47, 3.42, 3.31, 3.76 = y11· - y1·· - y·1· + y··· =1.820 - 2.723- 2.261+ 2.800 = -0.364 s11 = 0.217 s12 = 0.138 s13 = 0.139 s14 = 0.321 s21 = 0.124 s22 = 0.131 s23 = 0.044 s24 = 0.122 s31 = 0.208 s32 = 0.209 s33 = 0.240 s34 = 0.192 Pooled sample variance: s = 2 (0.217)2 + + (0.192)2 12 = 0.0349 Example continued: Sample Means Substrate Al2O3 no bracket Al2O3 with bracket BeO Column mean Epoxy I 1.820 Bonding Material Epoxy II Solder I 2.765 3.000 Solder II 3.305 Row Mean 2.723 1.765 3.070 2.945 3.420 2.800 3.198 2.013 2.815 3.490 2.879 2.261 2.616 2.920 3.405 2.800 Example continued: Other Model Parameters Two- Way Analysis of Variance We define the following sum of squares: a b n SST = åå å (yijk - y··· )2 i=1 j=1 k=1 a b n a a SSA = å åå (yi·· - y··· ) = bnå (yi·· - y··· ) = bnåtˆi2 2 2 i=1 j=1 k=1 a b i=1 n i=1 b a SSB = å åå (y· j· - y··· ) = anå (y· j· - y··· ) = anå bˆ 2j 2 2 i=1 j=1 k=1 a b j=1 i=1 n a b SSAB = åå å (yij· - yi·· - y· j· + y··· ) = nå å (yij· - yi·· - y· j· + y··· )2 2 i=1 j=1 k=1 a i=1 j=1 b = nå å i=1 j=1 a b n a b n 2 SSE = åå å (yijk - yij· )2 = åå å eijk i=1 j=1 k=1 i=1 j=1 k=1 Analysis of Variance • Degrees of Freedom: • • • • • SST: N – 1 SSA: a – 1 SSB: b – 1 SSAB: (a – 1)(b – 1) SSE: N – ab • SST = SSA + SSB + SSAB + SSE. • Similarly, the degrees of freedom also follow this identity, i.e. N -1= (a -1)+ (b -1)+ (a -1)(b -1)+ (N - ab) Analysis of Variance • Mean squares = π π’π ππ π ππ’ππππ π.π. SSA MSA = a -1 SSAB MSAB = (a -1)(b -1) SSB MSB = b -1 SSE MSE = = s2 N - ab Hypothesis Test We test three hypotheses: H 0 A : t 1 = t 2 = ... = t a = 0 vs. H1A : Not all t i = 0 H 0 B : b1 = b2 = ... = ba = 0 vs. H1B :Not all ba = 0 H 0 AB : (tb )11 = (tb )12 = ... = (tb )ab = 0 vs. H 0 AB : Not all (tb )ij = 0 If all interaction terms are equal to zero, then the effect of one factor on the mean response does not depend on the level of the other factors. When do we reject H0? • Use F-statistics to test our hypotheses by taking the ratio of the mean squares to the MSE. Reject H0A Reject H0B Reject H 0 AB MSA if FA = > fa-1,N -ab,a MSE MSB if FB = > fb-1,N-ab,a MSE MSAB if FAB = > f(a-1)(b-1),N-ab,a MSE • We test the interaction hypothesis H0AB first. Summary (Table 13.5) Source of Variation (Source) Sum of Squares (SS) a Degrees of Freedom (d.f.) Main Effects A SSA = bnåtˆi2 a–1 Main Effects B SSB = anå bˆ 2j b–1 Interaction AB Error i=1 a a i=1 b SSAB = nåå a i=1 j-1 b n SSE = ååå eijk2 (a – 1)(b – 1) N – ab i=1 j=1 k=1 Total a b n SST = ååå(yijk - y··· )2 i=1 j=1 k=1 N–1 Mean Square (MS) SSA a -1 SSB MSB = b -1 MSA = MSAB = SSAB (a -1)(b -1) MSE = SSE N - ab F FA = MSA MSE FB = MSB MSE FAB = MSAB MSE Example: Bonding Strength of Capacitors Data Capacitors; input Bonding $ Substrate $ Strength Datalines; Epoxy1 Al203 1.51 Epoxy1 Al203 1.96 1.83 Epoxy1 Al203 1.98 … ; @@; Epoxy1 Al203 proc GLM plots=diagnostics data=Capacitors; TITLE "Analysis of Bonding Strength of Capacitors"; CLASS Bonding Substrate; Model Strength = Bonding | Substrate; run; Bonding Strength of Capacitors ANOVA Table • At α=0.05, we can reject H0B and H0AB but fail to reject H0A. • The main effect of bonding material and the interaction between the bonding material and the substrate are both significant. • The main effect of substrate is not significant at our α. Main Effects Plot • Definition: A main effects plot is a line plot of the row means of factor and A and the column means of factor B. Factor B Main Effects Plot 4 4 3 3 Mean Resonse Mean Resonse Factor A Effects Plot 2 1 0 Al2O3 Al2O3 + Brckt Be0 2 1 0 Epoxy I Epoxy II Solder I Solder II Interaction Plot Model Diagnostics with Residual Plots • Why do we look at residual plots? • Is our constant variance assumption true? • Is our normality assumption true? k 2 Factorial Experiments 2k Factorial Experiments • 2k factorial experiments is a class of multifactor experiments consists of design in which each factor is studied at 2 levels. • If there are k factors, then we have 2k treatment combinations • 2-factor and 3-factor experiments can be generalized to >3-factor experiments 2 2 experiment • 22 Experiment: experiment with factors A and B, each at two levels. ab = (A high, B high) b = (A low, B high) a = (A high, B low) (1) = (A low, B low) 22 experiment cont’d Assume a balanced design with n observations for each treatment combinations, denote these observations by yij Yij ~ N(µi, σ2) i = (1), a, b, ab j = 1, 2, … , n 22 experiment cont’d • Main effect of factor A (π): difference in the mean response between the high level of A and the low level of A, averaged over the levels of B • Main effect of factor B (π): difference in the mean response between the high level of B and the low level of B, averaged over the levels of A • Interaction effect of AB (ππ·): difference between the mean effect of A at the high level of B and at the low level of B π= πππ−ππ +(ππ−π (1)) 2 (ππ½) = π½= πππ−ππ +(ππ−π (1)) 2 πππ−ππ −(ππ−π (1)) 2 22 experiment cont’d The least square estimates of the main effects and the interaction effects are obtained by replacing the treatment means by the corresponding cell sample means. Est. Main Effect A = π= π¦ ππ−π¦π +(π¦π−π¦ (1)) 2 Est. Main Effect B = π½= π¦ππ−π¦ π +(π¦π−π¦ (1)) 2 Est. Interaction AB = ππ½ = π¦ππ−π¦π −(π¦π−π¦(1)) 2 22 experiment cont’d Contrast Coefficients for Effects in a 22 Experiment Treatment Effect combinati I A B AB on (1) + + a + + b + + ab + + + + *Notice that the term-by-term products of any two contrast vectors equal the third one 23 experiment • 23 Experiment: experiment with factors A, B, and C with n observations. Yij ~ N(µi, σ2), i = (1), a, b, ab, c, ac, bc, abc Est. Main Effect A = Est. Main Effect B = Est. Main Effect C = j = 1, 2, … , n. π¦πππ−π¦ππ + π¦ππ−π¦ π +(π¦ππ−π¦ π)+(π¦π−π¦ (1)) 4 π¦ πππ−π¦ππ + π¦ππ−π¦π +(π¦ππ−π¦ π)+(π¦π−π¦ (1)) 4 π¦πππ−π¦ ππ + π¦ππ−π¦ π +(π¦ππ−π¦π)+(π¦π −π¦(1)) 4 3 2 experiment cont’d Est. Interaction Effect AB = π¦πππ−π¦ππ − π¦ππ−π¦π +{ π¦ππ−π¦ π − π¦π−π¦ 1 } 4 Est. Interaction Effect BC = π¦πππ−π¦ππ − π¦ππ−π¦π +{ π¦ππ−π¦π − π¦π−π¦ 1 } 4 Est. Interaction Effect AC = π¦πππ−π¦ππ − π¦ππ−π¦π +{ π¦ππ−π¦π − π¦π−π¦ 1 } 4 Est. Interaction Effect ABC = π¦πππ−π¦ππ − π¦ππ−π¦π −{ π¦ππ−π¦ π − π¦π−π¦ 1 } 4 23 experiment cont’d Contrast coefficients for Effects in a 23 Experiment Treatment Combination Effect I A B AB C AC BC ABC (1) + - - + - + + - a + + - - - - + + b + - + - - + - + ab + + + + - - - - c + - - + + - - + ac + + - - + + - - bc + - + - + - + - abc + + + + + + + + 23 experiment example Factors affecting bicycle performance: Seat height (Factor A): 26" (-), 30" (+) Generator (Factor B): Off (-), On(+) Tire Pressure (Factor C): 40 psi (-), 55 psi (+) 23 experiment example cont’d Travel times from Bicycle Experiment Factor Time (Secs.) A B C Run 1 Run 2 Mean - - - 51 54 52.5 + - - 41 43 42.0 - + - 54 60 57.0 + + - 44 43 43.5 - - + 50 48 49.0 + - + 39 39 39.0 - + + 53 51 52.0 + + + 41 44 42.5 23 experiment example cont’d 42.5−52.0 + 39.0−49.0 + 43.5−57.0 +(42.0−52.5) = -10.875 4 42.5−39.0 + 52.0−49.0 + 43.5−42.0 +(57.0−52.5) significant B= = 3.125 4 42.5−43.5 + 52.0−57.0 + 39.0−42.0 +(49.0−52.5) C= = -3.125 4 42.5−52.0 − 39.0−49.0 +{ 43.5−57.0 − 42.0−52.5 } AB = = -0.625 4 42.5−52.0 − 43.5−57.0 +{ 39.0−49.0 − 42.0−52.5 } AC = = 1.125 4 42.5−39.0 − 43.5−42.0 +{ 52.0−49.0 − 57.0−52.5 } BC = = 0.125 4 42.5−52.0 − 39.0−49.0 −{ 43.5−57.0 − 42.0−52.5 } ABC = = 0.875 4 A= 2k experiment • 2k experiments, where k>3. • n iid observations yij (j = 1,2,…n) at the ith treatment combination and its sample mean yi (i = 1,2,…, 2k) has the following estimated effect. Est. Effect = 2πΎ π=1 ππ π¦ π 2π − 1 Statistical Inference for 2k Experiments Basic Notations and Derivations • πΈπ π‘. ππππππ‘ = 2π π=1 ππ π¦π 2π−1 • πππ πΈπ π‘. ππππππ‘ = (π 2 /π)2π 22π−2 = π2 π2π−2 • ππΈ πΈπ π‘. ππππππ‘ = 2π 2 π=1 ππ πππ(ππ ) (2π−1 )2 π π2π−2 = 2π π=1 ±1 2 (π 2 /π) (2π−1 )2 = πππΈ = π 2 = 2π π=1 π π=1(π¦ππ − 2π (π − 1) d.f. π = 2π (π − 1) π¦π )2 CI and Hypotheses Test with t Test • Therefore a CI for any population effect is given by π πΈπ π‘. ππππππ‘ ± π‘π,πΌ/2 π2π−2 • The t-statistic for testing the significance of any estimated effect is (πΈπ π‘. ππππππ‘) π2π−2 (πΈπ π‘. ππππππ‘) π‘πΈπππππ‘ = = ππΈ(πΈπ π‘. ππππππ‘) π Hypotheses Test with F Test • Equivalently, we can use F test to do it (π2π−2 )(πΈπ π‘. ππππππ‘)2 2 πΉπΈπππππ‘ = π‘πΈπππππ‘ = π 2 • The estimated effect is significant at level πΌ if 2 π‘πΈπππππ‘ > π‘π,πΌ/2 βΊ πΉπΈπππππ‘ > π‘π,πΌ = π1,π,πΌ/2 2 Sums of Squares for Effects • πΉπΈπππππ‘ = (π2π−2 )(πΈπ π‘.ππππππ‘)2 π 2 = πππΈπππππ‘ πππΈ • ⇒ πππΈπππππ‘ = πππΈπππππ‘ = (π2π−2 )(πΈπ π‘. ππππππ‘)2 • πππππππ‘ππππ‘π = 2π π π=1(π¦π − π¦)2 = ππ1 + ππ2 + β― + ππ2π −1 The effects are mutually orthogonal contrasts. Regression Approach to 2k Experiments • a 22 experiment −1, π₯1 = +1, −1, π₯2 = +1, ππ π΄ ππ πππ€ ππ π΄ ππ βππβ ππ π΄ ππ πππ€ ππ π΄ ππ βππβ • Multiple regression model πΈ π = πΎ0 + πΎ1 π₯1 + πΎ2 π₯2 + πΎ12 π₯1 π₯2 • πΎ0 = π, πΎ1 = π , 2 πΎ2 = π½ 2 and πΎ12 = ππ½ 2 Regression Approach to 2k Experiments • 23 experiment πΈ π = πΎ0 + πΎ1 π₯1 + πΎ2 π₯2 + πΎ3 π₯3 + πΎ12 π₯1 π₯2 + πΎ13 π₯1 π₯3 + πΎ23 π₯2 π₯3 + πΎ123 π₯1 π₯2 π₯3 π΄ π΅ πΆ π₯1 + π₯2 + π₯3 2 2 2 π΅πΆ π΄π΅πΆ π₯2 π₯3 + π₯1 π₯2 π₯3 2 2 • π¦=π¦ + + π΄π΅ π₯ π₯ 2 1 2 + π΄πΆ π₯ π₯ 2 1 3 + • If all interactions are dropped from the model, the π΄ π΅ πΆ new fitted model is π¦ = π¦ + π₯1 + π₯2 + π₯3 2 2 2 Regression Approach to 2k Experiments • The interpolation formula πππππππππ πππ£ππ − π΄π£πππππ πππ£ππ π₯π = π ππππ/2 = πππππππππ πππ£ππ −(π»ππβ + πΏππ€)/2 (π»ππβ −πΏππ€)/2 A(seat height)= -10.875 B(generator) = 3.125 C(tire pressure) = -3.125 π¦ = 47.1875 Bicycle Example: Main Effects Model • main effects model π¦ = π¦ + π΄ π₯ 2 1 + π΅ π₯ 2 2 + πΆ π₯ 2 3 10.875 3.125 3.125 π¦ = 47.1875 − π₯1 + π₯2 + π₯3 2 2 2 • minimum travel time π¦ = 47.1875 − 5.4375 +1 + 1.5625 −1 − 1.5625 +1 = 38.625 π ππ Bicycle Example: Main Effects Model Sums of squares for omitted interactions effects ππππππππππππ = πππ΄π΅ + πππ΄πΆ + πππ΅πΆ + πππ΄π΅πΆ = 1.56+5.0625+0.0625+4.1875 = 10.875 d.f. = 4 Pure SSE = 33.5 d.f = 8 pooled SSE = 33.5 + 10.875 d.f. = 12 (total) MSE = πππΈ π.π. = 44.375 = 12 3.698 Bicycle Example: Residual Diagnostics To check model assumptions Residuals πππ = π¦ππ − π¦ππ • Normality • Equal error variance proc glm plots=diagnostics data = biker; class A B C; model travel= A|B|C; run; Single Replicated Case • Unreplicated case: n =1 • Problems in statistical testing Unusual response? Noise? Spoiling the results? • 0 degrees of freedom for error, • cannot use formal tests and C.I. to estimate of error and assess effects • Potential solutions • Pooling high-order interactions to estimate error • Graphical approach: normal plot against effects • Estimated effects • Independent, orthogonal, normally distributed, common variance π2 ( π−2 ) π2 Single Replicated Case • Effect Sparsity principle • If number of effects is large (e.g. k= 4, 15 effects), a majority of them are small ~N (0,σ2), few a large and more influential ~ (u≠0, σ2) • Reduced model • retaining only significant effects, omitting non-significant ones • Obtain sums of squares for omitted effects => pooled error sum of squares (SSE) (Error due to ignoring negligible effects) • Error d.f. = # pooled omitted effects • MSE = SSE/error d.f. • Perform formal statistical inferences Other Types of TwoFactor Experiments Section 13.3 Two-Factor Experiments with (Crossed and) Mixed Factors • A is fixed factor with a levels • B is random factor with b levels • Assume a balanced design with n ≥ 2 obs’s at each of (a x b) treatment combinations Example: • Compare three testing laboratories • Material tested comes in batches • Several samples from each batch tested in each laboratory • Laboratories represent a fixed factor • Batches represent a random factor • Two factors are crossed, since samples are tested from each batch in each laboratory • Model? Mixed Effects Model • Yijk = µ + τi + ßj + (τß)ij + Πijk ο µ,τi are fixed parameters ο ßj, (τß)ij are random parameters ο Πijk i.i.d. N(0,σ2) random errors The (Probability) Distribution of the Random Effects ο The random βj are the main effects of B, which are assumed to be i.i.d. N(0, σβ2) where σβ2 is called the variance component of the B (random factor) main effect. The distribution of βj would therefore be f βj (x) = 1 √2Πσβ2 exp(-x2/2σβ2 ) Variance Components Model • ππππ = π + ππ + π½π + (ππ½)ππ +ππππ 2 • πππ ππππ = ππ2 = ππ΅2 +ππ΄π΅ + π2 • Variance Components Model • SST = SSA + SSB +SSAB +SSE (same as fixed-effects model) Expected Mean Squares • E(MSA) = σ2 + nσ2AB + nΣia τi2 /(a-1) • E(MSB) = σ2 + nσ2AB + anσ2B • E(MSAB) = σ2 + nσ2AB • E(MSE) = σ2 Unbiased estimators of variance components • π2 = MSE • π 2AB = (MSAB - π 2 )/n • π 2B = (MSB - π 2 - n π 2AB) /an Common tests • H0A: τ1 = … =τa = 0 vs. H1A: At least one τi ≠ 0 • H0B: σ2B = 0 vs. H1B: σ2B > 0 • H0AB: σ2AB = 0 vs. H1AB: σ2AB > 0 Common tests: results • Reject H0A if FA = MSA/MSAB > fa-1,(a-1)(b-1),α • Reject H0B if FB = MSB/MSAB > fb-1,(a-1)(b-1),α • Reject H0AB if FAB = MSAB/MSE > f(a-1)(b-1),v,α Two-Factor Experiments w. Nested and Mixed Factors • Model: • Where, Two-Factor Experiments w. Nested and Mixed Factors • Orthogonal Decomposition of Sum of Squares Two-Factor Experiments w. Nested and Mixed Factors • ANOVA Table Illustrative Example • Consider the Following Experiment: ~ A ο Concentration of Reactant ~ B ο Concentration of Catalyst Analysis with SAS • Code Analysis with SAS • Selected Output Summary • Two factor experiments with multiple levels • Model: • We can decompose the Sum of Squares as: • And compute test statistics under Ho, as: Summary • 2^k Factorial Experiments • k factors, 2 levels each • Calculate the Sum of Squares due to an effect as Acknowledgements • Tamhane, Ajit C., and Dorothy D. Dunlop. "Analysis of Multifactor Experiments." Statistics and Data Analysis: From Elementary to Intermediate. Upper Saddle River, NJ: Prentice Hall, 2000. • Cody, Ronald P., and Jeffrey K. Smith. "Analysis of Variances: Two Independent Variables." Applied Statistics and the SAS Programming Language. 5th ed. Upper Saddle River, NJ: Prentice Hall, 2006. • Prof. Wei Zhu • Previous Presentations