Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 1 What If There Are More Than Two Factor Levels? • The t-test does not directly apply • There are lots of practical situations where there are either more than two levels of interest, or there are several factors of simultaneous interest • The analysis of variance (ANOVA) is the appropriate analysis “engine” for these types of experiments • The ANOVA was developed by Fisher in the early 1920s, and initially applied to agricultural experiments • Used extensively today for industrial experiments Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 2 An Example (See pg. 61) • An engineer is interested in investigating the relationship between the RF power setting and the etch rate for this tool. The objective of an experiment like this is to model the relationship between etch rate and RF power, and to specify the power setting that will give a desired target etch rate. • The response variable is etch rate. • She is interested in a particular gas (C2F6) and gap (0.80 cm), and wants to test four levels of RF power: 160W, 180W, 200W, and 220W. She decided to test five wafers at each level of RF power. • The experimenter chooses 4 levels of RF power 160W, 180W, 200W, and 220W • The experiment is replicated 5 times – runs made in random order Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 3 Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 4 An Example (See pg. 62) Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 5 • Does changing the power change the mean etch rate? • Is there an optimum level for power? • We would like to have an objective way to answer these questions • The t-test really doesn’t apply here – more than two factor levels Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 6 The Analysis of Variance (Sec. 3.2, pg. 62) • In general, there will be a levels of the factor, or a treatments, and n replicates of the experiment, run in random order…a completely randomized design (CRD) • N = an total runs • We consider the fixed effects case…the random effects case will be discussed later • Objective is to test hypotheses about the equality of the a treatment means Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 7 The Analysis of Variance • The name “analysis of variance” stems from a partitioning of the total variability in the response variable into components that are consistent with a model for the experiment • The basic single-factor ANOVA model is i 1, 2,..., a yij i ij , j 1, 2,..., n an overall mean, i ith treatment effect, ij experimental error, NID(0, 2 ) Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 8 Models for the Data There are several ways to write a model for the data: yij i ij is called the effects model Let i i , then yij i ij is called the means model Regression models can also be employed Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 9 Efeitos fixos ou efeitos aleatórios? • O modelo estatístico Yij =+i +ij pode descrever duas situações diferentes com respeito aos efeitos de tratamento. • Primeiro, os a tratamentos poderiam ser especificamente escolhidos pelo experimentador. • Nessa situação, deseja-se testar hipóteses sobre as médias de tratamento, e nossas conclusões se aplicam somente aos níveis do fator considerados na análise. • As conclusões não podem ser estendidas para tratamentos similares que não foram explicitamente considerados. • Podemos também estimar os parâmetros do modelo (,i,2). • Esse é o chamado modelo de efeitos fixos. Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 10 Efeitos fixos ou efeitos aleatórios? • Alternativamente, os a tratamentos poderiam representar uma amostra aleatória de uma população maior de tratamentos. • Nessa situação, deveríamos ser capazes de estender as conclusões – baseadas numa amostra de tratamentos para todos os tratamentos na população, tendo eles sido explicitamente considerados ou não na análise. • Os i’s são variáveis aleatórias. • Nesse caso testamos hipóteses sobre a variabilidade dos i’s e tentamos estimar essa variabilidade. • Modelo de efeitos aleatórios ou modelo de componentes de variância (Capítulo 13). Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 11 The Analysis of Variance of the Fixed Effects Model • Total variability is measured by the total sum of squares: a n SST ( yij y.. ) 2 i 1 j 1 • The basic ANOVA partitioning is: a n a n 2 ( y y ) [( y y ) ( y y )] ij .. i. .. ij i. 2 i 1 j 1 i 1 j 1 a a n n ( yi. y.. ) 2 ( yij yi. ) 2 i 1 i 1 j 1 SST SSTreatments SS E Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 12 The Analysis of Variance SST SSTreatments SSE • A large value of SSTreatments reflects large differences in treatment means • A small value of SSTreatments likely indicates no differences in treatment means • Formal statistical hypotheses are: H 0 : 1 2 a H1 : At least one mean is different Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 13 The Analysis of Variance • While sums of squares cannot be directly compared to test the hypothesis of equal means, mean squares can be compared. • A mean square is a sum of squares divided by its degrees of freedom: dfTotal dfTreatments df Error an 1 a 1 a(n 1) SSTreatments SS E MSTreatments , MS E a 1 a(n 1) • If the treatment means are equal, the treatment and error mean squares will be (theoretically) equal. • If treatment means differ, the treatment mean square will be larger than the error mean square. Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 14 Teorema de Cochran Z i ~ NID(0,1) para i 1,2,...,n n Z i 1 2 i Q1 Q2 ... Qv , com ν n e Q j t em n j graus de liberdade ( j 1,..., ). Então, Q1 , Q2 ,...,Q são independentement edist ribuídas com Q j ~ n2j se, e somentese, n n1 n2 ... nv Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 15 The Analysis of Variance is Summarized in a Table • Computing…see text, pp 69 • The reference distribution for F0 is the Fa-1, a(n-1) distribution • Reject the null hypothesis (equal treatment means) if F0 F ,a1,a( n1) Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 16 Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 17 ANOVA Table Example 3-1 Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 18 The Reference Distribution: P-value Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 19 ANOVA calculations are usually done via computer • Text exhibits sample calculations from three very popular software packages, Design-Expert, JMP and Minitab • See pages 98-100 • Text discusses some of the summary statistics provided by these packages Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 20 ANOVA NO R • No R está disponível a função aov(analysis of variance). aov(stats) R Documentation Fit an Analysis of Variance Model Description Fit an analysis of variance model by a call to lm for each stratum. Usage aov(formula, data = NULL, projections = FALSE, qr = TRUE, contrasts = NULL, ...) Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 21 formula a formula specifying the model. data A data frame in which the variables specified in the formula will be found. If missing, the variables are searched for in the standard way. projections Logical flag: should the projections be returned? qr Logical flag: should the QR decomposition be returned? contrasts A list of contrasts to be used for some of the factors in the formula. These are not used for any Error term, and supplying contrasts for factors only in the Error term will give a warning. Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 22 dados=read.table(“m://flavia//etche.txt”,header=T) boxplot(dados$y~dados$rf) Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 23 summary(aov(dados$y~dados$rf)) Df Sum Sq Mean Sq F value Pr(>F) dados$rf 3 66871 22290 66.797 2.883e-09 *** Residuals 16 5339 334 --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 24 Estimação dos parâmetros do modelo Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 25 Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 26 Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 27 Intervalos de confiança simultâneos • As expressões dadas em 3.12 e 3.13 valem separadamente. Isto é, o nível de confiança 1- aplica-se somente a uma particular estimativa. • Porém, em muitos problemas, o experimentador pode desejar calcular vários intervalos de confiança. • Se existem r intervalos de nível 1- , então a probabilidade de que esses r intervalos sejam válidos conjuntamente é pelo menos 1-r . • O número de intervalos simultâneos, r, não deve ser muito grande. • Por exemplo, se r=5 e =0,05, o intervalo simultâneo teria confiança de pelo menos 75% e se r=10, com o mesmo , o nível seria de pelo menos 50%. • Uma abordagem que assegura que o nível de confiança simultâneo não seja tão pequeno é substituir o /2 por /(2r) em cada uma das experssões 3.12, 3.13. Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 28 Intervalos de confiança simultâneos • Esse método é conhecido como método de Bonferroni. • Essencialmente, se existe um conjunto de r intervalos de confiança a serem construídos o método propõe substituir α/2 por α/(2r). • Isso produz um conjunto de r intervalos de confiança para os quais o nível de confiança simultâneo é pelo menos 100(1-α)%. • A justificativa para o método é devida à desigualdade de Bonferroni: r r P Eic 1 i 1 Chapter 3 PE i i 1 Design & Analysis of Experiments 7E 2009 Montgomery 29 Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 30 Model Adequacy Checking in the ANOVA Text reference, Section 3.4, pg. 75 • • • • • • Checking assumptions is important Normality Constant variance Independence Have we fit the right model? Later we will talk about what to do if some of these assumptions are violated Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 31 Model Adequacy Checking in the ANOVA • Examination of residuals (see text, Sec. 3-4, pg. 75) eij yij yˆij yij yi. • Computer software generates the residuals • Residual plots are very useful • Normal probability plot of residuals Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 32 Other Important Residual Plots Chapter 3 Design & Analysis of Experiments 7E 2009 Montgomery 33