CH 10: ANALYSIS OF VARIANCE WHAT IS ANOVA? Moving from chapter 8 where we covered hypothesis testing, we now need to understand how to compare more than two means. So 3 or 4 or more means will be tested at the same time. We will follow the same 4 step process as in chapter 8. The null hypothesis will now be H0: µ1 = µ2 = µ3 = … = µk The alternative hypothesis will always be H1: at least two means differ. We do not know which two population means differ, so we are simply testing if they differ. You will learn how to find out which means differ in more advanced Statistics courses. So why is this hypothesis test called the analysis of variance (ANOVA)? Even though we are testing whether more than 2 population means differ or not, we do not actually compare the means in the test. We have to compare the variances of the different populations and then decide whether the means are equal or not. 1 CONCEPTS AND DEFINITIONS Independent Variable This is the factor/variable of interest. It is the variable that is divided into the different groups/ treatments. Eg. We want to compare 3 types of learning materials for educational research (Type A, Type B, Type C). Each learning material would be a different group/treatment. Dependent Variable This is the response variable. So after the different types of learning materials were administered, what were the average test scores for those groups/treatments? ni Number of observations in each group/treatments k Number of groups/treatments n Total number of observations in ALL the groups/treatments (n = n1 + n2 + … + nk) yij The ith observation in the jth group/treatment (y11 ; y35 ; y42 ; y73 ) 𝒚𝒋 The mean of the observations in the jth group/treatment 𝒚 The grand mean = the mean of ALL the observations 2 ASSUMPTIONS FOR ANOVA 1. observations are assumed to be normally distributed within each group or treatment 2. observations are from a random sample and are independent 3. assumption that all variances are equal (𝜎12 = 𝜎22 = 𝜎32 = ⋯ = 𝜎𝑘1 ) 3 HYPOTHESIS TESTING FOR MORE THAN 2 POPULATION MEANS (ANOVA) Step 1: Hypotheses H0: µ1 = µ2 = µ3 = … = µk H1: at least two means differ. Step 2: CV and RR 𝛼 can be 10% or 5% or 2.5% or 1% or 0.5% d.f. (numerator) = ν1 = k – 1 d.f. (denominator) = ν2 = n – k We use the F-distribution to find the critical value for an ANOVA test. It is smooth curve that is skewed to the right and is always positive. We will thus always have a one tail test with the rejection region in the right hand tail (Page A10-A19). CV: Fcrit = Fα, ν1, ν2 = F0.05, k-1, n-k RR: {F│F > F0.05, k-1, n-k} 4 HYPOTHESIS TESTING FOR MORE THAN 2 POPULATION MEANS (ANOVA) Step 3: Test Statistic In order to test whether or not differences exist between more the 2 populations, we have to calculate 2 types of deviations. o We need to firstly see what is the deviation or difference between the groups/treatments and then we need to see what is the deviation or difference within the groups/treatments. But how do we do this? o We will use sums of squares in order to minimize these deviations or differences. o This involves 6 steps: 1) TSS = Total Sum of Squares = SST + SSE Measures the total deviation between treatments and within treatments. 𝟐 𝒌 𝒊=𝟏 TSS = 𝒌𝒋=𝟏 𝒏𝒊=𝟏 𝒚𝒊𝒋 − 𝒚 TSS = 𝒌𝒋=𝟏 𝒏𝒊=𝟏 𝒚𝟐𝒊𝒋 − 2) SST = Sum of Squares for Treatments Measures the deviation between each group or treatment mean and the grand mean. SST = 𝒌 𝒋=𝟏 𝒏𝒋 × 𝒚𝒋 − 𝒚 𝟐 𝟐 𝒏 𝒚 𝒊=𝟏 𝒊𝒋 𝒏 5 STEP 3: TEST STATISTIC 3) SSE = Sum of Squares for Error = TSS - SST Measures the deviation between each observation and each group or treatment mean. SSE = 𝒌 𝒋=𝟏 𝒏 𝒊=𝟏 𝒚𝒊𝒋 − 𝒚𝒋 𝟐 4)MST = Mean Square for Treatment Measures the average or Mean of the Sum of Squares for Treatment. MST = 𝑆𝑆𝑇 𝑘−1 5)MSE = Mean Square for Error Measures the average or Mean of the Sum of Squares for Error. MSE = 𝑆𝑆𝐸 𝑛−𝑘 6) F-Statistic = F = MST/MSE 6 7 ANOVA OUTPUT TABLE 8 HYPOTHESIS TESTING FOR MORE THAN 2 POPULATION MEANS (ANOVA) Step 4: Conclusion Option 1: Reject the null hypothesis if Fstat > Fcrit which means there is enough statistical evidence at the 5% significance level to conclude that population means are not equal and at least 2 means differ. Option 2: Do not reject the null hypothesis if Fstat < Fcrit which means that there is sufficient evidence at the 5% significance level to conclude that the population means are all equal. 9 Exercise 1 A university president collects data showing the number of absences over the past academic year for a random sample of six professors in the Faculty of Science. She does the same for a random sample of nine professors in the Economic and Management Science Faculty and for a random sample of eight professors in the Faculty of Arts. Faculty of Science Absences Management and Economic Sciences Faculty Arts Faculty 8 5 9 10 7 10 6 6 10 8 7 9 4 7 7 8 6 5 8 13 8 7 1 Test at the 5% significance level if there is sufficient evidence to infer whether the mean absences for the 3 faculties are the same or not. Use the 4 step process. 10 Solution: Step 1: Hypotheses H0: 𝜇1 = 𝜇2 = 𝜇3 H1: at least two means differ Step 2: CV and RR 𝛼 = 5% = 0.05 ν1 = d.f. (numerator) = k – 1 = 3 – 1 = 2 So we look in the top row for numerator df = 2 (somewhere down that column) ν2 = d.f. (denominator) = n – k = 23 – 3 = 20 So we look for 20 down the first column (on page A14) We look in the block of values that intersect at numerator df = 2 and denominator df = 20 and 𝛼 = 0.05 Fcrit = 𝐹0.05,2,20 = 3.49 RR: {F│F > F0.05, 2, 20} = {F│F > 3.49} 11 Step 3: Test Statistic Group (or treatment) Science EMS Arts 8 5 9 10 7 10 6 6 10 8 7 9 4 7 7 8 6 5 8 13 8 7 Observations 1 Sample size = nj (n = 23 ) 6 9 8 Total = ∑yj 44 55 70 Total = ∑yj2 344 373 654 Mean = 𝑦𝑗 7.3333 6.1111 8.75 Grand Total = ∑∑yij = 44 + 55 + 70 = 169 Grand Mean = 𝑦 = 169/23 = 7.3478 12 Step 3: Test Statistic Source of Sum of squares D.f Mean square F-statistic Step 3.4 Step 3.6 variation Step 3.2 SST n j y j y k 2 j 1 Treatments (T) 𝟐 Numerator 𝟐 3–1=2 SST = 6 x (7.3333 – 7.3478) + 9 x (6.1111 – 7.3478) + 8 x (8.75 – 7.3478) 1 k 1 (done in step2) 𝟐 = 29.4954 2 n k SSE yij y j Step 3.3 k nj 2 j 1 i 1 Sampling error (E) Denominator SSE = TSS – SST 23 – 3 = 20 = 129.2174 – 29.4954 = 99.722 (done in step2) MST MST = SST k 1 𝟐𝟗.𝟒𝟗𝟓𝟒 𝟐 = 14.7477 Step 3.5 F F= MST MSE 𝟏𝟒.𝟕𝟒𝟕𝟕 𝟒.𝟗𝟖𝟔𝟏 = 2.9578 SSE MSE nk MSE = 𝟗𝟗.𝟕𝟐𝟐 𝟐𝟎 = 4.9861 Step 3.1 Total (T) 2 k nj yij j 1 i 1 k nj 2 TSS yij n j 1 i 1 (𝟒𝟒+𝟓𝟓+𝟕𝟎)𝟐 TSS = (344+373+654) – 𝟐𝟑 = 1371 - 𝟐𝟖𝟓𝟔𝟏 𝟐𝟑 = 1371 – 1241.782609 = 129.2174 n 1 Step 4: Conclusion Total 23 – 1 = 22 F(critical) = 3.49 > F(statistic) = 2.9578 Therefore, the F statistic falls in the acceptance region and we do not reject the null hypothesis. We can infer13 that the mean absences of the three faculties are the same. Exercise 2 In a collaborative trial, four laboratories were sent samples from a reservoir and requested to perform ten assays and report the results based on percentage of a labelled amount of the drug (see Table 1). Were there any significant differences based on the laboratory performing the analysis? Table 2 shows the partial results of the ANOVA test output. Answer this question using the 4 step process. Table 1: Descriptive data for four different laboratories (rounded off to four decimals) Groups Lab (A) Lab (B) Lab (C) Lab (D) Count 10 10 10 10 Sum 999 996.9 995.1 1000 Average 99.9 99.69 99.51 100 Variance 0.0622 0.1721 0.0477 0.1156 Table 2: ANOVA table for four different laboratories (rounded off to four decimals) Source of Variation SS df MS F F crit Between Groups (T) 1.4370 ___ 0.4790 ___ ___ Within Groups (E) 3.5780 36 Total 5.0150 ___ 14 Step 1: Hypotheses H0: 𝜇1 = 𝜇2 = 𝜇3 = 𝜇4 H1: at least two means differ Step 2: CV and RR 𝛼 = not given so use default of 5% = 0.05 (can be different) ν1 = d.f. (numerator) = k – 1 = 4 – 1 = 3 So we look in the top row for numerator df = 3 (somewhere down that column) ν2 = d.f. (denominator) = n – k = 40 – 4 = 36 So we look for 36 down the first column (on page A18) but 36 is not there, so we round off to the nearest ten. That means we round up to 40. We look in the block of values that intersect at numerator df = 3 and denominator df = 40 and 𝛼 = 0.05 Fcrit = 𝐹0.05,3,40 = 2.84 RR: {F│F > F0.05, 3, 40} = {F│F > 2.84} 15 Step 3: Test Statistic Source of Variation Between Groups (T) SS df MS 1.4370 Done in step 2 0.4790 Df(numerator) =k–1 =4–1 =3 Check this is correct: Within Groups (E) 3.5780 Total 5.0150 36 MST = F stat 𝟏.𝟒𝟑𝟕𝟎 𝟎.𝟒𝟕𝟗 𝟎.𝟎𝟗𝟗𝟒 𝟑 = 0.479 MSE = F= 𝑺𝑺𝑬 = 4.8189 𝒏 −𝒌 𝟑.𝟓𝟕𝟖𝟎 = 𝟑𝟔 = 0.0994 Df(total) =n–1 = 40 – 1 = 39 Also = 3 + 36 Step 4: Conclusion Compare F(crit) to F(stat). F(crit) = 2.84 < F(stat) = 4.8189 and therefore the F statistic falls in the rejection region. There is sufficient statistical evidence to infer that at least 16 two means differ.