Matakuliah Tahun : A0392 – Statistik Ekonomi : 2006 Pertemuan 10 Analisis Varians Satu Arah 1 Outline Materi : Model tabel ANOVA klasifikasi satu arah ANOVA ulangan sama ANOVA ulangan tidak sama 2 Analysis of Variance • The Completely Randomized Design: One-Way Analysis of Variance – ANOVA Assumptions – F Test for Difference in c Means – The Tukey-Kramer Procedure 3 General Experimental Setting • Investigator Controls One or More Independent Variables – Called treatment variables or factors – Each treatment factor contains two or more groups (or levels) • Observe Effects on Dependent Variable – Response to groups (or levels) of independent variable • Experimental Design: The Plan Used to Test Hypothesis 4 Completely Randomized Design • Experimental Units (Subjects) are Assigned Randomly to Groups – Subjects are assumed to be homogeneous • Only One Factor or Independent Variable – With 2 or more groups (or levels) • Analyzed by One-Way Analysis of Variance (ANOVA) 5 Randomized Design Example Factor (Training Method) Factor Levels (Groups) Randomly Assigned Units Dependent Variable (Response) 21 hrs 17 hrs 31 hrs 27 hrs 25 hrs 28 hrs 29 hrs 20 hrs 22 hrs 6 One-Way Analysis of Variance F Test • Evaluate the Difference Among the Mean Responses of 2 or More (c ) Populations – E.g., Several types of tires, oven temperature settings • Assumptions – Samples are randomly and independently drawn • This condition must be met – Populations are normally distributed • F Test is robust to moderate departure from normality – Populations have equal variances • Less sensitive to this requirement when samples are of equal size from each population 7 Why ANOVA? • Could Compare the Means One by One using Z or t Tests for Difference of Means • Each Z or t Test Contains Type I Error • The Total Type I Error with k Pairs of Means is 1- (1 - a) k – E.g., If there are 5 means and use a = .05 • Must perform 10 comparisons • Type I Error is 1 – (.95) 10 = .40 • 40% of the time you will reject the null hypothesis of equal means in favor of the alternative when the null is true! 8 Hypotheses of One-Way ANOVA • H 0 : 1 2 c – All population means are equal – No treatment effect (no variation in means among groups) • H1 : Not all i are the same – At least one population mean is different (others may be the same!) – There is a treatment effect – Does not mean that all population means are different 9 One-Way ANOVA (No Treatment Effect) H 0 : 1 2 c H1 : Not all i are the same The Null Hypothesis is True 1 2 3 10 One-Way ANOVA (Treatment Effect Present) H 0 : 1 2 c H1 : Not all i are the same 1 2 3 The Null Hypothesis is NOT True 1 2 3 11 One-Way ANOVA (Partition of Total Variation) Total Variation SST = Variation Due to Group SSA Commonly referred to as: Among Group Variation Sum of Squares Among Sum of Squares Between Sum of Squares Model Sum of Squares Explained Sum of Squares Treatment Variation Due to Random Sampling SSW Commonly referred to as: + Within Group Variation Sum of Squares Within Sum of Squares Error Sum of Squares Unexplained 12 Total Variation nj c SST ( X ij X ) 2 j 1 i 1 X ij : the i -th observation in group j n j : the number of observations in group j n : the total number of observations in all groups c : the number of groups c X nj X j 1 i 1 n ij the overall or grand mean 13 Total Variation (continued) SST X 11 X X 2 21 X X 2 nc c X Response, X X Group 1 Group 2 Group 3 14 2 Among-Group Variation c SSA n j ( X j X ) j 1 2 SSA MSA c 1 X j : The sample mean of group j X : The overall or grand mean i j Variation Due to Differences Among Groups 15 Among-Group Variation (continued) SSA n1 X 1 X n X 2 2 2 X 2 nc X c X Response, X X3 X1 Group 1 Group 2 X2 Group 3 X 16 2 Within-Group Variation c nj SSW ( X ij X j ) 2 j 1 i 1 SSW MSW nc X j : The sample mean of group j X ij : The i -th observation in group j Summing the variation within each group and then adding over all groups j 17 Within-Group Variation (continued) SSW X 11 X 1 X 21 X 1 2 2 X nc c X c Response, X X3 X1 Group 1 Group 2 X2 Group 3 X 18 2 Within-Group Variation (continued) For c = 2, this is the SSW MSW pooled-variance in the nc t test. 2 2 2 (n1 1) S1 (n2 1) S2 (nc 1) Sc (n1 1) (n2 1) (nc 1) •If more than 2 groups, use F Test. •For 2 groups, use t test. F Test more limited. j 19 One-Way ANOVA F Test Statistic • Test Statistic – F MSA MSW • MSA is mean squares among • MSW is mean squares within • Degrees of Freedom – – df1 c 1 df 2 n c 20 One-Way ANOVA Summary Table Degrees Source of of Freedo Variation m Among c–1 (Factor) Within (Error) Total Sum of Squares SSA n–c SSW n–1 SST = SSA + SSW Mean Squares (Variance) F Statistic MSA = MSA/MS SSA/(c – 1 ) W MSW = SSW/(n – c ) 21 Features of One-Way ANOVA F Statistic • The F Statistic is the Ratio of the Among Estimate of Variance and the Within Estimate of Variance – The ratio must always be positive – df1 = c -1 will typically be small – df2 = n - c will typically be large • The Ratio Should Be Close to 1 if the Null is True 22 Features of One-Way ANOVA F Statistic (continued) • If the Null Hypothesis is False – The numerator should be greater than the denominator – The ratio should be larger than 1 23 One-Way ANOVA F Test Example As production manager, you want to see if 3 filling machines have different mean filling times. You assign 15 similarly trained & experienced workers, 5 per machine, to the machines. At the .05 significance level, is there a difference in mean filling times? Machine1 Machine2 Machine3 25.40 26.31 24.10 23.74 25.10 23.40 21.80 23.50 22.75 21.60 20.00 22.20 19.75 20.60 20.40 24 One-Way ANOVA Example: Scatter Diagram Machine1 Machine2 Machine3 25.40 26.31 24.10 23.74 25.10 23.40 21.80 23.50 22.75 21.60 27 20.00 22.20 19.75 20.60 20.40 X 1 24.93 X 2 22.61 X 3 20.59 X 22.71 26 25 24 23 22 21 20 • •• • • X1 •• • •• X2 • •• •• X X3 19 25 One-Way ANOVA Example Computations Machine1 Machine2 Machine3 25.40 26.31 24.10 23.74 25.10 23.40 21.80 23.50 22.75 21.60 20.00 22.20 19.75 20.60 20.40 X 1 24.93 nj 5 X 2 22.61 c3 X 3 20.59 n 15 X 22.71 2 2 2 SSA 5 24.93 22.71 22.61 22.71 20.59 22.71 47.164 SSW 4.2592 3.112 3.682 11.0532 MSA SSA /(c -1) 47.16 / 2 23.5820 MSW SSW /( n - c) 11.0532 /12 .9211 26 Summary Table Source Degree of s of Variatio Freedo n m Among (Factor) Within (Error) Total 3-1=2 153=12 151=14 Mean Squares (Variance) F Statistic 47.1640 23.5820 MSA/MS W =25.60 11.0532 .9211 Sum of Squares 58.2172 27 One-Way ANOVA Example Solution Test Statistic: H0: 1 = 2 = 3 H1: Not All Equal MSA 23.5820 25.6 F MSW .9211 a = .05 df1= 2 df2 = 12 Decision: Reject at a = 0.05. Critical Value(s): a = 0.05 0 3.89 F Conclusion: There is evidence that at least one i differs from the rest. 28 The Tukey-Kramer Procedure • Tells which Population Means are Significantly Different – E.g., 1 = 2 3 f(X) – 2 groups whose means may be significantly different X 1= 2 3 • Post Hoc (A Posteriori) Procedure – Done after rejection of equal means in ANOVA • Pairwise Comparisons – Compare absolute mean differences with critical range 29 The Tukey-Kramer Procedure: Example Machine1 Machine2 Machine3 25.40 23.40 20.00 26.31 21.80 22.20 24.10 23.50 19.75 23.74 22.75 20.60 25.10 21.60 20.40 2. Compute critical range: Critical Range QU ( c,nc ) 1. Compute absolute mean differences: X 1 X 2 24.93 22.61 2.32 X 1 X 3 24.93 20.59 4.34 X 2 X 3 22.61 20.59 2.02 MSW 2 1 1 1.618 nj nj' 3. All of the absolute mean differences are greater than the critical range. There is a significant difference between each pair of means at the 5% level of significance. 30 Levene’s Test for Homogeneity of Variance • The Null Hypothesis 2 2 2 H : – 0 1 2 c – The c population variances are all equal • The Alternative Hypothesis 2 – H1 : Not all j are equal ( j 1, 2, , c) – Not all the c population variances are equal 31 Levene’s Test for Homogeneity of Variance: Procedure 1. For each observation in each group, obtain the absolute value of the difference between each observation and the median of the group. 2. Perform a one-way analysis of variance on these absolute differences. 32 Levene’s Test for Homogeneity of Variances: Example As production manager, you want to see if 3 filling machines have different variance in filling times. You assign 15 similarly trained & experienced workers, 5 per machine, to the machines. At the .05 significance level, is there a difference in the variance in filling times? Machine1 Machine2 Machine3 25.40 26.31 24.10 23.74 25.10 23.40 21.80 23.50 22.75 21.60 20.00 22.20 19.75 20.60 20.40 33 Levene’s Test: Absolute Difference from the Median median Machine1 25.4 26.31 24.1 23.74 25.1 25.1 Time Machine2 Machine3 23.4 20 21.8 22.2 23.5 19.75 22.75 20.6 21.6 20.4 22.75 20.4 abs(Time - median(Time)) Machine1 Machine2 Machine3 0.3 0.65 0.4 1.21 0.95 1.8 1 0.75 0.65 1.36 0 0.2 0 1.15 0 34 Summary Table SUMMARY Groups Machine1 Machine2 Machine3 Count 5 5 5 ANOVA Source of Variation SS Between Groups 0.067453 Within Groups 4.17032 Total 4.237773 Sum Average Variance 3.87 0.774 0.35208 3.5 0.7 0.19 3.05 0.61 0.5005 df MS F P-value F crit 2 0.033727 0.097048 0.908218 3.88529 12 0.347527 14 35 Levene’s Test Example: Solution 2 2 2 H0: 1 2 3 H1: Not All Equal Test Statistic: MSA 0.0337 F 0.0970 MSW 0.3475 a = .05 df1= 2 df2 = 12 Decision: Critical Value(s): Do not reject at a = 0.05. a = 0.05 0 3.89 F Conclusion: There is no evidence that 2 at least one j differs from the rest. 36