Basic Business Statistics 10th Edition Chapter 11 Analysis of Variance Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 11-1 Learning Objectives In this chapter, you learn: The basic concepts of experimental design How to use one-way analysis of variance to test for differences among the means of several populations (also referred to as “groups” in this chapter) When to use a randomized block design How to use two-way analysis of variance and the concept of interaction Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-2 Chapter Overview Analysis of Variance (ANOVA) One-Way ANOVA F-test Randomized Block Design Multiple Comparisons Two-Way ANOVA Interaction Effects TukeyKramer test Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-3 General ANOVA Setting Investigator controls one or more independent variables Observe effects on the dependent variable Called factors (or treatment variables) Each factor contains two or more levels (or groups or categories/classifications) Response to levels of independent variable Experimental design: the plan used to collect the data Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-4 Completely Randomized Design Experimental units (subjects) are assigned randomly to treatments Only one factor or independent variable Subjects are assumed homogeneous With two or more treatment levels Analyzed by one-way analysis of variance (ANOVA) Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-5 One-Way Analysis of Variance Evaluate the difference among the means of three or more groups Examples: Accident rates for 1st, 2nd, and 3rd shift Expected mileage for five brands of tires Assumptions Populations are normally distributed Populations have equal variances Samples are randomly and independently drawn Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-6 Hypotheses of One-Way ANOVA H0 : μ1 μ2 μ3 μc All population means are equal i.e., no treatment effect (no variation in means among groups) H1 : Not all of the population means are the same At least one population mean is different i.e., there is a treatment effect Does not mean that all population means are different (some pairs may be the same) Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-7 One-Factor ANOVA H0 : μ1 μ2 μ3 μc H1 : Not all μj are the same All Means are the same: The Null Hypothesis is True (No Treatment Effect) μ1 μ2 μ3 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-8 One-Factor ANOVA H0 : μ1 μ2 μ3 μc (continued) H1 : Not all μj are the same At least one mean is different: The Null Hypothesis is NOT true (Treatment Effect is present) or μ1 μ2 μ3 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. μ1 μ2 μ3 Chap 11-9 Partitioning the Variation Total variation can be split into two parts: SST = SSA + SSW SST = Total Sum of Squares (Total variation) SSA = Sum of Squares Among Groups (Among-group variation) SSW = Sum of Squares Within Groups (Within-group variation) Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-10 Partitioning the Variation (continued) SST = SSA + SSW Total Variation = the aggregate dispersion of the individual data values across the various factor levels (SST) Among-Group Variation = dispersion between the factor sample means (SSA) Within-Group Variation = dispersion that exists among the data values within a particular factor level (SSW) Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-11 Partition of Total Variation Total Variation (SST) d.f. = n – 1 = Variation Due to Factor (SSA) + Variation Due to Random Sampling (SSW) d.f. = c – 1 Commonly referred to as: Sum of Squares Between Sum of Squares Among Sum of Squares Explained Among Groups Variation Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. d.f. = n – c Commonly referred to as: Sum of Squares Within Sum of Squares Error Sum of Squares Unexplained Within-Group Variation Chap 11-12 Total Sum of Squares SST = SSA + SSW c nj SST ( Xij X) 2 j1 i1 Where: SST = Total sum of squares c = number of groups (levels or treatments) nj = number of observations in group j Xij = ith observation from group j X = grand mean (mean of all data values) Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-13 Total Variation (continued) SST ( X11 X)2 ( X12 X)2 ... ( Xcnc X)2 Response, X X Group 1 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Group 2 Group 3 Chap 11-14 Among-Group Variation SST = SSA + SSW c SSA n j ( X j X) 2 j1 Where: SSA = Sum of squares among groups c = number of groups nj = sample size from group j Xj = sample mean from group j X = grand mean (mean of all data values) Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-15 Among-Group Variation (continued) c SSA n j ( X j X) 2 j1 Variation Due to Differences Among Groups SSA MSA c 1 Mean Square Among = SSA/degrees of freedom i j Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-16 Among-Group Variation (continued) SSA n1(x1 x) n2 (x2 x) ... nc (xc x) 2 2 2 Response, X X3 X1 Group 1 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Group 2 X2 X Group 3 Chap 11-17 Within-Group Variation SST = SSA + SSW c SSW j1 nj i1 ( Xij X j ) 2 Where: SSW = Sum of squares within groups c = number of groups nj = sample size from group j Xj = sample mean from group j Xij = ith observation in group j Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-18 Within-Group Variation (continued) c SSW j1 nj i1 ( Xij X j )2 Summing the variation within each group and then adding over all groups SSW MSW nc Mean Square Within = SSW/degrees of freedom μj Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-19 Within-Group Variation (continued) SSW ( x11 X1 ) ( X12 X2 ) ... ( Xcnc Xc ) 2 2 2 Response, X X3 X1 Group 1 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Group 2 X2 Group 3 Chap 11-20 Obtaining the Mean Squares SSA MSA c 1 SSW MSW nc SST MST n 1 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-21 One-Way ANOVA Table Source of Variation SS df Among Groups SSA c-1 Within Groups SSW n-c SST = SSA+SSW n-1 Total MS (Variance) F ratio SSA MSA MSA = c - 1 F = MSW SSW MSW = n-c c = number of groups n = sum of the sample sizes from all groups df = degrees of freedom Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-22 One-Way ANOVA F Test Statistic H0: μ1= μ2 = … = μc H1: At least two population means are different Test statistic MSA F MSW MSA is mean squares among groups MSW is mean squares within groups Degrees of freedom df1 = c – 1 df2 = n – c (c = number of groups) (n = sum of sample sizes from all populations) Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-23 Interpreting One-Way ANOVA F Statistic The F statistic is the ratio of the among estimate of variance and the within estimate of variance The ratio must always be positive df1 = c -1 will typically be small df2 = n - c will typically be large Decision Rule: Reject H0 if F > FU, otherwise do not reject H0 = .05 0 Do not reject H0 Reject H0 FU Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-24 One-Way ANOVA F Test Example You want to see if three different golf clubs yield different distances. You randomly select five measurements from trials on an automated driving machine for each club. At the 0.05 significance level, is there a difference in mean distance? Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Club 1 254 263 241 237 251 Club 2 234 218 235 227 216 Club 3 200 222 197 206 204 Chap 11-25 One-Way ANOVA Example: Scatter Diagram Club 1 254 263 241 237 251 Club 2 234 218 235 227 216 Club 3 200 222 197 206 204 Distance 270 260 250 240 • •• • • 230 220 X1 •• • •• X2 210 x1 249.2 x 2 226.0 x 3 205.8 200 x 227.0 190 •• •• 1 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. • 2 Club X X3 3 Chap 11-26 One-Way ANOVA Example Computations Club 1 254 263 241 237 251 Club 2 234 218 235 227 216 Club 3 200 222 197 206 204 X1 = 249.2 n1 = 5 X2 = 226.0 n2 = 5 X3 = 205.8 n3 = 5 X = 227.0 n = 15 c=3 SSA = 5 (249.2 – 227)2 + 5 (226 – 227)2 + 5 (205.8 – 227)2 = 4716.4 SSW = (254 – 249.2)2 + (263 – 249.2)2 +…+ (204 – 205.8)2 = 1119.6 MSA = 4716.4 / (3-1) = 2358.2 MSW = 1119.6 / (15-3) = 93.3 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. 2358.2 F 25.275 93.3 Chap 11-27 One-Way ANOVA Example Solution H0: μ1 = μ2 = μ3 H1: μj not all equal = 0.05 df1= 2 df2 = 12 Critical Value: FU = 3.89 = .05 0 Do not reject H0 Reject H0 FU = 3.89 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Test Statistic: MSA 2358.2 F 25.275 MSW 93.3 Decision: Reject H0 at = 0.05 Conclusion: There is evidence that at least one μj differs F = 25.275 from the rest Chap 11-28 One-Way ANOVA Excel Output EXCEL: tools | data analysis | ANOVA: single factor SUMMARY Groups Count Sum Average Variance Club 1 5 1246 249.2 108.2 Club 2 5 1130 226 77.5 Club 3 5 1029 205.8 94.2 ANOVA Source of Variation SS df MS Between Groups 4716.4 2 2358.2 Within Groups 1119.6 12 93.3 Total 5836.0 14 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. F 25.275 P-value 4.99E-05 F crit 3.89 Chap 11-29 The Tukey-Kramer Procedure Tells which population means are significantly different e.g.: μ1 = μ2 μ3 Done after rejection of equal means in ANOVA Allows pair-wise comparisons Compare absolute mean differences with critical range μ1= μ2 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. μ3 x Chap 11-30 Tukey-Kramer Critical Range Critical Range QU MSW 2 1 1 n n j' j where: QU = Value from Studentized Range Distribution with c and n - c degrees of freedom for the desired level of (see appendix E.9 table) MSW = Mean Square Within nj and nj’ = Sample sizes from groups j and j’ Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-31 The Tukey-Kramer Procedure: Example Club 1 254 263 241 237 251 Club 2 234 218 235 227 216 Club 3 200 222 197 206 204 1. Compute absolute mean differences: x1 x 2 249.2 226.0 23.2 x1 x 3 249.2 205.8 43.4 x 2 x 3 226.0 205.8 20.2 2. Find the QU value from the table in appendix E.10 with c = 3 and (n – c) = (15 – 3) = 12 degrees of freedom for the desired level of ( = 0.05 used here): QU 3.77 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-32 The Tukey-Kramer Procedure: Example (continued) 3. Compute Critical Range: Critical Range QU MSW 2 1 1 3.77 93.3 1 1 16.285 n n 2 5 5 j' j 4. Compare: 5. All of the absolute mean differences are greater than critical range. Therefore there is a significant difference between each pair of means at 5% level of significance. Thus, with 95% confidence we can conclude that the mean distance for club 1 is greater than club 2 and 3, and club 2 is greater than club 3. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. x1 x 2 23.2 x1 x 3 43.4 x 2 x 3 20.2 Chap 11-33 The Randomized Block Design Like One-Way ANOVA, we test for equal population means (for different factor levels, for example)... ...but we want to control for possible variation from a second factor (with two or more levels) Levels of the secondary factor are called blocks Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-34 Partitioning the Variation Total variation can now be split into three parts: SST = SSA + SSBL + SSE SST = Total variation SSA = Among-Group variation SSBL = Among-Block variation SSE = Random variation Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-35 Sum of Squares for Blocking SST = SSA + SSBL + SSE r SSBL c ( Xi. X) 2 i1 Where: c = number of groups r = number of blocks Xi. = mean of all values in block i X = grand mean (mean of all data values) Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-36 Partitioning the Variation Total variation can now be split into three parts: SST = SSA + SSBL + SSE SST and SSA are computed as they were in One-Way ANOVA Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. SSE = SST – (SSA + SSBL) Chap 11-37 Mean Squares SSBL MSBL Mean square blocking r 1 MSA Mean square among groups SSA c 1 SSE MSE Mean square error (r 1)(c 1) Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-38 Randomized Block ANOVA Table Source of Variation SS df MS F ratio MSA MSE Among Treatments SSA c-1 MSA Among Blocks SSBL r-1 MSBL Error SSE (r–1)(c-1) MSE SST rc - 1 Total c = number of populations r = number of blocks Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. MSBL MSE rc = sum of the sample sizes from all populations df = degrees of freedom Chap 11-39 Blocking Test H0 : μ1. μ2. μ3. ... H1 : Not all block means are equal MSBL F= MSE Blocking test: df1 = r – 1 df2 = (r – 1)(c – 1) Reject H0 if F > FU Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-40 Main Factor Test H0 : μ.1 μ.2 μ.3 ... μ.c H1 : Not all population means are equal F= MSA MSE Main Factor test: df1 = c – 1 df2 = (r – 1)(c – 1) Reject H0 if F > FU Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-41 The Tukey Procedure To test which population means are significantly different e.g.: μ1 = μ2 ≠ μ3 Done after rejection of equal means in randomized block ANOVA design Allows pair-wise comparisons Compare absolute mean differences with critical range 1= 2 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. 3 x Chap 11-42 The Tukey Procedure (continued) Critical Range Qu MSE r Compare: Is x.j x.j' Critical Range ? If the absolute mean difference is greater than the critical range then there is a significant difference between that pair of means at the chosen level of significance. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. x .1 x .2 x .1 x .3 x .2 x .3 etc... Chap 11-43 Factorial Design: Two-Way ANOVA Examines the effect of Two factors of interest on the dependent variable e.g., Percent carbonation and line speed on soft drink bottling process Interaction between the different levels of these two factors e.g., Does the effect of one particular carbonation level depend on which level the line speed is set? Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-44 Two-Way ANOVA (continued) Assumptions Populations are normally distributed Populations have equal variances Independent random samples are drawn Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-45 Two-Way ANOVA Sources of Variation Two Factors of interest: A and B r = number of levels of factor A c = number of levels of factor B n’ = number of replications for each cell n = total number of observations in all cells (n = rcn’) Xijk = value of the kth observation of level i of factor A and level j of factor B Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-46 Two-Way ANOVA Sources of Variation SST = SSA + SSB + SSAB + SSE SSA Factor A Variation SST Total Variation SSB Factor B Variation SSAB n-1 (continued) Degrees of Freedom: r–1 c–1 Variation due to interaction between A and B (r – 1)(c – 1) SSE rc(n’ – 1) Random variation (Error) Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-47 Two Factor ANOVA Equations Total Variation: r n c SST ( Xijk X) 2 i1 j1 k 1 Factor A Variation: r 2 SSA cn ( Xi.. X) i1 Factor B Variation: c 2 SSB rn ( X. j. X) j1 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-48 Two Factor ANOVA Equations (continued) Interaction Variation: r c SSAB n ( Xij. Xi.. X.j. X)2 i1 j1 Sum of Squares Error: r c n SSE ( Xijk Xij. )2 i1 j1 k 1 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-49 Two Factor ANOVA Equations r where: X Xi.. X j1 k 1 i1 j1 k 1 ijk rcn Grand Mean ijk Mean of ith level of factor A (i 1, 2, ..., r) cn r X. j. (continued) n X n c c n X i 1 k 1 ijk rn n Mean of jth level of factor B (j 1, 2, ..., c) Xijk Xij. Mean of cell ij k 1 n Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. r = number of levels of factor A c = number of levels of factor B n’ = number of replications in each cell Chap 11-50 Mean Square Calculations SSA MSA Mean square factor A r 1 SSB MSB Mean square factor B c 1 SSAB MSAB Mean square interaction (r 1)(c 1) SSE MSE Mean square error rc(n'1) Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-51 Two-Way ANOVA: The F Test Statistic H0: μ1.. = μ2.. = μ3.. = • • • H1: Not all μi.. are equal H0: μ.1. = μ.2. = μ.3. = • • • H1: Not all μ.j. are equal H0: the interaction of A and B is equal to zero H1: interaction of A and B is not zero Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. F Test for Factor A Effect MSA F MSE Reject H0 if F > FU F Test for Factor B Effect MSB F MSE Reject H0 if F > FU F Test for Interaction Effect MSAB F MSE Reject H0 if F > FU Chap 11-52 Two-Way ANOVA Summary Table Source of Variation Sum of Squares Degrees of Freedom Mean Squares F Statistic Factor A SSA r–1 MSA MSA MSE Factor B SSB c–1 AB (Interaction) SSAB (r – 1)(c – 1) Error SSE rc(n’ – 1) Total SST n–1 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. = SSA /(r – 1) MSB = SSB /(c – 1) MSAB = SSAB / (r – 1)(c – 1) MSB MSE MSAB MSE MSE = SSE/rc(n’ – 1) Chap 11-53 Features of Two-Way ANOVA F Test Degrees of freedom always add up n-1 = rc(n’-1) + (r-1) + (c-1) + (r-1)(c-1) Total = error + factor A + factor B + interaction The denominator of the F Test is always the same but the numerator is different The sums of squares always add up SST = SSE + SSA + SSB + SSAB Total = error + factor A + factor B + interaction Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-54 Examples: Interaction vs. No Interaction No interaction: Interaction is present: Factor B Level 3 Factor B Level 2 Factor A Levels Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Mean Response Factor B Level 1 Mean Response Factor B Level 1 Factor B Level 2 Factor B Level 3 Factor A Levels Chap 11-55 Multiple Comparisons: The Tukey Procedure Unless there is a significant interaction, you can determine the levels that are significantly different using the Tukey procedure Consider all absolute mean differences and compare to the calculated critical range Example: Absolute differences for factor A, assuming three factors: X1.. X 2.. X1.. X 3.. X 2.. X 3.. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-56 Multiple Comparisons: The Tukey Procedure Critical Range for Factor A: Critical Range QU MSE c n' (where Qu is from Table E.10 with r and rc(n’–1) d.f.) Critical Range for Factor B: Critical Range QU MSE r n' (where Qu is from Table E.10 with c and rc(n’–1) d.f.) Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-57 Chapter Summary Described one-way analysis of variance Considered the Randomized Block Design The logic of ANOVA ANOVA assumptions F test for difference in c means The Tukey-Kramer procedure for multiple comparisons Treatment and Block Effects Multiple Comparisons: Tukey Procedure Described two-way analysis of variance Examined effects of multiple factors Examined interaction between factors Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 11-58