Chapter 11 - McGraw Hill Higher Education

A PowerPoint Presentation Package to Accompany
Applied Statistics in Business &
Economics, 4th edition
David P. Doane and Lori E. Seward
Prepared by Lloyd R. Jaisingh
McGraw-Hill/Irwin
Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 11
Analysis of Variance
Chapter Contents
11.1
11.2
11.3
11.4
11.5
11.6
11.7
Overview of ANOVA
One-Factor ANOVA (Completely Randomized Model)
Multiple Comparisons
Tests for Homogeneity of Variances
Two-Factor ANOVA without Replication (Randomized Block Model)
Two-Factor ANOVA with Replication (Full Factorial Model)
Higher Order ANOVA Models (Optional)
11-2
Chapter 11
Analysis of Variance
Chapter Learning Objectives
LO11-1:
LO11-2:
LO11-3:
LO11-4:
LO11-5:
LO11-6:
LO11-7:
LO11-8:
LO11-9:
LO11-10:
LO11-11:
Use basic ANOVA terminology correctly.
Recognize from data format when one-factor ANOVA is appropriate.
Interpret sums of squares and calculations in an ANOVA table.
Use Excel or other software for ANOVA calculations.
Use a table or Excel to find critical values for the F distribution.
Explain the assumptions of ANOVA and why they are important.
Understand and perform Tukey's test for paired means.
Use Hartley's test for equal variances in c treatment groups.
Recognize from data format when two-factor ANOVA is needed.
Interpret main effects and interaction effects in two-factor ANOVA.
Recognize the need for experimental design and GLM (optional).
11-3
Chapter 11
LO11-1
11.1 Overview of ANOVA
LO11-1: Use basic ANOVA terminology correctly.
•
•
•
Analysis of variance (ANOVA) is a comparison of means.
ANOVA allows you to compare more than two means simultaneously.
Proper experimental design efficiently uses limited data to draw the strongest
possible inferences.
The Goal: Explaining Variation
•
•
ANOVA seeks to identify sources of variation in a numerical dependent variable Y
(the response variable).
Variation in Y about its mean is explained by one or more categorical independent
variables (the factors) or is unexplained (random error).
11-4
Chapter 11
LO11-1
11.1 Overview of ANOVA
The Goal: Explaining Variation
•
•
•
•
Each possible value of a factor or combination of factors is a treatment.
We test to see if each factor has a significant effect on Y using (for example) the
hypotheses: H0: m1 = m2 = m3 = m4 (e.g. mean defect rates are the same for all
four plants)
H1: Not all the means are equal
The test uses the F distribution.
If we cannot reject H0, we conclude that observations within each treatment have a
common mean m.
11-5
Chapter 11
LO11-1
11.1 Overview of ANOVA
The Goal: Explaining Variation
Figure 11.3
11-6
Chapter 11
LO11-6
11.1 Overview of ANOVA
LO11-6: Explain the assumptions of ANOVA and why they are important.
ANOVA Assumptions
•
•
Analysis of Variance assumes that the
- observations on Y are independent,
- populations being sampled are normal,
- populations being sampled have equal
variances.
ANOVA is somewhat robust to departures from normality and equal variance
assumptions.
ANOVA Calculations
•
•
Software (e.g., Excel, MegaStat, MINITAB, SPSS) is used to analyze data.
Large samples increase the power of the test,
but power also depends on the degree of variation in Y.
•
Lowest power would be in a small sample with high variation in Y.
11-7
Chapter 11
LO11-2
11.2 One-Factor ANOVA
(Completely Randomized Model)
LO11-2: Recognize from data format when one-factor ANOVA is appropriate.
One-Factor ANOVA as a Linear Model
•
An equivalent way to express the one-factor model is to say that treatment j came
from a population with a common mean (m) plus a treatment effect (Aj) plus
random error (eij):
yij = m + Aj + eij
j = 1, 2, …, c and i = 1, 2, …, n
•
Random error is assumed to be normally distributed with zero mean and the same
variance for all treatments.
•
A fixed effects model only looks at what happens to the response for
particular levels of the factor.
H0: A1 = A2 = … = Ac = 0
H1: Not all Aj are zero
If the H0 is true, then the ANOVA model collapses to yij = m + eij
•
•
One can use Excel’s one-factor ANOVA menu using Data
Analysis to analyze data.
11-8
Chapter 11
LO11-3
11.2 One-Factor ANOVA
(Completely Randomized Model)
LO11-3: Interpret sums of squares and calculations in an ANOVA table.
Partitioned Sum of Squares
•
•
Use Appendix F or Excel to obtain the critical value of F for
a given a.
Table 11.2
For ANOVA, the F test is a right-tailed test.
11-9
Chapter 11
LO11-5
11.2 One-Factor ANOVA
(Completely Randomized Model)
LO11-5: Use a table or Excel to find critical values for the F distribution.
Decision Rule for an F-test
11-10
Chapter 11
LO11-7
11.3 Multiple Comparisons
LO11-7: Understand and perform Tukey's test for paired means.
Tukey’s Test
•
•
•
•
•
•
•
After rejecting the hypothesis of equal mean, we naturally want to know: Which
means differ significantly?
In order to maintain the desired overall probability of type I error, a simultaneous
confidence interval for the difference of means must be obtained.
For c groups, there are c(c – 1) distinct pairs of means to be compared.
These types of comparisons are called Multiple Comparison Tests.
Tukey’s studentized range test (or HSD for “honestly significant difference” test) is
a multiple comparison test that has good power and is widely used.
Named for statistician John Wilder Tukey (1915 – 2000)
This test is not available in Excel’s Tools > Data Analysis but is available in
MegaStat and Minitab
11-11
Chapter 11
LO11-8
11.4 Tests for Homogeneity of Variances
LO11-8: Use Hartley's test for equal variances in c treatment groups.
ANOVA Assumptions
•
•
ANOVA assumes that observations on the response variable are from normally
distributed populations that have the same variance.
The one-factor ANOVA test is only slightly affected by inequality of variance when
group sizes are equal.
Test this assumption of homogeneous variances, using Hartley’s Fmax Test.
•
The hypotheses are
•
The test statistic is the ratio of the largest sample variance to the
smallest sample variance.
11-12
Chapter 11
LO11-8
11.4 Tests for Homogeneity of Variances
Hartley’s Test
•
The decision rule is:
11-13
11.4 Tests for Homogeneity of Variances
LO11-6: Explain the assumptions of ANOVA and why they are important.
Levene’s Test
•
•
•
•
Levene’s test is a more robust alternative to Hartley’s F test.
Levene’s test does not assume a normal distribution.
It is based on the distances of the observations from their sample medians rather
than their sample means.
A computer program (e.g., MINITAB) is needed to perform this test.
11-14
Chapter 11
LO11-6