Lecture 12 - Wharton Statistics Department

advertisement
Lecture 12
• One-way Analysis of Variance (Chapter
15.2)
• Multiple comparisons for one-way ANOVA
(Chapter 15.7)
Review of one-way ANOVA
• Objective: Compare the means of K populations
of interval data based on independent random
samples from each.
• H0: 1  2    K
• H1: At least two means differ
• Notation: xij – ith observation of jth sample;
x j - mean of the jth sample; n – number of
j
observations in jth sample; x - grand mean of all
observations
Example 15.1
• The marketing manager for an apple juice manufacturer needs to
decide how to market a new product. Three strategies are considered,
which emphasize the convenience, quality and low price of product
respectively.
• An experiment was conducted as follows:
• In three cities an advertisement campaign was launched .
• In each city only one of the three characteristics (convenience,
quality, and price) was emphasized.
• The weekly sales were recorded for twenty weeks following
the beginning of the campaigns.
Rationale Behind Test Statistic
• Two types of variability are employed when
testing for the equality of population means
– Variability of the sample means
– Variability within samples
• Test statistic is essentially (Variability of the
sample means)/(Variability within samples)
The rationale behind the test statistic – I
• If the null hypothesis is true, we would
expect all the sample means to be close to
one another (and as a result, close to the
grand mean).
• If the alternative hypothesis is true, at least
some of the sample means would differ.
• Thus, we measure variability between
sample means.
Variability between sample means
• The variability between the sample means is
measured as the sum of squared distances
between each group mean and the grand
mean times the sample size of the group.
This sum is called the
Sum of Squares for Treatments
In our example treatments are
SST
represented by the different
advertising strategies.
Sum of squares for treatments (SST)
k
SST   n j ( x j  x)
2
j 1
There are k treatments
The size of sample j
The mean of sample j
Note: When the sample means are close to
one another, their distance from the grand
mean is small, leading to a small SST. Thus,
large SST indicates large variation between
sample means, which supports H1.
Sum of squares for treatments (SST)
• Solution – continued
Calculate SST
x1  577.55 x 2  653.00 x 3  608.65
k
SST   n j ( x j  x) 2
j 1
The grand mean is calculated by
n1 x1  n2 x 2  ...  nk x k
X
n1  n2  ...  nk
= 20(577.55 - 613.07)2 +
+ 20(653.00 - 613.07)2 +
+ 20(608.65 - 613.07)2 =
= 57,512.23
Sum of squares for treatments (SST)
Is SST = 57,512.23 large enough to
reject H0 in favor of H1?
Large compared to what?
30
25
x3  20
20
x 2  15
16
15
14
11
10
9
x3  20
20
19
x 2  15
x1  10
12
10
9
x1  10
7
A small variability within
the samples makes it easier
Treatment 1 Treatment 2 Treatment 3
to draw a conclusion about the
population means.
The
1 sample means are the same as before,
but the larger within-sample variability
Treatment 1
Treatment 2 Treatment 3
makes it harder to draw a conclusion
about the population means.
The rationale behind test statistic – II
• Large variability within the samples
weakens the “ability” of the sample means
to represent their corresponding population
means.
• Therefore, even though sample means may
markedly differ from one another, SST must
be judged relative to the “within samples
variability”.
Within samples variability
• The variability within samples is measured
by adding all the squared distances between
observations and their sample means.
This sum is called the
Sum of Squares for Error
SSE
In our example this is the
sum of all squared differences
between sales in city j and the
sample mean of city j (over all
the three cities).
Sum of squares for errors (SSE)
• Solution – continued
Calculate SSE
s12  10 ,775 .00 s 22  7,238 ,11 s32  8,670 .24
k
SSE 
nj

(xij  x j ) 2 (n1 - 1)s12 + (n2 -1)s22 + (n3 -1)s32
j 1 i 1
= (20 -1)10,774.44 + (20 -1)7,238.61+ (20-1)8,670.24
= 506,983.50
Sum of squares for errors (SSE)
Is SST = 57,512.23 large enough
relative to SSE = 506,983.50 to
reject the null hypothesis that
specifies that all the means are
equal?
The mean sum of squares
To perform the test we need to calculate
the mean squares as follows:
Calculation of MST Mean Square for Treatments
SST
k 1
57 ,512 .23

3 1
 28 ,756 .12
MST 
Calculation of MSE
Mean Square for Error
SSE
nk
509 ,983 .50

60  3
 8,894 .45
MSE 
The F test rejection region
And finally the hypothesis test:
H0: 1 = 2 = …=k
H1: At least two means differ
MST
Test statistic: F 
MSE
R.R: F>Fa,k-1,n-k
The F test
MST
MSE
28,756.12

8,894.17
 3.23
F
Ho: 1 = 2= 3
H1: At least two means differ
Test statistic F= MST/ MSE= 3.23
R.R. : F  Fak 1nk  F0.05,31,60 3  3.15
Since 3.23 > 3.15, there is sufficient evidence
to reject Ho in favor of H1, and argue that at least one
of the mean sales is different than the others.
Required Conditions for Test
• Independent simple random samples from each
population
• The populations are normally distributed (look for
extreme skewness and outliers, probably okay
regardless if each n j  30 ).
• The variances of all the populations are equal
(Rule of thumb: Check if largest sample standard
deviation is less than twice the smallest standard
deviation)
ANOVA Table – Example 15.1
Analysis of Variance
Source
DF
Sum of
Squares
Mean
Square
F Ratio
Prob > F
City
2
57512.23
28756.1
3.2330
0.0468
Error
57
506983.50
8894.4
C.
Total
59
564495.73
Model for ANOVA
• X ij = ith observation of jth sample
• X   a 
ij
j
ij
•  is the overall mean level, a j is the differential
effect of the jth treatment and  ij is the random
error in the ith observation under the jth treatment.
The errors are assumed to be independent,
2
normally distributed with mean
zero
and
variance

K
The a j are normalized: j1a j  0
Model for ANOVA Cont.
• The expected response to the jth treatment is
E ( X ij )    a j
• Thus, if all treatments have the same expected
response (i.e., H0 : all populations have same
mean), a j  0, for j  1,, K . In general, a j  a j ' is
the difference between the means of population j
and j’.
• MSE is estimate of Var ( ij )   2
• Sums of squares decomposition:
SS(Total)=SST+SSE
Review Question: Intro to
ANOVA
• Which case does ANOVA generalize?
– 2-sample mean comparison with equal variance
assumption
– 2-sample mean comparison with unequal variances
permitted
– 2-sample variance comparison
Relationship between F-test and
t-test for two samples
• For comparing two samples, the F-statistic
equals the square of the t-statistic with equal
variances.
• For two samples, the ANOVA F-test is
equivalent to testing H 0 : 1  2 versus
H1 : 1  2 .
Comparing Pairs
• What crucial information does F not provide?
– If F is statistically significant, there is evidence that
not all group means are equal, but we don’t know
where the differences between group means are.
• Ex. of differences that make F stat. significant:
– Assume 4 groups with true means:
e.g., sales at 4 locations of a store
– 1   2  3   4
–    
1
2
3
4
–    
1
2
3
4
1 , 2,  3, 4
15.7 Multiple Comparisons
• When the null hypothesis is rejected, it may be
desirable to find which mean(s) is (are) different,
and at how they rank.
• Three statistical inference procedures, geared at
doing this, are presented:
– Fisher’s least significant difference (LSD) method
– Bonferroni adjustment to Fisher’s LSD
– Tukey’s multiple comparison method
15.7 Multiple Comparisons
• Two means are considered different if the
difference between the corresponding sample
means is larger than a critical number. Then, the
larger sample mean is believed to be associated
with a larger population mean.
• Conditions common to all the methods here:
– The ANOVA model is the one way analysis of variance
– The conditions required to perform the ANOVA are
satisfied.
Fisher Least Significant Different (LSD)
Method
• This method builds on the equal variances t-test of
the difference between two means.
• The test statistic is improved by using MSE rather
than sp2.
• We conclude that i and j differ (at a% significance
level if | xi  x j | > LSD, where
LSD  ta
2, n  k
1 1
MSE (  )
ni n j
Experimentwise Type I error rate (aE)
(the effective Type I error)
• Using Fisher’s method may result in an increased probability
of committing a type I error.
• The experimentwise Type I error rate is the probability of
committing at least one Type I error at significance level of a.
If C independent tests are done,
aE = 1-(1 – a)C
• The Bonferroni adjustment determines the required Type I
error probability per test (a) , to secure a pre-determined
overall aE.
Multiple Comparisons Problem
• A hypothetical study of the effect of birth control pills is
done.
• Two groups of women (one taking birth controls, the other
not) are followed and 100 variables are recorded for each
subject such as blood pressure, psychological and medical
problems.
• After the study, two-sample t-tests are performed for each
variable and it is found that women taking birth pills have
higher incidences of depression at the 5% significance
level (the p-value equals .02).
• Does this provide strong evidence that women taking birth
control pills are more likely to be depressed?
Bonferroni Adjustment
• Suppose we carry out C tests at significance
level a
• If the null hypothesis for each test is true,
the probability that we will falsely reject at
least one hypothesis is at most Ca
• Thus, if we carry out C tests at significance
level a / C , the experimentwise Type I error
rate is at most C (a / C )  a
Bonferroni Adjustment for ANOVA
• The procedure:
– Compute the number of pairwise comparisons (C)
[all: C=k(k-1)/2], where k is the number of populations.
– Set a = aE/C, where aE is the true probability of making
at least one Type I error (called experimentwise Type I
error).
– We conclude that i and j differ at a/C% significance
level (experimentwise error rate at most a )
 i   j  ta
( 2C ),n  k
1 1
MSE (  )
ni n j
Fisher and Bonferroni Methods
• Example 15.1 - continued
– Rank the effectiveness of the marketing strategies
(based on mean weekly sales).
– Use the Fisher’s method, and the Bonferroni adjustment
method
• Solution (the Fisher’s method)
– The sample mean sales were 577.55, 653.0, 608.65.
– Then,
1 1
x1  x 2  577.55  653.0  75.45
ta 2,nk MSE (  ) 
ni n j
x  x  577.55  608.65  31.10
1
3
x 2  x 3  653.0  608.65  44.35
t.05 / 2,57 8894(1 / 20)  (1 / 20)  59.71
Fisher and Bonferroni Methods
• Solution (the Bonferroni adjustment)
– We calculate C=k(k-1)/2 to be 3(2)/2 = 3.
– We set a = .05/3 = .0167, thus t.0167/2, 60-3 = 2.467 (Excel).
x1  x 2  577.55  653.0  75.45
1 1
t a 2 MSE(  ) 
ni n j
x1  x 3  577.55  608.65  31.10
x 2  x 3  653.0  608.65  44.35 2.467 8894 (1/ 20)  (1/ 20)  73.54
Again, the significant difference is between 1 and 2.
Tukey Multiple Comparisons
• The test procedure:
– Assumes equal number of obs. per populations.
– Find a critical number w as follows:
MSE
w  q a (k ,  )
ng
k = the number of populations
 =degrees of freedom = n - k
ng = number of observations per population
a = significance level
qa(k,) = a critical value obtained from the studentized range table (app. B17/18)
Tukey Multiple Comparisons
• Select a pair of means. Calculate the difference
between the larger and the smaller mean. xmax  xmin
• If xmax  xmin  w there is sufficient evidence to
conclude that max > min .
• Repeat this procedure for each pair of
samples. Rank the means if possible.
If the sample sizes are not extremely different, we can use the
above procedure with ng calculated as the harmonic mean of
the sample sizes.
ng 
k
1 n1 1 n2 ... 1 nk
Tukey Multiple Comparisons
• Example 15.1 - continued We had three
populations (three marketing strategies).
K = 3,
Sample sizes were equal. n1 = n2 = n3 = 20,
 = n-k = 60-3 = 57,
MSE = 8894.
Take q.05(3,60) from the table: 3.40.
MSE
8894
w  q a (k,  )
 q.05 (3,57)
 71.70
ng
20
Population
Mean
Sales - City 1 577.55
Sales - City 2 653
Sales - City 3 698.65
xmax  xmin
xmax  xmin  w
City 1 vs. City 2: 653 - 577.55 = 75.45
City 1 vs. City 3: 608.65 - 577.55 = 31.1
City 2 vs. City 3: 653 - 608.65 = 44.35
Practice Problems
• 15.16, 15.22, 15.26, 15.66
Download