= task μ hapter

advertisement
•
•
•
The first and foremost question that we ask is whether the
populations have different means?
•
Chapter 8 deals with inferences about means μ1, μ2, . . . , μt
from t > 2 populations.
•
In Chapter 6 we studied inferences about the difference
between two population means μ1 − μ2 based on a random
sample from each population.
•
In Chapter 5 we studied inferences about a population mean
μ using a single sample from the population.
•
Inference about more than Two Population
Central Values
s2p
(ni − 1)s2i + (nj − 1)s2j
=
ni + nj − 2
P (|tij | > tα/2 | when H0 is true) = α.
3
Suppose each test is made at the same level α. Thus for
each pair (i, j)
ȳi − ȳj
tij = ,
sp n1i + n1j
To test each H0 : μi − μj = 0, a t-statistic used is of the
form
μ1 − μ2 , μ1 − μ3 , μ2 − μ3 .
All possible differences μi − μj are:
1
If all pairwise tests of means fail to reject the null hypotheses,
then one may conclude that all means μi are equal.
This procedure has a serious flaw which will now be described.
Take the case t = 3, so that we have three populations with
means μ1, μ2, μ3.
A sample from each population yields sample means and
variances ȳ1, ȳ2, ȳ3, s21, s22, s23.
•
•
•
•
Now let us ask the question – What is the probability of
making one or more Type I errors when we make all three
tests? ( We shall call this the overall error rate.)
Theory says that if the three t random variables were
statistically independent then the overall error rate is
1 − (1 − α)3 [this works out to 0.14 if α = 0.05, much
larger than .05]
The three test statistics are not independent. For example
t12 and t13 both involve ȳ1. Furthermore, all three have the
same denominator. So they are not statistically independent.
•
•
•
4
In other words, the Type I error probability for each test is
α. We shall call α the per-comparison error rate.
•
2
A natural approach to think of when faced with the task
of determining whether evidence points to μ1 = μ2 = μ3 =
· · · = μt or not, is to draw on knowledge acquired in Chapter
6 and perform t-tests of H0 : μi = μj for all pairs of means.
•
This is the flaw in the multiple testing approach to testing
H0 : μ 1 = μ 2 = · · · = μ t .
One solution is to use a very small α for each test in order
to have a reasonably small overall error rate.
For example, in the previous situation, if we had α = .01
then the overall error rate would be less than .03.
•
•
•
A simplistic view is that an experiment is a planned way
to observe the effect of treatments that are applied to
experimental units.
Examples
•
•
7
(a) Different amounts of a headache drug (treatments) are
given to people with headache (experimental units) to
observe the effect.
We can, at this point, begin talking in terms of experiments
and their outcome. Recall that an experiment is a planned
data collection activity.
•
The Analysis of Variance and the F -test
In general, for c tests made at level α, the overall error rate
is larger than 1 − (1 − α)c.
•
5
Thus the overall error rate is not exactly 0.14 for our three
test situation, and computing its exact value is difficult in
theory.
•
The alternative to making multiple t-tests is to make a single
F -test of H0 : μ1 = μ2 = · · · = μt versus Ha : at least one
μj is different in value.
This is called the analysis of variance F -test.
•
•
We imagine applying each treatment i to all elements in the
parent population to obtain a treated population, say Ti.
We will assume the treated population Ti has mean μi,
possibly different than μ, but its variance is σ2, i.e., the
treatment doesn’t affect the variance among experimental
units, σ2.
•
•
8
In this context we can imagine theoretically a parent
population of experimental units whose measure of interest
has mean μ and variance σ2.
•
(b) Different furnace temperatures (treatments) are used to
temper steel rods (experimental units) to see the extent of
tempering for each temperature.
6
We would like to avoid doing this if possible, because we will
lose power by using small α for each test.
•
•
-
x
T1
T2
s
s
s
s
s
s
s
T3
s
s
s
s
s
s
-
x
This expectation gets larger than σ2 as the μi’s get further
apart in value.
Thus the ratio:
•
•
Mean Square between treatment groups
Mean Square within treatment groups
Note that the expected value of the mean square among
treatment groups is σ2 when μ1 = μ2 = · · · = μt.
•
12
T3
μ1 = μ2 = μ3
s
s
s
s
s
s
s
s
This is the motivation for analyzing variances when looking
for differences in population means.
•
11
T2
s
s
s
s
s
s
6
has expected (average) value 1 when μ1 = μ2 = · · · = μt
and greater than one otherwise.
T1
s
s
s
s
s
s
s
s
y
If some of the μj are different, then the variance in the overall
sample will be inflated due to differences in population means,
while the within sample variances would remain the same for
each sample.
•
10
If this were the case, the variation among n samples taken
from each treated population T1, T2, . . . , Tt (called the within
sample variance) would be the same as the variation among
all nt sample values.
If all of the treatments have the population mean μ then
μ1 = μ2 = · · · = μt = μ.
•
•
Each of these populations has mean μi and variance σ2, i =
1, 2, . . . , t.
•
This suggests that to investigate whether μ1 = μ2 = · · · =
μt we may wish to compare the observed variation within
samples from the populations Ti to the variation among (or
between) samples.
μ3
μ1
μ2
s
s
s
s
s
s
s
We can, equivalently, characterize this design as one that
acquires a simple random sample of size n from each of t
different treated populations T1, T2, . . . , Tt.
•
6
Here we have (say) t treatments and (say) nt experimental
units. These experimental units are divided into t sets of n
and the sets are assigned at random, each to a treatment.
•
y
The simplest experimental design is named the completely
randomized design (CRD).
•
9
In the past we have talked about populations without
worrying about how they might have materialized. Here
we are simply saying that we can imagine some populations
as arising through experimentation.
•
When the μi are not all equal, the distribution of this ratio
will be shifted to the right.
Thus the observed value of the above ratio will be larger
than the percentile value from the F-table at a specified level
α.
•
•
•
Then the above ratio is an F random variable whenever
μ1 = μ2 = · · · = μt .
•
ȳ1.
ȳ2.
..
ȳt.
15
Let nT = n1 + n2 + · · · nt denote total number of
observations.
Treatment 1: y11, y12, . . . , y1n1
Treatment 2: y21, y22, . . . , y2n2
..
..
Treatment t: yt1, yt2, . . . , ytnt
Data: (Note: the sample sizes in the t treatment groups
need not be the same.)
The Analysis of Variance (AOV) Table for a
CRD
The solution to this problem is to assume the treated
populations are Normally distributed.
•
13
It will be difficult for us to decide how much larger than 1
the ratio must be before we declare that at least one μi is
different from the others.
•
•
•
•
•
•
•
df
− ȳ..)
MSE
SSW
where
=
j
2
2
2
SSW/(nT − t)≡ MSE
2
F
2
MSB/MSE
16
(yij − ȳi.) = (n1 − 1)s1 + (n2 − 1)s2 + · · · + (nt − 1)st
2
SSB/(t − 1)
MS
SSW/[(n1 − 1) + (n2 − 1) + · · · + (nt − 1)] = s
i
ni
t j
2
(yij − ȳ..)
nT − 1
Total
i
SSW=
i=1 j=1
2
(yij − ȳi.)
ni(ȳi.
i=1
ni
t SS
nT − t
SSB=
t
Error
=
Treatment t − 1
Source
The AOV table in this case is:
14
Square Between Groups
using the statistic F = Mean
Mean Square Within Groups
Since our test involves essentially all mean values ȳ’s, the
central limit theorem tells us that we will have approximately
an F random variable if the treated populations are not
dramatically different than normal populations.
The F-test is carried out by constructing an analysis of
variance table discussed below.
Section 8.2 in the text gives details about computation of
a relevant analysis of variance table for this case. Here we
summarize the results.
H0 : μ1 = μ2 = · · · μt vs. Ha : at least one μi different,
Thus we may test the hypothesis
•
•
Variety
1
2
3
Phosphorous
.35 .40 .58
.65 .70 .90
.60 .80 .75
Content
.50 .47
.84 .79
.73 .66
Sample
Size
5
5
5
Sample
Mean
0.460
0.776
0.798
Sample
Variance
.00795
.01033
.00617
19
Data:
A horticulturist was investigating the phosphorous content of
tree leaves from three different varieties of apple trees(1,2,
and 3). Random samples of five leaves from each of the
three varieties were analyzed for phosphorous content. The
data are:
Example 8.1(Old Edition):
17
V ar(ij ) = σ2, and the ij ’s are all
Here αi is the effect due to treatment i.
where E(ij ) = 0,
independent.
yij = μ + αi + ij , i = 1, 2, . . . , t; j = 1, 2, . . . , n
M.S.
.138
.008
F
17.25
20
It appears that the mean phosphorus content for Variety 1
is smaller than those for Varieties 2 and 3. Techniques for
testing such hypotheses will be discussed in the Chapter 9.
S.S.
.277
.0978
.3748
•
d.f.
2
12
14
Since the computed F value exceeds F.05,2,12 = 3.89, the
null hypothesis that the mean phosphorus content for the
three varieties are all equal is rejected.
Source of Variation
Variety
Error
Total
Analysis of Variance Table:
•
•
18
We may talk of treatment means or treatment effects in
the single treatment factor CRD experiments irrespective of
sample size.
Model for data from a CRD
•
•
If none of the treatments has an effect then α1, α2, · · · αt are
all zero which is equivalent to μ1 = μ2 = · · · = μt = μ for
some μ.
•
MSE is the pooled estimator of the population variance, σ2.
•
The model of the CRD which we will use is:
Thus the treated population means are μi = μ + αi in this
notation.
•
SSW is simply the familiar pooled SS that we saw for the 2
sample case back in Chapter 6, extended to the t > 2 case.
•
i = 1, 2, . . . , t; j = 1, 2, . . . , ni
When heterogeneity appears to be present, a variance
stabilizing transformation can sometimes be found.
Such are discussed in Section 8.5 along with Hartley’s test.
Finding a useful variance stabilizing transformation is not, in
general, easy.
•
•
•
23
Hartley’s test is very sensitive to departures from normality.
An alternative test, Levine’s test was also discussed in
Chapter 7.
•
21
When sample sizes are nearly equal, heterogeneity of variance
is not as great a problem unless variances are severely
different. Such cases are usually detectable by simply looking
at the sample variances.
H0 : σ12 = σ22 = · · · = σt2 vs. Ha : not all σi2s are equal
A problem occurs if there is textcolormagentaheterogeneity
of variance.
where ij ’s are independent, normally distributed and for
each i the population parameters are (μi, σ2).
yij = μ + αi + j
To summarize, our model for the multiple population case is
(possibly unequal sample size)
•
•
•
Checking the Equal Variance Assumption
•
•
•
•
•
22
24
If an approximate relationship between σ 2 and μ is found
by examining the sample means ȳi’s and the corresponding
sample variances Si2, then it is possible to determine an
appropriate transformation using theoretical arguments.
For example, if this relationship is of the form σ 2 = kμ
√
for some constant k, then the transformations yT = y is
suggested.
Or, if the relationship is of the form σ 2 = kμ2 for some
constant k, then the transformation yT = log(y + 1) is
recommended. (See Table 8.15 for other possibilities).
Then the transformed data are analyzed using the usual
method. See Examples 8.4, 8.5, 8.6.
If sample sizes are not equal use the largest ni.
The test statistic presumes equal sample sizes. Recall that
Table 12 gives critical values, for α = 0.05 and α = 0.01 for
df = n − 1.
•
max(S2i)
min(S2i)
This uses the largest and smallest of the population sample
variances.
Fmax =
A test of this hypothesis, proposed by Hartley (1940) was
discussed in Chapter 7. The test is the F -max test
•
•
Download