Notes Set #4 Stat402B (Spring 2016) Last update: January 27, 2016

advertisement
Notes Set #4
Stat402B (Spring 2016)
Last update: January 27, 2016
Stat 402B (Spring 2016): Notes Set #4
Comparisons or Contrasts
•
The difference µp − µq is just one of many possible comparisons among
the means. The important comparisons may not be of the form µp − µq .
•
Suppose there are a treatments and each is replicated n times. It is
possible to subdivide the treatment sum of squares from the analysis of
variance into sums of squares each of one degree of freedom which can
be used to test a particular hypothesis about the means µ1, µ2, . . . , µa.
•
For example the sum of squares needed to test the hypothesis H0 :
µ1 − 21 (µ2 + µ3) = 0 vs Ha : µ1 − 12 (µ2 + µ3) 6= 0 will have 1 d.f.
•
A linear combination of the µ’s of the type µ1 − 12 µ2 − 21 µ3 is called a
comparison or a contrast of the means.
1
Stat 402B (Spring 2016): Notes Set #4
Comparisons or Contrasts(Cont’d 1)
Pa
•
If c1, c2, . . . , ca are constants s.t.
i=1 ci = 0 then Γ =
called a contrast or a comparison in the µi’s.
•
Want to test the hypotheses:
H0 :
a
X
ciµi = 0 vs. H1 :
i=1
•
a
X
Pa
i=1 ci µi
is
ciµi 6= 0
i=1
The contrast or a comparison in the sample means ȳi.’s is
C=
a
X
ciȳi.
i=1
and is the estimate of the contrast Γ =
Pa
i=1 ci µi
2
Stat 402B (Spring 2016): Notes Set #4
•
The variance of C is
•
By replacing σ 2 by its estimate we get a t-statistic
a
σ2 X 2
·
ci
V (C) =
n i=1
Pa
i=1 ci ȳi.
t0 = q
Pa
2
i=1 ci
sE
n
Reject the null hypothesis above if t0 exceeds tα/2,N −a (or calculate a
p-value).
•
We could also use an F statistic instead. A single degree of freedom sum
of squares for testing the above hypothesis is
SSc =
(
Pa
ci ȳi. )2
i=1
1 Pa
2
i=1 ci )
n
3
Stat 402B (Spring 2016): Notes Set #4
•
This gives the F-statistic
F0 =
M Sc
SSc/1
=
M SE
M SE
which turns out to be computationally equal to t20
•
For a significance level of α, the critical point
Pa is the upper 100α point of
the F (1, N − a) distribution where N = i=1 ni. Thus, we reject H0 if
F0 > Fα,1,N −a
•
If the sample sizes were unequal, i.e, each of the a treatments were
replicated ni times, respectively, the the t statistic and SSc are modified
as follows:
t0 =
Pa
i=1 ci ȳi.
r
Pa c2i
sE ·
i=1 n
i
SSc =
(
Pa
i=1 ci ȳi. )
Pa c2i
i=1 ni
2
4
Stat 402B (Spring 2016): Notes Set #4
•
•
•
•
•
•
Orthogonal Contrasts
The contrasts usually chosen to be tested are those that are of interest
to the experimenter or those suggested by the treatment structure.
Such contrasts must be determined before the experiment design begins,
and thus called pre-planned comparisons.
In an experiment with equal sample sizes, it is possible to find a set of
comparisons such that the sums of squares due to each of one degree of
freedom form a subdivision of the SST rt
Pa
Pa
If c1,P
. . . , ca and d1, . . . , dP
s.t. i=1 ci = 0, i=1 di = 0,
a are constantsP
a
a
a
and i=1 cidi = 0, then i=1 ciµi and i=1 diµi are called orthogonal
contrasts in the µi’s.
The corresponding contrasts of the sample means are statistically
independent of each other when the sample sizes are equal i.e.
n1 = · · · = na .
In that case, their contrast sum of squares form a complete partitioning
of the treatment sum of squares SST rt
i.e.
SST rt = C1 + C2 + . . . + Ca−1
5
Stat 402B (Spring 2016): Notes Set #4
Plasma Etching Example (continued)
ANOVA Table incorporating contrasts (Table 3.11 in the text)
Source of Variation
d.f.
SS
MS
F p − value
RF Power
3
66, 870.55 22, 290.18
66.8
< .0001
Orthogonal Contrasts
C1 : µ 1 = µ 2
1
(3276.10)
3276.10
9.82
< 0.01
C2 : µ 1 + µ 2 = µ 3 + µ 4
1 (46, 948.05) 46, 948.05 140.69
< .001
C3 : µ 3 = µ 4
1 (16, 646.40) 16, 646.40 49.88
< .001
Error
16
5339.20
333.70
Total
19
72, 209.75
Note: In actual situations, the contrast are selected at the planning stage
so that they lead to meeaningful conclusions about the treatment means
or effects. Thus they are very much related to the structure of the factor
levels. Ask the question whether the contrasts here are meaningful in this
experiment?
6
Stat 402B (Spring 2016): Notes Set #4
Example (Cont’d)
Compute the values of the contrasts and the sums of squares as follows: .
Ci
C=
Pa
i=1 ci ȳi.
C
1
+1(551.2) − 1(587.4)
−36.2
2
+1(551.2) + 1(587.4) − 1(625.4) − 1(707.0)
−193.8
3:
+1(625.4) − 1(707.0)
−81.6
SSCi
(−36.2)2
= 3276.10
(2/5)
(−193.8)2
= 46, 948.05
(4/5)
(−81.6)2
= 16, 646.40
(2/5)
These Contrast sums of squares completely partition the treatment sum
of squares. The F-tests on the contrasts are usually incorporated in the
analysis of variance as above. We see that
SST rt = 66, 870.55 = 3276.10 + 46, 948.05 + 16, 646.40
since the 3 contrasts considered are orthogonal to each other and thus
partitions the treatment sum of squares to 3 single degree of freedom sums
of squares.
7
Stat 402B (Spring 2016): Notes Set #4
Diagnostic Plots of Residuals
1. Probability Plot To determine possible deviations from normality of the
error distribution. Also helps to locate possible outliers
2. Residuals vs.Time To reveal possible variation of the experimental
techniques that occur as the experiment proceeds. May display more or
less variability (as time goes on) in the data.
3. Residuals vs.Predicted Values (fitted values) may help show whether
the absolute values of residuals increase (or decrease) as the size of the
response increases, indicating that the model is suspect. Ordinarily, if
the model is correct, the residuals should not be related to the size of
the response.
4. Residuals vs Extraneous Variables of Interest
• increase basic knowledge about the subject
• suggest variables that must be controlled
• lead to consider these variables as new factors in the experiment.
8
Stat 402B (Spring 2016): Notes Set #4
Choice of Sample Size for Oneway Classification
To simplify matters consider the equal sample size case i.e., n1 = n2 =
· · · = na = n
Prob. of Type II error = β = P(fail to reject H0|H0 is false)
= 1 − P(reject H0|H0 is false)
= 1 − P (F0 > Fα,a−1,N −a|H0 is false)
The relevant OC curves which are in table V of the Appendix, plot β vs. a
parameter φ where
Pa
Pa
2
n i=1 τi
n i=1(µi − µ̄)2
2
φ =
=
2
aσ
aσ 2
Separate curves available for α = .05 and α = .01 and a range of values of
ν1 = a − 1, ν2 = N − a. (Note: We want to use OC curve to determine n
for a specified power to reject a specified Ha at a chosen α. σ 2 is known).
9
Stat 402B (Spring 2016): Notes Set #4
How to use OC Curves
See Example 3.10 in the text for the plasma etching experiment example.
A practical approach is to specify the problem as finding the sample size
needed to reject the null hypothesis if any pair of treatment means differ
by at least D units. It can be shown that the minimum value of φ2 for any
configuration of µ’s that satisfy this condition (see p. 107 of the text) is:
2
nD
φ2 =
2aσ 2
Since the power increases (or β decreases) as φ2 increases, this value gives
us a way to find the minimum n that provides a test that meets the specified
power.
Here we give a simple example where we are using a = 4 treatments,
σ = 2 is known from past data, and the experimenter plans to use α = .05.
10
Stat 402B (Spring 2016): Notes Set #4
Suppose that the experimenter wants to be able to reject the null hypothesis
if any pair of treatment means differ by at least 5 units. The minimum
value of φ2 under these conditions is
2
n(5)
φ2 =
= 0.78125n
2(4)(2)2
From the OC curve for α = .05 and ν1 = 3, using ν2 = 4(n − 1) and
φ2 = 0.78125n we can construct the following table for different guesses of
n:
n
φ2
φ ν2 = 4(n − 1)
β Power
5 3.90 1.98
16 .18
.82
6 4.69 2.17
20 .11
.89
7 5.47 2.34
24 .07
.93
11
Download