STATISTICS 402B Spring 2016

advertisement
Name
STATISTICS 402B
Spring 2016
Midterm Exam II (100 points)
1. Four different washing solutions (labeled say 1, 2, 3, and 4) are being compared to study their effectiveness
in retarding bacteria growth in 5-gallon milk containers. Bacteria counts remaining in the containers after
3 hours is the response variable.
(a) (6) Give at least one advantage and one disadvantage of conducting this experiment using a completely
randomized design. (Hint: think of sample sizes, randomization, estimate of experimental error)
(b) (6) The experiment is run in a laboratory and only four trials can be run in a day. The experimenter
decides to use a randomized complete block design with “Day” as a blocking factor. If measurements
are being taken over five days, give an experimental plan for the experiment (i.e., the order of experimental runs to be run each day).
(c) (8) Suppose the model is yij = µ + τi + βj + ij ; i = 1, . . . , 4 (Solutions), j = 1, . . . , 5 (Days). Fill out
a partial ANOVA table (SV and d.f. only) for the experiment described in part (b).
SV
Solution
Day
Error
Total
d.f.
(d) (6) Calculate a 95% CI for τ1 − τ2 if it is given that ȳ1. = 23.1 ȳ2. = 25.2, SSE = 23.52 and
t.025,12 = 2.179. (Hint: First, use above table and given SSE to calculate M SE )
(e) (8) Suppose the 4 chemical analysts are available. The experimenter decides to use a Latin square design in order to control the variation among the different days as well as among the different analysts.
Thus Day and Analyst are blocking factors. Show a possible experimental plan the experimenter could
have used (after randomization). Start with a basic plan.
1
2. The following table gives the percent shrinkage during dyeing of 4 types fabric at 4 different dye temperatures. The effects of both the types of fabric and dye temperatures as well as any combined effects were
of interest. The fabric-temperature combinations were allocated completely at random to 32 experimental
runs so that each combination was replicated twice and then run in random order. The data are as shown
below:
Fabric
I
II
III
IV
210◦ F
0.8,1.6
2.2,1.8
2.8,3.6
2.2,2.8
Temperature
215◦ F 220◦ F
1.8,2.4 3.2,4.0
3.4,2.8 5.2,4.6
4.2,4.8 6.2,7.4
3.3,3.9 5.2,5.8
225◦ F
7.5,8.2
9.8,9.2
11.8,12.6
10.4,11.0
(a) (6) Describe the type of experiment used here by naming both the treatment arrangement and the
experimental design used.
(b) (10) Complete the ANOVA table for this experiment. You may use the JMP output noting that some
numbers are missing in the JMP output.
Source of Variation
Fabric
Temperature
Fabric*Temperature
Error
Total
d.f.
SS
MS
F
p-value
16
31
(c) (6) Obtain numbers from the JMP output and fill-out the following table of means:
Fabric
I
210◦
F
Temperature
215◦ F
220◦ F
225◦ F
ȳi..
II
III
IV
ȳ.j.
(d) (6) How would you determine whether there is significant interaction between fabric type and temperature effects? (if you perform a test of hypothesis, you must give the test statistic, the p-value,
and your decision) Does the plot in the JMP output support your conclusion? Explain how or why?
2
(e) (6) Calculate the 95% CI for the difference in the effects of Fabric I and II. (t.025,16 = 2.12)
(f) (6) Calculate the 95% CI the difference in the effects of Temperature 220◦ F and 225◦ F. (t.025,16 = 2.12)
3. (a) (6) Give two advantages of using a factorial experiment instead of single factor experiments to study
effects of several factors.
(b) (4) Explain how to define the main effect A of a 22 factorial with factors A and B. Recall that the
treatment means are identified using the notation (1), a, b, ab.
(c) (4) Explain how to define the interaction effect AB of a 22 factorial with factors A and B. Recall that
the treatment means are identified using the notation (1), a, b, ab.
3
Spring 2016
STATISTICS 402B
Examination 2 FORMULA SHEET
Single Factor in a CRD
Levels of the factor are the treatments; there are a treatments.
iid
Model: yij = µi + eij , where eij ∼ N (0, σ 2 ) or equivalently, by the effects model:yij = µ + τi + eij where
i = 1 . . . , a; j = 1, . . . , ni
Anova table to test H0 : µ1 = . . . = µa vs Ha :
or H0 : τ1 = . . . = τa vs Ha :
at least one inequality
Source of Variation
Treattment
Error
Total
In the above table N =
Pa
i=1 ni
at least one inequality
d.f.
a−1
N −a
N −1
SS
MS
F
SST rt M ST rt M ST rt /M SE
SSE
M SE
SST rt
if the sample sizes are n1 , n2 , . . . , na
If the sample sizes are equal, i.e., n1 = n2 = . . . = na = n, we have that N = an
Reject H0 at α level of significance if F > Fα,a−1,N −a or use the p-value from JMP.
(can be used this way only when all sample sizes are equal to n).
q
√
LSDα = tα/2,N −a · sE n2 where sE = M SE
LSD Procedure:
Tukey’s Method: (can be used this way only when all sample sizes are equal to n)
Tukeyα = HSDα = Q · sE
q
2
n
where Q is taken from the JMP output.
Single Factor in a RCBD
There are a treatments; b blocks.
iid
Model:yij = µ + τi +βj + ij , i = 1, . . . , a; b = 1, . . . , b where eij ∼ N (0, σ 2 )
| {z }
µi
Anova table to test H0 : µ1 = . . . = µa vs Ha :
least one inequality
SV
Treatments
Blocks
Error
Total
at least one inequality or H0 : τ1 = . . . = τa vs Ha :
d.f.
a−1
b−1
(a − 1)(b − 1)
N − 1(= ab − 1)
SS
SST rt
SSBlk
SSE
SST
at
MS
F
SST rt /(a − 1) M ST rt /M SE
SSBlk /(b − 1)
M SE (= s2E )
Confidence Intervals for Pairwise Comparisons
A 100(1 − α)% C.I. for τp − τq (or µp − µq ) is (ȳp. − ȳq. ) ∓ t α2 ,ν · sE
q
2
b,
ν = (a − 1)(b − 1)
Single Factor in a Latin Square Design
iid
Model: yijk = µ + αi + τj + βk + ijk , where ijk ∼ N (0, σ 2 ) and yijk is an observation in the ith row, k th column
for the j th treatment; i, j, k = 1, . . . , p. (Note: τj is the j th treatment effect.)
Anova table to test H0 : τ1 = . . . = τp vs Ha :
SV
Treatments
Rows
Cols
Error
Total
d.f.
p-1
p-1
p-1
(p-1)(p-2)
p2 − 1
at least one inequality
SS
SST rt
SSRows
SSCols
SSE
SST
MS
MST rt
MSRows
MSCols
MSE=s2E
F
MST rt /MSE
Confidence Interval for Pairwise Comparisons:
A 100(1 − α)% CI for τp − τq is (ȳ.p. − ȳ.q. ) ∓ tα/2,ν · sE 2/p where s2E = M SE and ν = (p − 1)(p − 2)
p
Basic Latin Squares:
Two-way Factorial in a CRD
Two factors A (at a levels) and B (at b levels), crossed giving a A × B factorial each treatment combination
replicated n times.
yijk = µ + τi + βj + (τ β)ij + ijk i = 1, . . . , a; j = 1, . . . , b; k = 1, . . . , n;
iid
τi = effect of i-th level of A; βj = effect of j-th level of B; (τ β)ij =interaction effect and eijk ∼ N (0, σ 2 )
ANOVA Table:
SV
Treatment
A
B
AB
Error
Total
d.f.
ab − 1
a−1
b−1
(a − 1)(b − 1)
ab(n − 1)
abn − 1
SS
SST rt
SSA
SSB
SSAB
SSE
SST
Confidence Intervals for Main Effects:
Factor A: A 100(1 − α)% C.I. for τ1 − τ2 (or µ̄1. − µ̄2. )
(ȳ1.. − ȳ2.. ) ∓ t α2 ,ν · sE
where ν = ab(n − 1)
q
2
bn
s2E = M SE
Factor B: A 100(1 − α) C.I. for β1 − β2 (or µ̄.1 − µ̄.2 )
(ȳ.1. − ȳ.2. ) ± t α2 ,ν · sE
where ν = ab(n − 1)
s2E = M SE
q
2
an
MS
M ST rt
M SA
M SB
M SAB
M SE (= s2E )
F
M ST rt /M SE
M SA /M SE
M SB /M SE
M SAB /M SE
(d) (12) The graph below shows the Yield totals of a 22 factorial with 3 replications for each treatment
combination where A (Concentration) and B (Catalyst) are the two factors under study:
i. Calculate Factor A effect.
ii. Calculate Factor B effect.
iii. Calculate Interaction AB effect.
iv. Calculate the degrees of freedom for Total SS for in this experiment.
v. Calculate the degrees of freedom for Error SS in this experiment.
vi. Calculate the degrees of freedom for Treatment SS in this experiment.
4
Edited JMP Output for Fabric Dye Data (Problem #2)
Analysis of Variance
Source
DF
Model
Error
C. Total
15
16
31
Sum of
Squares
329.32469
3.94500
333.26969
Mean Square
F Ratio
21.9550
0.2466
89.0443
Prob > F
<.0001*
Effect Tests
Source
Nparm
Fabric
Temperature
Fabric*Temperature
DF
Sum of
Squares
37.67594
288.08094
3.56781
F Ratio
<.0001*
<.0001*
0.1951
Effect Details
Fabric
Least Squares Means Table
Level
I
II
III
IV
Least Sq
Mean
3.6875000
4.8750000
6.6750000
5.5750000
Std Error
Mean
3.68750
4.87500
6.67500
5.57500
Temperature
Least Squares Means Table
Level
210
215
220
225
Least Sq
Mean
2.225000
3.325000
5.200000
10.062500
Std Error
Level
I,210
I,215
I,220
I,225
II,210
II,215
II,220
II,225
III,210
III,215
III,220
III,225
IV,210
IV,215
IV,220
IV,225
Least Sq
Mean
1.200000
2.100000
3.600000
7.850000
2.000000
3.100000
4.900000
9.500000
3.200000
4.500000
6.800000
12.200000
2.500000
3.600000
5.500000
10.700000
Mean
2.2250
3.3250
5.2000
10.0625
Fabric*Temperature
Least Squares Means Table
Std Error
0.35111430
0.35111430
0.35111430
0.35111430
0.35111430
0.35111430
0.35111430
0.35111430
0.35111430
0.35111430
0.35111430
0.35111430
0.35111430
0.35111430
0.35111430
0.35111430
Prob > F
Download