Test3_f10.doc

advertisement
Test 3 St 711 Fall 2010 Dickey
1. I had 6 wines (labeled A through F) to evaluate using tasters as my blocks. I gave each taster a sample
of 3 of the 6 wines resulting in a balanced incomplete block design. Here is part of the analysis of
variance I got using the code
PROC GLM; CLASS TASTER WINE; MODEL RATING = WINE TASTER; RUN;
Dependent Variable: rating
Source
Model
Error
Corrected Total
DF
__
__
59
Sum of
Squares
200.9652778
8.6180556
209.5833333
Source
wine
taster
DF
__
__
Source
wine
taster
DF
XX
XX
F Value
34.01
Pr > F
<.0001
Type I SS
1.9833333
198.9819444
F Value
1.61
42.53
Pr > F
0.1829
<.0001
Type III SS
3.3819444
198.9819444
F Value
2.75
42.53
Pr > F
0.0340
<.0001
(A) (8 pts.) How many total observations _______ and how many tasters ________ did I have?
(B) (4 pts.) List the appropriate p-value _______ to test for rating differences between wines.
(C) (16 pts.) Fill in the 4 blanks (degrees of freedom) in the table above.
(D) (5 pts.) Find, if possible, the number of tasters ______ that tasted wine A. If not possible put “NP”.
(E) (6 pts.) Find, if possible, the number of tasters ______ that tasted both wine A and C and the
number ______ that tasted both E and F. If not possible, put “NP”.
(F) (3 pts.) I also ran PROC MIXED with TASTER random getting this on my output:
Type 3 Tests of Fixed Effects
Effect
wine
Num
DF
5
Den
DF
35
F Value
2.57
Pr > F
0.0439
Which of the 3 p-values, (0.1829, 0.0340, 0.0439) for wine _______ from GLM and MIXED should I
use if I am only interested in the tasters in the experiment (maybe they are a wine judging team from a
national magazine).
2. I ran an experiment to study the juice content in apples. Several seasons ago I planted 5 trees from
each of 40 families, picking 10 locations at random around the state and using 4 different families for
each location. The 20 trees in each location are now producing fruit and I harvest 3 apples from each
tree then squeeze out and measure the juice in each apple.
(A) (12 pts.) If I use the code
PROC GLM; CLASS LOCATION FAMILY TREE;
MODEL JUICE = LOCATION FAMILY(LOCATION) TREE(FAMILY*LOCATION); RUN;
what degrees of freedom will I have for LOCATION______, for FAMILY(LOCATION) ______,
for TREE(FAMILY*LOCATION) ______ and for error _____.
(B) (4 pts.) What will be the denominator degrees of freedom ______ for the F test for
FAMILY(LOCATION)
3. (24 pts.) I have 6 fields each laid out in a 3x4 grid of plots. I want to run an experiment with 4
varieties (V) of corn and 3 fertilizers (F) using these fields as blocks. There are three possible designs
listed on the left. For each design give the degrees of freedom for blocks and the denominator degrees
of freedom for the F tests for V F V*F assuming all these are in your model.
Block df
Denominator df for
V
F
V*F
Randomized Complete Block
_______
_____
_____ _____
Split Plot (V=whole plot treatment)
_______
_____
_____ _____
Split Block
_______
_____
_____ _____
4. I have 8 factors A,B,…,H each at 2 levels (-1,1) in a 28 factorial arrangement. I can only afford to make
32 runs so I’ll use only the runs where ABCD and CDEF and EFGH are all 1
(A) (8 pts.) Write out the generalized interaction of CDEF and EFGH ______
and that of ABCD and CDEF ________
(B) (5 pts.) Those two generalized interactions can be used to discover 2 of the aliases of C. List
those two aliases.
_______________
______________
(C) (5 pts.) Altogether, how many _______ aliases does C have (not counting C itself).
***** Answers***;
1. With 59 total df we know that n=60 = bk = tr and we know k=3 wines per taster (block) so there are
20 tasters. We know t=6 wines in all so there are r=10 replicates of each. We also know that  = r(k1)/(t-1) = 10(2)/5=4 tasters who taste and pair of wines. That answers A, D, & E. If only these tasters are
of interest then we would treat them as fixed effects and use the GLM Type III tests so 0.0340 would
anwser B and F. With 5 df for 6 wines and 9 df for 10 tasters we have 14 model and thus 45 error df. I
the analysis of variance table.
2. 10 locations -> 9 df for L. 4 families per location -> 3(10) = 30 df for F(L). (This mean square would be
subtracted from MS(L) to start the estimation process for the location variance so it would also be the
denominator of F for testing location effects. )
5 trees per family, 40 families -> 160 df for tree(family, location). Using the same reasoning as before,
this mean square is thus the denominator for testing family(location) so 160 is the denominator df of
that test.
There are 200 trees and 600 apples so 599 total df of which we’ve used up 9+30+160=199 leaving 400 df
for apple to apple variation in yield (the error term in our model).
3. In RCB (t-1)(b-1) = 11(5) = 55 df for error
Row 1: 5 (for blocks) 55 55 55 (denominators for tests)
In Split Plot, (a-1)(b-1) df for error A so 3(5)=15. 55-15=40 so
Row 2: 5 15 40 40 (note the problem statement that fields are blocks. ).
Split block separates out F*block with 2(5)=10 df more so
Row 3: 5 15 10 30
4. CDGH and ABEF (others: ABCDEFGH and ABCD(CDEF)(EFGH) = ABGH)
C = DGH = ABCEF (plus 3 from given interactions and two more from others above)
With 1/8 of an experiment, each effect must represent 8 of the original ones or 7 besides C in this
case. With the extra aliases listed here in the answers, you could write them all out.
Download