Test 3 St 711 Fall 2010 Dickey 1. I had 6 wines (labeled A through F) to evaluate using tasters as my blocks. I gave each taster a sample of 3 of the 6 wines resulting in a balanced incomplete block design. Here is part of the analysis of variance I got using the code PROC GLM; CLASS TASTER WINE; MODEL RATING = WINE TASTER; RUN; Dependent Variable: rating Source Model Error Corrected Total DF __ __ 59 Sum of Squares 200.9652778 8.6180556 209.5833333 Source wine taster DF __ __ Source wine taster DF XX XX F Value 34.01 Pr > F <.0001 Type I SS 1.9833333 198.9819444 F Value 1.61 42.53 Pr > F 0.1829 <.0001 Type III SS 3.3819444 198.9819444 F Value 2.75 42.53 Pr > F 0.0340 <.0001 (A) (8 pts.) How many total observations _______ and how many tasters ________ did I have? (B) (4 pts.) List the appropriate p-value _______ to test for rating differences between wines. (C) (16 pts.) Fill in the 4 blanks (degrees of freedom) in the table above. (D) (5 pts.) Find, if possible, the number of tasters ______ that tasted wine A. If not possible put “NP”. (E) (6 pts.) Find, if possible, the number of tasters ______ that tasted both wine A and C and the number ______ that tasted both E and F. If not possible, put “NP”. (F) (3 pts.) I also ran PROC MIXED with TASTER random getting this on my output: Type 3 Tests of Fixed Effects Effect wine Num DF 5 Den DF 35 F Value 2.57 Pr > F 0.0439 Which of the 3 p-values, (0.1829, 0.0340, 0.0439) for wine _______ from GLM and MIXED should I use if I am only interested in the tasters in the experiment (maybe they are a wine judging team from a national magazine). 2. I ran an experiment to study the juice content in apples. Several seasons ago I planted 5 trees from each of 40 families, picking 10 locations at random around the state and using 4 different families for each location. The 20 trees in each location are now producing fruit and I harvest 3 apples from each tree then squeeze out and measure the juice in each apple. (A) (12 pts.) If I use the code PROC GLM; CLASS LOCATION FAMILY TREE; MODEL JUICE = LOCATION FAMILY(LOCATION) TREE(FAMILY*LOCATION); RUN; what degrees of freedom will I have for LOCATION______, for FAMILY(LOCATION) ______, for TREE(FAMILY*LOCATION) ______ and for error _____. (B) (4 pts.) What will be the denominator degrees of freedom ______ for the F test for FAMILY(LOCATION) 3. (24 pts.) I have 6 fields each laid out in a 3x4 grid of plots. I want to run an experiment with 4 varieties (V) of corn and 3 fertilizers (F) using these fields as blocks. There are three possible designs listed on the left. For each design give the degrees of freedom for blocks and the denominator degrees of freedom for the F tests for V F V*F assuming all these are in your model. Block df Denominator df for V F V*F Randomized Complete Block _______ _____ _____ _____ Split Plot (V=whole plot treatment) _______ _____ _____ _____ Split Block _______ _____ _____ _____ 4. I have 8 factors A,B,…,H each at 2 levels (-1,1) in a 28 factorial arrangement. I can only afford to make 32 runs so I’ll use only the runs where ABCD and CDEF and EFGH are all 1 (A) (8 pts.) Write out the generalized interaction of CDEF and EFGH ______ and that of ABCD and CDEF ________ (B) (5 pts.) Those two generalized interactions can be used to discover 2 of the aliases of C. List those two aliases. _______________ ______________ (C) (5 pts.) Altogether, how many _______ aliases does C have (not counting C itself). ***** Answers***; 1. With 59 total df we know that n=60 = bk = tr and we know k=3 wines per taster (block) so there are 20 tasters. We know t=6 wines in all so there are r=10 replicates of each. We also know that = r(k1)/(t-1) = 10(2)/5=4 tasters who taste and pair of wines. That answers A, D, & E. If only these tasters are of interest then we would treat them as fixed effects and use the GLM Type III tests so 0.0340 would anwser B and F. With 5 df for 6 wines and 9 df for 10 tasters we have 14 model and thus 45 error df. I the analysis of variance table. 2. 10 locations -> 9 df for L. 4 families per location -> 3(10) = 30 df for F(L). (This mean square would be subtracted from MS(L) to start the estimation process for the location variance so it would also be the denominator of F for testing location effects. ) 5 trees per family, 40 families -> 160 df for tree(family, location). Using the same reasoning as before, this mean square is thus the denominator for testing family(location) so 160 is the denominator df of that test. There are 200 trees and 600 apples so 599 total df of which we’ve used up 9+30+160=199 leaving 400 df for apple to apple variation in yield (the error term in our model). 3. In RCB (t-1)(b-1) = 11(5) = 55 df for error Row 1: 5 (for blocks) 55 55 55 (denominators for tests) In Split Plot, (a-1)(b-1) df for error A so 3(5)=15. 55-15=40 so Row 2: 5 15 40 40 (note the problem statement that fields are blocks. ). Split block separates out F*block with 2(5)=10 df more so Row 3: 5 15 10 30 4. CDGH and ABEF (others: ABCDEFGH and ABCD(CDEF)(EFGH) = ABGH) C = DGH = ABCEF (plus 3 from given interactions and two more from others above) With 1/8 of an experiment, each effect must represent 8 of the original ones or 7 besides C in this case. With the extra aliases listed here in the answers, you could write them all out.