Experimental Design in Agriculture CROP 590 Final Exam, Winter, 2014 Name______Key____________ Please show your work! Part I. Short Answer 1) An agronomist wants to measure the effect of irrigation (none, once, and twice during the cropping season) and nitrogen fertilizer (25, 50, and 75 kg/ha) on the yield of durum wheat. He decides to use a factorial set of treatments and a strip plot design, with 4 blocks. 8 pts 8 pts a) Complete the ANOVA by filling in the shaded cells (use the F table at the end of this exam). What are your conclusions from the ANOVA? Source df SS MS Block 3 1.08 0.36 Irrigation 2 3.34 1.67 Block*Irrigation 6 1.8 0.30 Nitrogen 2 3.02 1.51 Block*Nitrogen 6 1.74 0.29 Irrigation*Nitrogen 4 0.24 0.06 Error 12 1.20 0.10 Total 35 12.42 F F critical 5.57 5.14 significant 5.21 5.14 significant 0.60 3.26 not significant b) What means would you report from this experiment? Calculate the appropriate standard errors for those means. Because the interactions are not significant, we can report the means for each of the main effects and their standard errors. For irrigation means: For nitrogen means: se MSblocks*irrigation se MSblocks*nitrogen r*b r *a 1 0.30 0.158 4*3 0.29 0.155 4*3 8 pts 6 pts 2) An experiment was conducted to determine the effects of inoculation with four bacterial strains on dry weight of a perennial grass species. The experiment was replicated in four complete blocks. The researcher intended to obtain additional harvests from the same plots for several years. A colleague advised him to treat the harvest time as a sub-plot factor in a split-plot analysis. The researcher then asks for your opinion. What type of analysis should be considered for this data set? Explain why you are recommending that analysis. A repeated measures analysis is recommened when repeated observations are taken from the same experimental units over time. There is likely to be some correlation in errors from one time period to the next. Furthermore, the correlations are likely to be greatest between observations that are taken at short time intervals compared to those that are taken at more distance sampling periods. In order for a split-plot to be valid, one has to be able to assume that the covariance between subplots within each main plot is equal for all pairs of observations. This is not likely to be the case when the subplot is time. Patterns in the covariance structure can be taken into account in a repeated measures analysis. 3) The effect of storage temperature on seed viability was studied in a Completely Randomized Design (CRD). Three samples were stored at each of four temperatures: 10, 30, 50, and 70 F. At the end of a one year storage period the samples were tested for germination percentage. The estimate of MSE from the ANOVA was 19.0 with 8 df. 8 pts a) Complete the table of orthogonal polynomial contrasts by filling in the shaded cells. Storage temperature F 6 pts 10 30 50 70 Mean 58 31 18 13 ki2 Li SSL Fcalc Linear -3 -1 1 3 20 -148 3285.6 172.926 Quadratic 1 -1 -1 1 4 22 363 19.1053 Cubic -1 3 -3 1 20 -6 5.4 0.28421 b) What do these results tell you about the relationship between storage temperature and seed viability? Li = 58-31-18+13 = 22 SSL = 3*222/4 = 363 F = 363/19 = 19.1053 Critical F with 1, 8 df = 5.32 The relationship beween storage temperature and germination percentage is best described by a model that includes a linear and quadratic component: Yij = b0 + b1Xi + b2Xi2 + eij The response to temperature is curvilinear. Germination decreases with increased temperature. For the range of treatments included in this experiment, the loss in germination is very rapid in the low temperature range, but slows down at higher temperatures. 2 6 pts 4) Consider a split-plot design with 4 levels of factor A (main plots) and 2 levels of factor B (sub-plots). Assume that there is a soil gradient from high clay on the west to low clay on the east side of the field. Circle the design below that is most likely to effectively control experimental error due to this field effect. Answer: Design A 3 5) Researchers in state X wished to determine the best varieties of a new annual crop to recommend for commercial production. Five varieties were evaluated at three locations over a two year period (a total of six sites). A randomized block design was used at each site with three blocks. Yield data were collected from each of the six trials. After performing an analysis at each site to check for outliers and confirm that assumptions for the ANOVA were satisfied, PROC GLIMMIX was used to determine if variances across sites met the assumption of homogeneity of variance needed for a combined analysis. The output is shown below: Covariance Parameter Estimates Cov Parm Group Estimate Standard Error Residual (VC) Site Hilltown 2018 7.9417 3.9708 Residual (VC) Site Hilltown 2019 34.2160 17.1080 Residual (VC) Site Springfiield 2018 21.5927 10.7963 Residual (VC) Site Springfield 2019 20.3512 10.1756 Residual (VC) Site Waterbury 2018 21.7168 10.8584 Residual (VC) Site Waterbury 2019 11.8882 5.9441 Tests of Covariance Parameters Based on the Restricted Likelihood Label common variance 4 pts DF -2 Res Log Like ChiSq Pr > ChiSq Note 5 324.77 4.92 0.4260 DF a) The covariance parameter estimate of 34.2160 represents (choose one): i) The mean yield in Hilltown in 2019 ii) The MSE from the ANOVA for yield for Hilltown in 2019 iii) The covariance of yield in Hilltown in 2018 and 2019 iv) The Mean Square for varieties in Hilltown in 2019 5 pts b) What conclusion can be drawn about the homogeneity of variance assumption from the Chi Square test shown above? The observed probability for the Chi Square test is much greater than 0.05, so we can accept the null hypothesis that the variances are homogeneous. 4 Question 5, continued. The Random statement with a /test option was used in PROC GLM to generate Expected Mean Squares for an across site analysis: Source Type III Expected Mean Square Site Var(Error) + 3 Var(Site*Variety) + 5 Var(Block(Site)) + 15 Var(Site) Block(Site) Var(Error) + 5 Var(Block(Site)) Variety Var(Error) + 3 Var(Site*Variety) + Q(Variety) Site*Variety Var(Error) + 3 Var(Site*Variety) 6 pts c) Based on the results above, what would be the appropriate ratio of Mean Squares to use to test for significant differences among varieties? MS(varieties)/MS(site*variety) d) Are the blocks nested within sites? Explain your answer. 5 pts Blocks are nested because each block is unique to each site. They represent a random sample of possible blocks. A summary of the results of the combined ANOVA across sites is shown below: Source Mean Square 2014.37 F Value 5 Type III SS 10072 16.62 <.0001 12 1038.85 86.5712 4.41 0.0001 4 1072.25 268.061 4.94 0.0062 Site*Variety 20 1084.69 54.2347 2.76 0.002 Error 48 941.652 19.6178 Site Block(Site) Variety DF 5 Pr > F 6 pts e) Use the ANOVA on the previous page and the graph below to give a brief interpretation of these results. Can generalizations be made about the relative performance of varieties across sites? Can you note any trends that might warrant further investigation? The Site*Variety interaction is highly significant, so we have to be cautious about interpreting the main effects of varieties and making generalizations about the performance of varieties across sites. Nonetheless, differences among the varieties are very large and significant. Variety E was consistently good in Springfield and Waterbury. Variety D was consistently good in Hilltown. Variety B was consistently poor at all sites. The variation among sites was large in comparison to the variation among varieties, with 2019 being the better year at all sites. Blocking was effective. Further analyses could be conducted to determine the relative importance of locations and years in contributing to the variation among sites. It appears that most of the Site*Variety interaction was due to differences in relative performance of varieties in Hilltown vs the other sites. Further experimentation would be needed to determine if this is a consistent pattern and what environmental factors (rainfall, soil type, diseases, etc.) might be impacting the yield of these varieties. Variety D Variety E Variety B Part II. Experimental Design Question A researcher has developed a new herbicide that can control a parasitic weed in red clover fields in the Willamette Valley. The herbicide can be applied as a seed treatment 6 on clover, or as a post-emergence spray, but optimum rates have not been established. Widely grown varieties of clover may differ in their tolerance to the herbicide. The researcher would like to develop recommendations for use of the herbicide. Assume that the primary reason for growing the crop is for seed production. The parasite is prevalent on several acres of land that are available with a cooperative farmer who grows clover near Salem. Design an experiment that would meet the objectives of the researcher. 6 pts 1) What type of experimental design will you use? Justify your choice. Indicate any basic assumptions that you have made. There are many reasonable solutions to this question. One possibility is to use a split-plot arrangement of treatments. It would be difficult to apply the herbicide spray to small plots, so the herbicide treatments (Factor A) will be the main plot and clover varieties (Factor B) could be the sub-plot. Rates applied to seeds may not be directly comparable to rates that are applied post emergence, because seed treatments are active only in the volume of soil immediately surrounding each seed, whereas post-emergence sprays are applied to the entire surface area of the soil. For that reason there is no need to consider herbicide rates as a separate factor. There is also no indication from the description that one might consider applying both the seed treatment and the post-emergence spray to the same crop, so there is no reason to look at factorial combinations of seed treatments and sprays. Four blocks will be used to account for natural variation in the prevalence of the weed seed in the field. We have to assume that the level of the parasite is reasonably uniform within blocks. Four reps are needed to provide sufficient degrees of freedom for testing the main plots (herbicide treatments), which is of primary interest in this experiment. 2) List the treatments of the experiment. Be sure to include any necessary controls. Explain why you have chosen this particular set of treatments. 6 pts Factor A: Herbicide Treatments C = no herbicide SL = seed treatment, low rate SH = seed treatment, high rate PEL = post emergence spray, low rate PEM = post emergence spray, medium rate PEH = post emergence spray, high rate Factor B: Varieties – 2 levels (widely grown, old standard variety and the most promising new variety) I am assuming that establishing an effective and safe rate for a seed treatment is more straightforward than determining the rate of spray that will be needed to control the parasite without causing crop damage, so I have limited the seed treatments to two levels. Three levels of herbicide spray will be applied so that the equation for the response curve can be used to estimate the optimum level of herbicide. 7 3) Break out the ANOVA in terms of Sources of Variation and degrees of freedom. 6 pts Source of Variation DF Total Block Herbicide Block * Herbicide (error a) Cultivar Herbicide * Cultivar Error b 47 3 5 15 1 5 18 4) Identify two meaningful questions that this experiment might address that are not adequately evaluated from an ANOVA. Indicate the coefficients that would be needed for each of the treatment means to estimate Sums of Squares for the corresponding contrasts. Show how you would determine if the two contrasts are orthogonal (or not). (There is a table of orthogonal polynomial contrast coefficients at the end of this exam that can be used for reference.) 6 pts With the exception of the highlighted questions below, all of the contrasts would meet the requirement of providing additional information not adequately evaluated from the ANOVA. Questions pertinent to the objectives: 1) Does the use of an herbicide increase seed yield in comparison to the control (no herbicide)? 2) Does the method of herbicide application (seed vs spray) have an effect on the seed yield of red clover? 3) Is there a difference in seed yield for the low vs the high level of seed treatment? 4) Does red clover show a linear response to increasing rates of post-emergence spray? 5) Does red clover show a quadratic response to increasing rates of post-emergence spray? 6) Do the two varieties differ in seed yield? (equivalent to the test for main effects of varieties in the ANOVA) 7) Are the effects of the herbicide treatments the same for both varieties? (Can do individual tests for interactions of contrast 6 with contrasts 2-5) 6 pts Contrast coefficients: 8 Control Control SL SL SH SH PEL PEL PEM PEM PEH PEH Cult1 Cult2 Cult1 Cult2 Cult1 Cult2 Cult1 Cult2 Cult1 Cult2 Cult1 Cult2 1 -5 -5 1 1 1 1 1 1 1 1 1 1 2 0 0 3 3 3 3 -2 -2 -2 -2 -2 -2 3 0 0 -1 -1 1 1 0 0 0 0 0 0 4 0 0 0 0 0 0 -1 -1 0 0 1 1 5 0 0 0 0 0 0 -1 -1 2 2 -1 -1 6 0 0 -1 1 -1 1 -1 1 -1 1 -1 1 7 0 0 3 -3 3 -3 -2 2 -2 2 -2 2 8 0 0 -1 1 1 -1 0 0 0 0 0 0 9 0 0 0 0 0 0 -1 1 0 0 1 -1 10 0 0 0 0 0 0 -1 1 2 -2 -1 1 Test for orthogonality – use contrasts 1 and 2 and show that the sum of products = 0 (-5)(0)+(-5)(0)+(1)(3)+(1)(3)+(1)(3)+(1)(3)+(1)(-2)+(1)(-2)+(1)(-2)+(1)(-2)+(1)(-2)+(1)(-2)=0 9 F Distribution 5% Points Denominator Numerator df 1 2 3 4 5 6 7 1 161.45 199.5 215.71 224.58 230.16 233.99 236.77 2 18.51 19.00 19.16 19.25 19.30 19.33 19.36 3 10.13 9.55 9.28 9.12 9.01 8.94 8.89 4 7.71 6.94 6.59 6.39 6.26 6.16 6.08 5 6.61 5.79 5.41 5.19 5.05 4.95 5.88 6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 10 4.96 4.10 3.71 3.48 3.32 3.22 3.13 11 4.84 3.98 3.59 3.36 3.20 3.09 3.01 12 4.75 3.88 3.49 3.26 3.10 3.00 2.91 13 4.67 3.80 3.41 3.18 3.02 2.92 2.83 14 4.60 3.74 3.34 3.11 2.96 2.85 2.76 15 4.54 3.68 3.29 3.06 2.90 2.79 2.71 16 4.49 3.63 3.24 3.01 2.85 2.74 2.66 17 4.45 3.59 3.20 2.96 2.81 2.70 2.61 18 4.41 3.55 3.16 2.93 2.77 2.66 2.58 19 4.38 3.52 3.13 2.90 2.74 2.63 2.54 20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 21 4.32 3.47 3.07 2.84 2.68 2.57 2.49 22 4.30 3.44 3.05 2.82 2.66 2.55 2.46 23 4.28 3.42 3.03 2.80 2.64 2.53 2.44 24 4.26 3.40 3.00 2.78 2.62 2.51 2.42 25 4.24 3.38 2.99 2.76 2.60 2.49 2.40 26 27 28 29 30 10 Student's t Distribution (2-tailed probability) df 0.40 0.05 0.01 1 1.376 12.706 63.667 2 1.061 4.303 9.925 3 0.978 3.182 5.841 4 0.941 2.776 4.604 5 0.920 2.571 4.032 6 0.906 2.447 3.707 7 0.896 2.365 3.499 8 0.889 2.306 3.355 9 0.883 2.262 3.250 10 0.879 2.228 3.169 11 0.876 2.201 3.106 12 0.873 2.179 3.055 13 0.870 2.160 3.012 14 0.868 2.145 2.977 15 0.866 2.131 2.947 16 0.865 2.120 2.921 17 0.863 2.110 2.898 18 0.862 2.101 2.878 19 0.861 2.093 2.861 20 0.860 2.086 2.845 21 0.859 2.080 2.831 22 0.858 2.074 2.819 23 0.858 2.069 2.807 24 0.857 2.064 2.797 25 0.856 2.060 2.787 26 0.856 2.056 2.779 27 0.855 2.052 2.771 28 0.855 2.048 2.763 29 0.854 2.045 2.756 30 0.854 2.042 2.750 11