Experimental Design in Agriculture CROP 590 Final Exam, Winter, 2016 Name_______KEY___________ Part I. Short answer – please show your work 1) An experiment is conducted to evaluate the yield of seven oat cultivars at four locations that represent a sample of the environments in which the cultivars are likely to be grown. The experimental design at each location is a randomized complete block design with three replications. 3 pts 4 pts Source df Mean Square Expected Mean Square Location 3 MS1 σ2e + 7σ2Rep(Loc) + 21σ2Loc Rep(Loc) 8 MS2 σ2e + 7σ2Rep(Loc) Cultivar 6 MS3 σ2e + 3σ2Loc x Cultivar + 12Ө2Cult Loc*Cultivar 18 MS4 σ2e + 3σ2 Loc x Cultivar Error 48 MS5 σ2e a) Based on the Expected Mean Squares given in the table above, what would be the appropriate ratio of Mean Squares to use to calculate an F value to determine if there are differences among the cultivars? MS3/MS4 b) Are Replications and Locations nested or cross-classified? Explain your answer. Reps are nested within locations. Each block is unique to each location. 4 pts 4 pts c) The seven oat cultivars include the most promising new cultivars from your breeding program, and you are considering them for commercial release. Do you think that cultivars should be designated as fixed or random effects in this experiment? Defend your choice. Cultivars should be fixed. We would like to know how the new varieties compare to each other and to check varieties. We are interested in this particular set of cultivars. d) Using Cultivars as an example, explain what the Expected Mean Square in the ANOVA represents and define each of its components. The Expected Mean Square σ2e + 3σ2Loc x Cultivar + 12Ө2Cult represents the variation among cultivar means. σ2e is experimental error variance σ2Loc x Cultivar is the variance component for location x cultivar interactions Ө2Cult is the variation among the fixed cultivar effects 1 2) A researcher wished to study the relationships between irrigation and nitrogen response in corn. Because irrigation could only be applied to large plots, she decided to use a split plot design with the irrigation treatments (irrigated and nonirrigated) as main plots and nitrogen fertility (60, 90, 120, 150 and 180 lbs/acre) as the subplots. The trial was planted in four complete blocks. Yield was recorded in bu/acre. 10 pts Complete the ANOVA (fill in shaded areas): Source Total Block Irrigation Error a Nitrogen Irrigation x N Error b 5 pts df 39 3 1 3 4 4 24 SS 12879 1911 7445 384 1834 585 720 MS 637 7445 128 458.5 146.25 30 F 58.164 15.28 4.875 a) Using the F table in the back of this exam, what are your conclusions regarding the effects of irrigation and nitrogen on corn yield? The irrigation x N effects are significant (4.87 is greater than Fcritical = 2.78), so we have to be careful about interpreting results of main effects. The response to N depends on irrigation in corn. b) Calculate the standard error for an irrigation treatment mean. 5 pts se = sqrt(Errora/r.b) = sqrt(128/20) = 2.53 8 pts 3) You are reading an article that was published in 1965. The authors were evaluating the effect of growth promoters on Douglas Fir seedlings. Measurements were taken at monthly intervals over the first two years of growth, and time of sampling was analyzed as a sub-plot factor in a split-plot analysis. What type of analysis should be considered for this data set today? What are the advantages of the current methods of analysis compared to the split-plot in time? Today we would recommend a repeated measures analysis when repeated observations are taken from the same experimental units over time. There is likely to be some correlation in errors from one time period to the next. Furthermore, the correlations are likely to be greatest between observations that are taken at short time intervals compared to those that are taken at more distant sampling periods. Patterns in the covariance structure are taken into account in a repeated measures analysis. 2 4) An experiment was conducted to determine the effect of storage temperature on the potency of an antibiotic. Fifteen samples of the antibiotic were obtained and three samples, selected at random from the fifteen, were stored at each of five temperatures: 10, 30, 50, 70, 90. At the end of a thirty day storage period the samples were tested for potency with the following results: Temperature Mean 10 58 30 31 50 18 70 13 90 11 Source df SS Total 14 4680.4 4 4520.4 1130.1 10 160.0 16.0 Temperature Error MS F 70.63** Orthogonal Polynomial Coefficients are used to obtain the following contrasts: 10 30 50 70 90 Linear -2 -1 0 1 2 -112 10 3763.20 235.2 2 -1 -2 -1 2 58 14 720.86 45.05 -1 2 0 -2 1 -11 10 36.30 2.27 1 -4 6 -4 1 1 70 0.04 0 Quadratic Cubic Quartic 8 pts k2 Temperature L SS(L) a) Fill in the shaded areas to complete the analysis of contrasts. Show your calculations below. L = (2*58) + (-1*31) + (-2*18) + (-1*13) + (2*11) = 58 SSL 6 pts r*L2 3*(58) 2 10092 720.86 14 14 k2 b) What do these results tell you about the relationship between storage temperature and antibiotic potency? Use the F table at the end of this exam to support your conclusions. These data indicate that antibiotic potency is a quadratic function of temperature. The critical F at the alpha=0.05 level, with 1 and 10 df is 4.96. The F values for the linear and quadratic contrasts exceed the critical value, whereas the F for cubic and quartic contrasts are nonsignificant. The linear contrast is negative, so potency is decreasing with increased storage temperature. The significant quadratic contrast indicates that the response is curvilinear. The rate of the decline in potency is reduced at higher temperatures. 3 F 5) Eight meadowfoam families were evaluated for seed oil content in a field study. The experiment was blocked to account for soil heterogeneity and for ease of field operations. Each of the 8 families was randomly assigned to two complete blocks. A 3' x 20' area of each plot was harvested and threshed and the seeds were cleaned and weighed. A representative sample of seed was taken from each plot and sent to the OSU seed lab for determination of oil content (%). The researcher requested that duplicate NMR analyses be conducted on each sample. All of the data was analyzed in PROC GLM in SAS. The GLM Procedure Dependent Variable: Oil Source DF Sum of Squares Mean Square F Value Pr > F Model 15 50.75397187 3.38359812 Error 16 8.41925000 0.52620313 Corrected Total 31 59.17322187 6.43 <.0001 R-Square Coeff Var Root MSE Oil Mean 0.857719 2.831204 Source DF 0.725399 25.62156 Type III SS Mean Square F Value Pr > F Block 1 2.32740312 2.32740312 4.42 0.0516 Family 7 45.33204687 6.47600670 12.31 <.0001 Block*Family 7 0.44207455 0.84 0.5706 3.09452187 a) Is the F Value and Pr>F for Families in this output correct? Explain your answer. 4 pts No. Although the appropriate error term (block*family interaction) has been included in the model, SAS has used the residual error as the default for the F test. The residual error (0.52620313) represents the pooled sampling error rather than the true error (0.44207455). Based on the expected mean squares, we expect the block*family interaction to be greater than or equal to the sampling variance. Using the sampling error for the F test will therefore tend to inflate the F ratio. That’s not the case in this data set, but we still have too many degrees of freedom when we use the sampling error for the F test, which will increase Type I error. 6 pts b) Calculate the correct F statistic for families and determine if there are significant differences among families using the F table at the back fo this exam. F observed = 6.476/0.44207 = 14.65 Critical F with 7 and 7 df = 3.79 Reject H0 and conclude that there are significant differences among the families 4 6 pts 6) An experiment has been conducted to determine the effects of Nitrogen and Phosphorus fertilizer on the growth of spinach. Because the fertilizer treatments were applied with a farm-scale fertilizer spreader, a strip-plot design was used with three complete blocks. In the diagram below, shade or circle examples of the designated experimental units: a) Block I – an experimental unit for a Nitrogen treatment b) Block II – an experimental unit for a Phosphorus treatment c) Block III – the experimental unit for a specific combination of Nitrogen and Phosphorus that would be used to evaluate the importance of Nitrogen x Phosphorus interactions Block I N2 N3 Block II N1 N2 N1 Block III N3 N1 P3 P1 P3 P1 P3 P2 P2 P2 P1 N3 N2 Part II. Experimental Design (Answer Questions A through E) As an agronomist, you are interested in studying the effect of phosphate fertilizer and potash fertilizer on the yield of a perennial forage crop. Optimum rates have been established for each of the fertilizers individually, but you would like to find out if the application of one fertilizer affects the response to the other fertilizer. Other studies have indicated that the timing of application has an effect on the crop’s ability to use the fertilizer. To test this, you decide to use three different application dates: November 1, January 1, and March 1. The fertilizer application does not require large machinery. A local farmer has a large field that has been uniformly planted to the forage crop. There is also greenhouse space available, and flats in which you could plant the crop. Answers will vary. A) Which site will you use for the experiment? Justify your choice. 3 pts 4 pts I would use the farmer’s field, as it would be very difficult to make recommendations about the best time for fertilizer application based on a greenhouse experiment in pots. B) List the treatments of the experiment. Be sure to include any necessary controls. Explain why you have chosen this particular set of treatments. Fertilizer application date is one factor, with three dates. Because the optimum rates for the fertilizers are already known, I would use a 2x2 factorial combination of P and K (NoP-NoK, P only, K only, P+K). The NoP-NoK treatment acts as a control. 5 6 pts C) What type of experimental design will you use? Justify your choice. Indicate any basic assumptions that you have made. Although it would be perfectly acceptable to combine the 3 application dates and the fertilizer treatments in a 3-way factorial in an RBD, I would be inclined to treat the application dates as the main plot and apply the fertilizer treatments as subplots in a split-plot design. This would permit the fertilizer to be applied in a contiguous area at each application date, which might provide some benefits similar to blocking. This might increase the precision for testing the interactions of P and K, which is our primary interest. It might also be easier logistically, because all of the plots to be fertilized on a given date could be flagged in one section of each block, rather than having to move from one plot to another throughout the whole field on each date. Because there are not very many levels of the treatments, I would increase the number of replications to six to get a little more power for testing the main effects of application dates. 6 pts D) Draw a diagram to indicate the experimental layout. For one replication, show how the treatments will be randomized and assigned to experimental units. I have shown a possible layout for a split-plot arrangement of treatments in an RBD. For the simpler alternative (a 3-way factorial in an RBD) you would simply randomize the 12 combinations of application date and fertilizer treatments in each block. Note that for the RBD, you might consider using a single unfertilized control plot in each block (a total of 10 treatments), since the control plots for each application date would be managed in the same way. That would be a little trickier to analyze, but it would save some time and resources in the field. 6 8 pts E) Break out the ANOVA in terms of Sources of Variation and degrees of freedom. Indicate the appropriate error terms for the F tests for the effects of interest. Source df MS F Block Application Date (D) Error a Phosphorus (P) Potassium (K) PxK DxP DxK DxPxK Error b Total r-1=5 d-1=2 (r-1)(d-1)=10 p-1=1 k-1=1 (p-1)(k-1)=1 (d-1)(p-1)=1 (d-1)(k-1)=1 (d-1)(p-1)(k-1)=1 by subtraction = 48 r*d*p*k-1=71 MS1 MS2 MS3 MS4 MS5 MS6 MS7 MS8 MS9 MS1/MS2 7 MS3/MS9 MS4/MS9 MS5/MS9 MS6/MS9 MS7/MS9 MS8/MS9 F Distribution 5% Points Denominator Numerator df 1 2 3 4 5 6 7 1 161.45 199.5 215.71 224.58 230.16 233.99 236.77 2 18.51 19.00 19.16 19.25 19.30 19.33 19.36 3 10.13 9.55 9.28 9.12 9.01 8.94 8.89 4 7.71 6.94 6.59 6.39 6.26 6.16 6.08 5 6.61 5.79 5.41 5.19 5.05 4.95 5.88 6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 10 4.96 4.10 3.71 3.48 3.32 3.22 3.13 11 4.84 3.98 3.59 3.36 3.20 3.09 3.01 12 4.75 3.88 3.49 3.26 3.10 3.00 2.91 13 4.67 3.80 3.41 3.18 3.02 2.92 2.83 14 4.60 3.74 3.34 3.11 2.96 2.85 2.76 15 4.54 3.68 3.29 3.06 2.90 2.79 2.71 16 4.49 3.63 3.24 3.01 2.85 2.74 2.66 17 4.45 3.59 3.20 2.96 2.81 2.70 2.61 18 4.41 3.55 3.16 2.93 2.77 2.66 2.58 19 4.38 3.52 3.13 2.90 2.74 2.63 2.54 20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 21 4.32 3.47 3.07 2.84 2.68 2.57 2.49 22 4.30 3.44 3.05 2.82 2.66 2.55 2.46 23 4.28 3.42 3.03 2.80 2.64 2.53 2.44 24 4.26 3.40 3.00 2.78 2.62 2.51 2.42 25 4.24 3.38 2.99 2.76 2.60 2.49 2.40 26 27 28 29 30 8 Student's t Distribution (2-tailed probability) df 0.40 0.05 0.01 1 1.376 12.706 63.667 2 1.061 4.303 9.925 3 0.978 3.182 5.841 4 0.941 2.776 4.604 5 0.920 2.571 4.032 6 0.906 2.447 3.707 7 0.896 2.365 3.499 8 0.889 2.306 3.355 9 0.883 2.262 3.250 10 0.879 2.228 3.169 11 0.876 2.201 3.106 12 0.873 2.179 3.055 13 0.870 2.160 3.012 14 0.868 2.145 2.977 15 0.866 2.131 2.947 16 0.865 2.120 2.921 17 0.863 2.110 2.898 18 0.862 2.101 2.878 19 0.861 2.093 2.861 20 0.860 2.086 2.845 21 0.859 2.080 2.831 22 0.858 2.074 2.819 23 0.858 2.069 2.807 24 0.857 2.064 2.797 25 0.856 2.060 2.787 26 0.856 2.056 2.779 27 0.855 2.052 2.771 28 0.855 2.048 2.763 29 0.854 2.045 2.756 30 0.854 2.042 2.750