Experimental Design in Agriculture Name CROP 590

advertisement
Experimental Design in Agriculture
CROP 590
Final Exam, Winter, 2014
Name______Key____________
Please show your work!
Part I. Short Answer
1) An agronomist wants to measure the effect of irrigation (none, once, and twice during
the cropping season) and nitrogen fertilizer (25, 50, and 75 kg/ha) on the yield of
durum wheat. He decides to use a factorial set of treatments and a strip plot design,
with 4 blocks.
8 pts
8 pts
a) Complete the ANOVA by filling in the shaded cells (use the F table at the end of
this exam). What are your conclusions from the ANOVA?
Source
df
SS
MS
Block
3
1.08
0.36
Irrigation
2
3.34
1.67
Block*Irrigation
6
1.8
0.30
Nitrogen
2
3.02
1.51
Block*Nitrogen
6
1.74
0.29
Irrigation*Nitrogen
4
0.24
0.06
Error
12
1.20
0.10
Total
35
12.42
F
F critical
5.57
5.14
significant
5.21
5.14
significant
0.60
3.26
not significant
b) What means would you report from this experiment? Calculate the appropriate
standard errors for those means.
Because the interactions are not significant, we can report the means for each of the
main effects and their standard errors.
For irrigation means:
For nitrogen means:
se 
MSblocks*irrigation
se 
MSblocks*nitrogen
r*b
r *a
1

0.30
 0.158
4*3

0.29
 0.155
4*3
8 pts
6 pts
2) An experiment was conducted to determine the effects of inoculation with four
bacterial strains on dry weight of a perennial grass species. The experiment was
replicated in four complete blocks. The researcher intended to obtain additional
harvests from the same plots for several years. A colleague advised him to treat the
harvest time as a sub-plot factor in a split-plot analysis. The researcher then asks for
your opinion. What type of analysis should be considered for this data set? Explain
why you are recommending that analysis.
A repeated measures analysis is recommened when repeated observations are
taken from the same experimental units over time. There is likely to be some
correlation in errors from one time period to the next. Furthermore, the correlations
are likely to be greatest between observations that are taken at short time intervals
compared to those that are taken at more distance sampling periods. In order for a
split-plot to be valid, one has to be able to assume that the covariance between
subplots within each main plot is equal for all pairs of observations. This is not likely
to be the case when the subplot is time. Patterns in the covariance structure can be
taken into account in a repeated measures analysis.
3) The effect of storage temperature on seed viability was studied in a Completely
Randomized Design (CRD). Three samples were stored at each of four
temperatures: 10, 30, 50, and 70 F. At the end of a one year storage period the
samples were tested for germination percentage. The estimate of MSE from the
ANOVA was 19.0 with 8 df.
8 pts
a) Complete the table of orthogonal polynomial contrasts by filling in the shaded
cells.
Storage temperature F
6 pts
10
30
50
70
Mean
58
31
18
13
ki2
Li
SSL
Fcalc
Linear
-3
-1
1
3
20
-148
3285.6
172.926
Quadratic
1
-1
-1
1
4
22
363
19.1053
Cubic
-1
3
-3
1
20
-6
5.4
0.28421
b) What do these results tell you about the relationship between storage
temperature and seed viability?
Li = 58-31-18+13 = 22
SSL = 3*222/4 = 363
F = 363/19 = 19.1053
Critical F with 1, 8 df = 5.32
The relationship beween storage temperature and germination percentage is best
described by a model that includes a linear and quadratic component:
Yij = b0 + b1Xi + b2Xi2 + eij
The response to temperature is curvilinear. Germination decreases with increased
temperature. For the range of treatments included in this experiment, the loss in
germination is very rapid in the low temperature range, but slows down at higher
temperatures.
2
6 pts
4) Consider a split-plot design with 4 levels of factor A (main plots) and 2 levels of factor
B (sub-plots). Assume that there is a soil gradient from high clay on the west to low
clay on the east side of the field. Circle the design below that is most likely to
effectively control experimental error due to this field effect.
Answer: Design A
3
5) Researchers in state X wished to determine the best varieties of a new annual crop
to recommend for commercial production. Five varieties were evaluated at three
locations over a two year period (a total of six sites). A randomized block design was
used at each site with three blocks. Yield data were collected from each of the six
trials.
After performing an analysis at each site to check for outliers and confirm that
assumptions for the ANOVA were satisfied, PROC GLIMMIX was used to determine
if variances across sites met the assumption of homogeneity of variance needed for
a combined analysis.
The output is shown below:
Covariance Parameter Estimates
Cov Parm
Group
Estimate Standard Error
Residual (VC) Site Hilltown 2018
7.9417
3.9708
Residual (VC) Site Hilltown 2019
34.2160
17.1080
Residual (VC) Site Springfiield 2018
21.5927
10.7963
Residual (VC) Site Springfield 2019
20.3512
10.1756
Residual (VC) Site Waterbury 2018
21.7168
10.8584
Residual (VC) Site Waterbury 2019
11.8882
5.9441
Tests of Covariance Parameters
Based on the Restricted Likelihood
Label
common variance
4 pts
DF -2 Res Log Like ChiSq Pr > ChiSq Note
5
324.77
4.92
0.4260 DF
a) The covariance parameter estimate of 34.2160 represents (choose one):
i) The mean yield in Hilltown in 2019
ii) The MSE from the ANOVA for yield for Hilltown in 2019
iii) The covariance of yield in Hilltown in 2018 and 2019
iv) The Mean Square for varieties in Hilltown in 2019
5 pts
b) What conclusion can be drawn about the homogeneity of variance assumption
from the Chi Square test shown above?
The observed probability for the Chi Square test is much greater than 0.05, so we
can accept the null hypothesis that the variances are homogeneous.
4
Question 5, continued.
The Random statement with a /test option was used in PROC GLM to generate
Expected Mean Squares for an across site analysis:
Source
Type III Expected Mean Square
Site
Var(Error) + 3 Var(Site*Variety) + 5 Var(Block(Site)) + 15 Var(Site)
Block(Site)
Var(Error) + 5 Var(Block(Site))
Variety
Var(Error) + 3 Var(Site*Variety) + Q(Variety)
Site*Variety Var(Error) + 3 Var(Site*Variety)
6 pts
c) Based on the results above, what would be the appropriate ratio of Mean
Squares to use to test for significant differences among varieties?
MS(varieties)/MS(site*variety)
d) Are the blocks nested within sites? Explain your answer.
5 pts
Blocks are nested because each block is unique to each site. They represent a
random sample of possible blocks.
A summary of the results of the combined ANOVA across sites is shown below:
Source
Mean
Square
2014.37
F Value
5
Type III
SS
10072
16.62
<.0001
12
1038.85
86.5712
4.41
0.0001
4
1072.25
268.061
4.94
0.0062
Site*Variety
20
1084.69
54.2347
2.76
0.002
Error
48
941.652
19.6178
Site
Block(Site)
Variety
DF
5
Pr > F
6 pts
e) Use the ANOVA on the previous page and the graph below to give a brief
interpretation of these results. Can generalizations be made about the relative
performance of varieties across sites? Can you note any trends that might
warrant further investigation?
The Site*Variety interaction is highly significant, so we have to be cautious about
interpreting the main effects of varieties and making generalizations about the
performance of varieties across sites. Nonetheless, differences among the varieties
are very large and significant. Variety E was consistently good in Springfield and
Waterbury. Variety D was consistently good in Hilltown. Variety B was consistently
poor at all sites. The variation among sites was large in comparison to the variation
among varieties, with 2019 being the better year at all sites. Blocking was effective.
Further analyses could be conducted to determine the relative importance of
locations and years in contributing to the variation among sites. It appears that most
of the Site*Variety interaction was due to differences in relative performance of
varieties in Hilltown vs the other sites. Further experimentation would be needed to
determine if this is a consistent pattern and what environmental factors (rainfall, soil
type, diseases, etc.) might be impacting the yield of these varieties.
Variety D
Variety E
Variety B
Part II. Experimental Design Question
A researcher has developed a new herbicide that can control a parasitic weed in red
clover fields in the Willamette Valley. The herbicide can be applied as a seed treatment
6
on clover, or as a post-emergence spray, but optimum rates have not been established.
Widely grown varieties of clover may differ in their tolerance to the herbicide. The
researcher would like to develop recommendations for use of the herbicide. Assume that
the primary reason for growing the crop is for seed production.
The parasite is prevalent on several acres of land that are available with a cooperative
farmer who grows clover near Salem.
Design an experiment that would meet the objectives of the researcher.
6 pts
1) What type of experimental design will you use? Justify your choice. Indicate any
basic assumptions that you have made.
There are many reasonable solutions to this question.
One possibility is to use a split-plot arrangement of treatments. It would be difficult to
apply the herbicide spray to small plots, so the herbicide treatments (Factor A) will be
the main plot and clover varieties (Factor B) could be the sub-plot.
Rates applied to seeds may not be directly comparable to rates that are applied post
emergence, because seed treatments are active only in the volume of soil immediately
surrounding each seed, whereas post-emergence sprays are applied to the entire
surface area of the soil. For that reason there is no need to consider herbicide rates as a
separate factor. There is also no indication from the description that one might consider
applying both the seed treatment and the post-emergence spray to the same crop, so
there is no reason to look at factorial combinations of seed treatments and sprays.
Four blocks will be used to account for natural variation in the prevalence of the weed
seed in the field. We have to assume that the level of the parasite is reasonably uniform
within blocks. Four reps are needed to provide sufficient degrees of freedom for testing
the main plots (herbicide treatments), which is of primary interest in this experiment.
2) List the treatments of the experiment. Be sure to include any necessary controls.
Explain why you have chosen this particular set of treatments.
6 pts
Factor A: Herbicide Treatments
C = no herbicide
SL = seed treatment, low rate
SH = seed treatment, high rate
PEL = post emergence spray, low rate
PEM = post emergence spray, medium rate
PEH = post emergence spray, high rate
Factor B: Varieties – 2 levels (widely grown, old standard variety and the most promising
new variety)
I am assuming that establishing an effective and safe rate for a seed treatment is more
straightforward than determining the rate of spray that will be needed to control the
parasite without causing crop damage, so I have limited the seed treatments to two
levels. Three levels of herbicide spray will be applied so that the equation for the
response curve can be used to estimate the optimum level of herbicide.
7
3) Break out the ANOVA in terms of Sources of Variation and degrees of freedom.
6 pts
Source of Variation
DF
Total
Block
Herbicide
Block * Herbicide (error a)
Cultivar
Herbicide * Cultivar
Error b
47
3
5
15
1
5
18
4) Identify two meaningful questions that this experiment might address that are not
adequately evaluated from an ANOVA. Indicate the coefficients that would be
needed for each of the treatment means to estimate Sums of Squares for the
corresponding contrasts. Show how you would determine if the two contrasts are
orthogonal (or not). (There is a table of orthogonal polynomial contrast coefficients at
the end of this exam that can be used for reference.)
6 pts
With the exception of the highlighted questions below, all of the contrasts would meet
the requirement of providing additional information not adequately evaluated from the
ANOVA.
Questions pertinent to the objectives:
1) Does the use of an herbicide increase seed yield in comparison to the control (no
herbicide)?
2) Does the method of herbicide application (seed vs spray) have an effect on the seed
yield of red clover?
3) Is there a difference in seed yield for the low vs the high level of seed treatment?
4) Does red clover show a linear response to increasing rates of post-emergence spray?
5) Does red clover show a quadratic response to increasing rates of post-emergence
spray?
6) Do the two varieties differ in seed yield? (equivalent to the test for main effects of
varieties in the ANOVA)
7) Are the effects of the herbicide treatments the same for both varieties? (Can do
individual tests for interactions of contrast 6 with contrasts 2-5)
6 pts
Contrast coefficients:
8
Control Control
SL
SL
SH
SH
PEL PEL PEM PEM PEH PEH
Cult1
Cult2 Cult1 Cult2 Cult1 Cult2 Cult1 Cult2 Cult1 Cult2 Cult1 Cult2
1
-5
-5
1
1
1
1
1
1
1
1
1
1
2
0
0
3
3
3
3
-2
-2
-2
-2
-2
-2
3
0
0
-1
-1
1
1
0
0
0
0
0
0
4
0
0
0
0
0
0
-1
-1
0
0
1
1
5
0
0
0
0
0
0
-1
-1
2
2
-1
-1
6
0
0
-1
1
-1
1
-1
1
-1
1
-1
1
7
0
0
3
-3
3
-3
-2
2
-2
2
-2
2
8
0
0
-1
1
1
-1
0
0
0
0
0
0
9
0
0
0
0
0
0
-1
1
0
0
1
-1
10
0
0
0
0
0
0
-1
1
2
-2
-1
1
Test for orthogonality – use contrasts 1 and 2 and show that the sum of products = 0
(-5)(0)+(-5)(0)+(1)(3)+(1)(3)+(1)(3)+(1)(3)+(1)(-2)+(1)(-2)+(1)(-2)+(1)(-2)+(1)(-2)+(1)(-2)=0
9
F Distribution 5% Points
Denominator
Numerator
df
1
2
3
4
5
6
7
1 161.45 199.5 215.71 224.58 230.16 233.99 236.77
2 18.51 19.00 19.16 19.25 19.30 19.33 19.36
3 10.13
9.55
9.28
9.12
9.01
8.94
8.89
4
7.71
6.94
6.59
6.39
6.26
6.16
6.08
5
6.61
5.79
5.41
5.19
5.05
4.95
5.88
6
5.99
5.14
4.76
4.53
4.39
4.28
4.21
7
5.59
4.74
4.35
4.12
3.97
3.87
3.79
8
5.32
4.46
4.07
3.84
3.69
3.58
3.50
9
5.12
4.26
3.86
3.63
3.48
3.37
3.29
10
4.96
4.10
3.71
3.48
3.32
3.22
3.13
11
4.84
3.98
3.59
3.36
3.20
3.09
3.01
12
4.75
3.88
3.49
3.26
3.10
3.00
2.91
13
4.67
3.80
3.41
3.18
3.02
2.92
2.83
14
4.60
3.74
3.34
3.11
2.96
2.85
2.76
15
4.54
3.68
3.29
3.06
2.90
2.79
2.71
16
4.49
3.63
3.24
3.01
2.85
2.74
2.66
17
4.45
3.59
3.20
2.96
2.81
2.70
2.61
18
4.41
3.55
3.16
2.93
2.77
2.66
2.58
19
4.38
3.52
3.13
2.90
2.74
2.63
2.54
20
4.35
3.49
3.10
2.87
2.71
2.60
2.51
21
4.32
3.47
3.07
2.84
2.68
2.57
2.49
22
4.30
3.44
3.05
2.82
2.66
2.55
2.46
23
4.28
3.42
3.03
2.80
2.64
2.53
2.44
24
4.26
3.40
3.00
2.78
2.62
2.51
2.42
25
4.24
3.38
2.99
2.76
2.60
2.49
2.40
26
27
28
29
30
10
Student's t Distribution
(2-tailed probability)
df
0.40
0.05
0.01
1 1.376 12.706 63.667
2 1.061 4.303 9.925
3 0.978 3.182 5.841
4 0.941 2.776 4.604
5 0.920 2.571 4.032
6 0.906 2.447 3.707
7 0.896 2.365 3.499
8 0.889 2.306 3.355
9 0.883 2.262 3.250
10 0.879 2.228 3.169
11 0.876 2.201 3.106
12 0.873 2.179 3.055
13 0.870 2.160 3.012
14 0.868 2.145 2.977
15 0.866 2.131 2.947
16 0.865 2.120 2.921
17 0.863 2.110 2.898
18 0.862 2.101 2.878
19 0.861 2.093 2.861
20 0.860 2.086 2.845
21 0.859 2.080 2.831
22 0.858 2.074 2.819
23 0.858 2.069 2.807
24 0.857 2.064 2.797
25 0.856 2.060 2.787
26 0.856 2.056 2.779
27 0.855 2.052 2.771
28 0.855 2.048 2.763
29 0.854 2.045 2.756
30 0.854 2.042 2.750
11
Download