Experimental Design in Agriculture CROP 590 Name_______

advertisement
Experimental Design in Agriculture
CROP 590
Final Exam, Winter, 2016
Name_______KEY___________
Part I. Short answer – please show your work
1) An experiment is conducted to evaluate the yield of seven oat cultivars at four
locations that represent a sample of the environments in which the cultivars are
likely to be grown. The experimental design at each location is a randomized
complete block design with three replications.
3 pts
4 pts
Source
df
Mean Square
Expected Mean Square
Location
3
MS1
σ2e + 7σ2Rep(Loc) + 21σ2Loc
Rep(Loc)
8
MS2
σ2e + 7σ2Rep(Loc)
Cultivar
6
MS3
σ2e + 3σ2Loc x Cultivar + 12Ө2Cult
Loc*Cultivar
18
MS4
σ2e + 3σ2 Loc x Cultivar
Error
48
MS5
σ2e
a) Based on the Expected Mean Squares given in the table above, what would be the
appropriate ratio of Mean Squares to use to calculate an F value to determine if
there are differences among the cultivars?
MS3/MS4
b) Are Replications and Locations nested or cross-classified? Explain your answer.
Reps are nested within locations. Each block is unique to each location.
4 pts
4 pts
c) The seven oat cultivars include the most promising new cultivars from your breeding
program, and you are considering them for commercial release. Do you think that
cultivars should be designated as fixed or random effects in this experiment? Defend
your choice.
Cultivars should be fixed. We would like to know how the new varieties compare to
each other and to check varieties. We are interested in this particular set of
cultivars.
d) Using Cultivars as an example, explain what the Expected Mean Square in the
ANOVA represents and define each of its components.
The Expected Mean Square σ2e + 3σ2Loc x Cultivar + 12Ө2Cult represents the variation
among cultivar means.
σ2e is experimental error variance
σ2Loc x Cultivar is the variance component for location x cultivar interactions
Ө2Cult is the variation among the fixed cultivar effects
1
2) A researcher wished to study the relationships between irrigation and nitrogen
response in corn. Because irrigation could only be applied to large plots, she decided
to use a split plot design with the irrigation treatments (irrigated and nonirrigated)
as main plots and nitrogen fertility (60, 90, 120, 150 and 180 lbs/acre) as the
subplots. The trial was planted in four complete blocks. Yield was recorded in
bu/acre.
10 pts
Complete the ANOVA (fill in shaded areas):
Source
Total
Block
Irrigation
Error a
Nitrogen
Irrigation x N
Error b
5 pts
df
39
3
1
3
4
4
24
SS
12879
1911
7445
384
1834
585
720
MS
637
7445
128
458.5
146.25
30
F
58.164
15.28
4.875
a) Using the F table in the back of this exam, what are your conclusions regarding
the effects of irrigation and nitrogen on corn yield?
The irrigation x N effects are significant (4.87 is greater than Fcritical = 2.78), so
we have to be careful about interpreting results of main effects. The response to
N depends on irrigation in corn.
b) Calculate the standard error for an irrigation treatment mean.
5 pts
se = sqrt(Errora/r.b) = sqrt(128/20) = 2.53
8 pts
3) You are reading an article that was published in 1965. The authors were evaluating
the effect of growth promoters on Douglas Fir seedlings. Measurements were taken
at monthly intervals over the first two years of growth, and time of sampling was
analyzed as a sub-plot factor in a split-plot analysis. What type of analysis should be
considered for this data set today? What are the advantages of the current methods
of analysis compared to the split-plot in time?
Today we would recommend a repeated measures analysis when repeated
observations are taken from the same experimental units over time. There is likely
to be some correlation in errors from one time period to the next. Furthermore, the
correlations are likely to be greatest between observations that are taken at short
time intervals compared to those that are taken at more distant sampling periods.
Patterns in the covariance structure are taken into account in a repeated measures
analysis.
2
4) An experiment was conducted to determine the effect of storage temperature on
the potency of an antibiotic. Fifteen samples of the antibiotic were obtained and
three samples, selected at random from the fifteen, were stored at each of five
temperatures: 10, 30, 50, 70, 90. At the end of a thirty day storage period the
samples were tested for potency with the following results:
Temperature
Mean
10
58
30
31
50
18
70
13
90
11
Source
df
SS
Total
14
4680.4
4
4520.4
1130.1
10
160.0
16.0
Temperature
Error
MS
F
70.63**
Orthogonal Polynomial Coefficients are used to obtain the following contrasts:
10
30
50
70
90
Linear
-2
-1
0
1
2
-112
10
3763.20
235.2
2
-1
-2
-1
2
58
14
720.86
45.05
-1
2
0
-2
1
-11
10
36.30
2.27
1
-4
6
-4
1
1
70
0.04
0
Quadratic
Cubic
Quartic
8 pts
k2
Temperature
L
SS(L)
a) Fill in the shaded areas to complete the analysis of contrasts. Show your
calculations below.
L = (2*58) + (-1*31) + (-2*18) + (-1*13) + (2*11) = 58
SSL 
6 pts
r*L2 3*(58) 2 10092


 720.86
14
14
 k2
b) What do these results tell you about the relationship between storage
temperature and antibiotic potency? Use the F table at the end of this exam to
support your conclusions.
These data indicate that antibiotic potency is a quadratic function of
temperature. The critical F at the alpha=0.05 level, with 1 and 10 df is 4.96. The F
values for the linear and quadratic contrasts exceed the critical value, whereas
the F for cubic and quartic contrasts are nonsignificant. The linear contrast is
negative, so potency is decreasing with increased storage temperature. The
significant quadratic contrast indicates that the response is curvilinear. The rate
of the decline in potency is reduced at higher temperatures.
3
F
5) Eight meadowfoam families were evaluated for seed oil content in a field study. The
experiment was blocked to account for soil heterogeneity and for ease of field
operations. Each of the 8 families was randomly assigned to two complete blocks. A
3' x 20' area of each plot was harvested and threshed and the seeds were cleaned
and weighed. A representative sample of seed was taken from each plot and sent to
the OSU seed lab for determination of oil content (%). The researcher requested that
duplicate NMR analyses be conducted on each sample. All of the data was analyzed
in PROC GLM in SAS.
The GLM Procedure
Dependent Variable: Oil
Source
DF Sum of Squares Mean Square F Value Pr > F
Model
15
50.75397187
3.38359812
Error
16
8.41925000
0.52620313
Corrected Total 31
59.17322187
6.43 <.0001
R-Square Coeff Var Root MSE Oil Mean
0.857719 2.831204
Source
DF
0.725399 25.62156
Type III SS Mean Square F Value Pr > F
Block
1
2.32740312
2.32740312
4.42 0.0516
Family
7 45.33204687
6.47600670
12.31 <.0001
Block*Family
7
0.44207455
0.84 0.5706
3.09452187
a) Is the F Value and Pr>F for Families in this output correct? Explain your answer.
4 pts
No. Although the appropriate error term (block*family interaction) has been
included in the model, SAS has used the residual error as the default for the F
test. The residual error (0.52620313) represents the pooled sampling error rather
than the true error (0.44207455). Based on the expected mean squares, we
expect the block*family interaction to be greater than or equal to the sampling
variance. Using the sampling error for the F test will therefore tend to inflate the
F ratio. That’s not the case in this data set, but we still have too many degrees of
freedom when we use the sampling error for the F test, which will increase Type
I error.
6 pts
b) Calculate the correct F statistic for families and determine if there are significant
differences among families using the F table at the back fo this exam.
F observed = 6.476/0.44207 = 14.65
Critical F with 7 and 7 df = 3.79
Reject H0 and conclude that there are significant differences among the
families
4
6 pts
6) An experiment has been conducted to determine the effects of Nitrogen and
Phosphorus fertilizer on the growth of spinach. Because the fertilizer treatments
were applied with a farm-scale fertilizer spreader, a strip-plot design was used with
three complete blocks. In the diagram below, shade or circle examples of the
designated experimental units:
a) Block I – an experimental unit for a Nitrogen treatment
b) Block II – an experimental unit for a Phosphorus treatment
c) Block III – the experimental unit for a specific combination of Nitrogen and
Phosphorus that would be used to evaluate the importance of Nitrogen x
Phosphorus interactions
Block I
N2
N3
Block II
N1
N2
N1
Block III
N3
N1
P3
P1
P3
P1
P3
P2
P2
P2
P1
N3
N2
Part II. Experimental Design (Answer Questions A through E)
As an agronomist, you are interested in studying the effect of phosphate fertilizer and
potash fertilizer on the yield of a perennial forage crop. Optimum rates have been
established for each of the fertilizers individually, but you would like to find out if the
application of one fertilizer affects the response to the other fertilizer. Other studies
have indicated that the timing of application has an effect on the crop’s ability to use
the fertilizer. To test this, you decide to use three different application dates: November
1, January 1, and March 1. The fertilizer application does not require large machinery. A
local farmer has a large field that has been uniformly planted to the forage crop. There
is also greenhouse space available, and flats in which you could plant the crop.
Answers will vary.
A) Which site will you use for the experiment? Justify your choice.
3 pts
4 pts
I would use the farmer’s field, as it would be very difficult to make
recommendations about the best time for fertilizer application based on a
greenhouse experiment in pots.
B) List the treatments of the experiment. Be sure to include any necessary controls.
Explain why you have chosen this particular set of treatments.
Fertilizer application date is one factor, with three dates. Because the optimum rates for
the fertilizers are already known, I would use a 2x2 factorial combination of P and K
(NoP-NoK, P only, K only, P+K). The NoP-NoK treatment acts as a control.
5
6 pts
C) What type of experimental design will you use? Justify your choice. Indicate any
basic assumptions that you have made.
Although it would be perfectly acceptable to combine the 3 application dates and
the fertilizer treatments in a 3-way factorial in an RBD, I would be inclined to treat
the application dates as the main plot and apply the fertilizer treatments as subplots in a split-plot design. This would permit the fertilizer to be applied in a
contiguous area at each application date, which might provide some benefits similar
to blocking. This might increase the precision for testing the interactions of P and K,
which is our primary interest. It might also be easier logistically, because all of the
plots to be fertilized on a given date could be flagged in one section of each block,
rather than having to move from one plot to another throughout the whole field on
each date. Because there are not very many levels of the treatments, I would
increase the number of replications to six to get a little more power for testing the
main effects of application dates.
6 pts
D) Draw a diagram to indicate the experimental layout. For one replication, show how
the treatments will be randomized and assigned to experimental units.
I have shown a possible layout for a split-plot arrangement of treatments in an RBD.
For the simpler alternative (a 3-way factorial in an RBD) you would simply randomize
the 12 combinations of application date and fertilizer treatments in each block. Note
that for the RBD, you might consider using a single unfertilized control plot in each
block (a total of 10 treatments), since the control plots for each application date
would be managed in the same way. That would be a little trickier to analyze, but it
would save some time and resources in the field.
6
8 pts
E) Break out the ANOVA in terms of Sources of Variation and degrees of freedom.
Indicate the appropriate error terms for the F tests for the effects of interest.
Source
df
MS
F
Block
Application Date (D)
Error a
Phosphorus (P)
Potassium (K)
PxK
DxP
DxK
DxPxK
Error b
Total
r-1=5
d-1=2
(r-1)(d-1)=10
p-1=1
k-1=1
(p-1)(k-1)=1
(d-1)(p-1)=1
(d-1)(k-1)=1
(d-1)(p-1)(k-1)=1
by subtraction = 48
r*d*p*k-1=71
MS1
MS2
MS3
MS4
MS5
MS6
MS7
MS8
MS9
MS1/MS2
7
MS3/MS9
MS4/MS9
MS5/MS9
MS6/MS9
MS7/MS9
MS8/MS9
F Distribution 5% Points
Denominator
Numerator
df
1
2
3
4
5
6
7
1 161.45 199.5 215.71 224.58 230.16 233.99 236.77
2 18.51 19.00 19.16 19.25 19.30 19.33 19.36
3 10.13
9.55
9.28
9.12
9.01
8.94
8.89
4
7.71
6.94
6.59
6.39
6.26
6.16
6.08
5
6.61
5.79
5.41
5.19
5.05
4.95
5.88
6
5.99
5.14
4.76
4.53
4.39
4.28
4.21
7
5.59
4.74
4.35
4.12
3.97
3.87
3.79
8
5.32
4.46
4.07
3.84
3.69
3.58
3.50
9
5.12
4.26
3.86
3.63
3.48
3.37
3.29
10
4.96
4.10
3.71
3.48
3.32
3.22
3.13
11
4.84
3.98
3.59
3.36
3.20
3.09
3.01
12
4.75
3.88
3.49
3.26
3.10
3.00
2.91
13
4.67
3.80
3.41
3.18
3.02
2.92
2.83
14
4.60
3.74
3.34
3.11
2.96
2.85
2.76
15
4.54
3.68
3.29
3.06
2.90
2.79
2.71
16
4.49
3.63
3.24
3.01
2.85
2.74
2.66
17
4.45
3.59
3.20
2.96
2.81
2.70
2.61
18
4.41
3.55
3.16
2.93
2.77
2.66
2.58
19
4.38
3.52
3.13
2.90
2.74
2.63
2.54
20
4.35
3.49
3.10
2.87
2.71
2.60
2.51
21
4.32
3.47
3.07
2.84
2.68
2.57
2.49
22
4.30
3.44
3.05
2.82
2.66
2.55
2.46
23
4.28
3.42
3.03
2.80
2.64
2.53
2.44
24
4.26
3.40
3.00
2.78
2.62
2.51
2.42
25
4.24
3.38
2.99
2.76
2.60
2.49
2.40
26
27
28
29
30
8
Student's t Distribution
(2-tailed probability)
df 0.40
0.05
0.01
1 1.376 12.706 63.667
2 1.061 4.303 9.925
3 0.978 3.182 5.841
4 0.941 2.776 4.604
5 0.920 2.571 4.032
6 0.906 2.447 3.707
7 0.896 2.365 3.499
8 0.889 2.306 3.355
9 0.883 2.262 3.250
10 0.879 2.228 3.169
11 0.876 2.201 3.106
12 0.873 2.179 3.055
13 0.870 2.160 3.012
14 0.868 2.145 2.977
15 0.866 2.131 2.947
16 0.865 2.120 2.921
17 0.863 2.110 2.898
18 0.862 2.101 2.878
19 0.861 2.093 2.861
20 0.860 2.086 2.845
21 0.859 2.080 2.831
22 0.858 2.074 2.819
23 0.858 2.069 2.807
24 0.857 2.064 2.797
25 0.856 2.060 2.787
26 0.856 2.056 2.779
27 0.855 2.052 2.771
28 0.855 2.048 2.763
29 0.854 2.045 2.756
30 0.854 2.042 2.750
Download