CROP 590 Experimental Design in Agriculture

advertisement
CROP 590 Experimental Design in Agriculture
Lab exercise – 9th week
RBD with subsamples
Repeated Measures
SAS On-line Documentation
PROC GLM
PROC MIXED
Part I. RBD with subsamples
We will analyze data from the Randomized Complete Block Design (RBD) that we looked
at in Week 3, but in this case we will be utilizing data from subsamples taken within each plot.
A horticulturist performed a field experiment to determine the effect of fungicide
treatments applied to plots where azaleas were to be grown. The treatments consisted of a
control (no fungicide), an old product and a new product. The fungicides were applied to plots
before inoculation in a Randomized Complete Block Design with 4 replications. Two plants were
planted in each plot after inoculation. After several weeks the plants were dug up and their root
weights were recorded.
1. Download the data for this exercise from the file Lab9_subsample.xlsx.
2. In Week 3 we analyzed an RBD using plot means. Now we’ll analyze the sub-sample data
from which the averages were derived. Create a data set in SAS from the data on the
“subsamples” spreadsheet (in the file Lab9_subsample.xlsx).
For the analysis, the model statement is now:
PROC GLM;
Title 'RBD with subsamples';
Class Block Fungicide;
Model Rootwt = Block Fungicide Block*Fungicide;
The interaction of Block*Fungicide Mean Square is the experimental error that you would
want to use for constructing confidence intervals for your treatment means. We need to
identify the error term that SAS should use in calculating the F statistics with the
statements:
TEST H=Block E=Block*Fungicide;
TEST H=Fungicide E=Block*Fungicide;
The same error term would need to be specified when you run an LSD test on treatment
means.
MEANS Fungicide/LSD E=Block*Fungicide;
RUN;
3. Were there significant differences among the fungicides? Was blocking effective? Was the
plot-to-plot variation greater than the variation among plants within plots?
1
4. Compare these F tests to the F tests that were computed by default using the residual mean
square (sampling error). This demonstrates why it is so important to know which F tests are
appropriate when using SAS. In this example, only the Block*Fungicide mean square should
be tested with the residual mean square as an error term.
5. Compare your output to the results obtained in Week 3 (see Lab3_Annotations.pdf).
Compare the magnitude of the Sums of Squares and Mean Squares between the two
analyses (i.e., using plot means vs with subsamples). Are the results of the F tests for
treatments the same?
6. If you are willing to assume that Blocks and Block*Fungicides are random effects, you can
use the Random statement in PROC GLM to generate expected mean squares and the
appropriate F tests. You will still need to specify the correct error term if you request means
and standard errors.
PROC GLM;
Class Block Fungicide;
Model Rootwt = Block Fungicide Block*Fungicide;
Random Block Block*Fungicide/Test;
lsmeans Fungicide/stderr E=block*fungicide;
RUN;
7. Alternatively, you can analyze the data using mixed models. PROC MIXED will give you
estimates of variances for random effects, and F tests for fixed effects. The covtest option
can be used to get tests of significance for variance components. Estimates of standard
error are more realistic than from an ANOVA, because the additional variation from random
effects due to sampling from a larger reference population is taken into account. The syntax
for PROC MIXED is the same as for PROC GLM, but only the fixed effects are included in the
MODEL statement, and a RANDOM statement is required to specify random effects.
PROC MIXED covtest;
Class Block Fungicide;
Model Rootwt = Fungicide;
Random Block Block*Fungicide;
lsmeans Fungicide/pdiff;
RUN;
Look at the Covariance parameter estimate for Block*Fungicide. Try using the Mean
Squares and Expected Mean Squares from the ANOVA to derive the value given for the
Block*Fungicide variance in PROC MIXED.
Note the value for the standard error of a difference among fungicide treatments. This
should be the same as the value from PROC GLM, because comparisons of treatments are
not affected by the variation among blocks. Can you calculate the standard error of a
difference from the ANOVA output?
2
Part II. Repeated Measures
When repeated measurements are taken on the same experimental units over time, the
residuals associated with multiple observations from a particular plot (or experimental unit) may
be correlated (not independent). If the correlations are similar for all pairs of observations over
time, then it may be possible to analyze the data as a split-plot, using time as the sub-plot. The
assumption that all correlations among repeated measures are the same is known as compound
symmetry (CS). If the correlations are not the same, then the CS assumption may inflate the
Type I error rate, and some adjustments are needed. In this exercise we will first analyze a
data set as a split-plot, and then compare results using the repeated statement in SAS PROC
MIXED to make adjustments for correlated errors.
Yield was determined for four varieties of alfalfa that were harvested at four times (cuttings)
(Lab9_repeated.xlsx). The experiment was conducted as a CRD with five replications. (data
were taken from a course taught by Prof. Dubcovsky at UC-Davis.)
1) Run the analysis as a split-plot. Since this is a CRD, we will use the rep(variety) MS as the
error term for the main plots (varieties). Cuttings and varieties*cuttings can be tested using
the residual as an error term (the default). Conduct LSD tests for varieties and cuttings.
Remember to specify the appropriate error term if you do not wish to use the residual to
calculate LSD.
2) Note how the data has been reformatted on the ‘repeated’ spreadsheet in the data file.
Each cutting is now considered as another variable. Input this data set and run the repeated
analysis. The option ‘nonuni’ suppresses the output of a separate analysis for each cutting.
The ‘repeated’ statement will generate both a multivariate analysis (MANOVA) and a
univariate ANOVA.
PROC GLM;
title 'repeated measures analysis';
class rep variety;
model yield1 yield2 yield3 yield4=variety/ nouni;
repeated cutting /printe;
Note the two adjusted probabilities that are intended to correct for the unequal correlations
among pairs of repeated measures. Compare these results with your split-plot analysis. The
MANOVA approach tends to be very conservative (it inflates Type II error).
3) PROC MIXED has many features that are ideal for analysis of repeated measures (but
beyond the scope of this course). Below is an example of an analysis that could be used to
adjust for the error structure in this trial. The data input is the same as it was for the splitplot analysis of this data set.
PROC MIXED data=one covtest;
title 'repeated measures using mixed models - unstructured';
class rep cutting variety;
model yield=variety cutting variety*cutting;
repeated cutting /subject=rep(variety)type=un r rcorr;
run;
In some situations it may be appropriate to include a ‘random’ statement in addition to the
‘repeated’ statement. The value of -2 Res Log Likelihood for this unstructured model can be
compared to other models for repeated measures analyses that are available. Smaller values
3
indicate a better fit for the model. The level of significance of the Null Model Likelihood
Ratio Test shows how much better the current model is than an analysis that makes no
adjustments for correlations among the errors.
4
Download