CROP 590 Experimental Design in Agriculture Lab exercise – 9th week RBD with subsamples Repeated Measures SAS On-line Documentation PROC GLM PROC MIXED Part I. RBD with subsamples We will analyze data from the Randomized Complete Block Design (RBD) that we looked at in Week 3, but in this case we will be utilizing data from subsamples taken within each plot. A horticulturist performed a field experiment to determine the effect of fungicide treatments applied to plots where azaleas were to be grown. The treatments consisted of a control (no fungicide), an old product and a new product. The fungicides were applied to plots before inoculation in a Randomized Complete Block Design with 4 replications. Two plants were planted in each plot after inoculation. After several weeks the plants were dug up and their root weights were recorded. 1. Download the data for this exercise from the file Lab9_subsample.xlsx. 2. In Week 3 we analyzed an RBD using plot means. Now we’ll analyze the sub-sample data from which the averages were derived. Create a data set in SAS from the data on the “subsamples” spreadsheet (in the file Lab9_subsample.xlsx). For the analysis, the model statement is now: PROC GLM; Title 'RBD with subsamples'; Class Block Fungicide; Model Rootwt = Block Fungicide Block*Fungicide; The interaction of Block*Fungicide Mean Square is the experimental error that you would want to use for constructing confidence intervals for your treatment means. We need to identify the error term that SAS should use in calculating the F statistics with the statements: TEST H=Block E=Block*Fungicide; TEST H=Fungicide E=Block*Fungicide; The same error term would need to be specified when you run an LSD test on treatment means. MEANS Fungicide/LSD E=Block*Fungicide; RUN; 3. Were there significant differences among the fungicides? Was blocking effective? Was the plot-to-plot variation greater than the variation among plants within plots? 1 4. Compare these F tests to the F tests that were computed by default using the residual mean square (sampling error). This demonstrates why it is so important to know which F tests are appropriate when using SAS. In this example, only the Block*Fungicide mean square should be tested with the residual mean square as an error term. 5. Compare your output to the results obtained in Week 3 (see Lab3_Annotations.pdf). Compare the magnitude of the Sums of Squares and Mean Squares between the two analyses (i.e., using plot means vs with subsamples). Are the results of the F tests for treatments the same? 6. If you are willing to assume that Blocks and Block*Fungicides are random effects, you can use the Random statement in PROC GLM to generate expected mean squares and the appropriate F tests. You will still need to specify the correct error term if you request means and standard errors. PROC GLM; Class Block Fungicide; Model Rootwt = Block Fungicide Block*Fungicide; Random Block Block*Fungicide/Test; lsmeans Fungicide/stderr E=block*fungicide; RUN; 7. Alternatively, you can analyze the data using mixed models. PROC MIXED will give you estimates of variances for random effects, and F tests for fixed effects. The covtest option can be used to get tests of significance for variance components. Estimates of standard error are more realistic than from an ANOVA, because the additional variation from random effects due to sampling from a larger reference population is taken into account. The syntax for PROC MIXED is the same as for PROC GLM, but only the fixed effects are included in the MODEL statement, and a RANDOM statement is required to specify random effects. PROC MIXED covtest; Class Block Fungicide; Model Rootwt = Fungicide; Random Block Block*Fungicide; lsmeans Fungicide/pdiff; RUN; Look at the Covariance parameter estimate for Block*Fungicide. Try using the Mean Squares and Expected Mean Squares from the ANOVA to derive the value given for the Block*Fungicide variance in PROC MIXED. Note the value for the standard error of a difference among fungicide treatments. This should be the same as the value from PROC GLM, because comparisons of treatments are not affected by the variation among blocks. Can you calculate the standard error of a difference from the ANOVA output? 2 Part II. Repeated Measures When repeated measurements are taken on the same experimental units over time, the residuals associated with multiple observations from a particular plot (or experimental unit) may be correlated (not independent). If the correlations are similar for all pairs of observations over time, then it may be possible to analyze the data as a split-plot, using time as the sub-plot. The assumption that all correlations among repeated measures are the same is known as compound symmetry (CS). If the correlations are not the same, then the CS assumption may inflate the Type I error rate, and some adjustments are needed. In this exercise we will first analyze a data set as a split-plot, and then compare results using the repeated statement in SAS PROC MIXED to make adjustments for correlated errors. Yield was determined for four varieties of alfalfa that were harvested at four times (cuttings) (Lab9_repeated.xlsx). The experiment was conducted as a CRD with five replications. (data were taken from a course taught by Prof. Dubcovsky at UC-Davis.) 1) Run the analysis as a split-plot. Since this is a CRD, we will use the rep(variety) MS as the error term for the main plots (varieties). Cuttings and varieties*cuttings can be tested using the residual as an error term (the default). Conduct LSD tests for varieties and cuttings. Remember to specify the appropriate error term if you do not wish to use the residual to calculate LSD. 2) Note how the data has been reformatted on the ‘repeated’ spreadsheet in the data file. Each cutting is now considered as another variable. Input this data set and run the repeated analysis. The option ‘nonuni’ suppresses the output of a separate analysis for each cutting. The ‘repeated’ statement will generate both a multivariate analysis (MANOVA) and a univariate ANOVA. PROC GLM; title 'repeated measures analysis'; class rep variety; model yield1 yield2 yield3 yield4=variety/ nouni; repeated cutting /printe; Note the two adjusted probabilities that are intended to correct for the unequal correlations among pairs of repeated measures. Compare these results with your split-plot analysis. The MANOVA approach tends to be very conservative (it inflates Type II error). 3) PROC MIXED has many features that are ideal for analysis of repeated measures (but beyond the scope of this course). Below is an example of an analysis that could be used to adjust for the error structure in this trial. The data input is the same as it was for the splitplot analysis of this data set. PROC MIXED data=one covtest; title 'repeated measures using mixed models - unstructured'; class rep cutting variety; model yield=variety cutting variety*cutting; repeated cutting /subject=rep(variety)type=un r rcorr; run; In some situations it may be appropriate to include a ‘random’ statement in addition to the ‘repeated’ statement. The value of -2 Res Log Likelihood for this unstructured model can be compared to other models for repeated measures analyses that are available. Smaller values 3 indicate a better fit for the model. The level of significance of the Null Model Likelihood Ratio Test shows how much better the current model is than an analysis that makes no adjustments for correlations among the errors. 4