CSS 590 Experimental Design in Agriculture

advertisement
CSS 590 Experimental Design in Agriculture
Lab exercise – 7th week
Orthogonal Contrasts
Regression in the ANOVA
SAS On-line Documentation
GLM Procedure
REG Procedure
Part I. Using contrast statements in a factorial experiment
An agricultural chemist suspected that the activity of root growth hormone was dependent on
temperature and concentration. He designed an experiment using two different temperature
baths of a nutrient solution containing 2 ppm, 4 ppm, and 6 ppm of the root growth hormone.
He had two assistants to help measure the root growth response so the experiment was run as
a randomized block design with two blocks.
TEMP CONC BLOCK ROOTS
Cold
Cold
Cold
Cold
Cold
Cold
Warm
Warm
Warm
Warm
Warm
Warm
2
2
4
4
6
6
2
2
4
4
6
6
1
2
1
2
1
2
1
2
1
2
1
2
76
60
63
58
50
45
23
15
36
25
47
36
1) Copy this data into a SAS data step using the appropriate input statement and a datalines
statement.
2) Develop a set of orthogonal contrasts to answer the following questions:
a.
b.
c.
d.
Is there a difference in root growth due to temperature?
Is there a linear response to concentration?
Is there a quadratic response to concentration?
Does the nature of the response to concentration (linear or quadratic) depend on the
temperature?
Cold, 2 ppm
Cold, 4 ppm
Cold, 6 ppm
Warm, 2 ppm
Warm, 4 ppm
Warm, 6 ppm
3) Run a SAS program on this data to address these questions.
PROC GLM;
TITLE 'contrasts in a factorial experiment';
CLASS block temp conc;
MODEL roots = block temp conc temp*conc;
/*make sure that your coefficients correspond to the order of your
means*/
CONTRAST
CONTRAST
CONTRAST
CONTRAST
CONTRAST
'main
'conc
'conc
'temp
'temp
effect of temp' temp -1 1;
lin' conc -1 0 1;
quad' conc 1 -2 1;
x conc lin' temp*conc 1 0 -1 -1 0 1;
x conc quad' temp*conc -1 2 -1 1 -2 1;
LSMEANS conc temp;
LSMEANS temp*conc/out=new;
PROC GPLOT data=new;
TITLE 'Effect of conc and temp on root growth';
SYMBOL1 i=join v=dot;
AXIS1 label=('Root growth');
PLOT lsmean*conc=temp/ vaxis=axis1;
RUN;
4) How would you interpret the output?
5) Given these results, the chemist would like to conduct additional experiments to see if
there are differences in root growth among varieties of this plant species at cold
temperatures with 2 ppm root growth hormone. He thinks a range in root growth of about
20 g would be of practical importance. He wants to know how many replications will be
needed to detect differences of this magnitude. After running the power analysis below,
what will he conclude? How many reps would he need to detect differences at the 0.01
probability level?
PROC POWER;
Title 'determine #reps when Power=0.80, alpha=0.05';
onewayanova test=overall
groupmeans = 48|58|68
ALPHA = 0.05
stddev = 3
power = 0.80
npergroup = .
;
run;
Part II. Part I using nonclass variables
One could analyze the data from Part I by considering the quantitative factor
(concentration) to be a nonclass variable:
PROC GLM data=A;
TITLE 'regression vs contrasts in a factorial experiment';
CLASS block temp;
MODEL roots = block temp conc conc*conc temp*conc temp*conc*conc/solution;
RUN;
Part III: Regression with orthogonal polynomials compared to regression on a
quantitative variable (this section is optional!)
An experiment was conducted to determine the effect of storage temperature on seed viability.
Fifteen seed samples were obtained and three samples, selected at random from the fifteen
were stored at each of five temperatures: 10, 30, 50, 70, 90. At the end of a one year storage
period the samples were tested for germination percentage with the following results:
Temp
10
10
10
30
30
30
50
50
50
70
70
70
90
90
90
Germination
62
55
57
26
36
31
16
15
23
10
11
18
13
11
9
1) Copy this data into a SAS data step using the appropriate input statement and a datalines
statement.
2) Use Proc GLM to analyze the data as a CRD. Remember to specify that temperature is a
class variable. Are there significant differences among the Temperature treatments?
3) Use Contrast statements to determine if the effect of temperature is linear, quadratic, cubic,
or quartic (refer to the table of polynomial coefficients handed out in class).
PROC GLM;
TITLE 'Use of orthogonal polynomials';
CLASS Temp;
MODEL Germination = Temp;
CONTRAST 'Temp Linear' Temp -2 -1 0 1 2;
CONTRAST 'Temp Quadratic' Temp 2 -1 -2 -1 2;
CONTRAST 'Temp Cubic' Temp -1 2 0 -2 1;
CONTRAST 'Temp Quartic' Temp 1 -4 6 -4 1;
LSMEANS Temp / stderr out=new2;
PROC GPLOT data=new2;
PLOT lsmean*Temp;
RUN;
QUIT;
Based on your ouput, which polynomials will you retain in your model?
4) Now try running the same regression, but consider Temperature to be a continuous rather
than a class variable.
PROC GLM data=yourfile;
TITLE 'Linear Regression of germination on temperature';
MODEL Germination = Temp;
PROC GLM data=yourfile;
TITLE 'Quadratic Regression of germination on temperature';
MODEL Germination = Temp Temp*Temp;
PROC GLM data=yourfile;
TITLE 'Cubic Regression of germination on temperature';
MODEL Germination = Temp Temp*Temp Temp*Temp*Temp;
PROC GLM data=yourfile;
TITLE 'Quartic Regression of germination on temperature';
MODEL Germination = Temp Temp*Temp Temp*Temp*Temp Temp*Temp*Temp*Temp;
RUN;
Proc GLM provides direct estimates of the regression coefficients for nonclass variables.
Note how these change as we go to higher order polynomials.
Compare Type I vs Type III SS as we go to higher order polynomials. For regression
analysis using nonclass variables we generally use the Type I (sequential) SS. Type I SS
partitions the Model SS into component SS due to adding each variable sequentially to the
model in the order that they appear in the model statement. Type III SS (partial SS) determines
the SS explained be each variable after all of the other variables have been included in the
model, regardless of the order that they appear in the model statement. Compare the SS for
the quartic regression to the output from the analysis of contrasts.
What happens to MSE as we go to higher order polynomials? How does it compare to the
result from the analysis of contrasts? How do you explain the differences in the error term?
PROC REG is a general purpose regression procedure in SAS that could be used in this
example. PROC GLM or PROC MIXED are better choices if you have both class and nonclass
variables in your model. PROC REG has nine options for selecting the best model. In our
example, we could request a forward selection process to sequentially add terms to the model.
Because PROC REG cannot analyze a squared or higher order polynomial term in the model
statement, we must first create new variables representing these terms:
data four;
set three;
tempsq=temp*temp;
tempcu=temp*temp*temp;
tempqu=temp*temp*temp*temp;
PROC REG data=four;
TITLE 'Forward Selection';
MODEL Germination = Temp tempsq tempcu tempqu/SELECTION=FORWARD ss1;
RUN;
QUIT;
Download