Project 3 as a Word document

advertisement
STAT 217
Project 3
Due in class, March 6, 2009
2 WAY ANOVA
As part of a study of ocean production of greenhouse gases, a researcher needs to evaluate how
different concentrations of dimethylsulfonioproprionate (DMSP) are related to differences in methane
production in the surface ocean. This could demonstrate a potential source of methane, a greenhouse
gas.
The treatment is the concentration of DMSP set at 1, 15, or 30 nanoMoles. The response is a
concentration of methane (in nanoMoles) measured by removing a small amount of the gas atop a
sealed beaker of sea water with a syringe. Concentrations are expected to accumulate over time in the
collection beaker, with higher production from the higher concentrations of DMSP.
Each of the 9 different beakers was set up and then was randomly assigned a concentration of DMSP
(1, 15, or 30). Each would be run for 27 days and would be measured every 3 days starting on the 3rd
day, producing 8 different measurements over time from each beaker. Each level of DMSP is
replicated 3 times, for a total of 9 beakers. A total of 9 (beakers) * 8 (time points) measurements are to
be made in the experiment. Two factors are of interest, DMSP concentration (1, 15, or 30) and Day of
the study (1, 2, 3, …, 8). We want to initially consider the possibility of a day by DMSP concentration
interaction using a 2-WAY ANOVA.
This data set was generated based on potential observations to explore how the analysis would be
performed and the types of conclusions that might be available. Researchers sometimes try to
anticipate the types of results they might expect in advance and should always make sure they know
how to analyze data that they are planning to collect. Sometimes the best way to do that is to simulate
observations and try to analyze them. You can treat this as real data for your analysis.
After the questions, you will find R output that will allow you to complete the questions. In your
report, you will need to extract the necessary information from the output and report it with your
answers. Not all the output will be needed. This project differs from the others that we will complete
due to it being due the Friday after Exam 2. We recommend completing the project prior to the exam
since it is on material that will be included on the exam.
1) Using the interaction plot below, summarize the results. Assess the potential for an interaction
based on this plot alone, and, if present, describe its shape.
2) Find the ANOVA table for the 2-WAY ANOVA model with an interaction. Report the ANOVA
table.
3) For the first test that you should conduct in this situation with this model, make a decision and
write a conclusion.
4) Two different designs were considered for this study. The design that was not used required that
each time a beaker was measured, that beaker could not be measured again, essentially killing the
beaker for continued use. This would have required many more beakers than the design they chose to
use. Instead, they chose to repeatedly measure each beaker once it was set up.
Note all the assumptions involved in the model discussed in number 2.
Using the provided information about the design of the experiment as well as the diagnostic plots
available below to discuss whether the assumptions are met. Reference specific plots or parts of the
discussion in your answer.
5) Find and report the ANOVA table for the additive model. Make a decision for each test in the
table.
6) Assuming the assumptions in #4 are met, is there ever a situation where the tests in #5 are
dangerous to consider? Is that the case here?
7) A plot of the results of fitting the model in #5 is also provided below and is easily distinguished
from the plot used in #1. Discuss these results based on the previous results.
8) Compare the ANOVA tables from #2 and #5, explaining similarities and differences between the
tables.
9) What can you say about DMSP on methane production based on the analysis you have completed?
R OUTPUT AND PLOTS: You need to extract the needed information into your project. It should
stand alone without the assignment sheet. There is extra output you will need provided here.
Plot of Means
10
methane$level
6
4
2
mean of methane$y
8
1
15
30
1
2
3
4
5
6
methane$time
7
8
lm(y ~ level * time)
Normal Q-Q
28
29
1
0
-1
-2
0
-2
-4
Residuals
2
Standardized residuals
2
Residuals vs Fitted
30
4
6
8
10
-2
-1
1
Scale-Location
Constant Leverage:
Residuals vs Factor Levels
1
0
-1
Standardized residuals
1.0
0.5
4
6
28 29
30
8
level :
10
1
15
Fitted values
30
Factor Level Combinations
Plot of Means
10
methane$level
6
4
0
2
mean of fitted(Model.2)
8
1
15
30
1
2
3
4
5
6
methane$time
> Model.1 <- (lm(y ~
> Anova(Model.1)
Anova Table (Type II
Response: y
Sum Sq Df
level
174.42 2
time
542.82 7
level:time 83.65 14
Residuals 220.18 48
2
2
30
29
-2
1.5
Theoretical Quantiles
28
2
0
Fitted values
0.0
Standardized residuals
2
29 28
30
level*time, data=methane))
tests)
F value
Pr(>F)
19.0116 8.299e-07 ***
16.9051 5.002e-11 ***
1.3025
0.2411
7
8
> tapply(methane$y, list(level=methane$level, time=methane$time), mean,
+
na.rm=TRUE) # means
time
level
1
2
3
4
5
6
7
1 2.245726 1.233817 0.6283477 0.6928263 3.860116 6.374335 7.004601
15 1.569131 2.821702 4.4734457 5.3913263 6.581243 6.943360 7.977578
30 1.111514 3.096452 5.5158552 7.1259956 10.094806 11.134223 10.998626
time
level
8
1
7.413630
15 8.559368
30 10.872325
> tapply(methane$y, list(level=methane$level, time=methane$time), sd,
+
na.rm=TRUE) # std. deviations
time
level
1
2
3
4
5
6
7
1 0.2683789 1.443737 1.0489705 0.6792052 1.805697 1.628782 0.6767806
15 2.5176581 1.720250 2.9120991 3.1869022 3.250689 3.839488 2.9182479
30 1.6796614 1.420231 0.6024972 2.1034778 2.088390 2.046137 2.1690521
time
level
8
1 0.5442285
15 2.9276899
30 2.5487892
> tapply(methane$y, list(level=methane$level, time=methane$time), function(x)
+
sum(!is.na(x))) # counts
time
level 1 2 3 4 5 6 7 8
1 3 3 3 3 3 3 3 3
15 3 3 3 3 3 3 3 3
30 3 3 3 3 3 3 3 3
> Model.2 <- lm(y ~ level + time, data=methane)
> Anova(Model.2, type="II")
Anova Table (Type II tests)
Response: y
Sum Sq Df F value
Pr(>F)
level
174.42 2 17.796 7.804e-07 ***
time
542.82 7 15.824 1.012e-11 ***
Residuals 303.83 62
> Model.3 <- lm(y ~ level, data=methane)
> Anova(Model.3, type="II")
Anova Table (Type II tests)
Response: y
Sum Sq Df F value
Pr(>F)
level
174.42 2 7.1073 0.001561 **
Residuals 846.66 69
> Model.4 <- lm(y ~ time, data=methane)
> Anova(Model.4, type="II")
Anova Table (Type II tests)
Response: y
Sum Sq Df F value
Pr(>F)
time
542.82 7 10.377 1.262e-08 ***
Residuals 478.25 64
Download