STAT 217 Project 3 Due in class, March 6, 2009 2 WAY ANOVA As part of a study of ocean production of greenhouse gases, a researcher needs to evaluate how different concentrations of dimethylsulfonioproprionate (DMSP) are related to differences in methane production in the surface ocean. This could demonstrate a potential source of methane, a greenhouse gas. The treatment is the concentration of DMSP set at 1, 15, or 30 nanoMoles. The response is a concentration of methane (in nanoMoles) measured by removing a small amount of the gas atop a sealed beaker of sea water with a syringe. Concentrations are expected to accumulate over time in the collection beaker, with higher production from the higher concentrations of DMSP. Each of the 9 different beakers was set up and then was randomly assigned a concentration of DMSP (1, 15, or 30). Each would be run for 27 days and would be measured every 3 days starting on the 3rd day, producing 8 different measurements over time from each beaker. Each level of DMSP is replicated 3 times, for a total of 9 beakers. A total of 9 (beakers) * 8 (time points) measurements are to be made in the experiment. Two factors are of interest, DMSP concentration (1, 15, or 30) and Day of the study (1, 2, 3, …, 8). We want to initially consider the possibility of a day by DMSP concentration interaction using a 2-WAY ANOVA. This data set was generated based on potential observations to explore how the analysis would be performed and the types of conclusions that might be available. Researchers sometimes try to anticipate the types of results they might expect in advance and should always make sure they know how to analyze data that they are planning to collect. Sometimes the best way to do that is to simulate observations and try to analyze them. You can treat this as real data for your analysis. After the questions, you will find R output that will allow you to complete the questions. In your report, you will need to extract the necessary information from the output and report it with your answers. Not all the output will be needed. This project differs from the others that we will complete due to it being due the Friday after Exam 2. We recommend completing the project prior to the exam since it is on material that will be included on the exam. 1) Using the interaction plot below, summarize the results. Assess the potential for an interaction based on this plot alone, and, if present, describe its shape. 2) Find the ANOVA table for the 2-WAY ANOVA model with an interaction. Report the ANOVA table. 3) For the first test that you should conduct in this situation with this model, make a decision and write a conclusion. 4) Two different designs were considered for this study. The design that was not used required that each time a beaker was measured, that beaker could not be measured again, essentially killing the beaker for continued use. This would have required many more beakers than the design they chose to use. Instead, they chose to repeatedly measure each beaker once it was set up. Note all the assumptions involved in the model discussed in number 2. Using the provided information about the design of the experiment as well as the diagnostic plots available below to discuss whether the assumptions are met. Reference specific plots or parts of the discussion in your answer. 5) Find and report the ANOVA table for the additive model. Make a decision for each test in the table. 6) Assuming the assumptions in #4 are met, is there ever a situation where the tests in #5 are dangerous to consider? Is that the case here? 7) A plot of the results of fitting the model in #5 is also provided below and is easily distinguished from the plot used in #1. Discuss these results based on the previous results. 8) Compare the ANOVA tables from #2 and #5, explaining similarities and differences between the tables. 9) What can you say about DMSP on methane production based on the analysis you have completed? R OUTPUT AND PLOTS: You need to extract the needed information into your project. It should stand alone without the assignment sheet. There is extra output you will need provided here. Plot of Means 10 methane$level 6 4 2 mean of methane$y 8 1 15 30 1 2 3 4 5 6 methane$time 7 8 lm(y ~ level * time) Normal Q-Q 28 29 1 0 -1 -2 0 -2 -4 Residuals 2 Standardized residuals 2 Residuals vs Fitted 30 4 6 8 10 -2 -1 1 Scale-Location Constant Leverage: Residuals vs Factor Levels 1 0 -1 Standardized residuals 1.0 0.5 4 6 28 29 30 8 level : 10 1 15 Fitted values 30 Factor Level Combinations Plot of Means 10 methane$level 6 4 0 2 mean of fitted(Model.2) 8 1 15 30 1 2 3 4 5 6 methane$time > Model.1 <- (lm(y ~ > Anova(Model.1) Anova Table (Type II Response: y Sum Sq Df level 174.42 2 time 542.82 7 level:time 83.65 14 Residuals 220.18 48 2 2 30 29 -2 1.5 Theoretical Quantiles 28 2 0 Fitted values 0.0 Standardized residuals 2 29 28 30 level*time, data=methane)) tests) F value Pr(>F) 19.0116 8.299e-07 *** 16.9051 5.002e-11 *** 1.3025 0.2411 7 8 > tapply(methane$y, list(level=methane$level, time=methane$time), mean, + na.rm=TRUE) # means time level 1 2 3 4 5 6 7 1 2.245726 1.233817 0.6283477 0.6928263 3.860116 6.374335 7.004601 15 1.569131 2.821702 4.4734457 5.3913263 6.581243 6.943360 7.977578 30 1.111514 3.096452 5.5158552 7.1259956 10.094806 11.134223 10.998626 time level 8 1 7.413630 15 8.559368 30 10.872325 > tapply(methane$y, list(level=methane$level, time=methane$time), sd, + na.rm=TRUE) # std. deviations time level 1 2 3 4 5 6 7 1 0.2683789 1.443737 1.0489705 0.6792052 1.805697 1.628782 0.6767806 15 2.5176581 1.720250 2.9120991 3.1869022 3.250689 3.839488 2.9182479 30 1.6796614 1.420231 0.6024972 2.1034778 2.088390 2.046137 2.1690521 time level 8 1 0.5442285 15 2.9276899 30 2.5487892 > tapply(methane$y, list(level=methane$level, time=methane$time), function(x) + sum(!is.na(x))) # counts time level 1 2 3 4 5 6 7 8 1 3 3 3 3 3 3 3 3 15 3 3 3 3 3 3 3 3 30 3 3 3 3 3 3 3 3 > Model.2 <- lm(y ~ level + time, data=methane) > Anova(Model.2, type="II") Anova Table (Type II tests) Response: y Sum Sq Df F value Pr(>F) level 174.42 2 17.796 7.804e-07 *** time 542.82 7 15.824 1.012e-11 *** Residuals 303.83 62 > Model.3 <- lm(y ~ level, data=methane) > Anova(Model.3, type="II") Anova Table (Type II tests) Response: y Sum Sq Df F value Pr(>F) level 174.42 2 7.1073 0.001561 ** Residuals 846.66 69 > Model.4 <- lm(y ~ time, data=methane) > Anova(Model.4, type="II") Anova Table (Type II tests) Response: y Sum Sq Df F value Pr(>F) time 542.82 7 10.377 1.262e-08 *** Residuals 478.25 64