BSC 5936.01 AUTUMN 2004 EXERCISE 3: ONE-WAY ANALYSIS OF VARIANCE The Data The data are the results from the manipulative experiment on pitcher plants that most of you executed under Tom Miller’s guidance. The Excel file includes these variables: TREATMENT ANTS REP VOLUME STAGE I STAGE II STAGE III STAGE IV TOTAL MOSQ. MITES color code for treatment levels as you implemented them one of three treatment levels, 0 (control), 2, or 20 added dead ants identity of the replicate (1-25) amount of fluid remaining in the leaf when data were taken number of stage I mosquito larvae found in the leaf number of stage II mosquito larvae found in the leaf number of stage III mosquito larvae found in the leaf number of stage IV mosquito larvae found in the leaf total number of mosquito larvae found in the leaf number of mites found in the leaf For this exercise, you will use TOTAL MOSQ. (hereafter denoted as “mosquitoes”) and MITES (hereafter denoted as “mites”) as the response variables for analysis, disregarding the stage of each larva. Please DO NOT INCLUDE REPLICATE 16 IN ANY ANALYSES: DELETE ALL THREE TREATMENT LEVELS IN THIS REPLICATE. You’ll notice there are no data for replicate 16 in the “red” treatment and Tom believes the data in the “orange” treatment to be compromised, so let’s just forget that one entirely. This means that you have 24 replicates of each of three treatment levels. It appears to me (who is not an Excel jockey) that the entries in TOTAL MOSQ. are calculated as the sum of the preceding columns, which could make it difficult to import the file into some software packages intact; some packages require numbers and don’t accept formulas. If you delete the columns with “stage” designations, you may get blank or missing cells for TOTAL MOSQ. in the software file you create. Most software packages will recognize the variables in ANTS as levels of a categorical variable, although you may need to tell the package that those numbers represent “categories” when you do analyses with them so that the software does not treat 0, 2, and 20 ants as if you want a regression, which you do not with these data. So be careful when you import these data and attempt to work with them in your favorite software package. Forewarned is forearmed. The Questions 1. Examine the data; make box plots of the mosquitoes and mites for each treatment level and also tabulate the average numbers of mosquitoes and mites for each treatment level along with the standard errors within each level, each minimum and each maximum value. 2. Perform a one-way analysis of variance on each dependent variable, examining the residuals for each variable to see if they appear to match the assumptions of the analysis. From perusing the residuals, offer a diagnosis of whether the analysis is valid (i.e. are the assumptions violated) and comment upon whether you think that examining the box plots or the tabulated averages and other measures offered any hint about the validity of the analysis as diagnosed from the residual graphs. 3. If you think that the analyses of the raw data are valid, say so here and say why, then go to question 4. If you think they are not valid, then subject the data to an appropriate transformation and perform the analysis again, examine the residuals, and offer a new diagnosis. Do this until you find a transformation for which the pattern in the residuals satisfies you. If you cannot find such a transformation, choose the analysis that offers the most hope for a potentially valid analysis and then proceed to question 4. HINT: IF I WERE YOU, I’D TRY A SQUARE ROOT TRANSFORMATION, CREATING A NEW VARIABLE Y FROM THE ORIGINAL RAW DATA X AS Y = X + (X + 1). 4. Use an a priori contrast to test the null hypothesis that the average value of the control treatment (0 ants added) does not differ from the average values of the two other treatment levels (do this for each response variable). 5. Now pretend you have no a priori hypothesis; if the omnibus test was significant, use a multiple comparison method of your choice to delineate which treatment levels appeared to be distinct from one another. 6. All things considered, looking at your results in total, write the “results” section of a paper on this experiment. This means that you must decide which results to present in the paper (all graphs, some graphs, some tables, all tables, box plots, residuals, etc.), the best order in which to present them, and how to write concise prose descriptions that get your points across. The answer to this question must take no more than ½ of a single page in 12 point type, single spaced, extra space between paragraphs, and include no more than two figures and one table (the ex-editor in me is now writing). Clearly this forces you to confront how much of the real work in data analysis is ever reflected in the written “results” sections of papers. Look at recent journal papers for models. COMMENT. Of course, in the real world you would rarely, if ever, perform a multiple comparison if you had prior expectations and you certainly wouldn’t include both analyses in a single write-up (that’s a hint for question 6, meaning you also have to decide if you had a prior expectation in your head to decide which results to include). In other words, either you had expectations or you didn’t and you have to choose. One exception to this rule of thumb might be if you did an experiment that had a control and several treatment levels; you might legitimately test the a priori expectation that the control was different from the treatments in toto, but have no prior expectations about which treatment levels might be different from which other treatment levels. In such a case, you might first test the effect of control vs. treatments and then use a modified multiple comparison to test for differences among treatment levels.