1 Agenda: STAT 401 Week 6 Lab 1 1. Current HW questions 2. Next HW questions / practice exam questions 3. Example ANOVA and multiple comparison problems worked through with data (with JMP) 4. Example problems from book. 2 Examples 2.1 T-Rex Data 12 different bones were taken from a T-rex skeleton. Multiple samples were taken from each bone in order to estimate oxygen content in the bone. The data set (trex.csv) has two variables: oxygen content and bone number. NB: this data set uses unequal sample sizes. If we were doing all the math by hand, this would get ugly and annoying. However, we will do it by computer. 1. Test the null hypothesis that all bones have the same mean oxygen isotope concentration. Report the p-value and a one-sentence conclusion. 2. The first five bones (Rib 16, the two Gastralia and the two vertebrae) are bones in the body cavity. The other seven are bones in the extremities (legs, feet or tail). Estimate the difference between the average of the ”body” bones and the average of the ”extremity” bones. Test the null hypothesis that this is 0. Report your estimate, its se, and the p-value for the test. 3. Use a plot of residuals vs predicted values to evaluate the ANOVA assumptions. NB: need to save residuals, save predicted. 4. Imagine that the investigators were specifically interested in the difference between bone 4 (dorsal vertebrae 1) and bone 11 (mid-caudal). Test whether this difference = 0. Report the T statistic and p-value. (no conclusion needed). Treat it as if this comparison is the only comparison we are interested in. 5. Actually, the contrast between bones 4 and 11 really isn’t an a-priori question. If you used Tukey’s adjustment for multiple comparisons, what p-value do you report for the comparison of bone 4 and 11? 6. Estimate a 95% confidence interval for the mean difference between bones 4 and 11. 7. Estimate 95% simultaneous confidence intervals for all pairs of bones using the appropriate adjustment. Report the simultaneous interval for the difference between bones 4 and 11. 1 8. Is the interval for the mean difference longer, the same length, or shorter than the simultaneous confidence intervals? Briefly explain why this is reasonable. 2.2 Book Problem 10.7 p. 401 Note: this is not the homework problem 7. Source df SS Mixture Error Total 5664.415 2.3 MS f 13.929 Penicillin Example This is an example of randomized block design from Box, Box, and Hunter. The questions we are interested in answering can be answered by referring to a two-way ANOVA and using Tukey’s method. The data is saved as penicillin.dat. There are four different processes for producing penicillin (A,B,C, and D). 5 batches of raw material (corn steep liquor sampled from rail tanker cars) are drawn. Each sample is split into four parts and each process is run on one part from each sample. The response variable is the yield of penicillin. Model: Yij = µ + αi + βj + ij Test the null hypothesis of no treatment effect against the alternative that at least two population means are different. Test the null hypothesis of no block effect. Use the Tukey method to compare the sample means. Verify the ANOVA assumptions. 2.4 Chapter 11, Problem 9 The data set for problem 9 has been put in prob9.csv. 2.5 Chapter 11, Problem 18 The data set for problem 18 has been put in prob18.csv. 2.6 Sugar Cane Study: Interactions This is a 3x3 factorial experiment with four replicates. The first factor is three varieties of sugar cane. The second factor is three levels of nitrogen: N1: 150 lbs/acre N2: 210 lbs/acre N3: 270 lbs/acre The data are in cane.dat. Does nitrogen have an effect on yield? Be careful. 2 2.7 More book problems Ch 10: 3,38 Ch 11: 6 3 References • T-Rex Information • STAT 401 Page 3