Name:___________________________________ Homework #9 Due: Thu, Jul 9, 11am Read the directions carefully, and answer all parts of each question. Write neatly; if I can’t read it, it will be counted wrong. You may work in groups, but you must submit your own solution. Any submissions that are word-for-word identical will receive a score of 0. Use this sheet as a cover page or lose five points. No late homework will be accepted under any circumstances. With the recent plans to close Guantanamo Bay prison, the question of where the current “residents” must be placed upon the closure. Authorities are investigating which prisons would have experience in this matter. Below is the number of violent offenders held at the listed locations in the given years. Marion, IL Florence, CO Hardin, MT Year Avg 2005 45 2006 52 47 54 50 55 47.33333 53.66667 2007 43 2008 Loc. Avg 48 47 48 49 50 51 47 49.33333 49.5 51.5 1. Using One-Way ANOVA, test that all location (factor) means are equal. If necessary, determine which means differ from each other. SST = 132.667, SSB = 40.67, and SSW = 92 and α=0.05. H0: μMarion = μFlorence = μHardin HA: At least two means are different Fcrit (dfnum = 3 – 1 = 2, dfden = 12 – 3 = 9) = 5.715 If Fstat > 5.715, reject H0. Else, fail to reject H0. 𝐹𝑠𝑡𝑎𝑡 40.67⁄ 3 − 1 = 1.98 = 92⁄ 12 − 3 Fail to reject H0. All three prisons deal with the same number (statistically) of violent offenders. At this point, you would run Tukey-Cramer to determine which means are different. However since we failed to reject the initial hypothesis, there’s no need. You’re done. Had you wanted to run Tukey-Cramer, the critical value would have been: 92⁄ 1 1 𝐷𝑐𝑟𝑖𝑡 = 4.34√ 9 − 3 ( + ) = 8.49 2 4 4 2. We also should ensure that there is no excess variation in the years that might affect our results. Using Two-Way ANOVA, test that all yearly (block) means are equal. If necessary, determine which means differ from each other and test to ensure that your original results are unaffected. SST = 132.667, SSB = 40.67, SSBL = 84.67, and SSW = 7.33, and α=0.05. H0: μ2005 = μ2006 = μ2007 = μ2008 HA: At least two means are different “I may be drunk, Miss, but in the morning I will be sober and you will still be ugly.” -Winston Churchill Name:___________________________________ Homework #9 Due: Thu, Jul 9, 11am Fcrit (dfnum = 4 – 1 = 3, dfden = (4 – 1)(3 – 1) = 6) =6.599 If Fstat > 6.599, reject H0. Else, fail to reject H0. 𝐹𝑠𝑡𝑎𝑡 = 84.67⁄ 4−1 = 23.10 7.33⁄ (4 − 1)(3 − 1) Reject H0 (Big Time!). We have some different means. This implies two things. First, we’d like to know which means are different. Second, our original One-Way tests are inaccurate. We’ll need to go back and redo those using the Two-Way structure. But first, use Fisher’s LSD to determine which block (annual) means are different. H0: μi = μj HA: μi ≠ μj 𝐷𝑐𝑟𝑖𝑡 = 2.4469√7.33⁄(4 2 √ = 1.91 − 1)(3 − 1) 4 If Dstat > 1.91 or < -1.91, reject H0. Else, fail to reject H0. μ2005 – μ2006 = 47.33 – 53.66 = -6.33, Reject H0 μ2005 – μ2007 = 47.33 – 47 = 0.33, Fail to Reject H0 μ2005 – μ2008 = 47.33 – 49.33 = -2, Reject H0 μ2006 – μ2007 = 53.66 – 47 = 6.66, Reject H0 μ2006 – μ2008 = 53.66 – 49.33 = 4.33, Reject H0 μ2007 – μ2008 = 47 – 49.33 = -2.33, Reject H0 Wow, nearly all pairs are unequal. 2005 and 2007 are statistically the same, but there is a lot of variation in these means. Now we need to re-run our original tests on the factors (locations) altering the SSW and dfSSW to make the correct inference. H0: μMarion = μFlorence = μHardin HA: At least two means are different Fcrit (dfnum = 3 – 1 = 2, dfden = (4 – 1)(3 – 1) = 6) = 6.599 If Fstat > 6.599, reject H0. Else, fail to reject H0. “I may be drunk, Miss, but in the morning I will be sober and you will still be ugly.” -Winston Churchill Name:___________________________________ 𝐹𝑠𝑡𝑎𝑡 = Homework #9 Due: Thu, Jul 9, 11am 40.67⁄ 3−1 = 16.64 7.33⁄ (4 − 1)(3 − 1) Reject H0. Now that we’ve factored out the differences caused by yearly variation, it appears that we have significant differences between the prisons. Let’s use Tukey-Cramer to figure out which. H0: μi = μj HA: μi ≠ μj 7.33⁄ (4 − 1)(3 − 1) 1 1 √ 𝐷𝑐𝑟𝑖𝑡 = 4.34 ( + ) = 2.39 2 4 4 If Dstat > 2.39 or < -2.39, reject H0. Else, fail to reject H0. μMarion – μFlorence = 47 – 49.5 = -2.5, Reject H0. μMarion – μHardin = 47 – 51.5 = -4.5, Reject H0. μFlorence – μHardin = 49.5 – 51.5 = -2, Fail to Reject H0. So, Marion has a different level of experience. Because of the signs on the Dstat’s associated with Marion, it looks like they have significantly less experience than the other two prisons in dealing with violent offenders. If I were the Justice Department, I’d “lose” the phone number to Marion when placing Guantanamo detainees. The Obama Administration recently raised the federal tax rate on cigarettes in an effort to curb smoking. From a dataset compiled from the 50 states, the correlation between pack price and cigarette purchases is -0.26. 3. Test that this correlation is significant (α=0.05). H0: ρ = 0 HA: ρ ≠ 0 tcrit = ± 2.0086 If tstat > 2.0086 or < -2.0086, reject H0. Else, fail to reject H0. 𝑡𝑠𝑡𝑎𝑡 = −0.26 = −1.86 2 1 − −0.26 √ 50 − 2 Fail to reject H0. It appears that pack price and percentage of smokers does not have a significant linear relationship. The following regression was run to further quantify the relationship between cigarette purchases and pack price. “I may be drunk, Miss, but in the morning I will be sober and you will still be ugly.” -Winston Churchill Name:___________________________________ Homework #9 Due: Thu, Jul 9, 11am %SMOKEi = 26.14 – 1.05PRICEi + εi, r2 = 0.068, 4. Answer or comment on the following: a. If the national pack price were 3.00, what percent of the population would smoke? 26.14 – 1.05(3.00) = 22.99% b. If the national pack price were 4.50, what percent of the population would smoke? 26.14 – 1.05(4.50) = 21.415% c. The p-value on the coefficient on PRICEi is 0.06. Comment on the significance of pack price as a predictor of smoking. This would indicate that using an alpha = 0.05 we would fail to reject meaning that pack price is not significant when measuring smoking. At alpha = 0.10 however, we would fail to reject. d. The r2 = 0.068. Comment on the effectiveness of this regression. This indicates that only 6% of the variation in smoking is explained by pack price. Now this looks small because price is measured in dollars and smoking in percentages, however this is not a strong predictor. “I may be drunk, Miss, but in the morning I will be sober and you will still be ugly.” -Winston Churchill