Key

advertisement
Name:___________________________________
Homework #9
Due: Thu, Jul 9, 11am
Read the directions carefully, and answer all parts of each question. Write neatly; if I can’t read it, it will
be counted wrong. You may work in groups, but you must submit your own solution. Any submissions
that are word-for-word identical will receive a score of 0. Use this sheet as a cover page or lose five
points. No late homework will be accepted under any circumstances.
With the recent plans to close Guantanamo Bay prison, the question of where the current “residents”
must be placed upon the closure. Authorities are investigating which prisons would have experience in
this matter. Below is the number of violent offenders held at the listed locations in the given years.
Marion, IL
Florence,
CO
Hardin, MT
Year Avg
2005
45
2006
52
47
54
50
55
47.33333 53.66667
2007
43
2008 Loc. Avg
48
47
48
49
50
51
47 49.33333
49.5
51.5
1. Using One-Way ANOVA, test that all location (factor) means are equal. If necessary, determine
which means differ from each other. SST = 132.667, SSB = 40.67, and SSW = 92 and α=0.05.
H0: μMarion = μFlorence = μHardin
HA: At least two means are different
Fcrit (dfnum = 3 – 1 = 2, dfden = 12 – 3 = 9) = 5.715
If Fstat > 5.715, reject H0. Else, fail to reject H0.
𝐹𝑠𝑡𝑎𝑡
40.67⁄
3 − 1 = 1.98
=
92⁄
12 − 3
Fail to reject H0. All three prisons deal with the same number (statistically) of violent offenders.
At this point, you would run Tukey-Cramer to determine which means are different. However
since we failed to reject the initial hypothesis, there’s no need. You’re done. Had you wanted
to run Tukey-Cramer, the critical value would have been:
92⁄
1 1
𝐷𝑐𝑟𝑖𝑡 = 4.34√ 9 − 3 ( + ) = 8.49
2
4 4
2. We also should ensure that there is no excess variation in the years that might affect our results.
Using Two-Way ANOVA, test that all yearly (block) means are equal. If necessary, determine
which means differ from each other and test to ensure that your original results are unaffected.
SST = 132.667, SSB = 40.67, SSBL = 84.67, and SSW = 7.33, and α=0.05.
H0: μ2005 = μ2006 = μ2007 = μ2008
HA: At least two means are different
“I may be drunk, Miss, but in the morning I will be sober and you will still be ugly.”
-Winston Churchill
Name:___________________________________
Homework #9
Due: Thu, Jul 9, 11am
Fcrit (dfnum = 4 – 1 = 3, dfden = (4 – 1)(3 – 1) = 6) =6.599
If Fstat > 6.599, reject H0. Else, fail to reject H0.
𝐹𝑠𝑡𝑎𝑡 =
84.67⁄
4−1
= 23.10
7.33⁄
(4 − 1)(3 − 1)
Reject H0 (Big Time!). We have some different means.
This implies two things. First, we’d like to know which means are different. Second, our original
One-Way tests are inaccurate. We’ll need to go back and redo those using the Two-Way
structure. But first, use Fisher’s LSD to determine which block (annual) means are different.
H0: μi = μj
HA: μi ≠ μj
𝐷𝑐𝑟𝑖𝑡 = 2.4469√7.33⁄(4
2
√ = 1.91
− 1)(3 − 1) 4
If Dstat > 1.91 or < -1.91, reject H0. Else, fail to reject H0.
μ2005 – μ2006 = 47.33 – 53.66 = -6.33,
Reject H0
μ2005 – μ2007 = 47.33 – 47 = 0.33,
Fail to Reject H0
μ2005 – μ2008 = 47.33 – 49.33 = -2,
Reject H0
μ2006 – μ2007 = 53.66 – 47 = 6.66,
Reject H0
μ2006 – μ2008 = 53.66 – 49.33 = 4.33,
Reject H0
μ2007 – μ2008 = 47 – 49.33 = -2.33,
Reject H0
Wow, nearly all pairs are unequal. 2005 and 2007 are statistically the same, but there is a lot of
variation in these means.
Now we need to re-run our original tests on the factors (locations) altering the SSW and dfSSW to
make the correct inference.
H0: μMarion = μFlorence = μHardin
HA: At least two means are different
Fcrit (dfnum = 3 – 1 = 2, dfden = (4 – 1)(3 – 1) = 6) = 6.599
If Fstat > 6.599, reject H0. Else, fail to reject H0.
“I may be drunk, Miss, but in the morning I will be sober and you will still be ugly.”
-Winston Churchill
Name:___________________________________
𝐹𝑠𝑡𝑎𝑡 =
Homework #9
Due: Thu, Jul 9, 11am
40.67⁄
3−1
= 16.64
7.33⁄
(4 − 1)(3 − 1)
Reject H0. Now that we’ve factored out the differences caused by yearly variation, it appears
that we have significant differences between the prisons. Let’s use Tukey-Cramer to figure out
which.
H0: μi = μj
HA: μi ≠ μj
7.33⁄
(4 − 1)(3 − 1) 1 1
√
𝐷𝑐𝑟𝑖𝑡 = 4.34
( + ) = 2.39
2
4 4
If Dstat > 2.39 or < -2.39, reject H0. Else, fail to reject H0.
μMarion – μFlorence = 47 – 49.5 = -2.5,
Reject H0.
μMarion – μHardin = 47 – 51.5 = -4.5,
Reject H0.
μFlorence – μHardin = 49.5 – 51.5 = -2,
Fail to Reject H0.
So, Marion has a different level of experience. Because of the signs on the Dstat’s associated with
Marion, it looks like they have significantly less experience than the other two prisons in dealing
with violent offenders. If I were the Justice Department, I’d “lose” the phone number to Marion
when placing Guantanamo detainees.
The Obama Administration recently raised the federal tax rate on cigarettes in an effort to curb
smoking. From a dataset compiled from the 50 states, the correlation between pack price and
cigarette purchases is -0.26.
3. Test that this correlation is significant (α=0.05).
H0: ρ = 0
HA: ρ ≠ 0
tcrit = ± 2.0086
If tstat > 2.0086 or < -2.0086, reject H0. Else, fail to reject H0.
𝑡𝑠𝑡𝑎𝑡 =
−0.26
= −1.86
2
1
−
−0.26
√
50 − 2
Fail to reject H0. It appears that pack price and percentage of smokers does not have a
significant linear relationship.
The following regression was run to further quantify the relationship between cigarette purchases
and pack price.
“I may be drunk, Miss, but in the morning I will be sober and you will still be ugly.”
-Winston Churchill
Name:___________________________________
Homework #9
Due: Thu, Jul 9, 11am
%SMOKEi = 26.14 – 1.05PRICEi + εi, r2 = 0.068,
4. Answer or comment on the following:
a. If the national pack price were 3.00, what percent of the population would smoke?
26.14 – 1.05(3.00) = 22.99%
b. If the national pack price were 4.50, what percent of the population would smoke?
26.14 – 1.05(4.50) = 21.415%
c. The p-value on the coefficient on PRICEi is 0.06. Comment on the significance of pack
price as a predictor of smoking.
This would indicate that using an alpha = 0.05 we would fail to reject meaning that pack
price is not significant when measuring smoking. At alpha = 0.10 however, we would fail
to reject.
d. The r2 = 0.068. Comment on the effectiveness of this regression.
This indicates that only 6% of the variation in smoking is explained by pack price. Now this
looks small because price is measured in dollars and smoking in percentages, however this is
not a strong predictor.
“I may be drunk, Miss, but in the morning I will be sober and you will still be ugly.”
-Winston Churchill
Download