Homework 9 Key

advertisement
Stat 4220 Homework due April 10
1. For each scenario write the letter for what kind of hypothesis test or confidence interval is described
Hint: You will use each letter either twice or never.
A.
B.
C.
D.
E.
One sample z-test for a mean
One sample t-test for a mean
Matched pairs difference in means
Two sample z-test for means independent
Two sample t-test for means independent
F.
G.
H.
I.
One sample z-test for a proportion
One sample t-test for a proportion
Two sample z-test for p1-p2
None of the above
i.
A_______ An anthropology major believes the distribution of homes per city from the Anasazi
Indians is normally distributed with a standard deviation of 12 homes. A random sample of 10
Anasazi cities shows an average of 46 homes. He wants an 85% confidence interval for the true
overall average.
ii.
H_______ A History major suspects that Paris has more criminals today than it did in 1500. She
learns that in 1500 there were 200 thousand people, and 2 thousand criminals. Today there are
2,211 thousand people, and 30 thousand criminals. She wonders if the difference is significant.
iii.
C_______ An international studies student has found 90 families where one sibling is living in
the US and the other sibling is living in China. The average for the US siblings is 195 pounds with
a standard deviation of 20 pounds. The average for the Chinese sibling is 180 pounds with a
standard deviation of 15 pounds. The standard deviation of the difference across siblings was 8
pounds. She plans on writing a book discussing whether this is evidence that the American
lifestyle is more fat than the Chinese lifestyle.
iv.
I_______ A psychology major wants to know how much money it would take before a person
would do the Macarena in Prexy’s Pasture. He randomly samples 20 people and gets an
average of $30 with a standard deviation of $90. He wants to use 93% confidence.
v.
B_______ A criminal justice major wants to know the average time a drug dealer spends in jail
in Colorado. The mayor says it should be longer than 15 years. Assume the distribution is
normal. A random sample of 10 convicted drug dealers has an average of 20 years with a
standard deviation of 5 years. The goal is to test the mayor’s claim.
vi.
I_______ A theater and dance major wants to know if more women or men have seen a ballet.
He randomly samples 200 women and finds 11% have seen a ballet. He samples 200 men and
finds 7% have seen a ballet. He wants to use a 10% significance test.
vii.
B_______ A communication major wants to know the average blood pressure for someone who
is about to give a speech. He randomly samples 40 people before they give a speech and gets
an average systolic blood pressure of 190 with a standard deviation of 30 mmHg. He wants a
98% confidence interval for the true average systolic blood pressure of someone who is about
to give a talk.
viii.
A_______ An art major is testing whether a new painting was made by Michelangelo. It is
known that the amount of lead in a square inch of any of Michelangelo’s paintings has a mean
of 82 ppm and a standard deviation of 13 ppm. On the new painting 60 random square inches
are selected, and there is an average of 70 ppm of lead per square. She wants to test if this
painting has significantly different lead levels on average using α=0.01.
ix.
E_______ An accounting major knows the marketing people are getting paid more than the
finance people. He wants a 96% confidence interval for the difference in salaries between the
two majors. The 80 marketing people average $62/year with a standard deviation of $12/year.
The 50 finance people average $59/year with a standard deviation of $4/year. The standard
deviation of the differences is $3.2/year. His confidence interval will be used to accuse the CFO
of favoritism.
x.
F_______ A philosophy major wants to estimate the proportion of people who know what a
philosophy major does with 95% confidence. He randomly samples 100 people and exactly half
know what he does.
xi.
F_______ A political science major wants to know whether more than half the people in
Laramie vote on election day. A random sample of 350 people showed 185 of them voted.
xii.
C_______ A journalism major is tracking the number of protests between San Francisco and
New York. He randomly selects 100 days and find the number of protests in each city on each
of those days. The average in New York was 2.4 protests, the average in San Francisco was 0.7
protests. The standard deviation in New York was 2.3 while in San Francisco it was 5.7 and the
standard deviation of the differences was 1.2 protests. His goal is to find with 80% confidence
what the average difference is in the number of riots between the two cities.
xiii.
H_______ A biology major wants to know the difference between spraying your counter with
Lysol and spraying it with alcohol. A petri dish with a million bacteria on it had 99% of the
germs die with Lysol. A different dish with a million bacteria on it had 80% die with alcohol. He
wants a 95% confidence interval for the true population difference.
xiv.
E_______ An English major thinks contemporary books have more words than they did 50 years
ago. She randomly selects 40 books that were written this year, and randomly selects 40 books
written 50 years ago. Her data shows that modern books have an average of 140 thousand
words with a standard deviation of 70 thousand words. Fifty years ago it was an average of 90
thousand words with a standard deviation of 10 thousand words. She wants a test with 10%
significance.
2. 80 random men and 70 random women were sent to visit the orthodontist for the first time. 61 of
the men and 55 of the women came back with braces. Write the equation for the 96% confidence
interval of difference in the proportions of men and women who will get braces at the
orthodontist. Use the numbers given in the problem without simplifying (except for the z or t
score). You do not need to solve it out.
 61 55 
    2.054
 80 70 



61 1  61
55 1  55
80
80  70
70
80
70

3. The Branding Iron says the average salary for a UW graduate is higher than the salary of a CSU
graduate. Test at the 10% significance level if this is true given the data below.
Survey of UW graduates:
Survey of CSU graduates:
Participants: 32 Cowboys
Participants: 32 Rams
Average Salary: $66,500
Average Salary: $59,500
Standard Deviation for each Cowboy: $10,000
Standard Deviation for each Ram: $8,000
Pooled standard deviation for each graduate: $9,000
Matched Pairs standard deviation for each graduate: $8,522
Average standard deviation for both graduates: $9,000
H0: µ1 ≤ µ2
HA: µ1 > µ2
α=.1
t31 
66500  59500  0  3.092
10000 2 8000 2

32
32
.0025<p-value<.005
Reject
Our data supports the claim that UW graduates earn more than CSU graduates
4. Harry Potter believes that he can tell if a person is a bad guy by listening to the background music
when they come near. To find out if this is the case, Harry records what type of music he hears
around 114 random people. Then Harry performs the Crucius curse to determine if the person is a
good guy or bad guy. Based on the following data, determine if the proportion of good guys with
ominous music differs from the proportion of bad guys with ominous music.
Allegiance
Good
Bad
Guys
Guys
Background
Music
H0: пG = пB
HA: пG ≠ пB (claim)
Ominous
Music
Happy
Music
45
38
16
25
α = .05
n1 p1  83 * 45
83
 45

n1 (1  p1 )  83 * 1  45
n2 p2  41*16
  38
83
 16
41
n2 p2  41* 1  25  25
41


All are greater than 15
45  16
 .4919
83  41
45  16
83
41
Z
 1.59
.4919(1  .4919) .4919(1  .4919)

83
41
pˆ p 
Tail-probability = .0559 (double it because it is two-sided)
p-value = .1118
Fail to reject the null
Harry Potter can’t tell who the bad guys are by listening to the background music
5. The Statistics Department really wants to make sure the students fill out the on-line evaluations.
They realize students are often frustrated with tricky questions that cannot be done, but the
evaluations are so important they survey 28 students in Dr. Crawford’s class, and 19 students in
Michele’s class. The department wants to know if certain teachers are more effective at getting
their students to complete the evaluations. The results are shown below:
Class
Crawford Michele
Did you fill
out the online
evaluation?
Yes
No
16
12
8
11
Find a 95% confidence interval for the difference in proportions of students in each class who filled out
the on-line class evaluation.
This question cannot be done, the sample size is too small
6. Santa Claus suspects that calculators are the perfect Christmas gift. To test this he gives a
calculator to 16 college students, but gives an X-box to 16 other students. Then Santa watched
their average GPR for the year. Use α=0.25. Assume grades are normally distributed with equal
variance for both groups. Do a hypothesis test (all the steps) on whether a calculator increases GPR
more than an X-box does.
Students which got a calculator
Students which got an X-box
Number: 16
Number: 16
Mean GPR for the year: 3.2175
Mean GPR for the year: 2.92
Standard deviation (per GPR): 0.8
Standard deviation (per GPR): 1.5
The pooled standard deviation is 1.15
The matched pairs standard deviation is 1.07
The data is normally distributed. It is not matched pairs, but it does say equal variances, so this is a
pooled two sample means test.
1) H0: μCalculators ≤ μXbox
2) HA: μCalculatos > μXbox
3) α=0.25
4)
t n1 
x1  x2   1   2 
s2
n1
t15 

s2
n2
3.2175  2.92  0
1.2 2 1.2 2

16
16
= 0.7
5) 0.25 < P-value < 0.20 because on the T-table 0.691 < 0.7 < 0.866
6) 0.25 < P-value means Reject the null
7) There is sufficient evidence to show that the average GPR for students who received a
calculator is higher than the average GPR of students who did not receive a calculator.
If students express a concern that this will not really answer Santa’s question of whether a calculator is a
good present, or suspect Santa chose α=0.25 just so that he could get the results he wanted tell them
Santa is fictitious.
7. Brent wonders why his Pop-Tarts are always burnt, when Mason’s seem to be just right. He thinks
it might be because his outlets are getting more voltage than Mason’s outlets. To test this, he gets
all the multimeters he can find. Then Brent takes the multimeters to each house and randomly
selects and electric outlet. The voltage levels are shown below. Assume it is normally distributed.
Burn level
Multimeter 1
Multimeter 2
Multimeter 3
Multimeter 4
Multimeter 5
Multimeter 6
Multimeter 7
Brent
12
15
17
13
11
18
15
Mason
15
15
18
11
16
22
14
With α=.10 test whether the voltage is more at Brent’s house then at Mason’s house.
This is a matched pairs t-test because the numbers correspond to each multimeter (they cannot be
switched around). Therefore we can simplify the data by simply subtracting the values:
Difference
Multimeter 1
-3
Multimeter 2
0
Multimeter 3
-1
Multimeter 4
2
Multimeter 5
-5
Multimeter 6
-4
Multimeter 7
1
 3  0 1  2  5  4 1
 1.429
7
 3  1.492  0  1.492   1  1.492  2  1.492   5  1.492   4  1.492  1  1.492
7 1
 2.64
H0: μd ≤ 0
HA: μd > 0 (claim. Brent thinks subtracting will give him a positive number)
α=.10
The toaster times are normal, so we can use the t-test.
t6 
x D   D  1.429  0

 1.432
sD
2.64
7
n
This is a greater than test, so we want the probability to the right of the test statistic. The t-table tells
the lower tail probability, so we want one minus that probability.
The tail probability is between 0.1 and 0.15.
.85 < p-value < .9.
Fail to Reject
We cannot say Brents outlets get more voltage than Masons.
Looking at the picture this makes sense, we are no where near the rejection region. We cannot reject
the null hypothesis. Brent’s outlets are not carrying more voltage than Mason’s (we should investigate
whether his outlets get LESS voltage than Mason’s)
8. A study asked 800 married men if they would be willing to carry their wife’s purse through the
store. 720 of them said they were willing. Find a 90% confidence interval for the proportion of
men who said they would carry the purse.
720/800 +- 1.645 sqrt(720/800*(1-720/800)/800) = (0.883, 0.917)
9. The average interest rate for 10 year CD is an interesting measure for how the economy is doing. A
study group examined 42 banks and found an average 10 year CD rate of 0.03 with a standard
deviation of 0.001. Calculate the 99% confidence interval for the true average 10 year CD rate
across the nation.
0.03 +- 2.704*0.001/sqrt(42) = (0.02958, 0.03042)
10. Wheel thickness on a train is random because wear and tear causes the wheel to wear down. The
distribution of wheel thickness is known to have a standard deviation of 0.14 inches. The average
wheel thickness tells you how old the train is. A random sample of 6 wheels on the Laramie
Express Train found an average wheel thickness of 3.22 inches. Find a 99.9% confidence interval
for the true average wheel thickness on the Laramie Express.
Cannot do
11. A study on children diagnosed with Deliops disease examined whether taking garlic pills would
prolong the life of the children. The children who did not take garlic lived an average of 4 years.
Children who took the garlic pills lived an average of 21 years. The confidence interval for the
difference in average for the non-garlic pills verses garlic pills was (-43, 9) years.
What would you conclude about the effectiveness of garlic pills?
We cannot show the pills are effective, since 0 is in the interval
12. We would like to compare married and unmarried people with regard to their answer to the
question “Do you think that marriage as an institution is becoming obsolete?” We sampled 150
married adults, and 175 unmarried adults. Of the married adults, 31% answered ‘yes’ to this
question, and of the unmarried adults, 46% answered ‘yes’. Provide a 99% confidence interval to
estimate the true difference in the married and unmarried proportions who would answer ‘yes’ to
this question.
(.31-.46)+-2.576 * sqrt(.31*(1-.31)/150 + .46*(1-.46)/175) = (-.2874, -0.0126)
13. We also polled single parents who had never been married and cohabiting parents. Out of the 90
parents who had never been married, 52 answered ‘yes’, and out of the 120 cohabiting parents, 74
answered ‘yes.’ Test the claim that the proportion of single (never married) parents is different
from the proportion of cohabiting parents.
H0: p1 = p2
Ha: p1 ne p2
Alpha = 0.05
Pbar = (52+74)/(90+120) = 0.6
Z= (52/90-74/120)/sqrt(.6*(1-.6)/90 + .6(1-.6)/120) = -0.57
p-value = 0.2843*2=0.5686
Fail to Reject
We cannot say the married proportion is different from the cohabiting proportion
14. In 2011 a sample of 345 parents had 162 which said children should be financially independent
from their parents by the age of 22.
a. Provide a 90% confidence interval for the true proportion of all parents in 2011 who believe
children should be financially independent by age 22.
162/345+-1.645*sqrt(162/345(1-162/345)/345) = (0.4253, 0.5138)
b. In 1993 the proportion was 80%. Does your confidence interval indicate that the proportion
who would say this now is different from who said this in 1993?
Yes, 80% is not in the confidence interval
15. A sample of 808 young adults found that 22% say they will wait to have a baby because of the
economy. A sample of 100 elderly people found that 86% suggested young people should wait to
have a baby because of the economy. Test the claim that the proportion of young adults is
different from the proportion of elderly people who think waiting to have a baby is a good idea.
Cannot do.
16. A study investigated the link between speed of an automobile at the time of a crash, and the cost of
the insurance claim. Below is the residual plot and the output from Excel.
a) Are there any assumptions for regression that you feel ought to be examined?
It looks to me like linearity, but I can imagine how some students might worry about
independence
b) Test at the 5% significance level whether increased speed causes higher claim cost.
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.153390483
R Square
0.02352864
Adjusted R Square 0.018596967
Standard Error
0.403097257
Observations
200
ANOVA
df
Regression
Residual
Total
Intercept
X Variable 1
1
198
199
SS
MS
F
Significance F
0.775215051 0.775215051 4.770924141 0.030119026
32.17250485 0.162487398
32.9477199
Coefficients Standard Error
t Stat
P-value
0.691422089
0.103593709 6.67436369 2.43732E-10
0.003633867
0.001663672 2.184244524 0.030119026
Lower 95%
Upper 95% Lower 95.0% Upper 95.0%
0.487133484 0.895710693 0.487133484 0.895710693
0.000353076 0.006914658 0.000353076 0.006914658
H0: b1 <= 0
HA: b1 > 0
Alpha = 0.05
T198=2.184244
p-value = 0.030119/2 = 0.0151 (This is tricky –the p-value in the table is a two tailed p-value always, so it
was doubled, and that wasn’t what we wanted. You’ve got to un-double it)
Reject
Our data does show that as the speed increases the claim cost increases
Download