Homework 11 Key

advertisement
Stat 4220 homework – Due April 24
1) The casino boss has heard a rumor that the dice in his casino are rigged. (For those not familiar
with dice, there are 6 possible outcomes, and each outcome is supposed to be equally likely).
He hired a graduate student (lackey) to take a die and roll it 1002 times. He recorded that the
die rolled a one 150 times, rolled a two 155 times, rolled a three 164 times, rolled a four 180
times, rolled a five 150 times, and rolled a six 203 times. Test whether the die is rigged.
H0: The dice are not rigged (all proportions are 1/6)
Ha: The dice are rigged (not all proportions are 1/6)
α=0.05
1
167
2
167
3
167
4
167
5
167
6
167
1
150
2
155
3
164
4
180
5
150
6
203
1
1.73
2
.86
3
.054
4
1.012
5
1.73
6
7.76
Chi-squared = 13.15
.02<p-value<.025
Reject
Our data shows the dice is rigged
2) The same group of students also indicated on the survey what year in school they were. Since
statistics is a sophomore level class, we might expect to see more sophomores than any other
group. Use a chi-square goodness-of-fit test to determine if the different years in school are
equally distributed.
Year
Freshman
50
Sophomore
65
Junior
37
Senior
20
H0: The distribution across the classes is even
Ha: The distribution across the classes is uneven
α=0.05
Year
Freshman
43
Sophomore
43
Junior
43
Senior
43
Year
Freshman
1.14
Sophomore
11.26
Junior
0.84
Senior
12.30
Χ2=25.54
p-value < 0.0001
Reject
The distribution of classes is not even
3) For many years TV executives used the guideline that 30% of the viewing audience were
watching each of the traditional big three prime-time networks and 10% were watching cable
stations on a weekday night. A random sample of 500 viewers in the Tampa-St. Petersburg,
Florida, area last Monday night showed that 165 homes were tuned in to the ABC affiliate, 140
to the CBS affiliate, 125 to the NBC affiliate, and the remainder were viewing a cable station. At
the .05 significance level, can we conclude that the guideline is still reasonable?
H0: The distribution given is correct
HA: The distribution has changed
Alpha=0.05
Observed: 165,140,125,70
Expected: 150,150,150,50
Chisquar: 1.5, .67, 4.16, 8
=14.33
p-value = 0.002
Reject
The distribution has changed
4) A 98% confidence interval for the difference in the average length of a movie between “action
flicks” and “chick flicks” was (22.43, 25.61) minutes. Which of the following statements is true?
X
98% of “action flicks” and “chick flicks” last between 22.43 minutes and 25.61 minutes
The probability the next “action flick” or “chick flick” lasts between 22.43 and 25.61 minutes is 98%
We are 98% confident the average time for “chick flicks” and “action flicks” is between 22.43 and 25.61
The evidence does not support the claim that “action flicks” have a different average than “chick flicks”
“Action flicks” are 22.43 to 25.61 longer than “chick flicks” on average with 98% confidence
Of all possible “action flicks” and “chick flicks” 98% have an average difference of 22.43 to 25.61
5) University of Michigan surveyed high school seniors nationwide who smoke and asked them
which brands of cigarettes they use. Is there a relationship between Race and Cigarette Brand?
http://www.monitoringthefuture.org/data/tables/cigbrands/table1.html
LD Johnston, PM O'Malley, JG Bachman, JE Schulenberg. (Apr. 1999). Cigarette brands smoked by American teens: One brand predominates; three account for nearly all of teen smoking.
University of Michigan News and Information Services: Ann Arbor, MI. [On-line]. Available: www.isr.umich.edu/src/mtf; accessed 04/15/2013
Black
White
Hispanic
Marlboro
6
1276
90
Newport
87
138
36
Camel
0
198
5
All other Brands
13
205
25
H0: race is independent of cigarette preference
Ha: cigarette preference depends on race
Alpha:0.05
Black
White
Hispanic
Marlboro
69.95286
1199.098
102.9495
Newport
13.30736
228.1082
19.58442
Camel
10.35017
177.4175
15.23232
All other Brands
12.38961
212.3766
18.23377
Black
White
Hispanic
Marlboro
58.46749
4.932019
1.628851
Newport
408.0904
35.59491
13.75948
Camel
10.35017
2.387808
6.87357
All other Brands
0.030072
0.256217
2.510832
Chisq:544.88
p-value = 0
Reject
Certain cigarettes do depend on race
6) A study investigated whether people think Labrador retrievers are cuter than Afghan Hounds.
They walked a Labrador past 100 people and 78 petted the dog. They walked an Afghan Hound
past 90 people and 61 petted the dog. Find a 96% confidence interval for the difference in
proportions of people who will pet a Labrador verses an Afghan Hound.
78/100-61/90+-2.054*sqrt(78/100*(1-78/100)/100+61/90*(1-61/90)/90)=(-0.02972, 0.23416)
7) George Bush Sr. mentions on T.V. that the average age of a student at UW is 23 years old. To
test his hypothesis, you ask 3 randomly chosen UW students what their ages are, and use α=.01
Assume the ages of students at A&M are normally distributed.
The ages were : 22 years old, 28 years old, and 24 years old.
Test whether George Bush was right.
n =3
x = (22+28+24)/3 = 24.67
μ0 = 23
df = 3-1 = 2
sx =
(22  24.67) 2  (28  24.67) 2  (24  24.67) 2
 3.055
3 1
H0: μ = 23
HA: μ ≠ 23
α=.01
The ages are normally distributed, so we can use a t-test
x
s
n
24.67  23
t2 
 .9468
3.055
3
t df 
The t-value of .94 is between .817 and 1.061 on the t-table for 2 degrees of
freedom, so the tail-probability is between .25 and .2. This is a two-sided test, so
we need to double the probability to get the p-value.
.4 < p-value < .5
Since .25 > .01 we cannot reject the null hypothesis.
So Bush was right. (Again), it appears that the average age of students at UW is
about 23 years old.
8) The Working Imitation Design Gadget Engineering Tool is manufactured by a machine that
sometimes has a flaw in the production. According the machine specs the flaw distribution per
hour should be:
0 flaws
70%
1 flaw
15%
2 flaws
8%
3 flaws
3%
4 flaws
2%
5 flaws
1%
To see whether the machine is performing according to specifications we randomly sample 200 hours
and get the following flaw distribution:
0 flaws
131
1 flaw
28
2 flaws
16
3 flaws
11
4 flaws
8
5 flaws
6
Can we say with 5% significance that the machine is not performing at specifications?
There is not enough data to do a goodness of fit test – an expected value is less than 5
9) Billy says that the average speed of a mule is faster than the average speed of a zebra. To test
this he rides 7 different zebras and 7 different mules. Assume each time he rides the animal the
exact same way. Test whether Billy is right.
First ride
Second ride
Third ride
Fourth ride
Fifth ride
Sixth ride
Seventh ride
MEAN
S
Zebra
31
22
40
28
35
37
34
32.43
6.02
Mule
42
37
28
39
31
48
33
36.85
6.87
H0:mu1<=mu2
Ha: mu1>mu2
Alpha: 0.05
T=(32.43-36.85)/(6.02^2/7+6.87^2/7) = -1.28
p-value = .12
Fail to Reject
We cannot say the mules run faster than the zebras
10) In a class survey done in a statistics class, students were asked, “Suppose that you are buying a
new car and the model you are buying is available in three colors: silver, blue, or green. Which
color would you pick?” Of the 111 students who responded, 59 picked silver, 27 picked green,
and 25 picked blue. Is there sufficient evidence to conclude that the colors are not equally
preferred?
H0: colors equally preferred
Ha: colors not equally preferred
Alpha=0.05
O: 59,27,25
E: 37,37,37
X2: 13.08,2.7,3.89
X2=19.68
p-value = 0
Reject
Colors not equally preferred
11) Suppose that on a typical day, the proportion of students who drive to campus is .30 (30%), the
proportion of students who bike is .60 (60%), and the remaining .10 (10%) come to campus (e.g.,
walk, take the bus, get a ride). The campus sponsors a “spare the air” day to encourage people
not to drive to campus on that day. They want to know whether the proportion using each
mode of transportation on that day differ from the norm. To test this hypothesis, a random
sample of 300 students that day was asked how they got to campus, with the following results:
Method of
Transportation
Frequency
Drive
Bike
Other
Total
80
200
20
300
H0:New program did nothing
Ha: New program effected a change
Alpha=0.05
O:80,200,20
E:90,180,30
X2:1.11,2.22,3.33
X2:6.667
p-value 0.035
Reject
The program did effect change
12) Ashley is eating ice cream when she gets a brain freeze. Her thought is that it’s because she was eating
with her left hand. So she gets 100 bowls of cookie dough ice cream and asks 50 of her friends to
randomly choose either their right or their left hand. They eat as fast as they can until they get a brain
freeze. Then she asks them to switch hands with a new bowl of cookie dough ice cream and eat until
they get a brain freeze. Here is her data:
Left hand:
50 bowls
Average time: 48 seconds
Standard Deviation: 27 seconds
Right hand:
50 bowls
Average time: 37 seconds
Standard Deviation: 24 seconds
Pooled Standard deviation: 25.5 seconds
Matched Pairs deviation: 2.5 seconds
Difference in deviations: 3 seconds
Make a 99% Confidence interval for the difference in times for each hand
It doesn’t say it’s normal, but n=50, so the mean is normal. This is matched pairs because each right hand is paired
to one of the left hands by being attached to the same body.
t score= 2.704
(48-37) ± 2.704 * 2.5 / sqrt(50)
(10.04, 11.95)
13) The MagBlast company demolishes buildings by setting four charges on each corner of the
building. The charges are supposed to detonate at the same time, but sometimes something
goes wrong and not all of them ignite on time. If you had 500 buildings the distribution for the
expected number of charges that would go off on time is given below:
All five
detonate
406
Only
four
35
Only
three
27
Only
two
15
Just one
detonates
10
No charges
detonate
7
You have been asked to investigate what would happen to the charges if a building was demolited when
it is raining. To find out you randomly select 500 buildings around the United States and demolish them
when it is raining. The data you observed is given below:
All five
detonate
417
Only
four
25
Only
three
31
Only
two
17
Just one
detonates
10
No charges
detonate
0
Test whether your data supports the hypothesis that rain affects the detonation of the charges
(assuming homeland security does not catch you).
H0: The rain does not affect the detonation (proportions remain the same)
Ha: The rain does affect the detonation (at least one proportion is different)
Alpha = 0.05
All five
detonate
.298
Only
four
2.857
Only
three
.5926
Only
two
.2667
Just one
detonates
0
No charges
detonate
7
Chisq=11.01
.05<p-value<.06
Fail to Reject
Our data does not show that rain affects the detonation
14) Captain Buckwheat uses a sextant to measure the height of a ship when he spots it on the
horizon. After attacking the ship he calculates the gold looted. His goal is to be able to predict
the amount of gold based on the ship height. Below is the data and regression from the 500
ships he has attacked during his career.
Coefficients:
Estimate
(Intercept) 9.8958
Height
6.5537
Std. Error
0.1879
0.3154
t value
52.66
20.78
Pr(>|t|)
<2e-16 ***
<2e-16 ***
Based on the output above, are there any assumptions that you feel should be investigated?
Perhaps there is clumping, which might indicate a nonrandom sample
Based on the output above find a 90% confidence interval for the slope of the regression line.
6.5537+-1.65*0.3154=(6.03, 7.07)
15) Every day I see the elevator says it’s been “inspected.” Somehow I feel dubious. I think the
elevator gets inspected less than 70% of the time. I plan on doing a hypothesis test by
investigating 100 elevators and using α=0.10. If the true percentage was actually only 65%, how
powerful would my test be?
Z=-1.28, critical value = 0.64127
0.427 is the best answer, but if they use the null proportion in the alternative distribution (which
I think is a reasonable error, easy to make) they’ll get .424
16) Randomly selected deaths of motorcycle riders are summarized in the table below. Use a .05
significance level to test the claim that such fatalities occur with equal frequency in the different
months.
Month
Jan
Feb
Mar
Apr
May
June
July
Aug
Sept
Oct
Nov
Dec
Tot
Observed
6
8
10
16
22
28
24
28
26
14
10
8
200
Expected
16.7
16.7
16.7
16.7
16.7
16.7
16.7
16.7
16.7
16.7
16.7
16.7
200
X2
6.86
4.53
2.7
0.03
1.68
7.64
3.19
7.64
5.18
0.44
2.69
4.53
47.1
H0: Even across the months
Ha: Not even
Alpha:0.05
X2:47.1
p-value =0
reject
not even across the months
Download