Tipping

advertisement
TOPIC 15
Central Limit Theorem
In-Class Activities
Activity 15-1: Smoking Rates
15-1, 15-2, 15-9
a.
π
b.
No the sample result will not equal .209 exactly in general because of sampling variability
c.
The CLT predicts the sampling distribution of pö will be approximately normal, centered at .209
with standard deviation equal to
(.209)(.791)
 .04066 .
100
d.
Need to draw and shade graph. INCLUDE GRAPH HERE
e.
z = (.25 - .209) / .04066 = 1.01
f.
P(Z > 1.01) = .1562 (Table II)
g.
When the sample size increases to 400, the standard deviation of the sampling distribution will
decrease to .0203, which means there will be fewer sample proportions as far from the center of
.209, so it will be less likely that we will have a sample proportion greater than .25.
0.150
0.175
0.200
0.225
Sample Smoking Percentages (n=400)
0.250
.1566 (applet)
0.275
[smoking.pdf][label: Sample Proportion of Smokers
(n=400)]
h.
z = (.25-.209) / .0203 = 2.02
P(Z > 2.02) = .0217
Yes – this probability has decreased as predicted.
i.
No – the size of the population of the U.S. did not enter into the calculations.
j.
The previous calculations would not change in any way.
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 15
1
Activity 15-2: Smoking Rates
15-1, 15-2, 15-9
a.
N(.105, .0307)
0.00
0.05
0.10
0.15
Sample Smoking Proportions - Utah (n = 100)
0.20
[utahsmokers.pdf][label: Sample Proportion of Smokers
(n=100)]
b.
z = (.25-.105)/.0307 = 4.72, P(Z > 4.72) ≈ 0.000
c.
Yes – you would have strong reason to doubt that this state was Utah, because it the probability of
you finding a random sample of 100 people from Utah with 25 smokers is essentially zero – this
never happens by chance alone. So if you find a random sample of 25/100 smokers – you have
very strong evidence the sample is from some other state where the proportion of smokers is
greater than 10.5%.
Activity 15-3: Body Temperatures
12-1, 12-19, 15-3, 15-18, 15-19, 19-3, 19-7, 20-11, 22-10, 23-3
a.
These numbers are parameters.   98.6o F,   0.7o F
b.
Yes – our sample size is greater than 30 and we have a simple random sample, so the CLT applies.
c.
The CLT says the sampling distribution of the sample means will be approximately normal with
mean 98.6°F and standard deviation = .7
98.4
98.5
98.6
98.7
Average Body Temperatures (degrees Fahrenheit)
130  .061 .
98.8
[bodytempsnormal.pdf]
d.
P(98.5  X  98.7) = P(-1.64< Z< 1.64) = .9495 -.0505 = .8990 (Table II)
.8989 (applet)
e.
P(98.2  X  98.4) = P(-1.64< Z< 1.64) = .9495 -.0505 = .8990 (Table II)
.8989 (applet)
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 15
2
f.
These answers are the same – we have simply shifted the center of the plot, but the area within ±
0.1 degrees of the center has not changed.
g.
P

0.1
.061
Z
0.1
.061
 P(-1.64 < Z < 1.64) = .9495 -.0505 = .8990 (Table II)
.8989 (applet)
So there is about a 90% chance that a random sample of 130 will result in a sample mean body
temperature that is within ±.1 degrees of the actual population mean μ, if we assume the
population standard deviation is σ = 0.7°F.
Activity 15-4: Solitaire
11-22, 11-23, 15-4, 15-14, 21-20, 27-18
a.
CLT says the sampling distribution will be approximately N(.1111,.0994).
z = (.1 - .1111) / .0994= -.11
P(Z < -.11) = .4562(Table II)
.4562
(applet)
b.
P(X≤1) = .308 + .385 = .693
c.
No – the probabilities in a and b are not close.
d.
The technical conditions for the CLT are not satisfied. nπ = 10×(1/9) = 1.111  10 and n(1-π) =
10×(8/9) = 8.888  10.
Activity 15-5: Capsized Tour Boat
[insert self-check icon]
First, weight is a quantitative variable, so the relevant statistic is the sample mean weight of the 47
passengers. Because the question is phrased in terms of the total weight in a sample of 47 adults, you
must rephrase it in terms of the sample mean weight. If total weight exceeds 7500 pounds, then the
sample mean weight must exceed 7500/47 or 159.574 pounds. So, you want to find the probability that x
> 159.574 (with n = 47 and σ = 35).
The CLT applies because the sample size (n = 47) is fairly large, greater than 30. The sampling
distribution of x is, therefore, approximately normal with mean 167 pounds and standard deviation equal
to / n = 35 / 47 = 5.105 pounds. A sketch of this sampling distribution is shown here:
[Pick up art WS3_CSE_3_15_42]
Now you can use the Normal Probability Calculator applet or the Standard Normal
Probability Table to find the probability of interest. The z-score corresponding to a sample mean weight
of 159.574 pounds is (159.574 – 167)/5.105 = –1.45. The probability of the weight being less than
159.574 pounds is found from the table to be .0736, so the probability of exceeding this weight is 1 –
.0736 = .9264. It’s not surprising the boat capsized with 47 passengers!
Homework Activities
Activity 15-6: Means or Proportions
a.
sample mean
b.
sample proportion
c.
sample mean
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 15
3
d.
sample mean
e.
sample proportion
Activity 15-7: Candy Colors
1-13, 2-19, 13-1, 13-2, 15-7, 15-8, 16-4, 16-22, 24-15, 24-16
a.
The CLT says the sample proportions will be approximately normally distributed with mean = .45
and standard deviation =
(.45)(.55)
 .0568 .
75
b.
0.30
0.35
0.40
0.45
0.50
0.55
Proportion of Orange Reese's Pieces (n = 75)
0.60
0.65
[reeseshaded.pdf]
Student guesses.
c.
P( pö  .4) = P(Z < -.88) = .1894
d.
P(.35  pö  .55) = P(-1.76 < Z < 1.76) =.9608 - .0392 = .9216 (Table II)
e.
Yes – these probabilities are very close – virtually identical. The simulated probability from the
applet was 92% and the normal model predicts a probability of 92.1%.
.9217 (applet)
Activity 15-8: Candy Colors
1-13, 2-19, 13-1, 13-2, 15-7, 15-8, 16-4, 16-22, 24-15, 25-16
a.
The CLT says the sample proportions will be approximately normally distributed with mean = .45
and standard deviation =
(.45)(.55)
 .0372 . The only change from when the sample size was
175
75 is the standard deviation – which is smaller now.
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 15
4
b.
0.0895
0.35
0.40
0.45
0.50
Proportion of Orange Reese's Pieces (n = 175)
0.55
[reeseshaded1.pdf]
Student guesses.
c.
P( pö  .4) = P(Z < -1.34) = .0901 (Table II)
d.
This probability is smaller than when the sample size is 75. This makes sense because the
standard deviation (spread) has decreased and thus there are fewer sample proportions as far from
the center of .45.
e.
P(.35  pö  .55) = P(-2.69 < Z < 2.69) =.9964 - .0036 = .9928
f.
This probability is larger than when the sample size is 75. This makes sense because the standard
deviation has increased, which will concentrate more of the area under the curve near .45 (the
mean).
.0895 (applet)
Activity 15-9: Smoking Rates
15-1, 15-2, 15-9
a.
The CLT predicts this distribution will be approximately N(.276, .02235).
0.20
b.
c.
0.22
0.24
0.26
0.28
0.30
0.32
0.34
Sample Proportion of Smokers in Kentucky (n = 400)
0.36
[kentucky.pdf]
.251 .276
) = P(Z < -1.12) = .1314. Since the normal distribution is
.02235
symmetric, .301 will have a z-score of +1.12 and the area to the right of z = 1.12 will also be
.1314. Thus we can double this probability to find that the probability of obtaining a sample
proportion of Kentucky smokers more than .025 away from .276 is 2×.1314 or .2628.
P( pö  .251) = P(Z <
.226  .276
) = P(Z < -2.24) = .0125. Thus the probability of obtaining a
.02235
sample proportion of Kentucky smokers more than .05 away from .276 is 2×.0125 or .025.
P( pö  .226) = P(Z <
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 15
5
d.
You would have no reason to doubt that the state is Kentucky because part (b) shows that if the
state is Kentucky, you have more than a 26% chance of finding a sample result at least as extreme
as 25% smokers.
e.
Now you would have reason to doubt that the state is Kentucky because part (c) shows that if the
state is Kentucky, you have less than a 2.5% chance of finding a sample result at least as extreme
as 22.5% smokers.
Activity 15-10: Candy Bar Weights
12-10, 14-10, 15-10
2.18  2.20
2.22  2.20
<Z<
) = P(-.50 < Z < .50) = .6915 - .3085 =
0.04
0.04
.3830 (Table II) .3829 (applet)
a.
P(2.18 < X < 2.22) = P(
b.
Yes – the CLT applies because the population has a normal distribution as long as your sampling
method behaves like a simple random sample.
c.
The CLT says the sample means will be normally distributed with mean 2.20 ounces and standard
deviation = .04
5  .01789 .
2.18
2.150
d.
2.22
2.175
2.200
2.225
Average Candy Bar Weight (n = 5)
2.250
[candybar.pdf]
Student guess. This value should be greater than the answer to part a.
2.18  2.20
2.22  2.20
<Z<
) = P(-1.12 < Z < 1.12) = .8686 - .1314 =
.01789
.01789
.7372 (Table II) .7364 (applet) This probability is indeed larger than the probability we found in
part a.
e.
P(2.18 < X < 2.22) = P(
f.
Student guess – they should guess that the probability will increase if the sample size is increased
to 40 because this will decrease the standard deviation which will concentrate more area under the
curve near the middle (mean).
g.
Now the standard deviation of the sample means is .006325, so the curve is N(2.2, .006325).
2.18  2.20
2.22  2.20
<Z<
) = P(-3.16 < Z < 3.16) = .9996 - .0008 =
.006325
.006325
.9988 (Table II) .9984 (applet)
P(2.18 < X < 2.22) = P(
This is larger that our answer in part e.
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 15
6
h.
The calculations in part f remain approximately correct regardless of the distribution of candy bar
weights because the sample size was large (40 > 30).
Activity 15-11: Christmas Shopping
14-3, 14-7, 15-11, 19-1
a.
No – it is not valid to use the CLT because the sample size is too small and we do not know that
the population is normally distributed.
b.
Yes – with a sample size of 500, the CLT tells us about the sampling distribution of the sample
means. It would be N($850, $11.18)
810
820
830
840
850
860
870
Average Expected Christmas Expenditures ($)
880
890
[christmasshopping.pdf]
c.
831.61 850
868.39  850
<Z<
) = P(-1.64< Z< 1.64) = .9495
11.18
11.18
-.0505 = .8990 (Table II) .8989 (applet)
d.
871.91 850
828.09  850
<Z<
) = P(-1.96< Z< 1.96) = .9750
11.18
11.18
-.0250 = .9500 (Table II) .9499 (applet)
e.
821.20  850
878.80  850
<Z<
) = P(-2.58< Z< 2.58) = .9951
11.18
11.18
-.0049 = .9902 (Table II) .9900 (applet)
f.
First find the z-scores that mark 80% of the area in the middle of the standard normal curve:
P($831.61  X  $868.39) = P(
P($828.09  X  $871.91) = P(
P($821.20  X  $878.80) = P(
P(-z* < Z < z*) ≈ .8000 → P(-1.28 < Z < 1.28) ≈ .8000.


. As z  x  850 / 11.18, and z = 1.28, x  (1.28)(11.18)  850  864.31 . Therefore, k = 864.31
-850 = 14.31.
g.
981.61 850
1018.39  850
<Z<
) = P(-1.64< Z< 1.64) =
11.18
11.18
.9495 -.0505 = .8990 (Table II) .8989 (applet)
P($981.61  X  $1018.39) = P(
This is exactly what we found in part c. The probability of falling within ±$18.39 of μ is the same,
regardless of what value we use for μ.
Activity 15-12: Jury Selection
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 15
7
11-4, 12-1 15-12
a. The CLT applies to the jury pool with a sample size of 75 because it is large (75×.20 = 15 > 10). It
would not apply for the jury (sample size 12).
b. The CLT predicts the sampling distribution would be approximately N(.2, .046188).
0.05
c.
d.
0.10
0.15
0.20
0.25
0.30
Sample Proportion of Senior Citizens (n = 75)
P( pö  .333) = P(Z 
0.35
[jurypool.pdf]
.333  .2
) = P(Z>2.88) = .0020
.046188
This is the same as the empirical probability I found in Activity 11-4e. (Answers will vary).
Activity 15-13: Non-English Speakers
a.
The CLT says this sampling distribution will be approximately N(.315, .046452).
0.1
0.2
0.3
0.4
Sample Proportion of Non-English Speakers
0.5
[nonenglish.pdf]
.5  .315
) = P(Z>3.98) = 0.00
.046452
b.
P( pö  .5) = = P(Z 
c.
P( pö  .25) = P(Z 
.25  .315
)=P(Z<-1.40) = .0808 (Table II)
.046452
d.
P(.2  pö  .5) = P(
.2  .315
.5  .315
Z
) =P(-2.48< Z < 3.98) = 1.000 - .0066 = .9934
.046452
.046452
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 15
.0809 (applet)
8
e
Ohio
California
0.0
.
f.
0.1
0.2
0.3
0.4
Sample Proportion of Non-English Speakers
0.5
[nonenglishohio.pdf]
Judging from the plot, in Ohio, P( pö  .5) is zero, P( pö  .25) should be near 1, and
P(.2  pö  .5) should be near zero.
Activity 15-14: Solitaire
11-22, 11-23, 15-4, 15-14, 21-20, 27-18
a.
Author A would need to play at least 90 games in order for nπ = n(1/9) ≥ 10 which would let us
use the CLT.
b.
Author B would need to play at least 60 games in order for nπ = n(1/6) ≥ 10 which would let us
use the CLT.
c.
If π = .8, then the authors would need to play at least 10/.8 = 12.5 or 13 games in order to use the
CLT to approximate the sampling distribution of the sampling proportion of wins for author B.
Activity 15-15: Birth Weights
12-2, 14-9, 15-15, 21-17
2500  3300
) = P(Z<-1.40) = .0806 (Table II)
570
a.
P(X<2500) = P(Z 
b.
P( X  2500) = P(Z 
c.
This probability is less than the probability in part a. This makes sense because we are looking at
an average – it is harder for any pair of babies to have an average birth weight below 2500 grams
than for a single baby to weigh below this amount.
d.
P( X  2500) = P(Z 
2500  3300
) =P(Z<-1.99) = .0233 (Table II)
403.051
2500  3300
) =P(Z<-2.81) = .0025 (Table II)
285
.0802 (applet)
.0236 (applet)
.0025 (applet)
This probability is much less than the probability in part a. This makes sense because we are looking
at an average of 4 babies – it is harder for four of babies to have an average birth weight below
2500 grams than for a single baby to weigh below this amount.
e.
P(3000 < X <3600) =P(
3000  3300
3600  3300
Z
) =P(-.53 < Z < .53 ) = .4038 (Table II)
570
570
.4013 (applet)
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 15
9
f.
Student expectation.
g.
P(3000 < X <3600) = =P(
(Table II)
3000  3300
3600  3300
Z
) = P(-2.35 < Z < 2.35 ) = .9812
127.46
127.46
.9814 (applet)
Activity 15-16: Volunteerism
13-16, 15-16
a.




.282  .25
P( pö >.282) = P( Z  
 ) = P(Z > 18.1) Prob = 0
 (.25)(.75)


60000 
b.


 .282  .25 
P( pö >.282) = P( Z  
 ) = P(Z > 1.65) Prob = .0495
 (.25)(.75)


500 
c.
The first scenario (with a sample of 80,000) provides stronger evidence against the claim that 25%
of the population served as volunteers. If 25% of the population had indeed served as volunteers,
we would never expect to see a sample result as extreme as this with a sample of size 80,000.
Activity 15-17: Tip Percentages
14-12, 15-17
a.
The CLT says the sampling distribution will be approximately N(15, .566)
P( X  16.4) = P(Z 
16.4  15
) = P(Z >2.47) =.0048 (Table II)
.566
.0067 (applet)
b.
Yes – this provides strong evidence that the mean tip percentage is actually greater than 15%,
because if it were 15% or less, the chance that we would find a random sample or 50 tables with
an average tip percentage of at least 16.4% is less than .5% - so it is extremely unlikely.
c.
P( X  14.4) = = P(Z 
14.4  15
) P(Z < -1.06) =.1446
.566
This does not provide strong evidence that the population mean tip percentage is less than 15%
because if the population mean percentage is 15% (or more) – we would expect to see a random
sample of 50 tables with an average percentage tip of 14.4% or less almost 15% of the time, which
is not rare.
Activity 15-18: Body Temperatures
12-1, 12-19, 15-3, 15-18, 15-19, 19-3, 19-17, 20-11, 22-10, 23-3
The CLT would still apply with a sample of size 40 because the sample size is still large (> 30).
The standard deviation of the sampling distribution of the sample mean would increase because of the
smaller sample size. This would decrease the probability that the sample mean body temperature would
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 15
10
fall between 98.5 and 98.7 (or between 98.2 and 98.4 if the population mean were 98.3), and the
probability that a random sample of size 130 results in a sample mean body temperature within ± 0.1
degrees of the actual population mean μ. This makes sense because the increased standard deviation
means the average body temperatures are more spread out – less concentrated around the population mean
μ.
Activity 15-19: Body Temperatures
12-1, 12-19, 15-3, 15-18, 15-19, 19-3, 19-17, 20-11, 22-10, 23-3
25
Frequency
20
15
10
5
0
a.
96.75
97.50
98.25
99.00
99.75
Body Temperatures (degrees Fahrenheit)
100.50
[bodytempshistogram.pdf].
These body temperatures are fairly normally distributed with a couple of high outliers above
100°F.
b.
sample mean = 98.249°F, standard deviation = 0.733°F.
c.
CLT says the sampling distribution would be N(98.6, .061394).
P( X  98.249) = = P(Z 
d.
98.249  98.6
) = P(Z<-5.72) = 0.00
.061394
Yes – the probability found in part c is low enough to provide compelling evidence that the
population mean body temperature is not 98.6 degrees. If it were, we would never (probability
zero) find a sample of 130 health adults with an average body temperature as low as 98.249°F.
Since we did find such a sample, we do not believe the population mean temperature is as high as
98.6°F
Activity 15-20: IQ Scores
12-9, 14-13, 15-20
110  105
) = P(Z>.42) = .3372 (Table II)
12
a.
P(X>110) = P(Z 
b.
P( X  110) = = P(Z 
110  105
) = P(Z>1.32) = .0934 (Table II)
3.795
.0935 (applet)
c.
P( X  110) = = P(Z 
110  105
) = P(Z>2.63) = .0043 (Table II)
1.897
.0042(applet)
d.
Yes – the calculation in part c would be valid even in the distribution of IQs in the population
were skewed because the sample size is large (greater than 30).
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 15
.3385 (applet)
11
Download