Tipping

advertisement
TOPIC 13
Sampling Distributions:
Proportions
In-Class Activities
Activity 13-1: Candy Colors
1-13, 2-19, 13-1, 13-2, 15-7, 15-8, 16-4, 16-22, 24-15, 24-16
Answers will vary. Here is one representative set of answers.
a.
Count
Proportion (Count/25)
Orange
Yellow
Brown
13
.52
7
.28
5
.2
)
b.
This is a statistic. The symbol used to denote the proportion is p.
c.
This is a parameter. The symbol used to denote the proportion is  .
d.
No – we do not know the proportion of orange candies manufactured by Hershey.
e.
Yes – we know the proportion of orange candies among the 25 candies that we individually
selected.
f.
It is very unlikely that every student in the class obtained the same proportion of orange candies in
her sample.
g.
Answers will vary. Here is one representative set of answers.
[reeses.pdf] [Change axis label to “Sample
Proportion of Orange Candies”]
h.
observational units = samples of 25 candies;
colored orange
i.
This dotplot of the sample proportions is symmetric, mound-shaped (roughly normal) with center
of .6 and a spread from about .4 to about .84. The standard deviation is .09.
j.
Based on the sample results from this class, a reasonable guess for π would be .6.
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 13
variable = proportion of the sample that is
1
k.
Most estimates would be reasonably close to π, but a very few estimates would be way off. We
see this from the dotplot. Most of the class results are the same – near .6, but a few of the class
results are quite extreme (far from .6).
l.
If each student had taken samples of size 10 instead of size 25 we would expect more variability
(greater horizontal spread) in our dotplot.
m. If each student had taken samples of size 75 instead of size 25 we would expect less variability
(less horizontal spread) in our dotplot.
Activity 13-2: Candy Colors
1-13, 2-19, 13-1, 13-2, 15-7, 15-8, 16-4, 16-22, 24-15, 24-16
[insert PC icon]
Answers will vary. The following results are from one particular running of the applet.
a.
pö  .44 .
b.
pö  .54,507,.47, .56, .432 .
No – I did not get the same sample proportion each time.
c.
[reeseapplet1.pdf]
d.
Yes – the distribution appears roughly normal, centered at about .45, with a standard deviation of
about .1.
e.
A normal curve seems to model the simulated sample proportions very well.
f.
mean of pö values = .449
g.
Roughly speaking, more sample proportions are close .45 than are far away from it.
standard deviation of pö values = .100
h.
i.
Number of 500 Sample
Proportions
Percentage of 500 Sample
Proportions
Within  .10 of .45
Within  .20 of .45
354
473
71.5%
95.6%
Within  .30 of .45
491
99.2%
About 95% would capture the actual population proportion.
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 13
2
j.
No – you would not have any definitely way of knowing whether or not your sample proportion
was within .20 of the population proportion. But you could be reasonably confident that your
sample proportion was within .20 of the population proportion because about 95% of the sample
proportions would be within .2 of π.
k.
mean of pö values = .446
l.
The shape is still roughly normal and the center is still about .45. The spread however has
decreased significantly (from .1 to about .057).
standard deviation of pö values = .057
m. The applet reports that 460/500 = 92% of the sample proportions are within .1 of .45.
n.
This is a much greater percentage (92% versus 71.5%) than it was when our sample size was n =
25.
o.
The sample proportion is more likely to be closer to the population proportion with a larger
sample size.
p.
.057×2 = .114
q.
The applet reports that 336/500 = 95% of the sample proportions are within .114 of .45.
r.
about 95%
s.
theoretical mean of pö values = .45
.45±.114 = [.336, .564]
theoretical standard deviation of pö values =
t.
theoretical mean of pö values = .45
theoretical standard deviation of pö values =
u.
(.45)(.55)
 .0995  .1
25
(.45)(.55)
 .057
75
No – the normal model does not summarize this distribution well. This is not a contradiction to
the Central Limit Theorem because nπ = 25(.1) = 2.5  10.
Activity 13-3: Kissing Couples
13-3, 13-4, 16-6, 17-12, 24-4, 24-14
a.
This is a parameter. π = .5
b.
nπ = 124(.5) = 62 > 10 and n(1-π) = 124(.5) = 62 > 10, so the CLT does apply.
Shape: approx. normal
c.
d.
Center: π = .5
Spread:
(.5)(.5)
 .0449
124
Yes – the histogram does appear to be consistent with what the CLT predicts. It is bell-shaped,
centered at about .5 and extends from about .5 – 3(.0449) = .3653 to about .5 + 3(.0449)=.6347.
pö = 80/124 = .645
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 13
3
e.
Yes – it would be very surprising to observe such a sample proportion (.645) if ½ of all kissing
couples lean their heads to the right – this sample proportion never occurred in 1000 simulations.
f.
z = (.645-.5) / .0449 = 3.23
g.
Yes – this is a very surprising z-score. P(Z > 2.33) = .0099. If ½ of all kissing couples lean their
heads to the right, we would see a sample result as or more extreme is less than 1% of random
samples.
Activity 13-4: Kissing Couples
13-3, 13-4, 24-4, 24-14
[insert self-check icon]
a.
Recall that the observed sample proportion of kissing couples who lean their heads to the right is
pö = 80/124 = .645. This value is not at all uncommon in the first histogram.
b.
The CLT says that the sample proportion in this case would vary approximately normally with
mean equal to .667 and standard deviation equal to
(.667)(.333)
 .042
124
The z-score for the observed sample proportion of .645 is, therefore,
z=
.645  .667
 0.52
.042
so the observed sample proportion .645 lies only about half of a standard deviation from the population
proportion when  = .667.
c.
The observed sample proportion is barely one-half of a standard deviation away from what you
would expect if the population proportion were equal to 2/3, not a surprising result at all.
Therefore, the sample data provide no reason to doubt that the population proportion of kissing
couples who lean their heads to the right equals 2/3.
d.
The value .645 is pretty far along the lower tail of the second histogram. This indicates that the
observed sample proportion would rarely occur if the population proportion were equal to 3/4.
Further evidence of this result is provided by the rather large negative z-score:
z
.645  .750

(.750)(.250)
124
.645  .750
 2.69
.039
Therefore, the sample data provide fairly strong evidence that the population proportion of kissing
couples who lean their heads to the right is not 3/4 (because it would be rather surprising to find a
sample proportion so far from this population proportion by chance alone).
e.
A reasonable estimate of the population proportion  is the sample proportion .645. An estimate
of the standard deviation of pö would then be
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 13
(.645)(.355)
 .043
124
4
Doubling this standard deviation gives .086. The interval is, therefore, .645 ± .086, which runs from
.559 to .731. Notice that 1/2 and 3/4 are not in this interval, but 2/3 is. The interval is consistent
with the earlier analysis of the plausibility of the values 1/2, 2/3, and 3/4 for the population
proportion of kissing couples who lean their heads to the right.
Homework Activities
Activity 13-5: Parameters vs. Statistics
pö
a.
statistic
b.
parameter
π
c.
statistic
x
d.
parameter
μ
e.
parameter
σ
f.
statistic
s
g.
parameter
π (population is all voters)
h.
statistic
pö
i.
statistic
pö
j.
parameter
k.
statistic
pö
l.
statistic
x (population is all American households)
m. parameter
μ
μ
n.
statistic
pö
o.
statistic
x
Activity 13-6: Generation M
3-8, 4-14, 13-6, 16-1, 16-3, 16-7, 18-1, 21-11, 21-12
π
a.
parameter
b.
statistic
pö
c.
statistic
pö
d.
statistic
x
e.
parameter
μ
Activity 13-7: Presidential Approval
13-7, 13-8
a.
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 13
5
π
0 .2
.4
.5
.6
.8
1
standard deviation 0 .01265 .01549 .01581 .01549 .01265 0
b.
π = .5 produces the most variability.
c.
π = 0 or π = 1 produces the least variability.
d.
If none (or all) of a population has a particular characteristic, then none (or all) of a sample must
have this characteristic as well, leaving no variability in the sample proportion. Similarly, if the
population proportion is close to zero or one, there is not much “room” for the sample proportion
to vary away from the population value. But if exactly half of a population has the characteristic,
this should produce the most varied sample proportions.
e.
Using a different sample size (500 rather than 1000) would not change the answers to parts a-c
(the amount of variability would change, but not the fact that the variability is largest at  = .5)
since the sample size is a constant in the denominator for the standard deviation for each of these
values.
Activity 13-8: Presidential Approval
13-7, 13-8
a.
n
100
200
400
800
1600
standard deviation 0.0489898 0.034641 0.0244949 0.0173205 0.0122474
b.
As the sample size increases, the standard deviation decreases.
c.
The sample size must increase by a factor of 4 in order to cut the standard deviation in half.
d.
No – the answer to part c would not change if we used a proportion other than .4 to calculate the
standard deviations. (This time the numerator, π(1- π), is a constant in the calculations.)
Activity 13-9: Pet Ownership
13-9, 13-14, 13-15, 18-2, 20-21
a.
No – you cannot be certain that the sample proportion of cat households in your sample will be
closer to π than your competitor’s because of sampling variability, but it is much more likely.
b.
Yes – you have a better chance than your competitor of obtaining a sample proportion of cat
households that fall within ± .05 of π because you are using a larger sample size.
c.
n
standard deviation
50
0.061
200
0.031
The sample size 200 produces the smaller standard deviation. It is ½ the size of the standard deviation
when the sample size is 50 (or 2 times smaller).
d.
The applet reports that 431/500 = 86.2% of the sample proportions are within .05 of .25.
e.
The applet reports that 250/500 = 50% of the sample proportions are within .05 of .25.
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 13
6
f.
Both distributions are, as expected, approximately normal and centered at .25, but the distribution
with samples of size 200 has a much smaller spread than the distribution using samples of size 50.
With samples of size 200, the distribution extends from a minimum of only about .18 to a
maximum of about .33, and more than 85% of the sample proportions fall within .05 of the mean
(.25). In contrast, when the sample size is 50, the sampling distribution extends from 0 to above .4
and only about 50% of sample proportions are within .05 of the mean.
Activity 13-10: Calling Heads or Tails
13-10, 17-14, 17-15, 24-19
Answers will vary. Here is one representative set of answers.
a.
= 16/20 = .8 said they would call heads. This is a statistic.
b.
[heads.pdf]
This distribution is approximately normal, centered at about .5, with standard deviation = .117.
c.
Based on this simulation, it would be extremely surprising to obtain our class result if, in fact,
50% of the population of students call heads. A value of .8 is in the far right tail of this
distribution. Values as extreme as .8 happened in only about .2% of samples in the simulation
d.
[heads2.pdf]
This distribution is also approximately normal, but it is centered at about .7, with standard
deviation =.104. This time our class result is not uncommon, falling very close to the center of the
distribution (within one standard deviation). END HERE? According to the applet, a result of at
least16/20 students calling heads happened 117/1000 times or 11.7% of the time.
Activity 13-11: Racquet Spinning
11-9, 13-11, 17-3, 17-18, 18-3, 18-12, 18-13
a.
.5 is a parameter because describes the long run result of the spinning process.
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 13
7
b.
.46 is a statistic because it is the result of a sample.
c.
Answers will vary. The answers given here are from one particular running of the applet.
d.
[racquetapplet.pdf]
This distribution is approximately normal, with mean .502 and standard deviation .049.
e.
The Central Limit Theorem predicts this distribution will be approximately normal with mean .5
and standard deviation .05. The sampling distribution displayed by the applet simulation is very
close to this.
f.
The applet reports that 190/1000 =19% of the samples had a sample proportion of at least .54 and
183/1000 = 18.3% of the samples had a sample proportion of .46 or less. Together this is 37.3%
of the samples.
g.
This answer suggests that .46 is not very unlikely to occur by chance alone if the results are 50/50
in the long run. Such an outcome will happen about 37% of the time by chance alone – so it is
certainly not rare.
Activity 13-12: Halloween Practices
a.
.69 is a statistic because it is a number that summarizes a sample.
b.
No – this finding does not prove that π = .69. If we were to take another random sample of 1005
adults, we would most likely find a different (although similar) proportion of adults who planned
to give out Halloween treats.
c.
If π = .7, then the standard deviation = .0144, so .7 - 2×.0144 = .671. So yes, .69 would fall within
2 standard deviations of .7 in the sampling distribution.
d.
If π = .6, then the standard deviation = .0154, so .6 + 2×.0155 = .6309. So no, .69 would not fall
within 2 standard deviations of .6 in the sampling distribution.
e.
Using a common standard deviation of .015:
π 0.61 0.62 0.63 0.64 0.65 0.66 0.67 0.68 0.69 0.7 0.71 0.72 0.73
π+2s 0.58 0.59 0.6 0.61 0.62 0.63 0.64 0.65 0.66 0.67 0.68 0.69 0.7
π-2s 0.64 0.65 0.66 0.67 0.68 0.69 0.7 0.71 0.72 0.73 0.74 0.75 0.76
So the potential values of π are: .66 - .72.
f.
Based on our work in part e, the plausible values for the percentage of the population who planned
to give out Halloween treats from the doors of their homes in 1999 was between 66% and 72%
inclusive.
Activity 13-13: Distinguishing Between Colas
13-13, 17-24, 18-9
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 13
8
a.
 = 1/3
b.
Roll the die 30 times to represent the 30 trials. If you roll a 5 or 6 – consider this a success (i.e,
you successfully identified the odd cola). Otherwise (if you roll a 1, 2, 3 or 4) , you failed to
identify the odd cola.
c.
Below are example results from the applet:
d.
[colaapplet.pdf]
Yes – the shape of this sampling distribution is approximately normal.
e.
empirical sampling distribution mean - .336
CLT predicts mean = .333
standard deviation = .086
predicts standard deviation = .086
The simulated sampling distribution and CLT values are very, very close.
f.
The applet reports that in 169/1000 = 16.9% of the samples, the subject guessed correctly 40% or
more of the time.
g.
If a subject was correct 40% of the time in this experiment, I would not be convinced he/she was
doing better than guessing would allow, since if he/she was just guessing, he or she would get
40% or more correct about 17% of the time, so this is not all that surprising of an outcome for a
guesser.
h.
If a subject was correct 60% of the time in this experiment, I would be convinced he/she was
doing better than guessing would allow, since in this simulation – the subjects never got 60% or
more correct by just guessing.
i.
Applet.
j. The rough shapes of the histograms should look like:
0.0
0.1
0.2
0.3
0.4
0.5
0.6
Cola successes
0.7
0.8
0.9
[colaoverlap.pdf]
Both distributions are approximately normal and have the same spread! However they have
different centers. There is little overlap in the distributions.
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 13
9
k.
The applet reports that in 746/1000 = 74.6% of the samples, the subject guessed correctly 60% or
more of the time.
Activity 13-14: Pet Ownership
13-9, 13-14, 13-15, 18-2, 20-21
a.
π = ⅓ is a parameter because it describe all American households.
0.328
0.330
b.
0.332
0.334
Proportion of Cat Owners
0.336
0.338
[catowners.pdf]
The CLT says this sampling distribution will be approximately normal, centered at ⅓, with a
standard deviation of .001667.
c.
By the empirical rule, ninety-five percent of all sample proportions should fall between .3297 and
.3363 (within 2 standard deviations)
d.
This interval is so narrow because the sample size (80,000) is an extremely large sample size.
e.
pö = .316 is a statistic because it is a number obtained from a sample.
f.
z = (.316-.333) / .001667 = -10.2.
g.
This is an extremely unusual z-score. P(Z< -10.2) ≈ 0 – so the sample data do provide evidence
that the population proportion who own a pet cat is not one-third (we observed a sample result that
pretty much never happens when  = 1/3 so we are convinced  ≠ 1/3).
Activity 13-15: Pet Ownership
13-9, 13-14, 13-15, 18-2, 20-21
a.
0.047
0.048
0.049
0.050
0.051
Proportion of Bird Owners
0.052
0.053
[birdowners.pdf] [sample]
The CLT says this sampling distribution will be approximately normal, centered at .05, with a
standard deviation of .000771.
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 13
10
b.
This standard deviation is much smaller because we are assuming π is smaller (.05 rather than
.333, further away from .5, see Activity 13-7).
c.
z = (.046-.05) / .000771 = -5.19. This is a very unusual z-score. P(Z< -5.19) ≈ 0 – so the survey
does provide evidence that the population proportion who own a pet bird is not 5% (we observed a
sample result that pretty much never happens when  = .05, so we are convinced  ≠ .05).
Activity 13-16: Volunteerism
13-16, 15-16, 21-17
a.
28.2% is a statistic. pö =.282
b.
0.2450
0.2475
0.2500
0.2525
Proportion of Volunteers
0.2550
[volunteerism.pdf] [sample]
The CLT says this sampling distribution will be approximately normal, centered at .25, with a
standard deviation of .001768.
c.
z = (.282-.25) / .001768 = 18.1
d.
Yes – this z-score is extreme enough to cast doubt on the assertion that 25% of the population
participated in a volunteer activity. P(Z >18.1) = 0, so if π really is .25, we would never expect to
see such a sample result. Yet we did see this sample result, so we have very strong evidence that
π is not .25.
e.
0.20
0.22
0.24
0.26
0.28
Proportion of Volunteers
0.30
0.32
[volunteerism2.pdf][sample]
The CLT says this sampling distribution will be approximately normal, centered at .25, with a
standard deviation of .0194.
z = (.282-.25) / .0194 = 1.65
This is not a particularly extreme z-score. P(Z > 1.65) = .0495, which means we have some
evidence that would make us doubt that π really is .25, but the evidence is not overwhelming.
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 13
11
f.
It makes sense that our answers differ so much because the sample sizes are so different. A small
difference in sample and population proportion would be very surprising with a sample of size
80,000, but would not be as surprising with only 500 people.
Activity 13-17: Pursuit of Happiness
2-16, 3-25, 13-17, 25-1, 25-2, 25-4
a.
z = (.84-.8) / .007286 = 5.49
b.
Yes – this z-score is extreme enough to cast doubt on the assertion that 80% of the population felt
happy. P(Z >5.49) ≈ 0, so if π really is .80, we would never expect to see such a sample result.
Yet we did see this sample result, so we have strong evidence that  is not .80.
c.
π = .82: z 
.84  .82
 2.858
π = .83: z 
(.82)(.18)
3014
π = .84:
z
.84  .84
z
.86  .82
 0.00
π = .85: z 
.84  .88
.84  .85
 1.538
(.85)(.15)
3014
 3.164
π = .87: z 
(.86)(.14)
3014
π = .88: z 
 1.462
(.83)(.17)
3014
(.84)(.16)
3014
π = .86:
.84  .83
.84  .87
 4.897
(.87)(.13)
3014
 6.758
(.88)(.12)
3014
Plausible values of the population proportion include .83 -.85 since they lie within 2 standard
deviations of the observed sample proportion.
Activity 13-18: Cursive Writing
13-18, 16-8
a. standard deviation 
(.15)(.85)
n
 (.15)(.85) 
(.15)(.85)
 204
  .05, thus n 
n
(.025)2


b.
We need 2 
c.
 (.15)(.85) 
(.15)(.85)
2
 1275
  .02, thus n 
n
(.01)2


Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 13
12
We need a much larger sample size to be as “confident” that our sample proportion will fall within this
smaller range of the population proportion.
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 3, Topic 13
13
Download