Minitab Simulation to examine the sampling distribution of the

advertisement
Minitab Simulation to examine the sampling distribution of the sample proportion p-hat.
We will use Minitab to simulate randomly selecting samples from a population with a known proportion of
“success” (i.e., with a population in which a given proportion have a particular characteristic of interest).
Open Minitab (no need to open a data file) and follow these steps:
1) To simulate selecting a sample of size 20 from a population with p = 0.5 (50% of the population has the
characteristic of interest):
Calc > Random data > Binomial
Generate 500 rows of data
Store in X1
Number of trials = 20
Probability of success = 0.5
Now you have the results for 500 samples of size 20. The counts X (the values in X1) are the number in
the sample who have the characteristic of interest.
2) To calculate the proportion of “success” for each of these 500 random samples of size 20:
Calc > Calculator
Store result in variable: p-hat20
Expression: X1 / 20
3) Now we will examine the distribution of the 500 sample proportions.

Make a histogram for p-hat20.

Describe the distribution. Where is the distribution centered?

Use Stat > Basic Statistics > Descriptive Statistics to find the mean and the standard deviation of
the 500 sample proportions.
4) To simulate selecting a sample of size 70 from a population with p = 0.5 (50% of the population has the
characteristic of interest) and storing the number in the sample who also have that characteristic in X2:
Calc > Random data > Binomial
Generate 500 rows of data
Store in X2
Number of trials = 70
Probability of success = 0.5
5) To calculate the proportion of “success” for each of these 500 random samples of size 70:
Calc > Calculator
Store result in variable: p-hat70
Expression: X2 / 70
6) To simulate selecting a sample of size 200 from a population with p = 0.5 (50% of the population has the
characteristic of interest) and storing the number in the sample who also have that characteristic in X3:
Calc > Random data > Binomial
Generate 500 rows of data
Store in X3
Number of trials = 200
Probability of success = 0.5
Now you have the results for 500 samples of size 200. The counts X (the values in X3) are the
number in each sample that have the characteristic of interest.
7) To calculate the proportion of “success” for each of these 500 random samples of size 200:
Calc > Calculator
Store result in variable: p-hat200
Expression: X3 / 200
8) Now we will examine the distribution of the 500 sample proportions for samples of size 20, 70, and 200 by
viewing three histograms:
Graph > Histogram
Click on Multiple Graphs.
Graph the variables p-hat20
Select ‘Same X’ and ‘Same Y’.
p-hat70
p-hat200
Describe and compare the three distributions. How are they similar? How are they different?
Use Stat > Basic Statistics > Descriptive Statistics to find the means and the standard deviations.
9) Fill in the following:
Pop‟n proportion = 50%
Sample Size
(p = 0.5) [p is fixed; n changes]
Mean of the 500 sample proportions
Std dev of the 500 sample proportions
20
70
200
Refer to Example 3.32 on page 214. We will use Minitab to do the simulation described there. Note – we‟ll
define “success” here to be that a person finds clothes shopping to be frustrating.
File > New > Minitab worksheet.
10) To simulate taking a sample of size n = 25 from a population in which 60% of the individuals find clothes
shopping frustrating, do the following:
Calc > Random data > Binomial
Number of trials = 25
Generate 500 rows of data
Probability of success = 0.6
Store in X1
We now have in column „X1‟ the number of “successes” in each of 500 random samples of size 25 taken
from a population with a 60% “success” rate.
11) To calculate the sample proportions for each of these 500 trials:
Calc > Calculator
Store result in variable: p-hat60%
Expression: X1 / 25
12) To simulate taking a sample of size 25 from a population in which only 28% of the individuals find clothes
hopping frustrating, do the following:
Calc > Random data > Binomial
Number of trials = 25
Generate 500 rows of data
Probability of success = 0.28
Store in X2
We now have in column „X2‟ the number of “successes” in each of 500 random samples of size 25, but
now taken from a population with 28% “successes”.
13) To calculate the sample proportions for each of these 500 trials:
Calc > Calculator
Store result in variable: p-hat28%
Expression: X2 / 25
14) To simulate taking a sample of size 25 from a population in which 75% of the individuals find clothes
hopping frustrating, do the following:
Calc > Random data > Binomial
Number of trials = 25
Generate 500 rows of data
Probability of success = 0.75
Store in X3
We now have in column „X3‟ the number of “successes” in each of 500 random samples of size 25, but
now taken from a population with 75% “successes”.
15) To calculate the sample proportions for each of these 500 trials:
Calc > Calculator
Store result in variable: p-hat75%
Expression: X3 / 25
16) Now we will examine the distribution of the 500 sample proportions samples of size 25, but taken from
populations with 60%, 28% and 75% as the population proportion.
Make three histograms:
Graph > Histogram
p-hat60%
p-hat28%
Multiple Graphs.
p-hat75%
Select ‘Same X’ and ‘Same Y’.
Where are the distributions centered?
Describe the distribution. Where is the distribution centered?
Use Stat > Basic Statistics > Descriptive Statistics to find the mean and the standard deviation.
What are the means?
What are the standard deviations?
File > New > Minitab worksheet.
17) Repeat #s 11 – 16 for samples of size 250 (so your Number of trials will = 250 instead of 25), and for
samples of size 800. When you calculate p-hat each time, be sure to divide by the new value of n.
Compare the histograms (using the same scale).
18) Fill in the following:
Sample Size = 25
Pop‟n proportion p
proportions
60%
28%
75%
Mean of the 500 sample proportions
Std dev of the 500 sample
Sample Size = 250
Pop‟n proportion p
proportions
Mean of the 500 sample proportions
Std dev of the 500 sample
Mean of the 500 sample proportions
Std dev of the 500 sample
60%
28%
75%
Sample Size = 800
Pop‟n proportion p
proportions
60%
28%
75%
19) Use your results from #9 and #18 to explain the idea that the sample proportion p-hat is an unbiased
estimator of p, the population proportion.
20) Use your results from #9 and #18 to explain how sample size affects the variability of the estimate p-hat.
21) Explain why an estimate from a larger sample is better than an estimate from a smaller sample? Which is
more likely to give “reliable” results? Why?
Download