Lab Handout 5

advertisement
Chapter 5
Confidence Intervals
Course Website: www.math.mun.ca/~sneddon/st2500
These handouts are modifications of lab notes prepared by Lauren Granter.
In this session, we will study some examples to see how Minitab calculates confidence
intervals for a population mean and proportion. We’ll also do some numerical studies to
illustrate the interpretation of confidence intervals. As with sampling distributions, this will
be a numerical study of the Big Pot Theory discussed in class.
5.1
Confidence Intervals for Population Mean
Recall from class that we want to construct confidence intervals (CI) for the population mean
(µ) in two situations: when the sample size is large (n ≥ 30) and when the sample size is
small (n < 30).
5.1.1
CI for Population Mean: Large Sample Size
EX: The U.S. Commerce Dept. is interested in the average house price of all new houses
sold in the U.S. They selected a random sample of 345 homes, and found their average was
$201,400. If we assume the standard deviation of the price of all new homes sold is $38,000,
find a 95% confidence interval for the mean house price of all new homes sold.
In this case, we have a large sample size, so we can use this CI for the mean, as discussed
in class:
!
σ
x̄ ± zα/2 √
n
where zα/2 is the value on the standard normal curve with area of α/2 to its right, and n is
our sample size.
1
We do this in Minitab as follows:
1. Select Stat–Basic Statistics–1 Sample Z
2. Select Summarized Data, and enter 345 for Sample size and 201400 for mean.
3. Enter 38000 in the Standard deviation box.
4. Leave Test Mean blank.
5. Select OK.
The output is shown below:
One-Sample Z
The assumed standard deviation = 38000
N
345
Mean SE Mean
201400
2046
95% CI
(197390, 205410)
So we are 95% confident the mean selling price is between $197,390 and $205,410.
By default, Minitab found a 95% CI for the population mean, so α = 0.05. If we wanted
a 90% CI (or any other confidence level), we add the following step:
• Click Options, and enter the confidence level we want (say 90) in the Confidence
level box.
5.1.2
CI for Population Mean: Small Sample Size
EX: The data file pallet.mtw on the course website contains a sample of the weights of
wooden pallets of 2 types of shingles (“Boston” and “Vermont”).
1. Find a 90% CI for the mean weight of the Boston shingles pallets.
2. Find a 95% CI for the mean weight of the Vermont shingles pallets.
3. Evaluate whether the assumption needed for (1) and (2) has been seriously violated.
We’ll go through (1), and you can work on (2) and (3) on your own.
1. To find the 90% CI for a small sample size, we do the following in Minitab:
The first step is to get the data from the course webpage into your Minitab worksheet.
In this case, we have n < 30, so we need to use
x̄ ± tn−1
s
√
n
!
where tn−1 comes from the T-distribution with (n − 1) degrees of freedom (df).
2
(a) Select Stat–Basic Statistics–1 Sample t
(b) Select the Boston column for Samples in columns.
(c) Leave Test mean blank.
(d) Select Options and change Confidence Level to 90.
(e) Select OK.
Enter the answer from Minitab in the space below.
2. Find a 95% CI for the mean weight of the Vermont shingles pallets.
Enter your answer below:
3. Evaluate whether the assumption needed for (1) and (2) has been seriously violated.
Enter your answer below:
5.2
Confidence Interval for Population Proportion
EX: A study of 828 travellers showed that 567 of them purchased plane tickets on an airline
website in the past 12 months. Find a 96% confidence interval for the proportion of all
travellers that have purchased plane tickets on an airline website in the past 12 months.
3
Here we are looking at a proportion of people that have a certain characteristic. The
formula for a CI for a population proportion p is
s
p̂ ± zα/2
p̂(1 − p̂)
n
where p̂ is the sample proportion.
In this example, our sample size is n = 828 and x = 567, which is the number of people
with the characteristic we are interested in. So our sample proportion is p̂ = 567/828 = 0.685.
We can find our CI for p in Minitab as follows:
1. Select Stat-Basic Statistics-1 Proportion
2. Select Summarized data and enter 828 after Number of trials and 567 after Number of events
3. Select Options and 96 for Confidence Level and choose Use test and interval
based on normal distribution. Click OK.
4. Click OK.
The output is as follows:
Test and CI for One Proportion
Test of p = 0.5 vs p not = 0.5
Sample
1
X
567
N
828
Sample p
0.684783
96% CI
(0.651623, 0.717943)
Z-Value
10.63
P-Value
0.000
NOTE: Minitab uses Sample p instead of p̂.
The 96% CI for p is (.651, .718).
5.3
Interpreting a CI for µ
As we discussed in class, the formal interpretation of a confidence interval follows the Big
Pot Theory: if we can draw lots of samples, and create CI’s for each of these samples, we
would expect a certain percentage of them (90%, 99%, or whatever confidence level we’re
using) to contain the true value of µ.
Let’s see if we can get Minitab to illustrate this result numerically.
First, get Minitab to select 100 different samples, each of size n = 40, from a normal
distribution with µ = 10 and σ = 2. (We don’t have to use data from the normal distribution,
though). We do this as follows:
4
1. Select Calc–Random Data–Normal
2. Enter 40 in the space after Generate
3. Type C1–C100 in the space below Store in column
4. Set Mean = 10 and Standard deviation = 2.
5. Select OK.
The 100 samples are in columns C1–C100. For each sample, we want to find a 90%
confidence interval for µ. We do this as before:
1. Select Stat-Basic Statistics-1 Sample Z.
2. Choose Samples in Columns, and enter C1–C100 in the box.
3. Enter Standard deviation as 2.
4. Under Options, enter 90 for Confidence Level and hit OK.
5. Select OK.
A portion of the output I got (your answers will not be the same) is below:
One-Sample Z: C1, C2, C3, C4, C5, C6, C7, C8, ...
The assumed standard deviation = 2
Variable
C1
C2
.
.
.
N
40
40
Mean
9.79757
9.82073
StDev
2.03591
1.79016
SE Mean
0.31623
0.31623
90% CI
(9.39216, 10.20298)
(9.41532, 10.22614)
The last column contains the 100 CI’s for µ that Minitab found (one CI for each sample
of 40 observations).
Theory says: In this case we know that µ = 10 (that’s what we told Minitab when we
created the samples). The theory says 90% of all the CI’s we just created should contain
µ = 10. In other words, the interval
(9.39216, 10.20298) would contain µ = 10, since 10 falls between 9.39 and 10.20.
Take a look at the 100 intervals you have. Write down below how many of them do not
contain µ = 10.
5
Then the number that do contain µ = 10 is (100 – above result). Write down this value:
Out of your 100 intervals, how many (theoretically) should contain µ = 10?
How do your theoretical and actual results compare?
6
Download