The inferences that were discussed in chapters 5 and 6 were a priori

advertisement
Chapter 7
Confidence Intervals and Sample Size
The inferences that were discussed in chapters 5 and 6 were
based on the assumption of an a priori hypothesis that the
researcher had about a population. However, there are times
when the researchers do not have a hypothesis. In such cases
they would simply like a good estimate of the parameter.
Point estimate
Confidence interval – a point estimate ± margin of error
Chapter 7 - Page 183
The logic behind the creation of confidence intervals can be
demonstrated using the empirical rule, otherwise known as the
68-95-99.7 rule.
Chapter 7 - Page 184
To determine a more precise critical value, use the
probabilities in the z table to find the z-value.
The critical value is found by first determining the area in
one tail. The area in the left tail (AL) is found by subtracting the
degree of confidence from 1 and then dividing this by 2.
1  deg ree of confidence
. For example, substituting into the formula
AL 
2
for a 95% confidence interval produces
AL 
1  0.95
 0.025
2
. The
critical Z value for an area to the left of 0.025 is
–1.96.
Z
-2.2
-2.1
-2.0
-1.9
-1.8
-1.7
0.09
0.0110
0.0143
0.0183
0.0233
0.0294
0.0367
0.08
0.0113
0.0146
0.0188
0.0239
0.0301
0.0375
0.07
0.0116
0.0150
0.0192
0.0244
0.0307
0.0384
0.06
0.0119
0.0154
0.0197
0.0250
0.0314
0.0392
0.05
0.0122
0.0158
0.0202
0.0256
0.0322
0.0401
0.04
0.0125
0.0162
0.0207
0.0262
0.0329
0.0409
0.03
0.0129
0.0166
0.0212
0.0268
0.0336
0.0418
0.02
0.0132
0.0170
0.0217
0.0274
0.0344
0.0427
0.01
0.0136
0.0174
0.0222
0.0281
0.0351
0.0436
The critical z value of –1.96 is also called the 2.5th percentile.
That means that 2.5% of all possible statistics are below that
value.
Critical values can also be found using a TI 84 calculator. Use
2nd Distr, #3 invnorm (percentile, ,). For example
invnorm(0.025,0,1) gives –1.95996 which rounds to –1.96.
Find the critical z values for the following.
Degree of Confidence
0.90
0.95
0.99
Area in Left Tail
z*
0.025
1.96
Chapter 7 - Page 185
0.00
0.0139
0.0179
0.0228
0.0287
0.0359
0.0446
Confidence intervals for means require a critical value, t*,
which is found on the t tables. These critical values are
dependent upon both the degree of confidence and the sample
size, or more precisely, the degrees of freedom.
For example, the t* value for a 95% confidence interval with
7 degrees of freedom is 2.365.
One Tail
Probability
Two Tail
Probability
Confidence
Level
df
5
6
7
8
0.4
0.25
0.1
0.05
0.025
0.01
0.005
0.0005
0.8
0.5
0.2
0.1
0.05
0.02
0.01
0.001
20%
50%
80%
90%
95%
98%
99%
99.9%
0.267
0.265
0.263
0.262
0.727
0.718
0.711
0.706
1.476
1.440
1.415
1.397
2.015
1.943
1.895
1.860
2.571
2.447
2.365
2.306
3.365
3.143
2.998
2.896
4.032
3.707
3.499
3.355
6.869
5.959
5.408
5.041
Chapter 7 - Page 186
A problem is that the standard error of sampling
distributions includes variables we don’t know.
 pˆ 
p1  p 
. x  
n
n
.
Therefore we have to estimate p and . The estimated
standard errors then become: s pˆ  pˆ 1n pˆ  and s x  s .
n
Parameter
Distribution
Estimated Standard Error
s pˆ 
Proportion for one
population, p
pˆ 1  pˆ 
n
p̂
p̂ p̂ p̂
p̂ p̂ p̂ p̂ p̂
Difference between
proportions for two
populations, pA – pB
s pˆ A  pˆ B 
pˆ A 1  pˆ A  pˆ B 1  pˆ B 

nA
nB
pˆ A  pˆ B
pˆ A  pˆ B
pˆ A  pˆ B pˆ A  pˆ B
Mean for one
population or mean
difference for
dependent data, 
sx 
s
n
x
x x x
x x x x x
Difference between
means of two
independent
populations,
µA – µB
x A  xB
x A  xB
x A  xB
x A  xB
Chapter 7 - Page 187
 n  1s A2  n B  1s B2   1
1 
s x A  xB   A



n A  nB  2

  n A n B 
The reasoning process for determining the formulas for the
confidence intervals is the same in all cases.
1. Determine the degree of confidence. The most common
are 95%, 99% and 90%.
2. Use the degree of confidence along with the appropriate
table (z* or t*) to find the critical value.
3. Multiply the critical value times the standard error to find
the margin of error.
4. The confidence interval is the statistic plus or minus the
margin of error.
Chapter 7 - Page 188
Notice that all the confidence intervals have the same
format, even though some look more difficult than others.
statistic ± margin of error
statistic ± critical value x estimated standard error
Confidence intervals about the proportion for one
population:
pˆ  z *
pˆ 1  pˆ 
n
Confidence intervals for the difference in proportions
between two populations:
 pˆ A  pˆ B   z *
pˆ A qˆ A pˆ B qˆ B

nA
nB
Confidence intervals for the mean for one population:
x t*
s
n
Confidence interval for the difference between two
independent mean:
 
n A  1s A2  n B  1s B2   1 1  




n A  nB  2
n
 
 A nB  



x A  x B   t * 
where t* is the appropriate percentile from the t(nA+nB-2)
distribution.
Chapter 7 - Page 189
Proportions (for categorical
data)
1–
sample
pˆ  z *
pˆ 1  pˆ 
n
Means (for quantitative data)
x t*
s
n
df = n – 1
Assumptions:
If n<30, population is approximately normally
distributed.
Assumptions:
np  5, n(1-p)  5
Calc: A: 1 PropZInt
Calc: 8: T Interval
2–
samples
 pˆ A  pˆ B   z *
pˆ A qˆ A pˆ B qˆ B

nA
nB
Assumptions:
np  5, n(1-p)  5 for both
populations
Calc: B: 2 PropZInt
 
n A  1s A2  n B  1s B2

n A  nB  2
 

x A  x B   t * 

 1
1 



  n A n B  

df = nA+nB – 2
Assumptions:
If n<30, population is approximately normally
distributed.
Calc: 0: 2-SampTInt
What does a confidence interval mean? For a 95%
confidence interval, 95% of all possible statistics are within z* (or
t*) standard errors of the mean of the distribution. Therefore,
there is a 95% probability that the data that is randomly selected
will produce one of those statistics and the confidence interval
that is created will contain the parameter. Whether the interval
ultimately does include the parameter or not is unknown. We
only know that if the sampling processes was repeated a large
number of times producing many confidence intervals, about 95%
of them would contain the parameter.
Chapter 7 - Page 190
1. Find the 95% confidence interval of households prepared for a
natural disaster.
Assume that a random sample of 900 households was taken.
Of these, 98 claimed they are prepared. Can we conclude that
more than 10% are prepared?
Chapter 7 - Page 191
2. Find the 90% confidence interval for the difference between the
proportion of households in tornado/hurricane areas prepared for a
disaster and the proportion of households in earthquake areas.
Assume a random sample is taken from both populations.
For the Tornado country 122/800 are prepared. For earthquake
country, 98/900 are prepared.
Chapter 7 - Page 192
3. Find the 99% confidence interval for the average daily caloric
intake of US residents.
Mean 3250, SD 600 n = 18
Chapter 7 - Page 193
4. Find the 95% confidence interval for the difference
between the average daily caloric intake of a person on a diet
compared to prior to the diet?
Subject 1
2
3
4
5
6
Before 3820 3550 2840 4280 2960 2540
calories
during 3760 3650 2530 3460 2960 2530
calories
during- -60
100 -310 -820
0
-10
Before
Chapter 7 - Page 194
5. Find the 99% confidence interval for the difference in daily
caloric intake of Canadian residents and Americans.
The table below shows the mean, standard deviation and sample
size for the two samples.
Units: hours/week
Canadians
Americans
Mean
2950
3250
Standard Deviation
550
600
sample size, n
14
18
Chapter 7 - Page 195
Sample Size Estimation
E  z*
n
.25 z *2
E2
pˆ 1  pˆ 
.
n
or n 
z *2
4E 2
Example 2. Estimate the sample size needed for a national
presidential poll if the desired margin of error is 3%. Assume 95%
degree of confidence.
Chapter 7 - Page 196
Download