USE OF THE t DISTRIBUTION

advertisement
USE OF THE t DISTRIBUTION
• Footnote: Who was “Student”? A pseudonym for
William Gosset
• The t is often thought of as a small-sample
technique
• But, STRICTLY SPEAKING, the t should be
used whenever the population standard
deviation σ is NOT KNOWN
• Some practitioners use z whenever the sample
is large
– Central Limit Theorem
– There isn’t much difference between t and z
Population standard
deviation known?
Yes
No
Population normal?
Yes
Population normal?
No
Yes
No
Sample Size
Sample Size
z value
n >= 30
n < 30
z or t (see
t value
note)
ERROR
n >= 30
n < 30
z or t (see
note)
ERROR
Notes:
• For large samples with σ unknown, different
practitioners may proceed differently. Some
argue for using a z, appealing to CLT. Others
use a t since it gives a less precise estimate.
For this course: use a t whenever the
population standard deviation is not known.
• Small samples from non-normal populations are
beyond the scope of this course
Confidence intervals for the
population proportion 
• Sample proportion p = x/n
• E(p) =  and
p 
  (1   )
n
In general  is not know, so must be estimated
with p and we use
sp 
p  (1  p )
n
• Then the confidence interval is
• p  zC  sp
• Note that proportion problems always use
a z value
– Normal approximates binomial
• EXAMPLE: Of 112 students in a sample,
70 have paying jobs. Calculate a 95%
confidence interval for the proportion in the
population with paying jobs.
• p = 70/112 = 0.625
0.625  0.375
sp 
 0.045745315
112
• 0.625  1.96 * 0.045 etc.
• 0.625  0.089660819 or 0.625  0.09
• We are 95% confident that
0.54    0.71
• EXAMPLE:
• In a sample of 320 professional
economists, 251 agreed that “offshoring”
jobs is good for the American economy.
Calculate a 90% confidence interval for
the proportion in the population of
professional economists who hold this
view.
Finding the Right Sample Size
• The error in the estimate is given by
σp or, substituting
e  zC 
  (1   )
n
Solving for n yields:
n
2
zC
   (1   )
2
e
zC 
• In general  is not known
• Two solutions:
– Assume  = 0.5
• Result is the largest sample that would ever be
needed
– Conduct a pilot study and use the resulting p
as an estimate of 
• May give a somewhat smaller sample size if p is
much different from 0.5
• Saves sampling cost
Example:
• Above we had a 95% confidence interval with n =
112 of 0.625  0.09 or a 9% error. Suppose we
require a maximum error of 3%.
• Approach 1: let  = 0.5
1.96  0.5  0.5
n

1067
.
11

1068
2
.03
2
• Approach 2: assume  = 0.625
1.96  0.625  0.375
n

1000
.
41

1001
2
.03
2
The difference is more dramatic if p is much
different from 0.5. In a random sample of 300
students in NC, 30 have experienced “study”
abroad. A 95% confidence interval for the
population proportion is 10%  3.4%. Suppose we
require a maximum error of 2%. Approach 1 gives
_______ and approach 2 gives _________.
Download