Uploaded by Kareem Hamdi

Topic 8 - updated

advertisement
VIII. INTERVAL ESTIMATION
CHAPTER 8
Learning Objectives
1. Define a point estimate.
2. Define a confidence level.
3. Construct a confidence interval for the population mean when the
population standard deviation is known.
4. Construct a confidence interval for the population mean when the
population standard deviation is unknown.
5. Construct a confidence interval for the population proportion.
6. Determine the finite-population correction factor.
7. Calculate the required sample size to estimate a population
proportion or population mean.
2
Point Estimates
A point estimate is a single statistic, computed from
sample information, that is used to estimate a population
parameter.
Example: In order to estimate the average starting salary
of recent graduates from your university, , the university
takes a random sample of 100 recent graduates and
computes the sample mean, .
3
Examples of Point Estimates
Below are examples of population parameters and the sample
statistics that are computed to obtain a point estimate of
the population parameters.
Population Parameter
Sample Statistic
4
Confidence Levels
A point estimate only tells part of the story. While we
expect the point estimate to be close to the population
parameter, we would like to measure how close it really is.
A confidence interval serves this purpose.
A confidence interval is a range of values constructed
from sample data so that the population parameter is likely
to occur within that range at a specified probability.
The specified probability is called the level of confidence.
5
Confidence Intervals
To compute a confidence interval for a population mean,
we will consider two situations:
• We use sample data to estimate μ with , and the
population standard deviation (σ) is known.
• We use sample data to estimate μ with , and the
population standard deviation is unknown. In this case, we
substitute the sample standard deviation (s) for the
population standard deviation (σ).
We first consider the case where σ is known.
6
Confidence Interval for the Population Mean,
Population Standard Deviation Known
If the population standard deviation is known, and the
population is normally distributed or the sample size is at
least 30, then a confidence interval for the population
mean is given by
.
where z is the z-value for a particular confidence level.
7
Obtaining the z-value
The area between z = -1.96 and z = +1.96 is 0.95.
8
Confidence Levels and z-Values
Below are three common confidence levels and their
associated z-value.
Confidence Level
z-Value
90 percent
95 percent
99 percent
1.645
1.96
2.58
9
Example – Mean Income
A survey company wants to determine the mean income of
middle level employees in the retail industry. A random
sample of 361 employees reveals a sample mean of
$54520. The standard deviation of this population is $3060.
The company would like answers to the following
questions:
1. What is the population mean? What is a reasonable
value to use as an estimate of the population mean?
2. What is a reasonable range of values for the population
mean?
3. What do these results mean?
10
Solution – Mean Income
1. In this case, we do not know the population mean. We
do know the sample mean is $54,520. Hence, our best
estimate of the unknown population value is the
corresponding sample statistic. Thus, the sample mean of
$54,520 is a point estimate of the unknown population
mean.
2. Suppose the association decides to use the 95 percent
level of confidence:
The confidence limits are $54,204 and $54,836. The
margin of error is ±$316.
11
Solution – Mean Income
3. If we select many samples of 361 employees, and for
each sample we compute the mean and then
construct a 95% confidence interval, we could expect
about 95% of these confidence intervals to contain
the population mean. Conversely, about 5% of the
intervals would not contain the population mean
annual income, µ.
12
Confidence Interval for the Population Mean,
Population Standard Deviation Unknown
In most sampling situations the population standard
deviation (σ) is not known. We can use the sample
standard deviation s to estimate the population standard
deviation.
In this situation we can no longer use the previous
confidence interval formula, and because we do not know
σ we cannot use the z distribution.
To remedy this, we use the sample standard deviation and
replace the z distribution with the t distribution.
13
Characteristics of the t Distribution
1. It is, like the z distribution, a continuous distribution.
2. It is, like the z distribution, bell-shaped and symmetrical.
3. There is not one t distribution, but rather a “family” of t
distributions. All t distributions have a mean of 0, but their
standard deviations differ according to the sample size, n.
The standard deviation for a t distribution with 5
observations is larger than for a t distribution with 20
observations.
4. The t distribution is more spread out and flatter at the centre
than is the standard normal distribution. As the sample size
increases, however, the t distribution approaches the
standard normal distribution, because the errors in using s
to estimate σ decrease with larger n.
14
Characteristics of the t Distribution
The Standard Normal Distribution and Student’s t distribution
15
Confidence Interval for the Population Mean,
Population Standard Deviation Unknown
To develop a confidence interval for the population mean
with an unknown population standard deviation (σ) we:
1. Assume the sampled population is either normal or
approximately normal.
2. Estimate the population standard deviation (σ) with the
sample standard deviation (s).
3. Use the t-distribution rather than the z-distribution.
16
Confidence Interval for the Population Mean,
Population Standard Deviation Unknown
If the population standard deviation is unknown and the
population is normally distributed, then a confidence
interval for the population mean is given by
.
where t is the t-value for a particular confidence level and
sample size n.
The t-value is found by looking in the t-table in Appendix
(table or excel) with n – 1 degrees of freedom (Df).
17
Determining When to Use the z
Distribution or the t Distribution
18
Example – Life of Light Bulbs
A bulb manufacturer wishes to investigate the life of its
bulbs. A sample of 10 bulbs in use since 60 days revealed
a sample mean of 0.71 days of life remaining with a
standard deviation of 0.13 days.
Construct a 95% confidence interval for the population
mean. Would it be reasonable for the manufacturer to
conclude that after 60 days the population mean amount of
life remaining is 0.70 days?
19
Solution – Life of Light Bulbs
Given in the problem:
n = 10
= 0.71
s = 0.13
Confidence level = 95%
Because σ is unknown, we compute the confidence interval using
the t –distribution.
The t-value is found by looking in the t-table in Appendix B.2 with
n – 1 = 9 degrees of freedom (Df) and a 95% confidence level.
20
Solution – Life of Light Bulbs
To determine the confidence interval we substitute the
values in formula
The endpoints of the confidence interval are 0.617 and
0.803.
The Manufacturer can be reasonably sure (95% confident)
that the mean remaining life is between 0.617 and
0.803 days. Because the value of 0.70 is in this
interval, it is possible that the mean of the population is
0.70.
21
Confidence Interval for a Population Proportion
Recall that a proportion is the fraction, ratio, or percent
indicating the part of the sample or the population having a
particular trait of interest. Recall that we can determine the
sample proportion with the following formula:
.
The sample proportion provides a point estimate of the
population proportion, p.
22
Confidence Interval for a Population Proportion
To develop a confidence interval for a proportion, we need
to meet the following assumptions.
1. All binomial conditions are met.
2. The values np and n(1 – p) should both be greater than
or equal to 5.
If these assumptions are met, then a confidence interval for
a population proportion is given by
23
Confidence Interval for a Population Proportion
Since we do not know the value of the population
proportion, we replace σp with the standard error of the
sample proportion, sp:
As a result, the confidence interval becomes
.
24
Example – Dress Code
A company decided to take poll whether employees should
have dress code. Employees will have dress code if at
least three-fourths of employees vote in favour of dress
code. A random sample of 3000 employees reveals 2600
plan to vote for dress code.
(a) What is the estimate of the population proportion?
(b) Develop a 95% confidence interval for the population
proportion.
(c) Basing your decision on this sample information, can
you conclude that the necessary proportion of
employees favours the dress code?
25
Solution – Dress Code
(a) The sample proportion is
.
(b) The 95% C.I.
(c) Conclude that the dress code proposal will pass
because the interval estimate includes values greater than
75% of the employees.
26
Finite Population Correction Factor
The populations we have sampled so far have been very
large or infinite.
When the sampled population is not very large, we need to
adjust the way in which we compute the standard error
of the sample means and the standard error of the
sample proportions.
A population that has a fixed upper bound is finite.
27
Finite Population Correction Factor
For a finite population, where the total number of objects or
individuals is N and the number of objects or individuals in
the sample is n, we need to adjust the standard errors in
the confidence interval formulas.
To find the confidence interval for the mean we adjust the
standard error of the mean.
For the confidence interval for a proportion, we need to
adjust the standard error of the proportion.
28
Finite Population Correction Factor
This adjustment is called the finite-population correction
factor (FPC).
The usual rule is if the ratio of n/N is less than 0.05, the
correction factor is ignored.
29
Adjusting the Standard Errors with the FPC
We adjust the standard error of the mean or proportion as
follows:
30
Example – Charity Contribution
There are 350 families in one area in Brooks city. A poll of
50 families reveals the mean annual charity contribution is
$550 with a standard deviation of $85.
(a) Develop a 90 percent confidence interval for the
population mean.
(b) Interpret the confidence interval.
31
Solution – Charity Contribution
Given in Problem:
N = 350; n = 50 and s = $85
(a) Since n/N = 50/350 = 0.14, the finite population correction
factor must be used.
The population standard deviation is not known therefore use the
t-distribution.
32
Solution – Charity Contribution
It is likely that the population mean is more than $531.25
but less than $568.75. The population mean can be $545
but not $525. Because the value $545 is within the
confidence interval and $525 is not within the confidence
interval.
33
Choosing An Appropriate Sample Size
When working with confidence intervals, one important
variable is sample size. However, in practice, sample size
is not a variable. It is a decision we make so that our
estimate of a population parameter is a good one. Our
decision is based on three variables:
1. The margin of error the researcher will tolerate.
The margin of error, denoted by E, is the amount that is
added and subtracted to the sample mean or proportion to
determine the endpoints of the confidence interval.
34
Choosing An Appropriate Sample Size
2. The level of confidence desired.
The confidence level represents the allowable error. We
logically choose a relatively high level of confidence such
as 95%. Note that larger sample sizes correspond with
higher levels of confidence.
3. The variation or dispersion of the population being
studied. This is measured by the population standard
deviation.
As the variation increases, the sample size required
increases. We often need to estimate the standard
deviation by using a comparable study or pilot study.
35
Sample Size to Estimate a Population Mean
To estimate a population mean, we can express the
interaction among these three factors and the sample size
in the following formula.
This is the margin of error used to calculate the endpoints
of confidence intervals to estimate a population mean!
Solving this equation for n yields the following result:
36
Sample Size to Estimate a Population Mean
where: n is the size of the sample.
z is the standard normal value corresponding to
the desired level of confidence
σ is the population standard deviation.
E is the maximum allowable error.
When the outcome is not a whole number, the usual
practice is to round up any fractional result.
37
Example – City Trees
An NGO wants to determine the mean number of trees
planted in last month near the city. The error in estimating
the mean is to be less than 150 with a 95 percent level of
confidence. An NGO found a report by the Department of
Forest that estimated the standard deviation to be 1500.
What is the required sample size?
38
Solution – City Trees
39
Sample Size to Estimate a Population Proportion
To determine the sample size for a proportion, the same
three variables need to be specified:
1. The margin of error
2. The desired level of confidence
3. The variation or dispersion of the population being
studied
The formula to determine the sample size of a proportion is
given on the next slide:
40
Sample Size to Estimate a Population Proportion
where: n is the size of the sample.
z is the standard normal value corresponding to
the desired level of confidence
p is the population proportion.
E is the maximum allowable error.
We find a value for p through a comparable study or pilot
study. When no reliable value is available, p should be set
equal to 0.5.
41
Example – City Trees
The study in the previous example also estimates the
proportion of tress planted. An NGO wants the estimate to
be within 0.15 of the population proportion, the desired
level of confidence is 90 percent, and no estimate is
available for the population proportion. What is the required
sample size?
42
Solution – City Trees
Because no estimate of the population proportion is
available, we use .50
43
Download