Uploaded by bensonkariuki64

Confidence intervals

advertisement
Confidence interval of the Mean
i.
Estimating the confidence interval for the population mean (σ known)
This is given through the following formula;
x̄ ± Z*σ/√n
Where:
x̄ is the sample mean
Z is the critical value at a given level of confidence which is the inverse of the
probability values.
σ is the population standard deviation
n is the sample size
σ/√n is the Standard Error (S.E)
Z*σ/√n is the Margin of Error (M.E)
We can re-write this formula to have as;
x̄ ± M.E
NOTE
At 80% confidence level, Z = 1.282
At 90% confidence level, Z = 1.645
At 95% confidence level, Z = 1.96
At 98% confidence level, Z = 2.326
At 99% confidence level, Z = 2.576
Example
The American Management Association wishes to have information on the mean income of store
managers in the retail industry. A random sample of 256 managers reveals a mean of $45,420.
The standard deviation of this population is $2,050. The association would like answers to the
following questions:
1. What is the sample mean?
x̄ = 45420
2. What is the point estimate:
The point estimate the a measure derived from the sample that is supposed to estimate the
population parameter e.g the sample mean, the sample standard deviation or the sample
proportion
3. What is the 95% confidence interval for the population mean?
x̄ ± M.E
x̄ ± Z*σ/√n
45420 ± (1.96*(2050/√256))
Lower limit = 45169
Upper limit = 45671
45,169 < µ < 45,671
4. How do we interpret these results?
We are 95% confident that the true population mean is between $45,169 and $45,671
Example
The Bun-and-Run is a franchise fast-food restaurant located in the Northeast specializing
in half-pound hamburgers, fish sandwiches, and chicken sandwiches. Soft drinks and
French fries are also available. The Planning Department of Bun-and-Run Inc. reports
that the distribution of daily sales for restaurants follows the normal distribution and that
the population standard deviation is $3,000. A sample of 40 showed the mean daily sales
to be $20,000.
(a) What is the population mean?
We do not know the population mean
(b) What is the best estimate of the population mean? What is this value called?
The best estimate is the point estimate of the population mean which is the sample
mean which is 20,000
(c) Develop a 99 percent confidence interval for the population mean.
x̄ ± Z*σ/√n
20000 ± (2.58*(3000/√40))
Lower limit = 18776
Upper limit = 21224
18,776 < µ < 21,224
(d) Interpret the confidence interval.
We are 99% confident that the true population mean is between $18,776 and $21,224
Questions
1. A research firm conducted a survey to determine the mean amount by steady smokers
spend. This followed the normal distribution with a population standard deviation of
$5. A sample of 49 steady smokers revealed that on cigarettes during a week. They
found the sample distribution of amounts spent per week is x̄ = 20
The point estimate is 20 as it is the sample mean
The 95% confidence interval will be:
Z*S.E = M.E
C.I = Point estimate ± M.E
20 ± 1.96*(5/√49)
= (18.6, 21.4)
We are 95% confident that the true population mean is within the interval above.
2. Refer to the previous question. Suppose that 64 smokers (instead of 49) were
sampled.
Assume the sample mean remained the same.
a. What is the 95 percent confidence interval estimate of _?
20 ± 1.96*(5/√64)
= (18.775, 21.225)
b. Explain why this confidence interval is narrower than the one determined in the
previous exercise
NB
Whenever we increase the sample size, the margin of error decreases and therefore the
confidence interval becomes narrower.
Obtaining the Z critical value on Excel
Assuming we are interested in finding the critical value at 95% confidence level
The Excel function for this will be:
=norm.s.inv(probability)
ii.
Estimating the confidence interval for the population mean (σ unknown)
Whenever we are estimating the confidence interval when the population standard
deviation is not known, we use the t-distribution.
The formula is given by;
x̄ ± t*s/√n
Where t is the critical value with n-1 degrees of freedom and s is the sample standard
deviation.
Example
1. Find the t-critical value at 95% confidence level when the sample size is 21.
Degrees of freedom (df) = n-1
Df = 21-1 = 20
From the t-table, t = 2.086
2. Find the t-critical value at 90% confidence level when the sample size is 17.
From the t-table, t = 1.746
Computing the t-critical Value using Excel
When interested in computing the t critical value at a given level of confidence and n1 degrees of freedom we use the following formula.
=T.INV.2T(probability, df)
Where 2T represents the two tails.
Example
Recompute the above critical values using Excel
Find the t-critical value at 95% confidence level when the sample size is 21.
Degrees of freedom (df) = n-1
Df = 21-1 = 20
=T.INV.2T(0.05, 20)
= 2.085963447
Find the t-critical value at 90% confidence level when the sample size is 17.
=T.INV.2T(0.1, 16)
= 1.745883676
Relationship Between Z-distribution and t-distribution
The following characteristics of the t distribution are based on the assumption that the
population of interest is normal, or nearly normal.
• It is, like the z distribution, a continuous distribution.
• It is, like the z distribution, bell-shaped and symmetrical.
• There is not one t distribution, but rather a family of t distributions. All t
distributions have a mean of 0, but their standard deviations differ according to the
sample size, n. There is a t distribution for a sample size of 20, another for
a sample size of 22, and so on. The standard deviation for a t distribution with
5 observations is larger than for a t distribution with 20 observations.
• The t distribution is more spread out and flatter at the center than the standard
normal distribution. As the sample size increases, however, the t distribution
approaches the standard normal distribution, because the errors in using s to estimate
_ decrease with larger samples.
Example
A tire manufacturer wishes to investigate the tread life of its tires. A sample of 10
tires driven 50,000 miles revealed a sample mean of 0.32 inches of tread remaining
with a standard deviation of 0.09 inches. Construct a 95 percent confidence interval
for the population mean.
Would it be reasonable for the manufacturer to conclude that after 50,000 miles the
population mean amount of tread remaining is 0.30 inches?
The formula is given by;
x̄ ± t*s/√n
0.32 ± (2.262*(0.09/√10))
Lower limit = 0.32 - (2.262*(0.09/√10)) = 0.2556
Upper limit = 0.32 + (2.262*(0.09/√10)) = 0.3844
0.2556 < µ < 0.3844
We are 95% confident that the true population mean is between 0.2556 and 0.3844.
Yes it is reasonable because 0.30 falls within the confidence interval.
Estimating the Confidence interval of a Single Proportion
Before conducting a confidence interval test for proportions, we need first to verify
that the normal distribution is followed.
1. n*p(1-p) ≥ 10
2. n < 0.05N
2000*0.8*0.2 = 320
This is given by the formula;
p ± Z*√(p(1-p)/n)
where p is the sample proportion.
Point estimate = p = x/n where x is the number of observations and n is the total
sample size.
Standard error = √(p(1-p)/n)
Margin of error = Z*√(p(1-p)/n)
The estimate of a population proportion is the sample proportion.
Example:
The union representing the Bottle Blowers of America (BBA) is considering a
proposal to merge with the Teamsters Union. According to BBA union bylaws, at
least three-fourths of the union membership must approve any merger. A random
sample of 2,000 current BBA members reveals 1,600 plan to vote for the merger
proposal.
What is the estimate of the population proportion?
p = x/n
= 1600/2000
= 0.8
Develop a 95 percent confidence interval for the population proportion.
p ± Z*√(p(1-p)/n)
0.8 ± 1.96*√(0.8(1-0.8)/2000)
Lower limit = 0.7825
Upper limit = 0.8175
0.7825 < P < 0.8175
We are 95% confident that the true population proportion is between 78.25% and
81.75%.
Basing your decision on this sample information, can you conclude that the
necessary proportion of BBA members favor the merger? Why?
Since all values in the confidence interval are greater than 75%, then the necessary
proportion of BBA members favor the merger.
Sample Size Estimation
1. When we need to compute the sample size given the standard deviation, we use
the following formula;
n = (Zα/2*σ/E)2
Where n is the sample size
Z is the two tailed critical value
σ is the standard deviation
E is the margin of error we are willing to accept
Example
A student in public administration wants to determine the mean amount members
of city councils in large cities earn per month as remuneration for being a council
member. The error in estimating the mean is to be less than $100 with a 95
percent level of confidence. The student found a report by the Department of
Labor that reported a standard deviation of $1,000. What is the minimum required
sample size?
n = (Z*σ/E)2
= (1.96*1000/100)2
384.16
We round up this value to get n = 385
Question
A population is estimated to have a standard deviation of 10. We want to estimate
the population mean within 2, with a 80 percent level of confidence. How large a
sample is required?
2. The sample size when dealing with proportions is given by;
n = Zα /22*p(1-p)/E2
Where E is the margin of Error.
Example
The study in the previous example also estimates the proportion of cities that have
private refuse collectors. The student wants the margin of error to be within .10 of the
population proportion, the desired level of confidence is 90 percent, and no estimate is
available for the population proportion. What is the required sample size?
Assume that the prior estimate was 30% for the proportion. What is the new
minimum sample size required
When no prior estimate is available for population proportion, we assume p =
50% = 0.5
n = Z2*p(1-p)/E2
n = 1.6452*0.5(1-0.5)/0.10
= 6.765
We round up to get n = 7
When p = 0.3, we have;
n = 1.6452*0.3(1-0.3)/0.10
= 5.6
Rounding up to n = 6
Question
It is estimated that 60 percent of U.S. households subscribe to cable TV. You would like to
verify this statement for your class in mass communications. If you want your estimate to be
within 5 percentage points, with a 95 percent level of confidence, how large of a sample is
required?
Finite Population Correction Factor
The Finite Population Correction Factor (FPC) is used when you sample from more than 5% of a
finite population. It’s needed because under these circumstances, the Central Limit Theorem
doesn’t hold and the standard error of the estimate (e.g. the mean or proportion) will be too big.
In other words, this principle is used when the population is small so that the sample is a major
fraction of the population( i.e greater than 5%), the standard error formula can be reduced by
applying the finite-population correction factor to form the adjusted standard error.
This formula is used when we have been given the size of the population to which the sample is
drawn.
The formula is:
√((N – n)/N-1)
Where N is the size of the population and n the size of the sample.
Example
There are 250 families in Scandia, Pennsylvania. A random sample of 40 of these families
revealed the mean annual church contribution was $450 and the standard deviation of this was
$75.
1. Develop a 90 percent confidence interval for the population mean.
x̄ ± t*s/√n*(√((N – n)/N-1))
450 ± 1.685*(75/√40) *(√((250 – 40)/(250-1))
FCF = 0.91835
Lower limit = 450 - 1.685*(75/√40)* 0.91835= 431.6
Upper limit = 450 + 1.685*(75/√40)*0.91835= 468.4
(431.6, 468.4)
Question
Thirty people from a population of 300 were asked how much they had in savings. The sample
mean (x̄) was $1,500, with a sample standard deviation of $89.55. Construct a 95% confidence
interval estimate for the population mean.
FPC = (300- 30)/299 = 0.9502684
S.E = 89.55/√30
= 16.3495183415
A.S.E = 16.3495183415*0.9502684 = 15.5364306351
tc = 2.045229642
M.E = 2.045229642*15.5364306351 = 31.7755684658
L.B = 1500 - 31.7755684658 = 1468.22
U.B = 1500 + 31.7755684658 = 1531.78
Download