VIII. INTERVAL ESTIMATION CHAPTER 8 Learning Objectives 1. Define a point estimate. 2. Define a confidence level. 3. Construct a confidence interval for the population mean when the population standard deviation is known. 4. Construct a confidence interval for the population mean when the population standard deviation is unknown. 5. Construct a confidence interval for the population proportion. 6. Determine the finite-population correction factor. 7. Calculate the required sample size to estimate a population proportion or population mean. 2 Point Estimates A point estimate is a single statistic, computed from sample information, that is used to estimate a population parameter. Example: In order to estimate the average starting salary of recent graduates from your university, , the university takes a random sample of 100 recent graduates and computes the sample mean, . 3 Examples of Point Estimates Below are examples of population parameters and the sample statistics that are computed to obtain a point estimate of the population parameters. Population Parameter Sample Statistic 4 Confidence Levels A point estimate only tells part of the story. While we expect the point estimate to be close to the population parameter, we would like to measure how close it really is. A confidence interval serves this purpose. A confidence interval is a range of values constructed from sample data so that the population parameter is likely to occur within that range at a specified probability. The specified probability is called the level of confidence. 5 Confidence Intervals To compute a confidence interval for a population mean, we will consider two situations: • We use sample data to estimate μ with , and the population standard deviation (σ) is known. • We use sample data to estimate μ with , and the population standard deviation is unknown. In this case, we substitute the sample standard deviation (s) for the population standard deviation (σ). We first consider the case where σ is known. 6 Confidence Interval for the Population Mean, Population Standard Deviation Known If the population standard deviation is known, and the population is normally distributed or the sample size is at least 30, then a confidence interval for the population mean is given by . where z is the z-value for a particular confidence level. 7 Obtaining the z-value The area between z = -1.96 and z = +1.96 is 0.95. 8 Confidence Levels and z-Values Below are three common confidence levels and their associated z-value. Confidence Level z-Value 90 percent 95 percent 99 percent 1.645 1.96 2.58 9 Example – Mean Income A survey company wants to determine the mean income of middle level employees in the retail industry. A random sample of 361 employees reveals a sample mean of $54520. The standard deviation of this population is $3060. The company would like answers to the following questions: 1. What is the population mean? What is a reasonable value to use as an estimate of the population mean? 2. What is a reasonable range of values for the population mean? 3. What do these results mean? 10 Solution – Mean Income 1. In this case, we do not know the population mean. We do know the sample mean is $54,520. Hence, our best estimate of the unknown population value is the corresponding sample statistic. Thus, the sample mean of $54,520 is a point estimate of the unknown population mean. 2. Suppose the association decides to use the 95 percent level of confidence: The confidence limits are $54,204 and $54,836. The margin of error is ±$316. 11 Solution – Mean Income 3. If we select many samples of 361 employees, and for each sample we compute the mean and then construct a 95% confidence interval, we could expect about 95% of these confidence intervals to contain the population mean. Conversely, about 5% of the intervals would not contain the population mean annual income, µ. 12 Confidence Interval for the Population Mean, Population Standard Deviation Unknown In most sampling situations the population standard deviation (σ) is not known. We can use the sample standard deviation s to estimate the population standard deviation. In this situation we can no longer use the previous confidence interval formula, and because we do not know σ we cannot use the z distribution. To remedy this, we use the sample standard deviation and replace the z distribution with the t distribution. 13 Characteristics of the t Distribution 1. It is, like the z distribution, a continuous distribution. 2. It is, like the z distribution, bell-shaped and symmetrical. 3. There is not one t distribution, but rather a “family” of t distributions. All t distributions have a mean of 0, but their standard deviations differ according to the sample size, n. The standard deviation for a t distribution with 5 observations is larger than for a t distribution with 20 observations. 4. The t distribution is more spread out and flatter at the centre than is the standard normal distribution. As the sample size increases, however, the t distribution approaches the standard normal distribution, because the errors in using s to estimate σ decrease with larger n. 14 Characteristics of the t Distribution The Standard Normal Distribution and Student’s t distribution 15 Confidence Interval for the Population Mean, Population Standard Deviation Unknown To develop a confidence interval for the population mean with an unknown population standard deviation (σ) we: 1. Assume the sampled population is either normal or approximately normal. 2. Estimate the population standard deviation (σ) with the sample standard deviation (s). 3. Use the t-distribution rather than the z-distribution. 16 Confidence Interval for the Population Mean, Population Standard Deviation Unknown If the population standard deviation is unknown and the population is normally distributed, then a confidence interval for the population mean is given by . where t is the t-value for a particular confidence level and sample size n. The t-value is found by looking in the t-table in Appendix (table or excel) with n – 1 degrees of freedom (Df). 17 Determining When to Use the z Distribution or the t Distribution 18 Example – Life of Light Bulbs A bulb manufacturer wishes to investigate the life of its bulbs. A sample of 10 bulbs in use since 60 days revealed a sample mean of 0.71 days of life remaining with a standard deviation of 0.13 days. Construct a 95% confidence interval for the population mean. Would it be reasonable for the manufacturer to conclude that after 60 days the population mean amount of life remaining is 0.70 days? 19 Solution – Life of Light Bulbs Given in the problem: n = 10 = 0.71 s = 0.13 Confidence level = 95% Because σ is unknown, we compute the confidence interval using the t –distribution. The t-value is found by looking in the t-table in Appendix B.2 with n – 1 = 9 degrees of freedom (Df) and a 95% confidence level. 20 Solution – Life of Light Bulbs To determine the confidence interval we substitute the values in formula The endpoints of the confidence interval are 0.617 and 0.803. The Manufacturer can be reasonably sure (95% confident) that the mean remaining life is between 0.617 and 0.803 days. Because the value of 0.70 is in this interval, it is possible that the mean of the population is 0.70. 21 Confidence Interval for a Population Proportion Recall that a proportion is the fraction, ratio, or percent indicating the part of the sample or the population having a particular trait of interest. Recall that we can determine the sample proportion with the following formula: . The sample proportion provides a point estimate of the population proportion, p. 22 Confidence Interval for a Population Proportion To develop a confidence interval for a proportion, we need to meet the following assumptions. 1. All binomial conditions are met. 2. The values np and n(1 – p) should both be greater than or equal to 5. If these assumptions are met, then a confidence interval for a population proportion is given by 23 Confidence Interval for a Population Proportion Since we do not know the value of the population proportion, we replace σp with the standard error of the sample proportion, sp: As a result, the confidence interval becomes . 24 Example – Dress Code A company decided to take poll whether employees should have dress code. Employees will have dress code if at least three-fourths of employees vote in favour of dress code. A random sample of 3000 employees reveals 2600 plan to vote for dress code. (a) What is the estimate of the population proportion? (b) Develop a 95% confidence interval for the population proportion. (c) Basing your decision on this sample information, can you conclude that the necessary proportion of employees favours the dress code? 25 Solution – Dress Code (a) The sample proportion is . (b) The 95% C.I. (c) Conclude that the dress code proposal will pass because the interval estimate includes values greater than 75% of the employees. 26 Finite Population Correction Factor The populations we have sampled so far have been very large or infinite. When the sampled population is not very large, we need to adjust the way in which we compute the standard error of the sample means and the standard error of the sample proportions. A population that has a fixed upper bound is finite. 27 Finite Population Correction Factor For a finite population, where the total number of objects or individuals is N and the number of objects or individuals in the sample is n, we need to adjust the standard errors in the confidence interval formulas. To find the confidence interval for the mean we adjust the standard error of the mean. For the confidence interval for a proportion, we need to adjust the standard error of the proportion. 28 Finite Population Correction Factor This adjustment is called the finite-population correction factor (FPC). The usual rule is if the ratio of n/N is less than 0.05, the correction factor is ignored. 29 Adjusting the Standard Errors with the FPC We adjust the standard error of the mean or proportion as follows: 30 Example – Charity Contribution There are 350 families in one area in Brooks city. A poll of 50 families reveals the mean annual charity contribution is $550 with a standard deviation of $85. (a) Develop a 90 percent confidence interval for the population mean. (b) Interpret the confidence interval. 31 Solution – Charity Contribution Given in Problem: N = 350; n = 50 and s = $85 (a) Since n/N = 50/350 = 0.14, the finite population correction factor must be used. The population standard deviation is not known therefore use the t-distribution. 32 Solution – Charity Contribution It is likely that the population mean is more than $531.25 but less than $568.75. The population mean can be $545 but not $525. Because the value $545 is within the confidence interval and $525 is not within the confidence interval. 33 Choosing An Appropriate Sample Size When working with confidence intervals, one important variable is sample size. However, in practice, sample size is not a variable. It is a decision we make so that our estimate of a population parameter is a good one. Our decision is based on three variables: 1. The margin of error the researcher will tolerate. The margin of error, denoted by E, is the amount that is added and subtracted to the sample mean or proportion to determine the endpoints of the confidence interval. 34 Choosing An Appropriate Sample Size 2. The level of confidence desired. The confidence level represents the allowable error. We logically choose a relatively high level of confidence such as 95%. Note that larger sample sizes correspond with higher levels of confidence. 3. The variation or dispersion of the population being studied. This is measured by the population standard deviation. As the variation increases, the sample size required increases. We often need to estimate the standard deviation by using a comparable study or pilot study. 35 Sample Size to Estimate a Population Mean To estimate a population mean, we can express the interaction among these three factors and the sample size in the following formula. This is the margin of error used to calculate the endpoints of confidence intervals to estimate a population mean! Solving this equation for n yields the following result: 36 Sample Size to Estimate a Population Mean where: n is the size of the sample. z is the standard normal value corresponding to the desired level of confidence σ is the population standard deviation. E is the maximum allowable error. When the outcome is not a whole number, the usual practice is to round up any fractional result. 37 Example – City Trees An NGO wants to determine the mean number of trees planted in last month near the city. The error in estimating the mean is to be less than 150 with a 95 percent level of confidence. An NGO found a report by the Department of Forest that estimated the standard deviation to be 1500. What is the required sample size? 38 Solution – City Trees 39 Sample Size to Estimate a Population Proportion To determine the sample size for a proportion, the same three variables need to be specified: 1. The margin of error 2. The desired level of confidence 3. The variation or dispersion of the population being studied The formula to determine the sample size of a proportion is given on the next slide: 40 Sample Size to Estimate a Population Proportion where: n is the size of the sample. z is the standard normal value corresponding to the desired level of confidence p is the population proportion. E is the maximum allowable error. We find a value for p through a comparable study or pilot study. When no reliable value is available, p should be set equal to 0.5. 41 Example – City Trees The study in the previous example also estimates the proportion of tress planted. An NGO wants the estimate to be within 0.15 of the population proportion, the desired level of confidence is 90 percent, and no estimate is available for the population proportion. What is the required sample size? 42 Solution – City Trees Because no estimate of the population proportion is available, we use .50 43