Chapter 8: ESTIMATION OF THE MEAN AND PROPORTION ESTIMATION: AN INTRODUCTION Definition The assignment of value(s) to a population parameter based on a value of the corresponding sample statistic is called estimation. 2 ESTIMATION: AN INTRODUCTION cont. Definition The value(s) assigned to a population parameter based on the value of a sample statistic is called an estimate. The sample statistic used to estimate a population parameter is called an estimator. 3 ESTIMATION: AN INTRODUCTION cont. The estimation procedure involves the following steps. 1. 2. 3. 4. Select a sample. Collect the required information from the members of the sample. Calculate the value of the sample statistic. Assign value(s) to the corresponding population parameter. 4 POINT AND INTERVAL ESTIMATES A Point Estimate An Interval Estimate 5 A Point Estimate Definition The value of a sample statistic that is used to estimate a population parameter is called a point estimate. 6 A Point Estimate cont. Usually, whenever we use point estimation, we calculate the margin of error associated with that point estimation. The margin of error is calculated as follows: Margin of error 1.96 x or 1.96s x 7 An Interval Estimation Definition In interval estimation, an interval is constructed around the point estimate, and it is stated that this interval is likely to contain the corresponding population parameter. 8 Figure 8.1 Interval estimation. x $1130 x $1370 $1610 9 An Interval Estimation cont. Definition Each interval is constructed with regard to a given confidence level and is called a confidence interval. The confidence level associated with a confidence interval states how much confidence we have that this interval contains the true population parameter. The confidence level is denoted by (1 – α)100%. 10 INTERVAL ESTIMATION OF A POPULATION MEAN: LARGE SAMPLES Confidence Interval for μ for Large Samples The (1 – α)100% confidence interval for μ is x z x if is known where x zs x if is not known x / n and s x s / n The value of z used here is read from the standard normal distribution table for the given confidence level. 11 INTERVAL ESTIMATION OF A POPULATION MEAN: LARGE SAMPLES cont. Definition The maximum error of estimate for μ, denoted by E, is the quantity that is subtracted from and added to the value of x to obtain a confidence interval for μ. Thus, E z x or zs x 12 Figure 8.2 Finding z for a 95% confidence level. Total shaded area is .9500 or 95% .4750 .4750 μ -1.96 0 x 1.96 z 13 Figure 8.3 2 Area in the tails. 2 (1 – α) -z 0 z z 14 Example 8-1 A publishing company has just published a new college textbook. Before the company decides the price at which to sell this textbook, it wants to know the average price of all such textbooks in the market. The research department at the company took a sample of 36 comparable textbooks and collected information on their prices. This information produces a mean price of $70.50 for this sample. It is known that the standard deviation of the prices of all such textbooks is $4.50. 15 Example 8-1 (a) (b) What is the point estimate of the mean price of all such textbooks? What is the margin of error for the estimate? Construct a 90% confidence interval for the mean price of all such college textbooks. 16 Solution 8-1 a) n = 36, x = $70.50, and σ = $4.5 x n 4.50 36 $.75 Point estimate of μ = x = $70.50 Margin of error = 1.96 x 1.96(.75) $1.47 17 Solution 8-1 b) Confidence level is 90% or .90. z = 1.65. x z x 70.50 1.65(.75) 70.50 1.24 (70.50 - 1.24) to (70.50 1.24) $69.26 to $71.74 18 Solution 8-1 We can say that we are 90% confident that the mean price of all such college textbooks is between $69.26 and $71.74. 19 Figure 8.4 x3 Confidence intervals. x2 x1 1.65 x x2 1.65 x x3 1.65 x x3 x3 1.65 x x2 x1 x1 x x1 1.65 x x2 1.65 x 20 Example 8-2 According to a report by the Consumer Federation of America, National Credit Union Foundation, and the Credit Union National Association, households with negative assets carried an average of $15,528 in debt in 2002 (CBS.MarketWatch.com, May 14, 2002). Assume that this mean was based on a random sample of 400 households and that the standard deviation of debts for households in this sample was $4200. Make a 99% confidence interval for the 2002 mean debt for all such households. 21 Solution 8-2 Confidence level 99% or .99 sx s n 4200 400 $210 The sample is large (n > 30) Therefore, we use the normal distribution z = 2.58 22 Solution 8-2 x zs x 15,528 2.58(210) 15,528 541.80 $14,986.20 to $16,069.80 Thus, we can state with 99% confidence that the 2002 mean debt for all households with negative assets was between $14,986.20 and $16,069.80. 23 INTERVAL ESTIMATION OF A POPULATION MEAN: SMALL SAMPLES The t Distribution Confidence Interval for μ Using the t Distribution 24 The t Distribution Conditions Under Which the t Distribution Is Used to Make a Confidence Interval About μ The t distribution is used to make a confidence interval about μ if The population from which the sample is drawn is (approximately) normally distributed 2. The sample size is small (that is, n < 30) 3. The population standard deviation , σ , is not known 1. 25 The t Distribution cont. The t distribution is a specific type of bellshaped distribution with a lower height and a wider spread than the standard normal distribution. As the sample size becomes larger, the t distribution approaches the standard normal distribution. The t distribution has only one parameter, called the degrees of freedom (df). The mean of the t distribution is equal to 0 and its standard deviation is df /( df 2) . 26 Figure 8.5 The t distribution for df = 9 and the standard normal distribution. The standard deviation of the t distribution is The standard deviation of the standard normal distribution is 1.0 9 /(9 2) 1.134 μ=0 27 Example 8-3 Find the value of t for 16 degrees of freedom and .05 area in the right tail of a t distribution curve. 28 Table 8.1 Determining t for 16 df and .05 Area in the Right Tail Area in the right tail Area in the Right Tail Under the t Distribution Curve df df .10 .05 .025 … .001 1 2 3 . 16 . 3.078 1.886 1.638 … 1.337 … 6.314 2.920 2.353 … 1.746 … 12.706 4.303 3.182 … 2.120 … … … … … … … 318.309 22.327 10.215 … 3.686 … The required value of t for 16 df and .05 area in the right tail 29 Figure 8.6 The value of t for 16 df and .05 area in the right tail. .05 df = 16 0 1.746 t This is the required value of t 30 Figure 8.7 The value of t for 16 df and .05 area in the left tail. df = 16 .05 -1.746 0 t 31 Confidence Interval for μ Using the t Distribution The (1 – α)100% confidence interval for μ is x tsx s where s x n The value of t is obtained from the t distribution table for n – 1 degrees of freedom and the given confidence level. 32 Example 8-4 Dr. Moore wanted to estimate the mean cholesterol level for all adult men living in Hartford. He took a sample of 25 adult men from Hartford and found that the mean cholesterol level for this sample is 186 with a standard deviation of 12. Assume that the cholesterol levels for all adult men in Hartford are (approximately) normally distributed. Construct a 95% confidence interval for the population mean μ. 33 Solution 8-4 Confidence level is 95% or .95 sx s n 12 25 2.40 df = n – 1 = 25 – 1 = 24 Area in each tail = .5 – (.95/2) = .5 - .4750 = .025 The value of t in the right tail is 2.064 34 Figure 8.8 The value of t. df = 24 .025 .025 .4750 -2.064 .4750 0 2.064 t 35 Solution 8-4 x tsx 186 2.064(2.40) 186 4.95 181.05 to 190.95 Thus, we can state with 95% confidence that the mean cholesterol level for all adult men living in Harford lies between 181.05 and 190.95. 36 Example 8-5 Twenty-five randomly selected adults who buy books for general reading were asked how much they usually spend on books per year. The sample produced a mean of $1450 and a standard deviation of $300 for such annual expenses. Assume that such expenses for all adults who buy books for general reading have an approximate normal distribution. Determine a 99% confidence interval for the corresponding population mean. 37 Solution 8-5 Confidence level is 99% or .99 sx s n 300 25 $60 df = n – 1 = 25 – 1 = 24 Area in each tail = .5 – (.99/2) = .5 - .4950 = .005 The values of t are 2.797 and -2.797 38 Solution 8-5 The 99% confidence interval for μ is x ts x $1450 2.797(60) $1450 $167.82 $1282.18 to $1617.82 39 INTERVAL ESTIMATION OF A POPULATION PROPORTION: LARGE SAMPLES Estimator of the Standard Deviation of p̂ The value of s pˆ , which gives a point estimate of p̂ , is calculated as s pˆ pˆ qˆ n 40 INTERVAL ESTIMATION OF A POPULATION PROPORTION: LARGE SAMPLES cont. Confidence Interval for the Population Proportion, p The (1 – α)100% confidence interval for the population proportion, p, is pˆ zs pˆ The value of z used here is obtained from the standard normal distribution table for the given confidence level, and s pˆ pˆ qˆ/n . 41 Example 8-6 According to a 2002 survey by FindLaw.com, 20% of Americans needed legal advice during the past year to resolve such thorny issues as family trusts and landlord disputes (CBS.MarketWach.com, August 6, 2002). Suppose a recent sample of 1000 adult Americans showed that 20% of them needed legal advice during the past year to resolve such family-related issues. 42 Example 8-6 a) What is the point estimate of the population proportion? What is the margin of error of this estimate? b) Find, with a 99% confidence level, the percentage of all adult Americans who needed legal advice during the past year to resolve such family-related issues. 43 Solution 8-6 n = 1000, s pˆ p̂ = .20, and, q̂ = .80 pˆ qˆ (.20)(.80) .01264911 n 1000 Note that than 5. npˆ and nqˆ are both greater 44 Solution 8-6 a) Point estimate of p = p̂ = .20 Margin of error = ±1.96 s pˆ = ±1.96(.01264911) = ± .025 or ±2.5% 45 Solution 8-6 b) The confidence level is 99%, or .99. The z value for .4950 is approximately 2.58. pˆ zs pˆ .20 2.58(.01264911) .20 .033 .167 to .233 or 16.7% to 23.3% 46 Example 8-7 According to the analysis of a CNN–USA TODAY–Gallup poll conducted in October 2002, “Stress has become a common part of everyday life in the United States. The demands of work, family, and home place an increasing burden on the average American.” According to this poll, 40% of Americans included in the survey indicated that they had a limited amount of time to relax (Gallup.com, November 8, 2002). The poll was based on a randomly selected national sample of 1502 adults aged 18 and older. Construct a 95% confidence interval for the corresponding population proportion. 47 Solution 8-7 Confidence level = 95% or .95 pˆ qˆ (.40)(.60) .01264069 n 1502 s pˆ The value of z for .95 / 2 = .4750 is 1.96. 48 Solution 8-7 pˆ zs pˆ .40 1.96(.01264069) .40 .025 .375 to .425 or 37.5% to 42.5% 49 DETERMINING THE SAMPLE SIZE FOR THE ESTIMATION OF THE MEAN Given the confidence level and the standard deviation of the population, the sample size that will produce a predetermined maximum error E of the confidence interval estimate 2 2 of μ is z n Where E is E 2 E z x z. n 50 Example 8-8 An alumni association wants to estimate the mean debt of this year’s college graduates. It is known that the population standard deviation of the debts of this year’s college graduates is $11,800. How large a sample should be selected so that the estimate with a 99% confidence level is within $800 of the population mean? 51 Solution 8-8 z (2.58) (11,800) n 2 2 E (800) 1448.18 1449 2 2 2 2 52 DETERMINING THE SAMPLE SIZE FOR THE ESTIMATION OF PROPORTION Given the confidence level and the values of p and q, the sample size that will produce a predetermined maximum error E of the confidence interval estimate of p is 2 z pq n 2 E Where E is E z pˆ z pq n 53 DETERMINING THE SAMPLE SIZE FOR THE ESTIMATION OF PROPORTION cont. In case the values of p and q are not known: 1. Take the most conservative estimate of the sample size n by using p = .5 and q = .5. For a given E, these values of p and q will give the largest sample size in comparison to any other pair of values of p = .5 and q = .5 since their product is greater than the product of any other pair. 54 DETERMINING THE SAMPLE SIZE FOR THE ESTIMATION OF PROPORTION cont. 2. Take a preliminary sample of arbitrarily determined size and calculate and p̂ q̂ from this sample. Then use them to find n. 55 Example 8-9 Lombard Electronics Company has just installed a new machine that makes a part that is used in clocks. The company wants to estimate the proportion of these parts produced by this machine that are defective. The company manager wants this estimate to be within .02 of the population proportion for a 95% confidence level. What is the most conservative estimate of the sample size that will limit the maximum error to within .02 of the population proportion? 56 Solution 8-9 The value of z for a 95% confidence level is 1.96. p = .50 and q = .50 z 2 pq (1.96) 2 (.50)(.50) n 2401 2 2 E (.02) Thus, if the company takes a sample of 2401 parts, there is 95% chance that the estimate of p will be within .02 of the population proportion. 57 Example 8-10 Consider Example 8-9 again. Suppose a preliminary sample of 200 parts produced by this machine showed that 7% of them are defective. How large a sample should the company select so that the 95% confidence interval for p is within .02 of the population proportion? 58 Solution 8-10 p̂ = .07 and q̂ = .93 z pˆ qˆ (1.96) (.07)(.93) n 2 2 E (.02) (3.8416)(.07)(.93) 625.22 626 .0004 2 2 59