Confidence interval of the Mean i. Estimating the confidence interval for the population mean (σ known) This is given through the following formula; x̄ ± Z*σ/√n Where: x̄ is the sample mean Z is the critical value at a given level of confidence which is the inverse of the probability values. σ is the population standard deviation n is the sample size σ/√n is the Standard Error (S.E) Z*σ/√n is the Margin of Error (M.E) We can re-write this formula to have as; x̄ ± M.E NOTE At 80% confidence level, Z = 1.282 At 90% confidence level, Z = 1.645 At 95% confidence level, Z = 1.96 At 98% confidence level, Z = 2.326 At 99% confidence level, Z = 2.576 Example The American Management Association wishes to have information on the mean income of store managers in the retail industry. A random sample of 256 managers reveals a mean of $45,420. The standard deviation of this population is $2,050. The association would like answers to the following questions: 1. What is the sample mean? x̄ = 45420 2. What is the point estimate: The point estimate the a measure derived from the sample that is supposed to estimate the population parameter e.g the sample mean, the sample standard deviation or the sample proportion 3. What is the 95% confidence interval for the population mean? x̄ ± M.E x̄ ± Z*σ/√n 45420 ± (1.96*(2050/√256)) Lower limit = 45169 Upper limit = 45671 45,169 < µ < 45,671 4. How do we interpret these results? We are 95% confident that the true population mean is between $45,169 and $45,671 Example The Bun-and-Run is a franchise fast-food restaurant located in the Northeast specializing in half-pound hamburgers, fish sandwiches, and chicken sandwiches. Soft drinks and French fries are also available. The Planning Department of Bun-and-Run Inc. reports that the distribution of daily sales for restaurants follows the normal distribution and that the population standard deviation is $3,000. A sample of 40 showed the mean daily sales to be $20,000. (a) What is the population mean? We do not know the population mean (b) What is the best estimate of the population mean? What is this value called? The best estimate is the point estimate of the population mean which is the sample mean which is 20,000 (c) Develop a 99 percent confidence interval for the population mean. x̄ ± Z*σ/√n 20000 ± (2.58*(3000/√40)) Lower limit = 18776 Upper limit = 21224 18,776 < µ < 21,224 (d) Interpret the confidence interval. We are 99% confident that the true population mean is between $18,776 and $21,224 Questions 1. A research firm conducted a survey to determine the mean amount by steady smokers spend. This followed the normal distribution with a population standard deviation of $5. A sample of 49 steady smokers revealed that on cigarettes during a week. They found the sample distribution of amounts spent per week is x̄ = 20 The point estimate is 20 as it is the sample mean The 95% confidence interval will be: Z*S.E = M.E C.I = Point estimate ± M.E 20 ± 1.96*(5/√49) = (18.6, 21.4) We are 95% confident that the true population mean is within the interval above. 2. Refer to the previous question. Suppose that 64 smokers (instead of 49) were sampled. Assume the sample mean remained the same. a. What is the 95 percent confidence interval estimate of _? 20 ± 1.96*(5/√64) = (18.775, 21.225) b. Explain why this confidence interval is narrower than the one determined in the previous exercise NB Whenever we increase the sample size, the margin of error decreases and therefore the confidence interval becomes narrower. Obtaining the Z critical value on Excel Assuming we are interested in finding the critical value at 95% confidence level The Excel function for this will be: =norm.s.inv(probability) ii. Estimating the confidence interval for the population mean (σ unknown) Whenever we are estimating the confidence interval when the population standard deviation is not known, we use the t-distribution. The formula is given by; x̄ ± t*s/√n Where t is the critical value with n-1 degrees of freedom and s is the sample standard deviation. Example 1. Find the t-critical value at 95% confidence level when the sample size is 21. Degrees of freedom (df) = n-1 Df = 21-1 = 20 From the t-table, t = 2.086 2. Find the t-critical value at 90% confidence level when the sample size is 17. From the t-table, t = 1.746 Computing the t-critical Value using Excel When interested in computing the t critical value at a given level of confidence and n1 degrees of freedom we use the following formula. =T.INV.2T(probability, df) Where 2T represents the two tails. Example Recompute the above critical values using Excel Find the t-critical value at 95% confidence level when the sample size is 21. Degrees of freedom (df) = n-1 Df = 21-1 = 20 =T.INV.2T(0.05, 20) = 2.085963447 Find the t-critical value at 90% confidence level when the sample size is 17. =T.INV.2T(0.1, 16) = 1.745883676 Relationship Between Z-distribution and t-distribution The following characteristics of the t distribution are based on the assumption that the population of interest is normal, or nearly normal. • It is, like the z distribution, a continuous distribution. • It is, like the z distribution, bell-shaped and symmetrical. • There is not one t distribution, but rather a family of t distributions. All t distributions have a mean of 0, but their standard deviations differ according to the sample size, n. There is a t distribution for a sample size of 20, another for a sample size of 22, and so on. The standard deviation for a t distribution with 5 observations is larger than for a t distribution with 20 observations. • The t distribution is more spread out and flatter at the center than the standard normal distribution. As the sample size increases, however, the t distribution approaches the standard normal distribution, because the errors in using s to estimate _ decrease with larger samples. Example A tire manufacturer wishes to investigate the tread life of its tires. A sample of 10 tires driven 50,000 miles revealed a sample mean of 0.32 inches of tread remaining with a standard deviation of 0.09 inches. Construct a 95 percent confidence interval for the population mean. Would it be reasonable for the manufacturer to conclude that after 50,000 miles the population mean amount of tread remaining is 0.30 inches? The formula is given by; x̄ ± t*s/√n 0.32 ± (2.262*(0.09/√10)) Lower limit = 0.32 - (2.262*(0.09/√10)) = 0.2556 Upper limit = 0.32 + (2.262*(0.09/√10)) = 0.3844 0.2556 < µ < 0.3844 We are 95% confident that the true population mean is between 0.2556 and 0.3844. Yes it is reasonable because 0.30 falls within the confidence interval. Estimating the Confidence interval of a Single Proportion Before conducting a confidence interval test for proportions, we need first to verify that the normal distribution is followed. 1. n*p(1-p) ≥ 10 2. n < 0.05N 2000*0.8*0.2 = 320 This is given by the formula; p ± Z*√(p(1-p)/n) where p is the sample proportion. Point estimate = p = x/n where x is the number of observations and n is the total sample size. Standard error = √(p(1-p)/n) Margin of error = Z*√(p(1-p)/n) The estimate of a population proportion is the sample proportion. Example: The union representing the Bottle Blowers of America (BBA) is considering a proposal to merge with the Teamsters Union. According to BBA union bylaws, at least three-fourths of the union membership must approve any merger. A random sample of 2,000 current BBA members reveals 1,600 plan to vote for the merger proposal. What is the estimate of the population proportion? p = x/n = 1600/2000 = 0.8 Develop a 95 percent confidence interval for the population proportion. p ± Z*√(p(1-p)/n) 0.8 ± 1.96*√(0.8(1-0.8)/2000) Lower limit = 0.7825 Upper limit = 0.8175 0.7825 < P < 0.8175 We are 95% confident that the true population proportion is between 78.25% and 81.75%. Basing your decision on this sample information, can you conclude that the necessary proportion of BBA members favor the merger? Why? Since all values in the confidence interval are greater than 75%, then the necessary proportion of BBA members favor the merger. Sample Size Estimation 1. When we need to compute the sample size given the standard deviation, we use the following formula; n = (Zα/2*σ/E)2 Where n is the sample size Z is the two tailed critical value σ is the standard deviation E is the margin of error we are willing to accept Example A student in public administration wants to determine the mean amount members of city councils in large cities earn per month as remuneration for being a council member. The error in estimating the mean is to be less than $100 with a 95 percent level of confidence. The student found a report by the Department of Labor that reported a standard deviation of $1,000. What is the minimum required sample size? n = (Z*σ/E)2 = (1.96*1000/100)2 384.16 We round up this value to get n = 385 Question A population is estimated to have a standard deviation of 10. We want to estimate the population mean within 2, with a 80 percent level of confidence. How large a sample is required? 2. The sample size when dealing with proportions is given by; n = Zα /22*p(1-p)/E2 Where E is the margin of Error. Example The study in the previous example also estimates the proportion of cities that have private refuse collectors. The student wants the margin of error to be within .10 of the population proportion, the desired level of confidence is 90 percent, and no estimate is available for the population proportion. What is the required sample size? Assume that the prior estimate was 30% for the proportion. What is the new minimum sample size required When no prior estimate is available for population proportion, we assume p = 50% = 0.5 n = Z2*p(1-p)/E2 n = 1.6452*0.5(1-0.5)/0.10 = 6.765 We round up to get n = 7 When p = 0.3, we have; n = 1.6452*0.3(1-0.3)/0.10 = 5.6 Rounding up to n = 6 Question It is estimated that 60 percent of U.S. households subscribe to cable TV. You would like to verify this statement for your class in mass communications. If you want your estimate to be within 5 percentage points, with a 95 percent level of confidence, how large of a sample is required? Finite Population Correction Factor The Finite Population Correction Factor (FPC) is used when you sample from more than 5% of a finite population. It’s needed because under these circumstances, the Central Limit Theorem doesn’t hold and the standard error of the estimate (e.g. the mean or proportion) will be too big. In other words, this principle is used when the population is small so that the sample is a major fraction of the population( i.e greater than 5%), the standard error formula can be reduced by applying the finite-population correction factor to form the adjusted standard error. This formula is used when we have been given the size of the population to which the sample is drawn. The formula is: √((N – n)/N-1) Where N is the size of the population and n the size of the sample. Example There are 250 families in Scandia, Pennsylvania. A random sample of 40 of these families revealed the mean annual church contribution was $450 and the standard deviation of this was $75. 1. Develop a 90 percent confidence interval for the population mean. x̄ ± t*s/√n*(√((N – n)/N-1)) 450 ± 1.685*(75/√40) *(√((250 – 40)/(250-1)) FCF = 0.91835 Lower limit = 450 - 1.685*(75/√40)* 0.91835= 431.6 Upper limit = 450 + 1.685*(75/√40)*0.91835= 468.4 (431.6, 468.4) Question Thirty people from a population of 300 were asked how much they had in savings. The sample mean (x̄) was $1,500, with a sample standard deviation of $89.55. Construct a 95% confidence interval estimate for the population mean. FPC = (300- 30)/299 = 0.9502684 S.E = 89.55/√30 = 16.3495183415 A.S.E = 16.3495183415*0.9502684 = 15.5364306351 tc = 2.045229642 M.E = 2.045229642*15.5364306351 = 31.7755684658 L.B = 1500 - 31.7755684658 = 1468.22 U.B = 1500 + 31.7755684658 = 1531.78