Interval Estimation (Confidence Intervals) Section 9 [Book Chapter 8] I. Introduction - What does x try to estimate? - What does p try to estimate? - What does s try to estimate? - The questions are: • What are we trying to measure? • We know we are not exactly right so: What range might we be in (and how confident are we?) 1 II. Definitions Estimator versus Estimate: Estimator: Estimate: point estimate: a single number estimate from a sample used to estimate a population parameter Exs: interval estimates: A range of numbers estimated from a sample that you believe the population parameter lies in. Exs: confidence interval: Gives a range the population parameter lies in, along with a stated probability. Exs: Sampling error= x − μ 2 III. Properties of Estimators - We want our estimators to have three properties: 1. Unbiased: E(estimator) = population parameter Exs: 2. Efficiency: - An estimator is efficient if its variance is smaller than any other estimator’s variance. 3. Consistency lim P(estimator = population parameter ) = 1 n →∞ as n increases the probability that estimator equals the population parameter gets closer to 1. 3 IV. Interval Estimation (confidence Intervals) for the Population Mean: large sample case (n ≥ 30) Confidence Intervals for the Population Mean: Gives a range of where we believe the population parameter is. A. What a Confidence interval looks like B. Clearing up misconceptions of Confidence intervals: (What a confidence interval does and does not say) EX: We are going to calculate a 95% confidence interval for μ What it means: If we take a large number of samples and calculate a 95% confidence interval for each of them (i.e. [-1,1]; [-.9,1.1]; [-1.1,.9]. . .) 95% of them will contain the population mean. For example if we take 100 samples and calculate confidence intervals for them we expect 95 to contain the population mean. A Picture: 4 What it Does NOT mean: Given a confidence interval [a,b] there is a 95% chance the population mean is between [a,b]. (We cannot assign a probability to a specific interval) 5 C. Calculate a confidence interval for μ ( n ≥ 30 ) 1. If you know σ : Estimation of μ = x ± Z α * 2 α =1− confidence interval σ n α is the amount you want in both tails (put together, i.e. α / 2 is in each tail) Z α : P( Z > Z α ) = 2 2 α 2 Table: Look up on table, z value for the following probability: 0.5 − α 2 In Excel: Z α = abs ( Normsinv (α 2 )) 2 Picture: What would your Z α be if you wanted to calculate a 90% confidence 2 interval? 6 Ex: Sample of profits from small firms, in millions. We collect a random sample of 10,000 firms, your sample mean is 83.2(million) and we know the population standard deviation is 3 (million). Calculate a 95% confidence interval for the population mean. Interpret. 7 2. If you don’t know σ Estimate of μ = x ± Z α * 2 s n Ex: Sample of ACT scores from Wisconsin students. You collect a random sample of 100 students & find a sample mean of 16, you also estimate a sample variance of 9. Calculate a 90% confidence interval for the population mean. Interpret. Calculate a 99% confidence interval for the population mean. Interpret. 8 V. Interval Estimation (confidence Intervals) for the Population Mean: small sample case (n<30) A. Important assumption needed for small sample: must assume population is normally distributed Solution: 1. If you know σ : Assume population is normally distributed Estimation of μ = x ± Z α * 2 σ n 2. If you don’t know σ Assume population is normally distributed s Estimate of μ = x ± tα * 2 n t − dist with n-1 degrees of freedom Note t-dist. vs. z-dist: 9 Looking up t in the table: Ex: (sample size=30, 95% confidence interval) t in Excel: =TINV( α ,degrees of freedom) Ex: (sample size=30, 95% confidence interval) 10 VI. Justifying the confidence interval We want to show why: x ± Zα / 2 σ n is a 100(1- α ) percent confidence interval for μ If n is large what does the distribution of x look like? 11 VII. Interval estimation (confidence interval) of a population proportion Estimation of p = p ± Z α / 2 p (1 − p ) n Ex: You calculate that the proportion of Polish people in Milwaukee in 1965 was 0.61. If your sample size was 400 calculate a 90% confidence interval. 12 VIII. Determining sample size (needed) What if we have some idea how much error we are willing to accept and we want to know how large of a sample we need to draw to get this margin of error. What was the formula for our margin of error? Ex: Suppose you are working for Blockbuster & they want to know the average number of days a video is returned late. Your boss only wants the margin of error to be 1 day, with a 99% confidence interval. Assume the population standard deviation is 3.2. How large of a sample do you need to collect? 13 Practice Problems 1. An economist would like to estimate the mean annual income of widowers living in Idaho. A simple random sample of 100 widowers is taken, and their mean income is found to equal $11,500. The economist knows that the population standard deviation equals $2400. A. Construct a point estimate for the mean income of all widowers in Idaho. B. Should the z ratio of the t ratio be used for standardizing x ? Which table should be used for finding the z or t values? C. Construct a 95% confidence interval for the mean income of all widowers in Idaho D. If the sample size increased, holding all else constant, how would this affect our confidence interval? E. If the level of confidence increased, holding all else constant, how would this affect our confidence interval? F. If the economist were to learn that the population standard deviation was actually larger than previously assumed, holding all else constant, what impact would this have on the confidence interval? G. If the sample mean increased by $500, holding all else constant, what impact would this have on the confidence interval? 2. A computer sales representative would like to construct a 99% confidence interval for estimating the mean cost of computer repairs at his company. He knows from past experience that σ =35, and is willing to accept a margin of error no greater than $10. Given this information, how many computers would he have to sample? Answers 1. A. $11,500 B. the Z ratio should be used since our sample size is over 30. C. ⎞ = 11,500 ± 1.96⎛⎜ 2400 ⎟ 100 ⎝ ⎠ = 11,500 ± 470.4 = 11,029.6 < μ < 11,970.4 (margin of error is 470.4) D. Increasing the sample size would decrease the width E. Increasing the level of confidence would increase the width F. Increasing the population standard deviation would increase the width G. The confidence interval would shift to the right by $500 ( 11,529.6 < μ < 12,470.4 ) 2. 82. 14