ST 361: Estimation --- Interval Estimation for (§7.2, 7.4) Topics: I. Interval estimation: confidence interval II. (Two-sided) Confidence interval for estimating population mean (a) When the population SD is known: use Z distribution (§7.2) (b) When the population SD is NOT known: use t distribution (§7.4) III. Two-sided confidence interval for estimating population mean difference 1 2 (§7.5) (a) when the population SD’s 1 , 2 are known (b) when the population SD’s 1 , 2 are NOT unknown ---------------------------------------------------------------------------------------------------------------------I. Interval Estimate = to use an interval of plausible values to estimate the parameter Why? Because of sampling variability, the point estimate is almost never exactly the correct value for the parameter Point estimates don’t tell us how close we are to the actual parameter So we use an interval call ___________________________to report the likely range for the parameter of interest. It has the general form of Confidence interval for the population mean Consider a sample of X 1 , X 2 ,, X n (n 30) is randomly selected from a population with mean n and SD . To estimate we use the sample mean X X i 1 n i . 1. Recall that X ~ _____________ 2. Based on this normal distribution of X , we can show that the middle 95% of the X fall within _____________________________________________________________ Notice that this is an interval centering at __________________________ 3. However, in reality we don’t know , and instead we only observe X from the sample collected. So what we really want is an interval centering at X with the same length, i.e., ________________. This interval is called _________________________________ 1 4. Meaning of the 95% confidence interval (CI) “ X 1.96 X ”: If X is from Region 2, then the interval X 1.96 X _________________________ If X is from Region 1, then the interval X 1.96 X _________________________ If X is from Region 3, then the interval X 1.96 X _________________________ Overall there are ______% of X ’s from Region 2. Hence _____________________ _______________________________________________________ 2 Meaning of the interval “ X 1.96 X ”: ___________________________. Thus we say that we have 95% of confidence that can be covered within the range X 1.96 X , X 1.96 X ♣ Comment: (a) Note that a 95% CI does NOT interpret as 95% of CHANCE that is covered within the range X 1.96 X , X 1.96 X !!! (b) Note that the fundamental assumption for constructing the CI for is that : II. (a) Confidence interval for when the population SD is known (Ch7.2) If known, the CI for at a given confidence level is Ex. What is the critical value z* at confidence level 98%? Ex. What is the critical value z* at confidence level 95%? 3 ► Comment: the higher the confidence level is, the wider or narrower (choose one) a CI becomes. Ex. X is from a population with mean and SD =10. A sample of size=100 is collected and the sample mean is 5. What is the 90% CI for ? Ex. The caffeine content (in mg) was examined for a random sample of 50 cups of black coffee dispensed by a new machine and the mean was 110mg. Denote by the mean caffeine content for cups dispensed by the machine, and assume the population SD of caffeine content for cups dispensed by the machine is 7.1mg. (1) Find an unbiased estimate for and explain why this estimate is unbiased. (2) Construct the 99% percent CI for . 4 II. (b) Confidence interval for when the population SD is NOT known (Ch 7.4) When unknown, the CI for at a given confidence level is Why use t* instead? In practice, most of the time is not known. To calculate CI for , we have to use _______ instead of n. X ~ Z, and hence we use a z critical value. If X ~Normal and is known, If X ~Normal and s (instead of ) is used in standardization, n X is not normally s n distributed. Instead, it follows a __________ ------------------------------------------------------------------------------------------------------------------------------------- The t distribution: t distribution is similar to the standard normal distribution (the Z distribution) in many aspects: (1) (2) (3) However, it has _____________________than the Z distribution. Different sample size results in different thickness of the tail in a t distribution. The smaller the sample size is, ________________________________________ Each t distribution has a degree of freedom (df) associated with it. The df is defined as ___________ and the corresponding t distribution is denoted by ____________ When the sample size is very large (i.e., >120), t(n-1)≈ Z !! 5 Use t- table to find the critical value Page 566 Table IV Ex. Use the t table to find 95% and 99% t-critical value for each of the following sample size: Sample size n Degree of freedom (df) = n-1 t* (i.e., t-critical value) 95% 99% 3 6 5 2.571 4.032 12 11 2.201 3.106 65 Note: When df>30, there is little difference in the area under the curve between the t curves and the z curve. Therefore the textbook recommends to use a z critical value when df>30. But in our class, I’ll ask you always to use t critical value when is unknown. When n is large, t* will become z* automatically. Ex7. X~Normal distribution. n=25, X =8 and s=2. What is the 95% CI for the population mean ? Ex8. X= # of claims received (per week) by an insurance company. Based on 41 weeks of samples, X 18.5 and s=20.0. What is the 95% CI for 6 Interval Length and Margin of Error (ch7.2) (a) Interval Length = ____________________________ (b) Margin of Error (MOE) , aka _________________________________________, is defined as ___________________________________________ EX. Heights of the NCSU undergraduates is normally distributed with mean and SD 5". One took a sample of size n=100 and measured the heights of each individual. The sample mean is X 70.0" . What is the margin of error at the confidence level of 95%? With a fixed confidence level, say, 95%, a wider CI or a narrower CI is better? With a fixed confidence level, how can one reduce MOE? Ex. (Continued from the example above) John wishes to shorten the margin of error by half without compromising on the level of confidence, which is set to be 95%. What is the sample size John should use? 7