Chapter 9 Section 2 Confidence Intervals about a Population Mean in Practice where the Population Standard Deviation is Unknown Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 1 of 25 Chapter 9 – Section 2 ● Learning objectives 1 Know the properties of t-distribution 2 Determine t-values 3 Construct and interpret a confidence interval about a population mean Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 2 of 25 Chapter 9 – Section 2 ● Learning objectives 1 Know the properties of t-distribution 2 Determine t-values 3 Construct and interpret a confidence interval about a population mean Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 3 of 25 Chapter 9 – Section 2 ● In Section 1, we assumed that we knew the population standard deviation σ ● Since we did not know the population mean μ, this seems to be unrealistic ● In this section, we construct confidence intervals in the case where we do not know the population standard deviation ● This is much more realistic Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 4 of 25 Chapter 9 – Section 2 ● If we don’t know the population standard deviation σ, we obviously can’t use the formula Margin of error = 1.96 • σ / √ n because we have no number to use for σ ● However, just as we can use the sample mean to approximate the population mean, we can also use the sample standard deviation to approximate the population standard deviation Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 5 of 25 Chapter 9 – Section 2 ● Because we’ve changed our formula (by using s instead of σ), we can’t use the normal distribution any more ● Instead of the normal distribution, we use the Student’s t-distribution ● This distribution was developed specifically for the situation when σ is not known Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 6 of 25 Chapter 9 – Section 2 ● Properties of the t-distribution ● Several properties are familiar about the Student’s t distribution Just like the normal distribution, it is centered at 0 and symmetric about 0 Just like the normal curve, the total area under the Student’s t curve is 1, the area to left of 0 is ½, and the area to the right of 0 is also ½ Just like the normal curve, as t increases, the Student’s t curve gets close to, but never reaches, 0 Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 7 of 25 Chapter 9 – Section 2 ● So what’s different? ● Unlike the normal, there are many different “standard” t-distributions There is a “standard” one with 1 degree of freedom There is a “standard” one with 2 degrees of freedom There is a “standard” one with 3 degrees of freedom Etc. ● The number of degrees of freedom is crucial for the t-distributions Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 8 of 25 Chapter 9 – Section 2 ● When σ is known, the z-score x z / n follows a standard normal distribution ● When σ is not known, the t-statistic x t s/ n follows a t-distribution with n – 1 degrees of freedom Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 9 of 25 Chapter 9 – Section 2 ● Comparing three curves The standard normal curve The t curve with 14 degrees of freedom The t curve with 4 degrees of freedom Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 10 of 25 Chapter 9 – Section 2 ● Learning objectives 1 Know the properties of t-distribution 2 Determine t-values 3 Construct and interpret a confidence interval about a population mean Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 11 of 25 Chapter 9 – Section 2 ● The calculation of t-distribution values can be done in similar ways as the calculation of normal values Using tables (such as Table V on the inside back cover) Using technology (such as Excel, MINITAB, calculators, StatCrunch, etc.) ● Because t-distribution tables are not complete, it is suggested that the calculations be done with one of the technology methods Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 12 of 25 Chapter 9 – Section 2 ● Critical values for various degrees of freedom for the t-distribution are (compared to the normal) n 6 16 Degrees of Freedom 5 15 t0.025 2.571 2.131 31 101 1001 30 100 1000 2.042 1.984 1.962 Normal “Infinite” 1.960 Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 13 of 25 Chapter 9 – Section 2 ● Learning objectives 1 Know the properties of t-distribution 2 Determine t-values 3 Construct and interpret a confidence interval about a population mean Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 14 of 25 Chapter 9 – Section 2 ● The difference between the two formulas x z / n x t s/ n is that the sample standard deviation s is used to approximate the population standard deviation σ ● The z-score has a normal distribution, the t-statistic (or the t-score) has a t-distribution Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 15 of 25 Chapter 9 – Section 2 ● A 95% confidence interval, with σ unknown, is x t 0.025 s n to s x t 0.025 n where t0.025 is the critical value for the t-distribution with (n – 1) degrees of freedom Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 16 of 25 Chapter 9 – Section 2 ● The different confidence intervals with t0.025 would be For n = 6, the sample mean ± 2.571 • s / √ 6 For n = 16, the sample mean ± 2.131 • s / √ 16 For n = 31, the sample mean ± 2.042 • s / √ 31 For n = 101, the sample mean ± 1.984 • s / √ 101 For n = 1001, the sample mean ± 1.962 • s / √ 1001 When σ is known, the sample mean ± 1.960 • σ / √ n Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 17 of 25 Chapter 9 – Section 2 ● In general, the (1 – α) • 100% confidence interval, when σ is unknown, is s x t / 2 n to s x t / 2 n where tα/2 is the critical value for the t-distribution with (n – 1) degrees of freedom Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 18 of 25 Chapter 9 – Section 2 ● As the sample size n gets large, there is less and less of a difference between the critical values for the normal and the critical values for the t-distribution ● It is correct to use the t-distribution when σ is not known Technology should always use t-distribution When doing rough assessment by hand, the normal critical values can be used, particularly when n is large, for example if n is 30 or more Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 19 of 25 Chapter 9 – Section 2 ● When does the t-distribution and normal differ by a lot? ● In either of two situations When the sample size n is small (particularly if n is 10 or less), or When the confidence level needs to be high (particularly if α is 0.005 or lower) ● For n = 5 and α = .001, when n and α both are small, the t-distribution critical value is 5.893 compared to the normal critical value of 3.091 Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 20 of 25 Chapter 9 – Section 2 ● Assume that we want to estimate the average weight of a particular type of very rare fish We are only able to borrow 7 specimens of this fish The average weight of these was 1.38 kg (the sample mean) The standard deviation of these was 0.29 kg (the sample standard deviation) ● What is a 95% confidence interval for the true mean weight? Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 21 of 25 Chapter 9 – Section 2 ● n = 7, the critical value t0.025 for 6 degrees of freedom is 2.447 ● Our confidence interval thus is 0.29 1.38 2.447 1.11 7 to 1.38 2.447 0.29 1.65 7 or (1.11, 1.65) Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 22 of 25 Chapter 9 – Section 2 ● Outliers are always a concern, but they are even more of a concern for confidence intervals using the t-distribution The number of values n is small, so each outlier has a major affect on the data set The sample mean is sensitive to outliers The sample standard deviation is sensitive to outliers ● This is a problem! Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 23 of 25 Chapter 9 – Section 2 ● So what can we do? We always must check to see that the outlier is a legitimate data value (and not just a typo) We can collect more data, for example to increase n to be over 30 ● If neither method above will work, i.e. if the data value is a legitimate value and we are not able to collect more data, then there are other methods (“nonparametric methods”) that could apply … these are in Chapter 15 Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 24 of 25 Summary: Chapter 9 – Section 2 ● We used values from the normal distribution when we knew the value of the population standard deviation σ ● When we do not know σ, we estimate σ using the sample standard deviation s ● We use values from the t-distribution when we use s instead of σ, i.e. when we don’t know the population standard deviation Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 9 Section 2 – Slide 25 of 25