Author: Brenda Gunderson, Ph.D., 2012 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution-NonCommercial-Share Alike 3.0 Unported License: http://creativecommons.org/licenses/by-nc-sa/3.0/ The University of Michigan Open.Michigan initiative has reviewed this material in accordance with U.S. Copyright Law and have tried to maximize your ability to use, share, and adapt it. The attribution key provides information about how you may share and adapt this material. Copyright holders of content included in this material should contact open.michigan@umich.edu with any questions, corrections, or clarification regarding the use of content. For more information about how to attribute these materials visit: http://open.umich.edu/education/about/terms-of-use. Some materials are used with permission from the copyright holders. You may need to obtain new permission to use those materials for other uses. This includes all content from: Mind on Statistics Utts/Heckard, 4th Edition, Cengage L, 2012 Text Only: ISBN 9781285135984 Bundled version: ISBN 9780538733489 Attribution Key For more information see: http:://open.umich.edu/wiki/AttributionPolicy Content the copyright holder, author, or law permits you to use, share and adapt: Creative Commons Attribution-NonCommercial-Share Alike License Public Domain – Self Dedicated: Works that a copyright holder has dedicated to the public domain. Make Your Own Assessment Content Open.Michigan believes can be used, shared, and adapted because it is ineligible for copyright. Public Domain – Ineligible. WOrkds that are ineligible for copyright protection in the U.S. (17 USC §102(b)) *laws in your jurisdiction may differ. Content Open.Michigan has used under a Fair Use determination Fair Use: Use of works that is determined to be Fair consistent with the U.S. Copyright Act (17 USC § 107) *laws in your jurisdiction may differ. Our determination DOES NOT mean that all uses of this third-party content are Fair Uses and we DO NOT guarantee that your use of the content is Fair. To use t his content you should conduct your own independent analysis to determine whether or not your use will be Fair. Stat 250 Gunderson Lecture Notes Learning about a Population Mean Part 2: Confidence Interval for a Population Mean Chapter 11: Sections 1 and 2, CI Module 3 Do not put faith in what statistics say until you have carefully considered what they do not say. --William W. Watt 11.1 Introduction to Confidence Intervals for Means Chapter 10 introduced us to confidence intervals for estimating a population proportion and the difference between two population proportions. Recall it is important to understand how to interpret an interval and how to interpret what the confidence level really means. The interval provides a range of reasonable values for the parameter with an associated high level of confidence. For example we can say, “We are 95% confident that the proportion of Americans who do not get enough sleep at night is somewhere between 0.325 to 0.395, based on a random sample of n = 935 American adults. The 95% confidence level describes our confidence in the procedure we used to make the interval. If we repeated the procedure many times, we would expect about 95% of the intervals to contain the population parameter. 11.2 CI Module 3: Confidence Interval for a Population Mean Consider a study on the design of a highway sign. A question of interest is: What is the mean maximum distance at which drivers are able to read the sign? Data: A highway safety researcher will take a random sample of n = 16 drivers and measure the maximum distances (in feet) at which each can read the sign. Population parameter: = _________ mean maximum distance to read the sign for _____________ 155 Sample estimate x = _____ mean maximum distance to read the sign for _________________ But we know the sample estimate x may not equal , in fact, the possible x values vary from sample to sample. Because the sample mean is computed from a random sample, then it is a random variable, with a probability distribution. Sampling Distribution of the sample mean If x is the sample mean for a random sample of size n from a population with a normal model, then the distribution of the sample mean is: Central Limit Theorem If x is the sample mean for a random sample of size n from a population with any model, with mean, , and standard deviation , then when n is large, then the sampling distribution of the sample mean is approximately: So the possible x values vary normally around with a standard deviation of n . The standard deviation of the sample mean, , is roughly the average n distance of the possible sample mean values from the population mean . Since we don’t know the population standard deviation we will use the sample standard deviation s, resulting in the standard error of the sample mean. Standard Error of the Sample mean s.e.( x ) = where s = sample standard deviation The standard error of x estimates, roughly, the average distance of the possible x values from The possible x values result from considering all possible random samples of the same size n from the same population. 156 So we have our estimate of the population mean, the sample mean x , and we have its standard error. To make our confidence interval, we need to know the multiplier. Sample Estimate Multiplier x Standard error The multiplier for a confidence interval for the population mean is denoted by t*, which is the value in a Student’s t distribution with df = n – 1 such that the area between –t and t equals the desired confidence level. The value of t* will be found using Table A.2. First let’s give the formal result. One-sample t Confidence Interval for x t *s.e.( x ) where t * is an appropriate value for a t(n – 1) distribution. This interval requires we have a random sample from a normal population. If the sample size is large (n > 30), the assumption of normality is not so crucial and the result is approximate. Important items: be sure to check the conditions know how to interpret the confidence interval be able to explain what the confidence level of say 95% really means Try It! Using Table A.2 to find t* (a) Find t * for a 90% confidence interval based on n = 12 observations. (b) Find t * for a 95% confidence interval based on n = 30 observations. (c) Find t * for a 95% confidence interval based on n = 54 observations. (d) What happens to the value of t * as the sample size (and thus the degrees of freedom) gets larger? 157 From Utts, Jessica M. and Robert F. Heckard. Mind on Statistics, Fourth Edition. 2012. Used with permission. 158 Try It! Confidence Interval for the Mean Maximum Distance Recall the study on the design of a highway sign. The researcher wanted to learn about the mean maximum distance at which drivers are able to read the sign. The researcher took a random sample of n = 16 drivers and measured the maximum distances (in feet) at which each can read the sign. The data are provided below. 440 360 490 600 600 490 540 400 540 490 600 540 240 440 440 490 a. Verify the necessary conditions for computing a confidence interval for the population mean distance. We are told that the sample was a random sample so we just need to check if a normal model for the response ‘max distance’ for the population is reasonable. Comments: 159 b. Compute the sample mean maximum distance and the standard error (without the outlier). c. Use a 95% confidence interval to estimate the population mean maximum distance at which all drivers can read the sign. Write a paragraph that interprets this interval and the confidence level. One-Sample Statistics N DISTANCE 15 Mean 497.3333 Std. Deviation 73.43283 Std. Error Mean 18.96028 One-Sample Test Tes t Value = 0 DISTANCE t 26.230 df 14 Sig. (2-tailed) .000 160 Mean Difference 497.3333 95% Confidence Interval of the Difference Lower Upper 456.6676 537.9991 Here is the summary of the Confidence intervals for the big 5 parameters covered in Chapters 10 and 11. We have now covered the one and two population proportion scenarios and the one population mean scenario Population Proportion Parameter Two Population Proportions Parameter p p1 p 2 Parameter pˆ 1 pˆ 2 Statistic Standard Error p̂ Statistic Standard Error s.e.( pˆ ) Population Mean pˆ (1 pˆ ) n s.e.( pˆ 1 pˆ 2 ) Confidence Interval x Statistic Standard Error pˆ 1 (1 pˆ 1 ) pˆ 2 (1 pˆ 2 ) n1 n2 Confidence Interval pˆ z s.e.( pˆ ) pˆ pˆ 1 pˆ 2 z *s.e. pˆ 1 pˆ 2 s n Paired Confidence Interval * d t s.e.(d ) df = n – 1 z* 2 n z* Sample Size n 2m s.e.( x ) Confidence Interval x t *s.e.( x ) df = n – 1 * Conservative Conf. Interval 2 Two Population Means General Parameter Statistic Standard Error s.e.x1 x2 s12 s22 n1 n2 1 2 x1 x 2 Pooled x1 x2 t s.e.(x1 x2 ) * df = min( n1 1, n2 1) 1 2 Parameter Statistic Standard Error x1 x 2 pooled s.e.x1 x2 s p where s p Confidence Interval 1 1 n1 n2 (n1 1)s12 (n 2 1) s 22 n1 n 2 2 Confidence Interval x1 x2 t * pooled s.e.(x1 x2 ) df = n1 n2 2 161 Additional Notes A place to … jot down questions you may have and ask during office hours, take a few extra notes, write out an extra practice problem or summary completed in lecture, create your own short summary about this chapter. 162