Topics for Today Confidence Intervals Stat203 Fall 2011 – Week 6 Lecture 2 Page 1 of 23 Where are we and how did we get here? Even before confidence intervals, we discussed descriptive statistics: - _________, relative frequency, percent frequency, cumulative frequency distributions - Percentiles (_________) - Ratio & Rates - Cross-tabulations - ____ Median Mode - Range IQR __________________ … and we explored these in _______ of data. … and we showed that there is ___________ from sample to sample. Stat203 Fall 2011 – Week 6 Lecture 2 Page 2 of 23 The Question: If we calculate a descriptive _________ from a ______, it is a guess at the __________ of the _________. Think of the % heads for 10 flips of the bent coin being an approximation of the true probability of heads for that coin. Here’s the question: How close is our _____ to the _____? Stat203 Fall 2011 – Week 6 Lecture 2 Page 3 of 23 Examples: From our bent coin, we have a guess that the ________________ from the bent coin was __%. We made 10 flips to get this, so what is our __________ that 60% is close to the _____? What if we had done 10,000 flips and got 60%? Would you be more _________ then? You make judgments like this in your head all the time. What is this __________ based on? Stat203 Fall 2011 – Week 6 Lecture 2 Page 4 of 23 Let’s stream together the last few topics _______________________we use these to summarize features of a sample ____________rules for determining the likelihood of a specific event _________________________the true, underlying hypothetical distribution of a variable (like the underlying % relative frequency distribution) Stat203 Fall 2011 – Week 6 Lecture 2 Page 5 of 23 ____________________a very common probability distribution in nature … enables us to calculate the probability of many different events [source: http://www.statsdirect.com/help/distributions/normal_distribution.htm ] Stat203 Fall 2011 – Week 6 Lecture 2 Page 6 of 23 _________When the population can’t be measured, we have to take a sample … to be representative, samples should be random [source: http://cancerandcandy.files.wordpress.com/2011/05/blog-4-pic-2.jpg ] Stat203 Fall 2011 – Week 6 Lecture 2 Page 7 of 23 ______________________any statistic we calculate from a sample has its own distribution (ie: the histogram of the statistic from many many samples) [source: http://www.philender.com/courses/intro/notes2/sample.html_]__ Stat203 Fall 2011 – Week 6 Lecture 2 Page 8 of 23 ______________________the sampling distribution of the mean is ALWAYS a Normal Distribution (as long as the sample size is big enough … say 30) [source: http://www.pinkmonkey.com/studyguides/subjects/stats/chap8/img4.gif ] Stat203 Fall 2011 – Week 6 Lecture 2 Page 9 of 23 Quantifying Confidence So, to recap: __________ = the likelihood that our statistic is _____ to population _________. The two things that impact our confidence in a sample statistic: - the ____ of the sample - the ___________ of the population We’ll go into the math of this for the sample mean. But first … this makes sense, right? Stat203 Fall 2011 – Week 6 Lecture 2 Page 10 of 23 Think of these two populations: - all people in ______ - everyone in the _____ Let’s say we were able to take a simple random sample of 1000 _________ and another random sample of 1000 people from the entire world and calculated the ____ personal income ($/day) for each sample. Which do you think would be ______ to the ____ of the respective population? The sample size is the same, why do you think this? Stat203 Fall 2011 – Week 6 Lecture 2 Page 11 of 23 The ___________ of the population effects our confidence. [source: http://simun.info/ehlog/wp-content/uploads/2006/06/income_distribution.jpg ] Stat203 Fall 2011 – Week 6 Lecture 2 Page 12 of 23 So how do we connect the topics? First off, because of the ___, we know that the sampling distribution of the mean will be ______, centered at _ and with a standard deviation of ____. The figure on the following page is the _____________________ for a mean of a sample of size n. Mark where the population mean, µ, is on this graph. Mark points that contain __% of all sample means of size n (ie: if we took lots and lots of samples like this, where would __% of them be?) (say the tick-marks are 1.5 apart) Stat203 Fall 2011 – Week 6 Lecture 2 Page 13 of 23 Sampling Distribution of a sample of size n: Now … consider taking a sample. Mark a likely on the horizontal axis. Stat203 Fall 2011 – Week 6 Lecture 2 Page 14 of 23 Now how far is it from that to the µ in terms of standard deviations (approximately?) … is it 1 standard deviation below? … 0.8 standard deviations above? … 2.1 standard deviations above? Imagine taking lots and lots of samples and calculating the ’s. … ’s how far are the closest 95% of ;s away from µ? Stat203 Fall 2011 – Week 6 Lecture 2 Page 15 of 23 So … if we take just one sample, and consider a region + 2(/n) and - 2(/n), that region will capture the true µ 95% of the time! Let’s try this out: Stat203 Fall 2011 – Week 6 Lecture 2 Page 16 of 23 - take a sheet of paper and line up the edge of it with the normal curve on slide 14. - Mark on the edge of that sheet the points that are µ, µ + 2(/n) and µ - 2(/n) - Randomly select a point on the horizontal axis (this might be the from a particular sample) - Line up the edge of your sheet centered at that - Do the points on your sheet of paper capture the true µ? - Try again This applet does something similar: http://bcs.whfreeman.com/ips4e/cat_010/applets/confidenceinterval.html Stat203 Fall 2011 – Week 6 Lecture 2 Page 17 of 23 Confidence Interval Formula 95% Confidence Interval for µ: [ - ____(/n) , + ____(/n) ] (it’s not quite exactly 2 as we used earlier) Definition: An interval constructed as above, for repeated samples, will include the true mean µ 95% of the time. Stat203 Fall 2011 – Week 6 Lecture 2 Page 18 of 23 … same for a 90%, or a 99% confidence interval. 90% Confidence Interval for µ: [ - ____(/n) , + ____(/n) ] 99% Confidence Interval for µ: [ - ____(/n) , + ____(/n) ] Which interval is _____? A 90% confidence interval or a 99% confidence interval? Why? Stat203 Fall 2011 – Week 6 Lecture 2 Page 19 of 23 Using Confidence Intervals http://onlinelibrary.wiley.com/doi/10.1348/135532510X497258/full http://www.medscape.com/viewarticle/750158 … and they work for %s too! http://tinyurl.com/69we7hz Stat203 Fall 2011 – Week 6 Lecture 2 Page 20 of 23 The magic 95% Note that 19/20 = __%! Almost every published confidence interval will be 95%. This is purely a _____ … … 90% seems like you’re not really confident (___________ you’re wrong) … 99% seems really high (only a _____ chance you’re wrong) … 95% is just right … only a ____ chance you’re wrong. Stat203 Fall 2011 – Week 6 Lecture 2 Page 21 of 23 Today’s Topics Confidence Intervals - How close is our estimate to the true population value - Sample size and variability relate to confidence - We can decide how confident we want to be Stat203 Fall 2011 – Week 6 Lecture 2 Page 22 of 23 Reading for next lecture Chapter 6 - more on confidence intervals Stat203 Fall 2011 – Week 6 Lecture 2 Page 23 of 23