on confidence intervals

advertisement
Topics for Today
Confidence Intervals
Stat203
Fall 2011 – Week 6 Lecture 2
Page 1 of 23
Where are we and how did we get here?
Even before confidence intervals, we
discussed descriptive statistics:
- _________, relative frequency, percent
frequency, cumulative frequency
distributions
- Percentiles (_________)
- Ratio & Rates
- Cross-tabulations
- ____ Median Mode
- Range IQR __________________
… and we explored these in _______ of data.
… and we showed that there is ___________
from sample to sample.
Stat203
Fall 2011 – Week 6 Lecture 2
Page 2 of 23
The Question:
If we calculate a descriptive _________ from a
______, it is a guess at the __________ of the
_________.
Think of the % heads for 10 flips of the bent
coin being an approximation of the true
probability of heads for that coin.
Here’s the question:
How close is our _____ to the _____?
Stat203
Fall 2011 – Week 6 Lecture 2
Page 3 of 23
Examples:
From our bent coin, we have a guess that the
________________ from the bent coin was
__%.
We made 10 flips to get this, so what is our
__________ that 60% is close to the _____?
What if we had done 10,000 flips and got
60%? Would you be more _________ then?
You make judgments like this in your head all
the time.
What is this __________ based on?
Stat203
Fall 2011 – Week 6 Lecture 2
Page 4 of 23
Let’s stream together the last few topics
_______________________we use these
to summarize features of a sample
____________rules for determining the
likelihood of a specific event
_________________________the true,
underlying hypothetical distribution of a
variable (like the underlying % relative
frequency distribution)
Stat203
Fall 2011 – Week 6 Lecture 2
Page 5 of 23
____________________a very common
probability distribution in nature … enables
us to calculate the probability of many
different events
[source: http://www.statsdirect.com/help/distributions/normal_distribution.htm ]
Stat203
Fall 2011 – Week 6 Lecture 2
Page 6 of 23
_________When the population can’t be
measured, we have to take a sample … to
be representative, samples should be
random
[source: http://cancerandcandy.files.wordpress.com/2011/05/blog-4-pic-2.jpg ]
Stat203
Fall 2011 – Week 6 Lecture 2
Page 7 of 23
______________________any statistic we
calculate from a sample has its own
distribution (ie: the histogram of the
statistic from many many samples)
[source:
http://www.philender.com/courses/intro/notes2/sample.html_]__
Stat203
Fall 2011 – Week 6 Lecture 2
Page 8 of 23
______________________the sampling
distribution of the mean is ALWAYS a
Normal Distribution (as long as the sample
size is big enough … say 30)
[source: http://www.pinkmonkey.com/studyguides/subjects/stats/chap8/img4.gif ]
Stat203
Fall 2011 – Week 6 Lecture 2
Page 9 of 23
Quantifying Confidence
So, to recap:
__________ = the likelihood that our statistic
is _____ to population _________.
The two things that impact our confidence in a
sample statistic:
- the ____ of the sample
- the ___________ of the population
We’ll go into the math of this for the sample
mean.
But first … this makes sense, right?
Stat203
Fall 2011 – Week 6 Lecture 2
Page 10 of 23
Think of these two populations:
- all people in ______
- everyone in the _____
Let’s say we were able to take a simple
random sample of 1000 _________ and
another random sample of 1000 people from
the entire world and calculated the ____
personal income ($/day) for each sample.
Which do you think would be ______ to the
____ of the respective population?
The sample size is the same, why do you think
this?
Stat203
Fall 2011 – Week 6 Lecture 2
Page 11 of 23
The ___________ of the population effects our
confidence.
[source: http://simun.info/ehlog/wp-content/uploads/2006/06/income_distribution.jpg ]
Stat203
Fall 2011 – Week 6 Lecture 2
Page 12 of 23
So how do we connect the topics?
First off, because of the ___, we know that the
sampling distribution of the mean will be
______, centered at _ and with a standard
deviation of ____.
The figure on the following page is the
_____________________ for a mean of a
sample of size n.
Mark where the population mean, µ, is on this
graph.
Mark points that contain __% of all sample
means of size n (ie: if we took lots and lots of
samples like this, where would __% of them
be?)
(say the tick-marks are 1.5 apart)
Stat203
Fall 2011 – Week 6 Lecture 2
Page 13 of 23
Sampling Distribution of a sample of size n:
Now … consider taking a sample. Mark a likely
on the horizontal axis.
Stat203
Fall 2011 – Week 6 Lecture 2
Page 14 of 23
Now how far is it from that to the µ in terms
of standard deviations (approximately?)
… is it 1 standard deviation below?
… 0.8 standard deviations above?
… 2.1 standard deviations above?
Imagine taking lots and lots of samples and
calculating the ’s.
… ’s how far are the closest 95% of ;s
away from µ?
Stat203
Fall 2011 – Week 6 Lecture 2
Page 15 of 23
So … if we take just one sample, and consider
a region + 2(/n) and - 2(/n), that
region will capture the true µ 95% of the time!
Let’s try this out:
Stat203
Fall 2011 – Week 6 Lecture 2
Page 16 of 23
- take a sheet of paper and line up the
edge of it with the normal curve on slide
14.
- Mark on the edge of that sheet the points
that are µ, µ + 2(/n) and µ - 2(/n)
- Randomly select a point on the horizontal
axis (this might be the from a particular
sample)
- Line up the edge of your sheet centered
at that
- Do the points on your sheet of paper
capture the true µ?
- Try again
This applet does something similar:
http://bcs.whfreeman.com/ips4e/cat_010/applets/confidenceinterval.html
Stat203
Fall 2011 – Week 6 Lecture 2
Page 17 of 23
Confidence Interval Formula
95% Confidence Interval for µ:
[
- ____(/n) ,
+ ____(/n) ]
(it’s not quite exactly 2 as we used earlier)
Definition:
An interval constructed as above, for
repeated samples, will include the true mean µ
95% of the time.
Stat203
Fall 2011 – Week 6 Lecture 2
Page 18 of 23
… same for a 90%, or a 99% confidence
interval.
90% Confidence Interval for µ:
[
- ____(/n) ,
+ ____(/n) ]
99% Confidence Interval for µ:
[
- ____(/n) ,
+ ____(/n) ]
Which interval is _____? A 90% confidence
interval or a 99% confidence interval?
Why?
Stat203
Fall 2011 – Week 6 Lecture 2
Page 19 of 23
Using Confidence Intervals
http://onlinelibrary.wiley.com/doi/10.1348/135532510X497258/full
http://www.medscape.com/viewarticle/750158
… and they work for %s too!
http://tinyurl.com/69we7hz
Stat203
Fall 2011 – Week 6 Lecture 2
Page 20 of 23
The magic 95%
Note that 19/20 = __%!
Almost every published confidence interval will
be 95%.
This is purely a _____ …
… 90% seems like you’re not really confident
(___________ you’re wrong)
… 99% seems really high (only a _____
chance you’re wrong)
… 95% is just right … only a ____ chance
you’re wrong.
Stat203
Fall 2011 – Week 6 Lecture 2
Page 21 of 23
Today’s Topics
Confidence Intervals
- How close is our estimate to the true
population value
- Sample size and variability relate to
confidence
- We can decide how confident we want to be
Stat203
Fall 2011 – Week 6 Lecture 2
Page 22 of 23
Reading for next lecture
Chapter 6 - more on confidence intervals
Stat203
Fall 2011 – Week 6 Lecture 2
Page 23 of 23
Download