Chapter 6.1 — Confidence Intervals

advertisement
Chapter 6.1 — Confidence Intervals
Stat 226 – Introduction to Business Statistics I
Confidence Intervals
Spring 2009
Professor: Dr. Petrutza Caragea
Section A
Tuesdays and Thursdays 9:30-10:50 a.m.
Sample means vary in value and form a sampling distribution in which not
all samples result in x̄-values equal to the population mean µ. We should
not expect to obtain a sample mean x̄ (based on a specific sample) that is
exactly equal to the population mean µ.
However, we can expect the point estimate to be fairly close in value to
the population mean for a sufficiently large sample size (sampling
distribution becomes approximately normal for large sample size).
Chapter 6, Section 6.1
Recall 68-95-99.7 rule: 95% of all observations from a normal
distribution will fall within ± 2 standard deviation.
Confidence Intervals
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
1 / 25
Chapter 6.1 — Confidence Intervals
Using this concept, we can construct so-called confidence intervals:
We know that x̄ follows a normal distribution with mean µ and standard
√
deviation σ/ n, i.e.,
σ
x̄ ∼ N (µ, √ )
n
Therefore, we can anticipate approximately 95% of all random samples of
size n from some population with unknown µ and known σ to produce
sample means x̄ that fall between
Stat 226 (Spring 2009, Section A)
Section 6.1
2 / 25
This interval
!
σ
σ "
µ−2∗ √ ; µ+2∗ √
n
n
is based on the 68-95-99.7 rule. We know from Chapter 1, that the actual
z-score corresponding to the middle 95% is z = 1.96.
so more precisely we have
!
σ
σ "
µ − 1.96 ∗ √ ; µ + 1.96 ∗ √
n
n
We are going to use z = 1.96 in the future when constructing a 95%
confidence interval.
σ
and µ + 2 ∗ √
n
Introduction to Business Statistics I
Introduction to Business Statistics I
Chapter 6.1 — Confidence Intervals
If the sample size n is large enough, the sampling distribution of the
sample means is approximately normal. Our point estimate x̄ will hardly
be equal to the population mean µ, but most likely (≈ 95% of all times)
fall within 2 standard deviations about the population mean µ.
σ
µ−2∗ √
n
Stat 226 (Spring 2009, Section A)
Section 6.1
3 / 25
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
4 / 25
Chapter 6.1 — Confidence Intervals
Chapter 6.1 — Confidence Intervals
Example: ACT scores ∼ N (µ, 5.9), let’s take samples of size n = 76
It can be shown that this concept can be reversed in the following sense:
⇒ approximately 95% of all samples of size 76 will produce sample means
between
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
5 / 25
Chapter 6.1 — Confidence Intervals
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
6 / 25
Chapter 6.1 — Confidence Intervals
Definition of a Confidence Interval
Confidence Intervals (short: CI)
A confidence interval for the unknown population mean µ is an interval
(or range) of plausible values for µ. It is constructed such that with a
chosen degree (or level) of confidence C, the value of the unknown
population mean will be captured inside the interval.
For each confidence interval we have a confidence level C:
C provides information on how much “confidence” we can have in the
method used to construct the CI
C usual choices are: 90%, 95%, and 99%
C can be interpreted as the rate of success for the method used to
construct CI in the long run
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
7 / 25
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
8 / 25
Chapter 6.1 — Confidence Intervals
Chapter 6.1 — Confidence Intervals
Example: 99% level of confidence
A level C confidence interval for population mean µ
For a sufficiently large sample size n (CLT can apply so x̄ follows a normal
distribution) or a population that is already normally distributed, the
general formula for a level C confidence interval for the population mean µ
when σ is known is given by
!
σ
σ "
x̄ − z ∗ · √ ; x̄ + z ∗ · √
n
n
i.e. in short notation
!
σ "
x̄ ± z · √
n
∗
A 99% confidence interval is constructed such that in the long run it is
successful in capturing the true unknown population mean 99% of all
times.
——————————————————————————————
Finding the critical value z ∗ for a level C confidence interval: More
precisely we have that C = (1 − α) ∗ 100%
The relevant number is called α, measuring the difference between the
desired level of confidence and certainty (i.e. 100%).
Example:
z∗
The desired level of confidence C determines which critical value
is
used. The three most commonly used confidence levels, 90%, 95%, and
99% use critical values 1.645, 1.96, and 2.575 respectively. Use Table A
to find z ∗ .
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
9 / 25
Chapter 6.1 — Confidence Intervals
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
10 / 25
Chapter 6.1 — Confidence Intervals
Example: A random sample of size n = 25 from last semester’s heights
data yielded a sample mean of x̄ = 69.36. We know the population
standard deviation is σ = 4.004
Find a 90% confidence interval for the unknown population mean µ
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
11 / 25
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
12 / 25
Chapter 6.1 — Confidence Intervals
Chapter 6.1 — Confidence Intervals
What about a 95% confidence interval?
Why settle for a 90% CI or 95% CI when we can construct 99% CIs?
The higher level of confidence comes with a price tag: The resulting
interval is wider than the 90% or 95% confidence interval:
99% CI =⇒ z ∗ = 2.575
=⇒
4.004
69.36 ± 2.575 · √
=⇒
25
#
$%
&
(67.29794 , 71.42206)
2.06206
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
13 / 25
Chapter 6.1 — Confidence Intervals
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
14 / 25
Section 6.1
16 / 25
Chapter 6.1 — Confidence Intervals
The width of any confidence interval is given by
In the previous 3 examples, the width of the corresponding CIs was
Handout on simulated confidence intervals
90%:
95%:
99%:
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
15 / 25
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Chapter 6.1 — Interpretation of Confidence Intervals
Chapter 6.1 — Interpretation of Confidence Intervals
Interpretation of Confidence Intervals
Referring to the handout on the 100 simulated confidence intervals we can
take a away the following facts:
1
We can be C% confident that the
falls in the constructed level C confidence interval, i.e. between the lower
and upper CI bound for a specific calculated example.
2
If we would take repeated samples, approximately C% of all samples taken
will include the
in the long run.
3
The interpretation of a CI is always in terms of the unknown population
mean µ and never in terms of the sample mean x̄. The sample mean x̄, the
center of every CI, will always be included in the CI by default.
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
17 / 25
Be careful:
Before we take a sample from a population we can say there is a C%
chance, (e.g. 95% chance), that our confidence interval will include the
population parameter µ if we plan on constructing C% confidence
intervals, (e.g. 95% CIs).
Once we have taken the sample, this decision is made. Our interval either
does contain µ or it does not. We just don’t know it. There is not a C%
chance anymore, all we can say is that we are C% confident, (e.g. 95%
confident).
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Chapter 6.1 — Confidence Intervals
Chapter 6.1 — Confidence Intervals
We saw that the two properties of a high level of confidence and but a
narrow (precise) CI work against each other.
margin of error
The higher the level of confidence the wider the confidence interval and
therefore the less precision we have estimating the unknown µ.
Introduction to Business Statistics I
Section 6.1
18 / 25
σ
m = z∗ · √
n
is also referred to as the so-called margin of error
changing one of the three components z ∗ , σ or n in the margin of error will
have the following impact on the width of the confidence interval
1 level of confidence C = (1 − α) ∗ 100% will change z ∗
remedy: If we need a certain level of confidence, but also a specific
precision, we can increase the sample size n
if n goes up ⇒ σx̄ = √σn will go down!
we get a narrower interval with more precision:
Stat 226 (Spring 2009, Section A)
Section 6.1
19 / 25
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
20 / 25
Chapter 6.1 — Confidence Intervals
2
Chapter 6.1 — Confidence Intervals
sample size n will change standard deviation σx̄
sample size calculations
If we want both a high level of confidence and a small margin of error (i.e.
narrow confidence interval) we need to take a sample of size
n≥
3
! z ∗ · σ "2
m
n rarely corresponds to an integer number, so we always need round up to
the next largest integer.
population standard deviation σ
Why next largest? If we would round down, the corresponding confidence
interval would not have the desired margin of error any longer, but a
slightly larger one!
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
21 / 25
Chapter 6.1 — Confidence Intervals
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
22 / 25
Chapter 6.1 — Assumptions for Confidence Intervals
Example: What sample size should be used to estimate the mean age of
workers in a large factory within 1 year at a 95% level of confidence if the
standard deviation σ for the variable age is known to be 3.5?
Necessary Assumptions for Constructing CIs
1
the sampling distribution of x̄ has to follow at least approximately a
distribution, i.e.
either
sample size is
for the
to apply if the
population we sample from does not follow a normal distribution,
or
the population we sample from follows a normal distribution.
2
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
23 / 25
The sample taken has to be a
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
sample.
Section 6.1
24 / 25
Chapter 6.1 — Confidence Intervals
worksheets
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
25 / 25
Download