σ Stat 226 – Introduction to Business Statistics I

advertisement
Section 7.1 — Inference for population means, σ unknown
Stat 226 – Introduction to Business Statistics I
Inference for the population mean µ when the population
standard deviation σ is unknown
Spring 2009
Professor: Dr. Petrutza Caragea
Section A
Tuesdays and Thursdays 9:30-10:50 a.m.
When the population standard deviation σ is unknown we have to estimate
it first based on the collected data using the sample standard deviation s
!
"
n
" 1 $
s=#
(xi − x̄)2
n−1
i=1
Chapter 7, Section 7.1
Using s instead of σ adds more variability to the distribution of
Introduction to Business Statistics I
Section 6.1
√s
, so we
n
need a distribution with heavier tails.
Inference for population means, σ unknown
Stat 226 (Spring 2009, Section A)
x̄−µ
The t-distribution accounts for the additional variation by having heavier
tails (see graph next page)
1 / 12
Section 7.1 — Inference for population means, σ unknown
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
2 / 12
Section 7.1 — Inference for population means, σ unknown
0.4
0.3
µ (the mean) and
σ (the standard deviation)
0.1
0.2
dnorm(data)
0.4
0.3
0.2
The t-distribution is characterized by a single parameter, the so-called
0.0
0.0
0.1
dnorm(data)
Recall that the normal distribution is characterized by two parameters:
!4
!2
0
2
4
!4
!2
4
0.4
0.3
0.1
0.0
!4
!2
0
2
4
!4
data
Stat 226 (Spring 2009, Section A)
“degrees of freedom” (short:df)
As the degrees of freedom increase, the t-distribution approaches the
standard normal distribution N(0, 1)
0.2
dnorm(data)
0.3
0.2
0.0
0.1
dnorm(data)
2
data
0.4
data
0
!2
0
2
4
Why? As the sample size increases, s estimates σ more accurately because
we have more information about the population standard deviation in our
sample.
data
Introduction to Business Statistics I
Section 6.1
3 / 12
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
4 / 12
Section 7.1 — Inference for population means, σ unknown
Section 7.1 — Inference for population means, σ unknown
0.2
0.0
0.1
dt(data, 5)
0.3
Reading the t-table (Table D)
!4
!2
0
2
4
data
t-distribution is symmetric
it is characterized by the degrees of freedom
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
5 / 12
Section 7.1 — Inference for population means, σ unknown
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
6 / 12
Section 7.1 — Inference for population means, σ unknown
Finding critical values t ∗ for a t-distribution (table d)
Example:
1 95% percentile of a t-distribution with df=5:
2
the 5% percentile for a t-distribution with df=15
3
Find the quantiles the bound the middle 95%
t ∗ is the critical value such that the area to the right (uppertail
probability) of t ∗ is equal to 0.05 (or 5%)
1
look down the “df column” (first column on left) to 5
2
at the top of the table, find the right tail (uppertail) probability of 0.05
3
the critical value t ∗ corresponds to where row and column intersect,
which is t ∗ = 2.015
Note: t-table works with the area above (right) while z-table works
with area below (left)
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
7 / 12
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
8 / 12
Section 7.1 — Inference for population means, σ unknown
Section 7.1 — Inference for population means, σ unknown
Example: A random sample of 30 pills yielded a mean level of 20.5 mg of
aspirin and a standard deviation of 1.5 mg. Find a 95% confidence interval
for the mean level µ of aspirin in a pill.
confidence intervals for µ when σ is unknown
A (1 − α) · 100% confidence interval for µ is given by
%
&
s
x̄ ± t ∗ · √
n
just change σ to s and z ∗ to t ∗
look up t ∗ corresponding to a t-distribution with df = n − 1
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
9 / 12
Section 7.1 — Inference for population means, σ unknown
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
10 / 12
Section 7.1 — Inference for population means, σ unknown
What about assumptions for the t-test?
simple random sample (ensuring independence of observations)
Handout (t-test examples)
data following a normal distribution or sufficiently large sample size
for the CLT to apply
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
11 / 12
Stat 226 (Spring 2009, Section A)
Introduction to Business Statistics I
Section 6.1
12 / 12
Download