Chapters 23, part 1 powerpoints only

advertisement
Chapter 23
Confidence Intervals for a
Population Mean ; t
distributions
•
•
•
t distributions
t confidence intervals for a
population mean 
Sample size required to
estimate 
The Importance of the Central
Limit Theorem

When we select simple random
samples of size n, the sample means
we find will vary from sample to sample.
We can model the distribution of these
sample means with a probability model
that is
 

N  ,

n

Since the sampling model for
x is the normal model, when
we standardize x we get the
standard normal z
x
z

n
Note thatSD( x ) 

n
SD( x ) 

If  is unknown, we probably
n
don’t know  either.
The sample standard deviation s provides an estimate of
the population standard deviation 
For a sample of size n,
the sample standard deviation s is:
1
2
s
(
x

x
)
 i
n 1
n − 1 is the “degrees of freedom.”
The value s/√n is called the standard error of x , denoted
SE(x).
s
SE( x ) 
n
Standardize using s for 

Substitute s (sample standard deviation)
for 
x
x
z
zs s
s
s
s







ss s
 
n
n
Note quite correct
Not knowing  means using z is no
longer correct
t-distributions
Suppose that a Simple Random Sample of size n is drawn
from a population whose distribution can be approximated by
a N(µ, σ) model. When  is known, the sampling model for
the mean x is N(, /√n).
When  is estimated from the sample standard deviation s,
the sampling model for the mean x follows a t distribution
t(, s/√n) with degrees of freedom n − 1.
x 
t
s n
is the 1-sample t statistic
Confidence Interval Estimates






CONFIDENCE
INTERVAL for 
s
x t
n
where:
t = Critical value from tdistribution with n-1
degrees of freedom
x = Sample mean
s = Sample standard
deviation
n = Sample size



For very small samples (n < 15),
the data should follow a Normal
model very closely.
For moderate sample sizes (n
between 15 and 40), t methods
will work well as long as the data
are unimodal and reasonably
symmetric.
For sample sizes larger than 40, t
methods are safe to use unless
the data are extremely skewed. If
outliers are present, analyses can
be performed twice, with the
outliers and without.
t distributions
Very similar to z~N(0, 1)
 Sometimes called Student’s t
distribution; Gossett, brewery employee
 Properties:
i) symmetric around 0 (like z)
ii)degrees of freedom 

if  > 1, E(t ) = 0
if  > 2,  =   - 2, which is always
bigger than 1.
Student’s t Distribution
x - x
z =
x
x - x
s
t =
, sx =
sx
n
Z
-3
-3
-2
-2
-1
-1
00
11
22
33
Student’s t Distribution
z=
x - x
x - x
t=
s
n

n
Z
t
-3
-3
-2
-2
-1
-1
00
11
22
33
Figure 11.3, Page 372
Student’s t Distribution
Degrees of Freedom
s =
x - x
t=
s
n
s2
n
s2 =
2
(X

X)
 i
i=1
Z
n -1
t1
-3
-3
-2
-2
-1
-1
00
11
22
33
Figure 11.3, Page 372
Student’s t Distribution
Degrees of Freedom
s =
x - x
t=
s
n
s2
n
s2 =
2
(X

X)
 i
i=1
Z
n -1
t1
t7
-3
-3
-2
-2
-1
-1
00
11
22
33
Figure 11.3, Page 372
t-Table: text- inside back cover

90% confidence interval; df = n-1 = 10
Degrees of Freedom
1
2
.
.
10
0.80
3.0777
1.8856
.
.
1.3722
0.90
6.314
2.9200
.
.
1.8125
0.95
0.98
12.706
4.3027
.
.
2.2281
31.821
6.9645
.
.
2.7638
.
.
.
.
.
.
.
.
.
.
100

1.2901
1.282
1.6604
1.6449
1.9840
1.9600
s
90% confidenceint erval: x  1.8125
11
2.3642
2.3263
0.99
63.657
9.9250
.
.
3.1693
.
.
2.6259
2.5758
Student’s t Distribution
P(t > 1.8125) = .05
P(t < -1.8125) = .05
.90
.05
-1.8125
0
.05
1.8125
t10
Comparing t and z Critical
Values
z = 1.645
z = 1.96
z = 2.33
z = 2.58
Conf.
level
90%
95%
98%
99%
n = 30
t = 1.6991
t = 2.0452
t = 2.4620
t = 2.7564

Example
– An investor is trying to estimate the return
on investment in companies that won
quality awards last year.
– A random sample of 41 such companies is
selected, and the return on investment is
recorded for each company. The data for
the 41 companies have
x  14.75 s  8.18
– Construct a 95% confidence interval for the
mean return.
s
x t
n
x  14.75 s  8.18
degrees of freedom 41 1  40
d. f .  n 1
from t - t able, t  2.0211
s
8.18
x t
 14.75  2.0211
n
41
 14.75  2.61  12.14,17.36
W e are 95% confidentt hat t heint erval
(12.14,17.36)cont ainst hepopulat ionmean
ret urn on invest mentfor companiest hat win
qualit y awards.
Example



Because cardiac deaths increase after heavy
snowfalls, a study was conducted to measure
the cardiac demands of shoveling snow by
hand
The maximum heart rates for 10 adult males
were recorded while shoveling snow. The
sample mean and sample standard deviation
were
x 175, s 15
Find a 90% CI for the population mean max.
heart rate for those who shovel snow.
Solution
s
x t
n
d. f .  n 1
x  175, s 15 n  10
From t he t - t able, t 1.8331
15
175 1.8331
 175 8.70
10
 (166.30, 183.70)
We are 90% confidentt hat t heint erval
(166.30,183.70)cont ainst hemean
maximumheart rat efor snow shovelers
EXAMPLE: Consumer Protection
Agency


Selected random sample of 16 packages of a
product whose packages are marked as
weighing 1 pound.
From the 16 packages: x  1.10pounds, s  .36 pound
a. find a 95% CI for the mean weight 
of the 1-pound packages
 b. should the company’s claim that the
mean weight  is 1 pound be
challenged ?

EXAMPLE
s
x t
n
d. f .  n 1
95% CI, n=16, df=15, x=1.10
s=.36
critical value of t is t  2.1315
s
x t
becomes
n
 .36 
1.10  (2.1315) 
 1.10  .19  .91, 1.29 
 16 
Since 1 pound is in the interval, the company's
claim appears reasonable.
Chapter 23
Testing Hypotheses
about Means
22
Sweetness in cola soft drinks
Cola manufacturers want to test how much the sweetness of cola
drinks is affected by storage. The sweetness loss due to storage was
evaluated by 10 professional tasters by comparing the sweetness
before and after storage (a positive value indicates a loss of
sweetness):
We want to test if storage results
in a loss of sweetness, thus:










Taster
Sweetness loss
1
2
3
4
5
6
7
8
9
10
2.0
0.4
0.7
2.0
−0.4
2.2
−1.3
1.2
1.1
2.3
H0:  = 0 versus HA:  > 0
where  is the mean sweetness
loss due to storage.
We also do not know the population parameter , the standard deviation of the
sweetness loss.
The one-sample t-test
As in any hypothesis tests, a hypothesis test
for  requires a few steps:
1. State the null and alternative hypotheses (H0 versus HA)
a)
Decide on a one-sided or two-sided test
2. Calculate the test statistic t and determining its degrees of
freedom
3. Find the area under the t distribution with the t-table or
technology
4. State the P-value (or find bounds on the P-value) and interpret
the result
The one-sample t-test; hypotheses
Step 1:
1. State the null and alternative hypotheses (H0 versus HA)
a)
Decide on a one-sided or two-sided test
H0:  = 0 versus HA:  > 0 (1 –tail test)
H0:  = 0 versus HA:  < 0 (1 –tail test)
H0:  = 0 versus HA:  ≠ 0 2 –tail test)
The one-sample t-test; test statistic
We perform a hypothesis test with null
hypothesis
H :  = 0 using the test statistic
y  0
t
SE ( y )
where the standard error of y is .
s
SE ( y ) 
n
When the null hypothesis is true, the test
statistic follows a t distribution with n-1
degrees of freedom. We use that model to
obtain a P-value.
The one-sample t-test; P-Values
Recall:
The P-value is the probability, calculated assuming the null
hypothesis H0 is true, of observing a value of the test statistic
more extreme than the value we actually observed.
The calculation of the P-value depends on whether the
hypothesis test is 1-tailed
(that is, the alternative hypothesis is
HA : < 0 or HA :  > 0)
or 2-tailed
(that is, the alternative hypothesis is HA :  ≠ 0).
27
P-Values
Assume the value of the test statistic t is t0
If HA:  > 0, then P-value=P(t >
t 0)
If HA:  < 0, then P-value=P(t <
t 0)
If HA:  ≠ 0, then P-value=2P(t >
|t0|)
28
Sweetening colas (continued)
Is there evidence that storage results in sweetness loss in colas?
H0:  = 0 versus Ha:  > 0 (one-sided test)
t
y  0
s
1.02  0

 2.70
n 1.196 10
P  value  P(t9  2.70)
Conf. Level
Two Tail
One Tail
df
9
0.1
0.9
0.45
0.3
0.7
0.35
0.5
0.5
0.25
0.1293
0.3979
0.7027
0.7
0.3
0.15
0.8
0.9
0.2
0.1
0.1
0.05
Values of t
1.0997 1.3830 1.8331
0.95
0.05
0.025
0.98
0.02
0.01
0.99
0.01
0.005
2.2622
2.8214
3.2498
Taster
Sweetness loss
1
2.0
2
0.4
3
0.7
4
2.0
5
-0.4
6
2.2
7
-1.3
8
1.2
9
1.1
10
2.3
___________________________
Average
1.02
Standard deviation
1.196
Degrees of freedom
n−1=9
2.2622 < t = 2.70 < 2.8214; thus 0.01 < P-value < 0.025.
Since P-value < .05, we reject H0. There is a significant loss
of sweetness, on average, following storage.
Finding P-values with Excel
TDIST(x, degrees_freedom, tails)
TDIST = P(t > x) for a random variable t following the t distribution (x positive).
Use it in place of t-table to obtain the P-value.
– x is the absolute value of the test statistic.
– Deg_freedom is an integer indicating the number of degrees of freedom.
– Tails specifies the number of distribution tails to return. If tails = 1, TDIST returns
the one-tailed P-value. If tails = 2, TDIST returns the two-tailed P-value.
Sweetness in cola soft drinks (cont.)
t
y  0
s
1.02  0

 2.70
n 1.196 10
2.2622 < t = 2.70 < 2.8214; thus 0.01 < p < 0.025.
31
New York City Hotel Room Costs
The NYC Visitors Bureau
claims that the average
cost of a hotel room is
$168 per night. A
random sample of 25
hotels resulted in
y = $172.50 and
s = $15.40.
H0: μ = 168
HA: μ  168
New York City Hotel Room Costs
H0: μ = 168
HA: μ  168
t, 24 df
.079
.079
 n = 25; df = 24
y  $172.50, s  $15.40
yμ
172.50  168
t 

 1.46
s
15.40
n
25
Conf. Level
Two Tail
One Tail
df
24
0.1
0.9
0.45
0.3
0.7
0.35
0.5
0.5
0.25
0.1270
0.3900
0.6848
0.7
0.3
0.15
0.8
0.9
0.2
0.1
0.1
0.05
Values of t
1.0593 1.3178 1.7109
0
-1. 46
1. 46
P-value = .158
P  value  2P(t  1.46)
0.95
0.05
0.025
0.98
0.02
0.01
0.99
0.01
0.005
2.0639
2.4922
2.7969
0.1 ≤ P-value ≤ 0.2
Do not reject H0: not sufficient evidence that
true mean cost is different than $168
Microwave Popcorn
A popcorn maker wants a combination of
microwave time and power that delivers
high-quality popped corn with less than 10%
unpopped kernels, on average. After testing,
the research department determines that
power 9 at 4 minutes is optimum. The
company president tests 8 bags in his office
microwave and finds the following
percentages of unpopped kernels: 7, 13.2,
10, 6, 7.8, 2.8, 2.2, 5.2.
Do the data provide evidence that the mean
percentage of unpopped kernels is less than
10%?
H0: μ = 10
HA: μ < 10
where μ is true unknown mean percentage of unpopped
kernels
Microwave Popcorn
t, 7 df
H0: μ = 10
HA: μ < 10
.02
 n = 8; df = 7
y  6.775, s  3.64
t 
-2. 51
y
6.775  10

 2.51
s
3.64
n
8
Conf. Level
Two Tail
One Tail
df
7
0
0.1
0.9
0.45
0.3
0.7
0.35
0.5
0.5
0.25
0.1303
0.4015
0.7111
Exact P-value = .02
P  value  P(t < 2.51)
0.7
0.3
0.15
0.8
0.9
0.2
0.1
0.1
0.05
Values of t
1.1192 1.4149 1.8946
0.95
0.05
0.025
0.98
0.02
0.01
0.99
0.01
0.005
2.3646
2.9980
3.4995
Reject H0: there is sufficient evidence that true
mean percentage of unpopped kernels is less than
10%
Download