Chapter 24 Confidence Intervals and Hypothesis

advertisement
Chapter 24
Comparing Means:
Confidence Intervals and Hypotheses
Tests for the Difference between Two
Population Means µ1 - µ2
1
Confidence Intervals for the
Difference between Two Population
Means µ1 - µ2: Independent Samples
• Two random samples are drawn from the
two populations of interest.
• Because we compare two population
means, we use the statistic x1  x 2 .
2
Population 1
Population 2
Parameters: µ1 and 12
Parameters: µ2 and 22
(values are unknown) (values are unknown)
Sample size: n1
Statistics: x1 and s12
Sample size: n2
Statistics: x2 and s22
Estimate µ1 µ2 with x1 x2
3
Sampling distribution model for x1  x2 ?
E ( x1  x2 )  m1  m2
SD ( x1  x2 ) 
 12
n1

 22
Estimate using
SE ( x1  x2 ) 
n2
Shape?
2
s s 
  
n1 n2 

df 
2
2
2
2
1  s1 
1  s2 
  
 
n1  1  n1  n2  1  n2 
2
1
2
2
s12 s22

n1 n2
Sometimes used (not always
very good) estimate of the
degrees of freedom is
min(n1 − 1, n2 − 1).
df
s12 s22

n1 n 2

m1-m2
x1  x2
Confidence Interval for m1 – m2
Confidence interval
s2 s2
( x  x )  tdf * 1  2
1 2
n
n
1
2
where tdf * is the value from the t-table
that corresponds to the confidence level
2
s s 
  
n1 n2 

df 
2
2
2
2
1  s1 
1  s2 
  
 
n1  1  n1  n2  1  n2 
2
1
2
2
5
Example: “Cameron Crazies”.
Confidence interval for m1 – m2
 Do the “Cameron Crazies” at Duke home
games help the Blue Devils play better
defense?
 Below are the points allowed by Duke (men)
at home and on the road for the conference
games from a recent season.
Pts allowed 44
at home
56
44
54
75
101
91
81
Pts allowed
on road
56
70
74
80
67
65
79
58
home: x1  68.25 s1  21.8 n1  8
road: x2  68.63 s2  8.9 n2  8
6
Example: “Cameron Crazies”. Confidence
interval for m1 – m2
Calculate a 95% CI for m1 - m2 where
m1 = mean points per game allowed by Duke at home.
m2 = mean points per game allowed by Duke on road
• n1 = 8, n2 = 8; s12= (21.8)2 = 475.36; s22 = (8.9)2 = 79.41
2
2
s s 
475.36
79.41



  


n
n
8
8
 1
2 


df 

 9.27
2
2
2
2
2
2
1  s1 
1  s2  1  475.36   1  79.41 
  
  7  8  7  8 
n1  1  n1  n2  1  n2 
2
1
2
2
7
Example: “Cameron Crazies”. Confidence
interval for m1 – m2
• To use the t-table let’s use df = 9; t9* = 2.2622
• The confidence interval estimator for the
difference between two means is …
( x  x )  t9*
1 2
s2 s2
1  2
n
n
1
2
475.36 79.41

8
8
 .38  18.84   19.22,18.46
 (68.25  68.63)  2.2622
8
Interpretation
• The 95% CI for m1 - m2 is (-19.22, 18.46).
• Since the interval contains 0, there appears to be
no significant difference between
m1 = mean points per game allowed by Duke at home.
m2 = mean points per game allowed by Duke on road
• The Cameron Crazies appear to have no affect on
the ABILITY of the Duke men to play defense.
How can
this be?
9
Beware!! Common Mistake !!!
A common mistake is to calculate a one-sample
confidence interval for m1, a one-sample confidence interval for
m2,and to then conclude that m1 and m2 are equal if the
confidence intervals overlap.
This is WRONG because the variability in the sampling
distribution for x1  x 2 from two independent samples is more
complex and must take into account variability coming from both
samples. Hence the more complex formula for the standard error.
SE 
s12 s22

n1 n2
INCORRECT Two single-sample 95% confidence intervals:
The confidence interval for the male mean and the
confidence interval for the female mean overlap,
suggesting no significant difference between the true
mean for males and the true mean for females.
Male
Male interval: (18.68, 20.12)
Female
mean 19.4
17.9
st. dev. s 2.52
3.39
n 50
50
Female interval: (16.94, 18.86)
CORRECT The 2-sample 95% confidence interval of the form
( y1  y2 )  t
*
.025, df
s12
n1

s22
n2
for the difference mmale  m female between the means
is (.313, 2.69). Interval is entirely positive, suggesting significant difference
between the true mean for males and the true mean for females
(evidence that true male mean is larger than true female mean).
0 .313
1.5
2.69
Reason for Contradictory Result
It's always true that
a  b  a  b . Specifically,
2
1
2
2
s
s
s1
s2



n1 n2
n1
n2
SE ( x1  x2 )  SE ( x1 )  SE ( x2 )
12
Does smoking damage the lungs of children exposed
to parental smoking?
Forced vital capacity (FVC) is the volume (in milliliters) of
air that an individual can exhale in 6 seconds.
FVC was obtained for a sample of children not exposed to
parental smoking and a group of children exposed to
parental smoking.
Parental smoking
FVC
Yes
No
x
s
n
75.5
9.3
30
88.2
15.1
30

We want to know whether parental smoking decreases
children’s lung capacity as measured by the FVC test.
Is the mean FVC lower in the population of children
exposed to parental smoking?
Parental smoking
FVC
Yes
No
x
s
n
75.5
9.3
30
88.2
15.1
30

2
s s 
  
n1 n2 

df 
 48.23
2
2
2
2
1  s1 
1  s2 
  
 
n1  1  n1  n2  1  n2 
2
1
2
2
95% confidence interval for (µ1 − µ2), with
df = 48.23 t* = 2.0104:
s12 s22
( x1  x2 )  t *

n1 n2
m1 = mean FVC of children
with a smoking parent;
m2 = mean FVC of children
without a smoking parent
9.32 15.12
 (75.5  88.2)  2.0104

30
30
12.7  2.0104*3.24
12.7  6.51 (19.21,  6.19)
We are 95% confident that lung capacity is between
19.21 and 6.19 milliliters LESS in children of smoking
parents.
Do left-handed people have a shorter life-expectancy than
right-handed people?
 Some psychologists believe that the stress of being lefthanded in a right-handed world leads to earlier deaths among
left-handers.
 Several studies have compared the life expectancies of lefthanders and right-handers.
 One such study resulted in the data shown in the table.
Handedness
Mean age at death
Left
Right
star left-handed quarterback
Steve Young
x
s
n
66.8
25.3
99
75.2
15.1
888
left-handed presidents

We will use the data to construct a confidence interval
for the difference in mean life expectancies for left-
handers and right-handers.
Is the mean life expectancy of left-handers less
than the mean life expectancy of right-handers?
Handedness
Mean age at death
s
n
Left
66.8
25.3
99
Right
75.2
15.1
888
95% confidence interval for (µ1 − µ2), with
df = 105.92 t* = 1.9826:
s12 s22
( x1  x2 )  t *

n1 n2
(25.3) 2 (15.1) 2
 (66.8  75.2)  1.9826

99
888
8.4  1.9826* 2.59
8.4  5.13  (13.53,  3.27)
The “Bambino”,left-handed Babe
Ruth, baseball’s all-time best
player.
m1 = mean life expectancy of
left-handers;
m2 = mean life expectancy of
right-handers
We are 95% confident that the mean life expectancy for lefthanders is between 3.27 and 13.53 years LESS than the mean
life expectancy for right-handers.
The null hypothes H is that both
Two-sample t-test population
means m and m are equal,
0
1
2
thus their difference is equal to zero.
H 0 : m1  m2  0
  0,1 tail

H A : m1 - m2   0,1 tail

 0,2 tail
test statistic: t 
P-value=P(t < t0)
P-value=P(t > t0)
( x1  x2 )  ( m1  m2 )
s12 s22

n1 n2
Because in a two-sample test
H0 says (m1 − m2) 0, the test
statistic is …
P-value=2P(t > |t0|)
t
( x1  x2 )  (0)
2
1
2
2
s
s

n1 n2
Does smoking damage the lungs of children
exposed to parental smoking?
Forced vital capacity (FVC) is the volume (in milliliters) of air that an
individual can exhale in 6 seconds.
FVC was obtained for a sample of children not exposed to parental
smoking and a group of children exposed to parental smoking.
FVC x
Parental smoking
s
n
Yes
75.5
9.3
30
No
88.2
15.1
30

We want to know whether parental smoking decreases
children’s lung capacity as measured by the FVC test.
Is the mean FVC lower in the population of children
exposed to parental smoking?
Parental smoking
FVC
Yes
No
x
s
n
75.5
9.3
30
88.2
15.1
30

H0: m1 − m2 = 0
df = 48.23
t
2
1
2
2
s s

n1 n2

75.5  88.2
2
2
2
m1 = mean FVC of children
with a smoking parent;
m2 = mean FVC of children
without a smoking parent
Ha: m1 − m2 < 0
x1  x2
2
s s 
  
n1 n2 

df 
 48.23
2
2
2
2
1  s1 
1  s2 
  
 
n1  1  n1  n2  1  n2 
2
1
2
9.3 15.1

30
30
P-value=P(t<-3.9) 
.0001
12.7
t
  3.9
2.9  7.6
Conclusion: Reject H0. Lung capacity is
significantly impaired in children of smoking parents.
Recall the 95% CI for m1 − m2: (19.21, 6.19)
Can directed reading activities in the classroom help improve reading ability?
A class of 21 third-graders participates in these activities for 8 weeks while a
control classroom of 23 third-graders follows the same curriculum without the
activities. After 8 weeks, all children take a reading test (scores in table).
H 0 : m1  m2  0
H A : m1  m2  0
t
51.48  41.52
2
11.01 17.15

21
23
df = 37.86
2
 2.31
1 = mean test score of
activities participants
2 = mean test score of
controls
P-value=P(t37.86 > 2.31) = .013
There is evidence that reading activities
improve reading ability.
Robustness
The two-sample t procedures are more robust than the one-
sample t procedures. They are the most robust when both
sample sizes are equal and both sample distributions are similar.
But even when we deviate from this, two-sample tests tend to
remain quite robust.
 When planning a two-sample study, choose equal sample
sizes if you can.
As a guideline, a combined sample size (n1 + n2) of 40 or more
will allow you to work even with the most skewed distributions.
Pooled two-sample procedures
There are two versions of the two-sample t-test: one assuming
equal variance (“pooled 2-sample test”) and one not assuming
equal variance (“unequal” variance, as we have studied) for the
two populations. They have slightly different formulas and
degrees of freedom.
Two normally distributed populations
with unequal variances
The pooled (equal variance) twosample t-test was often used before
computers because it has exactly
the t distribution for degrees of
freedom n1 + n2 − 2.
However, the assumption of equal
variance is hard to check, and thus
the unequal variance test is safer.
Pooled two-sample procedures (cont.)
When both population have the
same standard deviation, the
pooled estimator of σ2 is:
The sampling distribution for x1  x2 has exactly the t distribution
with (n1 + n2 − 2) degrees of freedom.
A level C confidence interval for µ1 − µ2 is
(with area C between −t* and t*)
To test the hypothesis H0: µ1- µ2 = 0 against a
one-sided or a two-sided alternative,
compute the pooled two-sample t statistic
for the t(n1 + n2 − 2) distribution.
Which type of test? One sample,
paired samples, two samples?
• Comparing vitamin content of bread
immediately after baking vs. 3 days
later (the same loaves are used on day
one and 3 days later).
•
an oral contraceptive? Comparing a
 Paired
group of women not using an oral
• Comparing vitamin content of bread
contraceptive with a group taking it.
immediately after baking vs. 3 days
 Two samples
later (tests made on independent
loaves).
 Two samples
• Average fuel efficiency for 2005
vehicles is 21 miles per gallon. Is
average fuel efficiency higher in the
new generation “green vehicles”?
 One sample
Is blood pressure altered by use of
•
Review insurance records for dollar
amount paid after fire damage in
houses equipped with a fire
extinguisher vs. houses without one.
Was there a difference in the
average dollar amount paid?
 Two samples
Download