Section 6.2 Confidence Intervals for the Difference mu1-mu2

advertisement
Section 6.2
Confidence Intervals for the
Difference between Two
Population Means µ1 - µ2:
Independent Samples
1
6.2 Confidence Intervals for the
Difference between Two Population
Means µ1 - µ2: Independent Samples
• Two random samples are drawn from the
two populations of interest.
• Because we compare two population
means, we use the statistic x 1  x 2 .
2
Population 1
Population 2
Parameters: µ1 and 12
Parameters: µ2 and 22
(values are unknown) (values are unknown)
Sample size: n1
Statistics: x1 and s12
Sample size: n2
Statistics: x2 and s22
Estimate µ1 µ2 with x1 x2
3
Sampling distribution model for
x1  x 2
?
E ( x1  x 2 )   1   2
1
2
SD ( x1  x 2 ) 
n1
2
2

Estimate using
n2
SE ( x1  x 2 ) 
Shape?
s
s 



n
n
 1
2 
2
1
df 
2
2
2
s1
n1
2

s2
n2
2
2
2
 s12 
1  s2 
  
 
n1  1  n1 
n2  1  n2 
2
df
1
Sometimes used (not always
very good) estimate of the
degrees of freedom is
min(n1 − 1, n2 − 1).
0
t
Confidence Interval for 1 – 2
C onfidence interval
( x  x )  t df
1
2
*
2
2
s
s
1  2
n
n
1
2
*
w here t df is the value from the t-table
that corresponds to the confidence level
s
s 



n
n
 1
2 
2
1
df 
2
2
2
2
s 
1 s 
  
 
n1  1  n1 
n2  1  n2 
1
2
1
2
2
2
5
Confidence Interval for 1 – 2
C onfidence interval
( x  x )  t df
1
2
*
2
2
s
s
1  2
n
n
1
2
*
w here t df is the value from the t-table
that corresponds to the confidence level
s
s 



n
n
 1
2 
2
1
df 
2
2
2
2
s 
1 s 
  
 
n1  1  n1 
n2  1  n2 
1
2
1
2
2
2
6
Example: “Cameron Crazies”.
Confidence interval for 1 – 2
 Do the “Cameron Crazies” at Duke home
games help the Blue Devils play better
defense?
 Below are the points allowed by Duke (men)
at home and on the road for the conference
games from a recent season.
Pts allowed 44
at home
56
44
54
75
101
91
81
Pts allowed
on road
56
70
74
80
67
65
79
58
hom e: x1  68.25 s1  21.8 n1  8
road: x 2  68.63 s 2  8.9 n 2  8
7
Example: “Cameron Crazies”. Confidence
interval for 1 – 2
Calculate a 95% CI for 1 - 2 where
1 = mean points per game allowed by Duke at home.
2 = mean points per game allowed by Duke on road
• n1 = 8, n2 = 8; s12= (21.8)2 = 475.36; s22 = (8.9)2 = 79.41
s
s 



n
n
 1
2 
2
1
df 
2
2
2
2
s 
1 s 
  


n1  1  n1 
n2  1  n2 
1
2
1
2
2
2

 475.36 79.41 



8
8


2
2
1  475.36 
1  79.41 





7
8
7 8 

2
 9.27
8
Example: “Cameron Crazies”. Confidence
interval for 1 – 2
• To use the t-table let’s use df = 9; t9* = 2.2622
• The confidence interval estimator for the
difference between two means is …
( x  x )  t9
1
2
*
2
2
s
s
1  2
n
n
1
2
 (68.25  68.63)  2.2622
475.36
8

79.41
8
  .38  18.84    19.22,18.46 
9
Interpretation
• The 95% CI for 1 - 2 is (-19.22, 18.46).
• Since the interval contains 0, there appears to be
no significant difference between
1 = mean points per game allowed by Duke at home.
2 = mean points per game allowed by Duke on road
• The Cameron Crazies appear to have no affect on
the ABILITY of the Duke men to play defense.
How can
this be?
10
Example: confidence interval for 1 – 2
• Example (p. 6)
– Do people who eat high-fiber cereal for
breakfast consume, on average, fewer
calories for lunch than people who do
not eat high-fiber cereal for breakfast?
– A sample of 150 people was randomly
drawn. Each person was identified as a
consumer or a non-consumer of highfiber cereal.
– For each person the number of calories
consumed at lunch was recorded.
11
Example: confidence interval for 1 – 2
Consmers Non-cmrs
568
498
589
681
540
646
636
739
539
596
607
529
637
617
633
555
.
.
.
.
705
819
706
509
613
582
601
608
787
573
428
754
741
628
537
748
.
.
.
.
Solution: (all data on p. 6)
• The parameter to be tested is
the difference between two means.
• The claim to be tested is:
The mean caloric intake of consumers (1)
is less than that of non-consumers (2).
• n1 = 43, n2 = 107; s12=4,103; s22=10,670
s
s 



n
n
 1
2 
2
1
df 
2
2
2
2
2
 s12 
1  s2 
  
 
n1  1  n1 
n2  1  n2 
2
 122.6
1
12
Example: confidence interval for 1 – 2
• Let’s use df = 120; t120* = 1.9799
• The confidence interval estimator for the difference
between two means using the formula on p. 4 is
( x  x )  t120
1
2
*
2
2
s
s
1  2
n
n
1
2
 (604.02  633.239)  1.9799
4103
43

10670
107
  29.21  27.66    56.87,  1.55 
13
Interpretation
• The 95% CI is (-56.87, -1.55).
• Since the interval is entirely negative (that is,
does not contain 0), there is evidence from
the data that µ1 is less than µ2. We estimate
that non-consumers of high-fiber breakfast
consume on average between 1.55 and 56.87
more calories for lunch.
14
Example: (cont.) confidence interval for 1 –
2 using min(n1 –1, n2 -1) to approximate the
df
• Let’s use df = min(43-1, 107-1) = min(42, 106) = 42;
• t42* = 2.0181
• The confidence interval estimator for the difference
between two means using the formula on p. 4 is
( x  x )  t 42
1
2
*
2
2
s
s
1  2
n
n
1
2
 (604.02  633.239)  2.0181
4103
43

10670
107
  29.21  28.19    57.40,  1.02 
15
Beware!! Common Mistake !!!
A common mistake is to calculate a one-sample
confidence interval for 1, a one-sample confidence interval for
2, and to then conclude that 1 and 2 are equal if the
confidence intervals overlap.
This is WRONG because the variability in the sampling
distribution for x 1  x 2 from two independent samples is more
complex and must take into account variability coming from both
samples. Hence the more complex formula for the standard error.
2
SE 
s1
n1
2

s2
n2
INCORRECT Two single-sample 95% confidence intervals:
The confidence interval for the male mean and the
confidence interval for the female mean overlap,
suggesting no significant difference between the true
mean for males and the true mean for females.
Male
Male interval: (18.68, 20.12)
Female
mean 19.4
17.9
st. dev. s 2.52
3.39
n 50
50
Female interval: (16.94, 18.86)
C O R R E C T T he 2-sam ple 95% confidence interval of the form
2
( y1  y 2 )  t
*
.025 , df
2
s1
n1

s2
n2
for the difference  m ale  
fem ale
betw een the m eans
is (.313, 2.69). Interval is entirely positive, su ggestin g sign i fican t d ifferen ce
betw een the true m ean for m ales and the true m ean for fem ales
(evidence that true m ale m ean is larger than true fem ale m ean).
0 .313
1.5
2.69
Reason for Contradictory Result
It's alw ays true that
a  b
2
1
s
n1

s
2
2
n2
a

s1
n1
b . S pecifically,

s2
n2
SE ( x1  x 2 )  SE ( x1 )  SE ( x 2 )
18
Does smoking damage the lungs of children exposed
to parental smoking?
Forced vital capacity (FVC) is the volume (in milliliters) of
air that an individual can exhale in 6 seconds.
FVC was obtained for a sample of children not exposed to
parental smoking and a group of children exposed to
parental smoking.
Parental smoking
FVC x
s
n
Yes
75.5
9.3
30
No
88.2
15.1
30

We want to know whether parental smoking decreases
children’s lung capacity as measured by the FVC test.
Is the mean FVC lower in the population of children
exposed to parental smoking?
FVC x
Parental smoking
s
n
Yes
75.5
9.3
30
No
88.2
15.1
30

95% confidence interval for (µ1 − µ2), with
df = min(30-1, 30-1) = 29  t* = 2.0452:
2
( x1  x 2 )  t *
s1
n1
1 = mean FVC of children
with a smoking parent;
2 = mean FVC of children
without a smoking parent
2

s2
n2
 (75.5  88.2)  2.0452
9.3
2

30
15.1
2
30
 12.7  2.0452 * 3.24
 12.7  6.63  (  19.33,  6.07 )
We are 95% confident that lung capacity is between
19.33 and 6.07 milliliters LESS in children of smoking
parents.
Do left-handed people have a shorter life-expectancy than
right-handed people?
 Some psychologists believe that the stress of being lefthanded in a right-handed world leads to earlier deaths among
left-handers.
 Several studies have compared the life expectancies of lefthanders and right-handers.
 One such study resulted in the data shown in the table.
Handedness
Mean age at death x
s
n
Left
66.8
25.3
99
Right
75.2
15.1
888
star left-handed quarterback
Steve Young
left-handed presidents

We will use the data to construct a confidence interval
for the difference in mean life expectancies for left-
handers and right-handers.
Is the mean life expectancy of left-handers less
than the mean life expectancy of right-handers?
Handedness
Mean age at death
s
n
Left
66.8
25.3
99
Right
75.2
15.1
888
95% confidence interval for (µ1 − µ2), with
df = min(99-1, 888-1) = 98  t* = 1.9845:
2
( x1  x 2 )  t *
s1
n1
2

s2
n2
 (66.8  75.2)  1.9845
(25.3)
99
2

(15.1)
888
2
The “Bambino”,left-handed Babe
Ruth, baseball’s all-time best
player.
1 = mean life expectancy of
left-handers;
2 = mean life expectancy of
right-handers
 8.4  1.9845 * 2.59
 8.4  5.14  (  13.54,  3.26)
We are 95% confident that the mean life expectancy for lefthanders is between 3.26 and 13.54 years LESS than the mean
life expectancy for right-handers.
Download