Stat 301 – Lecture 5 Body Mass Index Two Independent Samples 

advertisement
Stat 301 – Lecture 5
Two Independent Samples
Question
In
2000, did men and
women differ in terms of
their body mass index?
1
Populations
random
selection
2. Male
Inference
1. Female
random
selection
Samples
2
Body Mass Index
Females
Males
n1  50
n2  50
Y1  27.484
Y2  26.868
s1  7.860
s2  7.215
s p  7.544
3
1
Stat 301 – Lecture 5
95% Confidence Interval
Y  Y   t s
*
1
2
p
1 1

n1 n2
t * from t - table with
df  n1  n2  2
4
95% Confidence Interval
Y  Y   t s
*
1
2
p
1 1

n1 n2
27.484  26.868  1.98457.544
0.616  1.98451.509 
1
1

50 50
0.616  2.995
 2.38 to 3.61
5
Interpretation
We are 95% confident that the
difference in population mean
BMI for women compared to
men is between –2.38 and 3.61.
Women could have a mean BMI
as much as 2.38 lower than
men or as much as 3.61 higher
than men.
6
2
Stat 301 – Lecture 5
Difference?
Because zero is in the
confidence interval, there
could be no difference in
population mean BMI’s for
women compared to men.
This agrees with the test of
hypothesis.
7
Two-sample model
Y  i  
•Y represents a value of the variable
of interest

• i represents the ith population mean
• represents the random error
associated with an observation
8
Conditions
The random error term,  , is
 Independent
 Identically
distributed
 Normally distributed with
standard deviation, 
9
3
Stat 301 – Lecture 5
Residuals
Estimate of error
(Observation – Fit)
Residual
ˆ  Y  Yi
10
Checking Conditions
Independence.
 Hard
to check this but the fact
that we obtained the data
through separate random
samples of women and men
assures us that the statistical
methods should work.
11
Checking Conditions
Identically distributed.
 Check
using an outlier box plot.
Unusual points may come from
a different distribution
 Check using a histogram. Bimodal shape could indicate two
different distributions.
12
4
Stat 301 – Lecture 5
Checking Conditions
Normally distributed.
Check
with a histogram.
Symmetric and mounded in
the middle.
Check with a normal
quantile plot. Points falling
close to a diagonal line.
13
Distributions
3
.99
2
.95
.90
1
.75
0
.50
Normal Quantile Plot
BMI centered by Gender
.25
-1
.10
.05
-2
.01
-3
30
20
15
Count
25
10
5
-20
-15
-10
-5
0
5
10
15
20
14
BMI Residuals
 Histogram is skewed left and
mounded to the right of zero.
 Box plot is fairly symmetric with two
potential outliers on the high side.
 Normal quantile plot has points
following the diagonal line for the
first part but then wiggles around
for larger values.
15
5
Stat 301 – Lecture 5
Equal Variance?
 All of the error terms are
supposed to be from the
same distribution with a
single standard deviation, σ.
 Display the residuals for
each group, male and
female.
16
17
Equal Variance?
 Both males and females show
about the same variability.
The sample standard
deviations are very close.
 The equal variance condition
is satisfied.
18
6
Stat 301 – Lecture 5
BMI Residuals
The identically distributed
and normally distributed
error conditions necessary
for statistical inference may
not be met for these data.
19
Consequences
The P-value for the test may not
be correct.
 Even so, there is not much of a
difference between women and
men, and I would not change
my conclusion from the test of
hypothesis.
20
Consequences
The stated confidence level
may not give the true coverage
rate.
 I would still use the confidence
interval but recognize that the
true coverage rate is probably
less than 95%.
21
7
Download