Stat 301 – Lecture 5 Two Independent Samples Question In 2000, did men and women differ in terms of their body mass index? 1 Populations random selection 2. Male Inference 1. Female random selection Samples 2 Body Mass Index Females Males n1 50 n2 50 Y1 27.484 Y2 26.868 s1 7.860 s2 7.215 s p 7.544 3 1 Stat 301 – Lecture 5 95% Confidence Interval Y Y t s * 1 2 p 1 1 n1 n2 t * from t - table with df n1 n2 2 4 95% Confidence Interval Y Y t s * 1 2 p 1 1 n1 n2 27.484 26.868 1.98457.544 0.616 1.98451.509 1 1 50 50 0.616 2.995 2.38 to 3.61 5 Interpretation We are 95% confident that the difference in population mean BMI for women compared to men is between –2.38 and 3.61. Women could have a mean BMI as much as 2.38 lower than men or as much as 3.61 higher than men. 6 2 Stat 301 – Lecture 5 Difference? Because zero is in the confidence interval, there could be no difference in population mean BMI’s for women compared to men. This agrees with the test of hypothesis. 7 Two-sample model Y i •Y represents a value of the variable of interest • i represents the ith population mean • represents the random error associated with an observation 8 Conditions The random error term, , is Independent Identically distributed Normally distributed with standard deviation, 9 3 Stat 301 – Lecture 5 Residuals Estimate of error (Observation – Fit) Residual ˆ Y Yi 10 Checking Conditions Independence. Hard to check this but the fact that we obtained the data through separate random samples of women and men assures us that the statistical methods should work. 11 Checking Conditions Identically distributed. Check using an outlier box plot. Unusual points may come from a different distribution Check using a histogram. Bimodal shape could indicate two different distributions. 12 4 Stat 301 – Lecture 5 Checking Conditions Normally distributed. Check with a histogram. Symmetric and mounded in the middle. Check with a normal quantile plot. Points falling close to a diagonal line. 13 Distributions 3 .99 2 .95 .90 1 .75 0 .50 Normal Quantile Plot BMI centered by Gender .25 -1 .10 .05 -2 .01 -3 30 20 15 Count 25 10 5 -20 -15 -10 -5 0 5 10 15 20 14 BMI Residuals Histogram is skewed left and mounded to the right of zero. Box plot is fairly symmetric with two potential outliers on the high side. Normal quantile plot has points following the diagonal line for the first part but then wiggles around for larger values. 15 5 Stat 301 – Lecture 5 Equal Variance? All of the error terms are supposed to be from the same distribution with a single standard deviation, σ. Display the residuals for each group, male and female. 16 17 Equal Variance? Both males and females show about the same variability. The sample standard deviations are very close. The equal variance condition is satisfied. 18 6 Stat 301 – Lecture 5 BMI Residuals The identically distributed and normally distributed error conditions necessary for statistical inference may not be met for these data. 19 Consequences The P-value for the test may not be correct. Even so, there is not much of a difference between women and men, and I would not change my conclusion from the test of hypothesis. 20 Consequences The stated confidence level may not give the true coverage rate. I would still use the confidence interval but recognize that the true coverage rate is probably less than 95%. 21 7