Regression LineANS

advertisement
Table Number:__________
Group Name: __KEY________________
Group Members:_____________________
_____________________
_____________________
Activity: The Regression Line
The data set, which is stored in the file North Carolina birth data, is a random sample of 1000 birth records taken
by the North Carolina State Center for Health and Environmental Statistics in 2005. Of particular interest will be
incidents of low infant birth weight. Low birth weight is defined as less than 2500 grams. Then goal of this activity
will be to summarize the variables in this data set both graphically and numerically. The variables in the study are:
Variable Label
Description of variable
Plurality
Refers to the number of children associated with the birth
Sex
Sex of child (Gender 1=Male 2=Female)
Fage
Age of father (years)
Mage
Age of mother (years)
Weeks
Completed weeks of gestation
Visits
Number of pre-natal medical visits
Marital
Marital status (1=married 2=unmarried)
Racemom
Hispmom
Race of Mother (0=Other Non-white 1=White 2=Black 3=American Indian 4=C hinese
5=Japanese 6=Hawaiian 7=Filipino 8=Other Asian or Pacific Islander)
Whether mother is of Hispanic origin (C=Cuban M=Mexican N=Non-Hispanic O=Other and
Unknown Hispanic P=Puerto Rican S=Central/South American U=Not Classifiable)
Gained
Weight gain during pregnancy (pounds)
Lowbw
If birth weight is 2500 grams or lower, 0=infant was not low birth weight, 1=infant was low
birth weight
Tpounds
Birth weight in pounds
Smoke
0=no 1=yes for mother admitted to smoking
Mature
0=no for mother is 34 or younger 1-yes for mother is 35 or older
Premie
0=no 1=yes to being born 36 weeks or sooner.
1.
Answer the following for the variables Tpounds (response variable) regressed on Weeks (Explanatory
Variable).
A. Make a scatterplot of the data with the regression line. Report the parameter estimates (estimates of the
slope and intercept). Copy and paste your information here:
Simple linear regression results:
Dependent Variable: tpounds
Independent Variable: weeks
tpounds = -6.0945723 + 0.34428023 weeks
Sample size: 998
R (correlation coefficient) = 0.67005908
R-sq = 0.44897917
Estimate of error standard deviation: 1.1186614
Parameter estimates:
Parameter
Estimate
Std. Err.
Intercept
-6.0945723
0.46463292
Slope
0.34428023 0.012085186
B.
Alternative DF
T-Stat
P-value
≠ 0 996 -13.116962 <0.0001
≠ 0 996
28.48779 <0.0001
Interpret the slope and the intercept
Slope = (Change in birthweight)/(change in weeks of Gestation) =
0.344
1
For every increase in length of gestation by one week, the birthweight is expected to (or tends to)
increase by 0.344 pound on average.
Y-intercept: After 0 weeks of gestation, the birthweight is -6.1 pounds. This makes no sense in the
context of the problem.
C.
What percentage of the variation in Tpounds is explained by Weeks? Is this high or low?
44%
D. What is the predicted value for Tpounds when Weeks is 35? What if Weeks is 40?
About 6 pounds
About 7.7 pounds
E.
Comment on the fit of the model.
The correlation coefficient is 0.67 which indicates a moderately strong positive association.
2.
Answer the following for the Fage (response variable) regressed on Mage (Explanatory Variable).
A. Make a scatterplot of the data with the regression line. Report the parameter estimates (estimates of the
slope and intercept). Copy and paste your information from StatCrunch here:
Simple linear regression results:
Dependent Variable: fage
Independent Variable: mage
fage = 6.6459857 + 0.85532832 mage
Sample size: 829
R (correlation coefficient) = 0.78050678
R-sq = 0.60919084
Estimate of error standard deviation: 4.2309059
Parameter estimates:
Parameter
Estimate
Std. Err.
Intercept
6.6459857
0.67379202
Slope
B.
0.85532832 0.023822415
Alternative DF
T-Stat
P-value
≠ 0 827 9.8635566 <0.0001
≠ 0 827
35.90435 <0.0001
Interpret the slope and the intercept in context of the “real world” data.
Slope = (Change in father’s age/(change in mother’s age) =
0.86
1
For every increase in age of the mother by one year, the age of the father is expected to (or tends to)
increase by 0.86 years on average.
Y-intercept: When the mother is 0 years old, the father is 6.6 years. This makes no sense in the
context of the problem.
C.
What percentage of the variation in Fage is explained by Mage? Is this high or low?
61%
D. What is the predicted value for Fage when Mage is 35? What if Mage is 17?
The predicted age of the father when the mother is 35 is 36.6 years.
The predicted age of the father when the mother is 17 is 21.2
E.
Comment on the fit of the model.
The correlation coefficient is 0.78 which indicates a strong positive association.
Answer the following using the information calculated
3.
A friend of yours is pregnant with her first child. She is nervous that she might have her baby at 34 weeks.
What can you tell her about the predicted weight of the baby?
I would tell her that the predicted weight of her baby is 5.6 pounds.
4.
Your cousin is 40 years old. Predict the age of the father of the child. Using the scatterplot, give a range for
the suspected ages of the father.
The predicted age of the father is 40.9 years old.
The range of the father’s age is 33 – 43 years.
5.
Of the two regression models, which has the best fit and why?
The one with the higher correlation coefficient has the better fit. This would be the second model, the
one with fage regressed on mage.
BONUS. If time permits try one more set:
Answer the following for the Tpounds (response variable) regressed on Mage (Explanatory Variable).
A. Make a scatterplot of the data with the regression line. Report the parameter estimates (estimates of the
slope and intercept)
Simple linear regression results:
Dependent Variable: tpounds
Independent Variable: mage
tpounds = 6.7395617 + 0.013345865 mage
Sample size: 1000
R (correlation coefficient) = 0.054962431
R-sq = 0.0030208689
Estimate of error standard deviation: 1.5072433
Parameter estimates:
Parameter
Estimate
Std. Err.
Intercept
6.7395617
0.21262657
Slope
B.
0.013345865 0.0076746494
Alternative DF
T-Stat
P-value
≠ 0 998 31.696705 <0.0001
≠ 0 998 1.7389543
0.0824
Interpret the slope and the intercept
Slope = (Change in birth weight)/(change in mother’s age) = 0.0133
For every increase in age of the mother by one year, birth weight is expected to (or tends to)
by 0.0133 pound on average.
increase
Y-intercept: When the mother is 0 years old, the birth weight is 7 pounds. This makes no sense in the
context of the problem.
C.
What percentage of the variation in Tpounds is explained by Mage? Is this high or low?
0.3% This is very low.
D. What is the predicted value for Tpounds when Mage is 35? What if Mage is 17?
When the mother’s age is 35, the birth weight is predicted to be 7.2 pounds.
When the mother is 17 years old, the birth weight is predicted to be 7 pounds.
Download