Table Number:__________ Group Name: __KEY________________ Group Members:_____________________ _____________________ _____________________ Activity: The Regression Line The data set, which is stored in the file North Carolina birth data, is a random sample of 1000 birth records taken by the North Carolina State Center for Health and Environmental Statistics in 2005. Of particular interest will be incidents of low infant birth weight. Low birth weight is defined as less than 2500 grams. Then goal of this activity will be to summarize the variables in this data set both graphically and numerically. The variables in the study are: Variable Label Description of variable Plurality Refers to the number of children associated with the birth Sex Sex of child (Gender 1=Male 2=Female) Fage Age of father (years) Mage Age of mother (years) Weeks Completed weeks of gestation Visits Number of pre-natal medical visits Marital Marital status (1=married 2=unmarried) Racemom Hispmom Race of Mother (0=Other Non-white 1=White 2=Black 3=American Indian 4=C hinese 5=Japanese 6=Hawaiian 7=Filipino 8=Other Asian or Pacific Islander) Whether mother is of Hispanic origin (C=Cuban M=Mexican N=Non-Hispanic O=Other and Unknown Hispanic P=Puerto Rican S=Central/South American U=Not Classifiable) Gained Weight gain during pregnancy (pounds) Lowbw If birth weight is 2500 grams or lower, 0=infant was not low birth weight, 1=infant was low birth weight Tpounds Birth weight in pounds Smoke 0=no 1=yes for mother admitted to smoking Mature 0=no for mother is 34 or younger 1-yes for mother is 35 or older Premie 0=no 1=yes to being born 36 weeks or sooner. 1. Answer the following for the variables Tpounds (response variable) regressed on Weeks (Explanatory Variable). A. Make a scatterplot of the data with the regression line. Report the parameter estimates (estimates of the slope and intercept). Copy and paste your information here: Simple linear regression results: Dependent Variable: tpounds Independent Variable: weeks tpounds = -6.0945723 + 0.34428023 weeks Sample size: 998 R (correlation coefficient) = 0.67005908 R-sq = 0.44897917 Estimate of error standard deviation: 1.1186614 Parameter estimates: Parameter Estimate Std. Err. Intercept -6.0945723 0.46463292 Slope 0.34428023 0.012085186 B. Alternative DF T-Stat P-value ≠ 0 996 -13.116962 <0.0001 ≠ 0 996 28.48779 <0.0001 Interpret the slope and the intercept Slope = (Change in birthweight)/(change in weeks of Gestation) = 0.344 1 For every increase in length of gestation by one week, the birthweight is expected to (or tends to) increase by 0.344 pound on average. Y-intercept: After 0 weeks of gestation, the birthweight is -6.1 pounds. This makes no sense in the context of the problem. C. What percentage of the variation in Tpounds is explained by Weeks? Is this high or low? 44% D. What is the predicted value for Tpounds when Weeks is 35? What if Weeks is 40? About 6 pounds About 7.7 pounds E. Comment on the fit of the model. The correlation coefficient is 0.67 which indicates a moderately strong positive association. 2. Answer the following for the Fage (response variable) regressed on Mage (Explanatory Variable). A. Make a scatterplot of the data with the regression line. Report the parameter estimates (estimates of the slope and intercept). Copy and paste your information from StatCrunch here: Simple linear regression results: Dependent Variable: fage Independent Variable: mage fage = 6.6459857 + 0.85532832 mage Sample size: 829 R (correlation coefficient) = 0.78050678 R-sq = 0.60919084 Estimate of error standard deviation: 4.2309059 Parameter estimates: Parameter Estimate Std. Err. Intercept 6.6459857 0.67379202 Slope B. 0.85532832 0.023822415 Alternative DF T-Stat P-value ≠ 0 827 9.8635566 <0.0001 ≠ 0 827 35.90435 <0.0001 Interpret the slope and the intercept in context of the “real world” data. Slope = (Change in father’s age/(change in mother’s age) = 0.86 1 For every increase in age of the mother by one year, the age of the father is expected to (or tends to) increase by 0.86 years on average. Y-intercept: When the mother is 0 years old, the father is 6.6 years. This makes no sense in the context of the problem. C. What percentage of the variation in Fage is explained by Mage? Is this high or low? 61% D. What is the predicted value for Fage when Mage is 35? What if Mage is 17? The predicted age of the father when the mother is 35 is 36.6 years. The predicted age of the father when the mother is 17 is 21.2 E. Comment on the fit of the model. The correlation coefficient is 0.78 which indicates a strong positive association. Answer the following using the information calculated 3. A friend of yours is pregnant with her first child. She is nervous that she might have her baby at 34 weeks. What can you tell her about the predicted weight of the baby? I would tell her that the predicted weight of her baby is 5.6 pounds. 4. Your cousin is 40 years old. Predict the age of the father of the child. Using the scatterplot, give a range for the suspected ages of the father. The predicted age of the father is 40.9 years old. The range of the father’s age is 33 – 43 years. 5. Of the two regression models, which has the best fit and why? The one with the higher correlation coefficient has the better fit. This would be the second model, the one with fage regressed on mage. BONUS. If time permits try one more set: Answer the following for the Tpounds (response variable) regressed on Mage (Explanatory Variable). A. Make a scatterplot of the data with the regression line. Report the parameter estimates (estimates of the slope and intercept) Simple linear regression results: Dependent Variable: tpounds Independent Variable: mage tpounds = 6.7395617 + 0.013345865 mage Sample size: 1000 R (correlation coefficient) = 0.054962431 R-sq = 0.0030208689 Estimate of error standard deviation: 1.5072433 Parameter estimates: Parameter Estimate Std. Err. Intercept 6.7395617 0.21262657 Slope B. 0.013345865 0.0076746494 Alternative DF T-Stat P-value ≠ 0 998 31.696705 <0.0001 ≠ 0 998 1.7389543 0.0824 Interpret the slope and the intercept Slope = (Change in birth weight)/(change in mother’s age) = 0.0133 For every increase in age of the mother by one year, birth weight is expected to (or tends to) by 0.0133 pound on average. increase Y-intercept: When the mother is 0 years old, the birth weight is 7 pounds. This makes no sense in the context of the problem. C. What percentage of the variation in Tpounds is explained by Mage? Is this high or low? 0.3% This is very low. D. What is the predicted value for Tpounds when Mage is 35? What if Mage is 17? When the mother’s age is 35, the birth weight is predicted to be 7.2 pounds. When the mother is 17 years old, the birth weight is predicted to be 7 pounds.