Final exam solutions - Wharton Statistics Department

Final Exam Solutions 1. (8) Write the best answer below the question for the following multiple choice questions. No explanation necessary. (1) Ten tidal pools of the same size were found in a certain coastal region. It was randomly determined which 5 would receive a treatment (removal of limpets, a type of seaweed grazer) and which 5 would serve as controls. The amount of seaweed covering the floor of the tide pools was measured at the end of the study period. It was desired to see whether the amount of seaweed was affected by limpet removal. Which of the following statements is correct? (a) The data should be analyzed with a two-independent sample t-test. A causal inference can be drawn from the study. (b) The data should be analyzed with a two-independent sample t-test. A causal inference cannot be drawn from the study. (c) The data should be analyzed with a paired t-test. A causal inference can be drawn from the study. (d) The data should be analyzed with a paired t-test. A causal inference cannot be drawn from the study. A (2) Which of the following is not an assumption in the ideal model for comparing several populations used for the one-way ANOVA F test? (a) (b) (c) (d) (e) The sample sizes must be equal The populations must all be normally distributed The population variances must be equal The samples for each treatment must be selected randomly and independently All of the above are assumed. A (3) Suppose people are randomly assigned into three groups and given either one, two or three mg of a drug. The amount of pain they feel after having their wisdom teeth removed is the Y variable. The amount of drugs is the X variable. A simple regression of Y vs. X is done. (a) If the t-test for X is significant (p-value < .05), then we have evidence that X causes Y to change. (b) No matter what the t-ratio for X is, we cannot determine causation since a regression is being done. To provide evidence for causation, a one-way analysis of variance F test should be done instead. (c) There may be a confounding variable that causes both X and Y to increase. So statistical significance in this problem doesn’t provide evidence for causation. (d) If the t-test for X is insignificant, then we have evidence that X doesn’t cause Y. (e) More than one of the above is true. A (4) In a statistical report, the statement is made that the 95% confidence interval for the percentage of babies who are boys is between 51% and 55% (i.e., 53%  2%). This means, that if, in the future, a 95% confidence interval is computed in the same way for each of a large number of random samples of the same size (a) (b) (c) (d) 95% of such intervals will cover (contain) the midpoint 53% 95% of such intervals will cover (contain) the population percentage of boys. 95% of such intervals will overlap (intersect) the interval 51% to 55% 95% of such intervals will completely cover (contain) the interval 51% to 55% B (5) A study of human development showed two types of movies to groups of children. Crackers were available in a bowl, and the investigators compared the number of crackers eaten by children watching the different kinds of movies. One kind of movie was shown at 8 A.M. (right after the children had breakfast) and another at 11 A.M. (right before the children had lunch). It was found that more crackers were eaten during the movie shown at 11 A.M. than during the movie shown at 8 A.M. The investigators concluded that the different types of movies had an effect on appetite. The results cannot be trusted because (a) the study was not double blind. Neither the investigators nor the children should have been aware of which movie was being shown. (b) the investigators were biased. They knew beforehand what they hoped to show. (c) the investigators should have used several bowls, with crackers randomly placed in each. (d) the time the movie was shown is a confounding variable. D (6) A group of college students believes that herbal tea has remarkable restorative powers. To test its theory, the group makes weekly visits to a local nursing home, visiting with residents, talking with them and serving them herbal tea. After several months, many of the residents are more cheerful and healthy. Which of the following may be correctly concluded from this study? (a) herbal tea does improve one’s emotional state, at least for the residents of nursing homes. (b) there is some evidence that herbal tea may improve one’s emotional state. The results would be completely convincing if a scientist had conducted the study rather than a group of college students. (c) the results of the study are not convincing since only a local nursing home was used and only for a few months. (d) the results of the study are not convincing since the effect of herbal tea is confounded with several other factors. D (7) Does taking gingko tables twice a day provide significant improvement in mental performance? To investigate this issue, a researcher conducted a study with 150 adult subjects who took gingko tablets twice a day for a period of six months. At the end of the study, 200 variables related to the mental performance of the subjects were measured on each subject and the means compared to known means for these variables in the population of all adults. Nine of these variables were significantly better (in the sense of statistical significance) at the 5% level for the group taking the gingko tablets as compared to the population as a whole, and one variable was significantly better at the 1% level for the group taking the gingko tablets as compared to the population as a whole. It would be correct to conclude (a) there is good statistical evidence that taking gingko tablets twice a day provides some improvement in mental performance. (b) there is good statistical evidence that taking gingko tablets twice a day provides improvement for the variable that was significant at the 1% level. We should be somewhat cautious about making claims for the variables that were significant at the 5% level. (c) these results would have provided good statistical evidence that taking gingko tablets twice a day provides some improvement in mental performance if the number of subjects had been larger. It is premature to draw statistical conclusions from studies in which the number of subjects is less than the number of variables measured. (d) none of the above. D (8) Are proficiency test scores affected by the education of the child’s parents? To answer this question, a random sample of 9-year old children was drawn. Each child’s test score and the education level of the parent with the higher level were recorded. The education categories are less than high school, high school graduate, some college, and college graduate. The null hypothesis for the one-way analysis of variance F test is that the population mean test scores are the same for all four education categories. The alternative hypothesis is (a) that the population mean test score is larger for children of college graduates than for the other three educational categories (b) that the population mean test score is smaller for children whose parents both did not graduate from high school than for the other three educational categories (c) that the population mean test score for children of college graduates is larger than the population mean test score for children whose parents both did not graduate from high school (d) none of the above. D 2. (5) The following is a list of some of the statistical methods you have learned in this course: A. Two independent samples t-test B. Matched pairs t-test C. Methods for comparing several means (One-way analysis of variance F test and Tukey-Kramer adjusted procedure for comparing two of several means) E. Chi-squared test F. Simple linear regression G. Multiple regression For each of the situations described below, state the technique (from the list above) that you believe is most applicable. (a) A researcher for OSHA (Occupational Safety and Health Adminstration) wants to see whether cutbacks in enforcement of safety regulations coincided with an increase in work related accidents. For 20 industrial plants, she has the number of accidents in 1980 and 1995. B (b) A researcher wants to investigate how fertilizer affects soybean yield. She divides a farm into 30 one-acre plots. Each plot receives a different amount of fertilizer. Soybeans were then planted and the amount of soybeans harvested at the end of the season from each plot were recorded. F (c) Can music make you smarter? And if so, which kind of music works best? Two University of California at Irvine professors addressed these questions (as reported on “Dateline” in September 1994). A random sample of 135 students was given tests that measured the ability to reason. One third of the students were then put in a room where rock-and-roll music was played. A second group of 45 students was placed in a room and listened to music composed by Mozart. The last group was placed in a room where no music was played. The students then took a test. The differences (second test score minus first test score) were recorded. C (d) A bank would like to develop a model to predict the total sum of money customers withdraw from automatic teller machines (ATMs) on a weekend so that they can be sure to stock an adequate amount of money in each of the machines. They have data on the amount of money withdrawn last weekend for a random sample of 35 ATM machines throughout the city. They believe several factors can be useful in predicting the amount of money withdrawn including the average assessed value of houses in the vicinity of the ATM machine, how far away the nearest branch office of the bank is from the ATM machine, and whether or not the ATM machine is located in a shopping center. G (e) There is a theory that the anticipation of a birthday can prolong a person’s life. In a study set up to examine this notion statistically, it was found that only 60 of 747 people whose obituaries were published in Salt Lake City in 1975 died in the three-month period preceding their birthday. E 3. (6) A randomized experiment is done to measure the effect of a drug on developing mouse’s weights. 10 30-day old mice were randomly divided into groups of five. The drug group received the drug for ten days; the placebo group received a placebo for ten days. The weight gains at the end of the ten days were recorded. JMP output is shown below for two analyses, one is an analysis of how the weight gains for the two groups compare and the other is an analysis of how the log weight gains for the two groups compare. (a) (3) Assume the additive treatment effect model holds. Find an (approximate) 95% confidence interval for the amount by which taking the drug increases a mouse’s weight gain compared to what the mouse’s weight gain would have been taking the placebo. From the JMP output for Analysis I, a 95% confidence interval is (-1.274,30.311). (b) (3) Assume the multiplicative treatment effect model holds. Find an (approximate) 95% confidence interval for the amount by which taking the drug multiplies a mouse’s weight gain compared to what the mouse’s weight gain would have been taking the placebo. From the JMP output for Analysis II, a 95% confidence interval for the amount by which taking the drug increases a mouse’s log weight gain compared to what the gain would have been from taking the placebo is (0.8019,3.8481). Thus, a 95% confidence interval for the amount by which taking the drug multiplies a mouse’s weight gain compared to what it would have been from taking the placebo is (e 0.8019 , e 3.8481)  (2.23,46.90) (c) (3) Is there strong evidence that taking the drug as opposed to the placebo causes a change in mice’s mean weight gains? Justify your answer using an appropriate test. The multiplicative model appears more appropriate than the additive model. The box plots for Analysis I show that the drug group has much greater spread than the placebo group; the spreads of the two groups should be about equal if the additive model holds. On the other hand, the box plots for Analysis II show that the drug group and placebo group have about equal spreads on the log scale; this is what we would expect to be the case if the multiplicative model holds. Under the multiplicative model, the t-test of the null hypothesis that the mean of log weight gain for the drug group equals the mean of log weight gain for the placebo group versus the two sided alternative that the means are not the same provides a test of whether taking the drug as opposed to the placebo causes a change in mice’s mean weight gains. From Analysis II, the p-value for this test is .0078. Thus, there is strong evidence that taking the drug as opposed to the placebo causes a change in mice’s mean weight gains. Analysis I: Y= Weight Gains Oneway Analysis of Response By Group Response 40 30 20 10 0 Drug Placebo Group Means and Std Deviations Level Drug Placebo Number Mean Std Dev Std Err Mean Lower 95% Upper 95% 5 16.1461 15.2061 6.8004 -2.735 35.027 5 1.6276 1.8113 0.8100 -0.621 3.877 t-Test Difference t-Test DF Prob > |t| Estimate 14.519 2.120 8 0.0668 Std Error 6.848 Lower 95% -1.274 Upper 95% 30.311 Assuming equal variances Analysis II: Y=Log (Weight Gains) Oneway Analysis of Log Response By Group Log Response 4 3 2 1 0 -1 Drug Placebo Group Means and Std Deviations Level Drug Placebo Number Mean Std Dev Std Err Mean Lower 95% Upper 95% 5 2.35527 1.05441 0.47155 1.046 3.6645 5 0.03029 1.03417 0.46250 -1.254 1.3144 t-Test Difference t-Test DF Prob > |t| Estimate 2.32499 3.520 8 0.0078 Std Error 0.66050 Lower 95% 0.80188 Upper 95% 3.84810 Assuming equal variances 4. (10) Lotteries have become important sources of revenue for governments. Many people have criticized lotteries, however, referring to them as a tax on the poor and uneducated. In an examination of the issue, a random sample of 100 adults was asked how much they spend on lottery tickets and was interviewed about various socioeconomic variables. The following data was recorded: amount spent on lottery tickets as a percentage of total household income (Lottery), number of years of education (Education), age (Age), number of children (Children) and personal income in thousands of dollars (Income). The output from multiple regression of Lottery on Education, Age, Children and Income is shown below. Assume the ideal multiple linear regression holds for this problem. Response Lottery Summary of Fit Rsquare Rsquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.433474 0.40962 ???? 5.39 100 Analysis of Variance Source DF Sum of Squares Mean Square F Ratio Model 4 615.4421 153.861 18.1722 Prob > F Error 95 ???? ???? C. Total 99 1419.7900 <.0001 Parameter Estimates Term Intercept Education Age Children Income Estimate Std Error t Ratio Prob>|t| 11.906094 1.785197 6.67 <.0001 -0.430018 0.132072 -3.26 0.0016 0.0291899 0.025228 1.16 0.2501 0.0934351 0.224313 0.42 0.6780 -0.074471 0.027726 -2.69 0.0085 Lottery Residual Residual by Predicted Plot 6 2 -2 -6 -10 0 5 10 15 Lottery Predicted (a) (2) Based on the multiple linear regression below, what would you predict the lottery spending of a 30 year old woman with 12 years of education, two children and an income of $30,000 to be? From the multiple regression, we would predict the percentage of total household income that the woman would spend on the lottery to be yˆ  11.91  0.430 *12  0.029 * 30  0.093 * 2  0.074 * 30  5.582 Thus, we would predict the woman to spend 5.582% of her income on the lottery or 0.05582 * 30000  $1674.60 on the lottery. Note than income is in thousands so the woman’s income is 30 in the prediction equation. (b) (2) The root mean square error ( ̂ ) is left blank. What is a reasonable estimate for the root mean square error? (i) 2.91 (ii) 5.91 (iii) 8.91 (iv) 10.91 A reasonable estimate is (i). We know that approximately 68% of the residuals should have magnitude less than or equal to 2.91 and approximately 95% of the residuals should have magnitude less than or equal to 2*2.91. Looking at the residual by predicted plot, only (i) is a reasonable estimate of the root mean square error. (c) (2) Is there strong evidence that the multiple regression model provides better predictions of Lottery than just using the sample mean of Lottery to predict Lottery? Justify your answer using a test. To test if the multiple regression model provides better predictions of Lottery than just using the sample mean of Lottery to predict Lottery, we use the overall F test which tests whether the coefficients on all of the explanatory variables (Education, Age, Children and Income) are equal to zero. From the JMP output in the Analysis of Variance table, the pvalue is <.0001. Thus, there is strong evidence that the multiple regression model provides better predictions than the sample mean does. (d) (2) What is an approximate 95% confidence interval for the coefficient on Age? An approximate 95% confidence interval for the coefficient is the point estimate plus or minus two standard errors: 0.0292  2 * .025  (0.0208,0.0792) . (e) (2) A goal of the study was to test the following theories: (i) Relatively uneducated people spend different mean amounts on lottery tickets than relatively educated people, all other things being equal. (ii) Older people spend different mean amounts on lottery tickets than younger people, all other things being equal. Assuming that there are no confounding variables, translate these theories into appropriate null and alternative hypotheses about the multiple linear regression model’s parameters. Test each of these theories at the 0.05 significance level and state your conclusions. Let the multiple regression model be written as {Lottery | Education  x1 , Age  x2 , Children  x3 , Income  x4 }   0  1 x1   2 x2   3 x3   4 x4 Theory (i) corresponds to 1  0 . The t-test of H 0 : 1  0 vs. H a : 1  0 gives a pvalue of 0.0016. Since ˆ1  0.430 , there is strong evidence that relatively uneducated people spend a higher mean amount on lottery tickets than relatively educated people, all other things being equal. Theory (ii) corresponds to  2  0 . The t-test of H 0 :  2  0 vs. H a :  2  0 gives a pvalue of 0.2501. Thus, there is no strong evidence that older people spend different mean amounts on lottery tickets than younger people, all other things being equal. 5. (6) For each of the following situations, a simple linear regression has been carried out. State whether (i) the the simple linear regression is well suited to answer the question of interest or (ii) the simple linear regression is not well suited to answer the question of interest and needs to be modified. If you answer (ii), state what the most salient problem with simple linear regression is and discuss briefly how you would try to fix it. (a) (3) We want to predict a car’s fuel consumption from its speed. The scatterplot below shows data on the British Ford Escort. The residual plot indicates that the simple linear regression model is not appropriate. The residual plot shows a clear pattern in the mean. The scatterplot of fuel used vs. speed indicates that the relationship between fuel used and speed is quadratic so that quadratic regression (a regression of fuel used on speed and speed squared) should be tried. Bivariate Fit of Fuel Used By Speed Fuel Used 13 14 13 11 9 7 5 0 50 100 Speed 150 Linear Fit Linear Fit Fuel Used = 7.5102418 + 0.0186022 Speed Summary of Fit RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.109542 0.035337 2.309302 9.091429 14 Analysis of Variance Source DF Sum of Squares Mean Square F Ratio Source Model Error C. Total DF Sum of Squares Mean Square F Ratio 1 12 13 7.872450 63.994521 71.866971 7.87245 5.33288 1.4762 Prob > F 0.2477 Parameter Estimates Term Residual Intercept Speed Estimate Std Error t Ratio Prob>|t| 7.5102418 0.0186022 1.440329 0.015311 5.21 1.21 0.0002 0.2477 6 3 1314 0 -3 0 50 100 Speed 150 Observations with Largest Cook’s Distances Observation Number Cook’s Distance 13 0.280 14 0.083 Leverage 0.257 0.204 (b) (3) One of the most dangerous contaminants deposited over European countries following the Chernobyl accident of April 1987 was radioactive cesium. To study cesium transfer from contaminated soil to plants, researchers collected soil samples and samples of mushroom mycelia from 17 wooded locations in Umbria, Central Italy, from August 1986 to November 1989. The researchers measured concentrations (Bq/kg) of cesium in the soil and in the mushrooms. The researchers’ goal is to predict Y=concentration in mushrooms based on X=concentration in soil. The output from a simple linear regression is shown below. The simple linear regression model with these data is not appropriate for drawing inferences. Observation 17 has an enormous Cook’s distance of 10.08. This is much greater than the cutoff of 1 for classifying a point as being influential. Observation 17’s leverage is 0.755>2*(2/17). Thus, observation 17 is influential and has high leverage. We cannot draw reliable inferences over the whole range of explanatory variables (concentration in soil) in the data. We should omit observation 17 from the regression and consider the model to only be reliable for concentration in soil in the range of 0-500. Bivariate Fit of MUSHROOM By SOIL 200 17 MUSHROOM 150 100 50 16 0 0 250 500 750 1000 1250 1500 SOIL Linear Fit Linear Fit MUSHROOM = 16.725686 + 0.0959027 SOIL Summary of Fit Rsquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.406386 0.366812 36.56475 44.58824 17 Analysis of Variance Source Model Error C. Total DF 1 15 16 Sum of Squares 13729.399 20054.718 33784.118 Mean Square 13729.4 1337.0 F Ratio 10.2690 Prob > F 0.0059 Parameter Estimates Term Intercept SOIL Estimate 16.725686 0.0959027 Std Error 12.41954 0.029927 t Ratio 1.35 3.20 Prob>|t| 0.1981 0.0059 Residual 75 50 17 25 0 -25 16 -50 0 250 500 750 1000 1250 1500 SOIL Observations with Largest Cook’s Distances Observation Number Cook’s Distance 16 0.081 17 10.08 Leverage 0.082 0.755 6. (7) A study was conducted to test the effects of a nonsteroidal anti-inflammatory drug on pain. The experiment examines the effects of giving the treatment both before and after surgery or after surgery only. 30 patients between the ages of 18 and 55 undergoing elective knee arthroscopy were enrolled in the study two weeks before their surgery and randomly divided into three groups. Group A received the nonsteroidal antiinflammatory drug (NSAID) both 3 days prior to surgery and 5 days after surgery. Group B received a placebo before surgery and the NSAID after surgery. Group C received the placebo both before and after surgery. Post-operatively all patients were given prescriptions for codeine which could be taken every 4 to 6 hours as needed. Pain scores were recorded at the time of enrollment in the study (two weeks before surgery), one day before surgery and one week after surgery (higher pain scores indicate that the patient is in more pain). Shown below are JMP analyses for three outcomes – (I) pain at time of enrollment in the study (two weeks before surgery); (II) pain one day before surgery; and (III) pain one week after surgery. (a) (2) Is there strong evidence that that not all of the treatments are equally effective one week after surgery, i.e., that some of the treatments have higher mean pain scores than other treatments one week after surgery? State this question in terms of a hypothesis test and answer the question of interest, using a test at the 0.05 level. Let  A,1 weekafter ,  B ,1 weekafter,  C ,1 weekafter be the mean pain scores for treatments A, B and C 1-week after surgery respectively. To test H 0 :  A,1 weekafter   B ,1 weekafter   C ,1 weekafter vs. H a : not all treatments have same mean one week after surgery, we use the one-way Analysis of Variance F-test for Analysis III. The p-value for this test is <.0001. Thus, there is strong evidence that not all treatments are equally effective one week after surgery. (b) (3) Choose a linear combination to test the hypothesis that NSAID is beneficial as a preoperative treatment (i.e., that it reduces pain before surgery). Find an approximate 95% confidence interval for the linear combination and test the hypothesis that the linear combination equals 0 at the 0.05 level. To investigate whether NSAID is beneficial as a preoperative treatment, we want to look at pain scores one day before surgery (We could also look at the differences between pain scores one day before surgery and two weeks before surgery. Although this would be a more powerful approach, it would require that we take into account the dependence between a patient’s score one day before surgery and two weeks before surgery as in a matched pairs t-test.) Because only group A had the treatment before surgery, we would like to compare group A to the combination of groups B and C. The appropriate linear  B  C where  A ,  B ,  C denote the means of 2 groups A, B and C one-day before surgery respectively. The point estimate of  27.211  27.350  6.983 . The standard error of g is is g  20.928  2 combination for doing that is    A  C12 C 22 C32 1 (1 / 2) 2 (1 / 2) 2   1.877    0.727 . Thus an approximate 95% n1 n2 n3 10 10 10 confidence interval for  is g  2 * SE ( g )  6.983  2 * 0.727  (5.529,8.437) . Because 0 is not in the confidence interval, we reject H 0 :   0 at the 0.05 level. Since ˆ g is negative and H 0 :   0 is rejected, there is strong evidence that NSAID is beneficial as a preoperative treatment. (c) (2) There is something in these analyses that calls into question whether patients were randomly assigned into Groups A, B and C. Explain what feature of these analyses calls the random assignment into question. If patients were in fact randomly assigned to Groups A, B and C, then we would not expect the pain scores of Groups A, B and C to be much different two weeks before surgery, since none of the groups have received any treatment at that point. However, we see from Analysis I that the p-value for testing the null hypothesis that the means of Groups A, B and C two weeks before surgery are all equal is <.0001. Furthermore, every patient in Group A had a higher pain score than any of the patients in Group B and C. The differences in pain scores between the groups before any treatment has been applied calls into question whether the patients were in fact randomly assigned. Analysis I – Outcome: Pain two weeks before surgery Oneway Analysis of Pain Two Weeks Before Surgery By Group Pain Two Weeks Before Surgery 30 27.5 25 22.5 20 17.5 15 A B C Group Means and Std Deviations Level A B C Number 10 10 10 Mean 19.9738 19.9886 25.6040 Std Dev 1.50822 1.82481 1.53550 Std Err Mean 0.47694 0.57706 0.48557 Lower 95% 18.995 18.805 24.608 Oneway Anova Summary of Fit Rsquare Adj Rsquare Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.746273 0.727478 1.629153 21.85545 30 Analysis of Variance Source Group Error C. Total DF 2 27 29 Sum of Squares 210.77451 71.66178 282.43629 Mean Square 105.387 2.654 F Ratio 39.7067 Prob > F <.0001 Upper 95% 20.952 21.173 26.600 Analysis II – Outcome: Pain One Day Before Surgery Oneway Analysis of Pain One Day Before Surgery By Group Pain One Day Before Surgery 32.5 30 27.5 25 22.5 20 17.5 A B C Group Means and Std Deviations Level A B C Number 10 10 10 Mean 20.9282 27.2110 27.3495 Std Dev 2.27684 1.84643 1.40720 Std Err Mean 0.72000 0.58389 0.44500 Lower 95% 19.451 26.013 26.436 Oneway Anova Summary of Fit Rsquare Adj Rsquare Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.738744 0.719392 1.877367 25.16289 30 Analysis of Variance Source Group Error C. Total DF 2 27 29 Sum of Squares 269.08526 95.16172 364.24698 Mean Square 134.543 3.525 F Ratio 38.1734 Prob > F <.0001 Upper 95% 22.406 28.409 28.263 Analysis III – Pain One Week After Surgery Oneway Analysis of Pain One Week After Surgery By Group Pain One Week After Surgery 22 20 18 16 14 12 A B C Group Means and Std Deviations Level A B C Number 10 10 10 Mean 15.6458 15.6439 19.6755 Std Dev 1.63775 1.65290 1.30114 Std Err Mean 0.51790 0.52269 0.41146 Lower 95% 14.583 14.571 18.831 Oneway Anova Summary of Fit Rsquare Adj Rsquare Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.628702 0.601198 1.539187 16.98841 30 Analysis of Variance Source Group Error C. Total DF 2 27 29 Sum of Squares 108.30977 63.96558 172.27535 Mean Square 54.1549 2.3691 F Ratio 22.8589 Prob > F <.0001 Upper 95% 16.708 16.716 20.520 7. (8) A study in Alachua County, Florida, investigated the relationship between Y=mental health impairment, X 1 =life events score and X 2  SES. Here is a description of the variables: Y  index of mental health impairment, which incorporates various dimensions of psychiatric symptoms, including aspects of anxiety and depression. Higher values indicate greater mental impairment. Values ranged from 17 to 41 in the sample. X 1 = life events score. Measure of number and severity of major life events the subject experienced within the past three years. These events range from severe personal disruptions such as a death in the family, a jail sentence, or an extramarital affair, to less severe events such as getting a new job, the birth of a child, moving within the same city, or having a child marry. A high X 1 score represents a greater number and/or greater severity of these life events. Scores range from 3 to 97 in the sample. X 2  socioeconomic status (SES). Composite index based on occupation, income and education. Measured on a standard scale, it ranges from 0 to 100; the higher the score, the higher the status. JMP output from a multiple regression of mental impairment on life events score and SES is shown below. Response Mental Impairment Summary of Fit RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.339159 0.303438 4.556438 27.3 40 Analysis of Variance Source Model Error C. Total DF Sum of Squares Mean Square F Ratio 2 37 39 394.2384 768.1616 1162.4000 197.119 20.761 9.4946 Parameter Estimates Term Estimate Std Error t Ratio Prob>|t| Intercept 28.229813 2.174222 12.98 <.0001 Life Events 0.1032595 0.032499 3.18 0.0030 SES -0.097476 0.029085 -3.35 0.0019 Prob > F 0.0005 Mental Impairment Residual Residual by Predicted Plot 10 5 0 -5 -10 15 20 25 30 35 40 Mental Impairment Predicted (a) (2) Assuming there are no omitted confounding variables, is there strong evidence that an increase in life events increases mental impairment? Justify your answer using a test at the .05 level. The multiple regression model is {Mental _ Im pairment | Life _ events  x1 , SES  x2 }   0  1 x1   2 x2 . Assuming there are no omitted variables, increases in life events cause increases in mean mental impairment if 1  0 . The point estimate of  1 is ˆ1  0.1032 and the p-value for the ttest of H 0 : 1  0 is 0.0030. Thus, there is strong evidence that increases in life events cause increases in mean mental impairments, assuming there are no omitted confounding variables. (b) There is concern that age is a confounding variable. It is known that age is positively associated with mental impairment holding fixed life event score and SES fixed. Shown below is a regression of age on life event score and SES from another study. The same relationship between age, life event score and SES is believed to hold for this study. Suppose we were able to obtain the ages of study participants and run the multiple regression ˆ{Y | X 1 , X 2 , Age}  ˆ0  ˆ1 X 1  ˆ2 X 2  ˆ3 X 3 . How would you expect ˆ to compare to 0.1033, the estimated coefficient on X for the 1 1 regression of mental impairment on life event score and SES. Would you expect ˆ1 to be larger, smaller or about the same size as 0.1033? Explain your reasoning. ˆ{Y | X 1 , X 2 }  ˆ0*  ˆ1* X 1  ˆ2* X 2 { Age | X 1 , X 2 }  ˆ0  ˆ1 X 1  ˆ2 X 2 By the omitted variables bias formula, ˆ1  ˆ1*  ˆ1 ˆ3 . The regression of Age on life events index and SES from another study shown in the problem indicates that ˆ1 is probably less than 0 ( ˆ1 =-0.4083 in the regression from the other study). Because it is Let known that age is positively associated with mental impairment holding fixed life event score and SES, ̂ 3 is probably greater than zero. Thus, ˆ1 ˆ3 is probably less than zero and from the omitted variables bias formula, we conclude that ˆ is probably larger than 1 ̂ (=0.1033). * 1 Response Age Summary of Fit RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.596241 0.480881 7.723134 45.7 10 Parameter Estimates Term Estimate Std Error t Ratio Prob>|t| Intercept 58.325987 7.374532 7.91 <.0001 Life Events Index -0.408285 0.127675 -3.20 0.0151 SES 0.0257135 0.093981 0.27 0.7923 Residual by Predicted Plot Age Residual 15 10 5 0 -5 -10 20 25 30 35 40 45 50 55 60 Age Predicted (c) (3) Ignore the issue of the potential confounding variable in part (b). A psychologist hypothesizes that for low levels of life scores, subjects with high SES and low SES should have about the same mean mental impairment but for high levels of life scores, subjects with high SES are better able to withstand the mental stress of potentially traumatic life events than subject with low SES and have lower levels of mean mental impairment. Write down the formula of a multiple regression model that could be used to test the psychologist’s hypothesis and state what the psychologist’s hypothesis is in terms of the parameters of the model you write down. The psychologist is hypothesizing that there is an interaction between SES and life events index. A multiple regression model that could be used to test the psychologists’ hypothesis is {Mental _ Im pairment | Life _ events  x1 , SES  x2 }   0  1 x1   2 x2   3 x1 * x2 The psychologists’ hypothesis is that there is a negative interactions between life events and SES (for low life events, high SES and low SES have about the same mean of mental impairment but for high life events, high SES has lower mean of mental impairment than low SES). Thus, the psychologists’ hypothesis is that  3  0 .

Final exam solutions - Wharton Statistics Department

Related documents

Products

Support

Final exam solutions - Wharton Statistics Department

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib