Final exam solutions - Wharton Statistics Department

advertisement
Final Exam Solutions
1. (8) Write the best answer below the question for the following multiple choice
questions. No explanation necessary.
(1) Ten tidal pools of the same size were found in a certain coastal region. It was
randomly determined which 5 would receive a treatment (removal of limpets, a type of
seaweed grazer) and which 5 would serve as controls. The amount of seaweed covering
the floor of the tide pools was measured at the end of the study period. It was desired to
see whether the amount of seaweed was affected by limpet removal. Which of the
following statements is correct?
(a) The data should be analyzed with a two-independent sample t-test. A causal
inference can be drawn from the study.
(b) The data should be analyzed with a two-independent sample t-test. A causal
inference cannot be drawn from the study.
(c) The data should be analyzed with a paired t-test. A causal inference can be drawn
from the study.
(d) The data should be analyzed with a paired t-test. A causal inference cannot be
drawn from the study.
A
(2) Which of the following is not an assumption in the ideal model for comparing several
populations used for the one-way ANOVA F test?
(a)
(b)
(c)
(d)
(e)
The sample sizes must be equal
The populations must all be normally distributed
The population variances must be equal
The samples for each treatment must be selected randomly and independently
All of the above are assumed.
A
(3) Suppose people are randomly assigned into three groups and given either one, two or
three mg of a drug. The amount of pain they feel after having their wisdom teeth
removed is the Y variable. The amount of drugs is the X variable. A simple regression of
Y vs. X is done.
(a) If the t-test for X is significant (p-value < .05), then we have evidence that X
causes Y to change.
(b) No matter what the t-ratio for X is, we cannot determine causation since a
regression is being done. To provide evidence for causation, a one-way analysis
of variance F test should be done instead.
(c) There may be a confounding variable that causes both X and Y to increase. So
statistical significance in this problem doesn’t provide evidence for causation.
(d) If the t-test for X is insignificant, then we have evidence that X doesn’t cause Y.
(e) More than one of the above is true.
A
(4) In a statistical report, the statement is made that the 95% confidence interval for the
percentage of babies who are boys is between 51% and 55% (i.e., 53%  2%). This
means, that if, in the future, a 95% confidence interval is computed in the same way for
each of a large number of random samples of the same size
(a)
(b)
(c)
(d)
95% of such intervals will cover (contain) the midpoint 53%
95% of such intervals will cover (contain) the population percentage of boys.
95% of such intervals will overlap (intersect) the interval 51% to 55%
95% of such intervals will completely cover (contain) the interval 51% to 55%
B
(5) A study of human development showed two types of movies to groups of children.
Crackers were available in a bowl, and the investigators compared the number of crackers
eaten by children watching the different kinds of movies. One kind of movie was shown
at 8 A.M. (right after the children had breakfast) and another at 11 A.M. (right before the
children had lunch). It was found that more crackers were eaten during the movie shown
at 11 A.M. than during the movie shown at 8 A.M. The investigators concluded that the
different types of movies had an effect on appetite. The results cannot be trusted because
(a) the study was not double blind. Neither the investigators nor the children should
have been aware of which movie was being shown.
(b) the investigators were biased. They knew beforehand what they hoped to show.
(c) the investigators should have used several bowls, with crackers randomly placed
in each.
(d) the time the movie was shown is a confounding variable.
D
(6) A group of college students believes that herbal tea has remarkable restorative
powers. To test its theory, the group makes weekly visits to a local nursing home,
visiting with residents, talking with them and serving them herbal tea. After several
months, many of the residents are more cheerful and healthy. Which of the following
may be correctly concluded from this study?
(a) herbal tea does improve one’s emotional state, at least for the residents of nursing
homes.
(b) there is some evidence that herbal tea may improve one’s emotional state. The
results would be completely convincing if a scientist had conducted the study
rather than a group of college students.
(c) the results of the study are not convincing since only a local nursing home was
used and only for a few months.
(d) the results of the study are not convincing since the effect of herbal tea is
confounded with several other factors.
D
(7) Does taking gingko tables twice a day provide significant improvement in mental
performance? To investigate this issue, a researcher conducted a study with 150 adult
subjects who took gingko tablets twice a day for a period of six months. At the end of the
study, 200 variables related to the mental performance of the subjects were measured on
each subject and the means compared to known means for these variables in the
population of all adults. Nine of these variables were significantly better (in the sense of
statistical significance) at the 5% level for the group taking the gingko tablets as
compared to the population as a whole, and one variable was significantly better at the
1% level for the group taking the gingko tablets as compared to the population as a
whole. It would be correct to conclude
(a) there is good statistical evidence that taking gingko tablets twice a day provides
some improvement in mental performance.
(b) there is good statistical evidence that taking gingko tablets twice a day provides
improvement for the variable that was significant at the 1% level. We should be
somewhat cautious about making claims for the variables that were significant at
the 5% level.
(c) these results would have provided good statistical evidence that taking gingko
tablets twice a day provides some improvement in mental performance if the
number of subjects had been larger. It is premature to draw statistical conclusions
from studies in which the number of subjects is less than the number of variables
measured.
(d) none of the above.
D
(8) Are proficiency test scores affected by the education of the child’s parents? To
answer this question, a random sample of 9-year old children was drawn. Each child’s
test score and the education level of the parent with the higher level were recorded. The
education categories are less than high school, high school graduate, some college, and
college graduate. The null hypothesis for the one-way analysis of variance F test is that
the population mean test scores are the same for all four education categories. The
alternative hypothesis is
(a) that the population mean test score is larger for children of college graduates than
for the other three educational categories
(b) that the population mean test score is smaller for children whose parents both did
not graduate from high school than for the other three educational categories
(c) that the population mean test score for children of college graduates is larger than
the population mean test score for children whose parents both did not graduate
from high school
(d) none of the above.
D
2. (5) The following is a list of some of the statistical methods you have learned in this
course:
A. Two independent samples t-test
B. Matched pairs t-test
C. Methods for comparing several means (One-way analysis of variance F test and
Tukey-Kramer adjusted procedure for comparing two of several means)
E. Chi-squared test
F. Simple linear regression
G. Multiple regression
For each of the situations described below, state the technique (from the list above) that
you believe is most applicable.
(a) A researcher for OSHA (Occupational Safety and Health Adminstration) wants to see
whether cutbacks in enforcement of safety regulations coincided with an increase in work
related accidents. For 20 industrial plants, she has the number of accidents in 1980 and
1995.
B
(b) A researcher wants to investigate how fertilizer affects soybean yield. She divides a
farm into 30 one-acre plots. Each plot receives a different amount of fertilizer. Soybeans
were then planted and the amount of soybeans harvested at the end of the season from
each plot were recorded.
F
(c) Can music make you smarter? And if so, which kind of music works best? Two
University of California at Irvine professors addressed these questions (as reported on
“Dateline” in September 1994). A random sample of 135 students was given tests that
measured the ability to reason. One third of the students were then put in a room where
rock-and-roll music was played. A second group of 45 students was placed in a room
and listened to music composed by Mozart. The last group was placed in a room where
no music was played. The students then took a test. The differences (second test score
minus first test score) were recorded.
C
(d) A bank would like to develop a model to predict the total sum of money customers
withdraw from automatic teller machines (ATMs) on a weekend so that they can be sure
to stock an adequate amount of money in each of the machines. They have data on the
amount of money withdrawn last weekend for a random sample of 35 ATM machines
throughout the city. They believe several factors can be useful in predicting the amount
of money withdrawn including the average assessed value of houses in the vicinity of the
ATM machine, how far away the nearest branch office of the bank is from the ATM
machine, and whether or not the ATM machine is located in a shopping center.
G
(e) There is a theory that the anticipation of a birthday can prolong a person’s life. In a
study set up to examine this notion statistically, it was found that only 60 of 747 people
whose obituaries were published in Salt Lake City in 1975 died in the three-month period
preceding their birthday.
E
3. (6) A randomized experiment is done to measure the effect of a drug on developing
mouse’s weights. 10 30-day old mice were randomly divided into groups of five. The
drug group received the drug for ten days; the placebo group received a placebo for ten
days. The weight gains at the end of the ten days were recorded. JMP output is shown
below for two analyses, one is an analysis of how the weight gains for the two groups
compare and the other is an analysis of how the log weight gains for the two groups
compare.
(a) (3) Assume the additive treatment effect model holds. Find an (approximate) 95%
confidence interval for the amount by which taking the drug increases a mouse’s weight
gain compared to what the mouse’s weight gain would have been taking the placebo.
From the JMP output for Analysis I, a 95% confidence interval is (-1.274,30.311).
(b) (3) Assume the multiplicative treatment effect model holds. Find an (approximate)
95% confidence interval for the amount by which taking the drug multiplies a mouse’s
weight gain compared to what the mouse’s weight gain would have been taking the
placebo.
From the JMP output for Analysis II, a 95% confidence interval for the amount by which
taking the drug increases a mouse’s log weight gain compared to what the gain would
have been from taking the placebo is (0.8019,3.8481). Thus, a 95% confidence interval
for the amount by which taking the drug multiplies a mouse’s weight gain compared to
what it would have been from taking the placebo is (e 0.8019 , e 3.8481)  (2.23,46.90)
(c) (3) Is there strong evidence that taking the drug as opposed to the placebo causes a
change in mice’s mean weight gains? Justify your answer using an appropriate test.
The multiplicative model appears more appropriate than the additive model. The box
plots for Analysis I show that the drug group has much greater spread than the placebo
group; the spreads of the two groups should be about equal if the additive model holds.
On the other hand, the box plots for Analysis II show that the drug group and placebo
group have about equal spreads on the log scale; this is what we would expect to be the
case if the multiplicative model holds. Under the multiplicative model, the t-test of the
null hypothesis that the mean of log weight gain for the drug group equals the mean of
log weight gain for the placebo group versus the two sided alternative that the means are
not the same provides a test of whether taking the drug as opposed to the placebo causes
a change in mice’s mean weight gains. From Analysis II, the p-value for this test is
.0078. Thus, there is strong evidence that taking the drug as opposed to the placebo
causes a change in mice’s mean weight gains.
Analysis I: Y= Weight Gains
Oneway Analysis of Response By Group
Response
40
30
20
10
0
Drug
Placebo
Group
Means and Std Deviations
Level
Drug
Placebo
Number
Mean
Std Dev Std Err Mean Lower 95% Upper 95%
5 16.1461 15.2061 6.8004 -2.735 35.027
5 1.6276 1.8113 0.8100 -0.621 3.877
t-Test
Difference
t-Test DF
Prob > |t|
Estimate 14.519 2.120 8 0.0668
Std Error
6.848
Lower 95% -1.274
Upper 95% 30.311
Assuming equal variances
Analysis II: Y=Log (Weight Gains)
Oneway Analysis of Log Response By Group
Log Response
4
3
2
1
0
-1
Drug
Placebo
Group
Means and Std Deviations
Level
Drug
Placebo
Number
Mean
Std Dev Std Err Mean Lower 95% Upper 95%
5 2.35527 1.05441 0.47155 1.046 3.6645
5 0.03029 1.03417 0.46250 -1.254 1.3144
t-Test
Difference
t-Test DF
Prob > |t|
Estimate 2.32499 3.520 8 0.0078
Std Error 0.66050
Lower 95% 0.80188
Upper 95% 3.84810
Assuming equal variances
4. (10) Lotteries have become important sources of revenue for governments. Many
people have criticized lotteries, however, referring to them as a tax on the poor and
uneducated. In an examination of the issue, a random sample of 100 adults was asked
how much they spend on lottery tickets and was interviewed about various
socioeconomic variables. The following data was recorded: amount spent on lottery
tickets as a percentage of total household income (Lottery), number of years of education
(Education), age (Age), number of children (Children) and personal income in thousands
of dollars (Income). The output from multiple regression of Lottery on Education, Age,
Children and Income is shown below. Assume the ideal multiple linear regression holds
for this problem.
Response Lottery
Summary of Fit
Rsquare
Rsquare Adj
Root Mean Square Error
Mean of Response
Observations (or Sum Wgts)
0.433474
0.40962
????
5.39
100
Analysis of Variance
Source
DF
Sum of Squares Mean Square
F Ratio
Model
4 615.4421 153.861 18.1722
Prob > F
Error
95
????
????
C. Total 99 1419.7900
<.0001
Parameter Estimates
Term
Intercept
Education
Age
Children
Income
Estimate
Std Error
t Ratio
Prob>|t|
11.906094 1.785197 6.67 <.0001
-0.430018 0.132072 -3.26 0.0016
0.0291899 0.025228 1.16 0.2501
0.0934351 0.224313 0.42 0.6780
-0.074471 0.027726 -2.69 0.0085
Lottery Residual
Residual by Predicted Plot
6
2
-2
-6
-10
0
5
10
15
Lottery Predicted
(a) (2) Based on the multiple linear regression below, what would you predict the lottery
spending of a 30 year old woman with 12 years of education, two children and an income
of $30,000 to be?
From the multiple regression, we would predict the percentage of total household income
that the woman would spend on the lottery to be
yˆ  11.91  0.430 *12  0.029 * 30  0.093 * 2  0.074 * 30  5.582
Thus, we would predict the woman to spend 5.582% of her income on the lottery or
0.05582 * 30000  $1674.60 on the lottery. Note than income is in thousands so the
woman’s income is 30 in the prediction equation.
(b) (2) The root mean square error ( ̂ ) is left blank. What is a reasonable estimate for
the root mean square error?
(i)
2.91
(ii)
5.91
(iii)
8.91
(iv)
10.91
A reasonable estimate is (i). We know that approximately 68% of the residuals should
have magnitude less than or equal to 2.91 and approximately 95% of the residuals should
have magnitude less than or equal to 2*2.91. Looking at the residual by predicted plot,
only (i) is a reasonable estimate of the root mean square error.
(c) (2) Is there strong evidence that the multiple regression model provides better
predictions of Lottery than just using the sample mean of Lottery to predict Lottery?
Justify your answer using a test.
To test if the multiple regression model provides better predictions of Lottery than just
using the sample mean of Lottery to predict Lottery, we use the overall F test which tests
whether the coefficients on all of the explanatory variables (Education, Age, Children and
Income) are equal to zero. From the JMP output in the Analysis of Variance table, the pvalue is <.0001. Thus, there is strong evidence that the multiple regression model
provides better predictions than the sample mean does.
(d) (2) What is an approximate 95% confidence interval for the coefficient on Age?
An approximate 95% confidence interval for the coefficient is the point estimate plus or
minus two standard errors: 0.0292  2 * .025  (0.0208,0.0792) .
(e) (2) A goal of the study was to test the following theories:
(i)
Relatively uneducated people spend different mean amounts on lottery tickets
than relatively educated people, all other things being equal.
(ii)
Older people spend different mean amounts on lottery tickets than younger
people, all other things being equal.
Assuming that there are no confounding variables, translate these theories into
appropriate null and alternative hypotheses about the multiple linear regression model’s
parameters. Test each of these theories at the 0.05 significance level and state your
conclusions.
Let the multiple regression model be written as
{Lottery | Education  x1 , Age  x2 , Children  x3 , Income  x4 } 
 0  1 x1   2 x2   3 x3   4 x4
Theory (i) corresponds to 1  0 . The t-test of H 0 : 1  0 vs. H a : 1  0 gives a pvalue of 0.0016. Since ˆ1  0.430 , there is strong evidence that relatively uneducated
people spend a higher mean amount on lottery tickets than relatively educated people, all
other things being equal.
Theory (ii) corresponds to  2  0 . The t-test of H 0 :  2  0 vs. H a :  2  0 gives a pvalue of 0.2501. Thus, there is no strong evidence that older people spend different mean
amounts on lottery tickets than younger people, all other things being equal.
5. (6) For each of the following situations, a simple linear regression has been carried
out. State whether (i) the the simple linear regression is well suited to answer the
question of interest or (ii) the simple linear regression is not well suited to answer the
question of interest and needs to be modified. If you answer (ii), state what the most
salient problem with simple linear regression is and discuss briefly how you would try to
fix it.
(a) (3) We want to predict a car’s fuel consumption from its speed. The scatterplot below
shows data on the British Ford Escort.
The residual plot indicates that the simple linear regression model is not appropriate. The
residual plot shows a clear pattern in the mean. The scatterplot of fuel used vs. speed
indicates that the relationship between fuel used and speed is quadratic so that quadratic
regression (a regression of fuel used on speed and speed squared) should be tried.
Bivariate Fit of Fuel Used By Speed
Fuel Used
13
14
13
11
9
7
5
0
50
100
Speed
150
Linear Fit
Linear Fit
Fuel Used = 7.5102418 + 0.0186022 Speed
Summary of Fit
RSquare
RSquare Adj
Root Mean Square Error
Mean of Response
Observations (or Sum Wgts)
0.109542
0.035337
2.309302
9.091429
14
Analysis of Variance
Source
DF
Sum of Squares
Mean Square
F Ratio
Source
Model
Error
C. Total
DF
Sum of Squares
Mean Square
F Ratio
1
12
13
7.872450
63.994521
71.866971
7.87245
5.33288
1.4762
Prob > F
0.2477
Parameter Estimates
Term
Residual
Intercept
Speed
Estimate
Std Error
t Ratio
Prob>|t|
7.5102418
0.0186022
1.440329
0.015311
5.21
1.21
0.0002
0.2477
6
3
1314
0
-3
0
50
100
Speed
150
Observations with Largest Cook’s Distances
Observation Number
Cook’s Distance
13
0.280
14
0.083
Leverage
0.257
0.204
(b) (3) One of the most dangerous contaminants deposited over European countries
following the Chernobyl accident of April 1987 was radioactive cesium. To study cesium
transfer from contaminated soil to plants, researchers collected soil samples and samples
of mushroom mycelia from 17 wooded locations in Umbria, Central Italy, from August
1986 to November 1989. The researchers measured concentrations (Bq/kg) of cesium in
the soil and in the mushrooms. The researchers’ goal is to predict Y=concentration in
mushrooms based on X=concentration in soil. The output from a simple linear regression
is shown below.
The simple linear regression model with these data is not appropriate for drawing
inferences. Observation 17 has an enormous Cook’s distance of 10.08. This is much
greater than the cutoff of 1 for classifying a point as being influential. Observation 17’s
leverage is 0.755>2*(2/17). Thus, observation 17 is influential and has high leverage.
We cannot draw reliable inferences over the whole range of explanatory variables
(concentration in soil) in the data. We should omit observation 17 from the regression
and consider the model to only be reliable for concentration in soil in the range of 0-500.
Bivariate Fit of MUSHROOM By SOIL
200
17
MUSHROOM
150
100
50
16
0
0
250
500
750
1000
1250
1500
SOIL
Linear Fit
Linear Fit
MUSHROOM = 16.725686 + 0.0959027 SOIL
Summary of Fit
Rsquare
RSquare Adj
Root Mean Square Error
Mean of Response
Observations (or Sum Wgts)
0.406386
0.366812
36.56475
44.58824
17
Analysis of Variance
Source
Model
Error
C. Total
DF
1
15
16
Sum of Squares
13729.399
20054.718
33784.118
Mean Square
13729.4
1337.0
F Ratio
10.2690
Prob > F
0.0059
Parameter Estimates
Term
Intercept
SOIL
Estimate
16.725686
0.0959027
Std Error
12.41954
0.029927
t Ratio
1.35
3.20
Prob>|t|
0.1981
0.0059
Residual
75
50
17
25
0
-25
16
-50
0
250
500
750
1000
1250
1500
SOIL
Observations with Largest Cook’s Distances
Observation Number
Cook’s Distance
16
0.081
17
10.08
Leverage
0.082
0.755
6. (7) A study was conducted to test the effects of a nonsteroidal anti-inflammatory drug
on pain. The experiment examines the effects of giving the treatment both before and
after surgery or after surgery only. 30 patients between the ages of 18 and 55 undergoing
elective knee arthroscopy were enrolled in the study two weeks before their surgery and
randomly divided into three groups. Group A received the nonsteroidal antiinflammatory drug (NSAID) both 3 days prior to surgery and 5 days after surgery. Group
B received a placebo before surgery and the NSAID after surgery. Group C received the
placebo both before and after surgery. Post-operatively all patients were given
prescriptions for codeine which could be taken every 4 to 6 hours as needed. Pain scores
were recorded at the time of enrollment in the study (two weeks before surgery), one day
before surgery and one week after surgery (higher pain scores indicate that the patient is
in more pain). Shown below are JMP analyses for three outcomes – (I) pain at time of
enrollment in the study (two weeks before surgery); (II) pain one day before surgery; and
(III) pain one week after surgery.
(a) (2) Is there strong evidence that that not all of the treatments are equally effective one
week after surgery, i.e., that some of the treatments have higher mean pain scores than
other treatments one week after surgery? State this question in terms of a hypothesis test
and answer the question of interest, using a test at the 0.05 level.
Let  A,1 weekafter ,  B ,1 weekafter,  C ,1 weekafter be the mean pain scores for treatments A, B and
C 1-week after surgery respectively. To test H 0 :  A,1 weekafter   B ,1 weekafter   C ,1 weekafter
vs. H a : not all treatments have same mean one week after surgery, we use the one-way
Analysis of Variance F-test for Analysis III. The p-value for this test is <.0001. Thus,
there is strong evidence that not all treatments are equally effective one week after
surgery.
(b) (3) Choose a linear combination to test the hypothesis that NSAID is beneficial as a
preoperative treatment (i.e., that it reduces pain before surgery). Find an approximate
95% confidence interval for the linear combination and test the hypothesis that the linear
combination equals 0 at the 0.05 level.
To investigate whether NSAID is beneficial as a preoperative treatment, we want to look
at pain scores one day before surgery (We could also look at the differences between pain
scores one day before surgery and two weeks before surgery. Although this would be a
more powerful approach, it would require that we take into account the dependence
between a patient’s score one day before surgery and two weeks before surgery as in a
matched pairs t-test.) Because only group A had the treatment before surgery, we would
like to compare group A to the combination of groups B and C. The appropriate linear
 B  C
where  A ,  B ,  C denote the means of
2
groups A, B and C one-day before surgery respectively. The point estimate of 
27.211  27.350
 6.983 . The standard error of g is
is g  20.928 
2
combination for doing that is    A 
C12 C 22 C32
1 (1 / 2) 2 (1 / 2) 2


1.877


 0.727 . Thus an approximate 95%
n1
n2
n3
10
10
10
confidence interval for  is g  2 * SE ( g )  6.983  2 * 0.727  (5.529,8.437) .
Because 0 is not in the confidence interval, we reject H 0 :   0 at the 0.05 level. Since
ˆ
g is negative and H 0 :   0 is rejected, there is strong evidence that NSAID is beneficial
as a preoperative treatment.
(c) (2) There is something in these analyses that calls into question whether patients were
randomly assigned into Groups A, B and C. Explain what feature of these analyses calls
the random assignment into question.
If patients were in fact randomly assigned to Groups A, B and C, then we would not
expect the pain scores of Groups A, B and C to be much different two weeks before
surgery, since none of the groups have received any treatment at that point. However, we
see from Analysis I that the p-value for testing the null hypothesis that the means of
Groups A, B and C two weeks before surgery are all equal is <.0001. Furthermore, every
patient in Group A had a higher pain score than any of the patients in Group B and C.
The differences in pain scores between the groups before any treatment has been applied
calls into question whether the patients were in fact randomly assigned.
Analysis I – Outcome: Pain two weeks before surgery
Oneway Analysis of Pain Two Weeks Before Surgery By Group
Pain Two Weeks Before Surgery
30
27.5
25
22.5
20
17.5
15
A
B
C
Group
Means and Std Deviations
Level
A
B
C
Number
10
10
10
Mean
19.9738
19.9886
25.6040
Std Dev
1.50822
1.82481
1.53550
Std Err Mean
0.47694
0.57706
0.48557
Lower 95%
18.995
18.805
24.608
Oneway Anova
Summary of Fit
Rsquare
Adj Rsquare
Root Mean Square Error
Mean of Response
Observations (or Sum Wgts)
0.746273
0.727478
1.629153
21.85545
30
Analysis of Variance
Source
Group
Error
C. Total
DF
2
27
29
Sum of Squares
210.77451
71.66178
282.43629
Mean Square
105.387
2.654
F Ratio
39.7067
Prob > F
<.0001
Upper 95%
20.952
21.173
26.600
Analysis II – Outcome: Pain One Day Before Surgery
Oneway Analysis of Pain One Day Before Surgery By Group
Pain One Day Before Surgery
32.5
30
27.5
25
22.5
20
17.5
A
B
C
Group
Means and Std Deviations
Level
A
B
C
Number
10
10
10
Mean
20.9282
27.2110
27.3495
Std Dev
2.27684
1.84643
1.40720
Std Err Mean
0.72000
0.58389
0.44500
Lower 95%
19.451
26.013
26.436
Oneway Anova
Summary of Fit
Rsquare
Adj Rsquare
Root Mean Square Error
Mean of Response
Observations (or Sum Wgts)
0.738744
0.719392
1.877367
25.16289
30
Analysis of Variance
Source
Group
Error
C. Total
DF
2
27
29
Sum of Squares
269.08526
95.16172
364.24698
Mean Square
134.543
3.525
F Ratio
38.1734
Prob > F
<.0001
Upper 95%
22.406
28.409
28.263
Analysis III – Pain One Week After Surgery
Oneway Analysis of Pain One Week After Surgery By Group
Pain One Week After Surgery
22
20
18
16
14
12
A
B
C
Group
Means and Std Deviations
Level
A
B
C
Number
10
10
10
Mean
15.6458
15.6439
19.6755
Std Dev
1.63775
1.65290
1.30114
Std Err Mean
0.51790
0.52269
0.41146
Lower 95%
14.583
14.571
18.831
Oneway Anova
Summary of Fit
Rsquare
Adj Rsquare
Root Mean Square Error
Mean of Response
Observations (or Sum Wgts)
0.628702
0.601198
1.539187
16.98841
30
Analysis of Variance
Source
Group
Error
C. Total
DF
2
27
29
Sum of Squares
108.30977
63.96558
172.27535
Mean Square
54.1549
2.3691
F Ratio
22.8589
Prob > F
<.0001
Upper 95%
16.708
16.716
20.520
7. (8) A study in Alachua County, Florida, investigated the relationship between
Y=mental health impairment, X 1 =life events score and X 2  SES. Here is a description
of the variables:
Y  index of mental health impairment, which incorporates various dimensions
of psychiatric symptoms, including aspects of anxiety and depression. Higher
values indicate greater mental impairment. Values ranged from 17 to 41 in the
sample.
X 1 = life events score. Measure of number and severity of major life events the
subject experienced within the past three years. These events range from severe
personal disruptions such as a death in the family, a jail sentence, or an
extramarital affair, to less severe events such as getting a new job, the birth of a
child, moving within the same city, or having a child marry. A high X 1 score
represents a greater number and/or greater severity of these life events. Scores
range from 3 to 97 in the sample.
X 2  socioeconomic status (SES). Composite index based on occupation, income
and education. Measured on a standard scale, it ranges from 0 to 100; the higher
the score, the higher the status.
JMP output from a multiple regression of mental impairment on life events score and
SES is shown below.
Response Mental Impairment
Summary of Fit
RSquare
RSquare Adj
Root Mean Square Error
Mean of Response
Observations (or Sum Wgts)
0.339159
0.303438
4.556438
27.3
40
Analysis of Variance
Source
Model
Error
C. Total
DF
Sum of Squares
Mean Square
F Ratio
2
37
39
394.2384
768.1616
1162.4000
197.119
20.761
9.4946
Parameter Estimates
Term
Estimate
Std Error
t Ratio
Prob>|t|
Intercept
28.229813 2.174222 12.98 <.0001
Life Events 0.1032595 0.032499 3.18 0.0030
SES
-0.097476 0.029085 -3.35 0.0019
Prob > F
0.0005
Mental Impairment Residual
Residual by Predicted Plot
10
5
0
-5
-10
15 20 25 30 35 40
Mental Impairment
Predicted
(a) (2) Assuming there are no omitted confounding variables, is there strong evidence
that an increase in life events increases mental impairment? Justify your answer
using a test at the .05 level.
The multiple regression model is
{Mental _ Im pairment | Life _ events  x1 , SES  x2 }   0  1 x1   2 x2 . Assuming
there are no omitted variables, increases in life events cause increases in mean mental
impairment if 1  0 . The point estimate of  1 is ˆ1  0.1032 and the p-value for the ttest of H 0 : 1  0 is 0.0030. Thus, there is strong evidence that increases in life events
cause increases in mean mental impairments, assuming there are no omitted confounding
variables.
(b) There is concern that age is a confounding variable. It is known that age is positively
associated with mental impairment holding fixed life event score and SES fixed.
Shown below is a regression of age on life event score and SES from another study.
The same relationship between age, life event score and SES is believed to hold for
this study. Suppose we were able to obtain the ages of study participants and run the
multiple regression ˆ{Y | X 1 , X 2 , Age}  ˆ0  ˆ1 X 1  ˆ2 X 2  ˆ3 X 3 . How would
you expect ˆ to compare to 0.1033, the estimated coefficient on X for the
1
1
regression of mental impairment on life event score and SES. Would you expect ˆ1
to be larger, smaller or about the same size as 0.1033? Explain your reasoning.
ˆ{Y | X 1 , X 2 }  ˆ0*  ˆ1* X 1  ˆ2* X 2
{ Age | X 1 , X 2 }  ˆ0  ˆ1 X 1  ˆ2 X 2
By the omitted variables bias formula, ˆ1  ˆ1*  ˆ1 ˆ3 . The regression of Age on life
events index and SES from another study shown in the problem indicates that ˆ1 is
probably less than 0 ( ˆ1 =-0.4083 in the regression from the other study). Because it is
Let
known that age is positively associated with mental impairment holding fixed life event
score and SES, ̂ 3 is probably greater than zero. Thus, ˆ1 ˆ3 is probably less than zero
and from the omitted variables bias formula, we conclude that ˆ is probably larger than
1
̂ (=0.1033).
*
1
Response Age
Summary of Fit
RSquare
RSquare Adj
Root Mean Square Error
Mean of Response
Observations (or Sum Wgts)
0.596241
0.480881
7.723134
45.7
10
Parameter Estimates
Term
Estimate
Std Error
t Ratio
Prob>|t|
Intercept
58.325987 7.374532 7.91 <.0001
Life Events Index -0.408285 0.127675 -3.20 0.0151
SES
0.0257135 0.093981 0.27 0.7923
Residual by Predicted Plot
Age Residual
15
10
5
0
-5
-10
20 25 30 35 40 45 50 55 60
Age Predicted
(c) (3) Ignore the issue of the potential confounding variable in part (b). A psychologist
hypothesizes that for low levels of life scores, subjects with high SES and low SES
should have about the same mean mental impairment but for high levels of life
scores, subjects with high SES are better able to withstand the mental stress of
potentially traumatic life events than subject with low SES and have lower levels of
mean mental impairment. Write down the formula of a multiple regression model
that could be used to test the psychologist’s hypothesis and state what the
psychologist’s hypothesis is in terms of the parameters of the model you write down.
The psychologist is hypothesizing that there is an interaction between SES and life events
index. A multiple regression model that could be used to test the psychologists’
hypothesis is
{Mental _ Im pairment | Life _ events  x1 , SES  x2 } 
 0  1 x1   2 x2   3 x1 * x2
The psychologists’ hypothesis is that there is a negative interactions between life events
and SES (for low life events, high SES and low SES have about the same mean of mental
impairment but for high life events, high SES has lower mean of mental impairment than
low SES). Thus, the psychologists’ hypothesis is that  3  0 .
Download