Test 2 Math 327 1. The following table shows results of a three

advertisement
Test 2 Math 327
1. The following table shows results of a three-center clinical trial to compare
a drug to a placebo for curing an infection. At each center, the subjects were
randomly assigned to treatment groups.
Center
1
Treatment
Drug
Control
Response
Success Failure
11
25
10
27
2
Drug
Control
16
22
4
10
3
Drug
Control
14
7
5
12
a). Fit a logistic regression model using Center and Treatment as predictors.
Based on your output, controlling for center, how do the odds of having a success
change if a subject switches from the Control treatment to the Drug treatment?
Also obtain a 95% confidence interval for this effect.
> infection <- read.table("http://educ.jmu.edu/~chen3lx/math327/infection.txt",header=T)
> infection
Center Treatment Success Failure
1
1
Drug
11
25
2
1
Control
10
27
3
2
Drug
16
4
4
2
Control
22
10
5
3
Drug
14
5
6
3
Control
7
12
> infection$Center <- factor(infection$Center)
> out <- glm(cbind(Success,Failure)~Center+Treatment,infection,family=binomial)
> summary(out)
Call:
glm(formula = cbind(Success, Failure) ~ Center + Treatment, family = binomial,
data = infection)
Deviance Residuals:
1
2
3
-0.63760
0.69979 -0.08129
4
0.05481
5
0.95423
6
-0.90506
Coefficients:
(Intercept)
Estimate Std. Error z value Pr(>|z|)
-1.2578
0.3286 -3.828 0.000129 ***
1
Center2
Center3
TreatmentDrug
--Signif. codes:
2.0254
1.1428
0.6644
0.4195
0.4226
0.3529
4.828 1.38e-06 ***
2.704 0.006841 **
1.882 0.059786 .
0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 31.7394
Residual deviance: 2.6355
AIC: 31.729
on 5
on 2
degrees of freedom
degrees of freedom
exp(0.6644) = 1.94, the odds of having a success in the Treatment group are
1.94 times the odds in the Control group, controlling for Center.
0.6644±1.96∗0.3529 = (−0.0273, 1.3561) and (e−0.0273 , e1.3561 ) = (0.97, 3.88).
We are 95% confident that the odds of having a success in the Treatment group
are between 0.97 to 3.88 times the odds in the Control group, controlling for
Center.
b). Based on your output, controlling for treatment, how do the odds of
having a success change if a subject switches from Center 1 to Center 2? Which
center has the highest odds of success controlling for treatment?
exp(2.0254) = 7.58, the odds of having a success at Center 2 are 7.58 times
the odds at Center 1, controlling for treatment. Based on the coefficient estimates for Center 2 and Center 3, Center 2 has the highest odds of success,
controlling for treatment.
c). Compute the sample odds ratios within each center. Based on the results,
is it reasonable to assume a common conditional odds ratio between Response
and Treatment?
The three samples odds ratios within each center are:
θ̂1 = 11∗27
25∗10 = 1.188,
θ̂2 = 16∗10
22∗4 = 1.818,
14∗12
θ̂3 = 7∗5 = 4.8.
Center 3 has a much higher odds ratio than Center 1 and Center 2. Probably
it is more reasonable to fit a model that allows different odds ratios within each
center. This can be accomplished by adding a Center and Treatment interaction
term in the model.
2. A study is performed to identify risk factors associated with giving birth to
a low birth weight baby (weighing less than 2500 grams). The response variable
is low (1 for weight < 2500g, 0 for not). Four predictors that are considered are
race ( 1=white, 2=black, 3=other), smoke: (smoking status of mother during
pregnancy, 1=yes, 0=no), lwt (weight in pounds right before pregnancy), ptl
(history of premature labor, taking values 0, 1, 2, 3 etc.). The following are
R output of glm fit of model low ∼ lwt + race + smoke + ptl and low ∼ lwt
respectively.
2
glm(formula = low ~ lwt + race + smoke + ptl, family = binomial,
data = birthwt)
Deviance Residuals:
Min
1Q
Median
-1.7432 -0.8520 -0.5669
3Q
1.1667
Max
2.0614
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.35023
0.90242 -0.388
0.6979
lwt
-0.01194
0.00637 -1.874
0.0609 .
race2
1.29788
0.51263
2.532
0.0113 *
race3
0.94423
0.41716
2.263
0.0236 *
smoke1
0.94017
0.38611
2.435
0.0149 *
ptl
0.60550
0.33103
1.829
0.0674 .
--Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 234.67
Residual deviance: 211.55
AIC: 223.55
on 188
on 183
degrees of freedom
degrees of freedom
Call:
glm(formula = low ~ lwt, family = binomial, data = birthwt)
Deviance Residuals:
Min
1Q
Median
-1.0951 -0.9022 -0.8018
3Q
1.3609
Max
1.9821
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.99831
0.78529
1.271
0.2036
lwt
-0.01406
0.00617 -2.279
0.0227 *
--Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 234.67
Residual deviance: 228.69
AIC: 232.69
on 188
on 187
degrees of freedom
degrees of freedom
Number of Fisher Scoring iterations: 4
3
1). Based on the above output, compute the odds ratio between mothers of
black race and other races , controlling for lwt, smoking status and ptl. You
just need to give a point estimate. No confidence interval is needed.
If we use other race as the baseline level, then the coefficient for black would
be 1.2979-0.9442=0.3537, and exp(0.3537) = 1.42, the odds of having a low
birth weight baby are 42% higher for a black mother than that for a mother
from other races, controlling for all the other variables in the model.
2). Base on the output, more previous premature labors increase or decrease
the odds of having a low birth weight baby? Quantify this effect using a 95%
confidence interval.
Based on the output, controlling for all the other variables, more previous
premature labors increase the odds of having a low birth weight baby.
exp(0.6055) = 1.83. Controlling for all the other variables, the odds increase
by 83% with each additional premature labor.
A 95% CI is: 0.6055 ± 1.96 ∗ 0.3310 = (−0.0433, 1.2543) and
(e−0.0433 , e1.2543 ) = (0.96, 3.51)
3). Estimate the probability of having a low birth weight baby for a black,
nonsmoking mother who weighed 150 pounds before pregnancy and had 2 premature labor.
exp(−0.3502−0.0119∗150+1.2979+0.6055∗2)
= 0.59.
π̂ = 1+exp(−0.3502−0.0119∗150+1.2979+0.6055∗2)
4). Is the (conditional) effect of lwt significant after controlling for the effects
of race, smoke and ptl? Perform a test to answer this question.
The Wald test for testing
H0 : β1 = 0 vs Ha : β1 6= 0 has z test statistic -1.874 and p-value 0.06. The
conditional effect of lwt is not so significant after controlling for the effects of
other variables.
5). Is the marginal effect of lwt significant? Perform a test to answer this
question.
The Wald test for testing the marginal effect of lwt has z test statistic -2.279
and p-value 0.02. The effect is significant.
6). Perform a test to answer if at least one of the following predictors are
significant after lwt is included in the model: race, smoke and ptl.
H0 : β2 = β3 = β4 = β5 = 0,
4
Ha : at least one of them is not 0.
To perform a likelihood ratio test, the test statistics can be obtained by
getting the difference between the residual deviance from fitting the model model
low ∼ lwt + race + smoke + ptl and low ∼ lwt respectively.
−2(L0 − L1 ) = 228.69 − 211.55 = 17.14, d.f. =4
Note Residual deviance given in the R output for a model M is −2(LM −LS ),
where LS is the maximized loglikelihood for the saturated model.
p-value = P (χ2 > 17.14) = 0.002.
Reject H0 .
5
Download