Fall 2002

advertisement
STAT 557
EXAM II: SOLUTIONS and COMMENTS
Fall 2002
(73 / 40)
= 1.35 . To construct a confidence
(145 / 107)
interval compute log(αˆ ) = .2977 and its large sample standard error
1. A. (8 points) The estimated odds ratio is αˆ =
1
1
1
1
+
+
+
= 0.2344 . Then, compute an approximate 95% confidence
73 40 145 107
interval for log(α) ,e.g.,
log(αˆ ) ± 1.96 Slog(αˆ ) ⇒ .2977 ± (1.96)( 0.2344) ⇒ (-.1617, .7571) .
Slog(αˆ ) =
Then, an approximate 95% for the odds ratio is
(exp(-.1617), exp(.7571) ) ⇒ (0.85, 2.13)
and this can be used as an approximate confidence for the relative risk of displaying
behavioral problems.
B. (6 points) Since
RR =
Pr{Behavioral problems | death of sibling}
Pr{Behavioral problems | no death of sibling}
and
Pr{ Death of sibling | Behavioral problems}
Pr{ Death of sibling | no behavioral problem
Pr{No death of sibling | Behavioral problems}
Pr{No death of sibling | no behavioral probl
Pr{ Death of sibling and Behavioral problems}
Pr{ Death of sibling and no behavioral
=
Pr{No death of sibling and Behavioral problems}
Pr{No death of sibling and no behaviora
Pr{Behavioral problems | Death of sibling }
Pr{No behavioral problems | Death of sib
=
Pr{Behavioral problems | No death of sibling }
Pr{ No behavioral problems | No death of si
odds ratio =
the odds ratio is a good approximation to RR if
Pr{No behavioral problems | death of sibling}
Pr{No behavioral problems | no death of sibling}
is close to one. This would be the situation, for example, if a high percentage of children did
not display behavioral problems regardless of whether or not their mother had experienced a
loss of a baby.
2
C. (6 points) The relative risk of displaying behavioral problems can be directly estimated
from the data obtained in this study because one sample was taken from the population of
children who display behavioral problems and the other sample was taken from the
population of children who do not display behavioral problems. Consequently, we can
estimate conditional probabilities conditioning on these populations, e.g.,
Pr{ Death of sibling | Behavioral problems} , Pr{ No death of sibling | Behavioral problems} ,
Pr{ Death of sibling | No behavioral problems} , and Pr{ No death of sibling | no behavioral problems} .
Since relative risk is
RR =
=
Pr{Behavioral problems | death of sibling}
Pr{Behavioral problems | no death of sibling}
Pr{Death of sibling | Behavioral problems} / Pr{death of sibling}
Pr{No death of sibling | Behavioral problems} / Pr{no death of sibling}
we cannot estimate relative risk from these data without estimating
Pr{no death of sibling } / Pr{death of sibling} , but the data from this study provide no
information for estimating this ratio. You cannot estimate, for example,
Pr{ death of sibling } = Pr{death of sibling | Behavioral problems}Pr{Behavioral problems}
+ Pr{death of sibling | no Behavioral problems}Pr{no Behavioral problems}
because the number of children with behavioral problems is fixed by the sampling
procedure.
2. (12 points) The counts for the 400 mother/daughter pairs could be displayed as follows in a
two-way table with one count for each pair.
Daughter’s
Response
Yes
No
Unsure
Mother’s Response
Yes
No
Unsure
Y11
Y12
Y13
Y21
Y22
Y23
Y31
Y32
Y33
Since simple random sampling was used, Y = ( Y11 Y12 Y13 Y21 Y22 Y23 Y31 Y32 Y33 )′ has
~
a multinomial distribution with sample size n=400 and category probabilities
3
3
π = ( π11 π12 π13 π 21 π22 π23 π31 π32 π33 )′ where ∑ ∑ πij = 1 . The null hypothesis can
~
i =1 j=1
0 1 1 - 1 0 0 - 1 0 0
/n .
be written as H 0 : C π = 0 where C = 
 . Let p = Y
~
~
~
~
0 - 1 0 1 0 1 0 - 1 0

1 
1
Then, V = C ∆ p − p p′ C′ is a consistent estimator of Var (C p) = C ∆ π − π π′ C′ and
n  ~ ~~ 
n  ~ ~~ 
~
3
the Wald test rejects the null hypothesis if X 2 = (C p)′V −1C p is larger than χ 22,α , the
~
~
upper α percentile of a central chi-square distribution with 2 degrees of freedom. There
are many other C matrices that can be used to express the same null hypothesis and the same
test.
3.
A. (4 points) The quasi-independence model is log( m ij ) = λ + λAi + λDj for i ≥ j .
In order to make the model identifiable and obtain values of parameter estimates, SAS
would impose the constraints λA5 = 0 and λD5 = 0 .
B. (6 points) 6 degrees of freedom
C. (6 points) This model implies within any complete subtable, where patients do not get
worse, the discharge status is conditionally independent of the admission status. For
example, consider patients admitted at disability level 2. From the quasi-independence
model, the conditional probability that such patients are discharged at disability levels 1
and 2 are
π 21
m 21
exp(λ + λA2 + λB1 )
exp(λB1 )
=
=
=
π21 + π22 m 21 + m 22 exp(λ + λA2 + λB1 ) + exp(λ + λA2 + λB1 ) exp(λ + λB1 ) + exp(λB2 )
and
m 22
exp(λ + λA2 + λB2 )
exp(λB2 )
=
=
m 21 + m 22 exp(λ + λA2 + λB1 ) + exp(λ + λA2 + λB1 ) exp(λ + λB1 ) + exp(λB2 )
respectively. Conditional on discharge at either level 1 or 2 for a patient admitted at level
of disability i>2, the conditional probability that the patient is discharged at level 1 is
m i1
exp(λ + λAi + λB1 )
exp(λB1 )
=
=
m i1 + m i 2 exp(λ + λAi + λB1 ) + exp(λ + λAi + λB1 ) exp(λ + λB1 ) + exp(λB2 )
and the conditional probability that the patient is discharged at level 2 is
mi2
exp(λ + λAi + λB2 )
exp(λB1 )
=
=
m i1 + m i 2 exp(λ + λAi + λB1 ) + exp(λ + λAi + λB1 ) exp(λ + λB1 ) + exp(λB2 )
4. A. (4 points) Holding constant the year of entry, age, sex, and initial tumor status, β1 is
the natural logarithm of an odds ratio corresponding to the log-odds of one-year
survival for the combined treatment minus the log-odds of one-year survival for the
radiation only treatment.
4
B. (4 points) Holding constant the year of entry, age, sex, and treatment, α 1 is the natural
logarithm of an odds ratio corresponding to the log-odds of one-year survival for the
patients with a single tumor smaller than 2 cm in diameter minus the log-odds of oneyear survival for patients with massive invasive tumors.
C. (8 points) A Wald test of the null hypothesis H 0 : α 1 = α 2 = α 3 = α 4 can be
constructed in the following manner. First write the null hypothesis in matrix form as
1 0 0 - 1 
H 0 : C α = 0 where α = (α1 α 2 α3 α 4 )′ and C = 0 1 0 - 1  . Mle’s of


~
~
~
0 0 1 - 1 
the parameters are αˆ = (αˆ 1
~
αˆ 2
αˆ 3
αˆ 4 )′ = (0.5109 0.5313 0.3824 0 ) and
.232646 - 0.09172 - 0.06540 0 
- 0.09172 0.120184 - 0.00617 0 
 is the estimated covariance matrix for α̂
V=
- 0.06540 - 0.00617 0.068344 0 
~


0
0
0
0

obtained by inverting the estimate of the Fisher Information matrix, The Wald test
rejects the null hypothesis at the .05 level of significance if
X 2 = ( C αˆ − 0)′( CV C′) −1 ( C αˆ − 0 ) is larger than χ 32,.05 = 7.81 . There are many other
~
~
~
~
C matrices that can be used to express the same null hypothesis and the same test.
A second approach would obtain mles of expected counts for the model specified in the
problem, say m̂ a , and a second set of mle’s of expected counts when the tumor stage
~
effects are deleted from the model, say m̂ 0 . Then, the null hypothesis would be
~
rejected if X = ( m̂ a − m̂ 0 )′[ Var ( m̂ a − m̂ 0 )]−1 ( m̂ a − m̂ 0 ) is larger than χ 32,.05 = 7.81 .
2
~
~
~
~
~
~
You could not evaluate this test statistic from the information provided in this problem.
D. (8 points) The estimated probability that a 40 year old female cancer patient with a
single tumor less than 2 cm in diameter would survive at least one year if she started the
combined radiation and chemotherapy treatment in 2002 is
exp(β̂ 0 + β̂1 + β̂ 2 (26) + β̂ 3 (10) + β̂ 4 + α̂1 )
πˆ =
= 0.999
1 + exp(β̂ 0 + β̂1 + β̂ 2 (26) + β̂ 3 (10) + β̂ 4 + α̂1 )
and a large sample estimate of the standard deviation of π̂ is obtained as follows.
Evaluate first partial derivatives
 ∂π ∂π ∂π ∂π ∂π ∂π ∂π ∂π 
G=
 = π(1 − π)[1 1 1 10 26 1 0 0]
 ∂β 0 ∂β 4 ∂β1 ∂β 3 ∂β 2 ∂α1 ∂α 2 ∂α3 
5
Let V denote the inverse of the estimated Fisher information matrix shown in the
middle of page 8. Let Ĝ denote the value of G evaluated at π̂ and let
z′ = [1 1 1 10 26 1 0 0]. Then, a large sample estimate of the standard deviation of
~
π̂ is S πˆ = πˆ (1 − πˆ ) z′ V z , and an approximate 95% confidence interval for π is
~
~
πˆ ± 1.96 S πˆ .
A second approach would create the confidence interval for log( π /(1 − π)) , i.e.,
log( πˆ /(1 − πˆ )) ± (1.96) z ′ V z , where z′ = [1 1 1 10 26 1 0 0]. Then, convert
~
~
~
the endpoints of the resulting interval using the inverse-logit transformation.
One could also use a bootstrap procedure to construct a confidence interval.
E. (4 points) The smoothed plot of the Pearson residuals against age suggests a curved
trend that was not described by linear age effect in the logistic regression model. The
smoothed plots of the residuals against year and tumor stage do not reveal any trends.
One would not expect to see any trend in the smoothed plot of the Pearson residuals
against tumor status in this case because tumor status was used as a classification
variable. Some people noticed the extreme negative residuals in 1978, one in 1979, and
for massive invasive tumors, that would correspond to cases where the patient did not
survive and the estimated probability of survival was not very small. These are cases
where the model does not provide a good description of the data.
The Hosmer-Lemeshow test and the cumulative local deviance plot also indicate that
the proposed model is inadequate, but this was not part of this question.
F.
(4 points) The value of gamma for this model is reported as 0.600. It is computed by
pairing the observed Yi = 0 or Yi = 1 outcome for each case with the estimated
probability of one-year survival π̂i . Two cases are concordant if either πˆ i < πˆ j and
Yi < Yj , or if πˆ i > πˆ j and Yi > Yj . Two cases are discordant if either πˆ i < πˆ j and
Yi > Yj , or if πˆ i > πˆ j and Yi < Yj . Two cases are tied if either πˆ i = πˆ j or Yi = Yj .
Then, gamma is (C-D)/(C+D), where C is the number of concordant pairs and D is the
number of discordant pairs. The value of 0.6 indicates that for the 60% of the cases
where Yi = 1 and Yj = 0 the estimated logistic model yields πˆ i > πˆ j .
G. (i) (4 points) By examining the dfbeta values for β 3 , the coefficient for the age
factor, you can determine if a single case has a large effect on the estimated value
of β 3 .
6
(ii) (4 points) Since h=.026 is close to the average leverage value of 8/229= .034,
2 .026
subject 107 is not a high leverage case. Then, c=.871= ri*
=0.871
1 − .026
implies that the absolute value of the adjusted Pearson residual for this subject is
( )
ri* = 5.71. Either this subject survived one year and the estimated model provided
a very small estimated probability of survival, or this subject did not survive one
year and the estimated model provided a large estimate of the survival probability.
This case should be further investigated as a potential outlier, or it may indicate
that a better model is needed.
5. A. (6 points) Patients under 47 years of age had a high (77.3%) one-year survival rate. Oneyear survival rates for patients older than 46 with massive invasive tumors were low (13.8%).
For patients over 46 years of age with single tumors, the combined chemotherapy and
radiation treatment (93.5% survival for patients between 46 and 64) appears to be more
effective than the radiation only treatment (30.2% survival for patients over 46). For patients
over 64 with single tumors who received the combined treatment, survival rates were higher
after 1978. You might check if the delivery of the combined treatment was changed near the
end of 1978 so that it could be better tolerated by older patients. Sex of the patient does not
appear to be an important factor.
B. (4 points) After looking at this estimated tree, one might consider some of the following
modifications the logistic regression model:
i.
Convert tumor stage into a binary factor: single versus massive invasive tumors.
ii.
Include an interaction between tumor stage and treatment to allow the combined
treatment to be more effective for patients with non-invasive single tumors.
iii.
Examine additional interactions between treatment and age and three-way
interactions between tumor stage, treatment, and age.
iv.
For older patients, allow the combined treatment to become more effective in later
years.
Alternatively, you could divide the patients into three groups. Those under 47, who have a
good chance of survival. Those older then 64, who have a poor chance of survival, and
those in between. Fit different logistic regression models within each group. You may find
that tumor status and treatment are only important in the middle age group.
There are a total of 100 points on this exam. You should have a score for each part of the
exam recorded on your paper. The scores are shown below.
9|
8|
7|
6|
5|
4|
Download