Stat 565 Solutions to Assignment 5 Fall 2005

advertisement
Stat 565
Solutions to Assignment 5
Fall 2005
1. (a) Partial likelihood estimates of the coefficients in the proportional hazards model using
Z1 , Z 2 , Z3 , Z 4 , Z5 , Z6 , Z7 , Z8 are shown below with standard errors. Those results
suggest that, after adjusting for effects of the other variables in the model, patient-donor age
interaction ( Z3 ), low risk for acute myelotic leukemia ( Z 4 ) and the FAB morphology score
( Z6 ) have significant associations with disease free survival.
Criterion
-2 LOG L
AIC
SBC
Without
Covariates
With
Covariates
746.591
746.591
746.591
711.884
727.884
747.235
Testing Global Null Hypothesis: BETA=0
Test
Likelihood Ratio
Score
Wald
Chi-Square
DF
Pr > ChiSq
34.7072
38.6903
35.6080
8
8
8
<.0001
<.0001
<.0001
Analysis of Maximum Likelihood Estimates
Variable
z1
z2
z3
z4
z5
z6
z7
z8
DF
Parameter
Estimate
Standard
Error
Chi-Square
Pr > ChiSq
Hazard
Ratio
1
1
1
1
1
1
1
1
0.00280
0.00370
0.00307
-1.07378
-0.38601
0.83695
-0.00781
0.30420
0.02002
0.01821
0.0009537
0.37108
0.37259
0.27962
0.01144
0.25298
0.0196
0.0412
10.3345
8.3730
1.0733
8.9590
0.4666
1.4459
0.8888
0.8392
0.0013
0.0038
0.3002
0.0028
0.4945
0.2292
1.003
1.004
1.003
0.342
0.680
2.309
0.992
1.356
The significant positive coefficient for the patient-donor age interaction suggests that that the
hazard of failure increases as both the patient age and the donor age increase. A scatter plot
matrix of the covariates is shown below. It is easily seen that older donors tend to be matched
with older patients, so it is difficult to separate the effects of patient age from donor age. This
is why the interaction was significant while the main effects were not. When patient age or
donor age or both get older, the risk of failure increases. The estimated coefficient for Z 4 is 1.074 which indicates that risk of failure for patients with low risk AML is about 1/3 of the risk
of failure for All patients with similar values of the other risk factors. The estimated
coefficient for Z6 is 0.0837 which indicates that risk of failure for patients with FAB scores of
4 or 5 is more than double the risk of failure for patients with FAB scores below 4, when the
other risk factors have about the same values. You should also examine confidence intervals
for the hazard ratios.
The scatter plot matrix shown below was made with SAS using the interactive graphics
capability. This is accessed by clicking on the solutions button at the top of the SAS window
and selecting analyze and then selecting interactive data analysis. The select you data set
from the work library and proceed as you would with JMP
24
z1
- 21
28
z2
- 26
1
z4
0
1
z5
0
1
z6
0
86. 0055
z7
0. 7890
1
z8
0
2
(c) The scatter plot matrix shown above also reveals some potentially influential cases. There
are four cases that had very long waiting times for transplants (longer than 40 months). Those
cases were below the median age for patients and near the median age for donors. They were
ALL patients with FAB scores below 4 who were all treated with MTX. They are patients 2, 6,
8 and 26. Patient 26 died at 122 days after transplant, while patients 2, 6 and 8 were censored
after at least 996 days after transplant. The dfbeta plot for waiting time for transplant ( Z7 )
identifies patient 26 as an influential case. Including patient 26 increases the value of the
coefficient of ( Z7 ) and increases the estimated risk for transplant failure associated with long
waits for a transplant. Patients 2, 6, and 8 were not identified as influential cases, they all had
relatively long censoring times.
Patient 84 was identified as potentially influential in the dfbeta plots for Z1 , Z 2 , and Z 4 . This
was a 34 year old, low risk, AML patient who died or relapsed within 10 days after transplant.
The donor was much older. Including this patient in the data leads to a larger positive estimate
of the coefficient on the donor age ( Z 2 ), a smaller positive estimate of the coefficient on Z1 ,
and a less negative estimate of the coefficient on Z 4 . The short failure time may be unusual,
but the case should be kept in the data if the recorded values are correct.
Patient 35 was identified as potentially influential in the dfbeta plot for Z5 , high risk AML
patients. The disease free survival time was censored at 845 days for this patient. Including
this patient in the data leads to a more extreme negative estimate of the coefficient for high risk
AML patients.
Patient 137 was identified as potentially influential in the dfbeta plots for Z3 . This was a 52
year old patient with a 48 year old donor. This patient died at 366 days. Including this patient
in the data leads to a less positive estimate of the coefficient for Z3 , the product of the patient
and donor ages.
The validity of the data for these cases should be checked and you should discuss these cases
with the medical experts in the study.
(d) Plots of the Schoenfeld residuals against time did not reveal any major time dependencies
with the possible exception of Z8 , treatment with MTX. The effect of MTX in reducing
the risk of failure may decline over the first year and then stabilize at essentially no
effect. This can be further investigated by adding an interaction between time and
treatment with MTX to the Cox model.
(e)
The plot of martingale residuals obtained when Z3 was deleted from the model revealed
a straight line with a positive slope, indicating that a straight line effect of Z3 is the only
effect of Z3 needed in the model. Similar plots of martingale residuals obtained by
deleting either Z1 Z 2 or Z7 from the model reveled variation about essentially horizontal
3
lines. This suggests that there is no significant effect for any of those variables after
adjusting for the effects of the other variables in the model. The plots are shown below.
Z3 adjusted for other variables
Z1 adjusted for other variables
Z 2 adjusted for other variables
4
Z7 adjusted for other variables
(f) There was a significant interaction between use of MTX and time. One reasonable model
is the Cox model with Z3 , Z 4 , Z6 , Z8 and Z8,time where
⎧ time × Z8 if time < 400 days
Z8,time = ⎨
if time ≥ 400 days
⎩ 400 × Z8
The estimated coefficients and other information are given below. The AIC and SBC
values are improvements over the model in part (a).
Criterion
-2 LOG L
AIC
SBC
Without
Covariates
With
Covariates
746.591
746.591
746.591
705.423
715.423
727.517
Testing Global Null Hypothesis: BETA=0
Test
Likelihood Ratio
Score
Wald
Chi-Square
DF
Pr > ChiSq
41.1687
45.7147
42.0493
5
5
5
<.0001
<.0001
<.0001
Analysis of Maximum Likelihood Estimates
Variable
z3
z4
z6
z8
z8time
DF
Parameter
Estimate
Standard
Error
Chi-Square
Pr > ChiSq
Hazard
Ratio
1
1
1
1
1
0.00275
-0.79732
0.76298
1.34941
-0.00521
0.0008726
0.24953
0.22816
0.42796
0.00198
9.9353
10.2102
11.1825
9.9423
6.9344
0.0016
0.0014
0.0008
0.0016
0.0085
1.003
0.451
2.145
3.855
0.995
5
The estimated coefficient for the low risk AML indicator variable is -0.79798 and the
associated hazard ratio, exp(-0.79798)=0.45, indicates that at any time point the risk of
failure is less than half of that for either high risk AML or ALL patients with similar
values for the other covariates in the model.. The estimated coefficient for the FAB score
indicator indicates that that patients with FAB scores above 3 will have about twice the
risk of failure at any time point as patients with lower FAB scores. With other
covariates held constant, the effect on the hazard indicated by the estimated coefficient
for Z3 =(patient age – 28)(donor age-28) indicates about a 0.3% increase for each one
unit (year2) increase. This effect of the age of the donor depends on the age of the
patient. For a 28 year old patient, the donor’s age has essentially no effect. For a 31 year
old patient, a one year increase in the age of the donor increases the hazard by a factor of
exp((.00272)(3))=exp(.00816)=1.0082, or about 0.82%. For a 58 year old patient, a one
year increase in the age of the donor increases the hazard by a factor of
exp((.00272)(30))=exp(.0816)=1.085, or about 8.5%. When there is a significant
interaction between two covariates, the effect of a one unit increase in one of the
covariates will be modified by the level of the other covariate involved in the interaction.
The effect of treatment with MTX tends to diminish over time and then remains after 400
days. Patients treated with MTX have more than three times the risk of very early failure,
but this declines to nearly the same risk of failure as those not treated with MTX after
about 308 hours. After 400 hours, patients treated with MTX may have a 30% lower
hazard of failure.
The SAS code for fitting this model is
proc phreg data=set1;
model time*status(0) = z3 z4 z6 z8 z8time /ties=efron;
z8time=z8*time;
if(time>400) then z8time=z8*400;
run;
2. (a) Assuming that the two times recorded for the same patient are independent, estimates
of the regression parameters and their standard errors for the Cox proportional hazards
model using AGE10, SEX, AN, GN, and PKD as the covariates (the Independence
Working Model) are as follows:
coxph(formula = Surv(time, status) ~ age10 + sex + gn + an + pkd, data = set1,
method = "efron", robust = F)
n= 76
6
coef
age10
exp(coef) se(coef)
z
p
0.000936
1.001
0.0109
0.0855
9.3e-01
sex
-1.453564
0.234
0.3545
-4.1005
4.1e-05
gn
0.134087
1.143
0.4044
0.3316
7.4e-01
an
0.524409
1.689
0.4137
1.2676
2.0e-01
pkd
-1.357208
0.257
0.6271
-2.1644
3.0e-02
exp(coef) exp(-coef)
lower .95
upper .95
age10
1.001
0.999
0.9797
1.023
sex
0.234
4.278
0.1167
0.468
gn
1.143
0.875
0.5176
2.526
an
1.689
0.592
0.7510
3.801
pkd
0.257
3.885
0.0753
0.880
Likelihood ratio test = 18.2 on 5 df, p=0.00269
Wald test
= 20.3 on 5 df, p=0.00111
Score (logrank) test = 20.4 on 5 df, p=0.00106
(b) Fit the marginal proportional hazards model and compute a robust estimate of the
covariance matrix of the parameter estimates to account for correlation among within patient
recurrence times. The parameter estimates and standard errors are shown below. The
parameter estimates are the same as those for the IWM model. The robust estimates of the
standard errors for the estimated coefficients increased for the sex and pkd variables and
decreased for the other variables. A strong positive correlation between the two times to
infection measured on each patient would have led to increased standard errors for all of the
estimated coefficients. In this case the association between times to infection measured on the
same patient are not very strong. The increase in the standard error for the estimated
coefficient for the pkd variable was enough to no longer give a significant result at the .05
level. Results are shown below:
coxph(formula = Surv(time, status) ~ age10 + sex + gn + an + pkd + cluster(id),
data = set1, method = "efron", robust = T)
n= 76
7
coef
exp(coef)
se(coef)
robust se
z
p-value
0.90000
age10
0.000936
1.001
0.0109
0.00715
0.131
sex
-1.453564
0.234
0.3545
0.40435
-3.595 0.00032
gn
0.134087
1.143
0.4044
0.28782
0.466
0.64000
an
0.524409
1.689
0.4137
0.32399
1.619
0.11000
pkd
-1.357208
0.257
0.6271
0.87450
-1.552 0.12000
exp(coef) exp(-coef)
lower .95
upper .95
age10
1.001
0.999
0.9870
1.015
sex
0.234
4.278
0.1058
0.516
gn
1.143
0.875
0.6505
2.010
an
1.689
0.592
0.8953
3.188
pkd
0.257
3.885
0.0464
1.429
Likelihood ratio test = 18.2 on 5 df, p=0.00269
Wald test
= 18.3 on 5 df, p=0.00265
Score (logrank) test = 20.4 on 5 df, p=0.00106, Robust = 11.7 p=0.0398
(c) Using the same set of covariates as in parts (a) and (b), a gamma frailty model was fit to
the data with a separate random effect for each patient. Estimates of coefficients and
standard errors are shown below. For these data the standard errors for the estimated
coefficients provided by the frailty model are more similar to the standard errors for the
independence working model than the robust standard errors. The estimated coefficients
are similar for all three models. The estimate of the variance of the frailty is relatively
small.
coxph(formula = Surv(time, status) ~ age10 + sex + gn + an + pkd
+ frailty(id, distribution = "gamma"), data = set1)
coef
se(coef)
se2
DF
p
age10
0.000943
0.0110
0.0109
0.01
1.00
9.3e-01
sex
-1.455191
0.3552
0.3544
16.78 1.00
4.2e-05
gn
0.134773
0.4051
0.4042
0.11
1.00
7.4e-01
an
0.525173
0.4144
0.4136
1.61
1.00
2.1e-01
pkd
-1.355156
0.6287
0.6269
4.65
1.00
3.1e-02
0.13
0.09
4.5e-01
frailty(id, distribution=gamma)
8
Chisq
exp(coef) exp(-coef) lower .95 upper .95
age10
1.001
0.999
0.9797
1.023
sex
0.233
4.285
0.1163
0.468
gn
1.144
0.874
0.5173
2.531
an
1.691
0.591
0.7504
3.809
pkd
0.258
3.877
0.0752
0.884
Iterations: 5 outer, 27 Newton-Raphson
Variance of random effect= 0.00224 I-likelihood = -179.1
Likelihood ratio test = 18.5 on 5.07 df, p=0.00254
Wald test
= 20.2 on 5.07 df, p=0.00121
Results for a frailty model with Gaussian random effects are shown below. This model
produces slightly different estimates of coefficients and model-based standard errors are larger.
The robust corrected standard errors (se2), however, are close to those for the gamma frailty
model. This might suggest that the gamma model provides a better description of the data than
the Gaussian frailty model.
coxph(formula = Surv(time, status) ~ age10 + sex + gn + an + pkd
+ frailty(id, distribution = "gaussian"), data = set1)
coef
se(coef)
se2
age10
0.00208
0.0163
0.0102
0.02
1.0 0.9000
sex
-1.74289
0.4923
0.3608 12.53
1.0 0.0004
gn
0.26049
0.5945
0.3874
0.19
1.0 0.6600
an
0.69302
0.5966
0.4227
1.35
1.0 0.2500
pkd
-1.00601
0.8802
0.6047
1.31
1.0 0.2500
frailty(id, distribution=Gaussian)
exp(coef) exp(-coef)
Chisq
DF
p
24.20 14.7 0.0560
lower .95
upper .95
age10
1.002
0.998
0.9705
1.035
sex
0.175
5.714
0.0667
0.459
gn
1.298
0.771
0.4047
4.161
an
2.000
0.500
0.6211
6.439
pkd
0.366
2.735
0.0651
2.053
9
Variance of random effect= 0.692
Likelihood ratio test = 57
Wald test
on 17.1 df, p=3.38e-06
= 14.7 on 17.1 df, p=0.622
(d) The results from the gamma frailty model suggest that men are about two to nine times
more likely than women to experience infections. Those with pfd form of kidney disease
may be less likely to experience infections. Age appears to have almost no association
with risk of infection, after adjusting for gender and type of disease.
(e) The p-value for the test of the null hypothesis that variance of the gamma frailty is
reported as 0.45. The null hypothesis is not rejected. The variance of the gamma frailty
appears to be quite small. This suggests a rather weak correlation between times to infection
measured on the same patient. Results for the Guassian frailty model are somewhat different.
A p-value of 0.056 is reported for the test of the null hypothesis that the variance of the
Guassian frailty is zero. All of the models indicate that gender is a significant risk factor and
women have lower risk of infection than men. The Gamma frailty model also suggests a
significantly lower risk of infection for patients with the pkd form of kidney disease. The
latter inference is not supported by the Gaussian frailty model or use of robust estimation for
standard errors.
10
Download