Stat 565 Solutions to Assignment 5 Fall 2005 1. (a) Partial likelihood estimates of the coefficients in the proportional hazards model using Z1 , Z 2 , Z3 , Z 4 , Z5 , Z6 , Z7 , Z8 are shown below with standard errors. Those results suggest that, after adjusting for effects of the other variables in the model, patient-donor age interaction ( Z3 ), low risk for acute myelotic leukemia ( Z 4 ) and the FAB morphology score ( Z6 ) have significant associations with disease free survival. Criterion -2 LOG L AIC SBC Without Covariates With Covariates 746.591 746.591 746.591 711.884 727.884 747.235 Testing Global Null Hypothesis: BETA=0 Test Likelihood Ratio Score Wald Chi-Square DF Pr > ChiSq 34.7072 38.6903 35.6080 8 8 8 <.0001 <.0001 <.0001 Analysis of Maximum Likelihood Estimates Variable z1 z2 z3 z4 z5 z6 z7 z8 DF Parameter Estimate Standard Error Chi-Square Pr > ChiSq Hazard Ratio 1 1 1 1 1 1 1 1 0.00280 0.00370 0.00307 -1.07378 -0.38601 0.83695 -0.00781 0.30420 0.02002 0.01821 0.0009537 0.37108 0.37259 0.27962 0.01144 0.25298 0.0196 0.0412 10.3345 8.3730 1.0733 8.9590 0.4666 1.4459 0.8888 0.8392 0.0013 0.0038 0.3002 0.0028 0.4945 0.2292 1.003 1.004 1.003 0.342 0.680 2.309 0.992 1.356 The significant positive coefficient for the patient-donor age interaction suggests that that the hazard of failure increases as both the patient age and the donor age increase. A scatter plot matrix of the covariates is shown below. It is easily seen that older donors tend to be matched with older patients, so it is difficult to separate the effects of patient age from donor age. This is why the interaction was significant while the main effects were not. When patient age or donor age or both get older, the risk of failure increases. The estimated coefficient for Z 4 is 1.074 which indicates that risk of failure for patients with low risk AML is about 1/3 of the risk of failure for All patients with similar values of the other risk factors. The estimated coefficient for Z6 is 0.0837 which indicates that risk of failure for patients with FAB scores of 4 or 5 is more than double the risk of failure for patients with FAB scores below 4, when the other risk factors have about the same values. You should also examine confidence intervals for the hazard ratios. The scatter plot matrix shown below was made with SAS using the interactive graphics capability. This is accessed by clicking on the solutions button at the top of the SAS window and selecting analyze and then selecting interactive data analysis. The select you data set from the work library and proceed as you would with JMP 24 z1 - 21 28 z2 - 26 1 z4 0 1 z5 0 1 z6 0 86. 0055 z7 0. 7890 1 z8 0 2 (c) The scatter plot matrix shown above also reveals some potentially influential cases. There are four cases that had very long waiting times for transplants (longer than 40 months). Those cases were below the median age for patients and near the median age for donors. They were ALL patients with FAB scores below 4 who were all treated with MTX. They are patients 2, 6, 8 and 26. Patient 26 died at 122 days after transplant, while patients 2, 6 and 8 were censored after at least 996 days after transplant. The dfbeta plot for waiting time for transplant ( Z7 ) identifies patient 26 as an influential case. Including patient 26 increases the value of the coefficient of ( Z7 ) and increases the estimated risk for transplant failure associated with long waits for a transplant. Patients 2, 6, and 8 were not identified as influential cases, they all had relatively long censoring times. Patient 84 was identified as potentially influential in the dfbeta plots for Z1 , Z 2 , and Z 4 . This was a 34 year old, low risk, AML patient who died or relapsed within 10 days after transplant. The donor was much older. Including this patient in the data leads to a larger positive estimate of the coefficient on the donor age ( Z 2 ), a smaller positive estimate of the coefficient on Z1 , and a less negative estimate of the coefficient on Z 4 . The short failure time may be unusual, but the case should be kept in the data if the recorded values are correct. Patient 35 was identified as potentially influential in the dfbeta plot for Z5 , high risk AML patients. The disease free survival time was censored at 845 days for this patient. Including this patient in the data leads to a more extreme negative estimate of the coefficient for high risk AML patients. Patient 137 was identified as potentially influential in the dfbeta plots for Z3 . This was a 52 year old patient with a 48 year old donor. This patient died at 366 days. Including this patient in the data leads to a less positive estimate of the coefficient for Z3 , the product of the patient and donor ages. The validity of the data for these cases should be checked and you should discuss these cases with the medical experts in the study. (d) Plots of the Schoenfeld residuals against time did not reveal any major time dependencies with the possible exception of Z8 , treatment with MTX. The effect of MTX in reducing the risk of failure may decline over the first year and then stabilize at essentially no effect. This can be further investigated by adding an interaction between time and treatment with MTX to the Cox model. (e) The plot of martingale residuals obtained when Z3 was deleted from the model revealed a straight line with a positive slope, indicating that a straight line effect of Z3 is the only effect of Z3 needed in the model. Similar plots of martingale residuals obtained by deleting either Z1 Z 2 or Z7 from the model reveled variation about essentially horizontal 3 lines. This suggests that there is no significant effect for any of those variables after adjusting for the effects of the other variables in the model. The plots are shown below. Z3 adjusted for other variables Z1 adjusted for other variables Z 2 adjusted for other variables 4 Z7 adjusted for other variables (f) There was a significant interaction between use of MTX and time. One reasonable model is the Cox model with Z3 , Z 4 , Z6 , Z8 and Z8,time where ⎧ time × Z8 if time < 400 days Z8,time = ⎨ if time ≥ 400 days ⎩ 400 × Z8 The estimated coefficients and other information are given below. The AIC and SBC values are improvements over the model in part (a). Criterion -2 LOG L AIC SBC Without Covariates With Covariates 746.591 746.591 746.591 705.423 715.423 727.517 Testing Global Null Hypothesis: BETA=0 Test Likelihood Ratio Score Wald Chi-Square DF Pr > ChiSq 41.1687 45.7147 42.0493 5 5 5 <.0001 <.0001 <.0001 Analysis of Maximum Likelihood Estimates Variable z3 z4 z6 z8 z8time DF Parameter Estimate Standard Error Chi-Square Pr > ChiSq Hazard Ratio 1 1 1 1 1 0.00275 -0.79732 0.76298 1.34941 -0.00521 0.0008726 0.24953 0.22816 0.42796 0.00198 9.9353 10.2102 11.1825 9.9423 6.9344 0.0016 0.0014 0.0008 0.0016 0.0085 1.003 0.451 2.145 3.855 0.995 5 The estimated coefficient for the low risk AML indicator variable is -0.79798 and the associated hazard ratio, exp(-0.79798)=0.45, indicates that at any time point the risk of failure is less than half of that for either high risk AML or ALL patients with similar values for the other covariates in the model.. The estimated coefficient for the FAB score indicator indicates that that patients with FAB scores above 3 will have about twice the risk of failure at any time point as patients with lower FAB scores. With other covariates held constant, the effect on the hazard indicated by the estimated coefficient for Z3 =(patient age – 28)(donor age-28) indicates about a 0.3% increase for each one unit (year2) increase. This effect of the age of the donor depends on the age of the patient. For a 28 year old patient, the donor’s age has essentially no effect. For a 31 year old patient, a one year increase in the age of the donor increases the hazard by a factor of exp((.00272)(3))=exp(.00816)=1.0082, or about 0.82%. For a 58 year old patient, a one year increase in the age of the donor increases the hazard by a factor of exp((.00272)(30))=exp(.0816)=1.085, or about 8.5%. When there is a significant interaction between two covariates, the effect of a one unit increase in one of the covariates will be modified by the level of the other covariate involved in the interaction. The effect of treatment with MTX tends to diminish over time and then remains after 400 days. Patients treated with MTX have more than three times the risk of very early failure, but this declines to nearly the same risk of failure as those not treated with MTX after about 308 hours. After 400 hours, patients treated with MTX may have a 30% lower hazard of failure. The SAS code for fitting this model is proc phreg data=set1; model time*status(0) = z3 z4 z6 z8 z8time /ties=efron; z8time=z8*time; if(time>400) then z8time=z8*400; run; 2. (a) Assuming that the two times recorded for the same patient are independent, estimates of the regression parameters and their standard errors for the Cox proportional hazards model using AGE10, SEX, AN, GN, and PKD as the covariates (the Independence Working Model) are as follows: coxph(formula = Surv(time, status) ~ age10 + sex + gn + an + pkd, data = set1, method = "efron", robust = F) n= 76 6 coef age10 exp(coef) se(coef) z p 0.000936 1.001 0.0109 0.0855 9.3e-01 sex -1.453564 0.234 0.3545 -4.1005 4.1e-05 gn 0.134087 1.143 0.4044 0.3316 7.4e-01 an 0.524409 1.689 0.4137 1.2676 2.0e-01 pkd -1.357208 0.257 0.6271 -2.1644 3.0e-02 exp(coef) exp(-coef) lower .95 upper .95 age10 1.001 0.999 0.9797 1.023 sex 0.234 4.278 0.1167 0.468 gn 1.143 0.875 0.5176 2.526 an 1.689 0.592 0.7510 3.801 pkd 0.257 3.885 0.0753 0.880 Likelihood ratio test = 18.2 on 5 df, p=0.00269 Wald test = 20.3 on 5 df, p=0.00111 Score (logrank) test = 20.4 on 5 df, p=0.00106 (b) Fit the marginal proportional hazards model and compute a robust estimate of the covariance matrix of the parameter estimates to account for correlation among within patient recurrence times. The parameter estimates and standard errors are shown below. The parameter estimates are the same as those for the IWM model. The robust estimates of the standard errors for the estimated coefficients increased for the sex and pkd variables and decreased for the other variables. A strong positive correlation between the two times to infection measured on each patient would have led to increased standard errors for all of the estimated coefficients. In this case the association between times to infection measured on the same patient are not very strong. The increase in the standard error for the estimated coefficient for the pkd variable was enough to no longer give a significant result at the .05 level. Results are shown below: coxph(formula = Surv(time, status) ~ age10 + sex + gn + an + pkd + cluster(id), data = set1, method = "efron", robust = T) n= 76 7 coef exp(coef) se(coef) robust se z p-value 0.90000 age10 0.000936 1.001 0.0109 0.00715 0.131 sex -1.453564 0.234 0.3545 0.40435 -3.595 0.00032 gn 0.134087 1.143 0.4044 0.28782 0.466 0.64000 an 0.524409 1.689 0.4137 0.32399 1.619 0.11000 pkd -1.357208 0.257 0.6271 0.87450 -1.552 0.12000 exp(coef) exp(-coef) lower .95 upper .95 age10 1.001 0.999 0.9870 1.015 sex 0.234 4.278 0.1058 0.516 gn 1.143 0.875 0.6505 2.010 an 1.689 0.592 0.8953 3.188 pkd 0.257 3.885 0.0464 1.429 Likelihood ratio test = 18.2 on 5 df, p=0.00269 Wald test = 18.3 on 5 df, p=0.00265 Score (logrank) test = 20.4 on 5 df, p=0.00106, Robust = 11.7 p=0.0398 (c) Using the same set of covariates as in parts (a) and (b), a gamma frailty model was fit to the data with a separate random effect for each patient. Estimates of coefficients and standard errors are shown below. For these data the standard errors for the estimated coefficients provided by the frailty model are more similar to the standard errors for the independence working model than the robust standard errors. The estimated coefficients are similar for all three models. The estimate of the variance of the frailty is relatively small. coxph(formula = Surv(time, status) ~ age10 + sex + gn + an + pkd + frailty(id, distribution = "gamma"), data = set1) coef se(coef) se2 DF p age10 0.000943 0.0110 0.0109 0.01 1.00 9.3e-01 sex -1.455191 0.3552 0.3544 16.78 1.00 4.2e-05 gn 0.134773 0.4051 0.4042 0.11 1.00 7.4e-01 an 0.525173 0.4144 0.4136 1.61 1.00 2.1e-01 pkd -1.355156 0.6287 0.6269 4.65 1.00 3.1e-02 0.13 0.09 4.5e-01 frailty(id, distribution=gamma) 8 Chisq exp(coef) exp(-coef) lower .95 upper .95 age10 1.001 0.999 0.9797 1.023 sex 0.233 4.285 0.1163 0.468 gn 1.144 0.874 0.5173 2.531 an 1.691 0.591 0.7504 3.809 pkd 0.258 3.877 0.0752 0.884 Iterations: 5 outer, 27 Newton-Raphson Variance of random effect= 0.00224 I-likelihood = -179.1 Likelihood ratio test = 18.5 on 5.07 df, p=0.00254 Wald test = 20.2 on 5.07 df, p=0.00121 Results for a frailty model with Gaussian random effects are shown below. This model produces slightly different estimates of coefficients and model-based standard errors are larger. The robust corrected standard errors (se2), however, are close to those for the gamma frailty model. This might suggest that the gamma model provides a better description of the data than the Gaussian frailty model. coxph(formula = Surv(time, status) ~ age10 + sex + gn + an + pkd + frailty(id, distribution = "gaussian"), data = set1) coef se(coef) se2 age10 0.00208 0.0163 0.0102 0.02 1.0 0.9000 sex -1.74289 0.4923 0.3608 12.53 1.0 0.0004 gn 0.26049 0.5945 0.3874 0.19 1.0 0.6600 an 0.69302 0.5966 0.4227 1.35 1.0 0.2500 pkd -1.00601 0.8802 0.6047 1.31 1.0 0.2500 frailty(id, distribution=Gaussian) exp(coef) exp(-coef) Chisq DF p 24.20 14.7 0.0560 lower .95 upper .95 age10 1.002 0.998 0.9705 1.035 sex 0.175 5.714 0.0667 0.459 gn 1.298 0.771 0.4047 4.161 an 2.000 0.500 0.6211 6.439 pkd 0.366 2.735 0.0651 2.053 9 Variance of random effect= 0.692 Likelihood ratio test = 57 Wald test on 17.1 df, p=3.38e-06 = 14.7 on 17.1 df, p=0.622 (d) The results from the gamma frailty model suggest that men are about two to nine times more likely than women to experience infections. Those with pfd form of kidney disease may be less likely to experience infections. Age appears to have almost no association with risk of infection, after adjusting for gender and type of disease. (e) The p-value for the test of the null hypothesis that variance of the gamma frailty is reported as 0.45. The null hypothesis is not rejected. The variance of the gamma frailty appears to be quite small. This suggests a rather weak correlation between times to infection measured on the same patient. Results for the Guassian frailty model are somewhat different. A p-value of 0.056 is reported for the test of the null hypothesis that the variance of the Guassian frailty is zero. All of the models indicate that gender is a significant risk factor and women have lower risk of infection than men. The Gamma frailty model also suggests a significantly lower risk of infection for patients with the pkd form of kidney disease. The latter inference is not supported by the Gaussian frailty model or use of robust estimation for standard errors. 10