STA/BST 222 Fall 2011 Nov 22, 2011 K-M estimate COX MODEL AFT MODEL PROC LIFETEST 1. Compute & graph the estimated survival function. ----Kaplan-Meier method ----life-table method 2. test the null hypothesis that the survival functions are identical for two or more groups. 3. test association between survival times and sets of quantitative covariates. KM estimator using PROC LIFETEST Syntax: proc lifetest data=dataname; run; time timevar*censoringindicator(0); proc lifetest data=dataname plots=(s) graphics; time timevar*censoringindicator(0); symbol1 v=none; run; life table estimator using PROC LIFETEST proc lifetest data=dataname method=life; run; time timevar*censoringindicator(0); PROC PHREG Advantages: 1. can incorporate time-dependent variables. 2. can deal with tied data. Syntax: proc phreg data=dataname; model timevar*censoringindicator(0)=predictors; run; Note: There is no intercept estimate. When ties are present: proc phreg data=dataname; model timevar*censoringindicator(0)=predictors /ties=exact(or breslow, or efron, or discrete); run; efron is fast. When there are no time-dependent covariates, the Cox model can be written as S ( t ) [ S 0 ( t )] exp( x ) Estimate baseline hazard function. Syntax: proc phreg data=dataname; model timevar*censorid(0)=predictors /ties=efron; baseline out=a survival=s logsurv=ls loglogs=lls; run; proc print data=a; run; Prediction for particular set of covariate values. Syntax: data covals; input predictors; cards; **** data *** run; proc phreg data=dataname; model timevar*censorid(0)=predictors; baseline out=a covariates=covals survival=s lower=lcl upper=ucl/nomean; run; proc print data=a; run; log Ti 0 1 x i1 k x ik i 1. It’s simpler to fix the variance of at some standard value (e.g., 1.0) and let change in value to accommodate changes in the distribution variance. 2. As for the log transformation of T, its main purpose is to ensure that the predicted values of T are positive, regardless of the values of the x’s and the ' s . 3. The output line labeled SCALE is an estimate of the parameter, along with its estimated standard error. If we let the error term be iid but arbitrary and unspecified, we get (nonparametric) AFT model. The Buckley-James estimator of the parameter can thought of as the nonparametric version of the EM algorithm: where the censored residual is replaced by expected values (E-step). Then followed by the usual regression M-estimation procedure. The non-parametric nature of this procedure appears in both the E-step (where you do not have a parametric distribution for the residuals); and M-step (where you do not maximize a parametric likelihood, but use least squares etc.). The calculation of least squares Buckley-James estimator can be found in the R function bj(), inside the Design library. The trustworthiness of the variance estimation from bj() is in doubt. Mai Zhou (Department of Statistics, University of Kentucky) recommends use empirical likelihood. See bjtest() etc. inside the emplik package. PROC LIFEREG Advantage: 1. 2. Accommodates left censoring and interval censoring. PROC PHREG only allows right censoring. Can test certain hypotheses about the shape of the hazard function. PROC PHREG only gives nonparametric estimates of the survival function. If the shape of the survival distribution is known, PROC LIFEREG produces more efficient estimates (with smaller standard errors) than PROC PHREG. PROC LIFEREG’s greatest limitation is that it does not handle time-dependent covariates. Syntax: proc lifereg data=dataname; model timevar*censorid(0)=predictors/dist=distri_T; run; PROC LIFEREG allows for five distributions for error term, and for each distribution, there is a corresponding distribution for T: Distribution of error term Distribution of T Extreme value (2 par.) Weibull Extreme value (1 par.) Exponential Log-gamma gamma logistic Log-logistic Normal Log-normal Note: all AFT models are named for the distribution rather than the distribution of error term or log T.