Lab Objectives

advertisement
SAS LAB TWO, April 27, 2004
Lab Objectives
After today’s lab you should be able to:
1. Use PROC LIFETEST to generate Kaplan-Meier product-limit survival estimates, and
understand the output of the LIFETEST procedure.
2. Generate confidence limits for the Kaplan-Meier curve. Understand upgrades in SAS 9
for obtaining confidence bands for the KM survivor function.
3. Use the LIFETEST procedure to compare survival times of two or more groups.
4. Generate log-survival and log-log survival curves.
5. Run a simple SAS macro.
SAS LAB TWO, April 27, 2004
We will produce the following plots today:
SAS LAB TWO, April 27, 2004
SAS LAB TWO, April 27, 2004
LAB EXERCISE STEPS:
Follow along with the computer in front…
1. Save to desktop, if it’s not already there, the excel dataset: “hmohiv.xls” from the hrp262
website: http://www-stat.stanford.edu/~jtaylo//courses/stats262/spring.2004/index.html
2. Open SAS: From the desktop double-click “Applications” double-click SAS icon
3. Import data using point-and-click options: Select from the menu: FileImport
Data…Next>Browse and find file on the desktopNext>name in work library as
member “HmoHiv”Finish
4. Fix datetime variables, enddate and startdate, via the following code:
Dealing with date-time variables
/**REMINDER: YOU MUST CLOSE THE DATASET BEFORE TRYING TO
MODIFY IT**/
data hmohiv;
set hmohiv;
format enddate date.;
format startdate date.;
enddate=datepart(enddate);
startdate=datepart(startdate);
Time=12*(enddate-startdate)/365.25; *gives time in months;
Time=round(time); *to match Time variable in textbook;
run;
5. Generate the Kaplan-Meier product limit survival estimates for the hmohiv data:
/**Kaplan-Meier estimates of survivorship function**/
proc lifetest data=hmohiv;
time time*censor(0);
title 'Kaplan-Meier Estimates for HMO HIV data';
run;
6. Examine the “product limit survival estimates” output from the lifetest procedure.
Notice that there are several events that have the same failure times.
Confirm this fact by examining the distribution of the Time variable using point-andclick as follows:
1. From the menu select: SolutionsAnalysisInteractive Data Analysis
2. Double click to open: library “Work”, dataset “HmoHiv”
3. Highlight “Time” variable and from the menu select: AnalyzeDistribution(Y)
4. From the menu select: TablesFrequency Counts
5. Scroll down the open analysis window to examine the frequency counts for Time.
Notice that there are many repeats.
SAS LAB TWO, April 27, 2004
Explanation of output from Lifetest procedure:
Kaplan-Meier Estimates for HMO HIV data
The LIFETEST Procedure
Product-Limit Survival Estimates
Time
Gives KM estimate
at each failure/event
time. Reported for
the last of the tied
cases, when ties
exist.
Censored
observations are
starred.
Note: KM estimate
does not change
until next
failure/event time,
so it’s not written.
0.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000*
1.0000*
2.0000
2.0000
2.0000
2.0000
2.0000
2.0000*
2.0000*
2.0000*
2.0000*
2.0000*
3.0000
3.0000
3.0000
.
.
.
Survival
1.0000
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0.8500
.
.
.
.
.
.
0.7988
.
.
.
.
.
.
.
.
Failure
0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0.1500
.
.
.
.
.
.
0.2012
.
.
.
.
.
.
.
.
Survival
Standard
Error
0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0.0357
.
.
.
.
.
.
0.0402
.
.
.
.
.
.
.
.
Number
Failed
Number
Left
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
15
15
16
17
18
19
20
20
20
20
20
20
21
22
23
100
99
98
97
96
95
94
93
92
91
90
89
88
87
86
85
84
83
82
81
80
79
78
77
76
75
74
73
72
71
70
Size of the risk set
for each time point.
1-Survival
= the
estimated probability
of death prior to the
specified time.
(Pointwise) standard
error of KM estimate,
calculated with
Greenwood’s formula.
Cumulative # of
failures.
NOTE: The marked survival times are censored observations.
Summary Statistics for Time Variable Time
Quartile Estimates
Smallest event time
such that the
probability of dying
earlier is .75
Percent
Point
Estimate
75
50
25
15.0000
7.0000
3.0000
95% Confidence Interval
[Lower
Upper)
10.0000
5.0000
2.0000
34.0000
9.0000
4.0000
Estimated median
death time and 95%
confidence interval.
Estimated mean survival time.
Note: Median is usually preferred measure
of central tendency for survival data.
SAS LAB TWO, April 27, 2004
Mean
Standard Error
14.5912
1.9598
NOTE: The mean survival time and its standard error were underestimated because the largest
observation was censored and the estimation was restricted to the largest event time.
Summary of the Number of Censored and Uncensored Values
Total
100
Failed
80
Censored
20
Percent
Censored
20.00
Remark: the confidence interval for 75 percentile is not symmetric, method used worth looking
into.
SAS LAB TWO, April 27, 2004
“s” asks for survival plot;
see reference page for full
list of plotting options
7. Plot the Kaplan-Meier curve, as in figure 2.2, p.34:
Or try “none” here, to
eliminate censoring marks. If
you don’t specify, it will give
you annoying circles as the
default.
/*Plot KM curve*/
goptions reset=all;
proc lifetest data= hmohiv plots=(s) graphics censoredsymbol=X;
time time*censor(0);
title 'Figure 2.2, p. 34';
symbol v=none ;
Requests high resolution
graphics
run;
Tell sas to omit symbol for
each event. You may also
specify this above with
option: “eventsymbol=none”
8. To get pointwise confidence intervals for the survival curve, use the outsurv option:
/*get confidence limits*/
proc lifetest data= hmohiv outsurv=outdata;
time time*censor(0);
title 'outputs pointwise confidence limits';
run;
9. Open new outdata set using point-and-click to view new variables.
10. Plot survival curve with point-wise confidence intervals:
/*plot confidence limits*/
goptions reset=all;
axis1 label=(angle=90);
proc gplot data= outdata ;
title 'Figure 2.5, p.46';
label survival='Survival Probability';
label time='Survival Time (Months)';
plot survival*time SDF_UCL*time SDF_LCL*time /overlay
vaxis=axis1;
symbol1 v=none i=join c=black line=1;
symbol2 v=none i=join c=black line=2;
symbol3 v=none i=join c=black line=2;
run; quit;
Asks for lines that differ in line type
(eg, dashed, solid) rather than color
(which is SAS default).
SAS LAB TWO, April 27, 2004
Note: In SAS 8, there is no easy way (other than programming a macro) to obtain the
simultaneous 95% confidence bands (Hall and Wellner) for the survivor function or to calculate
confidence intervals based on transformations of the survivor function, such as log-log, but
SAS 9 has these features:
Useful features in SAS 9 that are not available in SAS 8:
The new SURVIVAL statement enables you to create confidence bands (also known as simultaneous
confidence intervals) for the survivor function S(t) and to specify a transformation for computing the
confidence bands and the pointwise confidence intervals. It contains the following options:
 the OUT= option names the output SAS data set that contains survival estimates as in the
OUTSURV= option in the PROC LIFETEST statement.
 the CONFTYPE= option specifies the transformation applied to S(t) to obtain the pointwise




confidence intervals and the confidence bands. Four transforms are available: the arcsine-square
root transform, the complementary log-log transform, the logarithmic transform, and the logit
transform.
CONFBAND= option specifies the confidence bands to add to the OUT= data set. You can
choose the equal precision confidence bands (Nair, 1984), or the Hall-Wellner bands (Hall and
Wellner, 1980), or both.
The BANDMAX= option specifies the maximum time for the confidence bands.
The BANDMIN= option specifies the minimum time for the confidence bands.
The STDERR option adds the column of standard error of the estimated survivor function to the
OUT= data set.
The ALPHA= option sets the confidence level for pointwise confidence intervals as well as the
confidence bands.
11. We could also write a “macro” (like a function) to give us a plot of the survivor function
with confidence limits. If we were going to be plotting many survival curves, this would
save time.
Variables that will be entered into the function; here:
dataset, time variable, censoring variable. They will be
called with &variable. below.
%macro cl(data, time, censor);
goptions reset=all;
axis1 label=(angle=90);
proc lifetest data=&data. outsurv=outdata;
time &time.*&censor.(0);
run;
proc gplot data=outdata ;
title 'Plot of survivor function with pointwise confidence
intervals';
label survival='Survival Probability';
label &time.='Survival Time';
plot survival*&time. SDF_UCL*&time. SDF_LCL*&time. /overlay
vaxis=axis1;
symbol1 v=none i=join c=black line=1;
symbol2 v=none i=join c=black line=2;
symbol3 v=none i=join c=black line=2;
run; quit;
%mend cl;
%cl(hmohiv, time, censor);
Invoke macro
SAS LAB TWO, April 27, 2004
11. Compare drug groups.
/**Figure 2.7 , p. 58**/
proc lifetest data= hmohiv plots=(s) graphics censoredsymbol=none;
time time*censor(0);
title 'Figure 2.7, p. 58';
strata drug;
Requests comparison by drug group.
symbol1 v=none color=black line=1;
symbol1 v=none color=black line=2;
run;
Explanation of output from Lifetest procedure:
Tests of Null hypothesis:
S1(t)=S2(t)
Using:
-log-rank test
-Wilcoxon test
-Likelihood ratio test
(assumes event times
have an exponential
distribution)
Test of Equality over Strata
Test
Chi-Square
Log-Rank
Wilcoxon
-2Log(LR)
11.8556
10.9104
20.9264
Pr >
Chi-Square
DF
1
1
1
0.0006
0.0010
<.0001
If median is a better measure of the center, why not test the equality of medians?
The method is only recently available, by empirical likelihood.
SAS LAB TWO, April 27, 2004
12. Plot the Kaplan-Meier survival curve for the hmohiv data by age group (as in figure 2.8,
p.69) by changing strata statement (and title) as below:
/*by age group*/
proc lifetest data= hmohiv plots=(s) graphics censoredsymbol=none;
time time*censor(0);
title 'Figure 2.8, p. 69';
strata age(30 35 40 45); *look at survival by age groups;
run;
Asks SAS to divide into age groups: [-  ,30) [30,35) [3540) [40-45) {45,  )
^
13. Change plot from “s” (survival) to “ls” (log-survival) plot, which plots = –log S(t) versus t.
proc lifetest data= hmohiv plots=(ls) graphics censoredsymbol=none;
time time*censor(0);
title '-log survival plot’;
strata drug;
run;
Equivalent to the cumulative hazard function:
t

 log S (ˆt )  h(u )du
0
The plot tells us how the hazard changes with time:
For example, if the hazard is constant (no change
over time), should be a straight line with origin at
0.
14. Change plot from “s” (survival) to “lls” (log-survival) plot, which plots = log(–log S(t))
versus logt.
proc lifetest data= hmohiv plots=(lls) graphics;
time time*censor(0);
title ‘log log survival plot’;
strata drug;
run;
Asks for plot of log(-logS(t)) vs. log(time):
t

log[  log S ˆ(t )]  log h(u )du
0
If the survival times follow a Weibull distribution
with log h(t )     log t , then log-log
survival plot should be a straight line with slope β
Download