SAS_LAB_ONE

advertisement
SAS LAB ONE, April 20, 2004
Lab Objectives
After today’s lab you should be able to:
1.
2.
3.
4.
5.
6.
7.
8.
Import data from excel into SAS via point-and-click features as well as PROC IMPORT.
Manipulate and format date and time variables in SAS.
Put data into the correct structure for survival analysis: create time and censor variables.
Quickly examine univariate distributions and identify outliers via point-and-click
features.
Produce enhanced graphs using PROC GPLOT.
a. Use the TITLE, FOOTNOTE, SYMBOL, LABEL, AXIS, LEGEND statements
(which are global statements).
b. Use different symbols for different values of a classification variable (such as
censored/failed).
c. Overlay two plots.
d. Directly edit and enhance graphs using point and click features.
e. Export graphs as image files.
f. Add an exponential regression line to a plot (using PROC LIFEREG).
g. Know where to go for help on SAS/GRAPH: http://v8doc.sas.com/sashtml/
Produce the plots in Chapter 1 of your textbook (Applied Survival Analysis, by Hosmer
and Lemeshow) (see next page).
Produce a simple Kaplan-Meier curve (we will continue this in lab next week).
If time, complete the computer-based exercises at the end of chapter 1 of your textbook.
SAS LAB ONE, April 20, 2004
We will produce the following plots today:
SAS LAB ONE, April 20, 2004
SAS LAB ONE, April 20, 2004
SAS LAB ONE, April 20, 2004
LAB EXERCISE STEPS:
Follow along with the computer in front…
1. Save (to desktop) the two excel datasets “hmohiv.xls” and “whas.xls” from the hrp262
website: http://www-stat.stanford.edu/~jtaylo//courses/stats262/spring.2004/index.html
2. Open SAS: From the desktop double-click “Applications” double-click SAS icon
3. Use point-and-click features to obtain file extension for the library (where your datasets
sit). Copy.
4. Create library and import data via the following code (note: comments—ignored by
SAS—appear in brackets /* */):
Importing Data
/*Create library hrp262*/
libname hrp262 ‘PASTE YOUR FILE EXTENSION HERE’;
/**Import data**/
proc import out = work.hmohiv
datafile = "M:\HRP 262\SAS datasets\hmohiv.xls"
dbms = Excel2000 replace;
run;
5. Import data via point-and-click.
6. Fix datetime variables, enddate and startdate, via the following code:
Dealing with date-time variables
/**Dates are automatically imported from excel as datetime variables.
We want to remove the time part**/
/*Values of data variable represent # of days before or after Jan. 1,
1960**/
data hrp262.hmohiv;
set hmohiv;
format enddate date.;
format startdate date.;
enddate=datepart(enddate);
startdate=datepart(startdate);
Time=12*(enddate-startdate)/365.25; *gives time in months;
Time=round(time); *to match Time variable in textbook;
run;
Reference: alternate date formats:
date. 20April04
date9. 20April2004
day. 20
dowName. Tuesday
dowName3. Tue
monName. April
monName3. Apr
month. 4
year2. 04
mmddyy6. 042004
mmddyy8. 04/20/04
mmddyy10. 04/20/2004
weekdate. Tuesday, April 20, 2004
worddate. April 20, 2004
year. 2004
SAS LAB ONE, April 20, 2004
7. Use point-and-click features to check univariate distributions and plot age vs. time.
8. Recreate plot Figure 1.1 Hosmer and Lemeshow’s Applied Survival Analysis, p. 6. We’ll
start with the simplest version and add features as we go along. Use the following sets of
code:
/** Figure 1.1, Hosmer and Lemeshow's Applied Survival Analysis, p. 6,
version 1: simple.**/
/**Note specification of vertical and horizontal axes scales and use of
title statement**/
goptions reset=all; *resets graphing options;
proc gplot data=hrp262.hmohiv;
title1 'Figure 1.1, page 6: version 1';
plot time*age /
vaxis = 0 10 20 30 40 50 60
haxis = 15 20 25 30 35 40 45 50 55 ;
run; quit;
/** Figure 1.1, Hosmer and Lemeshow's Applied Survival Analysis, p. 6,
version 2: change symbol color, shape, and size; reduce minor tick
marks to match textbook figure**/
symbol1 value=circle color=red w=2 h=2;
proc gplot data=hrp262.hmohiv;
title1 'Figure 1.1, page 6: version 2';
plot time*age /
vaxis = 0 10 20 30 40 50 60 vminor=1
haxis = 15 20 25 30 35 40 45 50 55 hminor=1;
run; quit;
NOTE: Titles stay in effect until they are replaced by new ones or removed by entering a blank
title: title1 ' ' ;
/** Figure 1.1, Hosmer and Lemeshow's Applied Survival Analysis, p. 6,
finished version: Make axis labels match figure in textbook; divide
symbols by classification variable, censor; add a legend**/
goptions reset=all;
axis1 order= (0 to 60 by 10)
label=(height= 4pct font='Times New Roman' angle=90);
axis2 order= (15 to 55 by 10)
label=(height= 4pct font='Times New Roman');
symbol1 v=circle c=blue h=1 w=1;
symbol2 value=X color=red h=1 w=1;
legend1 label=('Censor')
shape=symbol(1,2)
position=(bottom center outside);
proc gplot data=hrp262.hmohiv;
title1 'Figure 1.1, page 6';
label time='Survival Time (Months)';
label Age='Age';
plot time*age=censor /
vaxis = axis1 haxis=axis2 vminor=1 hminor=1 legend=legend1;
run; quit;
SAS LAB ONE, April 20, 2004
NOTE: label statements assigned to a variable within a PROC only are valid for duration of that
PROC.
9. Recreate plot Figure 1.2 Hosmer and Lemeshow’s Applied Survival Analysis, p. 6 via the
SAS code:
/** Figure 1.2 in H&L, p.7**/
/*create temporary dataset new with inverse age variable*/
data new;
set hrp262.hmohiv;
ageinv=1000/age;
run;
proc gplot data=new;
title1 'Figure 1.2, page 7';
label time='Survival Time (Months)';
label Ageinv='1000/Age';
plot time*ageinv=censor /
vaxis = axis1 haxis=axis2 vminor=1 hminor=1;
run; quit;
10. Recreate plot Figure 1.4 Hosmer and Lemeshow’s Applied Survival Analysis, p. 19 via
the following SAS code:
/**Generate graph Figure 1.4, p.19. Note the use of the overlay
function to graph two X values for each individual**/
goptions reset=all;
axis1 order= (0 to 4 by 1)
label=(height= 4pct font='Times New Roman' angle=90) minor=none;
axis2 label=(height= 4pct font='Times New Roman') minor=none;
title1 'Figure 1.4, p. 19';
symbol1 v=circle c=blue h=1 w=1 ;
symbol2 value=X color=red h=1 w=1 ;
proc gplot data=hrp262.hmohiv;
title1 'Figure 1.4, page 19';
label ID='Subject';
label StartDate='Calendar Time';
plot ID*StartDate ID*EndDate / overlay vaxis=axis1 haxis=axis2;
where id le 4; *we only want 4 individuals;
run; quit;
11. Right click on the graph. Select edit graph feature. Add horizontal connecting lines to
complete the graph.
12. Exit Edit screen and save changes to graph. Use right click again to save graph as a .jpg
or .gif file. Open a new Word Document and insert graph as a picture into Word.
SAS LAB ONE, April 20, 2004
13. Recreate Figure 1.5, p. 20 via the following SAS Code:
axis2 order=(0 to 12 by 3)
label=(height= 4pct font='Times New Roman');
symbol1 v=circle c=blue h=1 w=1 i=;
symbol2 value=X color=red h=1 w=1 ;
proc gplot data=hrp262.hmohiv;
title1 'Figure 1.5, page 20';
label ID='Subject';
label StartDate='Time in Months';
plot id*Time/ haxis=axis2 vaxis=axis1
where id le 4;
run; quit;
;
14. Right click on the graph. Select edit graph feature. Add horizontal connecting lines to
complete the graph.
15. Exit Edit screen and save changes to graph. Use right click again to save graph as a .jpg
or .gif file. Add graph to your current open Word file.
16. Fit an exponential regression model as in Table 1.2, p. 14 of Hosmer and Lemeshow,
using PROC LIFEREG:
/*Fits the exponential curve as on p. 14 of H&L*/
Proc Lifereg data=hrp262.hmohiv outest=a;
title 'Table 1.2, p.14';
model time*censor(0)= age /dist=exponential;
output out=OutData p=median; *gives predicted median survival
times;
run;
17. Plot predicted survival times as a function of age on your scatterplot of Time vs. Age:
goptions reset=all ;
axis1 order= (0 to 60 by 10)
label=(height= 4pct font='Times New Roman' angle=90);
axis2 order= (15 to 55 by 10)
label=(height= 4pct font='Times New Roman');
axis3 order= (0 to 60 by 60) minor=none major=none label=(' '
angle=90);
symbol1 v=circle c=blue h=1 w=1;
symbol2 value=X color=red h=1 w=1;
legend1 frame cframe=ligr cborder=black
position=center;
/*Sorting data by independent variable is critical here*/
proc sort data=OutData; by age; run;
/*Fits the exponential curve as on p. 14 of H&L*/
proc gplot data=OutData;
title1 'Figure 1.3, page 16';
label time='Survival Time (Months)';
SAS LAB ONE, April 20, 2004
label Age='Age';
plot time*age=censor /
vaxis = axis1 haxis=axis2 vminor=1 hminor=1 legend=legend1;
plot2 median*age / overlay vaxis=axis3;
symbol3 v=none c=black w=2 i=join line=1;
run; quit;
18. Remove the proc sort statement in the above SAS code and rerun. What happens?
19. Change symbol statements in the above SAS code and rerun. What happens?
symbol1 v=star c=orange h=1 w=1;
symbol2 value=& color=green h=2 w=2;
symbol3 v=none c=black w=2 i=join line=5;
20. Plot the Kaplan-Meier survival curve for the hmohiv data: Figure 2.2, p. 34.
/**Kaplan-Meier estimate of survivorship function**/
proc lifetest data=hrp262.hmohiv;
time time*censor(0);
title 'Kaplan-Meier Estimates for HMO HIV data';
run;
/*Plot this curve*/
goptions reset=all;
proc lifetest data=hrp262.hmohiv plots=(s) graphics
censoredsymbol=none;
time time*censor(0);
title 'Figure 2.2, p. 34';
symbol v=none ;
run;
21. Obtain a cross-tabulation of DRUG and the censoring variable CENSOR and compute
the percent dead and percent censored in each drug group.
/*Proc freq by drug**/
proc freq data=hrp262.hmohiv;
tables drug*censor / chisq;
run;
22. Run a logistic regression comparing survival (censor) in the drug vs. placebo groups.
proc logistic data=hrp262.hmohiv descending;
model censor=drug;
run;
23. Plot the Kaplan-Meier survival curve for the hmohiv data by drug group.
/**Figure 2.7 , p. 58**/
proc lifetest data=hrp262.hmohiv plots=(s) graphics
censoredsymbol=none;
time time*censor(0);
title 'Figure 2.7, p. 58';
label s='survival probability';
strata drug;
run;
24. If there’s time (or for optional homework), import the whas data into SAS and complete
exercises 1(a), 1(d), and 1(e) using the coding strategies we learned today.
Download