Help! I need Help with the Help!?!

advertisement
Help! I need Help with the Help!?!
Kevin P. Delaney, New York State Office of Mental Health, Albany, NY
Abstract
SAS OnlineDOC Documentation provided with
Version 8 is a comprehensive resource, providing
electronic, indexed, and searchable copies of all
official documentation published by the SAS
Institute. However, such a complete reference can
be difficult to navigate. The task can become almost
insurmountable for a new SAS user who isn’t quite
sure whether he or she is looking for a procedure, a
statement, or an option, or even which SAS
component they are working in. This paper is
designed to provide a systematic overview of the
structure of the OnlineDoc. Using an example of the
FREQ procedure, we will identify what types of
items are procedure statements, and procedure
options. By adding SAS statements, formats, global
options and ODS statements, we will gain
experience navigating the Online Doc. Having
completed our tour of the Base SAS Online Doc, we
will briefly describe the documentation for some of
the other more popular SAS components, such as
SAS/GRAPH® software and SAS/STAT® software.
This paper is intended for both New SAS users, and
more experienced SAS users (gurus?) looking for a
guide to teach other novices how to get started in the
world of SAS.
is an outcome variable, taking the value 1 when a
person developed the disease under study, 0 when
they did not. The variable count contains the
number of persons with each of the four possible
combinations of A and B. Your boss has given you
the data step to create this data set:
data test;
input a b count;
label a ="Treatment" b ="Disease";
datalines;
0 1 234
1 0 345
1 1 67
0 0 175
;
run;
But he wants you to ”learn SAS.” So his little “test”
will be for you to produce a table with Frequency
counts, and expected values, as well as Chi-square
tests of the likelihood of seeing this distribution of
treatment and disease by chance alone and the
relative risk of developing disease for persons
receiving the treatment, compared to those who did
not. Oh, and he wants you to furnish the results on
your intranet web page by close of business so he
can look at it when he gets home.
So what now, and no, you can’t quit…..
Introduction
Getting Started: Where the HELP!?!
This paper grew out of a desire to provide a starting
out point for some of my coworkers who were new
to the SAS System. I found that many questions I
received were things that were undoubtedly at the
fingertips of the person doing the questioning, if
they could only understand the SAS OnlineDoc well
enough to know where to look. The purpose of this
paper is therefore to provide a guided tour to the
functionality of the OnlineDoc
Lets start with the basics. How do you access the
SAS OnlineDoc documentation??
Well if SAS has been configured correctly on your
computer or server you should be able to go to the
Help Menu on the SAS system toolbar, then to
books and training, SAS OnlineDoc:
Your Mission
I will attempt to do this by working through the
following problem:
The boss has given you a small data set with three
variables, the variable A is exposure to Treatment, it
takes the value 1 when a person received the
treatment, and 0 when they did not. The variable B
Figure 1: Opening the OnlineDoc
Which should then spawn your computers default
web-browser with the “Home page” of the online
documentation. Which looks like this:
searching for the phrase “frequency tables” produces
48 matches. Figure three shows these results.
Figure 3: Searching for “frequency tables”
Figure 2: SAS OnlineDoc Start page
An important note:
Unless you disable the option, your browser will ask
you whether or not you want to trust the Java applet
that helps the SAS OnlineDoc run. (This is a form
of virus protection, and can be very important if you
are considering running a program from a
connection to the Internet. However, since you are
usually running the SAS OnlineDoc from you
hardrive, or a local area network, it should be safe,
after all it did come from the SAS Institute.)
The first of which is Frequency tables and statistics
page from the SAS/STAT User’s Guide, which may
or may not be what we were looking for. For those
of us who have been using SAS for a while, it may
not be that difficult to page through these search
results until we find the information we need. But,
imagine you are a new user, trying to figure out
which page to choose. What if our search was not
for Frequency Tables, but for PROC FREQ, then we
return 126 matches. Hopefully, you see my point. It
seems as though it could be much easier to look up
information you want by paging through the
Contents of the OnlineDoc.
The moral of the story is that you should check YES
on this security warning, because checking NO will
disable the search menu in the OnlineDoc. If you
don’t believe me, try checking NO and then search
for such “uncommon” SAS words as PROC, DATA
or INPUT, it can be an amusing way to kill time.
A Guided Tour
Searching
The Contents of the SAS OnlineDOC
documentation is divided by SAS Component.
Information on the Freq procedure is found in both
the Base SAS documentation and the SAS/STAT
module. Figure 4 shows a close up of the BASE
SAS module of the OnlineDoc.
I won’t spend too much time talking about the Index
or Search facilities in the Online Doc. Both of these
options are well explained in the Help section of the
Online Doc, and if I were to address this here that
would be providing Help with the Help for the Help,
which would just be too confusing.
The Search option will, however, be a great place to
illustrate what a complete, and potentially
cumbersome, reference the SAS OnlineDoc
documentation can be. The Freq procedure is one
of the most commonly used procedures in the world
of SAS. So lets use the Search tab to look up the
phrase frequency tables to see learn more about how
SAS generates tables of counts of variables. Well,
Sticking with our example of trying to generate
frequency tables, lets find the SAS Guide to the Freq
procedure. To do this you have to know a little
more about how the Online Doc is laid out.
Procedures
Appendices
The meat of this documentation lies in the
Procedures chapter, which gives detailed
descriptions of 38 procedures available in BASE
SAS. For an excellent overview describing these see
the Concepts Chapter ‘Choosing the Right
Procedure.’ (I recommend this for even more
advanced SAS users, even you may be surprised at
some of the things BASE SAS can do).
The FREQ Procedure provides a representative
example of what you will find on most of the
procedure pages.
Figure 4: The BASE SAS topic group from the SAS
OnlineDoc documentation.
The plus (+) sign and graphic of multiple books next
to Base SAS indicate that there are folders contained
within this topic heading. If we double click on
BASE SAS we will see that there are in fact seven
topic groups within BASE SAS.
BASE SAS >
SAS Language Reference: Dictionary
SAS Macro Language Reference
Moving and Accessing SAS files across
Operating Environments
SAS Procedures Guide
Guide to the SAS Output Delivery System
SAS SQL Query Window User’s Guide, and
Host Specific Information
Now where might we find information on the FREQ
procedure? As a general rule PROC anything for
Base SAS can be found in the SAS Procedures
guide, and if the PROC you are looking for is not
found here, it is probably part of a different
component. The notable exception to this rule
(those new to SAS will quickly learn that there is at
least one exception to every rule) is the TEMPLATE
procedure, which is found in the Guide to the Output
Delivery System.
If we examine the SAS Procedures Guide we will
find four chapters, including:
Changes and Enhancements
Concepts
Figure 5: The FREQ Procedure Chapter of the
SAS OnlineDoc >BASE SAS > Procedures Guide
Figure 5 shows the topics included in the FREQ
Procedure chapter of the Procedures guide. First
you will find a brief overview of the procedure.
This is followed by a detailed description of the
procedure syntax, including the PROC
A_SAS_PROC statement and all other optional
statements. There will usually be a Concepts or
Results section (or both) with advice and special
topics specific to the procedure. Finally there will
be examples illustrating how to use the procedure.
Lets take a closer look at what is found in the FREQ
procedure chapter.
The procedure syntax describes the basic statements
used in the procedure in a concise format, in this
case:
PROC FREQ <option(s)>;
BY <DESCENDING> variable-1
<...<DESCENDING> variable-n>
<NOTSORTED>;
EXACT statistic-keyword(s) </ option(s)>;
OUTPUT statistic-keyword(s) <OUT=SAS-dataset>;
TABLES request(s) </ option(s)>;
TEST statistic-keyword(s);
WEIGHT variable;
The PROC FREQ statement is the fundamental
component of the procedure; all of these other
statements are optional.
The sample code provided with this paper contains
the actual SAS code that we will build as we tour the
OnlineDoc. This is not intended to teach you how to
program in SAS, but merely as an example of how
to pull information from the OnlineDoc. That said,
lets return to the PROC FREQ chapter. By default
PROC FREQ;
Run;
will produce one-way frequency tables of all the
variables in the last dataset you referenced, which in
most cases is not what you want to do.
You change this default procedure output by adding
Procedure STATEMENTS And Procedure
OPTIONS. The handy To do this > Use this tables
in each section help you decide which Procedure
statements and options you want to use.
To do this
Calculate separate frequency or
crosstabulation tables for each
BY group
Request exact tests for
specified statistics
Create an output data set that
contains specified statistics
Specify frequency or
crosstabulation tables and
request tests and measures of
association
Request asymptotic tests for
measures of association and
agreement
Identify a variable whose
values weight each observation
Use this statement
BY
EXACT
OUTPUT
TABLES
TEST
WEIGHT
Figure 6: To do this > Use this table for PROC
FREQ statements
In this case we want to specify a cross-tabulation
Table of A (treatment) *B (disease), and we have
the variable Count, which will be used as a Weight
for the observations.
Since TEST is the only data set we are working with
we don’t really need to tell SAS which data set to
use for the FREQ procedure, but it is good practice
to document this, so we should use the Procedure
Option to the PROC FREQ statement data=test as
well.
The code that results from these modifications
would be:
Proc freq data=test;
Tables a*b;
Weight count;
Run;
Now, that’s a little better, but there are a few more
things that we have to add. We need to calculate
expected cell frequencies, and statistics, for these
observed relationships between treatment and
disease. And since we weren’t asked for the
percentages in our table, we should probably remove
them, for clarity.
All of these things can be accomplished by asking
for Options to the Tables Statement. If we select the
Tables Statement we find another of the To do this
> Use this tables, which outlines the function of the
different Table statement options.
Again, this is a handy, comprehensive list of options
to the table statement. Figure 7 provides a
shortened version of this table, selecting the options
that we will use in our example. As you can see this
table is divided into sections such as: Control of
Statistical Analysis, Control of additional table
information, and Control of the displayed output,
which group options by function.
Adding these options to the Tables Statement will
produce the statistics we are looking for, including
the expected cell frequencies, Chi-square statistics,
and calculations of Relative risk, as well as
eliminating the unwanted percentages from our table
output.
To do this
Use this
option
Control statistical analysis
Request chi-square tests and measures CHISQ
of association based on chi-square
Request relative risk measures for
RELRISK
2×2 tables
Control additional table information
Display the expected cell frequency
for each cell
Control displayed output
EXPECTED
Suppress the column percentage for
NOCOL
each cell
Suppress the percentage, row
NOPERCENT
percentage, and column percentage in
crosstabulation tables, or percentages
and cumulative percentages in oneway frequency tables and in list
format
Suppress the row percentage for each NOROW
cell
Figure 7: Examples from the To do this > Use this Option
section for the Tables Statement.
proc format;
value ynfmt 0="No" 1="Yes";
run;
But this creates the format ynfmt; it does not apply it
to our variables A and B. How do we assign a
format, to a variable?
This information is found in the SAS Language
Reference: Dictionary. Since this is our first foray
outside the Procedures Guide, we will move
deliberately through here.
If we open this book in the SAS OnlineDoc, we see
there are three Main topic areas contained within:
Venturing Outside the FREQ Procedure:
To this point, our tour of the SAS OnlineDoc has
remained primarily within the confines of the FREQ
Procedure chapter. However, as we venture outside
this section of the documentation we will see that we
already know a lot about how the rest of the
Procedures chapter’s are arranged. Our problem
also called for value labels for these data. Value
labels in SAS are called Formats, and are assigned
using the Procedure PROC FORMAT.
So if we are looking for a Procedure we can select
the Contents Tab of the OnlineDoc, then look in
BASE SAS > Procedures Guide. If we select The
Format Procedure, we will find ourselves at what
should be a familiar looking screen. Again, we have
an overview of the procedure, a summary of the
procedures syntax, including statements and options,
Concepts and Results specific to The FORMAT
procedure, and examples.
If we select the Procedure Syntax, we will again find
our To do this > Use this table. I will let you all
find this one yourselves. For the purposes of our
example, since we want to specify the character
strings ‘YES’ and ‘NO’ to label our values of 1 and
0 respectively, I will tell you the table suggests the
VALUE statement to do this.
Using the syntax guide for the PROC FORMAT
statement, the VALUE statement, and the special
section entitled, specifying values or ranges, we can
develop the following code to create the format
ynfmt..
Figure 8: SAS Language Reference: Dictionary
Here the meat of the book lies in the Dictionary of
Language Elements. Double clicking on this chapter
gives use the following sub-chapters:
Introduction
Data Set Options
Formats
Functions and Call Routines
Informats
Statements, and
SAS System Options
Like most dictionaries the SAS Dictionary of
Language elements contains a comprehensive listing
of SAS words, and their “definitions.” Within the
Dictionary, these SAS words are divided by type.
It seems tempting to go to the Formats section, since
we are looking for a way to use our newly created
format. However, the Formats section consists
primarily of a long list of the SAS supplied formats.
Since we already have our own format we created,
we are not really interested in this list right now, but
it is good to know where it is. (Here’s another
activity for when you have time to kill, look through
this list. Again, even the most experienced SAS user
may find that SAS already provides the format you
needed, but never knew existed.) In addition to this
list, there is a page in the section called Using
Formats, which might be what we want.
Figure 9: SAS Language Reference: Dictionary >
Formats > Using Formats
This page lists all the ways that a format can be
used. There are two options here that mention that
they can be used in a PROC step, the Format
statement and the Attrib statement. If we page down
to find more information on the FORMAT statement
we find a short description on how to use the
statement and a link to another page. Follow this
link to the new page, which describes the Format
Statement. The question is; where in the OnlineDoc
does this page reside?
Now is a good time to point out a couple more
features of the OnlineDoc. The Contents and Search
pages in the left third of the page can be used to
navigate through the SAS OnlineDoc
documentation. However, if you are following a
link from one page to the next, the Contents view
will not change with you. This is important in our
current example because although the Contents page
still shows the Formats chapter of the SAS
Language Reference: Dictionary, and we are no
longer in that chapter.
At the top of each page in the right two-thirds of the
screen there is a gray line that contains the Book,
Chapter or Sub-Chapter title, depending on what
level of the documentation you are using. Also at
the top of each page in the right two thirds of the
screen, there are three buttons to move to Chapter or
Book Contents (in other words up a level) or to the
Previous or Next page within that level. Again,
clicking on the Previous button will not necessarily
bring you back to the page you were just on, but
back to the Previous page in the Chapter you are
currently in, or the Previous Chapter in the book you
are currently using. In order to be sure you return to
the page you were just on you should use your
Browser’s back button, rather than the Previous
button.
With this said, lets return to the problem at hand.
We followed a link entitled FORMAT from the
Formats Chapter of the SAS Language Reference:
Dictionary, to a page entitled FORMAT. A quick
way to figure out where you within the Contents of
the SAS OnlineDoc can be to look at the title line.
However, in our case the title line says FORMAT,
which doesn’t really help. However, if we use the
Chapter contents to go up a level, we find that we
are in the Statements chapter of the SAS Language
Reference: Dictionary.
The Statements chapter contains a listing of all the
SAS statements that can be used in a data step or
that can be used anywhere in SAS code (global
statements). These should be distinguished from the
Procedure statements we have already seen, such as
TABLE, WEIGHT, and VALUE which are
Procedure specific. FORMAT is an example of a
statement that can be used in any procedure or in a
data step, making it a Global Statement. The Format
sub-chapter of the Statements chapter will serve as
our example a common Statement Chapter in the
same way as the PROC FREQ chapter was our
example of a Procedure chapter. For any given
Statement you will find the syntax for using the
Statement, as well as any arguments to the Statement
(in this case Variables and formats). There will also
be a Details section to describe the particulars of
using the Statement, a Comparison and links to
Statements with similar uses, and examples
describing common uses of the Statement.
From this section we can ascertain that the syntax
we need to apply our ynfmt to the variables A and B
would be:
Format a b ynfmt.;
Adding this code to our PROC FREQ does in fact
add the formatted values, and we are almost done.
But wait the values on our Output Table say “No”
then “Yes”, rather than the more logical (at least to
myself) “Yes” then “No.” How can we fix this? I
will give you two hints, you will need the PROC
FREQ statement option ORDER=DATA as well as
the SORT Procedure. Hopefully you now know
where to look up information on these two items in
the SAS OnlineDOC.
Output Delivery System (ODS)
The original problem stated that your boss wanted
the table and statistics posted to your work groups
intranet page, and although you can have your buddy
in IT add web-pages to the server for you, you still
need to get your output into that form. The SAS
System, Version 8, Output Delivery System® allows
you to Output results directly to HTML, and the
most basic forms of this require only a few simple
statements.
I know what you're thinking, Statements, I know
where they are, in the SAS Language Reference:
Dictionary, I’ll look those up and be done in no
time. Well, you can look them up and be done in no
time, but you won’t find them in the dictionary. At
the beginning of our journey, when we first opened
the BASE SAS software documentation, there were
seven books within the BASE SAS software topic
heading. One of these was the Guide to the SAS
Output Delivery System, and that is where we will
find the statements needed to complete our assigned
task.
If you have yet to use the Output Delivery System
the introduction provides an excellent overview of
what ODS can do, and it is another one of those
sections of the SAS OnlineDoc that even more
advanced users should read in their spare time. But
the heart of the Guide to the SAS Output Delivery
System is the book entitled, Reference. Again, I
will not spend time going into detail about ODS, this
paper is about how to use the OnlineDoc. If you
want to learn more about PROC TEMPLATE,
which is used to modify the style of ODS output, the
OnlineDoc for this Procedure resembles that of
PROC FREQ and PROC FORMAT, which we
worked through before.
The Chapter entitled the ODS Statements, contains
all the information you need to utilized the default
styles that SAS provides for its HTML, PRINTER,
LISTING and OUTPUT output destinations.
(Documentation on the new RTF destination, as well as
some other new features of ODS available with Versions
8.1 and now 8.2 is available online from the SAS
Technical Support web-site, but was not available for the
SAS OnlineDoc Version 8)
The To do this > Use this statement table for this
Chapter is hidden under the heading “What does
each ODS statement do?” Here we find that to
create HTML output we want to use the ODS
HTML statement, which certainly seems reasonable.
If we then use the Chapter Contents button (or your
Browser’s BACK button) to go back to the ODS
Statements page we can select ODS HTML to view
its syntax.
Figure 10: The ODS Statements: ODS HTML
It is on this page that you will discover that creating
a web-page with a table of contents from your output
really is as easy as one, two, three, where these are
BODY=(aka. FILE=), CONTENTS=, and
FRAME=. These three Keywords describe different
HTML file specifications. Placing your destination
files to the right of the equals sign and wrapping
ODS HTML around our PROC FREQ does the rest.
Our final code to create an HTML document
containing a table with observed cell frequencies and
expected cell frequencies for the relationship
between A (Treatment) and B (Disease), Chi-square
statistics, and Relative risks should be:
ods listing close;
ods html
contents="\ MYPATH\freqycontents.html”
body="\MYPATH\freqybody.html"
frame="\MYPATH\freqshow.html";
proc freq data=test order=data;
tables a*b/ nocol norow nopercent expected
chisq relrisk;
weight count;
format a b ynfmt.;
run;
ods html close;
ods listing;
Other Documentation
For most of this paper we have stayed within the
confines of the BASE SAS software documentation.
This comprehensive reference is only one of 18
libraries that comprise the whole of the SAS
OnlineDoc. Other important references include the
SAS/STAT User’s Guide and the SAS/GRAPH
Software: Reference.
The first 12 Chapters of the SAS/STAT User’s
Guide contain information on the classes of
statistical analyses that SAS can perform. These
Chapters also provide a good review of some of the
statistical theories that underlie these analyses. The
remainder of the chapters provides descriptions of
each of the SAS/STAT Procedures, and follows the
same general layout as the Procedure descriptions of
the BASE SAS software procedures.
The SAS/GRAPH Software: Reference contains a
chapter on SAS/GRAPH Statements, as well as
separate chapters for each of the SAS/GRAPH
procedures, which look similar to their BASE SAS
counterparts. These are the best places to start when
you want to delve into the intricacies of the
SAS/GRAPH component.
Conclusion
Hopefully, this paper provided you with an
organized method for understanding the framework
of the SAS OnlineDoc documentation. If nothing
else I hope to have turned you on to the incredible
amount of information available in this reference,
and set you up with a more systematic way of
working through it. For new SAS users, I hope I
have given away at least some of the hiding places
frequented by the syntax you are seeking to clarify.
For experienced SAS Users, I think I have shown
that there is always more to learn, and that the SAS
OnlineDoc documentation can provide the step off
on the road to this knowledge. Happy Hunting!
Reference
SAS Institute Inc. (1999), SAS OnlineDoc, Version
8, Cary, NC: SAS Institute Inc.
SAS and all other SAS Institute Inc. product or
service names are registered trademarks or
trademarks of SAS Institute Inc. in the USA and
other countries.  indicates USA registration.
Other brand and product names are registered
trademarks or trademarks of their respective
companies.
Contact Information
Please send questions, comments and suggestions to:
Kevin Delaney
NYS Office of Mental Health
44 Holland Ave
Albany, NY 12229
(518) 473-7868
coevkpd@omh.state.ny.us
Download