Help! I need Help with the Help!?! Kevin P. Delaney, New York State Office of Mental Health, Albany, NY Abstract SAS OnlineDOC Documentation provided with Version 8 is a comprehensive resource, providing electronic, indexed, and searchable copies of all official documentation published by the SAS Institute. However, such a complete reference can be difficult to navigate. The task can become almost insurmountable for a new SAS user who isn’t quite sure whether he or she is looking for a procedure, a statement, or an option, or even which SAS component they are working in. This paper is designed to provide a systematic overview of the structure of the OnlineDoc. Using an example of the FREQ procedure, we will identify what types of items are procedure statements, and procedure options. By adding SAS statements, formats, global options and ODS statements, we will gain experience navigating the Online Doc. Having completed our tour of the Base SAS Online Doc, we will briefly describe the documentation for some of the other more popular SAS components, such as SAS/GRAPH® software and SAS/STAT® software. This paper is intended for both New SAS users, and more experienced SAS users (gurus?) looking for a guide to teach other novices how to get started in the world of SAS. is an outcome variable, taking the value 1 when a person developed the disease under study, 0 when they did not. The variable count contains the number of persons with each of the four possible combinations of A and B. Your boss has given you the data step to create this data set: data test; input a b count; label a ="Treatment" b ="Disease"; datalines; 0 1 234 1 0 345 1 1 67 0 0 175 ; run; But he wants you to ”learn SAS.” So his little “test” will be for you to produce a table with Frequency counts, and expected values, as well as Chi-square tests of the likelihood of seeing this distribution of treatment and disease by chance alone and the relative risk of developing disease for persons receiving the treatment, compared to those who did not. Oh, and he wants you to furnish the results on your intranet web page by close of business so he can look at it when he gets home. So what now, and no, you can’t quit….. Introduction Getting Started: Where the HELP!?! This paper grew out of a desire to provide a starting out point for some of my coworkers who were new to the SAS System. I found that many questions I received were things that were undoubtedly at the fingertips of the person doing the questioning, if they could only understand the SAS OnlineDoc well enough to know where to look. The purpose of this paper is therefore to provide a guided tour to the functionality of the OnlineDoc Lets start with the basics. How do you access the SAS OnlineDoc documentation?? Well if SAS has been configured correctly on your computer or server you should be able to go to the Help Menu on the SAS system toolbar, then to books and training, SAS OnlineDoc: Your Mission I will attempt to do this by working through the following problem: The boss has given you a small data set with three variables, the variable A is exposure to Treatment, it takes the value 1 when a person received the treatment, and 0 when they did not. The variable B Figure 1: Opening the OnlineDoc Which should then spawn your computers default web-browser with the “Home page” of the online documentation. Which looks like this: searching for the phrase “frequency tables” produces 48 matches. Figure three shows these results. Figure 3: Searching for “frequency tables” Figure 2: SAS OnlineDoc Start page An important note: Unless you disable the option, your browser will ask you whether or not you want to trust the Java applet that helps the SAS OnlineDoc run. (This is a form of virus protection, and can be very important if you are considering running a program from a connection to the Internet. However, since you are usually running the SAS OnlineDoc from you hardrive, or a local area network, it should be safe, after all it did come from the SAS Institute.) The first of which is Frequency tables and statistics page from the SAS/STAT User’s Guide, which may or may not be what we were looking for. For those of us who have been using SAS for a while, it may not be that difficult to page through these search results until we find the information we need. But, imagine you are a new user, trying to figure out which page to choose. What if our search was not for Frequency Tables, but for PROC FREQ, then we return 126 matches. Hopefully, you see my point. It seems as though it could be much easier to look up information you want by paging through the Contents of the OnlineDoc. The moral of the story is that you should check YES on this security warning, because checking NO will disable the search menu in the OnlineDoc. If you don’t believe me, try checking NO and then search for such “uncommon” SAS words as PROC, DATA or INPUT, it can be an amusing way to kill time. A Guided Tour Searching The Contents of the SAS OnlineDOC documentation is divided by SAS Component. Information on the Freq procedure is found in both the Base SAS documentation and the SAS/STAT module. Figure 4 shows a close up of the BASE SAS module of the OnlineDoc. I won’t spend too much time talking about the Index or Search facilities in the Online Doc. Both of these options are well explained in the Help section of the Online Doc, and if I were to address this here that would be providing Help with the Help for the Help, which would just be too confusing. The Search option will, however, be a great place to illustrate what a complete, and potentially cumbersome, reference the SAS OnlineDoc documentation can be. The Freq procedure is one of the most commonly used procedures in the world of SAS. So lets use the Search tab to look up the phrase frequency tables to see learn more about how SAS generates tables of counts of variables. Well, Sticking with our example of trying to generate frequency tables, lets find the SAS Guide to the Freq procedure. To do this you have to know a little more about how the Online Doc is laid out. Procedures Appendices The meat of this documentation lies in the Procedures chapter, which gives detailed descriptions of 38 procedures available in BASE SAS. For an excellent overview describing these see the Concepts Chapter ‘Choosing the Right Procedure.’ (I recommend this for even more advanced SAS users, even you may be surprised at some of the things BASE SAS can do). The FREQ Procedure provides a representative example of what you will find on most of the procedure pages. Figure 4: The BASE SAS topic group from the SAS OnlineDoc documentation. The plus (+) sign and graphic of multiple books next to Base SAS indicate that there are folders contained within this topic heading. If we double click on BASE SAS we will see that there are in fact seven topic groups within BASE SAS. BASE SAS > SAS Language Reference: Dictionary SAS Macro Language Reference Moving and Accessing SAS files across Operating Environments SAS Procedures Guide Guide to the SAS Output Delivery System SAS SQL Query Window User’s Guide, and Host Specific Information Now where might we find information on the FREQ procedure? As a general rule PROC anything for Base SAS can be found in the SAS Procedures guide, and if the PROC you are looking for is not found here, it is probably part of a different component. The notable exception to this rule (those new to SAS will quickly learn that there is at least one exception to every rule) is the TEMPLATE procedure, which is found in the Guide to the Output Delivery System. If we examine the SAS Procedures Guide we will find four chapters, including: Changes and Enhancements Concepts Figure 5: The FREQ Procedure Chapter of the SAS OnlineDoc >BASE SAS > Procedures Guide Figure 5 shows the topics included in the FREQ Procedure chapter of the Procedures guide. First you will find a brief overview of the procedure. This is followed by a detailed description of the procedure syntax, including the PROC A_SAS_PROC statement and all other optional statements. There will usually be a Concepts or Results section (or both) with advice and special topics specific to the procedure. Finally there will be examples illustrating how to use the procedure. Lets take a closer look at what is found in the FREQ procedure chapter. The procedure syntax describes the basic statements used in the procedure in a concise format, in this case: PROC FREQ <option(s)>; BY <DESCENDING> variable-1 <...<DESCENDING> variable-n> <NOTSORTED>; EXACT statistic-keyword(s) </ option(s)>; OUTPUT statistic-keyword(s) <OUT=SAS-dataset>; TABLES request(s) </ option(s)>; TEST statistic-keyword(s); WEIGHT variable; The PROC FREQ statement is the fundamental component of the procedure; all of these other statements are optional. The sample code provided with this paper contains the actual SAS code that we will build as we tour the OnlineDoc. This is not intended to teach you how to program in SAS, but merely as an example of how to pull information from the OnlineDoc. That said, lets return to the PROC FREQ chapter. By default PROC FREQ; Run; will produce one-way frequency tables of all the variables in the last dataset you referenced, which in most cases is not what you want to do. You change this default procedure output by adding Procedure STATEMENTS And Procedure OPTIONS. The handy To do this > Use this tables in each section help you decide which Procedure statements and options you want to use. To do this Calculate separate frequency or crosstabulation tables for each BY group Request exact tests for specified statistics Create an output data set that contains specified statistics Specify frequency or crosstabulation tables and request tests and measures of association Request asymptotic tests for measures of association and agreement Identify a variable whose values weight each observation Use this statement BY EXACT OUTPUT TABLES TEST WEIGHT Figure 6: To do this > Use this table for PROC FREQ statements In this case we want to specify a cross-tabulation Table of A (treatment) *B (disease), and we have the variable Count, which will be used as a Weight for the observations. Since TEST is the only data set we are working with we don’t really need to tell SAS which data set to use for the FREQ procedure, but it is good practice to document this, so we should use the Procedure Option to the PROC FREQ statement data=test as well. The code that results from these modifications would be: Proc freq data=test; Tables a*b; Weight count; Run; Now, that’s a little better, but there are a few more things that we have to add. We need to calculate expected cell frequencies, and statistics, for these observed relationships between treatment and disease. And since we weren’t asked for the percentages in our table, we should probably remove them, for clarity. All of these things can be accomplished by asking for Options to the Tables Statement. If we select the Tables Statement we find another of the To do this > Use this tables, which outlines the function of the different Table statement options. Again, this is a handy, comprehensive list of options to the table statement. Figure 7 provides a shortened version of this table, selecting the options that we will use in our example. As you can see this table is divided into sections such as: Control of Statistical Analysis, Control of additional table information, and Control of the displayed output, which group options by function. Adding these options to the Tables Statement will produce the statistics we are looking for, including the expected cell frequencies, Chi-square statistics, and calculations of Relative risk, as well as eliminating the unwanted percentages from our table output. To do this Use this option Control statistical analysis Request chi-square tests and measures CHISQ of association based on chi-square Request relative risk measures for RELRISK 2×2 tables Control additional table information Display the expected cell frequency for each cell Control displayed output EXPECTED Suppress the column percentage for NOCOL each cell Suppress the percentage, row NOPERCENT percentage, and column percentage in crosstabulation tables, or percentages and cumulative percentages in oneway frequency tables and in list format Suppress the row percentage for each NOROW cell Figure 7: Examples from the To do this > Use this Option section for the Tables Statement. proc format; value ynfmt 0="No" 1="Yes"; run; But this creates the format ynfmt; it does not apply it to our variables A and B. How do we assign a format, to a variable? This information is found in the SAS Language Reference: Dictionary. Since this is our first foray outside the Procedures Guide, we will move deliberately through here. If we open this book in the SAS OnlineDoc, we see there are three Main topic areas contained within: Venturing Outside the FREQ Procedure: To this point, our tour of the SAS OnlineDoc has remained primarily within the confines of the FREQ Procedure chapter. However, as we venture outside this section of the documentation we will see that we already know a lot about how the rest of the Procedures chapter’s are arranged. Our problem also called for value labels for these data. Value labels in SAS are called Formats, and are assigned using the Procedure PROC FORMAT. So if we are looking for a Procedure we can select the Contents Tab of the OnlineDoc, then look in BASE SAS > Procedures Guide. If we select The Format Procedure, we will find ourselves at what should be a familiar looking screen. Again, we have an overview of the procedure, a summary of the procedures syntax, including statements and options, Concepts and Results specific to The FORMAT procedure, and examples. If we select the Procedure Syntax, we will again find our To do this > Use this table. I will let you all find this one yourselves. For the purposes of our example, since we want to specify the character strings ‘YES’ and ‘NO’ to label our values of 1 and 0 respectively, I will tell you the table suggests the VALUE statement to do this. Using the syntax guide for the PROC FORMAT statement, the VALUE statement, and the special section entitled, specifying values or ranges, we can develop the following code to create the format ynfmt.. Figure 8: SAS Language Reference: Dictionary Here the meat of the book lies in the Dictionary of Language Elements. Double clicking on this chapter gives use the following sub-chapters: Introduction Data Set Options Formats Functions and Call Routines Informats Statements, and SAS System Options Like most dictionaries the SAS Dictionary of Language elements contains a comprehensive listing of SAS words, and their “definitions.” Within the Dictionary, these SAS words are divided by type. It seems tempting to go to the Formats section, since we are looking for a way to use our newly created format. However, the Formats section consists primarily of a long list of the SAS supplied formats. Since we already have our own format we created, we are not really interested in this list right now, but it is good to know where it is. (Here’s another activity for when you have time to kill, look through this list. Again, even the most experienced SAS user may find that SAS already provides the format you needed, but never knew existed.) In addition to this list, there is a page in the section called Using Formats, which might be what we want. Figure 9: SAS Language Reference: Dictionary > Formats > Using Formats This page lists all the ways that a format can be used. There are two options here that mention that they can be used in a PROC step, the Format statement and the Attrib statement. If we page down to find more information on the FORMAT statement we find a short description on how to use the statement and a link to another page. Follow this link to the new page, which describes the Format Statement. The question is; where in the OnlineDoc does this page reside? Now is a good time to point out a couple more features of the OnlineDoc. The Contents and Search pages in the left third of the page can be used to navigate through the SAS OnlineDoc documentation. However, if you are following a link from one page to the next, the Contents view will not change with you. This is important in our current example because although the Contents page still shows the Formats chapter of the SAS Language Reference: Dictionary, and we are no longer in that chapter. At the top of each page in the right two-thirds of the screen there is a gray line that contains the Book, Chapter or Sub-Chapter title, depending on what level of the documentation you are using. Also at the top of each page in the right two thirds of the screen, there are three buttons to move to Chapter or Book Contents (in other words up a level) or to the Previous or Next page within that level. Again, clicking on the Previous button will not necessarily bring you back to the page you were just on, but back to the Previous page in the Chapter you are currently in, or the Previous Chapter in the book you are currently using. In order to be sure you return to the page you were just on you should use your Browser’s back button, rather than the Previous button. With this said, lets return to the problem at hand. We followed a link entitled FORMAT from the Formats Chapter of the SAS Language Reference: Dictionary, to a page entitled FORMAT. A quick way to figure out where you within the Contents of the SAS OnlineDoc can be to look at the title line. However, in our case the title line says FORMAT, which doesn’t really help. However, if we use the Chapter contents to go up a level, we find that we are in the Statements chapter of the SAS Language Reference: Dictionary. The Statements chapter contains a listing of all the SAS statements that can be used in a data step or that can be used anywhere in SAS code (global statements). These should be distinguished from the Procedure statements we have already seen, such as TABLE, WEIGHT, and VALUE which are Procedure specific. FORMAT is an example of a statement that can be used in any procedure or in a data step, making it a Global Statement. The Format sub-chapter of the Statements chapter will serve as our example a common Statement Chapter in the same way as the PROC FREQ chapter was our example of a Procedure chapter. For any given Statement you will find the syntax for using the Statement, as well as any arguments to the Statement (in this case Variables and formats). There will also be a Details section to describe the particulars of using the Statement, a Comparison and links to Statements with similar uses, and examples describing common uses of the Statement. From this section we can ascertain that the syntax we need to apply our ynfmt to the variables A and B would be: Format a b ynfmt.; Adding this code to our PROC FREQ does in fact add the formatted values, and we are almost done. But wait the values on our Output Table say “No” then “Yes”, rather than the more logical (at least to myself) “Yes” then “No.” How can we fix this? I will give you two hints, you will need the PROC FREQ statement option ORDER=DATA as well as the SORT Procedure. Hopefully you now know where to look up information on these two items in the SAS OnlineDOC. Output Delivery System (ODS) The original problem stated that your boss wanted the table and statistics posted to your work groups intranet page, and although you can have your buddy in IT add web-pages to the server for you, you still need to get your output into that form. The SAS System, Version 8, Output Delivery System® allows you to Output results directly to HTML, and the most basic forms of this require only a few simple statements. I know what you're thinking, Statements, I know where they are, in the SAS Language Reference: Dictionary, I’ll look those up and be done in no time. Well, you can look them up and be done in no time, but you won’t find them in the dictionary. At the beginning of our journey, when we first opened the BASE SAS software documentation, there were seven books within the BASE SAS software topic heading. One of these was the Guide to the SAS Output Delivery System, and that is where we will find the statements needed to complete our assigned task. If you have yet to use the Output Delivery System the introduction provides an excellent overview of what ODS can do, and it is another one of those sections of the SAS OnlineDoc that even more advanced users should read in their spare time. But the heart of the Guide to the SAS Output Delivery System is the book entitled, Reference. Again, I will not spend time going into detail about ODS, this paper is about how to use the OnlineDoc. If you want to learn more about PROC TEMPLATE, which is used to modify the style of ODS output, the OnlineDoc for this Procedure resembles that of PROC FREQ and PROC FORMAT, which we worked through before. The Chapter entitled the ODS Statements, contains all the information you need to utilized the default styles that SAS provides for its HTML, PRINTER, LISTING and OUTPUT output destinations. (Documentation on the new RTF destination, as well as some other new features of ODS available with Versions 8.1 and now 8.2 is available online from the SAS Technical Support web-site, but was not available for the SAS OnlineDoc Version 8) The To do this > Use this statement table for this Chapter is hidden under the heading “What does each ODS statement do?” Here we find that to create HTML output we want to use the ODS HTML statement, which certainly seems reasonable. If we then use the Chapter Contents button (or your Browser’s BACK button) to go back to the ODS Statements page we can select ODS HTML to view its syntax. Figure 10: The ODS Statements: ODS HTML It is on this page that you will discover that creating a web-page with a table of contents from your output really is as easy as one, two, three, where these are BODY=(aka. FILE=), CONTENTS=, and FRAME=. These three Keywords describe different HTML file specifications. Placing your destination files to the right of the equals sign and wrapping ODS HTML around our PROC FREQ does the rest. Our final code to create an HTML document containing a table with observed cell frequencies and expected cell frequencies for the relationship between A (Treatment) and B (Disease), Chi-square statistics, and Relative risks should be: ods listing close; ods html contents="\ MYPATH\freqycontents.html” body="\MYPATH\freqybody.html" frame="\MYPATH\freqshow.html"; proc freq data=test order=data; tables a*b/ nocol norow nopercent expected chisq relrisk; weight count; format a b ynfmt.; run; ods html close; ods listing; Other Documentation For most of this paper we have stayed within the confines of the BASE SAS software documentation. This comprehensive reference is only one of 18 libraries that comprise the whole of the SAS OnlineDoc. Other important references include the SAS/STAT User’s Guide and the SAS/GRAPH Software: Reference. The first 12 Chapters of the SAS/STAT User’s Guide contain information on the classes of statistical analyses that SAS can perform. These Chapters also provide a good review of some of the statistical theories that underlie these analyses. The remainder of the chapters provides descriptions of each of the SAS/STAT Procedures, and follows the same general layout as the Procedure descriptions of the BASE SAS software procedures. The SAS/GRAPH Software: Reference contains a chapter on SAS/GRAPH Statements, as well as separate chapters for each of the SAS/GRAPH procedures, which look similar to their BASE SAS counterparts. These are the best places to start when you want to delve into the intricacies of the SAS/GRAPH component. Conclusion Hopefully, this paper provided you with an organized method for understanding the framework of the SAS OnlineDoc documentation. If nothing else I hope to have turned you on to the incredible amount of information available in this reference, and set you up with a more systematic way of working through it. Reference SAS Institute Inc. (1999), SAS OnlineDoc, Version 8, Cary, NC: SAS Institute Inc.