Debugging SAS Programs Finding and Correcting Errors Checking the Log • It is always a good idea to check the log file. • Start at the beginning of the log file, and correct the first error. Sometimes one mistake can create many errors. Errors, Warnings, and Notes • There are three kinds of notifications that SAS inserts into log files: Errors, Warnings, and Notes. • An Error indicates that there was a problem in the program and SAS could not execute the program. . • A Warning indicates that there was a problem in the program, but SAS figured out how to continue. • Notes can indicate that a program worked as planned or that a program worked differently. Some Notes are very important! Backwards Illustrations • When your SAS programs don’t work the way that you want, you’ll have to figure out what went wrong. • In this lesson, errors will be introduced intentionally to see the results. Missing Semicolons • Missing semicolons are the most common mistake to make. • From Program 4, if: DATA weight; INFILE 'C:\SAS_Files\tomhs.dat'; • Is replaced with: DATA weight INFILE 'C:\SAS_Files\tomhs.dat'; One Missing Semicolon Produced: ERROR: No DATALINES or INFILE statement. ERROR: Extension for physical file name "C:\SAS_Files\tomhs.data" does not correspond to a valid member type. NOTE: The SAS System stopped processing this step because of errors. WARNING: The data set WORK.WEIGHT may be incomplete. When this step was stopped there were 0 observations and 8 variables. WARNING: The data set WORK.INFILE may be incomplete. When this step was stopped there were 0 observations and 8 variables. How to figure out what happened: • The Error said that there wasn’t a DATALINES or INFILE statement, but you know that there was one. • SAS must not have identified the INFILE statement as an INFILE statement. • Checking the code shows that that SAS thought that the INFILE statement was part of the DATA statement because a semicolon was missing. Another Missing Semicolon: • From Program 4, if: PROC FREQ DATA=weight; TABLES sex clinic ; TITLE 'Frequency Distribution of Clinical Center and Gender'; • Is replaced with: PROC FREQ DATA=weight; TABLES sex clinic TITLE 'Frequency Distribution of Clinical Center and Gender'; The Missing Semicolon Produced: ------------------------------------------------------22 200 ERROR: Variable TITLE not found. ERROR 22-322: Syntax error, expecting one of the following: a name, ;, (, *, -, /, :, _ALL_, _CHARACTER_, _CHAR_, _NUMERIC_. ERROR 200-322: The symbol is not recognized and will be ignored. How to figure out what happened: • SAS says that the variable TITLE wasn’t found. • You know that TITLE isn’t a variable. • SAS must think that TITLE is part of a list of variables. • There is no semicolon separating TITLE from the variables SEX and CLINIC! Unbalanced Quotation Marks • An Unbalanced quotation marks warning can indicate that a quotation mark is missing. • From Program 5, if: DATA tdata; INFILE 'C:\SAS_Files\tomhs.data’; • Is replaced with: DATA tdata; INFILE 'C:\SAS_Files\tomhs.data; One Missing Quotation Mark Produced: WARNING: The quoted string currently being processed has become more than 262 characters long. You may have unbalanced quotation marks. 861 ; 850 INFILE 'C:\SAS_Files\tomhs.data; ------------------------49 NOTE 49-169: The meaning of an identifier after a quoted string may change in a future SAS release. Inserting white space between a quoted string and the succeeding identifier is recommended. What if you Balance the Quotation and Run Again? • You still get errors! • SAS interprets your program as a continuation of the program it ran before. Since there is an unbalanced quote, your quotes are still unbalanced and you get the Note: NOTE 49-169: The meaning of an identifier after a quoted string may change in a future SAS release. Inserting white space between a quoted string and the succeeding identifier is recommended. The Fix: Another Unbalance Quote • Run these two lines of code: ‘ RUN; • Do this ONCE (so the unbalance quote becomes balanced). • You program should run properly now (as long as it is error-free). Another Fix • • • • • This may be easier to understand: First, correct the unbalanced quote. Second, save your SAS program. Third, exit SAS. Fourth, reopen SAS and run your saved program. Invalid Data • If SAS is expecting a number, but gets text instead, you can get invalid data notes. • From Program 5, if: @ 12 clinic $1. • Is replaced with: @ 12 clinic 1. One Missing $ Produced: NOTE: The infile 'C:\SAS_Files\tomhs.dat' is: File Name=C:\SAS_Files\tomhs.dat, RECFM=V,LRECL=256 NOTE: Invalid data for clinic in line 1 12-12. RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+---1 C03615 C 11/10/1987 5 51 1 1 06/26/1936 5 4 2 71.5 05/17/1988 11/25/ 80 1988 205.5 199.0 093 084 143 138 36 36 5 260 046 111 159 4.8 063 02213 45.6 9.3 46.4 00471 00711 03611 01906 0.0 00 1 238 1 1 1 2 1 1 1 1 1 1 ptid=C03615 clinic=. group=5 sex=1 educ=4 evsmoke=2 alcbl=0 sebl_1=1 sebl_6=1 _ERROR_=1 _N_=1 Mixing up PROCs • Different PROCs have different options. • From Program 5, if: PROC FREQ DATA=tdata; TABLES clinic group sex educ sebl_1 sebl_6; • Is replaced with: PROC FREQ DATA=tdata; VAR clinic group sex educ sebl_1 sebl_6; Using the Wrong Syntax Produced: 1015 PROC FREQ DATA=tdata; 1016 VAR clinic group sex educ sebl_1 sebl_6; --180 ERROR 180-322: Statement is not valid or it is used out of proper order. • Note: Similar errors can be produced by missing semicolons Misspelled Variable in a PROC • From Progam 4, if: PROC FREQ DATA=weight; TABLES sex clinic ; • Is replaced with: PROC FREQ DATA=weight; TABLES sex clinc ; • You get: ERROR: Variable CLINC not found. Uninitialized Variables • From Program 4, if: bmi = (weight*703.0768)/(height*height); • Is replaced with: bmi = (wieght*703.0768)/(height*height); • You get: NOTE: Variable wieght is uninitialized. What’s an Uninitialized Variable? • An uninitialized variable is a variable that SAS considers to be nonexistent. • This usually occurs when a variable name on the RHS of an equation is misspelled. • In the example, the error was caused by a misspelling—SAS had no variable called wieght. Forgetting the RUN Statement • If you forget the RUN statement at the end of you program, SAS will not run (on PC) • You won’t get any output. • You may not get any errors or warnings. Fix: Run a single RUN; statement. Catching Errors as You Write: • You don’t have to write an entire program, then run the whole thing. • Try writing your programs in stages. – Write part and run it. – If your program works, write the next part, and run it. – If your program produced errors or warnings, it must have been from the last part that you wrote. Multipart Programs • If you are writing a program in stages, you may have multiple procedures. Running the same procedures over and over produces a lot of output and log files to check. • Once you get a procedure to work, you can enclose it in a comment (/* . . . */) while you work on other procedures. • Just remove the comment when you’ve finished the whole program. Example: DATA weight; INFILE 'C:\SAS_Files\tomhs.dat'; INPUT @1 ptid $10. @12 clinic $1. @27 age 2. @30 sex 1. @58 height 4.1 @85 weight 5.1 @140 cholbl; bmi = (weight*703.0768)/(height*height); RUN; /* PROC FREQ DATA=weight; TABLES sex clinic ; TITLE 'Frequency Distribution of Clinical Center and Gender'; RUN; */ PROC FREQ DATA=weight; TABLES clinic/ NOCUM ; TITLE 'Frequency Distribution of Clinical Center '; TITLE2 '(No Cumulative Percentages) '; RUN; *Now SAS will only perform the second PROC FREQ; Checking On Your Data Sets • Sometimes your data steps don’t work the way you want, but there aren’t any clear indications of problems from the log file. • You can insert a PROC PRINT to see your data: PROC PRINT DATA=mydata; RUN; • Then, when you’re sure that your data is OK, you can either delete the PROC PRINT or convert it into a comment: *PROC PRINT DATA=mydata; *RUN;