SAS Functions SAS has approximately 150 functions in the following general areas: Arithmetic Array Character Date and Time Financial Mathematical Probability Quantile Random Numbers Sample Statistics Special Functions State and ZIP Code Trigonometric and Hyperbolic Truncation SAS functions perform a calculation or a transformation of the arguments given in parentheses following the function name. Function-name(argument, argument, ...) All functions must have parentheses even if they don't require any arguments. Arguments are separated by commas and can be variable names, constant values such as numbers or characters enclosed in quotes, or expressions. birthday = MDY( monborn, dayborn, yearborn); /* compute a SAS date value using MDY function */ newvalue = INT( LOG(10) ); /* obtain the integer portion of the natural log of 10 */ leftphra = LEFT(charstng); /* left align a SAS character expression */ a = 'my date'; x = LENGTH(a); /* LENGTH returns the length of character string, x = 7*/ avrg = MEAN(score1, score2, score3); /* compute the average of three variables for each individual */ seed = 5 uninum = RANUNI(seed) /* RANUNI returns a random number generated from the uniform distribution on the interval (0, 1) */ Syntax ABS (argument) Description argument is numeric. The ABS function returns a nonnegative number equal in magnitude to that of the argument. Example: x = abs(2.4); x = abs(-5); The values returned are 2.40000 and 5.00000, respectively. 1 Syntax BETAINV(p,a,b) Description p is a numeric probability, with 0<p<1 a is a numeric shape parameter, with a>0 b is a numeric shape parameter, with b>0 The BETAINV function returns the p-th quantile form the beta distribution with shape parameters a and b. The probability that an observation form a beta distribution is less than or equal to the returned quantile is p. The BETAINV function is the inverse of the PRBBETA function. Example: X=betainv(.001,2,4); The returned value is 0.01010. The beta distribution is related to many distribution Syntax EXP (argument) Description argument is numeric. The EXP function raises the constant e, approximately given by 2.71828, to the power supplied by the argument. The result is limited by the maximum value of floating-point decimal value on the computer. Example: x = exp(1); x = exp(0); The values returned are 2.71828 and 1.00000, respectively. Syntax FINV(p,ndf,ddf<,nc>) Description p is a numeric probability, with 0<p<1 ndf is a numeric numerator degrees of freedom parameter, with ndf>0 ddf is a numeric denominator degrees of freedom parameter, with ddf>0 nc is an optional numeric noncentrality parameter, with nc>=0. The FINV function returns the p-th quantile form the F distribution with with numerator degrees of freedom ndf, denominator degrees of freedom ddf, and noncentrality parameter nc. The probability that an observation from the F distribution is less than the quantile is p. This function accepts noninterger degrees of freedom parameters ndf and ddf. The FIN function is the inverse of the PROBF function. If the optional parameter nc is not specified or has the value 0, the quantile form the central F distribution is returned. Example: q1=finv(.95,2,10); q2=finv(.95,2,10.3,2); The values returned are 4.1028 and 7.5838, respectively. 2 Syntax INT (argument) Description argument is numeric. The INT function returns the integer portion of the argument. If the value of argument is positive INT(argument) has the same result as FLOOR(argument). If the value of argument is negative, INT(argument) has the same result as CEIL(argument). Example: X = int(2.4); X = abs(-5); The values returned are 2 and -5, respectively. LOG(argument) MAX(argument1,argument2, …) MIN(argument1,argument2, …) MEAN(argument1,argument2, …) MOD(argument1,argument2) x=mod(6,3) returns 0 , x=mod(10,3) returns 1 MINUTE(<time | datetime>) time=’3:19:24’t; m=minute(time); produce a value 19 for m. MONTH(date) SECOND(<time | datetime>) NORMAL(seed) return a standard normal random number. “seed” is an integer. If seed<=0, the time of day is used to initialize the seed stream. POISSON(m,n) m is numeric mean parameter, n is an integer random variable. The POISSON function returns the probability that an observation form POISSON distribution, with mean m, is less than or equal to n. GAMINV(p,a) returns the p-th quantile from the gamma distribution. PROBBETA(x,a,b) returns the probability that an observation from beta distribution. PROBBNML(p,n,m) returns the probability that an observation from a binomial distribution, with probability of success p, number of trials n, and number of successes m, is less than or equal to m. PROBCHI(x,df<,nc>) returns the probability that an observation form a chi-square distribution, with degrees of freedom df and noncentrality parameter nc, is less than or equal to x. PROBF(x, ndf,ddf<,nc>) returns the probability that an observation from an F distribution. PROBGAM(x,a) returns the probability that an observation from a gamma distribution, with shape parameter a, is less than or equal to x. PROBIT(p) returns the p-th quantile from the standard normal distribution. The PROBIT is the inverse of PROBNORM(x). PROBT(x,df<,nc>) returns the probability that an observation from a Student’s t distribution, with degrees of freedom df and noncentrality parameter nc, is less than or equal to x. RANBIN(seed,n,p) returns a variate generated from binomial distribution with mean np and variance np(1-p). RANCAU(seed) returns a variate generated from a Cauchy distribution wit location parameter 0 and scale parameter 1. RANEXP(seed) returns a variate generated from a exponential distribution. RANGAM(seed,a) returns a variate generated from a gamma distribution with parameter a. RANK(x) returns an integer representing the position of a character in the ASCII or EBCDIC collating sequence. RANNOR(seed) returns a variate generated from a standard normal distribution. RANPOI(seed,m) returns a variate generated from a Poisson distribution with mean m. RANUNI(seed) returns a number generated from the uniform distribution on the interval (0,1). SIGN(argument) returns a value of –1 if x<0; a value of 0 if x=0 and a value of 1 if x>0. SQRT(argument) returns the square root of the argument. STD(argument, argument, …) STDERR(argument, argument, …) TINV(p,df<,nc>) returns the p-th quantile from the student’s t distribution with degrees of freedom df and an noncentrality parameter nc. TODAY() returns the current date. 3 Working with Character Variables DATA air.depart; INPUT country $ 1-9 cities 11-12 usgate $ 14-26 othrgate $ 28-48; CARDS; Japan 5 San Francisco Tokyo, Osaka Italy 8 New York Rome, Naples Australia 12 Honolulu Sydney, Brisbane ; DATA showchar; LENGTH usairpt $ 10; SET air.depart; schedule = '3-4 tours per season'; remarks ="See last year's schedule"; IF usgate = 'San Francisco' THEN usairpt = 'SFO'; ELSE IF usgate = 'Honolulu' THEN usairpt = 'HNL'; ELSE IF usgate = 'New York' THEN usairpt= 'JFK or EWR'; PROC PRINT DATA=showchar; VAR country schedule remarks usgate usairpt; TITLE 'Examples of Some Character Variables'; RUN; -------------------------------------------------------------------------------------------------------------------------------------------------Examples of Some Character Variables OBS COUNTRY 1 2 3 Japan Italy Australia SCHEDULE 3-4 tours per season 3-4 tours per season 3-4 tours per season REMARKS USGATE See last year's schedule San Francisco See last year's schedule New York See last year's schedule Honolulu USAIRPT SFO JFK or EWR HNL Extracting a Portion of a Character Value SCAN (source,n<,list-of-delimiters>) /* blank is default delimiter */ LEFT (source) /*left-alignment*/ DATA air.arvdept; /*LENGTH can be used to assign variable length */ SET air.depart; arvgate=SCAN(othrgate,1,' , '); deptgate=LEFT(SCAN(othrgate,2,' , ')); PROC PRINT DATA = air.arvdept; VAR country othrgate arvgate deptgate; TITLE 'Dividing Character Values into Terms'; 4 ---------------------------------------------------------------------Examples of Some Character Variables OBS COUNTRY OTHRGATE ARVGATE DEPTGATE 1 2 3 Japan Italy Australia Tokyo, Osaka Rome, Naples Sydney, Brisbane Tokyo Rome Sydney Osaka Naples Brisbane Combining Character Values: Concatenation, Trimming Blanks DATA all; SET air.depart; allgate=TRIM(usgate) | | ', ' | |othrgate; /* TRIM drops trailling blanks */ IF country = 'Brazil' THEN allgate=othrgate; PROC PRINT DATA=all; VAR country usgate othrgate allgate; TITLE 'Readable Concatenated Values'; ------------------------------------------------------------------------------------Examples of Some Character Variables OBS COUNTRY USGATE OTHRGATE ALLGATE 1 2 3 Japan Italy Australia San Francisco New York Honolulu Tokyo, Osaka Rome, Naples Sydney, Brisbane San Francisco, Tokyo, Osaka New York, Rome, Naples Honolulu, Sydney, Brisbane Selection of Observations SAS Output --------------------------------------------------------------------------Data Set ARTS.ARTTOUR OBS CITY NIGHTS LANDCOST EVENTS DESCRIBE 1 Rome 3 750 7 4 M, 3 G 2 Paris 8 1680 6 5 M, 1 other ...... DATA GUIDE D'Amico Lucas BACKUP Torres Lucas revise; SET arts.arttour; IF city='Rome' THEN landcost=landcost+30; IF events>nights THEN calendar='Check schedule'; ELSE calendar='No Problem'; IF guide='Lucas' AND nights>7 THEN guide='Torres'; IF landcost>=1500 THEN price='High '; ELSE IF landcost>=700 THEN price='Medium'; ELSE price='Low'; 5 Using More than One Comparisons in a Condition (with AND, &, OR, | ) /* In a SAS condition statement, AND has higher priority than OR. IF city='Paris' OR city='Rome' AND guide='Lucas' OR guide="D'Amico" THEN topic='Art history'; IF (city='Paris' OR city='Rome') AND (guide='Lucas' OR guide="D'Amico") THEN topic='Art history'; /* In computing terms, a value of TRUE is a 1 and a value of FALSE is 0. In SAS system, any numeric value other than 0 or missing is true; a value of 0 or missing is false. IF landcost THEN remarks='Ready to budget'; /* is equivalent to */ IF landcost NE . AND landcost NE 0 THEN remarks='Ready to budget'; /* The SAS system distinguishes between uppercase and lowercase letters in comparisons. UPCASE(city) is not the same as 'city' IF UPCASE(city) = 'MADRID' THEN guide='Duncan'; /* Comparing with a shorter character string IF guide =: 'D' THEN chosen = 'Yes'; ELSE chosen = 'No'; /* In this example guide='D' select a record which its guide variable contains a string of 8 characters long, 'D ' */ IF guide<=:'L' THEN group='A-L'; ELSE group ='M-Z'; /* In this example, guide<=:'L' helps selecting records whose variable 'guide' contain string that start with a letter less than letter L. */ Finding a Value Anywhere within Another Character Value INDEX(source, excerpt) /* The function returns the position of the first character of excerpt, which is a positive number. If the excerpt doesn't occur in the source, the function return a 0. */ Example: IF INDEX(describe, 'other') ELSE otherev='No'; THEN otherev='Yes' 6