Data Step Processing

advertisement
Reading selected observations or rows:
* Read the input data for the first 6 observations;
Data stocks;
Set cdat.stocks(OBS=6);
Run;
Note: OBS= dataset option can only be used for input dataset.
If the first row of the input text file contains the variable name, from the second row start
the data, use the FIRSTOBS=2 option.
Example:
filename bb '/courses/ddbf9765ba27fe300/bill.txt';
data bill;
infile bb FIRSTOBS=2;
input fname $1-13 lname $14-25 ssn1 ssn2 ssn3 areacd phonenum $
@68 bal1 dollar8. @77 duedt yymmdd10. @88 billdt date9.;
run;
Combined use of firstobs and obs options:
Libname cdat '/courses/ddbf9765ba27fe300';
Data stocks;
Set cdat.stocks(FIRSTOBS=5 OBS=6);
Run;
Output_dataset_obs = OBS – FIRSTOBS + 1 ;
Reading selected observations with logical comparison:
Libname cdat '/courses/ddbf9765ba27fe300';
Data stocks;
Set cdat.stocks2;
If _n_ > 5;
If industry='PHAR';
Run;
Note: _n_ is a SAS system variable generated at run time. It’s value is row number.
Data stocks;
Set cdat.stocks;
If price=. Then delete;
Run;
Data stocks;
Set cdat.stocks2;
Where industry='PHAR';
Run;
Data stocks (Where =(industry='PHAR'));
Set cdat.stocks2;
Run;
Data stocks ;
Set cdat.stocks2(Where =(industry='PHAR'));
Run;
proc print data=cdat.stocks2;
where industry='PHAR';
run;





WHERE is more efficient than if. When using WHERE, only those selected
observation will be read. While using IF condition, all observations will be read,
only those selected observations will be output to the output data set.
WHERE can not be used with data step options such as: OBS=, FIRSTOBS=,
POINT=, ETC.
WHERE can not be used with execution time system variable _N_.
WHERE statement can be used in both DATA STEP and PROC STEP. IF
statement can only be used DATA STEP.
Notice the difference between using dataset option for input data set and output
data set.
When the option is used for the input data set, only those observations that fit the
condition will be read into the data step. When the option is used for the output
data set, all observations will be read, but only those observations that fit the
condition will be output to the output data set.
Example: The following WHERE statement using variable _n_ get ERROR.
Data stocks;
Set cdat.stocks;
where _n_<5;
Run;
ERROR: Variable _n_ is not on file CDAT.STOCKS.
Example: The following code using IF statement is correct.
Data stocks;
Set cdat.stocks;
if _n_<5;
Run;
Create multiple data sets from one data set
Libname cdat '/courses/ddbf9765ba27fe300';
Data pharm tech;
Set cdat.stocks2;
If industry='PHAR' then output pharm;
Else If industry='TECH' then output tech;
Run;
A simple random sampling program:
Libname cdat '/courses/ddbf9765ba27fe300';
Data sample;
Set cdat.stocks2;
If ranuni(0) <= 0.3 then output;
* random sampling of 30% of
observations;
Run;
Note: This is not an efficient method, because it read through all observations, and only
output those selected observations.
===============================================================
Advantaged topic: Random (Direct) access
POINT=VarName :
Dataset option
Libname cdat '/courses/ddbf9765ba27fe300';
data obs5;
obsn=5;
set cdat.stocks point=obsn;
stop;
run;
output;
Data sample;
Do obsnum=1 to 10 by 2;
Set cdat.stocks point=obsnum;
If _error_ then abort;
Output;
End;
Stop;
Run;



POINT=VarName. You can not use POINT=5.
An explicit “output;” statement has to be specified
Random access cannot read END OF FILE, therefore the data step cannot finish. To
finish the data step process, an explicit “STOP;” has to be used.
Two useful programs:
1. Random Sampling with replacement.;
Example: Create a fixed-size random sample of observations from a SAS data set.
Duplicate observations are allowed.
Libname cdat '/courses/ddbf9765ba27fe300';
data sample (drop=sampsize);
sampsize=100;
do i=1 to sampsize;
readit=ceil(ranuni(0)*totobs);
set cdat. simulated_card_data point=readit nobs=totobs;
output;
end;
stop;
run;
2. Random Sampling without replacement.;
data sample2 (drop=sampsize obsleft);
sampsize=100;
obsleft=totobs;
do while(sampsize>0);
readit+1;
if ranuni(0)<sampsize/obsleft then do;
set cdat. simulated_card_data point=readit nobs=totobs;
output;
sampsize=sampsize-1;
end;
obsleft=obsleft-1;
end;
stop;
run;
End Advanced Top.
Reading selected variables using keep= or drop= option.
Libname cdat '/courses/ddbf9765ba27fe300';
Data stocks;
Set cdat.stocks2(keep=ticker price);
Run;
* ( or drop=industry);
Note: When using KEEP= option in input dataset, variables that are not in the keep list
are not available in the dataset, because they are not read into the data step.
Because of this, it is an efficient way of reading data. Variables that are not needed for
the data processing are not needed to read into data step. For large data, this can save
significant amount of time.
Example:
Libname cdat '/courses/ddbf9765ba27fe300';
Data stocks;
Set cdat.stocks2(keep=ticker price);
Put industry;
Run;
NOTE: Variable industry is uninitialized.
Note: When using KEEP statement or using KEEP= option in output dataset, all
variables, including those that are not in the keep list, are available in the dataset.
Example:
Libname cdat '/courses/ddbf9765ba27fe300';
Data stocks;
Set cdat.stocks2;
Keep ticker price;
Put industry;
* or drop industry;
Run;
Libname cdat '/courses/ddbf9765ba27fe300';
Data stocks(keep=ticker price);
* ( or drop=industry);
Set cdat.stocks2;
Put industry;
Run;

Note the difference of using the options for input data set and output data set.
When the options used for input data set, it is more efficient, but those not kept and
those dropped variables are not available in the data step for processing.
Data Step Processing
Variable creation and assignment
Libname cdat '/courses/ddbf9765ba27fe300';
Data stocks;
Length Comment $30
order_num 3;
Set cdat.stocks2;
Order_num = _n_;
If industry='PHAR' then comment='Pharaceutical';
Else if industry='FINAN' then comment='Financial';
Else if industry='TECH' then comment='Technology';
Price=price+10;
Run;



Length statement defines variable length in bytes. Because SAS data’s rectangle
matrix structure, create and define variables efficiently.
Numeric variable length is 3-8 bytes, default is 8 bytes floating point.
Character variable length is 1 to 2^15-1 bytes (characters). Default is 8 bytes or the
length of the first assigned value.
data aa;
a='1234'; output; * length is defined as $4 by assignment;
a='12345678910'; output; * Truncated;
run;

To assign a value to a variable, variable name on the left of the “=” sign, and variable
value on the right. Character value has to be quoted by either single quote or double
quote.
Working with numeric variables
Numeric variable length and largest integer value
In SAS, numeric number’s length is 3 to 8 bytes long. The following table list the length
in bytes of SAS variable and maximum integer number it represent. Notice that the
number of bytes and number of significant digits is different concept. An eight bytes
number can have up to 16 significant number of digits. By inspecting the following table,
we can get then following relation:
1 Byte has 8 bits,
max possible number = 28 = 256
2 Bytes hav 16 bits, max possible number = 216 = 65,536
3 Bytes hav 24 bits, max possible number = 224 = 16,777,216
......
SAS need 11 bits for exponent etc.
Max integer number = max possible number by bytes * 2**(-11)
For Example: for a length of 3 bytes:
Max integer number = (2** (3*8) ) * 2*(-11) = 2**24 * 2**(-11) = 2**13 = 8,192
Length (bytes)
-----------------3
4
5
6
7
8
Largest Integer Value
-----------------------------------8,192
2,097,152
536,870,912
137,438,953,472
35,184,372,088,832
9,007,199,254,740,992
Num of Digits
------4
7
9
12
14
16
Numeric variable’s order:
The numeric variable’s order from smallest to the largest is shown in the following
.
figure. The smallest is missing ( ) value, then it is negative number, zero (0) and
positive number.
Order of Evaluation of SAS Operators
(http://v8doc.sas.com/sashtml/lgref/z0208245.htm)
Examples:
data a;
do i=1 to 10;
output;
end;
run;
data num_var;
set a;
a = i + 1;
b = a - 1;
c = i * 2;
d = c / i;
e = i ** 2;
no_val = .;
f = 2 * a**2 + a*b/2 - 1;
f2= (2*(a**2)) + ((a*b)/2) – 1;
* “()“ has the highest operation order. ;
run;


Operator precedence: Exponentiation is done first, then multiplication and division,
then finally addition and subtraction.
Numeric variable is 8 bytes floating point by default. Define it efficiently.
Numeric Function
data num_var;
var1 = 1; var2 = 2; var3 = 3;
pi=3.141592657;
log10 = log10(100);
put 'log10 = ' log10;
loge = log(2.7182818); put 'loge = ' loge;
a=sin(pi/6);
put 'a=sin(pi/6)= ' a;
t=tan(pi/4);
put 't= tan(pi/4)= ' t;
c=max(1,2);
put 'c= max(1,2)= ' c;
d=min(1,2);
put 'd= min(1,2)= ' d;
e=sqrt(100);
put 'e= sqrt(100)= ' e;
f=mod(10,4);
put 'f= mod(10,4)=' f;
g=ranuni(0);
put 'g= ranuni(0)=' g;
h=sum(a,b,c); put 'h= sum(a,b,c)= ' h;
h2=sum(of var1-var3); put 'h2 = sum(of var1-var3) =' h2;
i=int(1/3);
put 'i= int(1/3) =' i;
j=ceil(1/3); put 'j= ceil(1/3) =' j;
k=floor(2.718); put 'k= floor(2.718) =' k;
l=round(2.71828, .01);
put 'l= round(3.1415, .01) =' l;
m=abs(-2.718);
put 'm= abs(-2.718) =' m;
output;
run;
For full list of SAS function, hold the <Ctrl> key and click
the following link:
SAS(R) 9.1.3 Language Reference: Dictionary, Fifth Edition: Functions and CALL Routines
Notice the difference between ADD (+), sum() function and sum(of …) function.
Example:
data a;
a1=1; a2=2; a3=3; a4=.;
add1=a1+a2+a3;
add2=a1+a2+a3+a4;
put 'add1 = ' add1;
put 'add2 = ' add2;
sum1 = sum(a1,a2,a3);
put 'sum1 = ' sum1;
sum2 = sum(a1,a2,a3,a4); put 'sum2 = ' sum2;
sum3 = sum(a1-a3);
put 'sum3 = ' sum3;
* wrong statement;
sum4 = sum(of a1-a4);
run;
put 'sum4 = ' sum4;
SAS Date: SAS date is numeric variable.

1/1/1960 SAS date is 0, after that, each day SAS date add 1.
12/30/1959
SAS Date is : -2
12/31/1959
SAS Date is : -1
1/1/1960
SAS Date is : 0
1/2/1960
SAS Date is : 1
1/3/1960
SAS Date is : 2
1/4/1960
SAS Date is : 3

before that date, for each day sas date minus 1.
* The following program shows SAS data and actual date value;
data aa;
y=1960;
do m=0 to 60;
sasdt=m;
put 'sas date = ' sasdt
end;
run;
data aa;
y=1960;
do m=0 to -60 by -1;
sasdt=m;
put 'sas date = ' sasdt
end;
run;
'
date is: ' sasdt mmddyy10.;
'
date is: ' sasdt mmddyy10.;
Because SAS date is numeric, arithmetic can be used, such as add, sub.

Create SAS date value
o MDY function: SASDT=MDY(month, day, year);
o TODAY function: SASDT=TODAY();
o INPUT
function:
data _null_;
cdt='20030208';
sasdt=input(cdt,yymmdd10.);
put sasdt mmddyy10.;
ym=put(sasdt,yymmn6.);
put 'ym = ' ym;
cdt2='02JAN1960';
sasdt2=input(cdt2,date9.);
put sasdt2 yymmdd10.;
sasdt3='08jan2007'd;
put sasdt3 yymmdd10.;
run;

Other useful DATE functions
YEAR function: year = YEAR(sasdt);
/* return year of the sas date */
MONTH function
month = MONTH(sasdt); /* return month of the sas date */
DAY function day = DAY(sasdt);
/* return day of the sas date */
QTR function
QTR = QTR(sasdt);
/* return quarter of the sas date */
WEEKDAY function
WKD = WEEKDAY(sasdt);
/* return day of the week of the sas date */
INTNX function Nextdt = INTNX('Interval', start, n_interval);
/* Return date of number of intervals from start dt */
INTCK function Intvl = INTCK('Interval', from, to);
/* returns the number of time intervals in a given time span */
/* ‘Interval’ can be year, month, day, week, qtr. */
DATEPART function: date=DATEPART(sas_datetime);
/* extracts the date from a SAS datetime value */
Exapmle:
data _null_;
sasdt='08jan2007'd;
put 'date= ' sasdt yymmdd10.;
year=year(sasdt);
put 'year=year(sasdt)= ' year;
month=month(sasdt);
put 'month=month(sasdt)=' month;
day =day(sasdt);
put 'day = day(sasdt)=' day;
qtr =qtr(sasdt);
put 'QTR = qtr(sasdt)=' qtr;
weekday=weekday(sasdt); put 'Weekday=weekday(sasdt)=' weekday;
put;
today=today();
put 'Today is: ' today date9.; put;
dt1='01jan2006'd;
put 'date 1= ' dt1 date9.;
dt2='01jan2007'd;
put 'date 2= ' dt2 date9.;
intervald=intck('day',dt1, dt2);
put "intervald =intck('day',dt1,
dt2)=" intervald;
intervalm=intck('month',dt1, dt2); put "intervalm
=intck('month',dt1, dt2)=" intervalm;
intervaly=intck('year',dt1, dt2); put "intervaly =intck('year',dt1,
dt2)=" intervaly;
put;
datetime= '08JAN2007:09:00:00'dt ;
datetime20.;
datepart=datepart(datetime);
date9.;
run;
put 'datetime=' datetime
put 'datepart=' datepart
Also see the date_conversion.sas for varies date conversion.
For complete list of SAS Date, Time, and Datetime Functions, and Datetime Format
hold <Ctrl> and click the following link:
SAS Date, Time, and Datetime Functions
Date Formats
Datetime and Time Formats
Date and Datetime Informats
Working with Character Variables

Define length and assign value: Character variable length is 1 to 2^15-1 bytes
(characters). Default is 8 bytes or the length of the first assigned value.
libname cdat '/courses/ddbf9765ba27fe300';
data char;
length char1 $3 char2 $40;
/* Assignment */
char1 = '0';
char2 = 'Simple Assignment';
/* Special character has to be double */
char1 = '1';
char2 = 'Smith’’s Pizza';
output;
output;
/* Single quote can be quoted by double quote */
char1 = '2';
char2 = "Smith's Pizza";
output;
/* Line up character value */
char1 = '3';
char2 = '
Chapter 1
';
output;
/* Assign missing value. Missing value can be '' or ' ' */
char1 = '4';
char2 = '';
output;
/* Use missing in if condition */
char1 = '5';
char2 = ''; if char2='' then put 'Char2 is missing';
output;
run;
String manipulating functions
Concatenation, Left(), Right, SUBSTR() and COMPRESS() function:
data str_func;
length c1 $20
c11 $20;
*** LEFT function: left align character value;
c1 = '
NAME';
c11 = LEFT(c1);
output; /* c11='NAME
' */
*** RIGHT function: right align character value;
c1 = 'NAME';
c11 = RIGHT(c1);
output; /* c11='
NAME'
*/
*** Concatination: combine character values;
c1 = ' A B C ';
c11 = c1||'D';
output; /* c11=' A B C
*/
'
*** TRIM function: Remove blanks ONLY on the right side of the
character value;
c1 = ' A B C ';
c11 = trim(c1)||'D';
output; /* c11=' A B CD'
*/
*** COMPRESS function: Remove all blanks in the character value;
c1 = ' A B C ';
c11 = compress(c1);
output; /* c11='ABC'
*/
c1 = ' A B C ';
c11 = compress(c1)||'D';output;
/* c11='ABCD'
*/
*** COMPRESS function: By default, it removes all blank spaces ;
c1 = '3.142, 2.728, 6.028';
c11= compress(c1);
output; /* c11='3.142,2.728,6.028' */
*** COMPRESS function: Remove specified characters from the
character value ;
c1 = '3.142, 2.728, 6.028';
c11= compress(c1, ',');
output; /* c11='3.142 2.728 6.028'
*/
/*** SUBSTR function: Extract characters from the character value */
c1 = 'JOHN SMITH';
c11 = SUBSTR(c1,6);
output; /* c11='SMITH'
*/
c1 = 'JOHN SMITH';
c11 = SUBSTR(c1,6, 1);
run;
output;
/* c11='S'
*/
The SCAN function
The scan function returns the nth word of a character value.
It is used to extract words from a character value when the starting positions are
unknown, but delimiters are existed.
WORD = SCAN(source, n <,list of delimiters>);
The default delimiters for PC and Unix are:
Blank . < ( + | & ! $ * ) ; - / , % ^


If n is negative, SCAN selects the word in the character string starting from the
end of the string.
If |n| is greater than the number of words in the character string, SCAN
returns a blank value.
Example:
data _null_;
var1 = '3.142, 2.718, 6.028';
pi
= SCAN(var1,1, ',');
put 'pi= ' pi;
e
= SCAN(var1,2, ',');
put 'e= ' e ;
afgn = SCAN(var1,3, ',');
put 'afgn= ' afgn;
var2='John Smith';
firstname = SCAN(var2,1);
lastname = SCAN(var2,2);
put 'First name= ' firstname;
put 'Last name= ' lastname;
a='c:\risk1\uscc\datapull.sas';
fn=scan(a,-1,'\');
put fn;
run;
The INDEX function
The INDEX function search for a character argument and return its location in the
character value if it find the character argument, or return 0 if it cannot find.
INDEX = INDEX(source, except);
Source:
Except:
The character expression to search.
Character Argument or the character string to search for
If find, INDEX = position of the except.
If not find, INDEX = 0.
data _null_;
length reason $100;
line = 'ERROR 180-322: Statement is not valid or it is out of
proper order.';
index = INDEX(line, 'ERROR');
if index > 0 then put line;
put 'index= ' index;
index2 = INDEX(LINE, ':');
put 'index2= ' index2;
if index2 > 0 then reason = substr(line, index2); put 'REASON'
reason;
run;
The INDEXC function
UPCASE and LOWCASE function
Convert all letters in the argument to UP or LOW case.
VarU = UPCASE(argument);
VarL = LOWCASE(argument);
data _null_;
length lastname $20;
name = 'John Smith';
lastname = scan(name,2);
if lastname = 'SMITH' then put 'lastname= ' lastname;
else put 'SMITH is not found';
if upcase(lastname) = 'SMITH' then put 'lastname= ' lastname;
run;
Note: String comparison is case sensitive. It is always a good habit to use upcase or
lowcase function
befor string comparison.
The STRIP function
Remove all leading and trailing blanks, but preserve the blanks in the middle.
data a;
A = 'B';
pi = 3.14;
C = '
D = 'E';
pi * 5
';
cc = A || pi || c || D;
dd = A || strip(pi) || c || D;
ee = A || strip(pi) || strip(c) ||D;
put cc;
put dd;
put ee;
run;
New String Functions for Version 9
CAT(string1, ... , stringn)
Concatenates string variables without removing leading or trailing blanks
CATT(string1, ... , stringn)
Concatenates string variables and removes trailing blanks (but not leading blanks)
CATS(string1, ... , stringn)
Concatenates string variables and removes leading and trailing blanks
CATX(separator, string1, ... , stringn)
Concatenates string variables, removes leading and trailing blanks, and insert the
specified separator between strings
Function Equivalent Code
CAT(OF X1-X4) = X1||X2||X3||X4
CATT(OF X1-X4)= TRIM(X1)||TRIM(X2)||TRIM(X3)||TRIM(X4)
CATS(OF X1-X4)=
TRIM(LEFT(X1))||TRIM(LEFT(X2))||TRIM(LEFT(X3))||TRIM(LEFT(X4))
CATX(SP, OF X1-X4)=TRIM(LEFT(X1))||SP||TRIM(LEFT(X2))||SP||
TRIM(LEFT(X3))||SP||TRIM(LEFT(X4))
Example:
data aa;
length a1 a3 $10 a2 a4 8;
a1='aaa'; a2=1234; a3='bbb'; a4=5678;
line1=cat(a1, a2, a3, a4);
line2=cat(of a1-a4);
put line1;
put line2;
line3=catx(',',a1, a2, a3, a4);
line4=catx(',',of a1-a4);
run;
put line3;
put line4;
FIND Function:
Searches for a specific substring of characters within a character string that you
specify.
FIND(string,substring<,modifiers><,startpos>)
Where : String is the character string or character variable name to search within.
Substring is the string string to search in the string.
Modifier = I ignore case during the search, default is the case sensitive
T  trims trailing blanks from string and substring
Startpos , start position to search.
Example:
data _null_;
string1='1234567890';
substr ='3';
pos=find(string1, substr);
put 'pos=' pos;
* pos=3;
string2='c:\project1\analysis\distribution.sas';
pos2= find(string2, '\', -100); * pos2=21; * when start position >
string len,;
put 'pos2=' pos2;
* it search left ward;
string3='The search is Case Sensitive';
pos3=find(string3, 'case');
* pos3=0, cannot find, case
sensitive;
put 'pos3=' pos3;
string4='The search is not Case Sensitive';
pos4=find(string4, 'case', 'i');
put 'pos4=' pos4;
* ignore case, pos4=19;
*** equivalent using index function;
string5=upcase(string4);
index5=index(string5,'CASE');
put 'index5=' index5;
run;
* pos5=19;
A string function example:
filename mod /courses/ddbf9765ba27fe300/model_out.txt';
data mod;
infile mod truncover;
input line $1-256;
if scan(line,1, ' ') in ('if', 'else');
run;
data mod2;
length varname $32 yname $8 yval 8;
retain vn;
set mod;
if scan(line,1) = 'else' then do;
varname=substr(line,21, 32);
yname =substr(line,71, 8);
yval
=substr(line,83, 8);
end;
else do;
varname=scan(line,2, ' ');
yname =scan(line,-3, ' ;');
yval
=scan(line,-1, ' ;');
end;
varname = scan(varname,1);
if varname ne '' then vn=varname;
else if varname = '' then varname=vn;
run;
proc sort data=mod2 out=mod3;
by varname descending yval;
run;
/*
1
2
3
4
5
6
7
8
9
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
01234567890
if
ALL903
= 0
then YIZQ0015 = -0.2021
;
else if 0
< ALL903
<= 2
then YIZQ0015 = -0.3925
;
else if 2
< ALL903
<= 3
then YIZQ0015 = -0.2662
;
else if 3
< ALL903
<= 4
then YIZQ0015 = -0.2264
;
*/
Converting Numeric values to Character Values

1.
2.
3.

Automatic conversion by using numeric value in one of the following situations:
Assign to a character variable
During a concatenation operation
When used with a function that takes character arguments.
Implicit conversion by using PUT function.
CharVar = PUT(num_var, format);
Example:
data Num_to_Char;
length score_c $10 score_c2 $10;
score_n = 2.718;
*** Conversion by assignment;
score_c = score_n;
put 'score_c= ' score_c;
*** Conversion by using in function;
Score_c2= LEFT(score_n); put 'score_c2= ' score_c2;
idc = 'A';
id1 = 0034;
id2 = 4825;
*** Leading zeroes and blanks in the numeric value will cause
problems;
*** in the automatic conversion during concatenation.
;
id_all_1 = id1 || id2;
put 'id_all_1= ' id_all_1;
id_all_2 = idc || id1 || id2; put 'id_all_2= ' id_all_2;
*** Solution: Use PUT function if you need the leading zero;
id_all_3 = idc || put(id1,z4.) || put(id2,4.);
put 'id_all_3= ' id_all_3;
*** Solution: Use Trim(left(Num_var)) function, if you don't need
the
leading zero;
id_all_4 = idc || TRIM(LEFT(id1)) || TRIM(LEFT(id2));
put 'id_all_4= ' id_all_4;
run;
Converting Character Values to Numeric values


Automatic conversion by using character value in one of the following situation:
1. In an arithmetric operation
2. In a logical comparison with a numeric value
3. When used with a function that takes numeric arguments
4. Assign to a numeric variable
By using INPUT function in a SAS statement
*** Example;
data char_to_num;
length pi_num 8;
pi_c='3.14159';
* convert by arithmetic operation;
pi_n=pi_c*1;
put 'pi_n=' pi_n;
* Convert by logical comparison;
if pi_c > 2.71 then put 'Comparison successful!';
* Convert by math function;
int_pi_c=int(pi_c); put 'int_pi_c= ' int_pi_c;
* Convert by asignmnt;
pi_num = pi_c;
put 'pi_num= ' pi_num;
run;


Using INPUT Function to convert character values to
Numeric;
In the following example, without using INPUT function
the conversion will generate error;
data char_to_num;
* Date conversion;
cdate='15FEB2003';
ndate=input(cdate,date9.); put 'ndate= ' ndate ' ' ndate date9.;
cdate2='20030215';
ndate2=input(cdate2,yymmdd8.);
put 'ndate2= ' ndate2 ' ' ndate2
yymmdd10.;
* Curency conversion;
csalary='$45,000';
nsalary=input(csalary, comma8.);
nsalary=input(csalary, dollar8.);
run;
put 'nsalary= ' nsalary;
put 'nsalary= ' nsalary;
Topic related to numeric character conversion
Variable type can’t be redefined
In the following program, data cdat.date_time contains variables bdd and byy, both are
numeric. The program defined two character variables in the same name, and try to
convert numeric variables to character variables in the same name. The following is the
eror message:
ERROR: Variable bdd has been defined as both character and
numeric.
ERROR: Variable byy has been defined as both character and numeric.
data date_time;
length bdd $2 byy $4;
set cdat.date_time;
bdd=bdd;
byy=byy;
run;
The correct way is to define two character variables in different name as following:
data date_time(rename= (bdd2=bdd byy2=byy));
length bdd2 $2 byy2 $4;
set cdat.date_time;
bdd2=bdd;
byy2=byy;
drop bdd byy;
run;
Debugging Techniques
1.
2.
3.
4.
5.
6.
Always check log window or log file.
Always search for the following key words in the log:
ERROR
WARNING
UNDIFINED
UNINITIALIZED W.D FORMAT
To make sure the program run correctly.
When ERROR detected in the LOG, always search for the first ERROR in the
log to start the debug,
Always look into the data to see how the data changes by the program logic.
The ways to look into the data include: PUT statement, proc print, FSV and
VT, etc.
Trace the program data step (or proc) by data step (or proc).
Occasionally when everything looks correct, but the program gives
unexpected results. This may be because the SAS session is already damaged.
Copy your code to a new SAS session and run again might solve the problem.
Download