SAS Workshop Iowa State University Introduction to SAS Programming

advertisement
SAS Workshop
Introduction to SAS
Programming
Day 1 Session Iii
Iowa State University
May 9, 2016
Repetitive Computation
 Repetitive computation is achieved through the use of do





loops.
In the SAS data step language, several forms of do
statements are available.
An iterative do loop, in general, is used to perform the
same computation on a sequence of variables.
This requires the sequence of variables to be defined as
elements of an array.
The array statement allows the user to reference the
variables using the matching array elements using
subscripts.
The use of iterative do loops and the array statement in
the data step are illustrated in Examples A9 - A12
Processing a Sequence of Variables
Example A9
data compete;
input Red Blue Grey Green White;
array grade8(5) Red Blue Grey Green White;
Total=0;
drop Team;
do Team=1 to 5;
if grade8(Team)=. then grade8(Team)=0;
grade8(Team)= grade8(Team)*10;
Total + grade8(Team);
end;
datalines;
4
6
0
1
.
3
2
8
9 12
5
.
4
7
6
7
5 10
4
5
;
proc print; run;
Writing observations into the data set
 In the execution of a SAS data step, the statements in a data




step are executed and an observation is written to the output
SAS data set, for every line of input data.
However, the user can insert an output statement in the data
step at the any point where he/she wishes to write a new
observation to the SAS data set.
When an output statement is encountered, SAS writes a new
observation to the SAS data set containing the current values of
the variables.
In Example A10, we create data values in the data step internally
(i.e. no external data are input), by doing a calculation and use
the output statement to write the data as new observations.
We use an iterative do loop to do the calculation and the write
the results as an observation into a data set repeatedly.
Writing Observations into a SAS Dataset
Example A10
data convert;
do Celsius= -10 to 40 by 5;
Fahrenheit=9*Celsius/5+32;
output;
end;
run;
proc print data=convert;
title "Celsius to Fahrenheit Converter";
run;
More uses of do loops and arrays
 In Example A4 we used do loops and arrays to change or





transform data values in a data set.
By inserting an output statement inside a do loop we can
form multiple observations from data values in a single
data line.
This gives us a useful method perform an operation called
transposing.
Transposing is using data lines in input data to form
columns (or variables) in a SAS data set.
In Example A11, the data values in each data line (Quiz
scores) form values of the variable called Score in the
output data set.
The value of the variable Name remain the same for each
of the values in the same data line.
More Examples of do loop and array
Example A11
data quizzes;
input Name $ Quiz1-Quiz5;
array scores (5) Quiz1-Quiz5;
drop Quiz1-Quiz5;
do Test= 1 to 5;
if scores(Test)=. then scores(Test)= 0 ;
Score = scores(Test);
output;
end;
datalines;
Smith 8 7 9 . 3
Jones 4 5 10 8 4
;
proc print data=quizzes;
run;
More uses of do loops and arrays
 In Example A12 we use the method we discussed in the





previous example to read in a data set
At the same time, we convert it to a form suitable to be
input to proc anova or proc glm etc.
Most of the time, this is done in practice by reading in a
single value per data line.
The method we use is more intiutive because the data
appears in the data lines in the the same form they would
appear in a data table.
Notice that in this example we have two do loops, an inner
do loop nested within an outer do loop.
For the values of the subscripts we use the actual values of
the corresponding variables Amount and Concentration.
Example A12
data reaction;
length Concentration $4;
do Amount =.9 to .6 by -.1;
do Concentration = '1%' , '1.5%' , '2%' , '2.5%' , '3%' ;
input Time @@;
output;
end;
end;
datalines;
10.9 11.5 9.8 12.7 10.6
9.2 10.3 9.0 10.6 9.4
8.7 9.7 8.2 9.4 8.5
7.2 8.6 7.5 9.7 7.7
;
proc print;
title 'Reaction times for biological substrate';
run;
Additional Notes on Arrays
 Arrays are used for repetitive processing of variables
 The array statement can be used to perform the same task
on a group of variables.
 array array-name (subscript) <variable-list>
<(initial- values)>;
 You can then use the array name with parentheses and a
subscript as in the examples.
 Notes:
1.
2.
3.
4.
All the variables in an array must be of the same type.
An array cannot have the same name as a variable name.
Subscript may be a number giving the dimension size or a
range of subscripts such as 1:5
If an asterisk (*) is used as the subscript, SAS will determine
the dimension size by counting the variables in the list.
Download