Chapter 14: Generating Data with Do Loops OBJECTIVES • • • • • Understand iterative DO loops. Construct a DO loop to perform repetitive calculations Use DO loops to eliminate redundant code. Use DO loop processing to conditionally execute code. Generate multiple observations in one iteration of the DATA step • Construct nested DO loops DO Loop Processing • Statements within a DO loop execute for a specific number of iterations or until a specific condition stops the loop. DATA statement SAS statements DO statement iterated SAS statements END statement SAS statements RUN statement 2 Repetitive Coding • Compare the interest for yearly versus quarterly compounding on a $50,000 investment made for one year at 7.5 percent interest. • How much money will a person accrue in each situation? 3 Repetitive Coding data compound; Amount=50000; Rate=.075; Yearly=Amount*Rate; Quarterly+((Quarterly+Amount)*Rate/4); Quarterly+((Quarterly+Amount)*Rate/4); Quarterly+((Quarterly+Amount)*Rate/4); Quarterly+((Quarterly+Amount)*Rate/4); run; 4 Repetitive Coding proc print data=compound noobs; run; PROC PRINT Output Amount Rate Yearly Quarterly 50000 0.075 3750 3856.79 What if you wanted to determine the quarterly compounded interest after a period of 20 years (80 quarters)? 5 DO Loop Processing data compound(drop=Qtr); Amount=50000; Rate=.075; Yearly=Amount*Rate; do Qtr=1 to 4; Quarterly+((Quarterly+Amount)*Rate/4); end; run; 6 The Iterative DO Statement • The iterative DO statement executes statements between DO and END statements repetitively, based on the value of an index variable. DO index-variable=specification-1 <,…specification-n>; <additional SAS statements> END; • specification-1…specification-n can represent a range of values or a list of specific values. 7 General Syntax of a DO Loop DO index-variable = start TO stop BY increment; SAS statements; END; • Index-variable stores the value of the current iteration of the DO loop. • Start, stop, increment values are set upon entry of the DO loop. • Can not be changed during the processing the DO loop • Can be numbers, variables, or SAS expressions. The Iterative DO Statement • The BY clause is optional. If ignored, the increment is 1. • Increment may be negative, which will process the DO loop backwards. For this situation, Start should be larger than Stop. • When completing the DO loop, the value of the indexvariable is STOP_value+Increment, not the STOP_value. Ex: Do month = 1 to 12; pay = pay*1.05; END; Once the DO loop is completed, the value of month = 13, NOT 12. 9 The Iterative DO Statement What are the values of each of the four index variables? Out of range do i=1 to 12; 1 2 3 4 5 6 7 8 9 10 11 12 13 do j=2 to 10 by 2; 2 4 6 8 10 12 do k=14 to 2 by –2; 14 12 10 8 6 4 2 0 do m=3.6 to 3.8 by .05; 3.60 3.65 3.70 3.75 3.80 3.85 10 The Iterative DO Statement DO index-variable=item-1 <,…item-n>; • item-1 through item-n can be either all numeric or all character constants, or they can be variables. • The DO loop is executed once for each value in the list. 11 The Iterative DO Statement How many times will each DO loop execute? do Month='JAN','FEB','MAR'; 3 times. do Fib=1,2,3,5,8,13,21; 7 times. do i=Var1,Var2,Var3; 3 times. do j=BeginDate to Today() by 7; The number of iterations depends on the values of BeginDate and Today(). do k=Test1-Test50; k is determined by subtracting Test50 from Test1. 1 time. A single value of 12 Use ‘TO’ not ‘-’. Performing Repetitive Calculations On January 1 of each year, $5,000 is invested in an account. Determine the value of the account after three years based on a constant annual interest rate of 7.5 percent. data invest; do Year=2001 to 2003; Capital+5000; Capital+(Capital*.075); end; run; 14 Performing Repetitive Calculations proc print data=invest noobs; run; PROC PRINT Output Year Capital 2004 17364.61 NOTE: The Year is 2004, NOT 2003. The Capital is not for 2003. 15 Performing Repetitive Calculations Generate a separate observation for each year. data invest; do Year=2001 to 2003; Capital+5000; Capital+(Capital*.075); output; end; Explicit output run; proc print data=invest noobs; run; 16 Performing Repetitive Calculations PROC PRINT Output Year Capital 2001 2002 2003 5375.00 11153.13 17364.61 Why is the value of Year not equal to 2004 in the last observation? NOTE: The use of OUTPUT, the explicit output, controls the output from 2001 to 2003. 17 Reducing Redundant Code • The example that forecast the growth of each division of an airline. • Partial Listing of the data set: mylib.growth 18 Division Num Emps Increase APTOPS FINACE FLTOPS 205 198 187 0.075 0.040 0.080 Reducing Redundant Code A Forecasting of Airline Growth data forecast; set mylib.growth(rename=(NumEmps=NewTotal)); Year=1; NewTotal=NewTotal*(1+Increase); output; Year=2; NewTotal=NewTotal*(1+Increase); output; Year=3; NewTotal=NewTotal*(1+Increase); output; run; What if you want to forecast growth over the next 30 years? 19 Reducing Redundant Code Use a DO loop to eliminate the redundant code in the previous example. data forecast; set mylib.growth(rename=(NumEmps=NewTotal)); do Year=1 to 3; NewTotal=NewTotal*(1+Increase); output; end; run; 20 Reducing Redundant Code proc print data=forecast noobs; run; Partial PROC PRINT Output Division APTOPS APTOPS APTOPS FINACE New Total Increase Year 220.38 236.90 254.67 205.92 0.075 0.075 0.075 0.040 1 2 3 1 What if you want to forecast the number of years it would take for the size of the Airport Operations Division to exceed 300 people? 21 Conditional Iterative Processing • You can use DO WHILE and DO UNTIL statements to stop the loop when a condition is met rather than when the index variable exceeds a specific value. • To avoid infinite loops, be sure that the condition specified will be met. 22 The DO WHILE Statement The DO WHILE statement executes statements in a DO loop while a condition is true. DO WHILE (expression); <additional SAS statements> END; • expression is evaluated at the top of the loop. • The statements in the loop never execute if the expression is initially false. 23 The DO UNTIL Statement The DO UNTIL statement executes statements in a DO loop until a condition is true. DO UNTIL (expression); <additional SAS statements> END; • expression is evaluated at the bottom of the loop. • The statements in the loop are executed at least once. 24 Conditional Iterative Processing Task: Determine the number of years it would take for an account to exceed $1,000,000 if $5,000 is invested annually at 7.5 percent. 25 Conditional Iterative Processing data invest; do until(Capital>1000000); Year+1; Capital+5000; Capital+(Capital*.075); end; run; proc print data=invest noobs; run; 26 Conditional Iterative Processing PROC PRINT Output Capital 1047355.91 Year 38 How could you generate the same result with a DO WHILE statement? 27 The Iterative DO Statement with a Conditional Clause You can combine DO WHILE and DO UNTIL statements with the iterative DO statement. DO index-variable=start TO stop <BY increment> WHILE | UNTIL (expression); <additional SAS statements> END; This is one method of avoiding an infinite loop in DO WHILE or DO UNTIL statements. 28 The Iterative DO Statement with a Conditional Clause Task: • Determine the return of the account again. • Stop the loop if 25 years is reached or more than $250,000 is accumulated. 29 The Iterative DO Statement with a Conditional Clause data invest; do Year=1 to 25 until(Capital>250000); Capital+5000; Capital+(Capital*.075); end; run; proc print data=invest noobs; run; 30 The Iterative DO Statement with a Conditional Clause PROC PRINT Output 31 Year Capital 21 255594.86 Nested DO Loops Nested DO loops are loops within loops. When you nest DO loops, – use different index variables for each loop – be certain that each DO statement has a corresponding END statement. 32 Nested DO Loops Task: • Create one observation per year for five years and show the earnings if you invest $5,000 per year with 7.5 percent annual interest compounded quarterly. 33 Nested DO Loops 5x data invest(drop=Quarter); do Year=1 to 5; Capital+5000; do Quarter=1 to 4; 4x Capital+(Capital*(.075/4)); end; output; end; run; proc print data=invest noobs; run; 34 Nested DO Loops PROC PRINT Output Year 1 2 3 4 5 Capital 5385.68 11186.79 17435.37 24165.94 31415.68 How could you generate one observation for each quarterly amount? 35 Nested DO Loops Task: • Compare the final results of investing $5,000 a year for five years in three different banks that compound quarterly. Assume each bank has a fixed interest rate. mylib.Banks 36 Name Rate Calhoun Bank and Trust State Savings Bank National Savings and Trust 0.0718 0.0721 0.0728 Nested DO Loops 3x 37 data invest(drop=Quarter Year); set mylib.banks; Capital=0; do Year=1 to 5; Capital+5000; do Quarter=1 to 4; 5x 4x Capital+(Capital*(Rate/4)); end; end; run; Nested DO Loops proc print data=invest noobs; run; PROC PRINT Output Name Calhoun Bank and Trust State Savings Bank National Savings and Trust 38 Rate Capital 0.0718 31106.73 0.0721 31135.55 0.0728 31202.91