A simple way to create a running total

advertisement
A SIMPLE WAY TO CREATE A RUNNING TOTAL
David Fickbohm, Homegain+, Emeryville, CA
ABSTRACT -- This paper will explain how to create a running total. The paper will explain
how the author used multiple arrays, how the author also used the end = finished method to
determine when to stop outputting detail lines and output a total line. This poster is oriented
towards beginning SAS programmers.
PROBLEM –
Create a procedure using SAS that will deal effectively and efficiently with the following situations:
¾ Clients want a daily report that shows the date, number of visits for each day of the
month, and the total visits for the month to date.
¾ Clients do not always begin sending visits at the beginning of a time period
¾ Clients may or may not send visits each day
¾ Clients do not always stop sending visits at the end of a time period.
METHODS USED –
The author used the following methods to deal with the problem above.
¾ Dynamically determined beginning, ending and prior day dates
¾ Used Arrays with do loops to repeat steps
RESULTS –
¾ A process that will deal with multiple clients or a single client
¾ A process that will display zero visits from the beginning of a time period until the first day
with data is found.
¾ A process that will display the proper number of visits each day that visits are found and
recorded.
¾ A process that will display zero visits on days when a client does not send any visits but
will display the proper total number of visits for the month on the day even though no
visits were received that day.
¾ A process that will display zero visits from the last day the client sends visits until the last
day of the time period while still showing the total number of visits for the month each day
until the last day of the time period.
CODE TO DYNAMICALLY SET BEGINNING , ENDING DATES
DATA _NULL_;
%LET TODAY = %SYSFUNC(PUTN(%SYSFUNC(TODAY()),DATE9.));
%PUT &TODAY;
%LET BOM =
%SYSFUNC(PUTN(%SYSFUNC(INTNX(month,"&TODAY"D,0,BEG)),DATE9.));
%PUT &BOM;
%LET EOM =
%SYSFUNC(PUTN(%SYSFUNC(INTNX(month,"&TODAY"D,0,END)),DATE9.));
%PUT &EOM;
RUN;
INPUT DATA
missing dates:
Partner 10 - July 1 through July 19, 27, 28, 30,31
Partner 11 – July 1 through July 24, 30
-1-
data currmm;
input Partner click_dt : mmddyy10. visits :;
format click_dt mmddyy10.;
datalines;
10
7/20/2007
5
10
7/25/2007
10
10
7/26/2007
15
10
7/29/2007
5
11
7/25/2007
20
11
7/26/2007
25
11
7/29/2007
5
11
7/31/2007
30
;
run;
Numbered SAS code
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
proc sort data=currmm; by partner; run;
/* FIND AND OUTPUT DATES MISSING DATA AT BEGINNING OF MONTH */
DATA MONTHLY (keep=partner click_dt visits sum_visits);
SET CURRMM (rename= (click_dt = dt)) END=FINISHED;
by partner;
retain running_date;
format running_date click_dt mmddyy10.;
temp_visits = visits;
/* OUTPUT DATES MISSING DATA AT BEGINNING OF EACH PARTNER*/
if first.partner then do;
sum_visits = 0;
do click_dt = &beg to dt - 1;
visits = 0;
output;
end;
running_date = dt - 1;
end;
else do;
/* FIND AND OUTPUT DATES MISSING DATA IN MIDDLE OF MONTH */
if dt - 1 gt running_date then do;
do click_dt = running_date to dt -1;
visits = 0;
output
end;
end;
end;
/* OUTPUT DATES MISSING DATA AT END OF MONTH */
visits = temp_visits;
SUM_VISITS + VISITS;
click_dt = dt;
if visits = . then
visits = 0;
output;
if last.partner then do;
if &end gt dt then do;
do click_dt = dt + 1 to &end;
visits = 0;
output;
end;
-2-
40
end;
41 end;
42 running_date = dt+1;
43 run;
EXPLANATION OF CODE
01 Sort input data by partner
02 comment
03 Keeps data to be displayed
04 Renames the date on the current record to dt
05 causes breaks by partner
06 keeps running date available
07 formats dates to mm/dd/yyyy
08 Save the original input value of visits
09 comment
10 checks for change in partner
11 If new partner reset running total to zero
12 repeat steps below do from first_dt to click_dt
13 assigns zero to visits
14 outputs monthly
15 end of inner do
16 make running date equal dt -1
17 end of outer do
18 If no missing data then do the following
19 /* comment */
20 If dt -1 greater than expected date then
21 repeat following steps running date to real dt
22 assign visits 0
23 output monthly
24 end first do
25 end second do
26 end third do
27 /* comment */
28 assign temp_visits to visits
29 sum visits and visits into sum_visits
30 restore original input value
31 If visits missing data then
32 assign zero to visits
33 output monthly
34 Check for last partner
35 if last partner check for dates missing data from current date until
36 Check for last observations for current partner value
37 assign zero to visits
38 output monthly
39 end first do loop
40 end second do loop
41 end third do loop
42 increment to next expected date
43 run
-3-
end of month
SAS Log
4
DATA _NULL_;
5
%LET TODAY = %SYSFUNC(PUTN(%SYSFUNC(TODAY()),date9.));
6
%PUT &TODAY;
SYMBOLGEN: Macro variable TODAY resolves to 06JUL2007
06JUL2007
7
%LET BOM =
%SYSFUNC(PUTN(%SYSFUNC(INTNX(month,"&TODAY"D,0,BEG)),date9.));
SYMBOLGEN: Macro variable TODAY resolves to 06JUL2007
8
%PUT &BOM;
SYMBOLGEN: Macro variable BOM resolves to 01JUL2007
01JUL2007
9
%LET EOM =
%SYSFUNC(PUTN(%SYSFUNC(INTNX(month,"&TODAY"D,0,END)),date9.));
SYMBOLGEN: Macro variable TODAY resolves to 06JUL2007
10
%PUT &EOM;
SYMBOLGEN: Macro variable EOM resolves to 31JUL2007
31JUL2007
11
RUN;
NOTE: DATA statement used (Total process time):
real time
0.00 seconds
cpu time
0.00 seconds
12
13
data currmm;
14
input partner click_dt : MMDDYY10. visits :; format click_dt
MMDDYY10.;
15
datalines;
NOTE: The data set WORK.CURRMM has 8 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time
0.01 seconds
cpu time
0.01 seconds
24
;
25
26
DATA MONTHLY (KEEP = PARTNER CLICK_DT VISITS SUM_VISITS);
27
SET CURRMM (RENAME = (CLICK_DT = DT)) END = FINISHED;
28
BY PARTNER;
29
RETAIN RUNNING_DATE;
30
FORMAT RUNNING_DATE CLICK_DT DT mmddyy10. ;
31
TEMP_VISITS = VISITS;
32
/* find and output missing dates at beginning of MONTH */
33
IF FIRST.PARTNER THEN DO;
34
SUM_VISITS = 0;
35
DO CLICK_DT = "&BOM"D TO DT - 1;
SYMBOLGEN: Macro variable BOM resolves to 01JUL2007
36
VISITS = 0;
37
SUM_VISISTS = 0;
38
OUTPUT;
39
END;
40
RUNNING_DATE = DT - 1;
41
END;
42
ELSE DO;
-4-
43
/* FIND AND OUTPUT MISSING DATES IN MIDDLE OF MONTH*/
44
IF DT -1 GE RUNNING_DATE THEN DO;
45
DO CLICK_DT = RUNNING_DATE TO DT -1;
46
VISITS = 0;
47
OUTPUT;
48
END;
49
END;
50
END;
51
/* OUTPUT DATA WITH NON MISSING DATES */
52
VISITS = TEMP_VISITS;
53
SUM_VISITS + VISITS;
54
CLICK_DT = DT;
55
IF VISITS = . THEN VISITS = 0; /* CHECK FOR MISSING VALUES CHANGE
TO ZERO
55 ! */
56
OUTPUT;
57
/* FIND AND OUTPUT MISSING DATES AT THE END OF THE MONTH*/
58
IF LAST.PARTNER THEN DO;
SYMBOLGEN: Macro variable EOM resolves to 31JUL2007
59
IF "&EOM"D GT DT THEN DO;
60
DO CLICK_DT = DT + 1 TO "&EOM"D;
SYMBOLGEN: Macro variable EOM resolves to 31JUL2007
61
VISITS = 0;
62
OUTPUT;
63
END;
64
END;
65
END;
66
RUNNING_DATE = DT + 1; /* INCREMENT RUNNING DATE TO NEXT EXPECTED
DATE */
67
RUN;
NOTE: There were 8 observations read from the data set WORK.CURRMM.
NOTE: The data set WORK.MONTHLY has 62 observations and 4 variables.
NOTE: DATA statement used (Total process time):
real time
0.01 seconds
cpu time
0.01 seconds
SAS output
partner visits 10 0 10 0 10 0 10 0 10 0 10 0 10 0 10 0 10 0 10 0 10 0 10 0 10 0 10 0 CLICK_DT SUM_VISITS 7/1/2007 0 7/2/2007 0 7/3/2007 0 7/4/2007 0 7/5/2007 0 7/6/2007 0 7/7/2007 0 7/8/2007 0 7/9/2007 0 7/10/2007 0 7/11/2007 0 7/12/2007 0 7/13/2007 0 7/14/2007 0 -5-
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 0 0 0 0 0 5 0 0 0 0 10 15 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 20 25 0 0 55 0 30 7/15/2007 7/16/2007 7/17/2007 7/18/2007 7/19/2007 7/20/2007 7/21/2007 7/22/2007 7/23/2007 7/24/2007 7/25/2007 7/26/2007 7/27/2007 7/28/2007 7/29/2007 7/30/2007 7/31/2007 7/1/2007 7/2/2007 7/3/2007 7/4/2007 7/5/2007 7/6/2007 7/7/2007 7/8/2007 7/9/2007 7/10/2007 7/11/2007 7/12/2007 7/13/2007 7/14/2007 7/15/2007 7/16/2007 7/17/2007 7/18/2007 7/19/2007 7/20/2007 7/21/2007 7/22/2007 7/23/2007 7/24/2007 7/25/2007 7/26/2007 7/27/2007 7/28/2007 7/29/2007 7/30/2007 7/31/2007 0 0 0 0 0 5 5 5 5 5 15 30 30 30 35 35 35 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 20 45 45 45 100 100 130 CONCLUSION
-6-
Base SAS is a very powerful programming language. Arrays and do while loops are wonderful ways to
repeat steps in a process until a condition is changed.
ACKNOWLEDGMENTS
Jan Squillace, SAS Tech Support , Thanks Jan.
CONTACT INFORMATION
Please feel free to contact me with questions and comments about this paper:
David Fickbohm
510 594 4151 day phone
davidf@homegain.com
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of
SAS Institute Inc. in the USA and other countries.
® indicates USA registration. Other brand and product names are trademarks of their respective companies.
-7-
Download