DUMP A SAS DATA SET TO A FLAT FILE TO USE IT WITH ANY POSTALSOFT PRODUCTS FOR BUSINESS AND MARKETING ANALYSIS Sergey Sianissian, PreVision Marketing, Lincoln, MA INTRODUCTION The key to successful Database Marketing programs lies in: segmenting customers based on their detailed buying activity, to target the “right” audiences for direct mail and other targeted promotions. Nowhere is the rigorous use of data for these applications more critical then for grocers, who have transaction rich databases but razor thin margins for targeted marketing. To implement these complex, challenging targeting assignments for grocers and their packaged good vendors, advanced analytical software (such as SAS) is often used. In addition, direct mail marketing software (e.g., PostalSoft, Group1), is needed to identify and parse names, titles and address data from multiple input files, to correct and standardize addresses, to eliminate duplicate records, etc. These programs usually operate with files in ASCII or DBF formats. That’s why when developing advanced targeting applications, there is a common activity to convert SAS data sets into flat files. A number of techniques have been used to do this: Kretzman (1992), Whitlock (1993) and Carpenter (1998). To determine the data structure, most of these techniques, like this article, use PROC CONTENTS. The most unique features of the code below are: - it generates the SAS program, which really converts the SAS data sets into ASCII flat files. - it creates, along with a flat file, a FORMAT file (Appendix 1), which is necessary for any PostalSoft programs (ACE, Merge/Purge, etc.). - it takes care of different presentations of dates in a SAS file. CONVERT SAS DATA SET TO A FLAT FILE /*******************************************/ /** This program runs PROC CONTENTS of **/ /** a SAS file and using its results generates **/ /** a program to create FLAT and FORMAT **/ /** files from this SAS data set. Later on it **/ /** executes the generated program. **/ /**************************************/ libname sasdirct 'D:\tape_oct98\sas data'; ❶ %MACRO genrflat(sasdirct, filename, flatfile); proc contents data=&sasdirct..&filename out=soderz(keep=name length label format Npos) noprint; run; proc sort data=soderz; by Npos; run; /** Calculate the record’s length and **/ /** create the MACRO variable **/ /** with its value **/ data _NULL_; set soderz(keep=length name npos format) END=last; by Npos; retain sumlen; if format=’DATE’ then length+2; ❷ sumlen+length; if last then do; sumlen=sumlen+1; call symput(’lenrec’, sumlen); end; run; data _NULL_; set soderz END=eof; by Npos; retain posN; Npos+1; if format=’DATE’ then length+2; /** Generate SAS program which /** would create FLAT files %genrflat(sasdirct, test, D:\tape_oct98) run; ❽ ******************************************; **/ **/ file "&flatfile\CreateFlat&filename..sas"; if _N_=1 then do; ❸ put "filename FLAT ’&flatfile"’\’"FLAT&filename..txt’;" / ’data _NULL_;’ / "set &sasdirct..&filename;" / "length EOR $1;" / "EOR=’X’;" / "file FLAT lrecl=&lenrec;" / ’put’ / ’@’ @2Npos @20name @30’/** ’ length @38’**/’; end; if _N_ > 1 then do; put ’@’ @2posN ❹ @20name @30’/** ’ length @38’**/’; end; if eof then do; put ’@’ @2"&lenrec" @20’EOR’ @30'/** 1’ @38 ‘**/' / ';' / 'run;'; end; posN=Npos+length; ❺ /** /** Create FORMAT file for PostalSoft products file "&flatfile\FLAT&filename..FMT"; put name +(-1) ',' length +(-1) ',c'; if eof then put 'EOR,3,b'; run; **/ **/ ❻ filename IN "'&flatfile.\CreateFlat&filename..sas'"; ❼ %INCLUDE IN; %mend genrflat; ❶ Specify path to the SAS data set. ❷ If in a SAS data set there are any “date” variables of numeric type (maximum length = 8), but format is ‘DATE9.’ (for example: 05OCT1997), it is possible to miss the last byte (in this example “7”) in the flat file. To avoid this, extend the variable’s length. ❸ This do-loop creates a header of a generated SAS program (Appendix 2). ❹ This do-loop creates the rest of a generated SAS program. ❺ To get a ‘nice’ column specified flat file use calculated variable (posN) instead of variable “Npos” from PROC CONTENTS. ❻ For PostalSoft programs the End Of Record (EOR) field should have a binary type. ❼ Call the generated SAS program. ❽ It should be 3 arguments for a MACRO “genrflat”: - path to the SAS data set - name of a SAS data set - destination to the flat file To convert a SAS data set (1,000,000 records) to a flat file takes approximately 2 minutes. CONCLUSION Presented is a SAS code, which converts a SAS data set to a flat file and creates its FORMAT file. Both files might be extensively used for business and marketing analysis with direct marketing name and address hygiene programs (e.g., PostalSoft, Group1). The presented code takes into consideration the different formats of “dates” presentation in a SAS data sets. ACKNOWLEDGMENTS Song Jungdong contributed extensively to the development of this paper. His support and suggestions are greatly appreciated. PreVision marketing is a database marketing agency specializing in the development and implementation of relationship marketing strategies including comprehensive customer loyalty, upgrade and acquisition programs. PreVision provides strategic, analytic, creative and mail production services along with support in the selection and efficient use of the newest database technologies. AUTHOR CONTACT INFORMATION Sergey Sianissian PreVision Marketing, Inc 55 Old Bedford Road Lincoln, MA 01773 Direct: (781) 259-5169 Fax: (781) 259-1548 E-mail: ssianissian@previsionmarketing.com TRADEMARK INFORMATION SAS is a registered trademark of SAS Institute Inc. PostalSoft is a registered trademark of a product name of a Firstlogic Inc. APPENDIX 1 EXAMPLE OF A FORMAT FILE DATEBRTH,10,c ADDR1,30,c CITY,20,c STATE,2,c ZIPFULL,10,c LNAME,20,c FNAME,15,c EOR,3,b APPENDIX 2 EXAMPLE OF A GENERATED SAS PROGRAM filename FLAT ’D:\tape_oct98\FLATtest.txt’; data _NULL_; set sasdirct.test; length EOR $1; EOR=’X’; file FLAT lrecl=108; put @1 DATEBRTH @11 ADDR1 @39 CITY @59 STATE @61 ZIPFULL @71 LNAME @91 FNAME @108 EOR ; run; /** 10 /** 30 /** 20 /** 2 /** 10 /** 20 /** 15 /** 1 **/ **/ **/ **/ **/ **/ **/ **/