Appendix C: SAS Software Uses of SAS CRM datamining data warehousing linear programming forecasting econometrics nonlinear parameter estimation simulation marketing models statistical analysis Data Types SAS Can Deal with panel data relational databases scanner data Web log data questionnaires Ideal When You Are … transforming manipulating Mathematical Marketing massaging sorting merging lookups reporting Slide C.1 SAS Two Types of SAS Routines DATA Steps • • • • • Read and Write Data Create a SAS dataset Manipulate and Transform Data Open-Ended - Procedural Language Presence of INPUT statement creates a Loop PROC Steps • Analyze Data • Canned or Preprogrammed Input and Output Mathematical Marketing Slide C.2 SAS A Simple Example data my_study ; input id gender $ green recycle ; cards ; 001 m 4 2 002 m 3 1 003 f 3 2 ••• ••• ••• ••• ; proc reg data=my_study ; class gender ; model recycle = green gender ; Mathematical Marketing Slide C.3 SAS The Sequence Depends on the Need data step to read in scanner data; data step to read in panel data ; data step to merge scanner and panel records ; data step to change the level of analysis to the household ; proc step to create covariance matrix ; data step to write covariance matrix in LISREL compatable format ; Mathematical Marketing Slide C.4 SAS The INPUT Statement - Character Data List input $ after a variable - character var input last_name $ first_name $ initial $ ; Formatted input $w. after a variable input last_name $22. first_name $22. initial $1. Column input $ start-column - end-column input last_name $ 1 - 22 first_name $ 23 - 44 initial $ 45 ; Mathematical Marketing Slide C.5 SAS The INPUT Statement - Numeric Data List input input score_1 score_2 score_3 ; Formatted input w.d (field width and number of digits after an implied decimal point) after a variable input score_1 $10. score_2 $10. score_3 10. Column input $ start-column - end-column input score_1 1 - 10 score_2 11 - 20 score_3 21 - 30 ; Mathematical Marketing Slide C.6 SAS Grouped INPUT Statements input (var1-var3) (10. 10. 10.) ; input (var1-var3) (3*10.) ; input (var1-var3) (10.) ; input (name var1-var3) ($10. 3*5.1) ; Mathematical Marketing Slide C.7 SAS The Column Pointer in the INPUT Statement input @3 var1 10. ; input more @ ; if more then input @15 x1 x2 ; input @12 x1 5. +3 x2 ; Mathematical Marketing Slide C.8 SAS Documenting INPUT Statements input Mathematical Marketing @4 @9 @20 @20 green1 green2 aware1 aware2 4. 4. 5. 5. ; /* /* /* /* greeness scale first item greeness scale 2nd item awareness scale first item awareness scale 2nd item */ */ */ */ Slide C.9 SAS The Line Pointer input x1 x2 x3 / x4 x4 x6 ; input x1 x2 x3 #2 x4 x5 x6 ; input #2 Mathematical Marketing x1 x2 x3 x4 x5 x6 ; Slide C.10 SAS The PUT Statement put x1 x2 x3 @ input x4 ; put x4 ; put _all_ ; put a= b= ; ; put x1 #2 x2 ; put _infile_ ; put x1 / x2 ; put _page_ ; col1 = 22 ; col2 = 14 ; put @col1 var245 @col2 var246 ; Mathematical Marketing Slide C.11 SAS Copying Raw Data infile in ′c:\old.data′ ; file out ′c:\new.data′ ; data _null_ ; infile in ; outfile out ; input ; put _infile_ ; Mathematical Marketing Slide C.12 SAS SAS Constants '21Dec1981'D 'Charles F. Hofacker' 492992.1223 Mathematical Marketing Slide C.13 SAS Assignment Statement x = a + b ; y = x / 2. ; prob = 1 - exp(-z**2/2) ; Mathematical Marketing Slide C.14 SAS The SAS Array Statement array y {20} y1-y20 ; do i = 1 to 20 ; y{i} = 11 - y{i} ; end ; Mathematical Marketing Slide C.15 SAS The Sum Statement variable+expression ; retain variable ; variable = variable + expression ; n+1 ; cumulated + x ; Mathematical Marketing Slide C.16 SAS IF Statement if a >= 45 then a = 45 ; if 0 < age < 1 then age = 1 ; if a = 2 or b = 3 then c = 1 ; if a = 2 and b = 3 then c = 1 ; if major = "FIN" ; if major = "FIN" then do ; a = 1 ; b = 2 ; end ; Mathematical Marketing Slide C.17 SAS More IF Statement Expressions name ne 'smith' name ~= 'smith' x eq 1 or x eq 2 x=1 | x=2 then etc ; if a <= b | a >= c a le b or a ge c a1 and a2 or a3 (a1 and a2) or a3 Mathematical Marketing Slide C.18 SAS Concatenating Datasets Sequentially first: second: id 1 2 3 id 4 5 6 x 2 1 3 y 3 2 1 x 3 2 1 y 2 1 1 data both ; set first second ; both: id 1 2 3 4 5 6 Mathematical Marketing x 2 1 3 3 2 1 y 3 2 1 2 1 1 Slide C.19 SAS Interleaving Two Datasets proc sort data=store1 ; by date ; proc sort data=store2 ; by date ; data both ; set store1 store2 ; by date ; Mathematical Marketing Slide C.20 SAS Concatenating Datasets Horizontally left: id 1 2 3 y1 2 1 3 right: y2 3 2 1 id x1 x2 1 3 2 2 2 1 3 1 1 data both ; merge left right ; both: id 1 2 3 Mathematical Marketing y1 2 1 3 y2 3 2 1 x1 3 2 1 x2 2 1 1 Slide C.21 SAS Table LookUp table: database: part desc 0011 hammer 0012 nail 0013 bow id part 1 0011 2 0011 3 0013 proc sort data=database out=sorted by part ; data both ; merge table sorted ; by part ; both: id 1 2 3 Mathematical Marketing part desc 0011 hammer 0011 hammer 0013 bow The last observations is repeated if one of the input data sets is smaller Slide C.22 SAS Update master: transaction: part desc 0011 hammer 0012 nail 0013 bow Part desc 0011 jackhammer data new_master ; update master transaction ; by part ; new_master: Mathematical Marketing part desc 0011 jackhammer 0012 nail 0013 bow Slide C.23 SAS Changing the Level of Analysis 1 Subject A A A B B B Time Score 1 A1 2 A2 3 A3 1 B1 2 B2 3 B3 Subject Score1 Score2 Score3 A A1 A2 A3 B B1 B2 B3 Mathematical Marketing Before After Slide C.24 SAS Changing the Level of Analysis 1 data after ; keep subject score1 score2 score3 ; retain score1 score2 ; set before ; if time=1 then score1 = score ; else if time=2 then score2 = score ; else if time=3 then do ; score3 = score ; output ; end ; Mathematical Marketing Slide C.25 SAS Changing the Level of Analysis 2 Day 1 1 1 2 2 2 Day 1 2 Mathematical Marketing Score 12 11 13 14 10 9 Student A B C A B C Highest 13 14 Student C A Before After Slide C.26 SAS Changing the Level of Analysis 2 FIRST. and LAST. Variable Modifiers proc sort data=log ; by day ; data find_highest ; retain hightest ; drop score ; set log ; by day ; if first.day then highest=. ; if score > highest then highest = score ; if lastday then output ; Mathematical Marketing Slide C.27 SAS The KEEP and DROP Statements keep a b f h ; drop x1-x99 ; data a(keep = a1 a2) b(keep = b1 b2) ; set x ; if blah then output a ; else output b ; Mathematical Marketing Slide C.28 SAS Changing the Level of Analysis 3 Spreading Out an Observation Subject Score1 Score2 Score3 A A1 A2 A3 B B1 B2 B3 Subject A A A B B B Mathematical Marketing Time Score 1 A1 2 A2 3 A3 1 B1 2 B2 3 B3 Before After Slide C.29 SAS Changing the Level of Analysis 3 – SAS Code data spread ; drop score1 score2 score3 ; set tight ; time = 1 ; score = score1 ; output ; time = 2 ; score = score2 ; output ; time = 3 ; score = score3 ; output ; Mathematical Marketing Slide C.30 SAS Use of the IN= Dataset Indicator data new ; set old1 (in=from_old1) old2 (in=from_old2) ; if from_old1 then … ; if from_old2 then … ; Mathematical Marketing Slide C.31 SAS Proc Summary for Aggregation proc summary data=raw_purchases ; by household ; class brand ; var x1 x2 x3 x4 x5 ; output out=household mean=overall ; Mathematical Marketing Slide C.32 SAS Using SAS for Simulations Simulation Loop data monte_carlo ; keep y1 - y4 ; array y{4} y1 - y4 ; array loading{4} l1 - l4 ; array unique{4} u1 - u4 ; l1 = 1 ; l2 = .5 ; l3 = .5 ; l4 = .5 ; u1 = .2 ; u2 = .2 ; u3 = .2 ; u4 = .2 ; do subject = 1 to 100 ; eta = rannor(1921) ; do j = 1 to 4 ; y{j} = eta*loading{j} + unique{j}*rannor(2917) ; end ; output ; end ; proc calis data=monte_carlo ; etc. ; Mathematical Marketing Slide C.33 SAS External Data Sets and Windows/Vista filename trans 'C:\Documents\june\transactions.data' ; libname clv 'C:\Documents\customer_projects\' ; ... data clv.june ; infile trans ; input id 3. purch 2. day 3. month $ ; Mathematical Marketing Slide C.34 SAS