STAT 6250 Chapters 4 to 6 Problems Chapter 4 1. /* Create a permanent SAS data set */ libname learn 'c:\Users\Yuen\Documents\6250\Homework\HW1'; data learn.perm; input ID : $3. Gender : $1. DOB : mmddyy10. Height Weight; label DOB = 'Date of Birth' Height = 'Height in inches' Weight = 'Weight in pounds'; format DOB date9.; datalines; 001 M 10/21/1946 68 150 002 F 5/26/1950 63 122 003 M 5/11/1981 72 175 004 M 7/4/1983 70 128 005 F 12/25/2005 30 40 ; /* Read the data set in a library */ proc contents data=learn.perm varnum; run; title "Listing of Data Set Perm"; proc print data=learn.perm; format DOB date9.; run; The CONTENTS Procedure Data Set Name Member Type Engine Created Last Modified Protection Data Set Type Label Data Representation Encoding LEARN.PERM DATA V9 Thursday, April 09, 2009 01:20:12 AM Thursday, April 09, 2009 01:20:12 AM Observations Variables Indexes Observation Length Deleted Observations Compressed Sorted 5 5 0 32 0 NO NO WINDOWS_32 wlatin1 Western (Windows) Engine/Host Dependent Information Data Set Page Size Number of Data Set Pages First Data Page Max Obs per Page Obs in First Data Page Number of Data Set Repairs File Name Release Created Host Created 4096 1 1 126 5 0 c:\Users\Yuen\Documents\6250\Homework\HW1\perm.sas7bdat 9.0101M3 WIN_PRO Variables in Creation Order 1 STAT 6250 Chapters 4 to 6 Problems # Variable Type Len Format Label 1 2 3 4 5 ID Gender DOB Height Weight Char Char Num Num Num 3 1 8 8 8 DATE9. Date of Birth Height in inches Weight in pounds 2. The column headings of DOB, Height and Weight in the data set Perm use the heading name assigned by the LABEL statement. However, their column headings from the PROC PRINT use the input variable name. 2 STAT 6250 Chapters 4 to 6 Problems Obs ID Gender 1 2 3 4 5 001 002 003 004 005 M F M M F DOB 21OCT1946 26MAY1950 11MAY1981 04JUL1983 25DEC2005 Height Weight 68 63 72 70 30 150 122 175 128 40 Chapter 5 *5-1; proc format; value agegrp 0 31 51 71 value $party 'D' 'R' 30 = '0 to 30' 50 = '31 to 50' 70 = '50 to 70' high = '71 and older'; = 'Democrat' = 'Republican'; 3 STAT 6250 Chapters 4 to 6 Problems value $likert '1' '2' '3' '4' '5' run; = = = = = 'Strongly Disagree' 'Disagree' 'No Opinion' 'Agree' 'Strongly Agree'; data voter; input Age Party : $1. (Ques1-Ques4)($1. + 1); label Ques1 = 'The president is doing a good job' Ques2 = 'Congress is doing a good job' Ques3 = 'Taxes are too high' Ques4 = 'Government should cut spending'; format Age agegrp. Party $party. Ques1-Ques4 $likert.; datalines; 23 D 1 1 2 2 45 R 5 5 4 1 67 D 2 4 3 3 39 R 4 4 4 4 19 D 2 1 2 1 75 D 3 3 2 3 57 R 4 3 4 4 ; title "Listing of Voter"; proc print data=voter; ***Add the option LABEL if you want to use the labels as column headings; run; title "Frequencies on the Four Questions"; proc freq data=voter; tables Ques1-Ques4; run; *5-2; proc format; value $grouped '1','2' = 'Generally Disagree' '3' = 'No Opinion' '4','5' = 'Generally Agree'; run; title "Grouped Frequencies"; proc freq data=voter; tables Ques1-Ques4 / nocum; format Ques1-Ques4 $grouped.; run; *5-3; data colors; input Color : $1. @@; datalines; 4 STAT 6250 Chapters 4 to 6 Problems R R B G Y Y . . B G R B G Y P O O V V B ; proc format; value $color 'R','B','G' = 'Group 1' 'Y','O' = 'Group 2' ' ' = 'Not Given' Other = 'Group 3'; run; title "Color Frequencies (Grouped)"; proc freq data=colors; tables color / nocum missing; *The MISSING option places the frequency of missing values in the body of the table and causes the percentages to be computed on the number of observations, missing or non-missing; format color $color.; run; *5-4; *Modify this libname statement; libname learn 'c:\books\learning'; options fmtsearch=(learn); proc format library=learn fmtlib; value agegrp 0 - 30 = '0 to 30' 31 - 50 = '31 to 50' 51 - 70 = '50 to 70' 71 - high = '71 and older'; value $party 'D' = 'Democrat' 'R' = 'Republican'; value $likert '1' = 'Strongly Disagree' '2' = 'Disagree' '3' = 'No Opinion' '4' = 'Agree' '5' = 'Strongly Agree'; run; data learn.voter; input Age Party : $1. (Ques1-Ques4)($1. + 1); label Ques1 = 'The president is doing a good job' Ques2 = 'Congress is doing a good job' Ques3 = 'Taxes are too high' Ques4 = 'Government should cut spending'; format Age agegrp. Party $party. Ques1-Ques4 $likert.; datalines; 23 45 67 39 19 D R D R D 1 5 2 4 2 1 5 4 4 1 2 4 3 4 2 2 1 3 4 1 5 STAT 6250 Chapters 4 to 6 Problems 75 D 3 3 2 3 57 R 4 3 4 4 ; Chapter 6 *6-2; data soccer; input Team : $20. Wins Losses; datalines; Readington 20 3 Raritan 10 10 Branchburg 3 18 Somerville 5 18 ; options nodate nonumber; ods listing close; ods csv file='c:\books\learning\soccer.csv'; proc print data=soccer noobs; run; ods csv close; ods listing; *6-3; *Modify this libname statement; libname readit 'c:\books\learning\soccer.xls'; title "Using the Excel Engine to read data"; proc print data=readit.'soccer$'n noobs; run; 6