UNC-Wilmington Department of Economics and Finance ECN 377 Dr. Chris Dumas SAS – Proc Import and Proc Export Importing and Exporting Data Files in SAS (Proc Import and Proc Export) Several types of commonly-used data file formats, including Microsoft Excel spreadsheet files (.xls files), comma-delimited files (.csv files), space or tab-delimited text files (.prn or .txt), Microsoft Access database files (.mdb files), or SAS data files (.sas7bdat files) can be imported into and exported from SAS. (SAS can also import and export STATA and SPSS format files.) In this handout, we’ll consider Microsoft Excel data files. "Import" means: bring data that are located in a file outside of SAS into SAS's memory. "Export" means: send data that are in SAS's memory to a file located outside of SAS. The SAS procedures Proc Import and Proc Export can be used to import and export Microsoft Excel spreadsheet data files. In addition, Proc Contents can be used to obtain summary information about the variables in a dataset that has just been imported into SAS. We’ll review Proc Contents in the next handout. Proc Import--Importing Data from Excel into SAS First, from inside Excel, convert the Excel data file to the "Excel 97-2003 Workbook" format. You can do this by opening the Excel data file in Excel and then using "Save As" to save the file in the "Excel 97-2003 Workbook" format. When the Excel file is in the "Excel 97-2003 Workbook" format, it will have a ".xls" filename extension (not ".xlsx" or ".xlxm" or ".xlsb" or any other filename extension). IMPORTANT: Before you attempt to import an Excel file into SAS, you need to close the file in Excel. SAS will not import an Excel file that is open in Excel. SAS will give you an error message if you attempt to import the Excel file into SAS while the Excel file is open in Excel. For example, suppose you want to import a (hypothetical) Excel data file called "mydata.xls" into SAS, and suppose the file is located on the V: drive in a folder named ECN422. You could use the following Proc Import command in SAS: proc import datafile="v:\ECN422\mydata.xls" dbms=xls out=dataset01 replace; run; In the “proc import” command above, the datafile="v:\ECN422\mydata.xls" tells SAS the location of the data file on your computer. The "dbms" means "database management system" and tells SAS what kind of file you are trying to import; in this case, an Excel file, which is designated an "xls" file in SAS, so we put "dbms=xls". The "out=dataset01" tells SAS what name to give to the new data set when it is stored in SAS's memory. We are calling the new data set "dataset01", although we could call it whatever we like. The original data file mydata.xls remains unchanged on the v: drive. A copy of dataset mydata.xls has been created and stored in SAS’s memory under as the dataset named dataset01. The "replace" tells SAS to replace any data set named "dataset01" that it might have in its memory with the new "dataset01" that is being created as we import the data from mydata.xls. Using "replace" is not necessary, but it's a good thing to do to ensure that we are creating a new, clean data set. 1 UNC-Wilmington Department of Economics and Finance V: drive of your computer ECN 377 Dr. Chris Dumas Proc Import Dataset “mydata.xls” SAS’s memory Dataset “dataset01” By default, Proc Import will look for variable names in the first row of the data set and use them if they are present. If variable names are not present in the first row of the data set, Proc Import will assign names VAR1, VAR2, VAR3, etc., to the variables as it reads them. Proc Import scans the first 20 rows of data and assigns a variable type, either numeric or character, to each variable based on the data values in those first 20 rows. Check to make sure that the first 20 values of each variable are not zero (or unrepresentative in some other way) before using Proc Import. If a data value is missing, Proc Import will assign a period "." value as the missing value. Proc Export--Exporting Data from SAS to Excel Now suppose that after working for a while with data in a data set named "dataset02" inside SAS's memory, we are ready to save the data to an Excel file. We can name the Excel file whatever we like, so let's call it "important_data.xls". Also, suppose we want to put this file in folder ECN422 on the V: drive. We could use the following Proc Export command to accomplish this: proc export data=dataset02 outfile="v:\ECN422\important_data.xls" dbms=EXCEL5 replace; run; The "data=dataset02" tells SAS which data set in its memory you would like to export (sometimes SAS may be holding more than one data set in its memory). The "outfile=" tells SAS which drive, folder and filename you want to use when you export the data. The "dbms" and "replace" command words do the same things they did in the Proc Import command; however, we need to use dbms=EXCEL5 in Proc Export (whereas we used dbms=xls in Proc Import). 2 UNC-Wilmington Department of Economics and Finance ECN 377 Dr. Chris Dumas In this course, we will work with data files in Excel format, however . . . SAS commands for importing and exporting other types of data files (non-Excel files) are provided below. Comma-Delimited ".csv" Data Files Importing Data For comma-delimited files, use dbms=csv : proc import datafile="v:\ECN422\mydata.csv" dbms=csv out=dataset01 replace; run; Exporting Data proc export data=dataset02 outfile="v:\ECN422\important_data.csv" dbms=csv replace; run; Space or Tab-Delimited ".txt" ".prn" ".dat" " Data Files Importing Data For space-delimited data files, use dbms=dlm : proc import datafile="v:\ECN422\mydata.txt" dbms=dlm out=dataset01 replace; run; For tab-delimited data files, use dbms=tab : proc import datafile="v:\ECN422\mydata.txt" dbms=tab out=dataset01 replace; run; Exporting Data For space-delimited data files, use dbms=dlm : proc export data=dataset02 outfile="v:\ECN422\important_data.txt" dbms=dlm replace; run; or, for tab-delimited data files, proc export data=dataset02 outfile="v:\ECN422\important_data.txt" dbms=tab replace; run; 3 UNC-Wilmington Department of Economics and Finance ECN 377 Dr. Chris Dumas Microsoft Access Database Files Importing and exporting Microsoft Access data files require a little modification of the basic Proc Import and Proc Export commands. Importing Data proc import datatable='table name' dbms=access out=dataset01 replace; database="v:\ECN422\mydata.mdb"; run; Notice that when importing Access data files, we need to specify the datatable within the Access data file. Also, instead of using a "datafile=..." statement inside the Proc Import command line, we need to use a "database=...." command outside the Proc Import command line. The "dbms=access" statement works for Access 2000 through Access 2003. If the data files are Access 2007 format, then use "dbms=access97" instead of "dbms=access". Exporting Data proc export data=dataset02 outfile="v:\ECN422\important_data.mdb" dbms=access replace; run; If the data files are Access 2007 format, then use "dbms=access97" instead of "dbms=access". SAS Database Files Importing SAS database files does not use the Proc Import command. Instead, the Data command and a Set command are used to import data from a SAS database file. The following commands read a SAS database file named CountyRev2 located in the ECN422 folder on the V: drive into SAS's memory and name it dataset01. In the ECN422 folder on the V: drive, the SAS database file actually has the name "CountyRev2.sas7bdat", but you do not need to type the .sas7bdat file name extension, as SAS automatically adds the ".sas7bdat" file name extension as it reads the file. (Aside: The ".sas7bdat" refers to SAS version 7; SAS 9.2 uses the same data file format as SAS 7.0, so they kept the filename extension the same.) Importing Data data dataset01; set 'v:\ecn422\CountyRev2'; run; Exporting Data The Proc Export command is not used to export SAS database files. Instead, the Data command and a Set command are used to export data to a SAS database file. The following commands create a SAS database file named CountyRev3 in the ECN422 folder on the V: drive and Set the data from dataset01 into the CountyRev3 file. In the ECN422 folder on the V: drive, the SAS database file will have the name "CountyRev3.sas7bdat". SAS automatically adds the ".sas7bdat" file name extension as it creates the file. data 'v:\ECN422\CountyRev3'; set dataset01; run; 4