To create new variables, e

advertisement
SPSS for Windows: Reading SAS Data Sets
Ed Greenberg
College of Nursing
Arizona State University
Revised November 5, 2001
Permission to freely use or adapt any portion of this document is granted, provided that the author is cited.
D:\533570779.doc
SPSS for Windows: Reading SAS Data Sets
Preface
This document contains instructions for converting SAS data sets for use in SPSS. The procedures described
herein apply to SPSS for Windows Version 10 and SAS for Windows Version 8. For different versions or
operating system editions of either package, the procedures may differ.
For “quick and dirty” instructions, see the last page of this document.
SPSS Data Files
A SPSS data file contains dictionary information and data. The dictionary information includes variable names,
variable labels, value labels, missing value specifications, and other attributes of the variables in the data file.
The data portion of the SPSS data file contains data values, logically organized into rows and columns containing
cases and variables, respectively. In the Microsoft Windows environment, a SPSS data file is named name.SAV
where name is any name that conforms to Windows file naming rules and “SAV” is the file extension.
SAS Data Sets
A SAS data set consists of descriptor information and data values. The descriptor information describes the
contents of the data set to SAS. The data values are organized into rows and columns containing observations
(cases) and variables, respectively.
The Structure of SAS Data Libraries
SAS data sets are stored in SAS data libraries and are referred to as members of a library. In the Microsoft
Windows environment, a SAS data library consists of a Windows directory that contains one or more SAS data
sets, each in a separate file. The SAS System identifies SAS files by using unique file extensions. For example, a
file containing a permanent SAS Version 7 or 8 data set has a file name of the form name.SAS7BDAT where
name is used in a SAS statement such as a DATA or PROC statement to refer to the SAS data set, and
“SAS7BDAT” is the file extension. Optionally, a SAS data set can have a shorter, three-character, extension of
SD7.
A “permanent” SAS data set is one that is retained after you end your SAS session. In contrast, a “temporary”
SAS data set exists only for the duration of a SAS session. Temporary SAS data sets are stored in a subdirectory
created by SAS in C:\WINDOWS\TEMP or another temporary Windows directory.
SAS data libraries can contain materials other than SAS data sets. For example, they can contain SAS catalogs. A
SAS catalog is a special type of SAS file that can contain multiple entries. You can keep different types of entries
in the same SAS catalog. For example, catalogs can contain SAS/GRAPH graphs, SAS/IML matrices, SAS
formats, etc.
A SAS format is an instruction that SAS uses to write data values, and is largely equivalent to value labels in
SPSS. In the Microsoft Windows environment, a SAS catalog has a file name of the form name.SAS7BCAT
where name is used in a SAS command to refer to the catalog and SAS7BCAT is the file extension. Optionally, a
SAS catalog can have a three-character extension of SC7.
When converting SAS data sets to SPSS, any SAS formats that are associated with the SAS data set are not
converted. Thus, the variables in the resulting SPSS data file will not have any value labels.
Note that earlier versions of SAS used different extensions for SAS files. For example a SAS Version 6 data set
has an extension of SD2 and a SAS version 6 catalog has an extension of SC2.
D:\533570779.doc
Page 2 of 12
SPSS for Windows: Reading SAS Data Sets
In the figure below, a SAS data library is contained in the directory C:\PROJECTS, and it contains two SAS data
sets, ONE.SAS7BDAT and TWO.SAS7BDAT, and a SAS catalog, MYFMTS.SAS7BCAT.
(C:)
PROJECTS
ONE.SAS7BDAT
TWO.SAS7BDAT
MYFMTS.SAS7BCAT
Using SAS Data Sets in SAS Programs
When using a permanent SAS data set in a SAS program, it must be referenced via a LIBNAME statement. The
basic format of the LIBNAME statement is as follows:
LIBNAME libref 'SAS-data-library';
You can choose any name for libref (library reference), as long as it conforms to the rules for SAS names. The
parameter SAS-data-library, enclosed in single quotes, specifies the path to the data library. The libref is used in
subsequent SAS commands to refer to the SAS data library. These references will be of the form libref.name.
On the sample LIBNAME statement below, MYDATA is the libref, which is associated with the SAS data
library, C:\PROJECTS. On the PROC PRINT statement is a reference to one of the SAS data sets in this library,
MYDATA.ONE. The contents of the data set C:\PROJECTS\ONE.SAS7BDAT will be printed.
LIBNAME MYDATA 'C:\PROJECTS';
PROC PRINT DATA=MYDATA.ONE;
RUN;
LIBNAMEs can also be assigned interactively in the SAS Display Manager (the shell which provides an
interactive interface to SAS). See Appendix A of this document for instructions on how to do this.
Conversion Method 1: Reading a SAS XPORT Format Data Set with SPSS
Versions of SPSS through Version 10 cannot read native-format SAS Version 8 data sets. However, SPSS can
read SAS data sets that have been stored in several other formats. One of these is XPORT format, a SAS portable
format that enables SAS data sets to be transferred from one type of system to another, e.g., Windows to Unix.
The translation of a SAS data set from one format to another is accomplished by invoking the appropriate access
method, or “engine” on the LIBNAME statement.
When invoking a SAS library engine, the format of the LIBNAME statement is as follows:
LIBNAME libref engine 'SAS-data-library';
As a first step, the SAS data set must be saved in XPORT format. This can be done using PROC COPY.
D:\533570779.doc
Page 3 of 12
SPSS for Windows: Reading SAS Data Sets
LIBNAME ABC 'C:\PROJECTS';
LIBNAME DEF XPORT 'C:\PROJECTS\ONE.TPT';
PROC COPY IN=ABC OUT=DEF;
SELECT ONE;
RUN;
The librefs ABC and DEF on the two LIBNAME statements are arbitrary, although they must conform to the
rules for SAS names. The XPORT parameter on the second LIBNAME statement invokes the SAS XPORT
engine. The extension used for the output data set name, TPT, is the one SPSS expects for SAS files in transport
format. PROC COPY is used to copy contents of the SAS data library referenced via the libref ABC to the
XPORT format data set referenced by libref DEF. The SELECT statement specifies that only the data set named
ONE is to be copied. This statement is only necessary if the original SAS data library contains more than one
SAS data set.
The same result could have been accomplished in a SAS Data step, as follows:
LIBNAME ABC 'C:\PROJECTS';
LIBNAME DEF XPORT 'C:\PROJECTS\ONE.TPT';
DATA DEF.ONE;
SET ABC.ONE;
RUN;
The SET statement reads the records from the SAS data set ONE.SAS7BDAT and the DATA statement causes
them to be written to the SAS XPORT file ONE.TPT.
If your data are included in-stream within your SAS program, a permanent SAS data library need not be
referenced. In this case, your program may appear as follows:
LIBNAME DEF XPORT 'C:\PROJECTS\ONE.TPT';
DATA DEF.ONE;
INPUT ID A B C;
CARDS;
1123
2456
3789
4 10 11 12
RUN;
Only one LIBNAME statement is needed, the one that references the SAS XPORT data file to be written via the
SAS Data step. Note the use of the libref “DEF” on the LIBNAME and DATA statements.
D:\533570779.doc
Page 4 of 12
SPSS for Windows: Reading SAS Data Sets
The next step is to open SPSS and from the File
menu select Open and Data...:
In the Open File window that pops up, locate the
directory containing the SAS portable file, in this
case. C:\PROJECTS. In the drop-down list for
“Files of Type:” select “SAS portable (*.tpt)”. The
SAS portable file ONE.TPT should be displayed in
the window.
D:\533570779.doc
Page 5 of 12
SPSS for Windows: Reading SAS Data Sets
Click on the file name to copy it into the “File
name:” box. Then click on the “Open” button.
SPSS will read the SAS portable file ONE.TPT and
place its contents into the Data Editor window:
You can also accomplish the same result via the SPSS command language. Type the following command in a
SPSS syntax window and run it:
GET SAS DATA='C:\PROJECTS\ONE.TPT' DSET(ONE).
Note that this method will also work on other systems, such as Unix. However, the file specifications (path and
filename) must conform to that system's rules.
Conversion Method 2: Reading a SAS Version 6 Data Set with SPSS
SPSS for Windows Version 8 can read SAS data sets in Version 6 format. If your SAS data sets are stored in
Version 7 or Version 8 format, they'll first need to be converted to this older format. This can be accomplished
via a short SAS program as in the following example:
LIBNAME SASV8 'C:\PROJECTS';
LIBNAME SASV6 V6 'C:\ PROJECTS';
DATA SASV6.ONE;
SET SASV8.ONE;
RUN;
This example assumes that a SAS Version 8 data set ONE.SAS7BDAT is a member of the SAS data library in
C:\PROJECTS. It is referenced via the libref SASV8 on the first LIBNAME statement and the SET statement.
The SAS Version 6 data set ONE.SD2 will be written to the same SAS data library, C:\PROJECTS. This data set
is referenced on the second LIBNAME statement and the DATA statement. Note the inclusion of the V6 engine
specification on this LIBNAME statement, specifying that SAS is to write this data set using the V6 library
D:\533570779.doc
Page 6 of 12
SPSS for Windows: Reading SAS Data Sets
engine. The Data step is executed once for each observation in the input data set. The SET statement causes a
record (observation) to be read from the Version 8 data set ONE.SAS7BDAT and the DATA statement causes
the record to be written to the data set ONE.SD2 in Version 6 format.
If your data are included in-stream within your SAS program, a permanent SAS data library need not be
referenced. In this case, your program may appear as follows:
LIBNAME SASV6 V6 'C:\ PROJECTS';
DATA SASV6.ONE;
INPUT ID A B C;
CARDS;
1123
2456
3789
4 10 11 12
RUN;
Only one LIBNAME statement is needed, the one that references the SAS Version 6 data set to be written via the
SAS Data step. Note the use of the libref “SASV6” on the LIBNAME and DATA statements and the use of the
V6 engine specification on the LIBNAME statement.
The next step is to open SPSS and from the File
menu select Open and Data...:
D:\533570779.doc
Page 7 of 12
SPSS for Windows: Reading SAS Data Sets
In the Open File window that pops up, locate the
directory containing the SAS Version 6 data set, in
this case. C:\PROJECTS. In the drop-down list for
“Files of Type:” select “SAS for Windows
(*.sd2)”. The SAS Version 6 data set ONE.SD2
should be displayed in the window.
Click on the file name to copy it into the “File
name:” box. Then click on the “Open” button.
SPSS will read the SAS Version 6 data set
ONE.SD2 and place its contents into the Data
Editor window:
A concise example of this method is shown in Appendix B of this document.
Good news, we hope!
D:\533570779.doc
Page 8 of 12
SPSS for Windows: Reading SAS Data Sets
On their web site, www.spss.com, SPSS, Inc. claims that the forthcoming Version 11 of SPSS for Windows will
be able to “read current versions of SAS data files and SAS portable files.” I'm hoping that this means that SAS
data sets in Version 7 and Version 8 formats can be read without prior conversion.
D:\533570779.doc
Page 9 of 12
SPSS for Windows: Reading SAS Data Sets
APPENDIX A
Assigning Libraries and LIBNAMES Interactively in SAS
Start the SAS System. A screen
similar to this one should be
displayed.
In the Explorer pane,
select the Libraries icon:
And click on the New
tool on the toolbar.
Type the LIBNAME into the Name:
box.
Using the drop-down list, select the
desired library engine (in this
example, V6).
D:\533570779.doc
Page 10 of 12
SPSS for Windows: Reading SAS Data Sets
Using the
browse button,
locate the
directory where
the SAS library
is located, in this
case,
C:\PROJECTS.
Then, click on
the OK button.
When you double-click on the
Libraries icon in the Explorer
window, you’ll see the libraries that
are defined in the current SAS
session. The new library,
MYDATA, is among them.
As you can see, an advantage to this
method is that you can see at a
glance what libraries are assigned to
your SAS session. You can also
browse their contents interactively.
D:\533570779.doc
Page 11 of 12
SPSS for Windows: Reading SAS Data Sets
APPENDIX B
FAST TRACK: Exporting Data from SAS Version 8 for Windows to SPSS for Windows Version 10

Run the following in SAS:
PROGRAM
COMMENTS
SASV6 is the libref, also used on the DATA statement.
V6 specifies the Version 6 library engine. A Version 6
LIBNAME SASV6 V6 'C:\ PROJECTS';
SAS data set will be written into the directory
C:\PROJECTS.
SASAV6 is the libref. ONE is the SAS data set name,
DATA SASV6.ONE;
INPUT ID A B C;
CARDS;
1123
2456
3789
4 10 11 12
RUN;

corresponding to the name of the file that will contain
it. This file, ONE.SD2, will be written into the
directory specified on the LIBNAME statement above.
The INPUT statement causes four variables, ID, A, B
and C to be read from each data line.
CARDS; signals the start of in-stream data lines.
Four data lines are read.
The RUN statement executes the above SAS
statements.
Do the following in SPSS:
1.
2.
3.
4.
5.
Start SPSS for Windows.
From the File menu, select Open, and Data...
Locate the directory containing the SAS data set, in this example C:\PROJECTS.
In the Files of Type: drop-down list, select SAS for Windows (*.SD2).
Click on the file name, ONE.SD2, and then click on Open.
D:\533570779.doc
Page 12 of 12
Download