A Library Name
Path to the physical HD location Hard Drive
Every SAS file is stored in a SAS library.
SAS data set is one type of SAS file.
In some operating environment, a library is a physical collection of files.
In others, such as Windows and Unix environments, a library is a logical name consisting of a group of files that are stored in a physical location in a storage space.
Library can be Temporary or Permanent.
A SAS library must be prepared in order for a SAS program to reach the directory to either read or output a SAS data set.
SAS program only need to recognize the Library reference name.
A SAS library name has two-levels:
LIBREF.Filename
Libref is the the SAS Library name that is connected to a physical directory in a storage location in your computer. fielname is a file stored in the directory referred to the Libref.
(A) Temporary SAS Library for hosting temporary SAS data sets:
The LIBREF is always WORK, which is already available in the Libraries folder in
Explore Panel of the SAS working environment.
Example: WORK.admit is a temporary SAS data set.
WORK is the LIBREF and the data set name is admit
NOTE: All of the SAS data sets stored in the WORK library will disappear after log off the SAS session.
NOTE: one can ignore ‘WORK’ and specify the data set as admit, if it is stored in the WORK library as temporary library.
Fro example, in the DATA step:
DATA admit2; is the same as DATA work.admit2;
(B) Permanent SAS Library:
The data sets hosted in the permanent SAS library remains in the SAS session, but the files are stored physically in the HD as defined. The Libref is defined by the user.
For example: Mylib.admit refers to a SAS data set admit which is stored in the library named Mylib.
Mylib is user defined SAS library. Admit is a file stored in the corresponding physical location in the hard drive.
If you want to use the WORK library to store your file, there is no need to define WORK library. It is already created by SAS when you login.
If you want to create you own library, there are two ways:
(1) By the pull-down menu, as described in the SAS
Window Environment document.
(2) By using a SAS statement as below to define a SAS library:
LIBNAME libref ‘the path link to the physical folder in HD’;
NOTE: libref is a logical name for the entire folder in HD. The folder can have many data sets. Each data set in the folder will be called: libref.dasetname
Example: you store data sets: admit, budget, tuition in the folder
‘UNIVERSITY’ in C-drive. You define a SAS library ‘COST’ link to these files by:
LIBNAME cost ‘C:\university’;
The data sets will be named in your SAS program as:
Cost.admit , cost.budget , cost .tuition
NOTE: the names can be upper or lower cases.
• are limited to 8 characters
• must start with a letter or underscore
• can contain only letters, numbers, or underscores.
Example: s575, _s575 , s575_ s575_ are valid LIBREF
S-575 , sta575_online are not valid
The LIBNAME statement is a global statement. A global statement will remain in effect until you modify them, cancel them or end your SAS session.
Although we say the library is permanent, this means your data
set in the SAS library (in the physical storage) is permanent, but not the LIBREF. You still need to assign a libref to each permanent library in order to access these data sets in each
SAS session.
NOTE: If you use the Pull-Down menu to create your permanent and check ‘Enable at Startup’, then, the LIBREF will be available when you login without LIBNAME statement.
You can use LIBNAME statement to reference not only
SAS files, but also files created by other software products, such as database management systems.
SAS uses appropriate SAS engine designed to connect to these specific software products.
Files from non-SAS software
Engine SAS data library
LIBNAME Libref Engine ‘path to the physical location’;
Some available engines are BMDP, SPSS, OSIRIS
Allows read-only access to BMDP, SPSS, OSIRIS files
See Help document for more details, if needed.
Where to find the Library created and the contents in the library and in each data set?
Once the library is created, it appears in the folder called
‘Libraries’ on the left panel (Explore Panel) of the SAS working interface.
To see the content of a SAS data set, click on the data set to open the data set in ‘Tableview’ window. Close the
Tableview window afterwards.
One can use SAS statements to view the contents of a
SAS library and the detailed DATA descriptor information of any SAS data set.
ID
Write a SAS program to read the following
SAS data set located in the class webiste,
Pilots.sas7bdat
This data consists of pilots employed at an airline. The variables are
Variable
LastName char
FirstName char
City
State
Gender
JobCode
Salary
Type char char char char char num
Length Description
4 ID number
10 last name
9 first name
12 city
2
1
3
8 state gender job code current salary
In this program, you will do the following tasks:
(1) Create a SAS library, mylib that connects to the folder in which
Pilots data set is stored.
(2) Read the SAS data set, Pilots
(3) Create a new SAS data set, call:
Pilotsnew, and store it in another SAS library call: mylib1 that connect to the folder, DataEx , inside Math707 folder.
(4) Print the data.
Save the SAS program, name it
C2_readSASData to your C-drive in a new folder, SASEx inside Math707,
Birth
Hired num num
8
8 birth date date hired
HomePhone char 12 home phone number
Libname mylib ‘c:\math707\sasdata’;
Libname mylib1 ‘c:\math707\ dataex’;
Data mylib1.pilotsnew;
Set mylib.pilots;
Run;
Proc print data = mylib1.pilotsnew;
Run;
In practical situation, a SAS library often consists of many data sets shared by different users. Therefore, it is a good practice to find out the contents in the library.
SAS has two SAS procedures to display the contents in the library as well as for each SAS data set:
PROC CONTENTS <options>;
RUN;
PROC DATASETS <options> ;
CONTENTS <options>;
QUIT;
/* To display all SAS data sets in Mylib library */ p roc contents data=mylib._all_ nods; run;
/*Or use the following procedure */ proc datasets; contents data=mylib._all_ nods;
Quit;
NOTE: the filename _all_ is a SAS designated variable name referring to all files in the mylib library.
NODS : is a key word referring to NO Data Descriptor details
NOTE: The statement inside /* */ is a comment statement.
/*view the data descriptor information for the SAS data set admit */
PROC CONTENTS data=mylib.admit; run;
/* One can also use the following procedure *
PROC DATASETS;
CONTENTS data=mylib.admit;
QUIT;
NOTE: The variables are listed in alphbetic order by default.
One can list the variable order based on the order it created in the SAS data set by using the option:
VARNUM
PROC CONTENTS data=mylib.admit
varnum ;
Or
PROC DATSETS ;
CONTENTS DATA=mylib.ADMIT
VARNUM ;
QUIT;
Open the SAS program C2_readSASdata program, and use PROC CONTENTS as well as PROC DATASETS to
(1) View only the SAS data sets in mylib library.
(2) View the detailed data descriptor for the SAS data set pilots in mylib.
(3) View the detailed data descriptor for the SAS data set pilots with the table column variable order.
(4) Save the SAS program, name it C2_Contents, to your SASEx folder.
/* use proc contents , display all sas data sets in mylib*/
Libname mylib ‘c:\math707\sasdata’;
Proc contents data = mylib._all_ nods; Run;
/* use proc datasets , display all sas data sets in mylib */ proc datasets; contents data=mylib._all_ nods;
Quit;
/* use proc contents , display details of sas data set pilots with variables in alphabetic order */
Proc contents data = mylib.pilots; Run; proc datasets; contents data=mylib.pilots;
Quit;
/* use proc contents , display details of sas data set pilots with variables in table column order
*/
Proc contents data = mylib.pilots varnum; run; proc datasets; contents data=mylib.pilots varnum;
Quit;
SAS system options for each window can be set using Tools, Options, System to set the system options using Pull-down menu, or use SAS statement to specify System options:
NOTE: One can set system options for SAS Listing output regarding to
• Line size, page size, the page number, the date and time to be displayed, and many others. These options will not affect the HTML output format .
The general syntax: OPTIONS options;
Some useful options are:
DATE|NODATE: to print date and time or not (Default is DATE)
NUMBER\NONUMBER: to print page # or not. Default is number and all numbers are cumulated until renumbered.
PAGENO = n: by default, page # are cumulated . Use PAGENO=n to reset the starting page #. For example,
PAGENO=3 will reset the page # starting at page 3, and begin cumulating from that point on.
PAGESIZE = n|max
LINESIZE=n|max: Note: If an observation need more than one line, it continues on to next line.
NOTE: OPTIONS statement is a global statement. Can appear anywhere in your program to change the setting from that point on.
NOTE: It is a good practice to place OPTIONS statements outside the DATA or PROC steps.
Open C2_Contents program, and practice the following SAS system options using
OPTIONS statement.
Delete all RPOC DATASETS procedures.
Add options statement at the end of this program with the following options:
Change options to NODATE,
Set PAGENO starting at 1 for the output
Set PAGESIZE to be 50
Set LINESIZE to be 80
Use proc contents to see the descriptor of admit data in mylib
Use proc print statement to print admit data.
Check results to see the effects of these options.
Add another OPTIONS statement and change options back to
DATE, PAGESIZE=max, LINESIZE=max, then,
Use proc print to print PILOTS data in the mylib.
Check the results to see the effect of the options.
Save the program, named C2_SYSOptions to your SASEx folder
Libname mylib ‘c:\math707\sasdata’;
Proc contents data = mylib._all_ nods; Run;
Proc contents data = mylib.admit; run;
Options nodate pageno=1 pagesize=50 linesize=80;
Proc print data = mylib.admit; run;
Options date pagesize=max linesize=max;
Proc print data = mylib.pilots; run;
Many data use two-digit year such as 94 for 1994. 10 for 1910.
There is no confuse for 1994 using 94 now, but year 10 can be
1910 or 2010. This is Year 2000 Compliance problem.
SAS uses OPTIONS YEARCUTOFF = year; to control the 2000 year compliance issue. This specifies the 100 year span for interpret two-digit year.
The default yearcutoff = 1920 (interpret the 100 years span from
1920 to 2019 for the two-digit year.
OPTIONS YEARCUTOFF = 1940; interpret 1940 to 2039 as 100 year span for two-digit year.
OPTIONS YEARCUTOFF=1940;
Interpret the 100 year from 1940 to 2039
Date in the data set
8/26/15
12/25/65
5/7/90
8/30/48
Interpreted as
8/26/2015
12/25/1965
5/7/1990
8/30/1948
OPTIONS YEARCUTOFF=1960
Date in the data set
8/26/15
12/25/65
5/7/90
8/30/48
Interpreted as
In many applications, the # of observations (cases) is very large. It is important that a SAS program is correct before processing the entire data set. However, one needs to test if the program correctly process the data, one can specify only a small part of the data to be processed for testing purpose.
This can be done by using OPTIONS statement.
OPTIONS FIRSTOBS = n1 OBS= n2 ;
FIRSTOBS = n1 will read the data starting at the n1th observation.
OBS=n2 will read the data set ending at the n2th observation.
Example: OPTIONS FIRSTOBS=5 OBS=15;
Will read from the 5 th observations until the 15 th
Default n1 and n2 are: FIRSTOBS=1 and OBS=MAX observations.
To reset reading the entire data set, use
OPTIONS FIRSTOBS = 1 OBS =MAX;
SAS defines some statements as global statements such as
LIBNAME statement, OPTIONS statement. They take effect once it is defined and overwritten by the next statement in the same program during the same SAS session.
Most of SAS statements are local, meaning it takes in effect only at the time it appears. If the same task defined in a global and in a local statement, the local statement overwrites the global statement at the point, but return to the global statement afterwards.
Write a program to
(1) Read and print the sas data set Admit using the following options:
Pageno=1, firstobs=5 and obs = 15
(2) Add another options statement to the program with the options:
Firstobs=3 and obs=8
And print the data set Admit again.
Observe the output and make sure you understand the reason for getting the output.
(3) Reset the options with Pageno=1, firstobs=1 and obs=max, then print the Admit data.
(4) Save the program as C2_sysoptions2 to SASEx folder
Libname mylib ‘c:\math707\sasdata’;
Options pageno=1 firstobs=5 obs=15;
Data admitn; set mylib.admit;
Proc print data=admitn; run;
Proc print data = mylib.admit; run;
Options firstobs=3 obs=8;
Proc print data = admitn; run;
Proc print data = mylib.admit; run;
Options firstobs=1 obs=max pageno=1;
Proc print data=admitn; run;
Proc print data = mylib.admit; run;
PROC PRINT procedure is the most common procedure to print the data.
The general syntax is:
PROC PRINT <options>; RUN;
The following examples use Local options in PROC PRINT to specify observations:
PROC PRINT data=mylib.admit (FIRSTOBS=5 OBS=15);
Will print 5 th observations to 15 th observations.
More on Local Options Vs. Global Options in PROC PRINT
OPTIONS FIRSTOBS=10 OBS=18;
/* Uses the global OPTIONS. Since there is no local option*/
proc print data = mylib.admit; title 'print 10th to 18th cases';
/*Uses local option for Firstobs = 15, and use global option for obs=18 */
PROC PRINT data=mylib.admit (firstobs=15); title 'prints cases 15 to 20'; run;
/*uses local option for Firstobs = 12, and obs=16.
Since local options overwrite global option for the specific procedure.*/
PROC PRINT data=mylib.admit (firstobs=12 OBS=16); title 'prints cases 12 to 16 ';
run;
/*Uses local option for Firstobs = 5, and obs=20.
Since local options overwrite global option for the specific procedure.*/
PROC PRINT data=mylib.admit(firstobs=5 obs=20); title 'prints 5 to 20 '; run;
See SAS Help Documents and a few additional options in textbook.
Write a SAS program to do the following:
(1) Create the library Mylib to connect to the SASData folder as usual.
(2) Use options: pageno=1 firstobs=5 obs=15
(3) Print data set admit in mylib
(4) Print data set admit using local options (firstobs = 3 obs =12) in proc print statement.
(5) Add system options statement with firstobs =1 and obs =15.
(6) Print data set admit using local options (firstobs = 10 obs =20) in proc print statement.
(7) Add system options statement with firstobs =1 and obs =max.
(8) Print data set admit using local options (firstobs = 3 obs =12) in proc print statement.
Save the program as c2_glob_loc_options to SASEx folder
Libname mylib ‘c:\math\sasdata’;
Options pageno=1 firstobs=5 obs=15;
Proc print data = mylib.admit; run;
Proc print data = mylib.admit (firstobs=3 obs=12); run;
Options firstobs=1 obs=15;
Proc print data = mylib.admit (firstobs=3 obs =12); run;
Options firstobs=1 obs=max;
Proc print data = mylib.admit (firstobs=3 obs =12); run;