Optimizing SAS System
Performance
− A Platform Perspective
Patrick McDonald
Scryer Analytics, LLC
June 3, 2010
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
Presentation Overview
 After this presentation you will know:
• How your SAS code interacts with the hardware it runs
on.
• The different hardware configurations SAS may run on
in your organization.
• How to help your IT organization diagnose and correct
performance problems.
 You probably won’t gain:
• Any new SAS programming tips
• More than a very brief overview of efficient
programming techniques
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
An Easy Question
proc sql;
connect to db2
(database=mydatabase);
create table Table1 as
select * from connection to
db2
( select * from db2table);
disconnect from db2;
quit;
What does this program
do?
 Connects to DB2
data View1 / view=View1;
set Table1;
retain x;
output;
x=y;
run;
 Creates x as previous y
proc summary data=View1 NWAY;
var _numeric_;
class c1 c2 c3;
output out =p.mymeans
mean= M N=COUNT;
run;
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
 SAS table of db2table
 Disconnects from DB2
 Calculates Mean and N
and outputs data
What controls system performance?
Resources
Relationships
Programmer
Time
Programmer
Time
Storage
CPU Time
Resources
Memory
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
I/O
Hardware
Efficient Programming Practices
Writing Efficient Code
 Necessary Statements
Configuring/Tuning
Options
 Buffer Allocation
 Passes Through Data
 Memory Allocation
 Essential Read/Writes
 Multithreading
 Permanent SAS Data
 Necessary Procedures
 Sorting, Duplicates, Etc.
 SAS Views
 DBMS Optimization
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
Resource Model – CPU, RAM, I/O, & Disk
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
CPU
CPU
 What is a CPU?
• # of Sockets
• # of Chips
• # of Cores
• # of Co-processors
• Clock Speed
• Etc.
 SPECfp
 SPECint
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
RAM
Memory
 RAM
 RAM per core
 RAM per session
 RAM for OS
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
I/O
I/O
 Types of Storage
• Network Attached Storage
• Local Disk
• Storage Area Network
 The disk is the slowest
part of the system ~10-60
MB/s read/write speeds
 Throughput per session
• 15-25 MB/s
• 50-75+ MB/s
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
A little more about storage
Storage Options
 HBAs
File Systems
 SAS User
 LUNS
 Temporary Work Space
 RAID
 Permanent Data Storage
 Disks
 Utility (UTILLOC)
 Disk Speed
 Disk Size
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
RAID Configurations in SAS Environments
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
Operating System Limitations
Windows (32 bit)
 Enterprise Edition (32 bit)
• ~2 GB of RAM practical
limit
• 5 GB data set size practical
limit (file cache contention)
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
Windows (x64)
 Enterprise Edition for x64
• Support issues (9.1)
• 5 GB data set size practical
limit (file cache contention)
Operating System Limitations
Windows (Itanium)
 Enterprise Edition
(Itanium)
• 10 GB data set size
practical limit (file cache
contention)
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
Unix (64 bit)
 HPUX, Solaris, AIX etc.
• Limited by hardware only
• Access to additional
memory
• No file cache contention
issues
Architecture Limitations
Hardware Bottlenecks
 CPU (#, speed, etc.)
SAN Bottlenecks
 I/O
 RAM
• Host Bus Adaptors
 Backplane
• Paths to Disk
 Cache
• Ethernet (2 GB/s Ethernet)
• Disks
 Configuration/Tuning
− RAID
 Hyperthreading
− Disk Speed
− # of disks
− Disk Size
• Luns & File Systems
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
Redux: what does this program do?
proc sql;
connect to db2
(database=mydatabase);
create table Table1 as
select * from connection to
db2
( select * from db2table);
disconnect from db2;
quit;
data View1 / view=View1;
set Table1;
retain x;
output;
x=y;
run;
proc summary data=View1 NWAY;
var _numeric_;
class c1 c2 c3;
output out =p.mymeans
mean= M N=COUNT;
run;
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
Think like hardware?
PROC SQL
proc sql;
connect to db2
(database=mydatabase);
create table Table1 as
select * from connection to
db2
( select * from db2table);
disconnect from db2;
quit;
data View1 / view=View1;
set Table1;
retain x;
output;
x=y;
run;
proc summary data=View1 NWAY;
var _numeric_;
class c1 c2 c3;
output out =p.mymeans
mean= M N=COUNT;
run;
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
What resources are used?
Data Step
proc sql;
connect to db2
(database=mydatabase);
create table Table1 as
select * from connection to
db2
( select * from db2table);
disconnect from db2;
quit;
data View1 / view=View1;
set Table1;
retain x;
output;
x=y;
run;
proc summary data=View1 NWAY;
var _numeric_;
class c1 c2 c3;
output out =p.mymeans
mean= M N=COUNT;
run;
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
What resources are used?
Proc Step
proc sql;
connect to db2
(database=mydatabase);
create table Table1 as
select * from connection to
db2
( select * from db2table);
disconnect from db2;
quit;
data View1 / view=View1;
set Table1;
retain x;
output;
x=y;
run;
proc summary data=View1 NWAY;
var _numeric_;
class c1 c2 c3;
output out =p.mymeans
mean= M N=COUNT;
run;
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
What resources are used?
Typical BI/SAS Solution Architecture
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
BI Architecture
Web Server Loads
 CPU Intensive
 Integer Calculations
 Rack Servers
 Pooled, Load Balanced
 ~ 100 concurrent
sessions per core (CPU)
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
Small Text Files
BI Architecture
Application Server Loads
 CPU Intensive
 Integer Calculations
 Rack Servers
 Pooled, Load balanced
 ~100 concurrent sessions
per core (CPU).
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
Small Text Files
BI Architecture
SAS Metadata Server
 Memory Intensive
 Metadata stored in
memory for speed
 Generally 2 CPU except
for very large
implementations
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
Metadata in RAM
database
BI Architecture
SAS BI Servers
 CPU and or I/O Intensive
 Heavy Floating Point
(CPU)
 Heavy I/O depending
upon the number of
sessions and volume of
data
 Heavy Memory (type of
problem & number of
concurrent sessions)
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
Large Volumes of Data
BI Architecture
SPD Server/RDBMS
 I/O Intensive
 SAN Storage (75+ Mb/s
sustained I/O throughput
per session)
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
Large Volumes of Data
Questions
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
References
 http://en.wikipedia.org/wiki/RAID
 Optimizing SAS® Programs Course Notes
 SGF 2009: How to maintain happy SAS users
 SUGI 31: Solving SAS Performance Problems:
Employing Host Based Tools
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.
SIMPLICITY
BEYOND
COMPLEXITY
Copyright © 2010, Scryer Analytics, LLC. All rights reserved.