Forthcoming Changes in SAS
Paul Kent
VP SAS Platform Research & Development
Where do I come from?
New Hill, North Carolina
Johannesburg, South
Fareham, England
R & D :: Loyal Employees
R & D groups, and where I come from
 Platform
 Clients
 Solutions
• With Analytics
R & D groups, and where I come from
 Platform
 Clients
 Solutions
• With Analytics
What do we programmers do?
Gather Data
Organise Data
Arrange Data for consumption
Facilitate said consumption
Create understanding of Data
Promote understanding of said Data
Who do we programmers do it for?
Audience Continuum
Information Consumers
Domain Experts
Web Report Viewing
Web Reporting
Power Reporting
Information Delivery Framework
Forthcoming Improvements in the SAS
ODS (and the new ODS statistical graphics)
SAS Database Storage capabilities
The Data Step and Proc SQL
Grid Computing Capabilities
Bits and Pieces
ODS Statistical Graphics
Copyright © 2004, SAS Institute Inc. All rights reserved.
Survival Plot Using PROC LIFETEST in SAS 8
 J. Zhou, NESUG 2002
 Three-page SAS
program with macros
 Use GPLOT and
GREPLAY for graphics
Statistical Metadata
Overlaid Curves
Statistical Graphics
 Essential for modern data analysis
 Difficult to create in SAS prior to SAS 9
• Context lost when statistical procedure terminates
• Programmer must recreate context, metadata
 Statistical procedures should automatically
create graphics
 Follow the 80-20 rule – 20% of these might need
further tweaking, but for the most part…
Life Is Easier in SAS 9 …
ods graphics on;
ods html file="lifetest.htm";
proc lifetest data=surv;
time surv*censor(1);
survival plots=(survival hwb);
strata trt;
id patient;
ods html close;
ods graphics off;
LIFETEST Procedure – Survival Plot
LIFETEST Procedure – HWB plot
Usage of ODS Statistical Graphics in SAS 9
 Experimental in 30 SAS/STAT and SAS/ETS
procedures - SAS 9.1
 Automates creation of commonly used graphical
displays for a particular analysis
 Production in SAS 9.2
GAM Procedure
HPF Procedure
KDE Procedure
KDE Procedure
LOGISTIC Procedure
MIXED Procedure
MIXED Procedure
PHREG Procedure
PLS Procedure
PRINCOMP Procedure
REG Procedure
UCM Procedure
Integration with ODS Styles
 Over 30 different styles
 New style elements for statistical graphics
• Fitted line
• Confidence lines and bands
• Prediction Lines
• Outliers
• Classification groups
Style Demonstration
ods html file=“robustreg.htm” style=journal;
ods graphics on;
title “Journal Style”;
proc robustreg data=mydata plot=all;
model y = x1 x2 x3;
ods html close;
(only Summary Statistics and Residual Histogram output shown)
 Goal is to automate creation of graphics by
statistical procedures
• Minimum work for user
• Maximum built-in functionality
 Experimental in SAS 9.1
 Production in SAS 9.2
SAS Transactional Storage
(aka SAS Database Capabilities)
 Demo Time
 1. Color_table
• Remember to start your TableServer
 2. Customers
• Remember to start your AppServer (tomcat5)
SAS Transactional Storage
(aka SAS Database Capabilities)
 A more traditional Database Capability
 From SAS. (not oracle, ibm, or microsoft)
 Based on OpenSource “Firebird”
Real Datatypes – INT, MONEY, VARCHAR
Real Connectors – JDBC, ODBC, SAS Libname
Real Transactions – Rollback and Commit
MultiUser Server
What’s New in SAS Grid
Cheryl Doninger
R&D Director, Grid Development
Roger Thompson
Relationship Manager
Merry Rabb
Product Manager, Grid
Grid Computing Market Size & Growth
Rapid Adoption of Grid Computing Based on Benefits
Grid Adoption is Increasing
2/3 of firms surveyed
are using or
considering grid
A high percentage of
firms using analytical
applications are
considering grid
Benefits of Grid Computing
 Faster results
 More executions – more data
 Time to recover from errors
 Better use of resources
 Virtualize resources
 Incremental IT spend
Types of Applications Suitable for Grid
 Long running
 Many replicate runs of same fundamental
simulation (what if analysis)
optimization (testing lots of scenarios)
BY GROUP processing
data segmentation
 Independent tasks running against large
data sources
• scoring – risk analysis
• multiple procedures and data steps
SAS Grid Strategy
 Infrastructure benefits SAS applications
• large data / complex algorithms
 Focus areas
• Development
• Run-time
• System management
 Incremental Releases
SAS Grid Roadmap
Phase I
 SAS 8.2 functionality
• %Distribute
• SAS log
SAS Grid Success Stories
Texas Tech University
Statistics Canada
Large Pharmaceutical Company
SAS Grid Roadmap
Phase II
 SAS 9.1.3 Q3/2005 functionality
• smarter engines for SAS IDEs
• SAS/Platform integration
• SASMC monitoring
Business Analytics - Enterprise Miner on SMP
Business Analytics - Enterprise Miner on Grid
Data Integration – ETL Studio on SMP/Grid
Data Integration – ETL Studio on SMP/Grid
Business Intelligence – Enabled on SMP/Grid
SAS Stored Process
SAS Program
ETL Studio
Enterprise Miner
Grid Manager Plugin – job view
Grid Manager Plugin – host view
SAS 9 Grid Computing Components
SAS 9 Grid Computing
NEW September 2005
Grid Manager
Grid Monitoring
Grid Management
Job Termination
Platform Suite
for SAS
Dynamic Load
Job, Queue & Host
SAS Connect
Session Spawning
SAS Applications
 Enterprise Miner
 Stored Processes
 Data Integration
Grid Enabled
Code Generation
Multiple Components Working Together to Provide Grid Computing
General Layout of a SAS Grid
Grid Node
SAS Foundation
Grid Node
Client Machine
SAS Grid
Grid Node
Metadata Server
Grid Mgr plugin
Grid Control
Suite for SAS
Grid Work Flow
Metadata Server
session resource
wl options
SASMain sas –noobjectserver
Workspace Server
Connect Client
LSF Cluster File
grdsvc_enable(p1, “resource=SASMain”);
Node1 ! ! 1 () (SASMain)
Node2 ! ! 1 () ()
Node3 ! ! 1 () (SASMain)
signon p1;
SAS Metadata
SASMain – Server Context
Platform Server Component
sas -noobjectserver
ETL Studio
Enterprise Miner
Partitioning the Grid
session resource
wl options
SASMain sas –noobjectserver ETL
Metadata Server
Workspace Server
Connect Client
EM grid
LSF Cluster File
Node1 ! ! 1 () (SASMain,EM)
Node2 ! ! 1 () (SASMain,EM,ETL)
Node3 ! ! 1 () (SASMain, ETL)
grdsvc_enable(p1, “resource=SASMain,
signon p1;
SAS Metadata
SASMain – Server Context
Platform Server Component
sas –noobjectserver
ETL Studio
Enterprise Miner
ETL grid
Grid Provides: Speed and Efficiency
Analytics are working, so people…
 Build more models
• For successively refined segments of customers
 Use more data in those models
 Integrate the results into operational systems
• <near real time>
 A SAS9.2 datastep movie
 More Multi thread enablement within SAS
 Yes, even the DATA STEP
 Saved Programs
 Multi Threaded Server Capabilities
• Same model, parallel data for thruput
• Many models, same data – one off scores in
operational systems
 Models Management can deploy models to
“score servers” without restarting them
Bits and Pieces
Reverse Engineer SAS jobs
Checkpoint and Restart SAS jobs
Encode (and protect) your SAS jobs
ZIP functions
Protect your IP
<expire=> <site=> …
 Send to your customers
 %include ‘’;
• Implies nosource; your macros can reset NOMPRINT…
Checkpoint/Restart and
Parallelization Features
in the Core Supervisor
Rick Langston, Core Systems Department
 Craig R.’s request as per user community
 Job fails – want to restart where it left off
 ETL Studio also wanted a restart facility
A simple solution
Record a checkpoint number, save it in WORK
If restarting, skip PROC / DATA steps to there
Tokenize everything
Execute all global statements
To set up for checkpointing
 Have WORK refer to a permanent directory
 Use the CHECKPOINT option
Subsequent restarting
Again use WORK to the permanent directory
Use the RESTART option
Job will restart as of the last successful step
Is this what users want?
 We can’t do this without user being proactive
 data temp / set temp issues
 skipped steps may need to be executed
 Output files (flat files – DISP=MOD,
 Use it for a step that must be executed
 For example, SYMPUT and CALL EXECUTE
 Using options debug=‘checkpoint-implicit’;
 Option names still to be decided
data temp1; x=1; run;
data temp2; x=2; run;
data temp3; x=3; run;
data _null_;
if "&sysparm."="1"
then abort abend 999;
data temp4; x=4; run;
 Invoke once with checkpoint-implicit
 Then reinvoke with restart-implicit
Additional info
 Planned for 9.2
 Option names still being decided
 Wanting additional input
Parallelization Efforts
Reading in arbitrary SAS code
Producing metadata in comments
This could be post-processed by ETL Studio
This could be post-processed by Grid Computing
Parallelization Efforts
Researching so far
Hooks in dependency opens
Catalogs, flat files, SAS data sets, etc.
Emitting info in comments
Example of use
Exposure to User
 New option, such as DEPMETA=fileref
 SAS program with comments written to this file
Ideas for the Future!
 How can the software learn?
 So the user doesn’t have to learn about the
software; they can learn the business!
 Some future ETL studio JOB
• Remembers data volumes from last weeks run
• Uses that memory to choose a better strategy
Your Turn!!
 You tell me next time SAS forgets something it
should have remembered
 And why remembering that would help SAS
improve next time
< >
Thanks for listening!
