Uploaded by sharmini.gopinathan

Lecture 2

advertisement
Lesson 1: Introduction to SAS® Visual Statistics
1.1 Overview
1.2 Managing Reports and Pages
1.3 SAS Viya Architecture
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Lesson 1: Introduction to SAS® Visual Statistics
1.1 Overview
1.2 Managing Reports and Pages
1.3 SAS Viya Architecture
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Course Description
•
•
•
This course provides an introduction to SAS Visual Statistics on SAS Viya.
SAS Visual Statistics is an add-on to SAS Visual Analytics on SAS Viya.
SAS Visual Statistics adds modeling functionality to the SAS Visual
Analytics web client. It enables users to experience powerful statistical
modeling and machine learning techniques via an easy-to-use, drag-anddrop visual interface.
The course is hands-on and provides direct access to an environment
for practices and practical exposure.
3
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Course Objectives
•
•
•
•
•
•
•
•
•
•
Describe the features and benefits of SAS Visual Statistics.
Cluster the data based on demographic characteristics.
Perform linear regression analyses.
Create logistic regression models, including nonparametric models.
Build generalized linear models and generalized additive models.
Create a decision-tree analysis.
Develop models by groups.
Compare model performance.
Export score code.
Create reports in the SAS Visual Analytics environment.
4
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Lesson Objectives
•
•
•
•
•
Discuss the SAS Viya architecture.
Describe SAS Cloud Analytic Services (CAS).
Describe the benefits and functionality of SAS Visual Statistics.
Access SAS Visual Statistics functionality within the SAS Visual Analytics
environment.
Work with the interface.
5
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
SAS Viya Architecture
SAS Cloud Analytic Services (CAS)
Symmetric Multiprocessing (SMP)
Massively Parallel Processing (MPP)
6
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
SAS Cloud Analytic Services (CAS)
Resilient
High speed
Scalable
Node-to-node
communication
Fault tolerance
7
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
SAS Viya Applications
SAS Drive
SAS Visual Analytics
SAS Data Studio
SAS Visual Statistics
SAS Visual Analytics App
SAS Environment
Manager
SAS Visual Data Mining
and Machine Learning
SAS Cloud Analytic Services (CAS)
SAS Graph Builder
SAS Theme Designer
8
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
What Is SAS Visual Statistics?
With the SAS Visual Statistics add-on, you can use SAS Visual Analytics
to rapidly build and modify predictive models.
SAS Visual Statistics models enable you to do the following actions:
• take advantage of the in-memory engine, the CAS server, to quickly
and efficiently analyze in-memory data
• enable concurrent access to data for multiple users to formulate
and refine models
9
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
SAS Visual Statistics
Use SAS Visual Statistics to do the following tasks:
• apply best-in-class analytics to data of any size
• create a partition ID
• perform exploratory modeling
• derive predictive outputs
• perform cluster analysis
• evaluate models and put a champion model into production with existing
SAS products
10
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Target Audience
Analytically minded users want to build exploratory predictive models
to solve specific business problems. These groups are included:
• data miners
• statisticians
• data scientists
• engineers
• researchers
• economists
• biostatisticians
• database marketers
11
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
continued...
Functional Overview
•
•
•
•
•
•
Use SAS Visual Analytics to prepare and explore data.
Access the SAS Visual Statistics functionality for advanced model building.
Use predictive and exploratory modeling techniques.
•
linear and logistic regression analyses (including nonparametric logistic)
•
generalized linear and additive models
•
decision trees
•
clustering
Work with detailed data at the observational level.
Add or remove variables, and filter interactively.
Create interactions spontaneously.
12
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Functional Overview
•
Identify influential outliers interactively.
•
Apply group-by processing (that is, create separate models
by group-by variables or segments).
•
Perform model comparison via lift charts, ROC charts, concordance
statistics, misclassification tables, and other model comparison metrics.
•
Export model score code.
•
Create data partitions for validation or testing, or both.
•
Write summary tables to the output in Microsoft Excel.
•
Create a secure, multi-user environment for concurrent access
to in-memory analytics via the CAS server.
13
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
SAS Drive
Applications menu
New content
button
Quick Access area
Folders and Filters
Canvas
The applications available depend
on your assigned permissions and
the products licensed by your site.
14
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Accessing SAS Visual Statistics Functionality
From SAS Drive, you can select New from the Reports tab, select Explore
and Visualize from the Applications menu, or select Open from the Actions
menu.
15
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
SAS Visual Analytics Welcome Window
Use the Welcome to SAS Visual Analytics window to do one of the following
tasks:
• start with data to open or begin import of data
• start a new report
• open an existing report
16
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Page tabs
SAS Visual Analytics Interface
Left pane
with Data
pane open
Canvas with object
17
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
More
options
Right pane
with Options
pane open
SAS Visual Statistics Objects
18
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Specifying Settings
Two types of settings are available.
• Global – settings that are shared between all of the SAS web applications
• Application-specific – preferences that are specific to SAS Visual Statistics
•
fit summary p-value precision
•
default statistic for model comparison
19
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Global Settings
Select NAME  Settings from the menu bar.
20
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
SAS Visual Statistics Settings
Select NAME  Settings  Visual Statistics.
21
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Example SAS Visual Statistics Session
1.
2.
3.
4.
5.
6.
7.
Import or open a data source.
Add a Visual Statistics object to the canvas.
Assign variables in the role pane.
Customize model with selections in the options pane.
Examine the output.
Enhance the model using interactive features.
Save the report.
Data
Object
Assign
Customize
22
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Examine
Idea Exchange
•
•
How many rows does a typical data set (table)
that you work with contain?
How many variables (fields)?
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Course Data Introduction
Anonymized and transformed campaign data
• is from the accounts of a large financial-services firm
• includes home equity lines of credit, loans, and other short-term
to medium-term credit instruments
• includes more than a half-year time interval
• focuses on direct and indirect promotions
• has three target variables (B_TGT is the focus.)
• has an account identifier.
24
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Introduction to the
SAS Visual Statistics
Environment
In this demonstration, you access
and examine the SAS Visual Statistics
interface, customize your preferences,
and explore relationships among
variables in the course data.
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Practice
This practice reinforces the concepts
discussed previously.
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Questions?
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Lesson 1: Introduction to SAS® Visual Statistics
1.1 Overview
1.2 Managing Reports and Pages
1.3 SAS Viya Architecture
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Objectives
•
Manage your reports and pages in SAS Visual Analytics.
29
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Managing Reports
Many approaches are available for creating a new report in SAS Visual
Analytics. Some users select a data source before they add the report
objects. Other users add report objects to the canvas and then select a data
source.
Open SAS Visual Analytics. You can do the following actions.
• To create a new report, click
(Menu) and select New from the upper
right corner of the report.
• To save a report, click
(Menu) and select Save or Save As from the
upper right corner of the report.
• To view a report, click
(Menu) and select View report from the upper
right corner of the report.
30
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Managing Pages
Within a report, you do these actions:
Add
• To add a page, click
(New page)
in the upper left corner of the report.
• To duplicate a page, click
(Options) inside a page tab
and select Duplicate page.
• From the Outline pane, you can
easily do the following activities:
•
add a page
•
delete a page (select first)
•
hide a page
•
move objects and pages
Move
Hide
Delete
31
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Questions?
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Lesson 1: Introduction to SAS® Visual Statistics
1.1 Overview
1.2 Managing Reports and Pages
1.3 SAS Viya Architecture
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Objectives
•
•
Describe the modeling approach.
Describe SAS Viya architecture.
34
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Modeling Approach Modification
Traditional Approach (SEMMA)
Sample
Explore
Modify
Model
Assess
Big Data Approach (EMSMA)
Explore
Modify
Segment
Model
Assess
35
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
SAS Visual Statistics
Rapid Model Building and Refinement
36
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Big Data and Big Analytics Challenges
•
•
•
Very computationally intensive model fitting or simulations can cause
slow execution.
Massive amounts of data are stored in databases. Extraction into
traditional computing environments requires massive data movement,
which results in very poor performance.
Many data mining algorithms require multiple passes of the data for
training models or reloading data into memory across analytical steps.
This exacerbates the performance problem.
37
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
SAS Viya Functionality
Among the core products in SAS Viya, available functionality is cumulative.
• SAS Visual Analytics provides baseline functionality, including reporting
and basic analytics.
• SAS Visual Statistics provides an additional set of advanced analytic
functions.
• SAS Visual Data Mining and Machine Learning provides a second
additional set of advanced analytic functions.
38
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
SAS Viya Functionality
SAS Visual Analytics
•
Explore data and discovery relationships
•
Examine distributions and summary
statistics
•
Perform post-model analysis and reporting
SAS Visual Statistics
•
Build unsupervised and supervised models
•
Interactively refine candidate models
•
Compare models and generate score code
SAS Visual Data Mining and Machine Learning
•
Six additional machine learning models
39
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
SAS Viya Interfaces
Multiple Interfaces, Including Visual and Programmatic
40
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
SAS Visual Statistics in SAS Viya
Modeling Techniques
(Visual Interface)
• Linear Regression
• Logistic Regression
• Nonparametric Logistic
• GLM Regression
• GAM Regression
• Clustering
• Decision Tree
Analytical Procedures
(SAS Studio Programmatic Interface)
• GENSELECT (Generalized Linear Model)
• KCLUS (k-means and k-modes Clustering)
• NMF (Nonnegative Matrix Factorization)
• SANDWICH (Sandwich Variance Estimator)
• PCA (Principal Component Analysis)
• LOGSELECT (Logistic Regression)
• NLMOD (Nonlinear Regression)
• REGSELECT (Linear Regression)
• TREESPLIT (Decision Trees)
• PLSMOD (Partial Least Square)
• QTRSELECT (Quantile Regression)
• SPC (Statistical Process Control)
• LMIXED (Linear Mixed Models)
• MBC (Model-Based Clustering)
• SIMSYSTEM (Simulate Univariate Data)
• GAMMOD (Generalized Additive Model)
• GAMSELECT (Model Selection for GAM)
• PHSELECT (Proportional Hazard Model)
• ICA (Independent Component Analysis)
• MODELMATRIX (Matrix of Covariates)
41
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
CAS Actions
CAS actions are the most basic units of a CAS server.
tableInfo()
help()
topK()
logistic()
addTable()
shutDown()
...
genmod()
freq()
...
glm()
...
summary()
shuffle()
addNode()
42
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
CAS Action Sets
CAS actions are grouped into action sets.
tableInfo()
help()
topK()
logistic()
addTable()
shutDown()
...
genmod()
freq()
Simple
Regression
...
glm()
...
summary()
shuffle()
addNode()
Table
BuiltIn
43
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
SAS Viya Consistency
Different Interfaces, Same Results
Submit
CAS Action
SAS Visual Analytics
SAS Visual Statistics
SAS Visual Data
Mining and Machine
Learning
……
Return
Action Results
CAS-Enabled Procedures:
TREESPLIT, LOGSELECT, GENSELECT,
…
44
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
SAS Cloud Analytic Services
Memory Management
•
•
•
•
•
•
SAS Cloud Analytic Services is a server that provides the cloud-based,
run-time environment for data management and analytics with SAS.
The server can run on a single machine or as a distributed server
on multiple machines.
The distributed server has a communication layer that supports
fault tolerance.
The data on the server is managed in blocks in order to handle
large problems and tables that exceed the memory capacity.
Whenever necessary, the server caches the blocks on disk.
SAS Viya is a unified, third-generation, high-performance analytics engine.
45
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Questions?
Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.
Download