Lesson 1: Introduction to SAS® Visual Statistics 1.1 Overview 1.2 Managing Reports and Pages 1.3 SAS Viya Architecture Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Lesson 1: Introduction to SAS® Visual Statistics 1.1 Overview 1.2 Managing Reports and Pages 1.3 SAS Viya Architecture Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Course Description • • • This course provides an introduction to SAS Visual Statistics on SAS Viya. SAS Visual Statistics is an add-on to SAS Visual Analytics on SAS Viya. SAS Visual Statistics adds modeling functionality to the SAS Visual Analytics web client. It enables users to experience powerful statistical modeling and machine learning techniques via an easy-to-use, drag-anddrop visual interface. The course is hands-on and provides direct access to an environment for practices and practical exposure. 3 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Course Objectives • • • • • • • • • • Describe the features and benefits of SAS Visual Statistics. Cluster the data based on demographic characteristics. Perform linear regression analyses. Create logistic regression models, including nonparametric models. Build generalized linear models and generalized additive models. Create a decision-tree analysis. Develop models by groups. Compare model performance. Export score code. Create reports in the SAS Visual Analytics environment. 4 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Lesson Objectives • • • • • Discuss the SAS Viya architecture. Describe SAS Cloud Analytic Services (CAS). Describe the benefits and functionality of SAS Visual Statistics. Access SAS Visual Statistics functionality within the SAS Visual Analytics environment. Work with the interface. 5 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. SAS Viya Architecture SAS Cloud Analytic Services (CAS) Symmetric Multiprocessing (SMP) Massively Parallel Processing (MPP) 6 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. SAS Cloud Analytic Services (CAS) Resilient High speed Scalable Node-to-node communication Fault tolerance 7 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. SAS Viya Applications SAS Drive SAS Visual Analytics SAS Data Studio SAS Visual Statistics SAS Visual Analytics App SAS Environment Manager SAS Visual Data Mining and Machine Learning SAS Cloud Analytic Services (CAS) SAS Graph Builder SAS Theme Designer 8 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. What Is SAS Visual Statistics? With the SAS Visual Statistics add-on, you can use SAS Visual Analytics to rapidly build and modify predictive models. SAS Visual Statistics models enable you to do the following actions: • take advantage of the in-memory engine, the CAS server, to quickly and efficiently analyze in-memory data • enable concurrent access to data for multiple users to formulate and refine models 9 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. SAS Visual Statistics Use SAS Visual Statistics to do the following tasks: • apply best-in-class analytics to data of any size • create a partition ID • perform exploratory modeling • derive predictive outputs • perform cluster analysis • evaluate models and put a champion model into production with existing SAS products 10 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Target Audience Analytically minded users want to build exploratory predictive models to solve specific business problems. These groups are included: • data miners • statisticians • data scientists • engineers • researchers • economists • biostatisticians • database marketers 11 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. continued... Functional Overview • • • • • • Use SAS Visual Analytics to prepare and explore data. Access the SAS Visual Statistics functionality for advanced model building. Use predictive and exploratory modeling techniques. • linear and logistic regression analyses (including nonparametric logistic) • generalized linear and additive models • decision trees • clustering Work with detailed data at the observational level. Add or remove variables, and filter interactively. Create interactions spontaneously. 12 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Functional Overview • Identify influential outliers interactively. • Apply group-by processing (that is, create separate models by group-by variables or segments). • Perform model comparison via lift charts, ROC charts, concordance statistics, misclassification tables, and other model comparison metrics. • Export model score code. • Create data partitions for validation or testing, or both. • Write summary tables to the output in Microsoft Excel. • Create a secure, multi-user environment for concurrent access to in-memory analytics via the CAS server. 13 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. SAS Drive Applications menu New content button Quick Access area Folders and Filters Canvas The applications available depend on your assigned permissions and the products licensed by your site. 14 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Accessing SAS Visual Statistics Functionality From SAS Drive, you can select New from the Reports tab, select Explore and Visualize from the Applications menu, or select Open from the Actions menu. 15 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. SAS Visual Analytics Welcome Window Use the Welcome to SAS Visual Analytics window to do one of the following tasks: • start with data to open or begin import of data • start a new report • open an existing report 16 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Page tabs SAS Visual Analytics Interface Left pane with Data pane open Canvas with object 17 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. More options Right pane with Options pane open SAS Visual Statistics Objects 18 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Specifying Settings Two types of settings are available. • Global – settings that are shared between all of the SAS web applications • Application-specific – preferences that are specific to SAS Visual Statistics • fit summary p-value precision • default statistic for model comparison 19 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Global Settings Select NAME Settings from the menu bar. 20 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. SAS Visual Statistics Settings Select NAME Settings Visual Statistics. 21 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Example SAS Visual Statistics Session 1. 2. 3. 4. 5. 6. 7. Import or open a data source. Add a Visual Statistics object to the canvas. Assign variables in the role pane. Customize model with selections in the options pane. Examine the output. Enhance the model using interactive features. Save the report. Data Object Assign Customize 22 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Examine Idea Exchange • • How many rows does a typical data set (table) that you work with contain? How many variables (fields)? Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Course Data Introduction Anonymized and transformed campaign data • is from the accounts of a large financial-services firm • includes home equity lines of credit, loans, and other short-term to medium-term credit instruments • includes more than a half-year time interval • focuses on direct and indirect promotions • has three target variables (B_TGT is the focus.) • has an account identifier. 24 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Introduction to the SAS Visual Statistics Environment In this demonstration, you access and examine the SAS Visual Statistics interface, customize your preferences, and explore relationships among variables in the course data. Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Practice This practice reinforces the concepts discussed previously. Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Questions? Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Lesson 1: Introduction to SAS® Visual Statistics 1.1 Overview 1.2 Managing Reports and Pages 1.3 SAS Viya Architecture Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Objectives • Manage your reports and pages in SAS Visual Analytics. 29 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Managing Reports Many approaches are available for creating a new report in SAS Visual Analytics. Some users select a data source before they add the report objects. Other users add report objects to the canvas and then select a data source. Open SAS Visual Analytics. You can do the following actions. • To create a new report, click (Menu) and select New from the upper right corner of the report. • To save a report, click (Menu) and select Save or Save As from the upper right corner of the report. • To view a report, click (Menu) and select View report from the upper right corner of the report. 30 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Managing Pages Within a report, you do these actions: Add • To add a page, click (New page) in the upper left corner of the report. • To duplicate a page, click (Options) inside a page tab and select Duplicate page. • From the Outline pane, you can easily do the following activities: • add a page • delete a page (select first) • hide a page • move objects and pages Move Hide Delete 31 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Questions? Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Lesson 1: Introduction to SAS® Visual Statistics 1.1 Overview 1.2 Managing Reports and Pages 1.3 SAS Viya Architecture Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Objectives • • Describe the modeling approach. Describe SAS Viya architecture. 34 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Modeling Approach Modification Traditional Approach (SEMMA) Sample Explore Modify Model Assess Big Data Approach (EMSMA) Explore Modify Segment Model Assess 35 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. SAS Visual Statistics Rapid Model Building and Refinement 36 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Big Data and Big Analytics Challenges • • • Very computationally intensive model fitting or simulations can cause slow execution. Massive amounts of data are stored in databases. Extraction into traditional computing environments requires massive data movement, which results in very poor performance. Many data mining algorithms require multiple passes of the data for training models or reloading data into memory across analytical steps. This exacerbates the performance problem. 37 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. SAS Viya Functionality Among the core products in SAS Viya, available functionality is cumulative. • SAS Visual Analytics provides baseline functionality, including reporting and basic analytics. • SAS Visual Statistics provides an additional set of advanced analytic functions. • SAS Visual Data Mining and Machine Learning provides a second additional set of advanced analytic functions. 38 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. SAS Viya Functionality SAS Visual Analytics • Explore data and discovery relationships • Examine distributions and summary statistics • Perform post-model analysis and reporting SAS Visual Statistics • Build unsupervised and supervised models • Interactively refine candidate models • Compare models and generate score code SAS Visual Data Mining and Machine Learning • Six additional machine learning models 39 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. SAS Viya Interfaces Multiple Interfaces, Including Visual and Programmatic 40 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. SAS Visual Statistics in SAS Viya Modeling Techniques (Visual Interface) • Linear Regression • Logistic Regression • Nonparametric Logistic • GLM Regression • GAM Regression • Clustering • Decision Tree Analytical Procedures (SAS Studio Programmatic Interface) • GENSELECT (Generalized Linear Model) • KCLUS (k-means and k-modes Clustering) • NMF (Nonnegative Matrix Factorization) • SANDWICH (Sandwich Variance Estimator) • PCA (Principal Component Analysis) • LOGSELECT (Logistic Regression) • NLMOD (Nonlinear Regression) • REGSELECT (Linear Regression) • TREESPLIT (Decision Trees) • PLSMOD (Partial Least Square) • QTRSELECT (Quantile Regression) • SPC (Statistical Process Control) • LMIXED (Linear Mixed Models) • MBC (Model-Based Clustering) • SIMSYSTEM (Simulate Univariate Data) • GAMMOD (Generalized Additive Model) • GAMSELECT (Model Selection for GAM) • PHSELECT (Proportional Hazard Model) • ICA (Independent Component Analysis) • MODELMATRIX (Matrix of Covariates) 41 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. CAS Actions CAS actions are the most basic units of a CAS server. tableInfo() help() topK() logistic() addTable() shutDown() ... genmod() freq() ... glm() ... summary() shuffle() addNode() 42 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. CAS Action Sets CAS actions are grouped into action sets. tableInfo() help() topK() logistic() addTable() shutDown() ... genmod() freq() Simple Regression ... glm() ... summary() shuffle() addNode() Table BuiltIn 43 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. SAS Viya Consistency Different Interfaces, Same Results Submit CAS Action SAS Visual Analytics SAS Visual Statistics SAS Visual Data Mining and Machine Learning …… Return Action Results CAS-Enabled Procedures: TREESPLIT, LOGSELECT, GENSELECT, … 44 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. SAS Cloud Analytic Services Memory Management • • • • • • SAS Cloud Analytic Services is a server that provides the cloud-based, run-time environment for data management and analytics with SAS. The server can run on a single machine or as a distributed server on multiple machines. The distributed server has a communication layer that supports fault tolerance. The data on the server is managed in blocks in order to handle large problems and tables that exceed the memory capacity. Whenever necessary, the server caches the blocks on disk. SAS Viya is a unified, third-generation, high-performance analytics engine. 45 Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved. Questions? Copyr i ght © SAS I nsti tute I nc . Al l r i ghts reser ved.