Finding Hidden Intelligence with Predictive Analysis of Data Mining Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd rafal@projectbotticelli.com Objectives • Show use of Microsoft SQL Server 2008 Analysis Services Data Mining • Tantalise you with the power of DM This seminar is based on a number of sources including a few dozen of Microsoft-owned presentations, used with permission. Thank you to Marin Bezic, Kathy Sabourin, Aydin Gencler, Bryan Bredehoeft, and Chris Dial for all the support. Thank you to Maciej Pilecki for assistance with demos. The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material presented is not certain and may vary based on several factors. Microsoft makes no warranties, express, implied or statutory, as to the information in this presentation. Portions © 2009 Project Botticelli Ltd & entire material © 2009 Microsoft Corp. Some slides contain quotations from copyrighted materials by other authors, as individually attributed or as already covered by Microsoft Copyright ownerships. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Project Botticelli Ltd as of the date of this presentation. Because Project Botticelli & Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft and Project Botticelli cannot guarantee the accuracy of any information provided after the date of this presentation. Project Botticelli makes no warranties, express, implied or statutory, as to the information in this presentation. E&OE. 2 Agenda • Data Mining and Predictive Analytics • Server and Process Considerations • Scenarios & Demos 3 What does Data Mining Do? Explores Your Data Finds Patterns Performs Predictions 4 Typical Uses Seek Profitable Customers Correct Data During ETL Understand Customer Needs Data Mining Detect and Prevent Fraud Build Effective Marketing Campaigns Anticipate Customer Churn Predict Sales & Inventory 5 Server Mining Architecture BIDS Excel Visio SSMS Excel/Visio/SSRS/Your App OLE DB/ADOMD/XMLA App Data Deploy Analysis Services Server Mining Model Data Mining Algorithm Data Source 6 Mining Process Training data Mining Model Data to be predicted DM Engine Mining Model Mining Model With predictions 7 Who are our customers? Are there any relationships between their demographics and their buying power? SCENARIO: CUSTOMER CLASSIFICATION & SEGMENTATION 8 Microsoft Decision Trees • Use for: • Classification: churn and risk analysis • Regression: predict profit or income • Association analysis based on multiple predictable variable • Builds one tree for each predictable attribute • Fast 9 Decision Trees for Classification of Customers’ Buying Potential 10 Who are our most profitable customers? Can I predict profit of a future customer based on demographics? Are they creditworthy? How much should I charge them to give a good loan and protect against losses? SCENARIO: PROFITABILITY AND RISK 11 Profitability and Risk • Finding what makes a customer profitable is also classification or regression • Typically solved with: • Decision Trees (Regression), Linear Regression, • and Neural Networks or Logistic Regression • Often used for prediction • Important to predict probability of the predicted, or expected profit • Risk scoring • Logistic Regression and Neural Networks 12 Neural Network & Logistic Regression • Applied to • Classification • Regression • Great for finding complicated relationship among attributes • Difficult to interpret results • Gradient Descent method • LR is NNet with no hidden layers Output Layer Loyalty Hidden Layers Input Layer Age Education Sex Income 13 1. Neural Networks for Profitability Analysis 2. Predicting Lending Risk with Neural Networks 14 How do they behave? What are they likely to do once they bought that really expensive car? Should I intervene? SCENARIO: CUSTOMER NEEDS ANALYSIS 15 Sequence Clustering • Analysis of: • • • • • Customer behaviour Transaction patterns Click stream Customer segmentation Sequence prediction • Mix of clustering and sequence technologies • Groups individuals based on their profiles including sequence data 16 Analysis Customer Behaviour with Sequence Clustering 17 What are my sales going to be like in the next few months? Will I have credit problems? Will my server need an upgrade in the next 3 months? SCENARIO: FORECASTING 18 Time Series • Uses: • • • • Forecast sales Inventory prediction Web hits prediction Stock value estimation • Regression trees with extras 19 Forecasting Using Time Series 20 Summary • Data Mining is a powerful, predictive technology • Turns data into valuable, decision-making knowledge • SQL Server 2008 Analysis Services support Predictive Analytics • Mine your mountains of data for gems of intelligence today! 23 Summary and Q&A Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd rafal@projectbotticelli.com BI & PM in an Enterprise 9: Clients Delivering BI enables abe process ofdirectly continuous business improvement 8: 1: 2: 3: 4: 5: 6: 7: The Data Staging Manual data warehouse sources need may use cleansing areas warehouse various access access can may isbe may periodically simplify tools data to manages mirrored/replicated data sources to required the query data populated data the to for warehouse cleanse data analyzing tofrom warehouse reduce dirty data population and contention data sources reporting Data Warehouse Data Sources Data Marts Staging Area Client Access Manual Cleansing Client Access 25 Want Powerful BI Applications? • You need a well designed Data Warehouse! • Want BI Apps quickly with self-service abilities? • Ensure good dimensional design: • Easy to understand for a knowledge worker • Flexible • Correct and aligned 26 Three Contexts of BI Use Personal BI 1 Built by me, for me, used only by me Team BI 2 Built by someone on the team, for the team’s use 3 Organizational BI Built and maintained by IT, for use across company 27 Integrated BI Platform 28 Resources • Project Botticelli at your service! • • Training, mentoring, “do-it-with-you” on-the-job assistance with all BI and SQL needs Email me at rafal@projectbotticelli.com • Home: www.microsoft.com/bi • Demos on www.sqlserveranalysisservices.com, www.sqlserverdatamining.com, www.codeplex.com • More demos and sessions at www.microsoft.com/technetspotlight 29 Q&A 30 Thank You! Please email your comments or requests to rafal@projectbotticelli.com 31 © 2009 Microsoft Corporation & Project Botticelli Ltd. All rights reserved. The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material presented is not certain and may vary based on several factors. Microsoft makes no warranties, express, implied or statutory, as to the information in this presentation. Portions © 2009 Project Botticelli Ltd & entire material © 2009 Microsoft Corp. Some slides contain quotations from copyrighted materials by other authors, as individually attributed or as already covered by Microsoft Copyright ownerships. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Project Botticelli Ltd as of the date of this presentation. Because Project Botticelli & Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft and Project Botticelli cannot guarantee the accuracy of any information provided after the date of this presentation. Project Botticelli makes no warranties, express, implied or statutory, as to the information in this presentation. E&OE. 32