Data Science and Analytics Unit 1. Introduction Introduction to Stephan Sorger

advertisement
Introduction to
Data Science and Analytics
Stephan Sorger
www.StephanSorger.com
Unit 1. Introduction
Disclaimer:
• All images such as logos, photos, etc. used in this presentation are the property of their respective copyright owners and are used
here for educational purposes only
• Some material adapted from: Sorger, Stephan. “Marketing Analytics: Strategic Models and Metrics. Admiral Press. 2013.
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Introduction 1
Outline/ Learning Objectives
Topic
Description
Definition
Topics
Trends
Decision Models
Predictive Analytics
Applications to gain valuable insight
Specific areas covered in the course
Timely trends driving adoption of data science
Decision models; Terminology; Forms; Types
Applications; Methods
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Introduction 2
Data Science: Introduction
Topic
Description
Definition
Application of technologies, techniques, and tools to data
to provide actionable insight
Coverage
Excel 1: Essentials: Formulas, Charts, Tips and Tricks
Excel 2: Tools: Solver, Statistics, etc.
Excel 3: Regression: R-squared, F tests, T tests, P tests
Excel 4: Forecasting: Time series; Multivariate
SQL (and Excel): Dipping into back-end databases
R Basics: Basic commands; Regression
R Applications: Segmentation
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Introduction 3
Trends Driving Data Science Adoption
Accountability
Improve productivity
Reduce costs
“What gets measured gets done”
Online Data Availability
Data Science
Adoption
Data-Driven Presentations
Data to back up proposals
Predict success of plans
Cloud-based data storage
Online = speed
Online = convenience
Reduced Resources
Massive Data
Initiatives to capture customer information
What to do with all that data?
Do more with less
Scrutinized budgets
Scientists must show outcomes
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Introduction 4
Data Scientists; Based on “Big Bang Theory” Characters
Dr. Sheldon Cooper
Theoretical Physicist
Dr. Leonard Hofstadter
Experimental Physicist
Howard Wolowitz
Engineer, Applied Physics
Theoretical Data Scientist
Machine learning; AI
General Data Scientist
Data mining
Data Scientist/ Engineer
Software development; Browser technology
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Introduction 5
Decision Models: Definition
Topic
Description
Model
Simplified representation of reality to solve problems
Evaluate affect of changes in input variables
Models provide guidance on business decisions
Example
Model showing changes in sales as we increase number of features
A
Sales
Revenue
Add features to product
But too many features  feature bloat
Sales can actually start to decrease
features
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Introduction 6
Decision Models: Styles
Topic
Description
Verbal
Expressed in words
“Sales is influenced by product features”
Pictorial
Expressed in pictures
Chart or graph of phenomenon
Mathematical
Expessed in equation
Sales = a + b * Features
Verbal
Pictorial
Mathematical
Sales = f(features)
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Introduction 7
Decision Models: Forms
Topic
Description
Descriptive
Characterize (describe) phenomenon
Identify causal relationships and relevant variables
Example: Descriptive equation: Sales = a*Features + b*Advertising +c*…
Predictive
Determine likely outcomes given certain inputs
Classic “What If?” spreadsheet exercise
Example: Spreadsheet to test different scenarios; What if we increase budget?
Normative
Decide best course of action to maximize objective, given fixed constraints
“Given X, what should I do?”
Example: Linear programming model
Descriptive
Features
Predictive
Normative
This Way
Ads
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Introduction 8
Terminology: Linear Equation
Y
Dependent
Variable
(Response)
Y=a+b*X
b
Y-intercept
Y value when
X=0
1
Slope = rise/run = b/1
Y = Dependent Variable; Response/Output
X = Independent Variable: Input
a = Parameter: Y-intercept
b = Parameter: Slope
X
Independent Variable (Input)
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Introduction 9
Decision Models: Variables
Topic
Description
Variable
Quantity that can be changed, or varied
Examples: Advertising budget, Sales
Independent Variable
Variable whose value affects dependent variable
Controllable: Product features; Number of emails sent
Non-controllable: Customer age; Interest rates
Dependent Variable
Variable representing response (y, or output)
Responds to changes in independent variable
What we want to produce; Our objective: sales, customer adoption,..
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Introduction 10
Data Science: Predictive Analytics
Technology
Cloud computing, Cheap storage
Growth Demands
Trends Driving
Predictive
Analytics
Looking for growth opportunities
Data Availability
Competitive Advantage
Terabytes of customer data
Powerful tool to target niches
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Introduction 11
Data Science: Predictive Analytics
Airlines
Customer Profitability
Predict maintenance before failure
Banking
FICO scores
Collections
Identify profitable customers
Predictive
Analytics
Applications
Fraud Detection
Predict fraudulent claims
Healthcare
Predict which customers will pay
Predict at-risk patients
Cross-Selling
Insurance
“Customers who bought X bought Y”
Assign prices to policies
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Introduction 12
Data Science: Data Mining
Step
Description
Selection
Pre-Processing
Transformation
Data Mining
Interpretation
Select portion of data to target
Data cleansing; Removing duplicate records
Sorting; Pivoting; Aggregation; Merging
Find patterns in data
Form judgments based on the patterns
Selection
Data
Pre-Processing
Target
Data
Transformation
PreProcessed
Data
Data Mining
Transformed
Data
Patterns
Interpretation
Actionable
Information
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Introduction 13
Data Science: Data Mining
Association Rule Learning
Search for associations in data
Seek products purchased together
Classification
Sorts data into different categories
Have prior knowledge of patterns
Spam filtering
Clustering
Data
Mining
Approaches
Identify patterns in data
No prior knowledge of patterns
Regression
Find relationships between variables
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Introduction 14
Outline/ Learning Objectives
Topic
Description
Definition
Topics
Trends
Decision Models
Predictive Analytics
Applications to gain valuable insight
Specific areas covered in the course
Timely trends driving adoption of data science
Decision models; Terminology; Forms; Types
Applications; Methods
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Introduction 15
Download