ANALYTICS AND OPEN DATA THROUGH A
CASE STUDY
SAS MIDDLE EAST
CAREL BADENHORST
HEAD OF INFORMATION TECHNOLOGY PRACTICE
MIDDLE EAST
C op yr i g h t © 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
SAS AGENDA
•
Analytics and Open Data
•
Analytics example - UN Global Pulse Case Study
C op yr i g h t © 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
ANALYTICS LIFE CYCLE….NOT BI
TEXT ANALYTICS
Discover relevant themes and
relationships in social media, call
notes and email for deeper insights
and improved business
management
FORECASTING
Leveraging historical time
series data to drive better
insight into decision-making
for the future
INFORMATION
MANAGEMENT
OPTIMIZATION
Make appropriate
business decisions by
understanding
dynamics and utilize
resources the best way
DATA MINING
Understand and find
relationships in data to make
accurate predictions about
the future
STATISTICS
C op yr i g h t © 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
Copyright © 2011, SAS Institute Inc. All rights reserved.
SAS/ UN GLOBAL
PULSE
C op yr i g h t © 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
BACKGROUND OF THE CASE STUDY
The UN Global Pulse- SAS research had a
few questions
•
Does the sum total of what we say online
add up to anything meaningful?
•
Do online conversations correlate in any
way with official government statistics?
•
Specifically can unemployment patterns be
predicted based on certain chatter topics
and correlated with govt open data to derive
meaningful statistics?
SAS/ UN GLOBAL
PULSE
METHODOLOGY
Dynamic Correlation
between mood scores with
unemployment scores
Online social media
conversations over a
period in US and Ireland
Government Open Data to
validate experiment ie.
Employment history
statistics
Text Analytics - words used in
each conversation were mined
in order to assign one or more
topical categories
Sentiment Analysis
undertaken to classify
conversations as happy,
sad, anxious etc
C op yr i g h t © 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
Mood Scoring based on
conversations
SAS SAMPLE INSIGHT GENERATED FROM THE RESEARCH
RESULTS
• An uptake in social media conversation on topics such as cutting back on
groceries and other essentials or downgrading one’s mode of
transportation can predict an impending unemployment spike.
• After a spike, an increase in chatter about foreclosures, reduced spending
for health care and canceled vacations can offer insights on the effects of
a down economy.
• Better understanding of demographical areas, gender, age and income
characteristics based on social techniques such as mood scoring
C op yr i g h t © 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
•
•
SAS SAMPLE INSIGHT GENERATED FROM THE RESEARCH
In the US:
• Huge increase in depressed mood conversations four months before a spike in
unemployment (calculated and validated within 95 percent).
• Talk about loss of housing increases two months after an unemployment spike (calculated
and validated within 95 percent).
• Talk about auto repossession increases three months after an unemployment spike
(calculated and validated within 95 percent).
In Ireland:
• Anxious moods increase five months before a spike in unemployment (calculated and validated
within 90 percent)
• Talk about travel cancellations increases three months after an unemployment spike
(calculated and validated within 95 percent).
• Talk about changing housing situations for the worse increases eight months after
unemployment increases (calculated and validated within 90 percent).
C op yr i g h t © 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
SAS TRANSLATING INTO
•
In the US:
• 95 Confidence EARLY WARNING SIGN KPI (four months) for unemployment increase
• 95 Confidence EARLY WARNING SIGN KPI (six months – four plus two months) for
mortgage repayment default increase (down to the demographics)
• 95 Confidence EARLY WARNING SIGN KPI (seven months – four plus three months)
for car manufacturers and retail re new sales and potential default increase (down to the
demographics)
• Increased potential in social welfare needs down to a specific demographic
level……and the most important value
•
Using further analytics statistical, data mining, prediction and optimization algorithms to
start predicting pre-emptive actions and their outcome in case these patterns are
detected
• Analytics is amazing if you allow it to tell you stories....
C op yr i g h t © 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
THE STORIES ANALYTICS WILL HELP YOU TELL
USING OPEN DATA IS ENDLESS…
QUESTIONS?
C op yr i g h t © 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
sas.com