ANALYTICS AND OPEN DATA THROUGH A CASE STUDY SAS MIDDLE EAST CAREL BADENHORST HEAD OF INFORMATION TECHNOLOGY PRACTICE MIDDLE EAST C op yr i g h t © 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . SAS AGENDA • Analytics and Open Data • Analytics example - UN Global Pulse Case Study C op yr i g h t © 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . ANALYTICS LIFE CYCLE….NOT BI TEXT ANALYTICS Discover relevant themes and relationships in social media, call notes and email for deeper insights and improved business management FORECASTING Leveraging historical time series data to drive better insight into decision-making for the future INFORMATION MANAGEMENT OPTIMIZATION Make appropriate business decisions by understanding dynamics and utilize resources the best way DATA MINING Understand and find relationships in data to make accurate predictions about the future STATISTICS C op yr i g h t © 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Copyright © 2011, SAS Institute Inc. All rights reserved. SAS/ UN GLOBAL PULSE C op yr i g h t © 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . BACKGROUND OF THE CASE STUDY The UN Global Pulse- SAS research had a few questions • Does the sum total of what we say online add up to anything meaningful? • Do online conversations correlate in any way with official government statistics? • Specifically can unemployment patterns be predicted based on certain chatter topics and correlated with govt open data to derive meaningful statistics? SAS/ UN GLOBAL PULSE METHODOLOGY Dynamic Correlation between mood scores with unemployment scores Online social media conversations over a period in US and Ireland Government Open Data to validate experiment ie. Employment history statistics Text Analytics - words used in each conversation were mined in order to assign one or more topical categories Sentiment Analysis undertaken to classify conversations as happy, sad, anxious etc C op yr i g h t © 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Mood Scoring based on conversations SAS SAMPLE INSIGHT GENERATED FROM THE RESEARCH RESULTS • An uptake in social media conversation on topics such as cutting back on groceries and other essentials or downgrading one’s mode of transportation can predict an impending unemployment spike. • After a spike, an increase in chatter about foreclosures, reduced spending for health care and canceled vacations can offer insights on the effects of a down economy. • Better understanding of demographical areas, gender, age and income characteristics based on social techniques such as mood scoring C op yr i g h t © 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . • • SAS SAMPLE INSIGHT GENERATED FROM THE RESEARCH In the US: • Huge increase in depressed mood conversations four months before a spike in unemployment (calculated and validated within 95 percent). • Talk about loss of housing increases two months after an unemployment spike (calculated and validated within 95 percent). • Talk about auto repossession increases three months after an unemployment spike (calculated and validated within 95 percent). In Ireland: • Anxious moods increase five months before a spike in unemployment (calculated and validated within 90 percent) • Talk about travel cancellations increases three months after an unemployment spike (calculated and validated within 95 percent). • Talk about changing housing situations for the worse increases eight months after unemployment increases (calculated and validated within 90 percent). C op yr i g h t © 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . SAS TRANSLATING INTO • In the US: • 95 Confidence EARLY WARNING SIGN KPI (four months) for unemployment increase • 95 Confidence EARLY WARNING SIGN KPI (six months – four plus two months) for mortgage repayment default increase (down to the demographics) • 95 Confidence EARLY WARNING SIGN KPI (seven months – four plus three months) for car manufacturers and retail re new sales and potential default increase (down to the demographics) • Increased potential in social welfare needs down to a specific demographic level……and the most important value • Using further analytics statistical, data mining, prediction and optimization algorithms to start predicting pre-emptive actions and their outcome in case these patterns are detected • Analytics is amazing if you allow it to tell you stories.... C op yr i g h t © 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . THE STORIES ANALYTICS WILL HELP YOU TELL USING OPEN DATA IS ENDLESS… QUESTIONS? C op yr i g h t © 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . sas.com