Adding Text to the Analytics Mix How to Blend Insights from Text, Sentiment, and Traditional Data Mining for Better Analytics Osvaldo Driollet, Ph.D. Principal Scientist - Director FICO Colette Glaeser Product Management Director FICO © 2014 Fair Isaac Corporation. Confidential. This presentation is provided for the recipient only and cannot be reproduced or shared without Fair Isaac Corporation’s express consent. Key Messages Turning Unused Data into Dollars requires a powerful and intuitive approach to unlocking hidden valuable insights form mixed, text rich data, to enable better modeling strategies and business decisions. 1. An integrated text mining and Natural Language Processing (NLP) approach for extracting precious hidden information on vast amounts of different kinds of data, from inside and outside the enterprise. 2. Turn-key services can automate the analysis and exploration of unstructured data and unleash the power of Visual Analytics to reveal insights. 3. Better business decisions are the natural consequence of better data insights. 2 © 2014 Fair Isaac Corporation. Confidential. The Unstructured Data Opportunity Billions of Lost Insights Purchase History Internal Customer Data Customer Profiles Loyalty Programs Call Center Data FICO® Analytic Modeler for Text Big Data 85% unstructured Customer Communications Geographic Demographic Consumer Reports Enterprise Data External Customer Data User Generated Content Ratings and Reviews Social Data Forums Twitter, Facebook. Linkedin Google+ 3 © 2014 Fair Isaac Corporation. Confidential. Increasing data variety and complexity Better Insights for Better Decisions and Improved Performance FICO Analytic Modeler for Text FICO® Analytic Modeler for Text Information Insights for What Happened? Why Happened? FICO® Solution Stack Decisions What will Happen? What should I do? Can I do more? Descriptive Analytics Diagnosis Analytics Predictive Analytics Prescriptive Analytics Pre-emptive Analytics How many? What exactly is the problem? What if these trends continue? What customers should I target? What actions are needed? What could happen if…? How should I do it? What can I offer the customer before they realize the need? How often? Where? Who is involved? What is next? Which is the best possible outcome? Better decisions through better data insights 4 Actions © 2014 Fair Isaac Corporation. Confidential. Data Insights from FICO® Insight Miner Can Lead to Better Business Decisions by Helping Detect Fraud Curb Customer Attrition Minimize Risk Increase Response Rate for Marketing Campaigns Create Better Predictive Models Explore Customer Experience Management (CEM) strategies Improve Business Assessment Add Social Media insights into business intelligence schemes Anticipate Resource Demands 5 © 2014 Fair Isaac Corporation. Confidential. FICO® Analytic Modeler For Text FICO ® Analytic Modeler for Text • Interactive Data Visualizations • Customizable Dashboards, Queries and Filters • Interactive Insights • • • • 6 Text Mining Statistical Analysis Entity Extraction Sentiment Analysis © 2014 Fair Isaac Corporation. Confidential. Visualization and Discovery Business Solutions Analytic Services Data Access • • • • • Fraud and Risk Mitigation Predictive Modeling Insurance Assessment Marketing Campaigns Customer Experience Management • Social Media Analysis • Preprocessing, Indexing and Meta data support • Multi-format support • Multi-language support • Security Layer Demo of Selected Uses Cases Getting Started (application overview and data access) Core text Analysis Named Entity Recognition Sentiment Analysis Visualization of Results Discovery 7 © 2014 Fair Isaac Corporation. Confidential. FICO® Analytic Modeler For Text The Power to Discover o Offers a rich and easy to use set of integrated data mining tools o Automates the analysis and exploration of unstructured data o Extracts valuable insights by embedded Visual Analytics. o Discovers hidden patterns and sentiment buried in text with less time and effort. o Enables more sophisticated descriptive and predictive models when used with the FICO Solution Stack and the FICO Analytic Cloud o Facilitates actionable intelligence for improved business performance. 8 © 2014 Fair Isaac Corporation. Confidential. Better decisions through better data insights Demo Use Case (Public Data) National Highway Transportation and Security Administration Database http://www.safercar.gov/Vehicle+Owners You will see (with a few clicks) Data Description • How to use out-ofautomotive the-box Predictive Mining Panel Complaint andthe defective parts database including car to unleash the power of unstructured data makers, brand names, accidents occurrence, injuries, fatalities, drivers’ comments, dealer’s info, etc. • How to add Interactive Visual Analytics to your business applications About 1 Million complaints from 1980 to 2013 • How tofields customize and profile your data and find the ‘needle 47 data in the haystack’ • • 4 categorical Indicators (Crash, Fire, Injured, and Fatalities) 2 Text Fields: How to use the Component FICO Solution Stack to transform the - Failing Description Complaint Description discovered insights into more profitable decisions For demo purposes, 3 random samples of about 70,000 data records wereand generated How easy fast Insurance Assessment, Survey Analysis, Brand Reputation, Predictive Modeling applications can be NHTSA 1980-2013 built.NHTSA 1980-1999 NHTSA 2000-2013 9 © 2014 Fair Isaac Corporation. Confidential. And we are working to add… FICO® Analytic Modeler For Text The Power to Discover FICO® Analytic Modeler for Text integrates Data Mining, Statistical Analysis, NLP and Machine Learning under the same roof. Visual Data Exploration Dedicated Dashboards Normalization Preprocessing and Filtering Feature Selection Text Mining FICO ® Analytic Modeler for Text Time series Analysis Better decisions through better data insights © 2014 Fair Isaac Corporation. Confidential. Content Analysis and Extraction Automatic Summarization Natural Language Processing Information Retrieval Information Theory Statistical Analysis 10 Ontology Management Topic Identification Entity Extraction Sentiment Analysis Machine Learning for - Classification - Clustering - Recommendation - Topic Modeling Demo Use Case (Enterprise Data) AAA Michigan Insurance You will see (with a few clicks) Data Description • How to extract Database of Insurance claims formed by 20+ relational tables with information about policies, vehicles, losses, claims, and communications with clients (email and voice transcripts). 1) Persons 2) Locations 3) Organizations About 20,000 complete records, with hundreds of data fields 4) Dates 5) Time References For demo purposes, only unique accounts with relevant fields were used 6) Percentages 2000 records with 28 data columns 7) Money Amounts using the Natural Language Processing (NLP) 1 Text Field: - Email communications with Client features embedded in the product - Phone calls transcripts • How to evaluate the customer’s sentiment using Interactive FICO Interactive VisualVisual Analytics Analytics • How to use the discovered insights to build CEM/CRM, Business Intelligence, Survey Reviews, and Predictive Modeling. 11 © 2014 Fair Isaac Corporation. Confidential. FICO® Analytic Modeler For Text Accuracy: FICO’s Comparative Advantage Sentiment Analysis Comparison SemEval 2013 (*) (2218 Tweets) NHTSA (5000 records) Evaluation Criterion F-Measure on Average Classification (Negative - Neutral - Positive) • FICO Analytic Modeler for Text 54% (Tweeter Model) 97% - 1% - 2% (Product Model) • Semantria (Lexalytics) 50% 41% - 53% - 5% • RapidMiner 24% 23% - 10% - 67% • AYLIEN 49% 81% - 1% - 16% • Textalytics 51% 48% - 35% - 17% (*) http://www.cs.york.ac.uk/semeval-2013/task2/ F-Measure: Weighted combination of Precision and Recall Precision: Percentage of documents correctly classified over the total found Recall: Percentage of documents correctly classified over the actual total 12 © 2014 Fair Isaac Corporation. Confidential. FICO® Analytic Modeler For Text The Power to Discover To drive better business decisions Predictive Features Named Entities Metadata Customer Defined Entities Sentiment and Emotions Hidden Patterns Market and Consumer Reviews Risk Management Email and Messages Review Sites Banking and Insurance Reputation Mgement Online Forums Financial Services Blogs and Social Media Call Center Notes and Transcripts News and Articles Insights from almost any data source Extracting almost any feature Customer Service For virtually any application Better decisions through better data insights 13 © 2014 Fair Isaac Corporation. Confidential. CEM and CRM THANK YOU! Our Most Special Thanks to the FICO Team that made all of this possible: Ray Ghanbari - Software Engineering V.P. Carlos Saraiva - Principal Architect Aritra Chatterjee - Lead Engineer Mac Belniak - Product Manager Director Howard Chen - Lead Designer Aaron Smith – Engineer Reza Sadoddin - Lead Scientist Bala Bhat - QA Lead Vishal Tyagi - QA Lead Lokesh Pant - Engineer 14 © 2014 Fair Isaac Corporation. Confidential. Jayant Ameta - Engineer Karthik Balasundaram - Engineer Sudeshna Sengupta - Sr. Manager Mark Fields - Sr. Engineer Jocelyn Qian - Engineer M Abdul Salam - Manager Tarun Garg - Engineer Romil Sandal - Engineer Sharath Kumar – Engineer Raghu Subramanyam - Sr. Engineer Austin Please rate this session online! Osvaldo Driollet Principal Scientist FICO 15 © 2014 Fair Isaac Corporation. Confidential. Colette Glaeser Product Management Director FICO