FICO ® Analytic Modeler For Text

advertisement
Adding Text to the Analytics Mix
How to Blend Insights from Text, Sentiment, and
Traditional Data Mining for Better Analytics
Osvaldo Driollet, Ph.D.
Principal Scientist - Director
FICO
Colette Glaeser
Product Management Director
FICO
© 2014 Fair Isaac Corporation. Confidential.
This presentation is provided for the recipient only and cannot be reproduced or shared without Fair Isaac Corporation’s express consent.
Key Messages
Turning Unused Data into Dollars requires a powerful and intuitive approach
to unlocking hidden valuable insights form mixed, text rich data, to enable better
modeling strategies and business decisions.
1. An integrated text mining and Natural Language Processing (NLP) approach for
extracting precious hidden information on vast amounts of different kinds of data, from
inside and outside the enterprise.
2. Turn-key services can automate the analysis and exploration of unstructured data and
unleash the power of Visual Analytics to reveal insights.
3. Better business decisions are the natural consequence of better data insights.
2
© 2014 Fair Isaac Corporation. Confidential.
The Unstructured Data Opportunity
Billions of
Lost
Insights
Purchase History
Internal
Customer
Data
Customer Profiles
Loyalty Programs
Call Center Data
FICO®
Analytic
Modeler
for Text
Big Data
85%
unstructured
Customer Communications
Geographic
Demographic
Consumer Reports
Enterprise
Data
External
Customer
Data
User Generated Content
Ratings and Reviews
Social Data
Forums
Twitter, Facebook.
Linkedin
Google+
3
© 2014 Fair Isaac Corporation. Confidential.
Increasing data variety and complexity
Better Insights for Better Decisions and Improved Performance
FICO Analytic Modeler for Text
FICO® Analytic Modeler for Text
Information
Insights for
What Happened?
Why Happened?
FICO® Solution Stack
Decisions
What will
Happen?
What should I do?
Can I do more?
Descriptive
Analytics
Diagnosis
Analytics
Predictive
Analytics
Prescriptive
Analytics
Pre-emptive
Analytics
How many?
What exactly is
the problem?
What if these
trends continue?
What customers
should I target?
What actions are
needed?
What could
happen if…?
How should I do
it?
What can I offer
the customer
before they realize
the need?
How often?
Where?
Who is involved?
What is next?
Which is the best
possible
outcome?
Better decisions through better data insights
4
Actions
© 2014 Fair Isaac Corporation. Confidential.
Data Insights from FICO® Insight Miner Can Lead to
Better Business Decisions by Helping
Detect Fraud
Curb Customer Attrition
Minimize Risk
Increase Response Rate for
Marketing Campaigns
Create Better Predictive Models
Explore Customer Experience
Management (CEM) strategies
Improve Business Assessment
Add Social Media insights into
business intelligence schemes
Anticipate Resource Demands
5
© 2014 Fair Isaac Corporation. Confidential.
FICO® Analytic Modeler For Text
FICO ®
Analytic
Modeler
for Text
• Interactive Data
Visualizations
• Customizable
Dashboards, Queries
and Filters
• Interactive Insights
•
•
•
•
6
Text Mining
Statistical Analysis
Entity Extraction
Sentiment Analysis
© 2014 Fair Isaac Corporation. Confidential.
Visualization
and
Discovery
Business
Solutions
Analytic
Services
Data Access
•
•
•
•
•
Fraud and Risk Mitigation
Predictive Modeling
Insurance Assessment
Marketing Campaigns
Customer Experience
Management
• Social Media Analysis
• Preprocessing, Indexing
and Meta data support
• Multi-format support
• Multi-language support
• Security Layer
Demo of Selected Uses Cases
Getting Started (application overview and data access)
Core text Analysis
Named Entity Recognition
Sentiment Analysis
Visualization of Results
Discovery
7
© 2014 Fair Isaac Corporation. Confidential.
FICO® Analytic Modeler For Text
The Power to Discover
o Offers a rich and easy to use set of integrated data mining tools
o Automates the analysis and exploration of unstructured data
o Extracts valuable insights by embedded Visual Analytics.
o Discovers hidden patterns and sentiment buried in text with less
time and effort.
o Enables more sophisticated descriptive and predictive models when used
with the FICO Solution Stack and the FICO Analytic Cloud
o Facilitates actionable intelligence for improved business performance.
8
© 2014 Fair Isaac Corporation. Confidential.
Better decisions through better data insights
Demo Use Case (Public Data)
National Highway Transportation and Security Administration Database
http://www.safercar.gov/Vehicle+Owners
You will see (with a few clicks)
Data Description
• How
to use
out-ofautomotive
the-box Predictive
Mining
Panel
Complaint
andthe
defective
parts database
including
car to
unleash
the
power
of
unstructured
data
makers, brand names, accidents occurrence, injuries, fatalities,
drivers’ comments, dealer’s info, etc.
• How to add Interactive Visual Analytics to your business
applications
About 1 Million complaints from 1980 to 2013
• How
tofields
customize and profile your data and find the ‘needle
47 data
in the haystack’
•
•
4 categorical Indicators (Crash, Fire, Injured, and Fatalities)
2 Text Fields:
How to use
the Component
FICO Solution
Stack to transform the
- Failing
Description
Complaint
Description
discovered insights into more profitable decisions
For demo purposes, 3 random samples of about 70,000 data
records
wereand
generated
How
easy
fast Insurance Assessment, Survey Analysis,
Brand Reputation, Predictive Modeling applications can be
NHTSA 1980-2013
built.NHTSA 1980-1999
NHTSA 2000-2013
9
© 2014 Fair Isaac Corporation. Confidential.
And we are working to add…
FICO® Analytic Modeler For Text
The Power to Discover
FICO® Analytic
Modeler for Text
integrates Data Mining,
Statistical Analysis,
NLP and Machine
Learning under the
same roof.
Visual Data
Exploration
Dedicated
Dashboards
Normalization
Preprocessing
and Filtering
Feature Selection
Text Mining
FICO ®
Analytic
Modeler
for Text
Time series
Analysis
Better decisions through better data insights
© 2014 Fair Isaac Corporation. Confidential.
Content Analysis and
Extraction
Automatic
Summarization
Natural
Language
Processing
Information
Retrieval
Information Theory
Statistical
Analysis
10
Ontology
Management
Topic
Identification
Entity Extraction
Sentiment Analysis
Machine Learning for
- Classification
- Clustering
- Recommendation
- Topic Modeling
Demo Use Case (Enterprise Data)
AAA Michigan Insurance
You will see (with a few clicks)
Data Description
• How to extract
Database of Insurance claims formed by 20+ relational tables with
information about policies, vehicles, losses, claims, and communications
with clients (email and voice transcripts).
1) Persons
2) Locations
3) Organizations
About 20,000 complete records, with hundreds of data fields
4) Dates
5) Time References For demo purposes, only unique accounts with relevant fields were used
6) Percentages
2000 records with 28 data columns
7) Money Amounts
using the Natural Language Processing (NLP)
1 Text Field:
- Email communications with Client
features
embedded in the product
- Phone calls transcripts
• How to evaluate the customer’s sentiment using Interactive
FICO Interactive
VisualVisual
Analytics
Analytics
• How to use the discovered insights to build CEM/CRM, Business Intelligence, Survey
Reviews, and Predictive Modeling.
11
© 2014 Fair Isaac Corporation. Confidential.
FICO® Analytic Modeler For Text
Accuracy: FICO’s Comparative Advantage
Sentiment Analysis Comparison
SemEval 2013 (*)
(2218 Tweets)
NHTSA
(5000 records)
Evaluation Criterion
F-Measure on Average
Classification
(Negative - Neutral - Positive)
•
FICO Analytic Modeler for Text
54% (Tweeter Model)
97% - 1% - 2% (Product Model)
•
Semantria (Lexalytics)
50%
41% - 53% - 5%
•
RapidMiner
24%
23% - 10% - 67%
•
AYLIEN
49%
81% - 1% - 16%
•
Textalytics
51%
48% - 35% - 17%
(*) http://www.cs.york.ac.uk/semeval-2013/task2/
F-Measure: Weighted combination of Precision and Recall
Precision: Percentage of documents correctly classified over the total found
Recall: Percentage of documents correctly classified over the actual total
12
© 2014 Fair Isaac Corporation. Confidential.
FICO® Analytic Modeler For Text
The Power to Discover
To drive better
business decisions
Predictive
Features
Named
Entities
Metadata
Customer
Defined
Entities
Sentiment
and
Emotions
Hidden
Patterns
Market and
Consumer
Reviews
Risk
Management
Email and
Messages
Review
Sites
Banking and
Insurance
Reputation
Mgement
Online
Forums
Financial
Services
Blogs and
Social
Media
Call Center
Notes and
Transcripts
News and
Articles
Insights from
almost any data
source
Extracting
almost any
feature
Customer
Service
For virtually
any application
Better decisions through better data insights
13
© 2014 Fair Isaac Corporation. Confidential.
CEM and
CRM
THANK YOU!
Our Most Special Thanks to the FICO Team that made all of this possible:
Ray Ghanbari - Software Engineering V.P.
Carlos Saraiva - Principal Architect
Aritra Chatterjee - Lead Engineer
Mac Belniak - Product Manager Director
Howard Chen - Lead Designer
Aaron Smith – Engineer
Reza Sadoddin - Lead Scientist
Bala Bhat - QA Lead
Vishal Tyagi - QA Lead
Lokesh Pant - Engineer
14
© 2014 Fair Isaac Corporation. Confidential.
Jayant Ameta - Engineer
Karthik Balasundaram - Engineer
Sudeshna Sengupta - Sr. Manager
Mark Fields - Sr. Engineer
Jocelyn Qian - Engineer
M Abdul Salam - Manager
Tarun Garg - Engineer
Romil Sandal - Engineer
Sharath Kumar – Engineer
Raghu Subramanyam - Sr. Engineer
Austin
Please rate this session online!
Osvaldo Driollet
Principal Scientist
FICO
15
© 2014 Fair Isaac Corporation. Confidential.
Colette Glaeser
Product Management Director
FICO
Download