Craig Rhinehart Director of ECM Product Strategy The Next Wave of ECM Innovation … Analyze Your Content with Trusted Content Analytics © 2009 IBM Corporation Craig Rhinehart Contact Info • On my blog this week … • What happens when we fail to govern enterprise content properly? • Email me at craigrhinehart@us.ibm.com • My blog can be found at http://craigrhinehart.wordpress.com/ • Follow me on Twitter at http://twitter.com/craigrhinehart © 2009 IBM Corporation Agenda • Introduction to Content Analytics • How Content Analytics Works • New Cognos Content Analytics Offering • Cognos Content Analytics Demo • New InfoSphere Content Assessment Offering © 2009 IBM Corporation Trusted Content Analytics Overview Know InfoSphere Content Assessment Empower organizations to identify necessary information and decommission the unnecessary Trust InfoSphere Master Content Deliver trusted content to empower better decision making about individual customers Leverage & Exploit Cognos Content Analytics Deliver insight by visualizing trends, correlations and anomalies about your overall business from your content © 2009 IBM Corporation The world is changing and becoming more… Instrumented Interconnected Intelligent The resulting explosion of information creates a need for a new kind of intelligence… … to help build a Smarter Planet © 2009 IBM Corporation Creating New Business Optimization Opportunities... New Intelligence Pervasive Real-Time Predictive What if you could understand what your customers want before they ask? What if you could detect fraudulent claims before they’re paid? What if you could find crime patterns and apprehend criminals in real-time? What if you could make cities smarter by integrating all information about a citizen? © 2009 IBM Corporation Business Optimization Enabled by Content Analytics Smarter Insurance Large Claims Third-Party Administrator Analytics over insurance claim files helps detect fraud faster, reducing costs for their clients by $millions and optimizing the claims-handling process Smarter CPG Kraft Australia Analytics over online customer postings helps Kraft target and deliver new branding campaigns, increasing sales and customer loyalty. Smarter Telecommunications NTT DoCoMo Analytics over Voice of Customer data provides insight to drive customer-oriented decision making, boosting loyalty and creating new opportunity Smarter Healthcare Plans Blue Cross Blue Shield of TN Analytics over an integrated single view of plans, patients and providers enables better negotiations and improves provider satisfaction to over 90% © 2009 IBM Corporation Analytics is Driving the Evolution of ECM ECM Becomes a Key Enabler for Information-Led Transformation Smarter Business Outcomes Optimization Trusted Content Analytics BPM Advanced Case Management Content Automation • Content Analytics • Content Assessment • Master Content • Advanced Workflow • Activity Monitoring • Business Rules • • • • Image Management Office Document Management Archiving / Records Management Compliance Lifecycle Mgmt © 2009 IBM Corporation Every single organization: 1. Keeps too much information and spends too much storing content because there’s too much to sift through 2. Can’t pinpoint the right content when they need it because its unfindable or hidden away in a departmental silo 3. Can’t trust the content they do find about their customers because the lifecycle is uncontrolled 4. Needs to deliver better customer service, for less because those with the best service are rising above the rest in highly competitive markets 5. Wants to optimize their business by • anticipating their customers’ purchasing needs • reducing fraud • delivering a more complete view of their customers • gaining early warning on product quality and customer satisfaction issues because the answers exist inside their organization, they’re just buried underneath too much information © 2009 IBM Corporation Agenda • Introduction to Content Analytics • How Content Analytics Works • New Cognos Content Analytics Offering • Cognos Content Analytics Demo • New InfoSphere Content Assessment Offering © 2009 IBM Corporation Key Enabling Innovation: Content Analytics Extracted Claimant: Soft Tissue Injury Concept Content Analytics Person Noun Based on UIMA, the open, industry-standard architecture for text analysis pioneered by IBM and now an OASIS standard and Apache open-source project Injury Body Part Location Verb Noun Phrase Prep Phrase John sprained his ankle on the step ... Analyzed Documents • From each document you can derive: with identified concepts • New business understanding • New visibility from content • Create structure and understanding from a group of words • Powered by IBM’s unique Dynamic Analysis capability © 2009 IBM Corporation Content Analytics enables analysis that was previously impractical Aggregates conclusions & scales out understanding to large data sets Extracted Claimant: Soft Tissue Injury Concept Person Injury Body Part Location Automatic Visualization Noun Verb Noun Phrase Prep Phrase Concepts and tagged source information are visualized in UI John sprained his ankle on the step ... Source Info (ECM, File, Web, DBMS, ...) Analyzed Documents with identified concepts • Content analytics scales out document by document content investigation • Aggregate the conclusions • Assess volumes of information not otherwise humanly possible (or cost effective) © 2009 IBM Corporation Dynamic Analysis: Basis for Trusted Content Analytics Solutions Impractical and overwhelming analyses are now a reality Aggregate Correlate Explore Visualize IBM’s unique Dynamic Analysis capability Aggregate … form collections from multiple content sources and types unmatched in industry Correlate … deep analysis of content that surfaces trends, relationships patterns, concepts and anomalous associations Visualize … easy to use, feature-rich views to quickly dissect large corpa of content and zero-in on answers Explore … freely investigate content with faceted navigation and drill down to surface new insight and understanding. … to enable informed business decisions © 2009 IBM Corporation Result: A Platform for Uncovering New Insights Separate the valuable content from the unnecessary Determine what customers will buy Find early warnings on product quality concerns Tells you something you may not know Identify potentially fraudulent insurance claims © 2009 IBM Corporation Extracted Claimant: Soft Tissue Injury Concept Based on UIMA Person Unstructured Information Management Architecture Noun Injury Body Part Location Verb Noun Phrase Prep Phrase Automated Concept Extraction and Logical Organization Plug-in Custom Analytics Automatic Classifier Multi-word Analytics Named Entity Extraction Word Analytics Tokenization Identify Language John sprained his ankle on the step ... Enhanced Metadata Analytics Index Visualization UI Crawlers UIMA Annotators It is an open, industrial-strength, scalable and extensible platform for creating, integrating and deploying unstructured information management solutions from combinations of semantic analysis and search components. Although UIMA originated at IBM, it is now an OASIS industry standard and an Open Source project which is currently incubating at the Apache Software Foundation. http://domino.research.ibm.com/comm/research_projects.nsf/pages/uima.index.html © 2009 IBM Corporation Agenda • Introduction to Content Analytics • How Content Analytics Works • New Cognos Content Analytics Offering • Cognos Content Analytics Demo • New InfoSphere Content Assessment Offering © 2009 IBM Corporation Leverage & Exploit IBM Cognos Content Analytics Deliver insight about your overall business from your content I need to improve my customer sat metrics Using Dynamic Analysis, Cognos Content Analytics powers solutions that can: • • • • • Drive new business understanding and visibility leveraging the content & context of unstructured information Enable better business decisions by explaining why events are occurring Expose patterns and trends to highlight optimization opportunities and create differentiation Create cost savings by uncovering process inefficiencies and optimization opportunity All without prior knowledge or pre-defined queries or reports The impact: • Improved customer satisfaction • Reduced fraud • Better understanding of market demand and perception • Early warning on product quality issues I need to better anticipate my customers’ needs I need better visibility into the marketplace I need to optimize my claims process I need to fight crime faster I need to get ahead of product quality problems I need to make my legal team more efficient I need to assess my content & take action to better manage it I need to reduce fraud I need to anticipate compliance violations © 2009 IBM Corporation IBM Cognos Content Analytics features… • Analyze and explore structured and unstructured information • Automatic extraction of meaningful concepts and entities from text • Open, standard UIMA-based text analysis pipeline • Integration with Cognos for reporting against unstructured concepts • Multiple graphical views of the facets (dimensions) of unstructured content • Automatic highlighting of interesting anomalies and correlations in the data • Support for analysis of over 30 content sources and over 150 content formats • Integration with ICM for analysis of document categories, classes, and clusters • Highly scalable and extensible © 2009 IBM Corporation Cognos Content Analytics adds value to… Retail Customer Care • Analyzing: Call logs, online media • For: Brand Reputation Management • Benefits: Improve customer sat, marketing campaigns Retail Banking Customer Care • Analyzing: Call logs, online media • For: Buyer Behavior • Benefits: Improve Customer satisfaction, marketing campaigns, find new revenue opportunities Crime Analytics • Analyzing: Police records, 911 calls… • For: Rapid crime solving & crime trend analysis • Benefits: Safer communities & optimized force deployment Healthcare Analytics • Analyzing: Care records • For: Clinical analysis; treatment protocol optimization • Benefits: Better management of chronic diseases; optimized drug formularies; improved patient outcomes Telco Customer Care • Analyzing: Call center logs and emails • For: Churn prediction and FAQ generation • Benefits: Improved customer retention & customer satisfaction ...and more! Automotive Quality Insight • Analyzing: Tech notes, call logs, online media • For: Brand Reputation Management • Benefits: Reduce warranty costs, improve customer satisfaction, marketing campaigns Insurance Fraud • Analyzing: Insurance claims • For: Detecting Fraudulent activity & patterns • Benefits: Reduced losses, faster detection, more efficient claims processes © 2009 IBM Corporation Insurance Case Study for Fraud Detection and Prediction 1 3 Content Analytics Based Predictive Fraud Indicators: Claims Process Soft Tissue Injury Unwitnessed Event Prior Injury Multiple Claims … 1. Automatically aggregate structured and unstructured data accumulated over time from the claims process 2. Correlate text analytics to apply meaning and understand patterns and trends … visualize and explore to uncover new insights into claims process 3. Instrument by applying indicators to “in process” claims to identify suspicious claims and type of risk 2 4 6 Historical Cross-Claim Content Analytics 5 4. Score suspicious claims to predict probability and impact of fraud and risks 5. Route high-likelihood and/or highimpact claims for investigation based on scoring outcomes 6. Continuously improve outcomes through closed loop optimization Automatic Routing to Investigations ... © 2009 IBM Corporation Partner Solution for Healthcare Fraud Analytics © 2009 IBM Corporation Partner Solution for Healthcare Fraud Analytics © 2009 IBM Corporation Accelerating Regulatory Review Environmental Protection Agency The Customer Problem: The Solution: • EPA tracks chemicals being produced • Chemical producers submit robust reports of effects on environment • EPA has 3,000 of these reports and no way to analyze the data The Results: • Convert documents to XML • Extract complex chemical structures from the documents • Provided toxicological capability to understand how different chemicals map to “end effects” (e.g. increase in liver weight) • Provide ability to analyze chemical structures in reports and, using patent data, understand how these chemical are being used in the environment © 2009 IBM Corporation Better Business Outcome: NYPD is Solving More Crime Faster with New Insight from Content Analytics Identify and Designate Trusted Repositories of Record Create, Control, Maintain and Supply Trusted Content Consume, Leverage and Exploit Trusted Information Govern The Information Lifecycle … Archive, Record and Preserve Information and Evidence of Transactions, Processes and Events Challenge Search and analyze complaints, police reports, 911 records, arrest records, and data marts … all stuck in silos of information All of these forms of text suffer from the common problems of call center text i.e. abbreviations, misspellings, synonyms (Police-specific i.e. perp, ML, FM, MO, pistol, gun, etc...) Find events that keyword search can never find because they are all described differently – what keyword to use? Solution IBM OmniFind Enterprise Edition with Content Analytics enables insight and understanding across all silos The Results Text Analytics can describe events, categorize them and allow for concept searches across often unstructured and at times inaccurate descriptions Enables aggregated view of information beyond silos In the first week of deployment two old murder cases were solved which were directly attributed to being able to analyze trusted data and content Customized with NYPD-specific case management analytics © 2009 IBM Corporation Accelerating Crime Analysis (Law Enforcement) Europol • Customer observed “that a too significant part (estimation of 76%) of the analyst’s time is spent in non real analysis tasks with no real added value for their analysis business” • “Enable the analysts to cope with the increasingly large volumes of intelligence information that they are receiving” • “Automatically extract and find relevant information (facts, entities, link, etc.) useful for the analysis without having to spend hours to examine and manually parse data collection.” • Solution based on Content Analytics with search front-end built with IBM OmniFind Enterprise Edition on top of an ECM system © 2009 IBM Corporation Europol Example Dynamic refinement of user query, based on detected concepts Concepts such as cars, people, and crime events is extracted from the underlying text by text analysis technology © 2009 IBM Corporation Agenda • Introduction to Content Analytics • How Content Analytics Works • New Cognos Content Analytics Offering • Cognos Content Analytics Demo • New InfoSphere Content Assessment Offering © 2009 IBM Corporation FDA MedWatch incident reports are one source of data for medical device manufacturers to understand problems being reported by consumers about their products. It contains both structured and unstructured information. A manufacturer could also analyze internal content, such as warranty claims or support incidents © 2009 IBM Corporation This view shows Deviations (or anomalies) over time for all values of the selected facet– in this case, Generic Device Name © 2009 IBM Corporation Here we see an unexpectedly high occurrence of incidents around Infusion Pumps in April, 2008, so we drill in. © 2009 IBM Corporation Switching to the Facets view of key phrases, we see frequent mentions of battery issues in Infusion Pump incidents reported in April, 2008. We drill down into these battery issues. © 2009 IBM Corporation In the documents view, we can see the original source documents about these 154 battery-related infusion pump incidents. Relevant matching text from the original documents is highlighted. © 2009 IBM Corporation Switching to a Brand Name facet view, we can immediately see a summary, by frequency and correlation, of the devices that are mentioned in these batteryrelated incidents. © 2009 IBM Corporation Through Cognos Content Analytics OLAP/Star Schema export ability, Cognos BI reports and dashboards can be created to monitor and track these issues over time. © 2009 IBM Corporation When a potential regulatory, legal, or compliance issue is identified, the same Content Analytics interface can be used to identify internal documents that might be relevant, gather them, and export them for archiving into a centralized IBM ECM repository. © 2009 IBM Corporation The IBM Content Collector provides a graphical interface for coordinating the archiving of these, and other relevant items (such as related emails). Emails and Documents can be classified, declared as records and even have meta data cleansed prior to becoming a managed or archived item © 2009 IBM Corporation Once gathered into a repository, IBM eDiscovery tools can be used to place legal holds on items, and prepare evidence for legal cases, audits, or other compliance events. Retention and Legal holds can be enforced within the storage infrastructure if using IBM Information Archive © 2009 IBM Corporation Specific subsets of evidence can be marked for further review to identify the degree of risk or legal exposure. © 2009 IBM Corporation Agenda • Introduction to Content Analytics • How Content Analytics Works • New Cognos Content Analytics Offering • Cognos Content Analytics Demo • New InfoSphere Content Assessment Offering © 2009 IBM Corporation Unnecessary Information Eclipses Necessary Information Unnecessary Information Necessary Information Over-Retained Irrelevant Duplicated Valued High Risk Compliant How much of your information is unnecessary? 70%? 80%? 90%? © 2009 IBM Corporation Content Assessment Enables Content Decommissioning Bloated Production Systems with Inefficient Storage Content Based Systems Needing Retirement Content In The Wild • Semi-automated process separates trusted from suspected Unnecessary Information Decommission Trusted Content Keep • Efficiently addresses large-scale problems, while incorporating the human element One customer found 1200 copies of the same policy document across multiple enterprise file servers 41 © 2009 IBM Corporation IBM InfoSphere Content Assessment Housekeeping doesn’t have to be a chore. 1 Dynamically Analyze what you have Aggregate, Correlate, Visualize and Explore your enterprise information in new ways to understand virtually all content types from multiple sources. Make rapid decisions about business value, relevance and disposition. Decommission what’s unnecessary 2 Save cost and reduce risk by eliminating obsolete, over-retained, duplicate, and irrelevant content – and the infrastructure that supports it. Preserve and Exploit the content that matters 3 Collect valued content to manage, trust and govern throughout its lifespan in an enterprise-grade ECM platform. Uncover new business value and insight by integrating with solutions for eDiscovery, case management, master data management, business intelligence, predictive analytics and more. © 2009 IBM Corporation Selling Content Assessment via BVA Content decommissioning, dynamic collection for eDiscovery lead to measurable ROI Cost Drivers Production System Tangible Costs Savings After Deployment Storage Management Tangible Savings • Email / File / SharePoint Storage • 50%-80% • Production System Servers • 40%-60% • Backup • Cost of backup media and storage Production System Productivity Costs Storage Management Productivity Savings • Production System Administration • 20% to 80% • End-User Administration / Classification • 70% to 90% eDiscovery Costs • • Data Spoliation (fines, lost or settled cases) eDiscovery Cost Avoidance • Up to 100% • Hours vs. Days Labor costs of providing the information © 2009 IBM Corporation Trusted Content Analytics Summary Know InfoSphere Content Assessment Empower organizations to identify necessary information and decommission the unnecessary Trust InfoSphere Master Content Deliver trusted content to empower better decision making about individual customers Leverage & Exploit Cognos Content Analytics Deliver insight by visualizing trends, correlations and anomalies about your overall business from your content © 2009 IBM Corporation Craig Rhinehart Director of ECM Product Strategy • Email me at craigrhinehart@us.ibm.com • My blog can be found at http://craigrhinehart.wordpress.com/ • Follow me on Twitter at http://twitter.com/craigrhinehart © 2009 IBM Corporation