Insurance Fraud Detection: Reducing Loss Payout Using Predictive Modeling Mark Rusch Vice President – Sales mrusch@statsoft.com 708-428-4113 data analysis data mining quality control web-based analytics U.S. Headquarters: StatSoft, Inc. 2300 E. 14th St. Tulsa, OK 74104 USA (918) 749-1119 Fax: (918) 749-2217 info@statsoft.com www.statsoft.com Australia: StatSoft Pacific Pty Ltd. Brazil: StatSoft South America Bulgaria: StatSoft Bulgaria Ltd. Czech Rep.: StatSoft Czech Rep. s.r.o. China: StatSoft China France:StatSoft France Germany: StatSoft GmbH Hungary: StatSoft Hungary Ltd. India: StatSoft India Pvt. Ltd. Israel: StatSoft Israel Ltd. Italy: StatSoft Italia srl Japan: StatSoft Japan Inc. Korea: StatSoft Korea Netherlands: StatSoft Benelux BV Norway: StatSoft Norway AS © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. Poland: StatSoft Polska Sp. z o.o. Portugal: StatSoft Ibérica Lda Russia: StatSoft Russia Spain: StatSoft Ibérica Lda S. Africa: StatSoft S. Africa (Pty) Ltd. Sweden: StatSoft Scandinavia AB Taiwan: StatSoft Taiwan UK: StatSoft Ltd. Overview ■ Introductions ■ Customer introductions ■ StatSoft introductions ■ A brief overview of analytic approaches, methods, and issues in fraud detection ■ To review methods useful for detecting underwriter, provider, and claimant fraud ■ Review Customer benefits example ■ Wrap up and Next Steps © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 1 Through early identification of high potential fraud claims, Predictive Fraud Detection can make a significant impact on overall loss cost. One LOB Claims Predictive Model Savings Annual Premium Loss Ratio $136M 52% Annual Losses Paid (est) % loss due to fraud claims $71M 5% Annual losses due to fraud claims $3.5M Fraud Claims Identified by PM 90% Fraud Losses Identified by Model $3.2M Projected Reduction on losses 4.5% Annual Fraud Reduction Revised Projected Loss Ratio ~$2M 49.8% © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 2 StatSoft WorldWide Offices & Sample Insurance Customers © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. English Italian French Chinese German Russian Polish Korean Czech Portuguese Spanish Japanese 3 Why Predictive Modeling ■ To Reduce Combined Ratios through: ■ Increased Subrogation and Recovery dollars by identifying candidate claims early, tagging and tracking them ■ Uncovering usual and new types of fraud via Text mining of claim notes, PDFs, adjustor reports, for suspicious patterns ■ More efficiently handle claims by providing “right level of service” ■ Through targeted “In Person Contact” ■ Reducing loss frequency ■ By leveraging data mining to uncover previous undetected patterns and applying these patterns into the Underwriting Process to either: ■ Reject Risk ■ Charge more Premium ■ Right Tracking ■ Identify claim complexity early, then route to appropriate resource © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 4 Predictive Analytics Enhances Many Insurance Processes Underwriting Marketing Automated underwriting / risk selection Campaign optimization Straight-through rate processing Customer segmentation Active risk portfolio management 1:1 marketing Automated discount/credit recommendation New product market analysis Automated renewal processing Outbound Predictive Marketing Underwriting fraud detection Inbound intelligent cross sell Appetite selection management Optimize leads delivered to Agency Force Automated premium audit Claims Sales & Service Fraud detection Field force optimization (marketing & agency) Fast tracking of claims Commission modeling and optimization Claims assignment automation (by competency) Cross-sell, up-sell, offer optimization Settlement analysis Intelligent call routing Accelerated detection of severe claims Intelligent Recommendations Predict Complexity In and outbound Customer Retention offers Routing optimization Enterprise Feedback Optimization Predict Reserves Agent /Broker Performance effectiveness © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 5 Opportunities PREDICTIVE MODELING IN ACTION © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 6 Current Insurance Environment ■ ■ Increasing deregulation and growing competition in the insurance industry is placing pressure on insurance companies to be more customer‐centric for the “right” customers in their operations In particular, focusing on providing the “right” level of service to the “right” customers ■ For Claims – Providing real time scoring across the entire claim process to continuously monitor for ■ ■ ■ ■ ■ ■ ■ Reserve Changes Subrogation Opportunities Fraud Right Tracking In Person Contact (IPC) Claim Complexity For Marketing – Identifying and providing the best price to the customer with the highest lifetime customer value ■ Smart Reasons to contact beyond the renewal ■ ■ Cross Selling Targeted Retention Offers based on Customer Value Score © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 7 Predictive Analytics can drive down claim costs at many points across the claim lifecycle Injury/ Accident First Report Assign Claim 3 Point Contact Low Touch Supervisor Review & Assignment New Information/ Medical Pharmacy Bill Manage Claim/Make Payments Referral Escalation Close Nurse Case Management © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. Evaluation – Strategy and Reserves Model Score + Reason Codes 8 Leveraging Predictions within Claims Workflow ■ Reducing fraud payouts by catching fraud earlier, ■ Increasing subrogation recovery by identifying subro cases earlier, tagging and tracking them ■ By streamlining the processing of non-fraudulent and routine claims ■ By identifying and sending the most complex claims to the right adjustor ■ By reducing working capital requirements through more timely and accurate reserving © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. Most systems can score a claim on a batch basis, few if any can perform this scoring in real time against the latest claim data entered. 9 Leveraging Predictions within SIU Workflow ■ Look at your current capacity to handle new cases ■ Enter the capacity field ■ By streamlining the processing of non-fraudulent and routine claims ■ System will re-score all new and existing claims (existing claims have new text and other information) and create a new list of cases that have the highest Fraud Score based on your current capacity levels ■ Scoring of claims for fraud is no longer a “one time” but rather continuous and virtually automatic event © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. Most systems can score a claim on a batch basis, few if any can perform this scoring in real time against the latest claim data entered. 10 Opportunities FRAUD DETECTION © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 11 Fraud: Finding the Needle in a Haystack.. Without Knowing what a Needle Looks Like ■ There are many ways in which insurance fraud can be perpetrated ■ By not being “honest” on the application for insurance (underwriter fraud) ■ By not being “honest” about a specific claim (claimant fraud) ■ By systematically “manufacturing” a claim (e.g., personal injury-related reimbursements to a provider (“provider fraud”)) ■ A fundamental problem is that, unlike fraud in other domains, fraudulent activity may go undetected for a long time, or may never be detected ■ So the problem is one of “looking for a needle in the haystack”, but not being sure what a needle looks like ■ Not easy! © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 12 Challenges in Identifying Fraud ■ Current Challenges ■ More fraudulent claims were slipping through the cracks. ■ Backlogged adjusters are so wrapped up in getting the claim settled that they miss basic fraud issues ■ Manual fraud detection approaches cause delays in getting files to the SIU department. ■ Resulting in: ■ Fraudsters realizing the above and are becoming much more sophisticated than the Insurance Company ■ Losses paid due to fraud are on the increase ■ New types of fraud are occurring with greater frequency ■ Fraud tends to increase with a down economy ■ Fraud is now being perpetrated via social networks There must be a better way…. © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 13 Categories of Approaches ■ Supervised Learning: Where an outcome variable exists in historical data ■ Based on the analysis of claims previously investigated by SIU ■ Predictive models can make the SIU process more efficient and effective by identifying new types of fraud patterns through: ■ The use of new algorithms ■ New insights gained through mining of unstructured data such as adjustor notes, faxes, PDF’s, letters and other forms of text based data ■ Leveraging Third Party Data Sources ■ Unsupervised Learning: Where an outcome variable does not exist in historical data ■ Based on the analysis of all claims filed in the past, unsupervised models can be built to identify claims that are “unusual” or “too usual” (too average) ■ Unsupervised learning methods may improve the SIU referral process, by identifying more claims and new types of fraud © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 14 What: Your Data and Past Claim Experience is Leveraged ■ Your data surrounding past Fraud cases ■ Loss Date against policy Effective Date ■ Time and location of loss against type of injury ■ Text data (letters, faxes, claim notes, police reports, etc) ■ Any Third Party Data available or you leverage today… ■ NICB ■ State Insurance Department ■ CLUE, ISO ■ Adjustor experience ■ Interview “what do you typically look for or what makes a claim look suspicious to you?” ■ Location, circumstances, timing, injury types, …….. © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 15 How: Predictor Variables are Generated ■ The claims process extends over time; the earlier fraud activity can be discovered the better (the greater the savings) ■ For example, a flag capturing that treatment for back pain by a chiropractor is ongoing 2 years after the claim is not very useful ■ A flag capturing that back pain was listed as one of the types of personal injury, and that a chiropractor was engaged to treat the pain within 10 days can be useful ■ In general, identify variables that based on experienced SIU professionals raise “red flags” (“red flag variables”) ■ Immediate involvement of lawyer, certain types of injuries, etc. ■ Derived variables such as unusual geographic distance of medical care provider (e.g., chiropractor) from claimant home address © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. Approach: Different fraud models at first notice of loss and then rescoring the claim each time more information is collected 16 Why: The Result is the ability to quickly generate an accurate Fraud Prediction on each new and existing claim ■ Once Predictor Variables are identified, the relationships between them are computed to create a formula, if for example the following variables: ■ A represents distance to therapy office from residence ■ B represents Attorney Involvement 1=Y, 0 = N ■ C represents soft tissue injury 1=Y, 0 = N ■ D represents loss date within 30 days of policy inception (value derived from structured data) ■ E represents Targeted Pharmaceuticals involved 1=Y, 0 = N ■ N represents the remaining Predictors ■ The STATISTICA platform will generate a predictive fraud model based on your data and experience that would look something like: ■ Fraud Score = .1A + .3B + .21C + .50D + .21E + (x)N……. ■ Then every bit of relevant (predictive) information collected on every claim would be run against this model (i.e. data plugged into A….N), to determine the claim’s probability of being fraudulent ■ A score is generated and then action taken based on your business rules (i.e. all claims with fraud scores over 80 get referred to the SIU) © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 17 LiveScore in Claim Workflow © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 18 Predictive Claim Workflow Example Initially this case looks like a routine claim…… © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 19 Various Medical Bills Received and other related expenses entered I stayed home from my job as a teacher for one week. I had follow-up treatment with my family physician, Dr. Harvey Stein, six days later. He told me to continue icing three times a day, and referred me to a physical therapist for my neck and back. I saw Julie Lyons, RPT, for 4 weeks, twice a week, and then for 4 more weeks, once a week. I am still doing the stretching and strengthening exercises at home. I’ve gone back to see Dr. Stein twice and have another appointment with him next week. I still have quite a bit of pain in my neck and back. My medical bills totaled $3,450 as follows (Copies of bills attached): Ambulance: $650 Hospital E.R, x-rays, exam, neck brace: $490 Dr. Stein: $225 Julie Lyons, RPT: $1216 Prescriptions: Flexeril, Vicodin: $219 I have lost wages in the amount of $1000. (Documentation attached.) As a result of the accident, I had to cancel reservations for a conference. The nonrefundable fee was $240. (Receipt attached.) As a result of being hit by Mr. Smith’s car, I couldn’t take my children to school and back for a week. I hired someone to help with that for $75 (Receipt attached.) I also had to hire a cleaning person to take care of the house and I will continue to need someone as long as I have pain in my neck and back. So far, this has cost me $600. (Cancelled checks attached). © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 20 Claim Re-Scored after recent payment request IPC flag triggered from Predictive model given recent letter and related expenses to reduce propensity to contact lawyer Text mining also invoked to improve overall predictive model accuracy to optimize service levels Goal: to provide the “right” level of service given ever changing circumstances © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 21 SOFTWARE PRESENTATION © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 22 To Identify new types of Fraud and Increased Fraud Model Accuracy TEXT MINING OVERVIEW © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 23 Text Mining Summary (Statistical Natural Language Processing) ■ Goal is to incorporate unstructured text into predictive modeling ■ Particularly well suited for fraud detection and estimating loss, from ■ First notice of loss, accident descriptions, adjustor notes ■ Emails, Letters, Faxes ■ Claim description ■ General approach is simple: ■ Narratives(PDF, Word, etc), adjustor notes extracted from your claims database ■ Notes are pre-processed to correct spelling errors, etc. ■ Find and count phrases, words, etc. of a-prior interest or use statistical methods to extract terms or phrases diagnostic of fraud, loss, etc. ■ “back pain”, “chiropractor”, “headaches”, soft tissue , … ■ Include word/phrase counts (incidences, transformed word frequencies) into modeling ■ Automate (“deploy”) the text-mining “model” to score new text data (e.g., claims as they are filed) © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 24 Text Mining: Illustration where a predictive fraud model is created from the following text file ■ Text or Unstructured data extracted from claim documents file(s) ■ Raw data and extracted terms, model generated and fraud scores assigned in batch or real time © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 25 Effective Text Mining for Predictive Modeling: Text Mining Details (1) ■ Example: Finding unusual narratives of aircraft accidents ■ Many “text-mining” approaches and solutions are geared towards finding “common phrases” etc. ■ This is usually not very interesting…. © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 26 Effective Text Mining for Predictive Modeling: Text Mining Details (2) ■ STATISTICA Text Miner is optimized for: Performance (multithreaded indexing) ■ Easy deployment of text models, for efficient scoring ■ ■ For example, building models for automatic detection of “unusual narratives”: ■ This can be accomplished through automatic ■ Latent semantic indexing of claims ■ Identifying unusual or “very-usual” narratives that do not belong to any cluster © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 27 Score Claim with new Text Data Claim Narrative: 10-12-2010 Spoke with claimant and injury seems not to have affected work or daily routine. Will pend for follow-up in 2 weeks. New Claim note added 10-12 Low Fraud Propensity © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 28 Two weeks later, new claim note added Claim Narrative: 10-28-10 Spoke to claimant, told me that after accident back was fine, yesterday went to Chiropractor and learned that more treatments will be needed to treat the relieve the pain. Also mentioned that friend told her that this pain could be chronic and last for years and that she should talk to an attorney. New Claim note added 10-28 now Fraud Propensity changes based on scoring of new information. Alert generated to refer claim to SIU © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 29 BENEFITS AND CUSTOMER EXAMPLES © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 30 Example: Commercial Property & Casualty Insurance Company ■ Background ROI: ■ Commercial Property and Casualty Insurance For one product line, Company (auto, disability, property, etc., product a 800%+ expected lines) return in 1st year by using text mining and ■ Predictive modeling in support of underwriting data mining to and fraud detection applications uncover fraudulent ■ Applications claims and ■ Underwriting: Actuaries use historical loss data opportunities for to determine the factors driving claims risks and subrogation develop predictive models of loss -> Agents use applications that score policy applicants ■ Fraud detection: Actuaries build predictive models to determine characteristics of fraudulent claims -> As new claims are processed, the models flag them for investigators © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 31 Example: Workers Compensation Insurance Company ■ Background ■ Disability and Workers’ Compensation Insurance ■ Predictive modeling in support of underwriting ■ Applied text mining to their claims analysis ■ Text mining of claims reports provides extra accuracy in uncovering the key factors driving historical losses ■ Underwriting Application and STATISTICA Live Score ■ Use Web-based application to support agents writing policies ■ Models built using STATISTICA Data Miner/STATISTICA Text Miner are deployed for real-time scoring to STATISTICA Live Score ■ STATISTICA Live Score integrates with the Webbased application using Web Services © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. ROI: • Increased accuracy in policy underwriting • Decreased IT costs by migrating from inhouse scoring application to • Dramatically reduced false positives to SIU •Next step: Implementing Predictive Claim flow 32 Thank you! ■ Overall impression? Can you see the value to Mercury Insurance? Now with Predictive Modeling Combined with Text Mining now you can: ■ Catch fraud earlier, before to many claim payments sent ■ Identify problem claims earlier ■ Identify low touch, low complexity earlier to provide better service at reduced cost ■ Identify new types of fraud, that were previously undetected ■ Any remaining questions? ■ Should we show to others, would you like a Software Presentation to reinforce our outrageous claims of cost savings ■ Next Steps? © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 33 History, Experience, Capabilities STATSOFT © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 34 STATISTICA Adoption STATISTICA Rated Highest in Customer Satisfaction* * 2010 Rexer Survey: Full report available at www.rexeranalytics.com © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 35 Text Mining STATISTICA Text Miner #1 Across Industries* * 2010 Rexer Survey © Copyright StatSoft, Inc., 1984-2010. StatSoft, StatSoft logo, and STATISTICA are trademarks of StatSoft, Inc. 36