Predictive Modeling: Rules of Thumb for Communicators Predictive Modeling Seminar Insurance Marketing Communications Association Chicago, IL September 18, 2007 Download at http://www.iii.org/media/presentations/predictivemodeling/ Robert P. Hartwig, Ph.D., CPCU, President Insurance Information Institute 110 William Street New York, NY 10038 Tel: (212) 346-5520 Fax: (212) 732-1916 bobh@iii.org www.iii.org PREDICTIVE MODELING: The Basics Predictive Modeling: Communications Challenges • Predictive Modeling Can Be Complex Actuaries/Economists use a variety of statistical techniques Understanding how they work requires formal statistical training Underwriters apply them, usually as part of an already sophisticated and automated underwriting process • • • • • Use of Some Predictive Factors/Models May Not be Intuitive Usage Often Not Explained or Even Revealed to Communicators Benefits Not Well Articulated to Communicators or Customers Failure to Recognize & Enlist Agents as Communicators Communications Obstacles in the Regulatory Context Regulators may have difficulty understanding Tendency is to react negatively May seize on issue for political gain • Models Maximize for Statistical Accuracy Some May Feel Models Are Too Impersonal Invasion of Privacy Concerns? Predictive Modeling: What is It? • What is Predictive Modeling? While people (even within the insurance industry) tend to view it as new, it is in fact quite old—as old as insurance itself. DEFINITION: Predictive modeling is a process used to create a statistical model of future behavior. In insurance, predictive models are primarily concerned with forecasting probabilities, trends and relativities.* A predictive model is made up of a number of predictors, variable factors that are likely to influence future behavior or results. In auto insurance, for example, a customer's gender, driving experience, type of vehicle, driving record, miles driven, etc., help predict the likelihood and cost of future claims. To create a predictive model, data is collected for the relevant predictors, a statistical model is formulated, predictions are made and the model is validated (or revised) as additional data becomes available. The models may employ a simple or extremely complex and employ a wide variety of statistical techniques. • Use of Some Predictive Factors/Models May Not be Intuitive *Adapted and modified by the Insurance Information Institute from www.searchdatamanagement.com accessed Sept. 16, 2007. Predictive Modeling: Why Do We Hear So Much About it Today? • Insurers rewrote their entire auto and homeowners book of business beginning in the later 1990s/early 2000s in response to huge losses in both of these key lines (which together account for nearly 50% of industry premiums) • This re-underwriting process was effectively a re-evaluation of risk presented by each policyholder and the adequacy of the premium paid by the policyholder to transfer that risk. • In most cases the premium was inadequate and premiums rose • Re-underwriting process included the use of sophisticated new models designed to better match price with risk • By definition, these models included more and better rating factors as well as new statistical methodologies for gauging interactions between these factors. • Policyholders and regulators incorrectly associated new factors in the models as being solely responsible for the increase • Credit-based “Insurance Scores” are the best known example Private Passenger Auto (PPA) Combined Ratio PPA is the profit juggernaut of the p/c insurance industry today 110 105 107.9 104.2 103.5 101.7 101.3101.3 101.1 101.0 Auto insurers have shown significant improvement in PPA after reunderwriting entire book of business in early 2000s 99.5 100 95 109.5 98.4 Average Combined Ratio for 1993 to 2005: 101.0 94.3 95.1 95.5 90 93 94 Sources: A.M. Best; III 95 96 97 98 99 00 01 02 03 04 05 06 Predictive Modeling: Why Now? • Predictive modeling is not new—big issue in most industries • Some form of it has been around since the earliest days of insurance—used in personal and commercial lines • In recent years the cost of data storage and acquisition have declined as has the cost of computing power • More data is available to insurers today at lower cost • Powerful computers make analysis (mining) of the this data easier, faster and more fruitful • Public and regulators have pushed for more individualized rates (and less reliance on factors like territory) • Insurers responded by accelerating trend toward individual risk ratingsmaller pools of increasingly homogeneous individuals • Consequently, rating systems becoming fairer & more accurate • Implies that subsidies are being removed from system • Recipients of subsidies don’t like their removal nor do regulators who view insurance as an extension of the social welfare system Insurance Scores: The Perfect Example of a Communications Breakdown • Insurers began to implement use of credit-based insurance score in the early/mid-1990s, but not on a large scale until late 1990s very early 2000s. • Insurers had found that scores were among the most accurate of all rating factors for predicting future loss. • Roll-out and use of credit was not communicated to most key personnel who come in contact with customers, regulators or media • Why credit works was not intuitive for most people (e.g., what does credit information have to do with my driving ability?) • Agents dislike having to explain why premiums rose due to credit factors • Special cases warranted special treatment abounded: No credit, life-changing events, identity theft • Consumer protections formalized only later (e.g., NCOIL) • Race issue became (and remains) big (but is red herring) PREDICTIVE MODELING: JUST PART OF THE RATEMAKING & UNDERWRITING PROCESS Predictive Data Can Be Historical, Class or Individual Specific • Historical Information: Used to identify trends in data Actuaries use a variety of statistical techniques; get base rate • Class Rating Data are adjusted for geographic, industry-specific factors or other factors statistically correlated with risk of future loss E.g. Urban zip codes = greater accident frequency E.g. Occupation in workers comp • Individual Risk Rating Policyholder-specific risk factors are taken into account E.g., Model of car; wood frame vs. masonry home; office vs. construction worker Credit profile “Black box” data; FUTURE: GPS Tracking (on voluntary basis) • Experience Rating Adjustments made to premium based on policyholder’s past claim filing activity UNDERWRITING: Key to Accurate Risk Assessments & Rates What is Underwriting? • Underwriting Process by which insurer determines whether policy should be issued and on what terms • Complex Process Many market and individual factors considered All relate to riskiness/likelihood of loss • Insurers All Use Underwriting Guidelines Helps keep insurers focused, disciplined, profitable, solvent E.g., no writing risks within 5 miles of coast, no high-rise construction risks, no limits above $1 million, no sportscars • Underwriting Tools Objective is to improve accuracy of loss forecasts Creates a more fair, equitable rating system for all Premium is more closely associated with risk RATING FACTORS Helping to Match Premium Charged to Risk Assumed Categories of Typical Auto Insurance Rating Factors/Criteria • Vehicle Type Factors • Use of Vehicle Factors • Location (Territorial) Factors • Driving History • Prior Insurance • Personal Factors • Other Typical Auto Insurance Rating Criteria • Vehicle Type Factors Number of vehicles to be insured on policy Number of operators in household Make, model & body style of each vehicle Age of vehicle (model year) Safety features (e.g., airbags, anti-lock brakes) Anti-theft devices • Use of Vehicle Factors Distance driven annually Commuting distance Number of days per week used to commute Who drives vehicle the most? Years of driving experience (youthful operator?) Use of vehicle for business purposes Typical Auto Insurance Rating Criteria • Location (Territorial) Factors Location where vehicle is kept Garage or street parking • Driving History Accidents Moving violations Convictions (e.g., DUIs) Personal claims history • Prior Insurance Factors Currently insured? Number of years with current insurer? Current Bodily Injury limits Typical Auto Insurance Rating Criteria • Driving History Accidents Moving violations Convictions (e.g., DUIs) Personal claims history • Prior Insurance Factors Currently insured? Number of years with current insurer? Current Bodily Injury limits Typical Auto Insurance Rating Criteria • Personal Factors Marital Status Gender Occupation Education Student? Homeowner? • Other Factors Information from credit reports Drivers education, defensive driving course taken Examples of Relationships Between Underwriting Criteria & Losses Example 1: GENDER & AUTO INSURANCE Sex of Drivers Involved in All Auto Crashes, 1994-2003 20 Millions of Accidents Males are involved Female in 50% more Millions of Accidents 18 18.6 Male accidents on average 16 15.2 14.3 14 12.7 12.4 12.7 11.4 12 10.6 9.9 9.6 8.6 7.6 7.0 11.6 10.6 10 8 12.1 7.5 8.6 8.4 7.4 6 94 95 96 97 98 99 00 01 02 Source: National Safety Council; Insurance Information Institute 2005 Fact Book, p. 109. 03 Fatality Rate by Sex of Drivers Involved in Auto Crashes, 1994-2003 Fatalities per Billion Miles Driven 30 Fatalities per Billion Miles Driven 27 25 25 27 27 25 24 20 17 17 17 15 Male 22 22 24 18 16 16 14 15 12 10 Female 13 13 Males are involved in 61% more likely to be killed in an auto accident 5 94 95 96 97 98 99 00 01 02 Source: National Safety Council; Insurance Information Institute 2005 Fact Book, p. 109. 03 Example 2: DRIVER AGE 2.9% 6.5% 3.6% Teens are by far the most likely to be involved in accident than the elderly (but elderly more likely to die in crash) 8.2% 4.8% 10% Teens account for just 5% of drivers but 22% of accidents! But people 35-44 represent 21% of drivers but just 16% of accidents 7.0% 13.3% 12.5% 8.4% 15% 5% Share of Accidents 20.7% 20.8% 15.8% 20% 17.3% 17.8% 25% Percent of Total Drivers 18.3% 22.1% Accidents by Age of Driver, 2003 0% Under 20 20-24 25-34 35-44 45-54 55-64 65-74 75+ Source: National Safety Council; Insurance Information Institute Example 3: INSURANCE SCORING (CREDIT) Importance of Rating Factors by Coverage Type Coverage Factor 1 Factor 2 Factor 3 BI Liability Age/Gender Ins. Score Geography PD Liability Age/Gender Ins. Score Geography PIP Ins. Score Geography Yrs. Insured Med Pay Ins. Score Limit Age/Gender Comprehensive Model Year Age/Gender Ins. Score Collision Model Year Age/Gender Ins. Score Source: The Relationship of Credit-Based Insurance Scores to Private Passenger Automobile Insurance Loss Propensity Michael Miller, FCAS and Richard Smith, FCAS (EPIC Actuaries), June 2003 (Presented at June 2003 NAIC meeting). Texas Auto: Average Loss per Policy (by Credit Score Decile, Total Market) Average Loss = $695 Avg. Incurred Loss per Policy $1,000 Interpretation: $918 $900 $846 $791 $800 Those with poorest credit scores generated incurred losses 65% higher those with the best scores $707 $703 $700 $668 $681 $631 1st Decile = Lowest Credit Scores $600 $584 10th Decile = Highest Credit Scores. $568 $558 $500 No Score 1st 2nd 3rd 4th 5th Score Range Source: University of Texas, Bureau of Business Research, March 2003. 6th 7th 8th 9th 10th Indicated Relative Pure Premium by Insurance Score (PD Liability)* Interpretation: 0.4 33% Relative Pure Premium 0.3 18% 0.2 0.1 10% 9% Those with poorest credit scores had loss experience 33% above average while those with the best scores had loss experience that was 19% below average 3% 0% 0.0 -0.1 -7% -11% -0.2 -14% -15% -19% -0.3 No Hit/Thin File Source: EPIC Actuaries, June 2003 607 659 693 722 748 Score Range 774 802 837 894 997 Example: Credit Discount Can Save $100s per Year* •Credit discount lowered annual premium by 14.7% Safety/Anti- Total Annual Savings from Discounts: $820 Theft CreditDiscount Related •Policyholder saved 19% nearly $300 Discount $154 36% •Credit was single $296 largest discount •Opponents of credit will force people to pay more for coverage $196 Good Driver Discount 24% *Annualized savings based on semi-annual data from example Source: Insurance Information Institute $174 Multipolicy Discount 21% Example 4: WORKER AGE (A Workers Comp Example) THE AGEING WORKFORCE Age Could be Used a Predictor of Occupational Injury and Loss, But it is Not U.S. Workforce is Aging: Significant Implications for Workers Comp Median Age of U.S. Worker 42 40.6 40.7 40.5 39.0 40 Older and less healthy workforce 38 34 32 38.0 36.6 35.8 36 39.4 35.2 34.3 The median age of US workers as the Baby Boomer begin to retire is about 41 years. Immigration will hold this number down and may even lower the figure. 30 1962 1970 1975 1980 1985 Year Source: US Bureau of Labor Statistics, 2004. 1990 1995 2000 2005 2008 Fatal Work Injury Rates Climb Sharply With Age Fatal Work Injuries per 100,000 Workers (2006) 8 Fatality rates for workers 65 and older are triple that of workers age 35-44. The workplace of the future will have to be completely redesigned to accommodate the surge in older workers. 6 4.9 12 10 4 2 2.7 2.7 3.2 3.6 10.8 4.0 Age is not used as a an underwriting factor in WC—should it be? 0.8 0 16-17 18-19 20-24 25-34 35-44 45-54 55-64 Source: US Bureau of Labor Statistics, US Department of Labor; Insurance Information Institute. 65+ Example 5: WORKER WEIGHT (Another Example Relevant to Workers Comp that is Not Used) THE OBESITY EPIDEMIC Major Cost Driver that WC Has Yet to Address 200 180 160 140 120 100 80 60 40 20 0 The most obese workers file twice as many WC claims and 13 times more lost workdays than healthy weight workers 10.80 183.63 11.65 8.81 5.80 10 8 75.21 6 60.17 40.97 14.19 12 117.61 7.05 5.53 14 Obesity is not a rating factor, but it is an identifiable cost factor 4 2 0 BMI <18.5 (Underweight) 18.5-24.9 (Healthy Weight) 25-29.9 30-34.9 (Obese 35-39.9 (Obese (Overweight) Class I) Class II) Lost Workdays 40+ (Obese Class III) Claims Source: Ostbye, T., et al, “Obesity and Workers Compensation,” J. of the American Medical Association, April 23, 2007. Claims per 100 FTEs Lost Workdays per 100 FTEs WC Claims and Lost Workdays by Body Mass Index (BMI) 60,000 10,000 $7,503 $5,396 20,000 $7,109 $3,924 30,000 BMI <18.5 (Underweight) 18.5-24.9 (Healthy Weight) $13,338 $13,569 40,000 $19,661 $23,633 50,000 $51,091 $59,178 70,000 Med claims costs are 6.8 times higher for the most obese workers and indemnity costs are 11 times higher $23,373 $34,293 Medical & Indemnity WC Claims Costs by BMI 0 25-29.9 30-34.9 (Obese 35-39.9 (Obese (Overweight) Class I) Class II) Medical Claims Costs 40+ (Obese Class II) Indemnity Claims Costs Source: Ostbye, T., et al, “Obesity and Workers Compensation,” J. of the American Medical Association, April 23, 2007. Example 6: TERRITORY Baltimore Relativity to State Loss Cost, 2001-2003 3.5 3.0 BI Liability costs in Baltimore are more than double (2.11 times) the state overall (i.e., 111% higher) 2.5 2.11 2.0 PD Liability costs in Baltimore are 47% higher than the state overall 3.00 PIP costs in Baltimore are triple the the state overall (200% higher) 1.47 1.5 1.0 0.5 0.0 Bodily Injury Liability *ISO territories 33, 35, 36 and 39. Source: ISO. Property Damage Liability Personal Injury Protection Baltimore Relativity to State Loss Cost, 1988 3.0 2.5 2.37 2.0 Costs in Baltimore were well above average back in 1988 too—still are today and will be in the future. This is permanent feature of most major urban auto insurance markets 2.52 1.47 1.5 1.0 0.5 0.0 Bodily Injury Liability *ISO territories 33, 35, 36 and 39. Source: ISO. Property Damage Liability Personal Injury Protection Are There Limits to What Predictive Modeling Can or Should Do? • Predictive Modeling Increases Accuracy, Equity in Rates Incumbent on insurers to use this information subject to limits imposed by policymakers • Advances in Data Storage, Retrieval, Computing Will Advance the Frontier of Predictive Models • Concern that Individual Risk Rating Will Replace Risk Pooling is Absurd No model will ever be 100% accurate Some degree of pooling will always exist • Societal Boundaries Will Always Exist Predictive modeling will never be used to its full potential Privacy/”Big Brother” concerns Predictive Modeling: 6 Rules of Thumb for Communicators 1. EDUCATE: Educate Yourself to Develop Understanding of How Products Work Get to know actuaries and underwriters in your company 2. PARTICIPATE: Get Communications (not just Marketing) Involved at a Much Earlier Stage of Product Cycle 3. ANTICIPATE: Potential Communications Challenges Before Rollout 4. IDENTIFY: Subject Area Experts as Technical Resources 5. DISSEMINATE: Create Plan to Help Employees with Customer, Regulator & Media Contact Understand How Product Operates 6. COORDINATE: Ensure Marketing, Government Affairs, Customer Service, Agents all Operating from Same Playbook Insurance Information Institute On-Line If you would like a copy of this presentation, please give me your business card with e-mail address, or dowload at: http://www.iii.org/media/presentations/predictivemodeling/