Practical GLM Modeling of Deductibles David Cummings State Farm Insurance Companies Overview • Traditional Deductible Analyses • GLM Approaches to Deductibles • Tests on simulated data Empirical Method All losses at $500 deductible $1,000,000 Losses eliminated by $1000 deductible Loss Elimination Ratio $ 100,000 10% Empirical Method • Pros – Simple • Cons – Need credible data at low deductible – No $1000 deductible data is used to price the $1000 deductible Loss Distribution Method • Fit a severity distribution to data 0 2000 4000 6000 8000 10000 Loss Distribution Method • Fit a severity distribution to data • Calculate expected value of truncated distribution 0 2000 4000 6000 8000 10000 Loss Distribution Method • Pros – Provides framework to relate data at different deductibles – Direct calculation for any deductible • Cons – Need to reflect other rating factors – Framework may be too rigid Complications • Deductible truncation is not clean • “Pseudo-deductible” effect – Due to claims awareness/self-selection – May be difficult to detect in severity distribution 0 2000 4000 6000 8000 10000 GLM Modeling Approaches 1. Fit severity distribution using other rating variables 2. Use deductible as a variable in severity/frequency models 3. Use deductible as a variable in pure premium model GLM Approach 1 – Fit Distribution w/ variables • Fit a severity model • Linear predictor relates to untruncated mean • Maximum likelihood estimation adjusted for truncation • Reference: – Guiahi, “Fitting Loss Distributions with Emphasis on Rating Variables”, CAS Winter Forum, 2001 GLM Approach 1 – Fit Distribution w/ variables X = untruncated random variable ~ Gamma Y = loss data, net of deductible d log( X ) 0 1v1 n vn f X ( y d; X ) fY ( y ) 1 FX (d ; X ) GLM Approach 1 – Fit Distribution w/ variables • Pros – Applies GLM within framework – Directly models truncation • Cons – Non-standard GLM application – Difficult to adapt to rate plan – No frequency data used in model Practical Issues • No standard statistical software – Complicates analysis – Less computationally efficient log( X ) 0 1v1 n vn f X ( y d; X ) fY ( y ) 1 FX (d ; X ) Not a member of Exponential Family of distributions Practical Issues • No clear translation into a rate plan – Deductible effect depends on mean – Mean depends on all other variables – Deductible effect varies by other variables log( X ) 0 1v1 n vn f X ( y d; X ) fY ( y ) 1 FX (d ; X ) Practical Issues • No use of frequency information – Frequency effects derived from severity fit 1 FX ( y d ; X ) – Loss of information GLM Approach 2 -- Frequency/Severity Model • Standard GLM approach • Fit separate frequency and severity models • Use deductible as independent variable GLM Approach 2 -- Frequency/Severity Model • Pros – Utilizes standard GLM packages – Incorporates deductible effects on frequency and severity – Allows model forms that fit rate plan • Cons – Potential inconsistency of models – Specification of deductible effects Test Data • Simulated Data – 1,000,000 policies – 80,000 claims • Risk Characteristics – – – – Amount of Insurance Deductible Construction Alarm System • Gamma Severity Distribution • Poisson Frequency Distribution Conclusions from Test Data – Frequency/Severity Models • Deductible as categorical variable – Good overall fit – Highly variable estimates for higher or less common deductibles – When amount effect is incorrect, interaction term improves model fit Severity Relativities Using Categorical Variable 3.5 3 2.5 2 1.5 1 0.5 0 0 2000 4000 6000 8000 10000 Conclusions from Test Data – Frequency/Severity Models • Deductible as continuous variable – Transformations with best likelihood • Ratio of deductible to coverage amount • Log of deductible – Interaction terms with amount improve model fit – Carefully examine the results for inconsistencies Frequency Relativities 1.2 Coverage Amount 1 0.8 100,000 500,000 0.6 0.4 0.2 0 0 1000 2000 3000 Deductible 4000 5000 Severity Relativities 1.2 Coverage Amount 1 0.8 100,000 500,000 0.6 0.4 0.2 0 0 1000 2000 3000 Deductible 4000 5000 Pure Premium Relativities 1.2 Coverage Amount 1 0.8 100,000 500,000 0.6 0.4 0.2 0 0 1000 2000 3000 Deductible 4000 5000 GLM Approach 3 – Pure Premium Model • Fit pure premium model using Tweedie distribution • Use deductible as independent variable GLM Approach 3 – Pure Premium Model • Pros – Incorporates frequency and severity effects simultaneously – Ensures consistency – Analogous to Empirical LER • Cons – Specification of deductible effects Conclusions from Test Data – Pure Premium Models • Deductible as categorical variable – Good overall fit – Some highly variable estimates • Good fit with some continuous transforms – Can avoid inconsistencies with good choice of transform Extension of GLM – Dispersion Modeling • Double GLM • Iteratively fit two models –Mean model fit to data –Dispersion model fit to residuals • Reference Smyth, Jørgensen, “Fitting Tweedie’s Compound Poisson Model to Insurance Claims Data: Dispersion Modeling,” ASTIN Bulletin, 32:143-157 Double GLM in Modeling Deductibles • Gamma distribution assumes that variance is proportional to µ2 • Deductible effect on severity – Mean increases – Variance increases more gradually • Double GLM significantly improves model fit on Test Data – More significant than interactions Pure Premium Relativities Tweedie Model – $500,000 Coverage Amount 1.1 1 0.9 0.8 0 1000 2000 3000 4000 Deductible Constant Dispersion Double GLM 5000 Conclusion • Deductible modeling is difficult • Tweedie model with Double GLM seems to be the best approach • Categorical vs. Continuous – Need to compare various models • Interaction terms may be important