Practical GLM Modeling: Deductibles

advertisement
Practical GLM Modeling
of Deductibles
David Cummings
State Farm Insurance Companies
Overview
• Traditional Deductible Analyses
• GLM Approaches to Deductibles
• Tests on simulated data
Empirical Method
All losses at $500 deductible $1,000,000
Losses eliminated by
$1000 deductible
Loss Elimination Ratio
$ 100,000
10%
Empirical Method
• Pros
– Simple
• Cons
– Need credible data at low deductible
– No $1000 deductible data is used to
price the $1000 deductible
Loss Distribution Method
• Fit a severity distribution to data
0
2000
4000
6000
8000
10000
Loss Distribution Method
• Fit a severity distribution to data
• Calculate expected value of truncated
distribution
0
2000
4000
6000
8000
10000
Loss Distribution Method
• Pros
– Provides framework to relate data at
different deductibles
– Direct calculation for any deductible
• Cons
– Need to reflect other rating factors
– Framework may be too rigid
Complications
• Deductible truncation is not clean
• “Pseudo-deductible” effect
– Due to claims awareness/self-selection
– May be difficult to detect in severity
distribution
0
2000
4000
6000
8000
10000
GLM Modeling Approaches
1. Fit severity distribution using other
rating variables
2. Use deductible as a variable in
severity/frequency models
3. Use deductible as a variable in
pure premium model
GLM Approach 1
– Fit Distribution w/ variables
• Fit a severity model
• Linear predictor relates to untruncated
mean
• Maximum likelihood estimation adjusted
for truncation
• Reference:
– Guiahi, “Fitting Loss Distributions with
Emphasis on Rating Variables”, CAS Winter
Forum, 2001
GLM Approach 1
– Fit Distribution w/ variables
X = untruncated random variable ~ Gamma
Y = loss data, net of deductible d
log(  X )   0  1v1     n vn
f X ( y  d; X )
fY ( y ) 
1  FX (d ;  X )
GLM Approach 1
– Fit Distribution w/ variables
• Pros
– Applies GLM within framework
– Directly models truncation
• Cons
– Non-standard GLM application
– Difficult to adapt to rate plan
– No frequency data used in model
Practical Issues
• No standard statistical software
– Complicates analysis
– Less computationally efficient
log(  X )   0  1v1     n vn
f X ( y  d; X )
fY ( y ) 
1  FX (d ;  X )
Not a member of Exponential Family of distributions
Practical Issues
• No clear translation into a rate plan
– Deductible effect depends on mean
– Mean depends on all other variables
– Deductible effect varies by other variables
log(  X )   0  1v1     n vn
f X ( y  d; X )
fY ( y ) 
1  FX (d ;  X )
Practical Issues
• No use of frequency information
– Frequency effects derived from
severity fit
1  FX ( y  d ;  X )
– Loss of information
GLM Approach 2
-- Frequency/Severity Model
• Standard GLM approach
• Fit separate frequency and
severity models
• Use deductible as independent
variable
GLM Approach 2
-- Frequency/Severity Model
• Pros
– Utilizes standard GLM packages
– Incorporates deductible effects on
frequency and severity
– Allows model forms that fit rate plan
• Cons
– Potential inconsistency of models
– Specification of deductible effects
Test Data
• Simulated Data
– 1,000,000 policies
– 80,000 claims
• Risk Characteristics
–
–
–
–
Amount of Insurance
Deductible
Construction
Alarm System
• Gamma Severity Distribution
• Poisson Frequency Distribution
Conclusions from Test Data
– Frequency/Severity Models
• Deductible as categorical variable
– Good overall fit
– Highly variable estimates for higher
or less common deductibles
– When amount effect is incorrect,
interaction term improves model fit
Severity Relativities
Using Categorical Variable
3.5
3
2.5
2
1.5
1
0.5
0
0
2000
4000
6000
8000
10000
Conclusions from Test Data
– Frequency/Severity Models
• Deductible as continuous variable
– Transformations with best likelihood
• Ratio of deductible to coverage amount
• Log of deductible
– Interaction terms with amount
improve model fit
– Carefully examine the results for
inconsistencies
Frequency Relativities
1.2
Coverage
Amount
1
0.8
100,000
500,000
0.6
0.4
0.2
0
0
1000
2000
3000
Deductible
4000
5000
Severity Relativities
1.2
Coverage
Amount
1
0.8
100,000
500,000
0.6
0.4
0.2
0
0
1000
2000
3000
Deductible
4000
5000
Pure Premium Relativities
1.2
Coverage
Amount
1
0.8
100,000
500,000
0.6
0.4
0.2
0
0
1000
2000
3000
Deductible
4000
5000
GLM Approach 3
– Pure Premium Model
• Fit pure premium model using
Tweedie distribution
• Use deductible as independent
variable
GLM Approach 3
– Pure Premium Model
• Pros
– Incorporates frequency and severity
effects simultaneously
– Ensures consistency
– Analogous to Empirical LER
• Cons
– Specification of deductible effects
Conclusions from Test Data
– Pure Premium Models
• Deductible as categorical variable
– Good overall fit
– Some highly variable estimates
• Good fit with some continuous
transforms
– Can avoid inconsistencies with
good choice of transform
Extension of GLM
– Dispersion Modeling
• Double GLM
• Iteratively fit two models
–Mean model fit to data
–Dispersion model fit to residuals
• Reference
Smyth, Jørgensen, “Fitting Tweedie’s
Compound Poisson Model to Insurance
Claims Data: Dispersion Modeling,”
ASTIN Bulletin, 32:143-157
Double GLM in Modeling
Deductibles
• Gamma distribution assumes that
variance is proportional to µ2
• Deductible effect on severity
– Mean increases
– Variance increases more gradually
• Double GLM significantly
improves model fit on Test Data
– More significant than interactions
Pure Premium Relativities
Tweedie Model – $500,000 Coverage Amount
1.1
1
0.9
0.8
0
1000
2000
3000
4000
Deductible
Constant Dispersion
Double GLM
5000
Conclusion
• Deductible modeling is difficult
• Tweedie model with Double GLM
seems to be the best approach
• Categorical vs. Continuous
– Need to compare various models
• Interaction terms may be
important
Download