Modeling the Loss Process for Medical Malpractice Bill Faltas GE Insurance Solutions

advertisement
Modeling the Loss Process
for Medical Malpractice
Bill Faltas
GE Insurance Solutions
CAS Special Interest Seminar … Predictive Modeling
“GLM and the Medical Malpractice Crisis” Session
October 4, 2004
Chicago, IL
© Employers Reinsurance Corporation - 2004
Patent Pending
"The work of science is to substitute
facts for appearances and
demonstrations for impressions.“
John Ruskin
2
© Employers Reinsurance Corporation - 2004
Patent Pending
Regression Modeling
 Simply: A functional relationship between one unknown (Y)
and one or more knowns (X’s)
Y = f (X1, X2, ... , Xn)  error
 Statistically: A distribution for Y with parameters that vary with
X
Example: Ordinary Least Squares (OLS) (“linear regression”)
• Y ~ Normal (i,2)
• Estimate both i and 2  have a measure of variability (2)
• E(Y) is a linear combination of X’s
i = a + b1X1i + b2X2i +…+ bnXni
• Estimate parameters a, b1, b2, … , bn
3
© Employers Reinsurance Corporation - 2004
Patent Pending
Terminology
X’s (explanatory / covariate / predictor / independent variables) could be:
(1) Numerical:
(a) Continuous [e.g., years of practice, square feet]
(b) Discrete [e.g., # past claims]
(2) Categorical:
(a) Ordinal [e.g., income or state group (H / M / L)]
(b) Nominal [e.g., gender (M/F), state]
Y (response / dependent variable) could be:
(1) Continuous [e.g., total $ losses from an insurance policy]
(2) Discrete [e.g., # of insurance claims]
(3) Binary [e.g., whether an insurance policy is likely to have a claim (Y/N)]
4
© Employers Reinsurance Corporation - 2004
Patent Pending
Popular Regression Modeling Choices
Y Continuous
Y Binary (0,1)
Ordinary Least Squares
Model (OLS)
Logistic Model
Y
5
Y Positive (Y>0)
Y Discrete {0,1,2,3, …}
Exponential Model
Poisson Model
© Employers Reinsurance Corporation - 2004
Patent Pending
GLM, OLS, and Logistic
Model
GLM
OLS
Logistic
Form of Y
Any
Continuous
Binary (0,1)
Distribution
of Y
Y ~ Exponential
Family
Y ~ Normal (,2)
Y (=1/0) ~ Bernoulli (P)
(in exponential family)
(in exponential family)
Mean(Y) = h(X • )
=X•
Mean(Yi) = f(a + b1X1i+ … + bnXni)
i = a + b1X1i + …+ bnXni
Pi = P(Yi=1) = eLi / (1+ eLi)
f(linear combination of X’s)
(linear combination of X’s)
where Li =a + b1X1i + … + bnXni
Item
Model
[E(Y)]
Method of
Estimating
a, b1, … , bn
6
M.L.E.
Method of Least
Squares
P=e
X •
/ (1 + e
X•
)
M.L.E.
(same as M.L.E. for
Normal)
© Employers Reinsurance Corporation - 2004
Patent Pending
Loss Process Model for Medical Malpractice
 Line Characteristic: low frequency / high severity
 Objective: Build models to forecast emergence and ultimate
values for (Y’s)
• # notices (a.k.a. incidents)
• # notices that turn into claims with indemnity payment
• $ losses
 Based on Four Types of X’s
• Policyholder attributes … state, specialty, years of practice, etc.
• Policy attributes … form type, limit, etc.
• Environmental attributes … lawyers per 1000, births per 1000, etc.
• Time … e.g., policy age measures time since effective date
7
© Employers Reinsurance Corporation - 2004
Patent Pending
Likelihood of Notice
Dependence of Likelihood on X1
Dependence of Likelihood on X2
Mi d p o i n t
pdf1
0.10
of
T i me
I nt er v al =3
p dpdf1
f 1
0 . 0.032
032
0 . 0.031
031
0 . 0.030
030
Likelihood at policy age 2.5
years (mode) increases with X1
0.09
0.08
0 . 0.029
029
0 . 0.028
028
0 . 0.027
027
0 . 0.026
026
0 . 0.025
025
0 . 0.024
024
0.07
0 . 0.023
023
Claim Likelihood
Likelihood
at policy
for
rises
agedoctor
2.5 years
and falls
with
(mode),
rises
and
Age
falls with X2
0 . 0.022
022
0 . 0.021
021
0.06
0 . 0.020
020
0 . 0.019
019
Not significantly different
0.05
0 . 0.018
018
0 . 0.017
017
0 . 0.016
016
0.04
0 . 0.015
015
0 . 0.014
014
0 . 0.013
013
0.03
0 . 0.012
012
0 . 0.011
011
Not significantly different
0.02
0 . 0.010
010
0 . 0.009
009
0 . 0.008
008
0.01
0 . 0.007
007
1
2
3
4
5
6
7
8
20-30
22
30-40
33
40-50
44
X1
50-60
55
60-70
66
X2
age_new
• Likelihood is a function of many (X) variables, including policy age
• Likelihood changes with X1 and X2  include both in model
• Y is binary (1/0), “whether there is a notice or not”
8
© Employers Reinsurance Corporation - 2004
Patent Pending
70-807 7
Likelihood of Notice
A Logistic Model (a GLM application)
To model: P = Likelihood of Notice = Pr(Y=1)
P = P(Y=1) = eL / (1+ eL)
where L =a + b1X1 + … + bnXn
• Transform some of the X variables, including policy age
• Develop model based on 70% data
• Validate model on remaining 30% of data
• Compare actual vs. modeled triangles of ‘# policies with notices’
• Finalize parameters on 100% of data
9
© Employers Reinsurance Corporation - 2004
Patent Pending
Model Validation Approaches
Sampling
Resampling
Partitioning
• Set aside sample
• Set aside 1st sample
• Develop parameters
using remaining data
• Develop parameters
using remaining data
• Divide data into n
partitions (often 4-6)
• Verify model works
against sample
• Verify model works
against 1st sample
• Develop parameters
using other partitions
• Finalize model using
all data
• Resample and redo
… n times
• Verify model works
against 1st partition
• Finalize model using
all data
• Repeat process for
all other partitions
• Set aside 1st partition
• Finalize model using
all data
Uses all data
10
© Employers Reinsurance Corporation - 2004
Patent Pending
Notice to Claim … Waiting Time Approach
Empirical PDF of “Waiting Time”
for 6 categories of X2
0.100
• Waiting time defined as
time from notice to claim
“Waiting time” varies by different
values of attribute X2  include X2 in
notice-to-claim model
0.075
• Waiting time approach
enables lack of claim
data to be used as
information
Area represents probability
of turning into a claim 1.0 - 2.5
years after receiving notice (no
actual data prior to 1.0 year).
0.050
0.025
0.000
0.0
2.5
5.0
7.5
10.0
12.5
15.0
17.5
20.0
• # Claims = (# notices) x
(prob. of notice turning
into a claim)
Waiting Time
11
© Employers Reinsurance Corporation - 2004
Patent Pending
Estimate Claim Closing Values (Claim Sizes)
• Model trended claim sizes using standard actuarial
approaches
– Closed claims, without regard to closing lag
– Closed claims by closing lag
– Closed claims by policyholder attributes
• Compare company data and models with external benchmarks
• Select model(s)
• Test modeled severities against actual severities
– Actual severities in development triangles
– Modeled severities: f(policyholder, policy, closing year)
12
© Employers Reinsurance Corporation - 2004
Patent Pending
Claim Size Distribution
Density
P.D.F of Log of claim sizes by 8 groups of X1
Claim size distribution
varies by different
values of attribute X1 
include X1 in claim size
modeling
modes
• Claim size model
parameters are a
function of significant
attributes
• Model location and
shape varies
w/attributes
• A way to introduce
distributional variation
LN(Claim Size)
13
© Employers Reinsurance Corporation - 2004
Patent Pending
Modeling Summary
Policyholder Attributes
Policy Attributes
Environmental Attributes
Notices Becoming
Claims
(Waiting Time)
# Claims =
# Notices x Prob
of Notice to Claim
Claim Size
Distribution
CLAIM SIZES
CLAIM COUNTS
# Notices
(Logistic Model)
$ Losses = # Claims x Claim Size
$ LOSSES
14
© Employers Reinsurance Corporation - 2004
Patent Pending
GLM Application Advantages
 Useful for all lines, including low freq / high sev
 Identifies and uses significant variables simultaneously
 Effective in dealing with interacting variables
 Can use time element to model emergence and ultimates
 Variability of modeled estimates can be byproduct and useful
for measurements of risk/uncertainty
 Multiple applications
 Underwriting
 Pricing
 Reserving
 Risk
15
© Employers Reinsurance Corporation - 2004
Patent Pending
Download