Glenn Meyers
ISO Innovative Analytics
CARe Seminar
June 6-7, 2005
Problems with
Experience Rating for
Excess of Loss Reinsurance
• Use submission claim severity data
– Relevant, but
– Not credible
– Not developed
• Use industry distributions
– Credible, but
– Not relevant (???)
General Problems with
Fitting Claim Severity Distributions
• Parameter uncertainty
– Fitted parameters of chosen model are estimates subject to sampling error.
• Model uncertainty
– We might choose the wrong model. There is no particular reason that the models we choose are appropriate.
• Loss development
– Complete claim settlement data is not always available.
• Quantifying Parameter Uncertainty
– Likelihood ratio test
• Incorporating Model Uncertainty
– Use Bayesian estimation with likelihood functions
– Uncertainty in excess layer loss estimates
• Bayesian estimation with prior models based on data reported to a statistical agent
– Reflect insurer heterogeneity
– Develops losses
• Start with classical hypothesis testing.
– Likelihood ratio test
• Calculate a confidence region for parameters.
• Calculate a confidence interval for a function of the parameters.
– For example, the expected loss in a layer
• Introduce a prior distribution of parameters.
• Calculate predictive mean for a function of parameters.
Let p
p
1 p k
) be a parameter vector for your chosen loss model.
Let x
x
1 x n
) be a set of observed losses.
Test H :
0 p
p
*
against H :
1 p
p
*
Theorem 2.10 in Klugman, Panjer & Willmot
If H is true then:
0
ln LR
2 ln
ln
has a
2 k of freedom.
Use
2 distribution to find critical values.
An Example – The Pareto Distribution
( ) 1
x
• Simulate random sample of size 1000
= 2.000,
= 10,000
Maximum Likelihood = -10034.660 with
ˆ
8723.04
ˆ
1.80792
• Significance level = 5%
2 critical value = 5.991
• H
0
: (
• H
1
: (
,
) = (10000, 2)
,
) ≠ (10000, 2)
• ln LR = 2(-10034.660 + 10035.623) =1.207
• Accept H
0
• Significance level = 5%
2 critical value = 5.991
• H
0
: (
• H
1
: (
,
) = (10000, 1.7)
,
) ≠ (10000, 1.7)
• ln LR = 2(-10034.660 + 10045.975) =22.631
• Reject H
0
• X% confidence region corresponds to the
1-X% level hypothesis test.
• The set of all parameters ( ,
) that fail to reject corresponding H
0
.
• For the 95% confidence region:
– (10000, 2.0) is in.
– (10000, 1.7) out.
Outer Ring 95%, Inner Ring 50%
2.5
2.0
1.5
1.0
0.5
0.0
0 5000
Theta
10000 15000
• Data grouped into four intervals
– 562 under 5000
– 181 between 5000 and 10000
– 134 between 10000 and 20000
– 123 over 20000
• Same data as before, only less information is given.
2.5
2.0
1.5
1.0
0.5
0.0
0
Confidence Region for
Grouped Data
Outer Ring 95%, Inner Ring 50%
5000
Theta
10000 15000
2.5
2.0
1.5
1.0
0.5
0.0
0
Confidence Region for
Ungrouped Data
Outer Ring 95%, Inner Ring 50%
5000
Theta
10000 15000
Estimation with Model Uncertainty
COTOR Challenge – November 2004
• COTOR published 250 claims
– Distributional form not revealed to participants
• Participants were challenged to estimate the cost of a $5M x $5M layer.
• Estimate confidence interval for pure premium
You want to fit a distribution to 250 Claims
• Knee jerk first reaction, plot a histogram.
Histogram of Cotor Data
250
200
150
100
50
0
0 1 2 3 4
Claim Amount
5 6 x 10
6
7
• And fit some standard distributions.
0.35
0.3
0.25
0.2
lcotor data lognormal gamma
Weibull
0.15
0.1
0.05
0
6 7 8 9 10 11 12
Log of Claim Amounts
13 14 15 16
Still looks skewed. Take double logs.
• And fit some standard distributions.
2.5
2
1.5
1
0.5
0
1.8
llcotor data
Lognormal
Gamma
Weibull
2 2.2
log log of Claim Amounts
2.4
2.6
2.8
Still looks skewed. Take triple logs.
• Still some skewness.
• Lognormal and gamma fits look somewhat better.
5
4
3
2 lllcotor data
Lognormal
Gamma
Normal
1
0
0.55
0.6
0.65
0.7
0.75
0.8
Triple log of Claim Amounts
0.85
0.9
0.95
1
Candidate #1
Quadruple lognormal
Distribution:
Log likelihood:
Lognormal
283.496
Domain:
Mean:
0 < y < Inf
0.738351
Variance: 0.006189
Estimate Std. Err. Parameter
Mu sigma
-0.30898
0.106252
0.00672
0.004766
Estimated covariance of parameter estimates:
mu sigma
Mu
Sigma
4.52E-05 1.31E-19
1.31E-19 2.27E-05
Candidate #2
Triple loggamma
Distribution:
Log likelihood:
Gamma
282.621
Domain:
Mean:
0 < y < Inf
0.738355
Variance: 0.00615
Estimate Std. Err. Parameter
A
B
88.6454
0.008329
7.91382
0.000746
Estimated covariance of parameter estimates: a b
A
B
62.6286 -0.00588
-0.00588 5.56E-07
Candidate #3
Triple lognormal
Distribution: Normal
Log likelihood: 279.461
Domain:
Mean:
-Inf < y < Inf
0.738355
Variance:
Parameter mu sigma
0.006285
Estimate Std. Err.
0.738355
0.079279
0.005014
0.003556
Estimated covariance of parameter estimates: mu sigma
mu
2.51E-05
sigma
-1.14E-19
-1.14E-19 1.26E-05
All three cdf’s are within confidence interval for the quadruple lognormal.
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.55
0.6
lllcotor data
Lognormal
confidence bounds (Lognormal)
Gamma
Normal
0.65
0.7
0.75
0.8
Triple log of Claim Amounts
0.85
0.9
0.95
1
• Three candidate models
– Quadruple lognormal
– Triple loggamma
– Triple lognormal
• Parameter uncertainty within each model
• Construct a series of models consisting of
– One of the three models .
– Parameters within a broad confidence interval for each model .
– 7803 possible models
• Calculate likelihood (given the data) for each model.
• Use Bayes’ Theorem to calculate posterior probability for each model
– Each model has equal prior probability.
• Calculate layer pure premium for 5 x 5 layer for each model.
• Expected pure premium is the posterior probability weighted average of the model layer pure premiums.
• Second moment of pure premium is the posterior probability weighted average of the model layer pure premiums squared.
Probability that layer pure premium ≤ x equals
Sum of posterior probabilities for which the model layer pure premium is ≤ x
Mean 6,430
Standard Deviation 3,370
Median 5,780
Range
Low at 2.5%
High at 97.5%
1,760
14,710
Histogram of
Predictive Pure Premium
Predictive Distribution of the Layer Pure Premium
0.08
0.06
0.04
0.02
0.16
0.14
0.12
0.10
0.00
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Low End of Amount (000)
• Continue with Bayesian Estimation
• Liability insurance claim severity data
• Prior distributions derived from models based on individual insurer data
• Prior models reflect the maturity of claim data used in the estimation
• Selected 20 insurers
– Claim count in the thousands
• Fit mixed exponential distribution to the data of each insurer
• Initial fits had volatile tails
• Truncation issues
– Do small claims predict likelihood of large claims?
45,000
40,000
35,000
30,000
25,000
20,000
15,000
10,000
5,000
0
1,000
10,000 100,000
Loss Amount - x
1,000,000 10,000,000
5,000
4,500
4,000
3,500
3,000
2,500
2,000
1,500
1,000
500
0
0.00
0.05
0.10
0.15
0.20
0.25
Probability That Loss is Over 5,000
0.30
0.35
0.40
5,000
4,500
4,000
3,500
3,000
2,500
2,000
1,500
1,000
500
0
0.00
0.01
0.02
0.03
0.04
Probability That Loss is Over 100,000
0.05
0.06
0.07
• Truncation point = $100,000
• Family of cdf’s that has “correct” behavior
– Admittedly the definition of “correct” is debatable, but
– The choices are transparent!
45,000
40,000
35,000
30,000
25,000
20,000
15,000
10,000
5,000
0
100,000 1,000,000
Loss Amount - x
10,000,000
6,000
5,000
4,000
3,000
2,000
1,000
0
0.00
0.01
0.01
0.02
0.02
0.03
0.03
Probability That Loss is Over 100,000
0.04
0.04
0.05
1. The claim severity distribution for all claims settled within 1 year
2. The claim severity distribution for all claims settled within 2 years
3. The claim severity distribution for all claims settled within 3 years
4. The ultimate claim severity distribution for all claims
5. The ultimate limited average severity curve
Three Sample Insurers
Small, Medium and Large
• Each has three years of data
• Calculate likelihood functions
– Most recent year with #1 on prior slide
– 2 nd most recent year with #2 on prior slide
– 3 rd most recent year with #3 on prior slide
• Use Bayes theorem to calculate posterior probability of each model
Formulas for Posterior Probabilities
Model ( m ) Cell
Probabilities
P
, ,
F
i
1
1
F
F
i
Likelihood
( m )
Using Bayes’
Theorem
Number of claims l m
9 3 i
1 AY
1
P
n
,
, ,
Posterior( ) l m
Results
Taken from paper.
1-3
1-3
1-3
1-3
1-3
1-3
1-3
1-3
1-3
1-2
1-2
1-2
1-2
1-2
1-2
1-2
1-2
1-2
1
1
1
1
Lags
1
1
1
1
1
Exhibit 1 – Small Insurer
Interval
Lower
Bound
Claim
Count
100,000 15
Layer Pure Premium
Prior Posterior $500K x $1M x
Model # Probability $500K
1 0.016406
763
$1M
541
200,000
300,000
2
1
2
3
0.041658
0.089063
911
1,153
645
682
400,000
500,000
750,000
1,000,000
1,500,000
2,000,000
2
0
0
0
0
0
4
5
6
7
8
9
10
11
0.130281
0.157593
0.110614
0.075702
0.053226
0.080525
0.104056
0.129925
1,224
1,281
1,390
1,494
1,587
1,849
2,069
2,417
796
912
978
1,040
1,095
1,328
1,523
1,828
100,000 40
200,000 10
300,000
400,000
1
0
500,000
750,000
1,000,000
1,500,000
2,000,000
2
0
0
2
0
12
13
14
15
16
17
18
19
20
0.010896
0.000007
0.000009
0.000011
0.000013
0.000014
0
0
0
2,598
2,788
3,004
3,202
3,382
3,543
4,058
4,663
5,354
1,916
1,922
2,124
2,309
2,477
2,628
3,211
3,784
4,440
Posterior Mean
Posterior Std. Dev.
1,572
463
1,113
385 100,000 76
200,000 26
300,000 11
400,000 3
500,000
750,000
1,000,000
1,500,000
2,000,000
0
0
8
0
0
Formulas for
Ultimate Layer Pure Premium
• Use #5 on model (3 rd previous) slide to calculate ultimate layer pure premium
20
Posterior Mean =
m m
=1
Posterior Standard Deviation =
20 m 2 m m
=1
Posterior Mean
Posterior Std. Dev.
Small Insurer Medium Insurer
Layer Pure Premium Layer Pure Premium
Large Insurer
Layer Pure Premium
$500K x $1M x $500K x $1M x $500K x $1M x
$500K $1M $500K $1M $500K $1M
1,572
463
1,113
385
1,344
278
909
245
1,360
234
966
188
• All insurers were simulated from same population.
• Posterior standard deviation decreases with insurer size.
• Obtain model for individual insurers
• Obtain data for insurer of interest
• Calculate likelihood, Pr{data|model}, for each insurer’s model.
• Use Bayes’ Theorem to calculate posterior probability of each model
• Calculate the statistic of choice using models and posterior probabilities
– e.g. Loss reserves