Properties of Chain Ladder Models

advertisement
Some important properties of chain ladder models
Section I. Mack (chain ladder) is volume-weighted average ratios
When we compute a ratio, y/x, what are we looking at?
Data are incurred losses from Mack (1994).
0
1
2
3
4
1 5,012 8,269 10,907 11,805 13,539
2
106
4,285
5,396 10,666 13,782
3 3,410 8,992 13,873 16,141 18,735
4 5,655 11,555 15,766 21,266 23,425
5 1,092 9,565 15,836 22,169 25,955
6 1,513 6,445 11,702 12,935 15,852
7
557
4,020 10,946 12,314
8 1,351 6,947 13,112
9 3,133 5,395
10 2,063
5
6
7
8
9
16,181
18,009
18,608
18,662
18,834
15,599
15,496
16,169
16,704
22,214
22,863
23,466
26,083
27,067
26,180
Consider the ratio of the cumulative value in AY1, DY1 to the previous value in AY1:
0
1
1
5,012
2
2
3

8,269
10,907
11,805

106
4,285
5,396
10,666

3
3,410
8,992
13,873
16,141







8269/5012 = 1.65
Let’s call the number on the numerator y and the number on the denominator x.
Let’s look at a graph of DY1 vs DY0. What is the value 1.65?
Cum DY 1 vs DY 0
12,000
10,000
(5012, 8269)
8,000
6,000
Slope = 1.65
4,000
2,000
0
0
1,000
2,000
3,000
4,000
5,000
6,000
If you plot x and y as a point on a plot, the ratio y/x is the slope of the line through the
origin that passes through that point.
When we think of there being a “typical” ratio (which we may want to use for
prediction), we are also talking about a “typical” slope through the origin on that (x,y)
plot.
If our measure of “typical” is some kind of average (such as a weighted average, a
geometric mean, an average of the most recent values, or whatever), we are talking
about both an “average” ratio and at the same time, an “average” slope.
Table of ratios
0-1
1-2
1 1.6498 1.3190
2 40.4245 1.2593
3 2.6370 1.5428
4 2.0433 1.3644
5 8.7592 1.6556
6 4.2597 1.8157
7 7.2172 2.7229
8 5.1421 1.8874
9 1.7220
2-3
3-4
4-5
1.0823
1.9766
1.1635
1.3489
1.3999
1.1054
1.1250
1.1469
1.2921
1.1607
1.1015
1.1708
1.2255
1.1951
1.1318
1.1857
1.1135
1.0087
5-6
6-7
7-8
8-9
1.1130 1.0333 1.0029 1.0092
0.9934 1.0434 1.0331
1.0292 1.0264
1.0377
In statistical terms, we’re using sample ratios to estimate the underlying ratio, since
the observed ratios are “noisy”.
We are also saying that, given the previous cumulative, we expect that on average the
next cumulative is a multiple of the previous one. That is:
E(y/x |x) = r
≡
E(y | x ) = rx.
“E()” stands for “expected” (underlying average) value of whatever is in parentheses,
and “|” means “given”. So E(y|x) means “the expected value of y, given the value of
x”.
The left side is a ratio, r, the right side is a line through the origin with slope r. They
both describe the same relationship between one cumulative or incurred and the next.
So if there’s an underlying "ratio", it’s also an underlying slope.
Here’s one such “average” line, an ordinary regression line through the origin:
Cum DY 1 vs DY 0
12,000
10,000
8,000
6,000
4,000
2,000
0
0
1,000
2,000
3,000
4,000
5,000
6,000
The slope of this line is around 2.217, which is an estimate of the underlying ratio.
Residuals
The residuals from this fit are the differences between the points and the line.
Residual = data – fit of method ; Residual trend = data trend – method trend
We use residuals to assess ways in which the model assumptions don’t apply.
Let’s calculate the residual for the point with y = 8992 and x = 3410, and for the point
with y = 9565, x = 1092 (these are from AYs 3 and 5 respectively). The observed
value for DY1 for the point circled in red below is 8992.
The fitted value (the height of the line at the x-value 3410) is 3410  2.217 = 7560.
Cum DY 1 vs DY 0
12,000
10,000
(3410, 8992)
1432
8,000
6,000
4,000
2,000
0
0
1,000
2,000
3,000
4,000
5,000
6,000
The residual, 1432 is 8992 – 3410  2.217; the observed value in DY 1 minus the
prediction from the ratio times the previous value (residual = data – fit). The
observation with the largest residual is circled in blue. Its observed (y) value is 9565,
while the x value is 1092. Consequently, its residual is 9565 – 1092  2.217, which
gives 7144.
Residuals vs DY0
8,000
6,000
4,000
2,000
0
-2,000
-4,000
0
1,000
2,000
3,000
4,000
5,000
6,000
Above are residuals from the fitted line (ratio). Notice the downward trend!
Something is clearly amiss; the line through the origin doesn’t describe the
relationship well. In fact, the residuals are getting smaller as the previous cumulative
gets larger. Look at the fitted line again, and see how the points on the left are above it
and the points on the right are mostly below it. The relationship is not a line through
the origin:
Cum.(1) vs Cum.(0)
12,000
11,000
10,000
9,000
8,000
7,000
6,000
5,000
4,000
3,000
2,000
1,000
0
0
1,000
2,000
3,000
4,000
5,000
Above is a line of best fit in green. Clearly a line that doesn’t go through the origin is
a better description of the relationship here.
Normally residuals are divided by their (individual) standard deviation, so that they
share a common scale – the result is standardized residuals. Secondly, it’s important
to see whether the residuals are related to other likely predictors of the observations
(in which case we will see non-random trends in the residuals plotted against those
predictors). One obvious thing to do is to look at residuals against the three directions
(accident year, development year and calendar year), as below. The fourth plot,
residuals vs fitted values, is a standard regression diagnostic. Notice that it has exactly
the same appearance as the above plot – only the scale labels are different!
Wtd Std Res vs Dev. Yr
Wtd Std Res vs Acc. Yr
1.5
1.5
1
1
0.5
0.5
0
0
-0.5
-0.5
0
1
2
3
4
5
6
7
8
9
81
82
Wtd Std Res vs Cal. Yr
83
84
85
86
87
88
89
90
Wtd Std Res vs Fitted
1.5
1.5
1
1
0.5
0.5
0
0
-0.5
-0.5
81
82
83
84
85
86
87
88
89
90
2,000
4,000
6,000
8,000
10,000
12,000
Residual display for a ratio model for DY1 on DY0 (first pair of years).
The plot against accident and calendar years are the same because we only have a
single pair of years. There’s also not a lot of information in the residuals for a single
pair of years – patterns have to be quite strong for us to be able to pick anything up.
Here’s the plot of y vs x for the second pair of years (DY2 vs DY1), followed by the
corresponding residuals.
Cum.(2) vs Cum.(1)
18,000
16,000
14,000
12,000
10,000
8,000
6,000
4,000
2,000
0
0
5,000
10,000
Wtd Std Res vs Dev. Yr
Wtd Std Res vs Acc. Yr
1.5
1.5
1
1
0.5
0.5
0
0
-0.5
-0.5
0
1
2
3
4
5
6
7
8
9
81
82
Wtd Std Res vs Cal. Yr
83
84
85
86
87
88
89
90
Wtd Std Res vs Fitted
1.5
1.5
1
1
0.5
0.5
0
0
-0.5
-0.5
81
82
83
84
85
86
87
88
89
90
8,000
10,000
12,000
14,000
16,000
18,000
Residual display for a ratio model for DY2 on DY1 (second pair of years).
In the model for DY2, we can see an increasing trend against calendar (and accident)
year, and a decreasing trend against fitted values – again, the relationship between
DY2 (y) and DY1 (x this time) is not through the origin.
Cum.(3) vs Cum.(2)
22,000
20,000
18,000
16,000
14,000
12,000
10,000
8,000
6,000
4,000
2,000
0
0
5,000
10,000
Plot of y vs x for second pair of years (DY3 vs DY2).
15,000
Wtd Std Res vs Dev. Yr
Wtd Std Res vs Acc. Yr
1.6
1.6
1.2
1.2
0.8
0.8
0.4
0.4
0
0
-0.4
-0.4
-0.8
-0.8
0
1
2
3
4
5
6
7
8
9
81
Wtd Std Res vs Cal. Yr
82
83
84
85
86
87
88
89
Wtd Std Res vs Fitted
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
-0.2
-0.4
-0.6
-0.8
1.6
1.2
0.8
0.4
0
-0.4
-0.8
81
82
83
84
85
86
87
88
89
90
8,000
10,000
12,000
14,000
16,000
18,000
Residual display for a ratio model for DY3 on DY2 (third pair of years).
Cum.(4) vs Cum.(3)
24,000
22,000
20,000
18,000
16,000
14,000
12,000
10,000
8,000
6,000
4,000
2,000
0
0
5,000
Plot of y vs x for third pair of years.
10,000
15,000
20,000
90
Wtd Std Res vs Dev. Yr
Wtd Std Res vs Acc. Yr
1
1
0.5
0.5
0
0
-0.5
-0.5
-1
-1
0
1
2
3
4
5
6
7
8
9
81
82
Wtd Std Res vs Cal. Yr
83
84
85
86
87
88
89
90
Wtd Std Res vs Fitted
1
1
0.5
0.5
0
0
-0.5
-0.5
-1
-1
81
82
83
84
85
86
87
88
89
90
14,000
16,000
18,000
20,000
22,000
24,000
Residual display for a ratio model for DY4 on DY3 (fourth pair of years).
By now it’s getting hard to see much of anything going on, there are only 7 and 6
points respectively in the most recent two sets of residual plots above.
We can combine the residuals together into a display against each direction. The plots
against accident and calendar years will no longer be redundant, and we will be able
to pick up trends in those directions. Note that the plot against development years will
not show lack of fit in general, since the fitted line will go through a “weighted
average” value of y at each development, but it will allow us to see what’s going on
with the spread around the line. The fitted value will more clearly show (by having an
overall trend) whether there’s a tendency to need an intercept.
Wtd Std Res vs Dev. Yr
Wtd Std Res vs Acc. Yr
1.5
1.5
1
1
0.5
0.5
0
0
-0.5
-0.5
-1
-1
0
1
2
3
4
5
6
7
8
9
81
82
Wtd Std Res vs Cal. Yr
83
84
85
86
87
88
89
Wtd Std Res vs Fitted
1.5
1.5
1
1
0.5
0.5
0
0
-0.5
-0.5
-1
-1
81
82
83
84
85
86
87
88
89
90
5,000
10,000
15,000
20,000
25,000
We can see a strong overall downward trend against fitted values. This strongly
suggests a need for an intercept term!
90
The standard chain ladder ratio (Mack)
The standard chain ladder ratio (Mack ratio) is a kind of weighted average of the
ratios, where the weight is the previous cumulative – it’s sometimes called a “volume
weighted average”.
An ordinary average of a set of y values (y1, y2, … yn) is (y1 + y2 + … + yn)/n. We can
write that in short form as: (i yi )/n.
A weighted average has a weight for each observation, so that an observation with
more weight “affects” the average more than one with less weight. It “pulls” the
average toward it. A weighted average looks like this: [wi · yi ] / [wi].
A weighted average ratio is written like this: r = [wi (yi /xi)] / [wi].
The chain ladder/Mack model has wi = xi .
That is, r = [xi (yi /xi)] / [xi] = [yi xi /xi] / [xi] = yi / xi .
In other words, if we add up the two columns and take the ratio to get the chain ladder
ratio,
8269 + 4285 + 8992 + …
5012 + 106 + 3410 + …
it’s the same as weighting the individual ratios by the first column and taking the
average:
5012 · 1.6498 + 106 · 40.4245 + 3410 · 2.6370 + …
5012 + 106 + 3410 + …
There are three entirely equivalent ways of looking at the same thing – as a ratio of
sums, and a weighted average ratio, and as a weighted average slope (weighted
regression line through the origin).
Because the weights applied to the ratios are the previous value (incurred or
cumulative), this kind of weighted average is often called a volume-weighted average
ratio.
Mack (chain ladder) is volume-weighted average ratios
14,000
12,000
10,000
8,000
6,000
4,000
2,000
0
0
1,000
2,000
3,000
4,000
5,000
6,000
Individual ratios and the standard chain ladder (Mack) ratio (red) for DY1 vs DY0.
The arithmetic average of the ratios is in gray.
Below are the residuals for the Mack model applied to all pairs of years.
Wtd Std Res vs Dev. Yr
Wtd Std Res vs Acc. Yr
2
2
1.5
1.5
1
1
0.5
0.5
0
0
-0.5
-0.5
-1
-1
-1.5
-1.5
0
1
2
3
4
5
6
7
8
9
81
82
Wtd Std Res vs Cal. Yr
83
84
85
86
87
88
89
Wtd Std Res vs Fitted
2
2
1.5
1.5
1
1
0.5
0.5
0
0
-0.5
-0.5
-1
-1
-1.5
-1.5
81
82
83
84
85
86
87
88
89
90
5,000
10,000
15,000
20,000
25,000
Again, we see a strong downward trend. Note that if a line through the origin is
inadequate because the actual relationship needs an intercept, then no other line
through the origin will fit. That is, no ratio, no matter how you choose it, will
adequately describe the development.
90
Section II. Chain ladder does not distinguish between accident and calendar
years
It is a property of the chain ladder that, when applied to an incremental triangle. the
incremental forecasts identical to those for the equivalent procedure where accident
years and development years are interchanged.
Equivalently, you get the same incremental forecasts whether cumulation, calculation
of ratios and projection runs to the right, across development years or down, across
accident years.
This applies to any model whose forecasts reproduce those of the standard chain
ladder, so Mack’s model and the two-way cross-classification quasi-Poisson GLM
both have this property in respect of the mean forecasts.
Consider 5he following (toy) triangle, which represents incremental paid losses.
20
10
30
It turns out that the value to be forecast (in gray) = 30  10 / 20 = 15
Usual calculation (working across to the right)
Equivalent cumulatives:
20
30
30
Ratio: 30/20 = 1.5
Cumulative forecast = 30  1.5 = 45
Incremental forecast = 45 – 30 = 15.
Working down rather than across
Cumulating down:
20
10
50
Ratio running down: 50/20 = 2.5
“Cumulative” forecast = 10  2.5 = 25
Incremental forecast = 25 – 10 = 15.
In the first case, writing everything in terms of the incrementals, the ratio is
(20+10)/20 = 1 + 10/20 .
The cumulative forecast is 30  (1 + 10/20) = 30 + 30  10/20 .
The incremental forecast is 30 + 30  10/20 – 30 = 30  10 / 20 .
In the second case, again writing in terms of incrementals, the ratio is
(20+30)/20 = 1 + 30/20 .
The cumulative forecast is 10  (1 + 30/20) = 10 + 10  30/20 .
The incremental forecast is 10 + 10  30/20 – 10 = 10  30 / 20 .
In the general case, the incremental forecast (calculated in either direction) if b
denotes the sum of incrementals in the same development year, and c denotes the sum
of incrementals in the same accident year, and a denotes the sum of all values that are
both above and to the left (in earlier accident and development years), then the
incremental predicted (forecast) value is (bc)/a .
a
c
b
p
ˆij
p̂ ij = b.c/a
Note that if the forecast value is further into the future than the next diagonal
(calendar year), that (unknown) future incremental values required for the formula are
replaced with their own forecasts.
An algebraic proof that the incremental forecasts are of the form (bc)/a is given in
Barnett, Zehnwirth and Dubossarsky (2005).
This formula is symmetric (the formula is the same if we interchange accident and
development periods – i.e. transpose the incremental array, since bc/a = cb/a.
Consequently, it is always the case that the chain ladder works the same if we treat the
accident years as if they were the development years and vice-versa.
This is a worrisome property, because we know that the accident and development
year directions are different. It appears to make no sense to cumulate downward and
take ratios running down, but in fact the usual across version makes just as much
sense. Some of the consequences are discussed in detail in Barnett et al (2005).
Example
The data in the following example are from the triangle ABC in Barnett and
Zehnwirth (2000). Note that we do not use the exposures in this example.
Firstly, let’s see what the residuals from a Mack model tell us about the suitability of a
ratio model.
We clearly see below strong changes in trend in the calendar year direction; neither
Mack nor the quasi-Poisson GLM (#link to other page#) version of the chain ladder
can deal with this. This is not an unusual circumstance.
Wtd Std Res vs Dev. Yr
Wtd Std Res vs Acc. Yr
1.5
1.5
1
1
0.5
0.5
0
0
-0.5
-0.5
-1
-1
0
1
2
3
4
5
6
7
8
9
10
77
78
79
Wtd Std Res vs Cal. Yr
80
81
82
83
84
85
86
87
Wtd Std Res vs Fitted
1.5
1.5
1
1
0.5
0.5
0
0
-0.5
-0.5
-1
-1
77
78
79
80
81
82
83
84
85
86
87
400,000
600,000
800,000
1,000,000
Let’s now examine standard chain ladder forecasts for this data. These forecasts are
the same for Mack and the quasi-Poisson GLM.
Incremental chain ladder forecasts for the ABC data (not exposure adjusted):
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
0
1
2
3
4
5
6
7
8
9
10
153638
188412
134534
87456
60348
42404
31238
21252
16622
14440
12200
178536
226412
158894
104686
71448
47990
35576
24818
22662
18000
14455
210172
259168
188388
123074
83380
56086
38496
33768
27400
20590
16918
211448
253482
183370
131040
78994
60232
45568
38000
26102
20758
17056
219810
266304
194650
120098
87582
62750
51000
34286
26997
21469
17640
205654
252746
177506
129522
96786
82400
44925
33853
26656
21198
17418
197716
255408
194648
142328
105600
65149
45697
34435
27114
21562
17717
239784
329242
264802
190400
116193
82949
58182
43843
34522
27453
22558
326304
471744
375400
234612
159737
114035
79986
60273
47459
37742
31011
420778
590400
425805
287301
195611
139645
97950
73809
58118
46218
37976
496200
649327
482379
325473
221600
158199
110964
83616
65840
52359
43021
Incremental chain ladder forecasts for the transpose of the ABC data:
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
0
1
2
3
4
5
6
7
8
9
10
153638
178536
210172
211448
219810
205654
197716
239784
326304
420778
496200
188412
226412
259168
253482
266304
252746
255408
329242
471744
590400
649327
134534
158894
188388
183370
194650
177506
194648
264802
375400
425805
482379
87456
104686
123074
131040
120098
129522
142328
190400
234612
287301
325473
60348
71448
83380
78994
87582
96786
105600
116193
159737
195611
221600
42404
47990
56086
60232
62750
82400
65149
82949
114035
139645
158199
31238
35576
38496
45568
51000
44925
45697
58182
79986
97950
110964
21252
24818
33768
38000
34286
33853
34435
43843
60273
73809
83616
16622
22662
27400
26102
26997
26656
27114
34522
47459
58118
65840
14440
18000
20590
20758
21469
21198
21562
27453
37742
46218
52359
12200
14455
16918
17056
17640
17418
17717
22558
31011
37976
43021
It should be noted, however, that the Mack standard deviations of forecasts (or
outstandings or ultimates) and their coefficients of variation do not have the transpose
property; because of the way the conditioning is set up; when transposed, the
conditioning in the variance is not symmetric.
The symmetry in the forecasts (specifically, that the transpose of the forecast and the
forecast of the transpose of the triangle are equal) is not a desirable property. It is not
an indication of “robustness” – forecasts from ratio models are highly sensitive to
particular observations, and completely insensitive to other observations. It is not an
indication of suitability of ratio models in general or of the chain ladder in particular,
nor of either the Mack or quasi-Poisson GLM versions of it.
If we run a structurally similar symmetric model from PTF (a two-way crossclassification model with log-link), of course the model won’t be suitable either:
Wtd Std Res vs Cal. Yr
2.5
2
1.5
1
0.5
0
-0.5
-1
77
78
79
80
81
82
83
84
85
86
87
- however, a more suitable model can be obtained by allowing for some CY trend
changes (though it would still be overparameterized and suffer from some of the other
deficiences mentioned in Barnett et al, 2005)
This model does have symmetry in forecast standard errors and CVs. For example, the
following table is the same for both the original and transposed array.
Calendar
Year
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
Total
Mean
Reserve
1,619,407
1,145,583
787,890
549,715
390,448
279,534
203,820
145,476
92,444
44,221
5,258,538
Standard
Dev.
87,010
64,074
44,277
31,161
22,714
16,870
13,195
10,766
8,594
6,252
240,539
CV
Reserve
0.05
0.06
0.06
0.06
0.06
0.06
0.06
0.07
0.09
0.14
0.05
The same would be true for the quasi-Poisson GLM version of the chain ladder.
As already mentioned, symmetry is not of itself a desirable property, but if you want
to impose that symmetry it would be interesting to ask why we should expect it to
hold for the mean but not the standard deviation.
An article on the Mack method and bootstrapping is available here.
References
Barnett, G. and B. Zehnwirth (2000) Best Estimates for Reserves, PCAS No 87, p245303.
Barnett G., B. Zehnwirth and E. Dubossarsky (2005) When Can Accident Years Be
Regarded As Development Years?, PCAS No 92, p249-256.
Mack, Th. (1994), "Which stochastic model is underlying the chain ladder method?"
Insurance Mathematics and Economics, Vol 15 No. 2/3, 1994, pp. 133-138.
Download