© Deloitte Consulting, 2004 Loss Reserving Using Policy-Level Data James Guszcza, FCAS, MAAA Jan Lommele, FCAS, MAAA Frank Zizzamia CLRS Las Vegas September, 2004 © Deloitte Consulting, 2004 Agenda Motivations for Reserving at the Policy Level Outline one possible modeling framework Sample Results 2 © Deloitte Consulting, 2004 Motivation Why do Reserving at the Policy Level? © Deloitte Consulting, 2004 2 Basic Motivations 1. Better reserve estimates for a changing book of business 2. Do triangles “summarize away” important patterns? Could adding predictive variables help? More accurate estimate of reserve variability Summarized data require more sophisticated models to “recover” heterogeneity. Is a loss triangle a “sufficient statistic” for ultimate losses & variability? 4 © Deloitte Consulting, 2004 (1) Better Reserve Estimates Key idea: use predictive variables to supplement loss development patterns Most reserving approaches analyze summarized loss/claim triangles. Does not allow the use of covariates to predict ultimate losses (other than time-indicators). Actuaries use predictive variables to construct rating plans & underwriting models. Why not loss reserving too? 5 © Deloitte Consulting, 2004 Why Use Predictive Variables? Suppose a company’s book of business has been deteriorating for the past few years. This decline might not be reflected in a summarized loss development triangle. However: The resulting change in distributions of certain predictive variables might allow us to refine our ultimate loss estimates. 6 © Deloitte Consulting, 2004 Examples of Predictive Variables Claim detail info Type of claim Time between claim and reporting Policy’s historical loss experience Information about agent who wrote the policy Exposure Premium, # vehicles, # buildings/employees… Other specifics Policy age, Business/policyholder age, credit….. 7 © Deloitte Consulting, 2004 More Data Points Typical reserving projects use claim data summarized to the year/quarter level. Probably an artifact of the era of pencil-andpaper statistics. In certain cases important patterns might be “summarized away”. In the computer age, why restrict ourselves? More data points less chance of overfitting the model. 8 © Deloitte Consulting, 2004 Danger of over-fitting One well known example Overdispersed Poisson GLM fit to loss triangle. Stochastic analog of the chain-ladder 55 data points 20 parameters estimated parameters have high standard error. How do we know the model will generalize well on future development? Policy-level data: 1000’s of data points 9 © Deloitte Consulting, 2004 Out-of-Sample Testing Policy-level dataset has 1,000’s of data points Rather than 55 data points. Provides more flexibility for various out-ofsample testing strategies. Use of holdout samples Cross-validation Uses: Model selection Model evaluation 10 © Deloitte Consulting, 2004 (2) Reserve Variability Variability Components Process risk Parameter risk Model specification risk Predictive error = process + parameter risk Both quantifiable What we will focus on Reserve variability should also consider model risk. Harder to quantify 11 © Deloitte Consulting, 2004 Reserve Variability Can policy-level data give us a more accurate view of reserve variability? Process risk: we are not summarizing away variability in the data. Parameter risk: more data points should lead to less estimation error. Prediction variance: brute force “bootstrapping” easily combines Process & Parameter variance. Leaves us more time to focus on model risk. 12 © Deloitte Consulting, 2004 Disadvantages Expensive to gather, prepare claim-detail information. Still more expensive to combine this with policy-level covariates. More open-ended universe of modeling options (both good and bad). Requires more analyst time, computer power, and specialist software. Less interactive than working in Excel. 13 © Deloitte Consulting, 2004 Modeling Approach Sample Model Design © Deloitte Consulting, 2004 Philosophy Provide an example of how reserving might be done at the policy level To keep things simple: consider a generalization of the chain-ladder Just one possible model Analysis is suggestive rather than definitive No consideration of superimposed inflation No consideration of calendar year effects Model risk not estimated etc… 15 © Deloitte Consulting, 2004 Notation Lj = {L12, L24, …, L120} Losses developed at 12, 24,… months Developed from policy inception date PYi = {PY1, PY2, …, PY10 } Policy Years 1, 2, …, 10 {Xk} = covariates used to predict losses Assumption: covariates are measured at or before policy inception 16 © Deloitte Consulting, 2004 Model Design Build 9 successive GLM models Regress L24 on L12; L36 on L24 … etc Each GLM analogous to a link ratio. The Lj Lj+1 model is applied to either Actual values @ j Predicted values from the Lj-1Lj model Predict Lj+1 using covariates along with Lj. 17 © Deloitte Consulting, 2004 Model Design Idea: model each policy’s loss development from period Lj to Lj+1 as a function of a linear combination of several covariates. L j 1 Lj L j 1 Policy-level generalization of the chain-ladder idea. Lj f ( 1 X 1 ... n X n ) f ( ) LinkRatio Consider case where there are no covariates 18 © Deloitte Consulting, 2004 Model Design Over-dispersed Poisson GLM: Log link function Variance of Lj+1 is proportional to mean Treat log(Lj) as the offset term Allows us to model rate of loss development L j 1 Lj 1 X 1 ... n X n e L j 1 e log(L j ) 1 X 1 ... n X n 19 © Deloitte Consulting, 2004 Using Policy-Level Data Note: we are using policy-level data. Therefore the data contains many zeros Poisson assumption places a point mass at zero How to handle IBNR Include dummy variable indicating $0 loss @12 mo Interact this indicator with other covariates. The model will allocate a piece of the IBNR to each policy with $0 loss as of 12 months. 20 © Deloitte Consulting, 2004 Sample Results © Deloitte Consulting, 2004 Data Policy Year 1991 – 1st quarter 2000 Workers Comp 1 record per policy, per year Each record has multiple loss evaluations “Losses @ j months” means: @ 12, 24, …,120 months j months from the policy inception date. Losses coded as “missing” where appropriate e.g., PY 1998 losses at 96 months 22 © Deloitte Consulting, 2004 Covariates Historical LR and claim frequency variables $0 loss @ 12 month indicator Credit Score Log premium Age of Business New/renewal indicator Selected policy year dummy variables Using a PY dummy variable is analogous to leaving that PY out of a link ratio calculation 23 Use sparingly © Deloitte Consulting, 2004 Covariates Interaction terms between covariates and the $0-indicator Most covariates only used for the 1224 GLM For other GLMs only use selected PY indicators These GLMs give very similar results to the chain ladder 24 © Deloitte Consulting, 2004 Results Chain-Ladder PY num_pol 1991 51,854 1992 39,821 1993 34,327 1994 33,431 1995 35,168 1996 38,005 1997 41,291 1998 44,418 1999 53,210 2000 14,588 prem paid_12 paid_24 paid_36 paid_48 216,399 9,310 48,224 70,688 81,542 170,851 26,315 54,071 68,369 75,137 156,172 24,789 50,174 61,896 68,498 153,561 23,137 44,758 55,375 61,249 148,788 23,468 44,315 53,571 60,133 140,368 22,989 45,112 54,755 59,477 134,616 23,936 50,725 62,585 68,432 134,316 26,761 58,536 69,525 166,206 36,452 72,571 46,955 10,266 Link LDF Policy-Level Model 1991 51,854 216,399 1992 39,821 170,851 1993 34,327 156,172 1994 33,431 153,561 1995 35,168 148,788 1996 38,005 140,368 1997 41,291 134,616 1998 44,418 134,316 1999 53,210 166,206 2000 14,588 46,955 2.030 3.153 9,310 26,315 24,789 23,137 23,468 22,989 23,936 26,761 36,452 10,266 1.215 1.553 48,224 54,071 50,174 44,758 44,315 45,112 50,725 58,536 72,571 21,114 1.103 1.278 70,688 68,369 61,896 55,375 53,571 54,755 62,585 69,525 88,071 25,653 1.053 1.159 81,542 75,137 68,498 61,249 60,133 59,477 68,432 76,672 97,073 28,275 paid_60 88,219 79,427 72,421 64,047 62,999 62,659 - paid_72 92,492 81,789 74,925 65,759 64,796 - paid_84 95,967 83,352 77,151 66,682 - 1.034 1.101 1.026 1.065 1.016 1.038 88,219 79,427 72,421 64,047 62,999 62,659 72,034 80,667 102,130 29,749 92,492 81,789 74,925 65,759 64,796 64,551 74,167 83,055 105,154 30,629 95,967 83,352 77,151 66,682 66,176 65,894 75,709 84,782 107,341 31,266 paid_96 paid_108 paid_120 97,841 99,280 100,057 84,783 85,776 77,957 1.013 1.021 97,841 84,783 77,957 67,750 67,210 66,923 76,891 86,106 109,017 31,755 ult res 86,448 79,614 69,191 68,982 69,004 79,325 88,870 112,718 32,372 672 1,657 2,509 4,186 6,346 10,892 19,346 40,147 22,106 107,860 86,440 79,572 69,130 68,578 68,285 78,457 87,860 111,237 32,401 664 1,615 2,448 3,782 5,626 10,024 18,335 38,665 22,135 103,296 1.008 1.008 99,280 85,776 78,989 68,624 68,076 67,785 77,882 87,216 110,422 32,164 100,057 86,440 79,572 69,130 68,578 68,285 78,457 87,860 111,237 32,401 25 © Deloitte Consulting, 2004 Comments Policy-Level model produces results very close to chain-ladder. is a proper generalization of the chain-ladder The model covariates are all statistically significant, have parameters of correct sign. In this case, the covariates seem to have little influence on the predictions. Might play more of a role in a book where quality of business changes over time. 26 © Deloitte Consulting, 2004 Model Evaluation Treat Recent Diagonals as Holdout 10-fold Cross-Validation © Deloitte Consulting, 2004 Test Model by Holding Out Most Recent 2 Calendar Years Actual PY num_pol 1991 51,854 1992 39,821 1993 34,327 1994 33,431 1995 35,168 1996 38,005 1997 41,291 1998 44,418 1999 53,210 2000 14,588 prem paid_12 paid_24 paid_36 paid_48 216,399 9,310 48,224 70,688 81,542 170,851 26,315 54,071 68,369 75,137 156,172 24,789 50,174 61,896 68,498 153,561 23,137 44,758 55,375 61,249 148,788 23,468 44,315 53,571 60,133 140,368 22,989 45,112 54,755 59,477 134,616 23,936 50,725 62,585 68,432 134,316 26,761 58,536 69,525 166,206 36,452 72,571 46,955 10,266 - Predict CY 1999 & 2000 1991 51,854 216,399 1992 39,821 170,851 1993 34,327 156,172 1994 33,431 153,561 1995 35,168 148,788 1996 38,005 140,368 1997 41,291 134,616 1998 44,418 134,316 1999 53,210 166,206 2000 14,588 46,955 CY1999 error CY2000 error 9,310 26,315 24,789 23,137 23,468 22,989 23,936 26,761 36,452 10,266 48,224 54,071 50,174 44,758 44,315 45,112 50,725 54,215 73,695 20,833 -7.4% 1.5% 70,688 68,369 61,896 55,375 53,571 54,755 62,126 66,071 89,770 25,412 -0.7% -5.0% 81,542 75,137 68,498 61,249 60,133 60,864 69,017 73,400 99,728 28,231 2.3% 0.9% paid_60 88,219 79,427 72,421 64,047 62,999 62,659 - paid_72 92,492 81,789 74,925 65,759 64,796 - paid_84 95,967 83,352 77,151 66,682 - paid_96 paid_108 paid_120 97,841 99,280 100,057 84,783 85,776 77,957 - 88,219 79,427 72,421 64,047 63,373 64,108 72,696 77,313 105,045 29,736 92,492 81,789 74,925 66,101 65,378 66,137 74,997 79,759 108,369 30,677 95,967 83,352 76,355 67,339 66,602 67,376 76,401 81,253 110,398 31,251 97,841 84,971 77,809 68,621 67,870 68,658 77,856 82,800 112,500 31,846 0.6% 2.3% 0.5% 0.9% -1.0% 1.0% 0.2% -0.2% 99,280 86,179 78,916 69,597 68,836 69,635 78,963 83,978 114,100 32,299 100,057 86,815 79,498 70,111 69,344 70,149 79,545 84,597 114,942 32,537 0.5% 28 © Deloitte Consulting, 2004 Cross-Validation Methodology Randomly break data into 10 pieces Fit the 9 GLM models on pieces 1…9 Apply it to Piece 10 Therefore Piece 10 is treated as out-of-sample data Now use pieces 1…8,10 to fit the nine models; apply to piece 9 Cycle through 8 other cases 29 © Deloitte Consulting, 2004 Cross-Validation Methodology Fit 90 GLMs in all 10 cross-validation iterations Each involving 9 GLMs Each of the 10 “predicted” pieces will be a 10x10 matrix consisting entirely of out-ofsample predicted values Can compare actuals to predicteds on upper half of the matrix Each cell of the triangle is treated as out-of-sample data 30 © Deloitte Consulting, 2004 Cross-Validation Results Actual PY 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 Predicted 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 paid_12 9,310 26,315 24,789 23,137 23,468 22,989 23,936 26,761 36,452 10,266 paid_24 48,224 54,071 50,174 44,758 44,315 45,112 50,725 58,536 72,571 - paid_36 70,688 68,369 61,896 55,375 53,571 54,755 62,585 69,525 - 9,310 26,315 24,789 23,137 23,468 22,989 23,936 26,761 36,452 10,266 48,445 54,089 50,209 44,796 44,353 47,767 49,471 55,002 74,740 21,100 70,884 68,418 61,157 54,618 54,064 58,181 60,164 66,781 90,704 25,631 paid_48 paid_60 paid_72 paid_84 paid_96 paid_108 paid_120 81,542 88,219 92,492 95,967 97,841 99,280 100,057 75,137 79,427 81,789 83,352 84,783 85,776 68,498 72,421 74,925 77,151 77,957 61,249 64,047 65,759 66,682 60,133 62,999 64,796 59,477 62,659 68,432 - 81,792 75,154 67,413 60,194 59,572 64,126 66,295 73,601 99,963 28,256 88,506 79,072 70,921 63,326 62,682 67,464 69,758 77,432 105,172 29,727 92,746 81,412 73,023 65,198 64,534 69,463 71,820 79,725 108,285 30,609 96,186 83,103 74,537 66,559 65,879 70,911 73,318 81,385 110,541 31,243 97,687 84,399 75,699 67,601 66,913 72,016 74,467 82,658 112,272 31,734 98,946 85,487 76,676 68,475 67,775 72,940 75,428 83,724 113,720 32,145 99,674 86,117 77,243 68,977 68,272 73,477 75,981 84,340 114,556 32,383 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 6% -2% -6% 3% 0% 0% -1% -1% 1% 6% -4% -4% 0% 0% -2% -2% -1% 8% -3% 0% 0% -2% -1% -1% 8% 0% 0% -3% -1% 0% 0% 0% 0% 0% 0% 0% 0% -3% -3% 0% 31 © Deloitte Consulting, 2004 Reserve Variability Using the Bootstrap to estimate the probability distribution of one’s outstanding loss estimate © Deloitte Consulting, 2004 The Bootstrap The Statistician Brad Efron proposed a very simple and clever idea for mechanically estimating confidence intervals: The Bootstrap. The idea is to take multiple resamples of your original dataset. Compute the statistic of interest on each resample you thereby estimate the distribution of this statistic! 33 © Deloitte Consulting, 2004 Motivating Example Suppose we take 1000 draws from the normal(500,100) distribution Sample mean ≈ 500 statistic N value 1000.00 MIN 181.15 MAX 836.87 MEAN 499.23 STD 98.96 what we expect a point estimate of the “true” mean From theory we know that: s.d .( X ) / N 100 1000 3.16 34 © Deloitte Consulting, 2004 Sampling with Replacement Draw a data point at random from the data set. Draw a second data point. Then throw it back in… Keep going until we’ve got 1000 data points. Then throw it back in You might call this a “pseudo” data set. This is not merely re-sorting the data. Some of the original data points will appear more than once; others won’t appear at all. 35 © Deloitte Consulting, 2004 Sampling with Replacement In fact, there is a chance of (1-1/1000)1000 ≈ 1/e ≈ .368 that any one of the original data points won’t appear at all if we sample with replacement 1000 times. any data point is included with Prob ≈ .632 Intuitively, we treat the original sample as the “true population in the sky”. Each resample simulates the process of taking a sample from the “true” distribution. 36 © Deloitte Consulting, 2004 Resampling Sample with replacement 1000 data points from the original dataset S S*2 S*N Call this S*1 S*3 ... Now do this 399 more times! S*1 S*1, S*2,…, S*400 Compute X-bar on each of these 400 samples S*10 S*4 S S*5 S*9 S*6 S*8 S*7 37 © Deloitte Consulting, 2004 The Result The green bars are a histogram of the sample means of S*1,…, S*400 The blue curve is a normal distribution with the sample mean and s.d. The red curve is a kernel density estimate of the distribution underlying the histogram Intuitively, a smoothed histogram 38 © Deloitte Consulting, 2004 The Result The result is an estimate of the distribution of X-bar. Notice that it is normal with mean≈500 and s.d.≈3.2 The purely mechanical bootstrapping procedure produces what theory tells us to expect. Can we use resampling to estimate the distribution of outstanding liabilities? 39 © Deloitte Consulting, 2004 Bootstrapping Reserves S = our database of policies Sample with replacement all policies in S S*3 ... S*10 S*4 S S*1, S*2,…, S*200 Estimate o/s reserves on each sample S*2 S*N Call this S*1 Same size as S Now do this 199 more times! S*1 Get a distribution of reserve estimates S*5 S*9 S*6 S*8 S*7 40 © Deloitte Consulting, 2004 Bootstrapping Reserves Compute your favorite reserve estimate on each S*k These 200 reserve estimates constitute an estimate of the distribution of outstanding losses Notice that we did this by resampling our original dataset S of policies. This differs from other analyses which bootstrap the residuals of a model. Perhaps more theoretically intuitive. But relies on assumption that your model is correct! 41 © Deloitte Consulting, 2004 Bootstrapping Results Standard Deviation ≈ 5% of total o/s losses 95% confidence interval ≈ (-10%, +10%) Tighter interval than typically seen in the literature. Result of not summarizing away variability info? year 1992 1993 1994 1995 1996 1997 1998 1999 2000 total reserve 651 1,584 2,444 3,747 5,631 9,972 18,273 38,820 22,128 103,250 stdev 187 274 360 453 503 720 1,166 2,508 1,676 5,304 % 28.8% 17.3% 14.7% 12.1% 8.9% 7.2% 6.4% 6.5% 7.6% 5.1% 42 © Deloitte Consulting, 2004 Reserve Dist: All Years 43 © Deloitte Consulting, 2004 Reserve Dist: 1992 44 © Deloitte Consulting, 2004 Reserve Dist: 1993 45 © Deloitte Consulting, 2004 Reserve Dist: 1994 46 © Deloitte Consulting, 2004 Reserve Dist: 1995 47 © Deloitte Consulting, 2004 Reserve Dist: 1996 48 © Deloitte Consulting, 2004 Reserve Dist: 1997 49 © Deloitte Consulting, 2004 Reserve Dist: 1998 50 © Deloitte Consulting, 2004 Reserve Dist: 1999 51 © Deloitte Consulting, 2004 Reserve Dist: 2000 52 © Deloitte Consulting, 2004 Interpretation This result suggests: With 95% confidence, the total o/s losses will be within +/- 10% of our estimate. Assumes model is correctly specified. Too good to be true? Yes: doesn’t include model risk! Tighter confidence than often seen in the literature. Bootstrapping a model can be tricky. However: we are using 1000s of data points! We’re not throwing away heterogeneity info. 53 © Deloitte Consulting, 2004 Closing Thoughts The +/- 10% result is only suggestive Applies to this specific data set. Bootstrapping methodology can be refined. Suggestion: using policy-level data can yield tighter confidence intervals Doesn’t throw away information pertaining to process & parameter risk. Bootstrapping is conceptually simple. Requires more computer power than brain power! Leaves us more time to focus on model risk. 54