Aggregate Loss Generator By Dom Yarnell Purpose This tool creates aggregate loss distributions for use in pricing primary insurance policies. While features like occurrence limits are priced easily enough with increased limits factors, aggregate loss limits require additional care, as their impact is a function of both severity and frequency. The aggregate loss distribution enables the actuary to reflect the impact of expected frequency more accurately than factors that don’t vary with the size of the risk. In addition to taking into account occurrence limits, the VBA code underlying the model can be edited to include other features, like retentions, to price primary insurance policies on-the-fly. How it works The algorithm underlying the Aggregate Loss Generator (ALG) relies on a frequency distribution and a severity distribution, as does traditional Monte Carlo simulation. However, unlike most simulation tools, the ALG must use a discrete severity distribution and evaluates all possible loss scenarios, so there is no sampling error. Let’s look at a very simple example: a frequency distribution that considers up to two losses, and a discrete severity distribution with five points. Claim Count 0 1 2 PDF 70% 20% 10% Loss Amount 10,000 25,000 50,000 75,000 100,000 PDF 60% 20% 10% 7% 3% Since a single loss implies five possible outcomes, and there are as many as two losses possible, there are 31 = 50 + 51 + 52 loss scenarios. Such a small number of calculations lends itself to direct calculation, but the distributions needed in the pricing process will likely include more points, and therefore require many more calculations. For instance, a frequency distribution with 10 points and a severity distribution with 12 points would imply over 67 billion loss scenarios to calculate. Clearly, a direct calculation of all loss scenarios is not feasible for computers commonly found in the workplace. In order to make the calculations more manageable, the ALG compresses the distribution of each claim scenario to a certain number of points, and then uses this distribution to calculate the subsequent claim scenario. In the simple example above, let’s assume we set the maximum number of points to four. If there is a single loss, the five points on the severity distribution would be reduced to at most four points. If the first two points were to be compressed, the Loss Amounts would be combined into a weighted average and the probabilities would be totaled, so that the new, single point would have a Loss Amount of 13,750 = [(10,000 x 60%) + (25,000 x 20%)] / (60% + 20%)], and a probability of 80% = 60% + 20%. This four-point, 1-claim scenario would then be used to calculate the 2-claim scenario, which now requires 20 = 4 x 5 calculations instead of the original 25 = 5 x 5. In total, 26 = 50 + 51 + (4 x 5) scenarios would be calculated instead of the original 31. If we look at the scenario with a 10-point frequency distribution and a 12-point severity distribution, and limit the number of points to 100, the number of loss scenarios to calculate drops from over 67 billion to 9,7571. In addition to compressing the distribution of each claim scenario, the ALG compresses the resulting aggregate loss distribution before recording it in Excel. Loss scenario calculations The calculation of loss and probability for a claim scenario is fairly straightforward. 1-Claim Loss Amount 13,750 13,750 13,750 13,750 13,750 50,000 50,000 50,000 50,000 50,000 75,000 75,000 75,000 75,000 75,000 100,000 100,000 100,000 100,000 100,000 1-Claim Loss Probability 80% 80% 80% 80% 80% 10% 10% 10% 10% 10% 7% 7% 7% 7% 7% 3% 3% 3% 3% 3% Second Loss Amount 10,000 25,000 50,000 75,000 100,000 10,000 25,000 50,000 75,000 100,000 10,000 25,000 50,000 75,000 100,000 10,000 25,000 50,000 75,000 100,000 Second Loss Probability 60% 20% 10% 7% 3% 60% 20% 10% 7% 3% 60% 20% 10% 7% 3% 60% 20% 10% 7% 3% 2-Claim Loss Amount 23,750 38,750 63,750 88,750 113,750 60,000 75,000 100,000 125,000 150,000 85,000 100,000 125,000 150,000 175,000 110,000 125,000 150,000 175,000 200,000 2-Claim Loss Probability 48.0% 16.0% 8.0% 5.6% 2.4% 6.0% 2.0% 1.0% 0.7% 0.3% 4.2% 1.4% 0.7% 0.5% 0.2% 1.8% 0.6% 0.3% 0.2% 0.1% Since we have four (after compression) scenarios in the event of one claim, and there are five possible values for the next loss, calculating the distribution of two claims results in 20 loss scenarios. For each scenario, the loss amounts are added together and the probability are multiplied together (since we assume claims are independent of each other), and the result would be sorted and compressed. 1 The actual number of points may be less than the maximum number of points, which means the number of loss scenarios calculated above represents the maximum the computer might have to calculate. 2-Claim Loss Amount 27,500 74,965 121,331 179,412 2-Claim Loss Probability 64.0% 28.2% 7.3% 0.5% We’ve calculated each loss amount and the probabilities of each loss amount in the event of one-loss and two-loss scenarios, so the next step is to incorporate the probabilities of each claim scenario by looking to the frequency distribution. Claim Frequency Count Probability 0 70% 1 20% 1 20% 1 20% 1 20% 2 10% 2 10% 2 10% 2 10% Loss Severity Loss Amount Probability Probability 0 100.0% 70.00% 13,750 80.0% 16.00% 50,000 10.0% 2.00% 75,000 7.0% 1.40% 100,000 3.0% 0.60% 27,500 64.0% 6.40% 74,965 28.2% 2.82% 121,331 7.3% 0.73% 179,412 0.5% 0.05% The Loss Probability is the product of the Frequency Probability and the Severity Probability, since we assume frequency and severity are independent. The aggregate loss distribution is then sorted and compressed. Loss Loss Amount Probability 4,286 92.40% 66,945 6.22% 111,701 1.33% 179,412 0.05% The compressed aggregate loss distribution has an expected value of 9,700, which equals the product of the expected frequency = 0.4, and the expected severity = 24,250. Generating the frequency distribution Although it’s common practice to assume one severity distribution for pricing purposes, specific frequency distributions need to be created for each risk when using the ALG, as exposure varies from risk to risk (and year to year). Using a Poisson distribution, the ALG creates a frequency distribution based on the Expected Frequency and Probability Limit. Expected Probability Frequency Limit 5.00 0.10% A Probability Limit of 0.10% will limit the number of claims in the frequency distribution to the number associated with a cumulative distribution function (CDF) that exceeds 1 – 0.10%. For example, a Poisson distribution with an expected value of 5.00 implies a CDF of 99.80% at 12, and 99.93% at 13. Since 99.93% is the first value of the CDF to exceed 1 – 0.10% = 99.9%, the ALG will create a frequency distribution with a maximum of 13 claims. Lowering the Probability Limit will increase the likeliness of higher claim count scenarios being calculated. Since a Poisson distribution includes probabilities for numbers above the limit imposed by the ALG, the probabilities are further adjusted so that 1) they add up to 100%, and 2) they yield a mean that matches the Expected Frequency. Occurrence limit and maximum scenario points The ALG also takes into account an Occurrence Limit and an Aggregate Limit when calculating loss scenarios. Occurrence Limit 80,000 Maximum Scenario Points 4 The Maximum Scenario Points (MSP) selected will limit the number of points in 1) the outcome of each claim count scenario, and 2) the number of points in the final distribution. The higher the MSP, the more refined your distribution, and the longer it will take ALG to produce the aggregate loss distribution. In order to compress the distribution, the algorithm first determines the largest loss amount in the scenario. For instance, in the one-claim scenario without an occurrence limit, the largest loss in the scenario is going to be the largest loss amount in the severity distribution. Using our simple severity distribution, we see that the maximum loss for a one-claim scenario is 100K. The algorithm then calculates the size of the loss buckets by dividing 100K by the MSP = 4, meaning the loss buckets are 25K. Loss Amount 10,000 25,000 50,000 75,000 100,000 PDF 60% 20% 10% 7% 3% Loss Bucket 1 1 2 3 4 Since the first two Loss Amounts are less than or equal to 25K, they are assigned to the first bucket, and each of the other points gets their own bucket. If the third point were 55K instead of 75K, it would be assigned to the third bucket, and the second bucket would be empty, so that the resulting distribution would have only three points instead of four. Loss Amount 10,000 25,000 55,000 75,000 100,000 Loss Bucket 1 1 3 3 4 In other words, the more evenly distributed the severity distribution, the more refined the compressed distributions. When we include an Occurrence Limit = 80K, the maximum loss amount for the one-claim scenario is 80K, and the bucket size becomes 20K = 80K / 4. Limited Loss Amount 10,000 25,000 55,000 75,000 80,000 Limited Loss Bucket 1 2 3 4 4 Now the compressed distribution becomes a four-point distribution, as only the last two points are in the same 20K bucket, and the remaining three points get their own bucket. So the lower the Occurrence Limit, the more refined the compressed distribution. Aggregate limit In the initial example, the 70% chance of zero loss (the chance of having no claims) was compressed into other scenarios such that the smallest Loss Amount in the aggregate loss distribution is 4,286 with a probability of 92.4%. But the ALG actually preserves the probability of a loss amount of zero, excluding it from the compression calculations. Loss Loss Amount Probability 0 70.00% 22,597 89,480 159,375 25.80% 4.04% 0.16% Since the MSP is four, the ALG preserves the probability of zero loss at each scenario and compresses the remaining distribution in to three points. Aggregate Limit 150,000 Likewise, if an Aggregate Limit is included, the ALG preserves the probability of the Loss Amount equal to the Aggregate Limit, so that that probability is not compressed with any other points. Loss Amount 0 23,295 90,921 150,000 PDF 70.00% 26.10% 3.80% 0.10% The resulting expected value is 9,685, which reveals that the impact of the Aggregate Limit using the parameters above would be a discount of 15, since the original expected value was 9,700. How to use the Aggregate Loss Generator The ALG is meant to be implemented behind the scenes, either directly in an Excel model or as a highly functional spec for IT to use as a reference. The VBA code that drives the model is not passwordprotected, and users are highly encouraged to examine the code and customize it to their purposes. Further potential enhancements include Calculating the effects of deductibles and self-insured retentions. This could be achieved in VBA or the severity distribution itself could be entered as net of limits and retentions. Calculating the effect of retention caps. This calculation would likely take place in VBA, as it’s not possible to calculate from the aggregate loss distribution. Implementing an alternative compression algorithm. The ALG does a good job of preserving the tails of aggregate loss distributions, but you may prefer distributions that are more “spread out,” with more evenly distributed probabilities per bucket. Discrete severity distributions Loss modeling often makes use of continuous severity distributions (lognormal, Pareto, etc.) that are fully defined using a couple parameters. Since these distributions imply an infinite number of loss amounts, the ALG, which calculates all possible loss scenarios, cannot make use of these continuous distributions. Some would view having to use discrete severity distributions as a weakness, but there are some clear advantages. Continuous distributions can be altered by changing two or three parameters, but these adjustments may be in sufficient to get a distribution you want. On the other hand, you can always add another point to a discrete distribution and edit the loss amounts and probabilities, so you’re more likely to arrive at a distribution you find acceptable because you have an unlimited number of parameters with which to work. To put it another way, if you’re happy with a lognormal distribution, it’s not too difficult to turn that into a discrete distribution, and then you can edit the parameters to change the distribution in ways you couldn’t change a lognormal distribution. Discrete distributions are very easy to understand, even for non-actuaries. If audit or underwriting takes an interest in the pricing assumptions of your model, they are more likely to understand a table of losses and probabilities than parameters of a continuous distribution. And if aggregate loss generation were included in a rate filing, I’m betting that a table of losses and probabilities would have a much easier time getting approved than would values assigned to Greek letters. Furthermore, actuaries might be tempted to calculate probabilities for very large losses using continuous distributions that were fit to losses in lower layers. But there’s no reason to believe that the tail of a continuous distribution is appropriate for pricing large losses simply because it works well for smaller losses. Discrete distributions can be edited so that tail fits expectations, which are, by their nature, explicitly stated. More generally, it’s arguably unreasonable to believe that real life probability distributions conform to continuous distributions. Actuaries just might be spending too much time learning and applying curvefitting techniques when they could instead simply bucket losses by size, use the claim counts as a proxy for probability, and make some explicit assumptions about missing data that was censored by retentions or truncated by limits.