Nonparametric statistical analysis of ruin probability under conditions of “small” and “large” claims Pier Luigi Conti1 and Esterina Masiello1 1 Dipartimento di Statistica, Probabilità e Statistiche Applicate Università di Roma “La Sapienza” pierluigi.conti@uniroma1.it esterina.masiello@uniroma1.it Summary. The aim of the present work is to develop a nonparametrical statistical analysis of the ruin probability. We introduce an estimator of the ruin probability which can be applied in general cases when we do not make any special assumption on the claim size distribution and on the parameter of the inter-arrival time distribution. For such an estimator, we obtain results of consistency and weak convergence to a Gaussian process. A different approach is considered when the claims are modeled by heavy-tailed distributions. The idea is to consider the Pareto bound introduced by [Wil94] as a useful approximation of the ruin probability; on this approximation we make inference. Key words: Poisson risk model, probability of ruin, nonparametric estimation, asymptotics, heavy-tailed distribution. 1 Introduction The interest of actuarial mathematics in ruin probabilities dates back to the work of Lundberg (1903). Since this starting point, a number of contributions has been given, the main problem being the difficulty in finding a general expression for the probability of ruin Ψ (except for a few special cases). We consider here the classical risk model, where claims occur as a Poisson process. The claim sizes X1 , X2 , . . . are positive independent and identically distributed (i.i.d.) random variables (r.v.s), having common distribution function (d.f.) F (x) = P (X1 ≤ x), finite mean µ = E(X1 ) and finite variance σ 2 = V ar(X1 ). The inter-arrival times T1 , T2 , . . . are i.i.d. exponentially distributed, with finite mean µT = E(T1 ) = 1/λ and d.f. A(t) = P (T1 ≤ t) = 1 − e−λt . Moreover, the two sequences {Xi } and {Ti } are independent. We further assume that the insurance company has an initial surplus x > 0 and that the premium income is linear in time with rate c. With no loss of generality, from now on we assume that c = 1. The function of interest is Ψ (x), the probability that the company will be ruined if the claims will exceed its available resources. If λµ < 1, it is possible to express 1502 Pier Luigi Conti and Esterina Masiello Ψ as a compound geometric tail probability by mean of the so-called PollaczeckKhinchine formula (see, e.g., [Fel71] or [Asm00] for more detailed discussions): Ψ (x) = ∞ X ∗k (1 − ρ)ρk F I (x) x≥0 k=1 (1) Rx where ρ = λµ, FI (x) = µ1 0 y dF (y) is the integrated tail distribution and F ∗k denotes the k-fold convolution of the d.f. FI . 2 A nonparametric estimator for the ruin probability In this section, we deal with the construction of a nonparametric estimator Ψn (x) of Ψ (x) based on a sample T1 , . . . , Tn of inter-arrival times and a sample X1 , . . . , Xn of corresponding claims. This problem has already been studied in [Pit94], in the special case of a known value of ρ. We generalize here her results to the more realistic case of an unknown value of ρ. Our only assumption is the stability of the model: ρ < 1. Note that when ρ ≥ 1, the model is unstable and the ruin of the insurance company occurs with probability 1. In fact, when ρ > 1 the expected claim payments per unit time are greater than the insurer’s premium income per unit time. On the other hand, when ρ = 1, λµ = c but the insurance company will be almost surely ruined because of the effect of the variability of the claims. The d.f. of inter-arrival times and claims are both supposed unknown. More specifically, the inter-arrival time distribution is assumed to be exponential with unknown parameter λ. As far as the claim distribution is concerned, no specific parametric model is assumed. Such a statistical problem has been considered by different authors but literature has mostly been concerned with asymptotic expansions; few results focus on statistical estimation of Ψ . Clearly, one could work out a parametric estimation procedure or use nonparametric methods. As remarked in [Pit94], the latter approach is particularly interesting in view of its applicability. In our approach, the key tool for estimating the ruin probability is the PollaczeckKhinchine formula. We first estimate E(T ) = µT by its maximum likelihood estiP Pn 1 mator T n = n1 n i=1 Ti , and E(X) = µ by X n = n i=1 Xi . The next step consists in estimating ρ by the ratio ρ̂ = X n /T n . Finally, we replace the k-fold convolution of F in (1) by the following estimator ([Fre86]): !−1 Fn∗k (x) = n k X I(Xi1 +...+Xi k ≤x) c P where {i1 , i2 , . . . , ik } is a subset of size k of {1, 2, . . . , n} and c is the sum over all n different combinations of {i1 , i2 , . . . , ik }. k We have easily estimated the quantities at the right hand side of (1) by the corresponding sampling counterparts. It seems therefore natural to estimate Ψ (x) by Ψn (x) = ∞ X ∗k (1 − ρ̂) ρ̂k F n (x) k=1 Nonparametric statistical analysis of ruin probability 1503 The asymptotic behavior of the proposed estimator has been investigated. Our main results are summarized in the following theorem. Theorem 1. Under the assumption ρ < 1, as n → ∞, we have: a.s. (i) supx |Ψn (x) − Ψ (x)| → 0 √ (ii) the sequence of stochastic processes { n(Ψn (x) − Ψ (x)); x ≥ 0}n≥1 converges weakly (in the Skorokhod topology in D[0, ∞]) to a Gaussian process Z with mean function 0 and covariance kernel given by: E[Z(x)Z(y)] = ∞ ∞ X X k=1 j=1 + ∞ X ∞ X ′ fk (ρ)F ∗k (1 − ρ)2 ρk+j k j Cov(F ′ ∗j (x)fj (ρ)F (y) k=1 j=1 +2 ∞ X ∞ X k=1 j=1 (1 − ρ)ρk ∗k−1 (x − X1 ), F µ2 1 2 σ + 4 σT2 2 µT µT ∗j−1 (y − X1 )) 1 ′ ∗j ∗k−1 (x − X1 )) f (ρ)F (y) Cov(X1 , F µT j This result can be used to construct confidence bands for Ψ (x), based on resampling techniques. 3 Ruin probability under “high claims” Almost all analytical studies of the probability of ruin in risk business have been based on the assumption that the moment generating function (m.g.f.) of the claim size distribution is finite for some positive real argument. What happens to the ruin probability of an insurer who is incurring “large claims” modeled by distributions with heavy tails, which do not have exponential moments? The estimator proposed in the above section, which naturally arises for the ruin probability, is essentially based on the empirical distribution function Fn and this involves a bad accuracy on the tails. As a consequence, when dealing with heavytailed distributions, we are forced to resort to a different approach. The basic idea is to resort to an appropriate approximation of Ψ (x) (“for x large”) and estimate this approximation. The main result in this direction is the celebrated Lundberg’s inequality. Assume that the Cramér-Lundberg condition holds, i.e. there exists a positive constant R (the so-called adjustment coefficient) such that: Z 0 ∞ eRx F (x)dx < ∞ Then, Lundberg’s inequality states a bound which is exponential in the initial capital x: Ψ (x) ≤ e−Rx ∀x ≥ 0 (2) The problem is that for many practical distributions, Lundberg’s inequality is not available because of the non-existence of the corresponding adjustment coefficients. 1504 Pier Luigi Conti and Esterina Masiello Refinements and generalizations of the inequality (2) have been considered by several authors. A simple approach to the problem consists in constructing a bound for the tail of the claim size distribution which is based on the reciprocal of a polynomial of order m, when at least m moments of the claim size distribution exist. Following this idea, Willmot ([Wil94]) obtains the so-called Pareto bound which, applied to the ruin probability Ψ (x), becomes: Ψ (x) ≤ (1 + kx)−m x≥0 (3) where k > 0 satisfies the relationship: Z ∞ (1 + kx)m dF (x) = 0 1 ρ and m > 0 is the lowest upper bound for the number of moments possessed by the claim size distribution. The moment based bound introduced by [Wil94] can be considered as a useful approximation of Ψ (x). We make inference on this approximation. The problem we study is the construction of nonparametric point and interval estimates for the bound (1 + kx)−m at the right hand side of relation (3). Again, we do not assume any specific parametric model for the distribution of the claims, while Ti ’s are assumed i.i.d. exponential r.v.s with unknown parameter λ. Let T1 , . . . , Tn and X1 , . . . , Xn be two samples of inter-arrival times and claims, respectively. Our idea is to estimate F (x) by the corresponding empirical d.f. Fn (x) and ρ by ρ̂ = X n /T n so that it is possible to define kn as the solution to the equation Z 0 ∞ (1 + kx)m dFn (x) − 1 =0 ρ̂ Now, it seems natural to consider (1 + kn x)−m as an estimate for the Pareto upper bound (1 + kx)−m at the right hand side of relation (3). Our main results are asymptotic results about the behavior of this estimator of the Pareto bound. They are stated in the following theorem. Theorem 2. Under the assumption ρ < 1, as the sample size n goes to infinity, the following results hold true. a.s. (i) (1 + kn x)−m → (1 + kx)−m ∀ x ≥ 0. √ (ii) The sequence of r.v.s n((1 + kn x)−m − (1 + kx)−m ) converges in law to a normal distribution with mean 0 and variance σ2 = (m2 (1 + kx)−2(m+1) x2 )σk2 . a.s. (iii) Let σ 2n = (m2 (1 + kn x)−2(m+1) x2 )σ̂k2 , then σ 2n → σ2 . (iv) Let qα/2 be the α/2-th quantile of the normal standard distribution. A confidence interval for (1 + kx)−m , with asymptotic size (1 − α), is given by √ √ [(1 + kn x)−m − qα/2 σ̄n / n, (1 + kn x)−m + qα/2 σ̄n / n] The confidence interval for (1 + kx)−m obtained in the above theorem is based on the asymptotic normality of our estimator. The distribution of (1 + kn x)−m is usually not symmetric around (1 + kx)−m (as supported by data provided in the sequel). Hence, we cannot expect that confidence intervals which are symmetric around (1+kn x)−m will perform well. Bootstrap confidence interval should be a good Nonparametric statistical analysis of ruin probability 1505 alternative to normal intervals because they are in general not symmetric around √ (1 + kx)−m . Hence, the bootstrap distribution of n((1 + kn x)−m − (1 + kx)−m ) should recover the asymmetry of the actual distribution. To construct a bootstrap confidence interval, we first generate mn bootstrap ∗ ∗ claim samples (X1,j , . . . , Xn,j ), (j = 1, . . . , mn ), and mn independent bootstrap ∗ ∗ ∗ inter-arrival time samples (T1,j , . . . , Tn,j ), (j = 1, . . . , mn ). Let P us denote by kn,j n ∗ m the root of the equation Gn,j (k, m) = 0, where Gn (k, m) = (1/n) i=1 (1 + kXi ) − √ ∗ ∗ n((1 + kn,j x)−m − (1 + kn x)−m ), 1/ρ̂, and define the mn quantities Vn,j = (j = 1, . . . , mn ). We can compute the empirical distribution function R̂n (x) = Pmn ∗ ∗ 1 j=1 I(−∞,x] (Vn,j ) and approximate the α-th quantile of R̂n (x) = P (Vn,j ≤ x) mn by R̂n−1 (α) = inf{x : R̂n (x) ≥ α} Finally, the bootstrap confidence interval is given by [(1 + kn x)−m − R̂n−1 (1 − α/2)n−1/2 , (1 + kn x)−m − R̂n−1 (α/2)n−1/2 ] 3.1 A simulation study The present subsection is devoted to a simulation study performed in order to compare the coverage probabilities of bootstrap and asymptotic confidence intervals for (1 + kx)−m . We suppose that the inter-arrival time distribution is exponential of parameter λ with d.f. A(t) = 1 − e−λt . The claim size distribution is assumed to be Pareto of scale parameter β > 0, shape parameter α > 0 and d.f. F (t) = 1 − (β/t)α with t ≥ β. The Pareto distribution seems to describe rather well the claim size distribution in insurance in models with many small claims and few large ones (it is the case of extreme values among the claims). Moreover, the moment generating function for such a distribution is not defined. The values for the parameters considered in our study are shown in Table 1 (when ρ = 0.2), in Table 2 (when ρ = 0.5) and in Table 3 (when ρ = 0.8). Obviously, the choice of a particular value for the parameter m depends on the moment assumptions that we want to make about F (the values m = 1, 1.5, 2, 2.5 are covered in our study). Table 1. Values of m, α, β, EX, λ, ET , k0 , x1 , x2 when ρ = 0.2. m 1 1.5 2 2.5 α 2.1 3.1 4.1 5.1 β 1048 27.1 45.37 64.31 EX 20 40 60 80 λ 0.01 0.005 0.0033 0.0025 ET 100 200 300 400 k0 0.2097208 0.0468165 0.01914081 0.01074617 x1 19.1 41.1 64.6 84.1 x2 419.3 456.4 426.6 458.9 For every choice of (α, β, λ), 500 pairs of independent inter-arrival times and claims samples of size n have been generated by Monte Carlo method. The two sample sizes n = 100 and n = 200 have been considered. A grid of values has been chosen for the initial surplus x in such a way that either (1 + kx)−m = 0.2 or (1 + kx)−m = 0.01. The corresponding surplus values have been denoted by x1 and x2 , respectively. This particular choice comes from our interest in considering the 1506 Pier Luigi Conti and Esterina Masiello Table 2. Values of m, α, β, EX, λ, ET , k0 , x1 , x2 when ρ = 0.5. m 1 1.5 2 2.5 α 2.1 3.1 4.1 5.1 β 12.57 32.52 54.44 77.17 EX 24 48 72 96 λ 0.020833 0.010416 0.006944 0.005208 ET 48 96 144 192 k0 0.05026794 0.01125278 0.005722792 0.003173154 x1 79.6 171.0 216.0 284.8 x2 2235.1 1587.6 1625.7 1589.4 Table 3. Values of m, α, β, EX, λ, ET , k0 , x1 , x2 when ρ = 0.8. m 1 1.5 2 2.5 α 2.1 3.1 4.1 5.1 β 15.71 40.64 68 96.47 EX 30 60 90 120 λ 0.026666 0.01333 0.00888 0.00666 ET 37.5 75 112.5 150 k0 0.009429673 0.00282884 0.001247634 0.000795483 x1 424.2 680.1 990.7 1136.0 x2 11276.2 8598.8 6662.9 6678.9 tail of the distribution rather than the whole distribution. For every pair of samples of inter-arrival times and claims, we have computed the value (1 + kn x)−m . In such a way, we have obtained the simulated distributions of (1 + kn x)−m for different values of ρ, n, m and x. In Tables 4 and 5, skewness (sk) and kurtosis (ku) measures for the simulated distributions of (1 + kn x)−m are displayed. Table 4. Values of skewness and kurtosis when x = x1 . n = 100 ρ m=1 0.2 sk 4.98 ku 64.7 0.5 sk 1.9 ku 10.2 0.8 sk 2.2 ku 9.3 m = 1.5 sk 0.5 ku 3.8 sk 1.0 ku 4.5 sk 2.1 ku 8.3 m=2 sk 0.3 ku 3.0 sk 0.8 ku 4.0 sk 2.1 ku 8.0 m = 2.5 sk 0.4 ku 3.5 sk 0.8 ku 4.5 sk 2.0 ku 7.4 n = 200 m=1 sk 1.6 ku 10.3 sk 1.4 ku 7.6 sk 2.5 ku 11.2 m = 1.5 sk 0.5 ku 3.4 sk 0.6 ku 3.2 sk 2.0 ku 8.0 m=2 sk 0.3 ku 3.1 sk 0.8 ku 4.7 sk 2.3 ku 12.0 m = 2.5 sk 0.4 ku 3.2 sk 0.8 ku 4.5 sk 1.6 ku 6.7 It clearly appears that generally the distribution of (1 + kn x)−m is asymmetric and far from normality. More specifically, data suggest that the higher is ρ the more asymmetric is the distribution of (1 + kn x)−m . For each sample size (n = 100 and n = 200) after having simulated 500 data sets, we have computed the corresponding confidence interval of nominal level 0.95 (α = 0.05) and counted the number of cases in which (1 + kx)−m , the true value, was covered by the confidence interval. The coverage probabilities for asymptotic and bootstrap confidence intervals (ca and cb, respectively) with a nominal level 0.95, obtained on the basis of the simulated samples, are shown in Table 6 for the case n = 100 and n = 200 when x = x1 and in Table 7 for n = 100 and n = 200 when x = x2 . It turns out that the level of the bootstrap confidence intervals, as well as the level of the asymptotic confidence intervals, becomes considerably smaller than 1−α when ρ increases. The same “bad” behavior has occurred with n = 200 instead of n = 100, even if changing from sample size n = 100 to n = 200 generally increases the coverage Nonparametric statistical analysis of ruin probability 1507 Table 5. Values of skewness and kurtosis when x = x2 . n = 100 ρ m=1 0.2 sk 3.1 ku 25.5 0.5 sk 1.7 ku 7.6 0.8 sk 8.9 ku 87.0 m = 1.5 sk 0.9 ku 4.7 sk 1.9 ku 10.6 sk 5.9 ku 47.1 m=2 sk 0.8 ku 4.2 sk 2.5 ku 15.9 sk 9.2 ku 103.6 m = 2.5 sk 0.8 ku 3.9 sk 3.9 ku 37.8 sk 9.4 ku 126.4 n = 200 m=1 sk 1.2 ku 6.5 sk 16.7 ku 314.4 sk 13.8 ku 235.5 m = 1.5 sk 0.5 ku 3.2 sk 0.7 ku 3.8 sk 10.1 ku 139.4 m=2 sk 0.3 ku 2.8 sk 1.5 ku 6.8 sk 10.8 ku 171.3 m = 2.5 sk 0.7 ku 3.5 sk 1.5 ku 7.4 sk 8.3 ku 88.7 m=2 ca 0.974 cb 0.97 ca 0.952 cb 0.93 ca 0.914 cb 0.832 m = 2.5 ca 0.974 cb 0.976 ca 0.97 cb 0.958 ca 0.9 cb 0.834 Table 6. Coverage probabilities ca and cb when x = x1 . n = 100 ρ m=1 0.2 ca 0.972 cb 0.944 0.5 ca 0.944 cb 0.9 0.8 ca 0.866 cb 0.722 m = 1.5 ca 0.98 cb 0.962 ca 0.97 cb 0.944 ca 0.902 cb 0.77 m=2 ca 0.972 cb 0.97 ca 0.97 cb 0.948 ca 0.92 cb 0.758 m = 2.5 ca 0.972 cb 0.964 ca 0.944 cb 0.914 ca 0.93 cb 0.794 n = 200 m=1 ca 0.98 cb 0.968 ca 0.964 cb 0.918 ca 0.928 cb 0.826 m = 1.5 ca 0.984 cb 0.976 ca 0.964 cb 0.942 ca 0.914 cb 0.834 probability and makes better the performance of the bootstrap confidence intervals: the level 95% confidence intervals has an actual coverage probability between 69% and 97% when n = 100 and between 78% and 97% when n = 200. Table 7. Coverage probabilities ca and cb when x = x2 . n = 100 ρ m=1 0.2 ca 0.958 cb 0.918 0.5 ca 0.922 cb 0.862 0.8 ca 0.822 cb 0.694 m = 1.5 ca 0.98 cb 0.96 ca 0.946 cb 0.892 ca 0.878 cb 0.746 m=2 ca 0.974 cb 0.944 ca 0.96 cb 0.902 ca 0.876 cb 0.746 m = 2.5 ca 0.966 cb 0.946 ca 0.942 cb 0.874 ca 0.872 cb 0.748 n = 200 m=1 ca 0.972 cb 0.948 ca 0.95 cb 0.888 ca 0.878 cb 0.782 m = 1.5 ca 0.982 cb 0.972 ca 0.956 cb 0.924 ca 0.904 cb 0.79 m=2 ca 0.978 cb 0.962 ca 0.964 cb 0.936 ca 0.878 cb 0.792 m = 2.5 ca 0.97 cb 0.948 ca 0.948 cb 0.916 ca 0.908 cb 0.794 On the other hand, the number of 500 Monte Carlo trials is rather small and does not give us precise information on the actual level of our confidence sets but a larger scale Monte Carlo study with 1000 or more trials is a computer time consuming task. Also observe that the cycle of bootstrap replications is “double”: we need bootstrap replications to compute the coverage probability and to compute the bootstrap quantile in bootstrap confidence intervals. 1508 Pier Luigi Conti and Esterina Masiello References [Asm00] Asmussen, S.: Ruin probability. World scientific, Singapore (2000) [Bil68] Billingsley, P.:Convergence of probability measures. John Wiley, New York (1968) [EKM97] Embrechts, P., Kl¨uppelberg, C., Mikosch, T.: Modeling extremal events in insurance and finance. Springer-Verlag, Berlin (1997) [Fel71] Feller, W.: An introduction to probability theory and its applications II. Wiley, New York (1971) [Fre86] Frees, E.W.: Nonparametric renewal function estimation. The Annals of Statistics, 14, 1366–1378 (1986) [HOS95] Harel, M., O’ Cinneide, C.A., Schneider, H.: Asymptotics of the sample renewal function. Journal of mathematical analysis and applications, 189, 240–255 (1995) [Lun03] Lundberg, F.: Approximerad Framställning av Sannolikhetsfunktionen. Aterförsäkring av Kollektivrisker, II, Almqvist & Wiksell, Uppsala (1903) [Pit94] Pitts, S.M.: Nonparametric estimation of compound distributions with applications in insurance. Ann. Inst. Statist. Math., 46, No. 3, 537–555 (1994) [VW96] Van der Vaart, A.W., Wellner, J.A.: Weak convergence and empirical processes. Springer-Verlang (1996) [Wil94] Willmot, G. E.: Refinements and distributional generalizations of Lundberg’s inequality. Insurance: Math. Econom., 15, 49–63 (1994) [WL00] Willmot, G.E., Lin, X.S.: Lundberg approximations for compound distributions with insurance applications. Springer-Verlang, Berlin (2000)