c Applied Mathematics E-Notes, 9(2009), 300-306 Available free at mirror sites of http://www.math.nthu.edu.tw/∼amen/ ISSN 1607-2510 A Tail Bound For Sums Of Independent Random Variables And Application To The Pareto Distribution∗ Christophe Chesneau† Received 11 December 2008 Abstract We prove a bound of the tail probability for a sum of n independent random variables. It can be applied under mild assumptions; the variables are not assumed to be almost surely absolutely bounded, or admit finite moments of all orders. In some cases, it is better than the bound obtained via the Fuk-Nagaev inequality. To illustrate this result, we investigate the bound of the tail probability for a sum of n weighted i.i.d. random variables having the symmetric Pareto distribution. 1 Introduction ∗ Let (Yi )i∈N∗ be a sequence of independent random variables. For any Pnn ∈ N , we wish to determine the smallest sequence of functions pn (t) such that P ( i=1 Yi ≥ t) ≤ pn (t), t ∈ [0, ∞[. This problem is well-known; numerous results exist. The most famous of them is the Markov inequality. Under mild assumptions on the moments of the Xi ’s, it gives a polynomial bound pn (t). Under the same assumptions, this bound can be improved by the Fuk-Nagaev inequality (see [2]). If the Xi ’s are almost surely absolutely bounded, or admit finite moments of all orders (and these moments satisfy some inequalities), the Bernstein inequalities provide better results. See [6] and [7] for further details. In this note, we present a new bound pn (t). It can be applied under mild assumptions on the Xi ’s; only knowledge of the order of a finite moment is required. The main interest of the proposed inequality is that it can be applied when the ‘Bernstein conditions’ are not satisfied, and can give better results than the Fuk-Nagaev inequality (and the Markov inequality). The tail probability for a sum of n weighted i.i.d. random variables having the symmetric Pareto distribution is studied. This is particularly interesting because the exact expression of the distribution of such a sum is really difficult to identify (see [8]). Moreover, there are some applications in economics, actuarial science, survival analysis and queuing networks. ∗ Mathematics Subject Classifications: 60E15 Mathématiques Nicolas Oresme, Université de Caen Basse-Normandie, Campus II, Science 3, 14032 Caen, France † Laboratoire de 300 301 C. Chesneau The note is organized as follows. Section 2 presents a general tail bound. An application of this bound to the Pareto distribution can be found in Section 3. Section 4 provides a comparative study of the considered inequality and the Fuk-Nagaev inequality. 2 A General Tail Bound Theorem 1 below presents a bound of the tail probability for a sum of n independent random variables. As for the Fuk-Nagaev inequality (and the Markov inequality), it requires knowledge only of the order of a finite moment. THEOREM 1. Let (Yi )i∈N∗ be a sequence of independent random variables. We suppose that − for any n ∈ N∗ and any i ∈ {1, . . . , n}, E(Yi ) = 0, − there exists a real number p ≥ 2 such that, for any n ∈ N∗ and any i ∈ {1, . . . , n}, E(|Yi |p ) < ∞. Then, for any t > 0 and any n ∈ N∗ , we have ! n X t2 p/2 P Yi ≥ t ≤ Cp t−p max rn,p(t), (rn,2 (t)) + exp − , 16bn (1) i=1 where, for any u ∈ {2, p}, rn,u (t) = Pn u 2 E |Y | 1 3b n i i=1 i=1 E Yi , {|Yi |≥ } , bn = Pn t Cp = 22p cp , and cp refers to the Rosenthal inequality (see Lemma 1 below). PROOF. Let n ∈ N∗ . For any t > 0, we have ! ! n n X X P Yi ≥ t = P (Yi − E (Yi )) ≥ t ≤ U + V, i=1 where U =P i=1 n X i=1 and V =P n X i=1 t Yi 1{|Yi |≥ 3bn } − E Yi 1{|Yi|≥ 3bn } ≥ t t 2 t Yi 1{|Yi|< 3bn } − E Yi 1{|Yi|< 3bn } ≥ t t 2 ! ! . Let us bound U and V , in turn. The upper bound for U . The Markov inequality yields ! n X p p −p U ≤2 t E Yi 1{|Yi|≥ 3bn } − E Yi 1{|Yi |≥ 3bn } . t t i=1 Now, let us introduce the Rosenthal inequality (see [9]). (2) 302 Tail Bound for Sums of Independent Random Variables LEMMA 1 (Rosenthal’s inequality). Let p ≥ 2 and (Xi )i∈N∗ be a sequence of independent random variables such that, for any n ∈ N∗ and any i ∈ {1, . . . , n}, E(Xi ) = 0 and E(|Xi |p ) < ∞. Then we have p ! !p/2 n n n X X X , E Xi ≤ cp max E (|Xi |p ) , E Xi2 i=1 i=1 i=1 where, for any τ > p/2, cp = 2 max τ p , pτ p/2 eτ R∞ 0 xp/2−1 (1 − x)−τ dx . Under mild assumptions on the Xi ’s, the constant cp of Lemma 1 can be improved. We refer to [1], [4] and [10]. For any i ∈ {1, . . . , n}, set Zi = Yi 1{|Yi |≥ 3bn } − E Yi 1{|Yi |≥ 3bn } . Since E(Zi ) = 0 t t and E (|Zi |p ) ≤ 2p E |Yi |p 1{|Yi |≥ 3bn } t independent variables (Zi )i∈N∗ gives ≤ 2p E (|Yi |p ) < ∞, Lemma 1 applied to the p ! !p/2 n n n X X X . E ≤ cp max Zi E (|Zi |p ) , E Zi2 i=1 i=1 (3) i=1 It follows from (2) and (3) that U ≤ p −p 2 t n X cp max p E (|Zi | ) , i=1 ≤ = 22pt−p cp max n X E i=1 n X i=1 Zi2 !! n X E Yi2 1{|Yi |≥ 3bn } E |Yi |p 1{|Yi|≥ 3bn } , t t i=1 p/2 Cpt−p max rn,p (t), (rn,2(t)) , !p/2 (4) where Cp = 22p cp . The upper bound for V . Let us present one of the Bernstein inequalities. See, for instance, [6]. LEMMA 2 (Bernstein’s inequality). Let (Xi )i∈N∗ be a sequence of independent random variables such that, for any n ∈ N∗ and any i ∈ {1, . . . , n}, E(Xi ) = 0 and |Xi | ≤ M < ∞. Then, for any λ > 0 and any n ∈ N∗ , we have n X P i=1 where d2n = Pn i=1 ! Xi ≥ λ λ2 ≤ exp − 2(d2n + λM 3 ) ! , E(Xi2 ). For any i ∈ {1, . . . , n}, set Zi = Yi 1{|Yi |< 3bn } − E Yi 1{|Yi |< 3bn } . Since E(Zi ) = 0 t t 6bn and |Zi | ≤ |Yi |1{|Yi|< 3bn } + E |Yi |1{|Yi|< 3bn } ≤ t , Lemma 2 applied with the t t 303 C. Chesneau independent variables (Zi )i∈N∗ and the parameters λ = 2t and M = 6btn , gives 2 t V ≤ exp − P . n t 6bn 8 V Y 1 3bn + i i=1 6 t {|Yi|< t } Pn Pn Since i=1 V Yi 1{|Yi |< 3bn } ≤ i=1 E Yi2 = bn , it comes t t2 V ≤ exp − . 16bn (5) Putting (4) and (5) together, we obtain the inequality ! n X t2 p/2 −p . P Yi ≥ t ≤ U + V ≤ Cp t max rn,p(t), (rn,2 (t)) + exp − 16bn i=1 Theorem 1 is proved. Theorem 1 can be applied for a wide class of random variables. However, if the variables are almost surely absolutely bounded, or have finite moments of all orders satisfying some inequalities, the Bernstein inequalities can give more optimal results than (1). When it is hard or not possible to prove that these conditions are satisfied, Theorem 1 becomes of interest. This is illustrated in the example below and in Section 3 for the symmetric Pareto distribution. Other examples can be studied in a similar fashion. EXAMPLE. Let (Xi )i∈N∗ be i.i.d. random variables such that, for any x > 0, γ P(|X1 | ≥ x) ≤ ce−x , γ > 0, c > 0. Taking t = tn = (κ16E(X12 )n log n)1/2 with κ > 0, the inequality (1) implies the existence of a constant Fp > 0 such that, for n and p large enough, ! 1/2 n X 3bn t2n ) + exp − P Xi ≥ tn ≤ Fp t−p n P(|X | ≥ 1 n tn 16bn i=1 0 2 ) 1/2 3 nE(X1 − 12 @ (16κ log n)1/2 ( ) ≤ n−p/2+1 e We used the inequality: rn,u (t) ≤ n E |X1 |2u 3 1/2 1γ A + n−κ ≤ 2n−κ . P(|X1 | ≥ 1/2 3bn tn ) , u ∈ {1, 2}. Application to the Pareto Distribution Proposition 1 below investigates the bound of the tail probability for a sum of n weighted i.i.d. random variables having the symmetric Pareto distribution. 304 Tail Bound for Sums of Independent Random Variables PROPOSITION 1. Let s > 2 and (Xi )i∈N∗ be i.i.d. random variables with the probability density function ( 2−1 s|x|−s−1 , if |x| ≥ 1, f(x) = (6) 0 otherwise. Pn Let (ai )i∈N∗ be a sequence ofnonzero real numbers such that i=1 |ai |s < ∞. Then, P 1/s n for any n ∈ N∗ , any t ∈ 0, 3b with ρn = ( ni=1 |ai |s ) and any p ∈ max( s2 , 2), s , ρn we have ! n n X X t2 −2p+s p−s s P ai Xi ≥ t ≤ Kp t bn |ai | + exp − , (7) 16bn i=1 i=1 p/2 Pn s s s 2 p−s where bn = s−2 a , K = 3 max , 22p cp , p i=1 i s−p s−2 1 + 2p/2 π −1/2 Γ p + 1 , 2 < p < 4, 2 cp = E (|θ − θ |p ) , p ≥ 4, 1 2 R∞ Γ(a) = 0 xa−1 e−x dx, a > 1, and θ1 , θ2 are independent Poisson random variables with parameter 2−1 . PROOF. Let n ∈ N∗ . Set, for any i ∈ {1, . . . , n}, Yi = ai Xi . Clearly, (Yi )i∈N are independent random variablessuch that, for any i ∈ {1, . . . , n}, E(Yi ) = ai E(Xi ) = 0 and, for any p ∈ max( s2 , 2), s , E(|Yi |p ) ≤ supi=1,...,n |ai P |pE(|Xi |p ) < ∞. WeP are now n n s 2 2 in the position to apply Theorem 1. We have bn = E(Y ) = i i=1 i=1 ai . s−2 s For any u ∈ {2, p} and any p ∈ max( 2 , 2), s , let us investigate the bound for Pn P rn,u (t) = i=1 E |Yi |u 1{|Yi |≥ 3bn } = ni=1 |ai |u E |Xi |u 1n|X |≥ 3bn o . Recall that t i |ai |t Pn 1/s 3bn s n ρn = ( i=1 |ai | ) and σn = supi=1,...,n |ai |. Since t ∈ 0, ρn ⊆ 0, 3b σn , for any u−s R∞ 3bn s i ∈ {1, . . . , n}, we have E |Xi |u 1n|X |≥ 3bn o = s 3bn xu−s−1dx = s−u . |ai |t i Therefore, rn,u (t) = s s−u 3bn t and p/2 max rn,p(t), (rn,2 (t)) where Rp = max s s−p , s s−2 ≤ Rp p/2 |ai |t |ai |t 3bn t p u−s X n |ai |s i=1 3bn max tρn −s , 3bn tρn −s !p/2 n . Since t ∈ 0, 3b and p > 2, we have ρn −s −s !p/2 −s 3bn 3bn = 3bn , . max tρn tρn tρn , 305 C. Chesneau Hence, p−s X n 3bn p/2 |ai |s . max rn,p(t), (rn,2 (t)) ≤ Rp t (8) i=1 Theorem 1 and (8) imply that P n X ai Xi ≥ t i=1 s s−2 where bn = ! ≤ Kp t−2p+s bp−s n n X t2 |ai |s + exp − , 16bn p/2 i=1 Pn 2 p−s max i=1 ai , Kp = 3 s s−p , s s−2 Cp , and Cp = 22p cp . Using the optimal form of the Rosenthal constant cp for the symmetric random variables (see [1], [4] and [10]), we complete the proof of Proposition 1. For recent asymptotic results on the approximation of the tail probability of a sum of n i.i.d. random variables having the symmetric Pareto distribution, we refer to [3]. Section 4 below compares the precision between (7) and the bound obtained via the Fuk-Nagaev inequality. 4 Comparison with the Fuk-Nagaev Inequality For the sake of clarity, recall the Fuk-Nagaev inequality. The considered version can be found in [5]. LEMMA 3. (Fuk-Nagaev inequality) Let p ≥ 2 and (Yi )i∈N∗ be a sequence of independent random variables such that, for any n ∈ N∗ and any i ∈ {1, . . . , n}, E(Yi ) = 0 and E(Yi2 ) < ∞. Then, for any t > 0, we have n X P Yi ≥ t i=1 ≤ n X ! −p P(Yi ≥ ηt) + (ηt) i=1 where dn = Pn i=1 n X E i=1 E(Yi2 ), η = p , p+2 and c∗p = Yip 1{0≤Yi≤ηt} t2 + exp − ∗ cp d n , (9) ep (p+2)2 . 2 In some cases, (7) can give better results than the Fuk-Nagaev inequality. For instance, consider the symmetric Pareto distribution (i.e. (Xi )i∈N∗ are i.i.d. with the probability density function (6) with s > 2). Suppose that n is large. For any p ∈ max( 2s , 2), s , if we take t = tn = 23/2 (sn log n)1/2 (∈ 0, 3n1−1/s ), then we can balance the two terms of the bound in (7); there exists a constant Qp > 0 such that P n X i=1 Xi ≥ tn ! ≤ Qp n1−s/2(log n)−p+s/2 + n1−s/2 ≤ 2n1−s/2. (10) 306 Tail Bound for Sums of Independent Random Variables For the same tn , the Fuk-Nagaev inequality (9) implies the existence of a constant Rp > 0 such that, for any p > 0, ! n X 16 ∗ (1−s/2) P Xi ≥ tn ≤ Rp n1−p/2 (log n)−p/2 + n cp i=1 ≤ p 16 ∗ (1−s/2) 2 max n1−p/2 , n cp . (11) 2 Since c∗p = e (p+2) > 8e2 > 16, for n large enough, the rate of convergence in (10) 2 is faster than the one in (11). Therefore, in this case, (7) gives a better result than the Fuk-Nagaev inequality. This superiority can be extended to tn = κ(n log n)1/2 for 0 < κ < 23/2 s1/2 . Acknowledgment. The author wishes to thank the reviewer for his/her valuable comments and suggestions. References [1] T. Figiel, F. Hitczenko, W. B. Johnson, G. Schechtman and J. Zinn, Extremal properties of Rademacher functions with applications to the Khintchine and Rosenthal inequalities, Trans. Amer. Math. Soc., 349(3)(1997), 997–1027. [2] D. H. Fuk and S.V. Nagaev, Probability inequalities for sums of independent random variables, Theor. Probab. Appl., 16(1971), 643–660. [3] M. Goovaerts, R. Kaas, R. Laeven, Q. Tang and R. Vernic, The Tail Probability of Discounted Sums of Pareto-like Losses in Insurance, Scandinavian Actuarial Journal, 6(2005), 446–461. [4] R. Ibragimov and Sh. Sharakhmetov, On an exact constant for the Rosenthal inequality, Theory Probab. Appl., 42(1997), 294–302. [5] S. V. Nagaev, Large deviations of sums of independent random variables, Ann. Probab., 7(5)(1979), 745–789. [6] V. V. Petrov, Limit Theorems of Probability Theory, Clarendon Press, Oxford, 1995. [7] D. Pollard, Convergence of Stochastic Processes, Springer, New York, 1984. [8] C. M. Ramsay, The distribution of sums of certain i.i.d. Pareto variates, Commun. Stat., Theory Methods. 35(1-3) (2006), 395–405. [9] H. P. Rosenthal, On the subspaces of Lp (p ≥ 2) spanned by sequences of independent random variables, Israel Journal of Mathematics, 8(1970), 273–303. [10] S. A. Utev, Extremal problems in moment inequalities (in Russian), Limit Theorems of Probability Theory, ed. A. A. Borovkov, Trudy Inst. Mat. 5, Nauka Sibirsk. Otdel., 1985, 56–75.