Computing confidence bounds for the mean of a L´ evy-stable distribution

advertisement
Computing confidence bounds for the mean of
a Lévy-stable distribution
Djamel Meraghni and Abdelhakim Necir
Laboratory of Applied Mathematics, University Med. Khider Biskra, PO Box 145
RP, 07000, Biskra, Algeria, dmeraghni@yahoo.fr, necirabdelhakim@yahoo.fr
Summary. When the characteristic exponent of a Lévy-stable distribution is between 1 and 2, its mean exists and is equal to the location parameter. In this paper,
we use Peng’s estimator [PO1] to construct confidence intervals for the mean when
the distribution is symmetric.
Key words: Lévy-Stable Distribution; Hill’s Estimator; Peng’s Estimator; Asymptotic Normality; Confidence Bounds; Extreme Values.
1 Introduction
A random variable (r.v.) X is said to be of stable distribution if and only if, for any
integer n ≥ 1 and for any sequence Y1 , Y2 , ..., Yn of independent r.v.’s identically
distributed as X, there exist two real numbers an > 0 and bn such that
(Y1 + Y2 + ... + Yn ) − bn d
= X,
an
d
where = denotes equality in distribution. It is proved in [F71] that there exists a
real constant 0 < α ≤ 2 such that an = n1/α .
The stable distribution, introduced in 1924 by Paul Lévy [L25], is characterized
by four parameters: 0 < α ≤ 2 (characteristic exponent or stability index), −1 ≤
β ≤ +1 (skewness parameter), σ > 0 and −∞ < µ < +∞ (scale and position
parameters). It is also called Lévy-stable or α−stable. A random variable of stable
distribution is denoted by X ∼ Sα (σ, β, µ). When α = 2, the stable distribution
coincides with the well known Gaussian one and when α < 2, the distribution
variance is infinite (in general the kth moment of a stable variable is finite if and
only if k < α < 2) and its tails are asymptotically equivalent to those of a Pareto
distribution, i.e. they exhibit a power-law behavior. In fact, if X ∼ Sα (σ, β, µ), then
it is shown in [ST94] that as x → ∞
xα P (X > x) → Cα
where
1+β α
σ
2
and
xα P (X < −x) → Cα
1−β α
σ ,
2
(1.1)
1286
Djamel Meraghni and Abdelhakim Necir
Z
−1
∞
x
Cα :=
−α
sin xdx
=
0
2
πα
Γ (α) sin
,
π
2
with Γ being the well-known gamma function. For a complete overview on the stable
distribution, we refer the reader to [ST94].
If we denote by F and G the respective distribution functions (d.f.’s) of X ∼
Sα (σ, β, µ) and Z = |X|, then relations (1.1) yield that the distribution tails satisfy
for x > 0, (see, for instance, [GH87]),
lim
t→∞
1 − G (tx)
= x−α
1 − G (t)
(regular variation condition),
and as t → ∞
1 − F (t)
1+β
F (−t)
1−β
→
and
→
1 − G (t)
2
1 − G (t)
2
(tail balancing condition).
The characteristic exponent α is the main parameter, it governs the behavior of the
distribution tails (the lesser α, the thicker the tails). Many estimators are proposed
for α via the extreme value approach. The most famous, but not necessarily the
best, of these estimators is that defined by [H75] as follows:
α
bn = α
bn (k) :=
k
1X
log Zn−i+1,n − log Zn−k,n
k i=1
!−1
,
(1.2)
where Z1,n ≤ ... ≤ Zn,n are the order statistics pertaining to a sample Z1 , ..., Zn ,
(n ≥ 1) from the r.v. Z and k = k (n) is an integer sequence such that k → ∞ and
k/n → 0 (as n → ∞).
It is essentially for its tail properties that the α−stable distribution is preferred
by many people to the normal distribution in the process of modelling heavy tailed
data such as financial returns.
Finally, it is worth mentioning that the simulations, computations and graphs
are realized by means of the R statistical software [IG96].
2 Peng’s estimator
When 1 < α < 2, the mean of X exists and is equal to the location parameter µ
but the variance is infinite so that the asymptotic normality of the sample mean
X (which is the natural estimator of µ) is not established. Using the extreme value
bn for µ and proved its asymptotic
theory, [PO1] proposed a consistent estimator µ
normality.
(1)
(2)
(3)
µ
bn = µ
bn (k) := µ
bn + µ
bn + µ
bn ,
(2.1)
where
(2)
(2)
µ
bn = µ
bn (k) :=
n−k
1 X
Xi,n
n
i=k+1
(trimmed mean),
(2.2)
Computing confidence bounds for the mean of a Lévy-stable distribution
(1)
(1)
k
α
bn
,
Xk,n (1)
n
α
bn − 1
(1)
µ
bn = µ
bn (k) :=
(3)
µ
bn
=
(3)
µ
bn
with
(1)
(3)
(3)
(2.3)
(3)
k
α
bn
(k) := Xn−k+1,n (3)
,
n
α
bn − 1
(1)
1287
α
bn = α
bn (k) :=
α
bn = α
bn (k) :=
k
1 P
log (−Xi,n ) − log (−Xk,n )
k i=1
−1
k
1 P
log Xn−i+1,n − log Xn−k,n
k i=1
,
−1
,
where X1,n ≤ ... ≤ Xn,n are the order statistics pertaining to a sample X1 , ..., Xn ,
(1)
(3)
(n ≥ 1) from X. Notice that α
bn and α
bn are also consistent estimators of α (see
[M82]), their almost sure convergence is established in [N06a]. The strong limiting
bn was studied by [N06b] when constructing a nonparametric sequential
behavior of µ
test with power 1 for µ.
Under a second order condition
on the
d.f. F (see for instance, [HS96]), [PO1]
proved that, whenever k = o n−2ρ(α−2ρ) , with ρ < 0 called second order parameter, then
√
n
d
2
bn − µ) → N 0, δ
as n → ∞,
(2.5)
(µ
σ (k/n)
or equivalently
√
n
d
(µ
bn − µ) → N (0, 1)
δσ (k/n)
where
as
δ 2 := 1 +
(2 − α) 2α2 − 2α + 1
2 (α − 1)4
+
n → ∞,
(2.6)
2−α
,
α−1
and
Z
σ 2 (s) :=
1−s
Z
s
s
1−s
(u ∧ v − uv) dF − (u) dF − (v) for 0 < s < 1,
with F − being the quantile function of X. It is shown in [PO1] that as n → ∞
0
p
11/2
2−α
A
k/nF − (k/n) /σ (k/n) → − 2 p2/α + (1 − p)2/α
P
(1 − p)1/α ,
(2.7)
1+β
. In case of symmetric distributions (β = 0), relation (2.7) may be
where p :=
2
rewritten into
p
σ (k/n) ∼ −
2
k/nF − (k/n)
√
2−α
as
n → ∞.
(2.8)
1288
Djamel Meraghni and Abdelhakim Necir
d
P
Here → and →
denote convergences in distribution and in probability respectively
and N m, λ2 stands for the normal distribution with mean m ∈ R and variance
λ2 > 0.
The behavior of Hill’s estimators and therefore that of µ
bn is affected by the
number k of upper order statistics to be used in estimate computations (using too
many data results in a bias and using too few observations leads to a substantial
variance). For details on the issue of how to determine the optimal number k∗
of extreme values that guarantees the best possible estimates, we refer to [CPO1],
[DHPV01] and [FV04]. In the next section, we will adopt the methodology of [NFA04]
who discussed and evaluated the performance of the method proposed by [RT97] and
which consists of taking as optimal the value of k that minimizes
1X θ
bn (i) − med (α
b1,n , ..., α
bk,n )| ,
i |α
k i=1
k
RT (k) :=
(2.9)
where med stands for the the median and 0 ≤ θ ≤ 1/2. In our case, since 0 < 1/2 <
1/α < 1, we choose θ = 0.3
3 Confidence intervals
Once a data set {x1 , ..., xN , for a fixed number N } in hand, we follow the next steps
to construct an estimation interval (for the mean µ) at a given confidence level
0 < 1 − γ < 1 on the basis of relation (2.6) .
• Select the optimal number k∗ := arg minRT (k) .
k
• Compute µ
b∗N = µ
bN (k∗ ) (relations(2.1) to (2.3)).
• Estimate δ by
δb∗ :=
b∗N ) 2α
b∗2
b∗N + 1
(2 − α
N − 2α
1+
2 (α
b∗N − 1)4
2−α
bN
+ ∗
α
bN − 1
!1/2
,
and use relation (2.8) to get an approximation of σ (k/N )
p
∗
σ
b := −
2
k∗ /N Xk∗ ,N
,
2−α
b∗N
p
where α
b∗N = α
bN (k∗ ) .
• Determine the normal (1 − γ/2) % quantile z1−γ/2 := Φ− (1 − γ/2), where Φ
denotes the Gaussian d.f. and Φ− its generalized inverse.
• Finally, the (1 − γ) % confidence interval of the stable mean µ is (asymptotically):
∗
bN
µ
δb∗ σ
b∗
δb∗ σ
b∗
∗
, µ
− z1−γ/2 √
bN + z1−γ/2 √
N
N
!
.
Computing confidence bounds for the mean of a Lévy-stable distribution
1289
4 Illustrative example
We now simulate, using the algorithm of [CMS76], N = 2500 observations of a
symmetric Lévy-stable distribution (µ = β = 0) with α = 1.2 and σ = 0.1 and we
apply the above result to construct confidence bounds for the distribution mean. We
start by plotting Hill’s estimator of α as a function of the number k of extremes. On
figure 1, the horizontal line corresponds to the true value of α while the vertical one
shows the optimal number k∗ . The results of our simulation study are summarized
in table 1.
Fig. 1. Hill’s estimator of α and optimal number of extremes
Table 1. 90% and 95% confidence bounds for the stable mean
k∗
α
b∗N
µ
b∗N
192
1−γ
0.90
0.95
1.19
lower bound
−1.83
−2.36
0.07
upper bound
1.97
2.51
5 Conclusion
At the end of this work, we see that the following points are worth to be mentioned:
1290
Djamel Meraghni and Abdelhakim Necir
• In [W01], Weron discussed the performance of Hill’s estimator in case of Lévystable distributions and noted that for α ≤ 1.5 the estimation is quite reasonable
but as α approaches 2, there is a significant overestimation when considering
samples of typical size.
• The knowledge of µ
bn ’s asymptotic distribution makes confidence interval construction possible and much faster than obtaining bootstrap based intervals.
• When performing the simulation trials, we noted that as σ decreases, the estimation of µ gets better and better (see figure 2).
Fig. 2. Peng’s estimator for µ of S1.2 (σ, 0, 0) with σ = 1 (left), 0.1 (right)
• In case the distribution is not symmetric (β 6= 0), one has to approximate p in
relation (2.7) .
References
[CMS76] Chambers, J. M., Mallows, C. L., Stuck, B. W. : A method for simulating
stable random variables. J. Am. Statist. Assoc. 71, 340–344 (1976)
[CP01]
Cheng, S., Peng, L.: Confidence intervals for the tail index. Bernoulli 7,
751–760 (2001)
[DHPV01] Danielson, J., de Haan, L., Peng, L., de Vries, C. G. : Using a bootstrap
method to choose the sample fraction in tail index estimation. J. Mult.
Anal. 76, 226–248 (2001)
[F71]
Feller, W.: An introduction to probability theory and its applications. Vol.
II. JohnWiley & Sons Inc, (2ed)(1971)
[FV04]
Fereira, A., de Vries, C. G.: Optimal confidence intervals for the tail index
and high quantiles. Tinberg Institute Discussion Paper 090/2 (2004)
[GH87] Geluk, J., de Haan, L.: Regular variation, extensions and Tauberian theorems. Mathematical Centre Tracts 40 Centre for Mathematics and Computer Science, Amsterdam (1987)
[HS96]
de Haan, L., Stadtmüller, U.: Generalized regular variation of second order. J. Australian Math. Soc. (Series A) 61, 381–395 (1996)
[H75]
Hill, B.: A simple approach to inference about tail of a distribution. Ann.
Statist. 3, 1163–1174 (1975)
Computing confidence bounds for the mean of a Lévy-stable distribution
[IG96]
1291
Ihaka, R., Gentleman, R.: R: A language for data analysis and graphics.
J. Comp. Graph. Statist. 5, 299–314 (1996)
[L25]
Lévy, P.: Calcul des probabilités. Paris, Gauthier-Villars (1925)
[M82]
Mason, D. M.: Laws of large numbers for sums of extreme values. Ann.
Probab. 10, 754–764 (1982)
[N06a]
Necir, A.: A Functional Law of the Iterated Logarithm for Kernel-type
estimators of the Tail Index. J. Statist. Plann. Inference 136, 780–802
(2006a)
[N06b]
Necir, A.: A nonparametric sequential test with power 1 for the mean
of Lévy-stable distributions with infinite variance. Meth. Comp. App.
Probab. (to appear)(2006b)
[NFA04] Neves, C., Fraga Alves, M.I.: Reiss and Thomas’ automatic selection of
the number of extremes. Comp. Statist. Data Anal. 47, 689–704 (2004)
[P01]
Peng, L.: Estimating the mean of a heavy tailed distribution. Statist.
Probab. Lett. 52, 255–264 (2001)
[RT97]
Reiss, R. D., Thomas, M.: Statistical analysis of extreme values with applications to insurance, finance, hydrology and other fields. Brikhäuser,
Basel(1997)
[ST94]
Samorodnitsky, G., Taqqu, M. S.: Stable non-Gaussian random processes.
Stochastic models with infinite variance. Chapman & Hall, New York
(1994)
[W01]
Weron, R.: Lévy-Stable Distributions Revisited: Tail Index >2 Does Not
Exclude the Levy-Stable Regime. Int. J. Modern Phy., C12, 209–223
(2001)
Download