n - Lévy 2016

advertisement
Estimation of the degree of jump
activity, for irregularly sampled
processes in presence of noise
Jean Jacod – Viktor Todorov
UPMC (Paris 6) – Northwestern University
The aim
Make inference on the jumps of a 1-dimensional process
Z t
Z t
Xt = X0 +
bs ds +
σs dWs + jumps driven by a Poisson measure
0
0
observed at discrete times within the fixed time interval [0, 1],
Three features:
• The Brownian part
Rt
0
σs dWs is typically dominating, but we are interested in jumps.
• The sampling times 0 = T (n, 0) < T (n, 1) < · · · < T (n, i) < · · · may be irregular,
possibly random.
• There is a microstructure noise: instead of XT (n,i) we observe
Yin = XT (n,i) + χni
(so tick-by-tick data, or all transactions data, can be used to do inference.
Notation
∆(n, i) = T (n, i) − T (n, i − 1)
∆ni V = VT (n,i) − VT (n,i−1)
∆Vt = Vt − Vt−
V : any process
V : any càdlàg process
Regular sampling means ∆(n, i) = ∆n .
Spot Lévy measures of X: the compensator ν of the jump measure of X is assumed to
have the factorization
ν(ω, dt, dx) = dt Fω,t (dx)
(this is the “Itô semimartingale property” for the jumps). The measures Ft = Fω,t are
Rt R
the spot Lévy measures, and 0 ds (x2 ∧ 1) Fs (dx) < ∞.
Warning: We do not want to “estimate” the jumps ∆Xt themselves, since
• the “big jumps” over [0, 1] have no predictive value for the model.
• the “small” jumps are infinitely many, hence impossible to “estimate”.
Rather, we will estimate the “degree of activity” of the jumps, and also the “intensity”
of the small jumps (as specified below).
Assumptions on X (typically a log-price)
We have a filtered space (Ω, F, (Ft )t≥0 , P) on which:
Z t
Z t
Z tZ
Z tZ
Xt = X0 +
bs ds +
σs dWs +
δ(s, z)(p
q )(ds, dz) +
δ 0 (s, z)p
= −=
= (ds, dz)
0
Z
σt = σ0 +
0
t
0
Z
bσs ds+
0
t
0
Hsσ dWs0 +
E
Z tZ
0
E
δ σ (s, z)(p
= −q
= )(ds, dz)+
0
E
Z tZ
0
E
δ 0σ (s, z)p
= (ds, dz)
• W and W 0 are two correlated Brownian motions.
• =p is a Poisson measure on R+ × E with (deterministic) compensator =q (dt, dz) =
dt ⊗ η(dz) (η is a σ-finite measure on the Polish space E).
• “standard” assumptions on the coefficients bt , bσt , Htσ : locally bounded, and (up to
localization). bt , Htσ satisfy for all finite stopping times T ≤ S:
E( sup |Vs − VT |2 ) ≤ KE(S − T )
(1)
s∈[T,S]
0
• |δ(t, z)|r , 1{δ0 (t,z)6=0} ≤ J(z) for some non-random η-integrable function J and some
r0 ∈ [0, 2), and the same for δ σ , δ 0σ (up to localization).
MOREOVER, we need a structural assumption on the high-activity jumps of X,
expressed in terms of the BG (Blumenthal-Getoor) index, or successive BG indices:
There is an integer M ≥ 0, numbers 2 > β1 > · · · βM > 0, and nonnegative càdlàg
m 1/βm
processes a1t , . . . , aM
satisfies (2), and the symmetrized Lévy
t , such that each (at )
measures F̆t (A) = Ft (A) + Ft (−A) are such that the (signed) measure
M
X
βm am
t
0
1{0<|x|≤1} dx
Ft (dx) = F̆t (dx) −
1+β
m
|x|
m=1
satisfies (|Ft0 | being the “absolute value” of Ft0 ):
|Ft0 |([−x, x]c ) ≤
Γ
xr
∀x >∈ (0, 1].
Example:
Z
Z
t
bs ds +
Xt = X0 +
0
t
σs dWs +
0
M Z
X
m=1
t
0
m
γs−
dYsm +
Z tZ
0
E
δ 0 (s, z)p
= (ds, dz)
with Y m independent stable or tempered stable processes (with arbitrary dependencies
with =p , and indices βm ) and γ m ’s are Itô semimartingales.
m βm
We then have am
(up to a multiplicative constant).
t = |γt |
REGULAR OBSERVATIONS – NO NOISE
Choose a sequence kn of integer (→ ∞) and set
kX
n −1
p
1
n
n
n
L(y)j =
cos(y(∆i+1+2l X − ∆i+2+2l X)/ ∆n )
kn
l=0
We take differences of two successive returns to “symmetrize” the jump measure around
zero and kill the drift.
R ∞ sin y
We have approximately, with χ(β) = 0 yβ dy:
Ã
2
E(L(y)ni | Fi ) ≈ exp −y 2 σi∆
−2
n
M
X
!
m /2 m
ai∆n
χ(βm )y βm ∆1−β
n
m=1
hence the following is a estimator of the spot (squared) volatility ct = σt2 for t ≈ i∆n :
b
c(y)nj
³
´
_
1
1
n
= − 2 log L(y)j
y
log(1/∆n )
Introducing a de-biasing term, we set
b nt = 2kn ∆n
C(y)
[t/vn ]−1 ³
X
b
c(y)nj
j=0
where sinh(x) =
1
2
¢ ´
1 ¡
2
n 2
− 2
sinh(y b
c(y)j )
,
y kn
(ex − e−x ) is the hyperbolic sine. This “estimates”
Ct +
M
X
Am,n (y)t ,
where
m=1
Z
m,n
A
(y)t = y
βm −2
m /2
∆1−β
Am
n
t ,
Am
t
t
= 2χ(βm )
0
and the normalized error term is
M
³
´
X
1
n
n
m,n
b
C(y)t − Ct −
Z(y)t = √
A (y)t .
∆n
m=1
am
s ds
The CLT: Let Y be a finite subset of (0, ∞). Choose kn and un such that
√
p
k
n ∆n
→ 0.
kn ∆n → 0,
kn ∆1/2−ε
→
∞
∀ε
>
0,
u
→
0,
n
n
u2n
Theorem We have the (functional) stable convergence in law:
³
´
¡ 1
¢
¢
L−s ¡
n
n
2
Z(un ) , 2 (Z(yun ) − Z(un ) ) y∈Y =⇒ Z, ((y − 1)Z)y∈Y ,
un
n
e of the original space (Ω, F, (Ft )t≥0 , P)
e F,
e (Fet )t≥0 , P)
where the limit is defined on an extension (Ω,
and can be written as
Z t
Z t
2
Zt = √
c2s dWs(2) .
Zt = 2
cs dWs(1) ,
3 0
0
where W (1) and W (2) are two independent Brownian motions, independent of the σ-field
F.
Estimation of β1 (when M = 1 for simplicity): Observe that
b n y)nt − C(u
b n )t
C(y)nt = C(u
p
= A (un y) − A (un )t + ∆n (Z(yun )nt − Z(un )t ))
p
β1 −2 β1 −2
1−β1 /2 1
= un (y
− 1)∆n
At + ∆n (Z(yun )nt − Z(un )t ))
m,n
m,n
x
∞
The function f (x) = 24x −1
on (1, 2), with a C ∞ reciprocal function f −1 .
−1 is C
Then a natural estimator for β1 is, for example,
³
n´
n
−1 C(4)t
b
βt = f
C(2)nt
and we have
unβ1 −2
(β −1)/2
∆n 1
¡
βbtn − β1
¢
L−s
−→ Y
for some mixed centered normal y with a known (conditional) variance. The rate is
(β −1)/2
thus almost 1/∆n 1
, better than rates obtained by other methods when β1 > 4/3.
√
The choice kn ≈ 1/ ∆n and un going to 0 very slowly is in fact not “optimal” for
this problem, so better (but complicated...) rates can actually be achieved.
One can also estimate
A1t
= 2χ(β1 )
bn,1
A
= C(2)nt
t
(on similarly estimate
Rt
0
Rt
0
a1s ds, using for example
1
bn
β
un1
−
bn
2)(2β1 −2
−
bn /2
1−β
1)∆n 1
a1s ds).
bn,1
Then we have a CLT for A
t , with the same kind of limit, and the same rate
divided by log((1/∆n ) to an appropriate power.
NOISE AND IRREGULAR SAMPLING
Assumptions on the observation scheme
Assumption: The inter-observations lags are
∆(n, i + 1) = ∆n λT (n,i) Φni+1
•
λt positive càdlàg adapted, satisfying
E(|λT − λS |) ≤ E(T − S)
for all stopping times S ≤ T and 1/K ≤ λt ≤ K (up to localization).
•
The variables Φni are positive, E(Φni ) = 1, and supn,i E(Φni )p ) < ∞ for all p > 0.
•
The variables (Φni : i ≥ 1) are mutually independent and independent of F∞
This implies in particular that Ntn =
∆n Ntn
P
1{T (n,i)≤t} satisfies
Z t
1
u.c.p.
=⇒ Λt :=
ds.
λ
s
0
i≥1
n
the σ-field generated by F∞ and all Φni : i ≥ 1).
We denote by H∞
Two assumptions on the noise
We observe Yin = XT (n,i) + γT0 (n,i) εni , where
(N1): The process γ 0 is an Itô semimartingale, and the variables (εni : i ≥ 1) are i.i.d.,
n
independent of H∞
, with
E(εni ) = 0,
E((εni )2 ) = 1,
E(|εni |p ) < ∞.
n
(N2): The variables (εni : i ≥ 1) are independent, conditionally on H∞
with
(p)
n
E((εni )p | H∞
) = γT (n,i) ,
n
P(εni ∈ B | H∞
) = P(εni ∈ B | HTn (n,i) ).
(where (Htn ) is the smallest filtration containing (Ft ) and for which all T (n, i) are
(1)
(2)
(p)
stopping times. Moreover, γt = 0 and γt = 1; moreover γt0 and all γt satisfy (2)
and are (Ft )-adapted.
(N1) is much stronger than (N2), and not very realistic. Later on, γt = (γt0 )2 (this
is the “variance” of the noise).
An important example satisfying (N2) but not (N1).
P
j
Let ρt ≥ 0 be càdlàg adapted nonnegative with j∈Z ρjt = 1 and ρjt = ρ−j
and
t
P
n
supt j∈Z ρjt |j|p < ∞. For each n let (Zin : i ≥ 1) be i.i.d. conditionally on H∞
, with
P
density x 7→ j∈Z ρjT (n,j) 1[j,j+1) (x). The observation at time T (n, i) is
Yin = [XT (n,i) + Zin ],
so we have a additive (modulated) white noise plus rounding.
Remark: If we have “pure rounding”, i.e. if we observe [XT (n,i) ] (or [XT (n,i) ] +
“center” the noise), then no consistent estimator for Ct exists.
1
2
to
Pre-averaging
We choose 3 tuning parameters un , hn , kn all going to ∞: here un > 0 will be the
argument of the empirical characteristic function below, and hn , kn (two integers) are
window sizes.
The de-noising method is pre-averaging, but other methods could probably be used
as well. Take a weight (or, kernel) function g on R with
g is continuous, piecewise C 1 with a piecewise Lipschitz derivative g 0 ,
R1
2
g(s)
ds > 0,
s∈
/ (0, 1) ⇒ g(s) = 0,
0
for example g(x) = (x ∧ (1 − x)) 1[0,1] (x). With a sequence hn → ∞ of integers, set
gin = g(i/hn ),
P
φn = h1n i∈Z (gin )2 ,
R
φ = g(u)2 du,
n
g ni = gi+1
− gin
P
φn = hn i∈Z (g ni )2 ,
R 0 2
φ = g (u) du,
P
(β)
φen = h1n i∈Z |gin |β ,
R
(β)
e
φ = |g(u)|β du
The pre-averaged returns of the observed values Yin are
Yein =
hX
n −1
j=1
n
n
gjn (Yi+j
− Yi+j−1
)=−
hX
n −1
j=0
n
g nj Yi+j
.
Initial estimators
For any y > 0, set
L(y)ni
kn −1
¡
¢
1 X
n
en
cos yun (Yei+2lh
−
Y
)
=
i+(2l+1)hn
n
kn
l=0
(a proxy for the real part of the empirical characteristic function of the returns, over
a window of 2hn kn successive observations). Taking a difference above allows us to
“symmetrize” the problem.
Then, for any y > 0, a natural estimator for the “integrated volatility” over the
time interval [T (n, i), T (n, i + 2hn kn )] is
2kn
n
v(y)
i,
y 2 u2n φn
v(y)ni
³
= − log
L(y)ni
_ 1 ´
.
hn
We need to de-bias these estimators, to account for the noise, and also ¡for some
intrinsic¢ distortion present even when there is no noise. With f (x, y) = 12 e2x−y +
e2x − 2 , set
b nt =
C(y)
kn
y 2 u2n φn
[Ntn /2h
n kn ]−1 ³
P
j=0
1
kn
f (v(y)n2jhn kn , v(2y)n2jhn kn )
´
P
2 2
n
2
−φn y un l=1 (∆2jhn kn +l ) .
2v(y)n2jhn kn −
Finally, we set
M
X
2
b n − Ct −
Z(y)nt = C(y)
|y|βm −2 uβnm −2 φeβnm Am
t
t
φn m=1
The basic CLT
Below, Y is any finite subset of (0, ∞).
Theorem under appropriate conditions on un , hn , kn , plus
uβn1 h3n ∆n
uβn1 h3n ∆n
+
u4n (1
and with
s
vn = kn
+
h2n ∆n )2
→ η,
h2 ∆n
0
→
η
1 + h2n ∆n
h3n ∆n
u4n (1 + h2n ∆n )2 + uβn1 h3n ∆n
¢
¡
n
for any t > 0 the variables vn Z(y)t y∈Y converge F∞ -stably in law to (Z(y)t )y∈Y ,
e and is, conditionally on F, centered Gaussian
e F,
e P)
which is defined on an extension (Ω,
with variance-covariance given by
0
e
E(Z(y)
t Z(y )t | F)
Z t
¡
¡
¢2 1
=
ηψ(β1 , y, y 0 )a1s λs + (1 − η)y 2 y 02 η 0 φσs2 λs + (1 − η 0 )φγs
ds
λ
s
0
Estimation of β1
Assume again M = 1. We use estimator analogous to the estimators in the regular
no-noise case, that is
³ C(4)n ´
t
n
−1
.
βbt = f
C(2)nt
where f (x) =
4x −1
2x −1
and
b n − C(1)
b t
C(y)nt = C(y)
t
¡ β −2 β −2
¢
1−β1 /2 1
2
n
1
1
− 1)∆n
At + (Z(y)t − Z(1)t ))
= φn un (y
and we have (when the basic CLT holds):
uβn1 /2
¡
βbtn − β1
¢
L−s
−→ Y
for some mixed centered normal Y with a known (conditional) variance. We similarly
have estimators with an analogous CLT for A1t .
Problem: Choose un , hn , kn ) with the appropriate conditions, plus un as large as
possible.
Reporting only the rate for β1 , we find the following (sub)-optimal rates (depending
on the value of β1 , and on an arbitrary ε > 0).
β1
(1−ε)
10−4β1
1/∆n
β1
• If σt ≡ 0 and (N1): 1/∆n7 (1−ε)
β1
(1−ε)
4+2β1
1/∆n
3β1
(1−ε)
30−11β1
1/∆n
3β1
(1−ε)
21+β1
• If σt ≡ 0 and (N2): 1/∆n
3β1
(1−ε)
12+7β1
1/∆n
if β1 ≤
if
3
4
≤ β1 ≤
if β1 ≥
if β1 ≤
if
3
4
3
4
3
2
3
4
≤ β1 ≤
if β1 ≥
3
2
3
2
3
2
β1
(1−ε)
16−5β1
• If σt not 0 and (N1):
1/∆n
3β1
(1−ε)
32−4β1
1/∆n
• If σt not 0 and (N2):
3β1
(1−ε)
48−11β1
1/∆n
if β1 ≤
16
11
if β1 ≥
16
11
Some problems:
1 - Optimality concerning β1
2 - When M ≥ 2, what does the rate for β1 become ?, and can we estimate
β2 , β3 , . . . ?
3 - How to choose in practice (un hn , kn ) (so far, only “mathematical” results are
known.
Download