Nonparametric estimation in a mixed-effect Ornstein-Uhlenbeck model

advertisement
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Nonparametric estimation in a mixed-effect
Ornstein-Uhlenbeck model
Charlotte Dion(1),(2)
Ph. D. 2013-2016.
Supervisors: Adeline Samson(1) , Fabienne Comte(2)
(1) LJK, UMR CNRS 5224, Université Joseph Fourier, Grenoble 1
(2) MAP5, UMR CNRS 8145, Université Paris Descartes, Paris Cité
11/09/2014
Charlotte Dion
1 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Content
1
Model and motivation
2
Nonparametric estimator built by deconvolution
Construction of the collection of estimators
Data driven selection method
3
Numerical study
Study on simulated data
Study on a neuronal database
4
Conclusions
Charlotte Dion
1 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Mixed-effect Ornstein-Uhlenbeck model
Observations: (Xj (t), 0 ≤ t ≤ T ) j = 1, . . . , N , N processes described by
(
X (t)
dXj (t) = φj − jα
dt + σdWj (t)
Xj (0)
= xj
→ it models the variability along time for each subject.
(Wj )1≤j≤N are N independent standard Wiener processes.
(φj )1≤j≤N are N unobserved i.i.d. r.v. with density f : random effect
of individual j.
(φj )1≤j≤N and (Wj )1≤j≤N are independent.
(x1 , . . . , xN ) are known values.
T in fixed, known.
The positive constants σ and α are supposed to be known.
Charlotte Dion
2 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
→ When t is fixed: due to the independence of the φj and the Wj , the
Xj (t) are N i.i.d. r. v.
Z t
Xj (t) = Xj (0)e−t/α + φj α(1 − e−t/α ) + σe−t/α
es/α dWj (s).
0
→ Differences between observations are due to the realization of both the
Wj and φj .
→ However, the N trajectories (Xj (t), 0 ≤ t ≤ T ), j = 1, . . . , N are i.i.d.
Charlotte Dion
3 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Xj represents the behaviour of one individual and φj describes the
individual specificity.
Goal: to estimate in a nonparametric way the density f of the random
effects.
Parametric approach: Gaussian assumption (c.f. e.g Genon-Catalot
and Larédo, (2013), Donnet and Samson, (2008), Delattre et al.,
(2013)).
Nonparametric: Comte et al., (2013), for large T . Not efficient when T
is small.
Proposal: a new nonparametric estimator, built by deconvolution,
depending on two parameters, selected in a data-driven way.
Charlotte Dion
4 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Construction of the collection of estimators
Data driven selection method
Notations
Let us consider f and g in L1 (R) ∩ L2 (R):
R
kf k2 = R |f (x)|2 dx.
R
The Fourier transform of f : f ∗ (x) = R eiux f (u)du for all x ∈ R.
R
The convolution product of f and g: f ? g(x) = R f (x − y)g(y)dy.
We assume
(A) f ∈ L2 (R), f ∗ ∈ L1 (R) ∩ L2 (R).
Charlotte Dion
5 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Construction of the collection of estimators
Data driven selection method
Construction of the estimator: deconvolution steps
dXj (t) =
φj −
Xj (t)
α
dt + σdWj (t), Xj (0) = xj
For j = 1, . . . , N , τ ∈]0, T ], estimators of the φj
Zj,τ
Xj (τ ) − Xj (0) −
:=
τ
Charlotte Dion
Rτ
0
(−
Xj (s)
ds)
α
.
6 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Construction of the collection of estimators
Data driven selection method
Construction of the estimator: deconvolution steps
dXj (t) =
φj −
Xj (t)
α
dt + σdWj (t), Xj (0) = xj
For j = 1, . . . , N , τ ∈]0, T ], estimators of the φj
Zj,τ
Xj (τ ) − Xj (0) −
:=
τ
Rτ
0
(−
Xj (s)
ds)
α
.
Notice that
σ
Wj (τ ).
τ
When τ is fixed: the two members of the sum are independent, thus
(Zj,τ )j=1,...,N are i.i.d., and
Zj,τ = φj +
fZτ (u) = f ? f στ Wj (τ ) (u).
Charlotte Dion
6 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Construction of the collection of estimators
Data driven selection method
Construction of the estimator: deconvolution steps
dXj (t) =
φj −
Xj (t)
α
dt + σdWj (t), Xj (0) = xj
For j = 1, . . . , N , τ ∈]0, T ], estimators of the φj
Zj,τ
Xj (τ ) − Xj (0) −
:=
τ
Rτ
0
(−
Xj (s)
ds)
α
.
Notice that
σ
Wj (τ ).
τ
When τ is fixed: the two members of the sum are independent, thus
(Zj,τ )j=1,...,N are i.i.d., and
Zj,τ = φj +
fZτ (u) = f ? f στ Wj (τ ) (u).
Fourier transform under (A)
fZ∗τ (u) = f ∗ (u)f ∗στ Wj (τ ) (u) ⇔ f ∗ (u) = fZ∗τ (u)eu
Charlotte Dion
2
σ 2 /2τ
.
6 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Construction of the collection of estimators
Data driven selection method
Cut-off choice
Fourier inversion
f (x) =
1
2π
Z
e−iux fZ∗τ (u)e
u2 σ 2
2τ
du.
R
P
iuZj,τ
Estimator of fZ∗τ (u): fbZ∗τ (u) = (1/N ) N
.
j=1 e
2 2
But: integrability of fbZ∗τ (u)eu σ /2τ no more ensured → cut-off.
Charlotte Dion
7 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Construction of the collection of estimators
Data driven selection method
Cut-off choice
Fourier inversion
f (x) =
1
2π
Z
e−iux fZ∗τ (u)e
u2 σ 2
2τ
du.
R
P
iuZj,τ
Estimator of fZ∗τ (u): fbZ∗τ (u) = (1/N ) N
.
j=1 e
2 2
But: integrability of fbZ∗τ (u)eu σ /2τ no more ensured → cut-off.
Idea due to Comte et al (2013): to link the time of the process τ and the
cut-off
Z √τ
N
X
u2 σ 2
1
−iux 1
fbτ (x) =
e
eiuZj,τ e 2τ du.
√
2π − τ
N j=1
→ Problem when τ is small.
Charlotte Dion
7 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Construction of the collection of estimators
Data driven selection method
Cut-off choice
Fourier inversion
f (x) =
1
2π
Z
e−iux fZ∗τ (u)e
u2 σ 2
2τ
du.
R
P
iuZj,τ
Estimator of fZ∗τ (u): fbZ∗τ (u) = (1/N ) N
.
j=1 e
2 2
But: integrability of fbZ∗τ (u)eu σ /2τ no more ensured → cut-off.
Idea due to Comte et al (2013): to link the time of the process τ and the
cut-off
Z √τ
N
X
u2 σ 2
1
−iux 1
fbτ (x) =
e
eiuZj,τ e 2τ du.
√
2π − τ
N j=1
→ Problem when τ is small. We introduce a new cut-off parameter s:
Z s√ τ
N
X
u2 σ 2
1
−iux 1
eiuZj,τ e 2τ du.
fbs,τ (x) =
e
√
2π −s τ
N j=1
Charlotte Dion
7 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Construction of the collection of estimators
Data driven selection method
√
To simplify the theoretical study, we replace s τ by a new parameter m.
Resulting estimator: fem,s , when m2 /s2 ∈]0, T ],
1
fem,s (x) =
2π
Z
m
e−iux
−m
N
1 X iuZj,m2 /s2 u2 σ22s2
e 2m du
e
N j=1
with m and s in two finite sets M and S.
Charlotte Dion
8 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Construction of the collection of estimators
Data driven selection method
Study of the mean integrated squared error (MISE):
h
i
h
i
Decomposition E kfem,s − f k2 = kf − E[fem,s ]k2 + E kfem,s − E[fem,s ]k2 .
∗
Definition fm is defined by fm
:= f ∗ 1[−m,m] .
Proposition
Under (A), E[fem,s ] = fm and we have
Z 1
h
i
2 2 2
m
E kfem,s − f k2 ≤ kfm − f k2 +
eσ s v dv.
πN 0
Charlotte Dion
9 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Construction of the collection of estimators
Data driven selection method
Study of the mean integrated squared error (MISE):
h
i
h
i
Decomposition E kfem,s − f k2 = kf − E[fem,s ]k2 + E kfem,s − E[fem,s ]k2 .
∗
Definition fm is defined by fm
:= f ∗ 1[−m,m] .
Proposition
Under (A), E[fem,s ] = fm and we have
Z 1
h
i
2 2 2
m
E kfem,s − f k2 ≤ kfm − f k2 +
eσ s v dv.
πN 0
Bias term: decreases when m increases, independent of s
Z
1
kfm −f k2 =
|f ∗ (u)|2 du.
2π |u|≥m
Variance term: increases with m
and s
Z 1
2 2 2
m
eσ s v dv.
πN 0
→ Bounded as soon as s is
bounded.
Charlotte Dion
9 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Construction of the collection of estimators
Data driven selection method
Finite collections
→ We want to choose the best couple (m, s): the one realizing the
bias-variance compromise.
1 2
, 1/2P −1 ≤ σsl ≤ 2, l = 0, . . . , P }
2l σ
√
k∆
M := {m =
, k ∈ N∗ , 0 < m ≤ N }
σ
with 0 < ∆ < 1 a small step to be fixed.
S := {sl =
C := {(m, s) ∈ M × S, m2 /s2 ≤ T }.
Charlotte Dion
10 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Construction of the collection of estimators
Data driven selection method
New criterion extended from the Goldenshluger and Lepski’s method
Consider (m, s) ∈ C.
Penalty function
m σ 2 s2
e
,
N
where κ is a numerical constant to be calibrated.
Criterion
Γm,s =
max
kfem0 ,s0 − fe(m0 ,s0 )∧(m,s) k2 − pen(m0 , s0 )
0 0
pen(m, s) = κ
(m ,s )∈C
0
0
0
+
0
where (m , s ) ∧ (m, s) := (m ∧ m, s ∧ s).
Selection
(m,
e se) = arg min {Γm,s + pen(m, s)}.
(m,s)∈C
Charlotte Dion
11 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Construction of the collection of estimators
Data driven selection method
New criterion extended from the Goldenshluger and Lepski’s method
Consider (m, s) ∈ C.
Penalty function
m σ 2 s2
e
,
N
where κ is a numerical constant to be calibrated.
Criterion
Γm,s =
max
kfem0 ,s0 − fe(m0 ,s0 )∧(m,s) k2 − pen(m0 , s0 )
0 0
pen(m, s) = κ
(m ,s )∈C
0
0
0
+
0
where (m , s ) ∧ (m, s) := (m ∧ m, s ∧ s).
Selection
(m,
e se) = arg min {Γm,s + pen(m, s)}.
(m,s)∈C
Lemma
There is a constant C 0 depending on kf k, σ, ∆, and P + 1 the cardinality of
S, such that:
C 0 (P + 1)
E[Γm,s ] ≤ 18kf − fm k2 +
.
N
Charlotte Dion
11 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Construction of the collection of estimators
Data driven selection method
Main non-asymptotic result on the final estimator: oracle type inequality
Theorem (D. (2014))
Under (A), consider the estimator fem,e
e s , there exists κ0 a numerical
constant such that, for all penalty constant κ ≥ κ0 ,
2
E[kfem,e
e s − f k ] ≤ C inf
(m,s)∈C
C 0 (P + 1)
kf − fm k2 + pen(m, s) +
N
where C > 0 is a numerical constant as soon as κ is fixed and C 0 is the
previous constant of Lemma.
Automatic realisation of the bias-penalty compromise.
We choose the two parameters in an adaptive way, thus this gives more
flexibility in the choice of the estimator.
Charlotte Dion
12 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Study on simulated data
Study on a neuronal database
Numerical study
Exact simulation of the processes.
Discretization, time step δ, small (500 to 2000 observations).
Choice of parameters and designs (ex: T = 0.3, 10, 100, 300)
Calibration κ = 0.3.
∆ = 0.08.
Charlotte Dion
13 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Study on simulated data
Study on a neuronal database
Study on simulated data with N = 240, T = 0.3, δ = 0.00015,
σ = 0.0135, α = 0.039
Figure : - 25 estimators fem,e
e s , - the true density f : gamma and mixed gamma
Table : Empirical mean integrated squared error, computed from 100 simulated
data sets
fem,e
e s
Oracle
f gamma
0.068
0.041
Charlotte Dion
f mixed-gamma
0.038
0.029
14 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Study on simulated data
Study on a neuronal database
Application on a neuronal database
Interspikes interval (ISI) measures: measurements along time of the
membrane potential in volts [V] of one single neuron, between the spikes.
Figure : Left: Membrane potential, right: the 240 observed trajectories
→ We consider the observations as independent realizations of
our model.
→ Picchini et al. (2010) proves that the Ornstein-Uhlenbeck
model with one random effect fits better the data than without.
Charlotte Dion
15 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Study on simulated data
Study on a neuronal database
Parameters values
T = 0.3, time step δ = 0.00015 [s].
The initial voltage = the resting potential: xj = 0.
√
The diffusion coefficient is fixed σ = 0.0135 [V/ s] (estimated in
Picchini et al (2010) )
α [s]: time constant of the neuron α = 0.039 [s] (estimated in Lansky
et al (2006))
→ φj is the local input that neuron receives during the j th ISI.
→ Estimation of f obtained in Picchini et al (2010) under Gaussian
assumption: N (0.278, 0.0412 ).
Charlotte Dion
16 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Study on simulated data
Study on a neuronal database
Parameters values
T = 0.3, time step δ = 0.00015 [s].
The initial voltage = the resting potential: xj = 0.
√
The diffusion coefficient is fixed σ = 0.0135 [V/ s] (estimated in
Picchini et al (2010) )
α [s]: time constant of the neuron α = 0.039 [s] (estimated in Lansky
et al (2006))
→ φj is the local input that neuron receives during the j th ISI.
→ Estimation of f obtained in Picchini et al (2010) under Gaussian
assumption: N (0.278, 0.0412 ).
We although represent the adaptive kernel estimator associate to the r.v.
Zj,T ,
N
x − Zj,T
1 X1
fbh (x) =
K
N j=1 h
h
where we choose the bandwidth b
h among a collection, with a data-driven
Lepski’s procedure developed in the article.
Charlotte Dion
16 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Study on simulated data
Study on a neuronal database
Estimated density f on a neuronal database
ee s , - - the density from Picchini et al (2010) N (0.278, 0.0412 )
Figure : – fbh
b , – fm,e
and ... the density Γ(46.3, 0.006)
Charlotte Dion
17 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Conclusions
More precise estimation instead of parametric assumption. Can be
used to simulate the φj .
This new parameter s generalizes the results of Comte et al (2013)
even if T is large.
Selection procedure of two parameters which can be adapted in other
cases.
Further works
Remark: the procedure can be written with a drift b(x) + φj with b
satisfying assumption but not necessary linear.
Add a new random effect: α.
Solve the problem when σ(x) 6= σ1 without assuming σ(x) < σ1 .
Charlotte Dion
18 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
References
I Dion, C Nonparametric estimation in a mixed-effect OrnsteinUhlenbeck model Preprint hal-01023300
I Comte, F., Genon-Catalot, V., Samson, A. (2013). Nonparametric
estimation for stochastic differential equation with random effects.
Stochastic Processes and their Applications 7, 2522–2551.
I Goldenshluger, A. and Lepski, O. (2011). Bandwidth selection in
kernel density estimation: oracle inequalities and adaptive minimax
optimality. Ann. Statist. 39, 1608–1632.
I Picchini, U., De Gaetano, A. and Ditlevsen,S. (2010). Stochastic
differential mixed-effects models. Scandinavian Journal of Statistics
Charlotte Dion
19 / 20
Model and motivation
Nonparametric estimator
Numerical study
Conclusions
Thank you for your attention.
Charlotte Dion
20 / 20
Download