Parameter estimation in diffusion processes from observations of first hitting-times Susanne Ditlevsen

advertisement
Parameter estimation in diffusion
processes from observations of
first hitting-times
Susanne Ditlevsen
Department of Mathematical Sciences
University of Copenhagen, Denmark
Email: susanne@math.ku.dk
Joint work with
Petr Lansky, Academy of Sciences of the Czech Republic, Prague
EPSRC Symposium Workshop on Computational Neuroscience,
Warwick, December 8–12, 2008
All characteristics of the neuron are
collapsed into a single point in space
1
All characteristics of the neuron are
collapsed into a single point in space
=⇒
1
•
Data: spiketrains
|
{z
T
}
|
{z
T
}
time
2
Underlying process
-
-
3
The model
dXt = µ(Xt ) dt + σ(Xt ) dW (t) ; X0 = x0
Xt : membrane potential at time t after a spike
x0 : initial voltage (the reset value following a spike)
An action potential (a spike) is produced when the membrane voltage
Xt exceeds a firing threshold
S(t) = S
>
X(0) = x0
After firing the process is reset to x0 . The interspike interval T is
identified with the first-passage time of the threshold,
T = inf{t > 0 : Xt ≥ S}.
4
Data
We observe the spikes: the first-passage-time of Xt through S:
Data: {t1 , t2 , . . . , tn } i.i.d. realizations of the random variable T .
Note: There is only information on the time scale, nothing on the
scale of Xt . Thus, obviously something is not identifiable in the
model from these data, and something has to be assumed known.
5
Estimation
dXt = µ(Xt , θ) dt + σ(Xt , θ) dW (t)
;
θ ∈ Θ ⊆ IRp
y 7→ fθ (t − s, x, y)
Transition density:
Corresponding
distribution function:
Fθ (t − s, x, y) =
Ry
fθ (t − s, x, u)du
T = inf{t > 0 : Xt ≥ S}.
Data: {t1 , t2 , . . . , tn } i.i.d. realizations of the random variable T .
How do we estimate θ?
6
Maximum likelihood estimation
... is possible if we know the distribution of T .
Let pθ (t) be the probability density function of T .
Recall:
Likelihood function:
Qn
Ln (θ)
=
log Ln (θ)
=
Score function(s):
∂θ log Ln (θ)
=
i=1 pθ (ti )
Pn
i=1 log pθ (ti )
Pn
i=1 ∂θ log pθ (ti )
Estimator θ̂ is such that
∂θ log Ln (θ̂)
=
0
Log-likelihood function:
7
Example: Brownian motion with drift
dXt = µ dt + σ dW (t)
;
µ > 0, σ > 0
S
;
X0 = 0 < S
Then
pθ (t)
=
√
2πσ 2 t3
exp −
2
(S − µt)
2σ 2 t
Thus
Ln (θ) =
n
Y
pθ (ti )
=
i=1
log Ln (θ) =
n
X
i=1
log pθ (ti )
n
Y
S
p
i=1
=
n
X
i=1
8
!
log
2πσ 2 t3i
exp −
S
p
2πσ 2 t3i
n
X
i=1
!
−
(S − µti )2
2σ 2 ti
n
X
(S − µti )2
i=1
2σ 2 ti
!
4
µ=1
2
σ=1
1
σ = 0.5
0
probability density
3
σ = 0.1
0.0
0.5
1.0
T
9
1.5
2.0
Score functions:
∂µ log Ln (θ)
n
X
(S − µti )
=
σ2
i=1
n
X (S − µti )2
n
= − 2+
2 )2 t
2σ
2(σ
i
i=1
∂σ2 log Ln (θ)
Maximum likelihood estimators:
S
µ̂ =
t̄
σ̂
2
= S
2
1
n
n
X
i=1
1
1
−
ti
t̄
where
n
t̄
=
1X
ti
n i=1
10
!
Example: The Ornstein-Uhlenbeck model
Consider the Ornstein-Uhlenbeck process as a model for the
membrane potential of a neuron:
dXt
=
Xt
−
+ µ dt + σ dWt ; X0 = x0 = 0.
τ
where
Xt : membrane potential at time t after a spike
τ : membrane time constant, reflects spontaneous voltage decay (>0)
µ: characterizes constant neuronal input
σ: characterizes erratic neuronal input
x0 : initial voltage (the reset value following a spike)
11
The conditional expectation is
E[Xt |X0 = 0] =
µτ (1 − e−t/τ )
The conditional variance is
Var[Xt |X0 = x0 ]
=
−t/τ
Thus (Xt |X0 = 0) ∼ N (µτ (1 − e
τ σ2 1 − e−2t/τ
2
),
τ σ2
2
−2t/τ
1−e
).
Asymptotically (in absence of a threshold) Xt ∼ N (µτ, τ σ 2 /2).
12
Two firing regimes:
Suprathreshold: µτ S (deterministic firing - the neuron is active
also in the absence of noise)
Subthreshold: µτ S (firing is caused only by random fluctuations
(stochastic or Poissonian firing)
6
Suprathreshold
µτ
S
Xt
T
T
T
T
T
T
T
T
time
6
Subthreshold
S
µτ
Xt
T
time
13
T
Model parameters: µ, σ, τ, x0 , S
Assumed known:
Intrinsic or characteristic parameters of the neuron: τ, x0 , S
τ ≈ 5 − 50 msec, S − x0 ≈ 10 mV ; (We set x0 = 0)
To be estimated:
√
Input parameters: µ (in [mV/msec]) and σ (in [mV/ msec])
14
dXt =
Xt
+ µ dt + σ dW (t) ; τ > 0, µ ∈ IR, σ > 0 ; X0 = 0 < S
−
τ
The distribution of T = inf{t > 0 : Xt ≥ S} is only known if S = µτ
(the asymptotic mean of Xt in absence of a threshold):
pθ (t)
=
√
2S exp(2t/τ )
πτ 3 σ 2 (exp(2t/τ ) − 1)3/2
2
S
exp − 2
σ τ (exp(2t/τ ) − 1)
Maximum likelihood estimator (µ = S/τ by assumption):
n
σ̂ 2
=
2S 2
1X
n i=1 τ (exp(2ti /τ ) − 1)
15
µ=2
20
30
40
0.6
Threshold
C
σ = 10
0.0000
0.05
0.00
10
σ=1
µ=3
0
0.15
3000
4000
5000
D
µ=5
σ = 10
σ=5
0.3
σ=1
σ=1
0.0
0.00
0.1
0.05
2000
0.2
0.10
σ=5
1000
Suprathreshold
0.4
0.20
0.25
0
µ=2
0.0004
σ=1
B
0.5
0.10
σ=5
Subthreshold
0.0008
A
σ = 10
0.15
0.0012
Subthreshold
0
5
10
15
20
25
30
0
2
4
6
8
10
Interspike intervals (ms)
Figure 1: Empirical densities from simulated data, each density corresponds to 10.000 ISIs
√
µ =√2 V/s and σ = 1, 5 or 10 mV/ ms. B:
Subthreshold regime, µ = 2 V/s and σ = 1 mV/ ms, enlarged from panel A. C: Threshold
√16
regime, µ = 3 V/s and σ = 1, 5 or 10 mV/ ms. D: Suprathreshold regime, µ = 5 V/s and
√
measured in ms. A: Subthreshold regime,
We reformulate to the equivalent dimensionless form
√ X X
t
Wt µτ
σ τ
t
t
d
d
d √
= −
+
+
S
S
S
τ
S
τ
or
dYs = (−Ys + α) ds + β dWs ,
Y0 = 0
where
√
Xt
Wt
σ τ
µτ
t
√
, Ws =
, β=
s = , Ys =
, α=
τ
S
S
S
τ
and T /τ = inf{s > 0 : Ys ≥ 1}.
17
dYs = (−Ys + α) ds + β dWs ,
Y0 = 0
E[Ys |Y0 = 0] = α(1 − e−s )
1 2
Var[Yt |Y0 = 0] = β (1 − e−2s )
2
Let fT /τ (s) be the density of T /τ .
An exact expression is only known for α = 1:
1
2e2s
exp − 2 2s
fT /τ (s)α=1 = √
2s
3/2
β (e −1)
πβ(e −1)
The maximum likelihood estimator:
α=1:
N
X
1
2
2
β̌ =
N i=1 e2si − 1
18
The Laplace transform of T :
2
kT /τ E e
=
α
exp{ 2β
2 }Dk
√
2α
β
Hk α
β
√
=
(α−1)2
2(α−1)
(α−1)
exp{ 2β 2 }Dk
Hk
β
β
for k < 0, where Dk (·) and Hk (·) are parabolic cylinder and Hermite
functions, respectively.
19
Ricciardi & Sato, 1988 derived series expressions for the moments of
T . In particular
∞
1 X 2n (1 − α)n − (−α)n n E[T /τ ] =
Γ
n
2 n=1 n!
β
2
The expression is difficult to work with, especially if |α| 1
(strongly sub- or suprathreshold) because of the canceling effects in
the alternating series. The expression for the variance includes the
digamma function also.
Inoue, Sato & Ricciardi, 1995, proposed computer intensive methods
of estimation by using the empirical moments of T .
20
Another approach: Martingales (Laplace transform)
dXt
=
Xt
−
+ µ dt + σdWt ; X0 = x0 = 0;
τ
with solution
Xt
− τt
= µτ (1 − e
Z
t
)+σ
−
e
(t−s)
τ
dWs
0
Define the martingale:
Z
t
τ
Yt = (µτ − Xt )e = µτ − σ
s
τ
e dWs
0
21
t
If M (t) is a martingale, then E[M (T ∧ t)] = E[M (0)]
We need more than that:
Doob’s Optional-Stopping Theorem
Let T be a stopping time and let M (t) be a uniformly integrable
martingale. Then E[M (T )] = E[M (0)].
22
Yt is obviously not uniform integrable (UI) (it is equivalent to a
Brownian Motion). CLAIM:
Y T (t) := Y (T ∧ t),
the process stopped at T , is UI in certain part of the parameter
region. We show that
E[|YtT |p ] < K
for all t and some p > 1 and some positive K < ∞.
23
First observe that
YT ∧t = (µτ − XT ∧t )e
(T ∧t)
τ
≥ (µτ − S)e
(T ∧t)
τ
>0
for all t if µτ > S (suprathreshold regime).
Set p = 2. We have
E[|YtT |2 ]
= E[(YtT )2 ]
Z
= E[(µτ − σ
T ∧t
s
e τ dWs )2 ]
0
=
Z
(µτ )2 − 0 + σ 2 E[(
0
24
T ∧t
s
e τ dWs )2 ]
M (t)
=
Z t
Z t
2s
s
2
τ
e τ ds
(
e dWs ) −
0
0
is a martingale due to Itôs isometry:
Z t
Z t
f (s, ω)dWs )2 =
E[f (s, ω)2 ]ds
E(
0
0
such that E[M (T ∧ t)] = E[M (0)] = 0. This yields
Z
E[(
T ∧t
s
τ
2
e dWs ) ]
Z
= E[
0
T ∧t
2s
τ
e ds]
0
τ 2(T ∧t)
= E[ (e τ − 1)]
2
2T
τ
≤
E[e τ ]
2
25
Thus, we have:
2T
τ
E[|YtT |2 ] ≤ (µτ )2 + σ 2 E[e τ ]
2
We need to show that this is finite.
26
Define the martingale (to be trusted):
2
Y2 (t) = (µτ − X(t)) e
2t
τ
2t
τ σ2
(1 − e τ )
+
2
such that
E[Y2 (T ∧ t)] = E[Y2 (0)] = (µτ )2
which yields
2
(µτ )
2(T ∧t)
τ σ2
= E[(µτ − X(T ∧ t)) e
+
(1 − e τ )]
2
2(T ∧t)
τ σ2
τ σ2
2
≥
(µτ − S) −
E[e τ ] +
2
2
2
27
2(T ∧t)
τ
2
If (µτ − S) >
τ σ2
2
then
τ σ2
(µτ ) −
2
2
τ
σ
(µτ − S)2 −
2
2
≥ E[e
2(T ∧t)
τ
].
Taking limits on both sides we obtain
2
τ
σ
(µτ )2 −
2
2
τ
σ
(µτ − S)2 −
2
≥
lim E[e
t→∞
since T is almost surely finite.
28
2(T ∧t)
τ
] = E[e
2T
τ
]
BINGO! Doob is good.
2
If S < µτ (suprathreshold regime) and (µτ − S) >
E[Y T (0)]
τ σ2
2
= E[Y T (T )]
such that
µτ = E[Y T (0)]
= E[Y T (T )]
T
τ
= E[(µτ − X(T ))e ]
T
(µτ − S)E[e τ ].
=
29
then
Beautiful result:
T
τ
E[e ]
µτ
µτ − S
=
With a little more work:
τ σ2
(µτ ) −
2
2
τ
σ
(µτ − S)2 −
2
2
E[e
2T
τ
]
=
Explicit expressions for the parameters:
µ =
SE[e
τ (E[e
TS
τ
TS
τ
]
] − 1)
2
σ2
2S Var[e
=
τ (E[e
2TS
τ
TS
τ
] − 1)(E[e
30
]
TS
τ
] − 1)2
Straightforward estimators:
n
Ê[Z]
=
1 X ti /τ
e
= Z1
n i=1
=
1 X 2ti /τ
e
= Z2
n i=1
n
Ê[Z 2 ]
where ti , i = 1, . . . , n, are the i.i.d. observations of the FPT’s.
Moment estimators of the parameters are then
µ̂ =
SZ1
τ (Z1 − 1)
2
2S 2 (Z2 − Z12 )
.
2
τ (Z2 − 1)(Z1 − 1)
σ̂
=
31
Another approach: The Fortet integral equation
Set S = 1. The probability
Z
P [Xt > 1 | X0 = x0 ] =
fθ (t, x0 , y)dy = 1 − Fθ (t, x0 , 1) = LHS(t)
y>1
can alternatively be calculated by the transition integral
Z t
pθ (u) 1 − Fθ (t − u, 1, 1) du = RHS(t)
P [Xt > 1 | X0 = x0 ] =
0
32
Parameter estimation
Sample t1 , . . . , tn of independent observations of T . Fix θ.
RHS can be estimated at t from the sample by the average
Z t
RHS(t; θ) =
pθ (u) 1 − Fθ (t − u, 1, 1) du
0
≈
n
1X
RHSemp (t; θ) =
1 − Fθ (t − ti , 1, 1) 1{ti ≤t}
n i=1
since for fixed t it is the expected value of
1T ∈[0,t]
1 − Fθ (t − T, 1, 1; θ)
with respect to the distribution of T .
33
Parameter estimation
Error measure:
L(θ) = sup |(RHSemp (t) − LHS(t))/ω|
t>0
Estimator:
θ̃ = arg min L(θ)
θ
34
Fortet integral equation
Let f (s) be the density function for the time t/τ from zero to the
first crossing of the level 1 by Y . The probability
α(1 − e−s ) − 1 √
P [Y (s) > 1] = Φ √
−2s
1−e
β/ 2
can alternatively be calculated by the transition integral
!
Z s
−(s−u)
α−1 1−e
√ √
P [Y (s) > 1] =
f (u) Φ
du
−2(s−u)
β/ 2 1−e
0
35
Parameter estimation
LHS(s) = Φ
−s
α(1−e )−1
√
√
1−e−2s β/ 2
q
Rs
−(s−u)
α−1
1−e
= 0 f (u) Φ β/√2 1+e−(s−u) du = RHS(s)
Sample: s1 ≤ s2 ≤ . . . ≤ sn , iid observations of T /τ .
s
!
n
α − 1 1 − e−(s−si )
1X
√
Φ
RHS(s) ≈
1{si ≤s}
−(s−s
)
i
n i=1
β/ 2 1 + e
since it is the expected value of
α−1
√
1U ∈[0,s] Φ
β/ 2
s
1−e−(s−U )
1+e−(s−U )
!
with respect to the distribution of U = T /τ for given α and β.
36
interpretable quantities of µ and σ through (11).
Simulated data. Trajectories of the OU process were simulated according to the Euler scheme
C
0.8
1.3
0.9
α̂
1.0
1.1
1.9
α̂
2.0
2.1
F
β̂
1.5
0.10
0.15
β̂
5
0
0
0.05
3
H
50
1.0
2
α̂
G
0
0
0.5
1
α̂
5
50
E
D
0
0
0
0
0.3
5
B
50
A
50
5
for four different combinations of parameter values: (α, β) = (0.8,1), (1,0.1), (2,0.1), and
0.05
0.10
β̂
0.15
0.5
1.0
1.5
β̂
FIG. 1: Densities of estimates for simulated data based on 1000 estimates. Upper panels: Estimates
of α. Lower panels: Estimates of β. Solid black line: sample sizes of 500. Dashed black line: sample
sizes of 100. Solid grey line: sample sizes of 50. Dashed grey line: sample sizes of 10. Vertical line:
37
1.0
0.5
1.0
0.5
2
B
0.0
0.0
A
4
0.01
0.02
ISIs (s)
ISIs (s)
uditory neurons. Comparison of the (normalized) right hand side of Eq.
the
Figur 1: Auditory neurons. A: Spontaneous record;
α̂ = 0.85; empirical
β̂ = 0.09. B:Eq.
Stimulated
record; α̂ curves),
= 4.8; β̂ = calculated
0.63. Note
(normalized)
(13) (dashed
different time axes.
with estima
ical axes can be interpreted as a cumulative probability, horizontal axes are
us record; α̂ = 0.85; β̂ = 0.09. B: Stimulated record; α̂ = 4.8; β̂ = 0.63. No
38
Feller process
β p
dYs = (−Ys + α) ds + √
Ys dWs
α
E[Ys |Y0 = y0 ] = α + (y0 − α)e−s
h
2y
i
β2
0
Var[Ys |Y0 = y0 ] =
(1 − e−s ) 1 +
− 1 e−s
2
α
39
S2
S1
µτ
VI
time
40
Ditlevsen & Lansky, 2006 give the moments
α − y0
if α > 1
α−1
2α(α − y0 )2 + β 2 (α − 2y0 )
2T /τ
E[e
]=
2α(α − 1)2 + β 2 (α − 2)
E[eT /τ ] =
Moment estimators:
α̂ =
if
p
1 + 2(α/β)2
Z1 − y0
Z1 − 1
and
2
2
2(1
−
y
)
(Z
−
Z
0
2
1)
2
β̂ =
α̂
2(Z1−1)(Z2−y0 ) − (Z1−y0 )(Z2−1)
41
<1+
2α(α − 1)
β2
µτ
S
α̂ and β̂ 2 are valid
1√
2− 2
g(k)
only α̂ is valid
1
1
k=
42
2µ
σ2
Feller process
β p
dYs = (−Ys + α) ds + √
Ys dWs
α
E[Ys |Y0 = y0 ] = α + (y0 − α)e−s
h
2y
i
β2
0
(1 − e−s ) 1 +
− 1 e−s
Var[Ys |Y0 = y0 ] =
2
α
Chapman-Kolmogorov integral equation:
Z s
f (u){1 − Fχ2 [a(s − u), ν, δ(s − u, 1)]} du
1 − Fχ2 [a(s), ν, δ(s, y0 )] =
0
a(s) = (4α)/β 2 (1 − e−s ), δ(s, y0 ) = (4αy0 /β 2 )[e−s /(1 − e−s )] and
ν = 4(α/β)2 .
43
1.0
0.5
1.0
0.5
1
ISIs (s)
B
0.0
0.0
A
2
1
ISIs (s)
2
C
5
10
15
time (s)
20
0.5
0.5
1.0
1.0
0
2
ISIs (s)
E
0.0
0.0
D
4
2
ISIs (s)
4
F
0
20
40
time (s)
60
FIG. 3: Olfactory neurons. A, B, C: Neuron that fits the models. D, E, F: Neuron that did n
fit the models. A, D: OU model, normalized comparison of the right hand side of Eq. (12) (sol
44
CLE IN PRESS
sen / Probabilistic Engineering Mechanics (
)
5
–
tions, and the
mates of α and
ates:
α̃
0.83 ± 0.14
1.01 ± 0.15
1.97 ± 0.15
2.99 ± 0.17
3.94 ± 0.21
10.76 ± 0.37
β̃
0.97 ± 0.10
0.98 ± 0.11
0.98 ± 0.09
0.98 ± 0.09
1.00 ± 0.09
0.99 ± 0.09
Fig. 2. Normalized empirical distribution functions of the sample of 100 joint
estimates of α and β compared to the standardized normal distribution function.
45
0.97 ± 0.10
0.98 ± 0.11
0.98 ± 0.09
0.98 ± 0.09
1.00 ± 0.09
0.99 ± 0.09
Fig. 2. Normalized empirical distribution functions of the sample of 100 joint
estimates of α and β compared to the standardized normal distribution function.
ervations each.
m (11) and (12)
98 ± 0.0.07 for
ime. In the
rms slightly
n the other
ment and is
rathreshold.
able 1. The
n the strong
rms as well.
tions of the
29) indicate
y distributed
o construct
Fig. 3. Scatterplots of the 996 pairs of estimates of (α, β), each estimated from
a sample of 10 simulated first-passage times corresponding to the true values
α = 2 and β = 1.
separated into 1000 samples with only 10 observations in each
46
sample and (α, β) is estimated for each of these. For four of
ARTICLE IN PRESS
levsen, O. Ditlevsen / Probabilistic Engineering Mechanics (
a Feller process.
elow, a property which is
he action of an inhibitory
its dimensionless form
quation
(30)
)
–
Fig. 5. Comparison of the (normalized) left-hand side of the integral equation
(25) (smooth curves) with the empirical (normalized) right-hand side given by
(26) for five simulated samples of 100 first-passage times of the OU process of
the level 1 corresponding to the true α-values 1, 2, 3, 4, 11, respectively, and the
true β = 1 (right to left). For these samples the estimates of (α, β) according to
(29) are (1.212, 0.926), (1.677, 0.996), (2.657, 1.039), (4.055, 1.029), (10.801,
0.956), respectively.
1 is directly obtained as
47
ARTICLE IN
8
S. Ditlevsen, O. Ditlevsen / Probabilistic Enginee
48
49
Download