Rare Event Simulation and (Interacting) Particle Systems Adam M. Johansen

advertisement
Rare Event Simulation
and (Interacting) Particle Systems
Adam M. Johansen
a.m.johansen@warwick.ac.uk
http://go.warwick.ac.uk/amjohansen/talks/
MASDOC Statistical Frontiers Seminar
4th February 2016
Context
I
I
I
I
Given a probability space (Ω, F, P),
and a random element X : (Ω, F) → (E , E),
what is P(X ∈ A) = P ◦ X −1 (A) = P({ω ∈ Ω : X (ω) ∈ A}),
for some A ∈ E such that P(A) 1?
X
Ω
X−1(A)
ω
A
X
X(ω)
X(Ω)
2/51
Some Simple Examples: Normal Probabilities
1. A really simple problem.
I
I
I
Let
2
1
x
f (x) = √ exp −
.
2
2π
What is P(X ∈ A) if A = [a, ∞) for a 1?
Simple semi-analytic solution 1 − Φ(a).
2. A somewhat harder problem:
I
I
I
let
1
1
f (x) = p
exp − xT Σ−1 x .
2
|2πΣ|
What is P(X ∈ A) if A = ⊗di=1 [ai , bi ]?
What can we say about Law(X )|A
3/51
The Monte Carlo Method
I
To approximate
I (ϕ) = E[ϕ(X )]
I
I
with X ∼ π.
iid
Sample X1 , . . . , Xn ∼ π and
use
n
În (ϕ) =
1X
ϕ(Xi ).
n
i=1
I
SLLN:
a.s.
lim În (ϕ) = E[ϕ(X )]
n→∞
I
CLT:
lim
n→∞
√
D
n(În (ϕ) − I (ϕ)) = Z ,
Z ∼ N (0, Var(ϕ(X ))) provided Var(ϕ(X )) < ∞.
4/51
The Monte Carlo Method and Rare Events
I
I
Use P(X ∈ A) ≡ E[IA (X )] = I (IA ).
Then, directly:
P(X ∈ A) ≈ În (IA ) =
|A ∩ {X1 , . . . , Xn }|
.
n
5/51
Simple Monte Carlo and the Toy Problem
a
k
1
2
3
4
5
6
1
-2.30
2
-1.66
-3.91
log(Î10k (I[a,∞) ))
3
4
5
-1.80 -1.82
-1.83
-3.73 -3.76
-3.78
-6.91 -6.81
-6.59
-10.12
6
-1.84
-3.79
-6.60
-10.26
7
-1.84
-3.79
-6.61
-10.42
-14.73
log
(1 − Φ(a))
-1.84
-3.78
-6.61
-10.36
-15.06
-20.74
Simple calculations reveal:
I
I
I
E[În (I[a,∞) )] = P(X ∈ [a, ∞))
Var[În (I[a,∞) )] = n1 P(X ∈ [a, ∞))(1 − P(X ∈ [a, ∞)))
So the relative standard deviation is ∼ (nP(X ∈ [a, ∞)))−1 .
6/51
Simple Monte Carlo and the Toy Problem
a
k
1
2
3
4
5
6
1
-2.30
2
-1.66
-3.91
log(Î10k (I[a,∞) ))
3
4
5
-1.80 -1.82
-1.83
-3.73 -3.76
-3.78
-6.91 -6.81
-6.59
-10.12
6
-1.84
-3.79
-6.60
-10.26
7
-1.84
-3.79
-6.61
-10.42
-14.73
log
(1 − Φ(a))
-1.84
-3.78
-6.61
-10.36
-15.06
-20.74
Simple calculations reveal:
I
I
I
E[În (I[a,∞) )] = P(X ∈ [a, ∞))
Var[În (I[a,∞) )] = n1 P(X ∈ [a, ∞))(1 − P(X ∈ [a, ∞)))
So the relative standard deviation is ∼ (nP(X ∈ [a, ∞)))−1 .
7/51
Variance Reduction
I
Want p̂n such that p̂n ≈ P(X ∈ A) =: p:
I
I
I
I
Ideally, with E[p̂n ] = p.
Such that Var(p̂n ) p 2 .
For modest n.
Controlling variance is the key issue.
I
I
I
I
Importance Sampling.
Splitting.
Interacting Particle Systems.
Sequential Monte Carlo.
8/51
Importance Sampling — A Change of Measure View
I
If:
I
I
I
I
I
df
dg (x)
Then:
I
I
X ∼f
Y ∼g
f g
w (x) :=
E[ϕ(X )] ≡ E[w (Y )ϕ(Y )]
iid
So, if Y1 , . . . ∼ g , then:
n
1X
a.s.
w (Yi )ϕ(Yi ) = E[ϕ(X )]
lim
n→∞ n
i=1
and this is an unbiased estimator for any n.
9/51
Importance Sampling Variance
The variance of this estimator is:
" n
#
1X
Var
w (Yi )ϕ(Yi )
n
i=1
1
= Var [w (Y1 )ϕ(Y1 )]
n
i
o
1n h
=
E (w (Y1 )ϕ(Y1 ))2 − E [w (Y1 )ϕ(Y1 )]2
n(
Z
2 )
Z
1
(w (y )ϕ(y ))2 g (dy ) −
w (y )ϕ(y )g (dy )
=
n
Z
1
=
w (y )ϕ2 (y )f (dx) − E[ϕ(X )]2
n
10/51
Optimal Importance Sampling
Proposition
Let X ∼ f , where f (dx) = f (x)dx, with values in (E , E) and let
φ : R → (0, ∞) a function of interest. The proposal which
minimizes the variance of the importance sampling estimator of
E[ϕ(X )] is g (x)dx, where:
g (x) = R
f (x)ϕ(x)
f (y )ϕ(y )dy
Note: if E ⊃ A ⊃ supp ϕ(x), it suffices for f |A g |A .
11/51
Importance Sampling and the Toy Problem
a
k
1
2
3
4
5
6
7
8
9
1
-1.72
-3.63
-6.43
-10.16
-14.85
-20.51
-27.16
-34.79
-43.41
2
-1.84
-3.78
-6.59
-10.34
-15.04
-20.72
-27.37
-35.01
-43.64
log(Î10k (I[a,∞) ))
3
4
5
-1.83
-1.84
-1.84
-3.79
-3.78
-3.78
-6.63
-6.60
-6.61
-10.40 -10.35 -10.36
-15.12 -15.06 -15.07
-20.81 -20.73 -20.73
-27.46 -27.38 -27.39
-35.10 -35.02 -35.01
-43.73 -43.63 -43.63
6
-1.84
-3.78
-6.61
-10.36
-15.06
-20.74
-27.38
-35.01
-43.63
7
-1.84
-3.78
-6.61
-10.36
-15.06
-20.74
-27.38
-35.01
-43.62
log
(1 − Φ(a))
-1.84
-3.78
-6.61
-10.36
-15.06
-20.74
-27.38
-35.01
-43.63
Using g (x) = exp(−(x − a))I[a,∞) (x).
12/51
So far
I
Rare events probabilities are important,
I
but not often available analytically.
I
Monte Carlo appears to offer a solution;
I
naı̈ve approaches fail due to their high variance.
I
Importance sampling can help dramatically;
I
but can be very difficult (i.e. impossible) to implement
effectively.
Next
I
Two classes of problems;
I
potential solutions in these settings.
13/51
Problem Formulation
Here we consider algorithms which are applicable to two types of
rare event, both of which are defined in terms of the canonical
Markov chain:
!
∞
∞
Y
Y
Ω=
En , F =
Fn , (Xn )n∈N , Pη0 ,
n=0
n=0
where the law Pη0 is defined by its finite dimensional distributions:
−1
Pη0 ◦ X0:N
(dx0:N ) = η0 (dx0 )
N
Y
Mi (xi−1 , dxi ).
i=1
14/51
Static Rare Events
We term the first type of rare events which we consider static rare
events:
I
I
They are defined as the probability that the first P + 1
elements of the canonical Markov chain lie in a rare set, T .
That is, we are interested in
Pη0 (x0:P ∈ T )
and the associated conditional distribution:
Pη0 (x0:P ∈ dx0:P |x0:P ∈ T )
I
We assume that the rare event is characterised as a level set
of a suitable potential function:
V : T → [V̂ , ∞), and V : E0:P \ T → (−∞, V̂ ).
15/51
A Simple Example: Normal Random Walks
I
A toy example:
I
I
I
I
I
So:
I
I
η0 (x0 ) = N (x0 ; 0, 1).
Mn (xn−1 , xn ) = N (xn ; xn−1 , 1).
PP
V (x0:P ) = i=0 xi .
T = [V̂ , ∞).
XP ∼ N (0, P + 1)
√
P(X0:P ∈ T ) = P(XP ≥ V̂ ) = 1 − Φ(V̂ / P + 1)
16/51
A Slightly Harder Problem
I
I
η0 (x0 ) = N (x0 ; 0, 1).
Mn (xn−1 , xn ) = N (xn ; xn−1 , 1).
I
V (x0:P ) = max0≤i≤P xi .
I
T = [V̂ , ∞).
17/51
A Real Problem: Polarization Mode Dispersion
This examples was considered by (3, 8). We have a sequence of
polarization vectors, rn which evolve according to the equation:
1
rn = R(θn , φn )rn−1 + S(θn )
2
where:
I
I
I
φn ∼ U[−π, π]
cos(θn ) ∼ U[−1, 1], sn = sgn(θn ) ∼ U{−1, +1},
S(θ) = (cos(θ), sin(θ), 0) and R(θ, φ) is the matrix which
describes a rotation of φ about axis S(θ).
The rare events of interest are system trajectories for which
|rP | > D where D is some threshold.
18/51
Feynman-Kac Formulæ and Interacting Particle
Systems. . . or SMC [Cf. Anthony Lee’s Talk]
I
I
Want a black box method.
Using only:
I
I
I
I
I
Samples from η0 and Mn (xn−1 , cdot)
Pointwise evaluation of V
Assume spatial homogeneity and that V : x0:P 7→ V (xp ).
Recall the core of an SMC algorithm, iteratively, sample:
PN
i
i
Gn−1 (Xn−1
)Mn (Xn−1
, ·)
1
N iid
.
Xn , . . . , Xn ∼ i=1 PN
j
j=1 Gn−1 (Xn−1 )
Provides unbiased, consistent estimates of:




P−1
P−1
Y
Y
Eη0 
G (Xp )
Eη0 
G (Xp )ϕ(XP )
p=0
p=0
Cf. (4) for more details.
19/51
The Approach of Del Moral and Garnier
One SMC approach to th static problem makes use of:
I
η0 and Mn (xn−1 , dxn ) from the original model
I
Gn (xn ) = exp(βV (xn ))
en = ⊗N Ep :
Or the more flexible path space formulation, E
p=0
I
ηe0 = η0 as before and
M̃n (xn−1,0:n−1 , dxn,0:n ) = δxn−1,0:n−1 (dxn,0:n−1 )Mn (xn−1,n−1 dxn,n )
I
and either
Gn (x0:n ) = exp(βV (xn ))
or Gn (x0:n ) = exp (α(V (xn ) − V (xn−1 ))) .
I
A black box but with parameters.
20/51
I
Underlying Identity:
P(XP ∈ V −1 ([V̂ , ∞)) =P(V (XP ) ≥ V̂ )
h
i
=E I[V̂ ,∞) (V (XP ))
"p−1
p−1
Y
Y
=E
G (Xn ) · I[V̂ ,∞) (V (XP ))
n=0
I
n=0
#
1
·
G (Xn )
Basic Algorithm:
I
I
iid
Sample X01 , . . . , X0N ∼ η0 . Y01 , . . . , Y0N = 1.
For p = 1 to P, for i = P
1 to N:
I
I
I
I
I
j
Compute zp−1 = N1 N
j=1 Gp−1 (Xp−1 ).
P
N
j
1
Sample Aip ∼ Nzp−1
Gp−1 (Xp−1
)δj (·).
ij=1 Ap
i
Sample Xp ∼ Mp Xp−1 , · .
i Aip
Ap−1
Set Ypi = Yp−1
/Gp−1 Xp−1
.
Report:
p̂ =
P−1
Y
p=0
zp ·
n
1 X
I[V̂ ,∞) (Xpi )Ypi .
N
i=1
21/51
SMC Samplers
Actually, SMC techniques can be used to sample from any
sequence of distributions...
I
I
Given a sequence of target distributions, {πn }, on measurable
spaces (En , En )
Construct a synthetic sequence {π̃n } on the product spaces
n
N
(Ep , Ep ) by introducing arbitrary auxiliary Markov kernels,
p=0
Lp : Ep+1 ⊗ Ep → [0, 1]:
π̃n (dx1:n ) = πn (dxn )
n−1
Y
Lp (xp+1 , dxp ) ,
p=0
which each admit one of the target distributions as their final
time marginal.
22/51
SMC Outline — One Iteration
I
(i)
Given a sample {X1:n−1 }N
i=1 targeting π̃n−1 , for i = 1 to N:
I
I
(i)
(i)
sample Xn ∼ Kn (Xn−1 , ·),
calculate
(i)
π̃n (X1:n )
(i)
Wn (X1:n ) =
(i)
πn (Xn )
n−1
Q
p=1
=
(i)
πn−1 (Xn−1 )
(i)
=
(i)
(i)
π̃n−1 (X1:n−1 )Kn (Xn−1 , Xn )
n−2
Q
p=1
(i)
(i)
(i)
(i)
(i)
Lp (Xp+1 , Xp )Kn (Xn−1 , Xn )
(i)
(i)
πn (Xn )Ln−1 (Xn , Xn−1 )
(i)
(i)
Lp (Xp+1 , Xp )
(i)
(i)
.
πn−1 (Xn−1 )Kn (Xn−1 , Xn )
I
(i)
Resample, yielding: {X1:n }N
i=1 targeting π̃n .
23/51
Alternative SMC Summary
At each iteration, given a set of weighted samples
(i)
(i)
{Xn−1 , Wn−1 }N
i=1 targeting πn−1 :
I
I
(i)
(i)
Sample Xn ∼ Kn (Xn−1 , ·).
n
o
(i)
(i)
(i) N
∼ πn−1 (Xn−1 )Kn (Xn−1 , Xn ).
(Xn−1 , Xn ), Wn−1
(i)
Wn
i=1
(i) πn (Xn )Ln−1 (Xn ,Xn−1 )
Wn−1 πn−1
(Xn−1 )Kn (Xn−1 ,Xn ) .
I
Set weights
=
n
o
(i) N
(Xn−1 , Xn ), Wn
∼ πn (Xn )Ln−1 (Xn , Xn−1 ) and,
i=1
n
o(i)
(i)
(i)
marginally, Xn , Wn
∼ πn .
I
Resample to obtain an unweighted particle set.
I
Hints that we’d like Ln−1 (xn , xn−1 ) =
I
i=1
)Kn (xn−1 ,xn )
Rπn−1 (xn−1
.
0
0
πn−1 (xn−1
)Kn (xn−1
,xn )
24/51
Key Points of SMC
I
An iterative technique for sampling from a sequence of similar
distributions.
I
By use of intermediate distributions, we can obtain well
behaved weighted samples from intractable distributions,
I
and estimate associated normalising constants.
I
Can be interpreted as a mean field approximation of a
Feynman-Kac flow (2).
25/51
Static Rare Events: Path-space Approach
I
Begin by sampling a set of paths from the law of the Markov
chain.
I
Iteratively obtain samples from a sequence of distributions
which moves “smoothly” towards one which places the
majority of its mass on the rare set.
I
We construct our sequence of distributions via a potential
function and a sequence of inverse temperatures parameters:
πt (dx0:P ) ∝ Pη0 (dx0:P )gt/T (x0:P )
−1
gθ (x0:p ) = 1 + exp −α(θ) V (x0:P ) − V̂
I
Estimate the normalising constant of the final distribution and
correct via importance sampling.
26/51
Path Sampling — An Alternative Approach to Estimating
Normalizing Constants
I
An integral expression for the log normalising constant of
sufficiently regular distributions.
I
Given a sequence of densities p(x|θ) = q(x|θ)/z(θ):
d
d
log z(θ) = Eθ
log q(·|θ)
dθ
dθ
(?)
where the expectation is taken with respect to p(·|θ).
I
Consequently, we obtain:
Z 1 z(1)
d
log
=
Eθ
log q(·|θ)
z(0)
dθ
0
I
See ?? or (5) for details. Also (8, 12) in SMC context. In our
case, we use our particle system to approximate both integrals.
27/51
Static Rare Events: Framework
Initialization proceeds via importance sampling:
At t = 0.
for i = 1 to N do
(i)
Sample X0 ∼ ν for some importance distribution ν.
N
(i)
P
π (X0 )
(i)
(j)
Set W0 ∝ 0 (i)
such that
W0 = 1.
ν(X1 )
j=1
end for
28/51
Samples are obtained from our sequence of distributions using SMC
techniques.
for t = 1 to T do
n
o
(i)
(i) N
.
if ESS < threshold then resample Wt−1 , Xt−1
i=1
If desired, apply a Markov kernel, K̃t−1 of invariant
distribution πt−1 to improve sample diversity.
for i = 1 to N do
(i)
(i)
Sample Xt ∼ Kt (Xt−1 , ·).
(i)
Weight Wt
(i)
(i)
∝ Ŵt−1 wt
where the incremental
(i)
importance weight, wt is defined through
(i)
(i)
(i)
N
P
πt (Xt )Lt−1 (Xt ,Xt−1 )
(i)
(j)
wt =
Wt =
(i)
(i)
(i) , and
πt−1 (Xt−1 )Kt (Xt−1 ,Xt )
j=1
1.
end for
end for
29/51
Finally, an estimate can be obtained:
Approximate the path sampling identity to estimate the
normalising constant:
" T
#
X
Ê
+
Ê
1
t−1
t
exp
(α(t/T ) − α((t − 1)/T ))
Zˆ1 =
2
2
t=1
N
P
Eˆt
=
(j)
Wt
j=1
(j)
V Xt −V̂
(j)
1+exp αt V Xt −V̂
N
P
(j)
Wt
j=1
Estimate the rare event probability using importance sampling:
N
P
p?
=
Ẑ1
j=1
(j)
WT
(j)
(j)
1 + exp(α(1)(V XT − V̂ )) I(V̂ ,∞] V XT
N
P
j=1
.
(j)
WT
30/51
Example: Gaussian Random Walk
I
I
I
A toy example: Mn (Rn−1 , Rn ) = N (Rn |Rn−1 , 1).
T = [V̂ , ∞).
Proposal kernel:
Kn (Xn−1 , dxn ) =
S
X
wn+1 (Xn−1 , Xn )
j=−S
P
Y
δXn−1,i +ijδ (dxn,i ),
i=1
where the weight of individual moves is given by
wn (Xn−1 , Xn ) ∝ πn (Xn ).
I
Linear annealing schedule.
I
Number of distributions T ∝ V̂ 3/2 (T=2500 when V̂ = 25).
31/51
Gaussian Random Walk Example Results
0
True Values
IPS(100)
IPS(1,000)
IPS(20,000)
SMC(100)
-10
log Probability
-20
-30
-40
-50
-60
5
10
15
20
25
30
35
40
Threshold
32/51
Typical SMC Run -- All Particles
30
25
Markov Chain State Value
20
15
10
5
0
-5
0
2
4
6
8
10
Markov Chain State Number
12
14
16
33/51
Typical IPS Run -- Particles Which Hit The Rare Set
30
25
Markov Chain State Value
20
15
10
5
0
-5
0
2
4
6
8
10
Markov Chain State Number
12
14
16
34/51
Example: Polarization Mode Dispersion
This examples was considered by (3). We have a sequence of
polarization vectors, rn which evolve according to the equation:
1
rn = R(θn , φn )rn−1 + Ω(θn )
2
where:
I
I
I
φn ∼ U[−π, π]
cos(θn ) ∼ U[−1, 1], sn = sgn(θn ) ∼ U{−1, +1},
Ω(θ) = (cos(θ), sin(θ), 0) and R(θ, φ) is the matrix which
describes a rotation of φ about axis Ω(θ).
The rare events of interest are system trajectories for which
|rP | > D where D is some threshold.
35/51
I
We use a πn -invariant MCMC proposal for Kn and the
associated time-reversal kernel for Ln−1 .
I
This leads to:
Wn (Xn−1 , Xn ) =
πn (Xn−1 )
πn−1 (Xn−1 )
allowing sampling and resampling to be exchanged (cf. (7)).
I
(Precisely, we employed a Metropolis-Hastings kernel with a
proposal which randomly selects two indices uniformly
between 1 and n and proposes replacing the φ and c values
between those two indices with values drawn from the uniform
distribution over [−π, π] × [−1, 1]. This proposal is then
accepted with the usual Metropolis acceptance probability.)
36/51
Example: Polarization Mode Dispersion - The PDF
PDF Estimates for PMD
1
IPS (Alpha=1)
IPS (Alpha=3)
SMC
0.01
1e-04
pdf
1e-06
1e-08
1e-10
1e-12
1e-14
0
1
2
3
4
5
6
7
8
9
DGD
37/51
Dynamic Rare Events
The other class of rare events in which we are interested are
termed dynamic rare events:
I
I
They correspond to the probability that a Markov process hits
some rare set, T , before its first entrance to some recurrent
set R.
That is, given the stopping time τ = inf {p : Xp ∈ T ∪ R},
we seek
Pη0 (Xτ ∈ T )
and the associated conditional distribution:
Pη0 (τ = t, X0:t ∈ dx0:t |Xτ ∈ T )
38/51
Dynamic Rare Events: Illustration
T
R
39/51
Importance Sampling
I
Use a second Markov process on the same space:
Ω=
∞
Y
n=0
En , F =
∞
Y
n=0
!
eηe
Fn , (Xn )n∈N , P
0
,
eηe is defined by its finite dimensional
where the law P
0
distributions:
eηe ◦ X −1 (dx0:N ) = ηe0 (dx0 )
P
0
0:N
I
N
Y
e i (xi−1 , dxi ).
M
i=1
So that:
N
Y
dPη0 ◦ X0:N
dη0
dMn (xn−1 , ·)
(x0:N ) =
(x0 )
(xn )
eηe ◦ X0:N
e
d ηe0
dP
0
p=1 d Mn (xn−1 , ·)
40/51
Importance Sampling
I
Then:
Pη0 (Xτ ∈ T ) =Eη0 [IT (Xτ )] =
∞
X
∞
X
Eη0 I{t} (τ )IT (Xt )
t=0
dP
e
=
Eη0 I{t} (τ )IT (Xt ) (X0:∞ )
e
dP
t=0
∞
X
dP
e
e
=
Eη0 Eη0 I{t} (τ )IT (Xt ) (X0:∞ ) Ft
e
dP
t=0
"
#
∞
−1
X
dP
◦
X
0:t
e η I{t} (τ )IT (Xt )
=
E
(X )
0
e ◦ X −1 0:t
d
P
0:t
t=0
"
#
−1
dP
◦
X
0:τ
e η IT (Xτ )
=E
(X )
0
e ◦ X −1 0:τ
dP
0:τ
I
e n?
But how do you choose ηe0 and M
41/51
Splitting
I
If V increases towards T , could consider:
T1 = V −1 ([V1 , ∞)) ⊃ T2 = V −1 ([V2 , ∞)) ⊃ · · · ⊃ TL = T
I
and the non-decreasing first hitting times:
τi = inf{t : Xt ∈ R ∪ Ti }
I
which yield the decomposition:
P(Xτ ∈ T ) = P(Xτ1 ∈ T1 )
I
L
Y
l=2
P(Xτl ∈ Tl |{Xτl−1 ∈ Tl−1 })
and the multilevel splitting method [dating back to (9)].
42/51
Multilevel Splitting
V(Xt)
V̂
V2
V1
V0
t
43/51
A Simple Splitting Algorithm
I
1 , . . . , X N1 iid from P.
Sample X1:τ
1
1:τ 1
I
i ) = I (X i ). Let S =
Compute G (X1:τ
1
T1
i
τi
1
1
1
I
1
i
i=1 G (X1:τ1i ).
For l = 2 to L:
I
I
Set Nl = rSl−1 .
i
Let (X̂τ̂i l−1 )i=1:Nl comprise r copies of each Xτl−1
∈ Tl−1 .
i
∼ P(·|{Xτi l−1 = X̂τ̂l−1
}).
PN1
i
i
) = ITl (Xτ i ). Let Sl = i=1 G (X1:τ i ).
:τ i
I
For i = 1, . . . , Nl : Sample Xτ̂i i
I
Compute G (Xτ̂i i
i
l−1 :τl
l−1
I
PN1
l
l
l
Compute
\
P(X
τ ∈T)=
L
Y
Sl
.
Nl
l=1
44/51
Adaptive Multilevel Splitting 1 — Choosing rl
I
I
Describes a Branching Process.
If E[rSl ] 6= Nl essentially:
I
I
Particle system dies eventually.
Nl grows exponentially fast.
I
See (6, 10) for some preliminary analysis.
I
See (11) for two-stage schemes in which a preliminary run
specifies rL .
45/51
Adaptive Multilevel Splitting 2 — Choosing the Levels
I
I
Let τ1i = inf{t : Xt ∈ R}.
1 , . . . , X N1 iid from P.
Sample X1:τ
i
1:τ i
1
I
1
(bαN1 c)
Compute X̌1i = max{X1i , . . . , Xτ i }. Set V1 = X̌1
1
I
.
While Vl < V̂ . l ← l + 1.
I
I
I
Set Nl = N1 .
i
Let (X̂τ̂i l−1 )i=1:Nl comprise r copies of each Xτl−1
which reached
Vl−1 up to the time when it reached it.
i
}).
For i = 1, . . . , Nl : Sample Xτ̂i i :τ i ∼ P(·|{Xτi l−1 = X̂τ̂l−1
l−1
I
l
With τli = inf{t : Xt ∈ R}.
(bαNl c)
Compute X̌li = max{Xτ̂i i , . . . , Xτli }. Set Vl = X̌l
.
l−1
I
Compute
N
l−1 1
\
P(X
·
I(τ i < τli )
τ ∈ T ) = (1 − α)
N i=1 T
i , . . . , τi} : Xi ∈ T }
where τTi = inf{t ∈ {τ̂l−1
t
l
Related adaptive methods can be unbiased (1).
46/51
Sequential Monte Carlo: Interacting Particle Systems
I
1 , . . . , X N iid from P.
Sample X1:τ
1
1:τ 1
I
Compute G1 (Xτi i ) = IT1 (Xτi i ). Let S1 =
1
1
1
For l = 2 to L:
I
Sample (X̂τ̂i l−1 )i=1:Nl iid from
1
Sl−1
Xτ̂i i :τ i
l−1 l
I
For i = 1, . . . , N: Sample
I
Compute Gl (Xτi i ) = ITl (Xτi i ). Let
l
I
i
i=1 G1 (Xτ i ).
1
1
I
PN
l
PN
j=1
Gl−1 (Xτj l−1 )δX
τ
j
l−1
i
∼ P(·|{Xτi l−1 = X̂τ̂l−1
}).
PN1
Sl = i=1 Gl (Xτi i ).
l
Compute
\
P(X
τ ∈T)=
L
Y
Sl
.
Nl
l=1
47/51
[1] C.-E. Bréhier, M. Gazeau, L. Goudenège, T. Lelièvre, and
M. Rousset. Unbiasedness of some generalized adaptive
multilevel splitting schemes. Technical Report 1505.02674,
ArXiv, 2015.
[2] P. Del Moral. Feynman-Kac formulae: genealogical and
interacting particle systems with applications. Probability and
Its Applications. Springer Verlag, New York, 2004.
[3] P. Del Moral and J. Garnier. Genealogical particle analysis of
rare events. Annals of Applied Probability, 15(4):2496–2534,
November 2005.
[4] A. Doucet and A. M. Johansen. A tutorial on particle filtering
and smoothing: Fiteen years later. In D. Crisan and
B. Rozovsky, editors, The Oxford Handbook of Nonlinear
Filtering, pages 656–704. Oxford University Press, 2011.
48/51
[5] A. Gelman and X.-L. Meng. Simulating normalizing
constants: From importance sampling to bridge sampling to
path sampling. Statistical Science, 13(2):163–185, 1998.
[6] P. Glasserman, P. Heidelberger, P.Shahabuddin, and T. Zajic.
Multilevel splitting for estimating rare event probabilities.
Operations Research, 47(4):585–600, 1999.
[7] A. M. Johansen and A. Doucet. Auxiliary variable sequential
Monte Carlo methods. Research Report 07:09, University of
Bristol, Department of Mathematics – Statistics Group,
University Walk, Bristol, BS8 1TW, UK, July 2007. A
substantially shorter version focusing on particle filtering
appeared in Statistics and Probability Letters.
[8] A. M. Johansen, P. Del Moral, and A. Doucet. Sequential
Monte Carlo samplers for rare events. In Proceedings of the
6th International Workshop on Rare Event Simulation, pages
256–267, Bamberg, Germany, October 2006.
49/51
[9] H. Kahn and T. E. Harris. Estimation of particle transmission
by random sampling. National Bureau of Standards Applied
Mathematics Series, 12:27–30, 1951.
[10] A. Lagnoux. Rare event simulation. Probability in the
Engineering and Informational Sciences, 20(1):43–66, 2006.
[11] A. Lagnoux. A two-step branching splitting model under cost
constraint. Journal of Applied Probability, 46(2):429–452,
2009.
[12] Y. Zhou, A. M. Johansen, and J. A. D. Aston. Towards
automatic model comparison: An adaptive sequential Monte
Carlo approach. Journal of Computational and Graphical
Statistics, 2015. In press.
50/51
Path Sampling Identity
Given a probability density, p(x|θ) = q(x|θ)/z(θ):
1 ∂
∂
log z(θ) =
z(θ)
∂θ
z(θ) ∂θ
Z
1 ∂
=
q(x|θ)dx
z(θ) ∂θ
Z
1 ∂
=
q(x|θ)dx
(??)
z(θ) ∂θ
Z
p(x|θ) ∂
=
q(x|θ)dx
q(x|θ) ∂θ
Z
∂
∂
= p(x|θ) log q(x|θ)dx = Ep(·|θ)
log q(·|θ)
∂θ
∂θ
wherever ?? is permissible. Back to ?.
51/51
Download