Auxiliary Particle Methods Perspectives & Applications Adam M. Johansen

advertisement
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Auxiliary Particle Methods
Perspectives & Applications
Adam M. Johansen1
adam.johansen@bristol.ac.uk
Oxford University Man Institute — 29th May 2008
1
Collaborators include: Arnaud Doucet, Nick Whiteley
1
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Introduction
2
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Outline
Outline
I
Background: Particle Filters
I
I
I
I
Interpretation
I
I
I
I
Auxiliary Particle Filters are SIR Algorithms
Theoretical Considerations
Practical Implications
Applications
I
I
I
I
Hidden Markov Models & Filtering
Particle Filters & Sequential Importance Resampling
Auxiliary Particle Filters
Auxiliary SMC Samplers
The Probability Hypothesis Density
On Stratification
Conclusions
3
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Outline
Outline
I
Background: Particle Filters
I
I
I
I
Interpretation
I
I
I
I
Auxiliary Particle Filters are SIR Algorithms
Theoretical Considerations
Practical Implications
Applications
I
I
I
I
Hidden Markov Models & Filtering
Particle Filters & Sequential Importance Resampling
Auxiliary Particle Filters
Auxiliary SMC Samplers
The Probability Hypothesis Density
On Stratification
Conclusions
4
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Outline
Outline
I
Background: Particle Filters
I
I
I
I
Interpretation
I
I
I
I
Auxiliary Particle Filters are SIR Algorithms
Theoretical Considerations
Practical Implications
Applications
I
I
I
I
Hidden Markov Models & Filtering
Particle Filters & Sequential Importance Resampling
Auxiliary Particle Filters
Auxiliary SMC Samplers
The Probability Hypothesis Density
On Stratification
Conclusions
5
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Outline
Outline
I
Background: Particle Filters
I
I
I
I
Interpretation
I
I
I
I
Auxiliary Particle Filters are SIR Algorithms
Theoretical Considerations
Practical Implications
Applications
I
I
I
I
Hidden Markov Models & Filtering
Particle Filters & Sequential Importance Resampling
Auxiliary Particle Filters
Auxiliary SMC Samplers
The Probability Hypothesis Density
On Stratification
Conclusions
6
Introduction
Interpretation
Applications & Extensions
Conclusions
References
HMMs & Particle Filters
Hidden Markov Models
I
x1
x2
x3
x4
x5
x6
y1
y2
y3
y4
y5
y6
Xn is a X -valued Markov Chain with transition density f :
Xn |{Xn−1 = xn−1 } ∼ f (·|xn−1 )
I
Yn is a Y-valued stochastic process:
Yn |{Xn = xn } ∼ g(·|xn )
7
Introduction
Interpretation
Applications & Extensions
Conclusions
References
HMMs & Particle Filters
Optimal Filtering
I
The filtering distribution may be expressed recursively as:
Z
p(xn |y1:n−1 ) = p(xn−1 |y1:n−1 )f (xn |xn−1 )dxn−1 Prediction
p(xn |y1:n ) = R
I
p(xn |y1:n−1 )g(yn |xn )
p(xn |y1:n−1 )g(yn |xn )dxn
Update.
The smoothing distribution may be expressed recursively as:
p(x1:n |y1:n−1 ) = p(x1:n−1 |y1:n−1 )fn (xn |xn−1 )
p(x1:n |y1:n−1 )gn (yn |xn )
p(x1:n |y1:n ) = R
p(x1:n |y1:n−1 )gn (yn |xn )dx1:n
Prediction
Update.
8
Introduction
Interpretation
Applications & Extensions
Conclusions
References
HMMs & Particle Filters
Optimal Filtering
I
The filtering distribution may be expressed recursively as:
Z
p(xn |y1:n−1 ) = p(xn−1 |y1:n−1 )f (xn |xn−1 )dxn−1 Prediction
p(xn |y1:n ) = R
I
p(xn |y1:n−1 )g(yn |xn )
p(xn |y1:n−1 )g(yn |xn )dxn
Update.
The smoothing distribution may be expressed recursively as:
p(x1:n |y1:n−1 ) = p(x1:n−1 |y1:n−1 )fn (xn |xn−1 )
p(x1:n |y1:n−1 )gn (yn |xn )
p(x1:n |y1:n ) = R
p(x1:n |y1:n−1 )gn (yn |xn )dx1:n
Prediction
Update.
9
Introduction
Interpretation
Applications & Extensions
Conclusions
References
HMMs & Particle Filters
Particle Filters: Sequential Importance Resampling
At time n = 1:
I
i ∼ q (·).
Sample X1,1
1
I
Weight
i
i
i
W1i ∝ f (X1,1
)g(y1 |X1,1
)/q1 (X1,1
)
I
Resample.
At times n > 1, iterate:
I
i
i
i
i
Sample Xn,n
∼ qn (·|Xn−1,n−1
). Set Xn,1:n−1
= Xn−1
.
I
Weight
Wni ∝
I
i |X i
i
f (Xn,n
n,n−1 )g(Xn,n |yn )
qn (Xn,n |Xn,1:n−1 )
Resample.
10
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Auxiliary Particle Filters
Auxiliary [v] Particle Filters (Pitt & Shephard ’99)
If we have access to the next observation before resampling, we
could use this structure:
(i)
(i)
I
Pre-weight every particle with λn ∝ p̂(yn |Xn−1 ).
I
Propose new states, from the mixture distribution
N
X
N
X
(i) λ(i)
q(·|X
)
λ(i)
n
n .
n−1
i=1
I
Weight samples, correcting for the pre-weighting.
Wni ∝
I
i=1
i |X i
i
f (Xn,n
n,n−1 )g(Xn,n |yn )
λin qn (Xn,n |Xn,1:n−1 )
Resample particle set.
11
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Auxiliary Particle Filters
Some Well Known Refinements
We can tidy things up a bit:
1. The auxiliary variable step is equivalent to multinomial
resampling.
2. So, there’s no need to resample before the pre-weighting.
Now we have:
(i)
(i)
I
Pre-weight every particle with λn ∝ p̂(yn |Xn−1 ).
I
Resample
I
Propose new states
I
Weight samples, correcting for the pre-weighting.
Wni ∝
i |X i
i
f (Xn,n
n,n−1 )g(Xn,n |yn )
λin qn (Xn,n |Xn,1:n−1 )
12
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Interpretation
13
Introduction
Interpretation
Applications & Extensions
Conclusions
References
APFs without Auxiliary Variables
General SIR Algorithms
The SIR algorithm can be used somewhat more generally.
Given {πn } defined on En = X n :
I
i
i
i
i
Sample Xn,n
∼ qn (·|Xn−1
). Set Xn,1:n−1
= Xn−1
.
I
Weight
Wni ∝
I
πn (Xni )
πn−1 (Xn,1:n−1 )qn (Xn,n |Xn,1:n−1 )
Resample.
14
Introduction
Interpretation
Applications & Extensions
Conclusions
References
APFs without Auxiliary Variables
An Interpretation of the APF
If we move the first step at time n + 1 to the last at time n, we
get:
I
Resample
I
Propose new states
I
Weight samples, correcting earlier pre-weighting.
I
Pre-weight every particle with λn+1 ∝ p̂(yn+1 |Xn ).
(i)
(i)
which is an SIR algorithm targetting the sequence of
distributions
ηn (xn ) ∝ p(x1:n |y1:n )p̂(yn+1 |xn )
which allows estimation under the actually interesting
distribution via importance sampling.
15
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Theoretical Considerations
Theoretical Considerations
I
Direct analysis of the APF is largely unnecessary.
I
Results can be obtained by considering the associated SIR
algorithm.
I
SIR has a (discrete time) Feynman-Kac interpretation.
16
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Theoretical Considerations
For example. . .
Proposition. Under standard regularity conditions
√
2
N ϕ
bN
n,AP F − ϕn → N 0, σn (ϕn )
where,
p ( x1 | y1 )2
Z
2
σ1 (ϕ1 ) =
q1 (x1 )
p(x1 |y1:n )2
Z
2
σn (ϕn ) =
+
q1 (x1 )
t−1
X
k=2
Z
+
Z
2
(ϕ1 (x1 ) − ϕ1 ) dx1
«2
„Z
ϕn (x1:n )p(x2:n |y2:n , x1 )dx2:n − ϕ̄n
p(x1:k |y1:n )
2
b (x1:k−1 |y1:k )qk (xk |xk−1 )
p
p(x1:n |y1:n )2
b (x1:n−1 |y1:n )qn (xn |xn−1 )
p
dx1
«2
„Z
ϕn (x1:n )p(xk+1:n |yk+1:n , xk )dxk+1:n − ϕ̄n
dx1:k
2
(ϕn (x1:n ) − ϕ̄n ) dx1:n .
17
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Practical Implications
Practical Implications
I
It means we’re doing importance sampling.
I
Choosing pb(yn |xn−1 ) = p (yn |xn = E [Xn |xn−1 ]) is
dangerous.
I
A safer choice would be ensure that
sup
xn−1 ,xn
I
g(yn |xn )f (xn |xn−1 )
<∞
pb(yn |xn−1 )q(xn |xn−1 )
Using APF doesn’t ensure superior performance.
18
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Practical Implications
A Contrived Illustration
Consider the following binary state-space model with common
state and observation spaces:
X = {0, 1}
p(x1 = 0) = 0.5
p(xn = xn−1 ) = 1 − δ
Y=X
p(yn = xn ) = 1 − ε.
I
δ controls ergodicity of the state process.
I
controls the information contained in observations.
Consider estimating E(X2 |Y1:2 = (0, 1)).
19
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Practical Implications
Variance of SIR - Variance of APF
0.05
0
-0.05
-0.1
-0.15
-0.2
1
0.9
0.8
0.7
0.05
0.6
0.1
0.15
0.5
0.4
0.2
0.25
ǫ
0.3
0.3
0.35
δ
0.2
0.4
0.45
0.1
0.5
20
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Practical Implications
Variance Comparison at ǫ = 0.25
0.45
SISR
APF
SISR Asymptotic
APF Asymptotic
0.4
Variance
0.35
0.3
0.25
0.2
0.15
0.1
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
δ
21
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Applications & Extensions
22
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Auxiliary SMC Samplers
SMC Samplers (Del Moral, Doucet & Jasra, 2006)
SIR Techniques can be adapted for any sequence of
distributions {πn }.
I
Define
π
fn (x1:n ) = πn (xn )
n−1
Y
Lk (xk+1 , xk ).
k=1
I
Sample at time n,
i
Xni ∼ Kn (Xn−1
, ·)
I
Weight
Wni ∝
I
i
πn (Xni )Ln−1 (Xni , Xn−1
)
i
i
πn−1 (Xn−1 )Kn (Xn−1 , Xni )
Resample.
23
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Auxiliary SMC Samplers
Auxiliary SMC Samplers
An auxiliary SMC sampler for a sequence of distributions πn
comprises:
I
An SMC sampler targeting some auxiliary sequence of
distributions µn .
I
A sequence of importance weight functions
w
fn (xn ) ∝
dπn
(xn ).
dµn
Generic approaches:
I
Choose µn (xn ) ∝ πn (xn )Vn (xn ).
I
Incorporate as much information as possible prior to
resampling.
24
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Auxiliary SMC Samplers
Auxiliary SMC Samplers
An auxiliary SMC sampler for a sequence of distributions πn
comprises:
I
An SMC sampler targeting some auxiliary sequence of
distributions µn .
I
A sequence of importance weight functions
w
fn (xn ) ∝
dπn
(xn ).
dµn
Generic approaches:
I
Choose µn (xn ) ∝ πn (xn )Vn (xn ).
I
Incorporate as much information as possible prior to
resampling.
25
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Auxiliary SMC Samplers
Resample-Move Approaches
A common approach in SMC Samplers:
I
Choose Kn (xn−1 , xn ) to be πn -invariant.
I
Set
Ln−1 (xn , xn−1 ) =
I
πn (xn−1 )Kn (xn−1 , xn )
.
πn (xn )
Leading to
Wn (xn−1 , xn ) =
I
πn (xn−1 )
.
πn−1 (xn−1 )
It would make more sense to resample and then move. . .
26
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Auxiliary SMC Samplers
An Interpretation of Pure Resample-Move
It’s SIR with auxiliary distributions. . .
µn (xn ) = πn+1 (xn )
µn−1 (xn−1 )Kn (xn−1 , xn )
πn (xn−1 )Kn (xn−1 , xn )
Ln−1 (xn , xn−1 ) =
=
µn−1 (xn )
πn (xn )
µn (xn )
πn+1 (xn )
wn (xn−1:n ) =
=
µn−1 (xn )
πn (xn )
µn−1 (xn )
πn (xn )
w
en (xn ) =
=
.
µn (xn )
πn+1 (xn )
27
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Auxiliary SMC Samplers
Piecewise-Deterministic Processes
25
20
y
15
10
5
0
−5
−10
35
40
45
50
55
60
x
28
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Auxiliary SMC Samplers
“Filtering” of Piecewise-Deterministic Processes
(Whiteley, Johansen & Godsill, 2007)
I
Xn = (kn , τn,1:kn , θn,0:kn ) specifies a continuous time
trajectory (ζt )t∈[0,tn ] .
I
ζt observed in the presence of noise, e.g. Yn = ζtn + Vn
I
Target sequence of distributions {πn } on nested spaces:
πn (k, τ1:k , θ0:k ) ∝ q(θ0 )
k
Y
f (τj |τj−1 )q(θj |τj , θj−1 , τj−1 )
j=1
× S(tn , τk )
n
Y
g(yp |ζtp )
p=1
29
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Auxiliary SMC Samplers
“Filtering” of Piecewise-Deterministic Processes
(Whiteley, Johansen & Godsill, 2007)
I
πn yields posterior marginal distributions for recent history
of ζt
I
Employ auxiliary SMC samplers
I
Choose µn (k, τ1:k , θ0:k ) ∝ πn (k, τ1:k , θ0:k )Vn (τk , θk )
I
Kernel proposes new pairs (τj , θj ) and adjusts recent pairs
I
Applications in object-tracking and finance
30
Introduction
Interpretation
Applications & Extensions
Conclusions
References
The Auxiliary SMC-PHD Filter
Probability Hypothesis Density Filtering
(Mahler, 2003; Vo et al. 2005)
I
Approximates the optimal filter for a class of spatial point
process-valued HMMs
I
A recursion for intensity functions:
Z
f (xn |xn−1 )pS (xn−1 )b
αn−1 (xn−1 )dxn−1 + γ(xn )
αn (xn ) =
"E
#
mn
X
ψn,r (xn )
α
bn (xn ) = 1 − pD (xn ) +
αn (xn ).
Zn,r
r=1
I
Approximate the intensity functions {b
αn }n≥0 using SMC
31
Introduction
Interpretation
Applications & Extensions
Conclusions
References
The Auxiliary SMC-PHD Filter
An ASMC Implementation (Whiteley et al., 2007)
1
N
PN
i
(dxn−1 )
i
i=1 Wn−1 δXn−1
I
N (dx
α
bn−1 (dxn−1 ) ≈ α
bn−1
n−1 ) =
I
In a simple case, target integral at nth iteration:
Z Z X
mn
E
I
E r=1
ϕ(xn )
ψn,r (xn )
N
f (xn |xn−1 )pS (xn−1 )dxn α
bn−1
(dxn−1 )
Zn,r
N ,
Proposal distribution qn (x0n , x0n−1 , rn ) built from α
bn−1
potential functions {Vn,r } and factorises:
(i)
i=1 Vn,r (Xn−1 )Wn−1 δX (i)
PN
qn (x0n |x0n−1 , r)
I
n−1
PN
(i)
(dx0n−1 )
i=1 Vn,r (Xn−1 )Wn−1
qn (r)
and also estimate normalising constants {Zn,r } by IS
32
Introduction
Interpretation
Applications & Extensions
Conclusions
References
On Stratification
Stratification. . .
I
Back to HMM, let (Ap )M
p=1 denote a partition of X
P
Introduce stratum indicator variable Rn = M
p=1 pIAp (Xn )
I
Define extended model:
I
p(x1:n , r1:n |y1:n ) ∝ g(yn |xn )f (xn |rn , xn−1 )f (rn |xn−1 )
× p(x1:n−1 , r1:n−1 |y1:n−1 )
I
An auxiliary SMC filter resamples from distribution with
N × M support points, over particles × strata
I
Applications in switching state space models
33
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Conclusions
34
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Conclusions
Conclusions
I
I
The APF is a standard SIR algorithm for nonstandard
distributions.
This interpretation. . .
I
I
I
I
allows standard results to be applied directly,
provides guidance on implementation,
and allows the same technique to be applied in more general
settings.
Thanks for listening. . . any questions?
35
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Conclusions
Conclusions
I
I
The APF is a standard SIR algorithm for nonstandard
distributions.
This interpretation. . .
I
I
I
I
allows standard results to be applied directly,
provides guidance on implementation,
and allows the same technique to be applied in more general
settings.
Thanks for listening. . . any questions?
36
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Conclusions
Conclusions
I
I
The APF is a standard SIR algorithm for nonstandard
distributions.
This interpretation. . .
I
I
I
I
allows standard results to be applied directly,
provides guidance on implementation,
and allows the same technique to be applied in more general
settings.
Thanks for listening. . . any questions?
37
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Conclusions
Conclusions
I
I
The APF is a standard SIR algorithm for nonstandard
distributions.
This interpretation. . .
I
I
I
I
allows standard results to be applied directly,
provides guidance on implementation,
and allows the same technique to be applied in more general
settings.
Thanks for listening. . . any questions?
38
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Conclusions
Conclusions
I
I
The APF is a standard SIR algorithm for nonstandard
distributions.
This interpretation. . .
I
I
I
I
allows standard results to be applied directly,
provides guidance on implementation,
and allows the same technique to be applied in more general
settings.
Thanks for listening. . . any questions?
39
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Conclusions
Conclusions
I
I
The APF is a standard SIR algorithm for nonstandard
distributions.
This interpretation. . .
I
I
I
I
allows standard results to be applied directly,
provides guidance on implementation,
and allows the same technique to be applied in more general
settings.
Thanks for listening. . . any questions?
40
Introduction
Interpretation
Applications & Extensions
Conclusions
References
Conclusions
Conclusions
I
I
The APF is a standard SIR algorithm for nonstandard
distributions.
This interpretation. . .
I
I
I
I
allows standard results to be applied directly,
provides guidance on implementation,
and allows the same technique to be applied in more general
settings.
Thanks for listening. . . any questions?
41
References
[1] P. Del Moral, A. Doucet, and A. Jasra. Sequential Monte Carlo samplers. Journal of the
Royal Statistical Society B, 63(3):411–436, 2006.
[2] A. M. Johansen and A. Doucet. Auxiliary variable sequential Monte Carlo methods.
Technical Report 07:09, University of Bristol, Department of Mathematics – Statistics
Group, University Walk, Bristol, BS8 1TW, UK, July 2007.
[3] A. M. Johansen and A. Doucet. A note on the auxiliary particle filter. Statistics and
Probability Letters, 2008. To appear.
[4] A. M. Johansen and N. Whiteley. A modern perspective on auxiliary particle filters. In
Proceedings of Workshop on Inference and Estimation in Probabilistic Time Series Models.
Isaac Newton Institute, June 2008. To appear.
[5] R. P. S. Mahler. Multitarget Bayes filtering via first-order multitarget moments. IEEE
Transactions on Aerospace and Electronic Systems, 39(4):1152, October 2003.
[6] M. K. Pitt and N. Shephard. Filtering via simulation: Auxiliary particle filters. Journal of
the American Statistical Association, 94(446):590–599, 1999.
[7] B. Vo, S.S. Singh, and A. Doucet. Sequential Monte Carlo methods for multi-target
filtering with random finite sets. IEEE Transactions on Aerospace and Electronic Systems,
41(4):1223–1245, October 2005.
[8] N. Whiteley, A. M. Johansen, and S. Godsill. Monte Carlo filtering of
piecewise-deterministic processes. Technical Report CUED/F-INFENG/TR-592, University
of Cambridge, Department of Engineering, Cambridge University Engineering Department,
Trumpington Street, Cambridge, CB2 1PZ, December 2007.
[9] N. Whiteley, S. Singh, and S. Godsill. Auxiliary particle implementation of the probability
hypothesis density filter. Technical Report CUED F-INFENG/590, University of
Cambridge, Department of Engineering, Trumpington Street, Cambridge,CB1 2PZ, United
Kingdom, 2007.
Download