An Brief Overview of Particle Filtering Adam M. Johansen May 11th, 2010

advertisement
Introduction
Particle Methods
Summary
References
An Brief Overview of Particle Filtering
Adam M. Johansen
a.m.johansen@warwick.ac.uk
www2.warwick.ac.uk/fac/sci/statistics/staff/academic/johansen/talks/
May 11th, 2010
Warwick University Centre for Systems Biology
1
Introduction
Particle Methods
Summary
References
Outline
I
Background
I
I
I
Monte Carlo Methods
Hidden Markov Models / State Space Models
Sequential Monte Carlo
I
Filtering
I
I
I
I
I
Sequential Importance Sampling
Sequential Importance Resampling
Advanced methods
Smoothing
Parameter Estimation
2
Introduction
Particle Methods
Summary
References
Background
3
Introduction
Particle Methods
Summary
References
Monte Carlo
Estimating π
I
Rain is uniform.
I
Circle is inscribed
in square.
I
Asquare = 4r2 .
I
Acircle = πr2 .
I
p=
I
383 of 500
“successes”.
I
383
= 3.06.
π̂ = 4 500
I
Also obtain
confidence
intervals.
Acircle
Asquare
= π4 .
4
Introduction
Particle Methods
Summary
References
Monte Carlo
The Monte Carlo Method
I
Given a probability density, p,
Z
I=
ϕ(x)p(x)dx
E
I
Simple Monte Carlo solution:
I
I
iid
Sample X1 , . . . , XN ∼ p.
N
P
ϕ(XN ).
Estimate Iˆ = N1
i=1
I
I
I
Justified by the law of large numbers. . .
and the central limit theorem.
Can also be viewed as approximating π(dx) = p(x)dx with
N
1 X
π̂ (dx) =
δXi (dx).
N
N
i=1
5
Introduction
Particle Methods
Summary
References
Monte Carlo
Importance Sampling
I
Given q, such that
I
I
p(x) > 0 ⇒ q(x) > 0
and p(x)/q(x) < ∞,
define w(x) = p(x)/q(x) and:
Z
Z
I = ϕ(x)p(x)dx = ϕ(x)w(x)q(x)dx.
I
This suggests the importance sampling estimator:
I
I
iid
Sample X1 , . . . , XN ∼ q.
N
P
w(Xi )ϕ(Xi ).
Estimate Iˆ = N1
i=1
I
Can also be viewed as approximating π(dx) = p(x)dx with
n
π̂ n (dx) =
1X
w(xi )δXi (dx).
n
i=1
6
Introduction
Particle Methods
Summary
References
Monte Carlo
Variance and Importance Sampling
I
Variance depends upon proposal:

!2 
N
X
ˆ =E  1
w(Xi )ϕ(Xi ) − I 
Var(I)
N
i=1
=
I
1
N2
N
X
E
w(Xi )2 ϕ(Xi )2 − I 2
i=1
1
=
Eq w(X)2 ϕ(X)2 − I 2
N
1
Ep w(X)ϕ(X)2 − I 2
=
N
For positive ϕ, minimized by:
q(x) ∝ p(x)ϕ(x) ⇒ w(x)ϕ(x) ∝ 1
7
Introduction
Particle Methods
Summary
References
Monte Carlo
Self-Normalised Importance Sampling
I
I
I
Often, p is known only up to a normalising constant.
As Eq (Cwϕ) = CEp (ϕ). . .
If v(x) = Cw(x), then
Eq (Cwϕ)
CEp (ϕ)
Eq (vϕ)
=
=
= Ep (ϕ).
Eq (v1)
Eq (Cw1)
CEp (1)
I
Estimate the numerator and denominator with the same
sample:
N
P
v(Xi )ϕ(Xi )
i=1
ˆ
I=
.
N
P
v(Xi )
i=1
I
I
Biased for finite samples, but consistent.
Typically reduces variance.
8
Introduction
Particle Methods
Summary
References
Hidden Markov Models
Hidden Markov Models / State Space Models
x1
x2
x3
x4
x5
x6
y1
y2
y3
y4
y5
y6
I
Unobserved Markov chain {Xn } transition f .
I
Observed process {Yn } conditional density g.
I
Density:
p(x1:n , y1:n ) = f1 (x1 )g(y1 |x1 )
n
Y
f (xi |xi−1 )g(yi |xi ).
i=2
9
Introduction
Particle Methods
Summary
References
Hidden Markov Models
Inference in HMMs
I
Given y1:n :
I
I
I
What is x1:n ,
in particular, what is xn ,
and what about xn+1 ?
I
f and g may have unknown parameters θ.
I
We wish to obtain estimates for n = 1, 2, . . . .
10
Introduction
Particle Methods
Summary
References
Hidden Markov Models
Some Terminology
Three main problems:
I Given known parameters θ, what are:
I
I
The filtering and prediction distributions:
Z
p(xn |y1:n ) = p(x1:n |y1:n )dx1:n−1
Z
p(xn+1 |y1:n ) = p(xn+1 |xn )p(xn |y1:n )dxn
The smoothing distributions:
p(x1:n |y1:n ) =
I
p(x1:n , y1:n )
p(y1:n )
And how can we estimate static parameters:
p(θ|y1:n ) and p(x1:n , θ|y1:n )?
11
Introduction
Particle Methods
Summary
References
Hidden Markov Models
Formal Solutions
I
Filtering and Prediction Recursions:
p(xn |y1:n−1 )g(yn |xn )
p(xn |y1:n ) = R
p(x0n |y1:n−1 )g(yn |x0n )dx0n
Z
p(xn+1 |y1:n ) = p(xn |y1:n )f (xn+1 |xn )dxn
I
Smoothing:
p(x1:n |y1:n ) = R
p(x1:n−1 |y1:n−1 )f (xn |xn−1 )g(yn |xn )
g(yn |x0n )f (x0n |xn−1 )p(x0n−1 |y1:n−1 )dx0n−1:n
12
Introduction
Particle Methods
Summary
References
Particle Filtering
Importance Sampling in This Setting
I
Given p(x1:n |y1:n ) for n = 1, 2, . . . .
I
We could sample from a sequence qn (x1:n ) for each n.
I
Or we could let qn (x1:n ) = qn (xn |x1:n−1 )qn−1 (x1:n−1 ) and
re-use our samples.
I
The importance weight decomposes:
p(x1:n |y1:n )
qn (xn |x1:n−1 )qn−1 (x1:n−1 )
p(x1:n |y1:n )
=
wn−1 (x1:n−1 )
qn (xn |x1:n−1 )p(x1:n−1 |y1:n−1 )
f (xn |xn−1 )g(yn |xn )
=
wn−1 (x1:n−1 )
qn (xn |xn−1 )p(yn |y1:n−1 )
wn (x1:n ) ∝
13
Introduction
Particle Methods
Summary
References
Particle Filtering
Sequential Importance Sampling – Prediction & Update
I
A first “particle filter”:
I
I
Simple default: qn (xn |xn−1 ) = f (xn |xn−1 ).
Importance weighting becomes:
wn (x1:n ) = wn−1 (x1:n−1 ) × g(yn |xn )
I
Algorithmically, at iteration n:
I
i
i
Given {Wn−1
, X1:n−1
} for i = 1, . . . , N :
I
I
i
Sample Xni ∼ f (·|Xn−1
) (prediction)
i
i
Weight Wn ∝ Wn−1 g(yn |Xni ) (update)
14
Introduction
Particle Methods
Summary
References
Particle Filtering
Example: Almost Constant Velocity Model
I
I
I
States: xn = [sxn uxn syn uyn ]T
Dynamics: xn = Axn−1 + n
 x  
sn
1 ∆t 0 0
 uxn   0 1 0 0
 y =
 sn   0 0 1 ∆t
0 0 0 1
uyn

sxn−1
  uxn−1 
 y
 + n
 s

n−1
y
un−1

Observation: yn = Bxn + νn

sxn
 uxn 
 y  + νn
 sn 
uyn

rnx
rny
=
1 0 0 0
0 0 1 0
15
Introduction
Particle Methods
Summary
References
Particle Filtering
Better Proposal Distributions
I
The importance weighting is:
wn (x1:n ) ∝
I
f (xn |xn−1 )g(yn |xn )
wn−1 (x1:n−1 )
qn (xn |xn−1 )
Optimally, we’d choose:
qn (xn |xn−1 ) ∝ f (xn |xn−1 )g(yn |xn )
I
Typically impossible; but good approximation is possible.
I
I
Can obtain much better performance.
But, eventually the approximation variance is too large.
16
Introduction
Particle Methods
Summary
References
Particle Filtering
Resampling
I
i } targetting p(x
Given {Wni , X1:n
1:n |y1:n ).
I
Sample
ei ∼
X
1:n
N
X
j=1
Wni δX j .
1:n
I
e i } also targets p(x1:n |y1:n ).
And {1/N, X
1:n
I
Such steps can be incorporated into the SIS algorithm.
17
Introduction
Particle Methods
Summary
References
Particle Filtering
Sequential Importance Resampling
I
Algorithmically, at iteration n:
I
I
I
i
i
} for i = 1, . . . , N :
, Xn−1
Given {Wn−1
e i }.
Resample, obtaining {1/N, X
n−1
I
i
en−1
Sample Xni ∼ q(·|X
)
I
Weight Wni ∝
i ei
i
f (Xn
|Xn−1 )g(yn |Xn
)
ei
q(X i |X
)
n
n−1
Actually:
I
I
You can resample more efficiently.
It’s only necessary to resample some of the time.
18
Introduction
Particle Methods
Summary
References
Particle Filtering
What else can we do?
Some common improvements:
I
Incorporate MCMC (7)
I
Use the next observation before resampling – auxiliary
particle filters (11, 9)
I
Sample new values for recent states (6)
I
Explicitly approximate the marginal filter (10)
19
Introduction
Particle Methods
Summary
References
Particle Smoothing
Smoothing:
I
Resampling helps filtering.
I
But it eliminates unique particle values.
I
Approximating p(x1:n |y1:n ) or p(xm |y1:n ) fails for large n.
20
Introduction
Particle Methods
Summary
References
Particle Smoothing
Smoothing Strategies
Some approaches:
I Fixed-lag smoothing:
p(xn−L |y1:n ) ≈ p(xn−L |y1:n−1 )
I
for large enough L.
Particle implementation of forward-backward algorithm:
p(x1:n |y1:n ) = p(xn |y1:n )
n−1
Y
p(xn−1 |xn , y1:n )
m=1
I
Doucet et al. have developed a forward-only variant.
Particle implementation of two-filter formula:
p(xm |y1:n ) ∝ p(xm |y1:m )p(ym+1:n |xm )
21
Introduction
Particle Methods
Summary
References
Parameter Estimation
Static Parameters
If f (xn |xn−1 ) and g(yn |xn ) depend upon θ:
I
We could set x0n = (xn , θn ) and
f 0 (x0n |x0n−1 ) = f (xn |xn−1 )δθn−1 (θn )
I
Degeneracy causes difficulties:
I
I
I
Resampling eliminates values.
We never propose new values.
MCMC transitions don’t mix well.
22
Introduction
Particle Methods
Summary
References
Parameter Estimation
Approaches to Parameter Estimation
I
Artificial dynamics:
θn = θn−1 + n .
I
Iterated filtering.
I
Sufficient statistics.
I
Various online likelihood approaches.
I
Practical filtering.
..
.
23
Introduction
Particle Methods
Summary
References
Summary
24
Introduction
Particle Methods
Summary
References
SMCTC: C++ Template Class for SMC Algorithms
I
Implementing SMC algorithms in C/C++ isn’t hard.
I
Software for implementing general SMC algorithms.
I
C++ element largely confined to the library.
I
Available (under a GPL-3 license from)
www2.warwick.ac.uk/fac/sci/statistics/staff/
academic/johansen/smctc/
or type “smctc” into google.
I
Example code includes simple particle filter.
25
Introduction
Particle Methods
Summary
References
In Conclusion (with references)
I
Sequential Monte Carlo methods (“Particle Filters”) can
approximate filtering and smoothing distributions (5).
I
Proposal distributions matter.
Resampling:
I
I
I
I
I
I
Smoothing & parameter estimation are harder.
Software for easy implementation exists:
I
I
I
is necessary (2),
but not every iteration ,
and need not be multinomial (4).
PFLib (1)
SMCTC (8)
Actually, similar techniques apply elsewhere (3)
26
Introduction
Particle Methods
Summary
References
References
[1] L. Chen, C. Lee, A. Budhiraja, and R. K. Mehra. PFlib: An object oriented matlab
toolbox for particle filtering. In Proceedings of SPIE Signal Processing, Sensor Fusion and
Target Recognition XVI, volume 6567, 2007.
[2] P. Del Moral. Feynman-Kac formulae: genealogical and interacting particle systems with
applications. Probability and Its Applications. Springer Verlag, New York, 2004.
[3] P. Del Moral, A. Doucet, and A. Jasra. Sequential Monte Carlo samplers. Journal of the
Royal Statistical Society B, 63(3):411–436, 2006.
[4] R. Douc and O. Cappé. Comparison of resampling schemes for particle filters. In
Proceedings of the 4th International Symposium on Image and Signal Processing and
Analysis, volume I, pages 64–69. IEEE, 2005.
[5] A. Doucet and A. M. Johansen. A tutorial on particle filtering and smoothing: Fiteen
years later. In D. Crisan and B. Rozovsky, editors, The Oxford Handbook of Nonlinear
Filtering. Oxford University Press, 2010. To appear.
[6] A. Doucet, M. Briers, and S. Sénécal. Efficient block sampling strategies for sequential
Monte Carlo methods. Journal of Computational and Graphical Statistics, 15(3):693–711,
2006.
[7] W. R. Gilks and C. Berzuini. Following a moving target – Monte Carlo inference for
dynamic Bayesian models. Journal of the Royal Statistical Society B, 63:127–146, 2001.
[8] A. M. Johansen. SMCTC: Sequential Monte Carlo in C++. Journal of Statistical Software,
30(6):1–41, April 2009.
[9] A. M. Johansen and A. Doucet. A note on the auxiliary particle filter. Statistics and
Probability Letters, 78(12):1498–1504, September 2008. URL
http://dx.doi.org/10.1016/j.spl.2008.01.032.
[10] M. Klass, N. de Freitas, and A. Doucet. Towards practical n2 Monte Carlo: The marginal
particle filter. In Proceedings of Uncertainty in Artificial Intelligence, 2005.
[11] M. K. Pitt and N. Shephard. Filtering via simulation: Auxiliary particle filters. Journal
of the American Statistical Association, 94(446):590–599, 1999.
27
Introduction
Particle Methods
Summary
References
SMC Sampler Outline
(i)
I
Given a sample {X1:n−1 }N
en−1 ,
i=1 targeting η
I
sample Xn ∼ Kn (Xn−1 , ·),
I
calculate
(i)
(i)
(i)
(i)
Wn (X1:n ) =
I
I
(i)
(i)
ηn (Xn )Ln−1 (Xn , Xn−1 )
(i)
(i)
(i)
ηn−1 (Xn−1 )Kn (Xn−1 , Xn )
.
(i)
Resample, yielding: {X1:n }N
en .
i=1 targeting η
Hints that we’d like to use
ηn−1 (xn−1 )Kn (xn−1 , xn )
Ln−1 (xn , xn−1 ) = R
.
ηn−1 (x0n−1 )Kn (x0n−1 , xn )
28
Download