Introduction Particle Methods Summary References An Brief Overview of Particle Filtering Adam M. Johansen a.m.johansen@warwick.ac.uk www2.warwick.ac.uk/fac/sci/statistics/staff/academic/johansen/talks/ May 11th, 2010 Warwick University Centre for Systems Biology 1 Introduction Particle Methods Summary References Outline I Background I I I Monte Carlo Methods Hidden Markov Models / State Space Models Sequential Monte Carlo I Filtering I I I I I Sequential Importance Sampling Sequential Importance Resampling Advanced methods Smoothing Parameter Estimation 2 Introduction Particle Methods Summary References Background 3 Introduction Particle Methods Summary References Monte Carlo Estimating π I Rain is uniform. I Circle is inscribed in square. I Asquare = 4r2 . I Acircle = πr2 . I p= I 383 of 500 “successes”. I 383 = 3.06. π̂ = 4 500 I Also obtain confidence intervals. Acircle Asquare = π4 . 4 Introduction Particle Methods Summary References Monte Carlo The Monte Carlo Method I Given a probability density, p, Z I= ϕ(x)p(x)dx E I Simple Monte Carlo solution: I I iid Sample X1 , . . . , XN ∼ p. N P ϕ(XN ). Estimate Iˆ = N1 i=1 I I I Justified by the law of large numbers. . . and the central limit theorem. Can also be viewed as approximating π(dx) = p(x)dx with N 1 X π̂ (dx) = δXi (dx). N N i=1 5 Introduction Particle Methods Summary References Monte Carlo Importance Sampling I Given q, such that I I p(x) > 0 ⇒ q(x) > 0 and p(x)/q(x) < ∞, define w(x) = p(x)/q(x) and: Z Z I = ϕ(x)p(x)dx = ϕ(x)w(x)q(x)dx. I This suggests the importance sampling estimator: I I iid Sample X1 , . . . , XN ∼ q. N P w(Xi )ϕ(Xi ). Estimate Iˆ = N1 i=1 I Can also be viewed as approximating π(dx) = p(x)dx with n π̂ n (dx) = 1X w(xi )δXi (dx). n i=1 6 Introduction Particle Methods Summary References Monte Carlo Variance and Importance Sampling I Variance depends upon proposal: !2 N X ˆ =E 1 w(Xi )ϕ(Xi ) − I Var(I) N i=1 = I 1 N2 N X E w(Xi )2 ϕ(Xi )2 − I 2 i=1 1 = Eq w(X)2 ϕ(X)2 − I 2 N 1 Ep w(X)ϕ(X)2 − I 2 = N For positive ϕ, minimized by: q(x) ∝ p(x)ϕ(x) ⇒ w(x)ϕ(x) ∝ 1 7 Introduction Particle Methods Summary References Monte Carlo Self-Normalised Importance Sampling I I I Often, p is known only up to a normalising constant. As Eq (Cwϕ) = CEp (ϕ). . . If v(x) = Cw(x), then Eq (Cwϕ) CEp (ϕ) Eq (vϕ) = = = Ep (ϕ). Eq (v1) Eq (Cw1) CEp (1) I Estimate the numerator and denominator with the same sample: N P v(Xi )ϕ(Xi ) i=1 ˆ I= . N P v(Xi ) i=1 I I Biased for finite samples, but consistent. Typically reduces variance. 8 Introduction Particle Methods Summary References Hidden Markov Models Hidden Markov Models / State Space Models x1 x2 x3 x4 x5 x6 y1 y2 y3 y4 y5 y6 I Unobserved Markov chain {Xn } transition f . I Observed process {Yn } conditional density g. I Density: p(x1:n , y1:n ) = f1 (x1 )g(y1 |x1 ) n Y f (xi |xi−1 )g(yi |xi ). i=2 9 Introduction Particle Methods Summary References Hidden Markov Models Inference in HMMs I Given y1:n : I I I What is x1:n , in particular, what is xn , and what about xn+1 ? I f and g may have unknown parameters θ. I We wish to obtain estimates for n = 1, 2, . . . . 10 Introduction Particle Methods Summary References Hidden Markov Models Some Terminology Three main problems: I Given known parameters θ, what are: I I The filtering and prediction distributions: Z p(xn |y1:n ) = p(x1:n |y1:n )dx1:n−1 Z p(xn+1 |y1:n ) = p(xn+1 |xn )p(xn |y1:n )dxn The smoothing distributions: p(x1:n |y1:n ) = I p(x1:n , y1:n ) p(y1:n ) And how can we estimate static parameters: p(θ|y1:n ) and p(x1:n , θ|y1:n )? 11 Introduction Particle Methods Summary References Hidden Markov Models Formal Solutions I Filtering and Prediction Recursions: p(xn |y1:n−1 )g(yn |xn ) p(xn |y1:n ) = R p(x0n |y1:n−1 )g(yn |x0n )dx0n Z p(xn+1 |y1:n ) = p(xn |y1:n )f (xn+1 |xn )dxn I Smoothing: p(x1:n |y1:n ) = R p(x1:n−1 |y1:n−1 )f (xn |xn−1 )g(yn |xn ) g(yn |x0n )f (x0n |xn−1 )p(x0n−1 |y1:n−1 )dx0n−1:n 12 Introduction Particle Methods Summary References Particle Filtering Importance Sampling in This Setting I Given p(x1:n |y1:n ) for n = 1, 2, . . . . I We could sample from a sequence qn (x1:n ) for each n. I Or we could let qn (x1:n ) = qn (xn |x1:n−1 )qn−1 (x1:n−1 ) and re-use our samples. I The importance weight decomposes: p(x1:n |y1:n ) qn (xn |x1:n−1 )qn−1 (x1:n−1 ) p(x1:n |y1:n ) = wn−1 (x1:n−1 ) qn (xn |x1:n−1 )p(x1:n−1 |y1:n−1 ) f (xn |xn−1 )g(yn |xn ) = wn−1 (x1:n−1 ) qn (xn |xn−1 )p(yn |y1:n−1 ) wn (x1:n ) ∝ 13 Introduction Particle Methods Summary References Particle Filtering Sequential Importance Sampling – Prediction & Update I A first “particle filter”: I I Simple default: qn (xn |xn−1 ) = f (xn |xn−1 ). Importance weighting becomes: wn (x1:n ) = wn−1 (x1:n−1 ) × g(yn |xn ) I Algorithmically, at iteration n: I i i Given {Wn−1 , X1:n−1 } for i = 1, . . . , N : I I i Sample Xni ∼ f (·|Xn−1 ) (prediction) i i Weight Wn ∝ Wn−1 g(yn |Xni ) (update) 14 Introduction Particle Methods Summary References Particle Filtering Example: Almost Constant Velocity Model I I I States: xn = [sxn uxn syn uyn ]T Dynamics: xn = Axn−1 + n x sn 1 ∆t 0 0 uxn 0 1 0 0 y = sn 0 0 1 ∆t 0 0 0 1 uyn sxn−1 uxn−1 y + n s n−1 y un−1 Observation: yn = Bxn + νn sxn uxn y + νn sn uyn rnx rny = 1 0 0 0 0 0 1 0 15 Introduction Particle Methods Summary References Particle Filtering Better Proposal Distributions I The importance weighting is: wn (x1:n ) ∝ I f (xn |xn−1 )g(yn |xn ) wn−1 (x1:n−1 ) qn (xn |xn−1 ) Optimally, we’d choose: qn (xn |xn−1 ) ∝ f (xn |xn−1 )g(yn |xn ) I Typically impossible; but good approximation is possible. I I Can obtain much better performance. But, eventually the approximation variance is too large. 16 Introduction Particle Methods Summary References Particle Filtering Resampling I i } targetting p(x Given {Wni , X1:n 1:n |y1:n ). I Sample ei ∼ X 1:n N X j=1 Wni δX j . 1:n I e i } also targets p(x1:n |y1:n ). And {1/N, X 1:n I Such steps can be incorporated into the SIS algorithm. 17 Introduction Particle Methods Summary References Particle Filtering Sequential Importance Resampling I Algorithmically, at iteration n: I I I i i } for i = 1, . . . , N : , Xn−1 Given {Wn−1 e i }. Resample, obtaining {1/N, X n−1 I i en−1 Sample Xni ∼ q(·|X ) I Weight Wni ∝ i ei i f (Xn |Xn−1 )g(yn |Xn ) ei q(X i |X ) n n−1 Actually: I I You can resample more efficiently. It’s only necessary to resample some of the time. 18 Introduction Particle Methods Summary References Particle Filtering What else can we do? Some common improvements: I Incorporate MCMC (7) I Use the next observation before resampling – auxiliary particle filters (11, 9) I Sample new values for recent states (6) I Explicitly approximate the marginal filter (10) 19 Introduction Particle Methods Summary References Particle Smoothing Smoothing: I Resampling helps filtering. I But it eliminates unique particle values. I Approximating p(x1:n |y1:n ) or p(xm |y1:n ) fails for large n. 20 Introduction Particle Methods Summary References Particle Smoothing Smoothing Strategies Some approaches: I Fixed-lag smoothing: p(xn−L |y1:n ) ≈ p(xn−L |y1:n−1 ) I for large enough L. Particle implementation of forward-backward algorithm: p(x1:n |y1:n ) = p(xn |y1:n ) n−1 Y p(xn−1 |xn , y1:n ) m=1 I Doucet et al. have developed a forward-only variant. Particle implementation of two-filter formula: p(xm |y1:n ) ∝ p(xm |y1:m )p(ym+1:n |xm ) 21 Introduction Particle Methods Summary References Parameter Estimation Static Parameters If f (xn |xn−1 ) and g(yn |xn ) depend upon θ: I We could set x0n = (xn , θn ) and f 0 (x0n |x0n−1 ) = f (xn |xn−1 )δθn−1 (θn ) I Degeneracy causes difficulties: I I I Resampling eliminates values. We never propose new values. MCMC transitions don’t mix well. 22 Introduction Particle Methods Summary References Parameter Estimation Approaches to Parameter Estimation I Artificial dynamics: θn = θn−1 + n . I Iterated filtering. I Sufficient statistics. I Various online likelihood approaches. I Practical filtering. .. . 23 Introduction Particle Methods Summary References Summary 24 Introduction Particle Methods Summary References SMCTC: C++ Template Class for SMC Algorithms I Implementing SMC algorithms in C/C++ isn’t hard. I Software for implementing general SMC algorithms. I C++ element largely confined to the library. I Available (under a GPL-3 license from) www2.warwick.ac.uk/fac/sci/statistics/staff/ academic/johansen/smctc/ or type “smctc” into google. I Example code includes simple particle filter. 25 Introduction Particle Methods Summary References In Conclusion (with references) I Sequential Monte Carlo methods (“Particle Filters”) can approximate filtering and smoothing distributions (5). I Proposal distributions matter. Resampling: I I I I I I Smoothing & parameter estimation are harder. Software for easy implementation exists: I I I is necessary (2), but not every iteration , and need not be multinomial (4). PFLib (1) SMCTC (8) Actually, similar techniques apply elsewhere (3) 26 Introduction Particle Methods Summary References References [1] L. Chen, C. Lee, A. Budhiraja, and R. K. Mehra. PFlib: An object oriented matlab toolbox for particle filtering. In Proceedings of SPIE Signal Processing, Sensor Fusion and Target Recognition XVI, volume 6567, 2007. [2] P. Del Moral. Feynman-Kac formulae: genealogical and interacting particle systems with applications. Probability and Its Applications. Springer Verlag, New York, 2004. [3] P. Del Moral, A. Doucet, and A. Jasra. Sequential Monte Carlo samplers. Journal of the Royal Statistical Society B, 63(3):411–436, 2006. [4] R. Douc and O. Cappé. Comparison of resampling schemes for particle filters. In Proceedings of the 4th International Symposium on Image and Signal Processing and Analysis, volume I, pages 64–69. IEEE, 2005. [5] A. Doucet and A. M. Johansen. A tutorial on particle filtering and smoothing: Fiteen years later. In D. Crisan and B. Rozovsky, editors, The Oxford Handbook of Nonlinear Filtering. Oxford University Press, 2010. To appear. [6] A. Doucet, M. Briers, and S. Sénécal. Efficient block sampling strategies for sequential Monte Carlo methods. Journal of Computational and Graphical Statistics, 15(3):693–711, 2006. [7] W. R. Gilks and C. Berzuini. Following a moving target – Monte Carlo inference for dynamic Bayesian models. Journal of the Royal Statistical Society B, 63:127–146, 2001. [8] A. M. Johansen. SMCTC: Sequential Monte Carlo in C++. Journal of Statistical Software, 30(6):1–41, April 2009. [9] A. M. Johansen and A. Doucet. A note on the auxiliary particle filter. Statistics and Probability Letters, 78(12):1498–1504, September 2008. URL http://dx.doi.org/10.1016/j.spl.2008.01.032. [10] M. Klass, N. de Freitas, and A. Doucet. Towards practical n2 Monte Carlo: The marginal particle filter. In Proceedings of Uncertainty in Artificial Intelligence, 2005. [11] M. K. Pitt and N. Shephard. Filtering via simulation: Auxiliary particle filters. Journal of the American Statistical Association, 94(446):590–599, 1999. 27 Introduction Particle Methods Summary References SMC Sampler Outline (i) I Given a sample {X1:n−1 }N en−1 , i=1 targeting η I sample Xn ∼ Kn (Xn−1 , ·), I calculate (i) (i) (i) (i) Wn (X1:n ) = I I (i) (i) ηn (Xn )Ln−1 (Xn , Xn−1 ) (i) (i) (i) ηn−1 (Xn−1 )Kn (Xn−1 , Xn ) . (i) Resample, yielding: {X1:n }N en . i=1 targeting η Hints that we’d like to use ηn−1 (xn−1 )Kn (xn−1 , xn ) Ln−1 (xn , xn−1 ) = R . ηn−1 (x0n−1 )Kn (x0n−1 , xn ) 28