Exact Approximation of Rao-Blackwellized Particle Filters Adam M. Johansen? , Nick Whiteley† and Arnaud Doucet‡ ? University of Warwick Department of Statistics, of Bristol School of Mathematics, ‡ University of Oxford Department of Statistics † University a.m.johansen@warwick.ac.uk http://go.warwick.ac.uk/amjohansen/talks July 11th, 2012 16th IFAC SysID 1 Context Filtering in State-Space Models: I SIR Particle Filters [GSS93] I Rao-Blackwellized Particle Filters [AD02, CL00] Exact Approximation of Monte Carlo Algorithms: I Particle MCMC [ADH10] I Local SMC [JD12] Approximating the RBPF I Approximated Rao-Blackwellized Particle Filters [CSOL11] 2 A (Rather Broad) Class of Hidden Markov Models y1 x1 z1 x2 z2 x3 y2 z3 y3 I I I Unobserved Markov chain {(Xn , Zn )} transition f . Observed process {Yn } conditional density g . Density: p(x1:n , z1:n , y1:n ) = f1 (x1 , z1 )g (y1 |x1 , z1 ) n Y f (xi , zi |xi−1 , zi−1 )g (yi |xi , zi ). i=2 3 Formal Solutions I Filtering and Prediction Recursions: p(xn , zn |y1:n−1 )g (yn |xn , zn ) p(xn , zn |y1:n ) = R p(xn0 , zn0 |y1:n−1 )g (yn |xn0 , zn0 )d(xn0 , zn0 ) Z p(xn+1 , zn+1 |y1:n ) = p(xn , zn |y1:n )f (xn+1 , zn+1 |xn , zn )d(xn , zn ) I Smoothing: p((x, z)1:n |y1:n ) ∝ p((x, z)1:n−1 |y1:n−1 )f ((x, z)n |(x, z)n−1 )g (yn |(x, z)n ) 4 A Simple SIR Filter Algorithmically, at iteration n: I I i Given {Wn−1 , (X , Z )i1:n−1 } for i = 1, . . . , N: e, Z e )i }. Resample, obtaining { 1 , (X 1:n−1 N I I Sample (X , Z )in e, Z e )i ) ∼ qn (·|(X n−1 Weight Wni ∝ e ,Z e )i )g (yn |(X ,Z )i ) f ((X ,Z )in |(X n n−1 e ,Z e )i ) qn ((X ,Z )i |(X n n−1 Actually: I Resample efficiently. I Only resample when necessary. I ... 5 A Rao-Blackwellized SIR Filter Algorithmically, at iteration n: I X ,i i i Given {Wn−1 , (X1:n−1 , p(z1:n−1 |X1:n−1 , y1:n−1 )} 1 ei ei , p(z1:n−1 |X Resample, obtaining { , (X I For i = 1, . . . , N: I 1:n−1 N I I 1:n−1 , y1:n−1 ))}. ei ) Sample Xni ∼ qn (·|X n−1 i i i e Set X1:n ← (X , 1:n−1 Xn ). ei ) p(Xni ,yn |X n−1 i e qn (X |X i ) I Weight WnX ,i ∝ I i Compute p(z1:n |y1:n , X1:n ). n n−1 Requires analytically tractable substructure. 6 An Approximate Rao-Blackwellized SIR Filter Algorithmically, at iteration n: I I I X ,i i i Given {Wn−1 , (X1:n−1 ,b p (z1:n−1 |X1:n−1 , y1:n−1 )} 1 ei ei ,b p (z1:n−1 |X Resample, obtaining { , (X 1:n−1 N 1:n−1 , y1:n−1 ))}. For i = 1, . . . , N: I I ei ) Sample Xni ∼ qn (·|X n−1 i i i e Set X1:n ← (X , 1:n−1 Xn ). ei ) b p (Xni ,yn |X n−1 i e qn (X |X i ) I Weight WnX ,i ∝ I i Compute b p (z1:n |y1:n , X1:n ). n n−1 Is approximate; how does error accumulate? 7 Exactly Approximated Rao-Blackwellized SIR Filter At time n = 1 I I I Sample, X1i ∼ q x ( ·| y1 ). Sample, Z1i,j ∼ q z ·| X1i , y1 . Compute and normalise the local weights w1z X1i , Z1i,j p(X i , y1 , Z i,j ) := 1 1 , W1z,i,j q z Z1i,j X1i , y1 define b p (X1i , y1 ) := := w1z X1i , Z1i,j , PM z X i , Z i,k w 1 1 k=1 1 M 1 X z i i,j w1 X1 , Z1 . M j=1 I Compute and normalise the top-level weights w1x X1i w1x X1i b p (X1i , y1 ) x,i , W1 := PN := x . x k q X1i |y1 k=1 w1 X1 8 At times n ≥ 2 I Resample x,i Wn−1 , n o z,i,j i,j i X1:n−1 , Wn−1 , Z1:n−1 j i to obtain I I I n o 1 z,i,j i,j ei , , X W , Z . 1:n−1 n−1 1:n−1 N j i z,i,j i,j 1 e i,j Resample {W n−1 , Z 1:n−1 }j to obtain { M , Z1:n−1 }j . i i ei ei Sample Xni ∼ q x (·|X 1:n−1 , y1:n ); set X1:n := (X1:n−1 , Xn ). e i,j Sample Zni,j ∼ q z ·| X i , y1:n , Z ; set 1:n i,j Z1:n 1:n−1 e i,j , Zni,j ). := (Z 1:n−1 9 I Compute and normalise the local weights ei e i,j , Z p Xni , yn , Zni,j X n−1 n−1 i,j i , wnz X1:n , Z1:n := i,j i,j i ,y ,Z e q z Zn X1:n 1:n 1:n−1 i e b p ( Xni , yn X 1:n−1 , y1:n−1 ) : = Wnz,i,j : = I M 1 X z i i,j wn X1:n , Z1:n , M j=1 i , Z i,j wnz X1:n 1:n . PM z X i , Z i,k w 1:n 1:n k=1 n Compute and normalise the top-level weights i e b p ( Xni , yn X 1:n−1 , y1:n−1 ) x i wn X1:n := , ei q x (Xni |X , y ) 1:n 1:n−1 x Xi w n 1:n x,i Wn := PN . k x k=1 wn X1:n 10 How can this be justified? I As an extended space SIR algorithm. I Via unbiased estimation arguments. Note also the M = 1 and M → ∞ cases. How does this differ from CSOL11? Principally in the local weights, benefits including: I Valid (N-consistent) for all M ≥ 1 rather than (M, N)-consistent. I Computational cost O(MN) rather than O(M 2 N). I Only requires knowledge of joint behaviour of x or z; doesn’t require say p(xn |xn−1 , zn−1 ). 11 Toy Example: Model We use a simulated sequence of 100 observations from the model defined by the densities: x1 0 1 0 µ(x1 , z1 ) =N ; z1 0 0 1 xn xn−1 1 0 f (xn , zn |xn−1 , zn−1 ) =N ; , zn zn−1 0 1 2 xn σx 0 g (yn |xn , zn ) =N yn ; , zn 0 σz2 Consider IMSE (relative to optimal filter) of filtering estimate of first coordinate marginals. 12 Mean Squared Filtering Error Approximation of the RBPF N = 10 −1 10 N = 20 N = 40 N = 80 −2 10 N = 160 0 10 1 10 Number of Lower−Level Particles, M 2 10 For σx2 = σz2 = 1. 13 Computational Performance 0 Mean Squared Filtering Error 10 N = 10 N = 20 N = 40 N = 80 N = 160 −1 10 −2 10 −3 10 1 10 2 10 3 10 Computational Cost, N(M+1) 4 10 5 10 For σx2 = σz2 = 1. 14 Computational Performance N = 10 N = 20 N = 40 N = 80 N = 160 1.9 Mean Squared Filtering Error 10 1.8 10 1.7 10 1.6 10 1 10 2 10 3 10 Computational Cost, N(M+1) 4 10 5 10 For σx2 = 102 and σz2 = 0.12 . 15 In Conclusion I SMC can be used hierarchically. I Software implementation is not difficult [Joh09]. The Rao-Blackwellized particle filter can be approximated exactly I I I I I Can reduce estimator variance at fixed cost. Attractive for distributed/parallel implementation. Allows combination of different sorts of particle filter. Can be combined with other techniques for parameter estimation etc.. 16 References C. Andrieu and A. Doucet. Particle filtering for partially observed Gaussian state space models. Journal of the Royal Statistical Society B, 64(4):827–836, 2002. C. Andrieu, A. Doucet, and R. Holenstein. Particle Markov chain Monte Carlo. Journal of the Royal Statistical Society B, 72(3):269–342, 2010. R. Chen and J. S. Liu. Mixture Kalman filters. Journal of the Royal Statistical Society B, 62(3):493–508, 2000. T. Chen, T. Schön, H. Ohlsson, and L. Ljung. Decentralized particle filter with arbitrary state decomposition. IEEE Transactions on Signal Processing, 59(2):465–478, February 2011. N. J. Gordon, S. J. Salmond, and A. F. M. Smith. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proceedings-F, 140(2):107–113, April 1993. A. M. Johansen and A. Doucet. Hierarchical particle sampling for intractable state-space models. CRiSM working paper, University of Warwick, 2012. In preparation. A. M. Johansen. SMCTC: Sequential Monte Carlo in C++. Journal of Statistical Software, 30(6):1–41, April 2009. 17