Block-Tempered Particle Filters Adam M. Johansen University of Warwick a.m.johansen@warwick.ac.uk www2.warwick.ac.uk/fac/sci/statistics/staff/academic/johansen/talks/ SysID 17: October 20th, 2015 Outline I Background: SMC and PMCMC I Tempering Blocks in SMC I Example: proof-of-concept results I Conclusions Filtering Online inference for State Space Models: I I x1 x2 x3 x4 x5 x6 y1 y2 y3 y4 y5 y6 Given transition fθ (xn |xn−1 ), and likelihood gθ (yn |xn ), use pθ (xn |y1:n ) to characterize latent state, but, R pθ (xn−1 |y1:n−1 )fθ (xn |xn−1 )dxn−1 gθ (yn |xn ) pθ (xn |y1:n ) = R R pθ (xn−1 |y1:n−1 )fθ (x0n |xn−1 )dxn−1 gθ (yn |x0n )dx0n I isn’t often tractable. Particle Filtering A (sequential) Monte Carlo (SMC) scheme to approximate the filtering distributions. A Simple Particle Filter At n = 1: I Sample X11 , . . . , X1N ∼ µθ . For n > 1: I Sample j j j=1 gθ (yn−1 |Xn−1 )fθ (·|Xn−1 ) Pn k k=1 gθ (yn−1 |Xn−1 ) PN Xn1 , . . . , XnN I ∼ Approximate pθ (dxn |y1:n ), pθ (y1:n ) with PN pbθ (·|y1:n ) = j=1 P n gθ (yn |Xnj )δXnj k k=1 gθ (yn |Xn ) N , pbθ (y1:n ) 1X gθ (yn |Xnk ) = pbθ (y1:n−1 ) n j=1 Online Particle Filters for Offline Systems Identification Particle Markov chain Monte Carlo (PMCMC) [ADH10] I Embed SMC within MCMC, I justified via explicit auxiliary variable construction, I and in some simple cases by a pseudomarginal [AR09] argument. I Very widely applicable, I but prone to poor mixing when SMC performs poorly for some θ [OWG15, Section 4.2.1]. I Is valid for very general SMC algorithms. Block-Sampling and Tempering in Particle Filters Tempered Transitions [GC01] I Introduce each likelihood term gradually, targetting: πn,m (x1:n ) ∝ p(x1:n |y1:n−1 )p(yn |xn )βm I between p(x1:n−1 |y1:n−1 ) and p(x1:n |y1:n ). Can improve performance — but to a limited extent. Block Sampling [DBS06] I Essentially uses yn:n+L in proposing xn . I Can dramatically improve performance, I but requires good analytic approximation of p(xn:n+L |xn−1 , yn:n+L ). Block-Tempering I We could combine blocking and tempering strategies. I Run a simple SMC sampler [DDJ06] targetting: 1 β(t,r) θ πt,r (x1:t∧T ) =µθ (x1 ) T ∧t Y s=2 1 γ(t,r) gθ (y1 |x1 ) fθ (xs |xs−1 ) s β(t,r) · s γ(t,r) gθ (ys |xs ) s s where {β(t,r) } and {γ(t,r) } are [0, 1]-valued I I I , for s ∈ [[1, T ]], r ∈ [[1, R]] and t ∈ [[1, T 0 ]], with T 0 = T + L for some R, L ∈ N. Can be validly embedded within PMCMC: I I Terminal likelihood estimate is unbiased. Explicit auxiliary variable construction is possible. (1) Two Simple Block-Tempering Strategies Tempering both likelihood and transition probabilities s β(t,r) = s γ(t,r) R(t − s) + r = 1∧ ∨0 RL (2) Tempering only the observation density s β(t,r) =I{s ≤ t} s γ(t,r) R(t − s) + r = 1∧ ∨ 0, RL (3) Tempering Only the Observation Density s s β(t,R) = β(t,0) 1 0 t s s γ(t,R) 1 s γ(t,0) 1 L 0 t−L+1 t s Illustrative Example I Univariate Linear Gaussian SSM: transition f (x0 |x) =N (x0 ; x, 1) likelihood 5 ● ● ● ●● ● ● g(y|x) =N (y; x, 1) ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Artificial jump in observation seqeuence at time 75. I Cartoon of model misspecification I — a key difficulty with PMCMC. Observations, y[t] ● I ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● −5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ●● ● −10 ● ● ● ● I Temper only likelihood. −15 ● ● ● ● ● ● ● ● ● ● I Use single-site Metropolis-Within Gibbs (standard normal proposal) MCMC moves. 0 25 50 Time, t 75 ● ● ● ● ● ● ● ●●● ● ● 100 True Filtering and Smoothing Distributions 5 ● ●● ●● ● ● ● ● ● ●● ●●● ● ● ● ● 0 ● ●● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● −5 ● ● x ● Algorithm smoother ● filter ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ●● ●● ● ●●●●● ●● ● ● −10 ● ● ●● ● ● ● ●● −15 ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● 0 25 50 Time, t 75 100 Estimating the Normalizing Constant, Zb −88 L log(Estimated Z) ● ● −90 ● ● ● ● ● 1 2 3 5 ● R ● 0 1 5 10 −92 3 4 5 log(Cost = N(1+RL)) 6 Relative Error in Zb Against Computational Effort 2 L log(Relative Error in Z) 1 ● 1 ● 2 ● 3 ● 5 Performance: log(N) ● ● ● 3 ● 4 ● 5 ●6 R ● 0 0 1 5 10 4 5 log(Cost = N(1+RL)) 6 (R = 1, L = 1) ≺(R > 1, L = 1) ≺(R > 1, L > 1) Conclusions I PMCMC is perhaps even more powerful than has yet been recognised. I Exploiting the offline nature of the PMCMC setting allows more flexibility than the filtering framework. I The particular block-tempered approach developed here warrants further investigation. I Another approach to similar problems is provided by: the iAPF [GJL15, preprint on ArXiv RSN] Conclusions I PMCMC is perhaps even more powerful than has yet been widely recognised. I Exploiting the offline nature of the PMCMC setting allows more flexibility than the filtering framework. I The particular block-tempered approach developed here warrants further investigation. I Another approach to similar problems is provided by: the iAPF [GJL15, preprint on ArXiv RSN] Thanks for listening. . . Any Questions? References C. Andrieu, A. Doucet, and R. Holenstein. Particle Markov chain Monte Carlo. Journal of the Royal Statistical Society B, 72(3):269–342, 2010. C. Andrieu and G. O. Roberts. The pseudo-marginal approach for efficient Monte Carlo computations. Annals of Statistics, 37(2):697–725, 2009. A. Doucet, M. Briers, and S. Sénécal. Efficient block sampling strategies for sequential Monte Carlo methods. Journal of Computational and Graphical Statistics, 15(3):693–711, 2006. P. Del Moral, A. Doucet, and A. Jasra. Sequential Monte Carlo samplers. Journal of the Royal Statistical Society B, 63(3):411–436, 2006. S. Godsill and T. Clapp. Improvement strategies for Monte Carlo particle filters. In A. Doucet, N. de Freitas, and N. Gordon, editors, Sequential Monte Carlo Methods in Practice, Statistics for Engineering and Information Science, pages 139–158. Springer Verlag, New York, 2001. P. Guarniero, A. M. Johansen, and A. Lee. The iterated auxiliary particle filter. In preparation, 2015. J. Owen, D. J. Wilkinson, and C. S. Gillespie. Likelihood free inference for markov processes: a comparison. Statistical Applications in Genetics and Molecular Biology, 2015. In press.