State Estimation and Kalman Filtering CS B659 Spring 2013 Kris Hauser Motivation • Observing a stream of data • Monitoring (of people, computer systems, etc) • Surveillance, tracking • Finance & economics • Science • Questions: • Modeling & forecasting • Handling partial and noisy observations Markov Chains • Sequence of probabilistic state variables X0,X1,X2,… • E.g., robot’s position, target’s position and velocity, … X0 X1 X2 X3 Observe X1 X0 independent of X2, X3, … P(Xt|Xt-1) known as transition model Inference in MC • Prediction: the probability of future state? • P(Xt) = S =S = S x0,…,xt-1P x0,…,xt-1P (X0) xt-1P(Xt|Xt-1) (X0,…,Xt) P x1,…,xt P(Xi|Xi-1) P(Xt-1) [Incremental approach] • “Blurs” over time, and approaches stationary distribution as t grows • Limited prediction power • Rate of blurring known as mixing time Modeling Partial Observability • Hidden Markov Model (HMM) X0 X1 X2 X3 Hidden state variables O1 O2 O3 Observed variables P(Ot|Xt) called the observation model (or sensor model) Filtering • Name comes from signal processing • Goal: Compute the probability distribution over current state given observations up to this point Query variable Unknown X0 X1 X2 O1 O2 Distribution given Known Filtering • Name comes from signal processing • Goal: Compute the probability distribution over current state given observations up to this point • P(Xt|o1:t) = S xt-1 P(xt-1|o1:t-1) P(Xt|xt-1,ot) • P(Xt|Xt-1,ot) = P(ot|Xt-1,Xt)P(Xt|Xt-1)/P(ot|Xt-1) = a P(ot|Xt)P(Xt|Xt-1) Query variable Unknown X0 X1 X2 O1 O2 Distribution given Known Kalman Filtering • In a nutshell • Efficient probabilistic filtering in continuous state spaces • Linear Gaussian transition and observation models • Ubiquitous for state tracking with noisy sensors, e.g. radar, GPS, cameras Hidden Markov Model for Robot Localization • Use observations + transition dynamics to get a better idea of where the robot is at time t X0 X1 X2 X3 Hidden state variables z1 z2 z3 Observed variables Predict – observe – predict – observe… Hidden Markov Model for Robot Localization • Use observations + transition dynamics to get a better idea of where the robot is at time t • Maintain a belief state bt over time • bt(x) = P(Xt=x|z1:t) X0 X1 X2 X3 Hidden state variables z1 z2 z3 Observed variables Predict – observe – predict – observe… Bayesian Filtering with Belief States • Compute bt, given zt and prior belief bt • Recursive filtering equation • 𝑃 𝑥𝑡 𝑧1:𝑡 = 1 𝑍 = 𝑃 𝑧𝑡 𝑥𝑡 𝑥𝑡−1 𝑃 𝑥𝑡−1 𝑃 𝑥𝑡 𝑥𝑡−1 , 𝑧1:𝑡 𝑃(𝑥𝑡−1 |𝑧1:𝑡−1 ) 𝑥𝑡 𝑥𝑡−1 𝑃(𝑥𝑡−1 |𝑧1:𝑡−1 ) Bayesian Filtering with Belief States • Compute bt, given zt and prior belief bt • Recursive filtering equation • 𝑃 𝑥𝑡 𝑧1:𝑡 = 1 𝑍 = 𝑃 𝑧𝑡 𝑥𝑡 𝑥𝑡−1 𝑃 𝑥𝑡−1 𝑃 1 • 𝑏𝑡 𝑥 = 𝑃 𝑧𝑡 𝑋𝑡 = 𝑥 𝑍 Update via the observation zt 𝑥𝑡 𝑥𝑡−1 , 𝑧1:𝑡 𝑃(𝑥𝑡−1 |𝑧1:𝑡−1 ) 𝑥𝑡 𝑥𝑡−1 𝑃(𝑥𝑡−1 |𝑧1:𝑡−1 ) 𝑥𝑡−1 𝑃 𝑥𝑡 𝑥𝑡−1 𝑏𝑡−1 (𝑥𝑡−1 ) Predict P(Xt|z1:t-1) using dynamics alone In Continuous State Spaces… • Compute bt, given zt and prior belief bt • Continuous filtering equation • 𝑏𝑡 𝑥 1 𝑍 = 𝑃 𝑧𝑡 𝑋𝑡 = 𝑥 𝑥𝑡−1 𝑃 𝑥𝑡 𝑥𝑡−1 𝑏𝑡−1 (𝑥𝑡−1 ) 𝑑𝑥𝑡−1 In Continuous State Spaces… • Compute bt, given zt and prior belief bt • Continuous filtering equation • 𝑏𝑡 𝑥 1 𝑍 = 𝑃 𝑧𝑡 𝑋𝑡 = 𝑥 𝑥𝑡−1 𝑃 𝑥𝑡 𝑥𝑡−1 𝑏𝑡−1 (𝑥𝑡−1 ) 𝑑𝑥𝑡−1 • How to evaluate this integral? • How to calculate Z? • How to even represent a belief state? Key Representational Decisions • Pick a method for representing distributions • Discrete: tables • Continuous: fixed parameterized classes vs. particle-based techniques • Devise methods to perform key calculations (marginalization, conditioning) on the representation • Exact or approximate? Gaussian Distribution • Mean m, standard deviation s • Distribution is denoted N(m,s) • If X ~ N(m,s), then • 𝑃 𝑋=𝑥 = 1 1 − 2 𝑥−𝜇 2 𝑒 2𝜎 𝑍 • With Z = 2𝜋𝜎 2 a normalization factor Linear Gaussian Transition Model for Moving 1D Point • Consider position and velocity xt, vt • Time step h • Without noise xt+1 = xt + h vt vt+1 = vt • With Gaussian noise of std s1 P(xt+1|xt) exp(-(xt+1 – (xt + h vt))2/(2s12) i.e. Xt+1 ~ N(xt + h vt, s1) Linear Gaussian Transition Model • If prior on position is Gaussian, then the posterior is also Gaussian vh s1 N(m,s) N(m+vh,s+s1) Linear Gaussian Observation Model • Position observation zt • Gaussian noise of std s2 zt ~ N(xt,s2) Linear Gaussian Observation Model • If prior on position is Gaussian, then the posterior is also Gaussian Posterior probability Observation probability Position prior m (s2z+s22m)/(s2+s22) s2 s2s22/(s2+s22) Multivariate Gaussians • Multivariate analog in N-D space • Mean (vector) m, covariance (matrix) S • 𝑃 𝑋=𝑥 = 1 −1 𝑥−𝜇 𝑇 Σ−1 𝑥−𝜇 𝑒 2 𝑍 𝑑/2 1/2 • With Z = (2𝜋) Σ X ~ N(m,S) a normalization factor Multivariate Linear Gaussian Process • A linear transformation + multivariate Gaussian noise • If prior state distribution is Gaussian, then posterior state distribution is Gaussian • If we observe one component of a Gaussian, then its posterior is also Gaussian y=Ax+e e ~ N(m,S) Multivariate Computations • Linear transformations of gaussians • If x ~ N(m,S), y = A x + b • Then y ~ N(Am+b, ASAT) • Consequence • If x ~ N(mx,Sx), y ~ N(my,Sy), z=x+y • Then z ~ N(mx+my,Sx+Sy) • Conditional of gaussian • If [x1,x2] ~ N([m1 m2],[S11,S12;S21,S22]) • Then on observing x2=z, we have x1 ~ N(m1-S12S22-1(z-m2), S11-S12S22-1S21) Presentation Next time • Principles Ch. 9 • Rekleitis (2004)