CS 440 / ECE 448 Introduction to Artificial Intelligence Spring 2010 Lecture #23 Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk Undergrad TAs: Sam Johnson, Nikhil Johri Today & Thursday • Time and uncertainty • Inference: filtering, prediction, smoothing • Hidden Markov Models (HMMs) – Model – Exact Reasoning Time and Uncertainty • Standard Bayes net model: – Static situation – Fixed (finite) random variables – Graphical structure and conditional independence • In many systems, data arrives sequentially • Dynamic Bayes nets (DBNs) and HMMs model: – Processes that evolve over time Example (Robot Position) Pos1 Sensor 1 Pos2 Pos3 Sensor2 Sensor 3 Vel 1 Vel 2 Sensor1 Sensor 2 Vel 3 Sensor3 Robot Position (With Observations) Pos2 Pos3 Sens.A 1 Sens.A2 Sens.A3 Vel 1 Vel 2 Sens.B1 Sens.B 2 Pos1 Vel 3 Sens.B3 Inference Problem • State of the System at time t: X t X t ( Post ,Velt , Sens. At , Sens.Bt ) • Probability distribution over states: P( X 1 ,..., X t ) P( X 1 ) P( X 2 | X 1)... P( X t | X 0 ,..., X t 1 ) • A lot of parameters Solution (Part 1) • Problem: P( X t | X 1 ,..., X t 1 ) • Solution: Markov Assumption – Assume X t is independent of X 1 ,..., X t 2 given X t 1 • State variables are expressive enough to summarize all relevant information about past • Therefore: P( X 1 ,..., X t ) P( X 1 ) P( X 2 | X 1)...P( X t | X t 1 ) Solution (Part 2) • Problem: – If all P( X t | X t 1 ) are different • Solution: – Assume all P( X t | X t 1 ) are the same – The process is time-invariant or stationary Inference in Robot Position DBN • Compute distribution over true position and velocity – Given a sequence of sensor values • Belief state: P( X t | O1:t ) – Probability distribution over different states at each time step • Update belief state when a new set of sensor readings arrive P( X t | X t 1 , Ot ) Example • First order Markov assumption not exactly true in real world Example • Possible fixes: – Increase order of Markov process – Augment state, e.g., add Temp, Pressure Or battery to position and velocity Today • Time and uncertainty • Inference: filtering, prediction, smoothing • Hidden Markov Models (HMMs) – Model – Exact Reasoning • Dynamic Bayesian Networks – Model – Exact Reasoning Inference Tasks • Filtering: P( X t | e1:t ) – Belief state: probability of state given the evidence • Prediction: P( X t k | e1:t ), k 0 – Like filtering without evidence • Smoothing: P( X t | e1:t k ), k 0 – Better estimate of past states • Most likelihood explanation: arg max x P( x1:t | e1:t ) 1:t – Scenario that explains the evidence Filtering (forward algorithm) Update: Xt-1 Xt Xt+1 Et-1 Et Et+1 P( X t | e1:t ) P( X t | e1:t 1 , et ) P( X t | e1:t 1 ) P(et | X t ) Predict: P( X t | e1:t 1 ) P( X t | xt 1 ) P( xt 1 | e1:t 1 ) xt 1 Recursive step Example P( R | R ) 1 0 r0 P( R1 | u1 ) P(u1 | R1 ) P( R1 | R0 ) r0 Smoothing P( X k | e1:t ) P( X k | e1:k , ek 1:t ) P( X k | e1:k ) P(ek 1:t | X k ) Forward backward Smoothing BackWard Step P( X k | e1:t ) P( X k | e1:k , ek 1:t ) P( X k | e1:k ) P(ek 1:t | X k ) P(ek 1:t | X k ) P (ek 1:t | X k , xk 1 ) P( xk 1 | X k ) xk 1 P (ek 1:t | xk 1 ) P ( xk 1 | X k ) xk 1 P (ek 1 | xk 1 ) P (ek 2:t | xk 1 ) P ( xk 1 | X k ) xk 1 Most Likely Explanation • Finding most likely path Xt-1 Xt Xt+1 Et-1 Et Et+1 Most likely path to xt Plus one more update Most Likely Explanation • Finding most likely path Xt-1 Xt Xt+1 Et-1 Et Et+1 max P ( x1..xt , X t 1 | e1:t 1 ) P (et 1 | X t 1 ) x1.. xt max ( P ( X t 1 | xt ) max P ( x1:t 1 , xt | e1:t ) xt Called Viterbi x1 .. xt 1 Viterbi (Example) Viterbi (Example) Viterbi (Example) Viterbi (Example) Viterbi (Example) Today • Time and uncertainty • Inference: filtering, prediction, smoothing, MLE • Hidden Markov Models (HMMs) – Model – Exact Reasoning • Dynamic Bayesian Networks – Model – Exact Reasoning Hidden Markov model (HMM) X1 Y1 X2 X3 Y3 Y2 “True” state Phones/ words Noisy observations acoustic signal Sparse transition matrix ) sparse graph P( X t j | X t 1 i) A(i, j ) transition matrix P( yt | X t i) B Diagonal Matrix Forwards algorithm for HMMs Predict: Update: Message passing view of forwards algorithm at|t-1 Xt-1 bt+1 bt Yt-1 Xt+1 Xt Yt Yt+1 Forwards-backwards algorithm bt at|t-1 Xt-1 Xt Xt+1 Yt Yt+1 bt Yt-1 If Have Time… • Time and uncertainty • Inference: filtering, prediction, smoothing • Hidden Markov Models (HMMs) – Model – Exact Reasoning • Dynamic Bayesian Networks – Model – Exact Reasoning Dynamic Bayesian Network • DBN is like a 2time-BN – Using the first order Markov assumptions Time 0 Standard BN P( X 0 ) Time 1 P( X t | X t 1 ) Standard BN Dynamic Bayesian Network • Basic idea: – Copy state and evidence for each time step – Xt: set of unobservable (hidden) variables (e.g.: Pos, Vel) – Et: set of observable (evidence) variables (e.g.: Sens.A, Sens.B) • Notice: Time is discrete Example Inference in DBN Unroll: Inference in the above BN Not efficient (depends on the sequence length) DBN Representation: DelC RHMt RHMt+1 Mt Mt+1 Tt Tt+1 Lt Lt+1 CRt CRt+1 RHCt RHCt+1 RHM R(t+1) R(t+1) T 1.0 0.0 F 0.0 1.0 T T F T(t+1) T(t+1) 0.91 0.09 0.0 1.0 fRHM(RHMt,RHMt+1) fT(Tt,Tt+1) L CR RHC CR(t+1) CR(t+1) O E O E O E O E T T F F T T F F T T T T F F F F 0.2 1.0 0.0 0.0 1.0 1.0 0.0 0.0 0.8 0.0 1.0 1.0 0.1 0.0 1.0 1.0 fCR(Lt,CRt,RHCt,CRt+1) Benefits of DBN Representation Pr(Rmt+1,Mt+1,Tt+1,Lt+1,Ct+1,Rct+1 | Rmt,Mt,Tt,Lt,Ct,Rct) RHMt Mt RHMt+1 = fRm(Rmt,Rmt+1) * fM(Mt,Mt+1) * fT(Tt,Tt+1) * fL(Lt,Lt+1) * fCr(Lt,Crt,Rct,Crt+1) * fRc(Rct,Rct+1) Mt+1 Tt Tt+1 Lt Lt+1 CRt CRt+1 RHCt RHCt+1 - Only few parameters vs. s1 s2 ... s160 25440 for matrix s1 0.9 0.05 ... 0.0 s.2 0.0 0.20 ... 0.1 s160 0.1 0.0 ... 0.0 .. -Removes global exponential dependence