T(t+1) - Knowledge Representation & Reasoning at UIUC!

advertisement
CS 440 / ECE 448
Introduction to Artificial Intelligence
Spring 2010
Lecture #23
Instructor: Eyal Amir
Grad TAs: Wen Pu, Yonatan Bisk
Undergrad TAs: Sam Johnson, Nikhil Johri
Today & Thursday
• Time and uncertainty
• Inference: filtering, prediction, smoothing
• Hidden Markov Models (HMMs)
– Model
– Exact Reasoning
Time and Uncertainty
• Standard Bayes net model:
– Static situation
– Fixed (finite) random variables
– Graphical structure and conditional independence
• In many systems, data arrives sequentially
• Dynamic Bayes nets (DBNs) and HMMs model:
– Processes that evolve over time
Example (Robot Position)
Pos1
Sensor 1
Pos2
Pos3
Sensor2
Sensor 3
Vel 1
Vel 2
Sensor1
Sensor 2
Vel 3
Sensor3
Robot Position
(With Observations)
Pos2
Pos3
Sens.A 1
Sens.A2
Sens.A3
Vel 1
Vel 2
Sens.B1
Sens.B 2
Pos1
Vel 3
Sens.B3
Inference Problem
• State of the System at time t: X t
X t  ( Post ,Velt , Sens. At , Sens.Bt )
• Probability distribution over states:
P( X 1 ,..., X t )  P( X 1 ) P( X 2 | X 1)...
P( X t | X 0 ,..., X t 1 )
• A lot of parameters
Solution (Part 1)
• Problem: P( X t | X 1 ,..., X t 1 )
• Solution: Markov Assumption
– Assume X t is independent of X 1 ,..., X t 2
given X t 1
• State variables are expressive enough to
summarize all relevant information about past
• Therefore:
P( X 1 ,..., X t )  P( X 1 ) P( X 2 | X 1)...P( X t | X t 1 )
Solution (Part 2)
• Problem:
– If all P( X t | X t 1 ) are different
• Solution:
– Assume all P( X t | X t 1 ) are the same
– The process is time-invariant or stationary
Inference in Robot Position DBN
• Compute distribution over true position and
velocity
– Given a sequence of sensor values
• Belief state: P( X t | O1:t )
– Probability distribution over different states at
each time step
• Update belief state when a new set of sensor
readings arrive P( X t | X t 1 , Ot )
Example
• First order Markov assumption not exactly
true in real world
Example
• Possible fixes:
– Increase order of Markov process
– Augment state, e.g., add Temp, Pressure
Or battery to position and velocity
Today
• Time and uncertainty
• Inference: filtering, prediction, smoothing
• Hidden Markov Models (HMMs)
– Model
– Exact Reasoning
• Dynamic Bayesian Networks
– Model
– Exact Reasoning
Inference Tasks
• Filtering: P( X t | e1:t )
– Belief state: probability of state given the
evidence
• Prediction: P( X t  k | e1:t ), k  0
– Like filtering without evidence
• Smoothing: P( X t | e1:t  k ), k  0
– Better estimate of past states
• Most likelihood explanation: arg max x P( x1:t | e1:t )
1:t
– Scenario that explains the evidence
Filtering (forward algorithm)
Update:
Xt-1
Xt
Xt+1
Et-1
Et
Et+1
P( X t | e1:t )  P( X t | e1:t 1 , et )
 P( X t | e1:t 1 ) P(et | X t )
Predict:
P( X t | e1:t 1 )   P( X t | xt 1 ) P( xt 1 | e1:t 1 )
xt 1
Recursive step
Example
 P( R | R )
1
0
r0
P( R1 | u1 )  P(u1 | R1 ) P( R1 | R0 )
r0
Smoothing
P( X k | e1:t )  P( X k | e1:k , ek 1:t )
 P( X k | e1:k ) P(ek 1:t | X k )
Forward
backward
Smoothing
BackWard Step
P( X k | e1:t )  P( X k | e1:k , ek 1:t )
 P( X k | e1:k ) P(ek 1:t | X k )
P(ek 1:t | X k )   P (ek 1:t | X k , xk 1 ) P( xk 1 | X k )
xk 1
  P (ek 1:t | xk 1 ) P ( xk 1 | X k )
xk 1
  P (ek 1 | xk 1 ) P (ek  2:t | xk 1 ) P ( xk 1 | X k )
xk 1
Most Likely Explanation
• Finding most likely path
Xt-1
Xt
Xt+1
Et-1
Et
Et+1
Most likely path to
xt
Plus one more
update
Most Likely Explanation
• Finding most likely path
Xt-1
Xt
Xt+1
Et-1
Et
Et+1
max P ( x1..xt , X t 1 | e1:t 1 )  P (et 1 | X t 1 )
x1.. xt
max ( P ( X t 1 | xt ) max P ( x1:t 1 , xt | e1:t )
xt
Called Viterbi
x1 .. xt 1
Viterbi
(Example)
Viterbi
(Example)
Viterbi
(Example)
Viterbi
(Example)
Viterbi
(Example)
Today
• Time and uncertainty
• Inference: filtering, prediction, smoothing,
MLE
• Hidden Markov Models (HMMs)
– Model
– Exact Reasoning
• Dynamic Bayesian Networks
– Model
– Exact Reasoning
Hidden Markov model (HMM)
X1
Y1
X2
X3
Y3
Y2
“True” state
Phones/ words
Noisy observations
acoustic signal
Sparse transition matrix ) sparse graph
P( X t  j | X t 1  i)  A(i, j )
transition
matrix
P( yt | X t  i)  B
Diagonal
Matrix
Forwards algorithm for HMMs
Predict:
Update:
Message passing view of forwards
algorithm
at|t-1
Xt-1
bt+1
bt
Yt-1
Xt+1
Xt
Yt
Yt+1
Forwards-backwards algorithm
bt
at|t-1
Xt-1
Xt
Xt+1
Yt
Yt+1
bt
Yt-1
If Have Time…
• Time and uncertainty
• Inference: filtering, prediction, smoothing
• Hidden Markov Models (HMMs)
– Model
– Exact Reasoning
• Dynamic Bayesian Networks
– Model
– Exact Reasoning
Dynamic Bayesian Network
• DBN is like a 2time-BN
– Using the first order Markov assumptions
Time 0
Standard
BN
P( X 0 )
Time 1
P( X t | X t 1 )
Standard
BN
Dynamic Bayesian Network
• Basic idea:
– Copy state and evidence for each time step
– Xt: set of unobservable (hidden) variables
(e.g.: Pos, Vel)
– Et: set of observable (evidence) variables
(e.g.: Sens.A, Sens.B)
• Notice: Time is discrete
Example
Inference in DBN
Unroll:
Inference in the above BN
Not efficient (depends on the sequence length)
DBN Representation: DelC
RHMt
RHMt+1
Mt
Mt+1
Tt
Tt+1
Lt
Lt+1
CRt
CRt+1
RHCt
RHCt+1
RHM R(t+1) R(t+1)
T
1.0 0.0
F
0.0 1.0
T
T
F
T(t+1) T(t+1)
0.91 0.09
0.0 1.0
fRHM(RHMt,RHMt+1)
fT(Tt,Tt+1)
L CR RHC CR(t+1) CR(t+1)
O
E
O
E
O
E
O
E
T
T
F
F
T
T
F
F
T
T
T
T
F
F
F
F
0.2
1.0
0.0
0.0
1.0
1.0
0.0
0.0
0.8
0.0
1.0
1.0
0.1
0.0
1.0
1.0
fCR(Lt,CRt,RHCt,CRt+1)
Benefits of DBN Representation
Pr(Rmt+1,Mt+1,Tt+1,Lt+1,Ct+1,Rct+1 | Rmt,Mt,Tt,Lt,Ct,Rct)
RHMt
Mt
RHMt+1
= fRm(Rmt,Rmt+1) * fM(Mt,Mt+1) * fT(Tt,Tt+1)
* fL(Lt,Lt+1) * fCr(Lt,Crt,Rct,Crt+1) * fRc(Rct,Rct+1)
Mt+1
Tt
Tt+1
Lt
Lt+1
CRt
CRt+1
RHCt
RHCt+1
- Only few parameters vs.
s1 s2 ... s160
25440 for matrix
s1 0.9 0.05 ... 0.0
s.2
0.0 0.20 ... 0.1
s160
0.1 0.0 ... 0.0
..
-Removes global exponential
dependence
Download