CS 182 Sections 103 - 104 slides created by Eva Mok ( )

advertisement
CS 182
Sections 103 - 104
slides created by Eva Mok (emok@icsi.berkeley.edu)
modified by jgm
April 13, 2005
Announcements
• a8 out, due Monday April 19th, 11:59pm
• BBS articles are assigned for the final paper:
– Arbib, Michael A. (2002). The mirror system, imitation,
and the evolution of language.
– Hurford, James R. (2003). The neural basis of predicateargument structure.
– Grush, Rick (2004). The emulation theory of
representation: motor control, imagery, and perception.
• Skim through them and let us know, as part of a8,
which article you plan to use.
Schedule
• Last Week
– Inference in Bayes Net
– Metaphor understanding using KARMA
• This Week
– Formal Grammar and Parsing
– Construction Grammar, ECG
• Next Week
– Psychological model of sentence processing
– Grammar Learning
Quiz
1. How are the source and target domains
represented in KARMA?
2. How does the source domain information enter
KARMA? How should it?
3. What does SHRUTI buy us?
4. How are bindings propagated in a structured
connectionist framework?
Quiz
1. How are the source and target domains
represented in KARMA?
2. How does the source domain information enter
KARMA? How should it?
3. What does SHRUTI buy us?
4. How are bindings propagated in a structured
connectionist framework?
KARMA
• DBN to represent
target domain
knowledge
• Metaphor maps link
target and source
domain
• X-schema to
represent source
domain knowledge
DBNs
• Explicit causal relations + full joint table  Bayes Nets
• Sequence of full joint states over time  HMM
• HMM + BN  DBNs
• DBNs are a generalization of HMMs which capture sparse causal
relationships of full joint
Dynamic Bayes Nets
Metaphor Maps
1. map entities and objects between embodied and
abstract domains
2. invariantly map the aspect of the embodied domain
event onto the target domain
by setting the evidence for the status variable based
on controller state (event structure metaphor)
3. project x-schema parameters onto the target
domain
Where does the domain knowledge
come from?
• Both domains are structured by frames
• Frames have:
– List of roles (participants, frame elements)
– Relations between roles
– Scenario structure
DBN for the target domain
Economic State
[recession,nogrowth,lowgrowth,higrowth]
Policy
[Liberalization, Protectionism]
Goal
[free trade, protection ]
Outcome
[success, failure]
Difficulty
[present, absent]
T0
T1
Let’s try a different domain
• I didn’t quite catch what he was saying
• His slides are packed with information
• He sent the audience a clear message
When we can get a good flow of information from the streets of
our cities across to, whether it is an investigating magistrate in
France or an intelligence operative in the Middle East, and
begin to assemble that kind of information and analyze it and
repackage it and send it back out to users, whether it's a
policeman on the beat or a judge in Italy or a Special Forces
Team in Afghanistan, then we will be getting close to the kind
of capability we need to deal with this kind of problem. That's
going to take a couple, a few years.
9/11 Commission Public Hearing, Monday, March 31, 2003
Target domain belief net (T-1)
Target domain belief net (T) (communication frame)
speaker
addressee
action
outcome
degree of
understanding
Metaphor Map (conduit metaphor)
send
is
talk
receive
is
hear
Ideas
are
objects
Words
are
containers
Senders
are
speakers
Receivers
are
addressees
Source domain f-structs (transfer)
sender
transfer
receiver
means
X-Schema
representation
force
rate
send
receive
pack
Quiz
1. How are the source and target domains
represented in KARMA?
2. How does the source domain information enter
KARMA? How should it?
3. What does SHRUTI buy us?
4. How are bindings propagated in a structured
connectionist framework?
How do the source domain f-structs
get parameterized?
• In the KARMA system, they are hand-coded.
• In general, you need analysis of sentences:
– syntax
– semantics
Syntax captures:
• constraints on word order
• constituency (units of words)
• grammatical relations
(e.g. subject, object)
• subcategorization & dependency
(e.g. transitive, intransitive,
subject-verb agreement)
Quiz
1. How are the source and target domains
represented in KARMA?
2. How does the source domain information enter
KARMA? How should it?
3. What does SHRUTI buy us?
4. How are bindings propagated in a structured
connectionist framework?
SHRUTI
• A connectionist model of reflexive processing
Reflexive reasoning
Reflective reasoning
automatic, extremely fast (~300ms),
ubiquitous
conscious deliberation, slow
overt consideration of alternatives
external props (pencil + paper)
• computation of coherent
explanations and predictions
• gradual learning of causal structure
• episodic memory
• understanding language
• solving logic puzzles
• doing cryptarithmetic
• planning a vacation
SHRUTI
• synchronous activity without using global clock
• An episode of reflexive processing is a transient propagation of
rhythmic activity
• An “entity” is a phase in the above rhythmic activity.
• Bindings are synchronous firings of role and entity cells
• Rules are interconnection patterns mediated by coincidence
detector circuits that allow selective propagation of activity
• Long-term memories are coincidence and coincidence-failure
detector circuits
• An affirmative answer / explanation corresponds to
reverberatory activity around closed loops
focal cluster
• provides locus of coordination, control and decision
making
• enforce sequencing and concurrency
• initiate information seeking actions
• initiate evaluation of conditions
• initiate conditional actions
• link to other schemas, knowledge structures
Quiz
1. How are the source and target domains
represented in KARMA?
2. How does the source domain information enter
KARMA? How should it?
3. What does SHRUTI buy us?
4. How are bindings propagated in a structured
connectionist framework?
dynamic binding example
entity
type
predicate
my-father
+
+e
+v
cup
?e
• asserting that
get(father, cup)
?
• father fires in phase
with agent role
?v
• cup fires in phase
with patient role
+
-
get
?
agt
pat
Active Schemas in SHRUTI
• active schemas require control and coordination,
dynamic role binding and parameter setting
• schemas are interconnected networks of focal clusters
• bindings are encoded and propagated using temporal
synchrony
• scalar parameters are encoded using rate-encoding
Review: Probability
• Random Variables
– Boolean/Discrete
• True/false
• Cloudy/rainy/sunny
– Continuous
• [0,1] (i.e. 0.0 <= x <= 1.0)
Priors/Unconditional Probability
• Probability Distribution
– In absence of any other info
– Sums to 1
– E.g. P(Sunny=T) = .8 (thus, P(Sunny=F) = .2)
• This is a simple probability distribution
• Joint Probability
– P(Sunny, Umbrella, Bike)
• Table 23 in size
– Full Joint is a joint of all variables in model
• Probability Density Function
– Continuous variables
• E.g. Uniform, Gaussian, Poisson…
Conditional Probability
• P(Y | X) is probability of Y given that all we know is
the value of X
– E.g. P(cavity=T | toothache=T) = .8
• thus P(cavity=F | toothache=T) = .2
• Product Rule
– P(Y | X) = P(X Y) / P(X)
Y
X
(normalizer to add up to 1)
Inference
Toothache
Cavity
Catch
Prob
False
False
False
.576
False
False
True
.144
False
True
False
.008
False
True
True
.072
True
False
False
.064
True
False
True
.016
True
True
False
.012
True
True
True
.108
P(Toothache=T)?
P(Toothache=T, Cavity=T)?
P(Toothache=T | Cavity=T)?
Independence
•Rainy
Cloudy
•Sunny
Windy
Bayes Nets
P(B)
Burglary
P(E)
Earthquake
0.001
Alarm
JohnCalls
A
P(J|…)
T
F
0.90
0.05
B
E
P(A|…)
T
T
F
F
T
F
T
F
0.95
0.94
0.29
0.001
MaryCalls
0.002
A
P(M|…)
T
F
0.70
0.01
Independence
X independent of Z?
No
X
Y
X
X conditionally independent of Z given Y?
Z
Z
X
X
Yes
Y
Z
Yes
Z
No
Y
Y
No
Y
X
Y
Or below
Z
X
Yes
Z
Markov Blanket
X
X is independent
of everything else given:
Parents, Children, Parents of Children
Reference: Joints
• Representation of entire network
• P(X1=x1  X2=x2  ... Xn=xn) =
P(x1, ..., xn) = i=1..n P(xi|parents(Xi))
• How? Chain Rule
– P(x1, ..., xn) = P(x1|x2, ..., xn) P(x2, ..., xn) =
... = i=1..n P(xi|xi-1, ..., x1)
– Now use conditional independences to simplify
Reference: Joint, cont.
X4
X2
X6
X1
X3
X5
P(x1, ..., x6) =
P(x1) *
P(x2|x1) *
P(x3|x2, x1)
P(x4|x3, x2,
P(x5|x4, x3,
P(x6|x5, x4,
*
x1) *
x2, x1) *
x3, x2, x1)
Reference: Joint, cont.
X4
X2
X6
X1
X3
X5
P(x1, ..., x6) =
P(x1) *
P(x2|x1) *
P(x3|x2, x1)
P(x4|x3, x2,
P(x5|x4, x3,
P(x6|x5, x4,
*
x1) *
x2, x1) *
x3, x2, x1)
Reference: Inference
• General case
– Variable Eliminate
– P(Q | E) when you have P(R, Q, E)
– P(Q | E) = ∑R P(R, Q, E) / ∑R,Q P(R, Q, E)
• ∑R P(R, Q, E) = P(Q, E)
• ∑Q P(Q, E) = P(E)
• P(Q, E) / P(E) = P(Q | E)
Inference
Toothache
Cavity
Catch
Prob
False
False
False
.576
False
False
True
.144
False
True
False
.008
False
True
True
.072
True
False
False
.064
True
False
True
.016
True
True
False
.012
True
True
True
.108
P(Toothache=T, Cavity=T)?
Inference
Toothache
Cavity
Prob
False
False
.72
False
True
0.08
True
False
0.08
True
True
0.12
Reference: Inference, cont.
X4
X2
X6
X1
X3
Q = {X1}, E = {X6}
R = X \ Q,E
P(x1, ..., x6) =
P(x1) * P(x2|x1) * P(x3|x1) *
P(x4|x2) *
P(x5|x3) * P(x6|x5, x2)
X5
P(x1, x6) = ∑x2 ∑x3 ∑x4 ∑x5 P(x1) P(x2|x1) P(x3|x1) P(x4|x2) P(x5|x3) P(x6|x5, x2)
= P(x1) ∑x2 P(x2|x1) ∑x3 P(x3|x1) ∑x4 P(x4|x2) ∑x5 P(x5|x3) P(x6|x5, x2)
= P(x1) ∑x2 P(x2|x1) ∑x3 P(x3|x1) ∑x4 P(x4|x2) m5(x2, x3)
= P(x1) ∑x2 P(x2|x1) ∑x3 P(x3|x1) m5(x2, x3) ∑x4 P(x4|x2)
= ...
Approximation Methods
• Simple
– no evidence
• Rejection
– just forget about the invalids
• Likelihood Weighting
– only valid, but not necessarily useful
• MCMC
– Best: only valid, useful, in proportion
Stochastic Simulation
P(WetGrass|Cloudy)?
Cloudy
Sprinkler
P(WetGrass|Cloudy)
= P(WetGrass  Cloudy) / P(Cloudy)
Rain
1. Repeat N times:
WetGrass
1.1. Guess Cloudy at random
1.2. For each guess of Cloudy, guess
Sprinkler and Rain, then WetGrass
2. Compute the ratio of the # runs where
WetGrass and Cloudy are True
over the # runs where Cloudy is True
Download