CS 182 Sections 103 - 104 slides created by: Eva Mok ()

advertisement
CS 182
Sections 103 - 104
slides created by:
Eva Mok (emok@icsi.berkeley.edu)
modified by jgm
April 26, 2006
Announcements
• a9 due Tuesday, May 2nd, 11:59pm
• final paper due Tuesday May 9th, 11:59pm
• final exam Tuesday May 9th in class
• final review?
Schedule
• Last Week
– Constructions, ECG
– Models of language learning
• This Week
– Embodied Construction Grammar learning
– (Bayesian) psychological model of sentence processing
• Next Week
– Applications
– Wrap-Up
Quiz
1. How does the analyzer use the constructions to
parse a sentence?
2. What is the process by which new constructions are
learned from <utterance, situation> pairs?
3. What are ways to re-organize and consolidate the
current grammar?
4. What metric is used to determine when to form a
new construction?
Quiz
1. How does the analyzer use the constructions to
parse a sentence?
2. What is the process by which new constructions are
learned from <utterance, situation> pairs?
3. What are ways to re-organize and consolidate the
current grammar?
4. What metric is used to determine when to form a
new construction?
Analyzing “You Throw The Ball”
MEANING (stuff)
FORM (sound)
t1 before t2
t2 before t3
“you”
“throw”
ThrowerThrow-Object
t2.thrower ↔ t1
t2.throwee ↔ t3
you
Addressee
schema
Addressee
subcase of Human
Throw Throw
schema
roles:
thrower
thrower
throwee
throwee
throw
“the”
ball
“ball”
“block”
block
Ball
schema
Ball
subcase of Object
schema Block
subcase of Object
I will get the cup
CauseMove1:CauseMove
mover = {Cup1:Cup}
direction = {Place2:Place}
causer = {Entity4:Speaker}
motion = Move1:Move
mover = {Cup1:Cup}
direction = Place2:Place
agent = {Entity4:Speaker}
agent = {Entity4:Speaker}
patient = {Cup1:Cup}
Referent2:Referent
category = {Cup1:Cup}
resolved-ref = {Cup1:Cup}
Referent1:Referent
category = {Entity4:Speaker}
resolved-ref = {Entity4:Speaker}
the analyzer allows
the skipping of
unknown lexical items
here is the SemSpec
Remember that the
transitive construction
takes 3 constituents:
agt : Ref-Expr
v : Cause-Motion-Verb
obj : Ref-Expr
their meaning poles are:
agt : Referent
v : CauseMove
obj : Referent
Another way to think of the SemSpec
Analyzer Chart
Chart Entry 0:
Transitive-Cn1 span: [0, 4]
processed input: i
bring cup
Skipped: 1 3
semantic coherence: 0.56
I-Cn1
span: [0, 0]
processed input: i
Skipped: none
semantic coherence: 0.3333333333333333
Chart Entry 2:
Bring-Cn1
span: [2, 2]
processed input: bring
Skipped: none
semantic coherence: 0.9090909090909091
Chart Entry 4:
Cup-Cn1 span: [4, 4]
processed input: cup
Skipped: none
semantic coherence: 0.2857142857142857
Analyzing in ECG
create a recognizer for each construction in the grammar
for each level j (in ascending order)
repeat
for each recognizer r in j
for each position p of utterance
initiate r starting at p
until the number of states in the chart doesn't change
Recognizer for the Transitive-Cn
level 2
agt
v
obj
I
get
cup
level 1
level 0
• an example of a level-1 construction is Red-Ball-Cn
• each recognizer looks for its constituents in order (the
ordering constraints on the constituents can be a partial
ordering)
Learning-Analysis Cycle
Reorganize
(Utterance, Situation)
Constructions
Analyze
(Chang, 2004)
1. Learner passes input (Utterance
+ Situation) and current
grammar to Analyzer.
2. Analyzer produces SemSpec
and Constructional Analysis.
3. Learner updates grammar:
a. Hypothesize new map.
Semantic
Specification,
Constructional Analysis
Hypothesize
b. Reorganize grammar
(merge or compose).
c. Reinforce
(based on usage).
Quiz
1. How does the analyzer use the constructions to
parse a sentence?
2. What is the process by which new constructions are
learned from <utterance, situation> pairs?
3. What are ways to re-organize and consolidate the
current grammar?
4. What metric is used to determine when to form a
new construction?
Usage-based Language Learning
Reorganize
(Utterance, Situation)
(Comm. Intent, Situation)
Constructions
Analyze
Partial Analysis
Comprehension
Generate
Hypothesize
Utterance
Acquisition
Production
Basic Learning Idea
SemSpec
Context
r1
r2
r1
r2
r1
r2
r3
r1
r2
r3
r1
r2
r1
r2
• The learner’s current grammar produces a certain
analysis for an input sentence
• The context contains richer information (e.g. bindings)
that are unaccounted for in the analysis
• Find a way to account for these meaning relations (by
looking for corresponding form relations)
Initial Single-Word Stage
FORM (sound)
“you”
“throw”
lexical constructions
“block”
schema Addressee
subcase of Human
you
throw
“ball”
ball
block
MEANING (stuff)
schema Throw
roles:
thrower
throwee
schema Ball
subcase of Object
schema Block
subcase of Object
New Data: “You Throw The Ball”
FORM
MEANING
SITUATION
throw-ball
Self
“you”
“throw”
you
throw
ball
“ball”
“block”
block
Addressee
schema
Throw Throw
roles:
thrower
thrower
throwee
throwee
Throw
thrower
throwee
role-filler
before
“the”
schema
Addressee
Addressee
subcase of Human
schema
Ball
Ball
subcase of Object
schema Block
subcase of Object
Ball
A
Af
Am
formrelation
rolefiller
B
Bf
throw ball
Relational Mapping
Scenarios
Bm
throw.throwee ↔ ball
Af
A
Am
B
Bm
formrelation
Bf
put ball down
A
Af
Am
formrelation
Bf
Nomi ball
rolefiller
Y
B
Bm
rolefiller
possession.possessor ↔ Nomi
possession.possessed ↔ ball
rolefiller
X
rolefiller
put.mover ↔ ball
down.tr ↔ ball
Quiz
1. How does the analyzer use the constructions to
parse a sentence?
2. What is the process by which new constructions are
learned from <utterance, situation> pairs?
3. What are ways to re-organize and consolidate the
current grammar?
4. What metric is used to determine when to form a
new construction?
Merging Similar Constructions
throw the block
Throw.throwee = Block
throw before block
throw before Objectf
throw before ball
THROWOBJECT
THROW.throwee = Objectm
Throw.throwee = Ball
throw-ing the ball
throw before-s ing
Throw.aspect = ongoing
Resulting Construction
construction THROW-OBJECT
constructional
constituents
t : THROW
o : OBJECT
form
tf before of
meaning
tm.throwee ↔ om
Composing Co-occurring
Constructions
throw the ball
Throw.throwee = Ball
throw before ball
throw before ball
ball before off
ball before off
ball off
THROWBALLOFF
THROW.throwee = Ball
Motion m
m.mover = Ball
m.path = Off
Motion m
m.mover = Ball
m.path = Off
Resulting Construction
construction THROW-BALL-OFF
constructional
constituents
t : THROW
b : BALL
o : OFF
form
tf before bf
bf before of
meaning
evokes MOTION as m
tm.throwee ↔ bm
m.mover ↔ bm
m.path ↔ om
Quiz
1. How does the analyzer use the constructions to
parse a sentence?
2. What is the process by which new constructions are
learned from <utterance, situation> pairs?
3. What are ways to re-organize and consolidate the
current grammar?
4. What metric is used to determine when to form a
new construction?
Minimum Description Length
• Choose grammar G to minimize cost(G|D):
– cost(G|D) = α • size(G) + β • complexity(D|G)
– Approximates Bayesian learning;
cost(G|D) ≈ 1/posterior probability ≈ 1/P(G|D)
• Size of grammar = size(G) ≈ 1/prior ≈ 1/P(G)
– favor fewer/smaller constructions/roles; isomorphic mappings
• Complexity of data given grammar ≈ 1/likelihood
≈ 1/P(D|G)
– favor simpler analyses
(fewer, more likely constructions)
– based on derivation length + score of derivation
Size Of Grammar
• Size of the grammar G is the sum of the size of each construction:
size( G )   size( c)
• Size of each construction c is:
where
cG
size( c)  nc  mc   length( e)
• nc = number of constituents in c, ec
• mc = number of constraints in c,
• length(e) = slot chain length of element reference e
Example: The Throw-Ball Cxn
construction THROW-BALL
constructional
constituents
t : THROW
b : BALL
form
tf before bf
meaning
tm.throwee ↔ bm
size( c)  nc + mc +  length( e)
ec
size(THROW-BALL)
= 2 + 2 + (2 + 3) = 9
Complexity of Data Given Grammar
• Complexity of the data D given grammar G is the sum of the analysis
score of each input token d:
complexity ( D | G)   score( d )
• Analysis score of each input token d is:
d D


score( d )    weight c     typer   height d  semfitd
where
cd 
rc

• c is a construction used in the analysis of d
• weightc ≈ relative frequency of c,
• |typer| = number of ontology items of type r used,
• heightd = height of the derivation graph,
• semfitd = semantic fit provide by the analyzer
Final Remark
• The goal here is to build a cognitive plausible model of
language learning
• A very different game that one could play is unsupervised /
semi-supervised language learning using shallow or no
semantics
– statistical NLP
– automatic extraction of syntactic structure
– automatic labeling of frame elements
• Fairly reasonable results for use in tasks such as information
retrieval, but the semantic representation is very shallow
Download