CS 182 Sections 103 - 104 slides created by: Eva Mok (emok@icsi.berkeley.edu) modified by jgm April 26, 2006 Announcements • a9 due Tuesday, May 2nd, 11:59pm • final paper due Tuesday May 9th, 11:59pm • final exam Tuesday May 9th in class • final review? Schedule • Last Week – Constructions, ECG – Models of language learning • This Week – Embodied Construction Grammar learning – (Bayesian) psychological model of sentence processing • Next Week – Applications – Wrap-Up Quiz 1. How does the analyzer use the constructions to parse a sentence? 2. What is the process by which new constructions are learned from <utterance, situation> pairs? 3. What are ways to re-organize and consolidate the current grammar? 4. What metric is used to determine when to form a new construction? Quiz 1. How does the analyzer use the constructions to parse a sentence? 2. What is the process by which new constructions are learned from <utterance, situation> pairs? 3. What are ways to re-organize and consolidate the current grammar? 4. What metric is used to determine when to form a new construction? Analyzing “You Throw The Ball” MEANING (stuff) FORM (sound) t1 before t2 t2 before t3 “you” “throw” ThrowerThrow-Object t2.thrower ↔ t1 t2.throwee ↔ t3 you Addressee schema Addressee subcase of Human Throw Throw schema roles: thrower thrower throwee throwee throw “the” ball “ball” “block” block Ball schema Ball subcase of Object schema Block subcase of Object I will get the cup CauseMove1:CauseMove mover = {Cup1:Cup} direction = {Place2:Place} causer = {Entity4:Speaker} motion = Move1:Move mover = {Cup1:Cup} direction = Place2:Place agent = {Entity4:Speaker} agent = {Entity4:Speaker} patient = {Cup1:Cup} Referent2:Referent category = {Cup1:Cup} resolved-ref = {Cup1:Cup} Referent1:Referent category = {Entity4:Speaker} resolved-ref = {Entity4:Speaker} the analyzer allows the skipping of unknown lexical items here is the SemSpec Remember that the transitive construction takes 3 constituents: agt : Ref-Expr v : Cause-Motion-Verb obj : Ref-Expr their meaning poles are: agt : Referent v : CauseMove obj : Referent Another way to think of the SemSpec Analyzer Chart Chart Entry 0: Transitive-Cn1 span: [0, 4] processed input: i bring cup Skipped: 1 3 semantic coherence: 0.56 I-Cn1 span: [0, 0] processed input: i Skipped: none semantic coherence: 0.3333333333333333 Chart Entry 2: Bring-Cn1 span: [2, 2] processed input: bring Skipped: none semantic coherence: 0.9090909090909091 Chart Entry 4: Cup-Cn1 span: [4, 4] processed input: cup Skipped: none semantic coherence: 0.2857142857142857 Analyzing in ECG create a recognizer for each construction in the grammar for each level j (in ascending order) repeat for each recognizer r in j for each position p of utterance initiate r starting at p until the number of states in the chart doesn't change Recognizer for the Transitive-Cn level 2 agt v obj I get cup level 1 level 0 • an example of a level-1 construction is Red-Ball-Cn • each recognizer looks for its constituents in order (the ordering constraints on the constituents can be a partial ordering) Learning-Analysis Cycle Reorganize (Utterance, Situation) Constructions Analyze (Chang, 2004) 1. Learner passes input (Utterance + Situation) and current grammar to Analyzer. 2. Analyzer produces SemSpec and Constructional Analysis. 3. Learner updates grammar: a. Hypothesize new map. Semantic Specification, Constructional Analysis Hypothesize b. Reorganize grammar (merge or compose). c. Reinforce (based on usage). Quiz 1. How does the analyzer use the constructions to parse a sentence? 2. What is the process by which new constructions are learned from <utterance, situation> pairs? 3. What are ways to re-organize and consolidate the current grammar? 4. What metric is used to determine when to form a new construction? Usage-based Language Learning Reorganize (Utterance, Situation) (Comm. Intent, Situation) Constructions Analyze Partial Analysis Comprehension Generate Hypothesize Utterance Acquisition Production Basic Learning Idea SemSpec Context r1 r2 r1 r2 r1 r2 r3 r1 r2 r3 r1 r2 r1 r2 • The learner’s current grammar produces a certain analysis for an input sentence • The context contains richer information (e.g. bindings) that are unaccounted for in the analysis • Find a way to account for these meaning relations (by looking for corresponding form relations) Initial Single-Word Stage FORM (sound) “you” “throw” lexical constructions “block” schema Addressee subcase of Human you throw “ball” ball block MEANING (stuff) schema Throw roles: thrower throwee schema Ball subcase of Object schema Block subcase of Object New Data: “You Throw The Ball” FORM MEANING SITUATION throw-ball Self “you” “throw” you throw ball “ball” “block” block Addressee schema Throw Throw roles: thrower thrower throwee throwee Throw thrower throwee role-filler before “the” schema Addressee Addressee subcase of Human schema Ball Ball subcase of Object schema Block subcase of Object Ball A Af Am formrelation rolefiller B Bf throw ball Relational Mapping Scenarios Bm throw.throwee ↔ ball Af A Am B Bm formrelation Bf put ball down A Af Am formrelation Bf Nomi ball rolefiller Y B Bm rolefiller possession.possessor ↔ Nomi possession.possessed ↔ ball rolefiller X rolefiller put.mover ↔ ball down.tr ↔ ball Quiz 1. How does the analyzer use the constructions to parse a sentence? 2. What is the process by which new constructions are learned from <utterance, situation> pairs? 3. What are ways to re-organize and consolidate the current grammar? 4. What metric is used to determine when to form a new construction? Merging Similar Constructions throw the block Throw.throwee = Block throw before block throw before Objectf throw before ball THROWOBJECT THROW.throwee = Objectm Throw.throwee = Ball throw-ing the ball throw before-s ing Throw.aspect = ongoing Resulting Construction construction THROW-OBJECT constructional constituents t : THROW o : OBJECT form tf before of meaning tm.throwee ↔ om Composing Co-occurring Constructions throw the ball Throw.throwee = Ball throw before ball throw before ball ball before off ball before off ball off THROWBALLOFF THROW.throwee = Ball Motion m m.mover = Ball m.path = Off Motion m m.mover = Ball m.path = Off Resulting Construction construction THROW-BALL-OFF constructional constituents t : THROW b : BALL o : OFF form tf before bf bf before of meaning evokes MOTION as m tm.throwee ↔ bm m.mover ↔ bm m.path ↔ om Quiz 1. How does the analyzer use the constructions to parse a sentence? 2. What is the process by which new constructions are learned from <utterance, situation> pairs? 3. What are ways to re-organize and consolidate the current grammar? 4. What metric is used to determine when to form a new construction? Minimum Description Length • Choose grammar G to minimize cost(G|D): – cost(G|D) = α • size(G) + β • complexity(D|G) – Approximates Bayesian learning; cost(G|D) ≈ 1/posterior probability ≈ 1/P(G|D) • Size of grammar = size(G) ≈ 1/prior ≈ 1/P(G) – favor fewer/smaller constructions/roles; isomorphic mappings • Complexity of data given grammar ≈ 1/likelihood ≈ 1/P(D|G) – favor simpler analyses (fewer, more likely constructions) – based on derivation length + score of derivation Size Of Grammar • Size of the grammar G is the sum of the size of each construction: size( G ) size( c) • Size of each construction c is: where cG size( c) nc mc length( e) • nc = number of constituents in c, ec • mc = number of constraints in c, • length(e) = slot chain length of element reference e Example: The Throw-Ball Cxn construction THROW-BALL constructional constituents t : THROW b : BALL form tf before bf meaning tm.throwee ↔ bm size( c) nc + mc + length( e) ec size(THROW-BALL) = 2 + 2 + (2 + 3) = 9 Complexity of Data Given Grammar • Complexity of the data D given grammar G is the sum of the analysis score of each input token d: complexity ( D | G) score( d ) • Analysis score of each input token d is: d D score( d ) weight c typer height d semfitd where cd rc • c is a construction used in the analysis of d • weightc ≈ relative frequency of c, • |typer| = number of ontology items of type r used, • heightd = height of the derivation graph, • semfitd = semantic fit provide by the analyzer Final Remark • The goal here is to build a cognitive plausible model of language learning • A very different game that one could play is unsupervised / semi-supervised language learning using shallow or no semantics – statistical NLP – automatic extraction of syntactic structure – automatic labeling of frame elements • Fairly reasonable results for use in tasks such as information retrieval, but the semantic representation is very shallow