Machine Learning Group Reconnecting Computational Linguistics to Artificial Intelligence and Cognitive Science Raymond J. Mooney Machine Learning Group Department of Computer Sciences University of Texas at Austin University of Texas at Austin Current Focus of Computational Linguistics • Well defined problems with large annotated corpora for training and evaluating systems. • Focus on engineering goals of achieving demonstrable, measurable performance on complex realistic data. • Competitions, Competitions, Competitions! – – – – – CoNLL Shared tasks Senseval/Semeval DUC MT-Eval Biocreative • Research driven by the demands of government funding agencies run by bureaucrats not scientists. • Research driven by the immediate needs of industry. • Little focus on using computation to gain insight into interesting and important linguistic, psycholinguistic, cognitive science, or AI issues. 2 Computational Linguistics is Becoming Boring and Incremental! 3 CompLing and AI • Is CompLing a part of Linguistics? • Is CompLing a part of AI? • Real language understanding is an AI complete problem. – Requires extensive knowledge of the world to resolve ambiguities. • The dog is in the pen. The ink is in the pen. • John ate the spaghetti with a fork. John ate spaghetti with meatballs. – Requires reasoning about beliefs, goals, plans, etc. • Today was Jack's birthday. Penny and Janet went to the store. They were going to get presents. Janet decided to get a kite. "Don't do that," said Penny. "Jack has a kite. He will make you take it back." 4 AI–NLP in the 70’s and early 80’s • Focused on knowledge representation and complex inference. • Focused on story understanding and interpreting characters goals and plans (Charniak, Schank et al.) • Focused on dialog understanding that inferred participants goals and plans. • Required extensive manual symbolic knowledge engineering. • Only worked on a small number of specially concocted illustrative examples. • Brittle and difficult to generalize to robust, reliable performance on unseen examples. 5 Story Understanding • My PhD thesis (1987) involved learning plan schemata for story understanding in the “Schankian” tradition. • Used hand-coded symbolic knowledge to “deeply understand” short, concocted stories by understanding the plans and goals of the characters. • GENESIS learned new plan schemata from a single example using explanation-based learning to improve its future understanding. 6 GENESIS Trace Inititial Schema Learning Input: Fred is Mary's father and is a millionaire. John approached Mary and pointed a gun at her. She was wearing blue jeans. He told her if she did not get in his car then he would shoot her. He drove her to his hotel and locked her in his room. John called Fred and told him John was holding Mary captive. John told Fred if Fred gave him 250000 dollars at Trenos then John would release Mary. Fred paid him the ransom and the kidnapper released Mary. Valerie is Fred's wife and he told her that someone had kidnapped Mary. Thematic goal achieved: John is happy that John has the $250000. Explanation suitable for generalization. Pruning...Generalizing...Packaging... Creating New Schema: (CaptureBargain ?x55 ?a34 ?b9 ?c4 ?r5 ?y5 ?l11) ?b9 is a person. ?c4 is a location. ?r5 is a room. ?c4 is in ?r5. ?x55 is a character. ?b9 is free. ?x55 captures ?b9 and locks him/her in ?r5. ?a34 is a character. ?x55 contacts ?a34 and tells it that ?b9 is ?x55's captive. ?y5 is a valuable. ?x55 wants to have ?y5 more than it wants ?b9 to be ?x55's captive. ?a34 has a positive relationship with ?b9. ?a34 has ?y5. ?x55 and ?a34 carry out a bargain in which ?x55 releases ?b9 and ?a34 gives ?x55 ?y5 at ?l11. Unknown word 'ransom' refers to ?y5 in CaptureBargain Unknown word 'kidnapper' refers to ?x55 in CaptureBargain Having: ?x55 captured ?b9. suggest CaptureBargain Having: ?x55 contacted ?a34 and told it that ?b9 was ?x55's captive. suggest CaptureBargain Having: ?x55 and ?a34 carried out a bargain in which ?x55 released ?b9 and ?a34 gave ?x55 ?y5 at ?l11. suggest CaptureBargain 7 Unknown word '?x55 kidnap ?b9' refers to CaptureBargain GENESIS Trace Question Answering Ready for questions: >Summarize Mary was free. John captured Mary and locked her in a room. John contacted Fred and told him that Mary was John's captive. John wanted to have $250000 more than he wanted Mary to be John's captive. Fred had a positive relationship with Mary. Fred had the $250000. John and Fred carried out a bargain in which John released Mary and Fred gave John the $250000 at Trenos. >Why did John approach Mary? So John could aim the gun at Mary. >Why did John aim the gun at Mary? So John could threaten to shoot Mary with the gun unless Mary went from John in to the car. >Why did Mary get into the car? Because Mary wanted not to be shot and because Mary believed that if Mary did not go from John into the car then John would shoot Mary with the gun. >Why did John threaten to shoot Mary unless she got in the car? So John could drive Mary to the hotel in the car. >Why did John drive Mary to the hotel? So John could move Mary from the hotel into the room. 8 GENESIS Trace Question Answering (Cont.) >Why did John move Mary into the room? So John could confine Mary in the room. >Why did John confine Mary in the room? So John could call Fred and could tell him that Mary was John's captive and so John and Fred could carry out a bargain in which John released Mary and Fred gave John the $250000 at Trenos. >Why did John tell Fred that John was holding Mary captive? So John and Fred could carry out a bargain in which John released Mary and Fred gave John the $250000 at Trenos. >Why did Fred give John the money? Because Fred believed that if Fred gave John the $250000 at Trenos then John would release Mary and because Fred wanted Mary to be free more than he wanted to have the $250000. >Why did Fred have $250000? Because Fred had millions of dollars. >Why did Fred believe that John was holding Mary captive? Because John called Fred and told him that Mary was John's captive. 9 GENESIS Trace Question Answering (Cont.) >Why did Fred want Mary free? Because Fred was Mary's father. >Why did John release Mary? Because John wanted to have the $250000 more than he wanted Mary to be John's captive and because John believed that if John released Mary then Fred would give John the $250000 at Trenos. 10 Explanation Graph of Story 11 GENESIS Trace Using the Learned Schema Input: Ted is Alice's husband. He won 100000 dollars in the lottery. Bob imprisoned Alice in his basement. Bob got 75000 dollars and released Alice. Thematic goal achieved: Ted is happy that Ted has the $100000. Thematic goal achieved: Bob is happy that Bob has the $75000. Ready for questions: >Summarize Alice was free. Bob captured Alice and locked her in a basement. Bob contacted Ted and told him that Alice was Bob's captive. Bob wanted to have $75000 more than he wanted Alice to be Bob's captive. Ted had a positive relationship with Alice. Ted had the $75000. Bob and Ted carried out a bargain in which Bob released Alice and Ted gave Bob the $75000. >Why did Bob lock Alice in his basement? So Bob could contact Ted and could tell him that Alice was Bob's captive and so Bob and Ted could carry out a bargain in which Bob released Alice and Ted gave Bob the $75000. >Why did Bob release Alice? Because Bob wanted to have the $75000 more than he wanted Alice to be Bob's captive and because Bob believed that if Bob released Alice then Ted would give Bob the $75000. 12 GENESIS Trace Question Answering (Cont.) >How did Bob get the money? Bob kidnapped Alice. >Who gave Bob the money? Ted gave Bob the $75000. >Why did Ted give him the money? Because Ted believed that if Ted gave Bob the $75000 then Bob would release Alice and because Ted wanted Alice to be free more than he wanted to have the $75000. >Why did Ted want Alice free? Because Ted was Alice's husband. >Why did Ted believe that Bob was holding Alice captive? Because Bob contacted Ted and told him that Alice was Bob's captive. 13 CogSci NLP in the 70’s and early 80’s • Computational models of human syntactic parsing that explained phenomena like garden-path sentences. – The horse raced past the barn fell. – While Anna dressed the baby spit up on the bed. • Computational models of human word-sense disambiguation that explained phenomenon like semantic priming and semantic garden-path sentences. – The astronomer married the star. – The rabbi was hit on the temple. 14 Challenge Problem Can one study the interesting NLP problems of the 70’s using 21st century technology and rigorous methodology to develop robust, deep-understanding systems and insightful and accurate computational models of human language processing and learning? 15 Promising Steps • SemEval-07 FrameNet task – Must identify semantic frames by disambiguating trigger words and then fill their semantic roles. • Results on modeling human reading time (sentence processing time) using PCFGs (Levy, 2006). • Computational Psycholinguistics workshops at COLING 2004, ACL 2005 and 2007. 16 Language Grounding • The meanings of many words are grounded in our perception of the physical world: red, ball, cup, run, hit, fall, etc. – Symbol Grounding: Harnad (1990) • Even many abstract words and meanings are metaphorical abstractions of terms grounded in the physical world: up, down, over, in, etc. – Lakoff and Johnson’s (1980) Metaphors We Live By • Its difficult to put my words into ideas. • Interest in competitions is up. • Most work in CompLing tries to represent meaning without any connection to perception or to the physical world, circularly defining the meanings of words in terms of other words with no firm foundation. 17 Learning Language from Perceptual Context • Children do not learn language from reading the newspaper, surfing the web, or listening to the radio. • The natural way to learn language is to perceive language in the context of its use in the physical and social world. • Must learn to infer the meanings of utterances from their perceptual context. – William James’ “Buzzing blooming confusion” 18 Existing Work • There is some work on learning grounded language from perceptual context. – Deb Roy et al. – Yu and Ballard – Siskind • However, these researchers come from computer vision and robotics and not computational linguistics. – Only simple language is used with no linguistic processing beyond finite state HMMs. 19 Robots that Learn Language from Narrowly Focused Context • Deb Roy’s Toco Robot 20 Contextually Ambiguous Sentence Meaning • In reality, sentences are frequently uttered in complex situations composed of numerous potential meanings. • Assume each sentence is annotated with multiple possible meanings inferred from context. – Siskind (1996) uses this type “referentially uncertain” training data to learn a semantic lexicon. 21 Ambiguous Supervision for Semantic Parsing • NL training sentences annotated with a set of possible MRs, only one of which is correct. • We artificially obfuscated a normal NL-MR corpus (GeoQuery) by adding extra distracter MRs to each training pair • We also created completely artificial ambiguous training data for semantic parsing. 22 Sample Ambig-ChildWorld Corpus gave(daisy, clock, mouse) Daisy gave the clock to the mouse. ate(mouse, orange) ate(dog, apple) Mommy saw that Mary gave the hammer to the dog. saw(mother, gave(mary, dog, hammer)) broke(dog, box) The dog broke the box. gave(woman, toy, mouse) gave(john, bag, mouse) John gave the bag to the mouse. The dog threw the ball. threw(dog, ball) runs(dog) saw(john, walks(man, dog)) 23 KRISPER • • Version of our KRISP semantic parser that learns from ambiguous supervision. Uses an iterative EM-like method to gradually converge on a correct meaning for each sentence. 1) Assume every possible meaning for a sentence is correct. 2) Train KRISP on the resulting noisy data. 3) Test learned parser on each training sentence and determine the most likely MR for each sentence from the set of potential meanings. 4) Build a new training set using the most likely meaning for each sentence and iterate. 24 Sample Ambig-ChildWorld Corpus gave(daisy, clock, mouse) Daisy gave the clock to the mouse. ate(mouse, orange) ate(dog, apple) Mommy saw that Mary gave the hammer to the dog. saw(mother, gave(mary, dog, hammer)) broke(dog, box) The dog broke the box. gave(woman, toy, mouse) gave(john, bag, mouse) John gave the bag to the mouse. The dog threw the ball. threw(dog, ball) runs(dog) saw(john, walks(man, dog)) 25 Sample Ambig-ChildWorld Corpus gave(daisy, clock, mouse) Daisy gave the clock to the mouse. ate(mouse, orange) ate(dog, apple) Mommy saw that Mary gave the hammer to the dog. saw(mother, gave(mary, dog, hammer)) broke(dog, box) The dog broke the box. gave(woman, toy, mouse) gave(john, bag, mouse) John gave the bag to the mouse. The dog threw the ball. threw(dog, ball) runs(dog) saw(john, walks(man, dog)) 26 Sample Ambig-ChildWorld Corpus gave(daisy, clock, mouse) Daisy gave the clock to the mouse. ate(mouse, orange) ate(dog, apple) Mommy saw that Mary gave the hammer to the dog. saw(mother, gave(mary, dog, hammer)) broke(dog, box) The dog broke the box. gave(woman, toy, mouse) gave(john, bag, mouse) John gave the bag to the mouse. The dog threw the ball. threw(dog, ball) runs(dog) saw(john, walks(man, dog)) 27 Results on Ambig-ChildWorld Corpus 28 Un-Enumerated Ambiguity • It may not always be possible to narrow the meaning of a sentence down to a relatively small number of explicitly enumerated potential meanings. • Assuming context meaning is represented as a semantic network, sentence meaning could be assumed to be any connected subgraph of the context. 29 Sample Ambiguous Context “Juvenile caresses canine.” Dog isa Chewing HasColor attr obj Black Spot patient Petting agent agent patient Thing1 isa Mary isa Child HasColor attr obj Blonde Possess patient agent isa Thing2 Barbie obj part isa HasPart Doll Hair Bone 30 Sample Ambiguous Context “Juvenile caresses canine.” Dog isa Chewing HasColor attr obj Black Spot patient Petting agent agent patient Thing1 isa Mary isa Child HasColor attr obj Blonde Possess patient agent isa Thing2 Barbie obj part isa HasPart Doll Hair Bone 31 Sample Ambiguous Context “Juvenile caresses canine.” Dog isa Chewing HasColor attr obj Black Spot patient Petting agent agent patient Thing1 isa Mary isa Child HasColor attr obj Blonde Possess patient agent isa Thing2 Barbie obj part isa HasPart Doll Hair Bone 32 Sample Ambiguous Context “Juvenile caresses canine.” Dog isa Chewing HasColor attr obj Black Spot patient Petting agent agent patient Thing1 isa Mary isa Child HasColor attr obj Blonde Possess patient agent isa Thing2 Barbie obj part isa HasPart Doll Hair Bone 33 Sample Ambiguous Context “Juvenile caresses canine.” Dog isa Chewing HasColor attr obj Black Spot patient Petting agent agent patient Thing1 isa Mary isa Child HasColor attr obj Blonde Possess patient agent isa Thing2 Barbie obj part isa HasPart Doll Hair Bone 34 Sample Ambiguous Context “Juvenile caresses canine.” Dog isa Chewing HasColor attr obj Black Spot patient Petting agent agent patient Thing1 isa Mary isa Child HasColor attr obj Blonde Possess patient agent isa Thing2 Barbie obj part isa HasPart Doll Hair Bone 35 Sample Ambiguous Context “Juvenile caresses canine.” Dog isa Chewing HasColor attr obj Black Spot patient Petting agent agent patient Thing1 isa Mary isa Child HasColor attr obj Blonde Possess patient agent isa Thing2 Barbie obj part isa HasPart Doll Hair Bone 36 Sample Ambiguous Context “Juvenile caresses canine.” Dog isa Chewing HasColor attr obj Black Spot patient Petting agent agent patient Thing1 isa Mary isa Child HasColor attr obj Blonde Possess patient agent isa Thing2 Barbie obj part isa HasPart Doll Hair Bone 37 Sample Ambiguous Context “Juvenile caresses canine.” Dog isa Chewing HasColor attr obj Black Spot patient Petting agent agent patient Thing1 isa Mary isa Child HasColor attr obj Blonde Possess patient agent isa Thing2 Barbie obj part isa HasPart Doll Hair Bone 38 Potential Approach • Can employ an approach similar to WOLFIE (Thompson & Mooney, 1999) to learn a semantic lexicon from such graph-based ambiguous supervision. • A candidate meaning for a word is the largest isomorphic common subgraph (LICS) of the conceptual context graphs of a set of sentences in which the word appears. • Use the candidate meanings of the words in a sentence to determine the connected subgraph of its conceptual context graph that is its most-likely meaning. 39 Conclusions • CompLing is becoming disconnected from AI and Cognitive Science and becoming boring. • Many interesting and important problems that were studied in the 70’s and 80’s are being ignored. • Can we start to address these issues again using the latest CompLing methods and systems to automatically acquire the requisite knowledge and improve robustness. • Sample challenge tasks: – Recognizing the goals and plans of people in narrative text or dialog participants. – Learning grounded language from ambiguous perceptual context. 40