Introduction to Computational Linguistics Martha Palmer April 19, 2006 LING 2000 - 2006 1 NLP Natural Language Processing • • • • • Machine Translation Predicate argument structures Syntactic parses Producing semantic representations Ambiguities in sentence interpretation LING 2000 - 2006 2 NLP Machine Translation • One of the first applications for computers – bilingual dictionary > word-word translation • Good translation requires understanding! – War and Peace, The Sound and The Fury? • What can we do? Sublanguages. – technical domains, static vocabulary – Meteo in Canada, Caterpillar Tractor Manuals, Botanical descriptions, Military Messages LING 2000 - 2006 3 NLP Example translation LING 2000 - 2006 4 NLP Translation Issues: Korean to English - Word order - Dropped arguments - Lexical ambiguities - Structure vs morphology LING 2000 - 2006 5 NLP Common Thread • Predicate-argument structure – Basic constituents of the sentence and how they are related to each other • Constituents – John, Mary, the dog, pleasure, the store. • Relations – Loves, feeds, go, to, bring LING 2000 - 2006 6 NLP Abstracting away from surface structure LING 2000 - 2006 7 NLP Transfer lexicons LING 2000 - 2006 8 NLP Machine Translation Lexical Choice- Word Sense Disambiguation Iraq lost the battle. Ilakuka centwey ciessta. [Iraq ] [battle] [lost]. John lost his computer. John-i computer-lul ilepelyessta. [John] [computer] [misplaced]. LING 2000 - 2006 9 NLP Natural Language Processing • Syntax – Grammars, parsers, parse trees, dependency structures • Semantics – Subcategorization frames, semantic classes, ontologies, formal semantics • Pragmatics – Pronouns, reference resolution, discourse models LING 2000 - 2006 10 NLP Syntactic Categories • Nouns, pronouns, Proper nouns • Verbs, intransitive verbs, transitive verbs, ditransitive verbs (subcategorization frames) • Modifiers, Adjectives, Adverbs • Prepositions • Conjunctions LING 2000 - 2006 11 NLP Syntactic Parsing • The cat sat on the mat. Det Noun Verb Prep Det Noun • Time flies like an arrow. Noun Verb Prep Det Noun • Fruit flies like a banana. Noun Noun LING 2000 - 2006 Verb Det Noun 12 NLP Context Free Grammar • • • • • • • • S -> NP VP NP -> det (adj) N NP -> Proper N NP -> N VP -> V, VP -> V PP VP -> V NP VP -> V NP PP, PP -> Prep NP VP -> V NP NP LING 2000 - 2006 13 NLP Parses The cat sat on the mat S NP VP Det the N cat PP V sat Prep on LING 2000 - 2006 14 NP Det the N mat NLP Parses Time flies like an arrow. S NP VP N time V flies PP Prep like LING 2000 - 2006 15 NP Det an N arrow NLP Parses Time flies like an arrow. S NP N time N flies VP V like NP Det an LING 2000 - 2006 16 N arrow NLP Features • C for Case, Subjective/Objective – She visited her. • P for Person agreement, (1st, 2nd, 3rd) – I like him, You like him, He likes him, • N for Number agreement, Subject/Verb – He likes him, They like him. • G for Gender agreement, Subject/Verb – English, reflexive pronouns He washed himself. – Romance languages, det/noun • T for Tense, – auxiliaries, sentential complements, etc. – * will finished is bad LING 2000 - 2006 17 NLP Probabilistic Context Free Grammars • Adding probabilities • Lexicalizing the probabilities LING 2000 - 2006 18 NLP Simple Context Free Grammar in BNF S NP → → PP V → → VP → LING 2000 - 2006 NP VP Pronoun | Noun | Det Adj Noun |NP PP Prep NP Verb | Aux Verb V | V NP | V NP NP | V NP PP | VP PP 19 NLP Simple Probabilistic CFG S NP → → PP V → → VP → LING 2000 - 2006 NP VP Pronoun | Noun | Det Adj Noun |NP PP Prep NP Verb | Aux Verb V | V NP | V NP NP | V NP PP | VP PP [0.10] [0.20] [0.50] [0.20] [1.00] [0.33] [0.67] [0.10] [0.40] [0.10] [0.20] [0.20] 20 NLP Simple Probabilistic Lexicalized CFG S NP → → PP V → → VP → LING 2000 - 2006 NP VP Pronoun | Noun | Det Adj Noun |NP PP Prep NP Verb | Aux Verb V | V NP | V NP NP | V NP PP | VP PP [0.10] [0.20] [0.50] [0.20] [1.00] [0.33] [0.67] [0.87] {sleep, cry, laugh} [0.03] [0.00] [0.00] [0.10] 21 NLP Simple Probabilistic Lexicalized CFG VP → V | V NP | V NP NP | V NP PP | VP PP [0.30] [0.60] {break,split,crack..} [0.00] [0.00] [0.10] VP → V | V NP | V NP NP | V NP PP | VP PP [0.10] [0.40] [0.10] [0.20] [0.20] LING 2000 - 2006 22 what about leave? leave1, leave2? NLP Language to Logic • John went to the book store. John store1, go(John, store1) • John bought a book. buy(John,book1) • John gave the book to Mary. give(John,book1,Mary) • Mary put the book on the table. put(Mary,book1,table1) LING 2000 - 2006 23 NLP Semantics Same event - different sentences John broke the window with a hammer. John broke the window with the crack. The hammer broke the window. The window broke. LING 2000 - 2006 24 NLP Same event - different syntactic frames John broke the window with a hammer. SUBJ VERB OBJ MODIFIER John broke the window with the crack. SUBJ VERB OBJ MODIFIER The hammer broke the window. SUBJ VERB OBJ The window broke. SUBJ VERB LING 2000 - 2006 25 NLP Semantics -predicate arguments break(AGENT, INSTRUMENT, PATIENT) AGENT PATIENT INSTRUMENT John broke the window with a hammer. INSTRUMENT PATIENT The hammer broke the window. PATIENT The window broke. Fillmore 68 - The case for case LING 2000 - 2006 26 NLP AGENT PATIENT INSTRUMENT John broke the window with a hammer. SUBJ OBJ MODIFIER INSTRUMENT PATIENT The hammer broke the window. SUBJ OBJ PATIENT The window broke. SUBJ LING 2000 - 2006 27 NLP Canonical Representation break (Agent: animate, Instrument: tool, Patient: physical-object) Agent <=> subj Instrument <=> subj, with-pp Patient <=> obj, subj LING 2000 - 2006 28 NLP Syntax/semantics interaction • Parsers will produce syntactically valid parses for semantically anomalous sentences • Lexical semantics can be used to rule them out LING 2000 - 2006 29 NLP Headlines • Police Begin Campaign To Run Down Jaywalkers • Iraqi Head Seeks Arms • Teacher Strikes Idle Kids • Miners Refuse To Work After Death • Juvenile Court To Try Shooting Defendant LING 2000 - 2006 30 NLP