EECS 595 / LING 541 / SI 661 Natural Language Processing Fall 2004 Lecture Notes #1 Introduction Course logistics • Instructor: Prof. Dragomir Radev (radev@umich.edu) Ph.D., Computer Science, Columbia University Formerly at IBM TJ Watson Research Center • Times: Tuesdays 1:10-3:55 PM, in 412, West Hall • Office hours: TBA, 3080 West Hall Connector Course home page: http://www.si.umich.edu/~radev/NLP-fall2004 Example (from a famous movie) Dave Bowman: Open the pod bay doors, HAL. HAL: I’m sorry Dave. I’m afraid I can’t do that. Example I saw her fall • How many different interpretations does the above sentence have? What is Natural Language Processing • Natural Language Processing (NLP) is the study of the computational treatment of natural language. • NLP draws on research in Linguistics, Theoretical Computer Science, Mathematics and Statistics, Artificial Intelligence, Psychology, etc. Linguistics • Knowledge about language: – – – – – – – Phonetics and phonology - the study of sounds Morphology - the study of word components Syntax - the study of sentence and phrase structure Lexical semantics - the study of the meanings of words Compositional semantics - how to combine words Pragmatics - how to accomplish goals Discourse conventions - how to deal with units larger than utterances Theoretical Computer Science • Automata – Deterministic and non-deterministic finite-state automata – Push-down automata • Grammars – Regular grammars – Context-free grammars – Context-sensitive grammars • Complexity • Algorithms – Dynamic programming Mathematics and Statistics • • • • • • Probabilities Statistical models Hypothesis testing Linear algebra Optimization Numerical methods Artificial Intelligence • Logic – First-order logic – Predicate calculus • Agents – Speech acts • Planning • Constraint satisfaction • Machine learning Ambiguity I saw her fall. • The categories of knowledge of language can be thought of as ambiguity-resolving components • How many different interpretations does the above sentence have? • How can each ambiguous piece be resolved? • Does speech input make the sentence even more ambiguous? Time flies like an arrow. http://edition.cnn.com/2004/WEATHER/09/03/hurricane.frances/index.html Frances churns toward Florida Hurricane center: Storm 'relentlessly lashing Bahamas' Friday, September 3, 2004 Posted: 2024 GMT (0424 HKT) MIAMI, Florida (CNN) -- Hurricane Frances moved slowly toward Florida on Friday, and the National Hurricane Center said it could gain intensity before making landfall, possibly late Saturday. At 2 p.m. ET, the Category 3 storm was centered near the southern tip of Great Abaco in the Bahamas, 200 miles (321 kilometers) eastsoutheast of Florida's lower east coast, according to the National Hurricane Center. The storm was moving toward the west-northwest at about 9 mph (15 kph). Its maximum sustained winds had dropped to 115 mph (185 kph), but forecasters said it still is "a dangerous hurricane." Hurricanes are classified as categories 1 to 5 on the Saffir-Simpson hurricane scale. A Category 3 storm has sustained winds between 111 and 130 mph (178 and 209 kph). The advisory said Frances was likely to make landfall in Florida in about 36 hours. Hurricane-force winds extend 85 miles (140 kilometers) from the center of the storm, and winds of tropical storm strength (39-73 mph) extend outward up to 185 miles (295 kilometers). Because Frances is the size of Texas -- more than twice as large as Hurricane Charley three weeks ago -- its major winds and heavy rain are expected to batter a large part of Florida well before landfall. By Friday afternoon, parts of Florida were experiencing wind gusts as high as 39 mph -- the lower end of tropical-storm intensity. Hurricane warnings are in effect for much of Florida's eastern coastline. A hurricane warning means hurricane conditions are expected in the warning area within 24 hours. Storm surge flooding of six to 14 feet above normal has been reported in the storm's path, and the hurricane center warned "rainfall amounts of seven to 12 inches -- locally as high as 20 inches -- are possible in association with Frances." The hurricane center bulletin said Frances was "relentlessly lashing the central and western Bahamas." A hurricane center official told CNN the storm could spend two days moving across the Florida Peninsula. Frances has weakened slightly in the past few days, but the hurricane center advisory warned that as it moves across the warm waters of the Gulf Stream, "this could easily lead to re-intensification." However, current forecasts predict "a 100-knot hurricane at landfall" -- meaning wind speeds of about 115 mph. Because steering currents are expected to weaken further, Frances "will likely slow down on its way to Florida. This could delay the landfall a few more hours," the advisory said. "Numerical guidance continues to bring the hurricane over Florida during the next two to three days." Florida Gov. Jeb Bush said Friday that the state was taking all necessary steps to prepare for the storm. Florida Gov. Jeb Bush said Friday that the state was taking all necessary steps to prepare for the storm. "We are staging across -- some outside the state and some inside the state -- a massive response for this storm, and we're going to need it," Bush said in a news conference. "There's going to be a lot of work necessary to make sure that the response is massive and immediate to help people once this storm comes." He said he has asked the governors of 17 states to waive size and weight restrictions on trucks carrying relief supplies. His brother, President Bush, also offered support at a campaign rally Friday morning in Pennsylvania. "Before I begin, I do know you'll join me in offering our prayers and best wishes to those in the path of Hurricane Frances," the president said. A hurricane the size of Texas Florida ordered mandatory evacuations in parts of 16 counties and voluntary evacuations in five other counties. "If you are on a barrier island or a low-lying area, and you haven't left, now is the time to do so," Governor Bush said. Florida officials said the evacuation order covers 2.5 million people. Most of them "are staying in their own community, which is exactly what they should be doing," said Bush, noting that low-lying areas were most at risk. "They've made plans to be with a loved one or a friend and they're not on the roads." People looking to flee the region clogged highways Thursday, but officials said Friday that traffic had died down. "Overall we're very, very pleased with evacuation procedures yesterday and continuing through today," said Col. Chris Knight, director of the Florida Highway Patrol. "We have no problems this morning." The Red Cross opened 82 shelters in Florida on Thursday and about 21,000 people were in them by nightfall, spokeswoman Carol Miller told CNN. The group also set up eight reception centers along the highway to help people who needed information, directions, water and maps, she said. Miller said the Red Cross was launching its largest-ever response effort to a domestic natural disaster. Airlines have canceled flights in and out of some of the major airports in Florida and the Caribbean, and are expected to adjust schedules as weather patterns change throughout the weekend. Military preparations Military officials preparing to evacuate three commands as Frances approaches. At MacDill Air Force Base in Tampa, on Florida's Gulf Coast, a military team is preparing to set up alternative headquarters facilities for the U.S. Central Command and Special Operations Command at the stadium used by the Tampa Bay Buccaneers football team. Central Command is responsible for running the wars in Afghanistan and Iraq, while Special Operations Command oversees 50,000 special operations forces. Patrick Air Force Base, on the eastern coast of Florida near Melbourne, was evacuated Thursday, and the commander of a fighter wing near Miami ordered aircraft moved out of the hurricane's path. The naval air station at Jacksonville also moved aircraft out of the area. In Miami, the headquarters of the Southern Command has closed. Command-and-control operations are being performed, but they could be moved to Davis-Monthan Air Force Base in Arizona. The alphabet soup (NLP vs. CL vs. SP vs. HLT vs. NLE) • • • • • • NLP (Natural Language Processing) CL (Computational Linguistics) SP (Speech Processing) HLT (Human Language Technology) NLE (Natural Language Engineering) Other areas of research: Speech and Text Generation, Speech and Text Understanding, Information Extraction, Information Retrieval, Dialogue Processing, Inference • Related areas: Spelling Correction, Grammar Correction, Text Summarization Sample applications • • • • • • Speech Understanding Question Answering Machine Translation Text-to-speech Generation Text Summarization Dialogue Systems Some demos • AT&T Labs Text-To-Speech (http://www.research.att.com/proje cts/tts/demo.html) • Babelfish (babelfish.altavista.com) OneAcross (www.oneacross.com) AskJeeves (www.ask.com) • • • IONaut (http://www.ionaut.com:8400) • NSIR (http://tangra.si.umich.edu/clair/NS IR/html/nsir.cgi) • AnswerBus (www.answerbus.com) • NewsInEssence (www.newsinessence.com) The Turing Test • Alan Turing: the Turing test (language as test for intelligence) • Three participants: a computer and two humans (one is an interrogator) • Interrogator’s goal: to tell the machine and human apart • Machine’s goal: to fool the interrogator into believing that a person is responding • Other human’s goal: to help the interrogator reach his goal Q: Please write me a sonnet on the topic of the Forth Bridge. A: Count me out on this one. I never could write poetry. Q: Add 34957 to 70764. A: 105621 (after a pause) Some brief history • Foundational insights (40’s and 50’s): automaton (Turing), probabilities, information theory (Shannon), formal languages (Backus and Naur), noisy channel and decoding (Shannon), first systems (Davis et al., Bell Labs) • Two camps (57-70): symbolic and stochastic. Transformation grammar (Harris, Chomsky), artificial intelligence (Minsky, McCarthy, Shannon, Rochester), automated theorem proving and problem solving (Newell and Simon) Bayesian reasoning (Mosteller and Wallace) Corpus work (Kučera and Francis) Some brief history • Four paradigms (70-83): stochastic (IBM), logicbased (Colmerauer, Pereira and Warren, Kay, Bresnan), nlu (Winograd, Schank, Fillmore), discourse modelling (Grosz and Sidner) • Empiricism and finite-state models redux (83-93): Kaplan and Kay (phonology and morphology), Church (syntax) • Late years (94-03): strong integration of different techniques, different areas (including speech and IR), probabilistic models, machine learning The state of the art and the nearterm future • World-Wide Web (WWW) • Sample scenarios: – – – – – – – – – generate weather reports in two languages teaching deaf people to speak translate Web pages into different languages speak to your appliances find restaurants answer questions grade essays (?) closed-captioning in many languages automatic description of a soccer game Structure of the course • Three major parts: – Linguistic, mathematical, and computational background – Computational models of morphology, syntax, semantics, discourse, pragmatics – Applications: text generation, machine translation, information extraction, etc. • Three major goals: – Learn the basic principles and theoretical issues underlying natural language processing – Learn techniques and tools used to develop practical, robust systems that can communicate with users in one or more languages – Gain insight into many open research problems in natural language Readings • Speech and Language Processing (Daniel Jurafsky and James Martin) Prentice-Hall, 2000 ISBN: 0-13-095069-6 • Handouts given in class • 1-2 chapters per week Optional readings: Natural Language Understanding by Allen Foundations of Statistical Natural Language Processing by Manning and Schütze. Grading • • • • • Four homework assignments (40%) Midterm (15%) Final project (20%) Final exam (25%) Additional requirements for SI761 Assignments • (subject to change) – Finite-state modeling, part of speech tagging, and information extraction • Fsmtools/lextools/JMX (Bell Labs, Penn) – Tagging and parsing • Brill tagger/Charniak parser (JHU, Brown) – Machine translation • GIZA++/Rewrite decoder (Aachen, JHU, ISI) – Text generation • FUF/Surge (Columbia) Syllabus Wk Date Topic 1 9/7 Introduction (JM1) Linguistic Fundamentals 2 9/14 Regular Expressions and Automata (JM2) 3 9/21 Morphology and Finite-State Transducers (JM3) Word Classes and Part of Speech Tagging (JM8) 4 9/28 Context-Free Grammars for English (JM9) Parsing with Context-Free Grammars (JM10) 5 10/5 Features and Unification (JM11) Lexicalized and Probabilistic Parsing (JM12) 6 10/12 Natural Language Generation (JM20) Machine Translation (JM 21 + handout) 10/19 NO CLASS HW HW due #1 #2 #1 #3 #2 Syllabus Wk Date Topic 7 10/26 Midterm 8 11/2 Natural Language Generation (JM20) (Cont’d) The Functional Unification Formalism (Handout) 9 11/9 Language and Complexity (JM13) 10 11/16 Representing Meaning (JM14) 11 11/23 Semantic Analysis (JM15) Discourse (JM18) 12 11/30 Rhetorical Analysis (Handout) Dialogue and Conversational Agents (JM19) 13 12/7&14 Project Presentations HW HW due #4 #3 #4 Project due Other meetings • CLAIR meeting (TBA) • Artificial Intelligence Seminar (Tuesdays 4-5:30) • STIET (Thursdays 4-5:30) Projects Each student will be responsible for designing and completing a research project that demonstrates the ability to use concepts from the class in addressing a practical problem. A significant part of the final grade will depend on the project assignment. Students can elect to do a project on an assigned topic, or to select a topic of their own. The final version of the project will be put on the World Wide Web, and will be defended in front of the class at the end of the semester (procedure TBA). In some cases (and only with instructor’s approval), students may be allowed to work in pairs when the project’s scope is significant. Sample projects • • • • • • • • • • • • • Noun phrase parser Paraphrase identification Question answering NL access to databases Named entity tagging Rhetorical parsing Anaphora resolution, entity crossreference Document and sentence alignment Using bioinformatics methods Encyclopedia Information extraction Speech processing Sentence normalization • • • • • • • • • • • • • Text summarization Sentence compression Definition extraction Crossword puzzle generation Prepositional phrase attachment Machine translation Generation Semi-structured document parsing Semantic analysis of short queries User-friendly summarization Number classification Domain-specific PP attachment Time-dependent fact extraction Main research forums and other pointers • Conferences: ACL/NAACL, SIGIR, AAAI/IJCAI, ANLP, Coling, HLT, EACL/NAACL, AMTA/MT Summit, ICSLP/Eurospeech • Journals: Computational Linguistics, Natural Language Engineering, Information Retrieval, Information Processing and Management, ACM Transactions on Information Systems, ACM TALIP, ACM TSLP • University centers: Columbia, CMU, JHU, Brown, UMass, MIT, UPenn, USC/ISI, NMSU, Michigan, Maryland, Edinburgh, Cambridge, Saarland, Sheffield, and many others • Industrial research sites: IBM, SRI, BBN, MITRE, MSR, (AT&T, Bell Labs, PARC) • Startups: Language Weaver, Ask.com, LCC • The Anthology: http://www.aclweb.org/anthology What this course is NOT • EECS 597 / LING 792 / SI 661 “Language and Information”, last taught in Fall of 2002, essentially an introduction to corpus-based and statistical NLP. – Topics covered: introduction to computational linguistics, information theory, data compression and coding, N-gram models, clustering, lexicography, collocations, text summarization, information extraction, question answering, word sense disambiguation, analysis of style, and other topics . • SI 760 “Information Retrieval”, last taught Winter 2003. – Topics covered: information need, IR models, documents, queries, query languages, relevance, retrieval evaluation, reference collections, query expansion and relevance feedback, indexing and searching, XML retrieval, language modeling approaches, crawling the Web, hyperlink analysis, measuring the Web, similarity and clustering, social network analysis for IR, hubs and authorities, PageRank and HITS, focused crawling, relevance transfer, question answering • An undergraduate Linguistics course such as Ling 212 “Intro to the Symbolic Analysis of Language” or Ling 320 “Programming for Linguistics and Language Studies” Linguistic Fundamentals Syntactic categories • Substitution test: Nathalie likes { black Persian tabby small } cats. • Open (lexical) and closed (functional) categories: No-fly-zone yadda yadda yadda the in Morphology The dog chased the yellow bird. • • • • • • Parts of speech: eight (or so) general types Inflection (number, person, tense…) Derivation (adjective-adverb, noun-verb) Compounding (separate words or single word) Part-of-speech tagging Morphological analysis (prefix, root, suffix, ending) Part of speech tags From Church (1991) - 79 tags NN IN AT NP JJ , NNS CC RB VB VBN VBD CS /* /* /* /* /* /* /* /* /* /* /* /* /* singular noun */ preposition */ article */ proper noun */ adjective */ comma */ plural noun */ conjunction */ adverb */ un-inflected verb */ verb +en (taken, looked (passive,perfect)) */ verb +ed (took, looked (past tense)) */ subordinating conjunction */ Jabberwocky (Lewis Carroll) `Twas brillig, and the slithy toves Did gyre and gimble in the wabe: All mimsy were the borogoves, And the mome raths outgrabe. "Beware the Jabberwock, my son! The jaws that bite, the claws that catch! Beware the Jubjub bird, and shun The frumious Bandersnatch!" Nouns • Nouns: dog, tree, computer, idea • Nouns vary in number (singular, plural), gender (masculine, feminine, neuter), case (nominative, genitive, accusative, dative) • Latin: filius (m), filia (f), filium (object) German: Mädchen • Clitics (‘s) Pronouns • Pronouns: she, ourselves, mine • Pronouns vary in person, gender, number, case (in English: nominative, accusative, possessive, 2nd possessive, reflexive) Mary saw her in the mirror. Mary saw herself in the mirror. • Anaphors: herself, each other Determiners and adjectives • • • • • • Articles: the, a Demonstratives: this, that Adjectives: describe properties Attributive and predicative adjectives Agreement: in gender, number Comparative and superlative (derivative and periphrastic) • Positive form Verbs • • • • • • • • • • Actions, activities, and states (throw, walk, have) English: four verb forms tenses: present, past, future other inflection: number, person gerunds and infinitive aspect: progressive, perfective voice: active, passive participles, auxiliaries irregular verbs French and Finnish: many more inflections than English Other parts of speech • Adverbs, prepositions, particles • phrasal verbs (the plane took off, take it off) • particles vs. prepositions (she ran up a bill/hill) • Coordinating conjunctions: and, or, but • Subordinating conjunctions: if, because, that, although • Interjections: Ouch! Phrase structure • Constraints on word order • Constituents: NP, PP, VP, AP • Phrase structure grammars S NP PN VP V N Spot chased Det a N bird Phrase structure • Paradigmatic relationships (e.g., constituency) • Syntagmatic relationships (e.g., collocations) S NP That VP man VBD PP NP caught the butterfly NP IN with a net Phrase-structure grammars Peter gave Mary a book. Mary gave Peter a book. • • • • • • • Constituent order (SVO, SOV) imperative forms sentences with auxiliary verbs interrogative sentences declarative sentences start symbol and rewrite rules context-free view of language Sample phrase-structure grammar S NP NP NP VP VP VP P NP AT AT NP VP VBD VBD IN VP NNS NN PP PP NP NP AT NNS NNS NNS VBD VBD VBD IN IN NN the children students mountains slept ate saw in of cake Phrase structure grammars • Local dependencies • Non-local dependencies • Subject-verb agreement The women who found the wallet were given a reward. • wh-extraction Should Peter buy a book? Which book should Peter buy? • Empty nodes Dependency: arguments and adjuncts Sue watched the man at the next table. • Event + dependents (verb arguments are usually NPs) • agent, patient, instrument, goal - semantic roles • subject, direct object, indirect object • transitive, intransitive, and ditransitive verbs • active and passive voice Subcategorization • Arguments: subject + complements • adjuncts vs. complements • adjuncts are optional and describe time, place, manner… • subordinate clauses • subcategorization frames Subcategorization Subject: The children eat candy. Object: The children eat candy. Prepositional phrase: She put the book on the table. Predicative adjective: We made the man angry. Bare infinitive: She helped me walk. To-infinitive: She likes to walk. Participial phrase: She stopped singing that tune at the end. That-clause: She thinks that it will rain tomorrow. Question-form clauses: She asked me what book I was reading. Subcategorization frames • • • • • • • Intransitive verbs: The woman walked Transitive verbs: John loves Mary Ditransitive verbs: Mary gave Peter flowers Intransitive with PP: I rent in Paddington Transitive with PP: She put the book on the table Sentential complement: I know that she likes you Transitive with sentential complement: She told me that Gary is coming on Tuesday Selectional restrictions and preferences • Subcategorization frames capture syntactic regularities about complements • Selectional restrictions and preferences capture semantic regularities: bark, eat Phrase structure ambiguity • Grammars are used for generating and parsing sentences • Parses • Syntactic ambiguity • Attachment ambiguity: Our company is training workers. • The children ate the cake with a spoon. • High vs. low attachment • Garden path sentences: The horse raced past the barn fell. Is the book on the table red? Ungrammaticality vs. semantic abnormality * Slept children the. # Colorless green ideas sleep furiously. # The cat barked. Semantics and pragmatics • Lexical semantics and compositional semantics • Hypernyms, hyponyms, antonyms, meronyms and holonyms (part-whole relationship, tire is a meronym of car), synonyms, homonyms • Senses of words, polysemous words • Homophony (bass). • Collocations: white hair, white wine • Idioms: to kick the bucket Discourse analysis • Anaphoric relations: 1. Mary helped Peter get out of the car. He thanked her. 2. Mary helped the other passenger out of the car. The man had asked her for help because of his foot injury. • Information extraction problems (entity crossreferencing) Hurricane Hugo destroyed 20,000 Florida homes. At an estimated cost of one billion dollars, the disaster has been the most costly in the state’s history. Pragmatics • The study of how knowledge about the world and language conventions interact with literal meaning. • Speech acts • Research issues: resolution of anaphoric relations, modeling of speech acts in dialogues Other areas of NLP • Linguistics is traditionally divided into phonetics, phonology, morphology, syntax, semantics, and pragmatics. • Sociolinguistics: interactions of social organization and language. • Historical linguistics: change over time. • Linguistic typology • Language acquisition • Psycholinguistics: real-time production and perception of language Other sites • Johns Hopkins University (Jason Eisner) http://www.cs.jhu.edu/~jason/465/ • Cornell University (Lillian Lee) http://courses.cs.cornell.edu/cs674/2002SP/ • Simon Fraser University (Anoop Sarkar) http://www.sfu.ca/~anoop/courses/CMPT-825-Fall-2003/index.html • Stanford University (Chris Manning) http://www.stanford.edu/class/cs224n/ • JHU Summer workshop http://www.clsp.jhu.edu/ws2003/calendar/preliminary.shtml Word classes and part-of-speech tagging Part of speech tagging • • • • Problems: transport, object, discount, address More problems: content French: est, président, fils “Book that flight” – what is the part of speech associated with “book”? • POS tagging: assigning parts of speech to words in a text. • Three main techniques: rule-based tagging, stochastic tagging, transformation-based tagging Rule-based POS tagging • Use dictionary or FST to find all possible parts of speech • Use disambiguation rules (e.g., ART+V) • Typically hundreds of constraints can be designed manually Example in French <S> ^ beginning of sentence La rf b nms u article teneur nfs nms noun feminine singular Moyenne jfs nfs v1s v2s v3s adjective feminine singular en p a b preposition uranium nms noun masculine singular des p r preposition rivi`eres nfp noun feminine plural , x punctuation bien_que cs subordinating conjunction délicate jfs adjective feminine singular À p preposition calculer v verb Sample rules BS3 BI1: A BS3 (3rd person subject personal pronoun) cannot be followed by a BI1 (1st person indirect personal pronoun). In the example: ``il nous faut'' ({\it we need}) - ``il'' has the tag BS3MS and ``nous'' has the tags [BD1P BI1P BJ1P BR1P BS1P]. The negative constraint ``BS3 BI1'' rules out ``BI1P'', and thus leaves only 4 alternatives for the word ``nous''. N K: The tag N (noun) cannot be followed by a tag K (interrogative pronoun); an example in the test corpus would be: ``... fleuve qui ...'' (...river, that...). Since ``qui'' can be tagged both as an ``E'' (relative pronoun) and a ``K'' (interrogative pronoun), the ``E'' will be chosen by the tagger since an interrogative pronoun cannot follow a noun (``N''). R V:A word tagged with R (article) cannot be followed by a word tagged with V (verb): for example ``l' appelle'' (calls him/her). The word ``appelle'' can only be a verb, but ``l''' can be either an article or a personal pronoun. Thus, the rule will eliminate the article tag, giving preference to the pronoun. Stochastic POS tagging • HMM tagger • Pick the most likely tag for this word • P(word|tag) * P(tag|previous n tags) – find tag sequence that maximizes the probability formula • A bigram-based HMM tagger chooses the tag ti for word wi that is most probable given the previous tag ti-1 and the current word wi: • ti = argmaxj P(tj|ti-1,wi) • ti = argmaxj P(tj|ti-1)P(wi|tj) : HMM equation for a single tag Example • Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/ADV • People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN • P(VB|TO)P(race|VB) • P(NN|TO)P(race|NN) • TO: to+VB (to sleep), to+NN (to school) Example (cont’d) • • • • • • P(NN|TO) = .021 P(VB|TO) = .34 P(race|NN) = .00041 P(race|VB) = .00003 P(VB|TO)P(race|VB) = .00001 P(NN|TO)P(race|NN) = .000007 HMM Tagging • T = argmax P(T|W), where T=t1,t2,…,tn • By Bayes’ rule: P(T|W) = P(T)P(W|T)/P(W) • Thus we are attempting to choose the sequence of tags that maximizes the rhs of the equation • P(W) can be ignored • P(T)P(W|T) = P(wi|w1t1…wi-1ti1ti)P(ti|w1t1…wi-1ti-1) Transformation-based learning • • • • P(NN|race) = .98 P(VB|race) = .02 Change NN to VB when the previous tag is TO Types of rules: – – – – – The preceding (following) word is tagged z The word two before (after) is tagged z One of the two preceding (following) words is tagged z One of the three preceding (following) words is tagged z The preceding word is tagged z and the following word is tagged w Confusion matrix IN JJ IN - .2 JJ .2 - 3.3 NN 8.7 - NNP .2 3.3 4.1 RB 2.0 .5 VBD .3 .5 VBN 2.8 2.2 NN NNP RB VBD VBN .7 2.1 1.7 .2 2.7 .2 - .2 - 4.4 2.6 - Most confusing: NN vs. NNP vs. JJ, VBD vs. VBN vs. JJ Readings • J&M Chapters 1, 2, 3, 8 • “What is Computational Linguistics” by Hans Uszkoreit http://www.coli.uni-sb.de/~hansu/what_is_cl.html • Lecture notes #1