Word Sense Disambiguation MAS.S60 Catherine Havasi Rob Speer Banks? • The edge of a river – “I fished on the bank of the Mississippi.” • A financial institution – “Bank of America failed to return my call.” • The building that houses the financial institution – “The bank burned down last Thursday.” • A “biological repository” – “I gave blood at the blood bank”. Word Sense Disambiguation • Most NLP tasks need WSD • “Played a lot of pool last night… my bank shot is improving!” • Usually keying to WordNet “I hit the ball with the bat.” Types • “All words” – Guess the WN sysnet • Lexical Subset – A small number of pre-defined words • Course Word Sense – All words, but more intuitive senses Types • “All words” – Guess the WN sysnet • Lexical Subset – A small number of pre-defined words • Coarse Word Sense – All words, but more intuitive senses IAA is 75-80% for all words task with WordNet 90% for simple binary tasks What is a Coarse Word Sense? • How many word senses does the word “bag” have in WordNet? What is a Coarse Word Sense? • How many word senses does the word “bag” have in WordNet? – 9 noun senses, 5 verb senses • Coarse WSD: 6 nouns, 2 verbs • A Coarse WordNet: 6,000 words (Navigli and Litkowski 2006) • These distinctions are hard even for humans (Snyder and Palmer 2004) – Fine Grained IAA: 72.5% – Coarse Grained IAA: 86.4% “Bag”: Noun • 1. A coarse sense containing: – bag (a flexible container with a single opening) – bag, handbag, pocketbook, purse (a container used for carrying money and small personal items or accessories) – bag, bagful (the quantity that a bag will hold) – bag, traveling bag, travelling bag, grip, suitcase (a portable rectangular container for carrying clothes) • • • • 2. bag (the quantity of game taken in a particular period) 3. base, bag (a place that the runner must touch before scoring) 4. bag, old bag (an ugly or ill-tempered woman) 5. udder, bag (mammary gland of bovids (cows and sheep and goats)) • 6. cup of tea, bag, dish (an activity that you like or at which you are superior) Frequent Ingredients • • • • • Open Mind Word Expert WordNet eXtended WordNet (XWN) SemCor 3.0 (“brown1” and “brown2”) ConceptNet Semcor No training set, no problem • Julia Hockenmaier’s “Psudoword” evaluation • Pick two random words – Say, “banana” and “door” • Combine them together – “BananaDoor” • Replace all instances of either in your corpora with your new pseudoword • Evaluate • A bit easier… The “Flip-flop” Method • Stephen Brown and Jonathan Rose, 1991 • Find a single feature or set of features which disambiguated the words – think the named entity recognizer An Example Standard Techniques • Naïve Bayes (notice a trend) – Bag of words – Priors are based on word frequencies • Unsupervised clustering techniques – Expectation Maximization (EM) – Yarowsky Yarowsky (slides from Julia Hockenmaier) Training Yarowsky Using OMCS • Created a blend using a large number of resources • Created an ad hoc category for a word and its surroundings in sentence • Find which word sense is most similar to category • Keep the system machinery as general as possible. Adding Associations • ConceptNet was included in two forms: – Concept vs. feature matrices – Concept-to-concept associations • Associations help to represent topic areas • If the document mentions computer-related words, expect more computer-related word senses Constructing the Blend Calculating the Right Sense “I put my money in the bank” SemEval Task 7 • • • • 14 different systems were submitted in 2007 Baseline: Most frequent sense Spoiler!: Our system would have placed 4th Top three systems: – NUS-PT: parallel corpora with SVM (Chang et al, 2007) – NUS-ML: Bayesian LDA with specialized features (Chai, et al, 2007) – LCC-WSD: multiple methods approach with endto-end system and corpora (Novichi et al, 2007) Results Parallel Corpora • IMVHO the “right” way to do it. • Different words have different sense in different languages • Use parallel corpora to find those instances – Like Euro or UN proceedings English and Romanian Gold standards are overrated • Rada Mihalcea, 2007: “Using Wikipedia for Automatic Word Sense Disambiguation” Lab: making a simple supervised WSD classifier • Big thanks to some guy with a blog (Jim Plush) • Training data: Wikipedia articles surrounding “Apple” (the fruit) and “Apple Inc.” • Test data: hand-classified tweets about apples and Apple products • Use familiar features + Naïve Bayes to get > 90% accuracy • Optional: use it with tweetstream to show only tweets about apples (the fruit) Slide Thanks • James Pustejovsky, Gerard Bakx, Julie Hockenmaier • Manning and Schutze