INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio LAB 9: WORDNET COMMONSENSE KNOWLEDGE SOURCES FOR AI / NLP • There are now several sources of commonsense knowledge that we can use to study its role in reasoning / develop systems able to use commonsense knowledge • The best known is WordNet, a lexical database based on semantic networks developed by George Miller and his collaborators in Princeton A LEXICAL RESOURCE BUILT ON SEMANTIC NETWORK PRINCIPLES • WordNet is a LEXICAL DATABASE created at Princeton – Freely available for research from the Princeton site • It contains information about a variety of SEMANTICAL RELATIONS • Three sub-databases (supported by psychological research as early as (Fillenbaum and Jones, 1965)) – NOUNs – VERBS – ADJECTIVES and ADVERBS • Each database organized around SYNSETS 2004/05 ANLE 3 The noun database • About 90,000 forms, 116,000 senses • Relations: 2004/05 hypernym breakfast -> meal hyponym meal -> lunch has-member faculty -> professor member-of copilot -> crew has-Part table -> leg part-of course -> meal antonym leader -> follower ANLE 4 USING WN ONLINE • http://wordnetweb.princeton.edu/perl/webw n • Example: CELL PHONE EXERCISE 1 • Find the entry for ROBOT LEXICAL RELATIONS IN WORDNET • Wordnet contains information about lexical relations between MEANINGS (more on meanings below) • The type of lexical relations depends on the type of lexical entry LEXICAL RELATIONS FOR NOUNS • ISA (hypernymy) • PART-OF (meronymy) TAXONOMIC INFORMATION IN WORDNET • WordNet is a very rich source of taxonomic information • This information can be found by following HYPERNYMY links Hypernyms 2 senses of robin Sense 1 robin, redbreast, robin redbreast, Old World robin, Erithacus rubecola -(small Old World songbird with a reddish breast) => thrush -- (songbirds characteristically having brownish upper plumage with a spotted breast) => oscine, oscine bird -- (passerine bird having specialized vocal apparatus) => passerine, passeriform bird -(perching birds mostly small and living near the ground with feet having 4 toes arranged to allow for gri pping the perch; most are songbirds; hatchlings are helpless) => bird -- (warm-blooded egg laying vertebrates characterized by feathers and forelimbs modified as wings) => vertebrate, craniate -(animals having a bony or cartilaginous skeleton with a segmented spinal column and a large brain encl osed in a skull or cranium) => chordate -- (any animal of the phylum Chordata having a notochord or spinal column) => animal, animate being, beast, brute, creature, fauna -(a living organism characterized by voluntary movement) => organism, being -(a living thing that has (or can develop) the ability to act or function independently) => living thing, animate thing -- (a living (or once living) entity) => object, physical object --ANLE 2004/05 10 => entity, physical thing -- EXERCISE 2 • Find the hypernyms of ROBOT UPPER ONTOLOGY IN WORDNET • The noun hierarchy is divided in distinct hierarchies, each with its top element • {act,action,activity} • {animal,fauna} • {artifact} • {attribute,property} • {body,corpus} • {cognition,knowledge} • {communication} • {event,happening} • {feeling,emotion} • {food} • {group,collection} • {location,place} • {motive} • {natural object} • {natural phenomenon} • {person,human being} • {plant,flora} • {possession} • {process} • {quantity,amount} • {relation} • {shape} • {state,condition} • {substance} • {time} 13/27 MERONYMY IN WORDNET • WordNet contains information about PARTS • Stored as information about MERONYMS • Example: TREE EXERCISE 3 • Find the parts of house • Find the parts of building • Find the parts of car EXERCISE 4 • Find the entry of bank THE ORGANIZATION OF THE LEXICON “eat” “eats” EAT-LEX-1 eat0600 eat0700 “ate” “eaten” WORD-FORMS 2004/05 LEXEMES ANLE SENSES 18 The organization of the lexicon stock0100 STOCK-LEX-1 “stock” STOCK-LEX-2 stock0200 stock0600 stock0700 STOCK-LEX-3 stock0900 stock1000 WORD-STRINGS 2004/05 LEXEMES ANLE SENSES 19 Synonymy cheap0100 “cheap” CHEAP-LEX-1 CHEAP-LEX-2 …. …… cheapXXXX “inexpensive” INEXP-LEX-3 inexp0900 inexpYYYY WORD-STRINGS 2004/05 LEXEMES ANLE SENSES 20 Synsets • Senses (or `lexicalized concepts’) are represented in WordNet by the set of words that can be used in AT LEAST ONE CONTEXT to express that sense / lexicalized concept: the SYNSET • E.g., {chump, fish, fool, gull, mark, patsy, fall guy, sucker, shlemiel, soft touch, mug} (gloss: person who is gullible and easy to take advantage of) 2004/05 ANLE 21 EXERCISE 5 • Find the senses of hand, palm, slick, and stock. EXERCISE 6 • Find the hypernyms of LAW The verb database • About 10,000 forms, 20,000 senses • Relations between verb meanings: 2004/05 Hypernym fly-> travel Troponym Walk -> stroll Entails Snore -> sleep Antonym Increase -> decrease ANLE 24 Relations between verbal meanings V1 ENTAILS V2 when Someone V1 (logically) entails Someone V2 - e.g., snore entails sleep TROPONYMY when To do V1 is To do V2 in some manner - e.g., limp is a troponym of walk 2004/05 ANLE 25 EXERCISE 7 • Find the antonyms of accelerate The adjective and adverb database • About 20,000 adjective forms, 30,000 senses • 4,000 adverbs, 5600 senses • Relations: 2004/05 Antonym (adjective) Heavy <-> light Antonym (adverb) Quickly <-> slowly ANLE 27 EXERCISE 8 • Find the antonyms of dangerous WORDNET FOR OTHER LANGUAGES • MultiWordNet (multiwordnet.fbk.eu) – A Multilingual WordNet – Italian WordNet – Synsets aligned with English WordNet (1.6) whenever possibile – Compatible versions developed for Hebrew, Portuguese, Romanian and Spanish OTHER SOURCES OF COMMONSENSE KNOWLEDGE • OpenCyc: – http://www.opencyc.org/ • ConceptNet – http://conceptnet.media.mit.edu/ – Discussed next week • DBPedia: – http://dbpedia.org/ READINGS • C. Fellbaum, 1998. WordNet. MIT press