introduction to artificial intelligence - clic

advertisement
INTRODUCTION TO ARTIFICIAL
INTELLIGENCE
Massimo Poesio
LAB 9: WORDNET
COMMONSENSE KNOWLEDGE
SOURCES FOR AI / NLP
• There are now several sources of
commonsense knowledge that we can use to
study its role in reasoning / develop systems
able to use commonsense knowledge
• The best known is WordNet, a lexical database
based on semantic networks developed by
George Miller and his collaborators in
Princeton
A LEXICAL RESOURCE BUILT ON
SEMANTIC NETWORK PRINCIPLES
• WordNet is a LEXICAL DATABASE created at Princeton
– Freely available for research from the Princeton site
• It contains information about a variety of SEMANTICAL
RELATIONS
• Three sub-databases (supported by psychological
research as early as (Fillenbaum and Jones, 1965))
– NOUNs
– VERBS
– ADJECTIVES and ADVERBS
• Each database organized around SYNSETS
2004/05
ANLE
3
The noun database
• About 90,000 forms, 116,000 senses
• Relations:
2004/05
hypernym
breakfast -> meal
hyponym
meal -> lunch
has-member
faculty -> professor
member-of
copilot -> crew
has-Part
table -> leg
part-of
course -> meal
antonym
leader -> follower
ANLE
4
USING WN ONLINE
• http://wordnetweb.princeton.edu/perl/webw
n
• Example: CELL PHONE
EXERCISE 1
• Find the entry for ROBOT
LEXICAL RELATIONS IN WORDNET
• Wordnet contains information about lexical
relations between MEANINGS (more on
meanings below)
• The type of lexical relations depends on the
type of lexical entry
LEXICAL RELATIONS FOR NOUNS
• ISA (hypernymy)
• PART-OF (meronymy)
TAXONOMIC INFORMATION IN
WORDNET
• WordNet is a very rich source of taxonomic
information
• This information can be found by following
HYPERNYMY links
Hypernyms
2 senses of robin
Sense 1
robin, redbreast, robin redbreast, Old World robin, Erithacus rubecola -(small Old World songbird with a reddish breast)
=> thrush -- (songbirds characteristically having brownish upper plumage with a spotted breast)
=> oscine, oscine bird -- (passerine bird having specialized vocal apparatus)
=> passerine, passeriform bird -(perching birds mostly small and living near the ground with feet having 4 toes arranged to allow for gri
pping the perch; most are songbirds; hatchlings are helpless)
=> bird -- (warm-blooded egg
laying vertebrates characterized by feathers and forelimbs modified as wings)
=> vertebrate, craniate -(animals having a bony or cartilaginous skeleton with a segmented spinal column and a large brain encl
osed in a skull or cranium)
=> chordate -- (any animal of the phylum Chordata having a notochord or spinal column)
=> animal, animate being, beast, brute, creature, fauna -(a living organism characterized by voluntary movement)
=> organism, being -(a living thing that has (or can develop) the ability to act or function independently)
=> living thing, animate thing -- (a living (or once living) entity)
=> object, physical object --ANLE
2004/05
10
=> entity, physical thing --
EXERCISE 2
• Find the hypernyms of ROBOT
UPPER ONTOLOGY IN WORDNET
• The noun hierarchy is divided in distinct
hierarchies, each with its top element
• {act,action,activity}
• {animal,fauna}
• {artifact}
• {attribute,property}
• {body,corpus}
• {cognition,knowledge}
• {communication}
• {event,happening}
• {feeling,emotion}
• {food}
• {group,collection}
• {location,place}
• {motive}
• {natural object}
• {natural phenomenon}
• {person,human being}
• {plant,flora}
• {possession}
• {process}
• {quantity,amount}
• {relation}
• {shape}
• {state,condition}
• {substance}
• {time}
13/27
MERONYMY IN WORDNET
• WordNet contains information about PARTS
• Stored as information about MERONYMS
• Example: TREE
EXERCISE 3
• Find the parts of house
• Find the parts of building
• Find the parts of car
EXERCISE 4
• Find the entry of bank
THE ORGANIZATION OF THE LEXICON
“eat”
“eats”
EAT-LEX-1
eat0600
eat0700
“ate”
“eaten”
WORD-FORMS
2004/05
LEXEMES
ANLE
SENSES
18
The organization of the lexicon
stock0100
STOCK-LEX-1
“stock”
STOCK-LEX-2
stock0200
stock0600
stock0700
STOCK-LEX-3
stock0900
stock1000
WORD-STRINGS
2004/05
LEXEMES
ANLE
SENSES
19
Synonymy
cheap0100
“cheap”
CHEAP-LEX-1
CHEAP-LEX-2
….
……
cheapXXXX
“inexpensive”
INEXP-LEX-3
inexp0900
inexpYYYY
WORD-STRINGS
2004/05
LEXEMES
ANLE
SENSES
20
Synsets
• Senses (or `lexicalized concepts’) are represented in
WordNet by the set of words that can be used in AT
LEAST ONE CONTEXT to express that sense / lexicalized
concept: the SYNSET
• E.g.,
{chump, fish, fool, gull, mark, patsy, fall guy, sucker,
shlemiel, soft touch, mug}
(gloss: person who is gullible and easy to take
advantage of)
2004/05
ANLE
21
EXERCISE 5
• Find the senses of hand, palm, slick, and stock.
EXERCISE 6
• Find the hypernyms of LAW
The verb database
• About 10,000 forms, 20,000 senses
• Relations between verb meanings:
2004/05
Hypernym
fly-> travel
Troponym
Walk -> stroll
Entails
Snore -> sleep
Antonym
Increase -> decrease
ANLE
24
Relations between verbal meanings
V1 ENTAILS V2
when Someone V1 (logically) entails Someone V2
- e.g., snore entails sleep
TROPONYMY
when To do V1 is To do V2 in some manner
- e.g., limp is a troponym of walk
2004/05
ANLE
25
EXERCISE 7
• Find the antonyms of accelerate
The adjective and adverb database
• About 20,000 adjective forms, 30,000 senses
• 4,000 adverbs, 5600 senses
• Relations:
2004/05
Antonym (adjective)
Heavy <-> light
Antonym (adverb)
Quickly <-> slowly
ANLE
27
EXERCISE 8
• Find the antonyms of dangerous
WORDNET FOR OTHER LANGUAGES
• MultiWordNet (multiwordnet.fbk.eu)
– A Multilingual WordNet
– Italian WordNet
– Synsets aligned with English WordNet (1.6)
whenever possibile
– Compatible versions developed for Hebrew,
Portuguese, Romanian and Spanish
OTHER SOURCES OF COMMONSENSE
KNOWLEDGE
• OpenCyc:
– http://www.opencyc.org/
• ConceptNet
– http://conceptnet.media.mit.edu/
– Discussed next week
• DBPedia:
– http://dbpedia.org/
READINGS
• C. Fellbaum, 1998. WordNet. MIT press
Download