Parts of Speech Grammars and Lexicons 11-721 Fall Term, 2003

advertisement
Parts of Speech
Grammars and Lexicons
11-721
Fall Term, 2003
Categories of Words:
Parts of Speech
•
•
•
•
•
•
•
Noun
Verb
Adjective
Adverb
Preposition
Determiner (Article)
Modal ?
Parts of Speech
Det Noun Modal Verb
Adverb Adjective Prep. Det Noun
This boy must seem incredibly stupid to
that girl.
A note on scientific method
• Theories must be falsifiable.
• Results must be reproducible.
Reproducible Results: Chomsky, 1957
• The search for rigorous formulation in linguistics has a much more
serious motivation than mere concern for logical niceties or the
desire to purify well-established methods of linguistic analysis.
Precisely constructed models for linguistic structure can play an
important role, both negative and positive, in the process of
discovery itself. By pushing a precise but inadequate formulation to
an unacceptable conclusion, we can often expose the exact source
of the inadequacy and, consequently, gain a deeper understanding
of the linguistic data. More positively a formalized theory may
automatically provide solutions for many problems other than those
for which it was explicitly designed. Obscure and intuition-bound
notions can neither lead to absurd conclusions nor provide new and
correct ones, and hence they fail to be useful in two important
respects.
Notional Definitions of Parts of Speech:
(an example of “obscure and intuition-bound notions”)
•
•
•
•
•
•
Verbs denote actions
Nouns denote entities
Adjectives denote states
Adverbs denote manner
Prepositions denote location
Determiners specify
A Non-falsifiable Theory
• Theory: noun denote entities
• Counter-example: assassination is a noun that
denotes an event
• Reply: no, it denotes the idea of the event, which
is an entity
• How do you tell the difference between an event
and the idea of an event?
• Without precise definitions, this theory cannot be
disproved.
• (In language technologies, imprecise definitions
lead to poor intercoder reliability, which leads to
poor training, etc.)
A Falsifiable Theory
• Only prepositions can be modified by right meaning completely or
directly.
• Supporting Examples:
–
–
–
–
–
–
–
Right up/down/in/on/across the street
Right in the drawer
Right down the stairs
Right from school
Right across the street
*He right despaired.
*She chose right this one.
• Counter-examples:
– She looked at him right strangely.
• (Right modifies an adverb.)
– You look a right clown. (Oxford English Dictionary)
• (Right modifies a noun.)
– The government made a right mess of it. (Oxford English Dictionary)
• (Right modifies another noun.)
• The theory is falsified (if you like the counter-examples). It needs to be
refined (maybe by specifying which dialects it is valid for).
How do you decide the part of
speech of a word?
• Distribution
• Morphology: Prefixes, suffixes, and other
changes to the structure of the word.
Distribution of Parts of Speech
•
•
•
•
•
Great ideas spread quickly.
Interesting ideas spread quickly.
Stupid ideas spread quickly.
Colorless ideas spread quickly.
Words of the same category have the
same distribution. For example, adjectives
can come before nouns.
Discussion:
Distribution of parts of speech
• Great ideas spread quickly.
• The ideas spread quickly.
• Great idea spread quickly.
• Do great and the have the same part of
speech?
• Do idea and ideas have the same part of
speech?
Templates for testing parts of
speech that work most of the time
•
•
•
•
•
•
•
•
noun can be a pain in the neck.
Television can be a pain in the neck.
Linguistics can be a pain in the neck.
This can be a pain in the neck.
*Happy can be a pain in the neck.
*From can be a pain in the neck.
*The can be a pain in the neck.
*Breathe can be a pain in the neck.
What is wrong with this
sentence?
• Cat can be a pain in the neck.
Templates for testing parts of
speech that work most of the time
•
•
•
•
•
They/it can verb.
They/it can stay/leave/die/cry.
*They/it can gorgeous/cute/trendy.
*They/it can from/to/in/off/on.
*They/it can door/bible/gold/camera.
What is wrong here?
• They can handle.
• They can accommodate.
• They can harbor.
Templates for testing parts of
speech that work most of the time
•
•
•
•
•
Modal I be frank?
Can I be frank?
Must I be frank?
Should I be frank?
Need I be frank?
Templates for testing parts of
speech that work most of the time
•
•
•
•
•
Very adverb or adjective
Very slow
Very slowly
Very badly
Very happy
Templates for testing parts of
speech that work most of the time
•
•
•
•
He treats her adverb.
He treats her well.
He treats her arrogantly.
He treats her nicely.
• He treats her nice.
• He treats her good.
Templates for testing parts of
speech that work most of the time
•
•
•
•
•
They are very adjective.
They are very nice/gentlemanly/ladylike.
*They are very gentlemen/ladies/faxes.
*They are very starve/die.
*They are very to/at/on.
• They are very in.
• They are very off.
Templates for testing parts of
speech that work most of the time
• Right preposition.
– Right is an intensifier.
•
•
•
•
•
•
•
Right up/down/in/on/across the street
Right down the stairs
Right in the drawer
Right from school
Right across the street
*He right despaired.
*She chose right this one.
What about these sentences?
• She looked at him right strangely. (dialect)
• She is right pretty. (dialect)
• You look a right clown. (Oxford English
Dictionary)
• The government made a right mess of it.
(Oxford English Dictionary)
Templates for testing parts of
speech that work most of the time
• He wrote determiner other works.
• He wrote the/all/these/no/few/many other
works.
• *He wrote despair/be/have other works.
• *He wrote student other works.
• ?He wrote successful other works.
Words can have more than one
part of speech
• He needs to see a doctor. (verb)
• Need there be a problem. (modal)
• I feel a need to explore my roots. (noun)
Morphology
•
•
•
•
The form of words
Affixes: Prefixes, suffixes, infixes
Stem changes: swim/swam
More about morphology in a couple of
weeks.
Morphological properties of English
nouns
• Count nouns
– Cup/cups
– Book/books
• Mass nouns
– Attention/?attentions
– Sand/?sands
– Water/?waters
– Coffee/?coffees
Morphological Properties of English
adjectives
• Monosyllabic (one syllable) adjectives
– Tall/taller/tallest
– Fast/faster/fastest
• Multi-syllabic adjectives
– Intelligent/more intelligent/most intelligent
Morphological Properties of English
Verbs
Base
Participle
Past
Present
Gerund
mow
prove
go
meet
cut
mown
proven
gone
met
cut
mowed
proved
went
met
cut
mows
proves
goes
meets
cuts
mowing
proving
going
meeting
cutting
Invariant words: no prefixes or suffixes
in English
• Prepositions (in, on, at, about, across,
beyond, etc.)
• Modals (may, might, can, could, must,
shall, should, etc.)
The Computational View
• Who cares if it is falsifiable? It just needs
to be implementable.
• Non-falsifiable theories tend to be nonimplementable.
Importance to you
• When you are building a lexicon, you will
decide on parts of speech for words by
using template tests and morphological
tests.
Discussion
• Toy house
• Big house
• Hypothesis 1: Toy is an adjective in toy
house. Toy house is just like big house.
• Hypothesis 2: Toy is a noun in toy house.
Toy house is a compound noun.
• Relevant diagnostic tests:
– Adjectives can be made comparative.
– Adjectives can be modified by very.
– Nouns can be made plural.
Discussion
•
•
•
•
He is like his brother.
Hypothesis 1: Like is an adjective.
Hypothesis 2: Like is a preposition.
Relevant diagnostic tests:
– Comparatives
– Very
– Right
Part of Speech Tagging
• Input: string of words
• Output: string of words with a part of speech
associated with each word.
• Example:
– This:det boy:N likes:V that:det girl:N
• Use statistical or rule-based knowledge about
distribution.
• Usually use a long list of parts of speech, e.g.,
around 40.
Download