G22.2590 - Natural Language Processing

advertisement
G22.2590 - Natural Language Processing - Spring 2001
Lecture 1 Outline
Prof. Grishman
January 17, 2001
Introduction
Centrality of Natural Language
a primary (and natural) mode of human communication
representation for most recorded human knowledge
a very rich and flexible representation (when compared to most formal
representations)
Applications of Natural Language Processing (NLP)
machine translation
interactive systems … data base query, expert systems, help systems
limited appeal with written input: people don't like to type a lot
information extraction and document retrieval
grammar checking
speech applications: dictation; interactive querying
Our Goal
create systems which can perform such applications: an engineering problem
the term "language engineering" has become popular, especially in Europe, to
reflect this orientation
natural language processing systems are complex, and require good design techniques
modular approaches to break the problem up at appropriate points
formal models which reflect aspects of the structure of language
Relation to other Fields
Linguistics
goal of linguistics is to describe language
provide simple models which can predict language behavior
understand what is universal about language
through these formal models, understand how language can be acquired
formal models from linguistics have been of value in NLP
but its goals are not the same as NLP:
a single counterexample can invalidate a model as a linguistic theory, but
would not significantly lessen its value for NLP
NLP must address all phenomena which arise in an application, while
linguistics may focus on select phenomena which give insight into the
language faculty
Psycholinguistics
goal is to understand human performance in generating and analyzing language
little influence on NLP to date
some syntactic analyzers which try to mimic human performance &
difficulties
Artificial Intelligence & Machine Learning
AI is concerned primarily with generic problem solving strategies & suitable
knowledge representations
1
there is an inherent link between AI and NLP: some NLP problems require the
sort of deep reasoning addressed by AI
but NLP has found increasing success through avoiding deep reasoning, and so
the link has weakened
Statistics
statistical methods and models, originally used in signal processing, information
theory, and physics, have become more widely used in NLP
easily trainable and easily computable models are more attractive now that lots of
training data is available
Analyzing our Needs: setting our agenda
What functionality do we require in order to address NLP applications?
Machine Translation
People have been interested in machine translation since the earliest days of computing.
At first, people imagined that machine translation is mostly a "data processing" task … a
system looks up the words one at a time in a bilingual dictionary, and then maybe has to
fix up the translation a bit. However, there is a lot more to do for machine translation:
word segmentation: for some languages (such as Japanese and Chinese) there are no
spaces between words, so it's not clear what the words are
morphology: words appear in different forms, indicating singular vs. plural (for nouns),
present tense vs. past tense (for verbs), nominative vs. accusative case, etc. English
has only a few morphological forms, so it's possible to put them all in a dictionary.
This isn't true of most Western languages; for example, a Spanish verb could have
over 50 forms.
syntax: word-for-word translation only works if the word order in the two languages is
about the same; if it's not, we need to understand enough about the structure of the
two languages (their syntax) to change from one word order to another. English has a
rather fixed subject-verb-object order ("SVO"), while many more inflected languages
have more variable word order.
lexical semantics: many words are polysemous … they have multiple meanings. A word
will have to be translated differently depending on its meaning in a particular context;
otherwise the translation is likely to make little sense. For example, "bill" means both
a statement of charges (an "invoice") and a part of a duck (its "beak"). It's not likely
that any foreign language has a word with both these senses. If we chose the wrong
sense in translating "bill" into a foreign language, it would be like reading the English
sentence "At the end of the meal, the waiter presented the beak."
discourse: in order to create a proper translation we sometimes have to look beyond the
individual sentence. That can be true in selecting word senses. The need also arises
in translating into English from languages where subject pronouns can be omitted;
we need to figure out what the subject actually is, so that we can supply a "he" or
"she" or "it" in English.
Information Extraction
An information extraction system processes text and extracts information about a specific
type of event or relationship. For example, one extraction system we've worked on reads
2
newspaper articles and builds a database of executives who were hired for new
management jobs. If it reads a sentence such as
IBM hired Fred Smith as president.
it would create a table entry
person
Fred Smith
company
IBM
position
president
We can find some of these items by simple pattern matching, looking for something like
<word> hired <word> <word> as <word>
but this won't get us very far. For better performance, we need
name recognition: a company name may be several words ("General Motors"), a
person may have a title or middle name ("Mr. Smith", "Fred X. Smith")
syntax: the information may appear in the passive ("Fred Smith was hired by IBM")
or in a relative clause ("Fred Smith, who was hired by IBM"); also, there may be
extra modifiers ("IBM yesterday hired Fred Smith as president")
lexical semantics: there may be lots of synonyms for hired ("appointed", "named",
…) which the system should recognize
discourse - pronouns: if a pronoun appears in a relevant sentence, the system has to
figure out what the pronoun refers to ("Fred Smith left Compaq last week. IBM
hired him yesterday as president.")
We will be studying (and testing) information extraction applications over the course of
the semester.
Interactive Command and Query
A number of systems have been constructed to serve as natural language front ends for
data base query. For complex queries, they spare the user the need to learn a formal
query language. On the other hand, they run into difficulty if the user keeps asking
questions the system cannot answer.
The system has to translate a natural language query into a formal data base query. It has
to be able to accept a wide range of queries or it will be worse than useless (it will be
much more frustrating than a formal query language --- it won't really be "natural
language"). To do so it needs to analyze
syntax: it needs to divide the query into phrases which correspond, roughly to
different data base attributes: Which customers | bought | green widgets | last
week?
lexical semantics: it needs to figure out how these phrases map into data base
relations
quantifier semantics: if the user says "List all the customers who bought more than
five widgets.", the system has to figure out the quantifier structure
discourse - pronouns and sentence fragments: in asking a sequence of questions,
the user is likely to use pronouns and fragments to keep the queries short: "How
many widgets did General Motors buy? How many kumquats? Did it buy any
tangelos?"
dialog: a good interactive system will be responsive in its replies, pointing out false
assumptions and asking for clarification:
3
"How many programmers in the child care department make over $50,000?"
"There are no programmers in the child care department."
"How many people live in Washington?"
"Washington, D.C. or the State of Washington?"
Summary
Natural language is a very rich a powerful communication medium. If we are to build
systems which can utilize this medium, we must analyze language at several levels:
syntax: what is the structure of a sentence?
semantics: what is the meaning of a sentence (in isolation)?
discourse: how can a sentence be interpreted in context?
dialog: how is language used to exchange information?
4
Download