Overview of Natural Language Processing

advertisement
Overview of
Natural Language Processing
Advanced AI CSCE 976
Amy Davis
amydavis@cse.unl.edu
Outline
• Common Applications
• Dealing with Sentences
(and words)
• Dealing with Discourses
Practical Applications
Machine translation
Database access
Information Retrieval
Query-answering
Text categorization
Summarization
Data extraction
Machine Translation
Proposals for mechanical translators of
languages pre-date the invention of the
digital computer
First was a dictionary look-up system at
Birkbeck College, London 1948
American interest started by Warren Weaver, a
code breaker in WW2, was popular during
cold war, but alas, rather unsuccessful
Machine Translation:
Working Systems
Taum-Meteo – Translates Weather reports
from English to French in Montreal. Works
because language used in reports is stylized
and regular.
Xerox Systram – Translates Xerox manuals
from English to all languages that Xerox
deals in. Utilized pre-edited texts
Machine Translation: Difficulties
Need a big Dictionary with Grammar rules in both
(or all) languages, large start-up cost
Direct word translation often ambiguous
Lexicons (words that aren’t in a dictionary, but made
of common parts)
(ex. Lebensversicherungsgesellschaftsangestellter,
a life insurance company employee)
Ambiguity even in primary language
Elements of language are different
Machine Translation: Difficulties
Essentially requires a good understanding of
the text, and finding a corresponding text in
the target language that does a good job of
describing the same (or similar) situation.
Requires computer to “understand”.
Machine Translation: Successes
Limited Domain
allows for limited vocabulary, grammar, easier
disambiguation and understanding
Journal article: Church, K.W. and E.H. Hovy. 1993. Good Applications for
Crummy Machine Translation. Machine Translation 8 (239--258)
MAT
machine-aided translation, where a machine starts,
and a real person proof-reads for clarity.
(Sometimes doesn’t require bi-lingual people).
Example of MAT (page 692)
The extension of the coverage of the health services to the
underserved or not served population of the countries of
the region was the central goal of the Ten-Year Plan and
probably that of greater scope and transcendence. Almost
all the countries formulated the purpose of extending the
coverage although could be appreciated a diversity of
approaches for its attack, which is understandable in view
of the different national policies that had acted in the
configuration of the health systems of each one of the
countries. (Translated by SPANAM: Vasconcellos and
Leon, 1985).
Database Access
The first major success for NLP was in the
area of database access
Natural Language Interfaces to Databases
were developed to save mainframe
operators the work of accessing data
through complicated programs.
Database Access:
Working Systems
LUNAR (by Woods for NASA, 1973)
allowed queries of chemical analysis data of
lunar rock and soil samples brought back by
Apollo missions
CHAT (Pereira, 1983)
allows queries of a geographical database
Database Access: Difficulties
Limited Vocabulary
User must phrase question correctly –
system doesn’t understand everything
Context detection
allowing questions that implicitly refer to
previous questions
Becomes Text Interpretation question
Database Access: Conclusion
Worked well for a time
Now more information is stored in text, not in
databases (ex. email, news, articles, books,
encyclopedias, web pages)
The problem now is not to find information,
it’s to sort through the information that’s
available.
Information Retrieval
Now the main focus of Natural Language
Processing
There are four types:
1.
2.
3.
4.
Query answering
Text categorization
Text summary
Data extraction
Information Retrieval: The task
Choose from some set of documents ones that
are related to my query
Ex. Internet search
Information Retrieval
Methods
Boolean: “(Natural AND Language) OR
(Computational AND Linguistics)”
• too confusing for most users
Vector: Assign different weights to each term
in query. Rank documents by distance from
query and report ones that are close.
Information Retrieval
Mostly implemented using simple statistical
models on the words only
More advanced NLP techniques have not
yielded significantly better results
Information in a text is mostly in its words
Text Categorization
Once upon a time… this was done by humans
Computers are much better at it (and more
consistent)
Best success for NLP so far (90+ % accuracy)
Much faster and more consistent than humans.
Automated systems now perform most of the
work.
NLP works better for TC than IR because categories
are fixed.
Text Summarization
Main task: understand main meaning and
describe in a shorter way
Common Systems: Microsoft
How:
– Sentence/paragraph extraction (find the most
important sentences/paragraphs and string them
together for a summary)
– Statistical methods are more common
Data extraction
Goal: Derive from text assertions to store in a
database
Example: SCISOR, Jacobs and Rau 1990
Summarizes Dow Jones News stories, and
adds information to a database.
NLP Goals
Have (or feign) some understanding based on
communication with Natural Language
In order to receive and send information in
ways easily understandable by human users
How to get there
NLP applications are all similar in that they
require some level of understanding.
Understand the query, understand the
document, understand the data being
communicated…
Understanding Sentences:
Overview
Parsing and Grammar
How is a sentence composed?
Lexicons
How is a word composed?
Ambiguity
Parsing Requirements
Requires a defined Grammar
Requires a big dictionary (10K words)
Requires that sentences follow the grammar
defined
Requires ability to deal with words not in
dictionary
Parsing (from Section 22.4)
Goal:
Understand a single sentence by syntax analysis
Methods
– Bottom-up
– Top-down
More efficient (and complicated) algorithm
given in 23.2
A Parsing Example
Rules:
S  NP VP
NP  Article N | Proper
VP  Verb NP
N  home | boy | store
Proper  Betty | John
Verb  go|give|see
Article  the | an | a
The Sentence: The boy went home.
A Parsing Example: The answer
Lexicons
The current trend in parsing
Goal: figure out this word
Method:
1. Tokenize with morphological analysis
Inflectional, derivational, compound
2. Dictionary lookup on each token
3. Error recovery (spelling correction, domaindependent cues)
Lexicons in Practice
10,000 – 100,000 root word forms
Expensive to develop, not readily shared
Wordnet (George Miller, Princeton)
clarity.princeton.edu
Ambiguity
More extensive Language  more Ambiguity
Disambiguation:
task of finding correct interpretation
Evidence:
•
•
•
•
•
Syntactic
Lexical
Semantic
Metonymy
Metaphor
Disambiguation Tools
Syntax
modifiers (prepositions, adverbs) usually
attach to nearest possible place
Lexical
probability of a word having a particular
meaning, or being used in a particular way
Semantic
determine most likely meaning from context
Semantic Disambiguation
Example: “with”
Sentence
Relation
I ate spaghetti with meatballs.
I ate spaghetti with salad.
I ate spaghetti with abandon.
I ate spaghetti with a fork.
I ate spaghetti with a friend.
(ingredient of spaghetti)
(side dish of spaghetti)
(manner of eating)
(instrument of eating)
(accompanier of eating)
Disambiguation is probabilistic!
More Disambiguation Tools
Metonymy
“Chrysler announced” doesn’t mean
companies can talk.
Metaphor
more is up: confidence has fallen, prices
have sky-rocketed.
Beyond Sentences:
Discourse understanding
Sentences are nice but…
Most communication takes place in the form of
multiple sentences (discourses)
There’s lots more to the world than parsing and
grammar!
Discourse Understanding: Goals
Correctly interpret sequences of sentences
Increase knowledge about world from
discourse (learn)
– Dependent on facts as well as new knowledge
gained from discourse.
Discourse Understanding:
an example
John went to a fancy restaurant.
He was pleased and gave the waiter a big tip.
He spent $50.
What is a proper understanding of this
discourse?
What is needed to have a proper
understanding of this discourse?
General world knowledge
• Restaurants serve meals, so a reason for
going to a restaurant is to eat.
• Fancy restaurants serve fancy meals, $50 is
a typical price for a fancy meal. Paying and
leaving a tip is customary after eating meals
at restaurants.
• Restaurants employ waiters.
General Structure of Discourse
“John went to a fancy restaurant. He was pleased…”
Describe some steps of a plan for a character
Leave out steps that can be easily inferred from other
steps.
From first sentence: John is in the eat-atrestaurant plan. Inference: eat-meal step
probably occurred – even if it wasn’t mentioned.
Syntax and Semantics
“...gave the waiter a big tip.”
“the” used for objects that have been
mentioned before
OR
Have been implicitly alluded to; in this case,
by the eat-at-restaurant plan
Specific knowledge about
situation
“He spent $50”
• “He” is John.
• Recipients of the $50 are the restaurant and
the waiter.
Structure of coherent discourse
Discourses comprised of segments
Relations between segments
(more in Mann and Thompson, 1983)
(coherence relation)
–
–
–
–
–
Enablement
Evaluation
Causal
Elaboration
Explanation
Speaker Goals (Hobbs 1990)
The Speaker does 4 things:
1) wants to convey a message
2) has a motivation or goal
3) wants to make it easy for the hearer to
understand.
4) links new information to what hearer knows.
A Theory of “Attention”
Grosz and Sidner, 1986
Speaker or hearer’s attention is focused
Focus follows a stack model
Explains why order is important.
Order is important
What’s the difference?
I visited Paris.
I bought you some
expensive cologne.
Then I flew home.
I went to Kmart.
I bought some underwear.
I visited Paris.
Then I flew home.
I went to Kmart.
I bought you some expensive cologne.
I bought some underwear.
Summary
• NLP have practical applications, but none do a
great job in an open-ended domain
• Sentences are understood through grammar,
parsing and lexicons
• Choosing a good interpretation of a sentence
requires evidence from many sources
• Most interesting NLP comes in connected
discourse rather than in isolated sentences
Current NLP Crowd
– Originally, mostly mathematicians.
– Now Computer Scientists (computational
linguists= linguists, stasticians, computer
science folk).
– Big names are Perrault, Hobbs, Pereira, Grosz
and Charniak
Current NLP conferences
Association for Computational Linguistics
Coling
EACL (Europe Association for Computational
Linguistics)
USA Schools with NLP Grad.
Brown University
Massachusetts at Amherst, University of
Buffalo, SUNY at
Massachusetts Institute of Technology
California at Berkeley, University of
Michigan, University of
California at Los Angeles, University of
New Mexico State University
Carnegie-Mellon University
New York University
Columbia University
Ohio State University
Cornell University
Pennsylvania, University of
Delaware, University of
Rochester, University of
Duke University
Southern California, University of
Georgetown University
Stanford University
Georgia, University of
Utah, University of
Georgia Institute of Technology
Wisconsin - Milwaukee, University of
Harvard University
Yale University
Indiana University
Information Sciences Institute (ISI) at the University
of Southern California
Johns Hopkins University
Current NLP Journals
Computational Linguistics
Journal of Natural Language Engineering
(JLNE)
Machine Translation
Natural Language and Linguistic Theory
Industrial NLP Research Centers
AT&T Labs - Research
BBN Systems and Technologies Corporation
DFKI (German research center for AI)
General Electric R&D
IRST, Italy
IBM T.J. Watson Research, NY
Lucent Technologies Bell Labs, Murray Hill, NJ
Microsoft Research, Redmond, WA
MITRE
NEC Corporation
SRI International, Menlo Park, CA
SRI International, Cambridge, UK
Xerox, Palo Alto, CA
XRCE, Grenoble, France
Speaker Goals (Hobbs 1990)
The Speaker does 4 things:
1) wants to convey a message
2) has a motivation or goal
3) wants to make it easy for the hearer to
understand.
4) links new information to what hearer knows.
Discourse comprehension
The procedure is actually quite simple. First you arrange things into
different groups. Of course, one pile may be sufficient depending
on how much there is to do. If you have to go somewhere else due
to lack of facilities that is the next step, otherwise you are pretty
well set. It is important not to overdo things. That is, it is better to
do too few things at once than too many. In the short run this may
not seem important but complications can easily arise. A mistake is
expensive as well. At first the whole procedure will seem
complicated. Soon however, it will become just another facet of
life. It is difficult to foresee any end to the necessity of this task in
the immediate future, but then one can never tell. After the
procedure is completed one arranges the material into different
groups again. Then they can get put into their appropriate places.
Eventually they will be used once more and the whole cycle will
have to be repeated. However, this is a part of life.
Now: What do you remember?
What are the four steps mentioned?
What step is left out?
What is the “material” mentioned?
What kind of mistake would be expensive?
Is it better to do too few or too many?
Why?
Oh Yeah -The title of the discourse is:
“Washing Clothes”
Now, re-read, and see if the questions are
easier. What does this say about discourse
comprehension?
Download