Overview of NLP

advertisement
Introduction to Computational
Linguistics
Martha Palmer
April 19, 2006
LING 2000 - 2006
1
NLP
Natural Language Processing
•
•
•
•
•
Machine Translation
Predicate argument structures
Syntactic parses
Producing semantic representations
Ambiguities in sentence interpretation
LING 2000 - 2006
2
NLP
Machine Translation
• One of the first applications for computers
– bilingual dictionary > word-word translation
• Good translation requires understanding!
– War and Peace, The Sound and The Fury?
• What can we do? Sublanguages.
– technical domains, static vocabulary
– Meteo in Canada, Caterpillar Tractor
Manuals, Botanical descriptions, Military
Messages
LING 2000 - 2006
3
NLP
Example translation
LING 2000 - 2006
4
NLP
Translation Issues:
Korean to English
- Word order
- Dropped arguments
- Lexical ambiguities
- Structure vs morphology
LING 2000 - 2006
5
NLP
Common Thread
• Predicate-argument structure
– Basic constituents of the sentence and how
they are related to each other
• Constituents
– John, Mary, the dog, pleasure, the store.
• Relations
– Loves, feeds, go, to, bring
LING 2000 - 2006
6
NLP
Abstracting away from surface
structure
LING 2000 - 2006
7
NLP
Transfer lexicons
LING 2000 - 2006
8
NLP
Machine Translation Lexical Choice- Word
Sense Disambiguation
Iraq lost the battle.
Ilakuka centwey ciessta.
[Iraq ] [battle] [lost].
John lost his computer.
John-i computer-lul ilepelyessta.
[John] [computer] [misplaced].
LING 2000 - 2006
9
NLP
Natural Language Processing
• Syntax
– Grammars, parsers, parse trees,
dependency structures
• Semantics
– Subcategorization frames, semantic
classes, ontologies, formal semantics
• Pragmatics
– Pronouns, reference resolution, discourse
models
LING 2000 - 2006
10
NLP
Syntactic Categories
• Nouns, pronouns, Proper nouns
• Verbs, intransitive verbs, transitive verbs,
ditransitive verbs (subcategorization
frames)
• Modifiers, Adjectives, Adverbs
• Prepositions
• Conjunctions
LING 2000 - 2006
11
NLP
Syntactic Parsing
• The cat sat on the mat.
Det Noun Verb Prep Det Noun
• Time flies like an arrow.
Noun Verb
Prep Det Noun
• Fruit flies like a banana.
Noun Noun
LING 2000 - 2006
Verb Det Noun
12
NLP
Context Free Grammar
•
•
•
•
•
•
•
•
S -> NP VP
NP -> det (adj) N
NP -> Proper N
NP -> N
VP -> V, VP -> V PP
VP -> V NP
VP -> V NP PP, PP -> Prep NP
VP -> V NP NP
LING 2000 - 2006
13
NLP
Parses
The cat sat on the mat
S
NP
VP
Det
the
N
cat
PP
V
sat
Prep
on
LING 2000 - 2006
14
NP
Det
the
N
mat
NLP
Parses
Time flies like an arrow.
S
NP
VP
N
time
V
flies
PP
Prep
like
LING 2000 - 2006
15
NP
Det
an
N
arrow
NLP
Parses
Time flies like an arrow.
S
NP
N
time
N
flies
VP
V
like
NP
Det
an
LING 2000 - 2006
16
N
arrow
NLP
Features
• C for Case, Subjective/Objective
– She visited her.
• P for Person agreement, (1st, 2nd, 3rd)
– I like him, You like him, He likes him,
• N for Number agreement, Subject/Verb
– He likes him, They like him.
• G for Gender agreement, Subject/Verb
– English, reflexive pronouns He washed himself.
– Romance languages, det/noun
• T for Tense,
– auxiliaries, sentential complements, etc.
– * will finished is bad
LING 2000 - 2006
17
NLP
Probabilistic Context Free
Grammars
• Adding probabilities
• Lexicalizing the probabilities
LING 2000 - 2006
18
NLP
Simple Context Free Grammar in BNF
S
NP
→
→
PP
V
→
→
VP
→
LING 2000 - 2006
NP VP
Pronoun
| Noun
| Det Adj Noun
|NP PP
Prep NP
Verb
| Aux Verb
V
| V NP
| V NP NP
| V NP PP
| VP PP
19
NLP
Simple Probabilistic CFG
S
NP
→
→
PP
V
→
→
VP
→
LING 2000 - 2006
NP VP
Pronoun
| Noun
| Det Adj Noun
|NP PP
Prep NP
Verb
| Aux Verb
V
| V NP
| V NP NP
| V NP PP
| VP PP
[0.10]
[0.20]
[0.50]
[0.20]
[1.00]
[0.33]
[0.67]
[0.10]
[0.40]
[0.10]
[0.20]
[0.20]
20
NLP
Simple Probabilistic Lexicalized CFG
S
NP
→
→
PP
V
→
→
VP
→
LING 2000 - 2006
NP VP
Pronoun
| Noun
| Det Adj Noun
|NP PP
Prep NP
Verb
| Aux Verb
V
| V NP
| V NP NP
| V NP PP
| VP PP
[0.10]
[0.20]
[0.50]
[0.20]
[1.00]
[0.33]
[0.67]
[0.87] {sleep, cry, laugh}
[0.03]
[0.00]
[0.00]
[0.10]
21
NLP
Simple Probabilistic Lexicalized CFG
VP
→
V
| V NP
| V NP NP
| V NP PP
| VP PP
[0.30]
[0.60] {break,split,crack..}
[0.00]
[0.00]
[0.10]
VP
→
V
| V NP
| V NP NP
| V NP PP
| VP PP
[0.10]
[0.40]
[0.10]
[0.20]
[0.20]
LING 2000 - 2006
22
what about
leave?
leave1, leave2?
NLP
Language to Logic
• John went to the book store.
 John  store1, go(John, store1)
• John bought a book.
buy(John,book1)
• John gave the book to Mary.
give(John,book1,Mary)
• Mary put the book on the table.
put(Mary,book1,table1)
LING 2000 - 2006
23
NLP
Semantics
Same event - different sentences
John broke the window with a hammer.
John broke the window with the crack.
The hammer broke the window.
The window broke.
LING 2000 - 2006
24
NLP
Same event - different syntactic frames
John broke the window with a hammer.
SUBJ VERB
OBJ
MODIFIER
John broke the window with the crack.
SUBJ VERB
OBJ
MODIFIER
The hammer broke the window.
SUBJ VERB
OBJ
The window broke.
SUBJ VERB
LING 2000 - 2006
25
NLP
Semantics -predicate arguments
break(AGENT, INSTRUMENT, PATIENT)
AGENT
PATIENT
INSTRUMENT
John broke the window with a hammer.
INSTRUMENT
PATIENT
The hammer broke the window.
PATIENT
The window broke.
Fillmore 68 - The case for case
LING 2000 - 2006
26
NLP
AGENT
PATIENT
INSTRUMENT
John broke the window with a hammer.
SUBJ
OBJ
MODIFIER
INSTRUMENT
PATIENT
The hammer broke the window.
SUBJ
OBJ
PATIENT
The window broke.
SUBJ
LING 2000 - 2006
27
NLP
Canonical Representation
break (Agent: animate,
Instrument: tool,
Patient: physical-object)
Agent
<=> subj
Instrument <=> subj, with-pp
Patient
<=> obj, subj
LING 2000 - 2006
28
NLP
Syntax/semantics interaction
• Parsers will produce syntactically valid
parses for semantically anomalous
sentences
• Lexical semantics can be used to rule
them out
LING 2000 - 2006
29
NLP
Headlines
• Police Begin Campaign To Run Down Jaywalkers
• Iraqi Head Seeks Arms
• Teacher Strikes Idle Kids
• Miners Refuse To Work After Death
• Juvenile Court To Try Shooting Defendant
LING 2000 - 2006
30
NLP
Download