Prepositional Phrase Attachment

advertisement
Prepositional Phrase Attachment
Chris Brew
Ohio State University
795M, Winter 2000
20/03/2016
1
Prepositional Phrase Attachment
 Hindle
and Rooth: partial parser to get
statistics
 Collins and Brooks: back off estimation
from tree bank data + attachment decision.
 Merlo,Crocker and Berthouzoz: multiple PP
disambiguated
 Ratnaparkhi: entirely unsupervised
795M, Winter 2000
20/03/2016
2
The problem
S
S
NP
NP
VP
I
I
VP
V
NP
V
NP
PP
bought
was hed DET
the
795M, Winter 2000
NN
s hirt
P
with
NP
PP
NP
DET
NN
P
NP
the
s hirt
with
pockets
s oap
20/03/2016
3
Hindle and Rooth
 Whittemore,
Ferrara and Brunner
– Structural heuristics (Kimball’s Right
Association, Frazier’s Minimal Attachment)
account for only 55% of behaviour
– Lexical preferences do much better
H
and R
– note that the preferences for this experiment
were provided by human judgement
– ask how to get automatically a good list of
lexical preferences
795M, Winter 2000
20/03/2016
4
Discovering Lexical Association
in text
 Church’s
part of speech analyser
 Hindle’s FIDDICH partial parser
 13 million words of AP news wire
795M, Winter 2000
20/03/2016
5
Fiddich
S
NP
AUX
DART NBAR
VP
TNS VPREZ VPPRT
theADJ NPL
are
radical
changes
?
?
PP
ADV
PREP
in
N
N
aimed
evidently
NBAR
NPL
CONJ NPL regulations
NP
pro+
?
PP
FIN
PREP
NP
.
?
in DART NBAR
PP
the PNP
PREP
at
VING
remedying
export
NP
PNP
PNP
Soviet
Union
VP
NP
IART NBAR
PP
an ADJ N PREP
and
extreme
customs
795M, Winter 2000
?
20/03/2016
NP
of NBAR
shortage
N NPL
consumer
goods
6
Extract information about words
ID
a
b
c
d
e
f
g
h
i
j
k
l
795M, Winter 2000
Verb
aim
remedy
assuage
Noun
change
regulation
PRO-+
shortage
good
DART-PNP
citizen
scarcity
item
wiper
VING
VING
20/03/2016
Prep
in
Syntax
-V
at
of
in
of
as
7
What the table means
 noun
column has head noun of noun phrase
(or various special cases)
 verb column has head verb if noun phrase
was its object
 prep column has following preposition
 Syntax column V- if no preceding verb
795M, Winter 2000
20/03/2016
8
Counting attachments
 Parser
isn’t reliable, so use a decision
procedure to assign nouns and verbs to
noun-attach (na) and verb-attach (va)
795M, Winter 2000
20/03/2016
9
No preposition

add a count for <noun,NULL> or <verb,NULL>
ID
a
b
c
d
e
f
g
h
i
j
k
l
795M, Winter 2000
Verb
aim
remedy
assuage
Noun
change
regulation
PRO-+
shortage
good
DART-PNP
citizen
scarcity
item
wiper
VING
VING
20/03/2016
Prep
in
Syntax
-V
at
of
in
of
as
10
Sure Verb Attach 1:

if the noun phrase head is a pronoun add a count for <verb,prep>
ID
a
b
c
d
e
f
g
h
i
j
k
l
795M, Winter 2000
Verb
aim
remedy
assuage
Noun
change
regulation
PRO-+
shortage
good
DART-PNP
citizen
scarcity
item
wiper
VING
VING
20/03/2016
Prep
in
Syntax
-V
at
of
in
of
as
11
Sure Verb Attach 2:

if the verb is passivized, verb attach unless preposition is “by”
ID
a
b
c
d
e
f
g
h
i
j
k
l
795M, Winter 2000
Verb
aim
remedy
assuage
Noun
change
regulation
PRO-+
shortage
good
DART-PNP
citizen
scarcity
item
wiper
VING
VING
20/03/2016
Prep
in
Syntax
-V
at
of
in
of
as
12
Sure Noun Attach

if no verb available, then noun attach
ID
a
b
c
d
e
f
g
h
i
j
k
l
795M, Winter 2000
Verb
aim
remedy
assuage
Noun
change
regulation
PRO-+
shortage
good
DART-PNP
citizen
scarcity
item
wiper
VING
VING
20/03/2016
Prep
in
Syntax
-V
at
of
in
of
as
13
Ambiguous Attach 1:

if LA score > 2.0 verb attach, < -2.0 noun attach. Use stats so far for
calculating score. Repeat until stable.
ID
a
b
c
d
e
f
g
h
i
j
k
l
795M, Winter 2000
Verb
aim
remedy
assuage
Noun
change
regulation
PRO-+
shortage
good
DART-PNP
citizen
scarcity
item
wiper
VING
VING
20/03/2016
Prep
in
at
of
in
Syntax
-V
<- maybe
of
as
14
Ambiguous Attach 2:

Share counts between noun and verb
ID
a
b
c
d
e
f
g
h
i
j
k
l
795M, Winter 2000
Verb
aim
remedy
assuage
Noun
change
regulation
PRO-+
shortage
good
DART-PNP
citizen
scarcity
item
wiper
VING
VING
20/03/2016
Prep
in
at
of
in
Syntax
-V
<- maybe
of
as
15
Unsure Attach:

attach to noun by default
ID
a
b
c
d
e
f
g
h
i
j
k
l
795M, Winter 2000
Verb
aim
remedy
assuage
Noun
change
regulation
PRO-+
shortage
good
DART-PNP
citizen
scarcity
item
wiper
VING
VING
20/03/2016
Prep
in
Syntax
-V
at
of
in
of
as
16
LA scores
va: (send (soldier NULL)
(into Afghanistan))
na: (send (soldier
(into Afghanistan)))
LA= log2(P(va p|v,n)/P(na p|v,n))
= log2(P(va into|send,soldier)/P(na into|send,soldier))

and we approximate this using collected counts
P(va into|send,soldier) ~ P(into|send)*P(NULL|soldier)
P(na into|send,soldier) ~ P(into|soldier)
795M, Winter 2000
20/03/2016
17
Estimating the counts
P(into|send) = |send,into| / | send| = .049
P(NULL|soldier) = |soldier,NULL|/ |soldier| = .800
P(into|soldier) = |soldier,into|/|soldier| = .0007
LA = log2(.049*.800/.0007) = 5.81
 which
is enough to be very sure that verb
attach is right
795M, Winter 2000
20/03/2016
18
Smooth the estimates
 using
typical association rates of
prepositions with the whole classes of
nouns and verbs
P(p|n) = (|n,p|+P(na|p))/( |n|+1)
where P(na|p) is |any noun,p|/|any noun|
and similarly for verbs
 Laplace’s M-estimate again
795M, Winter 2000
20/03/2016
19
Performance
~
80% correct
 can get better precision by accepting lower
recall (useful for exploratory text analysis)
 “good enough to be added to a parser like
Fidditch”
795M, Winter 2000
20/03/2016
20
Backed-off estimation
 Collins
and Brooks
– use N2 as well as N1
S
S
NP
NP
VP
I
I
VP
V
NP
V
NP
PP
bought
was hed DET
the
V
795M, Winter 2000
NN
s hirt
P
with
NP
PP
NP
DET
NN
P
NP
the
s hirt
with
pockets
s oap
N1 P N2
V
20/03/2016
N1 P N2
21
Use treebank data
 similar
approaches
– Ratnaparkhi, Reynar and Roukos
– Brill and Resnik
 difficult
to compare results with Hindle and
Rooth, because the corpora used are
different (but raw scores around 80% in
both cases)
795M, Winter 2000
20/03/2016
22
The data
 20801
training and 3097 test examples
 about 95% of the quadruples in the test data
had not been seen in the training set.
 compare H&R 200,000 triples
795M, Winter 2000
20/03/2016
23
The backed-off method
 Katz’s
approach to n-grams
– If there are enough trigrams:
p(wn|wn-1,wn-2) = | wn-2wn-1,wn | / | wn-2,wn-1|
– otherwise back off to bigrams
p(wn|wn-1,wn-2) = 1*|wn-1,wn | / |wn-1|
– otherwise back off to unigram
p(wn|wn-1,wn-2) = 1* 2*|wn |
795M, Winter 2000
20/03/2016
24
Take this method and apply to PP
data
 Start
with full quadruples
 Four possible triples to back off to
 Six possible pairs to back off to
– Restrict attention to those containing P
795M, Winter 2000
20/03/2016
25
How to combine counts from
triples and pairs
ptriple(1|v,n1,p,n2) ~
p(1,v,n1,p)+p(1,v,p,n2)+p(1,n1,p,n2)
p(v,n1,p)+p(v,p,n2)+p(n1,p,n2)
ppair(1|v,n1,p,n2) ~
p(1,v,p)+p(1,p,n2)+p(1,n1,p)
p(v,p)+p(p,n2)+p(n1,p)
 other
combinations tried, this formula is
better than simple averaging for this task
795M, Winter 2000
20/03/2016
26
What was “enough data”?
 In
this task it turns out that using a threshold
of 0 for the denominator is best. If there is
even one instance of the quadruple, trust it.
 For n-grams, it was better to ignore low
counts
 reason for this is not obvious, but in such
situations trying things is essential.
795M, Winter 2000
20/03/2016
27
Results
 84.1%
correct without morphological
analysis, 84.5% with
 Quadruples more accurate than triples , in
turn more accurate than doubles, etc.
 But only 148 quadruples in test data, vs 764
triples, 1965 doubles, 216 singles
795M, Winter 2000
20/03/2016
28
Comparison with Hindle and
Rooth
 We
have 1924 test cases where H&R would
have made a decision
 The backoff method using just the |v,p| and
|n1,p| counts (86.5%) outscores H&R style
(82.1%).
795M, Winter 2000
20/03/2016
29
Extra experiments
 Setting
threshold to 5 reduces performance
to 81.6%
 Tuples with prepositions in are the most
effective.
795M, Winter 2000
20/03/2016
30
Attaching Multiple PPs
 Merlo,
Crocker, Berthouzoz
 For a single PP there are two structures, for
2 PPs there are 5, for 3 PPs 14
 so the problem is harder, a dumb algorithm
will do poorly
 Generalization of Collins/Brooks
795M, Winter 2000
20/03/2016
31
Five structures for V NP PP PP
 Structure
1
535
The agency said it will
[keep]v
[the debt]np
[under review]pp
[ for possible downgrade] pp

Structure 2
1160
Penney will
[extend]v
[[its involvement]np [with the service]pp]np
[for at least five years] pp
795M, Winter 2000
20/03/2016
32
Structure 3
1394
[address]v
[[budget limits]np
[on [credit allocations
[ for the Federal Housing agency ] pp]np]pp]np
Structure 4
1055
[abandon]
[the everyday pricing approach]
[ in the face of
[the poor results]]
795M, Winter 2000
20/03/2016
33
Structure 5
539
[answering]
[questions
[from members of Parliament]]
[after his announcement]
795M, Winter 2000
20/03/2016
34
Algorithm
 Model
of PP1 as Collins and Brooks, but
excluding p2
 Model of 2PPs is back off over sextuples
(i,v,n1,p1,n2,p2) until we get to tuples that
don’t have p1, or that don’t have p2
 then Competitive Back off
795M, Winter 2000
20/03/2016
35
Competitive Back off
 Do
standard back off for PP1 using v,n1,p1
 Do standard back off for PP2 using v,n2,p2
 Do back off for PP2 using n1 instead of n2
(ie., v,n1,p2)
 Combine these results using a simple
procedure, with tiebreak where they
conflict.
795M, Winter 2000
20/03/2016
36
Results
 PP1(2)
84.3% baseline 61.2% (choose most
frequent)
 PP2(5) 69.6% baseline 29.8% (choose most
frequent)
 PP3(14) 43.6% baseline 18.5% (choose
most frequent)
795M, Winter 2000
20/03/2016
37
Results
 Take-home
messages
– Devise a baseline
– Measure performance
– Pick tasks where beating the baseline is
» Impressive
» Useful
795M, Winter 2000
20/03/2016
38
Ratnaparkhi (Coling 98)
 970K
unannotated sentences of WSJ
 tagger, simple chunker
 heuristic extraction of unambiguous cases
795M, Winter 2000
20/03/2016
39
Heuristic extraction
 (v,p,n2)
if
» p is a real preposition (not “of”)
» v is the first verb that occurs < K words left of p
» v is not a form of the verb “to be”
» No noun occurs between v and p
» n2 is first word < K words right of p
» No verb occurs between p and n2
795M, Winter 2000
20/03/2016
40
Heuristic extraction 2
 (n,p,n2)
if
» p is a real preposition (not “of”)
» n is the first that occurs < K words left of p
» No verb occurs between v and p
» n2 is first word < K words right of p
» No verb occurs between p and n2
795M, Winter 2000
20/03/2016
41
Accuracy of extraction
 Noisy
data (c 69% correct)
 But abundant
795M, Winter 2000
20/03/2016
42
Evaluation
 81.91%
with a back off technique
 81.85% with interpolation like H&R
 Baseline for this data 70.39%
795M, Winter 2000
20/03/2016
43
Portability
 Moved
to Spanish and got similar
performance
 H&R would have had to port Fidditch to
Spanish
795M, Winter 2000
20/03/2016
44
Where to get more information
Charniak ch 8.
 Hindle and Rooth CL 19(1) pp 103-120,
1993
 Manning and Schütze, section 8.3
 Original papers

795M, Winter 2000
20/03/2016
45
Download