ppt

advertisement
Discourse, coherence and
anaphora resolution
Lecture 16
1
What is discourse?
2

Any piece of text consisting of more than one
sentence

Until now our lectures revolved mainly
around topics concerning word-level or
sentence-level analysis.
Discourse phenomena

Anaphora resolution
–

Types of noun phrases
–
–
–
3
The Tin Woodman went to Emerald City to see the Wizard
of Oz and ask for a heart. After he asked for it, the
Woodman waited for the Wizard’s response.
Indefinite: Julia has a cat. Some cat entered the house.
Definite: The cat is brown.
Pronoun: It doesn’t eat much.

Coherence
–
–

John hid Bill’s car keys. [the reason he did this
was that] He was drunk.
?? John hid Bill’s car keys. [How are these
sentences related?] He likes spinach.
Coherence relations
– explanation
or cause
– contrast or concession
4
Discourse connectives

5
Cue phrases, discourse markers
–
Because, although, but, for example, yet, and
–
John hid Bill’s car keys because he was drunk.
–
[We can’t win] [but we must keep trying] contrast
Implicit and explicit discourse relations
I took my umbrella this morning. [because] The
forecast was rain in the afternoon.
She is never late for meetings. [but] He always
arrives 10 minutes late.
She woke up early. [afterward] She had
breakfast and went for a walk in the park.
6
Ambiguity of discourse connectives
They have not spoken to each other since they
argued last fall. (Temporal)
I assumed you were not coming since you
never replied to the invitation. (Causal)
7
Penn Discourse Tree Bank


8
Annotated explicit and implicit discourse
relations
Each relation is annotated with its sense
In a general text, what is the proportion
of explicit versus implicit relations?
9
How ambiguous are discourse
connectives?
10
Are certain sequences of relations more
likely?
11
12

In order to interpret (understand) discourse
automatically, the problem of identification
and disambiguation of discourse relations
needs to be addressed.

What else?
Reference resolution

13
Victoria Chen, Chief Financial Officer of
Megabucks Banking Corp since 2004, saw
her pay jump 20%, to $1.3 million, as the 37year-old also became the Denver-based
financial-services company’s president. It has
been ten years since she came to
Megabucks from rival Lotsabucks.
Definitions
Reference: use of linguistic expressions (her,
Chen) to denote an entity or individual
Reference resolution: the task of determining
what entities are referred to by which
linguistic expressions
 A natural language expression used to
perform reference is called a referring
expression, and the entity that is referred to
is called the referent.
14
15

Two referring expressions that are used to refer to
the same entity are said to corefer

Reference to an entity that has been previously
introduced into the discourse is called anaphora.

Coreference resolution is the task of finding referring
expressions in a text that refer to the same entity
(coreference chains)
Features for pronominal anaphora
resolution

Number agreement
–
–
–
–


16
John has a Ford Falcon. It is red
?? John has a Ford Falcon. They are red.
John has three cars. They are red.
?? John has three cars. It is red.
Person agreement
Gender agreement
Preferences in pronoun interpretation
Salience

Recency
–

Grammatical role:
–


typically entities mentioned in subject position are more
salient than those mentioned in object position
Repeated mention
Selectional restrictions
–
17
pronoun antecedents have been mentioned nearby in the
text.
John parked his car in the garage after driving it around for
hours.
Relation to summarization

18
Revisions that improve cohesion in multidocument
summaries: a preliminary study (2002) Jahna C.
Otterbacher, Dragomir R. Radev, Airong Luo . In Proceedings of
the Workshop on Automatic Summarization
Types of problems in manually edited
summaries (15 multi-doc summaries)
Discourse – Concerns the relationships between the sentences in a summary, as
well as
those between individual sentences and the overall summary.
Identification of entities – Involves the resolution of referential expressions such
that each entity mentioned in a summary can easily be identified by the reader.
Temporal – Concerns the establishment of the correct temporal relationships
between events.
Grammar – Concerns the correction of grammatical problems, which may be the
result of juxtaposing sentences from different sources, or due to the previous
revisions that were made.
Location/setting – Involves establishing where each event in a summary takes
place
19
20
21
22
23
Download