Discourse Structure in Generation Julia Hirschberg CS 4706 7/15/2016

advertisement
Discourse Structure in Generation
Julia Hirschberg
CS 4706
7/15/2016
1
Today
• Models of Discourse Structure
– Do we have them?
– Grosz & Sidner ’86
• What identifies discourse structure to Hearers?
– Textual cues
– Spoken cues
• How can we produce appropriate discourse
structure in TTS systems?
• Can we identify discourse structure
automatically, from speech?
7/15/2016
2
Is there structure in this discourse?
A beautiful mallard spotted the dove I was feeding.
The duck dove supply is small this year.
That dove was history in a minute.
Well, to recover from this horrible scene, I went to
the park snack bar for a cup of cocoa.
To my surprise, I ran into a friend from back home.
When I told her of my recent experience she
questioned my sanity.
7/15/2016
3
Is this a reasonable structure?
A beautiful mallard spotted the dove I was feeding.
The duck dove supply is small this year.
That dove was history in a minute.
Well, to recover from this horrible scene, I went to
the park snack bar for a cup of cocoa.
To my surprise, I ran into a friend from back home.
When I told her of my recent experience she
questioned my sanity.
7/15/2016
4
This?
A beautiful mallard spotted the dove I was feeding.
The duck dove supply is small this year.
That dove was history in a minute.
Well, to recover from this horrible scene, I went to
the park snack bar for a cup of cocoa.
To my surprise, I ran into a friend from back home.
When I told her of my recent experience she
questioned my sanity.
7/15/2016
5
This?
A beautiful mallard spotted the dove I was feeding.
The duck dove supply is small this year.
That dove was history in a minute.
Well, to recover from this horrible scene, I went to
the park snack bar for a cup of cocoa.
To my surprise, I ran into a friend from back home.
When I told her of my recent experience she
questioned my sanity.
7/15/2016
6
What information do we use in segmenting a
discourse?
•
•
•
•
‘Topic’ coherence?
Repeated reference?
‘Cue’ phrases?
????
7/15/2016
7
Structures of Discourse Structure (Grosz &
Sidner ‘86)
• A leading theory of discourse structure
– Based upon Speaker intentions and Speaker
and Hearer attentional state
– Identifies a few, general relations that hold
among Speaker intentions
– Identifies a model of attentional state
• Three components:
– Linguistic structure
– Intentional structure
– Attentional structure
7/15/2016
8
Linguistic Structure
• What is actually said or written
• How is the linguistic structure represented?
– Assume discourse is segmented into
Discourse Segments (DS)
• What is the basic unit of analysis?
• Do we all segment alike?
• Do we all use the same cues?
7/15/2016
9
Linguistic Structure of Discourse D
S1: A beautiful mallard spotted the dove I was
feeding.
The duck dove supply is small this year.
That dove was history in a minute.
S2: Well, to recover from this horrible scene, I
went to the park snack bar for a cup of cocoa.
To my surprise, I ran into a friend from back home.
When I told her of my recent experience she
questioned my sanity.
7/15/2016
10
Intentional Structure
• Discourse purpose (DP): basic purpose of the
Speaker in producing the discourse
• Discourse segment purposes (DSPs): the
Speaker’s purpose in producing the segment
• Segments are related to one another by their
purposes:
– Satisfaction-precedence: DSP1 must be
satisfied before DSP2
– Dominance: DSP1 dominates DSP2 if
fulfilling DSP2 constitutes part of fulfilling
DSP1
7/15/2016
11
Linguistic Structure of Discourse D
DSP1: Describe murder of dove by duck.
S1: A beautiful mallard spotted the dove I was feeding.
The duck dove supply is small this year.
That dove was history in a minute.
DSP2: Describe meeting of old friend.
S2: Well, to recover from this horrible scene, I went to the
park snack bar for a cup of cocoa.
To my surprise, I ran into a friend from back home.
When I told her of my recent experience she questioned
my sanity.
7/15/2016
12
DSP2: Describe recovery process.
S2:
DSP3: Describe snack
S3: Well, to recover from this horrible scene, I
went to the park snack bar for a cup of cocoa.
DSP3: Describe meeting old friend.
S4: To my surprise, I ran into a friend from back
home.
DSP5: Describe friend’s reaction
S5: When I told her of my recent experience she
questioned my sanity.
7/15/2016
13
Attentional State: The Focus Stack
• Stack of focus spaces, each containing
objects, properties and relations salient
during each DS, plus the DSP
• State changes: transition rules controlling the
addition/deletion of focus spaces
– Information at lower levels may or may not
be available at higher levels
– Focus spaces are pushed onto the stack
when
• A new DS is begun
7/15/2016
14
• An embedded DS (e.g. a DS dominated by
another DS) is begun
– Focus spaces are popped when they are
completed
• State of focus stack models felicitous reference,
coherence in discourse
S2: DSP2, scene, Speaker, snack_bar
Cocoa, friend, home,sanity
S1: DSP1, duck, dove, Speaker,
duck_dove_supply
7/15/2016
15
Limits of the Theory
• Assumes discourses are task-oriented
• Assumes a single, hierarchical structure shared
by S and H
• Questions:
– Do people really build such structures when
they converse?
– Use them in interpreting what others say?
– How could they do it?
7/15/2016
16
How might people recognize discourse
structure?
• Linguistic markers?
– tense and aspect
– cue phrases
• Inference of Speaker intentions?
• Inference from task structure?
• Intonational Information?
7/15/2016
17
Acoustic and Prosodic Cues to Discourse
Structure
• Intuition:
– Speakers vary acoustic and prosodic cues to
convey variation in discourse structure
– Systematic? In read or spontaneous speech?
• Evidence:
– Observations from recorded corpora
– Laboratory experiments
– Machine learning of discourse structure from
acoustic/prosodic features
7/15/2016
18
Prosodic Correlates of Discourse/Topic
Structure
• Pitch range
Lehiste ’75, Brown et al ’83, Silverman ’86,
Avesani & Vayra ’88, Ayers ’92, Swerts et al
’92, Grosz & Hirschberg’92, Swerts &
Ostendorf ’95, Hirschberg & Nakatani ‘96
• Preceding pause
Lehiste ’79, Chafe ’80, Brown et al ’83,
Silverman ’86, Woodbury ’87, Avesani &
Vayra ’88, Grosz & Hirschberg’92, Passoneau
& Litman ’93, Hirschberg & Nakatani ‘96
7/15/2016
19
• Rate
Butterworth ’75, Lehiste ’80, Grosz &
Hirschberg’92, Hirschberg & Nakatani ‘96
• Amplitude
Brown et al ’83, Grosz & Hirschberg’92,
Hirschberg & Nakatani ‘96
• Contour
Brown et al ’83, Woodbury ’87, Swerts et al ‘92
7/15/2016
20
Issues
• Do we find significant and reliable cues to
discourse structure in prosodic variation
– When tested against an independent theory of
discourse structure?
– In spontaneous as well as read speech?
• Are Hearers interpretations of discourse
structure influenced by intonational variation?
7/15/2016
21
Grosz & Hirschberg ‘92
• Small corpus of read AP newswire
– Read by professional speaker
– Labeled for discourse structure from text
alone or from text and speech
– Pre-ToBI labeled
– Acoustic-prosodic features extracted for each
intermediate (level 3) phrase
•
•
•
•
7/15/2016
Pitch range and change from prior phrase
Intensity (rms) and change in db from prior phrase
Preceding and subsequent pause
Speaking rate
22
• Analysis of phrases in different segment
positions: SBEG, SF, parentheticals, quoted
speech
– ANOVA’s and t-tests on means
• Results:
– Direct quotes: larger pitch range
– Parentheticals: smaller range, neg change
from prior phrase, neg change in db, faster
rate
– SBEG: larger range, louder, greater preceding
pause, less subsequent pause
– SF: greater subsequent pause
7/15/2016
23
• Machine learning experiments identified:
– SBEG with 91.5% est. accuracy (x-validation)
– SF, 92.5%
– Attributive tags, 96.9%
– Direct quotations, 86.4%
– Indirect quotations, 88.5%
– Parentheticals, 89.2%
• Conclusion: Acoustic/prosodic information is
available to permit Hearers to identify discourse
structure…
7/15/2016
24
Next
• The midterm
– Closed book, no notes or electronic devices
– Will include material through today
7/15/2016
25
Download