High Frequency Word Entrainment in Spoken Dialogue

advertisement
ACL, June 2008 - Columbus, OH
High Frequency Word Entrainment
in Spoken Dialogue
Ani Nenkova - Agustín Gravano - Julia Hirschberg
Department of Computer and Information Science
University of Pennsylvania - Philadelphia, PA
Department of Computer Science
Columbia University - New York, NY
Entrainment

In conversation, people adapt the way
they speak to match their partners’.

Entrainment, accommodation,
adaptation, alignment, convergence.
Agustín Gravano - ACL - June 2008
2
Previous Work
Existence of entrainment
In conversation, speakers:
 Negotiate common ways of describing things.


Alter their intensity to match their partners’.



S.E. Brennan, 1996
R. Coulston et al., 2002
A. Ward & D. Litman, 2007
Reuse syntactic constructions.

D. Reitter et al., 2006
Agustín Gravano - ACL - June 2008
3
Previous Work
Role of entrainment
Entrainment at different levels (lex, syn, sem):


Is key for both production and understanding,
and facilitates interaction.

M.J. Pickering & S. Garrod, 2004

D. Goleman, 2006
Is a good predictor of task success (MapTask).

D. Reitter & J. Moore, 2007
Agustín Gravano - ACL - June 2008
4
This Work

Novel measures of entrainment based on
usage of high-frequency words (HFW).

Entrainment and…


Perceived naturalness

Task success

Dialogue coordination
Implications in the development of Spoken
Dialogue Systems.
Agustín Gravano - ACL - June 2008
5
High-Frequency Words



Most common words in a corpus, or in a
conversation.
Typically, function words and cue words.
Entrainment of HFW

Domain-independent
Agustín Gravano - ACL - June 2008
6
Entrainment & Naturalness

Will a conversation be perceived as more
natural if HFW entrainment occurs?

Switchboard corpus

2430 spontaneous telephone conversations in
American English

Speakers asked to discuss a pre-assigned topic

Annotated for degree of perceived naturalness,
from “1” (Very natural) to “5” (Not natural at all).
Agustín Gravano - ACL - June 2008
7
Entrainment & Naturalness
Measure of Entrainment
Where
fraction(w, Si)  Fraction of times Speaker i used word w
in the conversation
Examples
entr(‘okay’)   | 10 / 500 – 8 / 600 |   0.0067
entr(‘yeah’)   | 1 / 500 – 30 / 600 |   0.048
Agustín Gravano - ACL - June 2008
8
Entrainment & Naturalness
Machine Learning Task


Predict the perceived naturalness of
conversations.
Binary decision, over balanced data



250 conversations rated “1” (very natural), and
250 with ratings “3”, “4” or “5”.
Computed entr(w) for the 100 most frequent
words in the entire Switchboard corpus.
Feature selection: 25 most predictive words.

um, how, okay, go, I’ve, all, very, as, or, up, a, no, more,
something, from, this, what, too, got, can, he, in, things, you, and.
Agustín Gravano - ACL - June 2008
9
Entrainment & Naturalness
Results

Logistic regression model (10-fold CV):
63.76% accuracy (significantly better than
50% baseline)

Entrainment in usage of HFW is a good
indicator of perceived naturalness.
Agustín Gravano - ACL - June 2008
10
Entrainment & Task Success


Is a conversation more likely to succeed
when HFW entrainment occurs?
Columbia Games Corpus



12 spontaneous task-oriented dialogues in
American English, with no eye contact.
Each pair of subjects played a series of computerbased matching games.
Subjects received a score after each task.
Agustín Gravano - ACL - June 2008
11
Entrainment & Task Success
Measures of Entrainment
Where
c = Class of words
countSi(w) = No. of times Si used word w in the conversation
Agustín Gravano - ACL - June 2008
12
Entrainment & Task Success
Word Classes

25MF-G: 25 most frequent words in the game

25MF-C: 25 most frequent words in the corpus


the, a, okay, and, of, I, on, right, is, it, that, have,…
ACW: Affirmative cue words

alright, mm-hm, okay, right, uh-huh, yeah, yes

7.9% of all words in the Games Corpus
Agustín Gravano - ACL - June 2008
13
Entrainment & Task Success
Results


Correlations with game score:
Word class
ENTR1
cor
(p)
ENTR2
cor
(p)
25MF-C
0.341 (0.02)
0.187 (0.20)
25MF-G
0.376 (0.01)
0.260 (0.07)
ACW
0.230 (0.12)
0.372 (0.01)
HFW entrainment positively correlated with
task success.
Agustín Gravano - ACL - June 2008
14
Entrainment & Coordination


Is dialogue more coordinated when HFW
entrainment occurs?
Columbia Games Corpus

Labeled for type of turn exchanges (Beattie, 1982),
including:
Smooth Switch: S2 starts his turn after S1 has finished hers
Interruption: S2 starts his turn before S1 has finished hers
Overlap: S2 starts his turn just before S1 has finished hers,
but without interrupting.
Agustín Gravano - ACL - June 2008
15
Entrainment & Coordination
Results

Significant correlations (p<0.05):
ENTR1(ACW)
ENTR2(ACW)
ENTR2(25MF-G)
ENTR1(25MF-C)
ENTR2(ACW)

& Prop. of Overlaps
(cor = 0.64)
& Prop. of Overlaps
(cor = 0.61)
& Prop. of Overlaps
(cor = 0.60)
& Prop. of Interruptions
(cor = – 0.61)
& Mean Latency of Smooth Switches
(cor = – 0.76)
HFW entrainment positively correlated with
more overlaps, fewer interruptions, and
shorter inter-turn latencies.
Agustín Gravano - ACL - June 2008
16
Conclusion


Two novel measures of lexical entrainment,
based on the usage of high-frequency words.
Entrainment in usage of high-frequency words
is correlated with:




Perceived naturalness
Task success
Dialogue coordination
Implications in the development of SDS.
Agustín Gravano - ACL - June 2008
17
ACL, June 2008 - Columbus, OH
High Frequency Word Entrainment
in Spoken Dialogue
Ani Nenkova - Agustín Gravano - Julia Hirschberg
Department of Computer and Information Science
University of Pennsylvania - Philadelphia, PA
Department of Computer Science
Columbia University - New York, NY
Download