ACL, June 2008 - Columbus, OH High Frequency Word Entrainment in Spoken Dialogue Ani Nenkova - Agustín Gravano - Julia Hirschberg Department of Computer and Information Science University of Pennsylvania - Philadelphia, PA Department of Computer Science Columbia University - New York, NY Entrainment In conversation, people adapt the way they speak to match their partners’. Entrainment, accommodation, adaptation, alignment, convergence. Agustín Gravano - ACL - June 2008 2 Previous Work Existence of entrainment In conversation, speakers: Negotiate common ways of describing things. Alter their intensity to match their partners’. S.E. Brennan, 1996 R. Coulston et al., 2002 A. Ward & D. Litman, 2007 Reuse syntactic constructions. D. Reitter et al., 2006 Agustín Gravano - ACL - June 2008 3 Previous Work Role of entrainment Entrainment at different levels (lex, syn, sem): Is key for both production and understanding, and facilitates interaction. M.J. Pickering & S. Garrod, 2004 D. Goleman, 2006 Is a good predictor of task success (MapTask). D. Reitter & J. Moore, 2007 Agustín Gravano - ACL - June 2008 4 This Work Novel measures of entrainment based on usage of high-frequency words (HFW). Entrainment and… Perceived naturalness Task success Dialogue coordination Implications in the development of Spoken Dialogue Systems. Agustín Gravano - ACL - June 2008 5 High-Frequency Words Most common words in a corpus, or in a conversation. Typically, function words and cue words. Entrainment of HFW Domain-independent Agustín Gravano - ACL - June 2008 6 Entrainment & Naturalness Will a conversation be perceived as more natural if HFW entrainment occurs? Switchboard corpus 2430 spontaneous telephone conversations in American English Speakers asked to discuss a pre-assigned topic Annotated for degree of perceived naturalness, from “1” (Very natural) to “5” (Not natural at all). Agustín Gravano - ACL - June 2008 7 Entrainment & Naturalness Measure of Entrainment Where fraction(w, Si) Fraction of times Speaker i used word w in the conversation Examples entr(‘okay’) | 10 / 500 – 8 / 600 | 0.0067 entr(‘yeah’) | 1 / 500 – 30 / 600 | 0.048 Agustín Gravano - ACL - June 2008 8 Entrainment & Naturalness Machine Learning Task Predict the perceived naturalness of conversations. Binary decision, over balanced data 250 conversations rated “1” (very natural), and 250 with ratings “3”, “4” or “5”. Computed entr(w) for the 100 most frequent words in the entire Switchboard corpus. Feature selection: 25 most predictive words. um, how, okay, go, I’ve, all, very, as, or, up, a, no, more, something, from, this, what, too, got, can, he, in, things, you, and. Agustín Gravano - ACL - June 2008 9 Entrainment & Naturalness Results Logistic regression model (10-fold CV): 63.76% accuracy (significantly better than 50% baseline) Entrainment in usage of HFW is a good indicator of perceived naturalness. Agustín Gravano - ACL - June 2008 10 Entrainment & Task Success Is a conversation more likely to succeed when HFW entrainment occurs? Columbia Games Corpus 12 spontaneous task-oriented dialogues in American English, with no eye contact. Each pair of subjects played a series of computerbased matching games. Subjects received a score after each task. Agustín Gravano - ACL - June 2008 11 Entrainment & Task Success Measures of Entrainment Where c = Class of words countSi(w) = No. of times Si used word w in the conversation Agustín Gravano - ACL - June 2008 12 Entrainment & Task Success Word Classes 25MF-G: 25 most frequent words in the game 25MF-C: 25 most frequent words in the corpus the, a, okay, and, of, I, on, right, is, it, that, have,… ACW: Affirmative cue words alright, mm-hm, okay, right, uh-huh, yeah, yes 7.9% of all words in the Games Corpus Agustín Gravano - ACL - June 2008 13 Entrainment & Task Success Results Correlations with game score: Word class ENTR1 cor (p) ENTR2 cor (p) 25MF-C 0.341 (0.02) 0.187 (0.20) 25MF-G 0.376 (0.01) 0.260 (0.07) ACW 0.230 (0.12) 0.372 (0.01) HFW entrainment positively correlated with task success. Agustín Gravano - ACL - June 2008 14 Entrainment & Coordination Is dialogue more coordinated when HFW entrainment occurs? Columbia Games Corpus Labeled for type of turn exchanges (Beattie, 1982), including: Smooth Switch: S2 starts his turn after S1 has finished hers Interruption: S2 starts his turn before S1 has finished hers Overlap: S2 starts his turn just before S1 has finished hers, but without interrupting. Agustín Gravano - ACL - June 2008 15 Entrainment & Coordination Results Significant correlations (p<0.05): ENTR1(ACW) ENTR2(ACW) ENTR2(25MF-G) ENTR1(25MF-C) ENTR2(ACW) & Prop. of Overlaps (cor = 0.64) & Prop. of Overlaps (cor = 0.61) & Prop. of Overlaps (cor = 0.60) & Prop. of Interruptions (cor = – 0.61) & Mean Latency of Smooth Switches (cor = – 0.76) HFW entrainment positively correlated with more overlaps, fewer interruptions, and shorter inter-turn latencies. Agustín Gravano - ACL - June 2008 16 Conclusion Two novel measures of lexical entrainment, based on the usage of high-frequency words. Entrainment in usage of high-frequency words is correlated with: Perceived naturalness Task success Dialogue coordination Implications in the development of SDS. Agustín Gravano - ACL - June 2008 17 ACL, June 2008 - Columbus, OH High Frequency Word Entrainment in Spoken Dialogue Ani Nenkova - Agustín Gravano - Julia Hirschberg Department of Computer and Information Science University of Pennsylvania - Philadelphia, PA Department of Computer Science Columbia University - New York, NY