TickTock: A non-Goal-Oriented Multimodal Dialog System with Engagement Awareness Zhou (Jo) Yu, Alex Papangelis and Alex Rudnicky, Carnegie Mellon University Goals: ◦ 1: Keep human engaged in the conversation as long as possible ◦ 2: Make human enjoy the conversation experience. TickTock, Multimodal Dialog System Natural Language Understanding Can you NNP PRP tell me something about movie ? VBP PRP NN NN (tell, VBP), (something, NN), IN (movie, NN) (tell, VBP, 1), (something, NN, 3), (movie, NN, 3) POS tagger Filter stop words Reweighted TickTock, Multimodal Dialog System Content Retrieval • Key word search in question-answer pair database m: #words in input utterance n: #words in candidate utterance • Matched both question and answers TickTock, Multimodal Dialog System Data Base CNN Interview Transcripts ◦ “Piers Morgan Tonight” Show http://transcripts.cnn.com/TRANSCRIPTS/ ◦ 767 Interviews, 500 - 1,000 utterances each ◦ Speakers’ names are annotated for each turn Pre-processing ◦ Sentence segmentations ◦ 2-stage Question-Answer (QA) Pair Extraction 1. Rule-based Question Identification ◦ “?”, “How”, “WH-”… 2. Take the consecutive follow-up term of an identified question and an utterance from a different speaker as answer TickTock, Multimodal Dialog System Conversation Strategy Selection score =< threshold Random Select from the states: Switch topic End + open question > threshold Random Select from the states: Answer only Continue (ask opinion) TickTock, Multimodal Dialog System Natural Language Generation Template for each strategy: Template[switch] = ['template_end', 'template_new,topic'] Template[end] = ['answer', 'template_open'] Template[continue]=['answer'] template[expand] = ['answer', 'template_expand'] Example Template [switch]: I can’t answer this, what do you think of music? TickTock, Multimodal Dialog System Talking head TickTock, Multimodal Dialog System Integrate in Conversation Strategies Engagement score =< threshold Random Select from the states: Switch topic End + open question > threshold Random Select from the states: Answer only Continue (ask opinion) Experiment Setting Example Dialog Engagement Definition: Engagement is the process by which two (or more) participants establish, maintain and end their perceived connection during interactions they jointly undertake. Annotation Unit: Turn Reference: Sidner, C.L., Kidd, C.D., Lee, D., and Lesh, N. 2004. Where to Look: A Study of Human-Robot Engagement, In Proceedings of IUI, Madeira, Portugal. Level 1 Definition Description Strongly Shows no interest in dialogue system, disengaged engaged in other things than talking to the dialogue system 2 Shows little interest to continue the conversation, passively interacts with the dialogue system 3 Neither Interacts with the dialogue system, disengaged showing neither interest nor lack of nor engaged interest to continue the conversation Disengaged 4 5 Engaged Strongly engaged Shows mild interest to continue the conversation Shows a lot of interest to continue the conversation and actively contributes to the conversation Annotation Analysis Third Person Turn counts 24 Self-Reported 24 23 22 20 20 16 15 15 14 11 11 10 8 5 0 0 1 1 2 1 0 3 4 Interaction 1 5 0 0 0 0 3 0 1 2 2 3 4 5 Interaction 2 0 0 0 1 1 1 2 0 1 3 4 5 Interaction 3 Distribution of Engagement score for self-reported and third person annotation. Thanks Zhou(Jo) Yu: zhouyu@cs.cmu.edu Code available : http://trac.speech.cs.cmu.edu/repos/olympus/branches/actorimpersonator/