User Responses to Prosodic Variation in Fragmentary Grounding Utterances in Dialog Gabriel Skantze, David House & Jens Edlund User Responses to Prosodic Variation in Fragmentary Grounding Utterances in Dialog Setting aaa Gabriel Skantze, David House & Jens Edlund User Responses to Prosodic Variation in Fragmentary Grounding Utterances in Dialog Errors in dialog • • Dialog not always error free Error detection often made by grounding the user utterance using explicit or implicit verification: User […] on the right I see a red building. System (low conf.) Did you say ’A red building’? System (high conf.) A red building… ok, take left […]? aaa Gabriel Skantze, David House & Jens Edlund User Responses to Prosodic Variation in Fragmentary Grounding Utterances in Dialog Grounding in dialog • Traditional dialog system grounding • Constructed as full propositions • Often perceived as tedious • Verifies entire user utterances • Fragmentary grounding User […] on the right I see a red building. System red? / red. • Fast • Focuses on problem words/concepts • Often used in human-human dialog aaa Gabriel Skantze, David House & Jens Edlund User Responses to Prosodic Variation in Fragmentary Grounding Utterances in Dialog The problem • Fragmentary grounding utterances are potentially ambiguous • Little syntax and structure • Prosody more critical • How do prosodic features affect the interpretation of such utterances? • How do fragmentary grounding utterances and their prosody affect the subsequent user behavior? aaa Gabriel Skantze, David House & Jens Edlund User Responses to Prosodic Variation in Fragmentary Grounding Utterances in Dialog Interpretations User […] on the right I see a red building. System red(?) Level Paraphrase Acceptance Ok, red. Understanding Do you really mean red? Perception Did you say red? Allwood et al. (1992), Clark (1996) aaa Gabriel Skantze, David House & Jens Edlund User Responses to Prosodic Variation in Fragmentary Grounding Utterances in Dialog Experiment I • • Perception study to find out how prosodic features affect the interpretation of fragmentary grounding 36 stimuli • Parameters: color word, peak position, peak height, vowel duration • LUKAS diphone MBROLA synthesis • • 8 subjects Task: Listen to each stimulus in dialog context and select an appropriate paraphrase aaa Gabriel Skantze, David House & Jens Edlund User Responses to Prosodic Variation in Fragmentary Grounding Utterances in Dialog Experiment I: results Interpretations: 2 3 1. OK, yellow 2. Do you really mean yellow? 3. Did you say yellow? 1 aaa Gabriel Skantze, David House & Jens Edlund User Responses to Prosodic Variation in Fragmentary Grounding Utterances in Dialog Experiment II • Wizard of Oz experiment to find out how fragmentary grounding affects user behaviour • 8(+2) subjects • Task: to help the computer model color perception by answering questions about color similarities • The three prototypes from Experiment I were used to ground the user utterances aaa Gabriel Skantze, David House & Jens Edlund User Responses to Prosodic Variation in Fragmentary Grounding Utterances in Dialog Results • • • • Subjects gave responses (”yes”, ”mm”) to grounding utterances in 243 of 294 cases Responses were similar regardless of grounding type 2 judges categorized the responses by listening to them together with paraphrases of the grounding utterances Judges agreed in 50% of the cases aaa Gabriel Skantze, David House & Jens Edlund User Responses to Prosodic Variation in Fragmentary Grounding Utterances in Dialog Results • • • • Subjects gave responses (”yes”, ”mm”) to grounding utterances in 243 of 294 cases Responses were similar regardless of grounding type 2 judges categorized the responses by listening to them together with paraphrases of the grounding utterances Judges agreed in 50% of the cases Level Paraphrase Acceptance Ok, red. Understanding Do you really mean red? Perception Did you say red? aaa Gabriel Skantze, David House & Jens Edlund User Responses to Prosodic Variation in Fragmentary Grounding Utterances in Dialog Results The categories chosen by the judges corresponded significantly (chi-square) with the type of grounding utterance actually preceding the response. Percentage of stimuli 100% ClarifyPerc ClarifyUnd Accept 90% 80% 70% 60% Significant correspondance 50% 40% 30% 20% 10% 0% Accept ClarifyUnd ClarifyPerc Annotators' selected paraphrase aaa Gabriel Skantze, David House & Jens Edlund User Responses to Prosodic Variation in Fragmentary Grounding Utterances in Dialog Results • • The silences between the end of the grounding utterances and the following user response were measured with /nailon/ - software for speech analysis. Cognitive load hypothesis – responses to: • acceptance: fast • perception clarification request: slower • understanding clarification request: slowest • The results support the hypothesis (ANOVA) Acceptance 591 Understanding 976 Perception 634 0 200 aaa 400 600 800 1000 ms Gabriel Skantze, David House & Jens Edlund User Responses to Prosodic Variation in Fragmentary Grounding Utterances in Dialog Relation to the field in general and the other contributions in particular • Important issues not addressed here: • Timing • Other modalities, e.g. facial gestures • Language and socio-cultural differences aaa Gabriel Skantze, David House & Jens Edlund User Responses to Prosodic Variation in Fragmentary Grounding Utterances in Dialog Where we want to be in 5-10 years • Goals: • More human-like error handling behavior in spoken dialog systems • Ability to generate appropriate grounding prosody for all types of utterances • Models for choosing prosody to achieve the desired pragmatic effect • Integration with fast and appropriate turn-taking aaa Gabriel Skantze, David House & Jens Edlund