Do You Really Want to Know? Display Questions in Human-Robot Dialogues

advertisement
Dialog with Robots: Papers from the AAAI Fall Symposium (FS-10-05)
Do You Really Want to Know?
Display Questions in Human-Robot Dialogues
A Position Paper
Maxim Makatchev and Reid Simmons
{mmakatch, reids}@cs.cmu.edu
Robotics Institute, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213
1 U: Jane Smith
2 R: Jane Smith is in 4001 Wean Hall.
Would you like directions?
3 U: no she is not1
4 R: OK
5 U: Smith is not where you say she is
Abstract
Not all questions are asked with the same intention.
Humans tend to address the implicit meaning of the
question (that contributes to its pragmatic force), which
requires knowledge of the context and a degree of
common ground, more so than addressing the explicit
propositional content of the question. Is recognizing
the pragmatic force in today’s human-robot dialogue
systems worth the trouble? We focus on display questions (questions to which the asker already knows the
answer) and argue that there are realistic human-robot
interaction scenarios in existence today that would benefit from the deeper intention recognition. We also propose a method for obtaining display question annotations by embedding an elicitation question into the dialogue. The preliminary study of our robot receptionist
shows that at least 16.7% of interactions with the embedded elicitation question include a display question.
Figure 1: A verbatim fragment of a dialogue containing
user’s feedback. The person’s name and room are modified
for privacy, and the labels “U:” and “R:” are added to denote
the user and robot turns.
For example Hala, the robot receptionist stationed in Qatar,
routinely receives relationship status questions and marriage
proposals (Makatchev et al. 2010). In the case of Roboceptionist, a robot receptionist located at the entrance of a
university building (Gockley et al. 2005), users exhibit behaviors such as swearing (Lee and Makatchev 2009), joking,
and asking display questions, i. e. questions for which they
already know the answers.
Consider the fragment of a dialogue shown in Figure 1
between a human and the Roboceptionist. Clearly, the user
has some knowledge of where Jane’s office is (the robot’s
answer is actually incorrect). Why did the user ask this question to the robot in the first place? What was the user’s intention? Is this the best possible response, to provide an answer referring to the semantic content of the query? In this
paper, we argue that these are important questions to ask
and address a problem essential for development of display
question detection methods: obtaining ground-truth display
question annotations.
Display questions are extensively studied in education research as an important part of a canonical Western-style
classroom instruction (Boyd and Nauclér 2001) (e.g. “What
is the capital of Cuba?”). Outside of the “teacher talk”
context, much like rhetorical and phatic questions (Malinowski 1923), display questions are not answered based
solely on their semantic content (Koshik 2003). However,
unlike rhetorical questions, they do usually require a recipient’s response.
We hypothesize relations between display questions and
user intentions, the process of assigning participant roles,
and a degree of trust the user attributes to the robot. A major
difficulty for evaluating these hypotheses is the lack of an
Introduction
Humans produce utterances that convey intentions beyond
those implied by their apparent propositional meaning. For
example, some utterances formed as questions are not seeking information explicitly referred to by their semantic content, but instead are intended as assertions (rhetorical questions such as “Who do you think you are?”), social rituals
(phatic questions such as “How are you?”), or are seeking
to test the receiver’s knowledge (display questions such as
“Where is my office?”). As robots become more socially
engaging, users may feel more inclined to produce utterances with the range of pragmatic meaning (briefly, a kind
of meaning that depends on the context) that is closer to
human-human dialogues. Robots, in turn, should be designed to respond to such utterances, either by solely interpreting the semantic content, or going deeper and addressing
the pragmatic component, such as whether the question is an
assertion, a greeting, or a test of the listener’s knowledge.
Most present day human-robot interaction (HRI) studies
are conducted in a controlled laboratory environment. It has
been argued that studying HRI “in the wild” can uncover
patterns that are not otherwise observed in the laboratory settings, e. g. (Sabanovic, Michalowski, and Simmons 2006).
c 2010, Association for the Advancement of Artificial
Copyright Intelligence (www.aaai.org). All rights reserved.
157
1 U: where is hamburg hall
2 R: Sorry. I could not get the
information. If you happen to know
the answer to your question, can you
tell it to me?
3 U: yes
4 R: Ok, what is the answer?
5 U: go out of the double doors behind you
and walk to your left, the path will
lead you to the back of hamburg hall.
esis are open for experimental verification, provided there
is a method for reliable annotation and detection of display
questions. We suggested one candidate method to label display questions that neither breaks the robot’s character nor
the impression that the environment is unmonitored. The
approach, which involves eliciting user’s answer to her own
question shows that at least 16.7% of interactions that included the robot’s elicitation question also contained a display question asked by the user. It remains to be seen how
this data can be used to reliably determine user intentions
and their effect on the user-robot interactions.
If the hypotheses are true, there are a number of ways in
which the dialogue can be adapted to the user that produced
a display question. To mention a few, (a) the user’s true
intention (e. g. exploring the robot’s capabilities) can be addressed, (b) the robot can behave in accordance to the role
or relative status that the user assigns to the robot, or try to
affect these assignments, and (c) combining the two above,
the robot may address the issue of low trust from the human
by explaining its own capabilities, the robot’s and the user’s
roles, and the shared task.
Figure 2: A verbatim dialogue fragment containing the embedded elicitation question in turns 2 and 4.
annotated corpus of display questions. As a possible solution to this problem, we propose an interactive approach to
label display questions by incorporating an explicit elicitation request to the user within the dialogue itself. The advantage of our method, as compared to user interviews and
questionnaires, is that it neither requires a robot to step out
of the character nor breaks the impression that the environment is unmonitored. In the following section we describe
the method in more detail and provide preliminary results of
an application of this method to display question annotation.
Acknowledgements
The authors would like to thank Sanako Mitsugi, Antonio
Roque, and Suzanne Wertheim. This publication was made
possible by the support of an NPRP grant from the Qatar
National Research Fund. The statements made herein are
solely the responsibility of the authors.
Eliciting the Display Question Label
The Roboceptionist dialogues occasionally include fragments where users provide explicit feedback, or even answers to their own questions after the robot fails to answer
them properly. For example, dialogues such as the one in
Figure 1 suggest that (a) users do ask display questions, (b)
some users are willing to provide detailed feedback after the
robot fails to produce a satisfactory answer. We attempt to
exploit the tendencies by making the robot at random purposely fail to answer a question. After each user question,
we randomly choose to present an elicitation question (but
not more than once per dialogue). Hence, longer dialogues
are more likely to have an elicitation question than shorter
ones. A fragment of such dialogue is shown in Figure 2. The
subset of display questions that is labeled this way serves as
an estimate of a lower bound on the fraction of interactions
that contain a display question among all the interactions
with an elicitation question. This is the estimate of a lower
bound since (a) the user may choose not to answer the elicitation question even if she knows the answer, and (b) the
elicitation intervention may be applied to a question that is
not a display question within a dialogue that contains a display question.
Note that the lower bound holds even though the presence
of the elicitation question affects the rest of the interaction.
An experiment conducted over a period of 3 months shows
that at least 16.7% (SE = 4.6%) of interactions that included an elicitation question contained a display question.
References
Boyd, S., and Nauclér, K. 2001. Sociocultural aspects of
bilingual narrative development in Sweden. In Verhoeven,
L. T., and Strömqvist, S., eds., Narrative development in a
multilingual context. Amsterdam: John Benjamins B.V.
Gockley, R.; Bruce, A.; Forlizzi, J.; Michalowski, M.;
Mundell, A.; Rosenthal, S.; Sellner, B.; Simmons, R.;
Snipes, K.; Schultz, A. C.; ; and Wang, J. 2005. Designing
robots for long-term social interaction. In Proc. Int. Conf.
on Intelligent Robots and Systems, 2199–2204.
Koshik, I. 2003. Wh-quesitons used as challenges. Discourse Studies 5:51–77.
Lee, M. K., and Makatchev, M. 2009. How do people talk
with a robot? An analysis of human-robot dialogues in the
real world. In Proc. of CHI, 3769–3774.
Makatchev, M.; Fanaswala, I. A.; Abdulsalam, A. A.;
Browning, B.; Gazzawi, W. M.; Sakr, M.; and Simmons,
R. 2010. Dialogue patterns of an arabic robot receptionist.
In Proc. of HRI. Osaka, Japan: ACM.
Malinowski, B. 1923. The problem of meaning in primitive languages. In Ogden, C. K., and Richards, I. A., eds.,
The Meaning of Meaning: A Study of Influence of Language
Upon Thought and of the Science of Symbolism. New York:
Harcourt, Brace and World. 296–336.
Sabanovic, S.; Michalowski, M. P.; and Simmons, R. 2006.
Robots in the wild: Observing human-robot social interaction outside the lab. In Proc. of Int. Workshop on Advanced
Motion Control.
Conclusion
We attempted to draw attention to the pragmatic forces behind questions asked to a robot receptionist. We hypothesized the relations between display questions and user intentions, role assignments and levels of trust. These hypoth-
158
Download