Humanoid Agents as Hosts, Advisors, Companions and Jesters Candace L. Sidner

Proceedings of the Twenty-First International FLAIRS Conference (2008)
Humanoid Agents as Hosts, Advisors, Companions and Jesters
Candace L. Sidner
Advanced Information Technology BAE Systems
6 New England Executive Drive, Burlington, MA 01803
candy.sidner@alum.mit.edu
Abstract
Humanoid robots are a valuable for study not only for how
they can be used, but as a research platform for
understanding a variety of issues in human interaction,
where we gain inspiration from human to human
interaction and study human-robot interaction to which
there are comparisons and contrasts.
Humanoid agents can serve in a variety of activities ranging
from hosts, to advisors and coaches, to teachers and to
humous companions. This paper explores recent research in
humanoid robots and embodied conversational agents to
undertand how both of these types of humanoid agents
might play a role in our lives.
Humanoid robots run the gamut of robots built using robot
bases with laptops displaying animated faces, as in the
Naval Research Laboratories robot (Kennedy et al 07) in
Figure 1, or robotic faces and robotic arms as shown in the
JAST robot (Rickert et al 07) in Figure 2 to specially made
animal-like ones such as Leonardo (Thomaz & Breazeal
07) in Figure 3 and special-purpose efforts such as Kidd's
robot head for coaching weight loss (Kidd & Braezeal 07)
in Figure 4. While these robots are very diverse in form,
they bear many similarities in terms of their ability to
produce communication styles and use of limbs that mimic
those of human communicators.
Introduction
In the 1970s, AI leaders believed we would soon see all
manner of intelligent agents, agents that would easily beat
grand masters at chess, speak natural language fluently and
continually learn new things. These problems turned out to
be much harder than they thought. Subfields formed
within AI to pursue specialized research in robotics,
machine vision, natural language and so on. Speech
recognition was joined with natural language to work
towards spoken language.
Happily, in past 25 years, these subfields have made
significant progress, to the point that there are now some
commercially available robots, speech recognition engines,
and vision systems that can actually find and sometimes
correctly identify faces, gaze, arms, and simple objects.
There are also embodied agents that live in the screen
world. So-called embodied conversational agents (ECAs)
have animated faces and bodies, and can be combined with
speech and vision systems to serve as a screen based
animated intelligent partner.
What are the capabilities of these various agents for
interaction? In this paper I explore the new types of
humanoid robots being explored in research groups around
the world. While only a sample of the work being done,
this survey will indicate the breadth of possibilities in this
work. I then turn to ECAs and survey them briefly as well
as applications of their use and how these might be applied
to robots.
Figure 1: George, a Naval Research Labs robot
Humanoid Robots: Capabilities and
Limitations
Copyright © 2008, Association for the Advancement o f Artificial
Intelligence (www.aaai.org). All rights reserved.
11
One-of-a-kind humanoids also now share the research
world now with commercially produced humanoids, such
as Melvin, shown in figure 5,a humanoid robot developed
by Robomotio (www.robomotio.com) and Mitsubishi
Electric Research Labs with 15 degrees of freedom, speech
input and output, and mounted a mobile base.
Figure 5: Melvin developed by Robomotio and MERL
Beyond humanoids, it is important to mention work on
“android” robots, which attempt to imitate humans very
realistically. David Hanson (Hanson Robotics) and Hiroshi
Ishiguro (Nishio et al 07) are exploring either face- only
androids or full-body humanoids, (see Figure 6 for
Ishiguro’s “geminoid” android). Full-body androids cannot
walk in any fashion because robotics has not progressed to
walking in the manner of humans. However, when seated,
Ishiguro's geminoid can participate in a meeting as a
surrogate for its creator while being controlled by him.
Figure 2: The EU JAST project robot
Figure 3: Leonardo from the MIT Media Lab
Figure 6: Ishiguro's geminoid robot with Ishiguro
Robots that display human or human-like physical
characteristics offer the possibility of being partner with
human interaction styles. However, unless they are
capable of speaking and understanding and using some
sensory means of being aware of their partner, humanoid
robots remain of limited use for interactions with people.
All the robots shown here make use of various sensory
inputs. Speech recognition tools make it possible to utter
sentences and phrases to them. However, unrestricted
Figure 4: Kidd's weight loss coach robot
12
natural conversation pushes most speech recognition
engines beyond their limits because conversational speech
is highly unpredictable, varied and disfluent. Good models
of dialogue, of which there are too many to review here,
have been shown to improve speech recognition engines
(Lemon et al, 2002). Yet care must still be taken to limit
the human partner's expectations for what can be said to
reduce error rates to a useful level.
Vision research has progressed significantly in recent
years, which makes possible the use of vision algorithms
for face tracking (Viola and Jones, 2001), face recognition,
gaze recognition (Morency et al, 2006), tracking of limbs
(e.g. Demirdjian, 2004) and recognition of objects in a
limited way (e.g. Torralba et al 2004; Liebe et al 2007 for
pointers into a vast literature). However, all of this
technology is still in relative infancy. Furthermore, a
moving robot will have a great deal more trouble tracking
and recognizing a face than one that is stationary at the
time it wishes to use its vision technology.
Figure 7: Bickmore's Laura ECA
Pelachaud's Greta includes a human face as well as a body
with significant facial motion, including lip movement, and
has been used to study expression of emotion and
communication style during talking (Pelachaud 05). Greta
is also in use as a research tool by other groups (including
E. Andre's group at University of Augsburg) to support
their own research efforts in emotion and dialogue.
Embodied Conversational Agents
Integration of vision technology, along with mobile
movement is a considerable undertaking in humanoid
robots. How can all this technology be used?
There are two general categories of answer, one based on
work with humanoid robots, and the other on ECAs, which
can have highly natural facial and body movements. Let us
envison what robots might do first by looking at research
that demonstrates what ECAs are now doing in research
labs to interact in intelligent and useful ways with their
human partners. I will then turn to research, including my
own, on using humanoid robots both for research and for
useful applications.
Embodied conversational agents simplify the physical
hardware problem inherent in humanoid robots by
replacing it with animation. Animation, while a challenge,
can be used in varying levels of sophistication to produce a
character that is human-looking, both in form and
movement. In the figures below, Bickmore's Laura ECA
(Bickmore and Picard, 2005) and Pelachaud's Greta
(Pelachaud, 2005), shown in Figures 7 and 8, illustrate
some of the range of animation forms. Work at Institute
for Creative Technologies, not just with a single agent, but
an immersive full-wall screen world, includes many ECAs
(Traum & Rickel 02).
Figure 8: Pelachaud's Greta ECA
ECAs, facial or full-body, also support research in social
behavior, including that of children. Cassell's research on
Sam the Castlemate, shown in Figure 9, provides children
with the opportunity to improve their preliteracy skills by
combining the ECA with physical figurines to "share" with
Sam [Ryokai et al, 2002; Cassell 04]. This work illustrates
not only the research platform side of ECAs but their
practical application for serving human activities, in this
case, playing and at the same time improving literacy
skills.
The projects associated with these three examples illustrate
a basic point of this paper: ECAs are being used in
significant ways to train people, assist them in their lives
and as research platforms for learning about
communication, emotion, and social behavior.
13
performed significantly more walking compared to a nonintervention control group drawn from the same practice
(Bickmore et al 05).
Humanoid Robots in Research and
Applications
In this section, we will focus on the humanoid robot as a
research platform and one for supporting humans in many
ways. In our own recent work, Lee and I (Sidner, Lee et al
2005; Sidner, Lee et al 2006) developed dialogue and
collaboration capabilities for two humanoids, one a
penguin with a face wearing glasses, body and wings (see
Figure 11), and the other using Melvin (see Figure 5).
Figure 9: Sam the Castlemate and child
The ICT immersive technology, shown in Figure 10,
including full-body ECAs is designed to help train soldiers
with tasks needed in modern military in-country duty.
Training is a significant and useful role for ECAs. While
research remains to be done to determine how and why this
technology is more effective than non-ECA efforts, its
potential will be greatly enhanced by tools that will allow
researchers and engineers to focus on what must be learned
rather than building the ECA and environment.
Figure 11: Mel a quasi-humanoid robot
This research concerned attentional behaviors, especially
the nature of engagement between the robot and its human
interlocutor. Engagement is the process by which
participants start, maintain and end their perceived
connection to one another. Our robots could hold extended
dialogues with their human counterparts while tracking the
human faces, and when dialogue appropriate, viewing
other objects relevant to the conversation. They could also
interpret nods during dialogue.
Figure 10: Peace keeping scenario from ICT
Bickmore's work on the Laura ECA is instructive for two
principal reasons: it is both a research platform for
studying social dialogue, and for creating devices that offer
assistance to users seeking to change their habits in diet
and exercise. As a research platform, Bickmore was able
to show that people interacting with Laura over a several
week span for exercise advice responded socially to Laura
when she not only talked with them about the task of
exercise but also about social matters (Bickmore and
Picard, 2005). Bickmore's more recent research has been
focused on the use of ECAs, which he calls relational
agents, in dialogues with users to support their behavior
change activities. Bickmore's research in a pilot study
showed that participants recruited from the Geriatric
Ambulatory Practice at Boston Medical Center who used a
relational agent exercise coach daily for two months
In user studies, we found that as long as the robot
conversed, whether it tracked its human interlocutor or not,
human participants looked back at the robot when they
took their turn (as people often do in human dialogs).
However, people preferentially looked back to the robot
more when it tracked them. They also considered the robot
more natural when moving and tracking, and they nodded
more at the robot when it indicated through its own nods
that was interpreting their nods. Our robots could converse
about themselves, engage in collaborative demonstrations
of devices, as well as locate a person in a environment and
begin an interaction with them.
14
Our robot research program has been directed towards a
robot as a host in an environment, a host that can direct a
human or groups of humans in the environment, tell them
about it and interact with them about devices in that
environment. Hosting is form of companionship aimed at
information sharing as opposed to building social
relationships. Humanoid robots, drawing on work in
ECAs, can provide social interaction. However, because
the benefit of physical presence and locomotion comes at
the cost of hardware and its maintenance challenges,
humanoid robots must be useful in their very physicality.
Research so far both on robots and ECAs suggests this will
be so.
Robots and ECAs are now being explored for similar tasks,
for example, in health behavior change where Kidd's recent
research on robotic weight loss coach (Kidd and Breazeal,
07) reprises some of the effects shown in Bickmore's work.
Other research (see Wainer et al 06 for one such study)
suggests that there are differential effects in social
awareness, believability and trust in a co-present robot as
opposed to a virtual (ECA) one. However, a better option
at this point in time is to be inspired by the applications of
ECAs in looking for uses of humanoids.
Bickmore, T., Caruso, L., Clough-Gorr, K., and Heeren, T.
2005. It's just like you talk to a friend: Relational Agents
for Older Adults. Interacting with Computers 17 (6): 711735.
Progress on many fronts, both on social matters, but also
on better robotic and visual technology, will offer
humanoid agents, embodied in the computer display and in
the physical world, who will be worthwhile additions to
our lives.
References
Bickmore, T. and Picard, R. 2005. Establishing and
Maintaining Long-Term Human-Computer Relationships.
ACM Transactions on Computer Human Interaction
(ToCHI), 59(1): 21-30.
Cassell, J. 2004. Towards a Model of Technology and
Literacy Development: Story Listening Systems. Journal
of Applied Developmental Psychology, 25 (1): 75-105.
Obviously, as robots become able to use their arms and
hands safely, many capabilities will manifest themselves.
In the meantime, as advisors and teachers, humanoid
robots can peer into our environment as we perform tasks
including repairs, or learn new physical skills or seek to
perfect them. As social companions, they may be more
engaging than simply "wiring" an environment to notice
when we forget something, or in the case of elders, fall and
become injured. Evidence already suggests that we
humans feel attuned to and engaged by the physical
presence of a humanoid robot.
Demirdjian, D. 2004. Combining Geometric- and ViewBased Approaches for Articulated Pose Estimation. In
Proceedings of ECCV’04, Prague, Czech Republic, May.
Feil-Seifer, D.J. and Mataric´, M.J. 2005. Defining
Socially Assistive Robotics. Poster paper in International
Conference on Rehabilitation Robotics, pages 465-468,
Chicago, IL.
Kennedy, W. G., Bugajska, M., Marge, M., Adams, W.,
Fransen, B. R.,Perzanowski, D., Schultz, A. C., & Trafton,
J. G. 2007. Spatial Representation and Reasoning for
Human-Robot Collaboration. Proceedings of the TwentySecond Conference on Artificial Intelligence. 1554-1559,
Vancouver, Canada: AAAI Press.
Humanoid robots have had one direct analogue of the
Cassell work in recent years—providing a interactive
partner for autistic children (Feil-Seifer & Mataric´ 05).
Mataric´ and colleagues have identified socially assistive
uses of robots as significant, that is, robots that assist in
social interaction for a purpose other than the interaction
itself. Robots that help autistic children are essentially
teachers, helping them to understand social behaviors they
find impossible to learn on their own. Likewise, robots
that serve as tutors or assistants in doing physical tasks are
socially assistive robots on their view.
Kidd C.D. and Breazeal, C. 2007. A Robotic Weight Loss
Coach. Twenty-Second Conference on Artificial
Intelligence. Vancouver, British Columbia, Canada. AAAI.
Leibe, B., Cornelis, N., Cornelis, K. and van Gool, L.
2007. Dynamic 3D Scene Analysis from a Moving
Vehicle, Proceedings of the 2007 IEEE Computer Society
Conference on Computer Vision and Pattern Recognition.
Social interaction means having something to be social
about. Advising on activities and tasks, teaching skills,
and hosting in environments are useful steps in that
direction. However, as a field, we have much to learn
about matters not discussed thus far in achieving in social
interaction. These matters include emotional connection,
extended dialogue and yes, even humor as part of our
interactions.
We all are aware of the role of these
behaviors in our human-to-human social interaction. It
remains to be seen if these behaviors when possible with
robots will be beneficial in our interactions with them.
Lemon, O., Gruenstein, E. and Peters, S. 2002.
Collaborative activities and multi-tasking in dialogue
systems. Traitement Automatique des Langues (TAL),
Special issue on Dialogue, 43(2):131–154.
Nishio,S., Ishiguro, H. and Hagita, N. Can a teleoperated
android represent personal presence? - A case study with
children, Psychologia, Vol. 50, No. 4, pp. 330-343, 2007.
15
Morency, L.-P., Christoudias, C. M., and Darrell, T. 2006.
Recognizing Gaze Aversion Gestures in Embodied
Conversational Discourse, International Conference on
Multimodal Interactions, Banff, Canada.
Pelachaud, C. 2005. Multimodal expressive embodied
conversational agent, ACM Multimedia, Brave New
Topics session, Singapore, November.
Rickel, J., Marsella, S. , Gratch, G., Hill, R., Traum, D. and
Swartout, W. 2002. Towards a New Generation of Virtual
Humans for Interactive Experiences, in IEEE Intelligent
Systems, 17(4):32—38.
Rickert, M., Foster, M.E., Giuliani, M., By, T., Panin,
G.and Knoll,A. Integrating language, vision and action for
human robot dialog systems. In Proceedings of HCI
International 2007, Beijing, pages 987-995, July 2007.
Ryokai, K., Vaucelle, C., and Cassell, J. 2002. Literacy
Learning by Storytelling with a Virtual Peer. In
Proceedings of Computer Support for Collaborative
Learning, 2002.
Sidner, C., Lee, C., Kidd, C., Lesh, N. and Rich, C. 2005.
Explorations in Engagement for Humans and Robots,
Artificial Intelligence, 166(1-2): 140-164, August.
Sidner, C., Lee, C., Morency, C., Forlines, C. 2006. The
Effect of Head-Nod Recognition in Human-Robot
Conversation, Proceedings of the 2006 ACM Conference
on Human Robot Interaction, pp. 290-296.
A. L. Thomaz, and C. Breazeal 2007. “Robot Learning via
Socially Guided Exploration.” In Proceedings of the 6th
International Conference on Developmental Learning
(ICDL). Imperial College, London.
Torralba, A., Murphy, K. P. and Freeman, W. T. 2004.
Sharing features: efficient boosting procedures for
multiclass object detection. Proceedings of the 2004 IEEE
Computer Society Conference on Computer Vision and
Pattern Recognition (CVPR). pp 762- 769.
Wainer,J., Feil-Seifer, D.J., Shell, D.A. and Mataric´, M.J.
2006. The role of physical embodiment in human-robot
interaction. In IEEE Proceedings of the International
Workshop
on
Robot
and
Human
Interactive
Communication (RO-MAN), pages 117-122, Hatfield,
United Kingdom.
Viola, P., Jones, M. 2001. Rapid object detection using a
boosted cascade of simple features, Proceedings of the
IEEE Conference on Computer Vision and Pattern
Recognition, Hawaii, pp. 905-910.
16