Using Anticipation to Create Believable Behaviour Carlos Martinho Ana Paiva

advertisement
Using Anticipation to Create Believable Behaviour ∗
Carlos Martinho
Ana Paiva
Instituto Superior Técnico
Av. Prof. Cavaco Silva, TagusPark
2780-990 Porto Salvo, Portugal
e-mail: carlos.martinho@tagus.ist.utl.pt
INESC-ID
Rua Alves Redol 9, Apartado 13069,
1000-029 Lisboa, Portugal
e-mail: ana.paiva@inesc-id.pt
Abstract
The quest for believability has sent researchers on two different paths: a pragmatical approach inspired in Arts such as
drama and character animation (Thomas & Johnston 1994),
and another that strives for higher levels of autonomy by
providing the synthetic creation with biologically plausible (Blumberg 2002) or psychologically sound behaviour
(Gratch & Marsella 2003). Both paths emphasize the concept of believability as a dimension of synthetic performance
closely related to the adequate expression of emotion. As a
result, most believable characters have some affective model
(Picard 1997) underlying their behaviour.
Several models of emotions have been proposed to aid
achieving believability. However, none explicitly integrates the concept of anticipation in the creation of lifelike behaviour. Anticipation is usually found diluted in the
planning mechanisms of the synthetic character (Gratch &
Marsella 2003) or disguised as an emotion by itself, one that
involves pleasure in considering some expected or longedfor good event, or irritation at having to wait, as in Plutchik’s
theory of emotions (Plutchik 1991). Anticipation has had
but a secondary role in the creation of believable characters.
In this paper, we show how a simple anticipatory mechanism — that we call emotivector — can lead to believable
behaviour. We describe the emotivector, how it monitorizes
the sensor history of the agent, anticipates the next sensor
state and extracts salience and qualitative information from
the mismatch between prediction and the sensed value. Unlike approaches such as Breazeal and Scassellati’s motivation driven perceptual maps (Breazeal & Scassellati 1999),
we compute salience by anticipating the sensor state rather
than by just reacting to it.
We delineate strategies to manage several emotivectors
at once and show how meta-anticipation can represent the
concept of uncertainty. Finally, we describe an experiment
where the information produced by the emotivector was
used to control a synthetic character and show how the evaluation asserted the adequacy of the approach to produce believable behaviour.
Although anticipation is an important part of creating believable behaviour, it has had but a secondary role in the field of
life-like characters. In this paper, we show how a simple anticipatory mechanism can be used to control the behaviour of
a synthetic character implemented as a software agent, without disrupting the user’s suspension of disbelief. We describe
the emotivector, an anticipatory mechanism coupled with a
sensor, that: (1) uses the history of the sensor to anticipate
the next sensor state; (2) interprets the mismatch between the
prediction and the sensed value, by computing its attention
grabbing potential and associating a basic qualitative sensation with the signal; (3) sends its interpretation along with the
signal. When a signal from the sensor reaches the processing module of the agent, it carries recommendations such as:
“you should seriously take this signal into consideration, as
it is much better than we had expected” or “just forget about
this one, it is as bad as we predicted”. We delineate several
strategies to manage several emotivectors at once and show
how one of these strategies (meta-anticipation) transparently
introduces the concept of uncertainty. Finally, we describe
an experiment in which an emotivector-controlled synthetic
character interacts with the user in the context of a wordpuzzle game and present the evaluation supporting the adequacy of our approach.
Introduction
Artificial Intelligence has long sought to construct autonomous creatures. The thought of these entities brings special delight when they are imagined to project a sense of
being “really there”. This sensation gave birth to the concept of believability. Although the notion falls prey to its
subjective nature, more than two decades of research have
followed the seminal definition forwarded by Bates (1994),
outlining a believable character as one able to maintain the
user’s “suspension of disbelief”. Even the most unrealistic
behaviour of a synthetic character must be consistent with
the expectations created in the user through the character’s
presentation. As such, anticipation plays an important part
in the creation of believability.
Anticipation and Emotion
Rosen (1985) defined an anticipatory system as one that contains a predictive model of itself and/or its environment,
which allows it to change state at an instant in accordance
with the model’s predictions pertaining to a latter instant.
∗
This work is supported by EU project MindRACES: from reactive to anticipatory cognitive embodied systems (IST-511931).
c 2006, American Association for Artificial IntelliCopyright gence (www.aaai.org). All rights reserved.
175
Percept Salience (Attention Model)
Anticipatory systems possess two interesting features
(Martinho et al. 2005). First, surprise is an inherent characteristic of anticipatory systems, as purely reactive systems
cannot be surprised. Surprise essentially arises as a mismatch between what is expected to be perceived and what
is actually perceived. Second, the basal functionality of an
anticipatory system relies on its ability to predict whether
the system is going in the “desired” or “undesired” direction, which is but the simplest form of valenced affective
reaction.
The idea behind the concept of emotivector is to capture
these two features in their most basic form, and assess their
potential in the creation of believable behaviour.
The computation of an a-priori salience for a percept is
based on a model of attention inspired by Posner’s exogenous (automatic reflexive control) and endogenous (voluntary control) systems (Posner 1980) and how these systems
interact according to Müller’s hypothesis (1989).
Our model of attention is implemented as follows. At time
t−1, the emotivector value is xt−1 ∈ [0, 1]. Using its history
at time t − 1, the emotivector estimates a value for next time
t (x̂t ), and predicts that its value will change by ∆x̂t = x̂t −
xt−1 . At time t, a new value is sensed (xt ), and a variation
∆xt = xt − xt−1 is actually verified (see Figure 2).
Emotivector
The emotivector is an anticipatory mechanism attached
to a sensor. Each emotivector is associated with a onedimensional aspect of perception, and all emotivectors are
kept together in a salience module, responsible for their
management as a whole. Figure 1 shows the agent architecture.
Figure 2: Computing Salience
The exogenous component at time t (EXOt ) is based on
the estimation error and reflects the principle that the least
expected (i.e. more surprising in terms of mismatch) is more
likely to attract the attention:
EXOt = (xt − x̂t )2
If no further information is given to the emotivector,
the exogenous component is the only factor contributing to
salience. However, if a “searched” value (st ) exists at time
t, the endogenous component (ENDt ) of the percept is computed as:
Figure 1: Agent Architecture
When a percept is sent to the processing module, the associated emotivector catches its value and performs the following steps: (1) using the history of percepts, the emotivector
computes the next expected value of the sensor, relying on a
hybrid algorithm based on the Kalman filter and the generalized recirculation algorithm; (2) by confronting the expectation with the actual sensed value, and using a model inspired
in the psychology of attention, the emotivector computes a
preliminary salience for the percept, and; (3) a sensation
is generated according to a model inspired in the psychology of emotion. The combination of both “attentional” and
“emotional” interpretation is added to the percept. When a
percept reaches the processing module, it carries recommendations such as “you should seriously take this signal into
consideration, as it is much better than we had expected” or
“just forget about this one, it is as bad as we predicted”
The following sections describe each step. For the sake of
simplicity, all sensor values are normalized to [0, 1].
∆st = (xt − st )2
∆ŝt = (x̂t − st )2
ENDt = ∆ŝt − ∆st
ENDt reflects whether the change in the distance to the
“searched” value is better or worse than the expectation and
the level of mismatch (see Figure 2). Unlike EXOt , which
is always nonnegative, ENDt is valenced: an increase in the
expected search distance is assumed negative, while a reduction is modeled by a positive value.
The combination of both exogenous and endogenous
components define the salience of the percept. However, an
emotivector with a search value can also provide a qualitative interpretation of the percept, as explained in the next
section.
176
Percept Sensation (Emotion Model)
Percept Prediction (Anticipatory Model)
Our affective model is inspired by early behavioural theories
and models of emotion.
We assume that emotions are conditioned responses of
primary sensations (Harlow & Stagner 1933) and concentrate our model in the generation of these sensations. Emotions per se are left to be handled by the processing module
of the agent.
Sensations are defined across two dimensions (Young
1961): valence and change. As such, our model considers
four basic sensations: positive increase, positive reduction,
negative increase and negative reduction. The emotivector estimation is used to anticipate a reward or punishment
which, when confronted with the actual reward or punishment (Hammond 1970), triggers one of the four basic sensations. The intensity of each emotion is given by the endogenous component of the percept.
Thus, our model considers the following four primary
sensations1 :
The computation of the emotivector salience relies on the
capacity of the emotivector to predict its next state. Before
anything else, we define the model that we expect the sensed
data to follow – if the signals are totally random, no prediction strategy can be evaluated for adequacy.
Model
As there is no a-priori knowledge of the signal, we follow
a simple assumption: that the intensity i of a signal will
change by a random small amount ∆i ∈ [−, ] at each discrete time step (defined by the sensor rate), for a random
time slice ∆t ∈]0, ∆tmax ], before suddenly changing to a
random new value in the interval [0, 1]. In other words, the
sensed value will tend to remain constant except for certain
points in time.
Kalman Filtering
A possible estimator for such a signal is the Kalman filter
(Kalman 1960), a set of mathematical equations that provides efficient computational recursive means to estimate the
state of a process in a way that minimizes the mean of the
squared error.
A Kalman filter estimating a random constant was implemented. The filter is composed of two sets of equations:
the time update equations, that project forward (in time) the
current state and error covariance estimates to obtain the apriori estimate for the next time step and; the measurement
update equations, that are responsible for the feedback i.e.
for incorporating a new measurement into the a-priori estimate to obtain an improved a posteriori estimate. In the
following equations, R is the measurement noise covariance
and Q is the process noise covariance:
S+ (positive increase) If reward is anticipated and the effective reward is stronger than the expected, an S+ sensation is thrown.
$+ (positive reduction) If reward is anticipated but the effective reward is weaker than the expected, a $+ sensation
is thrown.
S- (negative increase) If punishment is anticipated and the
effective punishment is stronger than expected, an S- sensation is thrown.
$- (negative reduction) If punishment is anticipated but
the effective punishment is weaker than expected, a $sensation is thrown.
1. Time update equations
(a) Project the state ahead: x̂−
t = x̂t−1
(b) Project the error covariance ahead: Pt− = Pt−1 + Q
2. Measurement update equations
(a) Compute the Kalman gain: Kt = Pt− /(Pt− + R)
−
(b) Update estimate: x̂t = x̂−
(1)
t + Kt (xt + x̂t )
−
(c) Update the error covariance: Pt = (1 − Kt )Pt
Following our nomenclature, the expected reward (R̂t ) for
time t and the sensed reward (Rt ) at time t are computed as
follows:
R̂t = ∆st−1 − ∆ŝt
Rt = ∆st−1 − ∆st
Table 1 shows the eliciting conditions for each basic sensation. Each sensation has an intensity defined by the endogenous component ENDt .
predicted
R̂t
R̂t
R̂t
R̂t
>0
>0
<0
<0
sensed
Rt
Rt
Rt
Rt
> R̂t
< R̂t
< R̂t
> R̂t
The need for parameters Q and R to be initially fine tuned
suggested another path.
sensation
Simple Predictor
S+
$+
S−
$−
We rewrote x̂t from Equation 1 as:
x̂t = x−
t−1
R
Pt−1 + Q
+ xt
Pt−1 + Q + R
Pt−1 + Q + R
and then made the following substitutions:
Table 1: Computing Sensation
α=
R
Pt−1 + Q
,β =
Pt−1 + Q + R
Pt−1 + Q + R
Noticing that α + β = 1, we rewrote x̂t as:
1
We follow Millenson’s symbolism to name our sensations, as
symbols are not connoted with an exact word which, by itself,
would imply a certain intensity.
x̂t = x̂t−1 (1 − α) + xt α
177
processing module. The central processing module will then
attend to the most relevant percept, then the second, and so
on, until no more processing resources are available.
Although using the full processing power of the system
on which it is built, the agent will usually spend most of its
processing time dealing with percepts of very low significance and waste precious processing power unnecessarily,
which is something that one wants to avoid as to allow for
other more demanding processes (e.g. graphics and physics)
to perform at their best.
Of course, a minimum threshold could be set, below
which the percepts would not be sent for processing. The
other percepts would continue to be processed by salience
order. However, how is the threshold value chosen? It depends on the setting context and may vary dynamically according to the situation. Another approach would be to only
process the first N most relevant percepts, but again, we
would fall into the same fallacy.
In the end, this strategy may work, but fine-tuning would
make it inelegant.
Figure 3: Simple Predictor
Figure 3 shows this simple predictor graphically. The predictor is a function of the estimated value and the sensed
value, both competing in the contribution to the prediction.
This result resembles to the generalized recirculation algorithm (Hinton & McClelland 1988), an algorithm that enabled back-propagation to be implemented in a more biologically plausible manner, by adopting two activation phases:
the minus phase, where the outputs of the system represent
the expectation of the system, as a function of the standard
activation settling process in response to a given input pattern; the plus phase, where the environment is responsible
for providing the target output activation. In this algorithm,
the learning is essentially the delta rule (Widrow & Hoff
1960) confronting the expectation with the sensed value.
Similarly, we used the delta rule with a learning rate set to
the endogenous salience, in such a way that stimuli sensed
under strong sensations are more relevant in terms of prediction. Hence, ∆αt is computed as follows:
Meta Anticipation
Our approach is to use a meta-predictor, whose role is to
assess the salience of its associated predictor. The metapredictor is perfectly identical to all other predictors. The
only difference is that the meta-predictor receives prediction
errors, while the other predictors receive percepts.
Each time a new value is sensed, there is usually a mismatch between the prediction and the sensed value. This
prediction error is fed to the meta-predictor. Based on the
history of prediction errors, the meta-predictor computes the
next expected error, using the same mechanism the average
predictor uses to compute its next prediction. When a new
value is sensed, the prediction error is computed and sent to
the meta-predictor that will compare it with its error expectation. If the prediction error is higher than the meta-predictor
estimated error, the percept sensed by the emotivector is
marked as relevant to the processing module. Otherwise,
the percept is marked as non-relevant.
This emotivector management strategy has the advantage
of not requiring parameter fine-tuning, and was chosen for
our experiment.
∆αt = ξt (xk − x̂k−1 )αt
where ξt = END2t .
Even if this predictor is not as optimal as the ones it is
inspired by, it provides a good efficiency/adaptation relation
that performs well in real-time over unpredictable signal dimensions, and does not require any previous fine tuning to
work.
Emotivector Management
In this section, we discuss how the salience module selects
among all percepts which are a-priori more relevant to the
processing module of the agent.
Uncertainty
Winner-Takes-All
An interesting side-effect of meta-prediction is that it provides an error margin for estimation. In other words, when
a predictor estimates a value, the meta-predictor estimates
how trustworthy the prediction is. As such, meta-prediction
models the concept of uncertainty.
Uncertainty allows us to add an “as expected” parameter
to our model of sensations, transforming our initial set of
four sensations (S+, $+, S- and $-) into a full set of nine basic sensations, where sensations as “this percept is as bad as
we expected it to be” allow the enrichment of the vocabulary of sensations associated with a percept. Table 2 shows
the relation between the nine-sensation model and the foursensation model presented earlier.
The most immediate approach is to use a winner-takesall strategy, and select only the percept with the highest
salience.
Although this may be a convenient approach when processing power is a crucial factor, the highest percept can easily hide other percepts, which are just “a little less salient”,
but in fact more important to the processing module.
Salience Ordering
To allow for more than one percept to reach the processing
module at the time, an alternative strategy is salience ordering: all percepts are ordered by salience and sent to the
178
Our second use of the emotivector was to monitor the
progress of the word completion task. A single emotivector was regularly fed with a measure of completeness of the
task. A search value, corresponding to the task being completed, activated the endogenous component. When salient,
the sensations created by the emotivector triggered different
expressive behaviours.
Consider the example of a user trying to uncover the word
“LIFE”. Initially, Aini is in a neutral state. Because the user
has now been playing for quite a while without trying any
letter, the margin of prediction error has dropped, and Aini
is pretty sure nothing is going to happen.
Suddenly, the user places her first guess (“L”) in a wrong
placeholder. As a result, Aini expresses an “unexpected P”
sensation. It stops moving and lowers its head. Aini expects
more punishment to come but is not sure about it. Seeing
the negative reaction of the flower, the user removes the letter “L” from the word. As the completeness increases by a
significant amount, Aini expresses a “weaker P” sensation:
it rises its head and waves it encouragingly. Notice that if
the user placed the “L” in the same wrong place again, Aini
would react differently from the first time, as the conditions
would have changed.
Now, the user places the “L” in the correct position. As
it is still under great uncertainty, an “expected R” sensation
is triggered: the agent now looks confident about the user
capabilities. Afterward, the user places the “I” in the right
position. As the margin of error now diminished, the value
is outside the predictive margin and launches a “stronger R”
sensation, expressing total bewilderment.
This example shows the richness of situations that can
be obtained from the use of the emotivector basic sensation
alone, and how perception history and timing triggers different basic sensations in response to the same user’s action.
sensed
expected
more R
as expected
more P
reward (R)
stronger R (S+)
expected R
weaker R ($+)
negligible
unexpected R
negligible
unexpected P
punishment (P)
weaker P ($-)
expected P
stronger P (S-)
Table 2: Extended Sensation Model
Experiment
Believability is a difficult concept to experimentally assess.
It is usually measured through questionnaires evaluating the
satisfaction of the subjects, and relating the satisfaction level
to the suspension of disbelief provoked by the experiment.
Although a questionnaire is invaluable, we designed an
interactive task where the user success depends on the consistent and believable behaviour of a synthetic character, in
such a way that the observation of the task itself provides
less subjective measurements regarding believability.
We created a word puzzle game in which Aini, a synthetic
flower built with our emotivector model, helped the user to
uncover a four-letter word solely by reacting to her actions.
To construct the word, the player must use a set of letterboxes, each one with an alphabet letter drawn onto it. The
word is constructed by placing the boxes onto wooden platforms representing the word letters and their relative position in the word, a tridimensional version of the hangman
dashes. Only some of the letter-cubes are relevant, others
act as distractors. To allow for a more intuitive interaction
with the letter-boxes, the simulation is fully tridimensional
and physics-based. Figure 4 shows the applicational setting.
Evaluation
To evaluate our model, we asked users to play the word puzzle with four different synthetic characters (presented in different orders) sharing the same appearance but having distinct behaviours. Two control characters, with idle and random behaviour respectively, confirmed that the task is impossible to complete without consistent help. Two other
characters evaluated the adequacy of our model: one used
our emotivector control; the other implemented an approach
used by the current generation of computer games, such as
Fable2 and Oblivion3 . In our game, it translates to: look at
the nearest object or latest word change and use a positive or
negative expression depending on action appraisal.
After interacting with each character, each subject answered a questionnaire evaluating the following aspects:
the smoothness of the animation, the naturalness of the behaviour, the level of expectation of the subject regarding the
provided behaviour, the similarity with the behaviour of a
person in the same situation, the ease of understanding the
displayed attention and emotion, and the general impression
left by the synthetic character.
Figure 4: Experiment Application
Emotivector Control
Our first use of the emotivectors was to control Aini’s focus
of attention, a task that only required the exogenous component. We provided Aini with a 5 × 5 grid of sensors (and associated emotivectors) measuring the distance to the nearest
objects in Aini’s field of view. The strongest salience designates the target of Aini’s gaze and the intensity controls
the speed of movement. This approach automatically implemented the “casual look around” as well as the “quickly
look at something new” behaviours.
2
3
179
Fable: The Lost Chapters, Lionhead Studios, 2005.
The Elders Scrolls IV: Oblivion, Bethesda Softworks, 2006.
A total of 280 puzzles were played by both male and female subjects from 5 to 79 years-old, with different levels
of familiarity with computers. No subject finished the game
with the control characters: although it would be possible to
use a brute force approach, the presence of an inconsistent
character led the user to quit. All subjects managed to finish the task with the emotivector-based character, but only
20.6% of the subjects were successful with the game-based
approach. Unexpectedly, the game-based behaviour induced
the feeling of “being cheated on” and made the user quit.
Only users experimenting with one letter at a time, while all
other letters were kept far away, were successful.
To reach a better understanding of the results, we performed a 2-dimension HOMALS on the data from the questionnaires, that accounted for 73.6% of the variance. From
the HOMALS quantification, two variables were identified
as totally unrelated to believability: “smoothness” and “previous knowledge of the word”. A Fisher test strengthened
(exact sig 0.678) that the real challenge is to understand
the expression of the synthetic character. Indeed, even the
younger subjects were able to discover unknown words.
93.7% of subjects rated “guessing a known word without
help” as very difficult (5 in 5 degree Likert scale) while only
6.4% of the subjects think that “finding an unknown word
with help” is difficult (4 or 5 in 5 degree Likert scale).
The HOMALS first dimension stated that a subject who
succeeded at the task generally enjoyed the experience and
found that attention and emotion were easy to perceive. The
second dimension suggests that when a subject evaluates
emotion recognition or task difficulty as “extreme” (in the
sense of very good or very bad) this evaluation is reflected
in a more “extreme” assessment of believability.
During the experiments, two unexpected results were
found. First, the exogenous component based synthetic
“vision” provided the user with a natural way to interact
with the agent (e.g. by waving a box to attract attention
to it). Second, the endogenous component of the taskcompleteness emotivector allowed a rich and non-repetitive
behaviour of the synthetic character that seemed to account
for the past history of interaction in a meaningful manner.
Believability seems to be influenced by all evaluated aspects of the experience except for the smoothness of the animation. It is congruent with the fact that more complex
animation does not necessarily means more believable. Our
results confirm that both the focus of attention and emotional
state play a more important role in the definition of believability.
from an agent with a low cost. Another approach for emotivector integration in more complex models is to substitute
each sensor by a “sensor+emotivector” combination. We are
currently using this approach to integrate emotivectors in
agents with higher-level cognitive theories of emotion and
anticipation implemented at the processing level to model
cautiousness.
References
Bates, J. 1994. The role of emotion in believable agents.
Technical report, School of Computer Science, Carnegie
Mellon University, Pittsburgh, PA.
Blumberg, B. 2002. Exploring Artificial Intelligence in the
new Millennium. Morgan Kaufmann Publishers. chapter
D-Learning: What learning in dogs tells us about building
characters that learn what they ought to learn.
Breazeal, C., and Scassellati, B. 1999. A contextdependent attention system for a social robot. In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence.
Gratch, J., and Marsella, S. 2003. Modeling coping behavior in virtual humans: Don’t worry, be happy. In Proceedings of Second International Joint Conference on Autonomous Agents and Multiagent Systems. ACM Press.
Hammond, L. 1970. Conditioned emotional state. In Physiological Correlates of Emotion. Academic Press.
Harlow, H., and Stagner, R. 1933. Theory of emotions.
Psychological Reviews.
Hinton, G., and McClelland, J. 1988. Learning representations by recirculation. In Anderson, D., ed., Neural Information Processing Systems. American Institute of Physics.
Kalman, R. E. 1960. A new approach to linear filtering and
prediction problems. Transactions of the ASME–Journal of
Basic Engineering 82(Series D):35–45.
Martinho, C.; Micelli, M.; Dias, J.; Paiva, A.; and Castelfranchi, C. 2005. Anticipation and emotion. Technical
report, MindRACES Technical Report.
Muller, H., and Rabbit, P. 1989. Reflexive orienting of
visual attention: time course of activation and resistance to
interruption. Journal of Experimental Psychology: Human
Perception and Performance, (15).
Picard, R. 1997. Affective Computing. MIT Press.
Plutchik, R. 1991. Emotion and evolution. In Strongman,
K. T., ed., International review of studies of emotions, vol
1. Chichester: Wiley.
Posner, M. 1980. Orienting of attention. Quaterly Journal
of Experimental Psychology (32).
Rosen, R. 1985. Anticipatory Systems. Pergamon Press.
Thomas, F., and Johnston, O. 1994. The Illusion of Life.
Hyperion Press.
Widrow, B., and Hoff, M. 1960. Adaptive switching circuits. IRE WESCON Convention Record.
Young, P. 1961. Motivation and Emotion. Wiley.
Conclusion and Future Work
In this paper we discussed how a simple anticipatory mechanism, the emotivector, can be used to produce believable
behaviour. We described the emotivector mechanism, inspired by theories of attention and emotion, and how it was
successfully used in a concrete interactive experiment.
Our model does not intend to substitute any current approach but rather complement the agent architecture for lifelike characters: the salience module exists “outside” the basic agent components, and as such, can be added or removed
180
Download