Dynamic Difficulty Adjustment in Computer Games

advertisement
Dynamic Difficulty Adjustment in Computer Games
David Michael Jordan Chang
School of Electronics and Computer Science
University of Southampton
dmjc1g10@ecs.soton.ac.uk
ABSTRACT
Dynamic difficulty adjustment (DDA) is an emergent technology that seeks to adapt the difficulty level of a computer
game whilst it is being played, in order to cater for the abilities of a specific player. Due to implementation cost and
complexity, DDA has yet to advance from research topic to
accepted convention in the world of game development. For
the first time in the literature, this paper will review current areas of DDA research, and provide a critical analysis
of each technique.
Keywords
Computer Entertainment, Video Games, AI, Dynamic Difficulty Adjustment.
1.
INTRODUCTION
In general, video-games follow a ‘difficulty curve’; that is,
the game becomes more difficult as the player progresses, so
that an appropriate level of difficulty is experienced by the
player as they learn and improve at the game. The nature
of the difficulty curve, i.e. the magnitude and frequency
of each difficulty incrementation, the initial difficulty level,
etc., have historically been dictated by the initial selection
of a static difficulty by the player. These difficulty levels
usually conform to the convention of Easy, Medium, Hard,
and sometimes additionally, Very Hard.
The problem with such a paradigm is that both novice and
expert players alike can become frustrated or bored by the
pre-defined difficulty curve. The discrete difficulty levels offered do not consider the minutiae of individual player ability levels, nor the various rates at which people learn. An
alternative approach is that of Dynamic Difficulty Adjustment, where the difficulty level of the game is adjusted dynamically in order to suit the individual player.
The difficulty can be adjusted in many different ways — the
method chosen will largely be dictated by the type of game
under consideration. For example, in a First Person Shooter
(FPS) game, the number, strength and AI of enemies and
the frequency of beneficial pick-ups such as weapons and
health may be modified; in a platforming game, the structure of the levels that the user has to navigate may be modified; in a strategy game, the tactics and nature of the opposition’s AI may be modified; in a small multiplayer game by
helping a struggling player or handicapping an outstanding
player; in a large online multiplayer game by balancing the
teams.
In this paper, I will review the current research issues that
are relevant to the various approaches to DDA, as categorised by the method of adjustment. I will review the results of the research that is present in the literature, and
recount the evaluation studies performed. Furthermore, for
each of the approaches discussed, I will provide my own
critique, and consider the future of such technologies. Finally, I will conclude with a discussion of the forefront of
DDA research: the development of a general game adaptation mechanism that is transferable across game genres.
2.
TERMINOLOGY
Throughout this paper, we will use the terms online to mean
‘whilst the game is being played’ and offline to mean ‘before/after the game has been played’.
3.
TYPES OF DYNAMIC DIFFICULTY ADJUSTMENT, BY ADAPTATION TYPE
3.1 DDA by means of automatic level generation
Platform games are games where the player controls an
avatar, and the overall objective of the game is to get from
point A to point B, usually by performing actions such as
jumping over gaps, avoiding enemies, collecting items etc.
The archetypal platform game is Nintendo’s Super Mario
Bros, where the player controls an avatar known as Mario,
who must get from A to B in order to reach the princess,
collecting coins as he goes. In [16], the authors develop a
system for automatic content generation for platform games.
The authors use Marcus Pearson’s Infinite Mario Bros as a
base. Infinite Mario Bros is a public domain clone of the
original Super Mario Bros that uses a technology known as
Procedural Content Generation (PCG). PCG is a technology that enables workers in the media industry to generate
content, such as graphical objects, for films or video-games
automatically, thereby saving the time and cost of having
artists design and place the content manually.
Traditionally, PCG has been used in the video-games industry offline; that is, PCG was used to generate content
such as levels before the user begins playing. Over time, the
reasons for doing so have shifted — the earliest computer
games that were developed were extremely constrained by
memory limitations, and so as a means of minimising the
stored size of a game, PCG was used to algorithmically produce content, such as maps, at run-time. An example of
this is The Sentinel, that managed to contain 10,000 different levels within just 48 kilobytes by using pseudo-random
number generators and seed values to create the levels dynamically [10]. In recent times, however, PCG is used offline
to create content for games simply as a means of saving time
and costs for the development team, as there is very little
constraint put on a modern game development company by
storage requirements. A modern example of such a PCG
system is the middleware SpeedTree, that generates large
numbers of trees procedurally, and is used in many modern
computer games including The Elder Scrolls IV: Oblivion
and Batman: Arkham Asylum [4].
A more recent innovation is the use of PCG online to generate content for video-games whilst they are being played.
In Infinite Mario Bros, levels are created automatically for
the player whilst they play, giving the user a potentially unending game that is different each time that they play it.
In [16], the authors modify Infinite Mario Bros such that
the content generated procedurally is dynamically adapted
to the specific user who is playing the game.
Firstly, the authors collected data from 327 human players
whilst they played Infinite Mario Bros. The data collected
can be broken into the following three categories:
1. Controllable features of the game, such as the number
and average width of gaps.
2. Gameplay statistics, such as the number of jumps,
deaths, kills, sprints etc. that the player performs.
3. The player’s subjective experience, in the form of numerical values attributed to each of fun, challenge and
frustration.
Next, using this data, the gameplay features that most affect the user’s perception of the game were determined using
the machine learning tool known as a single layered perceptron. Then, multi-layered perceptrons (MLPs) were trained
to learn, given a set of controllable features of a level (number of gaps etc.) and gameplay specific features (number of
jumps/deaths/kills performed etc.), what subjective experience a player of that level would have. So, once trained,
the MLPs could be fed data on any given level, and provide information on how fun, challenging and frustrating a
user was likely to find that level. For more information on
perceptrons, see [15].
Using these MLPs, the authors implemented their adaptive
game as follows. The first level that a user plays is ran-
Figure 1: Fun values of optimised vs. random levels
for the more human-like agent. (From [16])
Figure 2: Fun values of optimised vs. random levels
for the less human-like agent. (From [16])
domly generated; then, the gameplay specific features gathered from that play session, together with the controllable
features of that level, are used as input to the MLPs. The
MLPs then decide, given the nature of the last level played
and the way that the user played it, what kind of level the
user was likely to find most fun. That level is then generated for the player, and the process repeats. So, the more
levels that the user plays, the better trained the MLPs will
be, and so the better they will be at determining levels that
the user will find fun.
To evaluate their mechanism, the authors used two different
AI agents that could play through the levels unaided. Each
AI agent played 100 levels — the first 50 were adapted dynamically by the constructed system in order to maximise
the predicted fun values for the agent; the second 50 were all
generated randomly. A comparison showed that the adaptation mechanism was able to generate levels with higher
predicted fun values than levels that were generated randomly, as demonstrated by Figures 1 and 2.
Interestingly, the AI that plays in a more human-like style
(jumping and firing only when necessary) had levels generated for it that were more fun than the levels generated
for the AI that played in a more mechanistic fashion (constantly jumping and firing). This is likely because the adaptive mechanism used was trained by human players [16].
3.1.1 Critique of the approach of Shaker et al.
The work presented in [16] is an important step towards dynamic level generation for platform games. Their research
does, however, lack an evaluation with human users. Al-
though the evaluation with AI players demonstrates that the
levels that are dynamically generated have higher predicted
fun levels, a human evaluation would be a good confirmation
of their results.
The techniques used are of particular interest, as they rely
very little on domain-specific knowledge. That is, they could
very easily be transferred to games other than Infinite Mario,
by identifying the controllable features of the game levels,
and determining through user testing which features contribute most to the user experience. Then, levels could be
generated using MLPs trained to predict what levels a user
was likely to find most fun, based on the current gameplay
statistics and previous levels played.
Level structure plays a role of paramount importance in the
overall user experience of a platforming game, and so it is
unclear how much adaptive level generation would affect the
user experience of a different type of game, such as a First
Person Shooter (FPS), where level structure is less important. This is a potential topic for interesting further research.
3.2
DDA by means of AI modification
In many games, the difficulty level that the user perceives
is directly influenced by the manner in which his/her opponents play. This is especially true of strategy games, where
the user is typically pitted against an individual AI agent,
and the sole objective of the game is to advance by defeating
the opposition. Popular examples of such games include the
Civilisation and Command and Conquer series.
In any game, an AI will function according to a pre-defined
algorithm; the relative ‘strength’ of the AI will be determined by the underlying algorithm’s quality. The difficulty
selected, then, by the player, will determine which algorithm
their AI opponent will use during play. The problem of dynamically adjusting the difficulty of an AI opponent, then,
can be formulated as the problem of selecting the appropriate algorithm for the AI to use against a specific player,
online, and according to the abilities of that player. There
have been several different methods employed in the literature to adjust the nature of game AI dynamically, and I will
present them here, together with critiques of the approaches.
3.2.1
Dynamic scripting
In [17], Spronck et al. use what they term as “dynamic
scripting”. Dynamic scripting is a form of reinforcement
learning (see [15]) that has been adapted to be efficient
enough for use as an online learning technique in games.
A rulebase of scripts that control the behaviours of in-game
opponents is maintained, and every time a new opponent is
generated, its corresponding script is re-generated according to the rules in the rulebase. As the game progresses, the
weights of the rules in the rulebase are adjusted, according
to the behaviour of the player; resultantly, different scripts
will be generated for the various enemy types dynamically.
3.2.2
Critique of Spronck’s dynamic scripting approach
A disadvantage of this technique is that, as the game advances, the rulebase will grow very large. This makes the
rules more error-prone, hard to build and hard to maintain.
Furthermore, adversary performance is limited by the design
of the rule that generates the most intelligent agent, which
may still not present a significant challenge to very skilled
players [1].
3.2.3
Genetic algorithms
Other approaches to dynamic AI adjustment include use
of machine learning to build intelligent agents, where genetic algorithm techniques (see [5]) are employed to keep
alive those agents that most closely match the player’s abilities. In [3], Demasi and Cruz use online co-evolution (see
[18]) to speed up the learning process — pre-defined agents
with good features, constructed either with offline training
or manually, are used as parents in the genetic operations,
so that they bias the evolution.
3.2.4
Critique of Demasi and Cruz’s genetic algorithm approach
This is an interesting approach, but it also has its limitations. There will not be pre-extant models designed for very
skilled players, or players with very uncommon behaviour,
so the rates of learning to accommodate such players will be
very slow. Furthermore, the ability of the in-game agents
can only increase — there is no possibility of regression.
So, if a player’s skill decreases (e.g. if they haven’t played
the game for a long time) then the agents will not be able
to correspondingly regress. To ensure that agents cover all
skill levels, they must begin the evolution at the very easiest
level — this also means that it will take a long time for them
to evolve to the level of a skilled player, who could become
bored in the process [1].
3.2.5
Adaptive agents
Much work has been done in the area of dynamic AI adjustment by Olana Missura and Thomas Gartner. This work
began in 2007, where Missura formalised the problem of an
adaptive agent in the context of the popular strategy game
Connect Four [11]. Then, in 2008, Missura and Gartner
created an adaptive agent for the playing of Connect Four,
using an algorithm that they named “Adaptive Mini-Max”
(AMM) [12].
AMM is a modified version of a pre-existent algorithm known
as Mini-Max, that uses a game-tree to make its decisions
about which move to make on any given turn. This means
that a directed tree of available actions is calculated, where
the nodes are possible game states and the edges are possible moves, and the tree is investigated to determine which
available move is optimal. For reasons of efficiency, MiniMax only investigates a sub-tree of the available game tree,
and then decides which move to make based on a ranking of
the available moves. The value of Mini-Max’s search depth
(i.e. how much of the game tree it investigates at each turn)
determines how well it performs. For the purposes of the
paper, this value was set such that Mini-Max plays as well
as possible while still taking no more than several seconds
per move.
AMM is a modified version of Mini-Max that, instead of
always taking the optimal move based on its investigation of
the game tree, first evaluates the moves that were available
to its opponent. AMM does this by investigating a sub-
Figure 3: The results of 10 episodes of 1000 games
played by AMM and the non-adaptive algorithms.
(From [12])
tree of the game tree of choices that their opponent could
have made, just as it would do for itself on its turn. The
actions available to the opponent are ranked in terms of
optimality, and the actual move that they made is noted.
AMM does this throughout the game, and, at each turn,
calculates an average of the ranking of the all of the moves
made by their opponent — it is this ranking that enables
AMM to determine the ability of their opponent. Then, at
each of AMM’s turns, AMM will choose to make not the
optimal move, but rather the available move whose ranking
is closest to the average ranking of the opposition’s moves.
In order to test the quality of AMM’s adaptive mechanism,
AMM was pitted against three other Connect Four playing
agents, which I list here in order of ability: Naive, Simple,
Min-Max, Optimal.
When playing against each of these agents, AMM demonstrated a good ability to adapt to the different agent ability
levels, as can be seen in Figure 3. It can be observed from
these graphs that AMM won approximately 50 percent of
the time when playing against all but the Optimal agent.
The reason that AMM was not able to adapt to the ability
of the Optimal agent is that AMM is fundamentally based
on Mini-Max, and so cannot perform any better than MiniMax, which is a weaker agent than Optimal.
Finally, an evaluation of AMM’s “fun-factor” and ability to
adapt against human players was performed. The study
found that users preferred playing against AMM over the
non-adaptive agents; this is most likely because the testers
quickly figured out how easy/difficult a non-adaptive agent
was, and then tired of it. The users chose to spend much
more time playing against the adaptive agent [12].
Additionally, it was seen that AMM adapted very well to
the different skill levels of the users. Figure 4 shows how
there was very little correlation between a player’s skill level
and the rate at which they won when playing against AMM.
If you compare this to the graph of player skill level versus
winning rate against the non-adaptive Mini-Max in Figure 5,
you can see that there is a much stronger correlation present
Figure 4: Player’s winning rate (X) VS skill level in
the games with AMM. The correlation coefficient is
just 0.04. (From [12])
Figure 5: Player’s winning rate (X) VS skill level in
the games with MM. The correlation coefficient is
just 0.32. (From [12])
when users play against a non-adaptive agent. From this we
can infer that AMM successfully adapted to the player’s skill
levels.
3.2.6
Critique of Missura’s adaptive agent approach
The basic principles behind the algorithm produced in this
study, AMM, are of interest to researchers in DDA for AI
in strategy games. The general tactic of ranking the opposition’s move history, and choosing an appropriate next move
based on this information, is one that is transferrable across
all games in the Turn-Based Strategy genre. The specific
ranking method is, however, domain-specific, as it is only
applicable to a game of Connect Four. Thus, if the strategy
employed by the authors is to have further application, new
heuristics for ranking need be devised for the target game(s).
A further disadvantage of the algorithm produced is that
it can only play as well as Mini-Max, which is less than
optimal. Resultantly, the adaptive properties of AMM break
down when it is pitted against an opponent with greater
ability than Mini-Max. Further work could look at ways to
improve the performance of AMM; perhaps by increasing the
depth to which it investigates the game tree at each move.
3.3
DDA by means of level content adjustment
The next category of DDA that I shall relate is that of dynamic level content adjustment. By level content, I mean
resources are failing to meet the current demands [7], and
then intervening with an appropriate action.
An action can be reactive or proactive. Reactive actions involve the adjustment of parameters that pertain to game
content that the user is currently encountering, such as the
health, strength and accuracy of an attacking enemy; proactive actions are those that adjust game elements that are
not immediately observed by the player, such as the type,
spawning order, health, accuracy and item load of enemies
yet to be encountered.
Figure 6: Flow diagram, displaying how the challenge that a game provides should be suited to the
skill level of the player in order to keep them in the
“Flow Channel”. (From [7])
items that can be placed within a virtual game environment
that can interact with the player, or be interacted with by
the player. For example, in the context of an FPS style
game, such content may include, but is not limited to: enemies, health pickups, ammunition and weapons.
Important work towards achieving DDA systems that implement the adjustment of level content has been performed by
Robert Hunicke [7] [6]. In their 2004 paper, ‘AI for Dynamic
Difficulty Adjustment in Games’, Hunicke and Chapman describe the development of a system that they call “Hamlet”.
Prior to the work of Hunicke and Chapman, two approaches
for generating content-oriented DDA systems have been attempted. Pfeifer describes a system that utilises the manual
annotation of game tasks and obstacles with information
about their difficulty, such that they may be controlled dynamically [14], and Kennerly proposes a system that employs data-mining and offline analysis in order to dynamically regulate level content [9].
Hamlet is a system that is built on top of Valve’s Half-Life
game engine, that uses techniques drawn from Inventory
Theory and Operations Research to do the following:
1. Monitor game data with statistcal metrics
2. Predict the player’s future state using this data
3. Intervene when an undesirable but avoidable state is
predicted
Like much of the literature on DDA, the authors employ
the notion of ‘Flow’, as developed by the prominent psychologist Mihaly Csikszentmihalyi [2]. The formal aim of
the DDA system developed is to maintain players in what is
known as the “Flow Channel”. This is the state of complete
immersion that one experiences when engaging in an activity that is neither too difficult, such that it evokes anxiety or
frustration, nor too easy, such that it evokes boredom (see
Figure 6).
Hamlet seeks to maintain the player’s state of ‘Flow’ by
identifying when they are “flailing”; i.e., when their available
Game policies are combinations of actions and cost estimations. A cost estimation is a value attributed to the cost of a
given action, that is calculated by considering observations
such as how much progression the player has made, how often they have died, where they currently are in the level, how
often they’ve repeated the encounter and, most importantly,
how often the system has intervened in the past. By comprising game policies of both actions and cost estimations,
the authors hoped to produce a system that would dynamically alter the game in a way that is responsive to individual
play styles, without repeatedly intervening in the same way,
such that the player’s suspension of disbelief could remain
intact [7].
In a follow up to this paper, Hunicke evaluated the effectiveness of Hamlet using a group of human testers. In this
work, they discovered that their system was able to increase
the enjoyment levels of experienced ‘expert’ game players,
though there was no significant correlation found between
game enjoyment and game adaptation for ‘novice’ players
[6].
3.3.1
Critique of Hunicke and Chapman’s work
Though the work of Hunicke and Chapman has been an important first step in the direction of developing a system for
DDA by means of level content adaptation, the ineffectiveness of their system on players who are not ‘experts’ is, I
believe, a serious flaw, as a large proportion of the game
playing population are not expert players. A system that
is also effective on novice and average players would have
much more commercial value.
Interestingly, it is clear that the Hamlet system is capable of
scaling to game types other than those of the FPS genre; indeed, wherever there is a clearly defined system of adjustable
level content, a system such as Hamlet would be appropriate.
It would be of great interest to see similar DDA techniques
applied to Role Playing or Fantasy games, where the types
of gameplay elements encountered are likely to vary beyond
just enemies and items. For example, variables such as the
nature and type of spells available to the player could be
adjusted in such games.
4.
THE FOREFRONT OF DDA RESEARCH:
GENERALISATION OF AN ADAPTATION
MECHANISM
Building on their previous work, as described in section 3.2,
Missura and Gartner have attempted to address the limitation of domain-specific requirements by formalising DDA
as an abstract learning problem [13]. In their paper, they
aim to make a universal mechanism for DDA, that doesn’t
rely on heuristics, thus making it more transferable between
different game types.
The algorithm they produce, POSM, is compared by the
authors, together with another state-of-the-art DDA algorithm, to the ‘best static difficulty system chosen in hindsight’, BSIH. They found that POSM performs almost as
well as BSIH, and sometimes better.
In [8], Missura et al. build further on their generalised DDA
algorithm, POSM, to implement it in the strategy games of
Checkers and Chinese Chess. Implementing the algorithm
in these games and evaluating against both human and autonomous players, the authors discerned that, in most cases,
POSM adjusts difficulty to suit specific players; however, the
mechanism fails when pitted against players who are either
very random in their choices of moves, or very smart.
4.1
Analysis
Based on the studies in [8], it seems likely that POSM could
be used to scale the difficulty level of any Turn-based Strategy game where an AI agent is used. Amongst such games,
it is not unusual to have ‘friendly’ AI agents, as well as
adversarial ones. It would be extremely interesting to see
how POSM could be used to adapt the behaviours of both
friendly and adversarial agents, in order to help to keep the
player engaged.
POSM has yet to be implemented in a genre of games outside
of Turn-based Strategy, however it will be very interesting
to see how well Missura and Gartner’s algorithm transfers
to alternative game paradigms.
5.
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
CONCLUSION
Dynamic difficulty adjustment is a technology that is in its
infancy; as such, there are many teething problems that need
to be addressed if the technology is to become widespread.
Such problems include the need to overcome the requirements of domain-specific information in DDA techniques,
the need to develop DDA mechanisms that are both efficient and cost-effective to implement, and, importantly, the
need to produce DDA mechanisms that can be shown to
reliably increase a player’s enjoyment of a game.
It is at present unclear whether the fledging techniques discussed in this paper will act as the genesis of a new era
for computer game development, where the selection of a
game’s difficulty level is no longer a concrete decision that
is in the hands of humans, but rather a dynamic process,
that allows for automatic adjustment to satisfy individual
players’ needs; or whether the costs and limitations of DDA
techniques will leave such ideas confined to the world of research, without taking a hold in real world application. Any
emergent technology must be attended to if it is to develop;
how dynamic difficulty adjustment will develop, time will
tell.
6.
[2]
REFERENCES
[1] G. Andrade, G. Ramalho, H. Santana, and
V. Corruble. Challenge-sensitive action selection: an
application to game balancing. In Intelligent Agent
[13]
[14]
[15]
[16]
[17]
[18]
Technology, IEEE/WIC/ACM International
Conference on, pages 194–200. IEEE, 2005.
M. Csikszentmihalyi. Flow: The Psychology Of
Optimal Experience. Harper Collins, New York, 1990.
P. Demasi and J. Adriano. On-line coevolution for
action games. International Journal of Intelligent
Games & Simulation, 2(2), 2003.
D. Fritsch, M. Kada, et al. Visualisation using game
engines. Archiwum ISPRS, 35:B5, 2004.
D. Goldberg and J. Holland. Genetic algorithms and
machine learning. Machine Learning, 3(2):95–99, 1988.
R. Hunicke. The case for dynamic difficulty
adjustment in games. In Proceedings of the 2005 ACM
SIGCHI International Conference on Advances in
computer entertainment technology, pages 429–433.
ACM, 2005.
R. Hunicke and V. Chapman. Ai for dynamic difficulty
adjustment in games. In Challenges in Game Artificial
Intelligence AAAI Workshop, pages 91–96, 2004.
L. Ilici, J. Wang, O. Missura, and T. Gartner.
Dynamic difficulty for checkers and chinese chess. In
Proceedings of SIG 2012, pages 55–62. SIG, 2012.
D. Kennerly. Better game design through data mining.
http://www.gamasutra.com/view/feature/2816/
better_game_design_through_data_.php, August
2003.
L. Kranzky. The sentinel. http://kranzky.
rockethands.com/2010/09/08/the-sentinel/,
September 2010.
O. Missura. Adaptive agents in the context of connect
four. In A. Hinneburg, editor, LWA, pages 165–166.
Martin–Luther–University Halle–Wittenberg, 2007.
O. Missura and T. Gartner. Online adaptive agent for
connect four. In Proceedings of the Fourth
International Conference on Games Research and
Development CyberGames 2008, pages 1–8. AAAI
Press, 2008.
O. Missura and T. Gartner. Predicting dynamic
difficulty. Advances in Neural Information Processing
Systems 24, 24:2007–2015, 2011.
B. Pfeifer. Ai to control pacing in games. In IC2
GameDev Workshop. University of Texas, Austin,
2003.
S. Russell, P. Norvig, J. Canny, J. Malik, and
D. Edwards. Artificial intelligence: a modern
approach, volume 2. Prentice hall Englewood Cliffs,
NJ, 1995.
N. Shaker, G. Yannakakis, and J. Togelius. Towards
automatic personalized content generation for
platform games. In Proceedings of Artificial
Intelligence and Interactive Digital Entertainment
(AIIDE’10). AAAI Press, October 2010.
P. Spronck, I. Sprinkhuizen-Kuyper, and E. Postma.
Difficulty scaling of game ai. In Proceedings of the 5th
International Conference on Intelligent Games and
Simulation (GAME-ON 2004), pages 33–37, 2004.
R. Wiegand, W. Liles, and K. De Jong. Analyzing
cooperative coevolution with evolutionary game
theory. In Evolutionary Computation, 2002. CEC’02.
Proceedings of the 2002 Congress on, volume 2, pages
1600–1605. IEEE, 2002.
Download