Dynamic Difficulty Adjustment in Computer Games David Michael Jordan Chang School of Electronics and Computer Science University of Southampton dmjc1g10@ecs.soton.ac.uk ABSTRACT Dynamic difficulty adjustment (DDA) is an emergent technology that seeks to adapt the difficulty level of a computer game whilst it is being played, in order to cater for the abilities of a specific player. Due to implementation cost and complexity, DDA has yet to advance from research topic to accepted convention in the world of game development. For the first time in the literature, this paper will review current areas of DDA research, and provide a critical analysis of each technique. Keywords Computer Entertainment, Video Games, AI, Dynamic Difficulty Adjustment. 1. INTRODUCTION In general, video-games follow a ‘difficulty curve’; that is, the game becomes more difficult as the player progresses, so that an appropriate level of difficulty is experienced by the player as they learn and improve at the game. The nature of the difficulty curve, i.e. the magnitude and frequency of each difficulty incrementation, the initial difficulty level, etc., have historically been dictated by the initial selection of a static difficulty by the player. These difficulty levels usually conform to the convention of Easy, Medium, Hard, and sometimes additionally, Very Hard. The problem with such a paradigm is that both novice and expert players alike can become frustrated or bored by the pre-defined difficulty curve. The discrete difficulty levels offered do not consider the minutiae of individual player ability levels, nor the various rates at which people learn. An alternative approach is that of Dynamic Difficulty Adjustment, where the difficulty level of the game is adjusted dynamically in order to suit the individual player. The difficulty can be adjusted in many different ways — the method chosen will largely be dictated by the type of game under consideration. For example, in a First Person Shooter (FPS) game, the number, strength and AI of enemies and the frequency of beneficial pick-ups such as weapons and health may be modified; in a platforming game, the structure of the levels that the user has to navigate may be modified; in a strategy game, the tactics and nature of the opposition’s AI may be modified; in a small multiplayer game by helping a struggling player or handicapping an outstanding player; in a large online multiplayer game by balancing the teams. In this paper, I will review the current research issues that are relevant to the various approaches to DDA, as categorised by the method of adjustment. I will review the results of the research that is present in the literature, and recount the evaluation studies performed. Furthermore, for each of the approaches discussed, I will provide my own critique, and consider the future of such technologies. Finally, I will conclude with a discussion of the forefront of DDA research: the development of a general game adaptation mechanism that is transferable across game genres. 2. TERMINOLOGY Throughout this paper, we will use the terms online to mean ‘whilst the game is being played’ and offline to mean ‘before/after the game has been played’. 3. TYPES OF DYNAMIC DIFFICULTY ADJUSTMENT, BY ADAPTATION TYPE 3.1 DDA by means of automatic level generation Platform games are games where the player controls an avatar, and the overall objective of the game is to get from point A to point B, usually by performing actions such as jumping over gaps, avoiding enemies, collecting items etc. The archetypal platform game is Nintendo’s Super Mario Bros, where the player controls an avatar known as Mario, who must get from A to B in order to reach the princess, collecting coins as he goes. In [16], the authors develop a system for automatic content generation for platform games. The authors use Marcus Pearson’s Infinite Mario Bros as a base. Infinite Mario Bros is a public domain clone of the original Super Mario Bros that uses a technology known as Procedural Content Generation (PCG). PCG is a technology that enables workers in the media industry to generate content, such as graphical objects, for films or video-games automatically, thereby saving the time and cost of having artists design and place the content manually. Traditionally, PCG has been used in the video-games industry offline; that is, PCG was used to generate content such as levels before the user begins playing. Over time, the reasons for doing so have shifted — the earliest computer games that were developed were extremely constrained by memory limitations, and so as a means of minimising the stored size of a game, PCG was used to algorithmically produce content, such as maps, at run-time. An example of this is The Sentinel, that managed to contain 10,000 different levels within just 48 kilobytes by using pseudo-random number generators and seed values to create the levels dynamically [10]. In recent times, however, PCG is used offline to create content for games simply as a means of saving time and costs for the development team, as there is very little constraint put on a modern game development company by storage requirements. A modern example of such a PCG system is the middleware SpeedTree, that generates large numbers of trees procedurally, and is used in many modern computer games including The Elder Scrolls IV: Oblivion and Batman: Arkham Asylum [4]. A more recent innovation is the use of PCG online to generate content for video-games whilst they are being played. In Infinite Mario Bros, levels are created automatically for the player whilst they play, giving the user a potentially unending game that is different each time that they play it. In [16], the authors modify Infinite Mario Bros such that the content generated procedurally is dynamically adapted to the specific user who is playing the game. Firstly, the authors collected data from 327 human players whilst they played Infinite Mario Bros. The data collected can be broken into the following three categories: 1. Controllable features of the game, such as the number and average width of gaps. 2. Gameplay statistics, such as the number of jumps, deaths, kills, sprints etc. that the player performs. 3. The player’s subjective experience, in the form of numerical values attributed to each of fun, challenge and frustration. Next, using this data, the gameplay features that most affect the user’s perception of the game were determined using the machine learning tool known as a single layered perceptron. Then, multi-layered perceptrons (MLPs) were trained to learn, given a set of controllable features of a level (number of gaps etc.) and gameplay specific features (number of jumps/deaths/kills performed etc.), what subjective experience a player of that level would have. So, once trained, the MLPs could be fed data on any given level, and provide information on how fun, challenging and frustrating a user was likely to find that level. For more information on perceptrons, see [15]. Using these MLPs, the authors implemented their adaptive game as follows. The first level that a user plays is ran- Figure 1: Fun values of optimised vs. random levels for the more human-like agent. (From [16]) Figure 2: Fun values of optimised vs. random levels for the less human-like agent. (From [16]) domly generated; then, the gameplay specific features gathered from that play session, together with the controllable features of that level, are used as input to the MLPs. The MLPs then decide, given the nature of the last level played and the way that the user played it, what kind of level the user was likely to find most fun. That level is then generated for the player, and the process repeats. So, the more levels that the user plays, the better trained the MLPs will be, and so the better they will be at determining levels that the user will find fun. To evaluate their mechanism, the authors used two different AI agents that could play through the levels unaided. Each AI agent played 100 levels — the first 50 were adapted dynamically by the constructed system in order to maximise the predicted fun values for the agent; the second 50 were all generated randomly. A comparison showed that the adaptation mechanism was able to generate levels with higher predicted fun values than levels that were generated randomly, as demonstrated by Figures 1 and 2. Interestingly, the AI that plays in a more human-like style (jumping and firing only when necessary) had levels generated for it that were more fun than the levels generated for the AI that played in a more mechanistic fashion (constantly jumping and firing). This is likely because the adaptive mechanism used was trained by human players [16]. 3.1.1 Critique of the approach of Shaker et al. The work presented in [16] is an important step towards dynamic level generation for platform games. Their research does, however, lack an evaluation with human users. Al- though the evaluation with AI players demonstrates that the levels that are dynamically generated have higher predicted fun levels, a human evaluation would be a good confirmation of their results. The techniques used are of particular interest, as they rely very little on domain-specific knowledge. That is, they could very easily be transferred to games other than Infinite Mario, by identifying the controllable features of the game levels, and determining through user testing which features contribute most to the user experience. Then, levels could be generated using MLPs trained to predict what levels a user was likely to find most fun, based on the current gameplay statistics and previous levels played. Level structure plays a role of paramount importance in the overall user experience of a platforming game, and so it is unclear how much adaptive level generation would affect the user experience of a different type of game, such as a First Person Shooter (FPS), where level structure is less important. This is a potential topic for interesting further research. 3.2 DDA by means of AI modification In many games, the difficulty level that the user perceives is directly influenced by the manner in which his/her opponents play. This is especially true of strategy games, where the user is typically pitted against an individual AI agent, and the sole objective of the game is to advance by defeating the opposition. Popular examples of such games include the Civilisation and Command and Conquer series. In any game, an AI will function according to a pre-defined algorithm; the relative ‘strength’ of the AI will be determined by the underlying algorithm’s quality. The difficulty selected, then, by the player, will determine which algorithm their AI opponent will use during play. The problem of dynamically adjusting the difficulty of an AI opponent, then, can be formulated as the problem of selecting the appropriate algorithm for the AI to use against a specific player, online, and according to the abilities of that player. There have been several different methods employed in the literature to adjust the nature of game AI dynamically, and I will present them here, together with critiques of the approaches. 3.2.1 Dynamic scripting In [17], Spronck et al. use what they term as “dynamic scripting”. Dynamic scripting is a form of reinforcement learning (see [15]) that has been adapted to be efficient enough for use as an online learning technique in games. A rulebase of scripts that control the behaviours of in-game opponents is maintained, and every time a new opponent is generated, its corresponding script is re-generated according to the rules in the rulebase. As the game progresses, the weights of the rules in the rulebase are adjusted, according to the behaviour of the player; resultantly, different scripts will be generated for the various enemy types dynamically. 3.2.2 Critique of Spronck’s dynamic scripting approach A disadvantage of this technique is that, as the game advances, the rulebase will grow very large. This makes the rules more error-prone, hard to build and hard to maintain. Furthermore, adversary performance is limited by the design of the rule that generates the most intelligent agent, which may still not present a significant challenge to very skilled players [1]. 3.2.3 Genetic algorithms Other approaches to dynamic AI adjustment include use of machine learning to build intelligent agents, where genetic algorithm techniques (see [5]) are employed to keep alive those agents that most closely match the player’s abilities. In [3], Demasi and Cruz use online co-evolution (see [18]) to speed up the learning process — pre-defined agents with good features, constructed either with offline training or manually, are used as parents in the genetic operations, so that they bias the evolution. 3.2.4 Critique of Demasi and Cruz’s genetic algorithm approach This is an interesting approach, but it also has its limitations. There will not be pre-extant models designed for very skilled players, or players with very uncommon behaviour, so the rates of learning to accommodate such players will be very slow. Furthermore, the ability of the in-game agents can only increase — there is no possibility of regression. So, if a player’s skill decreases (e.g. if they haven’t played the game for a long time) then the agents will not be able to correspondingly regress. To ensure that agents cover all skill levels, they must begin the evolution at the very easiest level — this also means that it will take a long time for them to evolve to the level of a skilled player, who could become bored in the process [1]. 3.2.5 Adaptive agents Much work has been done in the area of dynamic AI adjustment by Olana Missura and Thomas Gartner. This work began in 2007, where Missura formalised the problem of an adaptive agent in the context of the popular strategy game Connect Four [11]. Then, in 2008, Missura and Gartner created an adaptive agent for the playing of Connect Four, using an algorithm that they named “Adaptive Mini-Max” (AMM) [12]. AMM is a modified version of a pre-existent algorithm known as Mini-Max, that uses a game-tree to make its decisions about which move to make on any given turn. This means that a directed tree of available actions is calculated, where the nodes are possible game states and the edges are possible moves, and the tree is investigated to determine which available move is optimal. For reasons of efficiency, MiniMax only investigates a sub-tree of the available game tree, and then decides which move to make based on a ranking of the available moves. The value of Mini-Max’s search depth (i.e. how much of the game tree it investigates at each turn) determines how well it performs. For the purposes of the paper, this value was set such that Mini-Max plays as well as possible while still taking no more than several seconds per move. AMM is a modified version of Mini-Max that, instead of always taking the optimal move based on its investigation of the game tree, first evaluates the moves that were available to its opponent. AMM does this by investigating a sub- Figure 3: The results of 10 episodes of 1000 games played by AMM and the non-adaptive algorithms. (From [12]) tree of the game tree of choices that their opponent could have made, just as it would do for itself on its turn. The actions available to the opponent are ranked in terms of optimality, and the actual move that they made is noted. AMM does this throughout the game, and, at each turn, calculates an average of the ranking of the all of the moves made by their opponent — it is this ranking that enables AMM to determine the ability of their opponent. Then, at each of AMM’s turns, AMM will choose to make not the optimal move, but rather the available move whose ranking is closest to the average ranking of the opposition’s moves. In order to test the quality of AMM’s adaptive mechanism, AMM was pitted against three other Connect Four playing agents, which I list here in order of ability: Naive, Simple, Min-Max, Optimal. When playing against each of these agents, AMM demonstrated a good ability to adapt to the different agent ability levels, as can be seen in Figure 3. It can be observed from these graphs that AMM won approximately 50 percent of the time when playing against all but the Optimal agent. The reason that AMM was not able to adapt to the ability of the Optimal agent is that AMM is fundamentally based on Mini-Max, and so cannot perform any better than MiniMax, which is a weaker agent than Optimal. Finally, an evaluation of AMM’s “fun-factor” and ability to adapt against human players was performed. The study found that users preferred playing against AMM over the non-adaptive agents; this is most likely because the testers quickly figured out how easy/difficult a non-adaptive agent was, and then tired of it. The users chose to spend much more time playing against the adaptive agent [12]. Additionally, it was seen that AMM adapted very well to the different skill levels of the users. Figure 4 shows how there was very little correlation between a player’s skill level and the rate at which they won when playing against AMM. If you compare this to the graph of player skill level versus winning rate against the non-adaptive Mini-Max in Figure 5, you can see that there is a much stronger correlation present Figure 4: Player’s winning rate (X) VS skill level in the games with AMM. The correlation coefficient is just 0.04. (From [12]) Figure 5: Player’s winning rate (X) VS skill level in the games with MM. The correlation coefficient is just 0.32. (From [12]) when users play against a non-adaptive agent. From this we can infer that AMM successfully adapted to the player’s skill levels. 3.2.6 Critique of Missura’s adaptive agent approach The basic principles behind the algorithm produced in this study, AMM, are of interest to researchers in DDA for AI in strategy games. The general tactic of ranking the opposition’s move history, and choosing an appropriate next move based on this information, is one that is transferrable across all games in the Turn-Based Strategy genre. The specific ranking method is, however, domain-specific, as it is only applicable to a game of Connect Four. Thus, if the strategy employed by the authors is to have further application, new heuristics for ranking need be devised for the target game(s). A further disadvantage of the algorithm produced is that it can only play as well as Mini-Max, which is less than optimal. Resultantly, the adaptive properties of AMM break down when it is pitted against an opponent with greater ability than Mini-Max. Further work could look at ways to improve the performance of AMM; perhaps by increasing the depth to which it investigates the game tree at each move. 3.3 DDA by means of level content adjustment The next category of DDA that I shall relate is that of dynamic level content adjustment. By level content, I mean resources are failing to meet the current demands [7], and then intervening with an appropriate action. An action can be reactive or proactive. Reactive actions involve the adjustment of parameters that pertain to game content that the user is currently encountering, such as the health, strength and accuracy of an attacking enemy; proactive actions are those that adjust game elements that are not immediately observed by the player, such as the type, spawning order, health, accuracy and item load of enemies yet to be encountered. Figure 6: Flow diagram, displaying how the challenge that a game provides should be suited to the skill level of the player in order to keep them in the “Flow Channel”. (From [7]) items that can be placed within a virtual game environment that can interact with the player, or be interacted with by the player. For example, in the context of an FPS style game, such content may include, but is not limited to: enemies, health pickups, ammunition and weapons. Important work towards achieving DDA systems that implement the adjustment of level content has been performed by Robert Hunicke [7] [6]. In their 2004 paper, ‘AI for Dynamic Difficulty Adjustment in Games’, Hunicke and Chapman describe the development of a system that they call “Hamlet”. Prior to the work of Hunicke and Chapman, two approaches for generating content-oriented DDA systems have been attempted. Pfeifer describes a system that utilises the manual annotation of game tasks and obstacles with information about their difficulty, such that they may be controlled dynamically [14], and Kennerly proposes a system that employs data-mining and offline analysis in order to dynamically regulate level content [9]. Hamlet is a system that is built on top of Valve’s Half-Life game engine, that uses techniques drawn from Inventory Theory and Operations Research to do the following: 1. Monitor game data with statistcal metrics 2. Predict the player’s future state using this data 3. Intervene when an undesirable but avoidable state is predicted Like much of the literature on DDA, the authors employ the notion of ‘Flow’, as developed by the prominent psychologist Mihaly Csikszentmihalyi [2]. The formal aim of the DDA system developed is to maintain players in what is known as the “Flow Channel”. This is the state of complete immersion that one experiences when engaging in an activity that is neither too difficult, such that it evokes anxiety or frustration, nor too easy, such that it evokes boredom (see Figure 6). Hamlet seeks to maintain the player’s state of ‘Flow’ by identifying when they are “flailing”; i.e., when their available Game policies are combinations of actions and cost estimations. A cost estimation is a value attributed to the cost of a given action, that is calculated by considering observations such as how much progression the player has made, how often they have died, where they currently are in the level, how often they’ve repeated the encounter and, most importantly, how often the system has intervened in the past. By comprising game policies of both actions and cost estimations, the authors hoped to produce a system that would dynamically alter the game in a way that is responsive to individual play styles, without repeatedly intervening in the same way, such that the player’s suspension of disbelief could remain intact [7]. In a follow up to this paper, Hunicke evaluated the effectiveness of Hamlet using a group of human testers. In this work, they discovered that their system was able to increase the enjoyment levels of experienced ‘expert’ game players, though there was no significant correlation found between game enjoyment and game adaptation for ‘novice’ players [6]. 3.3.1 Critique of Hunicke and Chapman’s work Though the work of Hunicke and Chapman has been an important first step in the direction of developing a system for DDA by means of level content adaptation, the ineffectiveness of their system on players who are not ‘experts’ is, I believe, a serious flaw, as a large proportion of the game playing population are not expert players. A system that is also effective on novice and average players would have much more commercial value. Interestingly, it is clear that the Hamlet system is capable of scaling to game types other than those of the FPS genre; indeed, wherever there is a clearly defined system of adjustable level content, a system such as Hamlet would be appropriate. It would be of great interest to see similar DDA techniques applied to Role Playing or Fantasy games, where the types of gameplay elements encountered are likely to vary beyond just enemies and items. For example, variables such as the nature and type of spells available to the player could be adjusted in such games. 4. THE FOREFRONT OF DDA RESEARCH: GENERALISATION OF AN ADAPTATION MECHANISM Building on their previous work, as described in section 3.2, Missura and Gartner have attempted to address the limitation of domain-specific requirements by formalising DDA as an abstract learning problem [13]. In their paper, they aim to make a universal mechanism for DDA, that doesn’t rely on heuristics, thus making it more transferable between different game types. The algorithm they produce, POSM, is compared by the authors, together with another state-of-the-art DDA algorithm, to the ‘best static difficulty system chosen in hindsight’, BSIH. They found that POSM performs almost as well as BSIH, and sometimes better. In [8], Missura et al. build further on their generalised DDA algorithm, POSM, to implement it in the strategy games of Checkers and Chinese Chess. Implementing the algorithm in these games and evaluating against both human and autonomous players, the authors discerned that, in most cases, POSM adjusts difficulty to suit specific players; however, the mechanism fails when pitted against players who are either very random in their choices of moves, or very smart. 4.1 Analysis Based on the studies in [8], it seems likely that POSM could be used to scale the difficulty level of any Turn-based Strategy game where an AI agent is used. Amongst such games, it is not unusual to have ‘friendly’ AI agents, as well as adversarial ones. It would be extremely interesting to see how POSM could be used to adapt the behaviours of both friendly and adversarial agents, in order to help to keep the player engaged. POSM has yet to be implemented in a genre of games outside of Turn-based Strategy, however it will be very interesting to see how well Missura and Gartner’s algorithm transfers to alternative game paradigms. 5. [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] CONCLUSION Dynamic difficulty adjustment is a technology that is in its infancy; as such, there are many teething problems that need to be addressed if the technology is to become widespread. Such problems include the need to overcome the requirements of domain-specific information in DDA techniques, the need to develop DDA mechanisms that are both efficient and cost-effective to implement, and, importantly, the need to produce DDA mechanisms that can be shown to reliably increase a player’s enjoyment of a game. It is at present unclear whether the fledging techniques discussed in this paper will act as the genesis of a new era for computer game development, where the selection of a game’s difficulty level is no longer a concrete decision that is in the hands of humans, but rather a dynamic process, that allows for automatic adjustment to satisfy individual players’ needs; or whether the costs and limitations of DDA techniques will leave such ideas confined to the world of research, without taking a hold in real world application. Any emergent technology must be attended to if it is to develop; how dynamic difficulty adjustment will develop, time will tell. 6. [2] REFERENCES [1] G. Andrade, G. Ramalho, H. Santana, and V. Corruble. Challenge-sensitive action selection: an application to game balancing. In Intelligent Agent [13] [14] [15] [16] [17] [18] Technology, IEEE/WIC/ACM International Conference on, pages 194–200. IEEE, 2005. M. Csikszentmihalyi. Flow: The Psychology Of Optimal Experience. Harper Collins, New York, 1990. P. Demasi and J. Adriano. On-line coevolution for action games. International Journal of Intelligent Games & Simulation, 2(2), 2003. D. Fritsch, M. Kada, et al. Visualisation using game engines. Archiwum ISPRS, 35:B5, 2004. D. Goldberg and J. Holland. Genetic algorithms and machine learning. Machine Learning, 3(2):95–99, 1988. R. Hunicke. The case for dynamic difficulty adjustment in games. In Proceedings of the 2005 ACM SIGCHI International Conference on Advances in computer entertainment technology, pages 429–433. ACM, 2005. R. Hunicke and V. Chapman. Ai for dynamic difficulty adjustment in games. In Challenges in Game Artificial Intelligence AAAI Workshop, pages 91–96, 2004. L. Ilici, J. Wang, O. Missura, and T. Gartner. Dynamic difficulty for checkers and chinese chess. In Proceedings of SIG 2012, pages 55–62. SIG, 2012. D. Kennerly. Better game design through data mining. http://www.gamasutra.com/view/feature/2816/ better_game_design_through_data_.php, August 2003. L. Kranzky. The sentinel. http://kranzky. rockethands.com/2010/09/08/the-sentinel/, September 2010. O. Missura. Adaptive agents in the context of connect four. In A. Hinneburg, editor, LWA, pages 165–166. Martin–Luther–University Halle–Wittenberg, 2007. O. Missura and T. Gartner. Online adaptive agent for connect four. In Proceedings of the Fourth International Conference on Games Research and Development CyberGames 2008, pages 1–8. AAAI Press, 2008. O. Missura and T. Gartner. Predicting dynamic difficulty. Advances in Neural Information Processing Systems 24, 24:2007–2015, 2011. B. Pfeifer. Ai to control pacing in games. In IC2 GameDev Workshop. University of Texas, Austin, 2003. S. Russell, P. Norvig, J. Canny, J. Malik, and D. Edwards. Artificial intelligence: a modern approach, volume 2. Prentice hall Englewood Cliffs, NJ, 1995. N. Shaker, G. Yannakakis, and J. Togelius. Towards automatic personalized content generation for platform games. In Proceedings of Artificial Intelligence and Interactive Digital Entertainment (AIIDE’10). AAAI Press, October 2010. P. Spronck, I. Sprinkhuizen-Kuyper, and E. Postma. Difficulty scaling of game ai. In Proceedings of the 5th International Conference on Intelligent Games and Simulation (GAME-ON 2004), pages 33–37, 2004. R. Wiegand, W. Liles, and K. De Jong. Analyzing cooperative coevolution with evolutionary game theory. In Evolutionary Computation, 2002. CEC’02. Proceedings of the 2002 Congress on, volume 2, pages 1600–1605. IEEE, 2002.