CS510 AI and Games Term Project Design Juncao Li Abstract: Computer games, as entertainment or education media, are becoming more and more popular. The design of AIs in computer games usually focuses on the adversaries that play against the human players. However the ultimate goal of the game design is not to make the AIs win against players, but to entertain or educate the players. As a result, it is important to have AIs that can mimic the behaviors of human players and serve as the benchmark in game designs. In this project, we design player AIs in the game Advanced Protection (AP) trying to mimic the human players and win the game. We first hardcode a static Finite State Machine (FSM) that mimics players’ strategies, and then use the FSM to train a Neural Network (NN) as an adaptive player AI. We also design an approach to evaluate the AIs on both sides, i.e., the Human and the Chaos. The evaluation is based on the win/lose ratio of the AIs given random initial treasury on a fixed map. Introduction: As shown in Fig. 1, the game of Advanced Protection (AP) is played between a human player and a computer opponent (known as Chaos) on a 24 x 24 wraparound grid. The game is split into turns, which are composed of 50 phases. Before each turn begins, the human player is able to buy, distribute, and salvage as many units as his or her treasury allows. When the turn begins, Chaos's minions are randomly placed on squares not occupied by the human's units. During each phase of the turn, every minion is allowed one or two moves and every human farming unit generates money. Minions can 1) Move Forward, 2) Turn Right, 3) Turn Left, and 4) Special Action. (The Special Actions vary between minions. The special action of Scouts is to broadcast. The Special Action of Scavengers is to farm. The Special Action of barbarians is to attack a human unit.) When the turn ends, Chaos's remaining units are removed from the board and then salvaged for the units' full value. Both the human player and Chaos create new units between turns by purchasing them using their respective treasuries. Chaos and the human both start the game with $2000. The human player wins when Chaos surrenders (when the human treasury overwhelms the Chaos treasury). Chaos wins when the human has no units and no money left in his or her treasury. Fig. 1 The UI of Advance Protection AP is an adaptive turn-based strategy game that updates its AI strategy each turn according to the human player’s performance. The difficulty ratio will scale up or down if the player wins or loses. To do this, AP encodes each minion with a brain (namely automaton) by a 128-bit string. Each minion has 250 candidate brains to use, among which 20 brains are hardcoded and 230 brains are generated from genetic algorithms. During the play, brains are rated based on their performance against the player. The brain rating varies on different players because the differences between players’ strategies. AP has a fitness function that dynamically chooses the most fit minion brain to satisfy the player’s need. AI design details: We design two player AIs in this project: a Finite State Machine (FSM) that hardcodes the player’s strategy and a Neural Network that is firstly trained by the FSM and further improved by randomly generated playing cases. The goal of a player AI is to maximize its treasury income each turn comparing to the Chaos’ income. The FSM captures game strategies in AP to play as a human player. It gathers necessary information to make decisions in each turn: (1) the treasury of both Human and Chaos; (2) the Terrain; and (3) current human nodes on the map. The output of the FSM includes: (1) where to place the units; and (2) what units to place. We classify the human units into two types: (1) farming units such as drone and settler, whose main purpose is to make money; (2) aggressive units such as mine and artillery, whose main purpose is to cause damage to the Chaos’ units. The FSM makes unit placement following this two categories to maximize its income and minimize the Chaos’ income. Fig. 2 The strategy of the FSM on the first game turn Fig. 3 The strategy of the FSM on the last game turn (before its victory) Figure 2 shows the strategy of the FSM on the first game turn, where it tries to make money by selecting most human units as farming type. Figure 3 shows the strategy of the FSM on the last game turn before its victory, where it has enough treasuries to sustain its growing and instead of making money its top interest is to suppress the Chaos. The FSM is static, where it hardcodes the human knowledge and never learns by itself. Furthermore, programming is limited to consider all the implications that may contribute the game strategy. For examples, some small templates that contain certain unit placement combinations are complex being hardcoded; game strategies can also depend on map terrain patterns, where the patterns are dynamic generated during the game. These limitations request us to design an adaptive player AI. Fig. 4 The strategy of the NN on the first game turn We use Neural Network (NN) to design an adaptive player AI that can evolve by proper training. The input size of our NN is 578, which contains the map terrain information (24*24) and the initial Human/Chaos treasury (1 respectively). The output is the human unit placement on the map (with the map size: 24*24). To simplify the design, we employ one hidden neural layer that contains the same number of nodes as the input. We normalize the inputs to fall in the range between [0.0, 1.0], so a large-number inputs will not overwhelm other inputs at the starting phase of the training. In development, we borrow two neural network implementations from the “AI Game Engine” and “Tim Jones Book”. The later implementation is simple and proved to be efficient during our practice. We design two steps to train our NN player AI. Firstly, we use the static FSM to train the NN, where the training inputs are randomly generated maps and random initial Human/Chaos treasury, and the outputs are generated by the FSM. The first step can efficiently help the NN recognize different maps and map patterns. Secondly, we use randomly generated strategy to train the NN if the strategy performs better than the NN. By doing this, the NN can keep improving itself against the Chaos’ brains. During the training, the NN player AI adapts to recognize different maps. It is often possible that the outputs of the NN are not as what we expect exactly, so we need to interpret the outputs. We first normalize the outputs to the range between [0.0, 1.0]. Then we match the outputs to the most similar answers. Figure 4 shows the NN player AI’s first-turn output on the same map as the FSM did. The NN has been trained for about 4 hours with 1 million iterations. The picture shows that the NN is adapting to recognize maps. Evaluations: We evaluate the player AIs and Chaos AI based on single turns because only for each turn, the Chaos’ brain is certain. We fix the test map in order to reduce the uncertainty cause by maps. Given certain amount of initial treasury, the player and Chaos AI should run 50 times for statistics in order to minimize the influence caused by the encoded AI fuzzy logics. We consider our AIs perform well or win iff: given random initial treasury and map, they have advantage against most Chaos brains (out of 250) in terms of the money earned. This comes from a simple observation that good performance on each turn can lead to final win of the game. Fig. 5 Statistics on the FSM: wins out of the 250 brains for given initial treasury Figure 5 shows the statistics of the hardcoded FSM against the 250 Chaos brains. We can infer from the picture that the FSM is not smart dealing with the treasury between 4000 and 10000, where it can never win against any Chaos brain. Figure 6 shows the statistics of the NN player AI trained after about 1 million iterations. Although trained by the FSM, the NN performs better than the FSM in many cases, especially when the treasury is between 4000 and 10000. The NN AI does not perform better than the FSM when the initial treasury is larger than 10000, because the selecting of the training set does not favor the cases with large initial treasury. Fig. 5 Statistics on the NN AI: wins out of the 250 brains for given initial treasury Conclusion and Discussion: In this project, we studied the turn-based game Advanced Protection (AP) especially on its adaptive AI strategy. We developed a static FSM to encode the human player’s game strategy against the AP AI in terms of brains. The FSM can well deal with certain brains under certain initial settings, but it does not perform well in general cases, especially when the initial treasury is large. We designed a Neural Network (NN) that hopefully can adapt to perform better than the FSM. We didn’t get enough time to train the NN based on the random generated cases, but the training results of the NN by the FSM already showed us a potentially promising outcome of the second training step. This project shows us an approach to memorize and mimic the game players’ behaviors. This could potentially help game companies improve their games after release. The player AIs can be created by users for free during their game plays and used as training cases to improve the game AIs. Developing NNs could be hard because the unpredictability of AIs hides bugs deeply. We ran into several bugs that make the NN fail to evolve properly. We found those bugs mostly by breakpoint check on data status and code review. An efficient way to check if the NN is implemented correctly is to train the NN by always the same training pair, seeing if it adapts as expected. Link to the source code, this report and associated presentation slides: http://web.cecs.pdx.edu/~juncao/links/src/ Most of my code is in the files listed below: The NN player AI class: NNPlayer.h, NNPlayer.cpp The FSM player AI class: Player.h, Player.cpp The NN code I borrowed and modified: backprop.h, backprop.cpp My code of learning: JLLearning.h, JLLearning.cpp Although I have code in other files, I don’t think it’s interesting. Please search “JL” for my comments and code.