Logan Yarnell Steven Raines Dean Antel Team Zerg Rush: Learning Algorithms Applied to StarCraft Abstract: The opening or initial build order during a StarCraft match is an integral part of a player’s strategy. If an opening is naively chosen it makes a player vulnerable to tactics that can end the game very quickly. We examine three algorithms for choosing a “good” opener: Bayesian network, a genetic algorithm, and a pathfinding algorithm. Introduction Intelligence does not have a singular formal definition, so an Artificial intelligence can be a confusing term to understand. In this paper, the word intelligent will not refer to speed, memory, or even effectiveness. While these are important factors to consider in a program, the final test of intelligence will be dependent on the program being able to learn from it's actions. Robotic tacticians have still not caught up to us yet. Deep Blue may have foretold the world of how terrifying a computer can be, but board games such as Go and other computer games still have many Artificial intelligence's puzzled.[1] And Deep Blue was also a warning to many other computer programmers. As soon as it one for the first time, it was dismantled, and there was never a rematch[2]. If deep blue had played again, his chances of winning would not have been high. This is also seen in the game of Go. A computer, when played for the first time, can be a high rank but when played a second time, they drop to a ninth ku[1]. Computers are fast, but they have trouble with the concepts of learning and adaptation. The purpose of this project is to remedy that. StarCraft is a real time strategy game. It requires both tactics and strategy to ensure victory. Expert level gameplay often requires hundreds of actions per minute. We chose this as our medium to program our learning algorithms because it is still a budding field, and is a more complex game than traditional board, or card games. Our algorithms focused on a narrow aspect of the game, Build order. This is an important part of StarCraft strategy since it determines all other aspects of how the game is played. The order in witch buildings and units are made show your opponent what you are capable of, and predicting your opponents build order allows you to predict a counter far in advance. Creating a good build order is a complex process, but we have implemented a number of algorithms to help with this task. Formal Project Statement Create a Program that obeys the rules of StarCraft and as a player creates a Strategy utilizing a Build Order file such that, Over the course of multiple games, It’s actions and play style change and adapt to a Static Oponent(ei: one who’s Build order and Strategy does not Change.) With the Goal of defeating an opponent. The algorithm filters out useless information, assigns unique Identification numbers to each unit seen, and arranges all of the seen units in a practical order. The Algorithm will be tested with respect to it’s Computational Complexity in respect to the generation of a build order, and it’s likelihood of defeating a static opponent. Context Starcraft AI development is not new. There are competitions currently, and open source libraries for developing your own. However, optimal solutions to this problem are elusive, and no one AI has been able to cement it’s self as superior. While some AI have implemented Learning algorithms into their play style, using Learning algorithms explicitly to develop new build orders and strategies is, to our knowledge, a novel concept. The current competitive AI use preset Build orders and strategies that they can choose from, but these are hard coded. While some programs we found used information from old games to improve their own Choice making capabilities, we did not see any Programs that actively learned and created new strategies and build orders. We obtained all of our information about other Starcraft AI from the SSCAIT website [5]. Algorithm #1: Pathfinding This algorithm is meant to find a build path from limited information given by scouting an enemy base. It remembers what an opponent has done and writes it to a list of other builds. Compared to the true build order of the opponent the Algorithm was accurate in terms of order of units, but not number of units, due to incomplete information and errors in our scouting code. It utilized Java’s built in Intersection algorithm and a master list of all Terran units in order to archive a correct order. Due to the small size of N this was an acceptable substitute for a formal sorting algorithm. However, Quicksort, or a similar algorithm, would have been preferable in terms of time computation, but would have been more difficult to implement. For the future, we would like to replace the Intersection algorithm with a formal sorting algorithm. Algorithm #2: Bayesian Network A Bayesian network approach to build order selection uses the conditional dependences of the StarCraft technology tree to make an inference on what type of units the opponent is building. The goal is to attempt to counter their units by producing a unit capable of dealing significant damage to them. StarCraft’s varied unit types are capable of dealing different levels of damage to an enemy unit, depending on its type. Three networks are required, one corresponding to each race the opponent may be playing: Protoss, Zerg, or Terran. units they are constructing. This becomes more difficult as the match goes on, because the enemy is harder to scout in longer games because the scout is more likely to be killed before the enemy base is fully visible due to the larger presence of enemy units. The network is also unable to make accurate predictions in longer games because it is increasingly likely the enemy has unlocked a larger portion of the technology tree, which gives a higher probability that the enemy has any number of the units in the network. This is due to the large values of time that has passed since the match began. Due to these difficulties, this is an effective method for early countering by build order selection but later on in the match it becomes increasingly ineffective. Algorithm #3: Hill climbing The purpose of the network structure is to calculate the probability that the opponent has a specific building or unit, given what has been seen so far by scouting the enemy base and how much time has gone by since the match started. Each node in the network has two possible values, true or false. A value of ‘true’ represents that the structure or unit has been seen so far during the match. Using conditional dependencies in the network, whether an opponent possesses a certain unit or structure is the probability that the corresponding node in the network has the value true. Due to StarCraft’s fog of war, an agent only has visibility of the area directly surrounding its units or structures. Therefore a scout must be sent very early to the enemy base and they must be watched as much as possible throughout the match to determine what types of The third step in the optimization of our Starcraft bot was to implement an algorithm which would aid in perfecting a set of build orders. It was originally planned that a mutagenic strategy would be used. A solution, in this case a build order read in from a text file and represented as an array list of strings, would be subjected to some mutation function which would alter the build by adding, removing, or reordering a single unit in the text file. A crossover function would then mix sections of two builds based on some markers which would indicate potential for combination, for instance, a Terran factory and machine shop in one build and a siege tank in another. After completion of a match wherein our bot utilized one of our build orders the fitness value of the solution would be incremented upon success and untouched upon a loss. The number of times a build was mutated was also kept on record. The elitism function would eliminate solutions from the current generation of build orders based on the following scheme: number of times isBuildValid had to be checked. Where F is the fitness value of the build, and N is the number of mutations the build has been subjected to, if F < 5 and N >= 10 then the build will be discarded from the set of solutions. Due to time constraints the crossover function was never implemented. The resulting system was more of a hill climbing algorithm which improved builds by incrementally altering units. The mutate function works by generating a random integer to determine whether to add a new unit, remove a unit, or simply reorder a unit in the build. In the event of a new unit being added, random integers are again generated to determine the type of unit and where it will be placed in the build order. Reorders and deletions of units in a build also used random integers in this way. A helper method, isBuildValid, was used to ensure that any proposed mutations to a build did not result in an unachievable strategy. For instance, if the mutate function placed a Terran marine in the build before a barracks, the marine would never be made and our bot would halt, leaving the build order incomplete. The random mutations to a build are repeated inside a while loop until isBuildValid returns true. This essentially makes the hill climbing algorithm a brute force solution. Since all decisions made by the algorithm are random, the complexity is difficult to quantify. Data was gathered by altering the algorithm to mutate a build over and over again for some set number of times. The runtimes taken to perform the mutations were recorded, as well as the minimum, maximum, and average As expected with a brute force algorithm, the runtimes seem to increase at an exponential rate, going from 86ms at 100 mutations to 18,867ms at 1,000 mutations. The minimum number of checks to isBuildValid was no surprise. At best, a build will get an executable mutation on the first try. However, what is interesting to note is the steady decrease in the maximum number of checks to isBuildValid as the number of mutations is increased. Build orders consistently grew in size and complexity as mutations occurred. Despite the possibility of the mutate function removing a unit from the build, there is also the possibility that either no units will be removed, or a unit will be added. It is because of this that the build orders consistently grow in number of units instead of getting smaller or remaining the same size. As a build becomes more and more complex, it becomes very unlikely that any necessary prerequisites for a new unit will be unfulfilled. Therefore, the larger the build order, the less likely it is for isBuildValid to be required more than once. Since our bot was never able to win a match, it is impossible to say whether or not the hill climbing algorithm is successful at perfecting a strategy. However, there are already several obvious aspects of the algorithm to be addressed in the future. It would be beneficial to add numbers to units in build orders to indicate how many of a particular type of unit our bot should make. For instance, a line in a build order that reads Terran_Marine 20 would task our bot with making 20 marines. This added specificity would give the mutation function the ability to make fine tuning adjustments to a build. The inclusion of a crossover function would also improve build order optimization by combining aspects of successful builds, thus making the hill climbing solution into a true mutagenic algorithm. Results Unfortunately our testing did not give us the results we desired. The complexity of the problem was not fully understood until well into the project. While we succeeded at make three learning algorithms, the Starcraft environment proved to be complex to the point that executing mundane actions for a human was beyond the scope of the project. The majority of our time working on this projects was spent learning the Java Libraries necessary to interact with StarCraft, and implementing code that allowed the AI to accomplish simple tasks such as attacking, scouting, placing and making buildings, training units, and remembering Enemy units. While our destination was not successful, the journey allowed us to study a wide variety of algorithms. Dijkstra’s algorithm, was considered as our path finding algorithm and it’s complexity was analyzed in terms of O(n) and found to be order n^2 in the worst case. Many other algorithms such as Genetic and Pattern Matching algorithms were also studied and analyzed for potential in our final program. In the end we settled with Intersection, Hill Climbing, and a Bayesian Network to be formally added and tested. The final program was able to learn and adapt to an opponent, copy their techniques, and even make it’s own strategies. Future work Each algorithm had room to improve, but overall, we would like to expand the scope of the AI to include not only build orders, but attack and command timings and placements. We would also like to implement a system for calculating possible repercussions of different choices, such as calculating the loss of minerals from choosing to not make an SCV at a certain time. This would increase the complexity of the AI’s learning and testability. We would also like to improve the basic workings of our program to make it more practical in a tournament setting. Questions What is the computational complexity (O(n)) of Dijkstra’s algorithm without optimization? N^2 Is this a constraint satisfaction problem, or an optimization problem? Optimization What probabilistic relationship is there between nodes in a Bayesian network that represents a StarCraft tech tree? Conditional dependence Sources [1] http://www.theguardian.com/technology/ 2006/aug/03/insideit.guardianweeklytec hnologysection [2] http://www.decodedscience.com/deepblue-a-landmark-in-artificialintelligence/23264/2 [3] http://www.sscaitournament.com/