paper - People Server at UNCW

advertisement
Logan Yarnell
Steven Raines
Dean Antel
Team Zerg Rush:
Learning Algorithms Applied to StarCraft
Abstract:
The opening or initial build order during
a StarCraft match is an integral part of
a player’s strategy. If an opening is
naively chosen it makes a player
vulnerable to tactics that can end the
game very quickly. We examine three
algorithms for choosing a “good”
opener: Bayesian network, a genetic
algorithm, and a pathfinding algorithm.
Introduction
Intelligence does not have a singular
formal definition, so an Artificial
intelligence can be a confusing term to
understand. In this paper, the word
intelligent will not refer to speed,
memory, or even effectiveness. While
these are important factors to consider in
a program, the final test of intelligence
will be dependent on the program being
able to learn from it's actions. Robotic
tacticians have still not caught up to us
yet. Deep Blue may have foretold the
world of how terrifying a computer can
be, but board games such as Go and
other computer games still have many
Artificial intelligence's puzzled.[1] And
Deep Blue was also a warning to many
other computer programmers. As soon as
it one for the first time, it was
dismantled, and there was never a
rematch[2]. If deep blue had played
again, his chances of winning would not
have been high. This is also seen in the
game of Go. A computer, when played
for the first time, can be a high rank but
when played a second time, they drop to
a ninth ku[1]. Computers are fast, but
they have trouble with the concepts of
learning and adaptation. The purpose of
this project is to remedy that.
StarCraft is a real time strategy game. It
requires both tactics and strategy to
ensure victory. Expert level gameplay
often requires hundreds of actions per
minute. We chose this as our medium to
program our learning algorithms because
it is still a budding field, and is a more
complex game than traditional board, or
card games.
Our algorithms focused on a narrow
aspect of the game, Build order. This is
an important part of StarCraft strategy
since it determines all other aspects of
how the game is played. The order in
witch buildings and units are made show
your opponent what you are capable of,
and predicting your opponents build
order allows you to predict a counter far
in advance. Creating a good build order
is a complex process, but we have
implemented a number of algorithms to
help with this task.
Formal Project Statement
Create a Program that obeys the rules of
StarCraft and as a player creates a
Strategy utilizing a Build Order file such
that, Over the course of multiple games,
It’s actions and play style change and
adapt to a Static Oponent(ei: one who’s
Build order and Strategy does not
Change.) With the Goal of defeating an
opponent.
The algorithm filters out useless
information, assigns unique
Identification numbers to each unit seen,
and arranges all of the seen units in a
practical order.
The Algorithm will be tested with
respect to it’s Computational
Complexity in respect to the generation
of a build order, and it’s likelihood of
defeating a static opponent.
Context
Starcraft AI development is not new.
There are competitions currently, and
open source libraries for developing
your own. However, optimal solutions to
this problem are elusive, and no one AI
has been able to cement it’s self as
superior. While some AI have
implemented Learning algorithms into
their play style, using Learning
algorithms explicitly to develop new
build orders and strategies is, to our
knowledge, a novel concept. The current
competitive AI use preset Build orders
and strategies that they can choose from,
but these are hard coded. While some
programs we found used information
from old games to improve their own
Choice making capabilities, we did not
see any Programs that actively learned
and created new strategies and build
orders. We obtained all of our
information about other Starcraft AI
from the SSCAIT website [5].
Algorithm #1: Pathfinding
This algorithm is meant to find a build
path from limited information given by
scouting an enemy base. It remembers
what an opponent has done and writes it
to a list of other builds.
Compared to the true build order of the
opponent the Algorithm was accurate in
terms of order of units, but not number
of units, due to incomplete information
and errors in our scouting code.
It utilized Java’s built in Intersection
algorithm and a master list of all Terran
units in order to archive a correct order.
Due to the small size of N this was an
acceptable substitute for a formal sorting
algorithm. However, Quicksort, or a
similar algorithm, would have been
preferable in terms of time computation,
but would have been more difficult to
implement.
For the future, we would like to replace
the Intersection algorithm with a formal
sorting algorithm.
Algorithm #2: Bayesian Network
A Bayesian network approach to build
order selection uses the conditional
dependences of the StarCraft technology
tree to make an inference on what type
of units the opponent is building. The
goal is to attempt to counter their units
by producing a unit capable of dealing
significant damage to them. StarCraft’s
varied unit types are capable of dealing
different levels of damage to an enemy
unit, depending on its type. Three
networks are required, one
corresponding to each race the opponent
may be playing: Protoss, Zerg, or
Terran.
units they are constructing. This
becomes more difficult as the match
goes on, because the enemy is harder to
scout in longer games because the scout
is more likely to be killed before the
enemy base is fully visible due to the
larger presence of enemy units. The
network is also unable to make accurate
predictions in longer games because it is
increasingly likely the enemy has
unlocked a larger portion of the
technology tree, which gives a higher
probability that the enemy has any
number of the units in the network. This
is due to the large values of time that has
passed since the match began. Due to
these difficulties, this is an effective
method for early countering by build
order selection but later on in the match
it becomes increasingly ineffective.
Algorithm #3: Hill climbing
The purpose of the network structure is
to calculate the probability that the
opponent has a specific building or unit,
given what has been seen so far by
scouting the enemy base and how much
time has gone by since the match started.
Each node in the network has two
possible values, true or false. A value of
‘true’ represents that the structure or unit
has been seen so far during the match.
Using conditional dependencies in the
network, whether an opponent possesses
a certain unit or structure is the
probability that the corresponding node
in the network has the value true.
Due to StarCraft’s fog of war, an agent
only has visibility of the area directly
surrounding its units or structures.
Therefore a scout must be sent very
early to the enemy base and they must be
watched as much as possible throughout
the match to determine what types of
The third step in the optimization of our
Starcraft bot was to implement an
algorithm which would aid in perfecting
a set of build orders. It was originally
planned that a mutagenic strategy would
be used. A solution, in this case a build
order read in from a text file and
represented as an array list of strings,
would be subjected to some mutation
function which would alter the build by
adding, removing, or reordering a single
unit in the text file. A crossover function
would then mix sections of two builds
based on some markers which would
indicate potential for combination, for
instance, a Terran factory and machine
shop in one build and a siege tank in
another. After completion of a match
wherein our bot utilized one of our build
orders the fitness value of the solution
would be incremented upon success and
untouched upon a loss. The number of
times a build was mutated was also kept
on record. The elitism function would
eliminate solutions from the current
generation of build orders based on the
following scheme:
number of times isBuildValid had to be
checked.
Where F is the fitness value of the build,
and N is the number of mutations the
build has been subjected to, if F < 5 and
N >= 10 then the build will be discarded
from the set of solutions.
Due to time constraints the crossover
function was never implemented. The
resulting system was more of a hill
climbing algorithm which improved
builds by incrementally altering units.
The mutate function works by
generating a random integer to
determine whether to add a new unit,
remove a unit, or simply reorder a unit in
the build. In the event of a new unit
being added, random integers are again
generated to determine the type of unit
and where it will be placed in the build
order. Reorders and deletions of units in
a build also used random integers in this
way. A helper method, isBuildValid,
was used to ensure that any proposed
mutations to a build did not result in an
unachievable strategy. For instance, if
the mutate function placed a Terran
marine in the build before a barracks, the
marine would never be made and our bot
would halt, leaving the build order
incomplete. The random mutations to a
build are repeated inside a while loop
until isBuildValid returns true. This
essentially makes the hill climbing
algorithm a brute force solution. Since
all decisions made by the algorithm are
random, the complexity is difficult to
quantify. Data was gathered by altering
the algorithm to mutate a build over and
over again for some set number of times.
The runtimes taken to perform the
mutations were recorded, as well as the
minimum, maximum, and average
As expected with a brute force
algorithm, the runtimes seem to increase
at an exponential rate, going from 86ms
at 100 mutations to 18,867ms at 1,000
mutations.
The minimum number of checks to
isBuildValid was no surprise. At best, a
build will get an executable mutation on
the first try. However, what is interesting
to note is the steady decrease in the
maximum number of checks to
isBuildValid as the number of mutations
is increased. Build orders consistently
grew in size and complexity as
mutations occurred. Despite the
possibility of the mutate function
removing a unit from the build, there is
also the possibility that either no units
will be removed, or a unit will be added.
It is because of this that the build orders
consistently grow in number of units
instead of getting smaller or remaining
the same size. As a build becomes more
and more complex, it becomes very
unlikely that any necessary prerequisites
for a new unit will be unfulfilled.
Therefore, the larger the build order, the
less likely it is for isBuildValid to be
required more than once.
Since our bot was never able to win a
match, it is impossible to say whether or
not the hill climbing algorithm is
successful at perfecting a strategy.
However, there are already several
obvious aspects of the algorithm to be
addressed in the future. It would be
beneficial to add numbers to units in
build orders to indicate how many of a
particular type of unit our bot should
make. For instance, a line in a build
order that reads Terran_Marine 20
would task our bot with making 20
marines. This added specificity would
give the mutation function the ability to
make fine tuning adjustments to a build.
The inclusion of a crossover function
would also improve build order
optimization by combining aspects of
successful builds, thus making the hill
climbing solution into a true mutagenic
algorithm.
Results
Unfortunately our testing did not
give us the results we desired. The
complexity of the problem was not fully
understood until well into the project.
While we succeeded at make three
learning algorithms, the Starcraft
environment proved to be complex to the
point that executing mundane actions for
a human was beyond the scope of the
project. The majority of our time
working on this projects was spent
learning the Java Libraries necessary to
interact with StarCraft, and
implementing code that allowed the AI
to accomplish simple tasks such as
attacking, scouting, placing and making
buildings, training units, and
remembering Enemy units.
While our destination was not
successful, the journey allowed us to
study a wide variety of algorithms.
Dijkstra’s algorithm, was considered as
our path finding algorithm and it’s
complexity was analyzed in terms of
O(n) and found to be order n^2 in the
worst case. Many other algorithms such
as Genetic and Pattern Matching
algorithms were also studied and
analyzed for potential in our final
program. In the end we settled with
Intersection, Hill Climbing, and a
Bayesian Network to be formally added
and tested. The final program was able
to learn and adapt to an opponent, copy
their techniques, and even make it’s own
strategies.
Future work
Each algorithm had room to improve,
but overall, we would like to expand the
scope of the AI to include not only build
orders, but attack and command timings
and placements. We would also like to
implement a system for calculating
possible repercussions of different
choices, such as calculating the loss of
minerals from choosing to not make an
SCV at a certain time. This would
increase the complexity of the AI’s
learning and testability. We would also
like to improve the basic workings of
our program to make it more practical in
a tournament setting.
Questions
What is the computational complexity
(O(n)) of Dijkstra’s algorithm without
optimization?
N^2
Is this a constraint satisfaction problem,
or an optimization problem?
Optimization
What probabilistic relationship is there
between nodes in a Bayesian network
that represents a StarCraft tech tree?
Conditional dependence
Sources
[1]
http://www.theguardian.com/technology/
2006/aug/03/insideit.guardianweeklytec
hnologysection
[2]
http://www.decodedscience.com/deepblue-a-landmark-in-artificialintelligence/23264/2
[3] http://www.sscaitournament.com/
Download