Mancala Research Paper

advertisement
Computational Explorations of Mancala Strategies
Daryl Benitez
SUNY Oswego
April 19, 2009
Abstract
Mancala is a simple board game that has many different possible outcomes. The complex nature
of a strategically significant game solver provides for ample opportunity in the explorations of
Artificial Intelligence. Using a rule based system an attempt was made to create a machine that
will solve a mancala game that is any game state.
Keywords: mancala, rule based, machine learning, AI, artificial intelligence.
Introduction
Game playing is a concentration that is greatly pursued in the field of Artificial Intelligence. TO
get a game to play as if it were a human could mean that one day a computer could be taught to
learn and think as if it were human, in more than just a game context. The pursuit of a machine
that can successfully play a game of mancala at a near human level was the goal of this project.
If the program could perform to a high enough level then maybe that would help in the
development of an actual artificial life.
Game Background
Mancala is a game that was invented in ancient Africa. Mancala is traditionally a two player
game although variations do exist where there may be more than two players, but two is the
minimum. The game, as we know it today, uses a board that has a number of pits cut into it.
The number of pits that the board has varies on the style of game played. For the purpose of this
project I choose to keep the number of pits set at six per player, which tends to be the typical size
of a playing board that one would buy in the United States. Along with the 12 player pits there
are larger pits at the end of each boards that are used as the goal pits. Each of the player’s six
pits are filled with four stones.
Play starts with what is considered to be the bottom player. This player chooses which pit to
move stones from. The stones are removed from that pit and are distributed one at a time in
consecutive pits towards that player’s goal pit. If the last stone is placed in the goal pit then that
player gets to go again. If there are enough stones in the chosen pit such that the player wraps
around the board then that player does not put a stone in the opposing player’s goal pit. If the
play wraps all the way around the board and the player places a stone in one of his own pits that
was empty then that play may capture the stones from the other player’s pit that s across from
that one. However this cross capture can occur if the player does not circle the board, it can also
occur when you simply land in an empty pit on your side.
The game ends when either player has no stones left in their pits. The goal is to have more
stones in your goal pit and any remaining stones on your side than the opposing player does.
Many different variations of the game exist but these are the rules used when creating my
machine solver.
Related Research
Mancala is a game that involves multiple players. Since mancala uses pits and stones the game
is considered a sowing game (Erickson 1996). The outcome of the game can be determined by
who in the game goes first. Initiative is defined as an action of the first player, having the
initiative is a clear advantage under the condition that the board is large enough (Uiterwijk
2007). Since one player has to go first and we know that there are a set number of pits and
stones it is possible to determine the endgame result given over time the amount of stones in
each pit is kept track of (Broline 2008). However this seems to be a little misleading as there are
so many possible moves that may be made in the game. So is it really possible to determine the
outcome of the game? The fact that a single move can have an effect on the pits on the board
make it difficult to foresee the consequences of even a few moves ahead (Donkers 2001).
Looking ahead at possible outcomes is a large part of what we as humans do. It seems necessary
for a computer to do this if it is the intended outcome for the machine to learn. This means that
the machine must use predictions theory. This means that statements about the fuure error is
known (Langford 2005).
Given all of this it is important to note that game theory and strategy play a important role in
mancala. Game theory is meant to predict what people will do (Camerer 2003). When dealing
with a machine it is impossible to predict what the machine will do. However it may be possible
for the machine t predict what the other player may do. Game theory is concerned with the
actions of decision makers who are conscious that their actions affect each other (Rasmusen
2006). It seems difficult that a machine can actually learn how to win a mancala game. In order
for this program to be considered to be a learning machine it needs to improve its performance
based on the experience it has or that is provided (Mitchell 1997).
Visual Representation
The board was generated in LISP using simple dashes, underscores and lines. The following is
an example of how the board looks when its generated.
|-------------------------------------------------------------------------|
| |-----| |-----| |-----| |-----| |-----| |-----| |-----| |-----| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |4 | |4 | |4 | |4 | |4 | |4 | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |-----| |-----| |-----| |-----| |-----| |-----| |
| |
| |
| *T1* *T2* *T3* *T4* *T5* *T6 * |
| |
| |0 |
|0 | |
| |
|
|
| |
| |
| |-----| |-----| |-----| |-----| |-----| |-----| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |4 | |4 | |4 | |4 | |4 | |4 | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |-----| |-----| |-----| |-----| |-----| |-----| |-----| |-----| |
| *TOP* *B6* *B5* *B4* *B3* *B2* *B1* *BOT* |
|--------------------------------------------------------------------------|
Process
This project was pursued in a step by step basis. The project was broken up into various small
parts so that it would seem as though progress was being made. If the project wasn’t broken up
then it would seem as though the project was making little to no progress. The steps are as
follows: Visualization, Human Playing, Random Playing, Analysis, Rule Base Generation,
Learning, and a Final build.
The visualization just entailed creating the board and populating it with stones. This step
although basic was the foundation of the entire project. Human and Random playing involved
actually implementing the game and making it so a human could play against a human as well as
a random player could play against another random player. A random player is simply one that
selects a move of the game randomly. After these two were implemented then the two parts
were combined so that a human could play against a random player.
Analysis of the Human and Random parts simply meant what were the winning percentage of the
bottom player, in the case of a game against a random player and a human the bottom player in
the program is always the random player. This was calculated by running many games and
computing the percentages of the outcomes. With this analysis it was possible to generate a rule
base of winning moves so that a learning machine could use this rule base to play the game. The
final build was just putting everything to together and making little efficiency improvements.
Unfortunately the program is still quite inefficient and does take quite a bit of time to generate
larger rule bases.
Implementation: How did that Happen?
Using various methods I was able to create a machine that could solve a mancala game state.
LISP was used to implement these methods. At first a class system was used to create a board, a
board class where one parameter was cells, and to populate that board with cells. These cells, a
different class in itself, were then populated with stones. The cell class had references to the cell
that was next and the cells that were across from the the selected cell.
After much though it was decided to scrap the class structure. What was eventually decided
upon was the usage of association lists. An association list is simply a list where each element is
a list in itself with two elements. These two elements were associated with each other.
Insert more on implementation.
Following are pseudo code examples of various methods used to implement the machine:
This method sets the moves for each player, both the learning one and the human one. This
method is slightly different if a random machine is used instead of a human, the difference is that
a random move is used instead of a human selected move.
(method make-move ()
(generate the rule base)
(select a rule to use)
(check to make sure the game isn’t over)
(use the rule for learning player)
(have a human select a move for themselves)
)
This method actually moves the stones around the board. This method is basically the same for
each the random, human and learning machine.
(method move-stones ()
(check who’s turn it is then change the next reference accordingly)
(check to see if the stones have all been moved out of the selected pit)
(switch whose turn it is)
(return to the make move method so that the other player can move)
(add one to the next pit)
(advance to the pit after the next one)
(recursively call the method until all the stones have been moved)
)
This method generates the rule base for the “learning” machine to use.
(method generate-rule-base ()
(initialize the association lists)
(play a random game until a winner is found)
(check to see if the bottom player won)
(if the bottom player won the add the rule to the rule base)
(if the rule base has less rules in it then you want the recursively call the method)
)
This method actually uses the rule base. What good is a rule base that doesn’t actually ever get
used.
(method use-rule-base ()
(make sure the rule base isn’t empty)
(if it is empty call generate rule base method)
(call the select rule base method)
(set the first element in the selected rule base as the move)
(remove the move from the selected rule base)
(call the appropriate move stones method)
)
This method checks for the possibility of a cross capture and if one exists on the next move then
it will take the appropriate action.
(method cross-capture ()
(check to see if there is only one stone left to move)
(if there is then check to see if the pit across from the next pit to receive a stone actually
has stones in it)
(if that pit isn’t empty then it checks to see if the next pit is empty)
(if all the conditions exist then a cross capture happens)
(if these conditions don’t exist then the method is exited)
)
Rule Based: A Method to the Madness.
The implementation of a learning machine the played mancala was the primary goal of this
project. To accomplish this it was chosen that a rule base would be the method used to create the
machine. A rule base is a collection of legal mancala moves that result in a win for the bottom
player. The actual rule base used ended up being a list that contained as elements other lists that
had all the moves for the bottom player in it.
To generate this rule base a method was created such that the program would run other methods
that played the game using random players. If the result of the play was a win for the bottom
player then the moves that the bottom player made were taken out of the list of moves. These
moves were put into its own list and was the then added as a list into the rule base.
A rule base was used, in contrast to other methods, because it contained moves that are known to
result in a win. Using those known moves the machine could potentially learn so that it could
consistently win mancala games against both a random and human opponent.
Results: It Worked...or Did it?
The program was run many times to obtain results that hopefully reflected the fact that the
program did in fact obtain some form of learning. After running the program just one thousand
times before learning and after learning an increase of about nine percent was realized. The
demo below shows this:
[2]> (generate-stats 1000 t)
Players...
Bottom Player: Random Player
Top Player: Random Player
Stats before learning...
((w 0.371) (l 0.338) (d 0.291))
Trying to learn
Bottom Player: Learning Player
Top Player: Random Player
Stats after learning...
((w 0.463) (l 0.302) (d 0.235))
NIL
The program was run again this time with ten thousand runs of the program. Unfortunately there
was no real difference in the percentage of wins. In fact the amount of losses increased as shown
below:
[2]> (generate-stats 10000 t)
Players...
Bottom Player: Random Player
Top Player: Random Player
Stats before learning...
((w 0.329) (l 0.317) (d 0.354))
Trying to learn
Bottom Player: Learning Player
Top Player: Random Player
Stats after learning...
((w 0.459) (l 0.325) (d 0.216))
NIL
The fact that for every turn that a player has there are six different moves that the player may
choose. So even if the machine follows the rule base exactly the other player has the option of
six different moves for most of their turns. So if the top player were to change just one move
during the play then the outcome of the game could potentially change significantly. It has
become apparent that perhaps it is not possible to create a machine using a rule base when there
are so many possible moves for the opposing player to make.
Perhaps using a rule base in conjunction with another method would provide for a better chance
for the machine to learn. The possibility of the opposing player choosing any pit to move from
needs to be accounted for. Also once a move is made then the state of the board is a completely
new state and the state of the board before the move is destroyed. This destructive state space
means that a different move than the one in the rule base is going to produce a board state that is
different from the ones contained in the rule base. If the state of the board is different then its
not possible to have the same outcome as the rule base originally produced unless by some fluke
the following moves actually result in the same outcome. This outcome is very unlikely and
would be completely by chance is it happened.
Future Work
With the current build more efficient code could make the generation and storage of rules and
states much easier. Currently it takes quite a bit of time, sometimes up to a few minutes, to
generate a small rule base of only five rules. Although part of this could be due to the server
load experienced at the time. Also maybe if the rule base was stored persistently then the
program wouldn’t necessarily need to generate a rule base before each run, but then time of
writing to the rule base and loading from the rule base could become a factor.
Since I was unable to complete a machine so that it demonstrated learning a future
implementation would be to put new methods of learning into the game. Perhaps a genetic
algorithm could be used to pick a best course to play the game or maybe even a neural network
of some sort could best be used to play the game.
One step that would definitely be looked into would be to assign a weight to each of a certain
type of move. Such as an exact capture would get the highest weight as it would allow the player
to go again. Then a cross capture could be next followed by setting up for future moves, either
an exact capture or a cross capture. Then defensive moves could be next, the less stones the
player allows the opposing player to capture the better. Finally if all else fails just to move a
random pit, although with all the previous cases being implemented a situation where a random
move would be played may not ever actually arise.
Conclusion
In hindsight perhaps a better method of learning could have been chosen. A rule base does not
seem like a sufficient learning mechanism for a game such as this. The amount of possible move
combined with a destructive state space means that a different method may be quite a bit more
successful. Although a machine that learned was not realized a lot was learned in the terms of
LISP programming and Artificial Intelligence. So this project was by no means a failure.
Now it is known that a rule base is not necessarily sufficient for the building of a learning
machine that plays mancala. Without actually attempting this project I may not have ever known
how to implement mancala. Future work will take place on this project as it is much too
intriguing to just set aside and forget about.
References
Broline, Duane and Loeb, Daniel. The combinatorics of Mancala-type games: Ayo,
Tchoukaillon. 1:41. February 2008.
Camerer, Colin. Behavioral Game Theory: Experiments in Strategic Interaction. Princeton
University: Princeton University Press. 2003.
Donkers, Jeroen and Uitrwijk, Jos and de Voogt, Alex. Mancala games - Topics in Mathematics
and Artificial Intelligence. Journal of Machine Learning. 2001.
Erickson, Jeff. Sowing Games. "Games of no Chance." Volume 29: 1996. 287-297.
Langford, John. Tutorial on Practical Prediction Theory for Classification. Journal of Machine
Learning Research. March 2005.
Mitchell, Thomas. (1997). Machine Learning. McGraw-Hill.
Rasmusen, Eric. Games and Information: An intrduction to game theory. Wiley-Blackwell.
2006.
Sauermann, Henry. "Games of Strategy." Diss. Duke University. 1999.
Uiterwijk , J.W.H.M, and H.J. van den Herik. The advantage of the initiative. Information
Sciences International Journal. 27 February 2007 .
Vectorial and Mancala-like games, apparatus and methods. Hamilton, Clarence. February 1986.
Brooklyn, NY. 21 February 2009. <http://www.freepatentsonline.com/4569526.html>
Download