Document

advertisement

Playing Tic-Tac-Toe with Neural Networks
Zachary McNellis
CPSC 4820
What is a robot?
•
Sense, think, act
2
Sense, Think, React
 Robotics technology consists of mechanisms that
can:
 Sense – Feedback devices (sensors) allow
information about the environment to be recorded
 Think – Information is processed in some way
(simple or complex)
 Act – Most obvious part of a robot. However, it can
be anything from outputting a value to making the
robot walk
3
Creating a tic-tac-toe engine
•
•
•
Board representation
What move to make?
Win, Lose, Draw
4
Board Representation
302111111111
9 board positions
Dimension, player 1,
player 2
Positions labeled 1-9
Player 1
0
Player 2
2
Empty
1
5
What Move to Make?
 What does a tic-tac-toe engine do?
 Input: board state
 302111111111
 Ex. “3 0 2 1 1 1 1 1 1 1 1 1 | ./my_engine”
 Output: next move
 Avoid collisions
 Ex. “5”
6
Win, Lose, or Draw
 “playtictactoe.py”
 Specify number of games
 Engine 1
 Engine 2
 Output
 Game progression
 Player 1 win ___ times
 Draw ___ times
 Player 2 win ___ times
7
1. Random Engine
•
•
Implementation details
Results summary
8
Implementation Details
 Java
 Slow execution
 Internal representation of board state
 x---ox--o
 x: player 1
 o: player 2
 -: empty position
 2 dimensional array
 Polymorphism to easily allow different engine
implementations

Player player = new RandomPlayer(board, turn);
9
Results Summary
 random_engine vs
random_engine



Player 1 win 49
Draw 13 times
Player 2 win 38
 About equal number of wins
from player 1 as player 2
10
2. “Smart” Engine
•
•
•
Implementation details
Case based reasoning
Results summary
11
Implementation Details
 Java
 Rules were simple and came from hands-on
experience
 IF able to get 3 in a row, play winning position
 ELSE IF able to block opponent, play blocking
position
 ELSE IF empty, play edge position
 ELSE play random position
12
Case Based Reasoning
 Use reverse logic to figure
out “rules” governing an
unknown engine
 Steps




Retrieve
Reuse
Revise
Retain
13
Results Summary
 random_engine vs
smart_engine



Player 1 win 8
Draw 29 times
Player 2 win 63
 smart_engine vs
smart_engine



Player 1 win 59
Draw 10 times
Player 2 win 31
14
3. Neural Network Engine
•
•
•
Neural network overview
Implementation details
Results summary
15
Neural Network Overview
 Provides ability to “learn” how to do tasks based on
training data
 Requires linear and nonlinear step to produce a set
of weights
 Weights map training input to training output
 Learning rate used to discover a set of weights that
result in an error of 0, in which all inputs are
precisely mapped to all outputs
16
Implementation Details

Goal: train neural network on data produced by previous “smart
engine”



Input: state of the board
Output: next move
Neural network trainer



Python
Allows user to pass in parameters such as learning rate, bias, input,
output, and weight files
15 pairs of inputs and outputs used


Difficulty of convergence
Neural network engine


Python
Use set of weights used by trainer to generate “next move”
17
Results Summary
 neural_engine vs
smart_engine



Player 1 win 38
Draw 11 times
Player 2 win 51
 neural_engine vs
random_engine



Player 1 win 56
Draw 12 times
Player 2 win 32
18
4. PyBrain Engine
•
•
•
PyBrain neural network library
Implementation details
Results summary
19
Implementation Details
 Goal: Implement same neural network engine using
training weights produced from an external library
 PyBrain
 Python-Based Reinforcement
Learning, Artificial Intelligence and Neural Network
Library
 http://pybrain.org/
 Used backpropogation method of training values
 Optimization of errors, minimizing loss function
 Allows higher chance of convergence for larger data sets
 25 pairs of input/output compared to 15
20
Results Summary
 smart_engine vs
neurallib_engine



Player 1 win 51
Draw 6 times
Player 2 win 43
 random_engine vs
neurallib_engine



Player 1 win 30
Draw times 16
Player 2 win 54
21
(5?) Self Organizing Maps
 Another type of neural network
 Using weights in different ways
 Weights are now nodes instead of connections
 Useful for identifying what the inputs should be
 Weights are updated based on geography
 Useful for pattern completion
 Could be used in tic-tac-toe engine to determine
whether a given board state is valid or not
22
Running the programs…
23
Download