Golconda Suresh www.cacs.ull.edu/~sxg3148 Genetic Algorithm for tic-tac-toe Project Statement: Use Genetic Algorithm to develop a program for tic-tac-toe. Introduction There was always a question in my mind about the capability/success of Genetic Algorithms (GA), which seem to work using randomness. In order to discover the capabilities of GA, I attempt here to develop GA for tic-tac-toe. Reasons for selecting tictac-toe were, primarily that ‘Tic-tac-toe is a well-defined game with fewer rules, yet with enormous game space’[ERN99]. Secondly as I have programmed tic-tac-toe in traditional methods (min-max algorithm) and with neural network approach, it gives an opportunity to compare the results of different approaches. Genetic algorithm are easily implemented via distributed processing. In order to execute the program on distributed systems with varying platform, a platform independent programming language like Java is selected. Game Representation Though a direct approach of generating raw code for tic-tac-toe looks simple, its nearly impossible to generate raw code with many logical sequences. Hence a better approach is to define the skeleton/structure of the code and make use of GA to develop the core code. The code structure is defined similar to how a novice programmer would program tic-tac-toe, i.e. with a set of if-else statements. Something like: If (this is board-configuration) Play this; Else if (this is board-configuration) Play this. … … else if(this is board-configuration) Play this. But this programming would lead to large list of if-else statements, with low performance. Multiple ‘if’ condition needs to be combined to form fewer ‘if’ conditions. We use Ternary Boolean algebra for this purpose. Where variables can take 3 possible values corresponding to 3 possible states that a particular position on board can take.[Refer to table-1]. Ternary algebraic functions are used to represent the set of board configurations for which player has to play at a particular board position, say position 0. Restating the problem, we need to find a set of 9 ternary algebraic functions for each of the 9 possible board positions to play. The ith ternary algebraic functions returns true if playing at position ‘i’ is best move for a given board configuration. Ternary Algebra Basics: Few basic concepts of ternary algebra is provided in table-1, with reference to normal 2-valued Boolean algebra. (Table-1 description) First row provides with the possible values that a variable in that algebra can take. Second row provide the notation for representing variables taking different values. Third row provides with the example algebraic function. Fourth row provides with the example function using truth table values. Fifth row provides with the functional representation of above defined functions. Table-1 Trilean Algebra compared to Boolean Algebra 2-valued Boolean Algebra Variable values 2 values x=0 x=1 States of variable Variable values X=0 X=1 X =0 or X=1 Example function F(x)= (x1 x2’) v (x1’ x2) v (x2 x3 x4) Example Truth table Corresponding function representation x1 0 0 1 1 x2 0 1 0 1 Representation X’ X _ ignore the variable F(x) 0 1 1 0 F(x)= (x1) v (x2) 3-valued Boolean Algebra (Ternary algebra) 3 values x=0, board position empty x=1, player-1 played this position x=-1,player_2 played this position Variables Representation X=0 X0 X=1 X1 X=-1 X2 X=0 or X=1 X3 X=0 or X=-1 X4 X=1 or X=-1 X5 X=0 or X=-1 X6 or X=-1 F(x)= (x11 x21) v (x13 x20) v (x25 x32) x1 0 0 0 1 1 1 -1 -1 -1 x2 0 1 -1 0 1 -1 0 1 -1 F(x) 0 1 0 0 1 0 1 1 1 F(x)=(x12 x26) v (x16 x21) Note: Super-scripts in above table do not represent power, they represent the state of the variable (see second row in above table). Representation of the Trilean Algebra functions The function is represented in disjunctive normal form (DNF) notation, which in GA is called as a chromosome. (see table-2, for naming convention). Eg: DNF for tictactoe F(x)= (x01 x11 x26 x36 x44 x53 x61 x71 x81) v (x02 x12 x21 x33 x44 x53 x66 x76 x84) ….. With subterms (x01 x11 x26 x36 x44 x53 x61 x71 x81) called as gene. And xij is called as term or variable. Table-2: Data Naming Convention F(x)= (x01 x11 x26 x36 x44 x53 x61 x71 x81) v (-----) ….. is called DNF function / Chromosome (x01 x11 x26 x36 x44 x53 x61 x71 x81) x26 is called Gene is called term/variable We can represent the chromosome element with a string of 9 numbers, with each number representing the super-script of corresponding variable Eg: the chromosome element (x01 x11 x26 x36 x44 x53 x61 x71 x81) in string notation is 1 1 6 6 4 3 1 Structure of tic-tac-toe program. Redefining the structure of tic-tac-toe program. If(DNF_0 (board_state[]) = = TRUE) Play_position=0; Else If(DNF_1 (board_state[]) = = TRUE) Play_position=1; … … 1 1 … Else if(DNF_8(board_state[]) = = TRUE) Play_position=8; Where: DNF(board_state[]) = chromosome_element0 (board_state[]) v chromosome_element1 (board_state[]) v … … chromosome_elementk (board_state[]). Where: chromosome_element0 (board_state[]) = returns true if chromosome element string_0 satisfies board_state[]. Implementation issues of Genetic Algorithm Two vital issues of genetic algorithm are: 1) Crossing & Mutation 2) Finding fitness value of the chromosome. Crossing & Mutation: Two parent chromosome strings are selected at random, giving more preference to chromosome strings with better fitness. This is implemented using Roulette Wheel principle[DAV99]. Each chromosome string in the population has a roulette wheel slot sized in proportion to its fitness[Refer Fig-1, Table-3]. The wheel is spin to select one of the element. The slot where the wheel stops spinning is chosen as parent string. Eg: roulette wheel [Fig-1] Roulette wheel Implementation of Roulette wheel Normalized fitness value of chromosomes is calculated using the formula Normalized_fitness[i]=fitness/ (sum_of_all_fitness) Partial sum of normalized fitness, given by following formula is calculated. A random value between 0-1 is chosen, and corresponding partial sum is selected.[Refer to Table-3]. [Table-3] Implementation of Roulette wheel Index 0 1 2 3 Fitness 0.2 0.6 0.4 0.8 Normalized fitness 0.1 0.3 0.2 0.4 Partial sum of fitness 0.1 0.4 0.6 1.0 Example random fitness =0.31 Example random fitness=0.01 Selected Selected The next step after selecting the parents is to select the site of crossing, which is selected randomly, and the parents string are interchanged after the site of crossing, to form new children. The child string elements are mutated with a probability p_mutation. Finding fitness values: In order to find the fitness of each chromosome in the population, the program can be run against another player [traditional algorithm implementation], for different instances of game. Alternately a better solution is, to create an exhaustive list of all possible board configurations in the game, along with the best play position for each of the board configuration. The chromosome string whose fitness value is to be evaluated, is applied to each of the entry in the list and the result is compared with the answer, the fraction of number of correct results to total number of elements, gives the fitness value. Structure of the Genetic algorithm implemented. Below is the genetic algorithms to find one chromosome string for one particular play position. BEGIN - Old population[N_POP]= random_population(); - Find fitness of string of the generation. - For( generation =0 to MAX_GENERATIONS) { - Copy best ‘K’ strings of the present generations to next generation - For( rest of the elements in new generation) ````````````````````````````` { - Randomly select 2 parents, by giving preference to higher fitness strings. - With a probability of P_cross perform crossing of parents strings to form 2 new children strings, else copy the parents directly. - Copy child strings into new generation. } - Copy new generation into old generation. - Calculate fitness of strings of new generation. - Note the all round best fitness string. } END Results Successfully generated code for playing tic-tac-toe with very high winning rate. Successfully implemented GA to generate set of DNF functions which represents a large sample of possible board configuration for which player has to play at a particular position (around 350elements for each board position). Changing the mutation probability produced different results. Increased mutation probability produced better results when the performance of chromosome string was low (i.e. when DNF element did not cover much elements). At the same time destroyed the best solution when performance of chromosome string was high. Obtained better results by using dynamically changing mutation probability, which increases when performance degrades, and decreases when performance improves. Mandatory copy of best chromosome from each generation to its next generation produced results at much higher rate. Comparing GA code with tradition algorithms and neural network Playing the game Pre-processing or training time Traditional algorithm Takes lot of time to make a move No preprocessing Facility for distributed processing Not required Understandable by common man Not easy to understand the logic, for a common man Neural Network Takes less time to make a move Require high training time around 23 hours on 500 Mhz PC Distributed training not possible Hard to interpret the logic for a common man How can a network store data? Genetic Algorithm Takes less time to make a move Requires high initial processing time around12* 9 hours on 500Mhz PC. Distributed processing possible. Each system can be set to generate separate chromosome, hence setting 9 systems to generate 9 chromosome requires just 12 hours of human time. Natural logic, easily understood by common man. Derived from nature. Conclusion Developing Genetic Algorithm was a challenging and enjoyable experience. The program developed using GA takes much less processing power and hence makes a move much quicker, within a fraction of second (on sun Ultra 5 system) compared to traditional min-max algorithm which takes around 2-3 seconds. Working for tic-tac-toe using GA helped me to better understand the concepts and capabilities of Genetic algorithm. Genetically developed tic-tac-toe model may be applied to develop other genetic programs, involving integral space. The project also shows the advantage of good program structure, which simplifies the complexity and increases the performance of GA’s attempt to generate code. A simple structure of tic-tac-toe program gave quick results which may not be the case if the structure of program was more complex. The code developed by GA is more appropriate compared to traditional methods, when it comes for implementing these games (like Tic-tac-toe) on mobile electronic devices (like cell phones, hand held video game) with very low processing power. Future Works A point to notice in this work, is the use of GA in storing huge amount of data (supervising data samples, containing board-configuration and play position). Around 1067 sample data is stored using 15 (average number of chromosome elements) * 9 (# of DNF)= 145 chromosome strings. Where each string can be stored in 3-9 bytes. This shows the possibility of using GA and expanded DNF functions to store large data samples for other applications (such as databases). Appendix -A Implementation details: Implemented in java. Java Class 1) dnf_GA Purpose: To provide basic genetic algorithm methods like finding fitness, Performing mutation, performing crossing, etc. Methods: Void initialize_pop() Initialize the initial generation chromosomes to random values [06]. Void Run_dnf_GA(String in_file, String out_file) Calls all major steps of GA, like finding out fitness, crossing the parent strings, mutation. Etc.. Void Update_to_new_generation() Copy new generation population to old generation. Void copy_cnf(cnf_element_class source, cnf_element_target) Creates a duplicate copy of source, and stores it in target Void save_statistics() Save the statistics like best fitness etc, in a file Void copy_N_best_cnf() Copy the best ‘N’ chromosomes with best fitness directly to next genration Void form_next_generation() Selects parents for crossing, and perform crossing to store children. Void cross_parents(int parent1_index, int parent2_index) Perform crossing of parents to form children. Site of cross is selected randomly int select_parent() Selects parents randomly giving parents with higher fitness value, more preference. Selection is implemented using ROULETEE WHEEL. Void calculate_fitness() Invokes methods to calculate fitness of the chromosome function. Int return_best_fitness_cnf_index() Float return_best_fitness() Void note_best_cnf(int index) 2) dnf_element_class: Purpose: To store the chromosome element strings and provide transparency to other classes by providing all data handling functions. Methods: Void set_term(int term, int state) Void get_term(int term) boolean run_cnf_element(int board_state[]) Boolean dnf_of_term(int value, int state) void display_cnf() void randomize() 3) GA_tic Purpose: Calls DNF_ga for each of the 9 possible play positions to generate DNF strings Methods: new_dnf_GA(int chrom_index) 4) Fitness_value: Purpose: Provides methods to calculate fitness of a chromosome element. Methods: Void Find_fitness() Void get_new_board_state() Int fitness_of_state() Int count_correct_work 5) Reproduction: Purpose: Implements methods for performing reproduction, mutation. Methods: Int select_parent() Boolean Flip(int prob) : randomly return true or false, with true having a Probability ‘prob’. Void mutate(int state, float p_mutation) 6) File_handling: 7) File_writing: Provides methods for using files. 8) N_best_chromosome: Purpose: Implement methods to return ‘N’ best chromosomes in a generation. Methods: Void Copy_first_n() Return_cnf_copy(cnf_element_class) Void insert_chromosome() Return_best_cnf() Update_lower_best_fitness() Void display() 9) Game_position Purpose: implements methods to check if for a specified board state, can a player play at a specified position. Methods: Void Add_new_chromosome(int chromosome[]) Boolean can_play(int board[]) Retrieve_cnf_states(String str) Return_next_cnf_states() Void add_all_cnf_elements() Void display() 10) tictactoe_ga Purpose: Provides methods for playing tictactoe game, like methods for checking if player has won the game, if the game is draw, get user input etc… Methods Int get_user_input() Int return_players_move(int player) Int game_win(int player) Int game_draw() Void play(int player, int position) Int computer_move() 11) tic_gui: Purpose: Provides graphical user interface to play tictactoe game. References: [ERN99] Ernest M. Post. Genetically optimized N-dimensional tic-tac-toe 1999. http://www.cs.cornell.edu/boom/1999sp/projects/tictactoe.html [DAV99] David E. Goldberg. Genetic Algorithms. International student Edition 1999. [ERC03] Eric C.R.Hehner Unified Algebra. 2003. http://www.cs.toronto.edu/~hehner/UA.pdf. [JRG03] Jorge Pedraza Arpasi A brief introduction to ternary logic 2003 http://www.aymara.org/ternary/ternary.pdf [ENC03] Wikipedia, the free encyclopedia. Disjunctive normal form. http://www.wikipedia.org/wiki/Disjunctive_normal_form