IS53024A: Artificial Intelligence Genetic Learning Algorithms 1. Genetic Algorithms and Natural Selection Genetic Algorithms (GA) are alternative to the traditional optimization techniques by using directed random searches to locate optimal solutions. The GA search methods are rooted in the mechanisms of evolution and natural genetics. In nature, individuals best suited to competition for scanted resources survive. The various features that uniquely characterize an individual are determined by its genetic content. Only the fittest individuals survive and reproduce, a natural phenomenon called "the survival of the fittest". The reproduction process generates diversity in the gene pool. Evolution is initiated when the genetic chromosomes from two parents recombine during reproduction. New combinations of genes are generated from previous ones and a new gene pool results. Specifically the exchange of genetic material among the chromosomes is called crossover. Segments of the two parent chromosomes are exchanged during crossover, creating the possibility of the right combination of genes for better individuals. The genetic algorithms manipulate a population of potential solutions to an optimization ( or search ) problem. They operate on encoded representations of the solutions, equivalent to the genetic material of individuals in the nature, and not directly on the solutions themselves. Usually the solutions are strings of bits from a binary alphabet. As in nature, selection provides the necessary driving mechanism for better solutions to survive. Each solution is associated with a fitness value that reflects how good it is, compared with other solutions in the population. The higher the fitness value of an individual, the higher its chances for survival and reproduction. Recombination of genetic material in the genetic algorithms is simulated through a crossover mechanism that exchanges portions between strings. Another operation, called mutation, causes sporadic and random alteration of the bits of strings. Mutation too has a direct analogy from nature and plays the role of regenerating lost genetic material. 2. Genetic Algorithm Structure The genetic operators- crossover and mutation, generate, promote and juxtapose building blocks to form optimal strings. The Crossover tends to conserve the genetic information present in the strings to be crossed. Mutation generates radically new building blocks. A Simple Genetic Algorithm procedure simple-GA begin t 0 Initialize P( T ) Evaluate P( T ) while NOT( termination-condition ) do begin t t + 1 Select v, v1, v2 P( T - 1 ) Cross-over( v1, v2 ) Mutate( v ) Evaluate( v, v1, v2 ) end end. Selection provides the favorable bias toward building blocks with higher fitness values and ensures that they increase in representation from generation to generation. The crucial operation is juxtaposing the building blocks achieved during crossover, and this is the cornerstone of the GA mechanics. The Schemata Theorem A genetic algorithm searches for the optimal string schemata in a competition with other ones from the current population in order to increase the number of the instances of the schemata in the next population. The notion that strings with high fitness values can be located by sampling schemata with high fitness values, called building blocks, and further combining these building blocks effectively is the building block hypothesis. The building block hypothesis assumes that the juxtaposition of good building blocks gives good strings. When the effects of selection, crossover and mutation on the rate at which instances of schema increase from generation to generation are considered, one can see that proportionate selection increases or decreases the number in relation to the average fitness value of the schema. A schema must have a short defining rate too. Because crossover is disruptive, the higher the defining rate of a schema, the higher the probability that the crossover point will fall between its fixed positions and an instance will be destroyed. Thus schemata with high fitness values and small defining lengths grow exponentially with time. This is the essence of the schema theorem, which is the fundamental theorem of genetic algorithms. The fundamental schemata theorem of genetic algorithms states that schemata with high fitness values and small defining lengths grow exponentially in time. The following equation is a formal statement of the schemata theorem: N ( h, t + 1 ) >= N ( h, t )*fi ( h, t ) / f ( t ) [ 1 - pc* ( h ) / ( l -1 ) - pm*o ( h ) ] where: fi ( h, t ) : average fitness value of schema h in generation t ; f(t) : average fitness value of the population in generation t ; pc : crossover probability ; pm : mutation probability ; ( h ) : defining length of the schema h ; o ( h ) : order of the schema h ; N( h, t ) : expected number of instances of schema h in generation t ; l : number of bit positions in a string. 3. Components of A Genetic Algorithm A simple genetic algorithm has the following components: encoding mechanism fitness function selection schemes genetic operators ( crossover and mutation ) reproduction control parameters 3.1. Encoding Mechanism The encoding mechanism depends on the nature of the problem variables. A large number of optimization problems have real-value continuous variables. A common method for encoding them uses their integer representation. Each variable is first lineary mapped to an integer defined in the prespecified range, and second the integer is encoded using a fixed number of binary bits. The binary codes of all the variables are then concatenated to obtain a binary string. 3.2. Fitness Function The objective function provides the mechanism for evaluating each string. However, its range of values varies from problem to problem. To maintain uniformity over various problem domains it is introduced a fitness function that normalizes the objective function to a convenient range of 0 and 1. The normalized value of the objective function is the fitness of the string, which the selection mechanism uses to evaluate the strings of the population. 3.3. Selection Schemes Selection models nature's survival-of-the-fittest mechanism. Fitter solutions survive while weaker ones perish. In the genetic algorithm, a fitter string receives a higher number of offspring and thus has a higher chance of surviving in the subsequent generation. There are two basic types of selection schemes commonly used today: proportionate selection and ranking ( ordinal ) selection. Proportionate-based selection selects individuals based on their fitness values relative to the fitness of the other individuals in the population. Some common proportionate selection mechanisms are: canonical proportional selection, and stochastic universal selection. Rank-based selection schemes select individuals upon their rank within the population. This entails that the selection pressure is independent of the fitness distribution and is solely based upon the relative ranking of the population. Common ranking selection schemes are: tournament selection, - selection, truncation selection, and linear ranking. Canonical Proportional Selection Algorithm Input: the population P( I ), N number of individuals / M denotes average fitness / Output: the population after selection P( I )' Proportional ( I1, I2, ..., IN ): s0 0 for i 1 to N do si si - 1 + ( fi / M ) for i 1 to N do r random[ 0, sN ] Ii' Ii such that sl - 1 <= r < sl return ( I1', I2', ..., IN' ) Tournament Selection Algorithm Input: the population P( I ), N number of individuals, and tournament size t { 1, 2, ..., N } Output: the population after selection P( I )' Tournament ( t, I1, I2, ..., IN ): for i 1 to N do Ii' best fit individual out of t randomly picked individuals from ( I1, I2, ..., IN ) return ( I1', I2', ..., IN' ) Linear Ranking Selection Algorithm Input: the population P( I ), N number of individuals, and the reproduction rate of the worst individual - [ 0, 1 ] Output: the population after selection P( I )' Linear_ranking ( -, I1, I2, ..., IN ): sort( I1, I2, ..., IN ) according fitness s0 0 for i 1 to N do si si - 1 + pi / computed with the equation below / for i 1 to N do r random[ 0, sN ] Ii' Ii such that sl - 1 <= r < sl return ( I1', I2', ..., IN' ) where the selection probability lineary assigned to the individuals according to their rank is defined: pi = 1 / N ( - + ( + - - )*( i - 1 ) / ( N - 1 ) ) Here - is the probability of the worst individual to be selected and + is the probability of the best individual to be selected. In this way all the individuals get a different rank, i.e. a different selection probability even if they have the same fitness value. Stochastic Universal Sampling Input: the population P( I ), N number of individuals and the reproduction rate for each fitness value Ri [0, N ] Output: the population after selection P( I )' SUS ( R1, R2, ..., RN , I1, I2, ..., IN ): sum 0 j1 ptr random[ 0, 1 ] for i 1 to N do sum sum + Ri where Ri is the reproduction rate of individual Ii while ( sum > ptr ) do Ij ' Ii jj+1 ptr ptr + 1 return ( I1', I2', ..., IN' ) 3.4. Genetic Operators - Crossover and Mutation The most crucial operation in the genetic algorithms is crossover. Crossover is the process of picking at random from the current population pairs of strings and exchanging portions of them. The simplest approach is singlepoint crossover. Assuming that the string length is l, a crossover point is chosen randomly from the range between 1 and l-1. The portions of the two parent strings beyond this crossover point are exchanged to form two new strings. The crossover point may assume any of the l-1 possible values with equal probability. After choosing a pair of strings , the algorithm invokes crossover only if a randomly generated number in the range 0 to 1 is greater than pc, the crossover rate (or probability of crossover). The crossover rate is usually from 0.5 to 1.0. Otherwise the strings remain unaltered. The value of pc lies in the range from 0 to 1. In a large population pc gives the fraction of strings actually crossed. Uniform Crossover Algorithm Uniform_crossover ( I1, I2 ): n length( I1 ) for i 1 to n do if random_real( 0,1 ) <= p then a I1 [ i ] I1 [ i ] I2 [ i ] I2 [ i ] a return ( I1', I2' ) where: p is the probability for crossover k-point Crossover Algorithm k-point_crossover ( k, I1, I2 ): n length( I1 ) if k < 1 and k > n + 1 then return Error for i 0 to N do I[i]0 i1 while i <= k do K[ i ] random_integer( 0,n ) if I[ K[ i ]] = 0 then I[ K[ i ]] 1 ii+1 for i 1 to k do for j K[ i ] to n do a I1 [ j ] I1 [ j ] I2 [ j ] I2 [ j ] a return ( I1', I2' ) After crossover, strings are subjected to mutation. Mutation of a bit involves flipping it: changing 0 to 1 and vice versa. Just as pc controls the probability of a crossover, another parameter pm ( mutation rate ) gives the probability that a bit will be flipped. The mutation rate is usually from 0.001 to 0.05. The bits of a string are independently mutated -- that is, the mutation of a bit does not affect the probability of mutation of other bits. Bitflip Mutation Algorithm ( one-point mutation ) Bitflip_mutation ( I ): n length( I ) k random_integer( 1, n ) I [ k ] new value from the alphabet return ( I ) Mutation Algorithm ( uniform mutation ) Mutation ( I ): n length( I ) for i 1 to n do if random_real( 0,1 ) <= p the probability for mutation then I [ i ] new value from the alphabet return ( I ) Genetic algorithms treat the mutation as a secondary operator with the role of restoring lost genetic material. 3.5. Reproduction Genetic algorithms modify a population of potential solutions during the course of a run, using both the application of operators such as crossover and mutation, and the application of a reproductive technique. There are two main reproductive techniques in general use. The first which is the most widely used is called generational reproduction, and the second steady-state reproduction. Generational reproduction replaces the entire population with a new population. This is done by repeatedly selecting an individual from the old population, according fitness and with replacement, and adding that member to the new population. Since selection is biased by fitness, individuals having a fitness value greater than the population average will on average be represented in the new population to a greater extent. Likewise, individuals with below-average fitness will decrease in the population according to the same ratio. The steady-state technique replaces only a few individuals during a generation, usually whatever number is required for an operator. Very often one individual is reproduced at a generation. This is done by selecting an individual from the population according to its fitness and making a copy. However, in order to insert the copy in the population, room must be made for it, which requires also that an individual for deletion is selected. Sometimes, the members for deletion are selected randomly. More often, the least-fit individual is chosen for deletion. The recent implementations use steady-state reproduction because of several advantages: - First, the steady-state reproduction automatically grants elitist status to all good individuals in the population ; - Second, When a new individual is created in the steady-state algorithm it is ready for immediate use, while in the generational algorithms a good new individual is not used until the next generation which may be many steps ahead; - Third, one can impose the condition that only one copy of an individual exists in the population at any one time. This simple change has shown for some problems large improvements in the speed at which the genetic algorithm converges on the correct solution. A Hill-climbing Algorithm procedure hill-climb begin t 0 repeat local = FALSE select current individual vc at random evaluate vc repeat select v Neighborhs( vc ) by flipping bits of vc select this vn v which has largest value of objective function f if f(vc ) < f(vn ) then vc vn else local = TRUE until local t t + 1 until t = MAX end. A Simulated Annealing Algorithm procedure simulated-annealing begin t 0 initialize temperature T select current individual vc at random evaluate vc repeat repeat select a vn Neighborhs( vc ) by flipping bits of vc if f(vc ) < f(vn ) then vc vn else if (rand[0,1)<exp{(f(vn)-f(vc))/T}) vc vn until terminating condition T g( T,t ) t t + 1 until stop criterion end.