Seminar on Computational Intelligence Timetable 12.1. 13.1. 19.1. 20.1. 26.1. 27.1. 2.2. 3.2. … UBI-Health etc. Iiro Jantunen Introduction Pekka Toivanen Positioning Mikko Asikainen Self-Organization Pekka Toivanen Genetic algorithms Pekka Toivanen No seminar delayed to future Seminar presentations Seminar presentations Seminar on Computational Intelligence Presentation topics Health 1. 2. 3. 4. 5. 6. Automatic context (behavior) analysis of elderly people Technology for self-made measurements of health parameters Ubiquitous health, e-health and e-Medical centre Ubiquitous health architecture Brain imaging and analysis in multiple sclerosis Brain imaging and analysis in dementia Swarm Intelligence 1. 2. Swarm Intelligence in computer games Basic ideas of swarm intelligence and applications Positioning 1. 2. 3. Positioning of animals indoors Positioning of animals outdoors Positioning of humans indoors Genetic Algorithms: A Tutorial “Genetic Algorithms are good at taking large, potentially huge search spaces and navigating them, looking for optimal combinations of things, solutions you might not otherwise find in a lifetime.” - Salvatore Mangano Computer Design, May 1995 Classes of Search Techniques Search techniques Calculus-based techniques Direct methods Finonacci Guided random search techniques Indirect methods Newton Evolutionary algorithms Evolutionary strategies Genetic algorithms Parallel Centralized Simulated annealing Distributed Sequential Steady-state Generational Enumerative techniques Dynamic programming Genetic Algorithms - History • • • • Pioneered by John Holland in the 1970’s Got popular in the late 1980’s Based on ideas from Darwinian Evolution Can be used to solve a variety of problems that are not easy to solve using other techniques Evolution in the real world • Each cell of a living thing contains chromosomes - strings of DNA • Each chromosome contains a set of genes - blocks of DNA • Each gene determines some aspect of the organism (like eye colour) • A collection of genes is sometimes called a genotype • A collection of aspects (like eye colour) is sometimes called a phenotype • Reproduction involves recombination of genes from parents and then small amounts of mutation (errors) in copying • The fitness of an organism is how much it can reproduce before it dies • Evolution based on “survival of the fittest” Start with a Dream… • • • • Suppose you have a problem You don’t know how to solve it What can you do? Can you use a computer to somehow find a solution for you? • This would be nice! Can it be done? A dumb solution A “blind generate and test” algorithm: Repeat Generate a random possible solution Test the solution and see how good it is Until solution is good enough Can we use this dumb idea? • Sometimes - yes: – if there are only a few possible solutions – and you have enough time – then such a method could be used • For most problems - no: – many possible solutions – with no time to try them all – so this method can not be used A “less-dumb” idea (GA) Generate a set of random solutions Repeat Test each solution in the set (rank them) Remove some bad solutions from set Duplicate some good solutions make small changes to some of them Until best solution is good enough How do you encode a solution? • Obviously this depends on the problem! • GA’s often encode solutions as fixed length “bitstrings” (e.g. 101110, 111111, 000101) • Each bit represents some aspect of the proposed solution to the problem • For GA’s to work, we need to be able to “test” any string and get a “score” indicating how “good” that solution is Silly Example - Drilling for Oil • Imagine you had to drill for oil somewhere along a single 1km desert road • Problem: choose the best place on the road that produces the most oil per day • We could represent each solution as a position on the road • Say, a whole number between [0..1000] Where to drill for oil? Solution1 = 300 Solution2 = 900 Road 0 500 1000 Digging for Oil • The set of all possible solutions [0..1000] is called the search space or state space • In this case it’s just one number but it could be many numbers or symbols • Often GA’s code numbers in binary producing a bitstring representing a solution • In our example we choose 10 bits which is enough to represent 0..1000 Convert to binary string 512 256 128 64 32 16 8 4 2 1 900 1 1 1 0 0 0 0 1 0 0 300 0 1 0 0 1 0 1 1 0 0 1023 1 1 1 1 1 1 1 1 1 1 In GA’s these encoded strings are sometimes called “genotypes” or “chromosomes” and the individual bits are sometimes called “genes” Drilling for Oil Solution1 = 300 (0100101100) Solution2 = 900 (1110000100) Road OIL 0 1000 30 5 Location Summary We have seen how to: • represent possible solutions as a number • encoded a number into a binary string • generate a score for each number given a function of “how good” each solution is - this is often called a fitness function • Our silly oil example is really optimisation over a function f(x) where we adapt the parameter x Search Space • For a simple function f(x) the search space is one dimensional. • But by encoding several values into the chromosome many dimensions can be searched e.g. two dimensions f(x,y) • Search space an be visualised as a surface or fitness landscape in which fitness dictates height • Each possible genotype is a point in the space • A GA tries to move the points to better places (higher fitness) in the the space Fitness landscapes Search Space • Obviously, the nature of the search space dictates how a GA will perform • A completely random space would be bad for a GA • Also GA’s can get stuck in local maxima if search spaces contain lots of these • Generally, spaces in which small improvements get closer to the global optimum are good Back to the (GA) Algorithm Generate a set of random solutions Repeat Test each solution in the set (rank them) Remove some bad solutions from set Duplicate some good solutions make small changes to some of them Until best solution is good enough Adding Sex - Crossover • Although it may work for simple search spaces our algorithm is still very simple • It relies on random mutation to find a good solution • It has been found that by introducing “sex” into the algorithm better results are obtained • This is done by selecting two parents during reproduction and combining their genes to produce offspring Adding Sex - Crossover • Two high scoring “parent” bit strings (chromosomes) are selected and with some probability (crossover rate) combined • Producing two new offspring (bit strings) • Each offspring may then be changed randomly (mutation) Selecting Parents • Many schemes are possible so long as better scoring chromosomes more likely selected • Score is often termed the fitness • “Roulette Wheel” selection can be used: – Add up the fitness's of all chromosomes – Generate a random number R in that range – Select the first chromosome in the population that - when all previous fitness’s are added gives you at least the value R SGA operators: Selection • Main idea: better individuals get higher chance – Chances proportional to fitness – Implementation: roulette wheel technique 1/6 = 17% A 3/6 = 50% B C » Assign to each individual a part of the roulette wheel » Spin the wheel n times to select n individuals fitness(A) = 3 fitness(B) = 1 2/6 = 33% fitness(C) = 2 Example population No. 1 2 3 4 5 6 7 8 Chromosome 1010011010 1111100001 1011001100 1010000000 0000010000 1001011111 0101010101 1011100111 Fitness 1 2 3 1 3 5 1 2 Roulette Wheel Selection 1 1 0 2 3 2 4 3 5 1 6 3 7 5 Rnd[0..18] = 7 Rnd[0..18] = 12 Chromosome4 Chromosome6 Parent1 Parent2 8 1 2 18 Crossover - Recombination 1010000000 Parent1 Offspring1 1011011111 1001011111 Parent2 Offspring2 1010000000 Crossover single point - random With some high probability (crossover rate) apply crossover to the parents. (typical values are 0.8 to 0.95) SGA operators: 1-point crossover • • • • Choose a random point on the two parents Split parents at this crossover point Create children by exchanging tails Pc typically in range (0.6, 0.9) n-point crossover • • • • Choose n random crossover points Split along those points Glue parts, alternating between parents Generalisation of 1 point (still some positional bias) Uniform crossover • • • • Assign 'heads' to one parent, 'tails' to the other Flip a coin for each gene of the first child Make an inverse copy of the gene for the second child Inheritance is independent of position Mutation mutate Offspring1 1011011111 Offspring1 1011001111 Offspring2 1010000000 Offspring2 1000000000 Original offspring Mutated offspring With some small probability (the mutation rate) flip each bit in the offspring (typical values between 0.1 and 0.001) The GA Cycle of Reproduction reproduction children modified children parents population deleted members discard modification evaluated children evaluation Back to the (GA) Algorithm Generate a population of random chromosomes Repeat (each generation) Calculate fitness of each chromosome Repeat Use roulette selection to select pairs of parents Generate offspring with crossover and mutation Until a new population has been produced Until best solution is good enough Many Variants of GA • Different kinds of selection (not roulette) – Tournament – Elitism, etc. • Different recombination – Multi-point crossover – 3 way crossover etc. • Different kinds of encoding other than bitstring – Integer values – Ordered set of symbols • Different kinds of mutation Many parameters to set • Any GA implementation needs to decide on a number of parameters: Population size (N), mutation rate (m), crossover rate (c) • Often these have to be “tuned” based on results obtained - no general theory to deduce good values • Typical values might be: N = 50, m = 0.05, c = 0.9 Why does crossover work? • A lot of theory about this and some controversy • Holland introduced “Schema” theory • The idea is that crossover preserves “good bits” from different parents, combining them to produce better solutions • A good encoding scheme would therefore try to preserve “good bits” during crossover and mutation Genetic Programming • When the chromosome encodes an entire program or function itself this is called genetic programming (GP) • In order to make this work encoding is often done in the form of a tree representation • Crossover entials swaping subtrees between parents Genetic Programming It is possible to evolve whole programs like this but only small ones. Large programs with complex functions present big problems Implicit fitness functions • Most GA’s use explicit and static fitness function (as in our “oil” example) • Some GA’s (such as in Artificial Life or Evolutionary Robotics) use dynamic and implicit fitness functions - like “how many obstacles did I avoid” • In these latter examples other chromosomes (robots) effect the fitness function Problem • In the Travelling Salesman Problem (TSP) a salesman has to find the shortest distance journey that visits a set of cities • Assume we know the distance between each city • This is known to be a hard problem to solve because the number of possible routes is N! where N = the number of cities • There is no simple algorithm that gives the best answer quickly Problem • Design a chromosome encoding, a mutation operation and a crossover function for the Travelling Salesman Problem (TSP) • Assume number of cities N = 10 • After all operations the produced chromosomes should always represent valid possible journeys (visit each city once only) • There is no single answer to this, many different schemes have been used previously A Simple Example “The Gene is by far the most sophisticated program around.” - Bill Gates, Business Week, June 27, 1994 A Simple Example The Traveling Salesman Problem: Find a tour of a given set of cities so that – each city is visited only once – the total distance traveled is minimized Representation Representation is an ordered list of city numbers known as an order-based GA. 1) London 3) Dunedin 2) Venice 4) Singapore 5) Beijing 7) Tokyo 6) Phoenix 8) Victoria CityList1 (3 5 7 2 1 6 4 8) CityList2 (2 5 7 6 8 1 3 4) Crossover Crossover combines inversion and recombination: * * Parent1 (3 5 7 2 1 6 4 8) Parent2 (2 5 7 6 8 1 3 4) Child (5 8 7 2 1 6 3 4) This operator is called the Order1 crossover. Mutation Mutation involves reordering of the list: Before: * * (5 8 7 2 1 6 3 4) After: (5 8 6 2 1 7 3 4) TSP Example: 30 Cities 120 100 y 80 60 40 20 0 0 10 20 30 40 50 x 60 70 80 90 100 Solution i (Distance = 941) TSP30 (Performance = 941) 120 100 y 80 60 40 20 0 0 10 20 30 40 50 x 60 70 80 90 100 Solution j(Distance = 800) TSP30 (Performance = 800) 120 100 80 y 44 62 69 67 78 64 62 54 42 50 40 40 38 21 35 67 60 60 40 42 50 99 60 40 20 0 0 10 20 30 40 50 x 60 70 80 90 100 Solution k(Distance = 652) TSP30 (Performance = 652) 120 100 y 80 60 40 20 0 0 10 20 30 40 50 x 60 70 80 90 100 Best Solution (Distance = 420) TSP30 Solution (Performance = 420) 120 100 80 y 42 38 35 26 21 35 32 7 38 46 44 58 60 69 76 78 71 69 67 62 84 94 60 40 20 0 0 10 20 30 40 50 x 60 70 80 90 100 Using Genetic Algorithms [GAs] to both design composite materials and aerodynamic shapes for race cars and regular means of transportation (including aviation) can return combinations of best materials and best engineering to provide faster, lighter, more fuel efficient and safer vehicles for all the things we use vehicles for. Rather than spending years in laboratories working with polymers, wind tunnels and balsa wood shapes, the processes can be done much quicker and more efficiently by computer modeling using GA searches to return a range of options human designers can then put together however they please. Getting the most out of a range of materials to optimize the structural and operational design of buildings, factories, machines, etc. is a rapidly expanding application of GAs. These are being created for such uses as optimizing the design of heat exchangers, robot gripping arms, satellite booms, building trusses, flywheels, turbines, and just about any other computer-assisted engineering design application. T here is work to combine GAs optimizing particular aspects of engineering problems to work together, and some of these can not only solve design problems, but also project them forward to analyze weaknesses and possible point failures in the future so these can be avoided Robotics involves human designers and engineers trying out all sorts of things in order to create useful machines that can do work for humans. Each robot's design is dependent on the job or jobs it is intended to do, so there are many different designs out there. GAs can be programmed to search for a range of optimal designs and components for each specific use, or to return results for entirely new types of robots that can perform multiple tasks and have more general application. GA-designed robotics just might get us those nifty multi-purpose, learning robots we've been expecting any year now since we watched the Jetsons as kids, who will cook our meals, do our laundry and even clean the bathroom for us Evolvable hardware (EH) is a new field about the use of evolutionary algorithms (EA) to create specialized electronics without manual engineering. It brings together reconfigurable hardware, artificial intelligence, fault tolerance and autonomous systems. Evolvable hardware refers to hardware that can change its architecture and behavior dynamically and autonomously by interacting with its environment. In its most fundamental form an evolutionary algorithm manipulates a population of individuals where each individual describes how to construct a candidate circuit. Each circuit is assigned a fitness, which indicates how well a candidate circuit satisfies the design specification. The evolutionary algorithm uses stochastic operators to evolve new circuit configurations from existing ones. Done properly, over time the evolutionary algorithm will evolve a circuit configuration that exhibits desirable behavior. Each candidate circuit can either be simulated or physically implemented in a reconfigurable device. Typical reconfigurable devices are field-programmable gate arrays (for digital designs) or field-programmable analog arrays (for analog designs). At a lower level of abstraction are the field-programmable transistor arrays that can implement either digital or analog designs. In many cases conventional design methods (formulas, etc.) can be used to design a circuit. But in other cases the design specification doesn't provide sufficient information to permit using conventional design methods. For example, the specification may only state desired behavior of the target hardware. The fitness of an evolved circuit is a measure of how well the circuit matches the design specification. Fitness in evolvable hardware problems is determined via two methods:: • extrinsic evolution: all circuits are simulated to see how they perform • intrinsic evolution : physical tests are run on actual hardware. In extrinsic evolution only the final best solution in the final population of the evolutionary algorithm is physically implemented, whereas with intrinsic evolution every individual in every generation of the EA's population is physically realized and tested. Evolvable hardware problems fall into two categories: original design and adaptive systems. Original design uses evolutionary algorithms to design a system that meets a predefined specification. Adaptive systems reconfigure an existing design to counteract faults or a changed operational environment. Do you find yourself frustrated by slow LAN performance, inconsistent internet access, a FAX machine that only sends faxes sometimes, your land line's number of 'ghost' phone calls every month? Well, GAs are being developed that will allow for dynamic and anticipatory routing of circuits for telecommunications networks. These could take notice of your system's instability and anticipate your re-routing needs. Using more than one GA circuit-search at a time, soon your interpersonal communications problems may really be all in your head rather than in your telecommunications system. Other GAs are being developed to optimize placement and routing of cell towers for best coverage and ease of switching, so your cell phone and blackberry will be thankful for GAs too. New applications of a GA known as the "Traveling Salesman Problem" or TSP can be used to plan the most efficient routes and scheduling for travel planners, traffic routers and even shipping companies. The shortest routes for traveling. The timing to avoid traffic tie-ups and rush hours. Most efficient use of transport for shipping, even to including pickup loads and deliveries along the way. The program can be modeling all this in the background while the human agents do other things, improving productivity as well! Chances are increasing steadily that when you get that trip plan packet from the travel agency, a GA contributed more to it than the agent did On the security front, GAs can be used both to create encryption for sensitive data as well as to break those codes. Encrypting data, protecting copyrights and breaking competitors' codes have been important in the computer world ever since there have been computers, so the competition is intense. Every time someone adds more complexity to their encryption algorithms, someone else comes up with a GA that can break the code. It is hoped that one day soon we will have quantum computers that will be able to generate completely indecipherable codes. Of course, by then the 'other guys' will have quantum computers too, so it's a sure bet the spy vs. spy games will go on indefinitely The de novo design of new chemical molecules is a burgeoning field of applied chemistry in both industry and medicine. GAs are used to aid in the understanding of protein folding, analyzing the effects of substitutions on those protein functions, and to predict the binding affinities of various designed proteins developed by the pharmaceutical industry for treatment of particular diseases. The same sort of GA optimization and analysis is used for designing industrial chemicals for particular uses, and in both cases GAs can also be useful for predicting possible adverse consequences. This application has and will continue to have great impact on the costs associated with development of new chemicals and drugs The development of microarray technology for taking 'snapshots' of the genes being expressed in a cell or group of cells has been a boon to medical research. GAs have been and are being developed to make analysis of gene expression profiles much quicker and easier. This helps to classify what genes play a part in various diseases, and further can help to identify genetic causes for the development of diseases. Being able to do this work quickly and efficiently will allow researchers to focus on individual patients' unique genetic and gene expression profiles, enabling the hoped-for "personalized medicine" we've been hearing about for several years In the current unprecedented world economic meltdown one might legitimately wonder if some of those Wall Street gamblers made use of GA-assisted computer modeling of finance and investment strategies to funnel the world's accumulated wealth into what can best be described as dot-dollar black holes. But then again, maybe they were simply all using the same prototype, which hadn't yet been de-bugged. It is possible that a newer generation of GA-assisted financial forecasting would have avoided the black holes and returned something other than bad debts the taxpayers get to repay. Who knows Those who spend some of their time playing computer Sims games (creating their own civilizations and evolving them) will often find themselves playing against sophisticated artificial intelligence GAs instead of against other human players online. These GAs have been programmed to incorporate the most successful strategies from previous games - the programs 'learn' - and usually incorporate data derived from game theory in their design. Game theory is useful in most all GA applications for seeking solutions to whatever problems they are applied to, even if the application really is a game When to Use a GA • Alternate solutions are too slow or overly complicated • Need an exploratory tool to examine new approaches • Problem is similar to one that has already been successfully solved by using a GA • Want to hybridize with an existing solution • Benefits of the GA technology meet key problem requirements Some GA Application Types Domain Application Types Control gas pipeline, pole balancing, missile evasion, pursuit Design Scheduling semiconductor layout, aircraft design, keyboard configuration, communication networks manufacturing, facility scheduling, resource allocation Robotics trajectory planning Machine Learning Signal Processing designing neural networks, improving classification algorithms, classifier systems filter design Game Playing poker, checkers, prisoner’s dilemma Combinatorial Optimization set covering, travelling salesman, routing, bin packing, graph colouring and partitioning Conclusions Question: ‘If GAs are so smart, why ain’t they rich?’ Answer: ‘Genetic algorithms are rich - rich in application across a large and growing number of disciplines.’ - David E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning