Introduction to Soft Computing ECE457 Applied Artificial Intelligence Spring 2008 Lecture #12 Outline Overview of soft computing Fuzzy logic Neural networks Russell & Norvig, pages 526-527 Russell & Norvig, sections 20.5 Genetic algorithms Russell & Norvig, pages 116-120 ECE 493 & ECE 750 (Prof. Karray) ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 2 Soft Computing Branch of AI that deals with systems and methodologies that can perform approximate, qualitative, human-like reasoning Humans can make intelligent decisions using incomplete and imprecise information Computer algorithms require complete and precise information “Soft” reasoning “Hard” reasoning Soft computing aims to bridge this gap ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 3 Soft vs. Hard Computing Input Hard Computing Soft Computing Complete and Approximate and exact data incomplete data Output Overly-exact solution Solution that’s “good enough” Reasoning Rational Human-like ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 4 Soft Computing Probabilistic reasoning Artificial neural networks (ANN) Human brain Fuzzy logic (FL) Human uncertainty and randomness Human knowledge Evolutionary computing Genetic algorithms (GA) Biological evolution ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 5 Fuzzy Logic In crisp logic, facts are either true or false Truth value = {0, 1} This is an unnatural way of doing it 1 Cold Warm Hot 0 Temperature ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 6 Fuzzy Logic In fuzzy logic, facts can be partially true and partially false Truth value = [0, 1] This is closer to human knowledge Cold Warm Hot 1 0 Temperature ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 7 Terminology “Cold”, “warm” and “hot” are fuzzy set The triangle function mapping a value of temperature to a value of cold (or warm or hot) is called a fuzzy membership function The value of cold (or warm or hot) to which a temperature is mapped is called a membership degree Fz[t Cold] = μCold(t): [0,1] ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 8 Fuzzy Sets vs. Probabilities Membership degrees are not probabilities Both are measures over the range [0,1] Probabilistic view: x is or is not y, and we have a certain probability of knowing which Fuzzy view: x is more or less y, with a certain degree ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 9 Fuzzy Rules If-then rules like other logics, but with fuzzy sets If Cold then VentilationHigh If Warm then VentilationLow If Hot then VentilationMedium Variables belong partially to antecedent, therefore consequence activated partially ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 10 Fuzzy Controller Typical application of fuzzy logic Set of fuzzy rules Define the behaviour of a system Antecedent: variables that affect the system Consequent: reaction of the system ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 11 Fuzzy Controller Example If Hot and Wet then VentilationLow If Warm and Humid then VentilationMedium Hot 1 0 1 0 Wet 1 T Warm 0 H Humid 1 t T 0 1 h ECE457 Applied Artificial Intelligence VentilationLow 0 S VentilationMedium 1 H 0 R. Khoury (2008) S Page 12 Fuzzy Controller Example Defuzzification using centroid technique Converts fuzzy consequent into crisp value that can be used by system Centroid merges the output fuzzy sets and finds the center of gravity 1 0 ECE457 Applied Artificial Intelligence R. Khoury (2008) s S Page 13 Advantages of Fuzzy Logic Partial activation of multiple rules at once Achieve complex non-linear behaviour with simple IF-THEN rules Use linguistic variables to model words, rules of thumb, human knowledge ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 14 Properties of Fuzzy Controllers The rule base must be The rules must Complete Continuous Be consistent Not interact The rule base system must be Robust Stable ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 15 Limits of Fuzzy Logic Writing the fuzzy rules Designing the fuzzy membership functions Often requires work by a domain expert ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 16 Fuzzy Robot Navigation Fuzzy controllers are very popular for robot navigation Eliminates need for complete world model and complex reasoning rules Basic robot Known current position and target Three sensors (front, left, right) Left and right wheels can turn at different speeds to make robot turn ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 17 Fuzzy Robot Navigation Fuzzy linguistic variables of control system 1 0 Near Medium Distance: near, medium, far Direction angle: negative, zero, positive Speed: slow, medium, fast Far Distance 1 Negative Zero 0 ECE457 Applied Artificial Intelligence Positive Direction 1 Slow 0 R. Khoury (2008) Medium Fast Speed Page 18 Fuzzy Robot Navigation Going to target Avoiding obstacles IF (left obstacle is far) and (front obstacle is near) and (right obstacle is far) and (angle is zero) THEN (left speed is fast) and (right speed is slow) Turning corners IF (left obstacle is far) and (front obstacle is far) and (right obstacle is far) and (angle is zero) THEN (left speed is fast) and (right speed is fast) IF (left obstacle is medium) and (front obstacle is near) and (right obstacle is near) and (angle is any) THEN (left speed is slow) and (right speed is fast) Following edges IF (left obstacle is far) and (front obstacle is far) and (right obstacle is near) and (angle is positive) THEN (left speed is medium) and (right speed is medium) ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 19 Fuzzy Robot Navigation ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 20 Artificial Neural Networks Human Brain Massively parallel network of neurons Each individual neuron is not intelligent Neuron is a simple computing element But the brain is intelligent! ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 21 Human Brain Brain learns by changing and adjusting the connections between neurons Information encoded in many ways Connection patterns of neurons Amplification of signals by dendrites Transfer function and threshold values controlling whether the cell transmits the signal ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 22 Neuron Receives electric impulse from other neurons through dendrites. If impulse strong enough, travels through axon. Reaches synapses and transmits to other neurons ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 23 Artificial Neuron ith neuron sums inputs a1… aj… an, weighted by weights wi1… wij… win Threshold value i controls activation If activated, input is transformed by transfer function fi(.) into output ai wij aj Cell body Axon fi(.) ai Synapse Dendrites i ai = fi( jwijaj - i ) = [0, 1] ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 24 Artificial Neural Networks Large arrangement of inter-connected artificial neurons Different classes of network Topology of the network Transfer function of the neurons Learning algorithm Different networks appropriate for different applications ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 25 Perceptron x1 o1 x2 o2 x3 Simplest and most commonly-used ANN Feed-forward ANN Single-layer Input Hidden Output layer layer layer ECE457 Applied Artificial Intelligence No hidden layer Only linearlyseparable problems Multi-layer Non-linear problems R. Khoury (2008) Page 26 Examples x1 1 x2 o 1 All these neurons use step functions f(<0) = 0 XOR Network x1 1 -1 f(0) = 1 -1.5 1 AND Network -0.5 -1 OR Network x1 1 x2 o 1 -0.5 1 -0.5 o x2 1 -0.5 ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 27 Learning Backpropagation algorithm Train using a set of input-output vectors Modify weights of network to minimize the error Difference between the training output vector and the network’s actual output vector Supervised learning Using the entire training set = 1 epoch ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 28 Backpropagation algorithm Start with random weights For each epoch Forward propagation of each input vector, to get the network’s output vector Compare network output to target output of training set, and compute the network error Starting from output layer and going backwards, back propagate the network error to each layer Update the input weights of each neuron in each layer to minimise the network error Repeat until min network error or max epoch ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 29 Learning Given The weight vector w, where w(l) represents the weights of the connections from layer l-1 to layer l A set of n input-output vectors, each composed of an input a(k) and a target output t(k) A network with q output neurons Algorithm Forward propagation of each input vector k in n, to get the network’s output vector o(k) Compute the network error given each output neuron i in q 1 n q 2 Ec ti (k ) oi (k ) 2 k 1 i 1 Back-propagate and update weight vector w to minimize Ec E w( l ) (cl ) w ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 30 ANN Topologies Feed-forward topology Information can only move forward through the network Recurrent topology Information can loop back to create feedback loop, or network memory ECE457 Applied Artificial Intelligence x1 x2 o1 x3 x1 x2 o1 x3 R. Khoury (2008) Page 31 Feed-Forward Networks Perceptron Basic network Any transfer functions Any number of hidden layers Radial Basis Function (RBF) Network Special class of multi-layer perceptron Single hidden layer with RBF (typically Gaussian) transfer function Hidden neuron transformations are symmetric, bounded and localized Useful for modelling systems with complex nonlinear behaviour, control systems, audio-video signal processing, … ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 32 Feed-Forward Networks Kohonen Self-Organising Map Fully-connected two-layer network Unsupervised learning x1 x2 Output neuron compete to activate Neuron with highest output value wins Winning neuron and its neighbours have their weights updated Creates a 2D topological mapping of the input vectors to the output layer Useful for pattern recognition, image analysis, data mining, … ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 33 Recurrent Networks Hopfield Network x1 o1 x2 o2 o3 x3 ECE457 Applied Artificial Intelligence First kind of recurrent ANN Implements an associative memory Previous-stored patterns “complete” current (noisy or incomplete) pattern Network is attracted to stable pattern in memory Useful for information retrieval, pattern/speech recognition, … R. Khoury (2008) Page 34 Limits of ANN Black box No formal design rules What does each neuron do? What does each weight do? How many hidden layers? How many neurons per layer? Which transfer functions? Prone to overfitting Backpropagation is slow and can converge on local optimum ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 35 ANN Classifier Typical application of ANN Requires A set of crisp, mutually-exclusive classes A set of well-defined, measurable attributes relevant to classification Correctly-classified training data Compared to Naïve Bayes Classifier Does not require any probabilities ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 36 ANN Classifier Design Pick network architecture based on problem One input neuron per attribute One output neuron per class Add hidden layers if classification is non-linear function of input space Or simply pick perceptron Exact number of layers or neurons difficult to discover Train network ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 37 ANN Classifier Example Paper on website Classification of pixels in aerial photograph Classes: lake, forest or land Input: pixel’s RGB value Data ECE457 Applied Artificial Intelligence 180 training pixels 300 testing pixels R. Khoury (2008) Page 38 ANN Classifier Example Perceptron network Good for complex classification problems Two hidden layers Number of hidden layers / neurons per hidden layer discovered by trial-and-error One hidden layer not precise enough First layer neurons input neurons Second layer neurons max(input neurons, output neurons) More hidden neurons give higher precision, need more training time ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 39 ANN Classifier Example Classification precision: 93% Naïve Bayes Classifier: 88% Network later expanded to 7 classes Water, marsh, farmland, woodland, grassland, residential area, salina Still has RGB input, two hidden layers 7 output neurons, more neurons on hidden layers Precision: 97% Naïve Bayes Classifier: 89% ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 40 Genetic Algorithms Biological evolution Species adapt to better survive in environment Over generations Pairs of individuals reproduce Individuals mutate Fittest individuals survive and go on to reproduce Source: David M. Hillis, Derrick Zwickl, and Robin Gutell, University of Texas. ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 41 Evolutionary Computing Introduced by John Holland in “Adaptation in Natural and Artificial Systems”, 1975 Stochastic search technique Like simulated annealing! Divided in four main classes (different representation of individuals) Genetic algorithms Evolution strategies Evolutionary programming Genetic programming ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 42 Genetic Algorithms Simulating biological evolution in AI Biology Individuals Environment Genetic Algorithms States Solutions to a problem State space to explore Problem to solve Fitness Evaluation function Changes from one generation to the next Operators ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 43 Individuals Individuals have a chromosome String of genes (bits) Encodes the solution represented by the individual Often binary representation, but can be anything 1 1 0 0 1 0 1 0 Length, nature of bits, meaning, varies according to problem ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 44 Operators To evolve, the population must change GA typically use three operators to change the population Crossover (sexual reproduction) Mutation (mutation) Selection (natural selection) ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 45 Crossover Select two fittest parents Select (random) splitting point in the chromosomes Recombine the genes to get children Causes slow move of the population around state space 1 1 0 0 1 0 1 0 0 1 0 0 0 1 1 0 ECE457 Applied Artificial Intelligence 1 1 0 0 0 1 1 0 0 1 0 0 1 0 1 0 R. Khoury (2008) Page 46 Mutation Select random child Select random gene Switch gene value according to mutation rule Depends on representation Typically very rare (low mutation rate) Causes large leap across state space 1 1 0 0 0 1 1 0 ECE457 Applied Artificial Intelligence 1 1 0 1 0 1 1 0 R. Khoury (2008) Page 47 Selection GA population have zero population growth We can’t keep them all (computer limits) and we don’t want to (they’re useless) But each crossover operation generates two more, new individuals Survival of the fittest! Evaluate fitness of all individuals Kill off (i.e. delete) least fit ones, keeping only the fittest (best solutions) Each generation, the population improves ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 48 Evolution Algorithm Starts with random population Evaluate fitness of all individuals For each generation Select fittest parents and crossover Mutate children according to mutation rate Evaluate fitness of new individuals Select individuals that survive for next generation Repeat until Generation limit reached An individual achieves the target fitness ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 49 Evolution Algorithm Individuals are random, but population converges slowly towards solution * * * x x * * * * * * * * * *** x* *** * * Population fitness Generation ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 50 Genetic Algorithm Example 8-Queen problem Environment: state space Chromosome: encodes position of queens Fitness: number of attacks 2 6 7 1 5 7 2 5 ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 51 Genetic Algorithm Example Crossover operation 2 6 7 1 5 7 2 5 4 8 5 3 6 5 1 5 ECE457 Applied Artificial Intelligence 2 6 7 3 6 5 1 5 4 8 5 1 5 7 2 5 R. Khoury (2008) Page 52 Genetic Algorithm Example Possible mutation operations Complements (1-8, 2-7, 3-6, 4-5) Swap two random genes 4 8 5 1 5 7 2 5 ECE457 Applied Artificial Intelligence 4 8 5 1 4 7 2 5 4 5 5 1 8 7 2 5 R. Khoury (2008) Page 53 Limits of Genetic Algorithms Not guaranteed to find the optimal solution In large complex state spaces, or with a low mutation rate, might converge to local optimum (premature convergence) High mutation rate can prevent convergence (mutation interference) Interdependence between genes makes it hard to find solution (epistasis) ECE457 Applied Artificial Intelligence R. Khoury (2008) Page 54