Simulated Annealing A Basic Introduction Lecture 10 Olivier de Weck, Ph.D. Massachusetts Institute of Technology 1 Outline • Heuristics • Background in Statistical Mechanics – Atom Configuration Problem – Metropolis Algorithm • Simulated Annealing Algorithm • Sample Problems and Applications • Summary 2 Learning Objectives • Review background in Statistical Mechanics: configuration, ensemble, entropy, heat capacity • Understand the basic assumptions and steps in Simulated Annealing (SA) • Be able to transform design problems into a combinatorial optimization problem suitable to SA • Understand strengths and weaknesses of SA 3 Heuristics 4 What is a Heuristic? • A Heuristic is simply a rule of thumb that hopefully will find a good answer. • Why use a Heuristic? – Heuristics are typically used to solve complex (large, nonlinear, nonconvex (i.e. contain local minima)) multivariate combinatorial optimization problems that are difficult to solve to optimality. • Unlike gradient-based methods in a convex design space, heuristics are NOT guaranteed to find the true global optimal solution in a single objective problem, but should find many good solutions (the mathematician's answer vs. the engineer’s answer) • Heuristics are good at dealing with local optima without getting stuck in them while searching for the global optimum. 5 Types of Heuristics • Heuristics Often Incorporate Randomization • Two Special Cases of Heuristics – Construction Methods • Must first find a feasible solution and then improve it. – Improvement Methods • Start with a feasible solution and just try to improve it. • 3 Most Common Heuristic Techniques – – – – Simulated Annealing Genetic Algorithms Tabu Search New Methods: Particle Swarm Optimization, etc… 6 Origin of Simulated Annealing (SA) • Definition: A heuristic technique that mathematically mirrors the cooling of a set of atoms to a state of minimum energy. • Origin: Applying the field of Statistical Mechanics to the field of Combinatorial Optimization (1983) • Draws an analogy between the cooling of a material (search for minimum energy state) and the solving of an optimization problem. • Original Paper Introducing the Concept – Kirkpatrick, S., Gelatt, C.D., and Vecchi, M.P., “Optimization by Simulated Annealing,” Science, Volume 220, Number 4598, 13 May 1983, pp. 671680. 7 MATLAB® “peaks” function Difficult due to plateau at z=0, local maxima SAdemo1 xo=[-2,-2] Maximum Optimum at x*=[ 0.012, 1.524] z*=8.0484 Peaks 3 2 1 y • • • • • • 0 -1 -2 Start Point -3 -3 -2 -1 0 x 1 2 3 8 “peaks” convergence • Initially ~ nearly random search • Later ~ gradient search SA convergence history 2 current configuration new best configuration 0 System Energy -2 -4 -6 -8 -10 0 50 100 150 Iteration Number 200 250 300 9 Statistical Mechanics 10 The Analogy • Statistical Mechanics: The behavior of systems with many degrees of freedom in thermal equilibrium at a finite temperature. • Combinatorial Optimization: Finding the minimum of a given function depending on many variables. • Analogy: If a liquid material cools and anneals too quickly, then the material will solidify into a sub-optimal configuration. If the liquid material cools slowly, the crystals within the material will solidify optimally into a state of minimum energy (i.e. ground state). – This ground state corresponds to the minimum of the cost function in an optimization problem. 11 Sample Atom Configuration Original Configuration Perturbed Configuration Atom Configuration - Sample Problem Atom Configuration - Sample Problem 7 7 6 6 5 5 3 4 4 4 3 2 2 2 1 1 3 2 2 3 1 1 0 -1 -1 4 3 0 0 1 2 3 4 5 6 7 E=133.67 Energy of original (configuration) -1 -1 0 1 4 5 6 7 E=109.04 Perturbing = move a random atom to a new random (unoccupied) slot 12 Configurations • Mathematically describe a configuration – Specify coordinates of each “atom” ri with i=1,..,4 r1 1 , r2 1 3 , r3 2 4 , r4 4 5 3 – Specify a slot for each atom ri with i=1,..,4 R 1 12 19 23 13 Energy of a state • Each state (configuration) has an energy level associated with it H q, q, t qi pi L q, q, t Hamiltonian i H T V kinetic potential Energy Etot Energy of configuration = Objective function value of design 14 Energy sample problem • Define energy function for “atom” sample problem N Ei mgyi potential energy xi xj 2 yi yj 1 2 2 j i kinetic energy N E R Ei ri i 1 Absolute and relative position of each atom contributes to Energy 15 Compute Energy of Config. A • Energy of initial configuration Atom Configuration - Sample Problem 7 E1 1 10 1 5 18 E2 1 10 2 5 5 20 5 20.95 26.71 6 5 3 4 4 3 E3 1 10 4 E4 1 10 3 18 20 5 5 2 2 47.89 38.12 2 2 1 1 0 -1 -1 0 1 2 3 4 5 6 7 Total Energy Configuration A: E({rA})= 133.67 16 Boltzmann Probability NR P! P N ! Number of configurations P=# of slots=25 N=# of atoms =4 NR=6,375,600 What is the likelihood that a particular configuration will exist in a large ensemble of configurations? P {r} exp E ({r}) k BT Boltzmann probability depends on energy and temperature 17 Boltzmann Collapse at low T T=100 T=10 T=10 - E avg=58.9245 T=1 - E avg=38.1017 14000 12000 12000 12000 10000 10000 10000 8000 6000 4000 8000 6000 4000 2000 0 Occurences 14000 Occurences Occurences T=100 - E avg=100.1184 T=1 20 40 60 80 T high 100 Energy 120 140 160 180 200 0 6000 4000 2000 0 8000 2000 0 20 40 60 80 100 Energy 120 140 160 180 200 0 0 20 40 60 80 100 Energy 120 140 160 180 200 T low Boltzmann Distribution collapses to the lowest energy state(s) in the limit of low temperature Basis of search by Simulated Annealing 18 Partition Function Z • Ensemble of configurations can be described statistically • Partition function, Z, generates the ensemble average Z Ei Tr exp kBT NR i Ei exp kBT 1 Initially (at T>>0) equal to the number of possible configurations 19 Free Energy NR Eavg E (T ) Average Energy of all Configurations in an ensemble Ei T i 1 F T kBT ln Z NR exp Eavg E (T ) i 1 E(T ) TS Ei T k BT Z Ei T d ln Z d 1 k BT Relates average Energy at T with Entropy S 20 Specific Heat and Entropy • Specific Heat C T dE (T ) dT 1 NR NR 2 i E (T ) E T i 1 k BT 2 2 2 E (T ) E T k BT 2 2 • Entropy T1 S (T ) S (T1 ) T C (T ) dT T dS (T ) dT C (T ) T Entropy is ~ equal to ln(# of unique configurations) 21 Low Temperature Statistics Specific Heat C and Entropy S 5 10 # of configurations, NT 10 # configs decreases 6 4 2 0 10 0 10 2 10 Temperature 0 4 10 0 2 10 Temperature 4 10 Average Energy, Eavg 150 100 50 0 -50 -100 Approaches -150 E* from below Ht. Capacity 10 150 Free Energy, F Entropy S decreases with lower T 8 0 10 2 10 Temperature 4 10 100 50 0 -50 -100 -150 0 10 2 10 Temperature 4 Approaches E* from above 10 22 Minimum Energy Configurations • Sample Atom Placement Problem Atom Configuration - Sample Problem Atom Configuration - Sample Problem 7 7 6 6 5 5 E*=60 E*=14.26 4 4 3 3 2 2 3 2 1 4 1 3 4 2 1 1 4 3 0 0 -1 -1 Uniqueness? 0 1 2 3 4 5 6 Strong gravity g=10 7 -1 -1 0 1 2 5 6 7 Low gravity g=0.1 23 Simulated Annealing 24 Dilemma • Cannot compute energy of all configurations ! – Design space often too large – Computation time for a single function evaluation can be large • Use Metropolis Algorithm, at successively lower temperatures to find low energy states – Metropolis: Simulate behavior of a set of atoms in thermal equilibrium (1953) – Probability of a configuration existing at T Boltzmann Probability P(r,T)=exp(-E(r)/T) 25 The SA Algorithm • Terminology: – X (or R or ) = Design Vector (i.e. Design, Architecture, Configuration) – E = System Energy (i.e. Objective Function Value) – T = System Temperature – = Difference in System Energy Between Two Design Vectors • The Simulated Annealing Algorithm 1) Choose a random Xi, select the initial system temperature, and specify the cooling (i.e. annealing) schedule 2) Evaluate E(Xi) using a simulation model 3) Perturb Xi to obtain a neighboring Design Vector (Xi+1) 4) Evaluate E(Xi+1) using a simulation model 5) If E(Xi+1)< E(Xi), Xi+1 is the new current solution 6) If E(Xi+1)> E(Xi), then accept Xi+1 as the new current solution with a probability e(- /T) where = E(Xi+1) - E(Xi). 7) Reduce the system temperature according to the cooling schedule. 8) Terminate the algorithm. 26 SA Block Diagram Define initial configuration Ro Start To End Tmin y T<Tmin? Evaluate energy E(Ro) Perturb configuration _ Ri > Ri+1 n Compute energy difference E = E(Ri+1)-E(Ri) Evaluate energy E(Ri+1) n Reduce temperature Ti+1 =Tj - T y Reached equilbrium at Tj ? Accept Ri+1 as new configuration y E < 0? y n Keep Ri as current configuration n exp - E > v? T Create random number v in [0,1] Metropolis step SA Block Diagram Image by MIT OpenCourseWare. 27 MATLAB Function: SA.m SA.m Initial Configuration xo Option Flags options Simulated Annealing Algorithm evaluation function perturbation function file_eval.m file_perturb.m Best Configuration xbest Ebest History xhist 28 Exponential Cooling • Typically (T1/To)~0.7-0.9 Tk Simulated Annealing Evolution 10 C-specific heat S-entropy T/10-temperature 9 1 T1 To k Tk 8 7 Can also do linear or step-wise cooling 6 5 4 … But exponential cooling often works best. 3 2 1 0 0 5 10 15 20 25 Temperature Step 30 35 40 45 29 Key Ingredients for SA 1. A concise description of a configuration (architecture, design, topology) of the system (Design Vector). 2. A random generator of rearrangements of the elements in a configuration (Neighborhoods). This generator encapsulates rules so as to generate only valid configurations. Perturbation function. 3. A quantitative objective function containing the trade-offs that have to be made (Simulation Model and Output Metric(s)). Surrogate for system energy. 4. An annealing schedule of the temperatures and/or the length of times for which the system is to be evolved. 30 Sample Problems 31 Traveling Salesman Problem • • • • • N cities arranged randomly on [-1,1] Choose N=15 SAdemo2.m Minimize “cost” of route (length, time,…) Visit each city once, return to start city N 2 l R x j Ri i 1 1 j 1 x j Ri 2 2 x j R1 x j RN 2 j 1 visit N cities return home to first city 32 TSP Problem (cont.) Initial (Random) Route Length: 17.43 Final (Optimized) Route Length: 8.24 Length of TSP Route: 17.4309 Length of TSP Route: 8.2384 1 7 0.8 1 2 10 7 11 0.8 0.6 Start 1 0.4 8 2 10 11 0.6 1 8 0.4 3 0.2 3 12 14 9 0.2 0 0 -0.2 -0.2 -0.4 -0.4 -0.8 -1 -1 6 5 -0.6 4 -0.8 15 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 -1 1 -1 6 5 -0.6 -0.8 13 Start 12 14 9 4 13 -0.8 15 -0.6 -0.4 -0.2 0 0.2 Shortest Route 0.4 0.6 0.8 1 Result with SA 33 Structural Optimization • Define: – – – – Design Domain Boundary Conditions Loads Mass constraint • Subdivide domain – N x M design “cells” – Cell density =1 or =0 find i 1,..., N i min C f T u( i ) s.t. u K 1f N s.t. Vi i mmax i 1 • Where to put material to minimize compliance? 34 Structural Topology Optimization (II) “Energy” = strain energy = compliance Computed via Finite Element Analysis 41 42 42 28 31 29 32 32 19 21 22 11 23 23 11 12 12 1 Cantilever Boundary condition 33 33 20 10 1 4343 13 13 2 22 33 44 45 46 47 48 49 50 44 45 30 31 32 33 34 35 36 46 47 34 35 36 37 38 48 39 40 34 49 35 50 21 22 23 3624 25 26 27 37 24 25 26 27 2838 29 30 24 39 25 40 12 13 14 26 15 16 17 18 27 28 14 15 16 17 18 19 20 14 29 15 3 4 5 16 6 7 8 9 30 17 188 4 5 6 7 9 10 4 19 5 20 6 7 8 9 10 Applied Load F=-2000 Deformation not drawn to scale 35 Structural Toplogy Optimization Initial Structural Toplogy “Final” Toplogy Initial Structural Configuration: x o “island” 9 Structural Configuration 8 7 6 5 4 3 2 1 0 -1 0 2 4 6 8 Compliance C=593.0 10 Compliance C=236.68 Not “optimal” – often need post-processing 36 Structural Optimization – Convergence Analysis Simulated Annealing Evolution 14 45 T=7.4 C-specific heat S-entropy ln(T)-temperature 12 Don’t terminate when C is still large ! 40 T=36 35 T=19 10 Occurences 30 8 6 25 20 15 4 10 2 0 T=6.7 T=10 T=11 T=13 T=6 T=21 T=9.2 T=8.2 T=16 T=29T=68 T=93 T=24 T=1.8e+002 T=55 T=40 T=3e+002 T=17 T=1.9e+002 T=32 T=2.2e+002 T=2.7e+002 T=14 T=1.3e+002 T=1.6e+002 T=2.4e+002 T=61 T=45T=1e+002 T=49T=84 T=26 T=1.1e+002 T=75 T=1.4e+002 5 0 5 10 15 20 25 Temperature Step 30 35 40 0 200 400 600 800 1000 1200 1400 Energy 1600 1800 2000 2200 Boltzmann Distributions Evolution of - Entropy, Temperature, Specific Heat 37 Premature termination SA convergence history 2200 current configuration new best configuration 2000 1800 System Energy 1600 1400 1200 1000 800 600 400 200 0 500 1000 1500 2000 Iteration Number 2500 3000 Indicator: Best Configuration found only shortly before Simulated Annealing terminated. 38 Final Example: Telescope Array Optimization • Place N=27 stations in xy within a 200 km radius • Minimize UV density metric • Ideally also minimize cable length Cable Length = 1064.924 Initial Configuration UV Density = 0.67236 200 100 100 50 0 0 -100 -50 -200 -200 -100 0 100 200 -100 -100 -50 0 50 100 39 Optimized Solution Cable Length = 1349.609 UV Density = 0.33333 200 100 100 50 0 0 -100 -50 -200 -200 -100 0 100 200 -100 -100 -50 0 50 100 Simulated Annealing Improved UV density from 0.67 to 0.33 • Simulated Annealing transforms the array: – Hub-and-Spoke Circle-with-arms Cohanim B. E., Hewitt J. N., and de Weck O.L., “The Design of Radio Telescope Array Configurations using Multiobjective Optimization: Imaging Performance versus Cable Length”, The Astrophysical Journal, Supplement Series, 154, 705-719, October 2004 40 Summary 41 Summary: Steps of SA • The Simulated Annealing Algorithm 1) Choose a random Xi, select the initial system temperature, and outline the cooling (ie. annealing) schedule 2) Evaluate E(Xi) using a simulation model 3) Perturb Xi to obtain a neighboring Design Vector (Xi+1) 4) Evaluate E(Xi+1) using a simulation model 5) If E(Xi+1)< E(Xi), Xi+1 is the new current solution 6) If E(Xi+1)> E(Xi), then accept Xi+1 as the new current solution with a probability e(- /T) where = E(Xi+1) - E(Xi). 7) Reduce the system temperature according to the cooling schedule. 8) Terminate the algorithm. 42 Recent Research in SA • Alternative Cooling Schedules and Termination criteria • Adaptive Simulated Annealing (ASA) – determines its own cooling schedule • Hybridization with other Heuristic Search Methods (GA, Tabu Search …) • Multiobjective Optimization with SA 43 References • Cerny, V., "Thermodynamical Approach to the Traveling Salesman Problem: An Efficient Simulation Algorithm", J. Opt. Theory Appl., 45, 1, 41-51, 1985 . • de Weck O. “System Optimization with Simulated Annealing (SA), Memorandum SA.pdf • Cohanim B. E., Hewitt J. N., and de Weck O.L., “The Design of Radio Telescope Array Configurations using Multiobjective Optimization: Imaging Performance versus Cable Length”, The Astrophysical Journal, Supplement Series, 154, 705-719, October 2004 • Jilla, C.D., and Miller, D.W., “Assessing the Performance of a Heuristic Simulated Annealing Algorithm for the Design of Distributed Satellite Systems,” Acta Astronautica, Vol. 48, No. 5-12, 2001, pp. 529-543. • Kirkpatrick, S., Gelatt, C.D., and Vecchi, M.P., “Optimization by Simulated Annealing,” Science, Volume 220, Number 4598, 13 May 1983, pp. 671-680. • Metropolis,N., A. Rosenbluth, M. Rosenbluth, A. Teller, E. Teller, "Equation of State Calculations by Fast Computing Machines", J. Chem. Phys.,21, 6, 1087-1092, 1953. 44 MIT OpenCourseWare http://ocw.mit.edu ESD.77 / 16.888 Multidisciplinary System Design Optimization Spring 2010 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.