Global Optimization And Simulated Annealing Notes (compiled from the Web) [Source: http://www.cs.sandia.gov/opt/survey/sa.html] Simulated Annealing Overview Simulated annealing is a generalization of a Monte Carlo method for examining the equations of state and frozen states of n-body systems [Metropolis et al. 1953]. The concept is based on the manner in which liquids freeze or metals recrystalize in the process of annealing. In an annealing process a melt, initially at high temperature and disordered, is slowly cooled so that the system at any time is approximately in thermodynamic equilibrium. As cooling proceeds, the system becomes more ordered and approaches a "frozen" ground state at T=0. Hence the process can be thought of as an adiabatic approach to the lowest energy state. If the initial temperature of the system is too low or cooling is done insufficiently slowly the system may become quenched forming defects or freezing out in metastable states (ie. trapped in a local minimum energy state). The original Metropolis scheme was that an initial state of a thermodynamic system was chosen at energy E and temperature T, holding T constant the initial configuration is perturbed and the change in energy dE is computed. If the change in energy is negative the new configuration is accepted. If the change in energy is positive it is accepted with a probability given by the Boltzmann factor exp -(dE/T). This processes is then repeated sufficient times to give good sampling statistics for the current temperature, and then the temperature is decremented and the entire process repeated until a frozen state is achieved at T=0. By analogy the generalization of this Monte Carlo approach to combinatorial problems is straight forward [Kirkpatrick et al. 1983, Cerny 1985]. The current state of the thermodynamic system is analogous to the current solution to the combinatorial problem, the energy equation for the thermodynamic system is analogous to at the objective function, and ground state is analogous to the global minimum. The major difficulty (art) in implementation of the algorithm is that there is no obvious analogy for the temperature T with respect to a free parameter in the combinatorial problem. Furthermore, avoidance of entrainment in local minima (quenching) is dependent on the "annealing schedule", the choice of initial temperature, how many iterations are performed at each temperature, and how much the temperature is decremented at each step as cooling proceeds. Application Domains Simulated annealing has been used in various combinatorial optimization problems and has been particularly successful in circuit design problems (see Kirkpatrick et al. 1983). Software Implementation of simulated annealing applied to the traveling salesman problem can be found in Numerical Recipes section 10.9. 1 [Source: http://members.aol.com/btluke/simann1.htm] Simulated Annealing Brian T. Luke, Ph.D. LearningFromTheWeb.net A simple Monte Carlo simulation samples the possible states of a system by randomly choosing new parameters. At the end of the simulation, the collection, or ensemble, of randomly chosen points in search space gives you information about about this space. For example, the web page Simple Monte Carlo Simulation gives an example of a unit square containing one-quarter of a unit circle whose center is in the lower left corner. The search space is the unit square, and any point in this space can be in one of two possible states; inside of the quarter-circle, or outside. Each point in the search space is determined by the value of two parameters, its x- and y-coordinate. The possible values for each parameter can be any real number in the range [0.0,1.0]. Each step in the simulation consists of choosing random, allowed values for both of the parameters. This generates a point in the search space that is associated with one of the two states. At the end of the simulation, there will be an ensemble of N points, of which Nin are inside of the quarter-circle. The ratio of Nin to N is just the ratio of the area inside the quarter-circle to the area of the unit square. Therefore, a simple Monte Carlo simulation randomly selects a point somewhere is the search space and all points are used to find out information about the search space. This procedure has use in some problems, like the one described above for finding the area of certain regions, but does not give physically realistic results when the search space represents an energy surface. For example, assume that the simulation studies a collection of M helium atoms in a cube. The position of each atom is described by three parameters that give its coordinates within the cube. The energy of this system is given by the sum of all pairwise interaction energies. If you wanted to calculate the average energy of this system, a simple Monte Carlo simulation should not be used. This is because a random placement of the M atoms may, at some point of the simulation, place two of the atoms so close together that their interaction energy is virtually infinite. This adds an infinite energy to the ensemble of atom distributions and produces an infinite average energy. In the real world, two helium atoms would never get that close together. Therefore, a modification to the simple Monte Carlo simulation needs to be made so that unrealistic samples are not placed into the ensemble. Such a modification was proposed in 1953 by Nicholas Metropolis and coworkers [1]. This modified procedure is known as a Metropolis Monte Carlo simulation. In contrast with the simple Monte Carlo simulation, a new point in search space is sampled by making a slight change to the current point. In the example used here, a new orientation of the helium atoms is created by making a random, small change to each atom's coordinates. If the energy of this new orientation is less than that of the old, this orientation is added to the ensemble. If the energy rises, a Boltzmann acceptance criteria is used. If the energy rise is small enough, the new orientation is added to the ensemble. 2 Conversely, if the energy rise is too large, the new orientation is rejected and the old orientation is again added to the ensemble (see Metropolis Monte Carlo Simulation for more details). By using this acceptance probability one can prove that the average of any property, such as the energy, over the ensemble is equal to the Boltzmann average of this property as determined by the Boltzmann Distribution Law, for a sufficiently large ensemble. What is unique about this Boltzmann acceptance probability is that the temperature of the system must be used. Therefore, the Boltzmann average of a property is the expected value of this property at the given temperature. In 1983, Kirkpatrick and coworkers [2] proposed a method of using a Metropolis Monte Carlo simulation to find the lowest energy (most stable) orientation of a system. Their method is based upon the procedure used to make the strongest possible glass. This procedure heats the glass to a high temperature so that the glass is a liquid and the atoms can move relatively freely. The temperature of the glass is slowly lowered so that at each temperature the atoms can move enough to begin adopting the most stable orientation. If the glass is cooled slowly enough, the atoms are able to "relax" into the most stable orientation. This slow cooling process is known as annealing, and so their method is known as Simulated Annealing. A Simulated Annealing optimization starts with a Metropolis Monte Carlo simulation at a high temperature. This means that a relatively large percentage of the random steps that result in an increase in the energy will be accepted. After a sufficient number of Monte Carlo steps, or attempts, the temperature is decreased. The Metropolis Monte Carlo simulation is then continued. This process is repeated until the final temperature is reached. A Simulated Annealing program consists of a pair of nested DO-loops. The outermost loop sets the temperature and the inner-most loop runs a Metropolis Monte Carlo simulation at that temperature. The way in which the temperature is decreased is known as the cooling schedule. In practice, two different cooling schedules are predominantly used; a linear cooling schedule (Tnew=Told-dT) and a proportional cooling schedule (Tnew=C×Told) where C<1.0. These are not the only possible cooling schedules, they are just the ones that appear the most in the literature. Other possible cooling schedules are shown in Figure 1. To hopefully make this whole process clearer, Figure 2 presents a flow chart of a Simulated Annealing run, with explanation. As described in more detail in the discussion of a Metropolis Monte Carlo simulation, a more difficult aspect is to determine who long to run this simulation at each temperature. This depends upon the maximum size of the Monte Carlo step at each temperature. While a pure Metropolis Monet Carlo simulation attempts to reproduce the correct Boltzmann distribution at a given temperature, the inner-loop of a Simulated Annealing optimization only needs to be run long enough to explore the regions of search space that should be reasonably populated. This allows for a reduction in the number of Monte Carlo steps at each temperature, but the balance between the maximum step size and the number of Monte Carlo steps is often difficult to achieve, and depends very much on the characteristics of the search space or energy landscape. References: [1] N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth. A.H. Teller and E. Teller, J. 3 Chem. Phys. 21 (1953) 1087-1092. [2] S. Kirkpatrick, C.D. Gelatt and M.P. Vecchi, Science 220 (1983) 671-680. [Source: http://emlab.berkeley.edu/Software/abstracts/goffe895.html] Simulated Annealing - Global Optimization Method That Distinguishes Between Different Local Optima Author: William Goffe Copyright (c) 1995. William Goffe. All Rights Reserved. Keywords: simulated annealing, optimization Reference: This implementation of simulated annealing was used in "Global Optimization of Statistical Functions with Simulated Annealing," Goffe, Ferrier and Rogers, Journal of Econometrics, vol. 60, no. 1/2, Jan./Feb. 1994, pp. 65-100. Briefly, we found it competitive, if not superior, to multiple restarts of conventional optimization routines for difficult optimization problems. Description: Simulated annealing is a global optimization method that distinguishes between different local optima. Starting from an initial point, the algorithm takes a step and the function is evaluated. When minimizing a function, any downhill step is accepted and the process repeats from this new point. An uphill step may be accepted. Thus, it can escape from local optima. This uphill decision is made by the Metropolis criteria. As the optimization process proceeds, the length of the steps decline and the algorithm closes in on the global optimum. Since the algorithm makes very few assumptions regarding the function to be optimized, it is quite robust with respect to non-quadratic surfaces. The degree of robustness can be adjusted by the user. In fact, simulated annealing can be used as a local optimizer for difficult functions. Platforms: Fortran [Source: http://www-2.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/anneal/asa/0.html] ASA: Adaptive Simulated Annealing areas/anneal/asa/ ASA (Adaptive Simulated Annealing) is a powerful global optimization C-code algorithm especially useful for nonlinear and/or stochastic systems. ASA is developed to statistically find the best global fit of a nonlinear non-convex cost-function over a D-dimensional space. This algorithm permits an annealing schedule for 'temperature' T decreasing exponentially in annealing-time k, T = T_0 exp(-c k^1/D). The introduction of re-annealing also permits adaptation to changing sensitivities in the multi-dimensional parameter-space. This annealing schedule is faster than fast Cauchy annealing, where T = T_0/k, and much faster than Boltzmann annealing, where T = T_0/ln k. 4 Origin: ftp.alumni.caltech.edu:/pub/ingber/ASA.tar.gz [131.215.48.62] References Cerny, V., "Thermodynamical Approach to the Traveling Salesman Problem: An Efficient Simulation Algorithm", J. Opt. Theory Appl., 45, 1, 41-51, 1985 Kirkpatrick, S., C. D. Gelatt Jr., M. P. Vecchi, "Optimization by Simulated Annealing",Science, 220, 4598, 671-680, 1983. Metropolis,N., A. Rosenbluth, M. Rosenbluth, A. Teller, E. Teller, "Equation of State Calculations by Fast Computing Machines", J. Chem. Phys.,21, 6, 1087-1092, 1953. Press, Wm. H., B. Flannery, S. Teukolsky, Wm. Vettering, Numerical Recipes, 326-334, Cambridge University Press, New York, NY, 1986. [Source: ftp://ftp.taygeta.com/pub/publications/anneal.refs] Simulated Annealing References Caceci, M.S. and W. P. Cacheris, 1984; Fitting Curves to Data, BYTE, May, pp. 340 - 362 Kirkpatrick, S., C.D. Gelatt Jr, and M.P. Vecchi, 1983; Optimization by Simulated Annealing, Science, V. 220, No. 4598, pp. 671 - 680 MacDougall, M.H., 1987; Simulating Computer Systems, Techniques and Tools, M.I.T. Press, Cambridge Mass., 284 pages McCalla, 1967; Introduction to Numerical Methods and FORTRAN Programming, John Wiley & Sons, New York McClelland, J.L. and D.E. Rumelhart, 1988; Explorations in Parallel Distributed Processing, A Handbook of Models, Programs, and Exercises, M.I.T. Press, Cambridge Mass., 344 pages Metropolis, N., A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller and E. Teller, 1953; Equation of State Calculations by Fast Computing Machines, J. Chem. Phys., V 21, No. 6, pp. 1087 - 1092 Press, W.H., B.P. Flannery, S.A. Teukolsky and W.T. Vetterling, 1986; Numerical Recipes, The Art of Scientific Computing, Cambridge Univ. Press, Cambridge England, 818 pages Rumelhart, D.E. and J.L. McClelland, 1986; Parallel Distributed Processing, Explorations in the Microstructure of Cognition, Volume 1: Foundations, 5 M.I.T. Press, Cambridge Mass., 547 pages Szu, H. and R. Hartley, 1987; Fast Simulated Annealing, Physics Letters A, Vol. 122, No. 3,4, pp. 157 - 162 Wasserman, P.D., 1989; Neural Computing, Theory and Practice, Van Nostrand Reinhold, New York, 230 pages http://www.nutechsolutions.com/technology/genetic_algorithms.asp Genetic Algorithms The entire research area of Genetic Algorithms, as well as that of Evolutionary Computing, was inspired by Darwin's theory of natural selection and survival of the fittest. Genetic algorithms are problem-solving programs that try to mimic the way large populations solve problems over a long period of time, through processes such as reproduction, mutation, and natural selection. To emulate the natural phenomenon of evolution, a genetic algorithm program creates a population of candidate solutions to a particular problem, and through a process of random selection and variation, each generation of the program improves upon the quality of the solution. Consequently, genetic algorithms promote the evolution of solutions by using genetically based processes. Unlike natural evolution, the program is usually able to generate and evaluate thousands of generations in seconds. Useful Links Global Optimization: http://www.mat.univie.ac.at/~neum/glopt.html Contains discussions (and code) for a wide variety of global optimization procedures, including simulated annealing... A Survey of Global Optimization Methods: http://www.cs.sandia.gov/opt/survey/main.html Just what it says, but more readable than the site listed above... Global Optimization: http://www.fi.uib.no/~antonych/glob.html Another site that lists journals, and lots of links... Simulated Annealing - Global Optimization Method That Distinguishes Between Different Local Optima http://emlab.berkeley.edu/Software/abstracts/goffe895.html This is the source code from the Goffe et al. 1994 citation that I used as the basis for the simulated annealing code that I have implemented in Delphi. Numerical Recipes Homepage: http://www.nr.com/ Homepage for the popular series of programs in C, Pascal, etc. 6 Numerical Recipes in Pascal: http://archives.math.utk.edu/software/msdos/numerical.analysis/nrpas13/.html This link will take you to a download of all of the programs from the diskette accompanying the Pascal version of Numerical Recipes 7