Simulated annealing

advertisement
Simulated annealing
... an overview
Contents
1. Annealing & Stat.Mechs
2. The Method
3. Combinatorial minimization
▫
The traveling salesman problem
4. Continuous minimization
▫
Thermal simplex
5. Applications
So in the first place...
what is simulated annealing?
• According to wikipedia :
“A probabilistic metaheuristic for the optimization
problem of locating a good approximaiton to the
global optimum of a given function in a large
search space”
• When you need some “good enough” solution
• For problems with many local minima
• Often used in very large discrete spaces
Annealing
Originally a blacksmithing technique in
which you cool down the metal slowly
Used to improve ductility and allow
further manipulation/shaping
Some notions of Stat.Mechs
So why is that, physically?
• Gibbs free energy
• Each configuration is possible, but weighted by a
boltzmann factor :
• Slow cooling : minimum energy configuration
• Fast cooling (quenching) : polycristals,
Some notions of Stat.Mechs
Example : Spin of a chain of atoms
• Possible states : Si = ±1
• Energy of a link : Eij = JSiSj
• Maximum/minimum is ±NJ (for N the chain length)
• Distribution of energy states given by boltzmann...
• Thus at low temperature : all spins align
• But is low temperature enough in physical systems?
How SA works...
So basically, we are going to do the same thing with
functions!
• Start by “baking” up the system (high randomization)
• Gradually cool down (structure appears)
• Enjoy 
The Method
(with a big M)
Element 1 : Description
Exact description of the state of
the possible configurations
Example : the N-Queens
problems
Possible representation of the
system : a vector
In this case, {7,5,2,6,3,7,8,4}
The Method
Element 2 : Generator of random changes
Some kind of way of evolving the
system → allowed moves
Requires some insight of the way
the system is working
In this case : select an attacked
queen, and move it to some
random spot on the same row
The Method
Element 3 : Objective function
Basically, the function to
optimize; might not always be
obvious in discrete systems
Analog of energy
In this case :
The Method
Element 4 : Acceptance probability function
The probability of taking the step
to the new proposed state
Generally :
Formally, some function of the
form P(E1,E2,T)
The Method
Element 5 : Annealing schedule
The specific way in which the
temperature is going to flow
from high to low
Will make the difference between
a working algorithm/PAIN
Meaning of fast/slow cooling and
hot/cold highly case-specific
Here : T(n) = 100/n
The Method
Further considerations
• Resets
• Specific heat calculation
Combinatorial minimization
• Type of minimization where there is no
continuous spectrum of values for the energy
function equivalent
• Can be hard to conceptualize :
▫ Energy function might not be obvious
▫ The most efficient way to get to neighbour states
might also not be obvious, and can require a lot of
thinking
Combinatorial minimization
The Traveling Salesman Problem
Given a list of cities and the
distances between each pair of
cities, what is the shortest
possible route that visits each
city exactly once and returns to
the origin city?
This is harder than it looks...
Possible solutions grow as O(n!)
Exact computation of solutions
grows as O(exp(n))
Combinatorial minimization
The Traveling Salesman Problem : Description
• A state can be defined as one distinct possible
route passing through all cities
• Supposing N cities, a vector of length N can
specify the order in which to visit them
For example with N=6 : {C1, C4, C2, C3, C6, C5}
• Must also specify the position of each city
Supposing 2D, thats {xi,yi}
Combinatorial minimization
The Traveling Salesman Problem : Generator
This one can be a bit tricky...
1. Take two random cities and swap them
2. Take two consecutive cities and swap them
3. Take two non-consecutive cities and swap the
whole segment between them
Combinatorial minimization
The Traveling Salesman Problem : Objective function
Pretty simple. We have the position of each city...
This is basically the total distance function.
Note : point N is the same as first point
Combinatorial minimization
The Traveling Salesman Problem : Annealing schedule
This is the part that requires experimentation... In
the literature, some considerations :
▫ If the square in which the cities are located has a
side of N1/2, then temperatures above N1/2 can be
considered hot, and temperatures below 1 are
cold;
▫ Every 100 steps OR 10 successful
reconfigurations, multiply temperature by 0.9
▫ Could also be some continuous equivalent...
Combinatorial minimization
The Traveling Salesman Problem : Some results
T = 1.2
T = 0.8
Combinatorial minimization
The Traveling Salesman Problem : Some results
T = 0.4
T = 0.0
Combinatorial minimization
The Traveling Salesman Problem : Some results
Constraints :
Continuous minimization
Kinda simpler, at least conceptually :
• Description : System state is some point x
• Generator : x+dx where dx is generated somewhat randomly
This is where we actually have some room to mess around... the way
dx is specified is entirely up to us
• Function : Function.
• Annealing schedule : Should be gradual once again, but
strongly depends on the function being minimized.
Continuous minimization
An example of implementation (NR Webnote 1)
Return of the AMOEBA!
1.
3.
2.
4.
Continuous minimization
An example of implementation
Except here’s a twist...
•A positive thermal fluctuation is added to all of the old values of the
simplex
•Another fluctuation is subtracted from the value of the proposal
point
•Thus the new point is favored over old points for high temperatures
Applications
• Lenses : Merit function depends on a lot of factors : curvature radii,
densities, thickness, etc.
• Placement problems : When you have to place a lot of stuff in a very
limited space... how do you arrange them optimally?
Used in logic boards, processors
Conclusion
• Simulated annealing is cool (after some time)
• Allows to research a large parameter space
without getting bogged down in local minima
• Very useful for discrete, combinatory problems
for which there are not a lot of algorithms (since
most are gradient based)
• Questions?
Simulated annealing
Round 2 !
Revisiting...
1. Specific heat calculation
2. Parameter spaces & objective functions
3. Simulated Annealing vs MCMC
Specific heat calculation
Reminder : we can define an equivalent to
specific heat for a given problem through
the energy (objective) function :
But how many steps do we need to get some
acceptable value for Cv?
Answer : it basically depends on E, and
more specifically on the variance of Cv.
Specific heat calculation
Thus this will be case-specific.
All points considered to get <E>
and <E2> are taken at the same
temperature; thus the annealing
schedule should be arranged
with blocks of constant
temperature
Cv‘s variance is usually pretty
large, so we need a lot of data. As
an example, for a function :
... at least 10k points/block.
Parameter spaces & objective functions
Generally speaking...
• Parameter space :
•
1.
Figure out what parameters you need to describe a state exactly;
these can be integers, real numbers, or whatever else you need
2.
Write it out in vector form (or possibly even matrix form), ie. [P1
, P2 , ... Pn]
3.
The set of all possible vectors (varying the parameters that you
defined in 1) is your parameter space
Objective function :
1.
Write out the objective function in terms of the previously
defined parameters
Parameter spaces & objective functions
The traveling salesman problem
• Parameter space :
Set of all possible ways to fill visit all cities once. We have N cities,
and we need 3 infos for each : xi position, yi position, and rank at
which the city will be visited Ri
•
Objective function :
Parameter spaces & objective functions
The knapsack problem
Parameter spaces & objective functions
The knapsack problem
• Parameter space :
Set of all possible ways to fill the sack without busting. Suppose we
have N types of objects; we need 3 infos for each object : the
number ni of objects of that type in the sack, the weight mi, and
some value Vi
•
Objective function :
Parameter spaces & objective functions
The spanning tree problem
Given a weighted graph (set of vertices and
weighted links between these), what is
the minimum value subgraph that
connects all vertices?
Parameter spaces & objective functions
The spanning tree problem
• Parameter space :
Set of all the M links, to which
we attribute an activation
value µi= 0,1 and weight xi
For N the number of vertices,
there are N – 1 degrees of
freedom
Parameter spaces & objective functions
The spanning tree problem
• Objective function :
where µi = 1 or 0 (active/inactive link)
Parameter spaces & objective functions
The N-queens problem
• Parameter space :
Set of all possible ways to place the N queens, supposing there is one
per row. For each queen we need only the column position xi :
•
Objective function :
Parameter spaces & objective functions
The N-queens problem
Simulated Annealing vs MCMC
Needs of each method : Continuous case
SA
MCMC
•Generator function : some way to
generate a proposal point x+dx
•Generator function : some way to
generate a proposal point x+dx
•Acceptance function (almost always) :
•Acceptance function (commonly) :
•Annealing schedule (how, specifically,
will the temperature decrease)
•Number of chains, starting points,
possibly with different temperatures
•Usually stops after a set number of
steps (has no memory of the chain,
usually)
•Usually stops either after a set number
of steps or when some condition of
minimal error is fulfilled in the
variance of the chain (requires keeping
the chain in memory!)
Simulated Annealing vs MCMC
Needs of each method : Combinatorial case
SA
MCMC
•Generator function : some way to
generate a proposal neighbour state
•Generator function : some way to
generate a proposal neighbour state
•Acceptance function (almost always) :
•Acceptance function (commonly) :
•Annealing schedule (how, specifically,
will the temperature decrease)
•Number of chains, starting points,
possibly with different temperatures
•Usually stops after a set number of
steps (has no memory of the chain,
usually)
•Usually stops either after a set number
of steps or when some condition of
minimal error is fulfilled in the
variance of the chain (requires keeping
the chain in memory!)
Simulated Annealing vs MCMC
Some more considerations...
• The choice of method is highly case dependant.
▫ Continuous : MCMCs should be able to handle most cases; SA
only to be used for particularly badly behaved objective functions
▫ Combinatorial : MCMCs can work... but most objective functions
in combinatorial problems tend to have a lot of very deep
minima, so SA is usually best
• SA is more demanding computationally, but can search a wider
parameter space
• MCMCs can easily be parallelized (more chains)
Download