Simulated Annealing A Basic Introduction Massachusetts Institute of Technology Olivier de Weck, Ph.D.

advertisement
Simulated Annealing
A Basic Introduction
Lecture 10
Olivier de Weck, Ph.D.
Massachusetts Institute of Technology
1
Outline
• Heuristics
• Background in Statistical Mechanics
– Atom Configuration Problem
– Metropolis Algorithm
• Simulated Annealing Algorithm
• Sample Problems and Applications
• Summary
2
Learning Objectives
• Review background in Statistical Mechanics:
configuration, ensemble, entropy, heat capacity
• Understand the basic assumptions and steps in
Simulated Annealing (SA)
• Be able to transform design problems into a
combinatorial optimization problem suitable to
SA
• Understand strengths and weaknesses of SA
3
Heuristics
4
What is a Heuristic?
• A Heuristic is simply a rule of thumb that hopefully will find a good
answer.
• Why use a Heuristic?
– Heuristics are typically used to solve complex (large, nonlinear, nonconvex (i.e. contain local minima)) multivariate combinatorial optimization
problems that are difficult to solve to optimality.
• Unlike gradient-based methods in a convex design space, heuristics
are NOT guaranteed to find the true global optimal solution in a single
objective problem, but should find many good solutions (the
mathematician's answer vs. the engineer’s answer)
• Heuristics are good at dealing with local optima without getting stuck
in them while searching for the global optimum.
5
Types of Heuristics
• Heuristics Often Incorporate Randomization
• Two Special Cases of Heuristics
– Construction Methods
• Must first find a feasible solution and then improve it.
– Improvement Methods
• Start with a feasible solution and just try to improve it.
• 3 Most Common Heuristic Techniques
–
–
–
–
Simulated Annealing
Genetic Algorithms
Tabu Search
New Methods: Particle Swarm Optimization, etc…
6
Origin of Simulated Annealing (SA)
• Definition: A heuristic technique that mathematically mirrors the
cooling of a set of atoms to a state of minimum energy.
• Origin: Applying the field of Statistical Mechanics to the field of
Combinatorial Optimization (1983)
• Draws an analogy between the cooling of a material (search for
minimum energy state) and the solving of an optimization problem.
• Original Paper Introducing the Concept
– Kirkpatrick, S., Gelatt, C.D., and Vecchi, M.P., “Optimization by Simulated
Annealing,” Science, Volume 220, Number 4598, 13 May 1983, pp. 671680.
7
MATLAB® “peaks” function
Difficult due to plateau at z=0, local maxima
SAdemo1
xo=[-2,-2]
Maximum
Optimum at
x*=[ 0.012, 1.524]
z*=8.0484
Peaks
3
2
1
y
•
•
•
•
•
•
0
-1
-2
Start Point
-3
-3
-2
-1
0
x
1
2
3
8
“peaks” convergence
• Initially ~ nearly random search
• Later ~ gradient search
SA convergence history
2
current configuration
new best configuration
0
System Energy
-2
-4
-6
-8
-10
0
50
100
150
Iteration Number
200
250
300
9
Statistical Mechanics
10
The Analogy
• Statistical Mechanics: The behavior of systems with many
degrees of freedom in thermal equilibrium at a finite temperature.
• Combinatorial Optimization: Finding the minimum of a given
function depending on many variables.
• Analogy: If a liquid material cools and anneals too quickly, then the
material will solidify into a sub-optimal configuration. If the liquid
material cools slowly, the crystals within the material will solidify
optimally into a state of minimum energy (i.e. ground state).
– This ground state corresponds to the minimum of the cost function in an
optimization problem.
11
Sample Atom Configuration
Original Configuration
Perturbed Configuration
Atom Configuration - Sample Problem
Atom Configuration - Sample Problem
7
7
6
6
5
5
3
4
4
4
3
2
2
2
1
1
3
2
2
3
1
1
0
-1
-1
4
3
0
0
1
2
3
4
5
6
7
E=133.67
Energy of original (configuration)
-1
-1
0
1
4
5
6
7
E=109.04
Perturbing = move a random atom
to a new random (unoccupied) slot
12
Configurations
• Mathematically describe a configuration
– Specify coordinates of each “atom”
ri with i=1,..,4 r1
1
, r2
1
3
, r3
2
4
, r4
4
5
3
– Specify a slot for each atom
ri with i=1,..,4 R
1 12 19 23
13
Energy of a state
• Each state (configuration) has an energy
level associated with it
H q, q, t
qi pi
L q, q, t
Hamiltonian
i
H
T V
kinetic
potential
Energy
Etot
Energy of configuration
=
Objective function value of design
14
Energy sample problem
• Define energy function for “atom” sample
problem
N
Ei
mgyi
potential
energy
xi
xj
2
yi
yj
1
2 2
j i
kinetic
energy
N
E R
Ei ri
i 1
Absolute and relative position of
each atom contributes to Energy
15
Compute Energy of Config. A
• Energy of initial configuration
Atom Configuration - Sample Problem
7
E1 1 10 1
5
18
E2 1 10 2
5
5
20
5
20.95
26.71
6
5
3
4
4
3
E3 1 10 4
E4 1 10 3
18
20
5
5
2
2
47.89
38.12
2
2
1
1
0
-1
-1
0
1
2
3
4
5
6
7
Total Energy Configuration A: E({rA})= 133.67
16
Boltzmann Probability
NR
P!
P N !
Number of configurations
P=# of slots=25
N=# of atoms =4
NR=6,375,600
What is the likelihood that a particular configuration will
exist in a large ensemble of configurations?
P {r}
exp
E ({r})
k BT
Boltzmann probability
depends on energy
and temperature
17
Boltzmann Collapse at low T
T=100
T=10
T=10 - E avg=58.9245
T=1 - E avg=38.1017
14000
12000
12000
12000
10000
10000
10000
8000
6000
4000
8000
6000
4000
2000
0
Occurences
14000
Occurences
Occurences
T=100 - E avg=100.1184
T=1
20
40
60
80
T high
100
Energy
120
140
160
180
200
0
6000
4000
2000
0
8000
2000
0
20
40
60
80
100
Energy
120
140
160
180
200
0
0
20
40
60
80
100
Energy
120
140
160
180
200
T low
Boltzmann Distribution collapses to the lowest energy
state(s) in the limit of low temperature
Basis of search by Simulated Annealing
18
Partition Function Z
• Ensemble of configurations can be
described statistically
• Partition function, Z, generates the
ensemble average
Z
Ei
Tr exp
kBT
NR
i
Ei
exp
kBT
1
Initially (at T>>0) equal to the number of possible configurations
19
Free Energy
NR
Eavg
E (T )
Average Energy of all
Configurations in an
ensemble
Ei T
i 1
F T
kBT ln Z
NR
exp
Eavg
E (T )
i 1
E(T ) TS
Ei T
k BT
Z
Ei T
d ln Z
d 1 k BT
Relates average Energy at T with Entropy S
20
Specific Heat and Entropy
• Specific Heat
C T
dE (T )
dT
1
NR
NR
2
i
E (T ) E T
i 1
k BT 2
2
2
E (T ) E T
k BT 2
2
• Entropy
T1
S (T )
S (T1 )
T
C (T )
dT
T
dS (T )
dT
C (T )
T
Entropy is ~ equal to ln(# of unique configurations)
21
Low Temperature Statistics
Specific Heat C and Entropy S
5
10
# of configurations, NT
10
# configs
decreases
6
4
2
0
10
0
10
2
10
Temperature
0
4
10
0
2
10
Temperature
4
10
Average Energy, Eavg
150
100
50
0
-50
-100
Approaches -150
E* from
below
Ht. Capacity
10
150
Free Energy, F
Entropy S
decreases
with lower T
8
0
10
2
10
Temperature
4
10
100
50
0
-50
-100
-150
0
10
2
10
Temperature
4
Approaches
E* from
above
10
22
Minimum Energy Configurations
• Sample Atom Placement Problem
Atom Configuration - Sample Problem
Atom Configuration - Sample Problem
7
7
6
6
5
5
E*=60
E*=14.26
4
4
3
3
2
2
3
2
1
4
1
3
4
2
1
1
4
3
0
0
-1
-1
Uniqueness?
0
1
2
3
4
5
6
Strong gravity g=10
7
-1
-1
0
1
2
5
6
7
Low gravity g=0.1
23
Simulated Annealing
24
Dilemma
• Cannot compute energy of all configurations !
– Design space often too large
– Computation time for a single function evaluation can
be large
• Use Metropolis Algorithm, at successively lower
temperatures to find low energy states
– Metropolis: Simulate behavior of a set of atoms in
thermal equilibrium (1953)
– Probability of a configuration existing at T 
Boltzmann Probability P(r,T)=exp(-E(r)/T)
25
The SA Algorithm
• Terminology:
– X (or R or ) = Design Vector (i.e. Design, Architecture, Configuration)
– E = System Energy (i.e. Objective Function Value)
– T = System Temperature
–
= Difference in System Energy Between Two Design Vectors
• The Simulated Annealing Algorithm
1) Choose a random Xi, select the initial system temperature, and specify the
cooling (i.e. annealing) schedule
2) Evaluate E(Xi) using a simulation model
3) Perturb Xi to obtain a neighboring Design Vector (Xi+1)
4) Evaluate E(Xi+1) using a simulation model
5) If E(Xi+1)< E(Xi), Xi+1 is the new current solution
6) If E(Xi+1)> E(Xi), then accept Xi+1 as the new current solution with a
probability e(- /T) where = E(Xi+1) - E(Xi).
7) Reduce the system temperature according to the cooling schedule.
8) Terminate the algorithm.
26
SA Block Diagram
Define initial
configuration Ro
Start To
End Tmin
y
T<Tmin?
Evaluate energy
E(Ro)
Perturb configuration
_
Ri > Ri+1
n
Compute energy
difference
E = E(Ri+1)-E(Ri)
Evaluate energy
E(Ri+1)
n
Reduce temperature
Ti+1 =Tj - T
y
Reached
equilbrium
at Tj ?
Accept Ri+1 as
new configuration
y
E < 0?
y
n
Keep Ri as current
configuration
n
exp - E > v?
T
Create random
number v in [0,1]
Metropolis step
SA Block Diagram
Image by MIT OpenCourseWare.
27
MATLAB Function: SA.m
SA.m
Initial Configuration
xo
Option Flags
options
Simulated Annealing
Algorithm
evaluation
function
perturbation
function
file_eval.m
file_perturb.m
Best Configuration
xbest Ebest
History
xhist
28
Exponential Cooling
• Typically (T1/To)~0.7-0.9
Tk
Simulated Annealing Evolution
10
C-specific heat
S-entropy
T/10-temperature
9
1
T1
To
k
Tk
8
7
Can also do linear
or step-wise cooling
6
5
4
…
But exponential cooling
often works best.
3
2
1
0
0
5
10
15
20
25
Temperature Step
30
35
40
45
29
Key Ingredients for SA
1.
A concise description of a configuration (architecture, design,
topology) of the system (Design Vector).
2.
A random generator of rearrangements of the elements in a
configuration (Neighborhoods). This generator encapsulates rules
so as to generate only valid configurations. Perturbation function.
3.
A quantitative objective function containing the trade-offs that
have to be made (Simulation Model and Output Metric(s)).
Surrogate for system energy.
4.
An annealing schedule of the temperatures and/or the length of
times for which the system is to be evolved.
30
Sample Problems
31
Traveling Salesman Problem
•
•
•
•
•
N cities arranged randomly on [-1,1]
Choose N=15
SAdemo2.m
Minimize “cost” of route (length, time,…)
Visit each city once, return to start city
N
2
l R
x j Ri
i 1
1
j 1
x j Ri
2
2
x j R1
x j RN
2
j 1
visit N cities
return home to first city
32
TSP Problem (cont.)
Initial (Random) Route
Length: 17.43
Final (Optimized) Route
Length: 8.24
Length of TSP Route: 17.4309
Length of TSP Route: 8.2384
1
7
0.8
1
2
10
7
11
0.8
0.6
Start
1
0.4
8
2
10
11
0.6
1
8
0.4
3
0.2
3
12
14
9
0.2
0
0
-0.2
-0.2
-0.4
-0.4
-0.8
-1
-1
6
5
-0.6
4
-0.8
15
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
-1
1 -1
6
5
-0.6
-0.8
13
Start
12
14
9
4
13
-0.8
15
-0.6
-0.4
-0.2
0
0.2
Shortest Route
0.4
0.6
0.8
1
Result with SA
33
Structural Optimization
• Define:
–
–
–
–
Design Domain
Boundary Conditions
Loads
Mass constraint
• Subdivide domain
– N x M design “cells”
– Cell density =1 or =0
find
i 1,..., N
i
min C
f T u( i )
s.t. u
K 1f
N
s.t.
Vi
i
mmax
i 1
• Where to put material to minimize compliance?
34
Structural Topology Optimization (II)
“Energy” = strain energy = compliance
Computed via Finite Element Analysis
41
42
42
28
31
29
32
32
19
21
22
11
23
23
11
12
12
1
Cantilever
Boundary
condition
33
33
20
10
1
4343
13
13
2
22
33
44
45
46
47
48
49
50
44
45
30
31
32
33
34
35
36
46
47
34
35
36
37
38 48 39
40
34
49
35
50
21
22
23 3624
25
26
27
37
24
25
26
27
2838
29
30
24
39
25
40
12
13
14 26 15
16
17
18
27
28
14
15
16
17
18
19
20
14
29
15
3
4
5 16 6
7
8
9 30
17
188
4
5
6
7
9
10
4
19
5
20
6
7
8
9
10
Applied Load
F=-2000
Deformation not drawn to scale
35
Structural Toplogy Optimization
Initial Structural Toplogy
“Final” Toplogy
Initial Structural Configuration: x o
“island”
9
Structural Configuration
8
7
6
5
4
3
2
1
0
-1
0
2
4
6
8
Compliance C=593.0
10
Compliance C=236.68
Not “optimal” – often
need post-processing
36
Structural Optimization –
Convergence Analysis
Simulated Annealing Evolution
14
45
T=7.4
C-specific heat
S-entropy
ln(T)-temperature
12
Don’t terminate
when C is still
large !
40
T=36
35
T=19
10
Occurences
30
8
6
25
20
15
4
10
2
0
T=6.7
T=10
T=11
T=13
T=6
T=21
T=9.2
T=8.2
T=16
T=29T=68 T=93
T=24
T=1.8e+002
T=55
T=40
T=3e+002
T=17
T=1.9e+002
T=32 T=2.2e+002 T=2.7e+002
T=14
T=1.3e+002
T=1.6e+002
T=2.4e+002
T=61
T=45T=1e+002
T=49T=84
T=26 T=1.1e+002
T=75
T=1.4e+002
5
0
5
10
15
20
25
Temperature Step
30
35
40
0
200
400
600
800
1000
1200 1400
Energy
1600
1800
2000
2200
Boltzmann Distributions
Evolution of
- Entropy, Temperature, Specific Heat
37
Premature termination
SA convergence history
2200
current configuration
new best configuration
2000
1800
System Energy
1600
1400
1200
1000
800
600
400
200
0
500
1000
1500
2000
Iteration Number
2500
3000
Indicator: Best Configuration found only shortly
before Simulated Annealing terminated.
38
Final Example: Telescope Array
Optimization
• Place N=27 stations in xy within a 200 km radius
• Minimize UV density metric
• Ideally also minimize cable length
Cable Length = 1064.924
Initial
Configuration
UV Density = 0.67236
200
100
100
50
0
0
-100
-50
-200
-200
-100
0
100
200
-100
-100
-50
0
50
100
39
Optimized Solution
Cable Length = 1349.609
UV Density = 0.33333
200
100
100
50
0
0
-100
-50
-200
-200
-100
0
100
200
-100
-100
-50
0
50
100
Simulated Annealing Improved UV density from 0.67 to 0.33
• Simulated Annealing transforms the array:
– Hub-and-Spoke  Circle-with-arms
Cohanim B. E., Hewitt J. N., and de Weck O.L., “The Design of Radio Telescope Array
Configurations using Multiobjective Optimization: Imaging Performance versus Cable
Length”, The Astrophysical Journal, Supplement Series, 154, 705-719, October 2004
40
Summary
41
Summary: Steps of SA
• The Simulated Annealing Algorithm
1) Choose a random Xi, select the initial system temperature, and outline the
cooling (ie. annealing) schedule
2) Evaluate E(Xi) using a simulation model
3) Perturb Xi to obtain a neighboring Design Vector (Xi+1)
4) Evaluate E(Xi+1) using a simulation model
5) If E(Xi+1)< E(Xi), Xi+1 is the new current solution
6) If E(Xi+1)> E(Xi), then accept Xi+1 as the new current solution with a
probability e(- /T) where = E(Xi+1) - E(Xi).
7) Reduce the system temperature according to the cooling schedule.
8) Terminate the algorithm.
42
Recent Research in SA
• Alternative Cooling Schedules and Termination
criteria
• Adaptive Simulated Annealing (ASA) –
determines its own cooling schedule
• Hybridization with other Heuristic Search
Methods (GA, Tabu Search …)
• Multiobjective Optimization with SA
43
References
•
Cerny, V., "Thermodynamical Approach to the Traveling Salesman Problem: An Efficient
Simulation Algorithm", J. Opt. Theory Appl., 45, 1, 41-51, 1985 .
•
de Weck O. “System Optimization with Simulated Annealing (SA), Memorandum SA.pdf
•
Cohanim B. E., Hewitt J. N., and de Weck O.L., “The Design of Radio Telescope Array
Configurations using Multiobjective Optimization: Imaging Performance versus Cable
Length”, The Astrophysical Journal, Supplement Series, 154, 705-719, October 2004
•
Jilla, C.D., and Miller, D.W., “Assessing the Performance of a Heuristic Simulated
Annealing Algorithm for the Design of Distributed Satellite Systems,” Acta Astronautica,
Vol. 48, No. 5-12, 2001, pp. 529-543.
•
Kirkpatrick, S., Gelatt, C.D., and Vecchi, M.P., “Optimization by Simulated
Annealing,” Science, Volume 220, Number 4598, 13 May 1983, pp. 671-680.
•
Metropolis,N., A. Rosenbluth, M. Rosenbluth, A. Teller, E. Teller, "Equation of State
Calculations by Fast Computing Machines", J. Chem. Phys.,21, 6, 1087-1092, 1953.
44
MIT OpenCourseWare
http://ocw.mit.edu
ESD.77 / 16.888 Multidisciplinary System Design Optimization
Spring 2010
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Download