Optimization Techniques for Design Space Exploration

advertisement
2012-03-27
Optimization Techniques
for Design Space Exploration
Zebo Peng
Embedded Systems Laboratory (ESLAB)
Linköping University
Outline




Optimization problems in
ERT system design
Heuristic techniques
Simulated annealing
Tabu search
Prof. Z. Peng, ESLAB/LiTH, Sweden
2
1
2012-03-27
Design Space of ERT Systems

Very large due to many solution parameters:
 architectures and components
 hardware/software partitioning
 mapping and scheduling
 operating systems and global control
 communication synthesis
Hardware
Software
Embedded
memory
Microprocessor
ASIC
DSP
Analog
circuit
Sensor
Prof. Z. Peng, ESLAB/LiTH, Sweden
Sourc
e: S3
Network
High-speed electronics
3
Design Space of ERT Systems

Very bumpy due to the interdependence between the different
parameters and the many design constraints (time, power, partial
structure, ...).
27
25
Many embedded RT systems have a
23
very complex design space.
21
19
17
Prof. Z. Peng,
15 ESLAB/LiTH, Sweden
4
3,9
33
2
2012-03-27
Design Space Exploration
What are needed in order to explore the complex design space
to find a good solution:





Exploration in the higher level of abstractions.
Development of high-level analysis and estimation
techniques.
Employment of very fast exploration algorithms.
Memory-less algorithms.
Each solution needs a huge data structure to store, so we
can’t afford to keep track of all visited solutions.
Design space exploration should be formulated as optimization
problems and powerful optimization techniques are needed!
Prof. Z. Peng, ESLAB/LiTH, Sweden
5
The Optimization Problem
The majority of design space exploration tasks can be
viewed as optimization problems:
To find
-
the architecture (type and number of processors, memory
modules, and communication mechanism, as well as their
interconnections),
-
the mapping of functionality onto the architecture
components, and
-
the schedules of basic functions and communications,
such that a cost function (in terms of implementation
cost, performance, power, etc.) is minimized and a
set of constraints is satisfied.
Prof. Z. Peng, ESLAB/LiTH, Sweden
6
3
2012-03-27
Mathematical Optimization

The design optimization problems can be formulated as to
Minimize
f(x)
Subject to
gi(x)  bi; i = 1, 2, ..., m;
where
x is a vector of decision variables (x  0);
f is the cost (objective) function;
gi’s are a set of constraints.

If f and gi are linear functions, we have a linear programming
(LP) problem.
LP can be solved by the simplex algorithm, which is an exact
method.

 It will always identify the optimal solution if it exists.
Prof. Z. Peng, ESLAB/LiTH, Sweden
7
Combinational Optimization (CO)

In many design problems, the decision variables are
restricted to integer values.
 The solution is a set, or a sequence, of integers or other discrete
objects.
 We have an Integer Linear Programming (ILP) problem, which
turns out to be more difficult to solve than the LP problems.

Ex. System partitioning can be formulated as:
 Given a graph with costs on its edges, partition the nodes into k
subsets no larger than a given maximum size, to minimize the total
cost of the cut edges.
 A feasible solution (with n nodes) is represented as
xi = j;

j  {1, 2, ..., k}, i = 1, 2, ..., n.
They are called combinatorial optimization problems.
Prof. Z. Peng, ESLAB/LiTH, Sweden
8
4
2012-03-27
Features of CO Problems

Most CO problems, e.g., system partitioning with constraints,
for ERT system designs are NP-compete.

The time needed to solve an NP-compete problem grows
exponentially with respect to the problem size n.

Approaches for solving such problems are usually based on
implicit enumeration of the feasible solutions.

For example, to enumerate all feasible solutions for a
scheduling problem (all possible permutation), we have:
 20 tasks in 1 hour (assumption);
 21 tasks in 20 hour;
 22 tasks in 17.5 days;

...
 25 tasks in 6 centuries.
Prof. Z. Peng, ESLAB/LiTH, Sweden
9
Outline




Optimization problems in
ERT system design
Heuristic techniques
Simulated annealing
Tabu search
Prof. Z. Peng, ESLAB/LiTH, Sweden
10
5
2012-03-27
Heuristics

A heuristic seeks near-optimal solutions at a reasonable
computational cost without being able to guarantee
either feasibility or optimality.

Motivations:
 Many exact algorithms require a huge computation effort.
 The decision variables have complicated interdependencies.
 We have often nonlinear cost functions and constraints, even
no mathematical functions.
• Ex. The cost function f can, for example, be defined by a
computer program (e.g., for power estimation).
 Approximation of the model for optimization.
• A near optimal solution is usually good enough and could be
even better than the theoretical optimum.
Prof. Z. Peng, ESLAB/LiTH, Sweden
11
Constructive
Problem specific
Generic methods
• List scheduling
• Left-edge algorithm
• Clustering
• Divide and conquer
• Branch and bound
Transformational
Iterative improvement
Heuristic Approaches to CO
• Kernighan-Lin
algorithm for
graph partitioning
Prof. Z. Peng, ESLAB/LiTH, Sweden
•
•
•
•
Neighborhood search
Simulated annealing
Tabu search
Genetic algorithms
12
6
2012-03-27
Clustering for System Partitioning
Each node initially belongs to its own cluster, and clusters
are gradually merged until the desired partitioning is found.
Merge operations are selected based on local information
(closeness metrics), rather than global view of the system.

0
v2
v2
2
v1
5
3
v1
3
2
6
1
4
5
3
v5
3
v4
6
v3
7
v3
4
0

v3
5
v2
v5
v1
4
v4
v5
4
v4
v3
v2
4
v3
v2
v1
v4
v1
v4
v5
v5
v1
Prof. Z. Peng, ESLAB/LiTH, Sweden
v5
v3
v2
v4
13
Branch-and-Bound

Traverse an implicit tree to find the best leaf (solution).
4-City TSP
0
3
1
41
0
1
40
2
3
3
4
2
0
1
2
3
0
3
6
41
0
40
5
0
4
0
Total cost of this solution = 88
Prof. Z. Peng, ESLAB/LiTH, Sweden
14
7
2012-03-27
0
1
2
3
0 0
3
6
41
1
0
40
5

0
4

2
Branch-and-Bound Ex
Low-bound on the cost function.
Search strategy
{0}
L0
0
3
{0,1}
L3
{0,3}
L  41
{0,2}
L6
{0,1,2}
L  43
{0,1,3}
L8
{0,2,1}
L  46
{0,2,3}
L  10
{0,3,1}
L  46
{0,3,2}
L  45
{0,1,2,3}
L = 88
{0,1,3,2}
L = 18
{0,2,1,3}
L = 92
{0,2,3,1}
L = 18
{0,3,1,2}
L = 92
{0,3,2,1}
L = 88
Prof. Z. Peng, ESLAB/LiTH, Sweden
15
Neighborhood Search Method

Step 1
(Initialization)
(A) Select a starting solution xnow  X.
(B) xbest = xnow, best_cost = c(xbest).

Step 2 (Choice and termination)
Choose a solution xnext  N(xnow), in the neighborhood of xnow.
If no solution can be selected, or the terminating criteria apply,
then the algorithm terminates.

Step 3 (Update)
Re-set xnow = xnext.
If c(xnow) < best_cost, perform Step 1(B).
Goto Step 2.
Prof. Z. Peng, ESLAB/LiTH, Sweden
16
8
2012-03-27
Neighborhood Search Method
Very attractive for many CO problems as they have a natural
neighborhood structure, which can be easily defined and evaluated.

 Ex. Graph partitioning: swapping two nodes.
5
15
2
65
35
20
5
45
5
40
35
56
4
8
24
23
5
15
2
65
35
8
24
Cost =209
20
5
45
23
Prof. Z. Peng, ESLAB/LiTH, Sweden
5
40
35
3
6
67
3
4
67
56
6
Cost =38
17
The Descent Method

Step 1
(Initialization)

Step 2 (Choice and termination)
Choose xnext  N(xnow) such that c(xnext) < c(xnow), and terminate
if no such xnext can be found.

Step 3
(Update)
The descent method can easily be stuck at a local optimum.
Cost
Solutions
Prof. Z. Peng, ESLAB/LiTH, Sweden
18
9
2012-03-27
Dealing with Local Optimality



Enlarge the neighborhood.
Start with different initial solutions.
To allow “uphill moves”:
 Simulated annealing
 Tabu search
Cost
X
Prof. Z. Peng, ESLAB/LiTH, Sweden
Solutions
19
Outline




Optimization problems in
ERT system design
Heuristic techniques
Simulated annealing
Tabu search
Prof. Z. Peng, ESLAB/LiTH, Sweden
20
10
2012-03-27
Annealing

Annealing is the slow cooling of metallic material
after heating:
 A solid material is heated pass its melting point.
 Cooling is then slowly done.
 When cooling stops, the material settles usually into a
low energy state (e.g., a stable structure).
 In this way, the properties of the materials are improved.
Prof. Z. Peng, ESLAB/LiTH, Sweden
21
Annealing

The annealing process can be viewed intuitively as:
 At high temperature, the atoms are randomly oriented due to
their high energy caused by heat.
 When the temperature reduces, the atoms tend to line up with
their neighbors, but different regions may have different
directions.
 If cooling is done slowly, the final frozen state will have a nearminimal energy state.
Prof. Z. Peng, ESLAB/LiTH, Sweden
22
11
2012-03-27
Simulated Annealing
Annealing can be simulated using computer simulation:

Generate a random perturbation of the atom orientations and
calculates the resulting energy change.

If the energy has decreased, the system moves to this new
state.

If energy has increased, the new state is accepted according
to the laws of thermodynamics:
At temperature t, the probability of an increase in energy of
magnitude E is given by
p (E ) 
1

E
e( k t )
where k is called the Boltzmann’s constant.
Prof. Z. Peng, ESLAB/LiTH, Sweden
23
Probability of Accepting Higher-Energy States
p(E)
1
0,9
0,8
0,7
0,6
0,5
0,4
0,3
0,2
0,1
0
p (E ) 
1
E

e( k t )
0,2 0,4 0,6 0,8 1 1,2 1,4 1,6 1,8 2 2,2 2,4 2,6 2,8 3 3,2 3,4 3,6 3,8
Prof. Z. Peng, ESLAB/LiTH, Sweden
E
24
12
2012-03-27
Simulated Annealing for CO

The SA algorithm could be applied to combinatorial
optimization:
Thermodynamic
simulation
Combinatorial
optimization
System states
Feasible solutions
Energy
Cost
Change of state
Moving to a
neighboring solution
Temperature
“Control parameter”
Frozen state
Final solution
Prof. Z. Peng, ESLAB/LiTH, Sweden
25
The SA Algorithm
Select an initial solution xnow  X;
Select an initial temperature t > 0;
Select a temperature reduction function ;
Repeat
Repeat
Randomly select xnext  N(xnow);
 = cost(xnext) - cost(xnow);
If  < 0 then xnow = xnext
else generate randomly p in the range (0, 1) uniformly;
If p < exp(-/t) then xnow = xnext;
Until iteration_count = nrep;
Set t = (t);
Until stopping condition = true.
Return xnow as the approximation to the optimal solution.
Prof. Z. Peng, ESLAB/LiTH, Sweden
26
13
2012-03-27
The SA Algorithm

Simulated annealing is really an optimization
strategy rather than an algorithm  a meta heuristic.

To have a working algorithm, the following must be
done:
 Selection of generic parameters.
 Problem-specific decisions must also be made.
Prof. Z. Peng, ESLAB/LiTH, Sweden
27
A HW/SW Partitioning Example
75 00 0
70 00 0
optimum a t itera tion 1006
65 00 0
Cost function value
60 00 0
55 00 0
50 00 0
45 00 0
40 00 0
35 00 0
0
2 00
40 0
60 0
8 00
1 00 0
12 00
14 00
Nu mb er of iter ations
Prof. Z. Peng, ESLAB/LiTH, Sweden
28
14
2012-03-27
The Cooling Schedule I

Initial temperature (IT):
 IT must be "hot" enough to allow an almost free
exchange of neighborhood solutions, if the final
solution is to be independent of the starting one.
 Simulating the heating process:
• A system can be first heated rapidly until the
proportion of accepted moves to rejected moves
reaches a given value;
e.g., when 80% of moves leading to higher costs will
be accepted.
• Cooling will then start.
Prof. Z. Peng, ESLAB/LiTH, Sweden
29
The Cooling Schedule II

Temperature reduction scheme:
 A large number of iterations at few temperatures or a small
number of iterations at many temperatures.
 Typically (t) = a x t, where a < 1;
a should be large, usually between 0.8 and 0.99.
 For better results, the reduction rate should be slower in
middle temperature ranges.

Stopping conditions:
 Zero temperature - the theoretical requirement.
 A number of iterations or temperatures has passed without
any acceptance of moves.
 A given total number of iterations have been completed (or a
fixed amount of execution time).
Prof. Z. Peng, ESLAB/LiTH, Sweden
30
15
2012-03-27
Problem-Specific Decisions

The neighborhood structure should be defined such that:
 All solutions should be reachable from each other.
 Easy to generate randomly a neighboring feasible solution.
 Penalty for infeasible solutions, if the solution space is strongly
constrained.
 The cost difference between s and s0 should be able to be
efficiently calculated.
 The size of the neighborhood should be kept reasonably
small.

Many decision parameters must be fine-tuned based on
experimentation on typical data.
Prof. Z. Peng, ESLAB/LiTH, Sweden
31
Outline




Optimization problems in
ERT system design
Heuristic techniques
Simulated annealing
Tabu search
Prof. Z. Peng, ESLAB/LiTH, Sweden
32
16
2012-03-27
Introduction to Tabu Search

Tabu search (TS) is a neighborhood search method which
employs "intelligent" search and flexible memories to avoid being
trapped at local optimum.
 To de-emphasize randomization.
 Moves are selected intelligently in each iteration (the best
admissible move is selected).
 Use tabus to restrict the search space and avoid cyclic behavior
(dead loop).
 The classification of tabus is based on the history of the search.

Taking advantage of history.

It emulates human problem solving process.
Prof. Z. Peng, ESLAB/LiTH, Sweden
33
An Illustrative Example

A set of tests is to be scheduled to check the correctness and
other features of a given system.

To find an ordering of the tests that maximizes the test
performance (fault coverage, time, and power):

A feasible solution can be simply represented by a
permutation of the given set of tests.

A neighborhood move can be defined by swapping two tests.

The best move will be selected in each step.

To avoid repeating or reversing swaps done recently, we
classify as tabu all most recent swaps.
Prof. Z. Peng, ESLAB/LiTH, Sweden
34
17
2012-03-27
TS Example
2
2
5
7
3
4
6
1
3
4
5
6
7
1
2
3
2
6
7
3
4
5
1
4
5
6
21 moves are possible in
each iteration in this example
Prof. Z. Peng, ESLAB/LiTH, Sweden
35
TS Example

Let a paired test tabu to be valid only for three iterations (tabu
tenure):
2
2
5
7
3
4
6
1
3
4
5
6
1
7
1
2
2
3
2
6
7
3
4
5
1
4
5
3
6

When a tabu move would result in a solution better
than any visited so far, its tabu classification may be
overridden (an aspiration criterion).
Prof. Z. Peng, ESLAB/LiTH, Sweden
36
18
2012-03-27
A TS Process
Tabu structure
2 3 4 5 6
Current solution
2
5
7
3
4
6
1
7
1
Performance = 10
Swap Value
2
3
4
5
2
2
4
7
3
5
6
1
3
4
5
6
6 7
1
Performance = 16
Top 5
candidates
5, 4
6
7, 4
4
3, 6
2
2, 3
0
4, 1
-1
Swap Value
3, 1
2
2, 3
1
4 3
3, 6
-1
5
7, 1
-2
6, 1
-4
2
3
6
Prof. Z. Peng, ESLAB/LiTH, Sweden
37
A TS Process
Tabu structure
2 3 4 5 6
Current solution
2 4 7 3 5 6 1
7
1
Performance = 16
Swap Value
2
3
4 3
5
2
2
4
7
1
5
6
3
Performance = 18
3
4
5
6
6 7
3
1
3, 1
2
2, 3
1
3, 6
-1
7, 1
-2
6, 1
-4
Swap Value
1, 3
-2
2, 4
-4
4 2
7, 6
-6
5
4, 5
-7
5, 3
-9
2
3
6
Prof. Z. Peng, ESLAB/LiTH, Sweden
Top 5
candidates
38
19
2012-03-27
A TS Process
Tabu structure
2 3 4 5 6
Current solution
2
4
7
1
5
6
3
3
1
Performance = 18
7
Swap Value
2
2
4
2
7
1
5
6
3
-4
7, 6
-6
5
4, 5
-7
5, 3
-9
4
6
6 7
5
Swap Value
3
2
4, 5
6
5, 3
2
4 1
7, 1
0
5
1, 3
-3
2, 6
-6
3
Aspiration criterion applies!
6
Prof. Z. Peng, ESLAB/LiTH, Sweden
-2
2, 4
2
1
Performance = 14
3
1, 3
4 2
3
Uphill moves are allowed!
Top 5
candidates
39
A TS Process
Tabu structure
Current solution
2
4
2
7
1
5
6
3
1
3
5
6
7
2
3
4
2
2
7
1
4
6
3
Performance = 20
3
4
0
1, 3
-3
2, 6
-6
5
6
4
7, 1
0
4, 3
-3
3
6, 3
-5
5
5, 4
-6
2, 6
-8
6
Prof. Z. Peng, ESLAB/LiTH, Sweden
7
Swap Value
3
Best so far!
2
7, 1
2
2
6
5, 3
5
1
1
4, 5
1
6
5
Top 5
candidates
Swap Value
3
2
Performance = 16
4
40
20
2012-03-27
Tabu Memory

The paired test tabu makes use of recencybased memory (short-term memory).

It should be complemented by frequency-based
memory (long-term memory) to diversify the
search into new regions.

Diversification is restricted to operate only on
particular occasions.

For example, we can select those occasions
where no admissible improving moves exist.
Prof. Z. Peng, ESLAB/LiTH, Sweden
41
Tabu Memory Structure
Iteration 26
Current solution
1 3 6 2 7 5 4
Performance = 12
(Recency-based)
1
1
4
4
5
6
7
3
3
3
1
4
6
2
4
5
4
2
1
5
2
1
Top 5
candidates
Penalized
Swap Value Value
2
5
7
3
1
2
3
2
6
1,4
3
2
2,4
-1
-6
3,7
-3
-3
1,6
-5
-5
6,5
-4
-6
3
(Frequency-based)
P.V. = Value – Frequency_count
Prof. Z. Peng, ESLAB/LiTH, Sweden
42
21
2012-03-27
Effects of Random Diversifications
2.4 5e+ 0 6
2 .4e+ 0 6
o p tim u m at iter atio n 1 9 4 1
Cost function value
2.3 5e+ 0 6
2 .3e+ 0 6
2.2 5e+ 0 6
2 .2e+ 0 6
2.1 5e+ 0 6
2 .1e+ 0 6
2.0 5e+ 0 6
2e+ 0 6
1.9 5e+ 0 6
0
5 00
10 00
1 50 0
2 00 0
2 50 0
3 00 0
N um b er o f ite ra tio n s
2. 00 4e + 0 6
o p t i m u m at i t er at i o n 1 9 4 1
Cost function value
2. 00 2e + 0 6
2e + 0 6
1. 99 8e + 0 6
1. 99 6e + 0 6
1. 99 4e + 0 6
1. 99 2e + 0 6
1 .9 9e + 0 6
0
5 00
1 00 0
1 50 0
20 00
25 00
3 00 0
N u m b e r o f i t era t i on s
Prof. Z. Peng, ESLAB/LiTH, Sweden
43
The Basic TS Algorithm
Step 1
(Initialization)
(A) Select a starting solution xnow  X.
(B) xbest = xnow, best_cost = c(xbest).
(C) Set the history record H empty.
Step 2
(Choice and termination)
Determine Candidate_N(xnow) as a subset of N(H, xnow).
Select xnext from Candidate_N(xnow) to minimize c(H, x).
Terminate by a chosen iteration cut-off rule.
Step 3
(Update)
Re-set xnow = xnext.
If c(xnow) < best_cost, perform Step 1(B).
Update the history record H.
Return to Step 2.
Prof. Z. Peng, ESLAB/LiTH, Sweden
44
22
2012-03-27
Tabu and Tabu Status

A tabu is usually specified by some attributes of the moves.

Typically when a move is performed that contains an
attribute , a record is maintained for its reverse attribute.
=> Preventing reversals or repetitions!

A tabu restriction is typically activated only under certain
condition:
 Recency-based restriction: its attributes occurred within a limited
number of iterations prior to the present iteration;
 Frequency-based restriction: occurred with a certain frequency over
a longer span of iterations.

The tabu restrictions and tenure should be selected to
achieve cycle prevention and induce vigor into the search.
Prof. Z. Peng, ESLAB/LiTH, Sweden
45
Tabu Tenure Decision

The tabu tenure, t, must be carefully selected:
 For highly restrictive tabus, t should be smaller than for lesser
restrictive tabus.
 It should be long enough to prevent cycling, but short enough to
avoid driving the search away from the global optimum.


t can be determined using static rules or dynamic rules:
Static rule choose a value for t that remains fixed:
 t = constant (typically between 7 and 20).
 t = f(n), where n is the problem size (typically between 0.5 n1/2 and
2 n1/2.

Experimentation must be carried out to choose the best
tenure!
Prof. Z. Peng, ESLAB/LiTH, Sweden
46
23
2012-03-27
Aspiration Criteria (AC)


Used to determine when tabu restrictions can be overridden.
They contribute significantly to the quality of the algorithm.
Examples of Aspiration Criteria:
 Aspiration by Default: If all available moves are classified as
tabu, and are not rendered admissible by some other AC,
then a "least tabu" move is selected.
 This is always implemented, e.g., by selecting the tabu with the
shortest time to become inactive.

Aspiration by Objective:
 c(xtrial) < best_cost.
 Subdivide the search space into regions R  R, and let best_cost(R)
denote the minimum c(x) for x found in R. If c(xtrial) < best_cost(R), a
move aspiration is satisfied.
Prof. Z. Peng, ESLAB/LiTH, Sweden
47
Stopping Conditions
TS does not converge naturally.



A fixed number of iterations has elapsed in total.
A fixed number of iterations has elapsed since the last best solution
was found.
A given amount of CPU time has been used.
2.45e+06
2.4e+06
optimum at iteration 1941
Cost function value
2.35e+06
2.3e+06
2.25e+06
2.2e+06
2.15e+06
2.1e+06
2.05e+06
2e+06
1.95e+06
0
500
1000
1500
2000
2500
3000
Number of iterations
Prof. Z. Peng, ESLAB/LiTH, Sweden
48
24
2012-03-27
TS vs. SA

Neighborhood space exploration:
 TS emphasizes complete neighborhood evaluation to identify moves
of high quality.
 SA samples the neighborhood solutions randomly.

Move evaluation:
 TS evaluates the relative attractiveness of moves in relation not only
to objective function change, but also to factors of influence.
 SA evaluates moves only in terms of their objective function change.
Prof. Z. Peng, ESLAB/LiTH, Sweden
49
TS vs. SA (Cont’d)

Search guidance:
 TS uses multiple thresholds, reflected in the tabu tenures and
aspiration criteria, which varies also non-monotonically.
 SA is based on a single threshold implicit in the temperature
parameter that only changes monotonically.

Use of memory:
 SA is memoryless.
 TS makes heavily and intelligently use of both short-term and longterm memory.
 TS can also use the mid-term memory for intensification.
Prof. Z. Peng, ESLAB/LiTH, Sweden
50
25
2012-03-27
Summary






Design space exploration is basically an optimization
problem.
Due to the complexity of the optimization problem, heuristic
algorithms are widely used.
Many general heuristics are based on neighborhood search
principles.
SA is applicable to almost any combinatorial optimization
problem, and very simple to implement.
TS has a natural rationale: it emulates intelligent uses of
memory.
When properly implemented, TS often outperforms SA (the
execution time is often one order of magnitude smaller).
Prof. Z. Peng, ESLAB/LiTH, Sweden
51
26
Download