2. Parallel Simulated Annealing

advertisement
Parallel simulated annealing
(Midterm Report)
Philippe Giguere
Panfeng Zhou
Feng Zhu
Xiaojian Zeng
1
1Simulated Annealing ......................................................................................................... 3
1.1 Introduction to Simulated Annealing ........................................................................ 3
1.2 A generic simulated annealing algorithm: ................................................................ 3
1.3 Some mathematical note on simulated annealing. .................................................... 4
2. Parallel Simulated Annealing ......................................................................................... 5
2.1. MIR: Multiple Independent Run .............................................................................. 5
2.2. Simultaneous periodically interacting searcher ....................................................... 6
2.3. Multiple Trials ......................................................................................................... 6
2.4. Massive parallelization ............................................................................................ 7
2.5. Partitioning of configurations .................................................................................. 8
3. Adapting the GSL (gnu scientific library) simulated annealing function. ...................... 9
3.1 Original GSL interface for simulated annealing ....................................................... 9
3.2 MIR (Multiple Independent Runs) interface............................................................. 9
3.3 PIMS (Periodically Interacting Multiple Search) interface .................................... 10
3.4 Parallel Moves (or Clusterization) .......................................................................... 11
4 Implementation of the parallel algorithms with TOP-C ................................................ 12
4.1 Implementation of MIR with TOP-C ...................................................................... 12
4.1.1 GenerateTaskInput() ........................................................................................ 12
4.1.2 DoTask(void *input) ........................................................................................ 12
4.1.3 CheckTaskResult(...) ........................................................................................ 13
4.1.4 UpdateSharedData(...) ...................................................................................... 13
4.1.5 Returning the best solution .............................................................................. 13
4.2 Implementation of PIMS with TOP-C .................................................................... 13
4.3 Parallel moves implementation with TOP-C .......................................................... 14
4.3.1 GenerateTaskInput() ........................................................................................ 14
4.3.2 DoTask(void *input) ........................................................................................ 14
4.3.3 CheckTaskResult(...)........................................................................................ 15
4.3.4 UpdateSharedData(void *input, void *output) ................................................ 15
5. Problems to solve using Parallel Simulated Annealing Algorithm: ............................. 16
5.1 Traveling Salesman Problem: ................................................................................. 16
5.2.1 The mincut problem: ........................................................................................ 17
5.2.2 Problem of Construction of Random Graphs................................................... 17
5.3 Why does annealing work on these problems (also on TSP): ................................ 18
5.4 Conclusions ............................................................................................................. 18
2
1Simulated Annealing
1.1 Introduction to Simulated Annealing
Simulated annealing is a general-purpose optimization method to find an optimal (or
near optimal) solution in various applications. It stochastically simulates the slow
cooling of a physical system.
1.2 A generic simulated annealing algorithm:
A generic simulated annealing algorithm:
1. L := Get Initial Solution ()
2. T := Warming up ()
3. Do
4. Do
5.
L1 = Neighbor(L)
6.
Cost' = Cost(L1) - Cost(L)
7.
If Accept(Cost', T)
8.
L = L1
9. Until Equilibrium()
10. T := DecrementT()
11.Until Frozen()
The algorithm starts with an initial temperature T and initial configuration L. Then a
random perturbation is made to the system - changing the configuration L to a new
configuration L1. After calculating the change in the cost function Cost' = Cost(L1) Cost(L), the algorithm decides whether or not to accept the change by applying the
Metropolis algorithm. The algorithm repeats the steps 5-8 until the system comes into
equilibrium. Then algorithm makes the temperature become lower and then repeats the
steps 4-9 again. The algorithm runs until the temperature T is low enough.
To decide whether or not to accept the change, the metropolis algorithms is used in
the function Accept(Cost', T). The generic function Accept(Cost', T) is
Accept(Cost', T)
1. Generate a random number q, 0<q<1
2. Let p = exp( - Cost’ / T ).
3. If Cost’ < 0, we accept L1.
4. Else if Cost’ > 0 and q < p, accept L1
5. Otherwise, reject L1
3
When the temperature is quite low, the changes are accepts only if the Cost’ < 0, in
other word, the cost decrease. The algorithm becomes the greedy algorithm. We know
that the greedy algorithm can decrease the cost function. The final result of greed
algorithm may not be the optimal result because some state of the system may not be able
to reach from the initial state.
If the temperature is very high, all changes are accepted. It means we simply move at
random and ignore the cost function. We will explore all of the states of the system to
find the global optimum.
In the simulated annealing algorithm, algorithm generates a lot of configuration,
starting at a high temperate and exploring the state space, and then gradually deceasing
the temperature to reject some change. While the temperature decrease to zero, the state
of the system hopefully settles on the global optimum. It has been proved that if the temp
To utilize simulated annealing, we need to have the following:
1.
A description of the configuration for the system – the state of the system at
temperate T.
2.
A perturbation mechanism that can generate random change in the
configuration. A crucial requirement for the proposed change is reachability – there be a sufficient variety of possible changes that one can always
find a sequence of changes so that any systems state may be reached from
any other.
3.
A cost function that we will use the simulated annealing to minimize. It
associate a cost with the configuration of the system.
4.
An annealing schedule that includes a temperature T and a scheme for
lowering the temperature as the algorithm progresses.
When we give the system a very slow decrease in the temperature and a large amount
of time at each temperature, the simulated annealing will always converge.
The simulated annealing algorithm works by randomly searching an enormous
amount of configurations. Unlike greedy algorithms, it will not get caught in local
minima because it sometimes accepts uphill movement.
Simulated annealing always converges, but it takes a long time to find the minimum.
Thus parallel algorithm of simulated annealing is desirable.
1.3 Some mathematical note on simulated annealing.
1. Configuration Space E could be any arbitrary finite set.
2. The cost function Cost(), which maps E to R, could be any function.
3. A generic sequential annealing algorithm on E generates a random sequence Xn
of configurations that will tend to concentrate.
4
4. T(n) is called Temperatures, if T(n) > T(n+1) and Lim T(n) = 0 when n -> infinit.
Reference:


Robert Azencott. Simulated Annealing, Parallelization Techniques.
New York:Wiley, (1987)
Aarts, E.H.L., and P.J.M. van Laarhoven. Simulated Annealing:
Theory and Applications. New York: Wiley, 1987
2. Parallel Simulated Annealing
Assumption: We got p processors that are concurrently available and each processor
has enough available memory and computing resources to generate its own sequential
annealing sequence.
2.1. MIR: Multiple Independent Run
1. Principle:
1. Begin with an initial temperature T and an initial configuration E. Give each
processor this temperature and configuration.
2. Each processor performs the serial SA algorithm as described in the previous
chapter until a certain number of successes are obtained.
3. Each processor sends its current configuration to a master processor. A decision
algorithm is applied to select one of these configurations. (Generally we can
choose the configuration with the lowest energy function
4. Distribute the configuration chosen by the decision algorithm to each of the
processors and repeat steps 2-3 until the prerequisite number of success is not
obtained. The algorithm is then complete
2.Analyze and Conclusion:
2.1 Advantage:
Attains a near-linear speedup. This is due to the fact that, with p processors, we are
searching a factor of p more possible configurations, so we increase the chances of
“stumbling” onto the correct configuration more quickly.
Easy to implement. The only message passing occurs at the synchronization steps.
2.2 Disadvantage:
If one processor obtains the prerequisite number of successes before another one, it must
wait for that other processor to finish.
A global gathering and re-broadcasting of large configurations can be time-consuming.
3. Reference:
 Robert Azencott. Simulated Annealing, Parallelization Techniques. New
York:Wiley, (1987):37-39.
5
2.2. Simultaneous periodically interacting searcher
1. Principle:
1. Begin with an initial temperature T and an initial configuration E. Give each
processor this temperature and configuration.
2. Each processor performs the serial SA algorithm as described in the previous
chapter until a pre-defined time interval reached (say 10 seconds).
3. Each processor sends its current configuration to a master processor. A decision
algorithm is applied to select one of these configurations. (Generally we can
choose the configuration with the lowest energy function
4. Distribute the configuration chosen by the decision algorithm to each of the
processors and repeat steps 2-3 until the prerequisite number of success is not
obtained. The algorithm is then complete
2. Analyze and Conclusion:
2.1 Advantage:
Attains a substantial improvement over sequential annealing
The lowest speed processor will not decide the overall performance.
Disadvantage:
More communication cost than MIR
If the communication cost is high, it will be less efficient than MIR.
3 Reference:
 Robert Azencott. Simulated Annealing, Parallelization Techniques. New
York:Wiley, (1987):39-41.
 Aarts, E.H.L., and P.J.M. van Laarhoven. Simulated Annealing: Theory and
Applications. New York: Wiley, 1987
2.3. Multiple Trials
1. Principle:
1.1 Definition: Trial, an arbitrary sequential annealing scheme with configuration space E
and cooling schedule (Tn). At time n, in order to modify the current configuration Xn,
one selects a random neighbor Yn of Xn, and then another random choice is made to
decide whether to keep the configuration Xn or to replace it by the new configuration Yn.
This sequence of two random choices is called a trial.
1.2. Procedure:
1. Begin with an initial temperature T and an initial configuration E. Give each
processor this temperature and configuration.
2. Each processor performs the serial SA algorithm as described in the previous
chapter only one trial.
3. Each processor sends its current configuration to a master processor. A decision
algorithm is applied to select one of these configurations. (Generally we can
choose the configuration with the lowest energy function
6
4. Distribute the configuration chosen by the decision algorithm to each of the
processors and repeat steps 2-3 until the prerequisite number of success is not
obtained. The algorithm is then complete
2. Analyze and Conclusion:
No processors ever sit idle.
No expensive synchronization steps. Communications are smaller but more frequently.
At low temperature the acceleration rate provided by this parallelization scheme is close
to the number p of processors.
At high temperatures it’s better to use a small number p of processors, whereas at low
temperature p can be much larger.
3. Reference:
 Robert Azencott. Simulated Annealing, Parallelization Techniques. New
York:Wiley, (1987):41-42.
2.4. Massive parallelization
1. Principle
1.1 Definition: Active set: assume that N processors are concurrently available. Fix a
sequence Tn of decreasing temperatures with limnTn = 0. At time n each processor Pi,
decides at random, independently of the past and of all other processors, whether it will
belong to the active set An with the fixed probability rate .
1.2 Procedure:
1. Begin with an initial temperature T and an initial configuration E. Give each
processor in ACTIVE SET this temperature and configuration.
2. Each processor in ACTIVE SET performs the serial SA algorithm as described in
the previous chapter only one trial.
3. Each processor in ACTIVE SET sends its current configuration to a master
processor. A decision algorithm is applied to select one of these configurations.
(Generally we can choose the configuration with the lowest energy function
4. Distribute the configuration chosen by the decision algorithm to each of the
processors and repeat steps 2-3 until the prerequisite number of success is not
obtained. The algorithm is then complete
2. Analyze and Conclusion:
In some cases, if the communication cost is too high, the parallel algorithm will be even
less efficient than the sequential algorithm. We can get the better performance by leaving
out a small random set of processors in massively parallel annealing.
3. Reference:
 Robert Azencott. Simulated Annealing, Parallelization Techniques. New
York:Wiley, (1987):42-44.
7



Geman, D., and S. Geman. Stochastic relaxation, Gibbs distributions, Bayesian
restoration of images. IEEE Trans. PAMI 6 (1984): 721-741
Mezard, M., G.Parisi, and M.A.Virasoro. Spin glass theory and beyond. World
Scientific Lecture Notes in Physics, Vol.9, New Jersey: World Scientific, 1987
Sherrington, D.and S.Kirkpatrick. Solvable model of a spin-glass. Phys. Rev. Lett.
35 (1975): 1792-1796
2.5. Partitioning of configurations
1. Principle:
Consider a configuration space E, and assume that for each plausible configuration x in
E, we can write x = (x1, x2, …, xp), where xj  Ej and Ej is a (smaller) set of local
configurations. Conversely, assume that whenever x1,…, xp belongs to E1*…*Ep and
verifies a set of boundary conditions Bij(xi, xj) for all pairs i, j, then there is an x in E such
that x = (x1, x2, …, xp),
Then define for each j = 1…p a neighborhood system Vj+(xj) of xj in Ej. Consider p
processors 1,…, p available simultaneously. Select a common temperature cooling
schedule Tn  0, and let each j perform sequential annealing steps starting with
configuration X0. At each step n define a communication protocol to guarantee that the
boundary compatibility conditions remain valid for the new partial configuration Xn+1.
2. Analyze and Conclusion:
The communication rate should be kept high whenever the number of tetrahedrons is
small. Working in parallel is only efficient when the domains are of the minimum
reasonable size.
Nevertheless, parallel annealing by partitioning of configurations remains an efficient
scheme form the point of view of potential acceleration on better-adapted hardware
architectures.
3. Reference:
 Robert Azencott. Simulated Annealing, Parallelization Techniques. New
York:Wiley, (1987):44-46.
 Bonomi, E., and J.L. Lutton. The N-city Traveling Salesman problem: Statistical
mechanics methods and Metropolis algorithm. SIAM Review 26 (1984)
8
3. Adapting the GSL (gnu scientific library) simulated
annealing function.
The GSL implement a simple sequential form of the simulated annealing
algorithm. Because of its availability, we choose to use it as the
underlying code for executing the simulated annealing part of the project.
Successful adaptation of this function means that other people already using
the same function can utilize this new parallelized version, without
significant changes to their program.
3.1 Original GSL interface for simulated annealing
The original function from the GSL is defined as follow:
void gsl_siman_solve(const gsl_rng * r,
void *x0_p,
gsl_siman_Efunc_t Ef,
gsl_siman_step_t take_step,
gsl_siman_metric_t distance,
gsl_siman_print_t print_position,
gsl_siman_copy_t copyfunc,
gsl_siman_copy_construct_t copy_constructor,
gsl_siman_destroy_t destructor,
size_t element_size,
gsl_siman_params_t params);
Important parameters include:
x0_p: initial state, and memory location to write final state.
params: simulated annealing parameters, such as initial/final temperature,
temperature decay constant, number of iterations for a single temperature.
take_step, distance: callback functions used to generate moves and computer
cost.
In order to facilitate reuse of the parallelized version, we will try to
keep any new function definition similar to the original one.
3.2 MIR (Multiple Independent Runs) interface
For the MIR parallel algorithm, we have the following interface:
void SolveMIR(
int nMir,
const gsl_rng * r,
/* how many different MIRs */
9
void *x0_p[],
/* now an array of states */
gsl_siman_Efunc_t Ef,
gsl_siman_step_t take_step,
gsl_siman_metric_t distance,
gsl_siman_print_t print_position,
gsl_siman_copy_t copyfunc,
gsl_siman_copy_construct_t copy_constructor,
gsl_siman_destroy_t destructor,
size_t element_size,
gsl_siman_params_t params);
The new arguments are:
-The number of independent runs we have (nMir). We might decide to have more
runs than parallel processor. In particular, it might help alleviate issues
of a slower CPU slowing down the group, by having significantly more runs
than remote process. This way the slow process might only do one run, while
the faster ones will do multiple runs. But by having the runs smaller, we
reduce the impact of the slower process (smaller granularity of task.)
Alternatively, we might simply decide that once a significant portion of the
runs have been completed, we can abort.
-We have now to pass a list of initial states (x0_p[]), instead of a single
one. Each independent run will need it a different initial state.
3.3 PIMS (Periodically Interacting Multiple Search) interface
void SolveMIR( int nPims,
/* how many different PIMS */
int nInteractions, /* number of interactions */
const gsl_rng * r,
void *x0_p,
gsl_siman_Efunc_t Ef,
gsl_siman_step_t take_step,
gsl_siman_metric_t distance,
gsl_siman_print_t print_position,
gsl_siman_copy_t copyfunc,
gsl_siman_copy_construct_t copy_constructor,
gsl_siman_destroy_t destructor,
size_t element_size,
gsl_siman_params_t params);
The new arguments are:
-Number of parallel searches (nPims.) As with the case of MIR, we suffer
10
from the same issue of granularity, and therefore might be desirable to have
significantly more runs than processes.
-Number of resynchronization (nInteractions.) This would specify the number
of time interaction is required between the short runs to select the best
solution so far, and initial the next round of runs with this best solution.
Optionally, we could add an extra parameter to select some variance of the
PIMS algorithm. In particular, we could try the following variances:
-Cascading of best solution: let's say we have 10 parallel searches. At
synchronization time, search j will compare its solution with j-1. If j is
better than j-1, it will keep its solution, and propagate it to j+1. If j-1
is better, then j will adopt this new intermediate solution and propagate it
to j+1.
-Selection of the better half: when all intermediate solutions are ranked,
the top half is selected and given to the lower half. This means that we
reduce the number of intermediate solution by half, and each of the top half
will now be duplicated. Other variances for the ratio could be experimented.
-Other propagation mechanism we can think of.
3.4 Parallel Moves (or Clusterization)
void SolvePM( double BeginTemp, /* temperature to start P.M. */
int SyncMethod, /* SYNCHRONOUS or ASYNCHRONOUS */
int SyncInterval, /* Intervals between synch */
const gsl_rng * r,
void *x0_p,
gsl_siman_Efunc_t Ef,
gsl_siman_step_t take_step,
gsl_siman_metric_t distance,
gsl_siman_print_t print_position,
gsl_siman_copy_t copyfunc,
gsl_siman_copy_construct_t copy_constructor,
gsl_siman_destroy_t destructor,
size_t element_size,
gsl_siman_params_t params);
The new arguments are:
-Temperature at which to start parallel moves (BeginTemp.) Since parallel
moves is not optimal at high temperature (and even possibly detrimental), we
need to specify a temperature at which to start parallel moves. Before that
11
we will do a sequential, or possibly MIR (still not clear.)
-Synchronous or asynchronous parallel moves (SynchMethod.) Synchronous means
that each time a slave finds an acceptable move, all other slaves must be
interrupted in order to receive the new configuration. Asynchronous means
that the slave won't be interrupted, but periodically check to see if the
configuration has changed.
-If asynchronous method is selected, we need to specify the maximum number
of local slave trial changes before checking for changes in the
configuration (SyncInterval.) If we choose a large number for this
parameter, the slave will operate on a configuration which is not
up-to-date, possibly resulting in wasted time.
4 Implementation of the parallel algorithms with TOP-C
A significant goal of the project is to show how quickly the simulated
annealing algorithm can be parallelized with TOP-C. Each of the following
sections describes how the different versions of the parallel algorithm
would be implemented.
4.1 Implementation of MIR with TOP-C
The implementation of MIR is rather straightforward. It does not require any
change to the basic code of the scientific library. In this model, each task
represents a complete run to be executed on a separate process.
4.1.1 GenerateTaskInput()
It will create a buffer containing one of the initial states, and the
simulated annealing parameters, such as start/stop temperatures, decay
constant, etc.
4.1.2 DoTask(void *input)
The input of the DoTask will be an initial state, and simulated annealing
parameters. DoTask simply call the GSL function gsl_siman_solve(...) with
the proper arguments. Once a solution has been found, it is returned by the
function.
12
4.1.3 CheckTaskResult(...)
CheckTaskResult will maintain a copy of the best solution found so far.
Every time a new result is checked-in by a slave, CheckTaskResult will
compare it with the best solution, and if it is found to be better than the
current best solution, it will replace it.
4.1.4 UpdateSharedData(...)
Not used.
4.1.5 Returning the best solution
After TOPC_master_slave() has been invoked in the SolveMIR function, the
best result will be copied in the first initial state. This is where the
caller of SolveMIR() will retrieve the solution.
4.2 Implementation of PIMS with TOP-C
PIMS implementation is also straightforward. In our case, we will be able to
completely reuse the MIR code already developed. All we have to do is to run
MIR sequentially, and between each run of MIR select the best state, and
copy it to the other states. There will be no direct call to the TOP-C
layer, this being handled inside SolveMIR.
The pseudo-code for SolvePims would be (structure and argument numbers not
respected, for simplification):
SolvePims(int nPims, int nInteractions, siman_params params, ...)
{
StartTemp = params.t_initial;
do
{
FinalTemp = ComputeInterTemp(StartTemp,params,nInteractions);
SolveMIR(StartTemp, FinalTemp, state[],...);
Copy state[0] to state[1,2,3...,n]; /* state[0] is best solution */
StartTemp = FinalTemp;
}
while (--nInteractions>0);
}
13
4.3 Parallel moves implementation with TOP-C
The parallel move implementation will force us to break apart the basic GSL
simulated annealing function. The main components of the simulated annealing
functions are:
-Generating valid moves on a state.
-Temperature decrease scheduling.
The master will be responsible for the temperature scheduling, and all the
slaves will be responsible for the generation of valid moves. The master
will get the valid moves, apply them to the global state, and then force an
update on the slaves after the valid move has been accepted. The difference
between the synchronous and asynchronous method is simply how quickly do we
want the changes to be broadcasted to the slaves.
4.3.1 GenerateTaskInput()
Since now the state is shared globally, we will not be required to send any
data to DoTask.
4.3.2 DoTask(void *input)
The main goal of DoTask is to find a valid move on the shared state. Once a
valid move has been found, we return it to the master.
If we choose a synchronous approach, we will need to keep polling for abort
between trials. In case of asynchronous, we will instead run for a certain
time, and if we can't find any better move we will return back to the
master. The pseudocode for DoTask would be:
DoTask(void *input)
{
nTrial = 0;
while (1)
{
Move = GenerateSingleMove(globalState...);
If (ValidMove(Move,Temperature,...)) {
Return TOPC_BUF(Move);
}
else {
if (synchronous) {
/* Asynchronous */
if (TOPC_is_abort_pending()) {
return TOPC_BUF(void);
}
} else {
/* Asynchronous */
14
if (++nTrial > SyncInterval) {
return TOPC_BUF(void);
}
}
}
}
}
4.3.3 CheckTaskResult(...)
CheckTaskResult will be responsible for maintaining the single state that
all slaves use to find a move. When a slave returns, it will either have a
new valid move, or no move at all. If a new valid move has been returned,
CheckTaskResult will apply it to the global state. If two moves are received
and are conflicting with each other, we will simply reject the second move.
After a move has been applied, the slaves become now out-of-date. If
synchronous option has been selected, the master will try to abort the tasks
(TOPC_abort
_tasks()) to force an update, otherwise it will not. CheckTaskResult will
then return UPDATE.
4.3.4 UpdateSharedData(void *input, void *output)
UpdateSharedData will be responsible for keeping the slaves up-to-date with
the global state. The buffer output contains the valid move accepted by the
master, and the slaves now have to apply this same move to their state.
15
5. Problems to solve using Parallel Simulated Annealing
Algorithm:
5.1 Traveling Salesman Problem:
5.1.1 The traveling salesman problem, or TSP for short: given a finite number of "cities"
along with the cost of travel between each pair of them, find the cheapest way of visiting
all the cities and returning to the starting point.
5.1.2 There are ways to solve TSP while we found using PSA might be one of those
according to slides of Parallel Simulated Annealing by Johnny Appleseed. To apply for
PSA, we have following:
Configuration - Each city is assigned a number from 1 to N. the configuration is a listing
of the route taken by the salesman.
Perturbation - Just to switch two of the cities along the salesman’s route.
Cost function – Try to minimize the total length of the salesman’s journey.
Annealing schedule – Experimentation is required to determine the best schedule. We can
decrease T by divider of 1.001 and hold T constant for a fixed number of successes (in
our simple example, we have 200 cities and 2000 iterations for each T).
5.1.3 Basic ideas in simulated annealing solution to TSP:
1. Begin with an initial temperature T (in our case, 5000) and an initial route for the
salesman to traverse, C.
2. Switch the location of 2 cities along the route and get the new route C*. Compute
the energy function for each configuration. E, E* is the distance of the route
specified by C, C* respectively.
3. Compute the change in the cost function, E*-E.
4. If E*-E<0, it is in the right direction, and C* is accepted. Otherwise, accept C*
with probability exp[-(E*-E)/T].
5. Repeat steps 2-4. If we obtain our specified number of successes, lower the
temperature and begin again. Otherwise, the algorithm is complete.
5.1.4 Above is the sequential solution and what we expect to implement is the parallel
one. We have done one simple solution roughly based on parallelization of SA, which is
MIR. That is, let each processor work on the entire problem. Each of the processors
works independently of the others. And the basics is as follows:
1. Begin with an initial temperature T and an initial configuration C. Give each
processor (slave) this T and C.
2. Each slave performs the serial SA algorithm as mention above until a certain
number of successes is obtained.
16
3. A synchronization step is performed. Each slave sends its current configuration to
the master processor. A decision algorithm is applied to select one of these
configurations.
4. Distribute the configuration chosen by the decision algorithm to each of the slave
and repeat steps 2-3 until the prerequisite number of successes is not obtained.
The algorithm is thus complete.
5.1.5 So far we’ve already done MIR on TSP and we are going to implement also the
Periodically interacting multiple searches, multiple trials, massive parallelization, and
partitioning of configurations if possible, and try to compare and analyze the results
based on those different algorithms.
5.2 Other problems we can apply PSA to solve: (we are of great interest
to have more applications of using the PSA and we may do some of them if possible.)
5.2.1 The mincut problem:
The mincut problem involves finding the partition of a graph into two subgraphs,
each with the same number of nodes, such that the number of edges between the
partitions is minimized. And more specific is as follows.
Given a graph G = (V, E) and a subset S of V, the cut (S) induced by S is the subset
of edges(i, j)  E sub that | {i, j}  S | = 1. So, (S) consists of all those edges with
exactly one endpoint in S. Given an undirected graph G = (V, E) and for each e  E a
nonnegative cost (or capacity) ce, the cost of a cut (S), is the sum of the costs of the
edges in the cut, that is c((S)) =  ce.
The mincut problem is then to find a cut of minimum cost.
5.2.2 Problem of Construction of Random Graphs
This problem investigates the construction of random topologies for multi-computer
networks. Given a number of nodes of fixed degree, annealing is used to find the graph of
minimum diameter. The program could be based on a generic parallel annealing package
developed with the MPI message-passing interface. More specific is as follows:
Given a graph G = (V; E) with n vertices, and m edges, and a family of pairs of
vertices in V, we are interested in finding for each pair (ai, bi), a path connecting ai to bi,
such that the set of paths so found is edge-disjoint. (For arbitrary graphs the problem is
NP- complete, although it is in P if is fixed.) A polynomial time randomized algorithm
could be presented for finding the optimal number of edge disjoint paths (up to constant
factors) in the random graph Gn,m, for all edge densities above the connectivity threshold.
(The graph is chosen first, then an adversary chooses the pairs of endpoints.)
17
5.3 Why does annealing work on these problems (also on TSP):
-
Given a very slow decrease in the temperature and a large amount of time
at each temperature, SA will always converge.
SA works by randomly searching an enormous amount of configurations.
5.4 Conclusions
PSA is desirable to solve the problems mentioned above and our approach is to
combine TOPC and GSL to make simulated annealing algorithm work parallel on these
problems.
18
Download