Lecture 07, 4 March 2014

advertisement
The Traveling Salesman Problem
in Theory & Practice
Lecture 7: Local Optimization
4 March 2014
David S. Johnson
dstiflerj@gmail.com
http://davidsjohnson.net
Seeley Mudd 523, Tuesdays and Fridays
Outline
1. Tour of the DIMACS TSP Challenge Website and
other web resources
2. Basic local optimization heuristics and their
implementations
•
2-Opt
•
3-Opt
Projects and Presentations
Please email me by 3/11:
– The planned subject for your project (survey
paper, theoretical or experimental research
project, etc.) and
– The paper(s)/result(s) you plan to present in class.
– Preferred presentation date.
We have 7 more classes after this one: 3 more for me, 3
for presentations, and the last (4/29) for a wrap-up
from me and 10-minute project descriptions from you.
Final project write-ups are due Friday 5/2.
DIMACS Implementation Challenge
• Initiated in 2000
• Major efforts wound down in 2002
• Still updateable (in theory)
Challenge Testbeds
All provided by means of instance generation code and
specific seeds for the random number.
Running Time Normalization
Source code for Greedy and a generator for
random Euclidean instances, provided for download.
Participants reported their running time for
Greedy on the Test Battery instances.
Machine-Specific Correction Factors
103
104
105
106
107
A Tour of the Website
Click Here
Local Optimization: 2-Opt
Basic Scheme
• Use a tour construction heuristic to build a starting tour.
Which heuristic?
• While there exists a 2-opt move that yields a shorter tour,
How do we determine this efficiently?
– Choose one.
Which one?
– Perform it.
With what data structure?
Each choice can affect both running time and tour quality.
Determining the existence of an
Improving 2-Opt move
• Naïve approach: Try all N(N-3)/2 possibilities.
• More sophisticated: Observe that one of the following must be
true: d(a,b) > d(b,c) or d(c,d) > d(d,a).
Suppose we consider each ordered pair
(t1,t2) of adjacent tour vertices as
candidates for the first deleted edge
in an improving 2-opt move. Then we
may restrict our attention to
candidates for the new neighbor t3 of
t2 that satisfy d(t2,t3) < d(t1,t2).
If the improving move to the left is not
caught when (t1,t2) = (a,b), it will be
caught when (t1,t2) = (c,d).
Sequential Searching
t3
t4
t3
For t1 going counterclockwise around the tour,
• For t2 a tour neighbor of t1,
• For all t3 with d(t2,t3) < d(t1,t2),
• For the unique t4 that will yield a
legal 2-opt move,
• Test whether d(t1,t4)+d(t2,t3) is
less than d(t1,t2)+d(t3,t4).
• If so, add 〈(t1,t2),(t4,t3)〉 to
the list of improving moves.
• Otherwise, continue.
t2
t1
t2
Note: For geometric instances where k-d
trees have been constructed, we can find
the acceptable t3’s using fixed-radius
searches from t2 with radius d(t1,t2).
Which Improving Move to Make?
Best Possible: Time consuming, not necessarily best choice for the long
run.
Best of those for the current choice of t1: Still not necessarily best in
the long run, but significantly faster.
Best of the first 8 new champions for the current choice of t1: Still
faster.
First found for the current choice of t1: Even faster, but not necessarily
best (or fastest) in the long run.
% Excess over Held-Karp Bound
Variant
103
104
105
Running Time in 150Mhz Seconds
106
103
104
105
106
Best
4.7
4.6
4.5
4.5
0.21
3.1
54.9
2285
8th
4.9
4.9
4.7
4.7
0.20
2.4
49.1
2344
First
6.1
6.0
5.8
5.7
0.17
2.2
47.5
2754
[Jon Bentley’s Geometric Code]
Don’t-Look-Bits
•
One bit associated with each city, initially 0.
•
If one fails to find an improving move for a given choice of t1, we set the
don’t-look-bit for t1 to 1.
•
If we find an improving move, we set the don’t-look-bits for t1, t2, t3, and t4
all to 0.
•
If a given city’s don’t-look-bit is 1, we do not consider it for t1.
Costs perhaps 0.1% in tour quality, factor of 2 or greater speedup.
Enables processing in “queue” order:
•
Initially all cities are in queue.
•
When a city has its don’t-look-bit set to 1, it is removed from the queue.
•
When a city not in the queue has it’s don’t-look-bit set to 0, it is added to the
end of the queue.
•
For the next city to try as t1, we pop off the element at the head of the queue.
Which Improving Move to Make?
Best Possible: Time consuming, not necessarily best choice for the long
run.
Best of those for the current choice of t1: Still not necessarily best in
the long run, but significantly faster.
Best of the first 8 new champions for the current choice of t1: Still
faster.
First found for the current choice of t1: Even faster, but not necessarily
best (or fastest) in the long run.
% Excess over Held-Karp Bound
Variant
103
104
105
Running Time in 150Mhz Seconds
106
103
104
105
106
Best
4.7
4.6
4.5
4.5
0.21
3.1
54.9
2285
8th
4.9
4.9
4.7
4.7
0.20
2.4
49.1
2344
First
6.1
6.0
5.8
5.7
0.17
2.2
47.5
2754
[Jon Bentley’s Geometric Code]
Tour Representations
Must maintain a consistent ordering of the tour so that
the following operations can be correctly performed.
1.
Next(a) and Prev(a): Return the successor/predecessor of city a in
the current ordering of the tour.
2.
Between(a,b,c): Report whether, if one starts at city a and proceeds
forward in the current tour order, one will encounter b before c. (This
will be needed for 3-opt.)
3.
Flip(a,b,c,d): If b = Next(a) and c = Next(d), update the tour to reflect
the 2-opt move in which the tour edges (a,b) and (c,d) are replaced by
(b,c) and (a,d). Otherwise, report “Invalid Move”.
Tour Representations
Must maintain a consistent ordering of the tour so that
the following operations can be correctly performed.
1.
Next(a) and Prev(a): Return the successor/predecessor of city a in
the current ordering of the tour.
2.
Between(a,b,c): Report whether, if one starts at city a and proceeds
forward in the current tour order, one will encounter b before c. (This
will be needed for 3-opt.)
3.
Flip(a,b,c,d): If b = Next(a) and c = Next(d), update the tour to reflect
the 2-opt move in which the tour edges (a,b) and (c,d) are replaced by
(b,c) and (a,d). Otherwise, report “Invalid Move”.
Tour Representations
Must maintain a consistent ordering of the tour so that
the following operations can be correctly performed.
1.
Next(a) and Prev(a): Return the successor/predecessor of city a in
the current ordering of the tour.
2.
Between(a,b,c): Report whether, if one starts at city a and proceeds
forward in the current tour order, one will encounter b before c. (This
will be needed for 3-opt.)
3.
Flip(a,b,c,d): If b = Next(a) and c = Next(d), update the tour to reflect
the 2-opt move in which the tour edges (a,b) and (c,d) are replaced by
(b,c) and (a,d). Otherwise, report “Invalid Move”.
See [Fredman, Johnson, McGeoch, & Ostheimer, “Data structures for
traveling salesmen,” J. Algorithms 18 (1995), 432-479].
Array Representation
Tour
a
b
c
d
e
f
g
h
i
j
k
l
m n
o
p
q
r
s
t
u
v w x
Array of City Indices
City
Array of Tour Indices
Next(ci) = Tour[City[i]+1(mod N)]
Prev(ci) = Tour[City[i]-1(mod N)] (analogous)
Between(ci, cj, ck): (Straightforward)
y
z
Array Representation: Flip
Tour
a
b
c
d
e
f
g
h
i
j
k
l
m n
o
p
q
r
s
t
u
v w x
y
z
g
q
r
s
t
u
v w x
y
z
g
q
r
s
t
u
v w x
c
b
Flip(f,g,p,q)
a
b
c
d
e
f
p
o
n m
l
k
j
i
h
Flip(x,y,c,d)
a
z
y
d
e
f
p
o
n m
l
k
j
i
h
Array Representation: Costs
• Next, Prev: O(1)
• Between:
θ(N)
• Flip:
θ(N)
Speed-up trick: If the segment to be flipped is greater than
N/2, flip its complement.
Problem for Arrays
•
For random Euclidean instances, 2-opt performs θ(N) moves and,
even if we always flip the shorter segment, the average length of
the segment being flipped, grows roughly as θ(N0.7) [Bentley, 1992].
•
Doubly-linked lists suffer from the same problems. Can we do
better with other tour representations?
•
We can in fact do much better (theoretically).
•
By representing the tour using a balanced binary tree, we can
reduce the (amortized) time for Between and Flip to θ(log(N)) per
operation, although the times for Next and Prev increase from
constant to that amount. “Splay Trees” are especially useful in this
context (and will be described in the next few slides).
•
Significant further improvements are unlikely, however:
•
Theorem [Fredman et al., 1995]. In the cell-probe model of
computation, any tour representation must, in the worst case, take
amortized time Ω(log(N)/loglog(N)) per operation.
Binary Tree Representation
•
Cities are contained in a binary tree, with a bit at each internal node to
tell whether the subtree rooted at that node should be reversed. (Bits
lower down in the tree will locally undo the effect of bits at their
ancestors.)
•
To determine the tour represented by such a tree, simply push the
reversal bits down the tree until they all disappear. An inorder traversal
of the tree will then yield the tour.
•
(To push a reversal bit at node x down one level, interchange the two
children of x, complement their reversal bits, and turn off the reversal
bit at x.)
Splay Trees
[Sleator & Tarjan, “Self-adjusting binary search trees,” J. ACM 32
(1985), 652-686]
•
Every time a vertex is accessed, it is brought to the root (splayed) by a
sequence of rotations (local alterations of the tree that preserve the
inorder traversal).
•
Each rotation causes the vertex that is accessed to move upward in the
tree, until eventually it reaches the root.
•
The precise operation of a rotation depends on whether the vertex is the
right or left child of its parent and whether the parent is the right or left
child of its own parent. The change does not depend on any global
properties of the subtrees involved, such as depth, etc.
•
All the standard binary tree operations can be implemented to run in
amortized worst-case time O(log(N)) using splays.
•
In our Splay Tree tour representation, the process of splaying is made
slightly more difficult by the reversal bits. We handle these by preceding
each rotation by a step that pushes the reversal bits down out of the
affected area. Neither the presence of the reversal bits nor the time
needed to clear them affects the amortized time bound for splaying by
more than a constant factor.
Splay Tree Tour Operations
Next(a):
1.
Splay a to the root of the tree.
2.
Traverse down the tree (taking account of reversal bits) to find the successor of a.
3.
Splay the successor to the root.
Prev(a): Handled analogously.
Between(a,b,c):
1.
Splay b to the root, then a, then c. Note that [Sleator & Tarjan, 1985] shows that no rotation
for a vertex x causes any vertex to increase its depth by more than 2. Thus, after these
splays, c is the root (level 1), a is no deeper than level 3, and b is no deeper than level 5. They
also show that if a is at level 3, then it either the left child of a left child or the right child of
a right child.
2.
Clear all the reversal bits from the top 5 levels of the tree.
3.
Traverse upward from b in its new position in the tree.
4.
The answer is yes if
5.
–
we reach a first and arrive from the right, or
–
we reach b first and arrive from the left.
Otherwise, it is no.
c
a
a
a
a
Splay Tree Flip(a,b,c,d)
•
Splay d to the root, then splay b to the root, and push all reversal
bits down out of the top three levels.
•
There are four possiblities (TiR represents the subtree with the
reversal bit at its root complemented):
b
d
b
d
b
b
d
d
b
b
x
b
d
d
x
Reverses the path from b to d.
x
d
d
b
x
Reverses the path from d to b.
Speedups
(Lose theoretical guarantees for better performance in practice)
• No splays for Next and Prev – simply do tree traversals, taking
into account the reversal bits.
• No need to splay b in the Between operation. Instead simply
splay a and c, and then traverse up from b until either a or c is
encountered (as before).
• Operation of Flip unchanged.
• Yields about a 30% speedup.
Advantages of Splay Trees
• Ease of implementing Flip compared to other balanced binary
tree implementations.
• “Self-Organizing” properties: Cities most involved in the action
stay relatively close to the root. And since typically most cities
drop out of the action (get their don’t-look-bits set to 1
permanently) fairly early, this can significantly reduce the time
per operation.
• Splay trees start beating arrays for random Euclidean instances
on modern computers somewhere between N = 100,000 and N =
316,000. They are 40% faster when N = 1,000,000.
• For more sophisticated algorithms, like Lin-Kernighan (to be
discussed later), the transition point is much earlier: Splay
trees are 13 times faster when N = 100,000.
Beating Splay Trees in Practice:
The Two-Level-Tree
Approximately √N segments of length √N each
Splay Trees versus Two-Level Trees
• Two-Level Trees 2-3 times faster for N = 10,000 (not counting
preprocessing time), declining to 11% at N = 1,000,000.
• But does this matter?
• In 1995, the time for N = 100,000 was 3 minutes versus 5 (LinKernighan).
• Today it is 2.1 seconds versus 3.8.
• What is this “preprocessing”?
• We switched implementations in order to be able to compare
tour representations – See next slide.
The Neighbor-List Implementation
• Can handle non-geometric instances.
–
–
–
–
TSP in graphs
X-ray crystallography
Video compression
Converted versions of asymmetric TSP instances
• Can exploit geometry when it is present.
• Because of the trade-offs it makes, it may be 0.4% worse for 2opt than the Bentley’s purely geometric implementation, but it
will be substantially faster for sophisticated algorithms like LinKernighan, which otherwise would perform large numbers of
fixed-radius searches.
The Neighbor-List Implementation
•
Basic idea: Precompute, for each city, a list of the k closest other cities,
ordered by increasing distance, and store the corresponding distances.
•
If we set k = N, we should find tours as good as Bentley’s geometric code,
but would take Θ(N2log(N)) preprocessing time and Θ(N2) space.
•
Tradeoff: Take much smaller k (default is k=20).
•
For geometric instances, with a k-d tree constructed, we can compute the
list for a given city in time “typically” O(logN + klogk)).
•
No longer need to do a fixed-radius search for t3 candidates. Merely
examine cities on the list for city t2 in order until a city x with d(t2,x) >
d(t1,t2) is reached.
•
As soon as we find an improving move for a given t1, we perform it and go on
to the next choice for t1 (first choice of an improving move rather than
best, although given our ordering of t3 candidates, it should tend to be
better than a random improving move).
•
Requires Θ(kN) space, but this is not a problem on modern computers.
•
Also allows variants on the make-up of the neighbor-list that might be
useful for non-uniform geometric instances.
Problem with Non-Uniform Geometric
Instances
Even if k = 80, the nearest neighbor graph (with an edge between two
cities if either is on the other’s nearest neighbor list) is not connected.
Quad Neighbors
k = 16
•
Pick k/4 nearest neighbors in each quadrant centered at our city c.
•
If any quadrants have a shortfall, bring the total to k by adding the nearest
remaining unselected cities, irrespective of quadrant.
•
This guarantees that the graph of nearest neighbors will be connected.
•
For N = 10,000 clustered instances, yielded a 1-3% improvement in tours
under 2-opt, with no running time penalty (and no tour penalty for uniform
data).
One More Thing… Starting Tours
N = 10,000 [Bentley, 1992]
Starting Tour
% Excess
over HK
2-opt %
excess
Start
Secs
2-opt
Secs
Total
Secs
Farthest Insertion
13.0
11.9
76
89
165
Farthest Addition+
13.2
11.8
38
52
90
Random Insertion
14.8
12.3
57
72
129
Random Addition
15.2
11.8
16
31
47
Approx. Christofides
14.9
6.7
24
40
64
Greedy
15.7
5.8
14
30
44
Nearest Neighbor
24.2
8.7
4
27
31
Similar results for Savings under the neighbor-list implementation:
Savings % Excess over HK: 11.8, 2-Opt % Excess with Savings Start: 8.6
Explanation?
1000 runs on on a fixed 1000-city instance using randomized versions of Greedy and
Savings. X-axis is % excess for starting tour. Y-axis is % excess after 2-opting.
Microseconds/N
Estimating Running-Time
Growth Rate for 2-Opt
(Neighbor List Implementation)
Microseconds/NlogN
Microseconds/N1.25
Beyond 2-Opt
•
3-Opt: Look for improving 3-opt moves, where three edges are deleted and
we choose the cheapest way to reconnect the segment into a tour. [Includes
2-opt moves as a special case. Naïve implementation is O(N3) to find an
improving move or confirm that none exists.]
•
2.5 Opt [Bentley, 1992]. When doing a ball search about t2 to find a potential
t3 with d(t2,t3) < d(t1,t2), also consider the following three other possible
moves:
– Insert t3 in the middle of edge {t1,t2},
– Insert t1 in the middle of tour edge ending with t3, or
– Insert t1 in the middle of tour edge beginning with t3.
Note that these are degenerate 3-opt moves:
•
Or-Opt: [Or, 1976]: Special case of 3-opt in which the moves are restricted
to simply deleting a chain of 1, 2, or 3 consecutive tour vertices and inserting
it elsewhere in the tour, possibly in the reverse direction for chains of 3
vertices. (Time O(N2) to find an improving move or confirm that none exists.)
But the next theorem suggests that 3-Opt need not take Ω(N3) in practice.
Partial Sum Theorem
If a sequence x1, x2, …, xk has a positive sum S > 0, then there is a cyclic
permutation π of these numbers, all of whose prefix sums are positive,
that is, for all j, 1 ≤ j ≤ k, it satisfies
Proof: Suppose that our original sequence does not satisfy this constraint.
Let M denote the largest value such that
= -M for some j, and h be
the largest j such that this holds. We claim that the cyclic permutation
that starts with h+1 is our desired permutation. By the maximality of h, we
must have, for all j, h < j ≤ k,
. We also have
=M+S>
M. Since, by definition of M, we have
≥ -M for all j, 1 ≤ j ≤ h, our
chosen permutation will have all its prefix sums positive.
+M
0
-M
1
h
k
π(1)
π(k)
(G* will be the value of the best move found so far.)
For each t1 in our neighbor-list implementation, we
perform the first improving move found unless it is a
2-opt move, in which case we take the first extension
found to a better 3-opt move, and if none is found,
perform the 2-opt move.
Topological Issues
Topology
Valid
The choices
Between(a,b,c)
for t5 areoperation
circled, for
is needed
the cases
in where
the second
t4 precedes
case tottell
us which
or follows
t5’s are
t3 (right).
valid.
3 (left)
[Note: (Omitting
One choice
ofcase
t6 in costs
the left
case,0.2%
twoin
choices
in the right.]
this
about
tour quality.)
If G* > 0, this is more
restrictive than the
Theorem allows, but
we’ve already found an
improving move for this
t1 and so can afford to
be aggressive -- this is a
speed-up trick from [Lin
& Kernighan, 1973]
In neighbor-list implementation,
perform move and go to next t1.
In neighbor-list implementation,
perform move and go to next t1.
In neighbor-list implementation, if G* > 0, the current choices of t2, t3, t4
must represent an improving 2-opt move. Perform it and go to next t1.
Results
• Tour quality for Neighbor-List 3-opt with k = 20 is equivalent to
that for Bentley’s geometric 3-opt (as opposed to 0.4% behind
for 2-opt).
• Neighbor List Results (2-Level Tree Tour Representation):
N=
2-Opt [20] % Excess
150 Mhz Secs*
3-Opt [20] % Excess
150 Mhz Secs*
103
104
105
106
4.9
5.0
4.9
4.9
0.32
3.8
56.7
928
3.1
3.0
3.0
3.0
3.8
4.6
66.1
1054
*Roughly half of time is spent generating neighbor lists and starting tour.
Time on 3.06 Ghz Intel Core i3 processor at N = 106:
25.4 sec (2-opt), 29.5 sec (3-opt)
Next Up
• 4-Opt
• Lin-Kernighan
• and beyond….
Download