Lecture 01, 21 January 2014

advertisement
The Traveling Salesman Problem
in Theory & Practice
Lecture 1
21 January 2014
David S. Johnson
dstiflerj@gmail.com
http://davidsjohnson.net
Seeley Mudd 523, Tuesdays and Fridays
Today’s Outline
1.
2.
3.
4.
5.
6.
7.
8.
Requirements, References, & Introductions
Problem Definition
Applications
Paths and Cycles
Complexity
Introduction to Optimization
Introduction to Approximation
Preview of the Rest of the course
Requirements and Grading
• Class presentation of results from the
literature.
• Written paper:
– Survey paper on an approved topic
– Report on your own new experimental work
– Theoretical paper on new results of your own
• Regular class participation.
About Me
• Ph.D. in Mathematics from MIT (1973). Thesis: NearOptimal Bin Packing Algorithms.
• 40 years at AT&T (Bell Labs, AT&T Labs – Research),
with one year off for good behavior (U. Wisconsin,
1980-81).
• Most famous publication: Computers and
Intractiability: A Guide to the Theory of NPCompleteness, (1979, with Mike Garey).
• Many theoretical and experimental papers on the TSP
with many co-authors, starting with the proof that the
Euclidean version is NP-Hard.
Optional Reference Books
The Traveling Salesman Problem, Lawler, Lenstra,
Rinnooy Kan, and Shmoys (Editors), Wiley (1985).
$377.47 (current amazon.com price, new)
The Traveling Salesman Problem and Its Variations,
Gutin and Punnen (Editors), Kluwer (2002). $152.10
The Traveling Salesman Problem: A Computational
Study, Applegate, Bixby, Chvatal, and Cook, Princeton
University Press (2006). $57.99/$44.99 (Kindle)
In Pursuit of the Traveling Salesman, Cook, Princeton
University Press (2012). $20.64/$15.37 (Kindle)
Web Resources
• http://www.math.uwaterloo.ca/tsp/
“The Traveling Salesman Problem” (Bill Cook)
• http://dimacs.rutgers.edu/Challenges/TSP/
“The 8th DIMACS Implementation Challenge: The
Traveling Salesman Problem” (DSJ)
• http://comopt.ifi.uniheidelberg.de/software/TSPLIB95/ “TSPLIB”
(Testbed of Instances, Gerd Reinelt)
• http://davidsjohnson.net/papers.html (DSJ’s
downloadable papers on the TSP and other topics)
• http://en.wikipedia.org/wiki/Travelling_sa
lesman_problem (Wikipedia Entry -- Much
Improved)
The Traveling Salesman Problem
Given:
Set of cities {c1,c2,…,cN }.
For each pair of cities {ci,cj}, a distance d(ci,cj).
Find:
Permutation
π : {1,2,..., N}  {1,2,..., N}
N 1
 d(c
i1
π(i)
that minimizes
, c π(i  1) )  d(c π(N) , c π(1) )
Alternative Definition
Given:
Graph G = (V,E)
Length d(e) for each edge e in E.
Find:
Minimum length Hamiltonian Circuit in the
complete graph G’ on V, where if {u,v} is not in
E, we assume d(e) = ∞.
N = 10
N = 10
N = 100
N = 1000
N = 10000
Jan Karel
Lenstra
Planar Euclidean Application #1
• Cities:
– Holes to be drilled in printed circuit boards
N = 10000
N = 2392
Planar Euclidean Application #2
• Cities:
– Wires to be cut in a “Laser Logic”
programmable circuit
N = 7397
N = 33,810
N = 85,900
Other Types of Instances
• X-ray crystallography
– Cities: orientations of a crystal
– Distances: time for motors to rotate the crystal
from one orientation to the other
• High-definition video compression
– Cities: binary vectors of length 64 identifying the
summands for a particular function
– Distances: Hamming distance (the number of terms
that need to be added/subtracted to get the next
sum)
Data Storage Layout
Goal: For each row, have as many consecutive entries as possible
(minimizes the number of random accesses)
Asymmetric Applications
•
•
•
•
•
•
Payphone Money Collection with One-Way Streets
Stacker-Crane
No-Wait Flowshop
Disk Scheduling
Compiling to Minimize Branching Cost
Minimum Length Common Superstring
The Stacker Crane Problem
No-Wait Flowshop
Job:
Task on Processor 1
Task on Processor 2
Schedule:
Processor 1
Processor 2
No-Wait Flowshop
2
1
1
3
6
2
1
1
3
5
Disk Scheduling
Disk Scheduling
Locations of the fragments of a file one want to retrieve
Distance between two fragments = time it takes to move the read
head from the end of one to the beginning of the next, taking into
account the spinning of the disk
Compiling to Minimize Branching Cost
PB
B
PC
C
A
Code Segment ending in a Branch
In execution, the delay at the end of the segment is much less if the
next instruction to be executed is the next one in the code, say 1
versus k.
Based on profiling, one can determine the empirical probability that
each branch is taken.
Following A directly by B causes an expected delay of PB + kPC.
Following A directly by C causes an expected delay of PC + kPB.
Following A directly by anything else causes an expected delay of k.
Shortest Superstring
• Given: Finite set of S strings over some
alphabet.
• Find: Shortest string that contains all
strings in S as substrings.
• Cities: Strings in S.
• Distances: d(x,y) = |y| - maximum overlap
between a suffix of x and a prefix of y.
X = “alphabet”, y =“ betrayal”
d(x,y) = 5 alphabet
betrayal
d(y,x) = 6 betrayal
alphabet
Hamiltonian Path versus Cycle
• Four variants (both for symmetric and
asymmetric TSP).
–
–
–
–
Cycle
Path between between fixed endpoints
Path with fixed starting vertex
Path with unconstrained endpoints.
• A code for any one can be adapted to
handle any of the others.
Path with Fixed Endpoints:
Cycle via Path
t
s
Call Path algorithm once for s and each vertex t in V-{s}.
Return result with best value of Path Length + dist(t,s)
Path with Fixed Endpoints:
Path via Cycle
t
s
Add one new vertex and two new edges. Compute
shortest cycle, then delete the added vertex and edges
Path with One Fixed Endpoint via
Path with Two Fixed Endpoints
s
For each t in V – {s}, find shortest Hamiltonian path
from s to t. Return the best.
Path with Two Fixed Endpoints via
Path with One Fixed Endpoint
t
s
t’
Add one new vertex t’ with an edge to t. The shortest
Hamiltonian path starting with s must end at t’.
Path with No Fixed Endpoints via
Path with One Fixed Endpoint
For each s in V, find shortest Hamiltonian path starting
from s. Return the best.
Path with One Fixed Endpoint via
Path with No Fixed Endpoint
s’
s
Add new vertex s’ and an edge from s’ to s.
Directed via Undirected
Replace each vertex vi by a triplet of vertices viin, vi,
viout, and edges {viin,vi} and {vi,viout}
v1
v1in
v2in
v2
v3
v3in
v1out
v2out
v3out



vN
vNin
vNout
Replace each directed edge (vi,vj) by the undirected
edge {viout,vjin}.
v2
v2in
v2out
v1
v1
v3
in
v1out
v3out
v3in
v4
v4in
v4out
TSP: The Canonical NP-Hard Problem?
• Commonly used in the popular press to explain NPcompleteness and exponential time to the layman:
The number of tours grows as N! (actually (N-1)!/2
for symmetric case):
N
# Tours
N
# Tours
3
1
12
39,916,800
4
3
13
518,918,400
5
12
14
7,264,857,600
6
60
15
108,972,864,000
7
420
16
1,743,565,824,000
8
3,360
17
29,640,619,008,000
9
30,240
18
533,531,142,144,000
10
302,400
19
10,137,091,700,736,000
11
3,326,400
20
202,741,834,014,720,000
N! = Ω(2NlogN) time is not required
O(N22N) suffices!
[Bellman, 1963][Held & Karp, 1962]
Algorithmic technique: Dynamic Programming
States: Pairs [U,j] with 2 ≤ j ≤ N and {v1,vj} ⊆ U ⊆ V.
Note: There are θ(N2N) states [U,j].
Values: X[U,j] is the length of the shortest Hamiltonian
path, starting with v1 and ending with vj, in the subgraph of
G induced by U.
Note: The optimal tour length equals
min {X[V,j] + d(vj,v1): 2 ≤ j ≤ N}.
Computing the Values X[U,j]
X[{v1,vj},j] = d(v1,vj) , 2 ≤ j ≤ N.
Now assume we already have computed X[U,j], 2 ≤ j ≤ N,
for all U, {v1,vj} ⊆ U ⊆ V, with |U| = k.
Let W be such that v1 ∈ W ⊆ V and |W| = k+1. Suppose
vi, i > 1, is in W. Then
X[W,i] = min {X[W - {vi},j] + d(vj,vi): vj ∈ W - {vi}}
Computation takes O(N) time for each state [W,i].
Since there are θ(N2N) states overall, this yields an
overall running time of O(N22N).
Current World Record (2006)
Using a parallelized version of the
Concorde code, Helsgaun’s
sophisticated variant on Iterated
Lin-Kernighan, and 2719.5 cpu-days
N = 85,900
Concorde
• “Branch-and-Cut” approach exploiting linear programming
to determine lower bounds on optimal tour length.
• Based on 30+ years of theoretical developments in the
“Mathematical Programming” community, plus some very
good data structures and heuristics work from computer
science.
• For surprisingly large instances, it finds an optimal tour
and proves its optimality (unless it runs out of
time/space).
• Executables and source code can be downloaded from
http://www.tsp.gatech.edu/
Running times (in seconds)
for 10,000 Concorde runs
on random 1000-city planar
Euclidean instances (2.66
Ghz Intel Xeon processor
in dual-processor PC,
purchased late 2002).
Range: 7.1 seconds
to 38.3 hours
Concorde Asymptotics
[Hoos and Stϋtzle, 2009 draft]
• Estimated median running time for
random Euclidean instances.
• Based on
– 1000 samples each for N = 500,600,…,2000
– 100 samples each for N = 2500, 3000,3500,4000,4500
– 2.4 Ghz AMD Opteron 2216 processors with 1MB L2 cache
and 4 GB main memory, running Cluster Rocks Linux v4.2.1.
0.21 · 1.24194 √N
Actual median for N = 2000: ~57 minutes, for N = 4,500: ~96 hours
For Larger Instances: Fast Heuristics
• Tour construction heuristics like Nearest Neighbor,
Greedy, Christofides.
• Local search heuristics like 2-Opt, 3-Opt, LinKernighan, Iterated Lin-Kernighan, or Helsgaun’s
Algorithm.
• A range of heurstics may be useful, based on
tradeoffs between tour quality and running time.
Necessary Digression: Metrics
•
As the TSP is defined, the city-city distances (edge lengths)
are only constrained to satisfy
1. d(c,c’) ≥ 0, for all pairs of cities c,c’ (non-negativity)
2. d(c,c’) = 0 if and only if c = c’
•
To be a quasimetric, the distances also must satisfy the
“triangle inequality”
3. d(c,c’) ≤ d(c,c’’) + d(c’’,c’) for all triples of cities
•
To be a metric, the distances must also be symmetric:
4. d(c,c’) = c(c’,c), for all pairs of cities c,c’
Shortest Path “Metric”
• Let d be a TSP distance function. For any pair c,c’ of
cities, let dS(c,c’) be the length of shortest path from
c to c’ under d.
• Note that dS will be a quasimetric (and a metric if d is
symmetric)
• For most real-world applications, dS is actually the
distance function of interest, and so the triangle
inequality holds.
• As we shall see shortly, if we have the triangle
inequality, we can obtain good performance
guarantees for certain heuristics.
Additional Restriction in Practice
• Distances are integers.
– Simplifies codes.
– Yields a definitive optimal solution value.
– Not a real restriction if distances are rational.
– Allows us to cope with the problemmatic Euclidean
metric.
Euclidean Difficulties
• The length of a TSP tour for points in the plane under
the Euclidean metric is a sum of square roots:
Length = ∑i(xi)1/2
• Given such an expression and a constant B our current
best algorithm for determining whether the length is
less than B takes exponential time.
• Hence, we do not even know whether the decision
problem version of the Euclidean TSP is in NP.
• And if we round the distances to some fixed
precision, then we may get different optimal tours
for different precisions (up to an exponential number
of bits).
Rounding Conventions
1. Round Nearest
dn(x) = floor(x+.5)
– Likely to be yield tour lengths closest to the true Euclidean
– Although optimal tours may opportunistically favor the
rounded-down edge lengths
– And triangle inequality may no longer be obeyed
x
1.3
1.3
z
y
dn(x,z) = 3 > dn(x,y) + dn (y,z) = 1 + 1 = 2.
Rounding Conventions
2. Round Down
df(x) = floor(x)
– Possibly most efficiently computable.
– But underestimates true tour length.
– Also fails to obey triangle inequality.
floor(3.8) > floor(1.9) + floor(1.9)
3. Round Up
dc(x) = ceiling(x)
– Does obey the triangle inequality.
– But overestimates true tour length.
Exploiting Triangle Inequality
• Observation 1: Any connected graph in which
every vertex has even degree contains an
“Euler Tour” – a cycle that traverses each
edge exactly once, which can be found in
linear time.
• Observation 2: If the Δ-inequality holds, then
traversing an Euler tour but skipping past
previously-visited vertices yields a Traveling
Salesman tour of no greater length.
Obtaining the Initial Graph
• Double MST algorithm (DMST):
– Combine two copies of a Minimum Spanning Tree.
– Theorem [Folklore]: DMST(I) ≤ 2Opt(I).
• Christofides algorithm (CH):
– Combine one copy of an MST with a minimum-length
matching on its odd-degree vertices (there must be
an even number of them since the total sum of
degrees for any graph is even).
– Theorem [Christofides, 1976]: CH(I) ≤ 1.5Opt(I).
Optimal Tour on Odd-Degree Vertices
(No longer than overall Optimal Tour by the
triangle inequality)
Matching M1 + Matching M2 = Optimal Tour
Hence Optimal Matching ≤ min(M1,M2) ≤ OPT(I)/2
Smart-Shortcut Christofides
2-Opt
3-Opt
1 million cities on my 3.06 Ghz iMac: Lin-Kernighan
gets within 2% of optimal in 61 seconds.
The “strip” heuristic gets within 30% in 2 seconds.
Compared to 40% for the much slower “double MST”
heuristic.
The Held-Karp Bound and the
Optimal Solution Value
Integer Programming Formulation
for Symmetric TSP
• Minimize
∑d i x i
where di is the length of edge ei
• Subject to

xi ∈ {0,1}, for all edges ei ∈ C X C

∑c∈eixi = 2, for all cities c ∈ C,

∑|ei∈U|=1 xi ≥ 2, for all proper subsets U ⊂ C
Linear Programming Relaxation:
“Held-Karp” or “Subtour” Bound
• Minimize ∑dixi
where di is the length of edge ei
• Subject to

xi ∈ [0,1], for all edges ei ∈ C X C
 ∑c∈eixi = 2, for all cities c ∈ C,
 ∑|ei∈U|=1 xi ≥ 2, for all proper subsets U ⊂ C
Percent by which Optimal Tour exceeds Held-Karp Bound
For “Uniform Points” in the Unit Square (+),
the gap appears to decline to a value of
about 0.44% asymptotically.
Computing the HK Bound
• Major obstacle: exponential number of cut constraints.
∑|ei∈U|=1 xi ≥ 2, for all proper subsets U ⊂ C.
• However, one can find violated constraints in
polynomial time by maximum flow techniques (and
other heuristics).
• Concorde has options for computing the bound in
roughly this way (5 hours on my iMac for a million
cities).
• One can also construct an alternative LP formulation
that is of polynomial size, so the HK bound can in
principle be computed in polynomial time.
Topics to Be Covered
•
NP-completeness proofs, hardness of approximation results.
•
Polynomial-time (and 2o(n)-time) solvable special cases.
•
Branch-and-cut optimization algorithms (Concorde, etc.): theory and
engineering.
•
Properties of optimal solutions.
•
Polynomial-time approximation tour construction heuristics with good worstcase guarantees and/or average case performance.
•
Data structures, exploiting geometry, and other speed-up tricks for
heuristics.
•
Local Optimization heuristics (2-Opt, 3-Opt, Lin-Kernighan).
•
Metaheuristics (neural nets, simulated annealing, genetic algorithms, etc.).
•
Variants (max TSP, min-latency TSP, prize-collecting TSP, Vehicle routing, …)
Download