• David S. Johnson
• AT&T Labs - Research
© 2010 AT&T Intellectual Property. All rights reserved. AT&T and the AT&T logo are trademarks of AT&T Intellectual Property.
Don Knuth Mike Garey David Johnson
From M.R. Garey, R. L. Graham, D. S. Johnson, and D.E. Knuth,
“Complexity Results for Bandwidth Minimization,” SIAM J. App.
Math. 34:3 (1978), 477-495.
Bob Tarjan Mike Garey David Johnson
1980’s
Peter Shor
Ed Coffman
Ron Graham
Mihalis Yannakakis
Dick Karp
Christos Papadmitriou
Endre Szemeredi
Laci Lovasz
• Role Models: Collaboration in Action
– Mike Fischer and Albert Meyer
• Collaborators “Down the Hall”
– Mike Garey, Ron Graham, Ed Coffman, Mihalis
Yannakakis, Bob Tarjan, Peter Shor
• Honorary “Down the Hall” Collaborators
– Christos Papadimitriou, Tom Leighton, Richard
Weber, Claire Mathieu
• Experimental Inspirations and Collaborators
– Jon Bentley, Shen Lin & Brian Kernighan, Lyle &
Cathy McGeogh, David Applegate
David S. Johnson
AT&T Labs – Research
Knuth Prize Lecture
June 7, 2010
1996: Public Key Cryptography (Adleman, Diffie, Hellman, Merkle, Rivest, and Shamir)
1997: Data Compression (Lempel and Ziv)
1998: Model Checking (Bryant, Clarke, Emerson, and McMillan)
1999: Splay Trees (Sleator and Tarjan)
2000: Polynomial-Time Interior Point LP Methods (Karmarkar)
2001: Shotgun Genome Sequencing (Myers)
2002: Constrained Channel Coding (Franaszek)
2003: Randomized Primality Tests (Miller, Rabin, Solovay, and Strassen)
2004: AdaBoost Machine Learning Algorithm (Freund and Schapire)
2005: Formal Verification of Reactive Systems (Holzmann, Kurshan, Vardi, and Wolper)
2006: Logic Synthesis and Simulation of Electronic Systems (Brayton)
2007: Gröbner Bases as a Tool in Computer Algebra (Buchberger)
2008: Support Vector Machines (Cortes and Vapnik)
2009: Practice-Oriented Provable-Security (Bellare and Rogoway)
Part I. The Traveling Salesman Problem
• TSP Applications (Bell Labs):
– “Laser Logic” (programming FPGA’s)
– Circuit Board Construction
– Circuit Board Inspection
• Algorithms Used --
– Double Spanning Tree? (worst-case ratio = 2)
– Nearest Insertion? (worst-case ratio = 2)
– Christofides? (worst-case ratio = 1.5)
• Answer: None of the Above
Testbed: Random Euclidean Instances
N = 10
N = 10
N = 100
N = 1000
N = 10000
Lin-Kernighan [Johnson-McGeoch Implementation]
1.5% off optimal
1,000,000 cities in 8 minutes at 500 Mhz
Iterated Lin-Kernighan [Johnson-McGeoch Implementation]
0.4% off optimal
100,000 cities in 75 minutes at 500 Mhz
Concorde Branch-and-Cut Optimization [Applegate-Bixby-Chvatal-Cook]
Optimum
1,000 cities in median time 5 minutes at 2.66 Ghz
Running times (in seconds) for 10,000 Concorde runs on random 1000-city planar
Euclidean instances (2.66
Ghz Intel Xeon processor in dual-processor PC, purchased late 2002).
Range: 7.1 seconds to 38.3 hours
For more on the state-of-the-TSP-art, see http://www2.research.att.com/~dsj/chtsp/index.html/
[DIMACS TSP Challenge] http://www.tsp.gatech.edu/
[Concorde, with instances]
Part II. Bin Packing
Part III. Access Network Design
[Applegate, Archer, Johnson, Merritt, Phillips, …]
• Problem:
In “out of region” areas, AT&T does not always have direct fiber connections to our business customers, and hence spends a lot of money to lease lines to reach them. Can we save money by laying our own fiber?
• Tradeoff:
Capital cost of fiber installation versus monthly cost savings from dropping leases.
• Our Task:
Identify the most profitable clusters of customers to fiber up.
• Key Observation: This can be modeled as a Prize Collecting Steiner Tree problem, with Prize = Lease Savings and Cost = Annualized Capital Cost.
• The Goemans-Williamson primal-dual approximation PCST algorithm should be applicable.
• Although the Goemans-Williamson algorithm has a worst-case ratio of 2, this is for the objective function
Edge Cost + Amount of Prize Foregone which isn’t really the correct one here.
• Edge costs are capital dollars, prizes are expense dollars and not strictly comparable.
• We don’t have accurate estimates of costs.
• By using various multipliers on the prize values, we can generate a range of possible clusters, ranking them for instance by the number of years until cumulative lease savings equals capital cost.
• Each cluster can itself yield more options if we consider peeling off the least profitable leaves.
• Planners can then take our top suggestions and validate them by obtaining accurate cost estimates.
Part IV. The More Typical Approaches
• Adapt a metaheuristical search-based approach
(local search, genetic algorithms, tabu search,
GRASP, etc.)
• Model as a mixed integer program and use CPLEX, either to solve the MIP if the instance is sufficiently small (often the case), or to solve the
LP, which we then round.
[Breslau, Diakonikolas, Duffield, Gu, Hajiaghayi, Karloff, Johnson, Resende, Sen]
• Special case of the “Cover-by-Pairs” problem [Hassin
& Segev, 2005]:
• Given a set A of items, a set C of “cover objects”, and a set T AxCxC , find a minimum-size subset C’ C such that for all a A , there exist (not-necessarilydistinct) c,c’ C’ such that (a,c,c’) T .
• Here we are given a graph G = (V,E) , with both A and C being subsets of V .
Yes
No
Cover Object
(potential content location)
Item
(customer for content)
(a,c,c’) T iff no vertex b a is in both a shortest path from a to c and a shortest path from a to c’ .
• Our special case is at least as hard to approximate as Cover-by-Pairs.
• Cover-by-Pairs is at least as hard to approximate as Label Cover.
• Assuming NP DTIME(n O(polylog(n)) ) , no polynomial-time approximation algorithm for Label Cover can be guaranteed to find a solution that is within a ratio of
2 log1-εn of optimal for any ε > 0 .
Algorithms we tried:
– CPLEX applied to the integer programming formulation of the corresponding Cover-by-Pairs instance
– Greedy algorithm for the Cover-by-Pairs instance
– Genetic algorithm for the Covers-by-Pairs instance
– Graph-based “Double Hitting Set” Algorithm (HH) that puts together solutions to two speciallyconstructed hitting-set instances, with Greedy algorithm cleanup
• Actual ISP networks with from 100 to
1000 routers (vertices)
• Synthetic wide-area-network structures from 26 to 556 routers, generated using the Georgia Tech Internet
Topology Models package.
• CPLEX could, in reasonable time, find optimal integer solutions to instances with |A|,|C| <
150, but its running time was clearly growing exponentially.
• The Double Hitting Set and Genetic algorithms typically found solutions that of size no more than 1.05 OPT
• (where “OPT” was the maximum of the true optimum, where known, and a lower bound equaling the optimal solution value for the second hitting set instance the Double Hitting
Set algorithm considered)
• Only for the largest ISP instance did the results degrade (HH was 46% off the lower bound)
• But is this degradation of the algorithm or the quality of our lower bound?
• And does it matter? The solution was still far better than the naïve solution and well worth obtaining.
• Real world instances were not as worst-case or asymptotic as our theory is.
• Champion algorithms from the theory world could be outclassed by ad hoc algorithms with much worse (or unknown) worst-case behavior.
• Some algorithms and ideas from the theory world have been successfully applied, often to purposes for which they were not originally designed.
• Algorithms from the Operations Research and
Metaheuristic communities have perhaps had more real-world impact on coping with NP-hardness than those from TCS.
1. Study problems people might actually want to solve.
2. Study the algorithms people actually use (or might consider using).
3. Design for users, not adversaries.
4. Complement worst-case results with “realistic” average case results.
5. Implement and experiment.
• Bin packing, greedy set covering, graph coloring (DSJ, 1973)
• 2-Opt algorithm for the TSP (Chandra,
Karloff, & Tovey, 1999)
• K-means clustering (Arthur & Vassilvitskii,
2006, 2007)
• Smoothed analysis of linear programming
(Spielman & Teng, 2001)
• When (and why) do metaheuristic approaches work well?
• Ditto for belief propagation algorithms, etc.
• Many other questions.
• Some of our most effective techniques for minimizing worst-case behavior essentially guarantee poor performance in practice
– Rounding
– Metric Embedding
…
• For any on-line algorithm A, R
(A) ≥ 1.540
• First Fit: asympotic worst-case ratio R
(FF) = 1.7
• Harmonic Algorithm: R
(H) = 1.691…
• Richey’s Harmonic+1 Algorithm: R
(H+) ~ 1.59
• The rounding up of sizes used in the latter two algorithms guarantees wasted space in bins, and
First Fit substantially outperforms them in practice and on average (for a variety of distributions)
Drawbacks of standard average-case analysis:
• Results for just one distribution tell a very narrow story
• Many distributions studied are chosen for ease of analysis rather than for modelling reality
– too unstructured
– too much independence
• Reasonably good: Random points in the unit square -- here geometry imposes structure, yielding reasonably good surrogates for real-world geometric instances.
• Not so Good: Random distance matrices
(each edge length chosen uniformly and independently from [0,1]).
• The classical distribution first studied has item sizes chosen independently from the uniform distribution on [0,1]. It yields great and surprising theory:
– Theorem [Shor]: For an n-item instance, the expected number of bins in the packing constructed by Best Fit is n/2 + Θ(n 1/2 log 3/4 n)
• However, for this distribution, bin packing is essentially reduced to a matching problem.
• Choosing sizes from [0,a], a < 1, captures a wider range of packing behavior.
– Theorem [Johnson, Leighton, Shor, & Weber]
First Fit Decreasing has O(1) waste for 0 < a ≤ ½ , and Θ(n 1/3 ) waste for ½ < a < 1.
• Even more generally, one can consider
“discrete distributions”, where item sizes are restricted to a fixed set of integers, each having its own probability.
• Random instances of Satisfiability
• G(n,p) random graphs
• G(n,p) random graphs with planted subgraphs
*Questionable for practice only -- Very interesting for theory.
• Given the frequent disconnect between theory and practical performance, the best way to get people to use your algorithm is to provide experimental evidence that it performs well, and, better yet, to provide an implementation itself.
• Side benefit: Experiments can also drive new theory, suggesting new questions and algorithms.
• Jon Bentley’s experiments with the 2-Opt algorithm identified the linked-list representation of the current tour as a bottleneck.
• This led to defining a “tour data structure” with flip, successor, predecessor, and betweeness as operations/queries.
• For this we obtained a cell-probe lower bound and a near-optimum solution (a representation based on splay trees) [Fredman, Johnson,
McGeogh, & Ostheimer, 1995]
• The above two average-case results would never even have been conjectured were it not for experimental results that suggested them.
• The observation that First Fit works well on the
U[0,1] distribution because it approximately solves a matching problem led to the invention of the “Sumof-Squares” on-line bin packing algorithm.
• Experimental analysis of this algorithm led to a variant which has essentially optimal average case performance for ALL discrete distributions. [Csirik,
Johnson, Kenyon, Orlin, Shor, & Weber, 2006]
• Implementation and experiments are certainly not for everyone (or every algorithm).
• Some experience in this area does, however, help to put our theoretical work in perspective, as does knowing more about how our algorithms perform in practice and why.
• More examples of impact (or lack thereof).
• Suggestions for future Kanellakis Prize nominees.
• Suggestions of new problem domains for future DIMACS Implementation
Challenges.