Smoothed Analysis of Algorithms Shang-Hua Teng Boston University Akamai Technologies Inc Joint work with Daniel Spielman (MIT) 1 Outline • Part I: Introduction to Algorithms • Part II: Smoothed Analysis of Algorithms • Part III: Geometric Perturbation 2 Part I: Introduction to Algorithms • • • • • • Type of Problems Complexity of Algorithms Randomized Algorithms Approximation Worst-case analysis Average-case analysis 3 Algorithmic Problems • Decision Problem: – Can we 3-color a given graph G? • Search Problem: – Given a matrix A and a vector b, find an x – s.t., Ax=b • Optimization Problem: – Given a matrix A and a vector b, and an objective vector c, find an x that – maximize cT x s.t. Axb 4 The size and family of a problem • Instance of a problem – Input: • for example, a graph, a matrix, a set of points – desired output • {yes,no}, coloring, solution-vector, convex hull – Input and output size • Amount of memory needed to store the input and output • For example: number of vertices in a graph, dimensions of a matrix, the cardinality of a point set • A problem is a family of instances. 5 An Example Median of a set of numbers 1 2 3 4 5 6 7 8 9 10 • Input: a set of numbers {a1,, a2,…, an} • Output: ai that maximizes min(| {a j : a j ai } |, | {a j : a j ai } |) n 6 Quick Selection Quick Selection ({a1,, a2,…, an}, k ) 1. Choose a1 in {a1,, a2,…, an} 2. Divide the set into L {a j a1},{a1}, R {a j a1} 3. Cases: 1. 2. 3. If k = |L| + 1, return a1; If k < |L|, recursively apply Quick_Selection(L,k) If k > |L|+1, recursively apply Quick_Selection(L,n-k-1) 7 Worse-Case Time Complexity Let T({a1,, a2,…, an} ) be the number of basic steps needed in Quick-Selection for input {a1,, a2,…, an}. • We classify inputs by their size • Let An be the set of all input of size n. T ({a1 ,, an }) {a ,a }A T (n) max 1 n n • T(n) = n2 - n 8 Better Algorithms from Worst-case View point • Divide and Conquer – Linear-time algorithm – Blum-Floyd-Pratt-Rivest-Tarjan 9 Average-Case Time Complexity • Let An be the set of all input of size n. • Choose {a1,, a2,…, an} uniformly at random AvgT (n) T ({a1 ,, an }) {a ,a }A E 1 n n • E(T(n)) = O(n) 10 Randomized Algorithms Quick Selection ({a1,, a2,…, an}, k ) 1. Choose a random element s in {a1,, a2,…, an} 2. Divide the set into L {a j s},{s}, R {a j s} 3. Cases: 1. If k = |L| + 1, return s; 2. If k < |L|, recursively apply Quick_Selection(L,k) 3. If k > |L|+1, recursively apply Quick_Selection(L,n-k-1) 11 Expected Worse-Case Complexity of Randomized Algorithms ET ({a1 ,, an }) {a ,a }A ET (n) max 1 n n E(T(n)) = O(n) 12 Approximation Algorithms Sampling-Selection ({a1,, a2, …, an}) 1. Choose a random element ai from {a1,, a2,…, an} 2. Return ai. ai is a δ–median if min(| {a j : a j ai } |, | {a j : a j ai } |) n Prob[ai is a (1/4)–median ] = 0.5 13 Approximation Algorithms Sampling-Selection ({a1,, a2, …, an}) 1. Choose a set S of random k elements from {a1,,…, an} 2. Return the median ai of S. Complexity: O(k) Prai is a 1 / 2 median 1 e 2 k 1 2 14 When k = 3 Middle-of-3 a b c 15 Iterative Middle-of-3 (Miller-Teng) mid-of-3 mid-of-3 mid-of-3 mid-of-3 Randomly assign elements from {a1,, a2, …, an} to the leaves Prai is a 1 / 2 median 1 e k / 2 16 Summary • Algorithms and their complexity • Worst-case complexity • Average-cast complexity • Design better worse-case algorithm • Design better algorithm with randomization • Design faster algorithm with approximation 17 Sad and Exciting Reality • Most interesting optimization problems are hard • P vs NP • NP-complete problems – Coloring, maximum independent set, graph partitioning – Scheduling, optimal VLSI layout, optimal web-traffic assignment, data mining and clustering, optimal DNS and TCPIP protocols, integer programming… • Some are unknown: – Graph isomorphism – factorization 18 Good News • Some fundamental problems are solvable in polynomial time – Sorting, selection, lower dimensional computational geometry – Matrix problems • Eigenvalue problem • Linear systems – Linear programming (interior point method) – Mathematical programming 19 Better News I • Randomization helps – Testing of primes (essential to RSA) – VC-dimension and sampling for computational geometry and machine learning – Random walks various statistical problems – Quicksort – Random routing on parallel network – Hashing 20 Better News II • Approximation algorithms – – – – – – On-line scheduling Lattice basis reduction (e.g. in cryptanalysis) Approximate Euclidean TSP and Steiner trees Graph partitioning Data clustering Divide-and-conquer method for VLSI layout 21 Real Stories • Practical algorithms and heuristics – Great-performance empirically – Used daily by millions and millions of people – Worked routinely from chip design to airline scheduling • Applications – Internet routing and searching – Scientific simulation – Optimization 22 PART II: Smoothed Analysis of Algorithms • Introduction of Smoothed Analysis • Why smoothed analysis? • Smoothed analysis of the Simplex Method for Linear Programming 23 Gaussian Perturbation with variance s2 Smoothed Analysis of Algorithms: Why The Simplex Method Usually Takes Polynomial Time Daniel A. Spielman (MIT) Shang-Hua Teng (Boston University) 24 Remarkable Algorithms and Heuristics Work well in practice, but Worst case: bad, exponential, contrived. Average case: good, polynomial, meaningful? 25 Random is not typical 26 Smoothed Analysis of Algorithms: worst case maxx T(x) average case avgr T(r) smoothed complexity maxx avgr T(x+r) 27 Smoothed Analysis of Algorithms • Interpolate between Worst case and Average Case. • Consider neighborhood of every input instance • If low, have to be unlucky to find bad input instance 28 29 Classical Example: Simplex Method for Linear Programming max zT x s.t. Axy • Worst-Case: exponential • Average-Case: polynomial • Widely used in practice 30 The Diet Problem Carbs Protein Fat Iron Cost 1 slice bread 30 5 1.5 10 30¢ 1 cup yogurt 10 9 2.5 0 80¢ 2tsp Peanut Butter 6 8 18 6 20¢ US RDA Minimum 300 50 70 100 Minimize 30 x1 + 80 x2 + 20 x3 s.t. 30x1 + 10 x2 + 6 x3 300 5x1 + 9x2 + 8x3 50 1.5x1 + 2.5 x2 + 18 x3 70 10x1 + 6 x3 100 x1 , x2 , x 3 0 31 Linear Programming max zT x s.t. Axy Max s.t x1 +x2 x1 1 x2 1 -x1 - 2x2 1 32 Smoothed Analysis of Simplex Method max zT x s.t. A x y max zT x s.t. (A +sG) x y G is Gaussian 33 Smoothed Analysis of Simplex Method • Worst-Case: exponential • Average-Case: polynomial • Smoothed Complexity: polynomial max zT x s.t. aiT x ±1, ||ai|| 1 max zT x s.t. (ai+sgi )T x ±1 34 Perturbation yields Approximation For polytope of good aspect ratio 35 But, combinatorially 36 The Simplex Method 37 History of Linear Programming • Simplex Method (Dantzig, ‘47) • Exponential Worst-Case (Klee-Minty ‘72) • Avg-Case Analysis (Borgwardt ‘77, Smale ‘82, Haimovich, Adler, Megiddo, Shamir, Karp, Todd) • Ellipsoid Method (Khaciyan, ‘79) • Interior-Point Method (Karmarkar, ‘84) • Randomized Simplex Method (mO(d) ) (Kalai ‘92, Matousek-Sharir-Welzl ‘92) 38 Shadow Vertices 39 Another shadow 40 Shadow vertex pivot rule start z objective 41 Theorem: For every plane, the expected size of the shadow of the perturbed tope is poly(m,d,1/s ) 42 z Theorem: For every z, two-Phase Algorithm runs in expected time poly(m,d,1/s ) 43 A Local condition for optimality z a2 z 0 Vertex on a1,…,ad maximizes z iff a1 z cone(a1,…,ad ) 44 Primal a1T x 1 a2T x 1 … amT x 1 Polar ConvexHull(a1, a2, … am) 45 46 Polar Linear Program z max z ConvexHull(a1, a2, ..., am) 47 Opt Simplex Initial Simplex 48 Shadow vertex pivot rule 49 50 Count facets by discretizing to N directions, N 51 Count pairs in different facets Pr [ ] < c/N Different Facets So, expect c Facets 52 Expect cone of large angle 53 Distance 54 Isolate on one Simplex 55 Integral Formulation 56 Example: For a and b Gaussian distributed points, given that ab intersects x-axis Prob[ < ] = O(2) a b 57 a b 58 b a 59 a b 60 b a 61 a b P Pr[ ab axis 0] c [ ][ ab axis 0] 0 (a) 1 (b)dadb a ,b Claim: For < 0, P < 2 62 Change of variables u z b a v da db = |(u+v)sin(| du dv dz d 0 (a) 0 (u, z , ) 1 (b) 1 (v, z, ) 63 Analysis: P c For < 0, P < 2 (u v) | sin( ) | 0 (u, z, ) 1 (v, z, )du dv dz d u ,v , z Slight change in has little effect on i for all but very rare u,v,z 0 64 Distance: Gaussian distributed corners a1 a2 p a3 65 Idea: fix by perturbation 66 Trickier in 3d 67 Future Research – Simplex Method • Smoothed analysis of other pivot rules • Analysis under relative perturbations. • Trace solutions as un-perturb. • Strongly polynomial algorithm for linear programming? 68 A Theory Closer to Practice • Optimization algorithms and heuristics, such as Newton’s Method, Conjugate Gradient, Simulated Annealing, Differential Evolution, etc. • Computational Geometry, Scientific Computing and Numerical Analysis • Heuristics solving instances of NP-Hard problems. • Discrete problems? • Shrink intuition gap between theory and practice. 69 Part III: Geometric Perturbation Three Dimensional Mesh Generation 70 Delaunay Triangulations for Well-Shaped 3D Mesh Generation Shang-Hua Teng Boston University Akamai Technologies Inc. Collaborators: Siu-Wing Cheng, Tamal Dey, Herbert Edelsbrunner, Micheal Facello Damrong Guoy Gary Miller, Dafna Talmor, Noel Walkington Xiang-Yang Li and Alper Üngör 3D Unstructured Meshes 73 Surface and 2D Unstructured Meshes courtesy N. Amenta, UT Austin courtesy NASA courtesy Ghattas, CMU 74 Numerical Methods Formulation Math+Engineering Mesh Generation geometric structures Approximation Numerical Analysis Linear System algorithm data structures Domain, Boundary, and PDEs Point Set: Triangulation: Finite ad hoc octree Delaunay element difference volume Ax=b direct method multigrid iterative method Outline • Mesh Generation in 2D – Mesh Qualities – Meshing Methods – Meshes and Circle Packings • Mesh Generation in 3D – Slivers – Numerical Solution: Control Volume Method – Geometric Solution: Sliver Removal by Weighted Delaunay Triangulations – Smoothed Solution: Sliver Removal by Perturbation Badly Shaped Triangles Aspect Ratio (R/r) Meshing Methods The goal of a meshing algorithm is to generate a well-shaped mesh that is as small as possible. • Advancing Front • Quadtree and Octree Refinement • Delaunay Based – – – – Delaunay Refinement Sphere Packing Weighted Delaunay Triangulation Smoothing by Perturbation Balanced Quadtree Refinements (Bern-Eppstein-Gilbert) Quadtree Mesh Delaunay Triangulations Why Delaunay? • Maximizes the smallest angle in 2D. • Has efficient algorithms and data structures. • Delaunay refinement: – In 2D, it generates optimal size, natural looking meshes with 20.7o (Jim Ruppert) Delaunay Refinement (Jim Ruppert) 2D insertion 1D insertion 84 Delaunay Mesh 85 Local Feature Spacing (f) The radius of the smallest sphere centered at a point that intersects or contains two non-incident input features f: W R 86 Well-Shaped Meshes and f 87 f is 1-Lipschitz and Optimal 88 Sphere-Packing 89 b -Packing a Function f f(p)/2 p q • No large empty gap: the radius of the largest empty sphere passing q is at most b f(q). 90 The Packing Lemma (2D) (Miller-Talmor-Teng-Walkington) • The Delaunay triangulation of a b -packing is a well-shaped mesh of optimal size. • Every well-shaped mesh defines a b -packing. 91 Part I: Meshes to Packings 92 Part II: Packings to Meshes 93 3D Challenges • Delaunay failed on aspect ratio • Quadtree becomes octree (Mitchell-Vavasis) • Meshes become much larger • Research is more Challenging!!! 94 Badly Shaped Tetrahedra Slivers Radius-Edge Ratio (Miller-Talmor-Teng-Walkington) R L R/L The Packing Lemma (3D) (Miller-Talmor-Teng-Walkington) • The Delaunay Triangulation of a b -packing is a well-shaped mesh (using radius-edge ratio) of optimal size. • Every well-shaped (aspect-ratio or radiusedge ratio) mesh defines a b -packing. 98 Uniform Ball Packing • In any dimension, if P is a maximal packing of unit balls, then the Delaunay triangulation of P has radius-edge at most 1. ||e|| is at least 2 r r is at most 2 99 Constant Degree Lemma (3D) (Miller-Talmor-Teng-Walkington) • The vertex degree of the Delaunay triangulation with a constant radiusedge ratio is bounded by a constant. 100 Delaunay Refinement in 3D Shewchuck 101 Slivers Sliver: the geo-roach Coping with Slivers: Control-Volume-Method (Miller-Talmor-Teng-Walkington) Sliver Removal by Weighted Delaunay (Cheng-Dey-Edelsbrunner-Facello-Teng) Weighted Points and Distance p z Orthogonal Circles and Spheres Weighted Bisectors Weighted Delaunay Weighted Delaunay and Convex Hull Parametrizing Slivers D L Y Interval Lemma 0 N(p)/3 • Constant Degree: The union of all weighted Delaunay triangulations with Property [r] and Property [1/3] has a constant vertex degree Pumping Lemma (Cheng-Dey-Edelsbrunner-Facello-Teng) z P p H q D Y r s Sliver Removal by Flipping • One by one in an arbitrary ordering • fix the weight of each point • Implementation: flip and keep the best configuration. Experiments (Damrong Guoy, UIUC) 115 Initial tetrahedral mesh: 12,838 vertices, all vertices are on the boundary surface 116 Dihedral angle < 5 degree: 13,471 117 Slivers after Delaunay refinement: 881 118 Slivers after sliver-exudation: 12 119 15,503 slivers 1183 slivers 142 slivers, less elements, better distribution 120 5636 slivers Triceratops 563 slivers 18 slivers, less elements, better distribution 121 Heart 4532 slivers 173 slivers 1 sliver, less elements, better distribution 122 Smoothing and Perturbation • Perturb mesh vertices • Re-compute the Delaunay triangulation 123 Well-shaped Delaunay Refinement (Li and Teng) • Add a point near the circumcenter of bad element • Avoids creating small slivers • Well-shaped meshes 124