A survey on approximating graph spanners Guy Kortsarz, Rutgers Camden Spanners: definition for unweighted undirected graphs. Peleg and Upfal. • Input: An undirected graph G(V,E) and an integer k • Required: a subgraph G’ so that for every u and v V: Dist (u,v)/Dist (u,v)≤ k G’ G It is enough to take care of edges • Say we are talking about the lengths and weights that are 1. A.K.A Basic Spanners. • Say that you have a k-spanner G(V,E’) of a graph G(V,E) • Let E-E’ be the omitted edges • What happens when you add an omitted edge is added to a k-spanner G(V,E’)? • Then a cycle of size at most k+1 must be closed. A look on an omitted edge e For example a 4-spanner A look on an omitted edge e For example a 4-spanner: the omitted edge creates a cycle of size 4+1=5 It enough that all edges are “spanned”, to get distance between any two vertices is at most k the original one. An example of a 2-spanners • The original graph: A 2-spanner • A few of the edge that were part of triangles were removed. Applications • In geometry. • Small routing tables: spanners have less edges. Thus smaller tables. But not much larger distance • Synchronizers: make non synchronized distributed computation, synchronized. • Parallel distributed and streaming algorithms. • Distance oracles. Handle queries about distance between two vertices quick by preprocessing. • Property testing • Minimum time broadcast. Directed spanners, applications: • Transitive Closure Spanners See Grigorescu, Jung, Raskhodnikova, Woodruff, SODA’09 • Property testing. See a survey by Raskhodnikova. • Access Control hierarchies. See a paper by De Santis et al. TC spanners • Transitive closure spanners. • Given G. Find H with same TC of G, and has diameter k. See a paper by Bhattacharyya et al. • Studied in access control (but were not given a name). Very long time ago. • In the last 20 years many known properties of transitive closure were rediscovered. TC spanners: remarkable many applications • Circuit construction. See Chandra et al et al STOC 1983. • Data structures. See Alon et al 1998. • Property testing: Raskhodnikova, FOCS 2012. • Lower bounds for stretch (a.k.a Lipshitzness). See JR 2011. And much more. 2-spanners • There is a difficulty. Unlike k≥3 thre are not necessarily 2 spanners with few edges. • The only 2-spanners of a complete bipartite graph is the graph itself. • Also true for TC spanners if all the directed edges go from left to right. • Like in 2-SAT and 2-Coloring and other problems, 2-spanners is different than the rest. For k at least 3 there are spanners with few edges • As we shall see: 3-spanners with O(n*sqrt{n}) edges always exist, and the same goes for 4spanners. And this is tight. • The larger k is, the smaller is the upper bound on the number of edges in the best spanner. • Remarkable fact: maximum number of edges in a graph with girth g not known. • Maybe for 40 years the upper and lower bound are quite far! General lengths • The definition is the same, just that the length is weighted length. Also weights=lengths. • Definition: The weighted distance of every pair of vertices in G’ must be no larger than k times their weighted distance in G. • It is not enough that an omitted edge closes a cycle of length at most k+1. • However, it works if the the omitted edge is the largest in the closed cycle. An edge that can be removed For example a 4-spanner 9 4 7 6 8 Removing the heaviest edge in a cycle of length at most k+1 is OK. For example a 4-spanner 4 7 6 8 A generalization of the Kruskal algorithm: • Sort the edges of the graph in increasing weights. c(e1) ≤ c(e2) ≤ c(e3) ≤……….. ≤ c(em) • Go over all edges from small cost to large. • For the next edge ei, if the edge does not close a cycle of length at most k+1 with previously added edges, add ei to G’ or else i=i+1 • This algorithm is due to I. Althofer, G. Das, D. Dobkin, D. Joseph, and J. Soares. The resulting graph is a k-spanner • If an edge e is missing, then by construction, this edge is the most heavy edge in a cycle of length at most k+1. • This is because we go over edges in non decreasing costs. • If we reach a cycle of size k+1, then it means that previous edges were not removed. • This implies that e is the largest edge in a cycle of length at most k+1 and it is safe to remove it. Girth k+2 • We observe that the resulting graph has girth at least k+2 • The girth is the size of the minimum simple cycle. • Observe that when we reach the largest edge e of a cycles with at most k+1 edges it will be removed. • Therefore, there are no k+1 size or smaller cycles. • Graphs with large girth have “few’’ edges. Example: graphs with girth 5 and 6 • We show that graphs with girth 5 and 6 have O(n*sqrt{n}) edges. • First remove all vertices of degree strictly smaller than m/n. • Here m is the numbers of edges and n is the number of vertices. • Since we have removed at most n vertices and each vertex removes less than m/n edges it is clear that the resulting graph is not empty. Two layers BFS graph • All the vertices seen below are distinct as otherwise there is a cycle of length at most 4. (m/n)-1 (m/n)-1 m/n Number of edges This implies that m/n(m/n-1)≤ n, or m2/n2-m/n ≤n As m/n<n we get that m2/n2<2n or m2<2n3 Thus m=O(n sqrt{n}) A matching lower bound. A graph of girth 6 that has Ω(n*sqrt{n}) edges. • A projective plane for our needs is a bipartite graph with n vetices on each side and degree Ө(sqrt{n}) thus contains Ө(n*sqrt{n}) edges. • The main property: every pair of vertices in the same side share at most 1 neighbor. • • • • Girth 6 • There could not be a cycle of size 4: A cycle of length 4 implies that two vertices on the same side share two neighbors. Contradiction Girth 6 • There could not be a cycle of size 4: Therefore girth 6 General bounds on the minimum number of edges for a given girth • It is known that there is always a 2k-1 spanner with O(n1+1/k) edges. • By the above we know that there is always a 3,4 spanners with O(n*sqrt{n}) edges. But better than sqrt{n} approximation is known (later). • For if we want a 3-spanner put k=2. This indeed gives O(n* sqrt{n}) edges. The number minimum number of edges for k=3 and k=4 is the same as the bad example is a bipartite graph. There are spanners with few edges but what about the cost? • An interesting result by Chandra et al. • The same algorithm described before returns a graph with small weight. • It is roughly of cost n1/t times the cost of the minimum spanning tree (on top of the fact that there are just few edges), • For log n it gives about O(1) times the MST weight, and O(n) edges. The story of the minimum cost 2-spanner problem. • The edges have length 1. We show an approximation for costs 1, even though works for arbitrary costs. • In 1992, K, Peleg: the 2-spanner problem admits an O(log n) ratio approximation in polynomial time • In 1998: Kortsarz: unless P=NP there exists a constant so that c*log n/k approximation is not possible for min cost k-spanners. Ω(log n) for k=2. TIGHT. • I know of one k for which the approximation and the lower bound are equal . Unnatural k. • See a paper by Dinitz and K from ICALP 2012. The idea for the 2-spanner O(logn) ratio approximation algorithm • A covering problem. A universe U of covering items. • A non decreasing function f: 2S---> R+ given indirectly via an oracle. • The cost is sub-additive. Namely, • c(A B)≤ c(A)+c(B). • The cost of a set A is given by an oracle. • We look for a minimum cost S so that f(S)=f(U). The density theorem S is initiated to . At iteration i we add a set Si to S. Let OPT be the optimum solution and opt its cost. The density condition: (f(S Si)-f(S))/c(Si)≤ (f(OPT S)-f(S))/opt • Intuition: f(OPT S)-f(S)=f(OPT)-f(S)=f(U)-f(S). • Therefore the density of adding Si is no worse than the density of adding OPT . • • • • The density theorem • Say that S is initiated to . • Now say that at any iteration we meet the density condition Si so that: (f(Si+S)-f(S))/c(Si)≥(f(U)-f(S))/opt • Then the final set S has cost bounded by (ln(U)+1) opt Proof • Let Si be the LAST set added in iteration i. • We get (f(S+ Sj)-f(S))/c(Sj)≥(f(U)-f(S))/opt • • • • And therefore: f(U)-f(S1)≤ (1-c(S1)/opt) f(U) Similarly: f(U)-f(S1+S2)≤ (1-c(S1)/opt) (1-c(S2)/opt ) f(U) Proof continued • f(U) - j≤ i-1 f(Sj)≥ 1. • We may assume that the cost of every set added is at most opt, therefore c(Si) ≤ opt • Therefore it remains to bound: j≤ i-1 c(Si) Let us concentrate on what happens before Si is added. By the previous claims • 1 ≤ f(U)-f(S1+S2 +……Si-1)≤ Πj≤ i-1(1-c(Si)/opt)· f(U) • 1/f(U) ≤ Πj≤ i-1(1-c(Sj)/opt)· • Take ln and use ln(1+x) ≤ x: -ln( f(U))≤ i≤ j -c(Sj) )/opt j≤ i-1 c(Sj) ≤ opt ln( f(U)) and so the ratio of ln( f(U))+1 follows. One of many ways to get a good density solution for subadditive costs • Decompose OPT=i OPTi . OPTi OPT . • The frequency property: each covering item participates in at most c different OPTi . • By the frequency property and sub-additivity c(i OPTi )≤ c(OPTi) ≤ c*opt. • We need that f(OPT)=f(i OPTi) ≤ f(OPTi) • Density: f(OPTi)/ c(OPTi) ≥f(OPT)/c*opt By Averaging • There exists an i so that f(OPTi)/c(OPTi)≥f(OPT)/c*opt. • The question is if we can find or approximate the type of structure that OPTi is. • The decomposition in the 2-spanner problem: a collection of stars. For each vertex the vertex and all its optimal edges are some OPTi Applying the decomposition scheme • We get c=2 as every edge participated in 2 stars. • For a vertex v that is connected in the optimum to QN(v) let f(v,Q) be the number of edges in the graph induced by Q. • We clearly get that f(OPT)=f(v f(v,Qv) )≤ f(v,Qv) . • Can we find the best star? Namely, the one who minimizes f(v, Qv)/|Qv|? The set Qv N(v) so • • • • Algorithm: check every vertex v For a vertex v look at the graph induced by N(v) Find a desnsest subgraph S(v) in N(v) Return the edges from v to S(v) that is the most dense set over all v and iterate S The problem we need to solve is the densest subgraph • Let e(S), SN(v) be the number of edges in the graph induced by S. • This problem requires finding a subset of the vertices with maximum density e(S)/|S| and can be solved exactly via flow. This implies (by the density claim) an O(log d) ratio follows for d the average degree. • A faster algorithm, approximates the best density by 2 but get O(n) time and not flow time. Adds 2 to the ratio (so negligible). • Was done by K,Peleg in 1992. Also Charikar 1998. • Very extensively cited for social networks. Almost always attribute the result to Charikar. A quick approximation for densest subgraph • Let be e(OPT)/|OPT| • We show that all vertices in the optimum have degree at least . • Otherwise, removing a vertex with degree less than increases the optimum. • Therefore we can iteratively remove vertices of degree strictly less than . We never remove a vertex of OPT and so the remaining graph is not empty. The 2 approximation continued • The degree of all vertices in the final graph is at least . • Say that there are i vertices. The density is the sum of the degrees, over 2, divided by i. • In other words the density is at least i* /(2i)= /2 Thus ratio 2. With a bit of data structure, O(n) running time. A reduction from Set-Cover to the minimum size 2-spanner E h C SC C A S A reduction from Set-Cover E h C SC C A S The edges between A and S are spanned via the edges of h A reduction from Set-Cover E h C SC C A S Few edges in the set cover instance, so add them. Will be negligible in size. A reduction from Set-Cover E h C SC C A S If there is an edge in the optimum between a vertex of A and a vertex of E, replace it by a path of length 2 via S A reduction from Set-Cover E h C SC C A S This means that every vertex in A is joined to a set-cover because the edges of A to E can not span themselves A reduction from Set-Cover E h C SC C A S If the size of the set cover is opt the size of the 2-spanner is about n*opt and the inapproximability is implied. How hard is it to approximate spanners for k≥3? • Strong hardness is exp(log 1- n) • Weak hardness is (log n)/k • K. 98. First hardness. Weak hardness for w(e)=l(e)=1. Later similar methods employed for hardness for Buy at Bulk. • Elkin Peleg: Strong hardness for: • 1) l(e)=w(e). • 2) w=1 arbitrary length • 3)Unit length, arbitrary weights • 4) Basic but directed spanners. A question posed in 1992 • Does the case w=l=1, k=3 admit strong hardness? • In 2000 Elkin and Peleg showed conditional hardness. If Min-Rep (see K. 98) with large super girth is strongly hard, then approximating basic k-spanners for k≥ 3 is strongly hard. • Dinic, Kortsarz and Raz ICALP 2012 . Min-Rep with large supergirth is indeed strongly hard, thus basic spanners as well. Solved after 20 years. A technique employed for directed Steiner Forest • Feldman, K. and Nutov. The following picture: s LP flow at least ¼ between every pair s,t t At most n2/5 vertices in every layer An edge with large xe • Between every two layers there is at most n4/5 edges. • Let xe be the largest capacity. Thus via every edge at most xe flow unit pass from s to t. • The total flow between s and t is at least ¼. • Therefore n4/5* xe≥ ¼ • Therefore there is an edge of value about 1/n4/5 • Iterative rounding gives ratio n4/5 Approximating directed spanners • Krauthgamer and Dinic employed (part of our) techniques to get an n2/3 approximation for unit length directed k-spanners. The ratio does not depend on k. As happens many times, the techniques were invented independently. • Improvement: non iterative but randomized rounding gets about n1/2 ratio. Very clever trick! • Due to Berman, Bhattacharyya, Makarychev, Raskhodnikova, Yaroslavtsev. Other results • For k=3 they get ratio n1/3 ratio for the directed case. Note that even for undirected graphs n1/2 is trivial but n1/3 not. • They also improve the result for Directed Steiner forest about 3 years after it was established. The new best ratio is n2/3. • So far we dealt with weighted spanner. • Consider spanners with w=l=1 but we want minimum maximum degree. The 20 years pattern • In 1993 K., Peleg approximation of 1/4 for minimum maximum degree 2-spanner • A related remark: in 1996 Feige, K, Peleg: a n1/3-1/60 for the dense k-subgraph problem. • Improved by Bhsaskara, Chlamtac, Charikar, Feige and Vijayaraghavan almost 20 years later to about n1/4 ratio. Improvements • The minimization version of the dense k-subgraph problem. • Given a bound B on the edges, find a minimum size collection V’ of vertices so that e(V’)≥B. Its ratio is never worse than the dense k-subgraph ratio. • Improvement 1: Chlamtac, Dinic and Krauthgamer. Ratio 0.172 for minimum maximum degree 2-spanner • Improvement 2: n0.172 ratio for the minimization version of dense k-subgraph. In my opinion great result. Remarks on this algorithm • The improvement for Dense k-subgraph can be thought of: instead of counting walks, count Caterpillars. Still, improves 20 years result. • The new results of CDK uses the SheraliAdams hierarchy. Also a notion of faithful rounding which is new and very interesting. • In 1993 K, Peleg said that Bounded degree 2spanner related to Dense k-Subgraph. • Showed to be indeed the case by CDK. Open problems (in approximation as otherwise too many) • The case k=2 for unit length and arbitrary cost and even on directed graphs is tight. Almost all other problems large gap between ratio and inapproximability. • New variants: Augmenting spanners, Client-server spanners, All server Client-server spanners. All these problems invented by Elkin and Peleg. Also Transitive closure spanners. Tree spanners. Additive spanners. • In addition a slightly new concept is Fault tolerant spanners.