Guy Kortsarz, Rutgers Camden

advertisement
A survey on approximating
graph spanners
Guy Kortsarz, Rutgers
Camden
Spanners: definition for unweighted
undirected graphs. Peleg and Upfal.
• Input: An undirected graph G(V,E)
and an integer k
• Required: a subgraph G’ so that for every u
and v  V:
Dist (u,v)/Dist (u,v)≤ k
G’
G
It is enough to take care of edges
• Say we are talking about the lengths and weights
that are 1. A.K.A Basic Spanners.
• Say that you have a k-spanner G(V,E’) of a
graph G(V,E)
• Let E-E’ be the omitted edges
• What happens when you add an omitted edge is
added to a k-spanner G(V,E’)?
• Then a cycle of size at most k+1 must be
closed.
A look on an omitted edge e
For example a 4-spanner
A look on an omitted edge e
For example a 4-spanner: the omitted edge creates
a cycle of size 4+1=5
It enough that all edges are “spanned”, to get distance
between any two vertices is at most k the original one.
An example of a 2-spanners
• The original graph:
A 2-spanner
• A few of the edge that were part of triangles
were removed.
Applications
• In geometry.
• Small routing tables: spanners have less edges. Thus
smaller tables. But not much larger distance
• Synchronizers: make non synchronized distributed
computation, synchronized.
• Parallel distributed and streaming algorithms.
• Distance oracles. Handle queries about distance
between two vertices quick by preprocessing.
• Property testing
• Minimum time broadcast.
Directed spanners, applications:
• Transitive Closure Spanners
See Grigorescu, Jung, Raskhodnikova,
Woodruff, SODA’09
• Property testing. See a survey by
Raskhodnikova.
• Access Control hierarchies. See a paper
by De Santis et al.
TC spanners
• Transitive closure spanners.
• Given G. Find H with same TC of G, and has
diameter k. See a paper by Bhattacharyya et al.
• Studied in access control (but were not given a
name). Very long time ago.
• In the last 20 years many known properties of
transitive closure were rediscovered.
TC spanners: remarkable many
applications
• Circuit construction. See Chandra et al et al
STOC 1983.
• Data structures. See Alon et al 1998.
• Property testing: Raskhodnikova, FOCS 2012.
• Lower bounds for stretch (a.k.a Lipshitzness).
See JR 2011. And much more.
2-spanners
• There is a difficulty. Unlike k≥3 thre are not
necessarily 2 spanners with few edges.
• The only 2-spanners of a complete bipartite
graph is the graph itself.
• Also true for TC spanners if all the directed
edges go from left to right.
• Like in 2-SAT and 2-Coloring and other
problems, 2-spanners is different than the rest.
For k at least 3 there are spanners
with few edges
• As we shall see: 3-spanners with O(n*sqrt{n})
edges always exist, and the same goes for 4spanners. And this is tight.
• The larger k is, the smaller is the upper bound
on the number of edges in the best spanner.
• Remarkable fact: maximum number of edges in
a graph with girth g not known.
• Maybe for 40 years the upper and lower bound
are quite far!
General lengths
• The definition is the same, just that the length is
weighted length. Also weights=lengths.
• Definition: The weighted distance of every pair
of vertices in G’ must be no larger than k times
their weighted distance in G.
• It is not enough that an omitted edge closes a
cycle of length at most k+1.
• However, it works if the the omitted edge is the
largest in the closed cycle.
An edge that can be removed
For example a 4-spanner
9
4
7
6
8
Removing the heaviest edge in a
cycle of length at most k+1 is OK.
For example a 4-spanner
4
7
6
8
A generalization of the Kruskal
algorithm:
• Sort the edges of the graph in increasing
weights.
c(e1) ≤ c(e2) ≤ c(e3) ≤……….. ≤ c(em)
• Go over all edges from small cost to large.
• For the next edge ei, if the edge does not close a
cycle of length at most k+1 with previously
added edges, add ei to G’ or else i=i+1
• This algorithm is due to I. Althofer, G. Das, D.
Dobkin, D. Joseph, and J. Soares.
The resulting graph is a k-spanner
• If an edge e is missing, then by construction,
this edge is the most heavy edge in a cycle of
length at most k+1.
• This is because we go over edges in non
decreasing costs.
• If we reach a cycle of size k+1, then it means
that previous edges were not removed.
• This implies that e is the largest edge in a cycle
of length at most k+1 and it is safe to remove it.
Girth k+2
• We observe that the resulting graph has girth at least
k+2
• The girth is the size of the minimum simple cycle.
• Observe that when we reach the largest edge e of a
cycles with at most k+1 edges it will be removed.
• Therefore, there are no k+1 size or smaller cycles.
• Graphs with large girth have “few’’ edges.
Example: graphs with girth 5 and 6
• We show that graphs with girth 5 and 6 have
O(n*sqrt{n}) edges.
• First remove all vertices of degree strictly
smaller than m/n.
• Here m is the numbers of edges and n is the
number of vertices.
• Since we have removed at most n vertices and
each vertex removes less than m/n edges it is
clear that the resulting graph is not empty.
Two layers BFS graph
• All the vertices seen below are distinct as
otherwise there is a cycle of length at most 4.
(m/n)-1
(m/n)-1
m/n
Number of edges
This implies that m/n(m/n-1)≤ n, or m2/n2-m/n ≤n
As m/n<n we get that m2/n2<2n or m2<2n3
Thus m=O(n sqrt{n})
A matching lower bound. A graph of girth 6 that has
Ω(n*sqrt{n}) edges.
• A projective plane for our needs is a bipartite graph
with n vetices on each side and degree Ө(sqrt{n})
thus contains Ө(n*sqrt{n}) edges.
• The main property: every pair of vertices in the
same side share at most 1 neighbor.
•
•
•
•
Girth 6
• There could not be a cycle of size 4:
A cycle of length 4 implies that two vertices on
the same side share two neighbors.
Contradiction
Girth 6
• There could not be a cycle of size 4:
Therefore girth 6
General bounds on the minimum
number of edges for a given girth
• It is known that there is always a 2k-1 spanner
with O(n1+1/k) edges.
• By the above we know that there is always a 3,4
spanners with O(n*sqrt{n}) edges. But better
than sqrt{n} approximation is known (later).
• For if we want a 3-spanner put k=2. This indeed
gives O(n* sqrt{n}) edges. The number
minimum number of edges for k=3 and k=4 is
the same as the bad example is a bipartite graph.
There are spanners with few edges
but what about the cost?
• An interesting result by Chandra et al.
• The same algorithm described before returns a
graph with small weight.
• It is roughly of cost n1/t times the cost of the
minimum spanning tree (on top of the fact that
there are just few edges),
• For log n it gives about O(1) times the MST
weight, and O(n) edges.
The story of the minimum cost
2-spanner problem.
• The edges have length 1. We show an approximation
for costs 1, even though works for arbitrary costs.
• In 1992, K, Peleg: the 2-spanner problem admits an
O(log n) ratio approximation in polynomial time
• In 1998: Kortsarz: unless P=NP there exists a constant
so that c*log n/k approximation is not possible for min
cost k-spanners. Ω(log n) for k=2. TIGHT.
• I know of one k for which the approximation and the
lower bound are equal . Unnatural k.
• See a paper by Dinitz and K from ICALP 2012.
The idea for the 2-spanner O(logn)
ratio approximation algorithm
• A covering problem. A universe U of covering items.
• A non decreasing function f: 2S---> R+ given indirectly
via an oracle.
• The cost is sub-additive. Namely,
•
c(A B)≤ c(A)+c(B).
• The cost of a set A is given by an oracle.
• We look for a minimum cost S so that f(S)=f(U).
The density theorem
S is initiated to .
At iteration i we add a set Si to S.
Let OPT be the optimum solution and opt its cost.
The density condition:
(f(S  Si)-f(S))/c(Si)≤ (f(OPT  S)-f(S))/opt
• Intuition: f(OPT  S)-f(S)=f(OPT)-f(S)=f(U)-f(S).
• Therefore the density of adding Si is no worse than the
density of adding OPT .
•
•
•
•
The density theorem
• Say that S is initiated to .
• Now say that at any iteration we meet the density
condition Si so that:
(f(Si+S)-f(S))/c(Si)≥(f(U)-f(S))/opt
• Then the final set S has cost bounded by
(ln(U)+1) opt
Proof
• Let Si be the LAST set added in iteration i.
• We get (f(S+ Sj)-f(S))/c(Sj)≥(f(U)-f(S))/opt
•
•
•
•
And therefore:
f(U)-f(S1)≤ (1-c(S1)/opt) f(U)
Similarly:
f(U)-f(S1+S2)≤
(1-c(S1)/opt) (1-c(S2)/opt ) f(U)
Proof continued
• f(U) -  j≤ i-1 f(Sj)≥ 1.
• We may assume that the cost of every set
added is at most
opt, therefore c(Si) ≤ opt
• Therefore it remains to bound:
 j≤ i-1 c(Si)
Let us concentrate on what happens before
Si is added.
By the previous claims
• 1 ≤ f(U)-f(S1+S2 +……Si-1)≤
Πj≤ i-1(1-c(Si)/opt)· f(U)
• 1/f(U) ≤ Πj≤ i-1(1-c(Sj)/opt)·
• Take ln and use ln(1+x) ≤ x:
-ln( f(U))≤  i≤ j -c(Sj) )/opt
 j≤ i-1 c(Sj) ≤ opt ln( f(U))
and so the ratio of ln( f(U))+1 follows.
One of many ways to get a good
density solution for subadditive costs
• Decompose OPT=i OPTi . OPTi OPT .
• The frequency property: each covering item
participates in at most c different OPTi .
• By the frequency property and sub-additivity
c(i OPTi )≤  c(OPTi) ≤ c*opt.
• We need that f(OPT)=f(i OPTi) ≤  f(OPTi)
• Density:  f(OPTi)/  c(OPTi) ≥f(OPT)/c*opt
By Averaging
• There exists an i so that
f(OPTi)/c(OPTi)≥f(OPT)/c*opt.
• The question is if we can find or approximate
the type of structure that OPTi is.
• The decomposition in the 2-spanner problem: a
collection of stars. For each vertex the vertex
and all its optimal edges are some OPTi
Applying the decomposition scheme
• We get c=2 as every edge participated in 2 stars.
• For a vertex v that is connected in the optimum
to QN(v) let f(v,Q) be the number of edges in
the graph induced by Q.
• We clearly get that f(OPT)=f(v f(v,Qv) )≤
 f(v,Qv) .
• Can we find the best star? Namely, the one who
minimizes f(v, Qv)/|Qv|?
The set Qv N(v) so
•
•
•
•
Algorithm: check every vertex v
For a vertex v look at the graph induced by N(v)
Find a desnsest subgraph S(v) in N(v)
Return the edges from v to S(v) that is the most
dense set over all v and iterate
S
The problem we need to solve is the
densest subgraph
• Let e(S), SN(v) be the number of edges in the graph
induced by S.
• This problem requires finding a subset of the vertices
with maximum density e(S)/|S| and can be solved
exactly via flow. This implies (by the density claim) an
O(log d) ratio follows for d the average degree.
• A faster algorithm, approximates the best density by 2
but get O(n) time and not flow time. Adds 2 to the
ratio (so negligible).
• Was done by K,Peleg in 1992. Also Charikar 1998.
• Very extensively cited for social networks. Almost
always attribute the result to Charikar.
A quick approximation for densest
subgraph
• Let  be e(OPT)/|OPT|
• We show that all vertices in the optimum have
degree at least .
• Otherwise, removing a vertex with degree less
than  increases the optimum.
• Therefore we can iteratively remove vertices of
degree strictly less than . We never remove a
vertex of OPT and so the remaining graph is
not empty.
The 2 approximation continued
• The degree of all vertices in the final graph is at least .
• Say that there are i vertices. The density is the sum of
the degrees, over 2, divided by i.
• In other words the density is at least
i*  /(2i)=  /2
Thus ratio 2. With a bit of data structure,
O(n) running time.
A reduction from Set-Cover to the
minimum size 2-spanner
E
h
C
SC
C
A
S
A reduction from Set-Cover
E
h
C
SC
C
A
S
The edges
between A and
S are spanned
via the edges of
h
A reduction from Set-Cover
E
h
C
SC
C
A
S
Few edges in the set
cover instance, so
add them. Will be
negligible in size.
A reduction from Set-Cover
E
h
C
SC
C
A
S
If there is an edge
in the optimum
between a vertex of
A and a vertex of E,
replace it by a path
of length 2 via S
A reduction from Set-Cover
E
h
C
SC
C
A
S
This means that
every vertex in A is
joined to a set-cover
because the edges
of A to E can not
span themselves
A reduction from Set-Cover
E
h
C
SC
C
A
S
If the size of the set
cover is opt the size
of the 2-spanner is
about n*opt and the
inapproximability is
implied.
How hard is it to approximate
spanners for k≥3?
• Strong hardness is exp(log 1- n)
• Weak hardness is (log n)/k
• K. 98. First hardness. Weak hardness for w(e)=l(e)=1.
Later similar methods employed for hardness for Buy at
Bulk.
• Elkin Peleg: Strong hardness for:
• 1) l(e)=w(e).
• 2) w=1 arbitrary length
• 3)Unit length, arbitrary weights
• 4) Basic but directed spanners.
A question posed in 1992
• Does the case w=l=1, k=3 admit strong hardness?
• In 2000 Elkin and Peleg showed conditional hardness.
If Min-Rep (see K. 98) with large super girth is strongly
hard, then approximating basic k-spanners for k≥ 3 is
strongly hard.
• Dinic, Kortsarz and Raz ICALP 2012 . Min-Rep with
large supergirth is indeed strongly hard, thus basic
spanners as well. Solved after 20 years.
A technique employed for directed
Steiner Forest
• Feldman, K. and Nutov.
The following picture:
s
LP flow at least
¼ between
every pair s,t
t
At most n2/5
vertices in every
layer
An edge with large xe
• Between every two layers there is at most n4/5
edges.
• Let xe be the largest capacity. Thus via every
edge at most xe flow unit pass from s to t.
• The total flow between s and t is at least ¼.
• Therefore n4/5* xe≥ ¼
• Therefore there is an edge of value about 1/n4/5
• Iterative rounding gives ratio n4/5
Approximating directed spanners
• Krauthgamer and Dinic employed (part of our)
techniques to get an n2/3 approximation for unit
length directed k-spanners. The ratio does not
depend on k. As happens many times, the
techniques were invented independently.
• Improvement: non iterative but randomized
rounding gets about n1/2 ratio. Very clever trick!
• Due to Berman, Bhattacharyya, Makarychev,
Raskhodnikova, Yaroslavtsev.
Other results
• For k=3 they get ratio n1/3 ratio for the directed
case. Note that even for undirected graphs n1/2
is trivial but n1/3 not.
• They also improve the result for Directed
Steiner forest about 3 years after it was
established. The new best ratio is n2/3.
• So far we dealt with weighted spanner.
• Consider spanners with w=l=1 but we want
minimum maximum degree.
The 20 years pattern
• In 1993 K., Peleg approximation of 1/4 for
minimum maximum degree 2-spanner
• A related remark: in 1996 Feige, K, Peleg: a
n1/3-1/60 for the dense k-subgraph problem.
• Improved by Bhsaskara, Chlamtac, Charikar,
Feige and Vijayaraghavan almost 20 years later
to about n1/4 ratio.
Improvements
• The minimization version of the dense k-subgraph
problem.
• Given a bound B on the edges, find a minimum size
collection V’ of vertices so that e(V’)≥B. Its ratio is
never worse than the dense k-subgraph ratio.
• Improvement 1: Chlamtac, Dinic and Krauthgamer.
Ratio 0.172 for minimum maximum degree 2-spanner
• Improvement 2: n0.172 ratio for the minimization
version of dense k-subgraph. In my opinion great
result.
Remarks on this algorithm
• The improvement for Dense k-subgraph can be
thought of: instead of counting walks, count
Caterpillars. Still, improves 20 years result.
• The new results of CDK uses the SheraliAdams hierarchy. Also a notion of faithful
rounding which is new and very interesting.
• In 1993 K, Peleg said that Bounded degree 2spanner related to Dense k-Subgraph.
• Showed to be indeed the case by CDK.
Open problems (in approximation as
otherwise too many)
• The case k=2 for unit length and arbitrary cost and
even on directed graphs is tight. Almost all other
problems large gap between ratio and inapproximability.
• New variants: Augmenting spanners, Client-server
spanners, All server Client-server spanners. All these
problems invented by Elkin and Peleg. Also Transitive
closure spanners. Tree spanners. Additive spanners.
• In addition a slightly new concept is Fault tolerant
spanners.
Download