Color-Coding

advertisement
Presented by Yuval Shimron
Course 236801
1.12.2010

Find solutions to sub-cases of the Subgraph
Isomorphism Problem in polynomial time.

Find More efficient solutions to some subcases that already had polynomial time
solutions.

Find simple paths and cycles of specific
length k.
 This was the initial goal of the authors…
2

(1) For a fixed k, if G=(V,E) contains a cycle of
length k it can be found in O(Vω) expected
time or O(VωlogV) worst-case time (ω<2.376
is the exponent of matrix multiplication).

(2) For a fixed k, if a planar graph G=(V,E)
contains a cycle of length k it can be found in
O(V) expected time or O(VlogV) worst-case
time (Applies also to any non-trivial minorclosed family of graphs).
3

(3) If G=(V,E) contains a subgraph isomorphic
to a bounded tree-width graph H=(VH,EH)
where |VH| = O(logV), then such a subgraph
can be found in polynomial time.
 Was not previously known even if H were just a
simple path of length O(logV).
 Shows that the LOG PATH problem is in NC (and
not just in P).
4

Randomized method
 Vertices are randomly colored using k = |VH|
colors.
 If |VH| = O(logV), then with a small (but only
polynomial small) probability all the vertices of
the (isomorphic to H) subgraph are colored in
distinct colors.
 Makes the task of finding this ‘color-coded’
subgraph much easier.
▪ Be patient…
5

De-randomized algorithm?
 Needs a family of colorings of G, such that every
subset of k vertices of G is assigned with distinct
colors by at least one of these coloring.
▪ In other words, a family of perfect hash functions from
{1, 2, …, |V|} to {1, 2, …, k}.
 Only “small” loss of efficiency.
6

If acyclic – simple
 O(E) time for a simple algorithm.

So eliminate cycles:
 Choose a random permutation
 Build G '  V , E ' by using  :
▪ Direct the edges:  u, v  E :
 
.
  u     v    u, v   E '
  v     u    v, u   E '
7




Every directed path of length k in G’ is a
simple path of length k in G.
Every simple path of length k in G has a
2/(k+1)! chance of becoming a directed path
in G’.
So if no path of length k was found in G’
repeat the process.
The expected number of times this process is
repeated is at most (k+1)!/2.
8

So we get O(E(k+1)!) time complexity.
 This is also the result for the directed case.
▪ Delete edges that don’t agree with  .

Use the following fact + DFS to reduce it to
O(V(k+1)!) for the undirected case:
 Every graph with V vertices and at least k|V| edges
contains a path of length k.


So first run a DFS on the original graph.
Apply the above algorithm only if no vertex of
depth k was found (answered in O(k|V|) time).
9




Choose random acyclic orientation G’.
Raise the adjacency matrix of G’ to the power
of k-1 using O(logk) matrix multiplications.
This gives all the pairs of vertices connected
by a path of length k-1.
Check if any of these pairs are connected.
 If so .
 If not, repeat the process.
▪ Expected number of at most k!/2 time.

Complexity: O(k!(logk)Vω)=O(Vω) for a fixed k.
10


To find a path of length k-1 in a graph G we
can choose a random coloring of the vertices
of G in k colors.
Every simple path of length k-1 in G has a
chance of k!/kk > e-k to become colorful.
 Each vertex is colored with a different color.

We can find it using lemma 3.1.
11

Use Color-Coding to find a colorful path of
length k-1 in 2O(k)E worst case time (if exists).
 Actually it finds a path of length k that starts at a
specific vertex s.
▪ but we can always add some vertex s to G (with a new
color).


The algorithm uses a given (random) coloring
c : V {1, 2, … k}
The algorithm uses a dynamic programming
approach.
12

Suppose we’ve found for each vertex v the
sets of colors on colorful paths of length i that
connects s and v.
k
 A collection of at most

 
 
i
color sets.
For that we only need to record the color sets
appearing on i-length paths.
 And not the path themselves…

We inspect every color set C of that
collection.
13



We also inspect every edge (v,u) in E.
If c  u   C we add C  c  u  to the collection
of u that corresponds to colorful paths of
length i+1.
The graph G contains a colorful path of
length k-1 iff the final collection, corresponds
to paths of length k-1, of at least one vertex is
non-empty.
14

The number of operations is at most
 k k 

O   i    E   O  k 2k  E 
.
 i 0  i 


The proof holds for both directed and
undirected graphs.
15

We can find all pairs of vertices connected by
path of length k-1 in 2O ( k ) VE or 2O ( k ) V 
worst case time.

To get 2O ( k ) VE time simply run 3.1 algorithm
|V| times, from each vertex of G=(V,E).

Use recursive approach to get 2O ( k ) V  time.
16

Keep all partitions of {1,2,…,k} into two
subsets C1,C2 of size k/2 each. There are
 k 
   2 such partitions.
k 
 2



k
For each partition, split G into two graphs
derived from C1, C2 coloring.
Recursively find pairs of vertices connected
by paths of k/2-1.
Store the results in Boolean matrices A1,A2.
17


Define B to be a Boolean matrix of adjacency
relations between V1,V2 vertices.
Compute A1BA2.
 You get all pairs connected by paths of length k-1
▪ First k/2 vertices are colored by colors from C1
▪ Last k/2 vertices are colored by colors from C2


By OR-ing all the matrices obtained from all
the partitions you get your answer.
Time complexity?
18

A simple path of length k-1 in a directed /
undirected graph G=(V,E) can be found (if exists)
in:
O k 
 2 V expected time for undirected graph.
▪ DFS…
2

O k 
 E expected time for directed graph.
A simple cycle of size k in a directed / undirected
graph G=(V,E) can be found (if exists) in either
Ok
O k
2   VE or 2   V  expected time.
 Simply use lemma 3.2.
19

The previous randomized algorithms can be
derandomized with a loss of efficiency.
 Extra logV factor to the complexity.

What we need is a family of k-perfect hash
functions from {1, 2, …, |V|} to {1, 2, …, k}.
 If we use these hash functions we know that for
every subset of k vertices there exists a coloring
that gives each vertex in it, a distinct color.
20

There exists an algorithm that constructs a
k-perfect family of hash functions from
{1, 2, ..., n} to {1, 2, ..., k}.
 But its size is 2O k  log 2 n .

There also exists an algorithm that constructs
a k-perfect family of hash functions from
{1, 2, ..., n} to {1, 2, ..., k2} that its size is
O1
k   log n .
21

So we use 2-level hashing:
 Mapping from {1, 2, ..., n} to {1, 2, ..., k2} by using
the second algorithm.
 Mapping from {1, 2, ..., k2} to {1, 2, ..., k} by using
the first algorithm..

And we get just the promised extra O(logV)
time.
 The value of each element can be evaluated in
O(1) time.
22

Use k-perfect hash coloring functions.
 Choose a random coloring (ant not a permutation)





c : V --> {1, 2, … k}
Remove edges (u,v) s.t. c  v   c u   1 .
Direct remaining edges (u,v) from u to v.
Again G’, the obtained graph, is acyclic.
Simple path of length k in G has a probability of
2k-k to become a directed path in G’.
Different from the Color-Coding method.
23


An undirected graph G is d-degenerate if
every subgraph of it has a vertex of degree at
most d.
Smallest such d is called the degeneracy or
the max-min degree of G.
 Maximum over the minimum degrees of all
sub-graphs of G.

If G is d-degenerate then clearly E  d V .
24


Let G be a connected undirected graph.
An acyclic orientation of G=(V,E) such that for
every v we have dout  v   d G  can be found in
O(E) time.
25



A graph H is a minor of undirected graph G if
it can be obtained from G by the removal and
the contraction of edges.
A family C of graphs is minor-closed if a minor
of any graph in it is also a member of the
family.
If such C is non-trivial then all graphs in C are
of bounded degeneracy.
 dC s.t. G  C : d  G   dC .
26

Consider the family of planar graphs Cplanar
 It is minor-closed.
 Each planar graph has a vertex whose degree is at
most 5.
  dC planar  5 .
27


Let C be a non-trivial minor-closed family of
graphs and let k  3 be a fixed parameter.
There exists a randomized algorithm that given
an undirected graph in C finds a Ck - cycle of size
k in it if one exists, in O(V) expected time.
Proof:
 Let G = (V,E) be a graph in C that contains a Ck.
 Choose a random coloring c : V -> {1, 2, 3, …, k}.
 Ck is considered well-colored if colored in a
consecutive way by the colors 1, 2, …, k.
28


The Ck in G has a chance of 2/kk-1 to be
well-colored.
Can we find it efficiently?
 Yes, but with some probability…



Assume that the degeneracy of C is d = O(1).
We describe a randomized algorithm that
given a coloring c, finds Ck with probability of
1/(2d)k.
Combining both gives a probability of at least
2   2d  k  so the expected time is O   2dk  V .
k
k 1
k
29

We can assume all edges of G connect vertices
that are colored by consecutive colors (mod k).
 Edges that don’t may be safely removed.

We orientate the graph so that the out-degree of
all the vertices is at most d.
 This takes only O(V) time.

The algorithm tries to find the edge that
connects the vertices in Ck colored by k and k-1:
vk ,vk-1. It “flipps coins” to guess it’s orientation
and index – 2d possible combinations.
30

For each guess of such index i
 If the orientation is from vk-1 to vk:
▪ All edges that leave vk-1 but whose index is not i are
removed.
 Otherwise does the opposite.
▪ (for edges that leave vk)

Result is the graph G’ that contains a Ck with a
probability of at least 1/(2d).
 A forest of rooted stars.
31



Each such star is contracted into a single
vertex and assigned with the color k-1.
The obtained graph is denoted by G’’.
G’’ contains a well-colored Ck-1 iff G’ contains
a well-colored Ck.
 Since each edge of G’ and therefore G’’ connects
consecutively colored vertices.

G’’ is also a graph in the minor-closed family C.
 So we recursively look for Ck-1.
32

It will take us O((k-1)V) expected time.
 And yields Ck-1 with a probability of at least
1/(2d)k-1.


Obviously it’s easy to reconstruct Ck from Ck-1.
We can stop the recursion when k=3 and use
an existing algorithm for finding triangles in a
general graph in O  E  d G  time.
 Any triangle in a three-colored graph is well-
colored.
 O  E  d  G   is O V  in our case.
33


There exists a determinist algorithm that given a
graph in C, finds Ck if exist, in O(VlogV) WC time.
Proof:
 Instead of using random coloring we exhaust a list of
kO(k)logV colorings that has this property:
▪ Every sequence of k vertices is consecutively colored by 1,2,…,k
by at list one coloring of the list.
 Instead of guessing the direction and index of each
edge in the Ck we exhaust for each coloring all the (2d)k
possible choices.
▪ If G contains a Ck then at least one Ck will be found this way.
34

A graph G1 is said to be isomorphic to a graph
G2 if there exists a bijection:
f : V(G1) -> V(G2)
such that any two vertices u and v of G1 are
adjacent in G1 iff ƒ(u) and ƒ(v) are adjacent in
G2.
35


Let F be a directed/undirected forest on k
vertices. Let G be a directed/undirected
graph.
A sub-graph isomorphic to F can be found if
exists in:
O k
 2   E expected time in the directed case.
O k 
 2
V expected time in the undirected case.
36

Proof:
 Start as usual, by choosing a random coloring:
c : V -> {1, …, k} of G.
 With a probability of at least e-k the copy of F in G
becomes colorful.
▪ Meaning, each vertex is assigned with a different color.
 Suppose that F is composed of l (directed) trees
T1, T2, …, Tl with k1, k2, …, kl vertices each.
 Let Fi be the (directed) forest composed of
T1, T2, …, Ti.
37

For each 1  i  l we find the color sets that
appear on colorful copies of Ti in G.
 Note that copies of Ti , Tj with disjoint color sets
are necessarily disjoint.


Then, in 2O(k) time we find the color sets that
appear on colorful copies of Fi for 1  i  l .
If the collection corresponding to F=Fl is not
empty then G contains a colorful copy of F.
 How do we find it…?
38

How do we find the color sets that appear on
colorful copies of Ti in G?
 Let t be an arbitrary vertex in Ti=T.
 For each vertex v in G we find the color sets that
appear on copies of T in which v plays the role of r.
 If T is a singe vertex then it’s easily done…
 Otherwise let e=(r,r’) be a (directed) edge in T.
▪ We break T into two (directed) sub-trees T’, T’’.
39



We recursively find, for each vertex v in G, the
color sets in copies of T’ and T’’ in which v plays
the role of r and then of r’.
For every (directed) edge (u,v) we update u’s
collection with v’s collection if they are disjoint.
The complexity of this recursive algorithm is
O k
2    E as required.
For the undirected case we use the fact that a
graph with at least k|V| edges contains as a
subgraph any forest on k vertices.
i

40

Remember tree-width of a graph G?
 The minimum tree-width over all possible tree-
decomposition of G to (X,T).
 T = (I, F) is a tree.
 X = { Xi : i  I} is a set of subsets of V such that:
▪ The union of all Xis equals to V.
▪ For every edge (u,v) of G there exists an i such that u,v
are in Xi.
▪ If i, j, k  I , and j is on the path from i to k in T then:
Xi  Xk  X j
41



Let H be a directed or undirected graph on k
vertices with tree-width t. Let G be a directed
or undirected graph.
A sub-graph of G, isomorphic to H, if one
exists, can be found in 2O k  V t 1 expected
O k 
t 1
2

V
log V worst case time.
time and in
Proof is similar to that of Theorem 6.1.
 So we will skip it...
42


In [RS86b] it is shown that if C is a minor
closed family of graphs that excludes at least
one planar graph G’ then there exists a (huge)
constant cG’ such that every graph in C has a
tree-width of at most cG’.
So we can use 6.3 wherever |VH| = O(logV) and
H excludes at least one planar graph.
 and decide in polynomial time whether G contains
a graph isomorphic to H.
43

As a very special case of Theorem 6.3 we get
that the LOG PATH problem is in P
 A path of logV vertices is a tree.

In addition, all the algorithms we described
are easily parallelizable.
 So we get that the LOG PATH problem and other
problems are in NC.
44



The Color-Coding method efficiently finds kvertex simple paths, k-vertex cycles, and
other small sub-graphs within a given graph
using probabilistic algorithms.
The Color-Coding method is a good example
of demonstrating de-randomization
techniques.
Algorithms presented can be easily
parallelized.
 Yielding efficient NC algorithms.
45
Download