Presented by Yuval Shimron
Course 236801
1.12.2010
Find solutions to sub-cases of the Subgraph
Isomorphism Problem in polynomial time.
Find More efficient solutions to some subcases that already had polynomial time
solutions.
Find simple paths and cycles of specific
length k.
This was the initial goal of the authors…
2
(1) For a fixed k, if G=(V,E) contains a cycle of
length k it can be found in O(Vω) expected
time or O(VωlogV) worst-case time (ω<2.376
is the exponent of matrix multiplication).
(2) For a fixed k, if a planar graph G=(V,E)
contains a cycle of length k it can be found in
O(V) expected time or O(VlogV) worst-case
time (Applies also to any non-trivial minorclosed family of graphs).
3
(3) If G=(V,E) contains a subgraph isomorphic
to a bounded tree-width graph H=(VH,EH)
where |VH| = O(logV), then such a subgraph
can be found in polynomial time.
Was not previously known even if H were just a
simple path of length O(logV).
Shows that the LOG PATH problem is in NC (and
not just in P).
4
Randomized method
Vertices are randomly colored using k = |VH|
colors.
If |VH| = O(logV), then with a small (but only
polynomial small) probability all the vertices of
the (isomorphic to H) subgraph are colored in
distinct colors.
Makes the task of finding this ‘color-coded’
subgraph much easier.
▪ Be patient…
5
De-randomized algorithm?
Needs a family of colorings of G, such that every
subset of k vertices of G is assigned with distinct
colors by at least one of these coloring.
▪ In other words, a family of perfect hash functions from
{1, 2, …, |V|} to {1, 2, …, k}.
Only “small” loss of efficiency.
6
If acyclic – simple
O(E) time for a simple algorithm.
So eliminate cycles:
Choose a random permutation
Build G ' V , E ' by using :
▪ Direct the edges: u, v E :
.
u v u, v E '
v u v, u E '
7
Every directed path of length k in G’ is a
simple path of length k in G.
Every simple path of length k in G has a
2/(k+1)! chance of becoming a directed path
in G’.
So if no path of length k was found in G’
repeat the process.
The expected number of times this process is
repeated is at most (k+1)!/2.
8
So we get O(E(k+1)!) time complexity.
This is also the result for the directed case.
▪ Delete edges that don’t agree with .
Use the following fact + DFS to reduce it to
O(V(k+1)!) for the undirected case:
Every graph with V vertices and at least k|V| edges
contains a path of length k.
So first run a DFS on the original graph.
Apply the above algorithm only if no vertex of
depth k was found (answered in O(k|V|) time).
9
Choose random acyclic orientation G’.
Raise the adjacency matrix of G’ to the power
of k-1 using O(logk) matrix multiplications.
This gives all the pairs of vertices connected
by a path of length k-1.
Check if any of these pairs are connected.
If so .
If not, repeat the process.
▪ Expected number of at most k!/2 time.
Complexity: O(k!(logk)Vω)=O(Vω) for a fixed k.
10
To find a path of length k-1 in a graph G we
can choose a random coloring of the vertices
of G in k colors.
Every simple path of length k-1 in G has a
chance of k!/kk > e-k to become colorful.
Each vertex is colored with a different color.
We can find it using lemma 3.1.
11
Use Color-Coding to find a colorful path of
length k-1 in 2O(k)E worst case time (if exists).
Actually it finds a path of length k that starts at a
specific vertex s.
▪ but we can always add some vertex s to G (with a new
color).
The algorithm uses a given (random) coloring
c : V {1, 2, … k}
The algorithm uses a dynamic programming
approach.
12
Suppose we’ve found for each vertex v the
sets of colors on colorful paths of length i that
connects s and v.
k
A collection of at most
i
color sets.
For that we only need to record the color sets
appearing on i-length paths.
And not the path themselves…
We inspect every color set C of that
collection.
13
We also inspect every edge (v,u) in E.
If c u C we add C c u to the collection
of u that corresponds to colorful paths of
length i+1.
The graph G contains a colorful path of
length k-1 iff the final collection, corresponds
to paths of length k-1, of at least one vertex is
non-empty.
14
The number of operations is at most
k k
O i E O k 2k E
.
i 0 i
The proof holds for both directed and
undirected graphs.
15
We can find all pairs of vertices connected by
path of length k-1 in 2O ( k ) VE or 2O ( k ) V
worst case time.
To get 2O ( k ) VE time simply run 3.1 algorithm
|V| times, from each vertex of G=(V,E).
Use recursive approach to get 2O ( k ) V time.
16
Keep all partitions of {1,2,…,k} into two
subsets C1,C2 of size k/2 each. There are
k
2 such partitions.
k
2
k
For each partition, split G into two graphs
derived from C1, C2 coloring.
Recursively find pairs of vertices connected
by paths of k/2-1.
Store the results in Boolean matrices A1,A2.
17
Define B to be a Boolean matrix of adjacency
relations between V1,V2 vertices.
Compute A1BA2.
You get all pairs connected by paths of length k-1
▪ First k/2 vertices are colored by colors from C1
▪ Last k/2 vertices are colored by colors from C2
By OR-ing all the matrices obtained from all
the partitions you get your answer.
Time complexity?
18
A simple path of length k-1 in a directed /
undirected graph G=(V,E) can be found (if exists)
in:
O k
2 V expected time for undirected graph.
▪ DFS…
2
O k
E expected time for directed graph.
A simple cycle of size k in a directed / undirected
graph G=(V,E) can be found (if exists) in either
Ok
O k
2 VE or 2 V expected time.
Simply use lemma 3.2.
19
The previous randomized algorithms can be
derandomized with a loss of efficiency.
Extra logV factor to the complexity.
What we need is a family of k-perfect hash
functions from {1, 2, …, |V|} to {1, 2, …, k}.
If we use these hash functions we know that for
every subset of k vertices there exists a coloring
that gives each vertex in it, a distinct color.
20
There exists an algorithm that constructs a
k-perfect family of hash functions from
{1, 2, ..., n} to {1, 2, ..., k}.
But its size is 2O k log 2 n .
There also exists an algorithm that constructs
a k-perfect family of hash functions from
{1, 2, ..., n} to {1, 2, ..., k2} that its size is
O1
k log n .
21
So we use 2-level hashing:
Mapping from {1, 2, ..., n} to {1, 2, ..., k2} by using
the second algorithm.
Mapping from {1, 2, ..., k2} to {1, 2, ..., k} by using
the first algorithm..
And we get just the promised extra O(logV)
time.
The value of each element can be evaluated in
O(1) time.
22
Use k-perfect hash coloring functions.
Choose a random coloring (ant not a permutation)
c : V --> {1, 2, … k}
Remove edges (u,v) s.t. c v c u 1 .
Direct remaining edges (u,v) from u to v.
Again G’, the obtained graph, is acyclic.
Simple path of length k in G has a probability of
2k-k to become a directed path in G’.
Different from the Color-Coding method.
23
An undirected graph G is d-degenerate if
every subgraph of it has a vertex of degree at
most d.
Smallest such d is called the degeneracy or
the max-min degree of G.
Maximum over the minimum degrees of all
sub-graphs of G.
If G is d-degenerate then clearly E d V .
24
Let G be a connected undirected graph.
An acyclic orientation of G=(V,E) such that for
every v we have dout v d G can be found in
O(E) time.
25
A graph H is a minor of undirected graph G if
it can be obtained from G by the removal and
the contraction of edges.
A family C of graphs is minor-closed if a minor
of any graph in it is also a member of the
family.
If such C is non-trivial then all graphs in C are
of bounded degeneracy.
dC s.t. G C : d G dC .
26
Consider the family of planar graphs Cplanar
It is minor-closed.
Each planar graph has a vertex whose degree is at
most 5.
dC planar 5 .
27
Let C be a non-trivial minor-closed family of
graphs and let k 3 be a fixed parameter.
There exists a randomized algorithm that given
an undirected graph in C finds a Ck - cycle of size
k in it if one exists, in O(V) expected time.
Proof:
Let G = (V,E) be a graph in C that contains a Ck.
Choose a random coloring c : V -> {1, 2, 3, …, k}.
Ck is considered well-colored if colored in a
consecutive way by the colors 1, 2, …, k.
28
The Ck in G has a chance of 2/kk-1 to be
well-colored.
Can we find it efficiently?
Yes, but with some probability…
Assume that the degeneracy of C is d = O(1).
We describe a randomized algorithm that
given a coloring c, finds Ck with probability of
1/(2d)k.
Combining both gives a probability of at least
2 2d k so the expected time is O 2dk V .
k
k 1
k
29
We can assume all edges of G connect vertices
that are colored by consecutive colors (mod k).
Edges that don’t may be safely removed.
We orientate the graph so that the out-degree of
all the vertices is at most d.
This takes only O(V) time.
The algorithm tries to find the edge that
connects the vertices in Ck colored by k and k-1:
vk ,vk-1. It “flipps coins” to guess it’s orientation
and index – 2d possible combinations.
30
For each guess of such index i
If the orientation is from vk-1 to vk:
▪ All edges that leave vk-1 but whose index is not i are
removed.
Otherwise does the opposite.
▪ (for edges that leave vk)
Result is the graph G’ that contains a Ck with a
probability of at least 1/(2d).
A forest of rooted stars.
31
Each such star is contracted into a single
vertex and assigned with the color k-1.
The obtained graph is denoted by G’’.
G’’ contains a well-colored Ck-1 iff G’ contains
a well-colored Ck.
Since each edge of G’ and therefore G’’ connects
consecutively colored vertices.
G’’ is also a graph in the minor-closed family C.
So we recursively look for Ck-1.
32
It will take us O((k-1)V) expected time.
And yields Ck-1 with a probability of at least
1/(2d)k-1.
Obviously it’s easy to reconstruct Ck from Ck-1.
We can stop the recursion when k=3 and use
an existing algorithm for finding triangles in a
general graph in O E d G time.
Any triangle in a three-colored graph is well-
colored.
O E d G is O V in our case.
33
There exists a determinist algorithm that given a
graph in C, finds Ck if exist, in O(VlogV) WC time.
Proof:
Instead of using random coloring we exhaust a list of
kO(k)logV colorings that has this property:
▪ Every sequence of k vertices is consecutively colored by 1,2,…,k
by at list one coloring of the list.
Instead of guessing the direction and index of each
edge in the Ck we exhaust for each coloring all the (2d)k
possible choices.
▪ If G contains a Ck then at least one Ck will be found this way.
34
A graph G1 is said to be isomorphic to a graph
G2 if there exists a bijection:
f : V(G1) -> V(G2)
such that any two vertices u and v of G1 are
adjacent in G1 iff ƒ(u) and ƒ(v) are adjacent in
G2.
35
Let F be a directed/undirected forest on k
vertices. Let G be a directed/undirected
graph.
A sub-graph isomorphic to F can be found if
exists in:
O k
2 E expected time in the directed case.
O k
2
V expected time in the undirected case.
36
Proof:
Start as usual, by choosing a random coloring:
c : V -> {1, …, k} of G.
With a probability of at least e-k the copy of F in G
becomes colorful.
▪ Meaning, each vertex is assigned with a different color.
Suppose that F is composed of l (directed) trees
T1, T2, …, Tl with k1, k2, …, kl vertices each.
Let Fi be the (directed) forest composed of
T1, T2, …, Ti.
37
For each 1 i l we find the color sets that
appear on colorful copies of Ti in G.
Note that copies of Ti , Tj with disjoint color sets
are necessarily disjoint.
Then, in 2O(k) time we find the color sets that
appear on colorful copies of Fi for 1 i l .
If the collection corresponding to F=Fl is not
empty then G contains a colorful copy of F.
How do we find it…?
38
How do we find the color sets that appear on
colorful copies of Ti in G?
Let t be an arbitrary vertex in Ti=T.
For each vertex v in G we find the color sets that
appear on copies of T in which v plays the role of r.
If T is a singe vertex then it’s easily done…
Otherwise let e=(r,r’) be a (directed) edge in T.
▪ We break T into two (directed) sub-trees T’, T’’.
39
We recursively find, for each vertex v in G, the
color sets in copies of T’ and T’’ in which v plays
the role of r and then of r’.
For every (directed) edge (u,v) we update u’s
collection with v’s collection if they are disjoint.
The complexity of this recursive algorithm is
O k
2 E as required.
For the undirected case we use the fact that a
graph with at least k|V| edges contains as a
subgraph any forest on k vertices.
i
40
Remember tree-width of a graph G?
The minimum tree-width over all possible tree-
decomposition of G to (X,T).
T = (I, F) is a tree.
X = { Xi : i I} is a set of subsets of V such that:
▪ The union of all Xis equals to V.
▪ For every edge (u,v) of G there exists an i such that u,v
are in Xi.
▪ If i, j, k I , and j is on the path from i to k in T then:
Xi Xk X j
41
Let H be a directed or undirected graph on k
vertices with tree-width t. Let G be a directed
or undirected graph.
A sub-graph of G, isomorphic to H, if one
exists, can be found in 2O k V t 1 expected
O k
t 1
2
V
log V worst case time.
time and in
Proof is similar to that of Theorem 6.1.
So we will skip it...
42
In [RS86b] it is shown that if C is a minor
closed family of graphs that excludes at least
one planar graph G’ then there exists a (huge)
constant cG’ such that every graph in C has a
tree-width of at most cG’.
So we can use 6.3 wherever |VH| = O(logV) and
H excludes at least one planar graph.
and decide in polynomial time whether G contains
a graph isomorphic to H.
43
As a very special case of Theorem 6.3 we get
that the LOG PATH problem is in P
A path of logV vertices is a tree.
In addition, all the algorithms we described
are easily parallelizable.
So we get that the LOG PATH problem and other
problems are in NC.
44
The Color-Coding method efficiently finds kvertex simple paths, k-vertex cycles, and
other small sub-graphs within a given graph
using probabilistic algorithms.
The Color-Coding method is a good example
of demonstrating de-randomization
techniques.
Algorithms presented can be easily
parallelized.
Yielding efficient NC algorithms.
45