CS 592

advertisement
SHEN’S CLASS NOTES
Chapter 34
NP-Completeness
34.1 Basic Concepts
After long time study, people found that some problems
can be efficiently solved, and some problems are so difficult
that only exponential time algorithms are known for them or
even no algorithm at all.
In order to study the intrinsic complexity of computational
problems, we would like to classify problems into different
classes based on their difficulty levels.
If a problem can be solved in time that is a polynomial
function of n, where n is the input size, we say that this problem
belongs to class P. Class P problems are called tractable
because they can be solved in polynomial time. A problem that
requires super-polynomial time is called intractable.
There is a class of problems whose tractability is not
known yet. We call this class NP-complete (NPC). If a problem
belongs to NPC class, then it is a hard problem because no body
can provide a polynomial algorithm for it so far. However, it has
not been proved yet that they are indeed not polynomial
solvable.
In order to study the intrinsic complexity of problems, we
introduce the NP class. A problem belongs to the NP class if it
can be solved by an NP algorithm in polynomial time. An NP
algorithm is an algorithm run on a powerful but hypothetical
computing model called non-deterministic machine model. All
NPC problems can be solved by an NP algorithm in polynomial
1
SHEN’S CLASS NOTES
time. Moreover, it is proved that if any NPC problem can be
solved in the future by a (deterministic) polynomial algorithm,
then all NP problems can be solved too. This is the reason why
we use the name NP-Complete for such a problem. On the other
hand, if any NPC problem is proved to be intractable in the
future, then all NPC problems are intractable.
Encodings
The complexity of a problem is closely related to the input
size which depends on the encoding of the problem. For
example, if we want to encode the number 99. Its decimal
representation, binary representation, and unary representation
are:
(99)10
(uses two digits)
(1100011) 2
(uses 7 digits)
(1111…1) 1
(uses 99 digits).
So, we need to say few words about the encodings.
Two different encodings e1 and e2 are called polynomial
related for a problem I if there is a polynomial computable
functions f and g such that f(e1(i)) = e2(i) and g(e2(i)) = e1(i) for
any problem instance i. Obviously, if encodings e1 and e2 are
polynomial related for problem I, then the problem I can be
solved in polynomial time under encoding e1 if and only if I can
be solved in polynomial time under encoding e2.
We notice that almost all encoding methods are polynomial
related except “expensive” encoding such as unary encoding.
Suppose we use binary encoding for a number N, we need
2
SHEN’S CLASS NOTES
log2N bits. Now, if we use base b system to represent N, then
we need logbN digits. However,
log2N = (log2b) (logbN) = k logbN,
where k = log2b is a constant.
Therefore, using base two encoding or base b encoding will
affect the input size only by a constant factor.
In general, if an encoding e uses an alphabet of b symbols, we
can always use two symbols {0, 1} to encode each symbol with
log2b bits. Thus, any encoding can be translated into a binary
encoding without affect the complexity.
We can assume that any “reasonable” encoding, particularly,
binary encoding, can be used in our discussion of NPCompleteness theory,
Decision Problems vs. Optimization Problems
Many problems are optimization problems for which we
wish to find the best solutions. For example, find a shortest path
in between two vertices of a graph, find the largest compatible
set of activities, find the MST for a graph.
The NP-completeness theory does not directly apply to the
optimization problems. It is based on “decision problems.”
Definition 1 Any problem for which the answer is either yes
or no is called a decision problem.
Although the discussion on NP-Completeness is restricted for
decision problems, it usually can be indirectly applied to
3
SHEN’S CLASS NOTES
optimization problems. For example, finding a path from vertex
u to v with minimum number of edges is an optimization
problem. We can cast this problem as a decision problem as
follows:
Given a graph G(V, E) and two vertices u, v V, does there
exists a path from u to v with distance k or less? We denote the
coding of this problem by <G, u, v, k>.
If this decision problem can be solved by an algorithm A(G, u, v,
k), then the optimization problem can be solved in the following
way:
Shortest-path(G, u, v)
1
k1
2
while A(G, u, v, k) = ‘no’
3
do { k  k + 1
4
A(G, u, v, k)
5
}
6
return k
7
End
Obviously, the above algorithm is polynomial if A(G, u, v, k) is
a polynomial algorithm. This algorithm finds the length k of the
shortest path, it does not actually produce the path. However, it
is not hard to design a simple algorithm that can actually
produce the path. We leave this to students.
From now on, we only discuss decision problems unless
specified otherwise.
Polynomial reductions
4
SHEN’S CLASS NOTES
Definition 2
Let A and B be two problems, we say that A
polynomial reduces to B, denoted by A  B, if there is a
procedure called reduction algorithm that transforms any
instance  of A into an instance  of B with the following
characteristics:
(1) The transformation takes polynomial time.
(2) The answer for  is yes if and only if the answer for  is
yes.
Obviously, if A  B and B is polynomial time solvable, then A
is also polynomial solvable.
A Formal-Language Framework
Since the formal-language is a powerful tool in establishing
NPC theory, here we review some basic notions about formal
languages.
Definition 3 An alphabet  is a finite set of symbols.
Examples are:  = {0, 1},  = {a, b, c},  = {a, b, …, z}.
Definition 4 A language L over  is any set of strings made up
of symbols from .
Suppose  = {0, 1}, L = {10, 11, 101, 111, 1011, …, } is the
language which contains the binary representations of all prime
numbers.
5
SHEN’S CLASS NOTES
Special symbols  and  are used to represent the empty string
and the empty language respectively. Moreover, * is used to
represent the set of all binary strings. That is,
* = {, 0, 1, 00, 01, 10, 11, 000, 001, …}.
Every language L is a subset of *. * itself is a language also.
Definition 5 Let L1 and L2 be two languages, their
concatenation is a language defined by
L1  L2 = { x1 x2 | x1  L1, x2  L2}.
For example,
L1 =
{ 10, 1100, 111000, …} = {1n0n | n  1},
L2 =
{ 01, 0011, 000111, …} = {0n1n | n  1},
L1  L2 = {1001, 100011, …} = {1n0n+m1m | m, n  1}.
Definition 6 Let L be a language, its complement and Kleene
star are languages defined by:
L = * - L, and
L* = {}  L  L2  L3  …
Let Q be a decision problem, x be an instance of the problem,
Q(x) be the answer to the instance x. Moreover, we use Q(x) = 1
and Q(x) = 0 to represent that the answer is yes or no
respectively. Then, the problem Q can be characterized by the
following language:
6
SHEN’S CLASS NOTES
L = { x  * | Q(x) = 1}.
Definition 7 An algorithm A is said to accept a string x  {0,
1}* if given input x, the algorithm’s output A(x) = 1. The
algorithm A is said to reject a string x if A(x) = 0.
Definition 8 Given an algorithm A, we define the language
accepted by A to be the set of strings accepted by A:
L = {x | x  {0, 1}* and A(x) = 1}.
Definition 9 A language L is decided by an algorithm A if
every string in L is accepted by A and every string not in L is
rejected by A.
Now, we are ready to define class P.
Class P
The class P is the set of decision problems that can be solved
(decided) in polynomial time.
P = {L | L  {0, 1}* and there exists an algorithm A that
decides L in polynomial time}.
The following theorem shows that accepting a language in
polynomial time means deciding a language in polynomial time.
So, we only need to show there is a polynomial time algorithm
to accept a language to prove that it is in class P.
Theorem 34.2
P = {L | L is accepted by a polynomial time algorithm}.
7
SHEN’S CLASS NOTES
Proof. We need only show that if L is accepted by a
polynomial time algorithm A, then we can find an algorithm A’
that decides L in polynomial time.
Assume algorithm A accepts L in time O(nk) for a fixed k.
This means that, for any string x of size n in L, algorithm A can
produce A(x) = 1 within T = cnk steps, where c is a constant
number. Now, we can design an algorithm A’ in this way:
Let A’ to simulate the actions of A on an input x until A
stops or reaches T steps. Then, A’ checks the result of A. If A
accepts x, A’ accepts x by output 1. If algorithm A has not
accepted x, then A’ rejects x by output 0. Obviously, A’
correctly decides L in polynomial time.
34.2 Polynomial Time Verification
If we are given a decision problem and additional
information which proves that the answer to the decision
problem is yes, can you design a polynomial algorithm to verify
that this information indeed proves the answer is yes?
If you can, then this algorithm is called a polynomial (time)
verification algorithm.
For example, if you are given problem <G, u, v, k> and you
are also given a path from u to v with distance k or less, then, we
can easily design a polynomial algorithm that verifies if the path
is indeed a path in the graph from u to v, and check if its length
is k or less.
8
SHEN’S CLASS NOTES
The additional information is called a certificate.
Obviously, the length of the certificate must be a polynomial
function of input size also.
We assume that if the instance of the problem has “no”
answer, then no certificate can exist. In this case, the
verification algorithm can either give “no” answer or give no
answer. In other words, the verification is only responsible for
the cases where the instance of the problem has “yes” answer
and a correct polynomial-long certificate is always provided.
Usually, when we could not solve a hard problem in
polynomial time, we try to design a polynomial verification
algorithm for it. We call a polynomial verification algorithm an
NP-algorithm.
In the following, we look at another example.
Hamiltonian Cycles
A Hamiltonian cycle of a graph G(V, E) is a simple cycle
that goes through every vertex in V exactly once. A graph that
has a Hamiltonian cycle is called Hamiltonian graph. The
Hamiltonian cycle problem is a decision problem that asks
whether a given graph G has a Hamiltonian cycle or not. In
terms of formal language, this problem corresponds to the
following language.
HAM-CYCLE = {<G> | G is a Hamiltonian graph}.
9
SHEN’S CLASS NOTES
This is a difficult problem. So, we consider the verification
algorithm.
Suppose we are given a graph G, and also a sequence p of
vertices, design a polynomial algorithm that verifies if p
represents a Hamiltonian cycle of G.
Obviously, this verification problem can be easily solved in
O(n2):
HAM-CYCLE(G(V, E), p)
1
Check if every vertex in p belongs to set V.
2
Check if the starting and ending vertices are identical.
3
Check every other vertex in p to see if they occur exactly
once.
4
Check if every vertex in G occurs in p.
5
Check if every two adjacent vertices u, v in p are also
adjacent in G.
6
If all above steps passed, return yes, otherwise no.
7
End
Definition 10 A verification algorithm A(x, y) is a twoargument algorithm that takes an instance x of problem Q and a
certificate y of x. This algorithm will produce A(x, y) = 1 if y
proves Q(x) = 1.
Note. The verification algorithm only needs be responsible for
those certificates y that proves Q(x) = 1. If Q(x) = 0, the
algorithm is allowed to produce nothing or runs forever.
10
SHEN’S CLASS NOTES
The class NP
A language L belongs to class NP if and only if there exists a
two-input polynomial-time algorithm A and a constant c such
that
L = {x  {0, 1}* | there exists a certificate y with |y| = O(|x|c)
such that A(x, y) = 1}.
We say that algorithm A verifies language L in polynomial time.
Obviously, HAM-CYCLE  NP.
Note that P  NP because if L  P, then L can be accepted by
an algorithm A in polynomial time. So, a certificate exists for
any x  L and A can be used as a verification algorithm. For
example, <G, u, v, k> is in P. If the answer to a graph G is yes,
then a path (u, v) with distance k or less exists. So, such a path
can serve as a certificate.
Also, note that a certificate is not unique. Any string that can be
used to prove that the answer is yes can be used as a certificate.
Therefore, x itself can be used as a certificate.
Example 1
The set partition problem takes as input a set S of numbers. The
question is whether the numbers can be partitioned into two
sets, A and Ā = S – A such that
x = x
xA
xA
Show that the set-partition problem is in NP.
11
SHEN’S CLASS NOTES
Solution:
Let the certificate y be a subset Y of the set S. The verification
algorithm A takes the following steps to verify:
(1)
check if every number in Y is a number in S
(2)
compute the sum of all numbers in the set Y
(3)
computer the set S – Y
(4)
compute the sum of all numbers in the set S - Y
(5)
compare the two numbers obtained in (2) and (4) to
see it these two numbers are identical. If yes, then return
yes.
34.3 NP-Completeness and Reducibility
Definition 11 A language L1 is polynomial-time reducible to a
language L2, written L1 p L2, if there exists a polynomial–time
computable function f : {0, 1}*  {0, 1}* such that for all x
{0, 1}*, x  L1 if and only if f(x)  L2. The function f is called
the reduction function. A polynomial-time algorithm F that
computes f is called a reduction algorithm.
Figure 34-1 illustrates the reduction function f.
12
SHEN’S CLASS NOTES
{0,1}*
f
L1
{0,1}*
L2
Fig. 34-1
Note that the reduction function is not one-to-one nor onto
function. It may be a many to one function and some instances
in L2 may be left unmapped.
Lemma 34.3
If L1, L2  {0, 1}* are languages such that L1 p L2, then
L2  P implies L1  P.
Proof. Let A2 be a polynomial-time algorithm that decides L2,
and let F be a polynomial-time reduction algorithm that
computes the reduction function f. We shall show how to design
a polynomial-time algorithm A1 that decides L1.
Fig. 34-2 illustrates the design of A1.
13
SHEN’S CLASS NOTES
x
F
f(x)
yes, f(x) L2
yes, x L1
no, f(x) L2
no, x  L1
A2
Fig. 34-2
Algorithm A1(x)
1
call algorithm F to transform x into f(x)
2
call algorithm A2(f(x)) to test if f(x)  L2
3
if A2(f(x)) = 1
//This means f(x)  L2
4
then return A1(x) = 1
// x  L1
5
else return A1(x) = 0
// x  L1
6
End
Obviously, the algorithm correctly decides L1 and its running
time is polynomial because each step in the algorithm needs a
polynomial time.
The class NPC
Definition 12 A language L  {0, 1}* is called NP-Complete if
the following two conditions hold:
(1) L  NP, and
(2) L’ p L for every L’  NP.
If a language L satisfies condition (2), but not necessarily (1),
then we say that L is NP-hard.
14
SHEN’S CLASS NOTES
From the definition, any NP-Complete problem is also a NPhard problem.
Definition 13 The set of all NP-Complete problems is called
the NP-Complete class or the NPC class. That is NPC = {L | L is
NP-Complete}.
Theorem 4.4
If any NP-Complete problem is polynomial-time solvable,
then P = NP. Equivalently, if any problem in NP is not
polynomial-time solvable, then no NP-Complete problem is
polynomial-time solvable.
Proof. Suppose L  P and also L  NPC. By the definition
of NP-Completeness, L’ p L for every L’  NP. From Lemma
34.3, we also have L’  P. Therefore, P = NP. The second
statement is the contraposition of the first and it is true also. 
So far it is not known if P = NP or P  NP although most people
believe P  NP. This is the most famous open conjecture in
computer science. If P = NP, then P = NP = NPC. Otherwise, P
 NP, NPC  NP, and P  NPC =  as illustrated by Fig. 34-3.
NPC
P
NP
Fig. 34-3
15
SHEN’S CLASS NOTES
Circuit Satisfiability
We will show that the NPC class is not empty. We will
show that the circuit satisfiability problem is NP-Complete.
Definition 14 A Boolean combinational circuit composed of
AND, OR, and NOT gates is satisfiable if a set of input values
can be found such that the output of the circuit is 1.
Example 2
Fig. 34-4 shows two circuits, one is satisfiable and the other is
not.
16
SHEN’S CLASS NOTES
x1
x2
1
1
1
1
1
0
1
0
1
1
1
1
x3 0
1
1
1
1
(a) A satisfiable circuit
x1
x2
x3
(b) A unsatisfiable circuit
Fig. 34-4
Suppose a combinational circuit is encoded in a binary sequence
<c>. Then the circuit satisfiability problem corresponds to the
following language:
CIRCUIT-SAT = {<C> | C is a satisfiable circuit}.
Lemma 34.5
CIRCUIT-SAT  NP.
17
SHEN’S CLASS NOTES
Proof. We design a two-input polynomial-time algorithm A
that can verify CIRCUIT-SAT. One input is the circuit C and
the other is a certificate corresponding to an assignment of
Boolean values to the wires of C.
The algorithm A is constructed as follows. For each logic
gate, it checks that the value provided by the certificate is
correctly computed. Then, if the final output of the entire circuit
is 1, the algorithm outputs 1. Otherwise A outputs 0. When the
circuit is satisfiable, a certificate exists and has a length that is
in the order of the circuit. The time to verify is linear in the
number of gates and wires. Thus A is a polynomial-time
algorithm. Therefore, CIRCUIT-SAT  NP. 
Lemma 34.6
CIRCUIT-SAT is NP-hard
Proof.
Omitted.
Theorem 34.7
CIRCUIT-SAT  NPC.
Proof. This is obtained directly from Lemmas 34.5 and 34.6. 
34.4 NP-Completeness Proofs
Lemma 34.8
If L is a language such that L’ p L for some L’  NPC,
then L is NP-hard. Moreover, if L  NP, then L  NPC.
18
SHEN’S CLASS NOTES
Proof. Since L’ is NP-Complete, for any L’’  NP, we have
L’’ p L’. Because L’ p L, we have L’’ p L by transitivity.
(See Exercise 34.3-2.) Therefore, L is NP-hard. Moreover, if L
 NP, then L  NPC by definition. 
A Method for Proving that a language L is NP-Complete
From Lemma 34.8, we often use the following steps to prove
that a language L is NP-Complete.
(1) Prove L  NP
(2) Select a known NP-Complete language L’
(3) Describe an algorithm F that transforms every instance x
{0, 1}* of L’ to an instance f(x) of L.
(4) Prove that x  L’ if and only if f(x)  L for all x {0,
1}*.
(5) Prove that the algorithm F runs in polynomial time.
In the following, we study a NPC problem.
Formula Satisfiability
We define the formula satisfiability problem in terms of the
language SAT.
An instance of SAT is a Boolean formula  which consists
of:
(1) n Boolean variables : x1, x2, …, xn;
(2) m Boolean connectives. Each connective has one or two
inputs and one output. Possible connectives are:
, , , , 
19
SHEN’S CLASS NOTES
(3) Parentheses used to define order of connectives. We
assume no redundant parentheses.
A truth assignment for a formula  is a set of values for the
variables of .
A satisfying assignment is a truth assignment such that  = 1.
A formula is satisfiable formula if it has a satisfying assignment.
SAT = {<> |  is a satisfiable Boolean formula}.
Example 3
 = ((x1  x2)  (( x1  x3)  x4 ))   x2 is satisfiable.
A satisfying assignment is < x1 = 0, x2 = 0, x3 = 1, x4 = 1>.
 = ((0  0)  (( 0  1)  1 ))   0
= (1 (1  1 ))  1
= (1 0)  1
= 1.
Theorem 34.9
SAT  NPC.
Proof. We first prove that SAT  NP. Given a certificate
consisting of a satisfying assignment for a formula , the
verifying algorithm simply replaces each variable in the formula
with its corresponding value and then evaluates the expression.
This can be done in polynomial time. So, SAT  NP.
20
SHEN’S CLASS NOTES
Now, we prove that SAT  NP-hard. We will show that
CIRCUIT-SAT p SAT. We will show how to transform a
circuit into a formula.
The transformation takes the following steps:
(1) Create n variables x1, x2, …, xn for the n input lines of
the circuit.
(2) Let m be the number of gates in the circuit. Create a new
variable xn+i for the output wire of gate i.
(3) For gate i, create a simple formula fi that establishes “if
and only if” relation between its input variables and its
output variables, 1  i  m. Specifically,
(3.1) if gate i is a NOT gate and the input variable is xj
then fi = (xn+i  xj);
(3.2) if gate i is a OR gate and the input variables are
xr, xr+1, …, xj then fi = (xn+i  (xr xr+1…
xj));
(3.3) if gate i is a AND gate and the input variables are
xr, xr+1, …, xj then fi = (xn+i  (xrxr+1… xj)).
(4) Let the xn+m be the variable corresponding to the output
wire of the circuit. Then, the formula is
 = xn+m  f1  f2 …  fm.
Fig. 34-5 shows an example.
21
SHEN’S CLASS NOTES
x1
x2
2
x5
5
3
x6
6
x3
1
x4
4
x8
x9
7
x10
x7
f1 = (x4  x3), f2 = (x5  (x1 x2)), f3 = (x6  x4),
f4 = (x7  (x1 x2 x4)), f5 = (x8  (x5 x6)),
f6 = (x9  (x6 x7)), f7 = (x10  (x7 x8 x9)).
 = x10  (x4  x3)
 (x5  (x1 x2))
 (x6  x4)
 (x7  (x1 x2 x4))
 (x8  (x5 x6))
 (x9  (x6 x7))
 (x10  (x7 x8 x9)).
Fig. 34-5.
Now we prove that the circuit is satisfiable if and only if  is
satisfiable.
(1) Suppose the circuit is satisfiable.
Let x1, x2, …, xn satisfy the circuit. Then, we can use
the same set of values for the variables x1, x2, …, xn in
the formula . Moreover, we use the value output from
gate i for variable xn+i in . Because each formula fi
correctly defines the function of gate i, the value of each
22
SHEN’S CLASS NOTES
fi will be 1. So, if we evaluate , we will get 1 which
means,  is satisfiable.
(2) Suppose  is satisfiable.
Let a set of values of x1, x2, …, xn, xn+1, …, xn+m
satisfy . Then, we can use the same values of x1, x2, …,
xn as the input values to the circuit. Because  = 1, each
formula fi must equal to one also, as well as xn+m = 1.
Because fi correctly defines the function of gate i, the
value of output wire from gate i must equal to the value
of fn+i. Particularly, the value of the output of the circuit
is equal to xn+m which is equal to one. Therefore, the
circuit is satisfiable.
Obviously, the transformation takes a linear time. Thus,
CIRCUIT-SAT p SAT, which effectively proves that SAT 
NPC. 
3-SAT
3-SAT is a short name for 3-CNF satisfiability problem.
Definition 15 A literal in a Boolean formula is an occurrence of
a variable x or its negation x.
Definition 15 A Boolean formula is in conjunctive normal form
(CNF) if it is expressed as an AND of clauses, where a clause is
the OR of one or more literals.
Definition 16 A 3-CNF is a CNF in which each clause has
exactly three distinct literals.
23
SHEN’S CLASS NOTES
Example 4
The following formula is a 3-CNF.
 = (x1 x1  x2)  (x3 x2  x4)  (x1 x3  x4).
Theorem 34.10
3-SAT  NPC.
Proof.
Omitted.
34.5 NP-Complete Problems
In this section, we will study several most well-known
NPC problems. We will lean some proof skills and techniques
from these examples. We will prove a new problem is NPC by
polynomial reducing a known NP-C problem to this new
problem. Fig. 34-6 shows those NP-C problems which will be
studied and the relationship from which problem to which
problem the polynomial reduction takes place. The first two
reductions have been discussed.
24
SHEN’S CLASS NOTES
CIRCUIT-SAT
SAT
3-SAT
SUBSET-SUM
CLIQUE
VERTEX-COVER
HAM-CYCLE
TSP
Fig. 34-6
The Clique Problem
A clique in a undirected graph G(V, E) is a subset V’  V
of vertices such that every two of them are adjacent. So, a clique
is a complete subgraph of G.
The clique problem is to find a clique of maximum size.
This is an optimization problem. A corresponding decision
problem is to decide if graph G has a clique of size k. This
problem can be defined as the following language.
CLIQUE ={<G, k> | G is a graph with a clique size k}.
25
SHEN’S CLASS NOTES
Theorem 34.11
CLIQUE  NP-C.
Proof. First, we prove that CLIQUE  NP. Given a
certificate that consists of k vertices, it is easy to check if the k
vertices form a k-clique. Checking if two vertices are adjacent
needs at most O(n) time by scan the input once. So, the
verification can be done in O(k2n) time.
Now, we prove that CLIQUE  NP-hard by proving 3SAT p CLIQUE. Let  = C1  C2 …  Ck be the input for the
3-SAT problem, where is a clause with three literals.
Let Ck = ( l1r  l r2  l 3r ), 1  r  k.
We construct a graph G(V, E) from . The vertex set V
contains 3k vertices:
V = { v1r , v r2 , v 3r }, 1  r  k.
For edges, ( v ir , v sj ) E if the following two conditions hold:
(1) r  s
(2) l ir   l sj
The first condition means v ir and v sj are in different triples.
The second means the corresponding literals are not
complement each other.
Fig. 34-7 shows the graph constructed from the formula
 = (x1 x2  x3)  (x1 x2  x3)  (x1 x2  x3).
26
SHEN’S CLASS NOTES
C1=x1x2x3
x2
x1
x3
x1
C2=x1x2x3
x1
x2
x2
x3
C3=x1x2x3
x3
Fig. 34-7
We will show that  is satisfiable if and only if the constructed
graph G(V, E) has a k-clique.
Suppose  has a satisfying assignment. Then, each clause
Cr has at least one literal l ir = 1. Its corresponding vertex in G is
v ir . Selecting one such literal from each clause, we get
corresponding k vertices in G. Among the k vertices, any two of
them are adjacent because they belong to different triples, and
their corresponding literals are not complement each other. This
is because any literal and its complement cannot be both equal
to 1. Therefore, these k vertices form a k-clique.
Now, suppose G has a clique V’ of size k. Then, any two
vertices in V’ must belong to different triples. We assign one to
the k corresponding literals in . That is, assign l ir = 1 if v ir 
V’.
Obviously, if v ir  V’, then the vertex u corresponding to the
complement of l ir will not be in V’ because (u, v ir )  E. Thus,
27
SHEN’S CLASS NOTES
this assignment will not run into the risk that both a variable and
its negation are assigned with one. After this, we assign 0 to the
k literals which are negations of the k assigned literals. If there
are other variables not assigned, we arbitrarily assign each of
them with one and its negation with zero. Obviously, this
assignment satisfies the formula .
Because the construction of graph takes a polynomial time,
3-SAT p CLIQUE, which proves CLIQUE  NPC. 
The Vertex-Cover Problem
A vertex cover of a graph G(V, E) is a vertex subset V’  V
such that if (u, v)  E, then u V’ or v  V’ or both.
The vertex cover problem is an optimization problem to
find a vertex cover of minimum size. Its decision problem can
be defined by the following language:
VERTEX-COVER = {<G, k> | G has a vertex cover of size k}.
Theorem 34.12
VERTEX-COVER  NPC.
Proof. We prove VERTEX-COVER  NP first. Let the
certificate to be a set of vertices V’  V. The verification
algorithm checks if the following are true: (1) |V’| = k. (2) For
every edge (u, v)  E, either u V’ or v  V’. Obviously, this
verification can be done in polynomial time.
Now, we prove VERTEX-COVER  NP-hard by showing
CLIQUE p VERTEX-COVER. Let G(V, E) be the graph for
28
SHEN’S CLASS NOTES
the CLIQUE problem. We construct a new graph G’ for the
VERTEX-COVER problem. The construction of G’ is easy. It is
the complement graph of G. That is G’ = G (V’, E’).
u
v
u
v
z
w
y
z
x
w
y
(a) G
x
( b) G
Fig. 34-8
Let |V| = n, k’ = n – k.
We shall show that G has a k-clique if and only if G has a
vertex cover with size k’.
Suppose G has a k-clique V’  V. We claim that V – V’ is
a vertex-cover of G . To see this, look at edge (u, v)  E’.
Obviously, (u, v)  E. So, either u or v will not belong to V’.
Then, u or v must belong to V – V’. So, V – V’ is a vertex cover
of G with size |V-V’| = n - k = k’.
Conversely, suppose G has a vertex-cover V’  V, where
|V’| = n - k = k’. Then, for any u, v  V, if (u, v)  E’, then u
V’ or v  V’ or both. This implies that if u V’ and v  V’,
then (u, v)  E’ or (u, v)  E. Therefore, V – V’ is a clique of G
with size |V-V’| = n – k’ = k.
Thus, we have just proved CLIQUE p VERTEX-COVER.
29
SHEN’S CLASS NOTES
So, VERTEX-COVER  NPC. 
The Hamiltonian Cycle Problem
We have defined this problem before. Now we prove its
NP-Completeness.
Theorem 34.13
The Hamiltonian Cycle problem is NP-Complete.
Proof. The proof is given in the book. Because it is too
lengthy, we omit it here.
The Traveling-Salesman Problem
A traveling salesman wishes to make tour, visiting each
city exactly once and return to the starting city. Suppose there is
a direct connection between any two cities. So, finding such a
tour is easy. The problem is that there is a cost associated with
each connection and the traveling salesman wants to minimize
the total cost. We formalize this optimization problem by graph
terminology as follows:
Given a weighted and complete graph G(V, E), find a
Hamiltonian cycle whose total weight (cost) is minimized.
A corresponding decision problem can be defined as:
Given a weighted and complete graph G(V, E) and a
number k, does G have a Hamiltonian cycle whose total weight
is k or less. We assume all weights are integers.
We can also define this problem by the following language:
30
SHEN’S CLASS NOTES
TSP = {<G, c, k> | G(V, E) is a complete graph, c is a function:
VVZ, k  Z+, G has a Hamiltonian cycle
with cost  k}.
Theorem 34.14
TSP  NP-C.
Proof. It is easy to see that TSP  NP. We will show that
TSP  NP-hard by showing HAM-CYCLE p TSP. Let G(V, E)
be an instance of HAM-CYCLE. We construct an instance of
TSP as follows:
The instance for TSP is a graph G’(V’, E’), where V’ = V. G’ is
a complete graph. So, E’ = {(i, j) | i, j  V and i  j}. The
weight (cost) on each edge is defined in this way:
0 if (i, j )  E
c(i, j) = 
1 if (i, j )  E
Then, <G’, c, 0> is the instance for TSP. This reduction takes
polynomial time. Now it is straightforward to see that G has a
Hamiltonian cycle if and only if a salesman tour in G’ has a
total cost 0. Therefore TSP  NPC. 
The Subset-Sum Problem
In the subset-sum problem, we are given a finite set S  N
and a target number t  N. We ask whether there is a subset S’
 S whose elements sum to t. For example,
if S = {1, 2, 7, 8, 14}, t = 15, then S’ = {7, 8} is a solution.
31
SHEN’S CLASS NOTES
Formally, we can define
SUBSET-SUM = {<S, t> | there exists a subset S’ S such that
 s = t}.
sS '
Theorem 34.15
SUBSET-SUM  NPC.
Proof. First we prove that SUBSET-SUM  NP. Let the
certificate be a subset of S, then checking  s = t can easily be
sS '
done in polynomial time. Now prove SUBSET-SUM  NPC by
showing 3-SAT p SUBSET-SUM.
Let formula  be the input to the s-SAT problem, we will
construct an instance <S, t> for the SUBSET-SUM.
Without loss of generality, we assume
(1) No clause contains x and x. This is because such a
clause is always true and can be deleted.
(2) Each variable appears in at least one clause.
Suppose  has n variables x1, x2, …, xn and k clauses C1,
C2, …, Ck.
The instance <S, t> will have 2(n + k) decimal numbers in the
set S, two for each variable or clause. Each number has (n+k)
digits defined by the n variable and k clauses as illustrated by
Fig. 34-8. The number t is also a (n+k)-digit number.
x1
x2

xn
C1 C2

Fig. 34-8 The structure of (n+k) digits.
32
Ck
SHEN’S CLASS NOTES
Specifically, we do the following.
(1) For each variable xi, generate two numbers vi and vi’,
one for xi itself and the other for its complement xi.
The (n+k) digits for vi are determined as follows:
The digit under xi is 1. If xi appears in Cj, then the digit
under Cj is 1. All other digits are 0.
The (n+k) digits for vi are determined as follows:
The digit under xi is 1. If xi appears in Cj, then the
digit under Cj is 1. All other digits are 0.
(2) For each clause Cj, generate two numbers sj and sj’. The
number sj has a zero under all digits except the digit
under Cj which is 1. The number sj’ has a zero under all
digits except the digit under Cj which is 2.
(3) The number t has a one in each of the first n digits
corresponding to the n variables x1, x2, …, xn. It has a 4
in each of the last k digits corresponding to the k clauses
C2, …, Ck.
Example 5
Figure 34-9 shows how the number t and set S of 14 numbers
are generated from formula  = C1  C2  C3  C4, where
C1 = (x1 x2   x3)
C2 = (x1 x2  x3)
C3 = (x1  x2  x3)
C4 = (x1 x2  x3)
33
SHEN’S CLASS NOTES
v1
v1’
v2
v2’
v3
v3’
s1
s1’
s2
s2’
s3
s3’
s4
s4’
t
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
x1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
1
x2
0
0
1
1
0
0
0
0
0
0
0
0
0
0
1
x3
0
0
0
0
1
1
0
0
0
0
0
0
0
0
1
C1
1
0
0
1
0
1
1
2
0
0
0
0
0
0
4
C2
0
1
0
1
0
1
0
0
1
2
0
0
0
0
4
C3
0
1
0
1
1
0
0
0
0
0
1
2
0
0
4
C4
1
0
1
0
1
0
0
0
0
0
0
0
1
2
4
Fig. 34-9
Obviously, this construction of <S, t> takes a polynomial time.
Now, we show that  is satisfiable if and only if <S, t> has a
yes answer.
(1) Suppose  has a satisfying assignment. We select
numbers from the set S as follows.
Check each xi. If xi = 1, we include vi the number
in set S’, otherwise include vi’. After the n variables
34
SHEN’S CLASS NOTES
have been checked, we add those number selected so far
in set S’. Let this number be r. From the construction, it
is easy to see that the number r has a one in each of the
first n digits. For example, in the Example 5,  has a
satisfying assignment, x1 = 0, x2= 0, x3 = 1. So, v1’, v2’,
and v3 are selected, and r = 1111231. The number r  t
yet. We notice that, each of the last k digits in r must be
either 1 or 2 or 3. This is because in every clause, there
is at least one literal but at most three literals that are
equal to one. We have selected exactly those numbers
whose corresponding literals equal to one.
Now, we check each of the last k digits in the
number r. If the digit under Cj is one, we include the
numbers sj and sj’ in the set S’. If it is two, we include
the numbers sj’ in the set S’. If it is three, we include the
numbers sj in the set S’. Now, the sum of all numbers in
set S’ is equal to t. This is because adding number sj to
the number r will increase the digit of Cj by one without
change other digits; Adding number sj’ to the number r
will increase the digit of Cj by two; Adding both sj and
sj’ will increase the digit of Cj by three. Therefore, the
way we select the numbers will make each of the last k
digits equal to 4 in the sum of all numbers in set S’.
Therefore, the instance we constructed has a yes answer.
In Fig. 34-9, the shaded rows are the numbers included
in set S’. Obviously, the sum of these numbers equals to
t.
35
SHEN’S CLASS NOTES
(2) Suppose the instance <S, t> we have constructed has a
yes answer. That is there is a subset S’ S such that
 s = t. We will show a satisfying assignment for the
sS '
formula .
From the construction of <S, t>, S’ must include
either vi or vi’, but not both, so that the sum has one in
each of the first n digits. We assign xi = 1 if vi  S’, xi =
0 otherwise, 1  i  n. Now, we show this assignment
satisfies . Because the sum t has a 4 in each of the last
k digits corresponding to the k clauses, Cj, 1  i  k, then
S’ must include some vi or vi’ that appears in Cj. This
means that some literal in Cj is assigned one. Therefore,
every Cj, 1  i  k, is satisfied and hence  is satisfied
too. 
End of Chapter 34.
36
Download