CSE596 Problem Set 9 Answer Key Fall 2015

advertisement
CSE596
Problem Set 9 Answer Key
Fall 2015
(A) Discussion Problem: Harking back to the discussion problem on Problem Set 2, show that the
following problem is NP-hard. More precisely, show that the following problem is complete for co-NP
under polynomial-time many-one reductions (3 pts. checkoff):
Instance: A regular expression R with no Kleene stars and an integer n > 0.
Question: Does R match all strings in {0, 1}n ?
Answer: The problem belongs to co-NP because there is an nO(1) -time procedure to test whether
a given string y ∈ {0, 1}n matches R (this is true even in the presence of *s by problem 1(a) on Prelim
II) and the question involves a “for all” quantifier on y.
It is complete for co-NP under ≤pm because we have reduced DNF-TAUT to it, and DNF-TAUT
being complete for co-NP follows from CNF-SAT being complete for NP under ≤pm .
(1) Prove that the problem (c) on question (1d) of Prelim II is NP-hard. More precisely, prove
that the complement of the language {(N, n) : {0, 1}n ⊆ L(N )} is NP-complete under ≤pm . Use a
reduction from 3SAT by transforming a 3CNF formula φ(x1 , . . . , xn ) into an NFA Nφ such that Nφ
fails to accept some y ∈ {0, 1}n iff y satisfies all clauses. Explain your construction of Nφ , why it is
computable in polynomial time given φ in 3CNF, and why your reduction is correct. (18 pts.)
Answer: As a first step, Given a clause C such as C = (x1 ∨ x̄2 ∨ x̄n ), let’s describe in general
how to build an NFA (or DFA) NC that accepts precisely the binary strings of length n that do not
satisfy C.
Allocate n states labeled x1 , . . . , xn , with x1 the start state, and add an accept state “xn+1 ” and
a dead state, to make n + 2 states. For i = 1 to n, connect state xi to state xi+1 by both an arc on 0
and an arc on 1, unless xi appears in the clause C, in which case the arc on 1 goes to the dead state,
or x̄i appears in C, in which case the arc on 0 goes to the dead state. Then binary strings a of length
n (representing truth assignments to x1 , . . . , xn in order) cause a computation that sidetracks into the
dead state iff a satisfies C, and that passes through to the accept state iff a does not satisfy clause C.
This describes a DFA NC with n + 2 states. For C = (x1 ∨ x̄2 ∨ x̄n ), we get
(1)--->---(2)--->---(3) ... (n)--->---((accept))
|
0
|
1
|
1
V 1
V 0
V 0
|
|
|
dead
dead
dead
Now given a CNF formula φ = C1 ∧ C2 ∧ . . . ∧ Cm , we want to show how to build an NFA Nφ such
that L(Nφ ) = L(NC1 ) ∪ L(NC2 ) ∪ · · · ∪ L(NCm ).
Make a start state s, and connect s by a λ-arc to the start state of every NC , for all clauses C in
the formula φ. This creates an NFA N with 1 + (n + 2)m states, where m is the number of clauses in
φ. Note that N has size polynomial in the size of the formula φ. Then L(N ) = L(NC1 ) ∪ · · · ∪ L(NCm ),
which equals the set of all assignment strings a that fail to satisfy some clause. This is the same as the
set of strings a that fail to satisfy the formula φ. Thus L(N ) = { 0, 1 }n if and only if φ is unsatisfiable.
Finally, turning that last assertion around, there exists a string a ∈ { 0, 1 }n \ L(N ) iff there exists
a string a ∈ { 0, 1 }n that satisfies φ, i.e., iff φ is satisfiable. The construction of N in (b) can be
carried out in polynomial time given φ. Thus the construction defines a polynomial-time reduction
from 3SAT to the stated problem, so the problem is NP-hard.
(2) Show that the following decision problem is NP-complete:
Dominating Set
Instance: An undirected graph G = (V, E) and an integer k, 1 ≤ k ≤ |V |.
Question: Does there exist a subset U ⊆ V of size at most k such that every other vertex
in V is adjacent to one in U (that is, (∀v ∈ V \ U )(∃u ∈ U ) : (u, v) ∈ E)?
Needless to say, it is forbidden to look this up on the Internet.
For some helpful examples, if G is a “triangle,” any one node forms a dominating set that is not a
vertex cover. If G has 6 vertices and edges in the form of the letter ‘H’ then the only dominating set
of size 2 is obtained by choosing the two middle vertices, and these do not form an independent set.
(24 pts.)
Answer: This one is almost the simplest example of the “ladder/clause-gadget” architecture of a
reduction from 3SAT that I know. Given φ with n variables and m clauses, set k = n and G with
3n + m nodes as follows: For each variable xi , in place of a simple “rung” connecting xi to its opposite
x̄i we add one more node ti connected to both xi and x̄i in a triangle. And G has just one node cj
for each clause Cj in φ. Node cj is connected to the (up to) 3 literals that belong to the clause Cj .
That finishes the description of what is clearly a polynomial time (in fact, linear time and log space)
computable function f (φ) = hG, ni.
For correctness, note that since each ti is connected only to xi and x̄i , there is never any reason to
prefer choosing it when either xi and x̄i will dominate the triangle equally well. No node can dominate
more than one ti , so the minimum possible size for a dominating set in G is n. Thus without loss
of generality, a size-n dominating set U contains exactly one of xi and x̄i for each i. These choices
correspond to a truth assignment. The set U needs to dominate all nodes cj as well, and this happens
if and only if the assignment U represents satisfies φ. Hence f reduces 3SAT to Dominating Set.
(3) Tied s-t Path. You are given a directed acyclic graph G = (V, E) in which each node has
one “left” out-arc and one “right” out-arc, with a distinguished source node s and sink node t. You
are also given a list of “ties” (u, v) which say that if you take the left [right] edge out of u, then you
must also take the left [right] edge out of v. Is there a path from s to t subject to the ties? Show that
this decision problem is NP-complete. (30 pts., for 75 regular-credit points on the set, including the
checkoff credit for A)
[18 pts. extra credit for doing this with the extra condition that no node is tied more than
once, i.e., the ties are disjoint pairs of nodes—you may wish to lean on special properties of the 3CNF
formulas that translate circuits in my Cook-Levin theorem proof.]
Answer: The regular-credit answer avoids headaches caused by the extra condition. “In NP” is
immediate either way, since the path has smaller size than the graph and it is easy to check for each
tie that the path takes the same left/right turn at the two nodes in the tie. Given a 3CNF formula
φ with n variables and m clauses, we design the DAG Gφ to be a chain of m “clause gadgets,” with
some nodes labeled by variable names.
Each clause gadget Cj has an entry node sj , an exit node tj which can be identified with the entry
node sj+1 for the next clause, and two other nodes. The entry node s1 for the first gadget is “s,” and
“t” is taken to be tm . Finally, G has one more sink node r for “reject.”
Within each clause gadget, sj is labeled by one of the three variables appearing in the clause, and
the two other nodes besides tj —call them uj and vj —by the other two variables. All nodes labeled by
a variable xi in the whole graph are tied to each other (or equivalently, successive pairs of them are
tied). Let us designate the left arc out of any node labeled xi as standing for the assignment xi = 0,
and the right arc for xi = 1. If sj is labeled xi and xi occurs positively in Cj , then the right arc goes
to tj —signifying that the clause has already been satisfied—while the left arc goes to uj to poll the
second member of the clause. If xi occurs negatively (i.e., as x̄i ), then the left arc goes to tj while the
right arc goes to uj . Node uj is coded similarly with the satisfying arc going to tj and the unsatisfying
arc going to vj . At vj , however, the unsatisfying arc goes to r. This finishes the description of G. The
function f (φ) = G is computed in one pass through the clauses of φ, hence clearly in polynomial time.
Since every node of G other than t and r (recall tj is the same as sj+1 for j < m) is labeled by a
variable, and all occurrences of a variable are tied, maximal paths in G that respect the ties are in 1-1
correspondence with assignments a ∈ { 0, 1 }n . If a satisfies φ, then the corresponding path goes to tj
in every clause gadget and so ends up at t. Conversely, if a path goes from s to t then it must take
a satisfying arc in each clause, which is possible only if the clause is satisfied by the corresponding
assignment. Thus the reduction f is correct.
Extra-credit answer: To comply with the extra(-credit) condition on ties, we replace each node
labeled xi in a clause gadget by a “sub-gadget.” Each sub-gadget has two “exit nodes” (in parallel),
and ones for the second and later occurrences of xi or x̄i in φ have two “entry nodes” (in sequence).
The first time xi occurs, the 0-arc goes to an exit node labeled yi1 , and the 1-arc to one labeled zi1 .
The 1-arc out of yi1 and the 0-arc out of zi1 then go to r, since they represent internal contradictions.
The 0-arc out of yi1 and the 1-arc out of zi1 go to the (first entry nodes of the subgadgets for the)
same places the 0-arc and 1-arc out of xi went in the original G described above. Note that these
destinations still depend only on whether xi occurs positively in the clause or as x̄i .
The next time xi occurs, the first entry node is also labeled yi1 and is tied to the previous yi1 .
The 1-arc out of this does not go to r. Instead it means xi = 1, since the only way it can be legally
taken in a path is for the earlier part of the path to have gone through zi1 . It goes to the new exit
node zi2 . The 0-arc out of this yi1 does not go immediately to yi2 , but instead to the second entry
node, which is labeled zi1 and tied to the previous zi1 . The 1-arc out of this node goes to r, while the
0-arc continues to yi2 . The meaning is this: If the earlier part of the path chose xi = 0 then it went
through the exit node yi1 , and so must have taken the 0-arc out of the entry node yi1 . It may then
legally take the 1-arc out of the entry node zi1 here, since that choice is not tied. If the earlier part
chose xi = 1, then it is not tied at the entry node yi1 here, but choosing the 0-arc out of yi1 leads to
the zi1 entry node where the tie to the earlier zi1 forces it to oblivion at r. Hence to survive, the path
must exploit the fact that it’s not tied to yi1 by taking the 1-arc there.
Finally, the new exit nodes yi2 and zi2 are coded similarly to the first exit nodes yi1 and zi1 . If
there is a next occurrence of xi , they are tied to the entry nodes for it. This completes the description
of the modified graph G0 , and the reduction f 0 that computes it is clearly still linear-time computable.
And G0 abides by the extra condition on ties. Note that we kept the nodes of G labeled by the first
occurrence of each variable but replaced the others by the entry nodes of sub-gadgets. The correctness
of f 0 follows by the correctness of f and the argument of the previous paragraph. (This problem was
posed to me by German researcher Thomas Thierauf in 1998. I solved it within the week, and then
found that another German named Detlef Seese had solved it in a paper earlier that year.)
Download