TDDD65 Introduction to the Theory of Computation Gustav Nordh

advertisement
TDDD65
Introduction to the Theory of Computation
Gustav Nordh
Department of Computer and Information Science
gustav.nordh@liu.se
2012-09-19
Complexity
What can be computed efficiently?
Complexity
What can be computed efficiently?
Efficient in terms of what?
Energy? Space? Time?
Time Complexity
Definition
The running time or time complexity of a Turing machine M
(which halts on all inputs) is the function f : N → N where f (n)
is the maximum number of steps that M uses on any input of
length n.
Time Complexity
Definition
The running time or time complexity of a Turing machine M
(which halts on all inputs) is the function f : N → N where f (n)
is the maximum number of steps that M uses on any input of
length n.
Worst-case complexity
Time Complexity
Given an algorithm, we would like to have a measure for the
running time that
is simple (the exact running time is often a very
complicated expression)
is independent of the machine model we implement the
algorithm in (Turing machine/Java/Pascal ...)
Time Complexity
Given an algorithm, we would like to have a measure for the
running time that
is simple (the exact running time is often a very
complicated expression)
is independent of the machine model we implement the
algorithm in (Turing machine/Java/Pascal ...)
We use the asymptotic running time as our measure
Time Complexity: Asymptotics
In asymptotic analysis of the running time we try to understand
the running time of the algorithm when it is run on large inputs
Time Complexity: Asymptotics, big-O notation
Time Complexity: Asymptotics, big-O notation
Time Complexity: Asymptotics, big-O notation
Time Complexity: Asymptotics
In asymptotic analysis of the running time we try to understand
the running time of the algorithm when it is run on large inputs
Time Complexity: Asymptotics
In asymptotic analysis of the running time we try to understand
the running time of the algorithm when it is run on large inputs
Example
If the running time of an algorithm/Turing machine is
3n2 + 10n + 200 (n is the length of the input). Then, for large n
the running time is “similar” to n2 .
Time Complexity: Asymptotics
In asymptotic analysis of the running time we try to understand
the running time of the algorithm when it is run on large inputs
Example
If the running time of an algorithm/Turing machine is
3n2 + 10n + 200 (n is the length of the input). Then, for large n
the running time is “similar” to n2 .
Only consider the highest order term (here 3n2 )
Ignore constant factors
Time Complexity: Asymptotics
In asymptotic analysis of the running time we try to understand
the running time of the algorithm when it is run on large inputs
Example
If the running time of an algorithm/Turing machine is
3n2 + 10n + 200 (n is the length of the input). Then, for large n
the running time is “similar” to n2 .
Only consider the highest order term (here 3n2 )
Ignore constant factors
3n2
+ 10n + 200 = O(n2 )
Time Complexity: Asymptotics, big-O notation
Definition
For functions f and g we say that f (n) = O(g(n)) if positive
integers c and n0 exist such that for every integer n ≥ n0
f (n) ≤ cg(n)
Time Complexity: Asymptotics, big-O notation
Definition
For functions f and g we say that f (n) = O(g(n)) if positive
integers c and n0 exist such that for every integer n ≥ n0
f (n) ≤ cg(n)
When f (n) = O(g(n)) then we say that g(n) is an
asymptotic upper bound for f (n)
Time Complexity: Asymptotics, big-O notation
Definition
For functions f and g we say that f (n) = O(g(n)) if positive
integers c and n0 exist such that for every integer n ≥ n0
f (n) ≤ cg(n)
Recall: Ignore constant factors and only consider the
highest order term
3n2 + 10n + 200 = O(n2 )
Time Complexity: Asymptotics, big-O notation
Definition
For functions f and g we say that f (n) = O(g(n)) if positive
integers c and n0 exist such that for every integer n ≥ n0
f (n) ≤ cg(n)
Recall: Ignore constant factors and only consider the
highest order term
3n2 + 10n + 200 = O(n2 )
For example by taking, c = 100 and n0 = 3
Time Complexity: Asymptotics, big-O notation
Example
Describe a Turing machine that recognizes the language
L = {0k 1k 2k | k ≥ 0}.
1
Scan the input from left to right and make sure it is of the
form 0∗ 1∗ 2∗ (if it is not, then reject)
2
Repeat if both 0’s 1’s and 2’s remain on the tape
3
Return the head to the left end of the tape
4
Cross of the first 0 and continue to the right crossing of
the first 1 and the first 2 that is found
5
Scan the tape and check that there are no 0’s 1’s and 2’s
on the tape and accept (should a 0 1 or 2 be on the tape,
then reject)
Time Complexity: Asymptotics, big-O notation
Example
Describe a Turing machine that recognizes the language
L = {0k 1k 2k | k ≥ 0}.
1
Scan the input from left to right and make sure it is of the
form 0∗ 1∗ 2∗ (if it is not, then reject)
2
Repeat if both 0’s 1’s and 2’s remain on the tape
3
Return the head to the left end of the tape
4
Cross of the first 0 and continue to the right crossing of
the first 1 and the first 2 that is found
5
Scan the tape and check that there are no 0’s 1’s and 2’s
on the tape and accept (should a 0 1 or 2 be on the tape,
then reject)
t(n) = O(n) + n3 (O(n) + O(n) + O(n)) + O(n) = O(n2 )
Time Complexity: Asymptotics, big-O notation
T [1, . . . , n] is an ordered list (increasing order) and we want to
determine whether key is in the list.
function L OOK U P TABLE(table T [1, . . . , n],key k )
for i from 1 to n do
if T [i] = k then return true
if T [i] > k then return false
return false
Time Complexity: Asymptotics, big-O notation
T [1, . . . , n] is an ordered list (increasing order) and we want to
determine whether key is in the list.
function L OOK U P TABLE(table T [1, . . . , n],key k )
for i from 1 to n do
if T [i] = k then return true
if T [i] > k then return false
return false
t(n) = n(O(1) + O(1)) = O(n)
Time Complexity: Asymptotics, big-O notation
function bubblesort (A : list[1..n]) {
var int i, j;
for i from n downto 1 {
for j from 1 to i-1 {
if (A[j] > A[j+1])
swap(A[j], A[j+1])
}
}
}
Time Complexity: Asymptotics, big-O notation
function bubblesort (A : list[1..n]) {
var int i, j;
for i from n downto 1 {
for j from 1 to i-1 {
if (A[j] > A[j+1])
swap(A[j], A[j+1])
}
}
}
t(n) = n(n − 1)(O(1) + O(1)) = O(n2 )
Time Complexity: Asymptotics, big-O notation
int summation (int m) {
int sum = 0;
for (int i = 1; i <= m; i++) {
sum = sum + i;
}
return sum;
}
Time Complexity: Asymptotics, big-O notation
int summation (int m) {
int sum = 0;
for (int i = 1; i <= m; i++) {
sum = sum + i;
}
return sum;
}
t(n) is not O(n)
Time Complexity: Asymptotics, big-O notation
Recall:
Definition
The running time or time complexity of a Turing machine M
(which halts on all inputs) is the function f : N → N where f (n)
is the maximum number of steps that M uses on any input of
length n.
Time Complexity: Asymptotics, big-O notation
int summation (int m) {
int sum = 0;
for (int i = 1; i <= m; i++) {
sum = sum + i;
}
return sum;
}
Time Complexity: Asymptotics, big-O notation
int summation (int m) {
int sum = 0;
for (int i = 1; i <= m; i++) {
sum = sum + i;
}
return sum;
}
n = log2 m, so mO(1) = O(2n ).
Time Complexity: Asymptotics, big-O notation
int summation (int m) {
int sum = 0;
for (int i = 1; i <= m; i++) {
sum = sum + i;
}
return sum;
}
n = log2 m, so mO(1) = O(2n ). Assuming a linear time
algorithm for addition, we get
t(n) = mO(log2 m) = 2n O(n) = O(2n n)
Time Complexity
Definition
Let t : N → R+ be a function. The time complexity class
TIME(t(n)) is the collection of all languages that are decidable
by an O(t(n)) time Turing machine.
Time Complexity
What can be computed efficiently?
Time Complexity
What can be computed efficiently?
In terms of the time required
We use an asymptotic measure for the worst-case running
time
Time Complexity
What can be computed efficiently?
In terms of the time required
We use an asymptotic measure for the worst-case running
time
In terms of this measure, what do we consider to be efficient?
Time Complexity: Asymptotics, big-O notation
Time Complexity
n
2
16
64
n
2
16
64
n log2 n
2
64
385
n2
4
256
4096
n3
8
4096
2.6 · 105
2n
4
6.5 · 104
1.84 · 1019
Time Complexity
n
n n log2 n
n2
n3
2n
2
2
2
4
8
4
16 16
64
256
4096
6.5 · 104
5
64 64
385
4096 2.6 · 10
1.84 · 1019
1.84 · 1019 nano seconds = 2.14 · 105 days (about 584 years)
Time Complexity
n
n n log2 n
n2
n3
2n
2
2
2
4
8
4
16 16
64
256
4096
6.5 · 104
5
64 64
385
4096 2.6 · 10
1.84 · 1019
1.84 · 1019 nano seconds = 2.14 · 105 days (about 584 years)
Algorithms having running times of the form 2cn , c > 0
(exponential time) are rarely considered to be efficient!
Time Complexity: P
Definition
P is the class of all languages that are decidable in polynomial
time on a deterministic Turing machine. So,
[
P=
TIME(nk )
k
Time Complexity: P
Definition
P is the class of all languages that are decidable in polynomial
time on a deterministic Turing machine. So,
[
P=
TIME(nk )
k
efficient computation = P
Time Complexity: P
Definition
P is the class of all languages that are decidable in polynomial
time on a deterministic Turing machine. So,
[
P=
TIME(nk )
k
efficient computation = P
Real world problems in P seems to be practically solvable
on computers
All (reasonable) deterministic computational models can
simulate each other with only polynomial increase in
running time
Time Complexity: P
Example (PATH)
Given a directed graph G and two vertices s and t, is there a
path in the graph from s to t?
Time Complexity: P
Example (PATH)
Given a directed graph G and two vertices s and t, is there a
path in the graph from s to t?
Algorithm I.
Generate each sequence of at most n vertices (where n is the
number of vertices in G) and check whether the sequence is a
directed path from s to t.
Time Complexity: P
Example (PATH)
Given a directed graph G and two vertices s and t, is there a
path in the graph from s to t?
Algorithm I.
Generate each sequence of at most n vertices (where n is the
number of vertices in G) and check whether the sequence is a
directed path from s to t.
The number of such paths is roughly nn , so the running time of
the algorithm is exponential in n.
Time Complexity: P
Example (PATH)
Given a directed graph G and two vertices s and t, is there a
path in the graph from s to t?
Time Complexity: P
Example (PATH)
Given a directed graph G and two vertices s and t, is there a
path in the graph from s to t?
Algorithm II.
Place a mark on s. Repeat the following until no new vertices
get marked: Scan all edges of G and for all edges (a, b) where
a is marked and b is not marked, place a mark on b.
Finally, if t is marked then accept, otherwise reject.
Time Complexity: P
Example (PATH)
Given a directed graph G and two vertices s and t, is there a
path in the graph from s to t?
Algorithm II.
Place a mark on s. Repeat the following until no new vertices
get marked: Scan all edges of G and for all edges (a, b) where
a is marked and b is not marked, place a mark on b.
Finally, if t is marked then accept, otherwise reject.
t(n) = n(O(n2 )) = O(n3 )
Time Complexity: P
Example (PATH)
Given a directed graph G and two vertices s and t, is there a
path in the graph from s to t?
Algorithm II.
Place a mark on s. Repeat the following until no new vertices
get marked: Scan all edges of G and for all edges (a, b) where
a is marked and b is not marked, place a mark on b.
Finally, if t is marked then accept, otherwise reject.
t(n) = n(O(n2 )) = O(n3 )
PATH is in P
Time Complexity: P
Theorem
Any context-free language L is in P
See Sipser Theorem 7.16 for the proof
Time Complexity: P
Example
Consider the following algorithm where the input is an
undirected graph and we define the size of the input to be the
number of vertices of the graph. We assume a representation
of the graph such that the existence of an edge beteen two
vertices can be determined in constant time.
TRIANGLE:
For each set of 3 distinct vertices from the graph, check
whether the set form a triangle, and in that case output true. If
none of the 3-element sets form a triangle, then output false.
Time Complexity: P
Example
Consider the following algorithm where the input is an
undirected graph and we define the size of the input to be the
number of vertices of the graph. We assume a representation
of the graph such that the existence of an edge beteen two
vertices can be determined in constant time.
TRIANGLE:
For each set of 3 distinct vertices from the graph, check
whether the set form a triangle, and in that case output true. If
none of the 3-element sets form a triangle, then output false.
There are n(n − 1)(n − 2)/6 distinct sets of 3 vertices
Time Complexity: P
Example
Consider the following algorithm where the input is an
undirected graph and we define the size of the input to be the
number of vertices of the graph. We assume a representation
of the graph such that the existence of an edge beteen two
vertices can be determined in constant time.
TRIANGLE:
For each set of 3 distinct vertices from the graph, check
whether the set form a triangle, and in that case output true. If
none of the 3-element sets form a triangle, then output false.
There are n(n − 1)(n − 2)/6 distinct sets of 3 vertices
t(n) = n(n − 1)(n − 2)/6(O(1) + O(1) + O(1)) = O(n3 )
Time Complexity: P
Example
Consider the following algorithm where the input is an
undirected graph and we define the size of the input to be the
number of vertices of the graph. We assume a representation
of the graph such that the existence of an edge beteen two
vertices can be determined in constant time.
TRIANGLE:
For each set of 3 distinct vertices from the graph, check
whether the set form a triangle, and in that case output true. If
none of the 3-element sets form a triangle, then output false.
There are n(n − 1)(n − 2)/6 distinct sets of 3 vertices
t(n) = n(n − 1)(n − 2)/6(O(1) + O(1) + O(1)) = O(n3 )
Polynomial time
Time Complexity: P
Example
3COL: This algorithm outputs true if the graph can be properly
colored by 3 colors (i.e., if every vertex can be assigned a color
(say Red, Blue, or Green) such that no two adjacent vertices
get the same color) and false otherwise.
For each possible assignment of colors (i.e., Red, Blue, or
Green) to all the vertices in the graph, check whether the graph
is properly 3 colored, and in that case output true. If none of the
assignments is a proper 3 coloring, then output false.
Time Complexity: P
Example
3COL: This algorithm outputs true if the graph can be properly
colored by 3 colors (i.e., if every vertex can be assigned a color
(say Red, Blue, or Green) such that no two adjacent vertices
get the same color) and false otherwise.
For each possible assignment of colors (i.e., Red, Blue, or
Green) to all the vertices in the graph, check whether the graph
is properly 3 colored, and in that case output true. If none of the
assignments is a proper 3 coloring, then output false.
There are 3n possible assignments of 3 colors to the n vertices
in the graph. Checking whether an assignment of colors is a
proper 3 coloring can be done in time O(n2 ).
t(n) = O(3n n2 ) (or O(3n ) if we ignore polynomial factors)
Time Complexity: P
Example
3COL: This algorithm outputs true if the graph can be properly
colored by 3 colors (i.e., if every vertex can be assigned a color
(say Red, Blue, or Green) such that no two adjacent vertices
get the same color) and false otherwise.
For each possible assignment of colors (i.e., Red, Blue, or
Green) to all the vertices in the graph, check whether the graph
is properly 3 colored, and in that case output true. If none of the
assignments is a proper 3 coloring, then output false.
There are 3n possible assignments of 3 colors to the n vertices
in the graph. Checking whether an assignment of colors is a
proper 3 coloring can be done in time O(n2 ).
t(n) = O(3n n2 ) (or O(3n ) if we ignore polynomial factors)
Exponential time
Time Complexity: Nondeterministic Turing machines
Definition
A nondeterministic Turing machine is a decider if all its
computation branches halt on all inputs
Time Complexity: Nondeterministic Turing machines
Definition
A nondeterministic Turing machine is a decider if all its
computation branches halt on all inputs
Definition
The running time of a nondeterministic Turing machine (which
is a decider) is the function f : N → N where f (n) is the
maximum number of steps that the machine uses on any
branch of its computation on any input of length n
Time Complexity
Definition
Let t : N → R+ be a function. The nondeterministic time
complexity class NTIME(t(n)) is the collection of all languages
that are decidable by an O(t(n)) time nondeterministic Turing
machine.
Time Complexity: NP
Definition
NP is the class of problems solvable in polynomial time on a
nondeterministic Turing machine, so
[
NP =
NTIME(nk )
k
Time Complexity: NP
Definition
A verifier for a language L is an algorithm V that can verify that
w ∈ L with the help of a certificate c. A polynomial time verifier
runs in polynomial time in the length of w.
Time Complexity: NP
Definition
NP is the class of languages that have polynomial time verifiers
Time Complexity: NP
Definition
NP is the class of languages that have polynomial time verifiers
Definition
NP =
[
k
NTIME(nk )
Time Complexity: NP
Definition
NP is the class of languages that have polynomial time verifiers
Definition
NP =
[
NTIME(nk )
k
Theorem
The class
S of languages that have polynomial time verifiers
equals k NTIME(nk )
See Sipser, Theorem 7.20 for the proof
Time Complexity: NP
Definition
NP is the class of problems for which the correctness of
solutions can be verified in polynomial time
Time Complexity: NP
Definition
NP is the class of problems for which the correctness of
solutions can be verified in polynomial time
LINEQ: Given a system of linear equations over N, is there
a solution?
Time Complexity: NP
Definition
NP is the class of problems for which the correctness of
solutions can be verified in polynomial time
LINEQ: Given a system of linear equations over N, is there
a solution?
LINEQ is in NP, let the certificate c be a solution. To verify
that c is a solution, check that all equations are satisfied by
c (which can be done in polynomial time).
Time Complexity: NP
Definition
NP is the class of problems for which the correctness of
solutions can be verified in polynomial time
LINEQ: Given a system of linear equations over N, is there
a solution?
LINEQ is in NP, let the certificate c be a solution. To verify
that c is a solution, check that all equations are satisfied by
c (which can be done in polynomial time).
HAMPATH: Given a directed graph G and two vertices s
and t, is there a path from s to t that pass through all the
vertices of G exactly once?
Time Complexity: NP
Definition
NP is the class of problems for which the correctness of
solutions can be verified in polynomial time
LINEQ: Given a system of linear equations over N, is there
a solution?
LINEQ is in NP, let the certificate c be a solution. To verify
that c is a solution, check that all equations are satisfied by
c (which can be done in polynomial time).
HAMPATH: Given a directed graph G and two vertices s
and t, is there a path from s to t that pass through all the
vertices of G exactly once?
HAMPATH is in NP, let the certificate c be a solution (a
hamilton path). To verify that c is a solution, check that c is
a path from s to t that pass through all vertices of G exactly
once (which can be done in polynomial time).
Time Complexity: NP
By definition P ⊆ NP
Time Complexity: NP
By definition P ⊆ NP
Is P 6= NP?
Time Complexity: NP
Is P 6= NP?
Time Complexity: NP
Is P 6= NP?
Why do we care?
Time Complexity: NP
Is P 6= NP?
Why do we care?
Most of the problems we need to solve are in NP but not
known to be in P
Time Complexity: NP
Is P 6= NP?
Why do we care?
Most of the problems we need to solve are in NP but not
known to be in P
Scheduling (university classes, processor instructions ...)
Time Complexity: NP
Is P 6= NP?
Why do we care?
Most of the problems we need to solve are in NP but not
known to be in P
Scheduling (university classes, processor instructions ...)
Planning (traveling salesperson, circuit design ...)
Time Complexity: NP
Is P 6= NP?
Why do we care?
Most of the problems we need to solve are in NP but not
known to be in P
Scheduling (university classes, processor instructions ...)
Planning (traveling salesperson, circuit design ...)
Solving systems of linear equations over the natural
numbers
...
Time Complexity: NP
Is P 6= NP?
Why do we care?
Most of the problems we need to solve are in NP but not
known to be in P
Scheduling (university classes, processor instructions ...)
Planning (traveling salesperson, circuit design ...)
Solving systems of linear equations over the natural
numbers
...
Factoring and Crypto
Time Complexity: NP
Is P 6= NP?
Why do we care?
Most of the problems we need to solve are in NP but not
known to be in P
Scheduling (university classes, processor instructions ...)
Planning (traveling salesperson, circuit design ...)
Solving systems of linear equations over the natural
numbers
...
Factoring and Crypto
Concerns a fundamental property of computation and the
world we live in
Time Complexity: NP
Is P 6= NP?
Why do most researchers believe that P 6= NP?
Time Complexity: NP
Is P 6= NP?
Why do most researchers believe that P 6= NP?
We have failed to prove P = NP
Time Complexity: NP
Is P 6= NP?
Why do most researchers believe that P 6= NP?
We have failed to prove P = NP
If P = NP, then the world would be a profoundly different
place than we usually assume it to be. There would be no
special value in “creative leaps”, no fundamental gap
between solving a problem and recognizing the solution
once it’s found. Everyone who could appreciate a
symphony would be Mozart; everyone who could follow a
step-by-step argument would be Gauss...
-Scott Aaronson
Time Complexity: NP, trying to prove P = NP
Time Complexity: NP, trying to prove P = NP
Recall the proof/algorithm that showed that any
nondeterministic Turing machine can be simulated by a
deterministic Turing machine:
Time Complexity: NP, trying to prove P = NP
Recall the proof/algorithm that showed that any
nondeterministic Turing machine can be simulated by a
deterministic Turing machine:
Theorem
Every t(n) time nondeterministic Turing machine has an
equivalent 2O(t(n)) time deterministic Turing machine
Time Complexity: NP, trying to prove P = NP
Recall the proof/algorithm that showed that any
nondeterministic Turing machine can be simulated by a
deterministic Turing machine:
Theorem
Every t(n) time nondeterministic Turing machine has an
equivalent 2O(t(n)) time deterministic Turing machine
So, given a polynomial time nondeterministic Turing machine
this only gives us an exponential time deterministic Turing
machine :-(
NP-completeness
The general belief is that P 6= NP (there are problems in
NP that cannot be solved efficiently (i.e., not in P))
NP-completeness
The general belief is that P 6= NP (there are problems in
NP that cannot be solved efficiently (i.e., not in P))
It would be good to know which are the most difficult
problems in NP, and in particular which problems in NP
that are not in P assuming P 6= NP
NP-completeness
The general belief is that P 6= NP (there are problems in
NP that cannot be solved efficiently (i.e., not in P))
It would be good to know which are the most difficult
problems in NP, and in particular which problems in NP
that are not in P assuming P 6= NP
The answer to this is the theory of NP-completeness
NP-completeness
The general belief is that P 6= NP (there are problems in
NP that cannot be solved efficiently (i.e., not in P))
It would be good to know which are the most difficult
problems in NP, and in particular which problems in NP
that are not in P assuming P 6= NP
The answer to this is the theory of NP-completeness
NP-complete problems are problems in NP that have the
remarkable property that if any single one of them is in P,
then P = NP.
NP-completeness
Stephen Cook (1939-)
NP-completeness
Leonid Levin (1948-)
NP-completeness: Reductions
NP-completeness: Reductions
Definition
A function f : Σ∗ → Σ∗ is a polynomial time computable function
if there is some polynomial time Turing machine M that on input
w halts with f (w) on its tape
NP-completeness: Reductions
Definition
A function f : Σ∗ → Σ∗ is a polynomial time computable function
if there is some polynomial time Turing machine M that on input
w halts with f (w) on its tape
Definition
Language A is polynomial time mapping reducible to language
B if there is a polynomial time computable function f : Σ∗ → Σ∗
such that for every w
w ∈ A iff f (w) ∈ B
The function f is called a polynomial time reduction from A to B
NP-completeness: Reductions
Definition
A function f : Σ∗ → Σ∗ is a polynomial time computable function
if there is some polynomial time Turing machine M that on input
w halts with f (w) on its tape
Definition
Language A is polynomial time mapping reducible to language
B if there is a polynomial time computable function f : Σ∗ → Σ∗
such that for every w
w ∈ A iff f (w) ∈ B
The function f is called a polynomial time reduction from A to B
If A is polynomial time mapping reducible to B then we write
A ≤Pm B
NP-completeness: Definition
Definition
We say that a problem B is NP-complete if
1
B ∈ NP
2
For every problem A in NP, A ≤Pm B
NP-completeness: Definition
Definition
We say that a problem B is NP-complete if
1
B ∈ NP
2
For every problem A in NP, A ≤Pm B
Corollary
If B is NP-complete and solvable in polynomial time (in P), then
P=NP
NP-completeness: Definition
Definition
We say that a problem B is NP-complete if
1
B ∈ NP
2
For every problem A in NP, A ≤Pm B
Corollary
If B is NP-complete and solvable in polynomial time (in P), then
P=NP
So, to prove that P=NP it is sufficient to find a polynomial time
algorithm to solve a single NP-complete problem.
NP-completeness: SAT problem
Recall:
Variables that can take the values TRUE (1) and FALSE (0)
are called Boolean variables
The Boolean operations are: AND (∧), OR (∨), and NOT
(¬)
A Boolean formula is an expression involving Boolean
variables and operations
NP-completeness: SAT problem
Recall:
Variables that can take the values TRUE (1) and FALSE (0)
are called Boolean variables
The Boolean operations are: AND (∧), OR (∨), and NOT
(¬)
A Boolean formula is an expression involving Boolean
variables and operations
Definition
A Boolean formula is satisfiable if some assignment of 0’s and
1’s to the variables makes the formula evaluate to 1 (TRUE)
NP-completeness: SAT problem
The satisfiability problem (SAT) is to test whether a Boolean
formula is satisfiable.
NP-completeness: SAT problem
The satisfiability problem (SAT) is to test whether a Boolean
formula is satisfiable.
SAT = {hϕi | ϕ is a satisfiable Boolean formula}
NP-completeness: SAT problem
The satisfiability problem (SAT) is to test whether a Boolean
formula is satisfiable.
SAT = {hϕi | ϕ is a satisfiable Boolean formula}
Theorem (Cook-Levin Theorem)
SAT is NP-complete
NP-completeness: SAT problem
The satisfiability problem (SAT) is to test whether a Boolean
formula is satisfiable.
SAT = {hϕi | ϕ is a satisfiable Boolean formula}
Theorem (Cook-Levin Theorem)
SAT is NP-complete
Corollary
SAT ∈ P iff P=NP
NP-completeness: SAT problem
SAT = {hϕi | ϕ is a satisfiable Boolean formula}
Lemma
SAT ∈ NP
NP-completeness: SAT problem
SAT = {hϕi | ϕ is a satisfiable Boolean formula}
Lemma
SAT ∈ NP
Proof.
Given a Boolean formula ϕ and an assignment of values to the
variables c, we can verify that the formula evaluates to 1
(TRUE) by substituting the variables by the values given by c
and simplifying.
NP-completeness: SAT problem
Theorem
SAT is NP-complete
Proof idea.
We need to show that for any L ∈ NP, L ≤Pm SAT
NP-completeness: SAT problem
Theorem
SAT is NP-complete
Proof idea.
We need to show that for any L ∈ NP, L ≤Pm SAT
L ∈ NP implies that it is decided in polynomial time by some
NTM N
NP-completeness: SAT problem
Theorem
SAT is NP-complete
Proof idea.
We need to show that for any L ∈ NP, L ≤Pm SAT
L ∈ NP implies that it is decided in polynomial time by some
NTM N
We need to simulate N by a Boolean formula
NP-completeness: SAT problem
Theorem
SAT is NP-complete
Proof idea.
We need to show that for any L ∈ NP, L ≤Pm SAT
L ∈ NP implies that it is decided in polynomial time by some
NTM N
We need to simulate N by a Boolean formula
Given N and a string w we construct a Boolean formula ϕN,w
that is satisfiable iff N accepts w.
NP-completeness: SAT problem
Theorem
SAT is NP-complete
Now, to prove that a problem B is NP-complete, it is sufficient to
prove that
1
B is in NP
2
SAT ≤Pm B
NP-completeness: SAT problem
A Boolean formula is in conjunctive normal form (CNF) if it is a
conjunction of disjunctive clauses, for example
(x ∨ y ∨ z) ∧ (y ∨ z ∨ y ) ∧ (w ∨ y ∨ x)
NP-completeness: SAT problem
A Boolean formula is in conjunctive normal form (CNF) if it is a
conjunction of disjunctive clauses, for example
(x ∨ y ∨ z) ∧ (y ∨ z ∨ y ) ∧ (w ∨ y ∨ x)
A formula is in k CNF if each clause contains exactly k literals
3SAT = {hϕi | ϕ is a satisfiable 3CNF formula}
NP-completeness: SAT problem
A Boolean formula is in conjunctive normal form (CNF) if it is a
conjunction of disjunctive clauses, for example
(x ∨ y ∨ z) ∧ (y ∨ z ∨ y ) ∧ (w ∨ y ∨ x)
A formula is in k CNF if each clause contains exactly k literals
3SAT = {hϕi | ϕ is a satisfiable 3CNF formula}
Theorem
3SAT is NP-complete
Proof.
SAT ≤Pm 3SAT
NP-completeness: SAT problem
DOUBLESAT = {hϕi | ϕ has at least two satisfying assignments}
NP-completeness: SAT problem
DOUBLESAT = {hϕi | ϕ has at least two satisfying assignments}
Theorem
DOUBLESAT is NP-complete
NP-completeness: SAT problem
DOUBLESAT = {hϕi | ϕ has at least two satisfying assignments}
Theorem
DOUBLESAT is NP-complete
Proof.
We give a reduction from 3SAT, i.e., 3SAT ≤Pm DOUBLESAT
NP-completeness: SAT problem
DOUBLESAT = {hϕi | ϕ has at least two satisfying assignments}
Theorem
DOUBLESAT is NP-complete
Proof.
We give a reduction from 3SAT, i.e., 3SAT ≤Pm DOUBLESAT
First note that DOUBLESAT is in NP, since given ϕ0 and
two variable assignments c1 and c2 we can verify in
polynomial time that c1 and c2 both satisfy ϕ0 (substitute
the variables in ϕ0 by the values given by c1 (c2 ), simplify
and check that ϕ0 evaluates to TRUE.
Given a 3SAT instance/formula ϕ, reduce it to
ϕ0 = ϕ ∧ (x ∨ x), where x is a new variable not used in ϕ.
NP-completeness: HAMPATH
HAMPATH: Given a directed graph G and two vertices s and t,
is there a path from s to t that pass through all the vertices of G
exactly once?
NP-completeness: HAMPATH
HAMPATH: Given a directed graph G and two vertices s and t,
is there a path from s to t that pass through all the vertices of G
exactly once?
HAMPATH is in NP, let the certificate c be a solution (a hamilton
path). To verify that c is a solution, check that c is a path from s
to t that pass through all vertices of G exactly once (which can
be done in polynomial time).
NP-completeness: HAMPATH
HAMPATH: Given a directed graph G and two vertices s and t,
is there a path from s to t that pass through all the vertices of G
exactly once?
HAMPATH is in NP, let the certificate c be a solution (a hamilton
path). To verify that c is a solution, check that c is a path from s
to t that pass through all vertices of G exactly once (which can
be done in polynomial time).
HAMPATH = {hG, s, ti | there is hamilton path from s to t in G}
Theorem
HAMPATH is NP-complete
Proof.
Reduction from 3SAT, i.e., 3SAT ≤Pm HAMPATH
NP-completeness
Why is it important to know that the problem you want to solve
is NP-complete?
NP-completeness
Why is it important to know that the problem you want to solve
is NP-complete?
Because it tells you something about what kind of
algorithm you should try to come up with
NP-completeness
Why is it important to know that the problem you want to solve
is NP-complete?
Because it tells you something about what kind of
algorithm you should try to come up with
In particular, trying to come up with a fast (polynomial time)
algorithm that always works is probably not a good idea...
NP-completeness
Why is it important to know that the problem you want to solve
is NP-complete?
NP-completeness
What to do when you have to solve an NP-complete problem?
NP-completeness
What to do when you have to solve an NP-complete problem?
Large inputs?
NP-completeness
What to do when you have to solve an NP-complete problem?
Large inputs?
Modify the problem
NP-completeness
What to do when you have to solve an NP-complete problem?
Large inputs?
Modify the problem
Approximation algorithms
NP-completeness
What to do when you have to solve an NP-complete problem?
Large inputs?
Modify the problem
Approximation algorithms
Heuristics
NP
Is P 6= NP?
Why has it not been resolved yet?
NP
Is P 6= NP?
Why has it not been resolved yet?
To prove P 6= NP, we need to exclude all possible
polynomial time algorithms!
NP
Is P 6= NP?
Why has it not been resolved yet?
To prove P 6= NP, we need to exclude all possible
polynomial time algorithms!
We do not have a very good understanding for the internal
structure of NP
Summary of (time) complexity
The time complexity of a Turing machine is the number of
steps the machine takes (in the worst-case) as a function
of the size of the input
We use big-O notation for running times since we are
interested in large inputs and needs something robust
P is the class of problems solvable in polynomial time on a
deterministic Turing machine
NP is the class of problems solvable in polynomial time on
a nondeterministic Turing machine
NP is the class of problems for which we can verify a
solution in polynomial time
We believe that P 6= NP but have no idea how to prove it
NP-complete problems are the most difficult problems in
NP
SAT is NP-complete, other problems can be proved
NP-complete by polynomial time mapping reductions
Summary of (time) complexity
Unlike the other two parts of this
course, complexity theory is a
very active research area!
Summary of regular languages
Regular languages are the languages recognized by DFAs
For every NFA there is an equivalent DFA
The subset construction
A language can be described by a regular expression if
and only if it can be recognized by a DFA
GNFA construction, closure properties
There are simple non-regular languages
Pumping lemma
Summary of context-free languages
The context-free languages are the languages generated
by context-free grammars
A context-free grammar is ambiguous if the same string
can be derived using two different left-most derivations
A language is context-free iff it is recognized by a PDA
There are simple languages that are not context-free
Summary of computability
A Turing machine is a mathematical model of a general
computer
Church-Turing thesis (anything that can be computed can
be computed by a Turing machine)
There are “simple” and important algorithmic problems that
cannot be solved on computers (undecidability)
ATM is undecidable (proof by diagonalization)
Other problems can be shown to be undecidable by
(mapping) reductions
Download