doc - Brown University

advertisement
P, NP and Logic
Eric Rachlin
GS19
5-14-2001
Our Group Independent Study Project, GS19, looked at many aspects of the
theory of computation. Throughout this study however, we relied on a somewhat intuitive
model of computation, the Turing Machine, whose computational abilities, and
limitations, provided the basis for classifying languages. In this paper we explore how the
mathematical field of logic can form a basis for a similar classification of languages. In
particular, we focus on the classes of languages P and NP and their relation to Boolean
and existential second-order logic.
Examples of NP and P from Boolean Logic
We begin with a well-known cross over between Boolean logic and complexity
classes NP, SAT. That is the collection of satisfiable Boolean formulas. We define a
Boolean formula as follows. It is either a Boolean variable (a variable which may take
only the values “T” or “F”), a Boolean expression of the form -, where  is a Boolean
formula, a Boolean expression of the form (1  2), where 1, 2 are both Boolean
expressions, or a Boolean expression of the form (1  2), where 1, 2 are both Boolean
expressions. A Boolean formula is called “satifiable” if there is some assignment to its
variables which makes it evaluate to true.
The evaluation of a Boolean formula is defined as follows. A Boolean formula
consisting of a single value evaluates to true if the variable has the value T, and evaluates
to false if the variable has the value F. A Boolean formula of the form - evaluates to true
if the Boolean formula  evaluates to false, and evaluates to false if the Boolean formula
 evaluates to true. A Boolean formula of the form (1  2) evaluates to true if either
formula 1 or 2 evaluate to true, and evaluates to false if both 1 evaluate 2 to false.
Finally a Boolean formula of the form (1  2) evaluates to true if both formulas 1 and
2 evaluate to true, and evaluates to false if either formula 1 or 2 evaluate to false.
As was stated before, SAT is the set of satisfiable Boolean expressions. Cook’s
Theorem tells us that SAT is NP-complete. That is, given a Boolean formula, determining
whether some assignment of T and F to the variables in that formula will cause the
formula to evaluate to true can be done in polynomial time by a nondeterministic Turing
machine. Also, if SAT could be solved in polynomial time by a deterministic Turing
machine, every language in NP could be solved in polynomial time by a deterministic
Turing machine (every language in NP is polynomial-time reducible to SAT).
Determining whether a given Boolean expression is satisfiable is NP-complete,
and thus if P  NP, cannot be done in polynomial time by a deterministic machine.
Suppose however, instead of trying to determine whether an arbitrary Boolean expression
is satisfiable, we only we only want to determine whether a certain limited class of
Boolean expression is satisfiable. It turns out that when this limited class of expression is
what we will call Horn expressions, determining whether or not one is satisfiable can be
done in polynomial time.
A Horn expression is simply the conjunction a series of Horn clauses. The
conjunction of n Boolean expressions 1, 2, 3, …, n is 1  2  3  … n. A clause
is simply a Boolean expression composed of literals with ‘’s between them. A literal is
either a Boolean variable, or the negation of a Boolean variable. Finally a Horn clause is
a clause in which all the literals, or all but one of the literals, are negations of a Boolean
variable. Now that Horn expressions have been defined, we give a deterministic,
polynomial time algorithm with which to determine if one is satisfiable.
We determine the satisfiablity of a Horn expression as follows. First we consider
only the Horn clauses in the Horn expression that contain a literal which is not the
negation of a Boolean variable. Call the conjunction of just these Horn clauses H’. Set
every variable in the Horn expression H’ to the value F and see if H’ evaluates to true
(evaluating a Boolean formula can be done by a deterministic Turing machine in
polynomial running time). If H’ does not evaluate to true, it means that one of its Horn
clauses is not evaluating to true. Select one of the Horn clauses that evaluated to false and
set its non-negated literal to true. If H’ still does not evaluate to true, repeat the
assignment procedure. Note that once a Boolean variable to true, it is never set back to
false, and hence this algorithm will eventually either find a satisfying assignment of
values to the variables or assign all variable to true, which will necessarily satisfy H’
(each clause in H’ contains a non-negated literal). Finally, when a satisfying assignment
of variables for H’ is found, check if the original Horn expression is satisfied by them.
The above algorithm clearly runs in polynomial time with regard to the length of
the Horn expression. The natural question to ask of it however is how we know that the
algorithm will find a satisfying assignment of variable if one exists. The reason we know
this is as follows. Our algorithm eventually finds some assignment of variables that
satisfies H’. Let A be the set the variables assigned “T” in this assignment. Any other
assignment of values, V, that satisfies H’ must assign all the variables in A true. To see
this, consider the step in our algorithm in which a variable was first added to A that is not
assigned true by V. The clause that caused this variable to be added to A cannot be
satisfied by V if all the variables that were already in A are set to “T”. If this truly was
the first step to put a variable in A that was not assigned “T” in V, then V cannot satisfy
H’. Finally, any assignment of variables that satisfies the entire Horn expression must
satisfy H’.
We now see that if the assignment of values to variables that is generated by our
algorithm to satisfy H’ does not satisfy the entire Horn expression, assigning additional
variables the value T won’t help, since the unsatisfied Horn Clauses only contain negated
literals. Also, assigning additional variables the value F won’t help, since we just showed
it will cause H’ to no longer be satisfied. Thus given a Horn expression, the above
deterministic algorithm succeeds in determining whether or not some assignment of
values to variables will satisfy it, and it does so in polynomial time. The language
HORNSAT, the set of satisfiable Horn expressions, is in P. In fact HORNSAT is not just
in P, but it is P-complete, which means that every language in P is log space reducible to
HORNSAT.
Examples of NP and P from Existential Second Order Logic
We have now described in detail the languages SAT and HORNSAT using
Boolean logic. These languages are in NP and P respectively. We will now attempt to
give examples, as well as characterizations of the languages in NP and P using existential
second order logic, a slight extension of first order logic. By existential second order
logic we refer to either expressions in first order logic with some vocabulary V, which we
denote , or expression of the form P, where to ’s vocabulary we add the additional
relational symbol P which takes some fixed number of arguments. Thus we interpret to
P to mean, “there is some relation P for which  holds”. More precisely, a model M,
which assigns an interpretation to a given vocabulary V is said to satisfy P if some
relation can be assigned to P so that M assigns an interpretation to V + P (V with relation
symbol P added) which satisfies . Essentially this is the same as existentially
quantifying over a variable in expressions from first order logic. P may or may not appear
in .
In this paper we will in fact concern ourselves only with a subset of all possible
expressions  (the expressions of first order logic), which we denote -graphs, which is
limited to the vocabulary VG. This vocabulary consists of the standard logical symbols ,
, , , -,  and , an infinite number of variables, the binary equality relation =, and
the binary relation G. As such, any interpretation of an expression of -graphs can be
though of as a finite graph. The quantified variables in the expression are free to range
over all possible nodes in the graph. The free variables in the expression are assigned to
specific nodes in the graph (which nodes they are assigned to is defined by the graph
itself). Furthermore the relation G(x, y) is interpreted as true if there is a directed edge
going from the node represented by x to the node represented by y in the graph, and the
equality relation = is interpreted as true if x and y represent the same node in G. As
always, the standard logical rules of interpretation apply. Each syntactically correct
expression in our new subset of second order existential logic, which we will denote Pgraphs, can be though of as asserting a certain property about graphs.
We now discuss the ability of expressions of the form P-graphs to characterize
languages in NP and P. Below is a sentence of the appropriate form which will only be
satisfied by graphs which are in the language HAMILTONIAN PATH. That is, it is only
satisfied when interpreted with a graph that has a path that visits each node exactly once.
Our sentence is: P(xy(P(x,y)  P(y,x)  (x = y)) 
xyz(-P(x,x)  ((P(x,y)  P(y,z))  P(x,z))) 
xy((P(x,y)  z(-P(x,z)  -P(z,y)))  G(x,y)))
This sentence asserts existence of a relation P which meets certain specific criteria. The
first underlined expression asserts that P must hold of either x and y, or y and x, if x and y
are assigned to distinct nodes. The second underlined expression asserts that P is
transitive, but is not reflexive. Finally the third underlined expression asserts that if
P(x,y), and for all other nodes z not both P(x,z) and P(z,y), then there is an edge going
from x to y. This sentence will only be satisfied by graphs for which a Hamiltonian path
exists. If one does exists the path defines P as follows. P(x,y) holds if the Hamiltonian
path goes from x to y. Furthermore if a relation P(x,y) exists for a given graph which
satisfies the above sentence, it can be used to construct a Hamiltonian path.
The language HAMILTONIAN PATH is in fact NP-complete, so we have
succeed in characterizing not only an NP language, but a NP-complete language using
this special subset of existential second order logic (which is itself a subset of second
order logic in general). In fact all expressions of the form P-graphs characterize
languages in NP. The proof of this fact is simple. For any graph with n nodes, a
nondeterministic Turing machine could guess the assignment of nodes to free variables in
-graphs, as well as guess the relation P. By guessing P, we mean assigning true/false
values to every possible assignment of values to P’s arguments. The number of
arguments P could possibly take is limited by the length, l, of the -graphs expression.
Since each argument can take on n possible values, the total number of values to P assign
will be less than or equal to nl, which is polynomial with respect to n. Once the proper
assignments
have
been
made
nondeterministically,
the
machine
need
only
deterministically verify that the formula has in fact been satisfied. As with Boolean
formulas, evaluating an expression takes only polynomial time with respect to the
number of nodes in the graph. This is easy to see, since even with the presence of
existential and universal quantifies, expressions in -graph will not take more than O(nl)
evaluations of what are essentially Boolean expression which can themselves be
evaluated in polynomial time. Once again, n is the number of nodes in the graph and l is
the length of the expression.
When Boolean logic was being discussed earlier, we defined the notion of a Horn
expression. We can extend our definition of a Horn expression to our graph based subset
of second order existential logic as follows. An expression is in Horn form if it in prenex
from (all quantifiers are in the front) with only universal first-order quantifiers, and the
expression to the right of these quantifiers is a Horn expression as defined previously,
with a small change. This time the term “literal” will refer to an atomic expression, or the
negation of an atomic expression. By atomic expression we mean either the relations
G(x,y), the relation x = y, or the relation P(…), where “…” represents some fixed number
of arguments. Also we now only require that only one or zero non-negated P(…)’s be in
each clause, negations of the other atomic expression are unrestricted. Clearly our
sentence which represented HAMILTONIAN PATH is not anywhere close to being in
Horn form (Although any first order expression can be converted into Prenex normal
form, doing so may cause existential quantifiers to appear).
Certain graph characteristics like UNREACHABILITY, described by the
following sentence with free variables x and y, can be represented by a Horn expressions.
To assert that there is no path from the node assigned to x to the node assigned to y we
can write: P(uvw((P(u,u))(G(u,v)P(u,v))((P(u,v)P(v,w))P(u,w))-P(x,y))).
This sentence asserts that the nodes represented by x and y are not connected by a path by
asserting -P(x,y) and establishing certain properties of P that are make it equivalent to
“there is a path from x to y”. That is, P is reflexive, P is transitive, and P(u, v) holds
whenever there is an edge going from u to v. If a path does exist from node x to node y in
a graph, then we will have P(x,y) for that graph, and only graphs in which x and y are not
connected will be able to satisfy the sentence.
The language UNREACHABILITY, (that is given a graph and two nodes, does a
path go from the first node to the second) is in P, and in fact all Horn expressions in
existential second-order logic characterize a language in P. The proof of this fact is
simple. Suppose we have such an expression and a graph, we start by expanding the first
order portion (everything to the left of the “P”) as follows. First assign the free variables
to the appropriate nodes in the graph (remember, this assignment is a property of the
graph itself, which plays the role of assigning a specific interpretation to the symbols in a
-graphs expression). Next eliminate the universal quantifiers by writing out the portion
of the Horn expression to the left of the quantifiers with every possible combination of
nodes assigned to the quantified variables. This will involve writing at most n l
expression, where n is the number of nodes in the graph and l is the length of the Horn
expression. Finally assign the appropriate true false values to the atomic expression
G(x,y), -G(x,y), x = y, - x = y.
Now notice that P is free to take any value (true or false) for a given combination
of arguments. As such, each appearance of P with a given set of arguments can be
replaced by a separate Boolean variable (appearances of P with the same arguments
should be replaced with the same variable). We have now successfully reduced the
existential second order Horn expression to a Boolean Horn expression of polynomial
length in a polynomial amount of time. Since we have already described an algorithm for
solving Boolean Horn clauses deterministically in polynomial time, we have succeeded in
showing that satisfiability of an existential second order Horn expression is in P. Any
existential second order Horn expression characterizes a language in P.
Complete charaterizations of NP and P using Existential Second Order Logic
We have just shown that certain expression in second order logic characterize
languages in NP and P. Specifically, we looked at expressions of the from P-graphs, as
well as Horn expressions of this form. In fact these forms of expressions almost have the
power to characterize all languages in NP and P respectively. The class of Horn
expressions of the form P-graphs must be extended slightly to characterize P, but the
class P-graphs remains unchanged.
We all ready know that every existential second order logic expression of the
form P-graphs characterizes a language in NP. If we could show that every language in
NP is characterized by an expression of this form, we would then have shown the second
order logic expression of the form P-graphs are exactly NP. In fact every language in
NP can be characterized by a P-graphs expression.
The proof of this is rather long but essentially relies on the ability to use -graphs
to represent a successor function. This means that a sentence with free variables x and y
can be defined which is true if and only if node x is the successor of node y. The function
will behave as if each node in the graph has been assigned a unique integer 1…n, where n
is the total number of nodes in the graph. Using this representation of the basic successor
function, successor expressions that have an arbitrary number of variables, k, can be
represented. This means that the successor function for values ranging from 1 to nk can be
represented.
The goal of the proof is to show how for a given language in NP can expressed as
an expressions of the from P-graphs. We already know that every language in NP is
solved by a nondeterminstic Turing machine in polynomial time, so the question the
becomes how can the computation of nondeterministic Turing machine with a given
polynomial running time be represented in P-graphs. Once the k-sized successor
function is represented, statements can be made about relations that define things like
“the (i,j)th symbol on the machines computation tape is a 1” or, “the ith nondeterminstic
choice made by the machine was a 0” can also be defined. The existentially quantified
relation P ends up playing the role of a number of simpler relations like the two examples
given above.
It is in fact simple to represent multiple existentially quantified relations by the
single relation P. One strategy is as follows. Suppose you have the expression AB,
where A and B both take k arguments. Let P take an additional argument x which will
take a certain value when P is to behave like B (with respect to its other k arguments),
and a certain value when P is to behave like A. Every time A or B appear in , substitute
for them P with argument x fixed at the appropriate value. If A and B have a take a
different number of arguments, just fix other arguments when P is substiuted for the
relation that takes less arguments. This method can be extended to any number of
relations and allow expressions of the from R1R2…RN to be written as P.
By the end of the proof, the P-graphs expression constructed ends up
translating roughly as “There exists an assignment of choices and symbols to the
machines computational history that allow for a valid computational history which ends
in an except state.” It is interesting to note that in the construction of the -graphs portion
of the sentence, which simply asserts the validity of a given computational, only contains
one non-Horn clause. Appropriately enough, this clause deals only with the assertion that
a single non-deterministic input is chosen at each step.
Although the full proof was not given, the idea behind the characterization of NP
using existential second order logic has been described. The natural question to ask next
is, how can the same be done with P. We already know that every Horn expression of the
form P-graphs characterizes a language in P, and ideally every language in P would be
characterized by such a Horn expression. If this were the cases, the two classes would be
equal, but unfortunately this is not so. Horn existential second-order logic has the
interesting property that if all graphs with n nodes are equally likely, then the probability
that a random graph with n nodes will satisfy a given existential second order Horn
expression approaches either 1 or 0 as n goes to infinity. With this being true it becomes
immediately clear that Horn expressions alone will never allow for the categorization of
the simple language IS-EVEN, defined as all graphs with an even number of nodes. Such
a language is clearly in P (just count the number of nodes in a given graph), but any
expression which characterizes it would need to have an equal probability of being
satisfied or not satisfied by a randomly chosen graph, regardless of the number of nodes.
This cannot be the case if the expression is a Horn expression.
To overcome this difficulty we say that every language in P can be characterized
by a Horn expression of the form P-graphs if we add to our vocabulary the successor
function S, described above. The proof of this mimics the construction given to allow
unrestricted P-graphs expressions to characterize all languages in NP. It is interesting
that besides the use of successor, the only part of the construction that required a nonHorn clause was the step expressing the selection of exactly one nondeterministic of
choice at each step of the computation.
To summarize, we have two valuable theorems. The first, known as Fagin’s
Theorem, is that the class of all properties (or languages) expressible by symantically
correct second order existential sentences using the vocabulary VG, which we have been
denoting P-graphs, is precisely NP. The second, a similar theorem, is that the class of
all properties expressible by second order existential Horn expressions using the
vocabulary VG along with the successor relation is precisely P.
Sources:
Papadimitiriou, Christos. Computational Complexity. Addison-Wesley Publishing
Company: Reading, Massachusetts. 1995.
Sipser, Micheal. Introduction to the Theory of Computation. PWS Publishing Company:
Boston, Massachusetts. 1997.
Boolos, George and Jeffrey, Richard. Computability and Logic. Cambridge University
Press: Cambridge, England. 1996.
Download