Mohit Singh: Hello everyone. So pleasure to have...

advertisement
Mohit Singh: Hello everyone. So pleasure to have Uriel Feige back with us. And he's a regular visitor.
And he will give us, tell us today about separation between estimation and approximation.
Uriel Feige: So good afternoon. So this is a joint work with a former student of mine Shlomo Jozeph.
So we all know of an NP-hard combinatorial optimization problems. Here are some examples like
Max 3SAT, minimum [indiscernible], maximum independent set, and so on. And for each one of them
would be felt as a problem like Max SAT is a problem, and it has problem instances. So problem is
really a class of instances of a similar type. For NP-hard optimization problems, standard way of
coping with them is through approximation algorithms.
So just go over the usual definitions. So approximation algorithm is just a polynomial time algorithm
A, let's say, that given an instance, produces a feasible solution, not necessarily an optimal solution.
And then when we want to measure it's performance. One where we're measuring it for the
approximation ratio, and here is the find for maximization problems, for minimization problems with
similar definitions.
So on a given instance, the approximation ratio of algorithm A is the value of the solution that it found
over the value of the optimal solution. And then optimization ratio, approximation ratio of the
algorithm of the whole is the minimum overall instances of its approximation ratio. In these instances
in the approximation ratio possible for a problem is the maximum of all the algorithms of the
approximation ratio, and of course this should be [indiscernible] but let's not worry about that. And for
example max [indiscernible], as is well known, can be approximated within a ratio of 7/8.
So the other side of approximation algorithms is that of proving hardness of approximation, which is
the limit of how well you can approximate the problem. And here's a typical example of how a
hardness of approximation result looks like. So this is the famous result by Hastad. So the proof looks
the following way. You have the polynomial time reduction that starts, let's say, from some source
problem and gives you an instant in a target problem. So here both the source problem and the target
problem can be fully set. So you can start from a reduction that reduces a 3CNF formula, theta, with
the property that is the original formula is satisfiable with the target formula. And if the original
formula was not satisfiable, then you build some gap and the target formula at most, you can satisfy is
it at most with 7/8#th# of its clauses.
And if you have such a reduction, then if you can distinguish between the two types of theta, then you
can distinguish between the two types of phi. So it means that its NP-hard to distinguish between
satisfiable 3SAT instances and those which are at most 7/8#th# satisfiable. And from this of course we
derive that Max 3SAT cannot be approximated within a ratio better than 7/8. So this is a well-known
reduction.
So the other notion we discuss here is that of estimation algorithms, which is a notion similar to
approximation algorithms. So here for maximization problems, so for me an estimation algorithm E is
polynomial time algorithm. Let me give an instance. Output is just a value. It doesn't need to output a
solution to the optimization problem, just the value. And the value has to be for maximization
problems not larger than opt; and for minimization problems, it would be not smaller than opt. And in
a sense it says I'm guaranteeing to you that there is some solution with at least this value. So this is
really what it guarantees, and it better be right by the definition. The value indeed has to be not larger
than opt.
So it doesn't find a solution, but it just tells you that there should be a solution of itself in value
>>: Is there proof along with this?
Uriel Feige: So the fact that it is an approximation algorithm and that's the value that it output serves
as the proof; otherwise, it's not an estimation algorithm. So of course usually you need to prove that
the certain algorithm is an approximation algorithm. And this you do in estimation algorithm. But this
you do externally. The algorithm itself does not need to show you the proof.
So the estimation ratio is defined similar to the approximation ratio. For every given instance you have
the estimation ratio for that instance, and then for a given algorithm you have the estimation ratio for
that algorithm, and for the given problem you have the best estimation ratio possible by polynomials
times algorithms.
So if we tried to compare these two tasks of estimation and approximation, then estimation is easier.
Because every approximation algorithm, let's say with approximation ratio Rho, serves also as an
estimation algorithm. The estimation algorithm can look at the solution output by the approximation
algorithm and output of the value of that solution, and that would be a fair estimate within the same
approximation ratio that the approximation ratio algorithm hit. In this sense, if you look at the possible
estimation ratios that you can get compared to approximation ratios then you can get estimation is no
harder than approximation.
Now, the interesting point is that in many cases it's also not easier. So for example, if we look at the
hardness of the approximation result that we show Hastad's design, it also shows how this estimation.
It says nothing about actually finding a solution. It just distinguishing between instances that have a
solution with high value and instances which has a solution of low value, but without the need to
actually output any solution of any value, so there are also hardness of estimation results, and this is
typical. It's not special for Hastad results. So the known techniques that we have for deriving hardness
of approximation results is also provide hardness of estimation results.
So if we look, for example, at indications where we have tight results, so we have approximation
algorithms, we [indiscernible] an approximation ratio that we can prove for them, and we have tight.
We have also hardness of approximation results. Then these hardness of approximation results all turn
out to be hardness of estimation results. They match some things up to the low order of terms
[indiscernible]. So for Max 3SAT, we know that the best approximation ratio and the best estimation
ratio is 7/8, Max XOR is half, max coverage one minus 1 over e, minimal set cover ln n, and so on. So
there's no difference between approximation and estimation for these problems.
In some cases we have tight results and the assumption is different from P equal 20. So a common
assumption used in this context is the unique games conjecture of a [indiscernible]. So if this
conjecture is true, so based on this conjecture, you can prove also hardness of approximation results,
and the hardness of approximation results are also hardness of estimation results. It's the same thing.
So for example under this assumption, the min vertex cover is a problem which can be approximated
within the ratio of 2 and can neither be approximated nor estimated within any better ratio up to Rho
order of terms. And for max cut, the best approximation ratio is whatever Goemans and Williamson
proved, this number. And in general, for large classes of problems we have tight results, tight in
approximation probability, results for bases on the unique games conjecture.
So we can ask ourself like are there any combinatorial optimization problem for which estimation is
easier than approximation? So can estimation ratio be better than the approximation ratio for some
problem? So like [indiscernible] there's no reason why they should be the same, but somehow a lot of
the evidence points toward them typically being the same.
Okay. So one place to look for problems for which estimation may be better than approximation is the
problems for which we have estimation algorithms that are not approximation algorithms. Give you an
estimate without actually finding a solution. So maybe they are doing something different than what
approximation algorithms need to do. And here are some examples. So for example, for Max 3SAT, so
I'm looking at [indiscernible] instances in which each clause has exactly 3 valuables. So if you have m
clauses, an estimation algorithm can just count the number of clauses and multiply it by 7/8 and say I
guarantee to you that you can satisfy 7m over 8 clauses, which is full, because a random assignment in
expectation, this is what it gets in instances of 3SAT. So this is a valued estimation without actually
exhibiting any solution.
Another example ->>: You just said the solution, the randomized -Uriel Feige: No, but the algorithm does not give you one. The estimation algorithm does not give you
one. And our example of estimation algorithms that do not go through finding a solution, they do not
find a solution and look at its value, they rather compute the number in some other way. And these
algorithms may be a source of differences between estimation and approximation.
Another example would be min dominating set. If you don't know the exact definitions of the
problems, it doesn't matter that much. So you just, given the graph, let's say d/regular graph, you just
output the number, which is the number of vertices divided by d+1, d+1, multiply it by the line of this.
And every such graph has in fact dominating set of this size or smaller. So this is a valued estimation
and no such graph is a dominating set smaller than n over d+1, so the estimation ratio is ln d+1, after
the factor of a logarithmic factor we are optimal here. And again, you didn't find any dominating set
explicitly; you just output the number.
For min bandwidth problems, for graphs of better problem of numbering of the vertices of the graph 1
to n such that, and you look at the edge for which the difference in the numbers of the n points is
maximal and that's your n width. So for circular [indiscernible] graphs, these are graphs like in this
picture. So each vertex is like a round arc on the circle, and two vertices share an edge. If they
intersect, then if you want to know the bandwidth of such a graph, you can just look at the maximum
clique size. So that's a points where as many arcs as possible intersect. So here it would be 3,
multiplied by 2, and say that's the bandwidth. And that's an estimation within the ratio of 2.
So these are examples of estimation algorithms. But they are not approximation algorithms. So if we
look at these examples, first, all of them their estimation algorithms are trivial. What else do we know?
In other examples that they give, the estimation ratio are in fact best possible. It's NP-hard to do any
better, so these trivial algorithms are optimal estimation algorithms. You cannot do any better. And
moreover, there is an approximation algorithm achieving the same ratio. For example, for Max 3SAT,
this would be either a random assignment, or a greedy assignment would achieve the same
approximation ratio. So this so far was not a good place to look for if we want to separate between
estimates and approximation. Usually those estimation algorithms that we know in many cases don't
give such a separation.
So let's also visit some common approximation algorithms. So one such class of algorithms, so that's
not just the single algorithm, these are greedy algorithms and they come in certain variations. For
certain problems that are solvable in polynomial time, greedy algorithms are known to produce the
optimal solution. For example, for minimal spanning tree, or more generally problems that have a
matroids fracture, then the greedy algorithm is known to be exactly optimal. And for many
optimization problems that are NP-hard, it's not that the greedy algorithm gives the optimal
approximation ratio in the sense that it's matched by the hardness of approximation results. So often
these are covering problems such as max coverage, min set cover, min-sum set cover, and so on. And
for all these problems the greedy algorithm, the approximation ratio achieved by the greedy algorithm
1-1 over [indiscernible] max coverage, ln for set cover, 4 means some set cover, is optimal, is matched
by a hardness of approximation result and hardness of estimation results. So for problems that we
solve using the greedy algorithm, estimation and approximation match each other.
So another common methodology for achieving approximation algorithms is for the use of the either
linear problem relaxations or semi-definite problem relaxations or sometimes hierarchies of relaxations.
And in general the template is as follows. You first formulate the combinatorial optimization problem
as an integer program. Then you relax the integer program by removing the integrality constraints and
replacing them, allowing for fractional solutions. Then you find an optimal fractional solution to the
linear program, and linear programs can be served in polynomial time. So this stuff you can actually
do in polynomial time. Integer programs in general cannot be involved in polynomial time. That's why
we needed the relaxation.
And then there's a procedure that is often known as rounding the LP solution, which is specific for each
problem. There might be a different rounding procedure that gets you back at feasible solution to the
original integer program or to the original problem. And then you estimate how good that approach is
by comparing the value that you get from the rounded solution from the value of the LP solution. So
how much did you lose in the process between this stage of the LP solution and the rounded solution.
So let's maybe see a picture of how it looks like. So let's say, let's normalize for a certain instance of
the problem, let's say maximization problem, the optimal value to be 1, the integer problem in
formulation of the problem would in general exactly formulate the problem, so it's optimal value would
also be 1. It would represent the problem exactly.
Once you do an LP relaxation, you relax the linear problem, the integer problem. So you have
potentially added a feasible solution, fractional solution, so the value goes up. And then when you
round you get a feasible solution to the IP. So after rounding your value cannot be larger than 1, and
typically it would be even smaller.
So if we look, the way you want to usually analyze this scheme is to compare the value of the rounded
solution to the LP solution. But in fact, this is not necessarily the approximation ratio of such
algorithms. The approximation ratio should be a comparison between the rounded value and the
optimal value, which can be a better ratio.
And another ratio that is interesting here is the ratio between the value of the LP solution and the IP
solution, which this ratio is always at least as good as the ratio between the rounded solution and the LP
solution. And this ratio is referred to as the integrality gap, this ratio.
So the two approximation ratios is really the rounded value compared to the optimal value, the one
which is usually analyzed as the random value compared to the LP value, and then another value of
interest is the integrality gap.
So if we look at this in the eyes of approximation versus estimation, so if you look now at estimation
algorithms that we can get from this linear problem in relaxation, then perhaps we can get two which
are not necessarily the same. So one is just approximation algorithm itself is an estimation algorithm.
So its ratio is the rounded value over the value of the LP, or the optimal value. It is exactly the same as
the approximation algorithm. You just take the rounded solution and output its value.
But there's another estimation algorithm, which is to look just at the value of the LP. And if you know
what the worse-case integrality gap over all instances is, this is just a formality, you can multiply it by
this integrality gap, so then you necessarily get the value which is smaller than IP as we wanted for
maximization problems, and then this is your estimated value for the solution.
And either one of them might be a better estimation algorithm. Because maybe the gap between these
two might always be smaller, let's say, than the gap between these two. Or the other way around.
So we have like two choices of estimation algorithms. There's like essentially one approximation
algorithm coming out of the linear program relaxation, but two estimation algorithms which might have
different estimation ratios. So we may hope that maybe the second of this estimation algorithms, the
one based on the integrality gap, has an estimation ratio better than the approximation ratio. And then
we'll have some separation.
So let's look at some evidence. So as an example, let's look at minimal vertex cover, a well-known
problem. So in the graph you want to select here the green vertices, the smaller set of vertices, that
covers all the edges.
So how do we do the framework that we did before? So when we want to write an integer program for
it, we have one valuable Xi for each vertex i, one to minimize it. So essentially you think of it as 1 if
it's selected into the vertex cover; otherwise, you want to minimize the number of vertices selected such
that every edge is covered and every vertex is either selected or not. So either 0 or 1. And then the LP
relaxation says, okay, a vertex can help the functional value. It doesn't have to be 0/1, so you might get
a functional solution, and a well-known rounding for this LP is every vertex that has functional value
above half you round up to 1 and below half round it to 0. It's easy to see that this would be a vertex
cover. And the approximation ratio is measured as the rounded value compared to the LP is no worse
than 2, because at most we doubled the value of our efforts. So this is well known.
So the approximation ratio is measured as a rounded value of LP. It is no worse than 2, and this is tight
even if we measure the approximation value as rounded by the optimal solution, and then the example
would be just a single edge. So in the single edge the LP might be each end point, the value of half,
and we round off of them to 1, we found the solution of value 2 and optimal solution is value 1. The
integrality gap is slightly better than 2. So it can be shown to be at most 2 minus 2 over n. So it's
slightly better. So the estimation ratio we get here is slightly better, but these are low order terms
which we know with these talks. So practically they are the same. And often you can have other
approximation algorithms that also save you low order terms. So this is not really an advantage here.
Another example would be the famous max-cut algorithm of Goemans and Williamson, which takes a
semi-definite [indiscernible] over the max-cut problem and round it using the random hyperplane and
get an approximation ratio that I will call here alpha. And this ratio .878 something analyzes the ratio
between the value of the rounded solution and the SDP solution. So we can ask with what's the 2
approximation, namely the ratio between the rounded solution and the 2 optimal solution of the SDP
solution. And Karloff showed examples of graphs in which it's no better than alpha. You can ask what
the integrality gap and there was the work of Schechtman and myself, and we show that also the
integrality gap is no better than alpha. So in this case the estimation ratio and approximation ratio
provided by this algorithm is also the same. And in fact it's best possible and assuming the unique
games conjecture.
More generally, we can look at other constraints of this fraction problem, so Max 3SAT and max-cut
are examples of constraint satisfaction problems, max [indiscernible] SAT, max so on and so on are
additional ones. And assuming the unique conjecture, then for every Boolean constraint satisfaction
problem, there is a certain semi-definite problem whose integrality gap matches the best possible
approximation ratio doing estimation ratio. Estimating it any better would refute the unique games
conjecture and moreover it can be rounded to actually give the solution of that value, so also in this
case the approximation ratios and estimation ratios match each other.
So do we know of examples in which there appear to be gaps between what you can do with
approximation and estimation? So I'll give you here an example where there might be such a gap.
So this is from work of Leighton, Maggs, and Rao. And talks about universal packet routing
algorithms, but another way to call this problem is acyclic job shop scheduling with unit operations. So
the problem is defined as follows. You have jobs and you have machines. Every job has a sequence of
unit operation. It has to perform them in this particular sequence, but with perhaps gaps in the middle.
That's okay. A sequence of unit operations that has to be done different machines. So for example, job
1 may have to do one operation on machine 2. After it finishes this operation it can do one operation
on machine 4 and then on machine 9 and machine 3. So this is related to packet routing like a packet
has to traverse [indiscernible] first on link 2, then on link 4, then on link 9, and so on.
and every machine can process only one job at each unit of time. So this is again like packet routing on
each link, you can say let's send one packet for one unit of time. And what you want to minimize is the
makespan, the time the last job completes. So this is the problem. So what do we know about it?
So Leighton, Maggs, and Rao gave an estimation algorithm for it, not an approximation algorithm,
which is as follows. They say that dilation, the length of the longest job, is an obvious lower bound on
the makespan on the completion time because the longest job has to complete. Likewise, the
congestion below then the most loaded machine is also a lower bound because the machine has to
process every job that wants to go through it. So these are two lower bounds, and so the lower bound
that you have is the maximum of the two. Not the minimum, the maximum.
And what they prove is in fact that this lower bound is always tied up to a constant factor. So there
always exists a schedule for these problems but it's up to a constant factor allowed in this lower bound.
So the concept that they had maybe was pretty large, I don't know, a hundred or so, probably the
concept is much smaller. I don't think that examples are known in which the graph is more than maybe
a factor of 3 or something like that.
So this lower bound provides an estimation ratio. It doesn't show you how to schedule exactly the jobs
on the machines, but it guarantees that you can do it within some ratio. The proof that they had used
repeated applications of the Lovasz local lemma. At the time that they had this proof, this lemma was
not constructed, there were no algorithmic versions of that lemma. So at the time it did not give
polynomial time approximation algorithms.
So what happened later? So they started to come up with algorithmic versions of the local lemma. So
Joseph Beck had one in 1990, and some a few years later it was used also to get an algorithmic version
of this LMR result. But the constants were worse because in X algorithmic version of the local lemma,
it loses things. Later Robin Moser got a tighter version of the local lemma without losing any
constants. So anything in the process is getting exactly the same bounds that you get in the existential
statement of the local lemma. And further extensions to the local lemma, and you can use it now to
redo the [indiscernible] and to get an actual schedule with an approximation guarantee similar to what
the existential analysis gives.
But this is not completely satisfactory, because we don't know if the current analysis using repeated
applications of the local lemma for this problem is tight. So we don't know what the exact
approximation estimation ratio that this lower bound provides, what it is. So in fact it might be much
better than what the proof using the local lemma gives. And then we don't have an algorithm for it. So
it's possible that for this particular problem, currently we already had an estimation algorithm that is
better, has a better ratio than the best approximation algorithm that we have. Because estimation
algorithm is just to output essentially the lower bound itself, and it may be provided better estimation
than what the approximation algorithms that we have can give. So this is, let's say, one candidate of a
place where there's a gap between estimation and approximation. There are not such many candidates.
So another well-known one is the asymmetric traveling sales person. For that there's a well-known
linear programming relaxation by Held and Karp. It's known that the integrality gap of this linear
problem linear relaxation is no better than a factor of 2. In terms of actually rounding it
algorithmically, the approximation ratio is known as log n over log log n, whereas there is a nonconstructive proof showing that the integrality gap is in fact no worse than something, some power of
log log n. So the solution, the value of the solution of the Held-Karp relaxation is an estimation
algorithm within this ratio, but we don't have currently an approximation algorithm within this ratio,
and it's an open problem both what is the estimation, 2 estimation ratio provided by the Held-Karp
relaxation and whether the summation ratio and approximation ratio would eventually match for this
problem.
It's also an open problem for symmetric TSP. For symmetric TSP, the known approximation ratio is 3
half. Also, one approach can go through the Held-Karp relaxation. Well, that's all we know about the
integrality gap of the Held-Karp relaxation, that it's not better than 4/3, so maybe the Held-Karp
relaxation provides for an estimation ratio of 4/3, and we don't have a matching approximation
algorithm. So again, this is a problem for which maybe there's a gap between estimation and
approximation. We don't know.
There are a few other such problems for which currently there are gaps, but not many. These kind of
problems tend to be rare.
>>: What is the rate which log log n to the constant is shown? Is it the local [indiscernible]?
Uriel Feige: No, it's not the legal MI. It's rather complicated. So it has many stages [indiscernible]
which has the non-constructive component but use a different technique.
So there are problems for which maybe there's a gap between estimation and approximation. Are there
any problems for which provably there's a gap? So are there any linear programming relaxations
whose integrality gap is strictly better than the best possible approximation ratio for the problem? If so,
then estimation would be better than approximation, have a better ratio.
So this is the topic here. So it relates also to a well-known problem, relation between decision and
search in optimization problems. So we can ask the problem for the decision problem for vertex cover
would be does graph have the vertex cover of size K? You just have to answer yes or no. You don't
have to actually exhibit it. Whether the search problem would find the vertex cover of size k. And for
NP-complete problems, it's known that search is a reducible to decision. If you cannot always answer
the decision problems correctly, there's a [indiscernible] for NP-complete problems that allows you to
serve the search problem exactly. So in this respect, decision and search would be the same for NPcomplete problems.
Now we can think of estimation as sort of a relaxation of a decision. So when you ask is there a vertex
cover of size k, if the vertex cover is smaller than k over 2, you better answer yes. Because in
estimation ratio, let's say you want an estimation ratio of 2. But if the size of the vertex cover; let's say,
3k over 4, we are allowed to say I don't know or maybe or maybe not. Because it's only an estimation.
Likewise, approximation is a relaxation of search. If the vertex cover smaller than k over 2, you better
output a vertex cover not larger than k. If it's larger than k over 2 maybe you're allowed to put
something larger than k. So approximation is a relaxation of search and the estimation is a relaxation
of the decision. And we can ask for NP-complete problems, the ones that we are looking at is
approximation reducible to estimation similar to search can be reducible to decision. So this is another
way of phrasing the question.
So in this context let's mention some problems for which the decision version is very easy. So these are
total functions in NP, or the classes known as TFNP. So these are problems for which a solution is
guaranteed to exist. And rather than define the graph, the problem, the class formally, just give some
examples. So for example, the factoring, the problem is you are giving it a positive integer n and you
want to ask output some prime factor of n. Any self can be the output if any supplied. So there's
always a solution. The problem is that we don't know how to find it. The factoring is difficult.
Another problem would be Nash equilibrium, given that 2-player game, output the Nash equilibrium,
by Nash's theorem and Nash equilibrium always exist. We don't know an algorithm to find it.
Another example, given a graph with edge weights, output a maximal cut. A maximal cut is one in
which no vertex can switch side. Each vertex wants most of its neighbors to be on the other side, most
of its weight of its neighbors to be on the other side. And a maximal cut is one in which you cannot
gain by having one vertex switch sides. The cuts would not go.
>>: [Indiscernible].
Uriel Feige: So this also always exists. It's easy to show. You can continue to do local improvement
until you stop. But we don't know how to find the polynomial time.
Another example would be I call it pigeon-hole sum, but maybe it has a different name. So we are
given n integer in the range between one and 2 to n over n, output two subsets that have the same sum.
So by the pigeon-hole principal you have 2 to the n subsets that you can choose from, and the sums are
at most 2 to the n, so 2 must have the same sum. Again, we don't know how to find it.
So what does complexity theory say about TFNP? So the decision version is always trivial. You
always whenever asked is there a solution, you say yes. The NP problem, okay, so other problems
which are hard for TFNP in the sense that they are hard like they are hard for NP problem, is hard for
NP. If you had polynomial time algorithm for it you'd have the polynomial time for every problem in
NP. So NP-hard problem is also hard for TFNP in the sense that if you had the polynomial time
algorithm for it, you'd serve also every problem in TFNP because its contained in NP.
However, NP-hard problems will not be in the class TFNP unless P equal NP because for NP-hard
problems the decision problem is also already hard. And here the decision problem is easy.
And in fact there are reasons to believe that there are no TFNP-complete problems, problems within the
class TFNP, which are the hardest for TFNP.
So there are problems that appear to be difficult in TFNP but not in the notion in the sense of being
TFNP-hard. Instead what we have in TFNP is subclasses. So they are defined based on the algorithm
in which you show that a solution must exist. So for example, if it's based on local search, then the
class is called PLS, polynomial local search. And if it's based on directed parity arguments, then it is
class PPAD. And this subclasses do have complete problems. So for example Nash-equilibrium is
complete for PPAD, and maximal cut is complete for PLS. So it seems like problem in TFNP, unlike
problems in NP, which are all like the NP-complete problems, are all difficult exactly for the same
reason, because they are all inter-reducible to one another. So like it's the same reason why they are
hard. Here it appears that there are many different reasons why problems in TFNP may be hard, and
for each reason you have the complete problem. So like there are many sources of hardness. It is not
one unifying concept that explains hardness of problems in this class.
So how do you relate estimation, approximation, and TFNP? So there's an obvious relation. So you
know that if for certain problem, let's say max Pi, whatever Pi is, if estimation within a ratio of Rho is
in NP. So I write FP instead of P because it's functional P. It's not the decision problem. But you have
to output the number. So if polynomial 10 computable, then also approximation within ratio of Rho is
reducible to the problem in TFNP, because you're guaranteed to have the solution of value at least to
the estimation has, so now you're inside TFNP, and you can output, so you're in a situation where you
have a guaranteed solutions and the problem is to find the solution, and that's TFNP.
So this proposition is intuitively trivial. There are some fine points that one is to observe if you
actually want to write the formal proof, because I did not give you the formal definition of what the
classic TFNP means. If you think the formal definition, then you maybe notice that the proof, as I
wrote it here, it's important that I said it's reducible to a problem in TFNP and not in TFNP itself. Let's
not bother about it.
So the consequence, we know that if we want to put that estimation is different from approximation, it's
not that we need to show that there's a certain problem for which you can achieve some estimation
ratio. If achieving the same thing as an approximation ratio is NP-hard, it will not be NP-hard, it will
be in TFNP. So it knows you want to say it's TFNP-hard. So TFNP-hard is not just one concept.
TFNP breaks into a lot of subclasses. So maybe you like to say it's PPAD hard or PLS hard. This is
what the kind of results that we can hope to achieve if we want to separate estimation from
approximation.
And indeed we can very easily achieve it. So this is completely trivial. So let's show how we achieve
such a result. So let's define the following optimization problem. So you get an integer n, and a
feasible solution is any prime for the problem, that's how we define it, any problem in the range
between 2 and 10 is a feasible solution. And the value of the solution is 1 if this prime is a divisor of n,
and 0 otherwise. So it just cast this decision problem like the search problem is an approximation
problem in some artificial way. So now for an optimal solution is applying the divides n. So finding
such a solution is as hard as factoring, which is we believe to be hard. Any non-trivial approximation
getting any solution of value 1 rather than 0 is an untrivial. Approximation is as hard as factoring, so
you cannot do an approximation here. But estimation is trivial, so you simply output and estimate 1.
The value of the [indiscernible] function is 1 because always the prime that divides n, I allow any
search to be the output if any of the prime.
So the same kind of trivial reduction applies to any problem in TFNP. Instead of factoring a
[indiscernible] Nash equilibrium or whatever you want into the same kind of thing.
So are we done? Okay. So we designed an artificial optimization problem for which estimation is easy
but approximation is hard. So this would be it. So we obtained our goal, but there was a price to be
paid. The optimization problem is artificial. But we did obtain our goal.
So now we only negotiate the price. So now the only how do we make this problem look less artificial
and look more natural. That's what remains to be done.
So the first extension is just to have it nice in another respect, is to control the gaps. So we can have
the following lemma, again, quite easy. So for every, we'll have an alpha for the approximation ratio
and the larger beta for the estimation ratio because estimation should be easier than approximation.
And pick your favorite upsilon. And for every choice of alpha, which is at least 0, can be equal to 0,
and beta, which is strictly larger than alpha, which can be at most 1, and any upsilon, we can design
optimization problem, a family optimization problems for which for every instance of that familiar
achieving a better estimation you can do it in polynomial time. Achieving an alpha approximation you
can also do in polynomial time for every instance in that family. However, there's no better plus
upsilon estimation algorithm. Better is the best estimation algorithm, estimation ratio for that class of
problems, unless P equals to NP.
So here we can have P equals to NP. And alpha is the best approximation ratio, and here we cannot
have unless P equal to NP. But here it's really I think unless TFNP is in FP, I really so like choose your
favorite problem in TFNP and plug it here. So like I said, there's no one universal problem for TFNP.
So you can say unless factoring is easy, or unless PPAD is NP or PLS is in P and so on.
So how do we do it? So here is the proof by example. Again, pretty simple. So let's fix alpha to be 7
over 16 and beta 7/8. 7/8 comes from 3CNF, and 7/16 is just half of this. So the input to our artificial
problem, still artificial, is composed of two components. One is a 3CNF formula with M clauses and
the other is an integer n. So that's like the factoring problem. And the output is two things, an
assignment to the 3CNF formula and a prime P. But supposedly divides n, but might not divide n. And
the value of the output that you give, you look at the two components. So for the assignment you count
how many clauses it satisfies. And for the prime that you output you ask whether it divides P. If it
divides P you get the bonus of a factor of 2 to the number of clauses that you satisfy. If it does not
divide P, you don't get this bonus.
So if you see what happens, you feel the estimation algorithm, you just out of 7m over 4, because of the
formalized m clauses 7m over 8 are satisfiable, we know that. And then n does have a prime factor, so
you can take the bonus of the factor of 2, so you can output this as an output of the estimation
algorithm. And since the maximum possible value that you can have here is 2m, so that's 7/8
approximate estimation ratio was an approximation algorithm, will 8 be able to find an assignment that
satisfies 7/8 of the clauses but will not be able to factor n. So we would just get 7/8n. So the
approximation ratio would be 7 over 16. So this is like how such, how you can control where you put
this [indiscernible] data. So using cliques like this, you can literally put them whenever you want. I
didn't; I showed just one example.
So still there are things that we don't like about this reduction. So the objective function is a product of
two terms. Usually we prefer them that it's like linear, we prefer it to be one of the terms. The one
with the factoring is extremely sensitive to small changes. You will sometimes divide n, you change
the prime [indiscernible], it does not divide n. Each time you jump the factor of 2, it's not something,
we like smoother objective functions. So it might be the other [indiscernible] that we don't like about
this instance.
So now the main theorem is something that looks more natural. It's derived and designed for unnatural,
things which are artificial, but the output looks more natural. So again as before, for every alpha, beta,
and upsilon, there's the class of integer programs now. So these are objects that we're familiar with.
The objective function is always non-negative. So this helps us talk about approximation ratios if the
objection function is sometimes 0 or negative or the property when it's negative, then it's difficult to
talk about approximation ratios. And it has probabilities very similar to what we had before. So for
every instance over this integer programming problems, class of integer problems, it's LP relaxation has
an integrality of no worse than beta. So the LP-relaxation gives you, is what gives you the better
estimation, just the value of the LP-relaxation. You can always find an invalid solution to the integer
program, which is within the alpha ratio, whose value's within a ratio of alpha of the optimal value,
which you can do algorithmically, just output just two solution to the integer program. And these are
best possible, so you don't have a better plus upsilon estimation unless equals to NP and you don't have
an alpha plus upsilon approximation, unless again I'm writing TFNP is in NP, but here you use your
favorite TFNP problem and plug it in here.
So what's the proof? How do we prove something like that? So the basic observation is that integer
program running is NP-complete, meaning that every problem in NP can be reduced to an integer
program. So we request start from an artificial looking program like the ones I showed before and
some [indiscernible] integer program. And there are things to watch for in the reduction only. Two
aspects. One, the reduction should be approximation preserving. So we don't just want the S instances
to go into yes, and no into no, but also the approximation ratios to be preserved.
And also we had an extra condition which we said what the estimation algorithm is. It's not, it has to
be to coincide with an LP-relaxation of the LP. Estimation algorithm has to be to take the natural LP
relaxation of the IP of the integer program and out to the value you cannot choose what estimation
algorithm has to be. So this is what you need to achieve.
And now like how would you prove it? So basically you just use standard reductions. There's nothing
difficult. So you start with the relation in TFNP and look at the Turing machine that verifies that
solutions are in this world, satisfied by the solution, then you change it in to circuit and the circuit you
will encode is in the integer program. And these are a lot of details and not very interesting, so I will
not go over it. So it's just standard techniques that are used in reductions. If you do it, if you can
maintain all the properties that you want. So nothing fancy is going on here.
So if the integer program that we get naturally you can ask what the integer program, but if it would
show it to you, you see like you know three variables, X1 plus X2 plus 2, X3, to be at least 2, confines
like this. That's what you'd see.
So is it natural? So natural is not a well-defined question. You can maybe list some properties that you
want the integer program to help, and see if it has these properties it's a nice integer program. For
example, you can ask what the variables would be with the 0/1 variables, that all co-efficients are small
and that the objective function is always non-negative, that the integer program is maybe you want it to
always to be feasible. And the question is only finding the optimal solution and always easy to find
some feasible solution. And whatever, you can make a list of properties. We made some list of our
own, including these properties, and may be some more. And there's no problem in modifying the
reduction such that all these properties hold. So if you like one to define what a natural integer
program is, I would say we can meet your definition.
So here's another property that maybe it's first, maybe it's not so obvious, but you would like this
property. So how does the proof of the main theorem goes? You start with alpha, beta, and the TFNP
instance, and you derive from it an integer program. But really we would like to think of it having a
class of integer program so that when we look at an integer program, it's easier to say whether it's in
this class or not.
So the fact that, so for example that it would be easiest for us to say that yes this is an integer program
that was derived by choosing alpha and beta and something like that, and we know what alpha was,
what beta was and everything, would be encoded in somewhere in the linear program. And we like
also for class to insert properties. We want the class to be closed until certain operations. So if you
remain valuable in the linear program, if the integer program was previously in the class after naming
the class, you reorder the constraints and still remains in the class and so on.
So you can do everything in such a way that this class is well defined. So when you look at linear
program I can tell you yes it's from this class. And then once I know that it's in this class I know, by
knowing that it's in this class, I know that indeed the linear program relaxation of this integer program,
it's values are better approximation for the optimal value of the integer program. So like it's in this
respect it's also a nice class of integer programs. You can easily tell whether you're in the class or not.
So this is sort of an aside. So as I said, there are no known complete problems for TFNP. But I'm
saying we may think of the problem of LP rounding as a sort of complete problem of TFNP. It's not
really a complete problem because it's a family of different problems. Depending on, so there are
certain classes of within this family. So one member of this family would be some class of integer
programs, and another one would be another class of integer programs. And the integrality, they will
have the property, the class would have the property of the LP relaxation, have the integrality gaps, no
worse than beta. And moreover, you'd be able to tell by looking at mentors of the class, whether they
belong to the class or not.
So then once for such a class, I call it now a new problem. So the instance would be an integer
program in a class, and the feasible solution would be a feasible solution to the integer program of
value no worse than beta times the value of the LP relaxation. And we know that such a solution must
exist because integrality gap is no worse than beta. So this puts this problem in TFNP. And also as
we've seen, every TFNP problem can be reduced to have a form like this. So in a sense, the problem of
LP rounding, to do rounding that matches the integrality gap, in a sense captures TFNP. So it's not
exactly the same as a complete problem because it's really a family of problems. But normally it
captures it.
So to summarize what we mean, so the point was just to show that you can design combinatorial
optimization problems that look natural. Okay, you have to define what natural is. For which
estimation is easier than approximation. So you can do that. To do that you need, you can do it under
the assumption that TFNP does not have polynomial algorithms. But this assumption is necessary. If
this assumption does not hold you cannot do it.
And another interesting thing may be to note is that so far NP-hardness was very successful in
explaining why we cannot get better approximation ratios for many problems. And really what we
think here is for maybe all estimation problems, maybe, if we don't know, maybe all of them can be
explained, like the best estimation ratio that we have maybe can be explained by NP-hardness. I think
a better estimation ratio is NP-hard. But for approximation ratio, not estimation ratios, we should
expect that some of them will not be explained by NP-hardness, but since they can encode this TFNP
class, but various other reasons, and those should be made several different reasons why we don't get
better approximation ratios. So for different problems can be different reasons because TFNP has
different subclasses.
So I'll end here.
[applause]
Mohit Singh: Questions?
>>: So you said [indiscernible] if I give you an image of program C, you can tell me if it came from a
reduction? Is that correct?
Uriel Feige: So not that. I can design the reduction in such a way that it gives you a class integer
programs. For this particular class I build using my reductions, I can tell you for every integer program
whether it's in this class or not. So but it doesn't mean that some other, you can have your own
reduction to an integer problem that leaves something outside my class.
Yes?
>>: How hard is it to proximate [indiscernible] in planar graphs?
Uriel Feige: In what sense? The result is algorithmic.
>>: What do you mean it's algorithmic?
Uriel Feige: The proof is algorithmic.
>>: The proof that there exists [indiscernible].
Uriel Feige: Yeah, the same proofs that there exists in algorithm, think of it this way.
>>: [Indiscernible].
Uriel Feige: No, it's polynomial time. It has a lot of cases, but that's just a constant.
>>: [Indiscernible] reduced to 500 configurations. For each one [indiscernible].
>>: 500 graphs?
>>: If I give you a planar graph, there's no cases -Uriel Feige: So the point is, yes, it's algorithmic. But to write down the algorithm is very complicated.
Once you write it down you can actually run it. I'm not sure that people actually do it. But there's
some discrepancy, say, between what complexity theory would say is what is an algorithm and what
intuitive theory would think of as a good algorithm. So huge constants [indiscernible].
>>: [Indiscernible].
>>: [Indiscernible].
>>: You can check for the configuration whether a graph has it, [indiscernible].
>>: [Indiscernible].
>>: There are problems which are not constant like coming from the Roberts and Seymour theory.
Then you have to design something in the case that are constants.
>>: [Indiscernible].
Uriel Feige: So there are papers that explicitly writes things like the algorithm that you get from that
[indiscernible].
>>: And what does it say about configuration NP for algorithms?
Uriel Feige: So our examples are additional examples that are not described here for which we know
things about integrality gap and we don't have bounding technique that matches that approximation
ratios, and some of them will be through the configuration [indiscernible].
>>: [Indiscernible].
Uriel Feige: No.
>>: Do you believe that program scheduling, there's actually a difference [indiscernible]?
Uriel Feige: So there could be. Like it could be that it just happens that the estimation ratio is 3.1, let's
say, just happens. But somehow no algorithm actually ->>: [Indiscernible].
Uriel Feige: No, imagine the sense of what we currently know. [Indiscernible] I don't know if it's the
same approximation ratio or different. So I'm not sure.
>>: [Indiscernible]
Uriel Feige: Which problem there would be a gap. I feel more comfortable if I design the problem
myself.
[Laughter]
Mohit Singh: Any more questions?
Okay. Let's thank our guest.
[Applause]
Download