Mohit Singh: Hello everyone. So pleasure to have Uriel Feige back with us. And he's a regular visitor. And he will give us, tell us today about separation between estimation and approximation. Uriel Feige: So good afternoon. So this is a joint work with a former student of mine Shlomo Jozeph. So we all know of an NP-hard combinatorial optimization problems. Here are some examples like Max 3SAT, minimum [indiscernible], maximum independent set, and so on. And for each one of them would be felt as a problem like Max SAT is a problem, and it has problem instances. So problem is really a class of instances of a similar type. For NP-hard optimization problems, standard way of coping with them is through approximation algorithms. So just go over the usual definitions. So approximation algorithm is just a polynomial time algorithm A, let's say, that given an instance, produces a feasible solution, not necessarily an optimal solution. And then when we want to measure it's performance. One where we're measuring it for the approximation ratio, and here is the find for maximization problems, for minimization problems with similar definitions. So on a given instance, the approximation ratio of algorithm A is the value of the solution that it found over the value of the optimal solution. And then optimization ratio, approximation ratio of the algorithm of the whole is the minimum overall instances of its approximation ratio. In these instances in the approximation ratio possible for a problem is the maximum of all the algorithms of the approximation ratio, and of course this should be [indiscernible] but let's not worry about that. And for example max [indiscernible], as is well known, can be approximated within a ratio of 7/8. So the other side of approximation algorithms is that of proving hardness of approximation, which is the limit of how well you can approximate the problem. And here's a typical example of how a hardness of approximation result looks like. So this is the famous result by Hastad. So the proof looks the following way. You have the polynomial time reduction that starts, let's say, from some source problem and gives you an instant in a target problem. So here both the source problem and the target problem can be fully set. So you can start from a reduction that reduces a 3CNF formula, theta, with the property that is the original formula is satisfiable with the target formula. And if the original formula was not satisfiable, then you build some gap and the target formula at most, you can satisfy is it at most with 7/8#th# of its clauses. And if you have such a reduction, then if you can distinguish between the two types of theta, then you can distinguish between the two types of phi. So it means that its NP-hard to distinguish between satisfiable 3SAT instances and those which are at most 7/8#th# satisfiable. And from this of course we derive that Max 3SAT cannot be approximated within a ratio better than 7/8. So this is a well-known reduction. So the other notion we discuss here is that of estimation algorithms, which is a notion similar to approximation algorithms. So here for maximization problems, so for me an estimation algorithm E is polynomial time algorithm. Let me give an instance. Output is just a value. It doesn't need to output a solution to the optimization problem, just the value. And the value has to be for maximization problems not larger than opt; and for minimization problems, it would be not smaller than opt. And in a sense it says I'm guaranteeing to you that there is some solution with at least this value. So this is really what it guarantees, and it better be right by the definition. The value indeed has to be not larger than opt. So it doesn't find a solution, but it just tells you that there should be a solution of itself in value >>: Is there proof along with this? Uriel Feige: So the fact that it is an approximation algorithm and that's the value that it output serves as the proof; otherwise, it's not an estimation algorithm. So of course usually you need to prove that the certain algorithm is an approximation algorithm. And this you do in estimation algorithm. But this you do externally. The algorithm itself does not need to show you the proof. So the estimation ratio is defined similar to the approximation ratio. For every given instance you have the estimation ratio for that instance, and then for a given algorithm you have the estimation ratio for that algorithm, and for the given problem you have the best estimation ratio possible by polynomials times algorithms. So if we tried to compare these two tasks of estimation and approximation, then estimation is easier. Because every approximation algorithm, let's say with approximation ratio Rho, serves also as an estimation algorithm. The estimation algorithm can look at the solution output by the approximation algorithm and output of the value of that solution, and that would be a fair estimate within the same approximation ratio that the approximation ratio algorithm hit. In this sense, if you look at the possible estimation ratios that you can get compared to approximation ratios then you can get estimation is no harder than approximation. Now, the interesting point is that in many cases it's also not easier. So for example, if we look at the hardness of the approximation result that we show Hastad's design, it also shows how this estimation. It says nothing about actually finding a solution. It just distinguishing between instances that have a solution with high value and instances which has a solution of low value, but without the need to actually output any solution of any value, so there are also hardness of estimation results, and this is typical. It's not special for Hastad results. So the known techniques that we have for deriving hardness of approximation results is also provide hardness of estimation results. So if we look, for example, at indications where we have tight results, so we have approximation algorithms, we [indiscernible] an approximation ratio that we can prove for them, and we have tight. We have also hardness of approximation results. Then these hardness of approximation results all turn out to be hardness of estimation results. They match some things up to the low order of terms [indiscernible]. So for Max 3SAT, we know that the best approximation ratio and the best estimation ratio is 7/8, Max XOR is half, max coverage one minus 1 over e, minimal set cover ln n, and so on. So there's no difference between approximation and estimation for these problems. In some cases we have tight results and the assumption is different from P equal 20. So a common assumption used in this context is the unique games conjecture of a [indiscernible]. So if this conjecture is true, so based on this conjecture, you can prove also hardness of approximation results, and the hardness of approximation results are also hardness of estimation results. It's the same thing. So for example under this assumption, the min vertex cover is a problem which can be approximated within the ratio of 2 and can neither be approximated nor estimated within any better ratio up to Rho order of terms. And for max cut, the best approximation ratio is whatever Goemans and Williamson proved, this number. And in general, for large classes of problems we have tight results, tight in approximation probability, results for bases on the unique games conjecture. So we can ask ourself like are there any combinatorial optimization problem for which estimation is easier than approximation? So can estimation ratio be better than the approximation ratio for some problem? So like [indiscernible] there's no reason why they should be the same, but somehow a lot of the evidence points toward them typically being the same. Okay. So one place to look for problems for which estimation may be better than approximation is the problems for which we have estimation algorithms that are not approximation algorithms. Give you an estimate without actually finding a solution. So maybe they are doing something different than what approximation algorithms need to do. And here are some examples. So for example, for Max 3SAT, so I'm looking at [indiscernible] instances in which each clause has exactly 3 valuables. So if you have m clauses, an estimation algorithm can just count the number of clauses and multiply it by 7/8 and say I guarantee to you that you can satisfy 7m over 8 clauses, which is full, because a random assignment in expectation, this is what it gets in instances of 3SAT. So this is a valued estimation without actually exhibiting any solution. Another example ->>: You just said the solution, the randomized -Uriel Feige: No, but the algorithm does not give you one. The estimation algorithm does not give you one. And our example of estimation algorithms that do not go through finding a solution, they do not find a solution and look at its value, they rather compute the number in some other way. And these algorithms may be a source of differences between estimation and approximation. Another example would be min dominating set. If you don't know the exact definitions of the problems, it doesn't matter that much. So you just, given the graph, let's say d/regular graph, you just output the number, which is the number of vertices divided by d+1, d+1, multiply it by the line of this. And every such graph has in fact dominating set of this size or smaller. So this is a valued estimation and no such graph is a dominating set smaller than n over d+1, so the estimation ratio is ln d+1, after the factor of a logarithmic factor we are optimal here. And again, you didn't find any dominating set explicitly; you just output the number. For min bandwidth problems, for graphs of better problem of numbering of the vertices of the graph 1 to n such that, and you look at the edge for which the difference in the numbers of the n points is maximal and that's your n width. So for circular [indiscernible] graphs, these are graphs like in this picture. So each vertex is like a round arc on the circle, and two vertices share an edge. If they intersect, then if you want to know the bandwidth of such a graph, you can just look at the maximum clique size. So that's a points where as many arcs as possible intersect. So here it would be 3, multiplied by 2, and say that's the bandwidth. And that's an estimation within the ratio of 2. So these are examples of estimation algorithms. But they are not approximation algorithms. So if we look at these examples, first, all of them their estimation algorithms are trivial. What else do we know? In other examples that they give, the estimation ratio are in fact best possible. It's NP-hard to do any better, so these trivial algorithms are optimal estimation algorithms. You cannot do any better. And moreover, there is an approximation algorithm achieving the same ratio. For example, for Max 3SAT, this would be either a random assignment, or a greedy assignment would achieve the same approximation ratio. So this so far was not a good place to look for if we want to separate between estimates and approximation. Usually those estimation algorithms that we know in many cases don't give such a separation. So let's also visit some common approximation algorithms. So one such class of algorithms, so that's not just the single algorithm, these are greedy algorithms and they come in certain variations. For certain problems that are solvable in polynomial time, greedy algorithms are known to produce the optimal solution. For example, for minimal spanning tree, or more generally problems that have a matroids fracture, then the greedy algorithm is known to be exactly optimal. And for many optimization problems that are NP-hard, it's not that the greedy algorithm gives the optimal approximation ratio in the sense that it's matched by the hardness of approximation results. So often these are covering problems such as max coverage, min set cover, min-sum set cover, and so on. And for all these problems the greedy algorithm, the approximation ratio achieved by the greedy algorithm 1-1 over [indiscernible] max coverage, ln for set cover, 4 means some set cover, is optimal, is matched by a hardness of approximation result and hardness of estimation results. So for problems that we solve using the greedy algorithm, estimation and approximation match each other. So another common methodology for achieving approximation algorithms is for the use of the either linear problem relaxations or semi-definite problem relaxations or sometimes hierarchies of relaxations. And in general the template is as follows. You first formulate the combinatorial optimization problem as an integer program. Then you relax the integer program by removing the integrality constraints and replacing them, allowing for fractional solutions. Then you find an optimal fractional solution to the linear program, and linear programs can be served in polynomial time. So this stuff you can actually do in polynomial time. Integer programs in general cannot be involved in polynomial time. That's why we needed the relaxation. And then there's a procedure that is often known as rounding the LP solution, which is specific for each problem. There might be a different rounding procedure that gets you back at feasible solution to the original integer program or to the original problem. And then you estimate how good that approach is by comparing the value that you get from the rounded solution from the value of the LP solution. So how much did you lose in the process between this stage of the LP solution and the rounded solution. So let's maybe see a picture of how it looks like. So let's say, let's normalize for a certain instance of the problem, let's say maximization problem, the optimal value to be 1, the integer problem in formulation of the problem would in general exactly formulate the problem, so it's optimal value would also be 1. It would represent the problem exactly. Once you do an LP relaxation, you relax the linear problem, the integer problem. So you have potentially added a feasible solution, fractional solution, so the value goes up. And then when you round you get a feasible solution to the IP. So after rounding your value cannot be larger than 1, and typically it would be even smaller. So if we look, the way you want to usually analyze this scheme is to compare the value of the rounded solution to the LP solution. But in fact, this is not necessarily the approximation ratio of such algorithms. The approximation ratio should be a comparison between the rounded value and the optimal value, which can be a better ratio. And another ratio that is interesting here is the ratio between the value of the LP solution and the IP solution, which this ratio is always at least as good as the ratio between the rounded solution and the LP solution. And this ratio is referred to as the integrality gap, this ratio. So the two approximation ratios is really the rounded value compared to the optimal value, the one which is usually analyzed as the random value compared to the LP value, and then another value of interest is the integrality gap. So if we look at this in the eyes of approximation versus estimation, so if you look now at estimation algorithms that we can get from this linear problem in relaxation, then perhaps we can get two which are not necessarily the same. So one is just approximation algorithm itself is an estimation algorithm. So its ratio is the rounded value over the value of the LP, or the optimal value. It is exactly the same as the approximation algorithm. You just take the rounded solution and output its value. But there's another estimation algorithm, which is to look just at the value of the LP. And if you know what the worse-case integrality gap over all instances is, this is just a formality, you can multiply it by this integrality gap, so then you necessarily get the value which is smaller than IP as we wanted for maximization problems, and then this is your estimated value for the solution. And either one of them might be a better estimation algorithm. Because maybe the gap between these two might always be smaller, let's say, than the gap between these two. Or the other way around. So we have like two choices of estimation algorithms. There's like essentially one approximation algorithm coming out of the linear program relaxation, but two estimation algorithms which might have different estimation ratios. So we may hope that maybe the second of this estimation algorithms, the one based on the integrality gap, has an estimation ratio better than the approximation ratio. And then we'll have some separation. So let's look at some evidence. So as an example, let's look at minimal vertex cover, a well-known problem. So in the graph you want to select here the green vertices, the smaller set of vertices, that covers all the edges. So how do we do the framework that we did before? So when we want to write an integer program for it, we have one valuable Xi for each vertex i, one to minimize it. So essentially you think of it as 1 if it's selected into the vertex cover; otherwise, you want to minimize the number of vertices selected such that every edge is covered and every vertex is either selected or not. So either 0 or 1. And then the LP relaxation says, okay, a vertex can help the functional value. It doesn't have to be 0/1, so you might get a functional solution, and a well-known rounding for this LP is every vertex that has functional value above half you round up to 1 and below half round it to 0. It's easy to see that this would be a vertex cover. And the approximation ratio is measured as the rounded value compared to the LP is no worse than 2, because at most we doubled the value of our efforts. So this is well known. So the approximation ratio is measured as a rounded value of LP. It is no worse than 2, and this is tight even if we measure the approximation value as rounded by the optimal solution, and then the example would be just a single edge. So in the single edge the LP might be each end point, the value of half, and we round off of them to 1, we found the solution of value 2 and optimal solution is value 1. The integrality gap is slightly better than 2. So it can be shown to be at most 2 minus 2 over n. So it's slightly better. So the estimation ratio we get here is slightly better, but these are low order terms which we know with these talks. So practically they are the same. And often you can have other approximation algorithms that also save you low order terms. So this is not really an advantage here. Another example would be the famous max-cut algorithm of Goemans and Williamson, which takes a semi-definite [indiscernible] over the max-cut problem and round it using the random hyperplane and get an approximation ratio that I will call here alpha. And this ratio .878 something analyzes the ratio between the value of the rounded solution and the SDP solution. So we can ask with what's the 2 approximation, namely the ratio between the rounded solution and the 2 optimal solution of the SDP solution. And Karloff showed examples of graphs in which it's no better than alpha. You can ask what the integrality gap and there was the work of Schechtman and myself, and we show that also the integrality gap is no better than alpha. So in this case the estimation ratio and approximation ratio provided by this algorithm is also the same. And in fact it's best possible and assuming the unique games conjecture. More generally, we can look at other constraints of this fraction problem, so Max 3SAT and max-cut are examples of constraint satisfaction problems, max [indiscernible] SAT, max so on and so on are additional ones. And assuming the unique conjecture, then for every Boolean constraint satisfaction problem, there is a certain semi-definite problem whose integrality gap matches the best possible approximation ratio doing estimation ratio. Estimating it any better would refute the unique games conjecture and moreover it can be rounded to actually give the solution of that value, so also in this case the approximation ratios and estimation ratios match each other. So do we know of examples in which there appear to be gaps between what you can do with approximation and estimation? So I'll give you here an example where there might be such a gap. So this is from work of Leighton, Maggs, and Rao. And talks about universal packet routing algorithms, but another way to call this problem is acyclic job shop scheduling with unit operations. So the problem is defined as follows. You have jobs and you have machines. Every job has a sequence of unit operation. It has to perform them in this particular sequence, but with perhaps gaps in the middle. That's okay. A sequence of unit operations that has to be done different machines. So for example, job 1 may have to do one operation on machine 2. After it finishes this operation it can do one operation on machine 4 and then on machine 9 and machine 3. So this is related to packet routing like a packet has to traverse [indiscernible] first on link 2, then on link 4, then on link 9, and so on. and every machine can process only one job at each unit of time. So this is again like packet routing on each link, you can say let's send one packet for one unit of time. And what you want to minimize is the makespan, the time the last job completes. So this is the problem. So what do we know about it? So Leighton, Maggs, and Rao gave an estimation algorithm for it, not an approximation algorithm, which is as follows. They say that dilation, the length of the longest job, is an obvious lower bound on the makespan on the completion time because the longest job has to complete. Likewise, the congestion below then the most loaded machine is also a lower bound because the machine has to process every job that wants to go through it. So these are two lower bounds, and so the lower bound that you have is the maximum of the two. Not the minimum, the maximum. And what they prove is in fact that this lower bound is always tied up to a constant factor. So there always exists a schedule for these problems but it's up to a constant factor allowed in this lower bound. So the concept that they had maybe was pretty large, I don't know, a hundred or so, probably the concept is much smaller. I don't think that examples are known in which the graph is more than maybe a factor of 3 or something like that. So this lower bound provides an estimation ratio. It doesn't show you how to schedule exactly the jobs on the machines, but it guarantees that you can do it within some ratio. The proof that they had used repeated applications of the Lovasz local lemma. At the time that they had this proof, this lemma was not constructed, there were no algorithmic versions of that lemma. So at the time it did not give polynomial time approximation algorithms. So what happened later? So they started to come up with algorithmic versions of the local lemma. So Joseph Beck had one in 1990, and some a few years later it was used also to get an algorithmic version of this LMR result. But the constants were worse because in X algorithmic version of the local lemma, it loses things. Later Robin Moser got a tighter version of the local lemma without losing any constants. So anything in the process is getting exactly the same bounds that you get in the existential statement of the local lemma. And further extensions to the local lemma, and you can use it now to redo the [indiscernible] and to get an actual schedule with an approximation guarantee similar to what the existential analysis gives. But this is not completely satisfactory, because we don't know if the current analysis using repeated applications of the local lemma for this problem is tight. So we don't know what the exact approximation estimation ratio that this lower bound provides, what it is. So in fact it might be much better than what the proof using the local lemma gives. And then we don't have an algorithm for it. So it's possible that for this particular problem, currently we already had an estimation algorithm that is better, has a better ratio than the best approximation algorithm that we have. Because estimation algorithm is just to output essentially the lower bound itself, and it may be provided better estimation than what the approximation algorithms that we have can give. So this is, let's say, one candidate of a place where there's a gap between estimation and approximation. There are not such many candidates. So another well-known one is the asymmetric traveling sales person. For that there's a well-known linear programming relaxation by Held and Karp. It's known that the integrality gap of this linear problem linear relaxation is no better than a factor of 2. In terms of actually rounding it algorithmically, the approximation ratio is known as log n over log log n, whereas there is a nonconstructive proof showing that the integrality gap is in fact no worse than something, some power of log log n. So the solution, the value of the solution of the Held-Karp relaxation is an estimation algorithm within this ratio, but we don't have currently an approximation algorithm within this ratio, and it's an open problem both what is the estimation, 2 estimation ratio provided by the Held-Karp relaxation and whether the summation ratio and approximation ratio would eventually match for this problem. It's also an open problem for symmetric TSP. For symmetric TSP, the known approximation ratio is 3 half. Also, one approach can go through the Held-Karp relaxation. Well, that's all we know about the integrality gap of the Held-Karp relaxation, that it's not better than 4/3, so maybe the Held-Karp relaxation provides for an estimation ratio of 4/3, and we don't have a matching approximation algorithm. So again, this is a problem for which maybe there's a gap between estimation and approximation. We don't know. There are a few other such problems for which currently there are gaps, but not many. These kind of problems tend to be rare. >>: What is the rate which log log n to the constant is shown? Is it the local [indiscernible]? Uriel Feige: No, it's not the legal MI. It's rather complicated. So it has many stages [indiscernible] which has the non-constructive component but use a different technique. So there are problems for which maybe there's a gap between estimation and approximation. Are there any problems for which provably there's a gap? So are there any linear programming relaxations whose integrality gap is strictly better than the best possible approximation ratio for the problem? If so, then estimation would be better than approximation, have a better ratio. So this is the topic here. So it relates also to a well-known problem, relation between decision and search in optimization problems. So we can ask the problem for the decision problem for vertex cover would be does graph have the vertex cover of size K? You just have to answer yes or no. You don't have to actually exhibit it. Whether the search problem would find the vertex cover of size k. And for NP-complete problems, it's known that search is a reducible to decision. If you cannot always answer the decision problems correctly, there's a [indiscernible] for NP-complete problems that allows you to serve the search problem exactly. So in this respect, decision and search would be the same for NPcomplete problems. Now we can think of estimation as sort of a relaxation of a decision. So when you ask is there a vertex cover of size k, if the vertex cover is smaller than k over 2, you better answer yes. Because in estimation ratio, let's say you want an estimation ratio of 2. But if the size of the vertex cover; let's say, 3k over 4, we are allowed to say I don't know or maybe or maybe not. Because it's only an estimation. Likewise, approximation is a relaxation of search. If the vertex cover smaller than k over 2, you better output a vertex cover not larger than k. If it's larger than k over 2 maybe you're allowed to put something larger than k. So approximation is a relaxation of search and the estimation is a relaxation of the decision. And we can ask for NP-complete problems, the ones that we are looking at is approximation reducible to estimation similar to search can be reducible to decision. So this is another way of phrasing the question. So in this context let's mention some problems for which the decision version is very easy. So these are total functions in NP, or the classes known as TFNP. So these are problems for which a solution is guaranteed to exist. And rather than define the graph, the problem, the class formally, just give some examples. So for example, the factoring, the problem is you are giving it a positive integer n and you want to ask output some prime factor of n. Any self can be the output if any supplied. So there's always a solution. The problem is that we don't know how to find it. The factoring is difficult. Another problem would be Nash equilibrium, given that 2-player game, output the Nash equilibrium, by Nash's theorem and Nash equilibrium always exist. We don't know an algorithm to find it. Another example, given a graph with edge weights, output a maximal cut. A maximal cut is one in which no vertex can switch side. Each vertex wants most of its neighbors to be on the other side, most of its weight of its neighbors to be on the other side. And a maximal cut is one in which you cannot gain by having one vertex switch sides. The cuts would not go. >>: [Indiscernible]. Uriel Feige: So this also always exists. It's easy to show. You can continue to do local improvement until you stop. But we don't know how to find the polynomial time. Another example would be I call it pigeon-hole sum, but maybe it has a different name. So we are given n integer in the range between one and 2 to n over n, output two subsets that have the same sum. So by the pigeon-hole principal you have 2 to the n subsets that you can choose from, and the sums are at most 2 to the n, so 2 must have the same sum. Again, we don't know how to find it. So what does complexity theory say about TFNP? So the decision version is always trivial. You always whenever asked is there a solution, you say yes. The NP problem, okay, so other problems which are hard for TFNP in the sense that they are hard like they are hard for NP problem, is hard for NP. If you had polynomial time algorithm for it you'd have the polynomial time for every problem in NP. So NP-hard problem is also hard for TFNP in the sense that if you had the polynomial time algorithm for it, you'd serve also every problem in TFNP because its contained in NP. However, NP-hard problems will not be in the class TFNP unless P equal NP because for NP-hard problems the decision problem is also already hard. And here the decision problem is easy. And in fact there are reasons to believe that there are no TFNP-complete problems, problems within the class TFNP, which are the hardest for TFNP. So there are problems that appear to be difficult in TFNP but not in the notion in the sense of being TFNP-hard. Instead what we have in TFNP is subclasses. So they are defined based on the algorithm in which you show that a solution must exist. So for example, if it's based on local search, then the class is called PLS, polynomial local search. And if it's based on directed parity arguments, then it is class PPAD. And this subclasses do have complete problems. So for example Nash-equilibrium is complete for PPAD, and maximal cut is complete for PLS. So it seems like problem in TFNP, unlike problems in NP, which are all like the NP-complete problems, are all difficult exactly for the same reason, because they are all inter-reducible to one another. So like it's the same reason why they are hard. Here it appears that there are many different reasons why problems in TFNP may be hard, and for each reason you have the complete problem. So like there are many sources of hardness. It is not one unifying concept that explains hardness of problems in this class. So how do you relate estimation, approximation, and TFNP? So there's an obvious relation. So you know that if for certain problem, let's say max Pi, whatever Pi is, if estimation within a ratio of Rho is in NP. So I write FP instead of P because it's functional P. It's not the decision problem. But you have to output the number. So if polynomial 10 computable, then also approximation within ratio of Rho is reducible to the problem in TFNP, because you're guaranteed to have the solution of value at least to the estimation has, so now you're inside TFNP, and you can output, so you're in a situation where you have a guaranteed solutions and the problem is to find the solution, and that's TFNP. So this proposition is intuitively trivial. There are some fine points that one is to observe if you actually want to write the formal proof, because I did not give you the formal definition of what the classic TFNP means. If you think the formal definition, then you maybe notice that the proof, as I wrote it here, it's important that I said it's reducible to a problem in TFNP and not in TFNP itself. Let's not bother about it. So the consequence, we know that if we want to put that estimation is different from approximation, it's not that we need to show that there's a certain problem for which you can achieve some estimation ratio. If achieving the same thing as an approximation ratio is NP-hard, it will not be NP-hard, it will be in TFNP. So it knows you want to say it's TFNP-hard. So TFNP-hard is not just one concept. TFNP breaks into a lot of subclasses. So maybe you like to say it's PPAD hard or PLS hard. This is what the kind of results that we can hope to achieve if we want to separate estimation from approximation. And indeed we can very easily achieve it. So this is completely trivial. So let's show how we achieve such a result. So let's define the following optimization problem. So you get an integer n, and a feasible solution is any prime for the problem, that's how we define it, any problem in the range between 2 and 10 is a feasible solution. And the value of the solution is 1 if this prime is a divisor of n, and 0 otherwise. So it just cast this decision problem like the search problem is an approximation problem in some artificial way. So now for an optimal solution is applying the divides n. So finding such a solution is as hard as factoring, which is we believe to be hard. Any non-trivial approximation getting any solution of value 1 rather than 0 is an untrivial. Approximation is as hard as factoring, so you cannot do an approximation here. But estimation is trivial, so you simply output and estimate 1. The value of the [indiscernible] function is 1 because always the prime that divides n, I allow any search to be the output if any of the prime. So the same kind of trivial reduction applies to any problem in TFNP. Instead of factoring a [indiscernible] Nash equilibrium or whatever you want into the same kind of thing. So are we done? Okay. So we designed an artificial optimization problem for which estimation is easy but approximation is hard. So this would be it. So we obtained our goal, but there was a price to be paid. The optimization problem is artificial. But we did obtain our goal. So now we only negotiate the price. So now the only how do we make this problem look less artificial and look more natural. That's what remains to be done. So the first extension is just to have it nice in another respect, is to control the gaps. So we can have the following lemma, again, quite easy. So for every, we'll have an alpha for the approximation ratio and the larger beta for the estimation ratio because estimation should be easier than approximation. And pick your favorite upsilon. And for every choice of alpha, which is at least 0, can be equal to 0, and beta, which is strictly larger than alpha, which can be at most 1, and any upsilon, we can design optimization problem, a family optimization problems for which for every instance of that familiar achieving a better estimation you can do it in polynomial time. Achieving an alpha approximation you can also do in polynomial time for every instance in that family. However, there's no better plus upsilon estimation algorithm. Better is the best estimation algorithm, estimation ratio for that class of problems, unless P equals to NP. So here we can have P equals to NP. And alpha is the best approximation ratio, and here we cannot have unless P equal to NP. But here it's really I think unless TFNP is in FP, I really so like choose your favorite problem in TFNP and plug it here. So like I said, there's no one universal problem for TFNP. So you can say unless factoring is easy, or unless PPAD is NP or PLS is in P and so on. So how do we do it? So here is the proof by example. Again, pretty simple. So let's fix alpha to be 7 over 16 and beta 7/8. 7/8 comes from 3CNF, and 7/16 is just half of this. So the input to our artificial problem, still artificial, is composed of two components. One is a 3CNF formula with M clauses and the other is an integer n. So that's like the factoring problem. And the output is two things, an assignment to the 3CNF formula and a prime P. But supposedly divides n, but might not divide n. And the value of the output that you give, you look at the two components. So for the assignment you count how many clauses it satisfies. And for the prime that you output you ask whether it divides P. If it divides P you get the bonus of a factor of 2 to the number of clauses that you satisfy. If it does not divide P, you don't get this bonus. So if you see what happens, you feel the estimation algorithm, you just out of 7m over 4, because of the formalized m clauses 7m over 8 are satisfiable, we know that. And then n does have a prime factor, so you can take the bonus of the factor of 2, so you can output this as an output of the estimation algorithm. And since the maximum possible value that you can have here is 2m, so that's 7/8 approximate estimation ratio was an approximation algorithm, will 8 be able to find an assignment that satisfies 7/8 of the clauses but will not be able to factor n. So we would just get 7/8n. So the approximation ratio would be 7 over 16. So this is like how such, how you can control where you put this [indiscernible] data. So using cliques like this, you can literally put them whenever you want. I didn't; I showed just one example. So still there are things that we don't like about this reduction. So the objective function is a product of two terms. Usually we prefer them that it's like linear, we prefer it to be one of the terms. The one with the factoring is extremely sensitive to small changes. You will sometimes divide n, you change the prime [indiscernible], it does not divide n. Each time you jump the factor of 2, it's not something, we like smoother objective functions. So it might be the other [indiscernible] that we don't like about this instance. So now the main theorem is something that looks more natural. It's derived and designed for unnatural, things which are artificial, but the output looks more natural. So again as before, for every alpha, beta, and upsilon, there's the class of integer programs now. So these are objects that we're familiar with. The objective function is always non-negative. So this helps us talk about approximation ratios if the objection function is sometimes 0 or negative or the property when it's negative, then it's difficult to talk about approximation ratios. And it has probabilities very similar to what we had before. So for every instance over this integer programming problems, class of integer problems, it's LP relaxation has an integrality of no worse than beta. So the LP-relaxation gives you, is what gives you the better estimation, just the value of the LP-relaxation. You can always find an invalid solution to the integer program, which is within the alpha ratio, whose value's within a ratio of alpha of the optimal value, which you can do algorithmically, just output just two solution to the integer program. And these are best possible, so you don't have a better plus upsilon estimation unless equals to NP and you don't have an alpha plus upsilon approximation, unless again I'm writing TFNP is in NP, but here you use your favorite TFNP problem and plug it in here. So what's the proof? How do we prove something like that? So the basic observation is that integer program running is NP-complete, meaning that every problem in NP can be reduced to an integer program. So we request start from an artificial looking program like the ones I showed before and some [indiscernible] integer program. And there are things to watch for in the reduction only. Two aspects. One, the reduction should be approximation preserving. So we don't just want the S instances to go into yes, and no into no, but also the approximation ratios to be preserved. And also we had an extra condition which we said what the estimation algorithm is. It's not, it has to be to coincide with an LP-relaxation of the LP. Estimation algorithm has to be to take the natural LP relaxation of the IP of the integer program and out to the value you cannot choose what estimation algorithm has to be. So this is what you need to achieve. And now like how would you prove it? So basically you just use standard reductions. There's nothing difficult. So you start with the relation in TFNP and look at the Turing machine that verifies that solutions are in this world, satisfied by the solution, then you change it in to circuit and the circuit you will encode is in the integer program. And these are a lot of details and not very interesting, so I will not go over it. So it's just standard techniques that are used in reductions. If you do it, if you can maintain all the properties that you want. So nothing fancy is going on here. So if the integer program that we get naturally you can ask what the integer program, but if it would show it to you, you see like you know three variables, X1 plus X2 plus 2, X3, to be at least 2, confines like this. That's what you'd see. So is it natural? So natural is not a well-defined question. You can maybe list some properties that you want the integer program to help, and see if it has these properties it's a nice integer program. For example, you can ask what the variables would be with the 0/1 variables, that all co-efficients are small and that the objective function is always non-negative, that the integer program is maybe you want it to always to be feasible. And the question is only finding the optimal solution and always easy to find some feasible solution. And whatever, you can make a list of properties. We made some list of our own, including these properties, and may be some more. And there's no problem in modifying the reduction such that all these properties hold. So if you like one to define what a natural integer program is, I would say we can meet your definition. So here's another property that maybe it's first, maybe it's not so obvious, but you would like this property. So how does the proof of the main theorem goes? You start with alpha, beta, and the TFNP instance, and you derive from it an integer program. But really we would like to think of it having a class of integer program so that when we look at an integer program, it's easier to say whether it's in this class or not. So the fact that, so for example that it would be easiest for us to say that yes this is an integer program that was derived by choosing alpha and beta and something like that, and we know what alpha was, what beta was and everything, would be encoded in somewhere in the linear program. And we like also for class to insert properties. We want the class to be closed until certain operations. So if you remain valuable in the linear program, if the integer program was previously in the class after naming the class, you reorder the constraints and still remains in the class and so on. So you can do everything in such a way that this class is well defined. So when you look at linear program I can tell you yes it's from this class. And then once I know that it's in this class I know, by knowing that it's in this class, I know that indeed the linear program relaxation of this integer program, it's values are better approximation for the optimal value of the integer program. So like it's in this respect it's also a nice class of integer programs. You can easily tell whether you're in the class or not. So this is sort of an aside. So as I said, there are no known complete problems for TFNP. But I'm saying we may think of the problem of LP rounding as a sort of complete problem of TFNP. It's not really a complete problem because it's a family of different problems. Depending on, so there are certain classes of within this family. So one member of this family would be some class of integer programs, and another one would be another class of integer programs. And the integrality, they will have the property, the class would have the property of the LP relaxation, have the integrality gaps, no worse than beta. And moreover, you'd be able to tell by looking at mentors of the class, whether they belong to the class or not. So then once for such a class, I call it now a new problem. So the instance would be an integer program in a class, and the feasible solution would be a feasible solution to the integer program of value no worse than beta times the value of the LP relaxation. And we know that such a solution must exist because integrality gap is no worse than beta. So this puts this problem in TFNP. And also as we've seen, every TFNP problem can be reduced to have a form like this. So in a sense, the problem of LP rounding, to do rounding that matches the integrality gap, in a sense captures TFNP. So it's not exactly the same as a complete problem because it's really a family of problems. But normally it captures it. So to summarize what we mean, so the point was just to show that you can design combinatorial optimization problems that look natural. Okay, you have to define what natural is. For which estimation is easier than approximation. So you can do that. To do that you need, you can do it under the assumption that TFNP does not have polynomial algorithms. But this assumption is necessary. If this assumption does not hold you cannot do it. And another interesting thing may be to note is that so far NP-hardness was very successful in explaining why we cannot get better approximation ratios for many problems. And really what we think here is for maybe all estimation problems, maybe, if we don't know, maybe all of them can be explained, like the best estimation ratio that we have maybe can be explained by NP-hardness. I think a better estimation ratio is NP-hard. But for approximation ratio, not estimation ratios, we should expect that some of them will not be explained by NP-hardness, but since they can encode this TFNP class, but various other reasons, and those should be made several different reasons why we don't get better approximation ratios. So for different problems can be different reasons because TFNP has different subclasses. So I'll end here. [applause] Mohit Singh: Questions? >>: So you said [indiscernible] if I give you an image of program C, you can tell me if it came from a reduction? Is that correct? Uriel Feige: So not that. I can design the reduction in such a way that it gives you a class integer programs. For this particular class I build using my reductions, I can tell you for every integer program whether it's in this class or not. So but it doesn't mean that some other, you can have your own reduction to an integer problem that leaves something outside my class. Yes? >>: How hard is it to proximate [indiscernible] in planar graphs? Uriel Feige: In what sense? The result is algorithmic. >>: What do you mean it's algorithmic? Uriel Feige: The proof is algorithmic. >>: The proof that there exists [indiscernible]. Uriel Feige: Yeah, the same proofs that there exists in algorithm, think of it this way. >>: [Indiscernible]. Uriel Feige: No, it's polynomial time. It has a lot of cases, but that's just a constant. >>: [Indiscernible] reduced to 500 configurations. For each one [indiscernible]. >>: 500 graphs? >>: If I give you a planar graph, there's no cases -Uriel Feige: So the point is, yes, it's algorithmic. But to write down the algorithm is very complicated. Once you write it down you can actually run it. I'm not sure that people actually do it. But there's some discrepancy, say, between what complexity theory would say is what is an algorithm and what intuitive theory would think of as a good algorithm. So huge constants [indiscernible]. >>: [Indiscernible]. >>: [Indiscernible]. >>: You can check for the configuration whether a graph has it, [indiscernible]. >>: [Indiscernible]. >>: There are problems which are not constant like coming from the Roberts and Seymour theory. Then you have to design something in the case that are constants. >>: [Indiscernible]. Uriel Feige: So there are papers that explicitly writes things like the algorithm that you get from that [indiscernible]. >>: And what does it say about configuration NP for algorithms? Uriel Feige: So our examples are additional examples that are not described here for which we know things about integrality gap and we don't have bounding technique that matches that approximation ratios, and some of them will be through the configuration [indiscernible]. >>: [Indiscernible]. Uriel Feige: No. >>: Do you believe that program scheduling, there's actually a difference [indiscernible]? Uriel Feige: So there could be. Like it could be that it just happens that the estimation ratio is 3.1, let's say, just happens. But somehow no algorithm actually ->>: [Indiscernible]. Uriel Feige: No, imagine the sense of what we currently know. [Indiscernible] I don't know if it's the same approximation ratio or different. So I'm not sure. >>: [Indiscernible] Uriel Feige: Which problem there would be a gap. I feel more comfortable if I design the problem myself. [Laughter] Mohit Singh: Any more questions? Okay. Let's thank our guest. [Applause]