>> Nikhil Devanur Rangarajan: So it's my pleasure to... him Balu for short. Balu is a PhD student...

>> Nikhil Devanur Rangarajan: So it's my pleasure to introduce Balasubramanian Sivan. I call him Balu for short. Balu is a PhD student at University of Wisconsin-Madison advised by Shuchi Chawla. And Balu has been an intern here. In building his internship here, he worked under the so-called [inaudible] algorithm which has had a big impact. So over to Balu. >> Balasubramanian Sivan: Thanks, Nikhil. Thanks for the invitation. I'm very happy to be back. So I'll be talking about optimization with uncertain inputs in this talk. And in particular, I'll be focusing on two kinds of uncertain inputs. One is as in online inputs, where the input comes piece by piece and the algorithm has to make a decision as soon as one piece arrives and can’t wait for that input. So optimization subject to uncertain future is a challenge here. The second is, as a mechanism design, where your input is distributed across several selfish participants, and each of them may have their own well defined goals, which often conflict with the optimization goal of the algorithm designer, and the challenge is to do optimization respecting the incentives of the [inaudible]. So I'll begin by asking how do we formerly model and analyze these problems in theory. There are several approaches to do this. There are two main approaches to gain currency and to see its literature. One is to do competitive analysis where the algorithm you design faces the input uncertainty, but the benchmark against which you compare yourself is omniscient; it ignores the entire input to begin with, and the performance metric is measured to what's called the competitive ratio, which is the worst case over all inputs of the performance of the algorithm, the performance of the benchmark. So as you can see there, the OPT passes a subscript I which means it's an instance where optimal, but the algorithm is the same for every instance. So it's an explicitly more powerful benchmark than your algorithm; and for this reason, being a robust benchmark, any positive research is great and a competitive analysis and a good example of robust [inaudible] as they celebrate VCG mechanism to maximize social welfare. And for the same reason that it is a robust benchmark like this, it often leads to, basically [inaudible], and we’ll see this in the talk in the two examples that I'm going to talk about. Now, a frequent alternator, in particular to step around this [inaudible] incompetent analysis, is to perform stochastic analysis, where the idea is that the input is drawn from a loan distribution. The algorithm knows the distribution, and it tries to optimize with the respective data distribution. The benchmark against which you compare is the expected optimal for the same distribution. And the performance metric is basically the ratio of the expected performance of the algorithm, the benchmark. As you can see, both of them have a subscript F, which means the OPT is not explicitly more powerful and you can shoot for one approximation. And because of this, there are several [inaudible] stories in stochastic analysis. I’ll just give one example. Meyerson’s revenue optimal mechanism is a great example of stochastic analysis. You know the distribution. But the biggest criticism for stochastic analysis is that you need to know the exact distribution in order to perform this optimization. And often you have noisy data [inaudible] distributions and that could render your algorithm really sub optimal if there is noise. So given these two extremes, a possible middle ground would be to say that the input may be drawn from some distribution. But I, or the algorithm designer, do not know the exact distribution. I only know the possibly huge universe of distribution from which the input arrives, and what I'm asking for is a single algorithm, which for every distribution in this universe performs approximately as well as the optimal algorithm tailored specifically for the distribution. Okay. So basically we are optimizing for the robust distribution in the whole universe. So even if you know, don't know the exact distribution, you can use the same algorithm for all the distributions. It is in this sense that the algorithms are prior robust because you're blind to the exact prior distribution on inputs. As you can see there, the OPT has a subscript F which means its distribution-wise optimal, tailored to the distribution. The algorithm is the same for all the distributions. >>: Does the algorithm see the whole [inaudible] at once? Or>> Balasubramanian Sivan: It depends on online input, it's basically one by one, and mechanism design you have to elicit the inputs. There, typically, you ask from all the people. Actually, for both examples, I'm going to give in this talk, I use an even stronger metric. The OPT is not just distribution-wise optimal, it is instance where it's optimal. Okay. So it is even closer to competitive analysis that we are, just the presence of the distribution, allows us to step around the negative results in competitive analysis. Now this can be pushed about extremes. If you shrink the universe to be a size 1, that is exactly stochastic analysis. And if you allow the universe to include arbitrary distribution, every distribution possible, then that is competitive analysis; and the goal in prior robust optimization is to develop algorithms which can handle [inaudible] universe as possible. But typically, these distributions, the universe has some structure, it is not an arbitrary bag of distributions. For example, all possible product distributions or inputs and things like that. The question is: do we have any nontrivial algorithms that are smarter? There are some nice examples. For example, the classic mechanism designed to be alone Kemplerer in its generalization then what data I've got. And then Yan, and improve the algorithm settings that is due to Devanur and Hayes. I'll talk more about this later. But these are examples of prior robust optimization. >>: So when you talk about optimization [inaudible] errors [inaudible] Or are you talking about the performance of errors? >> Balasubramanian Sivan: I'm talking about the performance of the algorithm. There is an objective function which you have, and I'm asking how close you can get to that objective function as compared to the optimal value of the objective function. >>: How do you compile it, so performance [inaudible] computing speed? >> Balasubramanian Sivan: I mean, it's not a question of runtime. If you want it to be polynomial, but that [inaudible] performance, the [inaudible] objective function. Okay. So that is a prior robust model. So the plan for this talk is to present prior robust algorithms for, to find [inaudible] problems, and the take away message is that several interesting problems lend themselves to these kind of prior robust algorithms, and you must look for them and deal with them. So in part one of the talk, I'll present a prior robust algorithm for a resource allocation problem, and in part two, I'll do mechanism design. I'll present a prior robust truthful machine scheduling mechanism, and I'll conclude with some future research directions. So part one is basically, I mean it's based on a couple of giant works. One is with Nikhil Devanur, Kamal Jain, and Chris Wilkens, others with Nikhil Davanur and Yossi Azar. So this resource allocation framework, the former framework I'll present in a bit, but this framework actually is very [inaudible] and captures a lot of problems motivated by internet advertising. So I'll quickly go to a couple of examples. One is the display as the example; if you visit any website, proper website, you’ll see at least one advertisement like that. And at a very high level, you can think of this display address proceeding in four phases. In phase 1, the advertiser, which is this case, University of Phoenix, signs a contract with the publisher, MSN, saying I want so many impressions in this period, two people [inaudible], so on, so on, etc. Now MSN signs contracts like this with several advertisers, and then those contracts are signed, the user webpage arrives. And MSN has to that basically in the vacant spot, which advertisement to put. Then you deliver these ads plus the content. So this third phase is basically a resource allocation phase. You have to decide which advertisement to put there vacant spot. And this next example is basically something you’ve all seen a ton of times if you search for a query, apart from the organic resource that you get, you also see some big search results. This can also be thought of at a very high level proceeding in four phases, except that there's no contracts here. Advertisers submit bids and budgets and then user query arrives, the search engine has to decide which has to be matched, and basically the phase 3 is a resource allocation phase. So now going the formal model of this phase 3. So the model we are going to talk about is due to MSVV, is basically an online model of repeated auctions. There is no incentives here. It's truly algorithmic. There's a bunch of advertisers. Each will specify some budgets. These budgets are the maximum amount you can extract from them over a day or sometime period. And they also specific bids on various queries, and as soon as a query arrives, the search engine has to decide whom to allocate it to. And so allocators are trying to budget that advertiser. And you keep on proceeding like this. So the former model is that you have n advertisers, advertiser I has budget B, i, and queries arrive as follows. So there is this huge graph, the righthand side of the graph is all possible queries, and at every step you are drawing a query independently and identically according to an unknown distribution. So the universe, if all distributions are basically universe of all possible i.i.d. distributions all over the place. I do not know the distribution of the algorithm designer, and I'm going to call p, j as the probability with each query j is going to be drawn. I did not know p, j. And B, i, j is the bid for advertiser I for query j. So your goal is basically to get an algorithm which maximizes the revenue, which is the sum total of all the allocated bids, respecting the budget constraints. So that's the model. Is a model clear where the queries are drawn? So it turns out that all the results for this problem are parameterized by the significant parameter called the maximum bid to budget ratio. It's not difficult to see why this is the case. And I'll illustrate this through a simple example. You consider a regular matching problem. They do the matching bids, all the bids are one, all the budgets are one. So you have just two advertisers. And three queries, let's say that two queries are going to be drawn online, let’s say this is the property distribution. Suppose query one arrives first. Query two arrives first. In respect to what you do, there is a constant probability that the second query can be allotted to anybody, right? You can, and this guy’s budget is over. And you can basically get a gap between optimal online and off-line revenue. The point is that if instead of the bids being one, suppose there were one by thousand. Then each mistake you make is not going to be as costly as the bids being one. The bids being one is a big mistake. If it's one bid over, then it's less a mistake. So obviously, if you have smaller and smaller bid to budget ratio, you can get better and better approximations. So all the approximation ratio, you have to be parameterized through this bid to budget ratio and I'm going to call this gamma, after. Okay? So I'll present what is known now. So MSVV, who coined this problem, studied this problem in competitive setting. And what they show is that you can get a one minus one e approximation in the case when the bid to budget ratio is zero, right? So this is the best case you can ask for and the bids are infinitesimally small compared to budgets. And even then, you can only get one minus one by e, no randomized algorithm can go beyond one minus one by e. Okay. So that is for the competitive setting. And in a recent result, concurrently with our work, [inaudible] show that it in a stochastic setting, which means you know the distribution, you can get a one minus square root gamma approximation where gamma is the bid to budget ratio. Okay? What this means is if gamma goes to zero, then you can get arbitrarily close to one, circumventing the bound here that you cannot go beyond 1 minus 1 by e. The question we asked in this work is: can you get the same one minus square root gamma approximation, but through a prior robust algorithm? Which means you use the same algorithm in respect to what the distribution is. Okay. So here's our results. This is same model, i.i.d. model. We don't ask for the full distribution. But we asked for n parameters from the distribution. Okay? I'll could explain what these n parameters are later. But the whole distribution itself could have an infinite support but if we only ask for a few parameters of the distribution, and given that, we get the same one minus square root gamma approximation. We also show that you cannot go beyond one minus square root gamma, even if you knew the distributions. >>: [inaudible]? >> Balasubramanian Sivan: n doesn't appear in the approximations at all. >>: So whatever n is>> Balasubramanian Sivan: Oh, the n is the number of advertisers. Yes. n is the number of queries, but the approximation ratio is independent of n. Yeah. So that's the result. So this same problem has been studied in the prior robust model with a slightly different [inaudible] than the i.i.d. universe of distributions. And in order to put our results in context, I'll briefly goes through that model also before giving the proof of this result. So the model is the same as this model except that instead of queries being drawn i.i.d., in adversity initially picks the set of m queries to arrive. After that, these queries are sent according to uniformly random permutation. So in i.i.d., if you conditionally set of queries to arrive, they arrive in uniform random order, correct? So you can think of the i.i.d. model as a distribution or random permutation models. Because in i.i.d., nobody's conditioning on which queries to arrive. So i.i.d. is a distributional random permutations, and for that reason, any approximation issue that holds for this random permutation model will also holds for the i.i.d. model, but the rewards is not known. And there's no separation [inaudible] for this. But the distributions are unknown, so basically the unknown i.i.d. model is really close to the random permutation model, though one of them is stronger than the other. >>: [inaudible]? The queries, they must be [inaudible]? >> Balasubramanian Sivan: It could be possible that it's applications. So here is what I already showed. So the, you know, for the random permutation [inaudible] is actually the first [inaudible] showed that you could get arbitrarily close to one, whereas the result by Devanur and Hayes, and this is the dependence n and m they show, and after this result there are several other results, it's different datas. Basically, the point is that all these depend on n and other parameters for the i.i.d. model to completely get to the [inaudible] and get this one minus square root gamma approximation, and it’s an open question actually, whether other algorithms actually extend to the random permutation results also. The first question is whether random permutations actually permit this kind of approximation without any dependence and then, but then you could ask whether this algorithm itself actually extends to random permutations. >>: So there are no [inaudible] and lower bounds for the permutation model? >> Balasubramanian Sivan: There is no known lower bound except for this [inaudible]is for i.i.d. So that is, that is a comparison for i.i.d. And random permutations. So I’ll now pool this result, this one minus square root gamma. Here I have some [inaudible] assumptions just for the purposes of this talk. First the bids are binary. Zero or one. The second is that the distribution is the uniform distribution queries, that every query arrives at the same probability of one by m. m is the total number of queries. And for the third assumption, I'll need a quick definition. I'm going to do define what is an expected instance. For every distribution, this is a single offline instance that everything happens as per expectation, okay? So you have a query j which arrives with probability p, j in the online model, which means the expected number of times it should arrive is m times p, j, right? So in the off-line instance, this expected instance, you have exactly m, p, j units of query j, okay? So for every query you have an expected number of units of that query in this off-line instance. But for this example, m times p, j is exactly one. So you have one unit of every query in the support of the distribution. And that is a single off-line instance. And you can't compute it if you do not know the distribution, right? If you know the distribution that is the off-line instance and the assumption is that you have a perfect matching in the expected instance. The optimal solution to the expected instance is perfect B matching, which means that every advertiser’s budget is fully consumed. I'll examine all these assumptions later, but for now, let's say that every advertiser’s full budget is consumed. Okay. So given these assumptions, what can you do? If you knew the distribution, then you can run this algorithm. That's why I'm calling this the hypothetical algorithm because you do not know the distribution. So here's what algorithm does. It firstly computes the expected instance for which you need the distribution. You need to know the number of queries which arrive. And it finds the optimal matching for the expected instance. So that is, you find the matching and tabulate it as a table, which query goes to which advertiser, it stores this table, and it uses the solution for the online problem. So what does it mean? Let’s say a sequence of queries arrive like this; when three arrives it goes to the table and see it should be given to advertiser seven, you give it there. And one arrives, you do the same thing. Go to the table and see, when three arrives again, it says advertiser seven, it will give it advertiser seven, even if by this time advertiser seven's budget was exhausted. Could be that seven’s budget is exhausted, but that is why it is an oblivious algorithm. It doesn't depend on what has happened earlier. It is a waste of money to give it to seven, but I'll still give it to seven, okay? It's a simple non[inaudible] algorithm. >>: [inaudible]? >> Balasubramanian Sivan: I mean, there is nothing like feasible. I mean, you can keep on giving it to the advertiser, but he won't pay beyond B, i. I mean he won't get the money. So it's a waste after B, i, that's all. >>: Only a theorist would consider that feasible. >>: [inaudible]. >> Balasubramanian Sivan: The claim is the simple non-active algorithm actually gets one minus square root gamma approximation and for the assumptions they made, gamma is just bid by budget, and the bids are all one, so it is one by the minimum budget, right? So I'm shooting for a one by square root only by budget approximation, one minus square root on the budget. Okay. So here is the proof for this. Let's analyze this step-by-step. In any given step, let's see what happens. What is the probability that advertiser i gets a query? Well, I said in the expected instance, you have full budget consumption, which means advertiser i had B, I queries going to him. So if any of those B, i queries arrive, then advertiser i would have been given the query. As a total of inquiries for the probability that advertiser i gets a query is B, i, by m in upshot. So now this algorithm is basically like a balls-and-bins process. Every step destroying a ball to advertiser i, it’s probably [inaudible] B, i by m and what is the revenue at the end of all the steps? Because I’ve assumed binary queries, the revenue is basically the number of queries which is matched or truncated at B, i ,like you asked, there is no user giving beyond B, i. And this is basically, they are truncated binomial sum. We know that a truncated binomial sum has a square root loss. And you basically get the result or sum of all these people and that's the total revenue. So if you know the distribution, you can run this simple algorithm. There are, I want to point out two things about this algorithm: one is the computational burden of this algorithm increases with the size of the support, which means to compute the expected instance, you need to know the full support and compute the expected instances as the pool grows larger and larger, the computational burden just grows larger, and if it is infinite is it rendered unfeasible. And secondly, this is, also this could be randomized algorithm because the table you get for the example I had, it is [inaudible], but it could be a fraction. >>: So I’m a bit confused about something. So this ratio you're computing is the ratio, the algorithm which you used to what exactly? >> Balasubramanian Sivan: What the algorithm achieves to what you can achieve through an off-line fractional solution. Optimal off-line fractional solution. So as you- >>: And that's essentially times this thing that your computer [inaudible]? >> Balasubramanian Sivan: Yes. Exactly. >>: Expectation of the Optimal>> Balasubramanian Sivan: It’s basically the OPT for the expected instance which is larger than the expectation of OPT. Because the expectation of OPT would be feasible to [inaudible] here, so the OPT of the expected instance would be only be larger. Okay. So that is why I said in the first, the benchmark is [inaudible] to a stronger metric is an instance where it is optimal. Instead of looking at off line optimal resolutions. Okay. So this is what you do if you know the distribution, which you do not know the distribution, what do you do? So consider running the following hybrid algorithm, which is a hybrid of the following two algorithms, one is the hypothetical algorithm, you just press enter. The other is an algorithm A, which is going to be inductively defined now. And H, i is the hybrid of B and A, which runs B for the first i steps, and A for the remaining i’s, sorry, A for the first i steps and B for the remaining i steps. It's A for [inaudible] B. So if I assume that A has been defined for the first i steps, and then I'm going to define what it does at i, plus one [inaudible], once the query arrives, it picks the following advertiser. The advertiser which maximizes the current steps revenue plus the expected residual revenue which you achieve, you run the hypothetical algorithm for the remaining m minus i minus one steps. So you look ahead and decide the best thing to do now. And by definition, the hybrid H, i plus one look at the higher revenue than that hybrid H, i because H, i plus one and H, i, for only i plus one at step, H, i plus one basically runs this algorithm A, which looks ahead and makes the decision, whereas H, i does this fewer, the hypothetical algorithm actually looks at the table and it does something. And clearly this OPT of looking ahead and doing something is better than looking at the table and doing something. >>: [inaudible] hypothetical algorithm, it’s basically that you know the distribution [inaudible]? >> Balasubramanian Sivan: I mean, I want to show that you can do this without knowing the distribution. But for now, suppose you could do this. Suppose you could look ahead and make this decision. Then I'm saying H, i plus one is better than H, i. Just i plus one at step is different, but by definition it is better. Now you can slowly change this step-by-step, just like you [inaudible] when you distinguish [inaudible] cryptography. At the end, basically the point is if you run the algorithm A all the way, it is better than running algorithm B all the way, by definition. And I’m already sure that B gets this square root loss, which is what we wanted. So you see A also gets it. The question is can you run this algorithm A, because I said you have to look ahead into the future and decide this expected, computer expected revenue. The key item is that implementing this hypothetical algorithm actually requires knowledge of the distribution. But just estimating this revenue doesn't require knowledge of the distribution. And all we need is to estimate the revenue. Why is that? Because I said this hypothetical algorithm is a simple balls-and-bins process which throws a query to advertiser A is already B, i by m. I know the budget’s B, i, I know the number of queries m, so at any given point in time, given the number of remaining queries, I can estimate the residual revenue with the number of queries remaining on these numbers. So some of my binomial exclusions, one for each i. So I can estimate this exactly and the upshot is that you can run the hybrid algorithm without knowledge of the distribution. All I ask for is these numbers. One point is that it is not enough to recognize that this p can be interpreted as a balls-and-pins process, it probably be B, i by m. Even after knowing that, you cannot implement [inaudible] because you’re not given a bunch of balls and asked to play a balls-and-pins process here. As a query arrives, you have to decide whom to assign it to. So you can’t render this algorithm B, but because you can estimate the revenue of B can [inaudible] some other algorithm, which does as well as B. So that's the point. I will wrap up this analysis by examining some assumptions I made. One is I said it's uniform distribution. What if it is not? This is, tells sort of an automated assumption. The expected instance will now have a fraction optimal solution instead of this integral optimal solution. That is fine, it only changes the, what the hypothetical algorithm, [inaudible] be an optimized algorithm. But it doesn't change anything about how the algorithm itself is defined inductively. >>: So instead of all these algorithms really depend on the [inaudible] distribution, right? I mean, if the distribution was very easy, say, have very low [inaudible] learn it [inaudible]? >> Balasubramanian Sivan: Yeah. So the algorithm I presented doesn't depend on the complex, it doesn't depend on>>: So you could imagine that for second distributions can do better than [inaudible]? >> Balasubramanian Sivan: No, no. Even if you know the distribution, you can't go beyond [inaudible] because I'm comparing it with the off-line optimal solution. It's not the online optimal. >>: I guess what I'm asking is whether the off-line solution, is it conceivable that off-line solution can do better than the square root for some distributions? >> Balasubramanian Sivan: What does it mean, you say off-line will do better than square root? >>: You have, suppose you know the distribution. >> Balasubramanian Sivan: Yes. >>: It’s, so is the square root, the one minus the square root>> Balasubramanian Sivan: Oh yeah. It doesn't hold for every distribution, yes. >>: [inaudible]. >> Balasubramanian Sivan: Yeah. You could, it's only the [inaudible] distributions for which you can't go beyond [inaudible]. Okay. And the second [inaudible] I'm trying to make is that the bids are binary. And this is a more serious assumption. You know what happens if they are not binary? Well, they could be anywhere between zero and one. You know, you can't interpret this hypothetical algorithm as a balls-and-pins process anymore because you are throwing balls of different sizes, so it’s like a fractional balls-and-pins process. And this complicates the hybrid argument, and you basically have to introduce a third algorithm F and do a two-level hybrid algorithm to give this proof, and I'm not going to go through that. But you know, you can get the same approximations for arbitrary bids. [inaudible]. The third assumption is that I said budgets are fully consumed in the expected instance. What if you don't get full budget consumption? You consume something, which is lesser than that B, i, the only difference is then the balls-and-bins probability now is probably C, i by m is sort of B i by m. Okay? You do not know the C, i’s. And this is what our algorithm asks for. These are the end parameters. If you give me the consumption, optimal consumption in the expected instance for every advertiser, if you give me the C, i’s, then I can use them to estimate the residual revenue and run the hybrid algorithm. Okay? So those are the end parameters we need, and if you give me them, I will give what I promised. And here is an interesting open question that is immediate from this. What if you simply, as you make a wrong assumption? Just say that C, i is equal to B, i, right? And then run your algorithm. Does it perform well? We are able to prove that for some special cases this already does well. And if you're able to prove that for all cases it does well, then you're basically completely eliminated all dependence of the distribution. You're doing as well as the [inaudible] case. I think that would be great to prove that. >>: So I have a question. If you [inaudible], you wouldn't expect to actually know the C, i’s, right? But maybe you could estimate them. >> Balasubramanian Sivan: Sure you can estimate a stand-in for is the C, i’s. >>: But only if you know the C, i’s within epsilon>> Balasubramanian Sivan: Yes, yes. If you can know the C, i's within epsilon, these results also will be within epsilon. Yes. Okay. So that is the immediate open question from this. Now I'm going to generalize this to a much more gentler resource allocation framework which I say I’ll present later. And this basically captures all these special cases I said, and then online [inaudible] and so on. So the model is that you have m requests like m queries, you have n resources, n advertisers. This source i has budget B, i. The major difference is that now every request part is sort of just consuming the resource from one resource can simultaneously consume resources from all other resources available. Okay. I'm going to introduce a third number, which is an option to serve a given request, and if you serve request j with option K, you get a profit of W, j, K, and it consumes some amount of resource from every advertiser. It consumes B, i, j, K from advertiser I, the resource i. To have a complete, concrete example in mind, think of the request as, think of you having a graph and the request says, request to route from some source in sync the graph; the resources are the H capacities in the graph. The option is basically which path you have to choose from the source to the sync. So you have an exponential number of parts to choose. The number of options is exponentially n, and obviously, different parts consume from different H’s. So that is the consumption information. And in [inaudible], you have B, i, j, K amount of consumption from resource i. So this is the general resource allocation framework. And here is our results. Again, the reports are drawn i.i.d. We do not know the distribution in this case, it's completely unknown distributions. We don't even ask for any end parameters. In other words, you can use this algorithm for the previous results I said, the previous problem I said. The only thing is the results are slightly worse here. You have to depend on n here. Unlike the one minus the square root gamma I presented earlier. The optimal depends on n, and you also show a lower bound matching which says that you cannot get beyond one minus square root gamma log n for this is general framework. This is complete unknown distribution, and also these results hold with high probability, you know, just in expectation, they hold with the probability of one minus square root gamma log n. Now high-level idea is to use hybrid argument again. But the algorithm is fundamentally different. We use hybrid algorithm not on the expected revenue to present it. But hybrid argument on the exponential [inaudible] functions used in [inaudible] bounds. To compare it to what's known, the previous best was one minus square root gamma n, log K, n, and this is, K is typically exponentially n, like in the parts example, the exponential number of parts. So if you have two [inaudible], if you bring in n here [inaudible], it’s really square root of n square, R and n sitting here in front of the hole and the n is gone basically; so n it typically large in the examples. But this is for random permutations, which as I explained, is a slightly stronger than the unknown i.i.d. model, [inaudible] so it goes in here is this algorithm, the extent of the random permutation model with the same approximation factor. So that's the summary of part one for the special case of adwords where you get matching upper and lower bounds, but we have the [inaudible] and parameters of distribution. And for the more the general resource allocation framework, we again get matching upper and lower bounds. But this is a completely unknown distributions. So this work was done while I was an intern in 2010, with Nikhil, and thanks to Nikhil, we were able to speak the product groups, and they basically use this algorithm now for MSN’s display ad serving engine. Okay. For pretty much the exact problem I said this [inaudible] problem. And it has been globally operational since the summer of 2011. So these are the open questions, which I already presented, but I'll repeat them. One is to see whether the i.i.d. And random permutation model have any separation. No separation is known. My own guess is that [inaudible], our algorithms actually work for random permutations, but we don't know. [inaudible] dependence on the end perimeters, the C, i's. So that's it for part one. If you have questions, you can ask questions about part one, right now I’ll be more into mechanism design. So I'll now talk about prior robust truthful machine scheduling mechanism. This is giant work with Shuchi Chawla, Jason Hartline and David Malec. So the problem is the very well-studied [inaudible] problem in computer science. You have n jobs and m machines. N jobs are different, m are different machines and your goal is to find a schedule of jobs and machines to minimize the completion time of the last job. So if you have these jobs and machines, machine two is basically the Makespan defining machine for this schedule. And this is called an undulated instance because for any given job that I'm trying on different machines are completely undulated. So this is the problem we are studying with a twist that you're going to study this in a strategic setting. This problem was introduced in the seminal paper for algorithmic design by the Nisan and Ronen. They say that these machines are operated by selfish workers and the runtimes of jobs on machines are privately held by machines and do not know these runtimes to begin with. And what we ask for is a mechanism, which is not just a schedule of jobs on machines, along with that, some payments which you transfer to these machines to [inaudible] them, to truthfully report their runtimes. This pair of schedule plus payments is the mechanism, and the machines have their own selfish objective, which is to maximize the payment they receive from you, minus the work they are forced to do, which is the sum total of the runtimes of all the jobs they're asked to run. The solution concept I ask for is dominance [inaudible] truthfulness, which means a mechanism which, in respect to how other machines behave, will make each machine, each machine is incentivized [inaudible] to runtimes. This is the strongest notion of truthfulness you could ask for in mechanisms. Okay? So what's the motivation to study Makespan in a strategic setting? There are several. One is that that it takes a central computer problems specifically in the context of resource allocation. Even from an economic point of view, Makespan is, you could think of Makespan as enforcing some kind of [inaudible]. After all, Makespan is a load balancing between machines, so it's like a min-max fairness. What makes it really interesting to me is that it is a nonlinear objective, unlike other traditionally straight objectives in mechanism design, like revenue and welfare, which are all linear, and for this reason, what is possible for Makespan is very different from welfare or revenue. Here are some differences, for example. The social welfare objective, which is the most well studied objective in mechanism design, you can get truthfulness plus optimality through the celebrator VCG mechanism. If you didn't care about computation concentrations for a minute, then you can get a truthful mechanism which optimizes social welfare all the time. But truthfulness plus optimality itself is impossible in Makespan, as Nisan and Ronen showed in their paper. This means that even if I give you unbounded computational time, you cannot give me a truthful scheduling mechanism which will optimize the Makespan. There are some [inaudible]. And I'll trace through that entire [inaudible] results in the next slide. Another striking difference is that the kind of impressive reductions that are possible in social welfare, where you take an arbitrary approximation algorithm and you inject truthfulness into it, you can morph an algorithm into truthful mechanism and it will preserve the approximation guarantee of the original approximation algorithm. Such kind of impressive general introductions are not possible even in the simplest of settings in Makespan. This, again, is a big difference. And this problem was basically introduced as a challenge problem for mechanism designed by Nisan and Ronen in their seminal paper. And since then, this has become a hot area to work on. Let's see what is known. So Nisan and Ronen gave a simple m approximation in their original paper. But much activity has been on lower bounds. Problem. So the older papers showed that no deterministic mechanism can get anything beyond a true approximation. Okay. Even if you give unbounded computational time, there is no computational restrictions here. And this was later improved to randomize mechanisms also by [inaudible], and then this bound was improved several times later by [inaudible], later. [inaudible]. Recently where Ashlagi, Dobzinksi, and Lavi showed that I'll put in a lower bounds match, basically. And there's a very big harness [inaudible]. Although the [inaudible] restricted class of mechanisms, anonymous mechanisms. Anonymous mechanisms means that you should not use the name of the machine to make your decision, which means that if two machines swap their runtimes, then how you treat those two machines also should be swapped. Okay? So given this backdrop of [inaudible] mechanisms we ask in this work if you make stochastic assumptions can you get prior robust truthful mechanism which gets good approximation for the Makespan, given all these hard results in the competitive setting? That's the question in this work. So I’ll now formally present what is a distribution of the model for scheduling? So you have n job send and m machines and the runtime of job j on machine i is X, j, i, and the X, j, i’s are independent random variables and here's why [inaudible] distribution. The runtime distributions are identical across machines. For a given job, the runtimes of the job on the different machines are i.i.d. random variables, but the jobs themselves could be nonidentical. So you see there. So basically, the machines are APRE[phonetic] identical. But if you draw the runtimes from these distributions, any specific instantiation of these runtimes will lead you to an undulated instance. So I'm going to use a short form for machines being i.i.d. and jobs being non-i.i.d., but this is what I mean. For some results, I also need the jobs that are also i.i.d., but I'll present them when I present the results. The goal is to minimize the expected Makespan. >>: And do you know the distribution? >> Balasubramanian Sivan: No. I want to get the prior robust mechanism, which means for all possible distributions, we just have to [inaudible] these conditions: independence and i.i.d. So this is what I showed you already, that the upper and lower bounds match in the competitive setting for a restrictive class for the lower bounds. And the stochastic setting is a completely open, which means if I gave you the distribution, can you do something better that's not known? Here's our results. They give a prior robust truthful mechanism which gets an R of n by m approximation for our path. n is the number of jobs and m is the number of machines. In particular, when n is equal to m, this is like a constant factor approximation. And this lower bound there also holds for n is equal to m. So basically you can get a constant factor for those. The benchmark for S is OPT-half, it is basically the non-truthful, OPT is allowed to be nontruthful. This is a non-truthful expected optimal, but it is allowed to use at most half the machines. So it’s like a resource augmentation result. So I'll explain this benchmark OPT-half in a bit. But this is our first result. And later we show that you can improve this to sub logarithmic factors if you assume that the jobs are also i.i.d. And make for the distribution assumptions. So those are our results. I'll now go to through this, OPT-half benchmark and explained it a bit. As I said, OPT-half means that OPT is allowed to use, at most, half the number of machines which you are allowed to use. So if you get good approximation to OPT-half, it means that you can do well with resource augmentation. If you double the number of machines, you can perform approximately as well as OPT. In general, this OPT-half could be much larger than OPT. But we show that for a large family of distributions, OPT-half and OPT are within constant factors, which means that there's really no result augmentation research for these distributions because there is only constant factor of A for more [inaudible] distributions. Intuitively, these are distributions which have [inaudible] no heavier than exponential distributions to use uniform [inaudible] exponential. This is basically true, and even for a larger class of distribution, much larger class, OPT-half and OPT are within constant factors. I wn’t to talk about that class now. So that is about the benchmark OPT-half. So I will now sketch the proof of how we get this R,f, n by m approximation. Vertically I'm going to assume that n is equal to m and then I'm going to shoot for a constant factor approximation in the proof. I'll sketch the proof. To get a flavor of what is truthful and what is possible in this [inaudible] parameter setting, let's begin with a very simple mechanism which is truthful, and this was introduced by Nisan and Ronen, this mechanism is called a Min-work mechanism. It just minimizes the total work done by all the machines, which means that it tries to minimize the sum total of the runtimes of all the machines. This has nothing to do with Makespan. So whenever a job, I mean it takes a job, and basically schedules a job on the machine, which has reported the smallest runtime for that job. Okay? That's what it means to minimize a subtotal of all the runtimes. And it pays the machine the job’s runtime on the second quickest machine. So this will result in a truthful mechanism, if you just take my word for it, I won't prove it. >>: [inaudible]? >> Balasubramanian Sivan: Exactly. Yes. And that also minimizes the total global runtime also. >>: [inaudible]? >> Balasubramanian Sivan: No [inaudible] each job, but that’s exactly what you'll do if you want to minimize the total runtime. You're not doing anything global actually. You can do locally to minimize the total runtime. >>: But if you were [inaudible]? >> Balasubramanian Sivan: This is exactly VCG, basically it’s VCG, we are doing a rewards auction here. So instead of maximizing welfare, you are minimizing the burden, basically. So you’re minimizing the total runtime. >>: One more question. So the relation principle doesn't apply to certain objective functions, that's why you can [inaudible]? >> Balasubramanian Sivan: No. The revelation principle applies, but that just says that you need whatever you can do with the non-truthful mechanism you can also do with the truthful mechanism. But, I mean here, this objective is a different objective. I do a minimizing the total work, my goal is to minimize Makespan. I'm simply getting a mechanism which is just truthful and it has some nontrivial approximation for Makespan. That's all. You can use this mechanism for Makespan, but it will be bad. That's what I'm going to talk about now. So this simple mechanism also ready gives m approximation where m is the number of machines, the simple thing because the Makespan is just a max. Max is obviously lesser than the sum. The sum of the quickest runtimes always less than sum of the runtimes of all jobs in OPT. Which is, at most, m times the Makespan of OPT. Okay? >>: This was an example? >> Balasubramanian Sivan: This was an example. You get m approximation. It is also truthful. So what else is truthful, as to get a sense of what is truthful? It turns out that if you minimize the total work over a constraint, restricted domain of schedules that are also truthful, okay? You put a pre-specified restriction on what kind of the schedules are possible before you ask for the runtimes. Then try to minimize the total work over this restricted domain of the schedules and that is also truthful. So the actual question is why can’t you try this simple mechanism first? Maybe it does better in a stochastic setting because that was for a worst-case I think that an m, I showed an m approximation of Nisan and Ronen. So let's go this simple example, you have m jobs, m machines. Everything's i.i.d. All jobs and machines runtimes are only found between one and one plus epsilon. Now the quickest machine for a job is a uniformly random machine, right? Which means that the Min-work mechanism [inaudible] quickest machine is basically assigning a job to a random machine. So your basically throwing balls randomly into bins and the fundamental factor in the balls-and-bins analysis is that the heaviest loaded bin has log m by log, log m was, implications in logarithmic approximation is trivial, but we're looking for much better approximations like constant factor or at least sub log to make approximation. Okay? So what are the problem with this Min-work mechanism? It is basically overcrowding machines, like what is happening here. It gives too much importance to putting a job on the best machine, quickest machine possible. So a natural next step would be to explicitly prevent this overcrowding. So what if you say that you should not schedule more than K jobs in any machine? Okay. So that is a restricted domain of schedules. You run the same Min-work mechanism, you try to minimize the total work over this restricted domain of schedules, [inaudible] most [inaudible] per machine. That's what they call Min-work, K mechanism. Minimize the total work with utmost K [inaudible] per machine. This is truthful because the restricted domain of schedules and is also a perimeter of time because it's a minimum size matching problem. Each machine has a duty of K. The claim is that this Min-work mechanism gives a constant factor approximation for OPT-half. The proof i.i.d. Is basically, roughly these two steps. First is that Min-work mechanism results in a roughly balanced schedule unlike work [inaudible] that lopsided schedules that [inaudible] the Min-work mechanism. Secondly, in spite of the fact that I put a restriction on the number of jobs that go to each machine, it is not necessary that hereafter a job goes to the quickest machine, still is basically a job roughly goes to the quickest machine. >>: What is K, again? >> Balasubramanian Sivan: K is the number I'm going to put, the restriction I'm going to put [inaudible]. Most K jobs can go to any particular machine. You cannot crowd beyond that. You’re optimizing- >>: [inaudible]? >> Balasubramanian Sivan: I'll choose K, I’ll make K, 10, basically. I mean K is a constant, and I can choose K to get a constant [inaudible] approximation, however I’ll still use K equals 10. >>: [inaudible]? >> Balasubramanian Sivan: So m is equal to n, but if you have more jobs than machines and n times n by m number of jobs that's the restriction I put. Is the average basically. In n by m. So here it's 10 and that's with R of n by m factor comes out with. Okay. So the second point is that do a job necessarily, not necessarily goes to the quickest machine, roughly goes to its quickest machine. So here's a proof sketch I'll just presented the two key lemmas. The first lemma is the probability that the job goes to its i, its favorite machine. It ranks the machines according to runtimes. It's basically exponentially decaying the rank of the machine. So it doesn't go too below in its preference order. The second point, second lemma basically complements the first lemma, that placing a job on its i at quickest machine is no worse than basically five to the i independent copies of a job on the quickest machines. So OPT, basically the only thing that [inaudible] OPT is that it puts all the jobs on the quickest machine so that's the best thing that is possible for OPT. But it’s the quickest among m by two machines, okay? So this i [inaudible] order statistics of phrased purely as a probabilistic result, to say that i [inaudible] order statistic is almost stochastically dominated by an exponential number of five to the i independent copies of a order statistics. And i [inaudible] order statistic is among m independent [inaudible], but the first order statistic is handicapped with m by two independent[inaudible]. That is a purely probabilistic version of what we're showing. And these two, basically you can see that these two compared together to give a constant approximation, because the i is actually the rank which the job goes is exponentially decaying in i, and this is we have an explanation number here. So I'll just summarize part two. The lemma machines that i.i.d. And the jobs on i.i.d., they give a prior robust mechanism which, being blind to distributions, it can give a constant factor approximation where n is equal to m where R of n by m approximation. And as I said, OPT-half and OPT are comparable to a large class of distributions. So it's not results augmentation for dual distribution at least. And when the jobs are all i.i.d., you can further improve these two sublogarithmic approximations. Again with the prior robust mechanism. So I'll wrap up with open questions. And one obvious open question is how broad a class of distribution can you tackle with prior robust mechanisms? Okay. In particular, can you relax this jobs being i.i.d. Here and can you have non-i.i.d. and so give this sublogarithmic approximations? Other question is, I mean does this have much beyond the scope of this work at least? You know, what if you relax i.i.d. assumption on machines? Everything I said breaks down there and you need completely fresh ideas. The second improvement, the second open question is can you do what is possible for the competitive setting? [inaudible] on the lower bounds match at least for this anonymous mechanisms. Can you slightly relax the model and get positive results? Here is one relaxation possible. Computers jobs, you know computer programs need not necessarily run on one machine. You can run the same program on multiple machines. And you can take the completion time to be when the first copy of every job completes. It’s not a [inaudible] constraint. This will help in insider compatibility. Now you can ask what's the user scheduling that job [inaudible] machines? But this will help in user compatibility; for example, you can schedule all jobs on all machines. There’s a possible schedule. That is truthful, the only thing it does too much work. If you also put a restriction on the maximum amount of work that can be done and a lower job to be scheduled by machines, can you do something better? >>: You’d have to create each machine [inaudible]? >> Balasubramanian Sivan: Yes. It could be possible that>>: [inaudible]? >> Balasubramanian Sivan: But you want to put the restriction on the total work to be done. You simply schedule all jobs and all missions then that is possible to get the best Makespan possible by different [inaudible] order. That is truthful knowing that for any machine because he don't know the runtime of machines at all. But it does, every machine does too much work. You put a restriction on the maximum work that can be done. And that is one way to circumvent the lower bounds competitive setting. So I'll just briefly mention a selection of other resources that have done before, measuring future research directions. One problem that I recently worked on is the design of optimal crowd sourcing contests. And crowd starting contests, basically a principle or a firm has a task to be completed and it advertises this task to a crowd, and it puts a reward, and these users are, or the crowd basically have submissions for this task, and you evaluate the task according to some pre-specific criterion, and spread the rewards among the winner or some winners. And so the obvious question which arises is what format of a contest to do you use to incentivize users to give good submissions? In particular, the question we ask in this work is how do you get optimize the quality of the best submission you receive? A good example to have in mind is the contest, which many of you may know, that Netflix ran recently. Netflix ran a contest to improve the prediction accuracy of their algorithm. They wanted to basically predict how much you given user will like a particular movie based on how much he liked a given set of previous movies. And they promised that any user who improved the prediction accuracy of the data algorithm, first used it to improve the prediction [inaudible] of the data algorithm, but ten percent will get 1 million dollars. This contest is over now. So that is one model. So what is the best contest format? So for the model we use, we say that research is basically that there's a very simple contest format which optimizes the quality, the best submission, which is basically you get all these submissions and you segregate these submissions in buckets. To go back to the Netflix example, you segregate all improvements between 10 and 12 percent in one bucket, 12 and 14 percent in one bucket, and all those users who fall in the biggest bucket, will basically share the reward equally. And that is the optimal format. It’s what we prove. >>: Why is it better to [inaudible] hold him out to the winner? >> Balasubramanian Sivan: Okay. So there are two things. One is, what is the best thing to do if you run a static contest? Static contest means you should not decide how much amount you're giving to which person after you have received the submissions. This is what TOPCOR does, for example. It turns these architecture competitions where it says, two thirds of the award will go to the best submission, one third of all award will go to the second best submission. Among all those static contests, we proved that it is best to give everything to the winner. Winner take all is the best. But obviously, a dynamic contest, where you are allowed to see the submissions and then decide what to do later, will do better, right? For example, when I said 10 and 12 percent, 12 and 14 are different buckets, the number of users who fall within 12 and 14 will basically, very best of the submissions. After that if I do something the dynamic is always better than static. >>: [inaudible] you're saying this is better than giving everything to the highest [inaudible]? Why is that? >> Balasubramanian Sivan: Well, I mean the buckets we designed our basically bins of the distribution, and so far, some distributions it will turn around in the buckets are all independent, there's no bucket at all, which means that everything will go to the winner. That's also possible. For many so-called regular distributions it might turn out that there is no bucketing. But if the distribution doesn't, there is a [inaudible] allocation function and we do what is called ironing of the allocation function. That results in bucketing. But it suffices to say that it happen often that you will give everything to the highest bidder. Highest>>: [inaudible]. There's some situations where you don't want to do that. And that's what we are more interested in. Can you explain, [inaudible]>> Balasubramanian Sivan: It depends on the distribution basically you, why that arises>>: [inaudible]? >> Balasubramanian Sivan: So that agents have skills, basically. So you can think of the skill as the rate at which an agent can be used for work. The quality of the submission is the product of the skill and effort of the agent. Okay. So it’s a linear model. So skill, v times e, effort B, which is the quality of the submission. And you have distribution over these skills. So you can map it to auction theory, we basically, extend auction theory to more of this. Now the only thing is in auction theory for revenue, you study the sum of the payments, you distribute the maximum payment. So you could implement the same thing instead of, you know, dividing all this revenue in the highest bucket equally, you can also break ties, but you can, you should break ties basically in a consistent manner. Instead of dividing things equally, you can always say that if all these people fall in this bucket, I'll always give everything to advertiser one if he falls in this bucket. And then to advertiser two. So that is a way to run these things without [inaudible]. >>: [inaudible]? >> Balasubramanian Sivan: No, even for contest distributions, it could turn out that you should have to split rewards equally between one to two players who fall in the highest band. [inaudible]. What is the example for, I can't think of it a good example for why this>>: [inaudible] then just giving it to the best? >>: So even with your distribution with, so it’d be that some people they don't even want to buy [inaudible]; they don't have its real chance of winning. But you're giving [inaudible]? >> Balasubramanian Sivan: Yeah. So I have a mathematical answer, but it, I don't have to give the intuition. So let's say, I’ll give from auction theory what is going on, basically. So if you, truthfulness in single parameter settings [inaudible] it only has one private parameter. It basically, you can give it [inaudible] in terms of allocation. So if this is v and this is X of v, then the allocation function has to be basically [inaudible]. And if it results in more than [inaudible] allocation function than it is truthful. But if you basically have a non-monotone allocation function like this, but you can do is non-monotone function, the goal is to do optimization incentive [inaudible] condition, then you basically iron this allocation function, which means you basically flatten it out in this this region, which means that all values which fall in a particular region are treated equally. Whenever you iron an allocation function that is going to be discontinuity in how you allocate. The reason is, so far you are competing only with these set of people, people with values here. If your values slightly increase, then you're on par with all of these people. Okay? So clearly, the probability with which you're going to get served will increase. Similarly here, if your value slightly increases, then you suddenly bypass all these people because you only have a [inaudible] competition here. You're not treated at par with these people, so there's going to be a discontinuity in allocation function whenever you try to flatten the allocation through. Does this continuity allocation function will ultimately result in a discontinuity in payments? The question is how do you ensure that this discontinuity in payments? The answer is you basically explicitly incentivize people to bid in certain agents. Discontinuity in payments means that there are some forbidden zones of possible payments. So if I say that all improvements between 10 percent and 12 percent will be treated equally for Netflix, nobody will try to give 11 percent accuracy. Everybody will go to 10. Which means the region between 10 and 12 is a forbidden region of payment. Nobody will participate if you give that number here. So to get what is available here, to mimic that in this contest setting, you basically put these restrictions so it flows out of this theory. But I'm not able to come up with an example where, it's more intuitive to do this. This is a theory though. The discontinuity has to be modeled. And I hope you can see that this basically ensures discontinuity. Nobody will give anywhere between 10 and 12. People will always keep to the lower end of any term. Right? It's a waste of effort. Other work is what I did, again in 2010, when I was an intern. And the motivating question for this work is that computing truthful payments is often more difficult than computing the original allocation function itself. Oftentimes, truthful payments is only a secondary consideration. The original problem is a problem to be considered, but the secondary thing is basically more difficult than the primary problem itself, and this is a striking observation which needs an [inaudible] and seminal paper where, for many problems, it seems like computing truthful payments is much harder than computing the original allocation. Can you compute [inaudible] payments as fast as original allocation? That's a question we asked. The answer is yes they can be if you're willing to go ahead with the more relaxed notion of truthfulness and truthfulness of expectation. And you can do this both for single parameter domains, which is answered by asking [inaudible], and we also due for multi-perimeter settings. And this is application to [inaudible] auctions where you do not know the parameters. And I'll just mention one other result. This basically has to do with revenue of maximization in multi parameter settings. This has been an open question for long in mechanism design. If you just have one parameter, then you know how to optimize the value. But if you have multiple, if agents are in pursuit of multiple items and they have different values for different items, then this question has been open for long. And we basically showed that you can get very simple mechanisms with approximate the optimal revenue by just posting prices on items and allowing agents to take their favorite items. And because it's been open for long, there's been several follow-ups after work including work on several directions. For example, you make some interesting connections to profit inequalities and statistics. There is an open question on a paper, dependent on improving profit and equality, and basically created a new interested in profit inequalities, and the state of the art seeing improved profit inequalities so the open question was answered. Okay. So I'll conclude with future research directions. One thing which is obvious to people, if you see what has been going on the resource allocation settings is that all these problems have been [inaudible] assumptions [inaudible]. Game theory has been completely ignored in these problems for the obvious reason that they non-Game theory problem was already nontrivial. But the picture is finally complete now for the non-Game theory setting. So the goal is to bring back Game theory to online algorithms, online [inaudible] problems. I'll leave with one simple problem where you could try bringing back Game theory in. You have the same display ad setting, m requests and n advertisers. Advertisers want, i wants that most B, i impressions. The only difference is that these bids tell you ideas are privately held. And let's assume that the queries are drawn i.i.d. from an unknown distribution. If you just look at a purely online setting without any Game theory, the algorithms I present are basically, solves this problem, but arbitrarily close to one approximations. If you look at a purely off-line setting, without any Game theory in it, then the VCG mechanism gets a one approximation. But if you mix these two, if you mix online and Game theory, the constraints, we don't know how this works. Can you get arbitrarily close to one? Or maybe not. Then you have a harness of being truthful. This is just like in many problems whether there are two constraints, and one constraint alone can be treated, either of them, but if you mix them, you cannot. And this has been the prime agenda in mechanism design where computational and incentive constraints can be treated individually well, but if they are combined, you don't know how to solve them. There's a tension. Similarly here, there's a sort of computational incentive; you have online and incentive constraints. Is there a separation here? Is there a harness of being truthful? What can you do even if you know the distribution that is not known and that is unknown distributions and separation between i.i.d. and [inaudible]? There’s a whole list of questions here. That's it. Thanks for visiting. [applause] >> Nikhil Devanur Rangarajan: Any more questions? >>: So like from the standard i.i.d. model [inaudible] which cannot [inaudible] >> Balasubramanian Sivan: No. It's not clear that you can get that with in theory because all the [inaudible] in theory is purely out of algorithmic. >>: [inaudible] main question is in the, are you trying to optimize in the [inaudible] setting expected value of the algorithm or expected sort of optimal value over the distribution, so what if you take expected value of the ratio instead of the ratio of expectations just sort of in a way -- if for example, the optimum values of your instances can differ a lot, right, so you have to test every region over all of them and then take expectations [inaudible] >> Balasubramanian Sivan: That's true. That's a stronger thing, yeah. That's [inaudible] for example to me is a hyper [inaudible] set which is better than the ratio of expectation you know. This holds with a high probability that you get this one minus square root of gamma again and that is our stronger benchmark and yes, the one minus square root gamma presented depends on using the ratio of expectations, so I'm not sure, yeah. >>: In comparable [inaudible] >> Balasubramanian Sivan: Yes but I mean for at least the kind of instances he said were [inaudible] variable different things. Could be that that is stronger. >> Nikhil Devanur Rangarajan: Anymore questions? Let's thank the speaker again. [applause]

>> Nikhil Devanur Rangarajan: So it's my pleasure to... him Balu for short. Balu is a PhD student...

Related documents

Products

Support

&gt;&gt; Nikhil Devanur Rangarajan: So it's my pleasure to... him Balu for short. Balu is a PhD student...

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

>> Nikhil Devanur Rangarajan: So it's my pleasure to... him Balu for short. Balu is a PhD student...