>> Yuval Peres: All right. We're happy to welcome both Costis and Jason to give us two game theory talks today. >> Costis Daskalakis: Hi, I'm glad to be back. So today I'm going to talk about some interaction -- what I call probabilistic approximation theorems to solve certain game theory and optimization problems. And so the talk is para modular so it has para modules and you can stop at any time. So I'm not sure how much I'll cover in 45 minutes. So you know, so physicists tell us that, you know, there is uncertain that's inherent in the physical world. But as this picture may convince you, there's also uncertain in the social environment okay? So this is an interesting T-shirt. So that's rock, paper, scissors, right? So what I'm interested in in this talk is to see -- to look at uncertain and optimization problems, game theoretic problems, auction problems and I'm going to look at stochastic uncertain, right? So Bayesian uncertain is using in these settings. So in the game theoretic setting it's the randomization that enables Nash's Theorem. In the auction setting, it's randomization that enables Myerson's Theorem because of the design of the revenue optimum functions is enabled by modeling the uncertainty with environment in the Bayesian way. And in the optimizations genetically modeling uncertain the stochastic way allows us to go from worst case to average case analyst. Okay? And get, you know, stronger results. On the other hand, this casts with difficulties so in these settings uncertain makes a problem harder to solve, so the hardness is the form of PPAD completeness. Over here we don't know how to generalize this important result to multi-dimensional settings and genetically Bayesian uncertain introduces nonlinearity in the underlying optimization [inaudible] so it comes with the presence and it comes with difficulties. And I'm -- what I want to do today is to show examples where we can somehow harness this Bayesian uncertain to get rid of some of these difficulties. And the way we're going to do that is we're going to use tools from our probability theory. All right. So what I want to do is I want to connect these two things. In particular, I want to -- I'm not going to look at genetically optimization combinatorial problems. Okay. So my first module is symmetries in games, which I've talked about before in this group, so I'm going to go through it faster just to -- I'm just going to -- I just want to illustrate a few things with this example. So the goal of this module is to study settings, game theoretic settings where there is a lot of symmetry. Okay? In particular, in this environment, there's no way a player is keep track. Identities and actions of every other player in the game. Probably a player is keeping track of aggregates of what the other players are doing. And here is a model, a mathematical role to capture subsettings. So I'll call a game an anonymous game. If every player only cares about what he's doing and what the aggregate behavior of the other players are. So more precisely every player supposed to have the same start to the set and then the player function of each player is anonymous, that is it looks at what the player is doing and then how many of the other players are choosing each of the other level strategies in the game. As an example congestion games can be written in this form because when I'm driving in the road network I only care about what route I'm choosing and then, you know, what routes the other players are choosing, it's sort of aggregate way. I don't care about the identities of other players. Or, you know, if you want to be more elaborate I can partition players into types, slow drivers, fast drivers [inaudible] but in any event, you know, there is a lot of anonymity in the way I'm modeling the game. But you know, examples with drawn in other settings, social phenomenon, auctions and so on, so forth. And, you know, just a couple of references in the game theoretic literature. So over here a social phenomenon started using this model. These are condition games. So what I'm interested in is how layers reason in these settings. There's a big population of other players and really what affects my payoff is the aggregate behavior of those players. All right? So what -- you know, how would -- how would Nash play this game? Okay. That's a joke referring to the [inaudible] okay. So how would Nash play this game? So here's, you know, some results I want to talk about. So in joint work with Christos Papadimitirou we solved these games in the following notion of solve. We compute for any given precision in polynomial time a natural approximate Nash equilibrium of these games. Okay? And to do that, we have to understand how to place reason about what they have to do given the aggregate behavior of the other players and, you know, these results are enabled by approximation theorems that sort of approximate the aggregate behavior of the other guys. And I'm going to show examples of what I mean by that. But in reality don't reduce us to theorems about sums of indicator on the variables or multidimensional versions of indicators. So let me be more precise. So in an anonymous game, the Nash equilibrium is some vector of -- say there are two actions per player. Every player is just choosing a probability. So the Nash equilibrium lies in this space, okay. So every player is using probability and there are N players. All right? So the Nash equilibrium is over here. So I have to search the space. It's highly dimensional -high dimensional. These are independent indicators because in Nash equilibria players are independent. And really if I look at this as an optimization problem, you know, what is the objective function? Or more generally, if this is the really Nash equilibrium and, you know, instead of that I look at this point, how well does this perform -- how well of a Nash equilibrium is some other point that's sort of far from the actual equilibrium in this space? So what's the write notion of distance, if you want in this space that I'm interested in. So it turns out that the relevant distance function is the talk of variation distance between the sum of these indicators and the sum of those indicators. In other words, if this is the real Nash equilibrium and this is actually close to the Nash equilibrium under this distance, then that's an excellent approximate Nash equilibrium. We'll take this as given. It's pretty straightforward. But what I want to do to find an epsilon Nash equilibrium is to come epsilon close to the actual Nash equilibrium. Where the distance is measured by the total variation distance of between the sums of the two indicators. Sorry. The sums much these indicators versus the sum of those indicators. And, you know, genetically the way -- the way I want to solve this problem is to cover this space with, you know, epsilon balls in this distance. So I want to epsilon cover under this type of distance. >>: [inaudible] the number are zero one or [inaudible]. >> Costis Daskalakis: They're indicator variables. >>: [inaudible]. >> Costis Daskalakis: Right. It's isomorphic to this space. So the space of all mixed stratus files is isomorphic to the hypercube. So I'm looking at this space, the space of all vectors of indicators. And what I said in the previous slide is if I have a mixed strategy profile, so vector indicators and a different mixed strategy profile, how close are they in the ->>: [inaudible]. >> Costis Daskalakis: You're asking about this? [inaudible] so for this example, suppose every where is two strategies. So a mixed strategy is just an indicator [inaudible]. >>: [inaudible]. >> Costis Daskalakis: Two strategies. And more generally is this [inaudible] categorical [inaudible]. >>: [inaudible]. >> Costis Daskalakis: [inaudible]. All right. So but [inaudible] right. So the -- my point was that if this is a Nash equilibrium and I come epsilon close to the Nash under this distance, then there is an epsilon Nash equilibrium. All right? And the idea of the generic way to find the Nash equilibrium would be to cover this space with epsilon balls under this notion of distance, right, and then I'm guaranteed that one of these points is going to be epsilon close to the Nash equilibrium, so I'm going to exhaustively go over each of these representatives, we want to represent the ball in my cover and check if that's an epsilon Nash equilibrium. Okay? And obviously the problem is that this is, you know, high dimensional and potentially the number of balls could be huge. And, you know, we showed with Christos is that there is a polynomial number of balls that covers this space, which was sort of like -- we didn't expect it originally. But just to relate it to the types of approximation theorems I want to construct, here's an example approximation theorem that we need to come up with to design such a cover. So here's a theorem. So for any collection of indicator on the variables, without [inaudible] expectations, these are independent, right, and any given constant delta to dependent N or anything else, it's an absolute constant, there is a way to construct another collection of indicators YI with expectation QI, so that the expectations of these guys are restricted to be multiples of delta. So these are of some finite precision if you want. And at the same time the total variation distance between the original given sum of indicators and the constructed sum of indicators is a function of delta. So you don't get the penalty of N as one would expect from this restriction. Okay? So generically if you are restricted to use multiples of delta, you would expect an N coming into this bound, what we show is that there's no N coming into that and that be enables polynomial size covers. Okay? >>: [inaudible]. >> Costis Daskalakis: Huh? >>: [inaudible]. >> Costis Daskalakis: So I can -- the best I can do is delta log one over delta. I don't know if that's optimal -- optimal. It's optimal -- it's almost optimal for N equals one. But potentially if you have more you can use it. If you have a bigger number of indicators you might be able to use it somehow. All right. So in any event this enables constructing covers of this size. And, you know, then you can exhaustively go over them and find the Nash equilibrium. Okay? So then it boils down to, you know, if you want a faster algorithm you need a better cover. Okay? And the best we can do, I'm not going to go through -- over, you know, this other theorem, but the best we can do right now is something of this form. So I can cover this space under this notion of distance with some polynomial number of balls in N, the number of indicators, and some function of epsilon that's quasi polynomial in one of our epsilon. All right? So -- and I can convert then looking for an epsilon Nash equilibrium to searching over a space of this size. And as an open problem, whose answers I don't know I don't have strong belief of whether a better stronger cover exists or not. So as a open problem, you know, is there discover of this space that has size polynomial in both N and one over epsilon? That's an interesting question. Yes? >>: [inaudible]. >> Costis Daskalakis: Is it to see how to go -- excuse me? >>: [inaudible]. >> Costis Daskalakis: It's. >>: So the existence of these [inaudible]. >> Costis Daskalakis: Right. So you're asking how do you go from the theorem to breaking down the number of balls? How do you use the theorem to ->>: [inaudible]. >> Costis Daskalakis: Right. How do you prove a theorem. Yeah. >>: [inaudible]. >> Costis Daskalakis: So it's not exactly but it's -- it's not exactly two lines. But you use -- okay. So one important ingredient of this function is symmetric over the indicators, so to begin with, there is already some reductions you can do. That's not going to give you a polynomial reduction but the problem is symmetric in permutations. And then, you know, using that and using the theorem I've presented you can bring down the dimension. Okay? So because this is symmetric and because I can just take each of these guys to choose from a finite set, there is only so many -- there's only polynomially many possible permutation variance classes of collections I can make. Okay? All right. That was my first application. All right. So -- and what I want to talk about now is a different setting, an auction setting. I want to talk about what is called multi-dimensional pricing. We're going to explain what that means. But some key studies first. So that falls under the more general class of problems called optimal [inaudible] design. Where basically the goal is to come up with an auction that optimizes the revenue of the auctioneer. And such auctions are known for certain settings. And maybe the most celebrated result in this realm is Myerson's auction which basically does the following. So it's a revenue optimal auction. Under two -- under the assumption that's a [inaudible] are quote/unquote single parameter Bayesian. I'm going to say what that is. I'm going to clarify that single parameter Bayesian means and what closed forum means. So by single parameter Bayesian, I'm -- it's a strong assumption. Basically what it says is that every bidder who is participating in the auction has the same value for each of the available items. All right? So if -- you know, if the auctioneer is selling, you know, many be paintings then every bidder has the same value for all the paintings. Okay? The value's random. But it's a single value for every painting. Okay? No matter if it's Picasso Moma, my value's the same, okay, and that value is a random number. >>: [inaudible]. >> Costis Daskalakis: So over bidder has a distribution. And his value for every painting is a single drove from that distribution. All right? But different bidders may have different distribution. It's a pretty restricted setting if the items are heterogenous. All right? >>: [inaudible]. >> Costis Daskalakis: No, no, no. One [inaudible]. So that is a restriction of this theorem. What is good about this theorem is that it's not generically searching over all -- it's not a theorem saying that there exists an optimal, all right, it's a theorem that says that the optimum has a very precise elegant form. In particular it says that the auction that optimizes revenue is really the auction that optimizes, you know, the virtual welfare of people. Okay? I'm not going to get into the details of that, but it's a very precise characterization of what the optimal auction is doing in this setting. All right? So that's what Myerson says. And following Myerson, you know, one of the most important problems in this area is the generalization of this theorem to the more natural setting of heterogenous items where an agent has different values for the different paintings in the auction. So the multidimensional setting. All right? So we want to have a same kind of theorem for the multi-dimensional case. So there is a large body of work in economics looking at this problem. And recently also it appears science has been interested in that. So -- and just to mention a couple of recent papers on this problem. Basically as far as constant factor approximations to the optimal revenue growth, the problem solves, okay? So we know we have very good -- very elegant auctions that achieve a constant factor of the best revenue. All right? So in terms of constant factor of commissions the problem is solved. What I want to talk about here is if we are interested in not getting a constant factor approximation but almost all the revenue that we can possibly get, okay? So I'm pregnant here joint work with my student Yang Cai on a close forum efficiently computable [inaudible] inception optimal revenue mechanism for in the case there's a single bidder. Okay? So I'm looking at this problem but for a single bidder. All right? And I'm going to talk about some different work for generalizing [inaudible]. Okay? So maybe this all seems absurd so let me make it more specific. So basically thing pricing problem is the following, okay? So there is one customer, and I have the gallery, okay? I have a gallery with many paintings so there's one customer. I have posted prices on my paintings, right, and the customer comes in and wants to decide what painting to buy, right? So how is he going to make that decision? He's going to basically say oh, you know, this painting is worth P1 to me, this other paintings worth P2 to me and this one is VN, right? And I'm going to pick the painting that optimizes the difference between my value and the price it has. And I'm going to pay, you know, the seller the price of that painting. Okay? The price of the art marks of the gallery, between the value and the price. >>: [inaudible]. >> Costis Daskalakis: That's right. Okay. And, you know, the [inaudible] a big problem because I can talk [inaudible] but in principal there could be an arbitrary function, okay? So the columns usually look at that. Okay? So I can handle logs, I can handle issues but I'm not sure if I can handle [inaudible]. So the point is that, you know, so I have the gallery, and I want to decide on the prices to optimize this objective. And in principal, if I have no idea about the values of the bidder of the buyer for my paintings I cannot solve this problem. All right? But what do you have Bayesian information about the values of the buyer for the paintings? So this is a setting I want to look at, right? So I know that -- and I'm going to assume that the values are independent random variables. So I have a distribution for the value of this guy for each of the paintings and these are independent rows. If they're not independent rows, there's not much I can do, okay? So the problem's highly inapproximatable if there are correlations. So now what is the optimization problem I need to solve? Well, I can write there is a close expression for what I need to optimize. So let me walk you through it. Okay? So ->>: Optimization [inaudible]. >> Costis Daskalakis: [inaudible] that's usually -- that can be handled in some cases. Okay? But [inaudible] all right. But let me walk you through the expected revenue, okay? So if I have posted prices P1 through PN for my paintings and the values are random variables then really my revenue is the buyer is going to pay price PJ with probability that the gap for the Jth item dominates the gaps for all the other items. That's what this guy says. And really what I want to do is I want to optimize his objective function. So I want to find the prices to optimize this function which is -- has a closed-form expression, right? It's not computed by [inaudible] or anything. It has the closed-form expression. But it's highly nonlinear. You know, these guys are drawn from this distribution. So. >>: All this is subject [inaudible]. >> Costis Daskalakis: That's right. Right. So I want to add another positivity constraint. All right. So ->>: [inaudible]. >> Costis Daskalakis: So I'm looking at the unity band case. We can talk about in additive buyer case or other settings. So I'm going to restrict my discussion to the unit band, unit band setting. So this guy wants to buy one painting for his living room, and this is the way he's going to decide about it and this is the expected revenue I'll get. Okay? So what do we know about this optimization problem? Well, we do have results. Again, in terms of constant factor approximations, the problem is solved. So there is a paper by Jason, Sue and Bobby giving poly time, constant factor approximation for this problem. So the result that I was talking about with Yang Cai is that woke get approximately arbitrarily close to the optimal solution -- to the optimal revenue -optimal revenue under the assumption that this guy, the distributions are [inaudible] distributions. All right? Or if the support -- each of them is supported in a balanced set, it's not spanning -- because in some sense the input complexity of the problem is -- the support of these distributions could be exponentially in the [inaudible]. That's how I want to avoid this case, okay? Even the IIDKs or other restrictions is not easy to solve, okay? Even, I don't know the support 3 case. If every distribution is a support 3, it's not easy to solve. So how do I go about solving it? So this again stating my optimization problem, the way -- I mean in general the way one goes about solving such a problem is to establish some smoothness properties of the objective function: So, you know, what I would like to do is for instance to argue that, you know, if I -- if I change this distribution buy epsilon in total valuation distance does not -- it's just going to, you know, destroy my revenue. Or if I restrict my prices to a discrete set, this is not going to kill my revenue. Or, you know, if I restricted prices further to be supported where the valid distributions are supported, which is a natural thing to do, but the objective function is not -- I don't lose a lot of revenue. So the problem with this problem is that it's highly non smooth, okay? So all of these you can construct examples where, you know, each of these, you know, first attempts is going to fail, okay? So it's hard to, you know, establish any smoothness problems on these guys or these guys. All right? So what we do -- so the way we go about this problem is this. All right? So suppose this is a circuit that is hard wired the valid distributions of the buyer for the paintings and for every selection of prices outputs the revenue distribution, okay, of the seller. So it's a distribution because these are deterministic but these are distributions. So for any choice of prices this circuit is going to simple these distributions but not with some revenue, so there is a revenue distribution coming out of the circuit. And what I just said in the previous slide is that, you know, the usual way one goes about approximating such a problem is to establish smoothness properties for this side and that side and [inaudible] that to get a -- you know, polynomial cover of the space of all possible choices of this side. Okay? So instead what we're going to do is to look at this guy directly. So look at all possible revenue distributions that can arise by selecting prize for my problem and try to argue that I can reduce the dimension of that space. So this is scalar on the wire, what's important thing, okay? So I want to cover what we do with Yang is we try to cover a space of scalar on the variables rather than covering, you know, an N dimensional space of prices. So again in the same framework as before, suppose that this is the space of all possible revenue distributions but can arise by choosing price vectors. Okay. So each price vector is going to induce a different distribution in this space. And again the question is what is the natural notion of distance between a collection of prices and another collection of prices? Because I'm going to be covering this space under, you know, this appropriate notion of distance. So it turns out that if you look at this problem this way the distance function that is relevant is some sort of dual of the total variation distance of these guys which is the smallest distance such is that I can perfectly couple these guys, these two random variables so that they're never apart by more than delta. It turns out that this is the right -- this is a sufficient -- this is the maximum distance you can play with. And what the result is that again there is a small cover of this space and this time it's an implicit cover. So we cannot provide a closed-form description of this cover but we can argue that there is a logarithm which given the FIs is going to generate this cover. Okay. So this is what we do. And we optimize, find the best price vector. So we use -so this -- if you have this in place you can -- you can get almost near optimal revenue. But, you know, this is not going to -- this does not go through unless you use two more ideas that are the structural parts of our approach, the structural parts of the theorem. And here's some interesting things we can show about this problem. So, you know, the first theorem I want to show is leveled by a constant number of prices to get one minus epsilon [inaudible] and so what I mean by that is the other one is a single price of [inaudible] variety. So the first theorem says that for any desired epsilon a constant number of prices, a price of different price levels suffice if you're shooting for one minus epsilon fraction of that revenue where, you know, so basically what this theorem says is that the epsilon you want to achieve dictates how many prices, how many different price levels you're going to choose to choose that revenue. So this function is independent of anything that has to do with the problem. >>: [inaudible]. >> Costis Daskalakis: That's right. All right. So this theorem says that, you know, no matter what your problem is, the instance is, you're guaranteed that if you're shooting for a one minus epsilon fraction of the revenue, you only -- you will only ever going to use D1 over epsilon distinct price levels. So that's a strong theorem. It has nothing to do with the instance, the number of different price levels. >>: [inaudible]. >> Costis Daskalakis: No, it's more like one over epsilon log one over epsilon. It's almost linear. So the other structural theorem we established is that in the case of IID the case of the FIs of IID so this theorem says that if there is enough items a assemble price suffices for getting one minus epsilon fraction of the revenue. And the number of items you need to get that depends on the epsilon you're shooting for. Okay? So if you're shooting for a [inaudible] approximation, okay, then even a few items suffice. All right? If you're shooting for something better you -- you know, this function again doesn't depend on anything that has to do with these sums. It just tells you that if you have many identical items a single price will suffice to give you the revenue. [inaudible] structural theorems [inaudible] so, you know, looking at the problem on the revenue side plus these two structural theorems imply the theorem I showed. So generalizing Myerson for the case of [inaudible] distributions. Okay? So ->>: [inaudible]. >> Costis Daskalakis: Yeah. So intuitively what this says is that if you look at the extreme value distribution this works for, I don't know, for [inaudible] so the extreme value distribution for the one that has the right distributions behaves nicely. So there is a point that's going to give you most of -- you know, most of the revenue. >>: [inaudible] so you're pricing paintings and the guys at the same distribution painting actually it wasn't clear [inaudible] different prices. Why should you post different prices on these paintings when they have the same distribution? >>: You might want to hedge, right, you may want to give some low price to [inaudible] and a different price to hit the table, right? So that's what you [inaudible] the tail behaves nicely. So the extreme value distribution, the maximum value drops sufficiently first so that you can [inaudible]. >>: [inaudible]. >>: I don't expect that this would be true for say regular distributions. >> Costis Daskalakis: So multidimensional [inaudible] design. So trying to generalize the previous results to auctions rather than a single bidder, too many bidders, not a single bidder. And that's joint work with my other student Matt Weinberg. So recall a broad question is generalizing Myerson's result for the muti-dimensional setting where I have different values for different items. Recall also that this problem is solved as far as constant factor approximations go. And here again with Matt what we show is that the randomness again, you know, looking, you know -- using this Bayesian viewpoint, we can get one minus epsilon fraction of the optimal revenue for certain scenarios so I'm going to give you what we have. That's fairly recent. So it's not even written up. >>: [inaudible]. >> Costis Daskalakis: Huh? >>: [inaudible]. >> Costis Daskalakis: What? >>: [inaudible]. >>: Is it written down? >> Costis Daskalakis: Okay. So what we have is efficient random [inaudible] fraction of the optimal revenue in the following scenario. Okay. In two scenarios, okay. So the first one is this. So the number of items constant, but the number of bidders may -- the bidders can be many. The bidders have -- are II -- so the valuation functions of the bidders are IID so bidders are identical. Okay? Their valuation functions come from the same distribution independently. Now, what is the valuation function of each bidder, okay? So it can be one of two kinds. So either -- so we're looking at the valuation function of a particular bidder. Either his values for the items are independent from MHR distributions or they can be arbitrarily correlated the values of a bidder for the items. Except, again, I need the balance conditions. So every distribution is supported on a interval whose ratio is bounded. And in general, if it's not bounded I get a result of this kind. I guess I have to pay the log of the ratio in the exponent of the running time. All right? So that's the case of a constant number of bidders. A constant number of items identical bidders, IID bidders and then the valuation functions to the bidder can be either arbitrarily correlated with this balanced condition or they're independent, not necessarily identical, from MHR distributions. So that's the first result we have. And the other one is a flip side of that. So a constant number ever bidders and many items. Two minutes. Yeah. In the case -- in this case I assume that, you know, basically everything is IID. So the value of every player, every bidder for every item is IID from either a [inaudible] rates or a balanced distribution. And again I can generalize it in this way. But just to conclude, the -- where does the randomness come into play? All right. So to get this result we're going to use randomness to do some kind of dimensional reduction in our problem and for those of you who have read Nash's original paper what I'm going to say is going to ring a bell, okay? So Nash's paper, his paper where he showed the existence of Nash equilibria, he established another interesting result that randomization buys you -- can also buy you some symmetry, some structural symmetry in the solution constant of Nash equilibrium. So for example if all the players are identical there's a Nash equilibrium where everybody is using the same mixed strategy. More generally, if they -- you think the functions have any type of symmetry, there is a Nash equilibrium that respects that symmetry. In fact there is always a Nash equilibrium that respects any kind of symmetry that the game has. This is a fairly generic result, general result. So we show the same thing in this setting. So suppose that D is the distribution of the bidder's values for the items. So there are N bidders and M items. So that distribution is supported in this set. So now let's S be a set of permutations on the bidders and on the items so that this distribution satisfies this similarly. So if I permute the output of distribution using, you know, permutation sigma from that set, the distribution stays -remains the same. All right? So let S be all the symmetries that the valid distribution satisfies. So what we show is that their exists an optimal randomized mechanism that respects the symmetries. So what this means is that if the mechanism is given as input, a permutation of a vector of values of all the bidders for the items then the output of the -- the behavior of the mechanism is the same as running the mechanism on the unpermuted vector of values and then permuting the results. Okay? And that's only true for randomized mechanism. It's not true for deterministic. The [inaudible] mechanisms are not going to have these symmetries. Randomized mechanisms do. And this what randomness is buying. Okay? And, you know, as I said, recall, you know, Nash's symmetric -- symmetry theorem. All right. So I'm pretty much done. I had a different -- a last application which I'm going to skip. Coming to conclusions. You know, so problems in economics involve uncertain. Right. That usually comes in, you know, randomized strategies or beliefs ->>: [inaudible]. >> Costis Daskalakis: Which one? >>: The last [inaudible]. >> Costis Daskalakis: Okay. Okay. Mechanisms. >>: [inaudible]. >> Costis Daskalakis: Yeah, yeah. >>: So for item pricing you post different prices. You've got a bidder who's IID, one bidder who is IID and you post different prices on the different items. >> Costis Daskalakis: That's right. >>: What would be the symmetric version of this you're talking about? Are you posting watering prices? That what you're doing? >> Costis Daskalakis: You can do -- yeah, you can give a single water for every -- you can even post symmetric [inaudible]. >>: So you're going from item pricing to watering pricing and you're adding randomization? >> Costis Daskalakis: Yeah. >>: Got it. >> Costis Daskalakis: So, right, to conclude. Problems in economics involve uncertain which is usually [inaudible] in a Bayesian way. But it produce complexity in the form of PPAD for computing Nash equilibria. But it also enables Nash's theorem. The symmetry is [inaudible] paper, Myerson's result is for support. And you know, we need probabilistic techniques to understand this Bayesian uncertainty and use it constructively in applications. And today I talked about anonymous games, auctions. I didn't talk about that. Thanks a lot. [applause]. >> Jason Hartline: All right. I want to thank you Costis for preceding me and solving a problem that I had thought about for several years. So thanks, Costis. Good. We're going to switch gears, still talking about mechanism design, but we're going to talk about some talks related to crowdsourcing. And I want to start with sort of motivating scenario. So I'm a theoretician academic. I give talks and stuff and occasionally I get e-mails from people saying, you know, send me your title, your abstract, your bio. Okay, all good to go. But very rarely but occasionally a talk related graphic I have to send in to these people about my talk. And where am I going to get a talk related graphic? I don't know. Like I can't draw. So fortunately there's the Internet. And fortunately there's this website taskcn which is unfortunately in Chinese but fortunately exists. And this is a crowdsourcing Website. And this crowdsourcing website specializes in the design of graphic images. All right. So if you have a graphic image that you need made and you know someone who speaks Chinese then you can use this website to get your image made. And so here is how it works. Basically you post a description of the task you want and a reward. So I said design a graphic depicting crowsourcing because it was -- this was for this talk I needed the graphic. And the reward, the best design wins $120. Okay? Then the hard part is you wait a week. And then you have a bunch of submissions and you pick the one you like the best and you pay the guy 120 bucks. Okay? So ->>: [inaudible]. >> Jason Hartline: There are various semantics. I think so I did pick one when I did this for getting a graphic for describing this talk. With the help of visiting student actually Qiang Zhang student visiting me for the somewhere. And so you're probably dying to know like what submissions I got when I said draw me a picture ->>: [inaudible]. >> Jason Hartline: I believe so. I believe there are, you know, ways to weasel out somehow. You can say none of them meet some minimal requirements, right? Because you can post requirements and if you say none of them meet the requirements then you don't have to pick one, because none of them met their requirements. A good -- so I received a bunch of submissions. So some of them looked like flow charts which I didn't really like too much. I guess this flow chart still looked better than the other flow chart. Then you get some kind of artistic pictures which don't really make much sense to me. And some horrible, horrible pictures which neither much sense nor have much artistic quality. And then some reasonable diagrams. Okay. There are a lot of people contributing ideas and there's one idea that really worked. Okay. This is getting better. More flow charts but more graphically oriented. Okay. So what was the best one that we got? So the best one that I got was this picture. Which has I guess someone posting a task to the crowsourcing tree of -with multicolored apples. And a bunch of people who are completing tasks and getting money. Good. That's what we're doing. So that enabled me to make this really attractive looking banner that was posted all over Northwestern for my talk, which otherwise I wouldn't have been able to do. So crowsourcing is a great thing. So what's the -- so I'm going to now speak about the theory of crowsourcing. And my main motivation in coming at this was to -- was the following observation. I just made a lot of people draw pictures and I basically threw all the pictures out except one of them I liked and posted that picture on all over Northwestern. >>: [inaudible]. >> Jason Hartline: What's that? I got 18. I didn't show you some of the worst ones. So I wasted a lot of work. And this of course made me nervous because I don't like -- I'm, you know, into optimization and I just did something that wasted a bunch of work. And I kind of was curious as to, A, so there are two questions, right? Can I show that I didn't waste too much work and the other question is well what if I just had gone and hired someone like the traditional way and said I'll -you know, you're an expert in drawing stuff, I'll pay you your hourly rate $120, whatever -- how much time that is, so you work for two minutes and then you draw me a picture, right? And do I get something better or worse? I mean, how does that compare to crowdsourcing okay? So I want to answer these two questions. So my talk is in two parts. The first talk I'm going to review some auction theory which is -- I'm going to use to describe the theory of crowdsourcing and the auction theory is going to be actually pretty quick. And then we're going to talk about crowdsourcing, okay? So that's the agenda. Good. So here's a single-item auction problem. One item for sail. N bidders with unknown private values for the items. And each bidder wants to maximize their utility which is the value they get for receiving the item if they receive it, minus the price they pay. And I want to design an auction to solicit bids and choose the winner in payments. I could have, if I was doing auction theory, have two possible objectives, probably either to maximize a social surplus, meaning I want to give the item to the person who values it the mostly or I could be trying to maximize the seller profit, meaning the total payments I get. Okay? So let's see what happens in maybe the first auction we might think of. The first auction you might think of maybe is first-price auction. So solicit sealed bids. The winner is the bidder who bids the most and charge the bidder their bid. Okay? So as an example if I got these bids, then the person who bids six wins and they pay me six. Good. Another auction you might have thought about when I said well what's an auction, you might have thought about the second-price auction, which almost the same thing, solicit sealed bids, the winner is the highest bidder. But then we charge the winner the second highest bid. Okay. Another auction rule. So again, here, the winner is the bidder, the person who bids six but now they pay me four the second highest bid. Okay? Good. So as an auction theorist here are the kinds of questions I want to be able to answer about these auctions. So I want to know what should happen in equilibrium in these auctions. And I want to know if people were playing an equilibrium then what is the outcome, can I predict the outcome of the auction? Okay. And once I've done these two things, then I can start comparing auctions. I want to maximize social surplus, if I wanted to maximize profit I could say well which auction has higher social surplus and equilibrium or which auction has higher profit in equilibrium? And then I could sort of optimize this once I've solved for what the equilibrium is. Okay? So let's talk about doing this equilibrium analysis. And the first thing that you might do, let's say let's take the second-price auction and let's look at Vickrey's work which shows that bidding your value in the second-price auction is a dominant strategy. Okay? And so you can assume the bidders bid their value and so that when you select as a winner the bidder with the highest bid, that's also the bidder with the highest value because it's a dominant strategy bidder value and so you've given the item to the agent with the highest value so you've maximized social surplus. Good. So that's equilibrium analysis for the second-price auction. That was easy. Okay? Let's now do the more challenging case, the first-price auction. Okay? So I -we're doing this auction. I want to know how should you bid in this auction? Okay. Of course this auction does not have a dominant strategy, meaning you never want to bid your -- of course you don't want to bid your value in this auction because you always get charged your bid and if you bid your value and you get charged your bid, then if you win you get zero utility, right, because your bid is equal to your value. So you never want to do that. And, in fact, if you knew the second highest bid, you would always want to bid epsilon above that if your value was above the second highest bid. But you don't. Okay? So there isn't a dominant strategy. Okay. So I want to have a little quick review of probability so that I can talk about what's going on. And I'm going to use all of my examples with the uniform distribution so things are nice and easy. Okay? So if I draw random variable V and I'll do random variables in red in this talk from the uniform distribution on the interval 0, 1 then what do I a cumulative distribution function which I'll call capital F, the probability this random variable V is less than Z and that's just Z. Right? And the density function is of course the derivative of the distribution function, which is one. Okay? The expectation, the expectation of V is just the integral of the value times a density function which is one here, so it's one-half which is this picture. And I'll do most of my proofs by picture. So what is the expectation of G of V? What's the integral of G of V which is if I had the picture it's the green area minus the red area. So that would be positive in this picture because the green area is bigger than the red area. Good. Okay. And the last thing about expectations, uniform random variables that I'm going to use is uniform random variables evenly divide the interval in expectation. So if I had two uniform random variables and I sorted so V1 was bigger than V2, then the expected value of V1 is two-thirds, expected value of V2 is one-third. All right. Good. So that's my review of that. Now, back to first-price auctions. So you and I are bidding in a first-price auction. Let's assume that our values are drawn from the uniform distribution. Okay? So I want to figure out how you should bid. Okay? And I'm going to tell you how I'm bidding. Let's suppose I bid half of my value, okay? And now you know that my value is uniform 0, 1 and I just told you I'm bidding half my value. So you know I'm bidding uniform 0 one-half, right? And now given your value, you can solve for what you should be bidding. All right. So let's just do that really quick. So your utility is a function of your value and your bid. Right? And if you win, then what's your utility? It's your value minus your bid, right? Because you win you hit your value and you pay your bid. So if you win it's your value minus your bid. So what your utility for bidding B when your value is B, it's your value minus bid. Neither of these are random. Times the probability you win which is random, because it depends on what I do. Okay. So what's the probability you win? Well, that's really easy. That's a probability that your bid is bigger than my bid. What's my bid? My bid is half my value. Okay. So let's rearrange. That's the probability that twice your bid is at least my value. My volume is uniform zero one. So this is just a CDF of 2B which is 2B. Okay. So I get a formula which is V minus B times 2B. Good. Which is this. How do I maximize that? How do you maximize that if you want to figure out your optimal bid? Had well, you take the derivative set equal to zero and solve it and you get you should bid V over 2. So I just showed you what you should do given what I was doing. But notice -so what's the conclusion? You bid half your value. But I was bidding half my value. So if I bid half my value, then you want to bid half your value. But if I know you're bidding half your value, then I also want to bid half my value. So that's an equilibrium. Okay? So that's a base Nash equilibrium and that's what we're going to talk about in this talk, bidding half your value is a big Nash equilibrium. What can we conclude from this equilibrium analysis? Well, who wins? If everyone in the auction bids half their value, who wins? The person with the highest value wins. So again it maximizes social surplus in equilibrium. Just like the second-price auction did. Okay. Good. I now want to talk about ->>: [inaudible]. >> Jason Hartline: Is that the only equilibrium? Yes. I'm not going to prove that in this talk, although I could if you gave me another 30 minutes. If you all gave me another 30 minutes. Okay. Good. So now I want to talk about -- we talked about social surplus, first-price and second-price are the same. I want to now quickly talk about profit. What's the profit of the second-price auction? Again, two bidders, uniform values. Well, that's easy, right. Draw values from intervals, V1 is less than V2. What is my profit? Well, it's the expected value of V2. So I get as payment the highest payer pays the second highest value. So what's the expected value of V2 here? It's one-third. By the assumption their uniform. Okay? What about the first-price auction? What? Half, one-quarter? One-third. Thank you. It's one-third, right? Because the -- your expected profit is equal to the expected highest bid. What's the highest bid? Well, the highest bidder pays -bids half of their value, right? So the highest bid is two-thirds, right? But he bids half his value, which is one-third in expectation. Okay? So they actually had the exact same expected profit, too. They have the same surplus, they have the same profit. This is not a coincidence. In fact, it's one of the most important theorems in auction theory which is two auctions with the same equilibrium outcome had the exact same expected revenue. And, in fact, the theorem is more precise than just this. It's actually happens once I know my value in the auction I had the same expected payment. Okay? So in the first and second-price auctions if my value is V then my expected payment in the auction is the exact same. And so I'm just going to use this notion payment of V to denote the payment that I have in the first or second-price auction if my value is V in expectation. Because they're the same. Okay? Good. I want -- now want to turn to one -- another auction. I want to talk about the Alan-pay auction. So an all-pay auction is the following. I again want to sell a single item to a bunch of bidders so I solicit sealed bids. The winner is the person who bids the most. Everyone pays their bid. Okay? So as my example, bird six wins, everyone pays. Clear? Good. I want to now solve for the equilibrium in this game. And the way I'm going to do that is I'm first going to guess. I'm first going to guess that probably the bidder with the highest value is going to win in equilibrium if the bidders are IID, right, because they should be bidding the most. If the person with the highest value wins then the equilibrium is the exact same as the equilibrium in the second-price auction and I can use the second-price auction to figure out what my payment should have been. Okay? So what should the equilibrium strategy be to get this equilibrium where the highest valued agent wins? Well, in the all-pay auction our bid is equal to our payment, right? Now, so my bid is a function of my value is equal to my payment. But remember payment is just a formula. It's the same for all pay -- this is the expected payment. This is the formula that's the same for the all-pay the first-price, the second-price auction, right? So what is that formula? Well, that formula is it's just the second-price auction payment, okay? So let's break that down. What do you pay a second-price auction? Well, you only pay if you win. So if you win you pay the second highest value given that you won. Okay? So let's do an example. Two bidders uniform 0, 1. If your value is V, what do you expect to pay the second-price auction? Well, the second highest value given that you win well that's the expected value of a uniform random variable conditioned on being less than your value. So that's half your value. Okay? So this first term is V over two. And the probability you win given your values V, well that's the probability he's below you which is just the CDF, which is V. Okay? So V squared over 2, that's what you should bid in a two-player all-pay auction. Okay? And of course if you do that, that verifies my guess that the highest bidder wins, right? If you do that, then the highest bidder will win. Good. So my guess was right. Yes? >>: [inaudible]. >> Jason Hartline: This last example assumed that. But everything else assumes the IID. Okay? Good. Okay. So that's my overview of auction theory. I now want to relate auction theory to crowdsourcing. So here's sort of a first attempt at modelling crowdsourcing. I post a reward that I normalize to one dollar and then contestant I comes and they have some skill, which is how good they are at producing images. And they make some effort, they work. And their effort times their skill is the quality of the picture they make, okay? Okay. So the quality that make which I'll call P is just their effort times their skill. Good. And this agent receives some reward XI for doing the work, which I'll say is either zero or one whether he's zero if he doesn't get the reward or it's one if he gets the reward. Okay? And so what's his utility? His utility is the reward he gets minus the effort he puts in. Okay? Good. And what's our goal in crowdsourcing? Well, I want to maximize the best quality submission that I get. Okay? So I want to maximize -I want to optimize the maximum payment. Okay? I want to compare this setting to all-pay auctions. Okay? So what happens in an all-pay auction? I have one item for sale. And I have a bidder I who has a value VI for receiving the object. They bid and pay the same amount, so I'll just use the same variable for their bid and their payment, which is just PI. Good. I gone choose the outcome who wins, XI 0, 1. And what's his utility? Well, his utility is his value times the probability he wins or times whether he wins or not minus his payment. Okay? That's your utility in the all-pay auction. And your goal usually maybe in the all-pay auction is to minimize your revenue which is the sum of the payments. Okay? So I want to compare these two things because the utilities look kind of different and I want to note that from this bidder's perspective his value is a constant so I can multiple this equation on both sides by his value and say he's just trying to optimize this thing. Right? If he optimizes his value times his utility, that's the same as optimizes his utility. Right? And now the objectives are the same okay because VI times EI is just PI by definition. >>: [inaudible]. >> Jason Hartline: It controls EI. That's his effort. Okay? So basically the utilities are the exact same thing. And so what do we know? We know the -- if the utilities are the same, then what happens? Well, the equilibriums are exactly the same. Okay? So that analysis we just did solving for equilibrium, we're done. We know what goes on in crowdsourcing auctions. Okay? The difference -- so we have the same equilibrium, but the difference between crowdsourcing contests and all-pay contests is here our revenue is the sum of the payments and here the payments of the losers are wasted. Okay? That's the big difference. And that's the focus of my talk is understanding what's wasted, how much and good. So crowdsourcing contests everyone works. Works of losers is wasted. It's like an all-pay auction. Otherwise I want to compare crowdsourcing contests to conventional procurement, right? What if I just hired the best guy, or what if I was the government and I did some auction where people sort of put in proposals and I pick the best proposal and only they did the work? Right? So this proposal system is kind of more like a second-price auction or a first-price auction, right? And so if you're trying to get good revenue in the second-price auction you just get the full set of payments, right? Had and we know from revenue equivalents that all pay and first and second price get the same total revenue. Right? So my only question is how much work is wasted. Right? Because that wasted work is work I could have gotten in the second-price auction doing regular procurement. >>: [inaudible]. >> Jason Hartline: By the PI. >>: [inaudible]. >> Jason Hartline: Yes. Yes. I value wasted work by the PI and I note that had I run a second-price auction or first-price auction to do this procurement I could have gotten the full sum of the PIs as payment because I just -- only the first guy pays. Only the highest guy pays, right, in the revenue equivalent. So there's no wasted work. And it is the payments I'm talking about. Okay? >>: [inaudible]. >> Jason Hartline: Yes. Yes. You only know PI. >>: [inaudible]. >> Jason Hartline: No, I cheated by showing them to you. But pretend you didn't see them. Okay? Because otherwise -- I mean, so, yeah. So usually -- so -and the assumption going in here is that you're trying to get crowdsource something that you can't combine. You got a bunch of payments if they were money you could just take the sum of the payments, right? But I'm imagining these images there's no way to combine the images I got to get one image that I want to post on my flier. >>: [inaudible]. >> Jason Hartline: For -- yes. For the illustration to talk we kind of went around the model. But let's ignore that. Good. So this is what I was just saying. So basically I just want to quantify this loss from wasted work. And it turns out it's not that hard to do. So this is the main theorem that I wanted to tell you. The main theorem is that the highest payment is at least half of the total revenue that you would have gotten if you collected all the payments. Or in other words, the highest payment is at least the sum of all the losers payments. Okay? Good. So that's what I want to show. I'm going to give you a proof of this on this one slide. It's pretty concise. So I'll have two variables, W and L. W is the expected winner's payment, L is the expect loser's payment and I just want to show that W is bigger than L. Okay? So what is the winner's payment? Well, remember payment of V is the sort of -we know if your value is V we know what you bid, right? You bid your payment. Okay? So your payment is payment of V times a probability you win if your value is V. That's the winner's payments. And the loser's payments are payment of V times the probability you lose if your value is V which is one minus the probability you win. Good. So what is W minus L? Well, it's your payment to V times two times the probability you win minus one. Right? Just regrouping. Okay. So let's go back to previously in the talk where we solved for what payment was. And remember, that was the expected second highest value conditioned on V winning times a probability of V winning. Okay? And so to avoid myself writing V winning all the time, this probability I'm going to remember that for the uniform distribution that's easy. Probability V wins is just V to the N minus one, right? Because it's the probability you're each of the N minus one other guys and you beat each one with probability V and so the probability you beat all of them is V to the N minus one. Okay? So this seems not very general like it's only for uniform distributions but you can get the same decomposition just by changing variables into probability space instead of value space. Okay? So the same proof is general but let's just do it for uniform so we don't get confused. Good. And I'm just going to call this thing, so I don't have to write it out again, G of V. Okay. So here's my new equation. I hit -- this is again my payment, G of V times V, VN minus one times this two probability of winning which is this. Okay? So I want to calculate the expected value of this. G of V is complicated. But this term is easy. So let's -- let's do the easy thing first which is calculate the expected value of this term without the G of V and see what we get. Okay? So I claim that this is at least zero. And you can work this out for any N. But here's a picture which shows you that that's at least zero, right, the green area is bigger than the red area so it's at least zero. Good. But furthermore look at this picture. There's more positive area to this side than that side. Right? And if we know one thing about G of V we know it's monotone increasing. Right? So it's got bigger values towards this side. Okay? So I'm taking -- what is the expectation? It's the weight sum of G of V with this thing that's increasing weight on this side good. So G of E was a monotone function. So I got more weight here. So the whole thing is positive. All right. That's it. So some conclusions. We modelled crowdsourcing contests as all-pay auctions. We showed the loss from wasted work is small. A major work horse of this theorem was the revenue equivalent theorem which made it all easy. Okay? There are two other results in the paper that I'm referring to here which I didn't have time to talk about. And one is solving for optimal crowdsourcing contests. And so let me just say quickly what one of the questions you could answer is. So we gave all the price to the best participant. Well, I don't have to do that, right? So for instance TopCoder is a crowdsourcing Website for programming and though give two-thirds of the price to the best guy and one-third to the second best guy. Is that a good idea? Is that optimal for some distribution of values or is it not? Remember we still only get the value of the best contribution but somehow we reward the second guy for his effort. Thanks for playing, we'll give you a third the prize. Right. Is that a good idea? So the answer is no. Among static allocation rules, meaning if you fix in advance that you allocate to the highest guy or some static rule like you allocate some fraction to the highest, some of the fraction second highest, some other fraction to the third highest and you fix that in advance, amongst all fixed in advance rules which rule do you want, you want to always go with the highest guy. Okay? So top coder is doing something stupid. And another question you could ask is well, maybe after seeking the qualities I can adjust the amount I pay. Right? And then if you try to do this, this takes the form a lot closer to what Costis was referring to as maximizing virtual values. And so you can derive virtual values for the maximum payment and then optimize virtual value in that and optimize to the maximum payment. And you get similar results there. Except that unlike in the auction case, these virtual values depend on the number of players. So the more players you get, you do different things. Okay? And the last thing that we did, which is -- I don't really know how to sell this because I don't know the economics that [inaudible] that well, but I am told by economists that they didn't know that for N bidder IID distributions that the first-price auctions and all pay auctions had unique equilibrium. And we've proved that. And we prove it basically using revenue equivalence. Whereas the proofs in the literature -- in the economic literature I've seen for analyzing first-price auctions and all-pay auctions use differential equations which is why they do it for N equals 2. And so they know equilibrium results for N equals 2 but for general and they don't know because differential equations with N functions are complicated. Right? And so we do it with revenue equivalence for N players in IID settings and I'm assuming they don't know it because people I've talked to are like, whoa, really? But it is also really easy. So I'm not -- I'm surprised. All right. That's that. [applause]