>> Yuval Peres: All right. We're happy to... give us two game theory talks today.

advertisement
>> Yuval Peres: All right. We're happy to welcome both Costis and Jason to
give us two game theory talks today.
>> Costis Daskalakis: Hi, I'm glad to be back. So today I'm going to talk about
some interaction -- what I call probabilistic approximation theorems to solve
certain game theory and optimization problems. And so the talk is para modular
so it has para modules and you can stop at any time. So I'm not sure how much
I'll cover in 45 minutes.
So you know, so physicists tell us that, you know, there is uncertain that's
inherent in the physical world. But as this picture may convince you, there's also
uncertain in the social environment okay? So this is an interesting T-shirt.
So that's rock, paper, scissors, right? So what I'm interested in in this talk is to
see -- to look at uncertain and optimization problems, game theoretic problems,
auction problems and I'm going to look at stochastic uncertain, right? So
Bayesian uncertain is using in these settings. So in the game theoretic setting
it's the randomization that enables Nash's Theorem.
In the auction setting, it's randomization that enables Myerson's Theorem
because of the design of the revenue optimum functions is enabled by modeling
the uncertainty with environment in the Bayesian way. And in the optimizations
genetically modeling uncertain the stochastic way allows us to go from worst
case to average case analyst. Okay? And get, you know, stronger results.
On the other hand, this casts with difficulties so in these settings uncertain makes
a problem harder to solve, so the hardness is the form of PPAD completeness.
Over here we don't know how to generalize this important result to
multi-dimensional settings and genetically Bayesian uncertain introduces
nonlinearity in the underlying optimization [inaudible] so it comes with the
presence and it comes with difficulties. And I'm -- what I want to do today is to
show examples where we can somehow harness this Bayesian uncertain to get
rid of some of these difficulties.
And the way we're going to do that is we're going to use tools from our probability
theory. All right. So what I want to do is I want to connect these two things. In
particular, I want to -- I'm not going to look at genetically optimization
combinatorial problems. Okay.
So my first module is symmetries in games, which I've talked about before in this
group, so I'm going to go through it faster just to -- I'm just going to -- I just want
to illustrate a few things with this example. So the goal of this module is to study
settings, game theoretic settings where there is a lot of symmetry. Okay?
In particular, in this environment, there's no way a player is keep track. Identities
and actions of every other player in the game. Probably a player is keeping track
of aggregates of what the other players are doing. And here is a model, a
mathematical role to capture subsettings. So I'll call a game an anonymous
game. If every player only cares about what he's doing and what the aggregate
behavior of the other players are. So more precisely every player supposed to
have the same start to the set and then the player function of each player is
anonymous, that is it looks at what the player is doing and then how many of the
other players are choosing each of the other level strategies in the game.
As an example congestion games can be written in this form because when I'm
driving in the road network I only care about what route I'm choosing and then,
you know, what routes the other players are choosing, it's sort of aggregate way.
I don't care about the identities of other players. Or, you know, if you want to be
more elaborate I can partition players into types, slow drivers, fast drivers
[inaudible] but in any event, you know, there is a lot of anonymity in the way I'm
modeling the game. But you know, examples with drawn in other settings, social
phenomenon, auctions and so on, so forth. And, you know, just a couple of
references in the game theoretic literature.
So over here a social phenomenon started using this model. These are condition
games.
So what I'm interested in is how layers reason in these settings. There's a big
population of other players and really what affects my payoff is the aggregate
behavior of those players. All right? So what -- you know, how would -- how
would Nash play this game? Okay. That's a joke referring to the [inaudible]
okay. So how would Nash play this game?
So here's, you know, some results I want to talk about. So in joint work with
Christos Papadimitirou we solved these games in the following notion of solve.
We compute for any given precision in polynomial time a natural approximate
Nash equilibrium of these games. Okay?
And to do that, we have to understand how to place reason about what they have
to do given the aggregate behavior of the other players and, you know, these
results are enabled by approximation theorems that sort of approximate the
aggregate behavior of the other guys. And I'm going to show examples of what I
mean by that. But in reality don't reduce us to theorems about sums of indicator
on the variables or multidimensional versions of indicators.
So let me be more precise. So in an anonymous game, the Nash equilibrium is
some vector of -- say there are two actions per player. Every player is just
choosing a probability. So the Nash equilibrium lies in this space, okay. So
every player is using probability and there are N players. All right? So the Nash
equilibrium is over here. So I have to search the space. It's highly dimensional -high dimensional. These are independent indicators because in Nash equilibria
players are independent. And really if I look at this as an optimization problem,
you know, what is the objective function? Or more generally, if this is the really
Nash equilibrium and, you know, instead of that I look at this point, how well does
this perform -- how well of a Nash equilibrium is some other point that's sort of far
from the actual equilibrium in this space? So what's the write notion of distance,
if you want in this space that I'm interested in.
So it turns out that the relevant distance function is the talk of variation distance
between the sum of these indicators and the sum of those indicators. In other
words, if this is the real Nash equilibrium and this is actually close to the Nash
equilibrium under this distance, then that's an excellent approximate Nash
equilibrium. We'll take this as given. It's pretty straightforward. But what I want
to do to find an epsilon Nash equilibrium is to come epsilon close to the actual
Nash equilibrium. Where the distance is measured by the total variation distance
of between the sums of the two indicators. Sorry. The sums much these
indicators versus the sum of those indicators.
And, you know, genetically the way -- the way I want to solve this problem is to
cover this space with, you know, epsilon balls in this distance. So I want to
epsilon cover under this type of distance.
>>: [inaudible] the number are zero one or [inaudible].
>> Costis Daskalakis: They're indicator variables.
>>: [inaudible].
>> Costis Daskalakis: Right. It's isomorphic to this space. So the space of all
mixed stratus files is isomorphic to the hypercube. So I'm looking at this space,
the space of all vectors of indicators. And what I said in the previous slide is if I
have a mixed strategy profile, so vector indicators and a different mixed strategy
profile, how close are they in the ->>: [inaudible].
>> Costis Daskalakis: You're asking about this? [inaudible] so for this example,
suppose every where is two strategies. So a mixed strategy is just an indicator
[inaudible].
>>: [inaudible].
>> Costis Daskalakis: Two strategies. And more generally is this [inaudible]
categorical [inaudible].
>>: [inaudible].
>> Costis Daskalakis: [inaudible]. All right. So but [inaudible] right. So the -- my
point was that if this is a Nash equilibrium and I come epsilon close to the Nash
under this distance, then there is an epsilon Nash equilibrium. All right? And the
idea of the generic way to find the Nash equilibrium would be to cover this space
with epsilon balls under this notion of distance, right, and then I'm guaranteed
that one of these points is going to be epsilon close to the Nash equilibrium, so
I'm going to exhaustively go over each of these representatives, we want to
represent the ball in my cover and check if that's an epsilon Nash equilibrium.
Okay? And obviously the problem is that this is, you know, high dimensional and
potentially the number of balls could be huge. And, you know, we showed with
Christos is that there is a polynomial number of balls that covers this space,
which was sort of like -- we didn't expect it originally.
But just to relate it to the types of approximation theorems I want to construct,
here's an example approximation theorem that we need to come up with to
design such a cover. So here's a theorem. So for any collection of indicator on
the variables, without [inaudible] expectations, these are independent, right, and
any given constant delta to dependent N or anything else, it's an absolute
constant, there is a way to construct another collection of indicators YI with
expectation QI, so that the expectations of these guys are restricted to be
multiples of delta. So these are of some finite precision if you want.
And at the same time the total variation distance between the original given sum
of indicators and the constructed sum of indicators is a function of delta. So you
don't get the penalty of N as one would expect from this restriction. Okay? So
generically if you are restricted to use multiples of delta, you would expect an N
coming into this bound, what we show is that there's no N coming into that and
that be enables polynomial size covers. Okay?
>>: [inaudible].
>> Costis Daskalakis: Huh?
>>: [inaudible].
>> Costis Daskalakis: So I can -- the best I can do is delta log one over delta. I
don't know if that's optimal -- optimal. It's optimal -- it's almost optimal for N
equals one. But potentially if you have more you can use it. If you have a bigger
number of indicators you might be able to use it somehow. All right.
So in any event this enables constructing covers of this size. And, you know,
then you can exhaustively go over them and find the Nash equilibrium. Okay?
So then it boils down to, you know, if you want a faster algorithm you need a
better cover. Okay? And the best we can do, I'm not going to go through -- over,
you know, this other theorem, but the best we can do right now is something of
this form. So I can cover this space under this notion of distance with some
polynomial number of balls in N, the number of indicators, and some function of
epsilon that's quasi polynomial in one of our epsilon. All right?
So -- and I can convert then looking for an epsilon Nash equilibrium to searching
over a space of this size. And as an open problem, whose answers I don't know
I don't have strong belief of whether a better stronger cover exists or not. So as
a open problem, you know, is there discover of this space that has size
polynomial in both N and one over epsilon? That's an interesting question. Yes?
>>: [inaudible].
>> Costis Daskalakis: Is it to see how to go -- excuse me?
>>: [inaudible].
>> Costis Daskalakis: It's.
>>: So the existence of these [inaudible].
>> Costis Daskalakis: Right. So you're asking how do you go from the theorem
to breaking down the number of balls? How do you use the theorem to ->>: [inaudible].
>> Costis Daskalakis: Right. How do you prove a theorem. Yeah.
>>: [inaudible].
>> Costis Daskalakis: So it's not exactly but it's -- it's not exactly two lines. But
you use -- okay. So one important ingredient of this function is symmetric over
the indicators, so to begin with, there is already some reductions you can do.
That's not going to give you a polynomial reduction but the problem is symmetric
in permutations. And then, you know, using that and using the theorem I've
presented you can bring down the dimension. Okay? So because this is
symmetric and because I can just take each of these guys to choose from a finite
set, there is only so many -- there's only polynomially many possible permutation
variance classes of collections I can make. Okay? All right.
That was my first application. All right. So -- and what I want to talk about now is
a different setting, an auction setting. I want to talk about what is called
multi-dimensional pricing. We're going to explain what that means. But some
key studies first. So that falls under the more general class of problems called
optimal [inaudible] design. Where basically the goal is to come up with an
auction that optimizes the revenue of the auctioneer. And such auctions are
known for certain settings. And maybe the most celebrated result in this realm is
Myerson's auction which basically does the following. So it's a revenue optimal
auction. Under two -- under the assumption that's a [inaudible] are
quote/unquote single parameter Bayesian. I'm going to say what that is. I'm
going to clarify that single parameter Bayesian means and what closed forum
means.
So by single parameter Bayesian, I'm -- it's a strong assumption. Basically what
it says is that every bidder who is participating in the auction has the same value
for each of the available items. All right? So if -- you know, if the auctioneer is
selling, you know, many be paintings then every bidder has the same value for all
the paintings. Okay? The value's random. But it's a single value for every
painting. Okay? No matter if it's Picasso Moma, my value's the same, okay, and
that value is a random number.
>>: [inaudible].
>> Costis Daskalakis: So over bidder has a distribution. And his value for every
painting is a single drove from that distribution. All right? But different bidders
may have different distribution. It's a pretty restricted setting if the items are
heterogenous. All right?
>>: [inaudible].
>> Costis Daskalakis: No, no, no. One [inaudible].
So that is a restriction of this theorem. What is good about this theorem is that
it's not generically searching over all -- it's not a theorem saying that there exists
an optimal, all right, it's a theorem that says that the optimum has a very precise
elegant form. In particular it says that the auction that optimizes revenue is really
the auction that optimizes, you know, the virtual welfare of people. Okay?
I'm not going to get into the details of that, but it's a very precise characterization
of what the optimal auction is doing in this setting. All right? So that's what
Myerson says. And following Myerson, you know, one of the most important
problems in this area is the generalization of this theorem to the more natural
setting of heterogenous items where an agent has different values for the
different paintings in the auction. So the multidimensional setting. All right? So
we want to have a same kind of theorem for the multi-dimensional case.
So there is a large body of work in economics looking at this problem. And
recently also it appears science has been interested in that. So -- and just to
mention a couple of recent papers on this problem. Basically as far as constant
factor approximations to the optimal revenue growth, the problem solves, okay?
So we know we have very good -- very elegant auctions that achieve a constant
factor of the best revenue. All right? So in terms of constant factor of
commissions the problem is solved. What I want to talk about here is if we are
interested in not getting a constant factor approximation but almost all the
revenue that we can possibly get, okay? So I'm pregnant here joint work with my
student Yang Cai on a close forum efficiently computable [inaudible] inception
optimal revenue mechanism for in the case there's a single bidder. Okay? So
I'm looking at this problem but for a single bidder. All right?
And I'm going to talk about some different work for generalizing [inaudible].
Okay? So maybe this all seems absurd so let me make it more specific. So
basically thing pricing problem is the following, okay? So there is one customer,
and I have the gallery, okay? I have a gallery with many paintings so there's one
customer. I have posted prices on my paintings, right, and the customer comes
in and wants to decide what painting to buy, right? So how is he going to make
that decision? He's going to basically say oh, you know, this painting is worth P1
to me, this other paintings worth P2 to me and this one is VN, right? And I'm
going to pick the painting that optimizes the difference between my value and the
price it has. And I'm going to pay, you know, the seller the price of that painting.
Okay? The price of the art marks of the gallery, between the value and the price.
>>: [inaudible].
>> Costis Daskalakis: That's right. Okay. And, you know, the [inaudible] a big
problem because I can talk [inaudible] but in principal there could be an arbitrary
function, okay? So the columns usually look at that. Okay? So I can handle
logs, I can handle issues but I'm not sure if I can handle [inaudible].
So the point is that, you know, so I have the gallery, and I want to decide on the
prices to optimize this objective. And in principal, if I have no idea about the
values of the bidder of the buyer for my paintings I cannot solve this problem. All
right?
But what do you have Bayesian information about the values of the buyer for the
paintings? So this is a setting I want to look at, right? So I know that -- and I'm
going to assume that the values are independent random variables. So I have a
distribution for the value of this guy for each of the paintings and these are
independent rows. If they're not independent rows, there's not much I can do,
okay? So the problem's highly inapproximatable if there are correlations.
So now what is the optimization problem I need to solve? Well, I can write there
is a close expression for what I need to optimize. So let me walk you through it.
Okay? So ->>: Optimization [inaudible].
>> Costis Daskalakis: [inaudible] that's usually -- that can be handled in some
cases. Okay? But [inaudible] all right. But let me walk you through the expected
revenue, okay? So if I have posted prices P1 through PN for my paintings and
the values are random variables then really my revenue is the buyer is going to
pay price PJ with probability that the gap for the Jth item dominates the gaps for
all the other items. That's what this guy says. And really what I want to do is I
want to optimize his objective function.
So I want to find the prices to optimize this function which is -- has a closed-form
expression, right? It's not computed by [inaudible] or anything. It has the
closed-form expression. But it's highly nonlinear. You know, these guys are
drawn from this distribution. So.
>>: All this is subject [inaudible].
>> Costis Daskalakis: That's right. Right. So I want to add another positivity
constraint. All right. So ->>: [inaudible].
>> Costis Daskalakis: So I'm looking at the unity band case. We can talk about
in additive buyer case or other settings. So I'm going to restrict my discussion to
the unit band, unit band setting.
So this guy wants to buy one painting for his living room, and this is the way he's
going to decide about it and this is the expected revenue I'll get. Okay? So what
do we know about this optimization problem? Well, we do have results. Again,
in terms of constant factor approximations, the problem is solved. So there is a
paper by Jason, Sue and Bobby giving poly time, constant factor approximation
for this problem.
So the result that I was talking about with Yang Cai is that woke get
approximately arbitrarily close to the optimal solution -- to the optimal revenue -optimal revenue under the assumption that this guy, the distributions are
[inaudible] distributions. All right? Or if the support -- each of them is supported
in a balanced set, it's not spanning -- because in some sense the input
complexity of the problem is -- the support of these distributions could be
exponentially in the [inaudible]. That's how I want to avoid this case, okay?
Even the IIDKs or other restrictions is not easy to solve, okay? Even, I don't
know the support 3 case. If every distribution is a support 3, it's not easy to
solve. So how do I go about solving it? So this again stating my optimization
problem, the way -- I mean in general the way one goes about solving such a
problem is to establish some smoothness properties of the objective function:
So, you know, what I would like to do is for instance to argue that, you know, if I
-- if I change this distribution buy epsilon in total valuation distance does not -- it's
just going to, you know, destroy my revenue. Or if I restrict my prices to a
discrete set, this is not going to kill my revenue. Or, you know, if I restricted
prices further to be supported where the valid distributions are supported, which
is a natural thing to do, but the objective function is not -- I don't lose a lot of
revenue. So the problem with this problem is that it's highly non smooth, okay?
So all of these you can construct examples where, you know, each of these, you
know, first attempts is going to fail, okay? So it's hard to, you know, establish
any smoothness problems on these guys or these guys. All right?
So what we do -- so the way we go about this problem is this. All right? So
suppose this is a circuit that is hard wired the valid distributions of the buyer for
the paintings and for every selection of prices outputs the revenue distribution,
okay, of the seller. So it's a distribution because these are deterministic but
these are distributions. So for any choice of prices this circuit is going to simple
these distributions but not with some revenue, so there is a revenue distribution
coming out of the circuit.
And what I just said in the previous slide is that, you know, the usual way one
goes about approximating such a problem is to establish smoothness properties
for this side and that side and [inaudible] that to get a -- you know, polynomial
cover of the space of all possible choices of this side. Okay? So instead what
we're going to do is to look at this guy directly. So look at all possible revenue
distributions that can arise by selecting prize for my problem and try to argue that
I can reduce the dimension of that space. So this is scalar on the wire, what's
important thing, okay? So I want to cover what we do with Yang is we try to
cover a space of scalar on the variables rather than covering, you know, an N
dimensional space of prices.
So again in the same framework as before, suppose that this is the space of all
possible revenue distributions but can arise by choosing price vectors. Okay. So
each price vector is going to induce a different distribution in this space. And
again the question is what is the natural notion of distance between a collection
of prices and another collection of prices? Because I'm going to be covering this
space under, you know, this appropriate notion of distance. So it turns out that if
you look at this problem this way the distance function that is relevant is some
sort of dual of the total variation distance of these guys which is the smallest
distance such is that I can perfectly couple these guys, these two random
variables so that they're never apart by more than delta. It turns out that this is
the right -- this is a sufficient -- this is the maximum distance you can play with.
And what the result is that again there is a small cover of this space and this time
it's an implicit cover. So we cannot provide a closed-form description of this
cover but we can argue that there is a logarithm which given the FIs is going to
generate this cover. Okay.
So this is what we do. And we optimize, find the best price vector. So we use -so this -- if you have this in place you can -- you can get almost near optimal
revenue. But, you know, this is not going to -- this does not go through unless
you use two more ideas that are the structural parts of our approach, the
structural parts of the theorem. And here's some interesting things we can show
about this problem.
So, you know, the first theorem I want to show is leveled by a constant number of
prices to get one minus epsilon [inaudible] and so what I mean by that is the
other one is a single price of [inaudible] variety. So the first theorem says that for
any desired epsilon a constant number of prices, a price of different price levels
suffice if you're shooting for one minus epsilon fraction of that revenue where,
you know, so basically what this theorem says is that the epsilon you want to
achieve dictates how many prices, how many different price levels you're going
to choose to choose that revenue. So this function is independent of anything
that has to do with the problem.
>>: [inaudible].
>> Costis Daskalakis: That's right. All right. So this theorem says that, you
know, no matter what your problem is, the instance is, you're guaranteed that if
you're shooting for a one minus epsilon fraction of the revenue, you only -- you
will only ever going to use D1 over epsilon distinct price levels. So that's a strong
theorem. It has nothing to do with the instance, the number of different price
levels.
>>: [inaudible].
>> Costis Daskalakis: No, it's more like one over epsilon log one over epsilon.
It's almost linear.
So the other structural theorem we established is that in the case of IID the case
of the FIs of IID so this theorem says that if there is enough items a assemble
price suffices for getting one minus epsilon fraction of the revenue. And the
number of items you need to get that depends on the epsilon you're shooting for.
Okay?
So if you're shooting for a [inaudible] approximation, okay, then even a few items
suffice. All right? If you're shooting for something better you -- you know, this
function again doesn't depend on anything that has to do with these sums. It just
tells you that if you have many identical items a single price will suffice to give
you the revenue. [inaudible] structural theorems [inaudible] so, you know,
looking at the problem on the revenue side plus these two structural theorems
imply the theorem I showed. So generalizing Myerson for the case of [inaudible]
distributions. Okay?
So ->>: [inaudible].
>> Costis Daskalakis: Yeah. So intuitively what this says is that if you look at
the extreme value distribution this works for, I don't know, for [inaudible] so the
extreme value distribution for the one that has the right distributions behaves
nicely. So there is a point that's going to give you most of -- you know, most of
the revenue.
>>: [inaudible] so you're pricing paintings and the guys at the same distribution
painting actually it wasn't clear [inaudible] different prices. Why should you post
different prices on these paintings when they have the same distribution?
>>: You might want to hedge, right, you may want to give some low price to
[inaudible] and a different price to hit the table, right? So that's what you
[inaudible] the tail behaves nicely. So the extreme value distribution, the
maximum value drops sufficiently first so that you can [inaudible].
>>: [inaudible].
>>: I don't expect that this would be true for say regular distributions.
>> Costis Daskalakis: So multidimensional [inaudible] design. So trying to
generalize the previous results to auctions rather than a single bidder, too many
bidders, not a single bidder. And that's joint work with my other student Matt
Weinberg.
So recall a broad question is generalizing Myerson's result for the
muti-dimensional setting where I have different values for different items. Recall
also that this problem is solved as far as constant factor approximations go. And
here again with Matt what we show is that the randomness again, you know,
looking, you know -- using this Bayesian viewpoint, we can get one minus epsilon
fraction of the optimal revenue for certain scenarios so I'm going to give you what
we have. That's fairly recent. So it's not even written up.
>>: [inaudible].
>> Costis Daskalakis: Huh?
>>: [inaudible].
>> Costis Daskalakis: What?
>>: [inaudible].
>>: Is it written down?
>> Costis Daskalakis: Okay. So what we have is efficient random [inaudible]
fraction of the optimal revenue in the following scenario. Okay. In two scenarios,
okay. So the first one is this. So the number of items constant, but the number
of bidders may -- the bidders can be many. The bidders have -- are II -- so the
valuation functions of the bidders are IID so bidders are identical. Okay? Their
valuation functions come from the same distribution independently. Now, what is
the valuation function of each bidder, okay? So it can be one of two kinds. So
either -- so we're looking at the valuation function of a particular bidder. Either
his values for the items are independent from MHR distributions or they can be
arbitrarily correlated the values of a bidder for the items. Except, again, I need
the balance conditions. So every distribution is supported on a interval whose
ratio is bounded.
And in general, if it's not bounded I get a result of this kind. I guess I have to pay
the log of the ratio in the exponent of the running time. All right? So that's the
case of a constant number of bidders. A constant number of items identical
bidders, IID bidders and then the valuation functions to the bidder can be either
arbitrarily correlated with this balanced condition or they're independent, not
necessarily identical, from MHR distributions. So that's the first result we have.
And the other one is a flip side of that. So a constant number ever bidders and
many items. Two minutes. Yeah.
In the case -- in this case I assume that, you know, basically everything is IID.
So the value of every player, every bidder for every item is IID from either a
[inaudible] rates or a balanced distribution. And again I can generalize it in this
way. But just to conclude, the -- where does the randomness come into play?
All right. So to get this result we're going to use randomness to do some kind of
dimensional reduction in our problem and for those of you who have read Nash's
original paper what I'm going to say is going to ring a bell, okay?
So Nash's paper, his paper where he showed the existence of Nash equilibria, he
established another interesting result that randomization buys you -- can also buy
you some symmetry, some structural symmetry in the solution constant of Nash
equilibrium. So for example if all the players are identical there's a Nash
equilibrium where everybody is using the same mixed strategy. More generally,
if they -- you think the functions have any type of symmetry, there is a Nash
equilibrium that respects that symmetry. In fact there is always a Nash
equilibrium that respects any kind of symmetry that the game has. This is a fairly
generic result, general result.
So we show the same thing in this setting. So suppose that D is the distribution
of the bidder's values for the items. So there are N bidders and M items. So that
distribution is supported in this set.
So now let's S be a set of permutations on the bidders and on the items so that
this distribution satisfies this similarly. So if I permute the output of distribution
using, you know, permutation sigma from that set, the distribution stays -remains the same. All right? So let S be all the symmetries that the valid
distribution satisfies.
So what we show is that their exists an optimal randomized mechanism that
respects the symmetries. So what this means is that if the mechanism is given
as input, a permutation of a vector of values of all the bidders for the items then
the output of the -- the behavior of the mechanism is the same as running the
mechanism on the unpermuted vector of values and then permuting the results.
Okay? And that's only true for randomized mechanism. It's not true for
deterministic. The [inaudible] mechanisms are not going to have these
symmetries. Randomized mechanisms do. And this what randomness is buying.
Okay?
And, you know, as I said, recall, you know, Nash's symmetric -- symmetry
theorem. All right.
So I'm pretty much done. I had a different -- a last application which I'm going to
skip. Coming to conclusions. You know, so problems in economics involve
uncertain. Right. That usually comes in, you know, randomized strategies or
beliefs ->>: [inaudible].
>> Costis Daskalakis: Which one?
>>: The last [inaudible].
>> Costis Daskalakis: Okay. Okay. Mechanisms.
>>: [inaudible].
>> Costis Daskalakis: Yeah, yeah.
>>: So for item pricing you post different prices. You've got a bidder who's IID,
one bidder who is IID and you post different prices on the different items.
>> Costis Daskalakis: That's right.
>>: What would be the symmetric version of this you're talking about? Are you
posting watering prices? That what you're doing?
>> Costis Daskalakis: You can do -- yeah, you can give a single water for every
-- you can even post symmetric [inaudible].
>>: So you're going from item pricing to watering pricing and you're adding
randomization?
>> Costis Daskalakis: Yeah.
>>: Got it.
>> Costis Daskalakis: So, right, to conclude. Problems in economics involve
uncertain which is usually [inaudible] in a Bayesian way. But it produce
complexity in the form of PPAD for computing Nash equilibria. But it also
enables Nash's theorem. The symmetry is [inaudible] paper, Myerson's result is
for support. And you know, we need probabilistic techniques to understand this
Bayesian uncertainty and use it constructively in applications.
And today I talked about anonymous games, auctions. I didn't talk about that.
Thanks a lot.
[applause].
>> Jason Hartline: All right. I want to thank you Costis for preceding me and
solving a problem that I had thought about for several years. So thanks, Costis.
Good. We're going to switch gears, still talking about mechanism design, but
we're going to talk about some talks related to crowdsourcing. And I want to start
with sort of motivating scenario. So I'm a theoretician academic. I give talks and
stuff and occasionally I get e-mails from people saying, you know, send me your
title, your abstract, your bio. Okay, all good to go. But very rarely but
occasionally a talk related graphic I have to send in to these people about my
talk. And where am I going to get a talk related graphic? I don't know. Like I
can't draw. So fortunately there's the Internet. And fortunately there's this
website taskcn which is unfortunately in Chinese but fortunately exists. And this
is a crowdsourcing Website. And this crowdsourcing website specializes in the
design of graphic images. All right. So if you have a graphic image that you
need made and you know someone who speaks Chinese then you can use this
website to get your image made. And so here is how it works. Basically you
post a description of the task you want and a reward. So I said design a graphic
depicting crowsourcing because it was -- this was for this talk I needed the
graphic. And the reward, the best design wins $120. Okay? Then the hard part
is you wait a week. And then you have a bunch of submissions and you pick the
one you like the best and you pay the guy 120 bucks. Okay?
So ->>: [inaudible].
>> Jason Hartline: There are various semantics. I think so I did pick one when I
did this for getting a graphic for describing this talk. With the help of visiting
student actually Qiang Zhang student visiting me for the somewhere. And so
you're probably dying to know like what submissions I got when I said draw me a
picture ->>: [inaudible].
>> Jason Hartline: I believe so. I believe there are, you know, ways to weasel
out somehow. You can say none of them meet some minimal requirements,
right? Because you can post requirements and if you say none of them meet the
requirements then you don't have to pick one, because none of them met their
requirements.
A good -- so I received a bunch of submissions. So some of them looked like
flow charts which I didn't really like too much. I guess this flow chart still looked
better than the other flow chart. Then you get some kind of artistic pictures which
don't really make much sense to me. And some horrible, horrible pictures which
neither much sense nor have much artistic quality. And then some reasonable
diagrams. Okay. There are a lot of people contributing ideas and there's one
idea that really worked. Okay. This is getting better. More flow charts but more
graphically oriented. Okay.
So what was the best one that we got? So the best one that I got was this
picture. Which has I guess someone posting a task to the crowsourcing tree of -with multicolored apples. And a bunch of people who are completing tasks and
getting money. Good. That's what we're doing. So that enabled me to make
this really attractive looking banner that was posted all over Northwestern for my
talk, which otherwise I wouldn't have been able to do. So crowsourcing is a great
thing.
So what's the -- so I'm going to now speak about the theory of crowsourcing.
And my main motivation in coming at this was to -- was the following observation.
I just made a lot of people draw pictures and I basically threw all the pictures out
except one of them I liked and posted that picture on all over Northwestern.
>>: [inaudible].
>> Jason Hartline: What's that? I got 18. I didn't show you some of the worst
ones.
So I wasted a lot of work. And this of course made me nervous because I don't
like -- I'm, you know, into optimization and I just did something that wasted a
bunch of work. And I kind of was curious as to, A, so there are two questions,
right? Can I show that I didn't waste too much work and the other question is well
what if I just had gone and hired someone like the traditional way and said I'll -you know, you're an expert in drawing stuff, I'll pay you your hourly rate $120,
whatever -- how much time that is, so you work for two minutes and then you
draw me a picture, right? And do I get something better or worse? I mean, how
does that compare to crowdsourcing okay? So I want to answer these two
questions.
So my talk is in two parts. The first talk I'm going to review some auction theory
which is -- I'm going to use to describe the theory of crowdsourcing and the
auction theory is going to be actually pretty quick. And then we're going to talk
about crowdsourcing, okay? So that's the agenda.
Good. So here's a single-item auction problem. One item for sail. N bidders
with unknown private values for the items. And each bidder wants to maximize
their utility which is the value they get for receiving the item if they receive it,
minus the price they pay. And I want to design an auction to solicit bids and
choose the winner in payments. I could have, if I was doing auction theory, have
two possible objectives, probably either to maximize a social surplus, meaning I
want to give the item to the person who values it the mostly or I could be trying to
maximize the seller profit, meaning the total payments I get. Okay?
So let's see what happens in maybe the first auction we might think of. The first
auction you might think of maybe is first-price auction. So solicit sealed bids.
The winner is the bidder who bids the most and charge the bidder their bid.
Okay?
So as an example if I got these bids, then the person who bids six wins and they
pay me six. Good. Another auction you might have thought about when I said
well what's an auction, you might have thought about the second-price auction,
which almost the same thing, solicit sealed bids, the winner is the highest bidder.
But then we charge the winner the second highest bid. Okay. Another auction
rule.
So again, here, the winner is the bidder, the person who bids six but now they
pay me four the second highest bid. Okay? Good. So as an auction theorist
here are the kinds of questions I want to be able to answer about these auctions.
So I want to know what should happen in equilibrium in these auctions. And I
want to know if people were playing an equilibrium then what is the outcome, can
I predict the outcome of the auction? Okay. And once I've done these two
things, then I can start comparing auctions. I want to maximize social surplus, if I
wanted to maximize profit I could say well which auction has higher social
surplus and equilibrium or which auction has higher profit in equilibrium? And
then I could sort of optimize this once I've solved for what the equilibrium is.
Okay?
So let's talk about doing this equilibrium analysis. And the first thing that you
might do, let's say let's take the second-price auction and let's look at Vickrey's
work which shows that bidding your value in the second-price auction is a
dominant strategy. Okay? And so you can assume the bidders bid their value
and so that when you select as a winner the bidder with the highest bid, that's
also the bidder with the highest value because it's a dominant strategy bidder
value and so you've given the item to the agent with the highest value so you've
maximized social surplus. Good. So that's equilibrium analysis for the
second-price auction. That was easy. Okay?
Let's now do the more challenging case, the first-price auction. Okay? So I -we're doing this auction. I want to know how should you bid in this auction?
Okay. Of course this auction does not have a dominant strategy, meaning you
never want to bid your -- of course you don't want to bid your value in this auction
because you always get charged your bid and if you bid your value and you get
charged your bid, then if you win you get zero utility, right, because your bid is
equal to your value. So you never want to do that.
And, in fact, if you knew the second highest bid, you would always want to bid
epsilon above that if your value was above the second highest bid. But you
don't. Okay? So there isn't a dominant strategy.
Okay. So I want to have a little quick review of probability so that I can talk about
what's going on. And I'm going to use all of my examples with the uniform
distribution so things are nice and easy. Okay?
So if I draw random variable V and I'll do random variables in red in this talk from
the uniform distribution on the interval 0, 1 then what do I a cumulative
distribution function which I'll call capital F, the probability this random variable V
is less than Z and that's just Z. Right? And the density function is of course the
derivative of the distribution function, which is one. Okay?
The expectation, the expectation of V is just the integral of the value times a
density function which is one here, so it's one-half which is this picture. And I'll
do most of my proofs by picture. So what is the expectation of G of V? What's
the integral of G of V which is if I had the picture it's the green area minus the red
area. So that would be positive in this picture because the green area is bigger
than the red area. Good. Okay.
And the last thing about expectations, uniform random variables that I'm going to
use is uniform random variables evenly divide the interval in expectation. So if I
had two uniform random variables and I sorted so V1 was bigger than V2, then
the expected value of V1 is two-thirds, expected value of V2 is one-third. All
right. Good. So that's my review of that.
Now, back to first-price auctions. So you and I are bidding in a first-price auction.
Let's assume that our values are drawn from the uniform distribution. Okay? So
I want to figure out how you should bid. Okay? And I'm going to tell you how I'm
bidding. Let's suppose I bid half of my value, okay? And now you know that my
value is uniform 0, 1 and I just told you I'm bidding half my value. So you know
I'm bidding uniform 0 one-half, right? And now given your value, you can solve
for what you should be bidding.
All right. So let's just do that really quick. So your utility is a function of your
value and your bid. Right? And if you win, then what's your utility? It's your
value minus your bid, right? Because you win you hit your value and you pay
your bid. So if you win it's your value minus your bid. So what your utility for
bidding B when your value is B, it's your value minus bid. Neither of these are
random. Times the probability you win which is random, because it depends on
what I do.
Okay. So what's the probability you win? Well, that's really easy. That's a
probability that your bid is bigger than my bid. What's my bid? My bid is half my
value. Okay. So let's rearrange. That's the probability that twice your bid is at
least my value. My volume is uniform zero one. So this is just a CDF of 2B
which is 2B. Okay.
So I get a formula which is V minus B times 2B. Good. Which is this. How do I
maximize that? How do you maximize that if you want to figure out your optimal
bid? Had well, you take the derivative set equal to zero and solve it and you get
you should bid V over 2.
So I just showed you what you should do given what I was doing. But notice -so what's the conclusion? You bid half your value. But I was bidding half my
value. So if I bid half my value, then you want to bid half your value. But if I
know you're bidding half your value, then I also want to bid half my value. So
that's an equilibrium. Okay? So that's a base Nash equilibrium and that's what
we're going to talk about in this talk, bidding half your value is a big Nash
equilibrium. What can we conclude from this equilibrium analysis? Well, who
wins?
If everyone in the auction bids half their value, who wins? The person with the
highest value wins. So again it maximizes social surplus in equilibrium. Just like
the second-price auction did. Okay. Good.
I now want to talk about ->>: [inaudible].
>> Jason Hartline: Is that the only equilibrium? Yes. I'm not going to prove that
in this talk, although I could if you gave me another 30 minutes. If you all gave
me another 30 minutes. Okay.
Good. So now I want to talk about -- we talked about social surplus, first-price
and second-price are the same. I want to now quickly talk about profit. What's
the profit of the second-price auction? Again, two bidders, uniform values. Well,
that's easy, right. Draw values from intervals, V1 is less than V2. What is my
profit? Well, it's the expected value of V2. So I get as payment the highest payer
pays the second highest value. So what's the expected value of V2 here? It's
one-third. By the assumption their uniform. Okay?
What about the first-price auction? What? Half, one-quarter? One-third. Thank
you. It's one-third, right? Because the -- your expected profit is equal to the
expected highest bid. What's the highest bid? Well, the highest bidder pays -bids half of their value, right? So the highest bid is two-thirds, right? But he bids
half his value, which is one-third in expectation. Okay? So they actually had the
exact same expected profit, too. They have the same surplus, they have the
same profit. This is not a coincidence. In fact, it's one of the most important
theorems in auction theory which is two auctions with the same equilibrium
outcome had the exact same expected revenue. And, in fact, the theorem is
more precise than just this. It's actually happens once I know my value in the
auction I had the same expected payment. Okay? So in the first and
second-price auctions if my value is V then my expected payment in the auction
is the exact same. And so I'm just going to use this notion payment of V to
denote the payment that I have in the first or second-price auction if my value is
V in expectation. Because they're the same. Okay?
Good. I want -- now want to turn to one -- another auction. I want to talk about
the Alan-pay auction. So an all-pay auction is the following. I again want to sell
a single item to a bunch of bidders so I solicit sealed bids. The winner is the
person who bids the most. Everyone pays their bid. Okay? So as my example,
bird six wins, everyone pays. Clear? Good.
I want to now solve for the equilibrium in this game. And the way I'm going to do
that is I'm first going to guess. I'm first going to guess that probably the bidder
with the highest value is going to win in equilibrium if the bidders are IID, right,
because they should be bidding the most.
If the person with the highest value wins then the equilibrium is the exact same
as the equilibrium in the second-price auction and I can use the second-price
auction to figure out what my payment should have been. Okay? So what
should the equilibrium strategy be to get this equilibrium where the highest
valued agent wins? Well, in the all-pay auction our bid is equal to our payment,
right? Now, so my bid is a function of my value is equal to my payment. But
remember payment is just a formula. It's the same for all pay -- this is the
expected payment. This is the formula that's the same for the all-pay the
first-price, the second-price auction, right?
So what is that formula? Well, that formula is it's just the second-price auction
payment, okay? So let's break that down. What do you pay a second-price
auction? Well, you only pay if you win. So if you win you pay the second highest
value given that you won. Okay? So let's do an example. Two bidders uniform
0, 1. If your value is V, what do you expect to pay the second-price auction?
Well, the second highest value given that you win well that's the expected value
of a uniform random variable conditioned on being less than your value. So
that's half your value. Okay? So this first term is V over two. And the probability
you win given your values V, well that's the probability he's below you which is
just the CDF, which is V. Okay? So V squared over 2, that's what you should
bid in a two-player all-pay auction. Okay?
And of course if you do that, that verifies my guess that the highest bidder wins,
right? If you do that, then the highest bidder will win. Good. So my guess was
right. Yes?
>>: [inaudible].
>> Jason Hartline: This last example assumed that. But everything else
assumes the IID. Okay? Good. Okay. So that's my overview of auction theory.
I now want to relate auction theory to crowdsourcing. So here's sort of a first
attempt at modelling crowdsourcing.
I post a reward that I normalize to one dollar and then contestant I comes and
they have some skill, which is how good they are at producing images. And they
make some effort, they work. And their effort times their skill is the quality of the
picture they make, okay? Okay. So the quality that make which I'll call P is just
their effort times their skill. Good. And this agent receives some reward XI for
doing the work, which I'll say is either zero or one whether he's zero if he doesn't
get the reward or it's one if he gets the reward. Okay?
And so what's his utility? His utility is the reward he gets minus the effort he puts
in. Okay? Good. And what's our goal in crowdsourcing? Well, I want to
maximize the best quality submission that I get. Okay? So I want to maximize -I want to optimize the maximum payment. Okay? I want to compare this setting
to all-pay auctions. Okay? So what happens in an all-pay auction? I have one
item for sale. And I have a bidder I who has a value VI for receiving the object.
They bid and pay the same amount, so I'll just use the same variable for their bid
and their payment, which is just PI. Good. I gone choose the outcome who
wins, XI 0, 1. And what's his utility? Well, his utility is his value times the
probability he wins or times whether he wins or not minus his payment. Okay?
That's your utility in the all-pay auction.
And your goal usually maybe in the all-pay auction is to minimize your revenue
which is the sum of the payments. Okay?
So I want to compare these two things because the utilities look kind of different
and I want to note that from this bidder's perspective his value is a constant so I
can multiple this equation on both sides by his value and say he's just trying to
optimize this thing. Right? If he optimizes his value times his utility, that's the
same as optimizes his utility. Right?
And now the objectives are the same okay because VI times EI is just PI by
definition.
>>: [inaudible].
>> Jason Hartline: It controls EI. That's his effort. Okay? So basically the
utilities are the exact same thing. And so what do we know? We know the -- if
the utilities are the same, then what happens? Well, the equilibriums are exactly
the same. Okay? So that analysis we just did solving for equilibrium, we're
done. We know what goes on in crowdsourcing auctions. Okay? The difference
-- so we have the same equilibrium, but the difference between crowdsourcing
contests and all-pay contests is here our revenue is the sum of the payments and
here the payments of the losers are wasted. Okay? That's the big difference.
And that's the focus of my talk is understanding what's wasted, how much and
good.
So crowdsourcing contests everyone works. Works of losers is wasted. It's like
an all-pay auction.
Otherwise I want to compare crowdsourcing contests to conventional
procurement, right? What if I just hired the best guy, or what if I was the
government and I did some auction where people sort of put in proposals and I
pick the best proposal and only they did the work? Right? So this proposal
system is kind of more like a second-price auction or a first-price auction, right?
And so if you're trying to get good revenue in the second-price auction you just
get the full set of payments, right? Had and we know from revenue equivalents
that all pay and first and second price get the same total revenue. Right? So my
only question is how much work is wasted. Right? Because that wasted work is
work I could have gotten in the second-price auction doing regular procurement.
>>: [inaudible].
>> Jason Hartline: By the PI.
>>: [inaudible].
>> Jason Hartline: Yes. Yes. I value wasted work by the PI and I note that had I
run a second-price auction or first-price auction to do this procurement I could
have gotten the full sum of the PIs as payment because I just -- only the first guy
pays. Only the highest guy pays, right, in the revenue equivalent. So there's no
wasted work. And it is the payments I'm talking about. Okay?
>>: [inaudible].
>> Jason Hartline: Yes. Yes. You only know PI.
>>: [inaudible].
>> Jason Hartline: No, I cheated by showing them to you. But pretend you didn't
see them. Okay? Because otherwise -- I mean, so, yeah. So usually -- so -and the assumption going in here is that you're trying to get crowdsource
something that you can't combine. You got a bunch of payments if they were
money you could just take the sum of the payments, right? But I'm imagining
these images there's no way to combine the images I got to get one image that I
want to post on my flier.
>>: [inaudible].
>> Jason Hartline: For -- yes. For the illustration to talk we kind of went around
the model. But let's ignore that. Good. So this is what I was just saying. So
basically I just want to quantify this loss from wasted work. And it turns out it's
not that hard to do. So this is the main theorem that I wanted to tell you. The
main theorem is that the highest payment is at least half of the total revenue that
you would have gotten if you collected all the payments. Or in other words, the
highest payment is at least the sum of all the losers payments. Okay? Good.
So that's what I want to show.
I'm going to give you a proof of this on this one slide. It's pretty concise. So I'll
have two variables, W and L. W is the expected winner's payment, L is the
expect loser's payment and I just want to show that W is bigger than L. Okay?
So what is the winner's payment? Well, remember payment of V is the sort of -we know if your value is V we know what you bid, right? You bid your payment.
Okay? So your payment is payment of V times a probability you win if your value
is V. That's the winner's payments. And the loser's payments are payment of V
times the probability you lose if your value is V which is one minus the probability
you win. Good.
So what is W minus L? Well, it's your payment to V times two times the
probability you win minus one. Right? Just regrouping.
Okay. So let's go back to previously in the talk where we solved for what
payment was. And remember, that was the expected second highest value
conditioned on V winning times a probability of V winning. Okay? And so to
avoid myself writing V winning all the time, this probability I'm going to remember
that for the uniform distribution that's easy. Probability V wins is just V to the N
minus one, right? Because it's the probability you're each of the N minus one
other guys and you beat each one with probability V and so the probability you
beat all of them is V to the N minus one. Okay?
So this seems not very general like it's only for uniform distributions but you can
get the same decomposition just by changing variables into probability space
instead of value space. Okay? So the same proof is general but let's just do it
for uniform so we don't get confused. Good. And I'm just going to call this thing,
so I don't have to write it out again, G of V. Okay. So here's my new equation. I
hit -- this is again my payment, G of V times V, VN minus one times this two
probability of winning which is this. Okay?
So I want to calculate the expected value of this. G of V is complicated. But this
term is easy. So let's -- let's do the easy thing first which is calculate the
expected value of this term without the G of V and see what we get. Okay? So I
claim that this is at least zero. And you can work this out for any N. But here's a
picture which shows you that that's at least zero, right, the green area is bigger
than the red area so it's at least zero. Good.
But furthermore look at this picture. There's more positive area to this side than
that side. Right? And if we know one thing about G of V we know it's monotone
increasing. Right? So it's got bigger values towards this side. Okay?
So I'm taking -- what is the expectation? It's the weight sum of G of V with this
thing that's increasing weight on this side good. So G of E was a monotone
function. So I got more weight here. So the whole thing is positive. All right.
That's it.
So some conclusions. We modelled crowdsourcing contests as all-pay auctions.
We showed the loss from wasted work is small.
A major work horse of this theorem was the revenue equivalent theorem which
made it all easy. Okay?
There are two other results in the paper that I'm referring to here which I didn't
have time to talk about. And one is solving for optimal crowdsourcing contests.
And so let me just say quickly what one of the questions you could answer is. So
we gave all the price to the best participant. Well, I don't have to do that, right?
So for instance TopCoder is a crowdsourcing Website for programming and
though give two-thirds of the price to the best guy and one-third to the second
best guy. Is that a good idea? Is that optimal for some distribution of values or is
it not? Remember we still only get the value of the best contribution but
somehow we reward the second guy for his effort. Thanks for playing, we'll give
you a third the prize. Right. Is that a good idea? So the answer is no. Among
static allocation rules, meaning if you fix in advance that you allocate to the
highest guy or some static rule like you allocate some fraction to the highest,
some of the fraction second highest, some other fraction to the third highest and
you fix that in advance, amongst all fixed in advance rules which rule do you
want, you want to always go with the highest guy. Okay?
So top coder is doing something stupid. And another question you could ask is
well, maybe after seeking the qualities I can adjust the amount I pay. Right?
And then if you try to do this, this takes the form a lot closer to what Costis was
referring to as maximizing virtual values. And so you can derive virtual values for
the maximum payment and then optimize virtual value in that and optimize to the
maximum payment. And you get similar results there. Except that unlike in the
auction case, these virtual values depend on the number of players. So the more
players you get, you do different things. Okay?
And the last thing that we did, which is -- I don't really know how to sell this
because I don't know the economics that [inaudible] that well, but I am told by
economists that they didn't know that for N bidder IID distributions that the
first-price auctions and all pay auctions had unique equilibrium. And we've
proved that. And we prove it basically using revenue equivalence. Whereas the
proofs in the literature -- in the economic literature I've seen for analyzing
first-price auctions and all-pay auctions use differential equations which is why
they do it for N equals 2. And so they know equilibrium results for N equals 2 but
for general and they don't know because differential equations with N functions
are complicated. Right? And so we do it with revenue equivalence for N players
in IID settings and I'm assuming they don't know it because people I've talked to
are like, whoa, really? But it is also really easy. So I'm not -- I'm surprised. All
right. That's that.
[applause]
Download