>> Nikhil Devanur Rangarajan: Hello. Welcome to MSR... Chakraborty from Harvard speaking on mechanism design for a risk...

advertisement
>> Nikhil Devanur Rangarajan: Hello. Welcome to MSR talk series. Today we have Tanmoy
Chakraborty from Harvard speaking on mechanism design for a risk averse seller. Tanmoy
completed his Ph.D. from UPenn and [inaudible] and is currently a postdoc in Harvard with
David [inaudible]. So Tanmoy.
>> Tanmoy Chakraborty: Thanks, Nikhil. So I'll be talking about a mechanism design problem.
And the key difference from classical economic theory would be that I'll be considering a risk
averse seller.
So what is a risk aversion? Agents do not like uncertainty in what's going to happen. For
example, suppose an agent has two strategies that he can employ. The first one, strategy A, is
a -- sort of gives him a randomized payoff. It gives him $20 with probability half. Otherwise he
can also end up with like [inaudible] he can also end up with zero dollars.
In comparison, there's another strategy which gives him a guaranteed $10. Think of buying a
lottery versus somebody just gives you the $10, or would you rather -- would you actually buy
the lottery. Suppose there is a lottery for 20 with probability half, would you buy it for $10.
Our premise is that an agent would prefer strategy A. That would essentially be ->>: B.
>> Tanmoy Chakraborty: Sorry, strategy B. Yeah. And that would be equivalent to risk
aversion.
And but so this is actually like any risk averse seller or agent should essentially prefer strategy B.
But then the difficulty comes when you are comparing slightly -- there is no such easy choice.
For example, what if strategy B did not pay $10 but paid a little less than $10, $9. Would you go
for $9 or would you bet on the $20 with probability half.
What if -- suppose you say, okay, I would still go for $9, I don't want to take the risk, would you
go for 5. If strategy B paid $5, would it still go for $5.
And at an extreme level, if strategy B pays $1, at that point you might say I don't care about the
$1, I might as well bet on the 20.
And so at some threshold A is able to become more preferable than B. And the threshold really
depends on the person, right, how risk averse a person feels.
So the way we model this is that a player has a dimensioning marginal utility, so essentially a
concave utility function. So if when an agent is getting $20, he thinks of his utility of U of 20, so
U can be something, a concave function, so log of 20 or square root of 20.
When he's thinking, say, on a logarithmic scale, say log of 20, he actually doesn't value the $20
that much above the $10. $20 is no longer twice of $10, right? Because log of 20 is no longer -is just one unit or more than log of 10. Essentially.
So what will happen is now if the agent wants to maximize his expected utility, so what is the
expected utility of strategy A. He ends up with either $20, in which case his utility is U of 20
with probability half, so U of 20 by 2, plus U of 0 by 2. He ends up with 0, and that also happens
for [inaudible].
For concave functions, U of 20 plus U of 0 by 2 will be less than U of 10. This is simply basic -it's a property of any concave function. And how concave the function is will reflect what the
threshold is going to be. Is it going to be less than -- is U of 20 by 2 going to be less than U of 9,
for instance. Okay.
So different agents will have different utility functions which will determine these thresholds,
and the extreme cases where if you can think of linear utility, somebody with linear utility is
again risk neutral in the sense that he is just maximizing his expectation. He is not risk averse.
Commonly used functions would be square root of X, so log X kind of thing. I just wrote log 1
plus X, so that to just make sure that U of 0 is 0. That's a nice boundary to have.
And as a common function that can capture a lot of instances is capped-additive functions, so
utility function that grows linearly to some point and then just flattens. So once I reach $100,
there goes my goal. I don't really care for getting more. I just want to make sure that I get
something like $100.
There are other measures too, fairly intuitive ones in fact. Think of maximizing expectation
minus the standard deviation of the variants. We just say that, oh, I want to maximize my -- pick
a strategy that maximizes my expectation, but I don't like uncertainty, so I'll put a negative
weight on variance. I will not go for strategies which have a lot of variants in their payoff.
But we'll be handling this expected concave utility problem in this talk.
And the context in which we study this is a Bayesian mechanism design. So there's a singer
seller, a monopolistic seller, and N buyers. The seller has limited inventory, and buyers'
valuation -- buyers have value for the inventory that is private. The seller has some items to sell,
the buyer has value of those items, the seller doesn't know what the values are.
So multi-unit auctions. We'll focus mainly on multi-unit auctions where the seller has K
identical items, K identical indivisible items, and each buyer -- there are N buyers -- each buyer
has a value for the item, and the buyer wants only one item. They're identical copies, so imagine
buying K TVs or something, a buyer wants only one TV, and he has a value VI -- buyer I has a
value of VI for the item. So different buyers can have different values.
And VI is drawn -- and the Bayesian aspect of this model is that VI is drawn from some
distribution FI. FI is a buyer specific distribution, and VI is drawn from it. The distribution FI is
known to the seller. So essentially the seller is using priors, some beliefs about the values.
And the we'll stick to this notion of -- so the seller is going to set up some mechanism on the
auction, if you will, and he would like to sell these items. So the notion of -- standard notion is
dominant strategy incentive compatible mechanisms where the seller will ask the buyer to submit
bids. So think of seller asks for the values. And based on -- once he receives all the bids, he's
going to allocate the items and charge some payments from the buyers as a function of these
bids.
And that will be what's called the auction rule. And the auction rule should be such that buyer
best strategy is to truthfully report their value.
Whatever other buyers are doing, it should be in my best interest as a buyer to just report my
value truthfully to the seller. That would make the -- if that is true, then the mechanism is called
dominant strategy incentive compatible.
So in terms of maximizing expected revenue, so -- yeah?
>>: [Inaudible] use the simple word truthful.
>> Tanmoy Chakraborty: Sorry? Yeah, truthful or incentive compatible. They are, yeah,
completely interchangeable.
And so Myerson characterized the revenue-optimal option for this kind of setting, for multi-unit
auctions. So only maximizing expected revenue, so it assumes the risk neutral seller. Okay.
And so one of the characterizations that we'll use from his work is that any deterministic
dominant strategy incentive compatible mechanism can only do the following. It can take all the
bids and it can set up -- it can get to a particular buyer if a particular buyer is looking at he's
being treated, he's being offered a price as a function of everybody else's bids.
His price, the price that he's getting or being offered, does not depend on his own bid.
Because -- so this is essentially -- if you think [inaudible] is necessary so that he cannot
strategize his bid. Otherwise he will have an incentive to change his bid if his price depends on
his own bids.
But if his price is a function of everybody else's bid, then he can't do anything to that price. He
can either accept that price if his value is above that price, or he's going to reject that price. He
cannot do anything else. His price does not depend on his own actions.
Okay. And it surmises techniques actually applied to a slightly general setting called
single-parameter settings. Multi-unit auction is pretty much the most prominent in that class.
So the expected revenue in this case is measured with respect to these distributions. So this is a
Bayesian model. The revenue is real. So imagine these VIs are being drawn from these
distributions independently, from the distributions FI independently. You need [inaudible]
vector there is some revenue that comes in. And now, though, this mechanism is maximizing
expected revenue.
So Myerson actually also showed that -- and this is a slightly more -- goes into an economic
concept that Myerson's mechanism is not just optimal among the dominant strategy incentive
compatible mechanisms. So he designed a -- like a DSIC mechanism which is optimal among all
mechanisms. So even if a mechanism is not incentive compatible, is not truthful in buyer
strategies, and they reach some equilibrium of the game, even then no equilibrium can make
more money than what Myerson promises.
So this is -- but this is characterization. So analysis of the Bayes-Nash equilibria. So if I am
talking of a mechanism that is not truthful, okay, buyers are going to strategize. How do they
strategize and what strategy do they play? Think of playing a mixed Nash equilibrium.
In that case buyers are maximizing their expectation, right? They choose the strategy that given
my expected belief about other player strategy maximizes my expectation. And the whole
premise of this talk is that maximizing -- people may not be maximizing their expectation.
People might be maximizing their expected utility. People may be risk averse. Only these
neutral players maximize their expectation.
So there is a little bit of issue of consider -- of how to consider Bayes-Nash equilibrium. Yeah.
>>: Did you say that this Myerson schema beats collaborative efforts by the ->> Tanmoy Chakraborty: No.
>>: No. Just individual?
>> Tanmoy Chakraborty: No. It's not collision. But if people collude -- yeah. Actually if
people collude, then, yeah, Myerson's doesn't remain robust.
>>: [inaudible] they won't be happy about [inaudible] getting the same price.
>> Tanmoy Chakraborty: Yeah.
>>: Okay. Sorry.
>> Tanmoy Chakraborty: So is the -- in this work we are designing efficient mechanisms to
approximately maximize expected utility for the seller. And we'll essentially stick to this DSIC
mechanisms. Okay? So I'll -- let's formally define what we mean by expected utility.
So in a particular realization, the revenue, there is some particular number R that is obtained as
the sum of the payments from all the buyers. So that's the revenue from the seller in that
particular realization. In that case we say U of R is the utility in that realization and we want to
maximize expectation of the utility. Okay. E of U of R.
And now intuitively, when is risk aversion important? Why would the seller care much about
risk aversion. It should happen when priors are very wide or priors have very high variance.
So imagine that there is some buyer and the seller knows that this buyer once in a while puts a
very high bid. He usually bids around $1 or $2, that is standard, and once in a while he just bids
$100, and the seller doesn't really know why. He has seen this in the past, but he hasn't been able
to figure out why, so he cannot predict it.
So this $100 bid is part of the -- is going to be part of his prior if you think of creating these
priors from a past sample, past history.
This $100 bid happens but maybe with like 1 percent probability. It happens once every 100
auctions. So and he cannot figure out why. So he cannot actually predict exactly when it's going
to happen. So the question is should the seller actually stick out for it. Essentially, should the
seller give -- offer a price of 100 so to try to get the 100 from him, whenever his bid is on it, or
should he just stick to offer a price of $1 and just take the $1.
So does this straight up make sense? Either he can just offer a price of $1 and get the $1, or he
can put a price of a $100 and once in a while get $100. Remaining time he's going to get zero.
So only especially when the priors have these kind of outliers is when risk aversion becomes
important. It really depends on the utility function of the choice of how far of a -- like, okay, 100
maybe I won't wait. But what if it is $1 million, should I wait for that. And that depends on the
risk aversion of the seller.
So this is what I mean by when priors have a very high variance, possibly due to not
understanding the world as well as one might or simply because there isn't enough sample, the
risk aversion becomes important.
>>: That's different, right, that is not [inaudible] accurately?
>> Tanmoy Chakraborty: Yes.
>>: And understanding the word not enough or not having enough samples means you cannot
predict good enough. But if you're talking about something slightly different.
>> Tanmoy Chakraborty: So prior is in some sense -- it's a belief, so for that particular auction
it's a belief of where the value is going to be. And the worse my prediction, like my confidence
is, the broader -- wider distribution I'll end up keeping.
And if I am very -- if I think my prediction of a value is more accurate, then I'm going to say, oh,
it is going to be $10 plus/minus 10 percent. That's a very high confidence. Versus a low
confidence would say, oh, it can go from $10 to $100, I don't know.
>>: Philosophical issues here.
>>: [inaudible] for example, knowing the number between 0 and 1 is going to throw in a
random. Is that different from not knowing anything about it at all?
>> Tanmoy Chakraborty: Yeah. Yeah. It's not so obvious.
>>: So your prior is really when you do -- imagine you do the expectation or the prior, if
assuming your -- you're assuming the prior is accurate.
>> Tanmoy Chakraborty: Technically, yes.
>>: [inaudible] accurate [inaudible] issues of not being able to get the accurate.
>> Tanmoy Chakraborty: Yes. So the reason I make this point is imagine that your -- you have
started with some prior, think of unit from 0, 1, and you update it according to some sampling.
You update the posterior. The posterior starts having less and less and less variance as your
sample size increases.
Okay. And you would need less of risk assertion. These risk aversion techniques won't be
needed with those posteriors, because they'll have far less variants. And this is just the point I'm
trying to make, that if you have like -- you often start with very broad priors, and if you update
some posteriors, if you do not have a large -- if you have a small sample, you'll be sticking to
priors for which risk aversion is going to give you different results from risk neutrality, as a risk
aversion will matter. If your distributions were very narrow, in that case risk averse solution will
be the same as risk neutral solution.
>>: Yeah, maybe we can [inaudible].
>> Tanmoy Chakraborty: And we restrict our choice to only DSIC mechanisms. And this is
other point I was making, again, a little bit philosophical, that -- so Myerson said that, okay, I'm
willing to consider any mechanism and look at the equilibrium of that mechanism and look at its
revenue.
Okay. We're not going to do that. We're going to say that on my class of mechanisms that I'm
going to consider ours have to be this dominant strategy truthful. Why? Because [inaudible]
equilibrium assumes risk neutral buyers who are maximizing only their expectation.
Now, if buyers themselves are risk averse, then they will have their own utility function which
will affect the equilibrium. It will end up in some other equilibrium, which will depend on the
risk aversion functions of the buyers.
And as a seller, I don't know how risk averse the buyers are. So I can't really argue with it -- it
will not be robust. I cannot measure my utility because I cannot know for certain what the
equilibrium is going to be.
But if I stick to dominant strategy incentive compatible mechanism, the truth telling equilibrium
is actually an equilibrium even if buyers are risk averse. Because it's instance by instance of
better strategy for the buyer to tell the truth.
So even in expectation -- even in expected utility, it gives -- so this is like a stochastic
dominance, I think, strategy or the [inaudible] sitotactically dominates any other strategy. And
so for even for unexpected utility, it's going to be preferable.
So these are the results. And I will mainly -- I will just talk about the first one. The second one
goes along similar principles.
So we obtain an efficient polynomial time computable 1 minus 1 over E approximation -- minus
epsilon approximation to utility-optimal DSIC for multi-unit auctions. It applies to any arbitrary
kinds of buyer distributions, buyer distributions allowed to be nonidentical, and we do not have
any.
So the previous result on this same problem assumed that buyers' evaluations were drawn from
identical distributions, so all buyers were identical, had desired identical distribution. And
moreover the value distributions satisfied regularity.
So what does regularity mean? Regularity essentially -- think of a Gaussian. A Gaussian is a
regular distribution. And for those distributions, they actually have less variance. So the
problem is actually kind of gets closer to essentially in those problems Myerson itself is a pretty
good mechanism under that assumption that the distributions are narrow, distributions don't have
high variance.
So under this regularity assumption, they are normally -- the distributions were normal, you
could run Myerson. Myerson [inaudible] won't be as good as 1 minus 1 over E, though, but it
will be reasonable. And you won't have to take risk aversion into account at all.
But the key point of our work is that risk aversion is especially important to when your
distributions are kind of far apart and you have outliers. And we design a mechanism for that.
So and in comparison. So our mechanism is going to be a sequential posted pricing. What does
that mean? So Myerson or any truthful mechanism sets a price as a function of other people's
bids. Okay. What if you were asked to set prices just as a function of the distributions. You
don't look at any bids. You just set a price to every buyer by just looking at the distribution.
If you did that and just offered these prices in a sequence, take it or leave it offers, do you want it
at this price or go, and I'll give you the item immediately, and that is called a sequential posted
pricing mechanism.
So these are posted prices. The prices just posted as independently to each buyer. It's not a
function of other buyers or bids. And these are -- so these are the kind of mechanisms that are
interesting in their own right, like they're studied as pricing problems, because for consumer
goods we don't really participate in an auction. We go from -- what we see is typically posted
pricing.
So our mechanism does end up being sequential posted pricing. And the sequential posted
pricing are relevant -- like work for revenue. Sequential posted pricing does give a 1 minus 1
over your approximation to -- yeah.
>>: [inaudible] price goes down until somebody picks up the purchase?
>> Tanmoy Chakraborty: So the buyer -- actually in sequential posted pricing the buyers need
not be present at the same time. So they're actually being offered prices -- think of people
coming into a shop and I just set a posted price on it. That's pretty much it.
>>: So the comparison within [inaudible].
>> Tanmoy Chakraborty: Only for SPM, yes.
>>: You have some concave function.
>> Tanmoy Chakraborty: Yes.
>>: And you get [inaudible].
>> Tanmoy Chakraborty: Via SPM. Of course, here the goal is to get really -- like if I could
beat 1 minus 1 by doing something other than an SPM, that will be an even more interesting,
even better result. Because the utility optimal mechanism has not been characterized.
>>: [inaudible] extending the [inaudible].
>> Tanmoy Chakraborty: Yes.
>>: [inaudible]
>> Tanmoy Chakraborty: Yes. Essentially.
And, similarly, in the other part, Chawla, et al., give -- so when there are multiple distinct items,
Chawla, et al., gives a 6.75 approximation. We can essentially get a 1 minus 1 over E over 6.75.
So for the concavity to handle the concavity, we are losing 1 minus 1 over E. Yes.
>> Tanmoy Chakraborty: So let's start by looking, and I'll mainly -- I'll discover the first result,
actually multi-unit ductions. So let's start by looking at some simple cases. If you have just a
single, what would you do. If you have a single buyer who comes in with a distribution F given
by the CDFF, and all I'm going to do is I'm just going to set a price that maximizes my revenue,
if I were maximizing expected revenue.
So it will be simply the price times the success probability. So price times the probability that
his value exceeds my price.
Right? So 1 minus FP is the success probability or the sale probability, and P -- if I sell it, I get
P, which is a price I set, offered.
So I'm just going to maximize P times 1 minus FP. That's it. If I'm maximizing utility and
there's just one buyer, it's very easy to modify that. Just look at any price P as U of P. Just think
of it as a characteristic inversion or something.
And now I am going to maximize U of P times the sale probability of P. I'm going to value over
this one dimensional curve and I can find my optimal [inaudible].
>>: [inaudible] worthless to you.
>> Tanmoy Chakraborty: Yes. It's normalized to zero, yes. Otherwise you could just move
things actually. If you have a -- if you have known cost to yourself, you can just move to
translate things. Yeah. It's normalized to zero.
So the second observation is that it is actually easy to maximize the slightly different looking
function. So if you're getting -- if your buyer 1 was paying P1, buyer 2 paid P2, buyer N paid
PN, what you really want to maximize is U of P1 plus P2 plus PN. Right? Because there P1
plus P2 plus PN is your revenue. You take the utility of the sum, concave function of the sum,
right?
Instead, as we've done here, we have separated it up. So there's a different function, right, from
what we want. But this function is easy to solve. Why? Because, again, the idea is that the
contribution of each buyer just separates. Whenever a buyer is going to pay me P, I just pretend
that it's U of P. Because that's the contribution to the objective. U, whenever he pays P, his
contribution is U of P.
That is not true whenever two -- so this is -- so this will happen as long as I'm selling just one
item. Because in any realization only one buyer pays. But the moment I have more than one
guy paying me in one -- in a particular realization, even he pays me $5, I cannot say this is to my
utility from him is U of 5. I cannot say that because it really depends on whether I got $2 from
the other guy. Right? Because it can be either U of 5. But if I got $2 from the other guy, it is U
of 7 minus U of 2. One could think of it as that.
So now these contributions to the objective have become completely realization by realization
related. The correlation of these payments become very important. And this is really the source
of the technical difficulty.
So let's go for now -- let's warm up with an existential result. So I'm going to show you that
there exists -- without being able to compute it, there exists a sequential poster pricing that will
be within 1 minus 1 over the optimum. I'll give you an existential result.
So let's say that I have the utility optimum mechanism in my box. Of course if I have it, this talk
is meaningless. But I don't. But suppose I had it as an oracle. Now I could construct a
sequential posted pricing from that.
How would I do it? So I would look at -- so a posted pricing -- this utility optimal mechanism
being dominant strategy truthful offers a price as a function of other people's bids. Right? But
now go to one of those buyers and look from his view. He's being offered various prices. He's
essentially being offered a randomized price, a random price that depends on other people's bids.
But if he doesn't care about that, he's just seeing a random price.
Okay. Now, if I just offered him the random price independently of other people, and I do that to
everybody, I offer the same distribution of price to each guy, but I offer these things
independently, you know, auction these things get correlated, right, it's function of other people's
bids. If I offered this independently, so what is a good thing and what is a bad thing.
The good thing is that, well, the revenue contribution -- just forget utility for now -- the revenue
contribution is going to remain the same. He saw the same set of prices, his value was realized
to something independently, so from his perspective nothing else changed. He paid the exact
same amount. So the total revenue would remain the same.
So this seems very simple. Why would I then run an auction? I could just give these
independent prices. The problem is I've given these independent prices, these prices get accepted
with some probabilities that are independent but they get accepted independently, there will be
realizations when I end up selling more than K items, more than the number of items I had.
And the auction essentially -- Myerson auction essentially correlates these things to ensure that it
never sells more than T items.
So making them independent, what it does, it is going to violate inventory at times, but the
inventory -- but there are two things. One, in expectation the inventory is still maintained
because the sale probabilities remain the same. So the sum of the sale probabilities of this
mechanism, this independent price projection and Myerson are the same.
So the expected number of items sold is same. However, well, is going to -- is not really a
mechanism, because is not actually maintaining the inventory constraint, hard inventory restraint.
So one -- so the question is, okay, so expected -- so this analysis says expected revenue remain
the same. That doesn't -- if there were no inventory constraints. We would have to handle
inventory constraints, but expected revenue remains the same otherwise.
That's not true for utility. Utility can get seriously affected by this. So because I cannot say,
okay, this is -- I just said that the revenue contribution of a particular buyer looking from his
angle is the same given my independent price.
But now imagine a utility-optimum mechanism that was ensuring that this is making correlations
in such a way that ensure that either I get a dollar from you or I'll try to get a dollar from this
other guy. And it's sort of the mixture that he gets a dollar.
Now, if I make them independent, I've actually lost that sort of correlation that was helping
utility. [inaudible] get one from both of them, because they're just -- and I'm getting them
independently, so I'll get two in some realizations and sometimes I'll get zero. None of them will
succeed. I'm not correlating them, anyway.
So let's try to -- and if other part is -- so utility, so even the revenue gets kind of fixed, utility
may still drop, and it will drop. Because I've removed these correlations. And the other thing is
we have to reintroduce this inventory constraint that I just removed from the argument.
So these are the two factors to handle. So this is a technical result that we'll use. So this is what
is called correlation gap. So let F be any submodular function real -- defined on N -- like vector
RN. Okay. So FX1, X2, XN plus -- this is the standard definition when you are talking about -not about sets but about real numbers. Right?
And the main thing to keep in mind is that a concave function, a concave sum, if U is a concave
function, then U of X1 plus X2 plus XN is submodular. Okay? Because submodularity in some
sense is really concavity in multiple dimensions, and this is essentially why or this is one specific
instance of that, that concave sum is submodular.
But we'll need other one that is slightly nontrivial; that if it wasn't a concave sum of all the
variables but concave some of the K largest variables, so if you just consider X1, X2, XN,
consider the K largest numbers among them, take the -- their sum, and take the concave U of
that, even that function -- so this is also a real valued function. Right? And that function is also
submodular. Okay?
By the way, just a heads up, why I did consider this U of some [inaudible] K largest variables?
Eventually I -- because I have K items, I am going to get only K -- very K -- number K positive
values. So this is sort of the head's up of why I would need such a function. But these are the
two functions that need to be submodular, and they are for my work.
So and this is -- I talked about independent projection of the prices, how will they fit in. So the
class considered next part. Let X1, X2, XN be random variables correlated in any way they like.
So imagine these are the payments of the people. So then they are -- in an auction they are
correlated anyway.
And what I'm doing is I'm making them independent and yet the same. Right? They are
independently sampled, but they are essentially the -- like their marginal probabilities are exactly
the same. They're the same.
So this is what they are. They are Y1, Y2, YN be their independent projections. So they're just
being drawn from a product distribution and the first one, X1, X2, XN, were being drawn from
an arbitrarily complex joint distribution.
And submodularity, or in our case, essentially concavity, satisfies the following; that even if I
make things independent, even if I make these variables independent, my utility, expected utility,
does not go down my much.
So to begin with, why would it go down? If -- imagine that the -- this X1, X2, XN were related
in such a way that was very negatively correlated. For example, essentially whenever X1 is
positive, everyone else is 0. Whenever X2 is positive, everyone else is zero. Okay?
So this is negative correlation. Negative correlation enhances expected utility. Okay. On the
other hand, positive correlation, think of everybody being 1 together and 0 together, that's a
horrible scenario for expected utility.
So the -- think of -- so the positive correlation is bad. So the tight example for this particular
equation is suppose you have N Bernoulli variables, 0, 1, so 1 with probability 1 over N. Some
very small probability. And what is the -- you could have them perfectly negatively correlated;
that is, with 1 over N probability, X1 is 1, everything else is 0. With 1 over N probability, X2 is
1, everything else is 0. What is [inaudible] is that this -- you always get a revenue of 1 because
in each relation you'll get exactly 1.
But now imagine that they are now made independent. So often there will be multiple variables
turning out to be more than one. And if I do not -- if my utility function is capped of 1, so I have
U of [inaudible], whenever I get 1, I get 1, but you U of 2 is also 1. U of 3 is also 1. It's just
concave. It just flattens at 1. Okay.
In that case, I do not get any value out of multiple algorithms assignment, any occurrences.
Right? So independence is going to be bad. However, what is the probably in this particular
example that I get at least a 1 positive variable? That is really going to be -- for this utility
function, that is really going to be my expected utility. Right? It is 1 minus 1 over E.
And this gives the [inaudible] bound. Notice one thing, though. You can have horrendous
results for other distributions. All this is is that the independent distribution is a pretty good
distribution as far as all distributions go. There are definitely -- it's -- independent distribution is
close to the best distribution, right? That's what it says. There are different distributions which
are far, far worse.
In particular with 1 over N probability, all these variables are 1. So you get U of N times 1 over
N. If U of N is also 1, it just got capped at 1, right? So all you got was 1 over N. So you could
be factor N away if the things are [inaudible] positively correlated.
But what it says is that the independent variables -- the independent distribution has enough
negative correlation among them. Eventually has zero correlation. Zero correlation is kind of
enough negative correlation. And positive correlation is bad.
Okay. So with this -- so one slide of technical background, we are back to our existential result.
We are doing this independent projection of the prices. Auction was giving quoted prices, I give
independent prices.
I can simply apply this correlation gap result -- I just need a suitable F. What is my F? My F is
U times U of some of the K largest variables.
So if I just created these random prices and then offered these prices in decreasing order of
prices, so that's one more thing I'm doing, I'm choosing the order. What it does is among the
people who'd accept their offered prices, I get the best possible scenario. So I would always
offer them in decreasing order of prices.
But if I do that, essentially this function remains submodular, and all I've done is I have
introduced independents and I'm in good shape.
So this is 1 minus 1 over the approximation. Only if I could get these independent prices. The
point is these independent prices that I offered were constructed by using this utility optimum
mechanism as an oracle, which to begin with I don't have.
So but this proves existence. It's not constructible. And for expected revenue, it is constructible
in actually in two ways, and I'll tell you that none of the two ways actually work. So this is my
buildup that why we need more complicated things.
So, number 1, you could construct this sequential posted pricing by using Myerson's mechanism
as an oracle. Because for revenue you actually have characterized -- we actually have
characterized the optimal mechanism. Right? For utility we don't have that. That's the first
reason.
The second one is that now that we are giving independent prices for the pricing problem, we can
write a very simple linear program with just one constraint, maximize expected revenue, subject
to expected number of items sold is less than equal to K.
So the revenue is going to be -- so the objective is going to be simply this: Let's say PJ is -- let's
say prices are discrete. They're coming from this class. PJ is one of the prices. One minute FIJ
is the sample property and XIJ is the property that I offer price PJ to buy.
I can just write this. Do a plain summation. And that captures expected revenue. I could write
just one more constraint to this subject to selling at most K items in expectation, do the rounding
by decreasing order of prices; that is [inaudible] minus 1 over E.
However, for expected utility, I can still write that constraint, sell less than K items in
expectation. That's a linear constraint. What's not linear or even concave is the objective. The
objective is nothing like this. It actually depends on whether XIJ happens. So that is I offer a
price -- offer to buy a price PJ, but also whether simultaneously some I plane J plane happens.
Because, in that case, I'll be getting PJ and PJ prime together or not.
So it really depends on the realizations and the expected utility. Even the utility -- this utility
function is concave, expected utility is neither convex nor concave. It will end up having
products of the form XIJ times XI prime, J prime, multiple like -- it's not good.
So now there are two ways to go from this point. One is now that we know there exists 1 minus
1 over E approximate SPM, let's look at the pricing problem as the goal problem, as the goal, and
just go for an approximation scheme, a PTAS.
We have done that. And but in this talk, and that one essentially goes through the suitable
distribution and going through sequences and -- but it's significantly messier, I would say.
But I would go for a -- I would actually give some more structural results that will give me a
simpler process, simpler algorithm.
So what we have said is that if every price distribution was preserved from the optimal
mechanism, utility-optimal mechanism is there, if I had the sequential pricing with exactly the
same price distribution to each buyer, I'll be good. That thing does give me a 1 minus 1 over E
approximation.
I cannot be guessing all the distributions precisely. That's too many. Right? That's not by any
means an approximation scheme or anything.
But I can still guess a few numbers. And this is what the -- now the technical [inaudible] will go
towards. So the total what I'm going to guess is actually the total sale probability at each price.
So what is the total sale probability? Imagine that I am offering price PJ to buyer I with some
probability and they accepted with some less probability [inaudible] the price. So but that is my
goal. That's going to be my sale probability from buyer I, that they actually extract PJ from
buyer I.
Now I sum it over all I. Okay. What is the probability by IP is mu PJ. What is the probability
that buyer I prime pays me PJ.
I'm going to sum this up over all buyers for a particular given price. And this is the sale
probability for a particular price.
Okay. And now the -- so suppose now there are two mechanisms which match these numbers.
They don't match in exact distributions, but what they match is in this further coarse mapping.
That is, for each price I write the total sell property over all buyers for that price. It's a
significantly coarser description of the mechanism. There will be many distinct mechanisms or
even like independent pricing mechanisms which will satisfy this.
But let's say I have two mechanisms which just satisfy this coarse description. Now, again,
obviously the expected revenue is going to be the same, because each price summed over all
buyers was realized with the same probability. So the revenue is going to be the same.
The question is are they -- how close are the utility. And our first immediate technical insight
here is that they are close. They are within 1 minus 1 over E.
And the proof goes through what we call split and merge operations on the random variables
that's going to start with the optimum mechanism, kind of split variables up to very tiny
variables, and then merge them back up to this second mechanism M prime. It starts with
M-OPT, splits things up, and then merges things back up to M prime.
Okay. And it can do that as long as M prime has the same total sale probability at each price.
And in all the process losing only 1 minus 1 over E.
And how -- so before proving that -- this probability's result, how are we going to use it? Well,
one way is to -- we could guess. If we could guess all the sale probabilities of each price, then
we could essentially write those feasible constraints and get one of those mechanisms that just
has -- satisfies those sale prices.
I don't have -- I won't have any objective to optimize, simply because I can't write it, but I'm
going to guess those sale probabilities of the utility-optimum mechanism. If I guessed it right, I
would find some mechanism that satisfies -- that is feasible, has the same sale probabilities. And
this probability would fit. I have a 1 minus 1 over E approximation.
However, again, it's still too many guesswork to do. Too many. Why? There are too many
prices still. I don't -- I have to guess only one number for each price, not for each buyer and
price. But I still have to guess one number for each price. There are too many prices. Still can't
do that. And I'll come back to it later.
So now we are going to bound the gap between M prime and M-OPT, the two -- the
utility-optimum mechanism and someone who just matches it in the total sale probabilities.
So these are two operations that we define and again be useful for others to reach stochastic
optimization as well, that split and merge. So this is -- so, again, as I said, negative correlation is
good, positive correlation is bad.
So think of -- let X be a random variable. So this is a simple example of a split and merge. Let
X be a random variable that takes the value 0, 1, or 2. And now let's say Y be a variable that is 1
with probability 1/3 matching essentially the probability of EX, and let Z be essentially the
instances where the X was 2. That's the idea of it.
I just broke it into two variables, one taking for each a value. If X had another value, say 3, 4, 5,
I would create a variable for each of them. And mapping matching the probability of that event.
Okay.
What is the difference? That the variables that I'm creating, the children of this first variable, the
children are all too valued, right, a zero comma something. All the children are now
independent.
I will introduce independence. What they were were negatively correlated. The children would
never happen together because X can be either 1 or 2. It is never both 1 or 2. Things were not
getting mixed up. The moment you make them independent, there will be occasions where Y
also turns out to be 1 and 0 also turns out to be 2.
So what we have done is we have introduced -- we've gone from like perfect negative correlation
to [inaudible] independence. Naturally we'll go -- and this is called a split. The opposite of this
is called a merge. If I had these kind of -- these variables, Y, Z, which do not really add up to
like -- like their positive properties do not add up to less than one, I could just merge them. I
could just say that, oh, I'll have only one variable that takes -- if I were given Y and Z, I would
just say my merge variable is X. So that's the opposite way to go.
And, again, so the expectation remains preserved in these things. What happens is the utility
increases when variables are merged as in the Y and Z went back to X? Your utility would go
up. Otherwise it goes down.
And slightly more complex that I won't get into is that even if I had this hard constraint of
instead of U of sum, like U of -- I'm just writing them as U of Y plus Z. But if I had this
constraint of, oh, I can't have more than K non0 variables [inaudible] I can take the sum of only
K largest variables, the merging still is a positive thing do. Merging still increases utility.
And I've just given an example with like two variables, but imagine that you had plenty of other
variables in the sum. Instead of E of U of X, your X plus X2 plus X3 plus X4, those are just -imagine giving them the same, just splitting one variable and looking at the effect. Okay. So
splitting decreases utility, merging increases utility.
But how bad can splitting get? Splitting essentially clears independent variables. Correlation
that is going to tell us, well, it can get too bad, because even if things are perfectly -- extremely
independent, very small and independent, because you could see the parent variables as a
correlated version of the split variables, the independent one can be worse. That's correlation
gap.
So this is my morphing argument. That start with N random variables, which are the payments
from the buyers in M-OPT. In the optimum mechanism, what does buyer I pay me? That's my
I -- that's my variable XI. It's a random variable.
These variables are of course correlated. But now what I do is I split these variables, the first
step, as I did for X in this case. I just split it into essentially Bernoulli but not 01 Bernoulli, 2
value, zero comma some other positive value. Probability.
I do -- I do the splitting. Okay. And I keep -- I not only split them -- so imagine that I got stuck
with YZ being -- Y being probability 1/3. I could do one more split. I could break Y into Y1
and Y2, both being one with property 1/6 each. Right? I will still maintain the expectation.
This is also an example of a split. I could just -- what I've done, I've reduced the probability of
each event. Right? And this also goes in the same direction. Split decreases the utility, merge
increases the utility.
So I keep doing that as well. So what I end up is with these infinite similarly small Bernoullis.
So 0 comma V will kind of have Bernoullis, 2 value things, where the success probability, the
probability of V being V, is infinite similarly small. I can make them as small as I like. I keep
making them smaller and smaller. This of course is a [inaudible].
But -- and why am I doing this? The idea is now that I've split them so much, I started from
M-OPT, I kept splitting them. So now what has happened is it has lost all texture essentially.
The system has lost all texture. It's essentially if some price PJ happened with probability QJ.
All it means is that in my system of Bernoullis there are just these Bernoullis that get the positive
[inaudible] 0 comma PJ Bernoullis whose sum of probabilities with epsilon essentially add up to
that QJ. That's the only thing that's remaining. Everything -- all other description has gone away
from the optimum mechanism. Things are all independent.
Okay. This mechanism -- the expected utility of this infinite -- mass of infinite similar
Bernoullis is at least 1 minus 1 over E of the initial optimal mechanism. Okay?
So brought it down to this infinite similar match, and now I'm going to merge it as I feel like. I
can now merge back in whatever way I please. So give me any M prime that matches those total
sale probabilities, which essentially the probability of getting PJ should be QJ. That part should
be preserved.
But as long as that is preserved, I can just match them back up to individual buyers as I wish. If
buyer I was giving me PJ with probability some RIJ, I could even mimic that. I could merge
them up in any way I like. And -- this entire merging process only increases my utility. So I
am -- I will stay within 1 minus 1 over E of OPT.
So this is the entire morphing process. Bring it down to a Bernoulli match and map it back up to
whatever if you like.
Okay. Now I come to -- so this proves that this -- as long as you match the total sale
probabilities, you are within 1 minus 1 over E. Still I have too many prices. I can't guess this
total sale probability of each price.
So this brings into the second insight, which is, well, guess the value of -- let's say we know
OPT, the value OPT, and let's say that's guessed. There are some ways to sort of fairly
accurately guess it. Factor of 2 would be fine. And now classify these prices.
So what I want to say next is I'm not going to guess all these properties. I'm going to do
something simpler. So I'm going to classify all the set of prices as huge prices, large prices,
small prices. Maybe should have been large, medium, small, but huge, large, and small. Okay?
So whatever ->>: [inaudible]
>> Tanmoy Chakraborty: Hmm?
>>: [inaudible] grande.
>> Tanmoy Chakraborty: Yes, yes. With the smallest would mean large in Latin -- or Spanish,
right? Yeah.
So what are huge prices? Huge prices are those that are U inverse OPT over epsilon. What that
means, that if I hit one of those, I don't -- they're so big, I eventually hit a utility of OPT over
epsilon in that particular realization. I don't care what I get from the other people in the
realization. I just leave the market, count myself happy. I can do that.
This happens with only epsilon probability by Markov, by Markov's inequality. I can't be getting
more than OPT over epsilon with probability more than epsilon. Right?
So essentially -- oh, what I can say is that, oh, whenever one of these big guys is contributing,
only one big guy is contributing, nobody else is contributing. I can say that. Right? What is the
good thing about only one guy contributing? I can essentially say whatever he contributes, if he
contributes P, I get U of P, because nobody else contributed. So that's the good thing about
saying only one guy contributed. So that's my intuition behind why I want to define huge prices.
And then there are large prices which are less than huge, obviously bigger than small, and
that's -- their definition is a little tricky. So let's go -- so the large prices are up to -- starts from
wherever the huge prices -- like you have U inverse OPT over epsilon, anything below that, up to
the point where I am selling my sum of sale probabilities have reached 1 over epsilon to the 4.
So probably 1 over epsilon.
Why am I defining it to be that? It essentially means that if my sum of sale probabilities of these
numbers have added up to 1 -- some pretty 1 over epsilon to the 4, then there's almost guaranteed
that I'm getting this -- getting a good number of sales here. I have already got a fair number of
large prices. Essentially with like probability 1 minus epsilon, I will get a large number of prices
from the -- contribution from the large prices. if I got a lot of contribution from the large prices,
essentially what I can say is that my contribution from the small prices, it doesn't really depend
on the details of it. Only the expectation matters.
So essentially if I already got a big number and essentially concentration bounds; that if the small
prices have to still make a significant contribution, because large prices have given me pretty
good money, if the small contribution -- small prices have to give me a lot of -- still a significant
contribution that I cannot neglect, that means a lot of -- I'm getting a lot of small prices. If I'm
getting a lot of small prices, I should have concentration. I should always be pretty close to the
expectation.
And this is what the equation says. That I can -- for the large prices, as huge, I can essentially
consider it as I'm getting that price U of PI. For the small prices, I'm just going to look at -- it
has moved into the summation, because the summation is pretty robust. This is almost a
constant. This is highly concentrated around this expectation, the small prices.
And so this is essentially what it says. The contribution of huge prices can be linearized, the first
one, and the small prices only the expectation matters because they're concentrated.
So the only thing that I need to guess is the R. R is the -- so R is the expected revenue from the
small prices. H is the expected utility from the huge prices. I guess these two threshold.
But other than that, I only need to guess the sale probabilities of the large prices. And now the
large prices came from a pretty small set. They were essentially multiple, so this epsilon to the 5
result, so again some poly 1 over epsilon choices.
So and they can be guessed up to that, that the poly 1 over epsilon resolution now suffices. So
this essentially becomes sort of a 1 over epsilon to the 1 over epsilon -- like 2 to the 1 over
epsilon to the 7 kind of algorithm.
And for each of them there's an [inaudible] that's getting solved for feasibility. Once I find a
feasible solution for [inaudible] QJ when I find the feasible solution, that is going to be my
mechanism.
This principle also extends to unit demand where the items are essentially nonidentical. Same
multi-unit except items are now identical. And it can -- so we're essentially solving certain -this is essentially stochastic packing constraint, so we hope that this can generally extend -- these
techniques can extend to risk aversion in other stochastic packing problems.
And ideally when something we are still thinking about and working on is we have really cut our
guessing space quite a bit. We are now guessing the probability, sale probabilities of only total
sale probability of only the large prices, but that's still guesswork. We are still unable to write a
concave -- convex optimization problem. And that will be the ideal goal, to somehow morph this
expected nonlinear optimization stochastic optimization problem to actually a concave
optimization problem that gives some approximation. It's a good approximation.
Thank you.
[applause]
>> Nikhil Devanur Rangarajan: Questions?
>>: So this Myerson algorithm, is this a sequential?
>> Tanmoy Chakraborty: No.
>>: You look at all the prices in Myerson.
>> Tanmoy Chakraborty: Yes.
>>: And is this one where you have each ->> Tanmoy Chakraborty: You do ->>: The item at the largest price [inaudible]?
>> Tanmoy Chakraborty: Yeah. Except that you look at the -- something called -- that will be
VCG, which it maximizes welfare.
>>: Right.
>> Tanmoy Chakraborty: For revenue Myerson designs a different auction where he actually -they actually end up something called virtual value. So in every value V gives a map to a virtual
value as a function of V and his distribution.
And then the -- with the virtual value he does exactly the same. Give it to the guy at sort of -what qualifies him to get it.
>>: This also requires the [inaudible] this doesn't require a prior, right?
>> Tanmoy Chakraborty: It needs it. For revenue it needs a prior.
>>: [inaudible] depends on the prior.
>> Tanmoy Chakraborty: Yes. For revenue you cannot do it without -- you do not optimize it
without priors.
>> You do this VC thing.
>> Tanmoy Chakraborty: Yeah. That only maximizes welfare. It can be far away from
revenue. Yeah.
>>: [inaudible] know the people in the airline business who have huge price for first class and
then some large price for business class and then small price for coach and they try to [inaudible]
approach breaking even.
>> Tanmoy Chakraborty: Hmm. No, I have not. But, yeah, there may be some simpler -similar ideas going on there. Small prices will likely have concentration. That's I guess the idea
here.
>>: So more natural is when buyers have [inaudible].
>> Tanmoy Chakraborty: Yes. That's one of the things. So the -- so our mechanisms are okay
with that. That's ->>: [inaudible]
>> Tanmoy Chakraborty: Dominant strategy instead of compatible buyers. Maskin and Riley
have a very old paper where they consider risk averse buyers. So there they assume that every
buyer has the same risk averse function. And essentially analyze it regarding to that.
The trouble with that is of course even if you know that buyers are risk averse, how does a seller
know how risk averse the buyers are. So that is a -- that seems to me like a major assumption of
knowledge.
>>: So is there any reason you can't provide the buyers with a truthful mechanism and not
commit yourself to a particular prior? Because, in reality, in many situations, you want to update
your priors as you see what happens.
>> Tanmoy Chakraborty: So this is actually a one round of a game, and instead of compatibility
as being offered as just a one auction. And if you are updated ->>: I understand that. I understand that. If you're playing many rounds, you can [inaudible] a
prior each time.
>> Tanmoy Chakraborty: Yes.
>>: Nonetheless, in many situations in real life there aren't [inaudible] there's one round.
>> Tanmoy Chakraborty: Yeah.
>>: And the information you get from the bids is very valuable information which you're
throwing away.
>> Tanmoy Chakraborty: Yes.
>>: [inaudible] my question is do you have to throw them away, or is it just that [inaudible].
>> Tanmoy Chakraborty: Here the priors are different for different priors. This is
nonhierarchical. They are a different distributions [inaudible].
>>: Why do you say [inaudible] you could use priors kind of use [inaudible]?
>>: Yeah, I mean, you don't have to make assumption. You don't have to assume they're all I
and D, you just have to assume like you're gaining information from one byte to the next
[inaudible].
>> Tanmoy Chakraborty: And your intuition is pretty spot on. Except that the map has often
ended up saying that you cannot do much better in expected revenue by just using the prior as
opposed to doing -- using the values.
So it is true even for expected revenue. Because Myerson maximizes expected revenue, but
essentially one of my priors will say, well, if in multi-unit auction if you are selling K items and
K is pretty large, then throw away all the values and offer these -- just create price as a function
of these priors and offer them, and you're essentially within a very close factor of Myerson, like
1 minus 1 over square root K of Myerson. So it simply approaches Myerson and the values do
not match.
>>: Of course in real life in your problem you could -- you can presumably always check with
Myerson [inaudible] and then go with Myerson if you don't [inaudible].
>> Tanmoy Chakraborty: Yes. Well, Myerson also needs priors. But, yeah, you could run
Myerson and we'll look at its expected reliability.
>>: Because, as you point out, you need kind of wild priors for -- to make it worth sacrificing
[inaudible].
>> Tanmoy Chakraborty: Exactly. Exactly. Exactly.
>>: Going back to this predictability, that would be I think part [inaudible] robust [inaudible].
>> Tanmoy Chakraborty: Yes. [inaudible] technically this is not addressing the exact same
question. But it's coming from the same angle. Because priors -- essentially priors being very
wide has a relation to -- it's standard qualitative [inaudible] but it has a relation to predictability.
Having a very wide prior means I have very little idea of what's actually about to happen. That's
why essentially you need to hedge against everything.
But, yeah, a robust [inaudible] more direct definition of that. Of course the definition there is
also varies as we have discussed.
>> Nikhil Devanur Rangarajan: Okay. Let's thank [inaudible].
[applause]
Download