>> Nikhil Devanur Rangarajan: Hi. It is my... of Pennsylvania. Zhiyi may be familiar to some of...

advertisement
1
>> Nikhil Devanur Rangarajan: Hi. It is my pleasure to introduce Zhiyi Huang from University
of Pennsylvania. Zhiyi may be familiar to some of you since he's intern here with the theory
group twice actually. And Zhiyi is also a recipient of Simons Graduate Fellowship, which,
believe it or not, believe it or not is somewhat more prestigious than being an intern here twice.
So he's going to tell us about how to compute over private data.
>> Zhiyi Huang: Okay. Thanks, Nikhil, and it's a pleasure to be back.
So, yeah, what I'm going to talk about today is about computation over private data. And I
realize there's some talk about data and privacy in the building today, so part of my talk will fit
well into that theme.
So the plan is to first go over the background, define what I mean by computation over private
data and what specific challenges that we are facing in this problem. And then I'll talk about
what my coauthors and I have done for this topic. And finally wrap up by listing a few direction
that I'm keen on pursuing in the future.
Okay. So let me get started. I would like to start with a brief revisit of what is computation and
what is algorithm. So abstractly, each computational problem can be defined with a set of
feasible outcomes, some input data, and the objective function which we use to measure how
good each outcome is, right?
So, for example, if you think about max-weight matching, then the set of feasible outcomes are
the set of matchings in a graph, the input data are the weight of the edges, and the objective
function is simply the total weight of all the edges in the matching.
Okay. Now, given such a computational problem, an algorithm can be viewed as an input/output
interface where the algorithm designer carefully choose some machinery in the middle which
take the data's input and then choose the outcome from the feasible range accordingly to
optimize the objective.
However, in this model there's a big assumption that is all the necessary input data are given to
the algorithm designer for free. It's not really the case. In fact, for many computational
problems in modern world, especially those that arise from the Internet or electronic commerce,
that's quite often not the case because these problems usually rely on the private data held by
self-interested agents as their input.
So the picture looks more like this where the algorithm designer needs to first gather the
information from a bunch of agents, say, Alice, Bob, and Charlie, as in this graph. And then
depending on what these agents report, which may or may not be the true underlying data, the
algorithm designer need to choose the outcome from the feasible range.
And as a result it's natural to ask can we design the algorithm in some specific way maybe
sometimes with the help of appropriate payment scheme such that we can convince the agent to
share the true data.
So to make a distinction with the algorithm in the more traditional environment, I would like to
refer to the algorithm in the presence of private data and self-interested agents as mechanisms
which is the standard notation in the literature.
2
So in order to design a good mechanism, one need to take into account only the usual limitation
on computational power but also some new challenges imposed by the social, economic, or
personal considerations of the agents.
So to motivate what specific new challenges that we are facing, I'd like to talk about two
illustrative problems. The first illustrative problem is about allocating one resource to a set of
agents, say allocating a new iPhone 5 to one of Alice, Bob, and Charlie, and we would like to
allocate the iPhone to the agent with highest value for the app.
So what we might want to do is to run the famous Vickrey auction or the second-price auction.
We first let agents submit a bid, different agents might have different values for the iPhone, and
then depending on what their bid is, we will allocate the iPhone to the highest bidder, Alice, in
this case, and will let the winner, Alice, pay the second highest bid, $199.
And the Vickrey auction has many nice properties. First of all, it encourage the agent to share
the true valuation in the sense that that always maximizes their utility defined to be the value for
the iPhone if the agent get it minus the payment that they need to pay.
So clearly Alice has no reason to lie because she's getting the iPhone and there's nothing she can
do in terms of lowering the price because that depend on the second highest bid. And also Bob
and Charlie do not have incentive to lie trying to win the iPhone because in order to do so they
will need to pay a price that's higher than their value.
And, secondly, the Vickrey auction maximizes the social welfare which is defined to be the sum
of agents' value for the outcome. And in this specific case it's simply value of the agent who get
the iPhone.
So by definition we are allocating to the highest bidder, and by the fact that we are encouraging
the agent to tell the truth, we are maximizing social welfare.
And finally the Vickrey auction has a very simple format. It can be implement in essentially
linear time because all we need to do is to find the highest bid and second highest bid.
Now, so far so good. But that's only for allocating one resource. And more generally we would
like to be able to handle multiple resources and maybe more complicated scenario where agents
have combinatorial valuations over subset of resources in the sense that the value for subset of
item may not simply be the sum of the value for individual items, right?
So this is called combinatorial auction problem, arguably the most studied problem in the
literature. And it also captures many resource allocation problems arise in practice, for example,
the auctions for advertisement slots or the FCC spectrum auction between the government and
the companies.
And of course this is a very classic economic study. It's very well studied by economists. One
of the solution they propose is we should always allocate a resource such that we maximize
social welfare. And if we do so, there exists some payment scheme that will encourage the agent
to share their true value. And this is called a VCG auction.
Unfortunately we also know that maximizing social welfare is NP-hard in general and therefore
implementing VCG is NP-hard in general. There's work left to be done.
3
So for these kind of resource allocation problems, other than the limitation on computational -sorry. For these kind of resource allocation problems, other than the resource -- the limitation on
computational power, we also face a new challenge of the game-theoretic constraint in the sense
that each agent has some utility that directly generated from the outcome of the mechanism, and
therefore if they collide or decide not to participate, to improve their utility they will very well do
so.
And this game-theoretic constraint has received a lot of attention from the theory community
over the past decade and lead to this very exciting field of algorithmic mechanism design which
study how to design mechanism that run in poly time and take care of this game-theoretic
constraint.
And the solution concept in this literature is to restrict our attention to the truthful mechanisms
which always encourage the agent to tell the truth by making sure that maximizes the expected
utility.
And the central question in our algorithm mechanism design is how to design computationally
efficient and truthful mechanisms with good social welfare guarantee.
Now, as a quick remark, there also exists other interesting objective in mechanism design, such
as mechanizing revenue or ensuring some notion of fairness. But for the purpose of this talk, I'll
focus on social welfare maximization, which is the most studied objective in the literature.
Okay. So that's the first constraint, the game-theoretic constraint. Now let me move on to the
second illustrative problem. Suppose we want to release on the average salary of all the
employees in the company, say in Microsoft. So of course this is easy to compute the average.
But if we release the exact answer, that might be problematic because by comparing the average
salary before and after Bob joined the company, the adversary will be able to learn exactly what
Bob's salary is, which is considered to be a sense that's personal information that should not be
revealed to the public.
So what we may want to do is to release a noisy answer by running the Laplace mechanism. So
we will first let a trusted third party called a curator to solicit all the salaries of the agents, and
then we compute the average. We sample noise from the Laplace distribution, which essentially
the exponential distribution mirror in the Y axis. And then we sum them up and release the
noisy answer. Okay?
Now, the point is if the adversary again tried to compare the noisy answer before and after Bob
joined the company, then the estimation he will get has an error that is roughly N times the error
we will get adding to the average answer, so where N is the number of agents. So when the
number of the agent is large, there's hope that we can add very little noise in terms of the average
answer but adding a lot of noise in the [inaudible] estimation.
So, in other words, we protect the privacy of each individual agent's salary while providing
reasonably accurate answer in terms of this average salary.
And finally it also has a very simple form and can be implemented in a computational efficient
manner. Okay?
4
>>: So in the example, when you're trying to [inaudible] this is the error you would add to the
average [inaudible] or to ->> Zhiyi Huang: To the average [inaudible].
>>: Seems that -- seems that in order to get -- to hide most salary you would just need to add the
error of -- scale down by the number of people, so this ->> Zhiyi Huang: Yes. So ->>: So this is like [inaudible] huge error [inaudible].
>> Zhiyi Huang: So in this illustrative example, I'm only writing like three agents here. But in
general we will assume there are a lot of agents ->>: All right [inaudible] describe an example of a large company, okay, I understand that.
>> Zhiyi Huang: Right. Yeah. Yes. Yes.
>>: Okay.
>> Zhiyi Huang: Yes. You are making a very good point that if there are only three agents in
the picture, then probably it won't be very private for the agent because essentially each agent
has a substantial contribution to the average. But in general we would like to consider large
number of agent. Okay? Very good point.
So okay. Now, again, that's only for releasing one numerical query, and in general we would
like to be able to answer many, many queries about the same database. So we would like to
consider one data universe which specify the possible sense of types of the agents which might
be their demographic information or their medical records or the salary and so on. And we'd like
to consider a database that contain the sense of types of N agents.
What we'd like to do is to answer the predicate queries asked by data analysts where each
predicate query is specified by a predicate function mapping from the data universe to real
number between 0 and 1, and given such a query the exact answer is supposed to be the average
value of this predicate function taken over all the elements in the database. Okay?
So here I'm abusing notation a little bit because when we talk about predicate function usually
we mean Boolean function, but I'm allowing to map to 0, 1, okay, real number.
So the predicate queries captures many useful queries in practice. To give you some example, if
the data universe consists of numbers between 0 and 1, then we ask about the mean or higher
moments about the numbers in the database. If the data universe consists of the Boolean strings,
then we ask about what's the fraction of agents whose sensitive types satisfies some conjunctive
Boolean formula or general Boolean formula.
And if the data universe are all the points in some metric space, for example, the D-dimensional
unit cube in the Euclidean distance space, then we may ask about a distance query where each
query is specified by a point in the metric space. And what we would like to learn is the -what's the average distance from the query point or the point in the database. And this kind of
5
query might be useful if we want to pick a location for building a new facility and would like to
learn how convenient is this new location in terms of the average distance to all the citizens in
the database. Okay?
Now, for these kind of queries, these problems, the agents has little or no utility that directly
generated from the outcome of the mechanism. But, nonetheless, they may still decide to lie
about the private type or decide not to participate in the survey if the curator's answer will leak
too much information about a sensitive type because they may worry leaking -- the leaking of
such information might hurt the utility in the future.
So this is the privacy constraint, and it lead to the very fruitful field of differential privacy which
study how to design mechanism that run in poly time and take care of this privacy constraint.
And the solution concept is to consider a differentially private mechanism that, roughly
speaking, whose outcome distribution is insensitive to the change of one agent's private type.
Okay?
And more formally by insensitive what we mean is that suppose we fix the type of all other agent
except Agent I, then no matter what Agent I reports, the probability that we would choose a
specific outcome changes by no more than E to the epsilon factor for some small constant
epsilon. Okay?
Essentially it's saying that the infinite divergence between the outcome distribution for two
neighboring database are -- is spawned by epsilon. Okay?
So given the definition, the essential question for differential privacy is to study how to design
computational efficient and private mechanism that can provide accurate answer for the queries
asked by the analysts about the database. Okay?
Okay. Now, so far I have defined what is computation over private data, and I have specified
two challenges, the game-theoretic constraint and the privacy constraint. So before I move on to
the technical part, is there any question about...
Okay. So now let me talk about what my coauthor and I have done. So it will contain -- consist
of three parts. In the first part I'll focus on purely the game-theoretic constraint. I'll talk about
how to solve the social welfare maximization problem via the black box reduction technique.
And the second part I'll focus on purely the privacy constraint and I'll talk about how to design
mechanism for answering the distance queries. And finally I'll briefly go over our result on how
to design mechanism that can handle both constraints simultaneously.
So for the first part I'll get into a little more detail, and for the other two part I'll be a little brief.
Okay?
Okay. Now let me move on to the first part, the game-theoretic constraint. And let me remind
you the central question is how to design computational efficient and truthful mechanism with
good social welfare guarantee.
Now, suppose we take away the truthfulness requirement for a second, then this is well-defined
optimization problem. And as computer scientists, we are trained to be able to design fast
algorithm for solving optimization problem either exactly or approximately.
6
And indeed our computer scientists have developed very powerful -- many powerful tools for
solving optimization problems in poly time.
Now, if we put back the truthfulness requirement, then the problem of designing computational
efficient and truthful mechanisms despite of many exciting progress over the past few years
remain a much, much less understood topic. So it is natural to ask can we reduce the less
understood mechanism design problem to a better understood algorithm design problem.
Now, this question has motivate me among many other researchers to look into the following
sort of a Holy Grail for algorithm and mechanism design which says that can we convert any
algorithm into truthful mechanism with essentially the same social welfare guarantee while the
running time of the mechanism is no more than polynomial times the running time of the
algorithm.
If we can give affirmative answer to this question, that would be great because that means there's
some machinery that will automatically take care of the game-theoretic part and we can simply
focus on the optimization part, which is a much more familiar terrain for computer scientists.
And our main contribution is that for many problem the answer is yes, there exists such a
reduction. In particular we show that for all problems in the Bayesian setting, which is the
standard economic setting, the answer is yes. This is joint work with Xiaohui Bei.
And of course as computer scientists we are also interested in privacy setting and worst-case
analysis, and we look into that as well.
So we show that for one subclass for all single dimensional and symmetric problems in the
privacy setting the answer is yes, there exists such a reduction.
So here the single dimensional means the private evaluation of an agent can be written in a
simple form of a single real number, and symmetric means the set -- the feasible outcome is
symmetric. Pick any feasible outcome, then no matter how we permute the identity of the
agents, it will remain feasible. So arguably a relatively natural restriction to add.
So that's our result. So due to time constraint, I won't be able to talk about the technical details
about the privacy setting, but I will go over how we design the black box reduction for the
Bayesian setting. Okay?
Okay. Now let me move on to the Bayesian setting. Let me be more specific about our setting.
So, again, there's a set of feasible outcome and N agents, and each agent has a private valuation
function mapping from the set of feasible outcome to real number between 0 and 1 which specify
how much each agent like each outcome.
And we will make the Bayesian assumption in the sense that we assume this valuation VI is
drawing from some publicly known distribution F of I, which is publicly known and agreed
among -- across different agents. Okay?
And for the sake of presentation, I'll think about a somewhat simplified setting. Imagine there's a
list -- finalist of possible valuation each agent could have, and the prior distribution simply says
that it's equally likely each of the valuation -- each of the valuation it's equally likely to be
realized as a true valuation for the agent. Okay? It's a uniform distribution.
7
Now, for this setting, what we show is that any algorithm can be converted into a truthful
mechanism by using payments and with no loss in social welfare and polynomial overhead and
running time. And the notion of truthfulness we consider here is that telling the truth maximizes
the agents' expected utility where the expectation is over the randomness of the mechanism and
also real random realization of other agent's type, assuming the other agent tell the truth. Okay.
It's a Bayes-Nash equilibrium.
Okay? So now let me first tell you about the basic framework of our black box reduction. So,
again, the algorithm can be viewed as an input/output interface where in this case the input are
simply the reported valuation of the agents.
What we would like to do is we would like to decouple the reported valuation by the agents and
the valuation we use as input for the algorithm by using some carefully designed perturbation
algorithm sigma I is one for each agent. And what we would do is use the perturbed valuation as
input for the algorithm and then use what the algorithm output as the output for the mechanism.
So in this basic framework I haven't talked about payment yet. And that's because by relatively
standard technique in mechanism design, once we fix the outcome of the mechanism, the
payment can be derived automatically. So I ignore the payment in this framework, but I'll talk
about -- I'll precisely define what the payment are when I get to the particulars.
So given this framework, it remain to decide how to design these sigma Is, right? So in
particular I would like to design the sigma such that they satisfy three properties. The first
property I would like to impose is stationary in the sense that if the input for some sigma I really
follow the uniform distribution over the support, then the output sigma VI also follow a uniform
distribution over the same support.
So, in other words, distribution-wise, this perturbation algorithm is not doing anything, okay?
And the reason we'd like to impose the stationary constraint is that will allow us to decouple the
correlation across different agents in the sense that from Agent 1's viewpoint, now it is as if the
perturbation for other agents do not exist, right? So that will allow us to focus on the problem of
designing one perturbation algorithm subject to this stationary constraint.
Now subject to the stationary constraint, I would like to make sure the social welfare do not
decrease, and in particular we will make sure the expected valuation of each individual agent did
not decrease.
And finally we would like to impose the truthfulness into the picture. So these are the three
goals in designing the sigma Is.
Now, let me go to the particulars of the sigma Is. So the first observation is that there's a natural
correspondence between stationary perturbation algorithms and bipartite perfect matching in the
following graph.
I would like to associate every possible valuation in the support with two verses in the bipartite
graph. One on the left-hand side we call it the replicas and one on the right-hand side we call it
the surrogates.
Essentially the replicas correspond to the reported types of the agents and the surrogates
8
correspond to the perturbed type output by the perturbation algorithm.
Now, suppose we have perfect matching in this bipartite graph. I claim that that naturally
correspond to a deterministic and stationary perturbation algorithm in the sense that given N
reported type I can look into the replica, find the correspondent replica, and see which surrogate
it get matched to and output that surrogate type, right?
And since this is the perfect matching, if the input is the uniform distribution of left-hand side
vertices, then the output is the uniform distribution of the right-hand side vertices, so it will be
stationary.
And it is actually not difficult to show this is 1-to-1 correspondence. So it remain to decide
which perfect matching to use.
And in order to do that, I would like to introduce the following interpretation. I would like to
interpret the replicas as virtual agents and the surrogates as virtual items in the sense that each
surrogate -- from Agent I's viewpoint, each surrogate essentially correspond to a distribution
over outcome, sort of a lottery, because once we fix the perturbed type we use for Agent I, then
that's well-defined distribution over feasible outcome over the randomness of the algorithm and
also over the random realization of other agents' type.
So given this interpretation, we can talk about what's the expected valuation of one -- a virtual -type T virtual agent for a type T prime virtual item. We'll define this value or the weight of edge
TT prime to be the expected value for Agent I if Agent I has type T and we use T prime as the
perturbed type. Okay?
So essentially what we are doing is we are using the Bayesian assumption to create a virtual
interface, a virtual market for the agent such that from each agent's viewpoint it is as if the agent
is really competing in this virtual market, in which instead of competing with other agent, the
agent is competing with other possible [inaudible] of his own type. Okay?
And moreover, in this virtual market, we have the simple structure of a match market for which
we know how to solve the social welfare optimization problem exactly. So we can simply run
the VCG in this virtual market.
>>: I'm sorry, what's A?
>> Zhiyi Huang: A's the algorithm that we are given as a black box. So yes. So this expectation
will be over the randomness of A, the random conflicts of A, and random realization of VI -- V
minus I. Okay?
>>: [inaudible].
>> Zhiyi Huang: Okay. So ->>: [inaudible] okay.
>> Zhiyi Huang: Right.
>>: So [inaudible] the T's here are using --
9
>> Zhiyi Huang: The T is the ->>: T -- so the T's -- the T is T1, T2 ->> Zhiyi Huang: T's are the possible type or possible valuation of Agent I. And T1, T2, TK are
all the possible valuation in the discrete support.
>>: Are these -- these are the numbers, so the TIs are ->>: Valuation functions.
>> Zhiyi Huang: Valuation functions. It could be a multidimensional vector.
>>: [inaudible] the number.
>>: Okay.
>>: But what about VI, are they functions or are they not?
>> Zhiyi Huang: VIs also function. VI are the collection of the valuation function of all agent
except Agent I, V minus I.
>>: So there's -- that is [inaudible] as well.
>> Zhiyi Huang: Yes. Yes.
>>: So V [inaudible] different set of the V and T are the same space ->> Zhiyi Huang: Yes. I'm using T to specify sort of the -- all the possible valuation of one
specific agent, Agent I. But, yeah, they are from the same space, yes. Okay?
So, again, so we have created this virtual market that has the matching market structure and we
would like to run VCG.
So precisely what we mean is that we'll find the max-weight matching with respect to the weight
I just defined. We will use the stationary perturbation algorithm corresponding to that
max-weight matching. And then we will charge the agent a price that's equal to the VCG price
in this virtual market. Okay?
Now, let me show that we have achieved all three property that we're aiming for. So stationary
is easy. Again, since we are picking one perfect matching and using the corresponding
perturbation algorithm, that's stationary, uniform distribution will be matched to uniform
distribution.
In terms of social welfare, it's actually not difficult to show that the expected value of the agents
subject to this perturbation is proportional to the social welfare in this virtual market because if
you think about the edge that we picked incident on type T, that equals the expected value for
Agent I condition on his true value being T.
10
So when we sum them up, it's really the expectation, the expected value of the agent scaled by K.
Okay?
Now, the algorithm is essentially using a naive [inaudible] matching, what's in equals what's out.
And the mechanism is doing something more clever with using the max-weight matching and
therefore the social welfare in this virtual market can only increase, and as a result the expected
value of the agent in the regional market can only increase. And since that holds for all agent,
the social welfare in the regional market can only increase.
And finally we have imposed truthfulness because from the agent's viewpoint it is as if he is
really participating in this virtual market. And since we are running VCG in the virtual market,
we ensure truthfulness for the agent.
Okay? So that's essentially the whole proof idea modulo some details that will allow us to
generalize to more general distributions. Okay?
All right. Now, let me wrap up the black box reduction part by giving you a summary of what
has happened and where our result stands.
So for the Bayesian setting, our work is motivated by this very nice work by Hartline and Lucier
that solve the problem in the small, restricted, single-dimensional setting, and we have expanded
to the multidimensional setting. And our result has also been independently discovered by
Hartline, Kleinberg, and Malekian.
In the privacy setting, things are more incremental in the sense that the positive results are also
restricted subclass of problems, and we solve one relatively general subclass with symmetric and
single-dimensional problems.
And moreover, things has to be incremental in the sense that for both the single dimensional and
the multidimensional setting, there are the impossibility result that shows that it's impossible to
get a general black box reduction that work for all problems. So we have to utilize specific
structures of the problems to get positive result. Okay?
All right. So that wrap up the game-theoretic part. Now let me move on to the privacy part.
So for the privacy part, again the goal is to design efficient and private mechanism that can
provide accurate answer for the queries asked by the data analysts.
For this regards, there exists a very general positive result that allows to answer K predicate
queries with this error. So let me say a little bit how to interpret this error.
So, first of all, the dependency on N is roughly 1 over root N, so the larger the number of agent
is, the more accurate we can get. And also it's 1 over root N, so it's roughly the same error as the
sampling error.
And, secondly, the dependency on the number of query K is polylogarithmic, and therefore we
can answer exponentially many queries while having nontrivial accuracy, small of one accuracy.
Unfortunately this very positive result is efficient and the best running time per query is linear in
the size of the data universe which can be exponential in the dimension of the data universe.
11
And moreover, that's not just because we're not creative enough, there exists strong evidence
showing no efficient algorithm can privately and accurately answer more than quadratically
many queries. If we insist on answering general predicate queries.
So given this very general inefficient positive result and this very strong low-bound result, it's
natural to ask are there interesting subclass of the predicate queries for which we can design
efficient mechanism that can privately answer much more than quadratically many queries,
right?
And our main contribution is, again, provide affirmative answer to this question by showing that
the distance query from one such subclass. And let me remind you for distance query the
database consists of a bunch of points in a symmetric space and each query is, again, a point in
the metric space, and what we would like to learn is the average distance from the query point to
all the data points in the database. Okay?
So specifically what we show is that there exists a query release mechanism whose running time
per query is nearly linear in the size of the database and we can privately answer arbitrary
number of queries with the following error. If the metric is L1 or L2, we can answer with small
or with one additive error.
And if for we are talking about arbitrary metric, then in addition to the small or with one additive
error, we also lose a log K multiplicative distortion where K is the number of queries.
Okay? So that's our result. Now, let me briefly sketch our approach. Yep.
>>: Additive error [inaudible] number of queries?
>> Zhiyi Huang: Yes. The additive error is independent on the multipliers. Essentially we will
come up with a -- sort of a proxy function, and then we will answer all the query using the proxy
function without further access to the database.
>>: And even if you knew the entire proxy function, you would still ->> Zhiyi Huang: Exactly.
>>: [inaudible].
>> Zhiyi Huang: But, yeah, that's for the first result. And for the second result we are in this
batch model where we are given all these queries, and then we will design our mechanism sort of
utilizing the structure of the query. Before the L1 and L2, we are in this online or offline model,
we can come up with an offline proxy function that can answer all queries about a database.
Okay?
So let me briefly sketch our approach. At the high level our approach depend on this nice
relation between queries and learning algorithm that is established in a series of work. So
essentially we can view the database as a function mapping from queries to answers. Okay?
And then we can first use some learning algorithm to learn the approximate version of this
function called a proxy function, and then we will answer all the queries asked by data analysts
by the proxy function.
12
Now, in this picture, the only place we need to access the true database is by this learning
algorithm. So suppose we have a very good learning algorithm that can learn this approximate
function using only a few updates. That means we only need to access the true database a few
times. And therefore the total privacy loss will be small. So in sum a good learning algorithm
with few updates implies good query release mechanism with small privacy loss. Okay?
And in particular we will design such a learning algorithm directly for some privacy-friendly
metric, namely the L1 metric. So what we utilize is that for L1, the function that we need to
learn can be decomposed into a bunch of single-dimensional functions, and also these
single-dimensional functions are convex and Lipschitz continuous. And we will utilize all three
properties to design an efficient learning algorithm that only use a few updates.
So let me skip the details for how to design this learning algorithm, but let me talk about how to
now handle arbitrary queries.
Our approach is to reduce the problem to the problem for L1 via the metric embedding
technique. So the high-level approach is that we'll first pick a low-distortion matching
embedding from the given metric space to some L1 metric space, and then we will embed all the
data points in the regional database to a proxy database with respect to L1 metric, and then we
will run our mechanism for L1 over the proxy database in the sense that we will embed any
queries using the same embedding and ask the embedded query to the proxy database and get
back the answer. Okay?
>>: So how many points -- so this is a general question, how many points [inaudible]?
>> Zhiyi Huang: So I'm embedding all the points that's in the regional database. Yes.
>>: So say the [inaudible] grow within the number of points.
>> Zhiyi Huang: It's a case-by-case thing. In some nice case when it's L1 and 2 -- L2 to L1,
then it's universal. But yeah.
Okay. So -- okay. In this picture the accuracy analysis is easy. Essentially we lose a small
additive error due to running the L1 private mechanism and also a multiplicative distortion that
equals the distortion of the metric embedding. The tricky part is the privacy analysis because
although we are running a private mechanism over the proxy database, the embedding step itself
might leak information. So we want to avoid that. Right?
So what we observe is that in order to ensure privacy, it suffices to focus on the low-sensitivity
embedding in the sense that changing one data point only change its own embedding, does not
affect the embedding of other points. Okay?
So if we can ensure that, then changing one point in the regional point in the regional database
we're only change one point in the proxy database. Now, by the fact that we are running a
private mechanism, that will not change the outcome distribution by too much. So that will
ensure privacy.
But now I remain to show that due excess interesting low distortion embedding that has low
sensitivity, right? So briefly speaking, from L2 to L1 there's a classic result that has distortion
13
essentially arbitrary close to 1 that's based on random projection, and therefore the embedding is
independent on the data point, and therefore we get low distortion for free.
Now, when we go to arbitrary metric, there another classic result, the bulky [phonetic] theorem.
But the bulky theorem heavily relies on the structure of the data points, and this now has low
sensitivity. So the way we get around this problem is by observing -- by observe that we don't
need to preserve all pairwise distance, we only need to preserve the distance between query
points and the data points.
And in order to do so, it suffices to use a bulky embedding only with respect to the query point,
and that's enough to ensure preserving all the distance between query point and data points, and
therefore we get low sensitivity.
>>: And that's why you [inaudible] log K?
>> Zhiyi Huang: Exactly. Okay.
>>: [inaudible] what do you do with the data?
>> Zhiyi Huang: Sorry?
>>: What do you do with the data points? So you're interested in distances from the ->> Zhiyi Huang: So I'm only interested in preserving the distances between query point and data
point. It's okay to have the distance among data points or the distance among query point to be
highly distorted. And for this weaker notion of distortion guarantee, it suffices to only utilize the
information about the queries. I -- I can just ->>: You're embedding the data points also.
>> Zhiyi Huang: I'm embedding the data point also, but the design of the embedding function
only depend on the query point.
>>: [inaudible] produce something as ->> Zhiyi Huang: Yes. It's a variance -- it's a variance of the bulky theorem, but essentially we
can follow the same proof structure with some minor technical twist that allows to show this
low ->>: It's not a black box reduction.
>> Zhiyi Huang: It's not a black box reduction, but from the high level it's -- that's the idea.
Yeah.
Okay. So okay. Now let me move on to the final part, how to handle both constraints
simultaneously. So this line of approach -- this line of work is motivated by the fact that for
many mechanism design problems, not only the game-theoretic constraint is important, the
privacy constraint is also important.
For some problems, this is because the project valuation of the agents or the companies might be
14
regarded as business secrets that they have devote a lot of research in the market and so on to
realize these secrets, and they don't want to reveal the secret to their competitors. Right?
And in some other settings, maybe we would like to protect the privacy of the valuation function
because this valuation might depend on other sensitive information about the agents.
For example, if we think about combinatorial public project problem where a government want
to choose a subset of public project to invest in subject to some feasibility constraints, say we can
invest in no more than K projects, then the agent's value might depend on their sensitive data.
For example, if the projects are locations for building new hospitals, then this valuation might
depend on their medical records. So there's a natural need to protect the privacy of the agent's
value.
So the open question in this field before our work is that is it possible to design truthful and
differentially private mechanisms with good social welfare guarantee? Again, we are focusing
on maximizing social welfare.
And there's some previous work that gave some positive results either for achieving approximate
truthfulness or for getting both exact truthfulness and differential privacy for special cases.
What's been lacking is a general technique for achieving truthfulness and privacy for any
problems.
And our main contribution is, again, an affirmative answer to this question, the answer is yes,
there's a general technique for doing so. And the way we do it is by showing that the
well-known exponential mechanism in the privacy literature which choose the outcome from the
feasible range with property proportional to the exponent of the social welfare of this outcome
scaled by the privacy parameter epsilon divided by 2 is truthful when we are -- we couple this
with some appropriate payment scheme. Okay?
And let me give you a one-slide sketch of the proof. Many of you are familiar with this maybe.
The exponential mechanism can be characterized as maximizing the following free social
welfare which is defined to be the expected social welfare over the distribution of outcome plus
the Shannon entropy of the outcome distribution scale by 2 over epsilon.
This fact is known in different names in different fields. For example, in statistical physics, it's
known as nature, or Gibb's measure, natural minimizes free energy. Or in learning it's known as
regularization with Shannon entropy.
The way we interpret this is that suppose we think about picking a -- instead of picking one
outcome we pick a lottery of outcome or a distribution over the outcomes. And all the
distribution are available on the market. Then the exponential mechanism is essentially running
the VCG mechanism with respect to the original agents plus one additional agent who is a
pure-risk lover whose value equals the Shannon entropy of the outcome distribution scaled
properly by their privacy parameter.
And therefore since VCG is truthful, we get that the exponential mechanism is truthful by
translating the VCG payments back to the exponential mechanism setting. Okay?
Okay. So now I want to wrap up the technical part by giving a brief overview of my research.
15
So what I talk about today is my thesis topic on algorithmic mechanism design and differential
privacy, and specifically on social welfare maximization via the black box reduction technique
and private mechanism for releasing distance query and truthful and differentially private
mechanism design.
I have also done some work for mechanism design for revenue maximization, which I do not
have time to cover in this talk, but feel free to ask me more about this offline.
Outside my thesis topic I've also done some work in online algorithms during my interns here.
And specifically I've worked on online schedulings, problems, and online matching problems.
And, finally, outside these two topics, I've also worked on a wide range of problems, for
example, in property testings and generalization of the sorting problem and so on.
So, again, I would like to talk more about this offline. Okay. Now let me wrap up with a few
future directions.
So for mechanism design, our black box reduction technique among -- and with the results by
others, and also the recent similar result by Cai, Daskalakis, and Weinberg for black box
reduction for revenue maximization indicates that algorithmic mechanism design in the Bayesian
setting is easy in the sense that it's as easy as the algorithm design problem.
Now, on the other hand, there are strong negative results showing that algorithmic mechanism
design in the prior-free setting is hard. It's much harder than the algorithm design problem.
So it seems that if we want to get positive results, the Bayesian setting is the right setting to look
into. However, getting exact prior knowledge is very difficult. I mean, in my opinion, it's
unrealistic to get exact prior knowledge.
>>: If it's so easy, why are there so many papers on it?
>> Zhiyi Huang: So, yeah, so this is like a serious paper showing stronger and stronger
published results.
>>: [inaudible].
>> Zhiyi Huang: Yeah. Essentially they are showing the hardness for stronger and stronger
notion of truthfulness. The original result is for deterministic truthful and later for uniformly
truthful and so on. Okay?
Okay. Back to my point. Although it seems that we should look into Bayesian settling, but
getting exact prior knowledge is difficult. So I think it's interesting and important to explore the
intermediate domain between Bayesian and privacy setting. And there are many interesting
possibilities in between, and let me talk about one possible.
So maybe we should look into the prior robust mechanism in the sense that the truthfulness is
independent on the correctness of the prior while the performance in terms of social welfare or
revenue scale smoothly when we have small errors in the prior estimation.
So if we can get such prior robust mechanism, then we have more reason to believe it will
16
perform well in practice, right? And there are other possibility to explore this intermediate
domain. Let me skip that. Now for the privacy part, the theme I would like to pursue is to
design computationally efficient and private mechanisms building upon the many exciting
progress on the information theory side over the past few years.
And in particular the result I talk about in this talk show that for by utilizing the structures of the
queries, we can answer distance query in a computationally efficient manner.
So it will be interesting to classify what subclasses of predicate query can be answered
efficiently and what subclasses cannot.
For example, the convex predicate query might be one candidate to look into where the predicate
functions specify the convexity constraint, or we may look into the conjunction which is another
very well studied type of queries in the literature.
And, on the other hand, I feel it would be important to develop differentially private version of
important algorithmic tools such as linear programming and semidefinite programming which
can be served as important building block for designing private mechanism in the future. Okay?
And, finally, for differential private mechanism design, this is a much more open area. So,
again, the first theme is to bring computational efficiency into the picture, because the general
positive result that I just talk about is not computational efficient in general.
For this regard I have recently make some progress. I realize that by combining the convex
rounding technique from the mechanism design literature and objective perturbation technique
from the privacy literature, we can solve the combinatorial public project problem in a
computationally efficient, truthful, and private manner.
And here convex rounding, roughly speaking, is a technique in mechanism design that use
convex programming to design truthful mechanism. And objective perturbation is, roughly
speaking, a differentially private way of solving convex programs, some specific convex
programs. Okay?
And it will also be interesting to look into other settings, for example, mechanism design without
payment, because for many settings that's just voting where both the game theory part and
privacy part matters. It's in the property to use payments. And our technique crucially rely on
the use of payments. So this is another interesting direction.
And, finally, I'm interested in this very open-ended question, what's the right model to capture
both the game-theoretic and the privacy constraint.
So our approach is essentially by criteria one where we would like to have the mechanism that's
truthful with respect to the usual notion of utility while we want to ensure the outcome
distribution is insensitive to the agent's type. But some may argue it's more natural to model the
privacy constraint into the utility function in the sense that we assume that some of this utility
that's captured, how much this agent lost by -- how much this agent get hurt by the information
leaked by the mechanism, right?
But if we take this approach, at least so far that's not a very satisfying form of this utility function
that everyone is happy about, so this remain a very interesting open-ended question.
17
So okay. Although there are many, many other interesting directions I can keep talking about, so
let me take questions here. And thank you.
[applause]
>> Nikhil Devanur Rangarajan: Any questions?
>>: So for this last thing that you said about the right model for privacy [inaudible] in general
maybe, you know, this is hard to come up with the distributive function for privacy, but a lot of
times the privacy [inaudible] privacy precisely because I don't want to [inaudible] me. So I have
some value for [inaudible] again and again and again, so if I review my value in a [inaudible] or
maybe you can use it [inaudible] so maybe for this subclass, we can call it a reasonable definition
of privacy. So has this been considered? Do you know anything?
>> Zhiyi Huang: I see. So that's a very good question. So basically the point you raise is that
maybe for some specific setting we have more reason to come up with a precise form of this
utility function, because, for example, this -- in this repeated auction where your value will play
a role over and over again, maybe we can come up with a better closed form. Yes. I think that's
an interesting direction. And to my knowledge I'm not aware of any work that utilize this
structure to define this utility function. Yeah. That would be an interesting direction to look
into.
>>: In general, I mean, coming back to your very first slide, this is an issue with the truthfulness
in the Vickrey auction, truthful only when used in isolation ->> Zhiyi Huang: Exactly. Exactly. Exactly.
>>: [inaudible].
>> Zhiyi Huang: Yes. It's truthfulness in this one-shop game, but if you think about future
utility ->>: And this is something that the English auction doesn't suffer, so in the English auction the
people who [inaudible] have exact value of [inaudible].
>> Zhiyi Huang: Oh, I'm not sure that in English auction you don't get your value revealed,
which might hurt your utility in the future ->>: You might. But if you're way above the ->> Zhiyi Huang: Oh, yes. Yes. If you're way above the second highest bid, then you'll sort of
protect your value very well. Yeah, I agree.
Yeah. I think differential privacy is just one way of putting a closed form sort of damage bound
on how much you can get hurt in the future, in the sense that your utility cannot get hurt by more
than a one-plus-epsilon factor or E to the epsilon factor.
But I agree, if you have some better closed form in terms of what your future utility is depending
on what the outcome of the current mechanism, maybe you should take that into your account
18
into your utility function, and then -- yeah.
>>: So all these sort of proxies for utility have some [inaudible] because if you are claiming they
can have only epsilon [inaudible] and this proxy, well, how do you know the real valuations in
any way continue [inaudible]?
>> Zhiyi Huang: I'm not sure I get that question.
>>: So whenever you have a proxy for your utility ->> Zhiyi Huang: Uh-huh.
>>: -- so really utility is perhaps a function of it ->> Zhiyi Huang: Right.
>>: -- maybe it's not just a function, but ->> Zhiyi Huang: Right.
>>: -- if it isn't, then you need to know how continuous is that function.
>> Zhiyi Huang: I see. Yes. Yes. I agree.
>>: [inaudible] have an approximate, but ultimately that ->> Zhiyi Huang: Right, right, right. Yes, I agree.
>> [inaudible]
>> [inaudible]
>>: But I think the utility of [inaudible].
>> Zhiyi Huang: Yes.
>>: Okay. Thank you.
[applause]
Download