>> Nikhil Devanur Rangarajan: Hi. It is my... of Pennsylvania. Zhiyi may be familiar to some of...

1 >> Nikhil Devanur Rangarajan: Hi. It is my pleasure to introduce Zhiyi Huang from University of Pennsylvania. Zhiyi may be familiar to some of you since he's intern here with the theory group twice actually. And Zhiyi is also a recipient of Simons Graduate Fellowship, which, believe it or not, believe it or not is somewhat more prestigious than being an intern here twice. So he's going to tell us about how to compute over private data. >> Zhiyi Huang: Okay. Thanks, Nikhil, and it's a pleasure to be back. So, yeah, what I'm going to talk about today is about computation over private data. And I realize there's some talk about data and privacy in the building today, so part of my talk will fit well into that theme. So the plan is to first go over the background, define what I mean by computation over private data and what specific challenges that we are facing in this problem. And then I'll talk about what my coauthors and I have done for this topic. And finally wrap up by listing a few direction that I'm keen on pursuing in the future. Okay. So let me get started. I would like to start with a brief revisit of what is computation and what is algorithm. So abstractly, each computational problem can be defined with a set of feasible outcomes, some input data, and the objective function which we use to measure how good each outcome is, right? So, for example, if you think about max-weight matching, then the set of feasible outcomes are the set of matchings in a graph, the input data are the weight of the edges, and the objective function is simply the total weight of all the edges in the matching. Okay. Now, given such a computational problem, an algorithm can be viewed as an input/output interface where the algorithm designer carefully choose some machinery in the middle which take the data's input and then choose the outcome from the feasible range accordingly to optimize the objective. However, in this model there's a big assumption that is all the necessary input data are given to the algorithm designer for free. It's not really the case. In fact, for many computational problems in modern world, especially those that arise from the Internet or electronic commerce, that's quite often not the case because these problems usually rely on the private data held by self-interested agents as their input. So the picture looks more like this where the algorithm designer needs to first gather the information from a bunch of agents, say, Alice, Bob, and Charlie, as in this graph. And then depending on what these agents report, which may or may not be the true underlying data, the algorithm designer need to choose the outcome from the feasible range. And as a result it's natural to ask can we design the algorithm in some specific way maybe sometimes with the help of appropriate payment scheme such that we can convince the agent to share the true data. So to make a distinction with the algorithm in the more traditional environment, I would like to refer to the algorithm in the presence of private data and self-interested agents as mechanisms which is the standard notation in the literature. 2 So in order to design a good mechanism, one need to take into account only the usual limitation on computational power but also some new challenges imposed by the social, economic, or personal considerations of the agents. So to motivate what specific new challenges that we are facing, I'd like to talk about two illustrative problems. The first illustrative problem is about allocating one resource to a set of agents, say allocating a new iPhone 5 to one of Alice, Bob, and Charlie, and we would like to allocate the iPhone to the agent with highest value for the app. So what we might want to do is to run the famous Vickrey auction or the second-price auction. We first let agents submit a bid, different agents might have different values for the iPhone, and then depending on what their bid is, we will allocate the iPhone to the highest bidder, Alice, in this case, and will let the winner, Alice, pay the second highest bid, $199. And the Vickrey auction has many nice properties. First of all, it encourage the agent to share the true valuation in the sense that that always maximizes their utility defined to be the value for the iPhone if the agent get it minus the payment that they need to pay. So clearly Alice has no reason to lie because she's getting the iPhone and there's nothing she can do in terms of lowering the price because that depend on the second highest bid. And also Bob and Charlie do not have incentive to lie trying to win the iPhone because in order to do so they will need to pay a price that's higher than their value. And, secondly, the Vickrey auction maximizes the social welfare which is defined to be the sum of agents' value for the outcome. And in this specific case it's simply value of the agent who get the iPhone. So by definition we are allocating to the highest bidder, and by the fact that we are encouraging the agent to tell the truth, we are maximizing social welfare. And finally the Vickrey auction has a very simple format. It can be implement in essentially linear time because all we need to do is to find the highest bid and second highest bid. Now, so far so good. But that's only for allocating one resource. And more generally we would like to be able to handle multiple resources and maybe more complicated scenario where agents have combinatorial valuations over subset of resources in the sense that the value for subset of item may not simply be the sum of the value for individual items, right? So this is called combinatorial auction problem, arguably the most studied problem in the literature. And it also captures many resource allocation problems arise in practice, for example, the auctions for advertisement slots or the FCC spectrum auction between the government and the companies. And of course this is a very classic economic study. It's very well studied by economists. One of the solution they propose is we should always allocate a resource such that we maximize social welfare. And if we do so, there exists some payment scheme that will encourage the agent to share their true value. And this is called a VCG auction. Unfortunately we also know that maximizing social welfare is NP-hard in general and therefore implementing VCG is NP-hard in general. There's work left to be done. 3 So for these kind of resource allocation problems, other than the limitation on computational -sorry. For these kind of resource allocation problems, other than the resource -- the limitation on computational power, we also face a new challenge of the game-theoretic constraint in the sense that each agent has some utility that directly generated from the outcome of the mechanism, and therefore if they collide or decide not to participate, to improve their utility they will very well do so. And this game-theoretic constraint has received a lot of attention from the theory community over the past decade and lead to this very exciting field of algorithmic mechanism design which study how to design mechanism that run in poly time and take care of this game-theoretic constraint. And the solution concept in this literature is to restrict our attention to the truthful mechanisms which always encourage the agent to tell the truth by making sure that maximizes the expected utility. And the central question in our algorithm mechanism design is how to design computationally efficient and truthful mechanisms with good social welfare guarantee. Now, as a quick remark, there also exists other interesting objective in mechanism design, such as mechanizing revenue or ensuring some notion of fairness. But for the purpose of this talk, I'll focus on social welfare maximization, which is the most studied objective in the literature. Okay. So that's the first constraint, the game-theoretic constraint. Now let me move on to the second illustrative problem. Suppose we want to release on the average salary of all the employees in the company, say in Microsoft. So of course this is easy to compute the average. But if we release the exact answer, that might be problematic because by comparing the average salary before and after Bob joined the company, the adversary will be able to learn exactly what Bob's salary is, which is considered to be a sense that's personal information that should not be revealed to the public. So what we may want to do is to release a noisy answer by running the Laplace mechanism. So we will first let a trusted third party called a curator to solicit all the salaries of the agents, and then we compute the average. We sample noise from the Laplace distribution, which essentially the exponential distribution mirror in the Y axis. And then we sum them up and release the noisy answer. Okay? Now, the point is if the adversary again tried to compare the noisy answer before and after Bob joined the company, then the estimation he will get has an error that is roughly N times the error we will get adding to the average answer, so where N is the number of agents. So when the number of the agent is large, there's hope that we can add very little noise in terms of the average answer but adding a lot of noise in the [inaudible] estimation. So, in other words, we protect the privacy of each individual agent's salary while providing reasonably accurate answer in terms of this average salary. And finally it also has a very simple form and can be implemented in a computational efficient manner. Okay? 4 >>: So in the example, when you're trying to [inaudible] this is the error you would add to the average [inaudible] or to ->> Zhiyi Huang: To the average [inaudible]. >>: Seems that -- seems that in order to get -- to hide most salary you would just need to add the error of -- scale down by the number of people, so this ->> Zhiyi Huang: Yes. So ->>: So this is like [inaudible] huge error [inaudible]. >> Zhiyi Huang: So in this illustrative example, I'm only writing like three agents here. But in general we will assume there are a lot of agents ->>: All right [inaudible] describe an example of a large company, okay, I understand that. >> Zhiyi Huang: Right. Yeah. Yes. Yes. >>: Okay. >> Zhiyi Huang: Yes. You are making a very good point that if there are only three agents in the picture, then probably it won't be very private for the agent because essentially each agent has a substantial contribution to the average. But in general we would like to consider large number of agent. Okay? Very good point. So okay. Now, again, that's only for releasing one numerical query, and in general we would like to be able to answer many, many queries about the same database. So we would like to consider one data universe which specify the possible sense of types of the agents which might be their demographic information or their medical records or the salary and so on. And we'd like to consider a database that contain the sense of types of N agents. What we'd like to do is to answer the predicate queries asked by data analysts where each predicate query is specified by a predicate function mapping from the data universe to real number between 0 and 1, and given such a query the exact answer is supposed to be the average value of this predicate function taken over all the elements in the database. Okay? So here I'm abusing notation a little bit because when we talk about predicate function usually we mean Boolean function, but I'm allowing to map to 0, 1, okay, real number. So the predicate queries captures many useful queries in practice. To give you some example, if the data universe consists of numbers between 0 and 1, then we ask about the mean or higher moments about the numbers in the database. If the data universe consists of the Boolean strings, then we ask about what's the fraction of agents whose sensitive types satisfies some conjunctive Boolean formula or general Boolean formula. And if the data universe are all the points in some metric space, for example, the D-dimensional unit cube in the Euclidean distance space, then we may ask about a distance query where each query is specified by a point in the metric space. And what we would like to learn is the -what's the average distance from the query point or the point in the database. And this kind of 5 query might be useful if we want to pick a location for building a new facility and would like to learn how convenient is this new location in terms of the average distance to all the citizens in the database. Okay? Now, for these kind of queries, these problems, the agents has little or no utility that directly generated from the outcome of the mechanism. But, nonetheless, they may still decide to lie about the private type or decide not to participate in the survey if the curator's answer will leak too much information about a sensitive type because they may worry leaking -- the leaking of such information might hurt the utility in the future. So this is the privacy constraint, and it lead to the very fruitful field of differential privacy which study how to design mechanism that run in poly time and take care of this privacy constraint. And the solution concept is to consider a differentially private mechanism that, roughly speaking, whose outcome distribution is insensitive to the change of one agent's private type. Okay? And more formally by insensitive what we mean is that suppose we fix the type of all other agent except Agent I, then no matter what Agent I reports, the probability that we would choose a specific outcome changes by no more than E to the epsilon factor for some small constant epsilon. Okay? Essentially it's saying that the infinite divergence between the outcome distribution for two neighboring database are -- is spawned by epsilon. Okay? So given the definition, the essential question for differential privacy is to study how to design computational efficient and private mechanism that can provide accurate answer for the queries asked by the analysts about the database. Okay? Okay. Now, so far I have defined what is computation over private data, and I have specified two challenges, the game-theoretic constraint and the privacy constraint. So before I move on to the technical part, is there any question about... Okay. So now let me talk about what my coauthor and I have done. So it will contain -- consist of three parts. In the first part I'll focus on purely the game-theoretic constraint. I'll talk about how to solve the social welfare maximization problem via the black box reduction technique. And the second part I'll focus on purely the privacy constraint and I'll talk about how to design mechanism for answering the distance queries. And finally I'll briefly go over our result on how to design mechanism that can handle both constraints simultaneously. So for the first part I'll get into a little more detail, and for the other two part I'll be a little brief. Okay? Okay. Now let me move on to the first part, the game-theoretic constraint. And let me remind you the central question is how to design computational efficient and truthful mechanism with good social welfare guarantee. Now, suppose we take away the truthfulness requirement for a second, then this is well-defined optimization problem. And as computer scientists, we are trained to be able to design fast algorithm for solving optimization problem either exactly or approximately. 6 And indeed our computer scientists have developed very powerful -- many powerful tools for solving optimization problems in poly time. Now, if we put back the truthfulness requirement, then the problem of designing computational efficient and truthful mechanisms despite of many exciting progress over the past few years remain a much, much less understood topic. So it is natural to ask can we reduce the less understood mechanism design problem to a better understood algorithm design problem. Now, this question has motivate me among many other researchers to look into the following sort of a Holy Grail for algorithm and mechanism design which says that can we convert any algorithm into truthful mechanism with essentially the same social welfare guarantee while the running time of the mechanism is no more than polynomial times the running time of the algorithm. If we can give affirmative answer to this question, that would be great because that means there's some machinery that will automatically take care of the game-theoretic part and we can simply focus on the optimization part, which is a much more familiar terrain for computer scientists. And our main contribution is that for many problem the answer is yes, there exists such a reduction. In particular we show that for all problems in the Bayesian setting, which is the standard economic setting, the answer is yes. This is joint work with Xiaohui Bei. And of course as computer scientists we are also interested in privacy setting and worst-case analysis, and we look into that as well. So we show that for one subclass for all single dimensional and symmetric problems in the privacy setting the answer is yes, there exists such a reduction. So here the single dimensional means the private evaluation of an agent can be written in a simple form of a single real number, and symmetric means the set -- the feasible outcome is symmetric. Pick any feasible outcome, then no matter how we permute the identity of the agents, it will remain feasible. So arguably a relatively natural restriction to add. So that's our result. So due to time constraint, I won't be able to talk about the technical details about the privacy setting, but I will go over how we design the black box reduction for the Bayesian setting. Okay? Okay. Now let me move on to the Bayesian setting. Let me be more specific about our setting. So, again, there's a set of feasible outcome and N agents, and each agent has a private valuation function mapping from the set of feasible outcome to real number between 0 and 1 which specify how much each agent like each outcome. And we will make the Bayesian assumption in the sense that we assume this valuation VI is drawing from some publicly known distribution F of I, which is publicly known and agreed among -- across different agents. Okay? And for the sake of presentation, I'll think about a somewhat simplified setting. Imagine there's a list -- finalist of possible valuation each agent could have, and the prior distribution simply says that it's equally likely each of the valuation -- each of the valuation it's equally likely to be realized as a true valuation for the agent. Okay? It's a uniform distribution. 7 Now, for this setting, what we show is that any algorithm can be converted into a truthful mechanism by using payments and with no loss in social welfare and polynomial overhead and running time. And the notion of truthfulness we consider here is that telling the truth maximizes the agents' expected utility where the expectation is over the randomness of the mechanism and also real random realization of other agent's type, assuming the other agent tell the truth. Okay. It's a Bayes-Nash equilibrium. Okay? So now let me first tell you about the basic framework of our black box reduction. So, again, the algorithm can be viewed as an input/output interface where in this case the input are simply the reported valuation of the agents. What we would like to do is we would like to decouple the reported valuation by the agents and the valuation we use as input for the algorithm by using some carefully designed perturbation algorithm sigma I is one for each agent. And what we would do is use the perturbed valuation as input for the algorithm and then use what the algorithm output as the output for the mechanism. So in this basic framework I haven't talked about payment yet. And that's because by relatively standard technique in mechanism design, once we fix the outcome of the mechanism, the payment can be derived automatically. So I ignore the payment in this framework, but I'll talk about -- I'll precisely define what the payment are when I get to the particulars. So given this framework, it remain to decide how to design these sigma Is, right? So in particular I would like to design the sigma such that they satisfy three properties. The first property I would like to impose is stationary in the sense that if the input for some sigma I really follow the uniform distribution over the support, then the output sigma VI also follow a uniform distribution over the same support. So, in other words, distribution-wise, this perturbation algorithm is not doing anything, okay? And the reason we'd like to impose the stationary constraint is that will allow us to decouple the correlation across different agents in the sense that from Agent 1's viewpoint, now it is as if the perturbation for other agents do not exist, right? So that will allow us to focus on the problem of designing one perturbation algorithm subject to this stationary constraint. Now subject to the stationary constraint, I would like to make sure the social welfare do not decrease, and in particular we will make sure the expected valuation of each individual agent did not decrease. And finally we would like to impose the truthfulness into the picture. So these are the three goals in designing the sigma Is. Now, let me go to the particulars of the sigma Is. So the first observation is that there's a natural correspondence between stationary perturbation algorithms and bipartite perfect matching in the following graph. I would like to associate every possible valuation in the support with two verses in the bipartite graph. One on the left-hand side we call it the replicas and one on the right-hand side we call it the surrogates. Essentially the replicas correspond to the reported types of the agents and the surrogates 8 correspond to the perturbed type output by the perturbation algorithm. Now, suppose we have perfect matching in this bipartite graph. I claim that that naturally correspond to a deterministic and stationary perturbation algorithm in the sense that given N reported type I can look into the replica, find the correspondent replica, and see which surrogate it get matched to and output that surrogate type, right? And since this is the perfect matching, if the input is the uniform distribution of left-hand side vertices, then the output is the uniform distribution of the right-hand side vertices, so it will be stationary. And it is actually not difficult to show this is 1-to-1 correspondence. So it remain to decide which perfect matching to use. And in order to do that, I would like to introduce the following interpretation. I would like to interpret the replicas as virtual agents and the surrogates as virtual items in the sense that each surrogate -- from Agent I's viewpoint, each surrogate essentially correspond to a distribution over outcome, sort of a lottery, because once we fix the perturbed type we use for Agent I, then that's well-defined distribution over feasible outcome over the randomness of the algorithm and also over the random realization of other agents' type. So given this interpretation, we can talk about what's the expected valuation of one -- a virtual -type T virtual agent for a type T prime virtual item. We'll define this value or the weight of edge TT prime to be the expected value for Agent I if Agent I has type T and we use T prime as the perturbed type. Okay? So essentially what we are doing is we are using the Bayesian assumption to create a virtual interface, a virtual market for the agent such that from each agent's viewpoint it is as if the agent is really competing in this virtual market, in which instead of competing with other agent, the agent is competing with other possible [inaudible] of his own type. Okay? And moreover, in this virtual market, we have the simple structure of a match market for which we know how to solve the social welfare optimization problem exactly. So we can simply run the VCG in this virtual market. >>: I'm sorry, what's A? >> Zhiyi Huang: A's the algorithm that we are given as a black box. So yes. So this expectation will be over the randomness of A, the random conflicts of A, and random realization of VI -- V minus I. Okay? >>: [inaudible]. >> Zhiyi Huang: Okay. So ->>: [inaudible] okay. >> Zhiyi Huang: Right. >>: So [inaudible] the T's here are using -- 9 >> Zhiyi Huang: The T is the ->>: T -- so the T's -- the T is T1, T2 ->> Zhiyi Huang: T's are the possible type or possible valuation of Agent I. And T1, T2, TK are all the possible valuation in the discrete support. >>: Are these -- these are the numbers, so the TIs are ->>: Valuation functions. >> Zhiyi Huang: Valuation functions. It could be a multidimensional vector. >>: [inaudible] the number. >>: Okay. >>: But what about VI, are they functions or are they not? >> Zhiyi Huang: VIs also function. VI are the collection of the valuation function of all agent except Agent I, V minus I. >>: So there's -- that is [inaudible] as well. >> Zhiyi Huang: Yes. Yes. >>: So V [inaudible] different set of the V and T are the same space ->> Zhiyi Huang: Yes. I'm using T to specify sort of the -- all the possible valuation of one specific agent, Agent I. But, yeah, they are from the same space, yes. Okay? So, again, so we have created this virtual market that has the matching market structure and we would like to run VCG. So precisely what we mean is that we'll find the max-weight matching with respect to the weight I just defined. We will use the stationary perturbation algorithm corresponding to that max-weight matching. And then we will charge the agent a price that's equal to the VCG price in this virtual market. Okay? Now, let me show that we have achieved all three property that we're aiming for. So stationary is easy. Again, since we are picking one perfect matching and using the corresponding perturbation algorithm, that's stationary, uniform distribution will be matched to uniform distribution. In terms of social welfare, it's actually not difficult to show that the expected value of the agents subject to this perturbation is proportional to the social welfare in this virtual market because if you think about the edge that we picked incident on type T, that equals the expected value for Agent I condition on his true value being T. 10 So when we sum them up, it's really the expectation, the expected value of the agent scaled by K. Okay? Now, the algorithm is essentially using a naive [inaudible] matching, what's in equals what's out. And the mechanism is doing something more clever with using the max-weight matching and therefore the social welfare in this virtual market can only increase, and as a result the expected value of the agent in the regional market can only increase. And since that holds for all agent, the social welfare in the regional market can only increase. And finally we have imposed truthfulness because from the agent's viewpoint it is as if he is really participating in this virtual market. And since we are running VCG in the virtual market, we ensure truthfulness for the agent. Okay? So that's essentially the whole proof idea modulo some details that will allow us to generalize to more general distributions. Okay? All right. Now, let me wrap up the black box reduction part by giving you a summary of what has happened and where our result stands. So for the Bayesian setting, our work is motivated by this very nice work by Hartline and Lucier that solve the problem in the small, restricted, single-dimensional setting, and we have expanded to the multidimensional setting. And our result has also been independently discovered by Hartline, Kleinberg, and Malekian. In the privacy setting, things are more incremental in the sense that the positive results are also restricted subclass of problems, and we solve one relatively general subclass with symmetric and single-dimensional problems. And moreover, things has to be incremental in the sense that for both the single dimensional and the multidimensional setting, there are the impossibility result that shows that it's impossible to get a general black box reduction that work for all problems. So we have to utilize specific structures of the problems to get positive result. Okay? All right. So that wrap up the game-theoretic part. Now let me move on to the privacy part. So for the privacy part, again the goal is to design efficient and private mechanism that can provide accurate answer for the queries asked by the data analysts. For this regards, there exists a very general positive result that allows to answer K predicate queries with this error. So let me say a little bit how to interpret this error. So, first of all, the dependency on N is roughly 1 over root N, so the larger the number of agent is, the more accurate we can get. And also it's 1 over root N, so it's roughly the same error as the sampling error. And, secondly, the dependency on the number of query K is polylogarithmic, and therefore we can answer exponentially many queries while having nontrivial accuracy, small of one accuracy. Unfortunately this very positive result is efficient and the best running time per query is linear in the size of the data universe which can be exponential in the dimension of the data universe. 11 And moreover, that's not just because we're not creative enough, there exists strong evidence showing no efficient algorithm can privately and accurately answer more than quadratically many queries. If we insist on answering general predicate queries. So given this very general inefficient positive result and this very strong low-bound result, it's natural to ask are there interesting subclass of the predicate queries for which we can design efficient mechanism that can privately answer much more than quadratically many queries, right? And our main contribution is, again, provide affirmative answer to this question by showing that the distance query from one such subclass. And let me remind you for distance query the database consists of a bunch of points in a symmetric space and each query is, again, a point in the metric space, and what we would like to learn is the average distance from the query point to all the data points in the database. Okay? So specifically what we show is that there exists a query release mechanism whose running time per query is nearly linear in the size of the database and we can privately answer arbitrary number of queries with the following error. If the metric is L1 or L2, we can answer with small or with one additive error. And if for we are talking about arbitrary metric, then in addition to the small or with one additive error, we also lose a log K multiplicative distortion where K is the number of queries. Okay? So that's our result. Now, let me briefly sketch our approach. Yep. >>: Additive error [inaudible] number of queries? >> Zhiyi Huang: Yes. The additive error is independent on the multipliers. Essentially we will come up with a -- sort of a proxy function, and then we will answer all the query using the proxy function without further access to the database. >>: And even if you knew the entire proxy function, you would still ->> Zhiyi Huang: Exactly. >>: [inaudible]. >> Zhiyi Huang: But, yeah, that's for the first result. And for the second result we are in this batch model where we are given all these queries, and then we will design our mechanism sort of utilizing the structure of the query. Before the L1 and L2, we are in this online or offline model, we can come up with an offline proxy function that can answer all queries about a database. Okay? So let me briefly sketch our approach. At the high level our approach depend on this nice relation between queries and learning algorithm that is established in a series of work. So essentially we can view the database as a function mapping from queries to answers. Okay? And then we can first use some learning algorithm to learn the approximate version of this function called a proxy function, and then we will answer all the queries asked by data analysts by the proxy function. 12 Now, in this picture, the only place we need to access the true database is by this learning algorithm. So suppose we have a very good learning algorithm that can learn this approximate function using only a few updates. That means we only need to access the true database a few times. And therefore the total privacy loss will be small. So in sum a good learning algorithm with few updates implies good query release mechanism with small privacy loss. Okay? And in particular we will design such a learning algorithm directly for some privacy-friendly metric, namely the L1 metric. So what we utilize is that for L1, the function that we need to learn can be decomposed into a bunch of single-dimensional functions, and also these single-dimensional functions are convex and Lipschitz continuous. And we will utilize all three properties to design an efficient learning algorithm that only use a few updates. So let me skip the details for how to design this learning algorithm, but let me talk about how to now handle arbitrary queries. Our approach is to reduce the problem to the problem for L1 via the metric embedding technique. So the high-level approach is that we'll first pick a low-distortion matching embedding from the given metric space to some L1 metric space, and then we will embed all the data points in the regional database to a proxy database with respect to L1 metric, and then we will run our mechanism for L1 over the proxy database in the sense that we will embed any queries using the same embedding and ask the embedded query to the proxy database and get back the answer. Okay? >>: So how many points -- so this is a general question, how many points [inaudible]? >> Zhiyi Huang: So I'm embedding all the points that's in the regional database. Yes. >>: So say the [inaudible] grow within the number of points. >> Zhiyi Huang: It's a case-by-case thing. In some nice case when it's L1 and 2 -- L2 to L1, then it's universal. But yeah. Okay. So -- okay. In this picture the accuracy analysis is easy. Essentially we lose a small additive error due to running the L1 private mechanism and also a multiplicative distortion that equals the distortion of the metric embedding. The tricky part is the privacy analysis because although we are running a private mechanism over the proxy database, the embedding step itself might leak information. So we want to avoid that. Right? So what we observe is that in order to ensure privacy, it suffices to focus on the low-sensitivity embedding in the sense that changing one data point only change its own embedding, does not affect the embedding of other points. Okay? So if we can ensure that, then changing one point in the regional point in the regional database we're only change one point in the proxy database. Now, by the fact that we are running a private mechanism, that will not change the outcome distribution by too much. So that will ensure privacy. But now I remain to show that due excess interesting low distortion embedding that has low sensitivity, right? So briefly speaking, from L2 to L1 there's a classic result that has distortion 13 essentially arbitrary close to 1 that's based on random projection, and therefore the embedding is independent on the data point, and therefore we get low distortion for free. Now, when we go to arbitrary metric, there another classic result, the bulky [phonetic] theorem. But the bulky theorem heavily relies on the structure of the data points, and this now has low sensitivity. So the way we get around this problem is by observing -- by observe that we don't need to preserve all pairwise distance, we only need to preserve the distance between query points and the data points. And in order to do so, it suffices to use a bulky embedding only with respect to the query point, and that's enough to ensure preserving all the distance between query point and data points, and therefore we get low sensitivity. >>: And that's why you [inaudible] log K? >> Zhiyi Huang: Exactly. Okay. >>: [inaudible] what do you do with the data? >> Zhiyi Huang: Sorry? >>: What do you do with the data points? So you're interested in distances from the ->> Zhiyi Huang: So I'm only interested in preserving the distances between query point and data point. It's okay to have the distance among data points or the distance among query point to be highly distorted. And for this weaker notion of distortion guarantee, it suffices to only utilize the information about the queries. I -- I can just ->>: You're embedding the data points also. >> Zhiyi Huang: I'm embedding the data point also, but the design of the embedding function only depend on the query point. >>: [inaudible] produce something as ->> Zhiyi Huang: Yes. It's a variance -- it's a variance of the bulky theorem, but essentially we can follow the same proof structure with some minor technical twist that allows to show this low ->>: It's not a black box reduction. >> Zhiyi Huang: It's not a black box reduction, but from the high level it's -- that's the idea. Yeah. Okay. So okay. Now let me move on to the final part, how to handle both constraints simultaneously. So this line of approach -- this line of work is motivated by the fact that for many mechanism design problems, not only the game-theoretic constraint is important, the privacy constraint is also important. For some problems, this is because the project valuation of the agents or the companies might be 14 regarded as business secrets that they have devote a lot of research in the market and so on to realize these secrets, and they don't want to reveal the secret to their competitors. Right? And in some other settings, maybe we would like to protect the privacy of the valuation function because this valuation might depend on other sensitive information about the agents. For example, if we think about combinatorial public project problem where a government want to choose a subset of public project to invest in subject to some feasibility constraints, say we can invest in no more than K projects, then the agent's value might depend on their sensitive data. For example, if the projects are locations for building new hospitals, then this valuation might depend on their medical records. So there's a natural need to protect the privacy of the agent's value. So the open question in this field before our work is that is it possible to design truthful and differentially private mechanisms with good social welfare guarantee? Again, we are focusing on maximizing social welfare. And there's some previous work that gave some positive results either for achieving approximate truthfulness or for getting both exact truthfulness and differential privacy for special cases. What's been lacking is a general technique for achieving truthfulness and privacy for any problems. And our main contribution is, again, an affirmative answer to this question, the answer is yes, there's a general technique for doing so. And the way we do it is by showing that the well-known exponential mechanism in the privacy literature which choose the outcome from the feasible range with property proportional to the exponent of the social welfare of this outcome scaled by the privacy parameter epsilon divided by 2 is truthful when we are -- we couple this with some appropriate payment scheme. Okay? And let me give you a one-slide sketch of the proof. Many of you are familiar with this maybe. The exponential mechanism can be characterized as maximizing the following free social welfare which is defined to be the expected social welfare over the distribution of outcome plus the Shannon entropy of the outcome distribution scale by 2 over epsilon. This fact is known in different names in different fields. For example, in statistical physics, it's known as nature, or Gibb's measure, natural minimizes free energy. Or in learning it's known as regularization with Shannon entropy. The way we interpret this is that suppose we think about picking a -- instead of picking one outcome we pick a lottery of outcome or a distribution over the outcomes. And all the distribution are available on the market. Then the exponential mechanism is essentially running the VCG mechanism with respect to the original agents plus one additional agent who is a pure-risk lover whose value equals the Shannon entropy of the outcome distribution scaled properly by their privacy parameter. And therefore since VCG is truthful, we get that the exponential mechanism is truthful by translating the VCG payments back to the exponential mechanism setting. Okay? Okay. So now I want to wrap up the technical part by giving a brief overview of my research. 15 So what I talk about today is my thesis topic on algorithmic mechanism design and differential privacy, and specifically on social welfare maximization via the black box reduction technique and private mechanism for releasing distance query and truthful and differentially private mechanism design. I have also done some work for mechanism design for revenue maximization, which I do not have time to cover in this talk, but feel free to ask me more about this offline. Outside my thesis topic I've also done some work in online algorithms during my interns here. And specifically I've worked on online schedulings, problems, and online matching problems. And, finally, outside these two topics, I've also worked on a wide range of problems, for example, in property testings and generalization of the sorting problem and so on. So, again, I would like to talk more about this offline. Okay. Now let me wrap up with a few future directions. So for mechanism design, our black box reduction technique among -- and with the results by others, and also the recent similar result by Cai, Daskalakis, and Weinberg for black box reduction for revenue maximization indicates that algorithmic mechanism design in the Bayesian setting is easy in the sense that it's as easy as the algorithm design problem. Now, on the other hand, there are strong negative results showing that algorithmic mechanism design in the prior-free setting is hard. It's much harder than the algorithm design problem. So it seems that if we want to get positive results, the Bayesian setting is the right setting to look into. However, getting exact prior knowledge is very difficult. I mean, in my opinion, it's unrealistic to get exact prior knowledge. >>: If it's so easy, why are there so many papers on it? >> Zhiyi Huang: So, yeah, so this is like a serious paper showing stronger and stronger published results. >>: [inaudible]. >> Zhiyi Huang: Yeah. Essentially they are showing the hardness for stronger and stronger notion of truthfulness. The original result is for deterministic truthful and later for uniformly truthful and so on. Okay? Okay. Back to my point. Although it seems that we should look into Bayesian settling, but getting exact prior knowledge is difficult. So I think it's interesting and important to explore the intermediate domain between Bayesian and privacy setting. And there are many interesting possibilities in between, and let me talk about one possible. So maybe we should look into the prior robust mechanism in the sense that the truthfulness is independent on the correctness of the prior while the performance in terms of social welfare or revenue scale smoothly when we have small errors in the prior estimation. So if we can get such prior robust mechanism, then we have more reason to believe it will 16 perform well in practice, right? And there are other possibility to explore this intermediate domain. Let me skip that. Now for the privacy part, the theme I would like to pursue is to design computationally efficient and private mechanisms building upon the many exciting progress on the information theory side over the past few years. And in particular the result I talk about in this talk show that for by utilizing the structures of the queries, we can answer distance query in a computationally efficient manner. So it will be interesting to classify what subclasses of predicate query can be answered efficiently and what subclasses cannot. For example, the convex predicate query might be one candidate to look into where the predicate functions specify the convexity constraint, or we may look into the conjunction which is another very well studied type of queries in the literature. And, on the other hand, I feel it would be important to develop differentially private version of important algorithmic tools such as linear programming and semidefinite programming which can be served as important building block for designing private mechanism in the future. Okay? And, finally, for differential private mechanism design, this is a much more open area. So, again, the first theme is to bring computational efficiency into the picture, because the general positive result that I just talk about is not computational efficient in general. For this regard I have recently make some progress. I realize that by combining the convex rounding technique from the mechanism design literature and objective perturbation technique from the privacy literature, we can solve the combinatorial public project problem in a computationally efficient, truthful, and private manner. And here convex rounding, roughly speaking, is a technique in mechanism design that use convex programming to design truthful mechanism. And objective perturbation is, roughly speaking, a differentially private way of solving convex programs, some specific convex programs. Okay? And it will also be interesting to look into other settings, for example, mechanism design without payment, because for many settings that's just voting where both the game theory part and privacy part matters. It's in the property to use payments. And our technique crucially rely on the use of payments. So this is another interesting direction. And, finally, I'm interested in this very open-ended question, what's the right model to capture both the game-theoretic and the privacy constraint. So our approach is essentially by criteria one where we would like to have the mechanism that's truthful with respect to the usual notion of utility while we want to ensure the outcome distribution is insensitive to the agent's type. But some may argue it's more natural to model the privacy constraint into the utility function in the sense that we assume that some of this utility that's captured, how much this agent lost by -- how much this agent get hurt by the information leaked by the mechanism, right? But if we take this approach, at least so far that's not a very satisfying form of this utility function that everyone is happy about, so this remain a very interesting open-ended question. 17 So okay. Although there are many, many other interesting directions I can keep talking about, so let me take questions here. And thank you. [applause] >> Nikhil Devanur Rangarajan: Any questions? >>: So for this last thing that you said about the right model for privacy [inaudible] in general maybe, you know, this is hard to come up with the distributive function for privacy, but a lot of times the privacy [inaudible] privacy precisely because I don't want to [inaudible] me. So I have some value for [inaudible] again and again and again, so if I review my value in a [inaudible] or maybe you can use it [inaudible] so maybe for this subclass, we can call it a reasonable definition of privacy. So has this been considered? Do you know anything? >> Zhiyi Huang: I see. So that's a very good question. So basically the point you raise is that maybe for some specific setting we have more reason to come up with a precise form of this utility function, because, for example, this -- in this repeated auction where your value will play a role over and over again, maybe we can come up with a better closed form. Yes. I think that's an interesting direction. And to my knowledge I'm not aware of any work that utilize this structure to define this utility function. Yeah. That would be an interesting direction to look into. >>: In general, I mean, coming back to your very first slide, this is an issue with the truthfulness in the Vickrey auction, truthful only when used in isolation ->> Zhiyi Huang: Exactly. Exactly. Exactly. >>: [inaudible]. >> Zhiyi Huang: Yes. It's truthfulness in this one-shop game, but if you think about future utility ->>: And this is something that the English auction doesn't suffer, so in the English auction the people who [inaudible] have exact value of [inaudible]. >> Zhiyi Huang: Oh, I'm not sure that in English auction you don't get your value revealed, which might hurt your utility in the future ->>: You might. But if you're way above the ->> Zhiyi Huang: Oh, yes. Yes. If you're way above the second highest bid, then you'll sort of protect your value very well. Yeah, I agree. Yeah. I think differential privacy is just one way of putting a closed form sort of damage bound on how much you can get hurt in the future, in the sense that your utility cannot get hurt by more than a one-plus-epsilon factor or E to the epsilon factor. But I agree, if you have some better closed form in terms of what your future utility is depending on what the outcome of the current mechanism, maybe you should take that into your account 18 into your utility function, and then -- yeah. >>: So all these sort of proxies for utility have some [inaudible] because if you are claiming they can have only epsilon [inaudible] and this proxy, well, how do you know the real valuations in any way continue [inaudible]? >> Zhiyi Huang: I'm not sure I get that question. >>: So whenever you have a proxy for your utility ->> Zhiyi Huang: Uh-huh. >>: -- so really utility is perhaps a function of it ->> Zhiyi Huang: Right. >>: -- maybe it's not just a function, but ->> Zhiyi Huang: Right. >>: -- if it isn't, then you need to know how continuous is that function. >> Zhiyi Huang: I see. Yes. Yes. I agree. >>: [inaudible] have an approximate, but ultimately that ->> Zhiyi Huang: Right, right, right. Yes, I agree. >> [inaudible] >> [inaudible] >>: But I think the utility of [inaudible]. >> Zhiyi Huang: Yes. >>: Okay. Thank you. [applause]

>> Nikhil Devanur Rangarajan: Hi. It is my... of Pennsylvania. Zhiyi may be familiar to some of...

Related documents

Products

Support

&gt;&gt; Nikhil Devanur Rangarajan: Hi. It is my... of Pennsylvania. Zhiyi may be familiar to some of...

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

>> Nikhil Devanur Rangarajan: Hi. It is my... of Pennsylvania. Zhiyi may be familiar to some of...