Fairness, Privacy and Social Norms

Fairness, Privacy, and Social Norms Omer Reingold, MSR-SVC “Fairness through awareness” with Cynthia Dwork, Moritz Hardt, Toni Pitassi, Rich Zemel + Musings with Cynthia Dwork, Guy Rothblum and Salil Vadhan In This Talk • Fairness in Classification (individual-based notion) – Connection between Fairness and Privacy – DP beyond Hamming Distance • A notion of privacy beyond the DB setting. • Empowering society to make choices on privacy. Fairness in Classification Health Care Advertising Financial aid Concern: Discrimination • Population includes minorities – Ethnic, religious, medical, geographic – Protected by law, policy, ethics • A catalog of evils: redlining, reverse tokenism, self fulfilling prophecy, … discrimination may be subtle! Credit Application (WSJ 8/4/10) User visits capitalone.com Capital One uses tracking information provided by the tracking network [x+1] to personalize offers Concern: Steering minorities into higher rates (illegal)* Here: A CS Perspective • An individual based notion of fairness – fairness through awareness • Versatile framework for obtaining and understanding fairness • Lots of open problems/directions – Fairness vs. Privacy Other notions of “fairness” in CS • • • • • Fair scheduling Distributed computing Envy-freeness Cake cutting Stable matching • More closely related notions outside of CS (Economics, Political Studies, …) – Rawls, Roemer, Fleurbaey, Young, Calsamiglia Fairness and Privacy (1) • [Dwork & Mulligan 2012] objections to online behavioral targeting often expressed in terms of privacy. In many cases the underlying concern is better described in terms of fairness (e.g., price discrimination, being mistreated). – Other major concern: feeling of “ickiness” [Tene] • Privacy does not imply fairness – Definitions and techniques useful. – Can Fairness Imply Privacy (beyond DB setting)? Ad network (x+1) x V: Individuals Vendor (capital one) M(x) O: outcomes A: actions Our goal: Achieve Fairness in the first step (mapping) Assume x V: Individuals M(x) unknown, untrusted, un-auditable vendor O: outcomes First attempt… Fairness through Blindness Fairness through Blindness • Ignore all irrelevant/protected attributes – e.g., Facebook “sex” & “interested in men/women” • Point of failure: Redundant encodings – Machine learning: You don’t need to see the label to be able to predict it – E.g., redlining Second attempt… Group Fairness (Statistical Parity) • Equalize minority S with general population T at the level of outcomes – Pr[outcome o | S] = Pr[outcome o | T] • Insufficient as a notion of fairness – Has some merit, but can be abused – Example: Advertise burger joint to carnivores in T and vegans in S. – Example: Self fulfilling prophecy – Example: Multiculturalism … Lesson: Fairness is task-specific • Fairness requires understanding of classification task (this is where utility and fairness are in accord) – Cultural understanding of protected groups – Awareness! Our approach… Individual Fairness Treat similar individuals similarly Similar for the purpose of Similar distribution (fairness in) the classification task over outcomes Metric – Who Decides? • Assume task-specific similarity metric – Extent to which two individuals are similar w.r.t. the classification task at hand • Possibly captures some ground truth or society’s best approximation – Or instead: society’s norms • Open to public discussion, refinement • Our framework is agnostic to the choice of metric • User control? Metric - Starting Points • Financial/insurance risk metrics – Already widely used (though secret) • IBM’s AALIM health care metric – health metric for treating similar patients similarly • Roemer’s relative effort metric – Well-known approach in Economics/Political theory • Machine Learning Maybe not so much science fiction after all… Randomized Mapping Classification x M(x) V: Individuals O: outcomes Towards Formal Definition Close individuals according to Metric d: V  V  R Mapped to close distributions M(y) y x M(x) V: Individuals O: outcomes Fairness and D-Privacy (2) Close databases individualsaccording according to Hamming V Metric d: Vd:VV RR Mapped to close distributions M(y) y x M(x) V: databases Individuals O: outcomes sanitizations Key elements of our approach… Efficiency (with utility maximization) loss function L: V  O  R d-fair mapping M Metric d: V  V  R Efficient Procedure x V: Individuals M(x) O: outcomes Minimize vendor’s expected loss subject to fairness condition More Specific Question we Address • How to efficiently construct the mapping M: V -> (O) • When does individual fairness imply group fairness (statistical parity)? – For a specific metric, which sub-communities are treated similarly? • Framework for achieving “fair affirmative action” (ensuring minimal violation of fairness condition) Fairness vs. Privacy • Privacy does not imply fairness. • Can (our definition of) fairness imply privacy? • Differential Privacy [Dwork-McSherry-NissimSmith’06], privacy for individuals whose information is part of a database: Privacy on the Web? • No longer protected by the data of others – my traces can be used directly to compromise my privacy. • Can fairness be viewed as a measure of privacy? – Can fairness “blend me in with the (surrounding) crowd”? Relation to K-Anonymity • Critique of k-anonymity: Blending with others that have the same sensitive property X is a small consolation. • “Our” notion of privacy is as good as the metric! • If your surrounding is “normative” may imply meaningful protection (and substantiate, currently unjustified, sense of security of users). Simple Observation: Who Are You Mr. Reingold? ?? • If all new information on me obeys our fairness definition with metrics where the two possible Omers are very close then your confidence won’t increase by much … Do We Like It? Challenge – Accumulated Leakage: • Different applications require different metrics. • Less of an issue for fairness … DPrivacy with Other Metrics • This work gives additional motivation to study differential privacy beyond Hamming distance. • Well motivated even in the context of database privacy (there since the original paper). • Example: Privacy of social networks [KiferMachanavajjhala SIGMOD ‘11] – Privacy depends on context • Privacy is a matter of social norms. • Our burden: give tools to decision makers. What is the Privacy in DP? • Original motivation mainly given in terms of optout/opt-in incentives. Worry about an individual deciding if to participate. • A different point of view: a committee that needs to approve a proposed study in the first place. – Does the study incur only tolerable amount of privacy loss for any particular individual? On Correlations and Priors • Assume that rows are selected independently, and no prior information on the database: – DP protects the privacy of each individual. • But at the presence of prior information, privacy can be grossly violated [Dwork-Naor ‘10] • Pufferfish [Kifer- Machanavajjhala] A Semantic Approach to the Privacy of Correlated Data • Protect privacy at the presence of pre-specified adversaries • Interesting case may be when there is a conflict between privacy and utility Individual-Oriented Sanitization • Assume you only care about the privacy of Alice. • Further assume that the data of Alice is correlated to the data of at most 10 others. • Enough to erase these 11 rows from the database. • Even if correlated to more, expunging more that 11 rows may exceed the (society defined) legitimate expectation of privacy (e.g., in a health study). • Differential privacy simultaneously gives “comparable” level of privacy to everyone. Other variants of DP • Suggests and interprets other variants of DP – defined by the sanitization we allow individuals. • For example: in social networks, what is the reasonable expectation of privacy for an individual: – Erase your neighborhood? – Erase information originating from you? • Another variant: change a few entries in each column. Objections • Adam Smith: this informal interpretation may lose too much. For example, the distance in the definition of DP is subtle • Jonathan Katz: How do you set up epsilon? • Omer Reingold: How do you incorporate input from machine learning into the decision process of policy makers? Lots of open problems/directions • Metric – Social aspects, who will define them? – How to generate metric (semi-)automatically, metric oracle? • Connection to Econ literature/problems – Rawls, Roemer, Fleurbaey, Young, Calsamiglia – Local vs global distributive fairness? Composition? • Case Study (e.g., in health care) – Start from AALIM? • Quantitative trade-offs in concrete settings Lots of open problems/directions • Further explore connection and implications to privacy. • Additional study of DP with other metrics. • Completely different definitions of privacy? • … Thank you. Questions?

Fairness, Privacy and Social Norms

Related documents

Products

Support

Fairness, Privacy and Social Norms

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib