Privacy-Utility Tradeoffs in the Smart Grid H. Vincent Poor Princeton University Joint work with Lalitha Sankar Supported by NSF under Grant CCF-1016671 12/3/2011 H. V. Poor SG: Privacy and Utility 1 Talk Outline • Motivation • Database privacy problem • Smart grid privacy problems • Summary 12/3/2011 H. V. Poor SG: Privacy and Utility 2 Cyber-Physical Systems (Smart Grid) • Collection of physically inter-connected agents which: – Obtain measurements from a network of sensors (and related devices) – interact via a communication network to jointly monitor/control their subnetwork. • Multi-level connectivity architecture: agents ranging from system operators at higher layers to consumers at lower layers 12/3/2011 H. V. Poor SG: Privacy and Utility 3 Privacy in Cyber-Physical Systems • Competitive Privacy: Competitive agents need to cooperate for joint system state estimation/control while keeping measurements private • Consumer Privacy: Guaranteed privacy of consumers being monitored by smart devices (water/gas/electricity meters, mobile health devices, etc.) Competitive Privacy Consumer Privacy 12/3/2011 H. V. Poor SG: Privacy and Utility 4 Privacy vs. Secrecy! • Privacy: the ability to prevent unwanted transfer of information (via inference or correlation) when legitimate transfers happen. • But privacy is not secrecy! • Secrecy Problem: Protocols and primitives clearly distinguish a malicious adversary vs. intended user and secret vs. non-secret data. – Encryption may be a solution. 12/3/2011 H. V. Poor SG: Privacy and Utility 5 Privacy is not Secrecy! • Privacy: the ability to prevent unwanted transfer of information (via inference or correlation) when legitimate transfers happen. • But privacy is not secrecy! • Privacy problem: disclosing data provides informational utility while also enabling potential loss of privacy – Every user is potentially an adversary – Encryption is not a solution! ?= Eve 12/3/2011 H. V. Poor SG: Privacy and Utility 6 Utility vs. Privacy • Data sources exist to be used but utility of a data source can be degraded by privacy requirements. • Maximum utility of a data source is achieved at minimum privacy and vice versa. • Need a framework that quantifies the utility-privacy tradeoffs for any data source. Privacy Max. privacy Ethnicity Min. utility Utility Max. utility Min. privacy Visit Date Zip Code Gender Birth Date 12/3/2011 Diagnosis Procedure Medication Total Charge H. V. Poor SG: Privacy and Utility 7 Talk Outline • Motivation • Database privacy problem • Smart grid privacy problems • Summary 12/3/2011 H. V. Poor SG: Privacy and Utility 8 Existing Approaches • Privacy problem lies at the intersection of multiple communities. de-identification of census release Statistics Database Data mining sanitizing databases while maintaining query accuracy robust classification without identification – Application-specific approaches without universal guarantees • CS Theory: differential privacy – cryptography motivated definition – How to guarantee non-identification – Privacy paramount • Utility vs. privacy tradeoff remains unsolved. 12/3/2011 H. V. Poor SG: Privacy and Utility 9 Privacy Problem: A New Insight • Any data source has public and private attributes Ethnicity Ethnicity Visit Date Visit Date Diagnosis Diagnosis Name Procedure Procedure Address Medication Medication SSN Total ChargeTotal Charge Zip Code Gender Birth Date Health Care Database L. Sankar, S. R. Rajagopalan, and H. V. Poor. “Utility and privacy of data sources: Can Shannon help conceal and reveal information?,” ITA Workshop, La Jolla, CA, Feb. 2010. 12/3/2011 H. V. Poor SG: Privacy and Utility 10 Privacy Problem: A New Insight • • Any data source has public and private attributes Want to reveal public attributes maximally without revealing the private attributes Ethnicity Ethnicity Visit Date Visit Date Diagnosis Diagnosis Name Procedure Procedure Address Medication Medication SSN Total ChargeTotal Charge Zip Code Gender Birth Date Health Care Database L. Sankar, S. R. Rajagopalan, and H. V. Poor. “Utility and privacy of data sources: Can Shannon help conceal and reveal information?,” ITA Workshop, La Jolla, CA, Feb. 2010. 12/3/2011 H. V. Poor SG: Privacy and Utility 11 Privacy Problem: A New Insight • But… private and public attributes are correlated. • Controlling privacy leakage amounts to controlling the correlation. • Correlation can be controlled via perturbation of public attributes. • Best U-P tradeoff: finding the minimal perturbation that achieves a desired correlation. • Our contribution: a framework based on rate-distortion theory with universal metrics for utility and privacy. L. Sankar, S. R. Rajagopalan, and H. V. Poor. “Utility and privacy of data sources: Can Shannon help conceal and reveal information?,” ITA Workshop, La Jolla, CA, Feb. 2010. 12/3/2011 H. V. Poor SG: Privacy and Utility 12 The Database Privacy Problem • A database is a table – entries (rows); attributes (columns) Attributes total attributes Entries Gender Medication Diagnosis 1 2 … Payment . . . Private Public L. Sankar, S. R. Rajagopalan, and H. V. Poor. “A theory of utility and privacy of data sources,” Proc. of IEEE Intl. Symp. Inform. Theory, Austin, TX, Jun. 13-18 2010. 12/3/2011 H. V. Poor SG: Privacy and Utility 13 The Database Privacy Problem • A database is a table – entries (rows); attributes (columns) Attributes total attributes Entries Gender Medication Diagnosis 1 2 … Payment , : hidden , : revealed . . . Private • Public entry : , , , Our model: is a sequence of i.i.d. observations of a vector random variable = (1 2 … K) with the distribution [SRP, ISIT ’10] ( ) 1 2 (1 , 2 , , ) L. Sankar, S. R. Rajagopalan, and H. V. Poor. “A theory of utility and privacy of data sources,” Proc. of IEEE Intl. Symp. Inform. Theory, Austin, TX, Jun. 13-18 2010. 12/3/2011 H. V. Poor SG: Privacy and Utility 14 Database: Utility vs. Privacy • The Utility-Privacy Problem: Rate distortion theory with privacy is a natural fit! 12/3/2011 H. V. Poor SG: Privacy and Utility 15 Database: Utility vs. Privacy • • The Utility-Privacy Problem: Rate distortion theory with privacy is a natural fit! Encoder maps () to a “sanitized” database (SDB) Encoder : 1 , 2 ,, • : number of revealed (“quantized”) databases Source , , 1 , Encoder L. Sankar, S. R. Rajagopalan, and H. V. Poor. “A theory of utility and privacy of data sources,” Proc. of IEEE Intl. Symp. Inform. Theory, Austin, TX, Jun. 13-18 2010. 12/3/2011 H. V. Poor SG: Privacy and Utility 16 Database: Utility vs. Privacy • • The Utility-Privacy Problem: Rate distortion theory with privacy is a natural fit! Encoder maps () to a “sanitized” database (SDB) Encoder : 1 , 2 ,, • : number of revealed (“quantized”) databases • Decoder: Uses to obtain a “reconstructed” database (for query processing) Decoder: Source , , 1 , Encoder Decoder , 1 L. Sankar, S. R. Rajagopalan, and H. V. Poor. “A theory of utility and privacy of data sources,” Proc. of IEEE Intl. Symp. Inform. Theory, Austin, TX, Jun. 13-18 2010. 12/3/2011 H. V. Poor SG: Privacy and Utility 17 Utility and Privacy Metrics • • Utility: measure of closeness of and . Map utility to fidelity (distortion) bound on avg. distortion per entry 1 1 – : distance-based function (e.g.: Hamming, Euclidean, K-L) • • Privacy: measure of ‘uncertainty’ about hidden data given revealed data. Map privacy to equivocation equivocation on average per entry 1 | – : lower bound on the avg. privacy per entry 12/3/2011 H. V. Poor SG: Privacy and Utility 18 The Utility-Privacy Tradeoff • Utility-privacy tradeoff region ( ) is {(, ): (, ) is feasible} • How do we compute ? • Add an additional rate constraint and map it to a rate-distortionequivocation problem 12/3/2011 H. V. Poor SG: Privacy and Utility 19 A Source Coding Problem with Privacy Distortion Equivocation 1 , 1 , 1 1 | Source 2 ( ) • Encoder W Decoder 1 Rate constraint Simplified version of the database privacy problem with additional rate constraint – Rate constraint bounds the number of “quantized” sequences – For U-P tradeoff this seems superfluous 12/3/2011 H. V. Poor SG: Privacy and Utility 20 Utility-Privacy/RDE Regions (, ) Privacy-exclusive Privacy Region (current art) Equivocation Our Approach: Utility-Privacy Tradeoff Region Privacy-indifferent Equivocation Region Feasible Distortion-Equivocation region . Utility Distortion Distortion (a): Rate-Distortion-Equivocation Region (b): Utility-Privacy Tradeoff Region L. Sankar, S. Raj Rajagopalan, H. V. Poor, “A theory of privacy and utility in databases,” submitted to the IEEE Trans. Inform. Theory, Feb. 2011. 12/3/2011 H. V. Poor SG: Privacy and Utility 21 Utility-Privacy/RDE Regions (, ) Privacy-exclusive Privacy Region (current art) Equivocation Our Approach: Utility-Privacy Tradeoff Region Privacy-indifferent Equivocation Region Feasible Distortion-Equivocation region . Utility Distortion Distortion (a): Rate-Distortion-Equivocation Region (b): Utility-Privacy Tradeoff Region For a database with utility and privacy constraints, = . [SRP, ISIT ‘10] L. Sankar, S. Raj Rajagopalan, H. V. Poor, “A theory of privacy and utility in databases,” submitted to the IEEE Trans. Inform. Theory, Feb. 2011. 12/3/2011 H. V. Poor SG: Privacy and Utility 22 Related and New Results The Side Information Problem Model and U-P tradeoff for decoder side information Name Address Affiliation Date last voted The Successive Disclosure Problem Zip Ethnicity CodeVisit Date Diagnosis Gender Birth Procedur Date Medication Voter Database Health Care Database Conditions for no privacy leaks over successive queries relative to one-shot Multi-round Query Response Interaction User L. Sankar, S. Raj Rajagopalan, H. V. Poor, “A theory of privacy and utility in databases,” submitted to the IEEE Trans. Inform. Theory, Feb. 2011. Multi-user Privacy 1 1 2 2 Discriminatory Coding and Privacy 1 2 ,, 2 1 2 R. Tandon, L. Sankar, H. V. Poor, “Multiuser Privacy and Common Information”, ISIT 2011. 12/3/2011 H. V. Poor R. Tandon, L. Sankar, H. V. Poor, “Discriminatory Lossy Source Coding”, Globecom, Nov. 2011. SG: Privacy and Utility 23 Talk Outline • Motivation • Database privacy problems • Smart grid privacy problems • Summary 12/3/2011 H. V. Poor SG: Privacy and Utility 24 What is a Smart Grid? • Smart Grid : Overlay electrical grid with sensors (phasor monitoring units - PMUs) and control systems (SCADA) to enable: – reliable and secure network monitoring, load balancing, energy efficiency via smart meters, and integration of new energy sources 12/3/2011 H. V. Poor SG: Privacy and Utility 25 Smart Grid: Competitive Privacy • N.A. Grid: interconnected regional transmission organizations which: – need to share measurements on state estimation for reliability (utility) – wish to withhold information for economic competitive reasons (privacy) • Leads to a new problem of competitive privacy L. Sankar, S. Kar, R. Tandon, and H. V. Poor, “Competitive privacy in the smart grid: An information-theoretic approach,” Proc. IEEE SmartGridComm, Oct. 2011. 12/3/2011 H. V. Poor SG: Privacy and Utility 26 Our Contributions • • A linear model for the network measurements and interconnections A two-RTO protocol for distributed communications • Notion of competitive privacy and a utility-privacy tradeoff framework that: – Includes metrics for utility and privacy – Determines privacy minimizing operating points for every choice of utility measure • Two new problem(s) in distributed source coding – distributed state estimation from noisy measurements (distributed CEO) – Rate-Distortion-Leakage Tradeoff: estimating state with fidelity vs. minimizing the resulting state information leakage L. Sankar, S. Kar, R. Tandon, and H. V. Poor, “Competitive privacy in the smart grid: An information-theoretic approach,” Proc. IEEE SmartGridComm, Oct. 2011. 12/3/2011 H. V. Poor SG: Privacy and Utility 27 System Model • Noisy measurements at RTO with interference from other RTOs: 1 1 2 system state • • Utility: Mean squared error state and its estimate Privacy: leakage of state from measurements and messages • Cooperation leads to inevitable leakage of state information [SKTP ’11]: For a two-RTO network, a one-shot Wyner-Ziv coding maximizes privacy for a desired utility at each RTO. L. Sankar, S. Kar, R. Tandon, and H. V. Poor, “Competitive privacy in the smart grid: An information-theoretic approach,” Proc. IEEE SmartGridComm, Oct. 2011. 12/3/2011 H. V. Poor SG: Privacy and Utility 28 Smart Meter Privacy • • Smart metering is a critical enabler of the smart grid Advantage (utility) for consumers: – Tariff- and network-load-aware appliance usage • Advantages (utility) for power supplier/ data collector: – Continuous load monitoring and balancing – data mining (analytics) for marketing (energy audit vendors, appliance manufacturers, insurance companies) • Data mining: tremendous utility to the data collectors and a huge privacy risk to the consumer – Immediate due to meter rollouts in US/EU 12/3/2011 H. V. Poor SG: Privacy and Utility 29 Smart Meter: Appliance Signatures 12/3/2011 H. V. Poor SG: Privacy and Utility 30 Our Contributions • Our insight: privacy leakages dominantly from intermittently used appliances – e.g.: kettles, TV, reveal more than continuously running A/C, heaters Our utility-privacy tradeoff framework consists of: • Load model: colored Gaussian mixture of an intermittent and a continuous On load • Inference model: inference sequence correlated with intermittent appliance processes • Utility: Euclidean distance between measured continuous valued meter data and revealed data • Privacy: mutual information between a possible inference sequence (intermittent appliance sequence) and revealed data 12/3/2011 H. V. Poor SG: Privacy and Utility 31 Meter Privacy: Main Result [RSMP ’11]: Privacy leakage is minimized by a spectral ‘interference-aware reverse water-filling’ solution. 1 = 0.4 ; 2 = 0.8 (0) = 12 ; (0) = 8 2 = 0.1 1 = 40 ; 2 = 120 = 629 100 90 () ( () + 2 ) ∆() Distortion = 4 80 waterlevel ( () = () + () + 2 ) Power spectrum 70 60 50 40 30 Privacy preservation as a result of: i) noisy interference (zero distortion case) ii) distortion-induced waterlevel ̂ () (distorted) 20 10 0 -3 -2 -1 0 angular frequency (radians) 1 2 3 S. Rajagopalan, L. Sankar, S. Mohajer, and H. V. Poor, “Smart meter privacy: An information-theoretic approach,” Proc. IEEE SmartGridComm, Oct. 2011. 12/3/2011 H. V. Poor SG: Privacy and Utility 32 SG Privacy: Remarks Competive Privacy for distributed SE: • Pricing-based incentive mechanisms for collaboration [BSPD, ’11] • Generalization to multiple RTOs to be addressed Smart Meter Privacy: • A dynamic analysis-synthesis framework required for streaming data • Can privacy be appliance agnostic? E. V. Belmega, L. Sankar, H. V. Poor, and M. Debbah, “Distributed State Estimation for Smart Grids: Competition vs. Cooperation,” (invited) ISCCSP, May. 2012. 12/3/2011 H. V. Poor SG: Privacy and Utility 33 Summary • The privacy problem is pervasive … in all cyber-physical systems • One solution will not fit all applications… • But a framework provides the much needed abstraction • More needs to be done… Medical cyber-physical system 12/3/2011 H. V. Poor SG: Privacy and Utility 34 For more: … http://www.arxiv.org Thank you! 12/3/2011 H. V. Poor SG: Privacy and Utility 35