Privacy-Utility Tradeoffs in the Smart Grid H. Vincent Poor Princeton University

advertisement
Privacy-Utility Tradeoffs in the
Smart Grid
H. Vincent Poor
Princeton University
Joint work with Lalitha Sankar
Supported by NSF under Grant CCF-1016671
12/3/2011
H. V. Poor
SG: Privacy and Utility
1
Talk Outline
•
Motivation
•
Database privacy problem
•
Smart grid privacy problems
•
Summary
12/3/2011
H. V. Poor
SG: Privacy and Utility
2
Cyber-Physical Systems (Smart Grid)
•
Collection of physically inter-connected agents which:
– Obtain measurements from a network of sensors (and related devices)
– interact via a communication network to jointly monitor/control their subnetwork.
•
Multi-level connectivity architecture: agents ranging from system
operators at higher layers to consumers at lower layers
12/3/2011
H. V. Poor
SG: Privacy and Utility
3
Privacy in Cyber-Physical Systems
•
Competitive Privacy: Competitive agents need to cooperate for joint
system state estimation/control while keeping measurements private
•
Consumer Privacy: Guaranteed privacy of consumers being
monitored by smart devices (water/gas/electricity meters, mobile
health devices, etc.)
Competitive Privacy
Consumer Privacy
12/3/2011
H. V. Poor
SG: Privacy and Utility
4
Privacy vs. Secrecy!
•
Privacy: the ability to prevent unwanted transfer of information (via
inference or correlation) when legitimate transfers happen.
•
But privacy is not secrecy!
•
Secrecy Problem: Protocols and primitives clearly distinguish a
malicious adversary vs. intended user and secret vs. non-secret
data.
– Encryption may be a solution.
12/3/2011
H. V. Poor
SG: Privacy and Utility
5
Privacy is not Secrecy!
•
Privacy: the ability to prevent unwanted transfer of information (via
inference or correlation) when legitimate transfers happen.
•
But privacy is not secrecy!
•
Privacy problem: disclosing data provides informational utility while
also enabling potential loss of privacy
– Every user is potentially an adversary
– Encryption is not a solution!
?= Eve
12/3/2011
H. V. Poor
SG: Privacy and Utility
6
Utility vs. Privacy
•
Data sources exist to be used but utility of a data source can be
degraded by privacy requirements.
•
Maximum utility of a data source is achieved at minimum privacy
and vice versa.
•
Need a framework that quantifies the utility-privacy tradeoffs for any
data source.
Privacy
Max. privacy
Ethnicity
Min. utility
Utility
Max. utility
Min. privacy
Visit Date
Zip Code
Gender
Birth Date
12/3/2011
Diagnosis
Procedure
Medication
Total Charge
H. V. Poor SG: Privacy and Utility
7
Talk Outline
•
Motivation
•
Database privacy problem
•
Smart grid privacy problems
•
Summary
12/3/2011
H. V. Poor
SG: Privacy and Utility
8
Existing Approaches
•
Privacy problem lies at the intersection of multiple communities.
de-identification
of census release
Statistics
Database
Data mining
sanitizing databases while
maintaining query accuracy
robust classification
without identification
– Application-specific approaches without universal guarantees
•
CS Theory: differential privacy – cryptography motivated definition
– How to guarantee non-identification
– Privacy paramount
•
Utility vs. privacy tradeoff remains unsolved.
12/3/2011
H. V. Poor
SG: Privacy and Utility
9
Privacy Problem: A New Insight
•
Any data source has public and private attributes
Ethnicity
Ethnicity
Visit Date Visit Date
Diagnosis Diagnosis
Name
Procedure Procedure
Address
Medication Medication
SSN Total ChargeTotal Charge
Zip Code
Gender
Birth Date
Health Care
Database
L. Sankar, S. R. Rajagopalan, and H. V. Poor. “Utility and privacy of data sources: Can
Shannon help conceal and reveal information?,” ITA Workshop, La Jolla, CA, Feb. 2010.
12/3/2011
H. V. Poor
SG: Privacy and Utility
10
Privacy Problem: A New Insight
•
•
Any data source has public and private attributes
Want to reveal public attributes maximally without revealing the
private attributes
Ethnicity
Ethnicity
Visit Date Visit Date
Diagnosis Diagnosis
Name
Procedure Procedure
Address
Medication Medication
SSN Total ChargeTotal Charge
Zip Code
Gender
Birth Date
Health Care
Database
L. Sankar, S. R. Rajagopalan, and H. V. Poor. “Utility and privacy of data sources: Can
Shannon help conceal and reveal information?,” ITA Workshop, La Jolla, CA, Feb. 2010.
12/3/2011
H. V. Poor
SG: Privacy and Utility
11
Privacy Problem: A New Insight
•
But… private and public attributes are correlated.
•
Controlling privacy leakage amounts to controlling the correlation.
•
Correlation can be controlled via perturbation of public attributes.
•
Best U-P tradeoff: finding the minimal perturbation that achieves a
desired correlation.
•
Our contribution: a framework based on rate-distortion theory with
universal metrics for utility and privacy.
L. Sankar, S. R. Rajagopalan, and H. V. Poor. “Utility and privacy of data sources: Can
Shannon help conceal and reveal information?,” ITA Workshop, La Jolla, CA, Feb. 2010.
12/3/2011
H. V. Poor
SG: Privacy and Utility
12
The Database Privacy Problem
•
A database  is a table – entries (rows); attributes (columns)
Attributes
 total attributes
Entries Gender Medication Diagnosis
1
2
…
Payment
.
.
.

Private
Public
L. Sankar, S. R. Rajagopalan, and H. V. Poor. “A theory of utility and privacy of data
sources,” Proc. of IEEE Intl. Symp. Inform. Theory, Austin, TX, Jun. 13-18 2010.
12/3/2011
H. V. Poor
SG: Privacy and Utility
13
The Database Privacy Problem
•
A database  is a table – entries (rows); attributes (columns)
Attributes
 total attributes
Entries Gender Medication Diagnosis
1
2
…
Payment
, : hidden
 , : revealed
.
.
.

Private
•
Public
 entry :     , , , 
Our model:  is a sequence of  i.i.d. observations of a vector
random variable  = (1 2 … K) with the distribution [SRP, ISIT ’10]
 ( )  1 2 (1 , 2 , ,  )
L. Sankar, S. R. Rajagopalan, and H. V. Poor. “A theory of utility and privacy of data
sources,” Proc. of IEEE Intl. Symp. Inform. Theory, Austin, TX, Jun. 13-18 2010.
12/3/2011
H. V. Poor
SG: Privacy and Utility
14
Database: Utility vs. Privacy
•
The Utility-Privacy Problem: Rate distortion theory with privacy is a
natural fit!
12/3/2011
H. V. Poor
SG: Privacy and Utility
15
Database: Utility vs. Privacy
•
•
The Utility-Privacy Problem: Rate distortion theory with privacy is a
natural fit!
Encoder maps  () to a “sanitized” database (SDB) 
Encoder :      1 , 2 ,,  
• : number of revealed (“quantized”) databases

Source
,  ,  1

 ,
Encoder
 
L. Sankar, S. R. Rajagopalan, and H. V. Poor. “A theory of utility and privacy of data
sources,” Proc. of IEEE Intl. Symp. Inform. Theory, Austin, TX, Jun. 13-18 2010.
12/3/2011
H. V. Poor
SG: Privacy and Utility
16
Database: Utility vs. Privacy
•
•
The Utility-Privacy Problem: Rate distortion theory with privacy is a
natural fit!
Encoder maps  () to a “sanitized” database (SDB) 
Encoder :      1 , 2 ,,  
• : number of revealed (“quantized”) databases
•
Decoder: Uses  to obtain a “reconstructed” database (for query
processing)
Decoder:    

Source
,  ,  1

 ,
Encoder
 

Decoder
  ,


 1
L. Sankar, S. R. Rajagopalan, and H. V. Poor. “A theory of utility and privacy of data
sources,” Proc. of IEEE Intl. Symp. Inform. Theory, Austin, TX, Jun. 13-18 2010.
12/3/2011
H. V. Poor
SG: Privacy and Utility
17
Utility and Privacy Metrics
•
•
Utility: measure of closeness of  and  .
Map utility to fidelity (distortion)  bound on avg. distortion per
entry
1 

                
   1



–  : distance-based function (e.g.: Hamming, Euclidean, K-L)
•
•
Privacy: measure of ‘uncertainty’ about hidden data given revealed
data.
Map privacy to equivocation   equivocation on average per entry
 
1
   |      

– : lower bound on the avg. privacy per entry
12/3/2011
H. V. Poor
SG: Privacy and Utility
18
The Utility-Privacy Tradeoff
•
Utility-privacy tradeoff region ( ) is
  {(, ): (, ) is feasible}
•
How do we compute  ?
•
Add an additional rate constraint and map it to a rate-distortionequivocation problem
12/3/2011
H. V. Poor
SG: Privacy and Utility
19
A Source Coding Problem with Privacy
Distortion
Equivocation
1 

        ,       
  1



 ,   1
 
1
   |      


Source
  2  (   )
•
Encoder
W 
 
 
Decoder

 1
Rate constraint
Simplified version of the database privacy problem with additional rate
constraint
– Rate constraint bounds the number of “quantized” sequences
– For U-P tradeoff this seems superfluous
12/3/2011
H. V. Poor
SG: Privacy and Utility
20
Utility-Privacy/RDE Regions
 (,  )
Privacy-exclusive
Privacy  Region (current art)
Equivocation 
Our Approach:
Utility-Privacy
Tradeoff Region
Privacy-indifferent
Equivocation 
Region
Feasible Distortion-Equivocation
region  .
Utility 
Distortion 
Distortion 
(a): Rate-Distortion-Equivocation Region
(b): Utility-Privacy Tradeoff Region
L. Sankar, S. Raj Rajagopalan, H. V. Poor, “A theory of privacy and utility in databases,”
submitted to the IEEE Trans. Inform. Theory, Feb. 2011.
12/3/2011
H. V. Poor
SG: Privacy and Utility
21
Utility-Privacy/RDE Regions
 (,  )
Privacy-exclusive
Privacy  Region (current art)
Equivocation 
Our Approach:
Utility-Privacy
Tradeoff Region
Privacy-indifferent
Equivocation 
Region
Feasible Distortion-Equivocation
region  .
Utility 
Distortion 
Distortion 
(a): Rate-Distortion-Equivocation Region
(b): Utility-Privacy Tradeoff Region
For a database with utility and privacy constraints,  = . [SRP, ISIT ‘10]
L. Sankar, S. Raj Rajagopalan, H. V. Poor, “A theory of privacy and utility in databases,”
submitted to the IEEE Trans. Inform. Theory, Feb. 2011.
12/3/2011
H. V. Poor
SG: Privacy and Utility
22
Related and New Results
The Side Information Problem
Model and
U-P tradeoff
for decoder
side information
Name
Address
Affiliation
Date last
voted
The Successive Disclosure Problem
Zip Ethnicity
CodeVisit Date
Diagnosis
Gender
Birth Procedur
Date Medication
Voter
Database
Health Care
Database
Conditions for
no privacy leaks over
successive queries
relative to one-shot
Multi-round
Query
Response
Interaction
User
L. Sankar, S. Raj Rajagopalan, H. V. Poor, “A theory of privacy and utility in databases,” submitted to
the IEEE Trans. Inform. Theory, Feb. 2011.
Multi-user Privacy
1
1
2


2


Discriminatory Coding and Privacy
1  2 ,,  
2

1



2

R. Tandon, L. Sankar, H. V. Poor, “Multiuser Privacy
and Common Information”, ISIT 2011.
12/3/2011
H. V. Poor


R. Tandon, L. Sankar, H. V. Poor, “Discriminatory
Lossy Source Coding”, Globecom, Nov. 2011.
SG: Privacy and Utility
23
Talk Outline
•
Motivation
•
Database privacy problems
•
Smart grid privacy problems
•
Summary
12/3/2011
H. V. Poor
SG: Privacy and Utility
24
What is a Smart Grid?
•
Smart Grid : Overlay electrical grid with sensors (phasor monitoring
units - PMUs) and control systems (SCADA) to enable:
– reliable and secure network monitoring, load balancing, energy
efficiency via smart meters, and integration of new energy sources
12/3/2011
H. V. Poor
SG: Privacy and Utility
25
Smart Grid: Competitive Privacy
•
N.A. Grid: interconnected regional transmission organizations which:
– need to share measurements on state estimation for reliability (utility)
– wish to withhold information for economic competitive reasons (privacy)
•
Leads to a new problem of competitive privacy
L. Sankar, S. Kar, R. Tandon, and H. V. Poor, “Competitive privacy in the smart grid: An
information-theoretic approach,” Proc. IEEE SmartGridComm, Oct. 2011.
12/3/2011
H. V. Poor
SG: Privacy and Utility
26
Our Contributions
•
•
A linear model for the network measurements and interconnections
A two-RTO protocol for distributed communications
•
Notion of competitive privacy and a utility-privacy tradeoff framework
that:
– Includes metrics for utility and privacy
– Determines privacy minimizing operating points for every choice of utility
measure
•
Two new problem(s) in distributed source coding
– distributed state estimation from noisy measurements (distributed CEO)
– Rate-Distortion-Leakage Tradeoff: estimating state with fidelity vs.
minimizing the resulting state information leakage
L. Sankar, S. Kar, R. Tandon, and H. V. Poor, “Competitive privacy in the smart grid: An
information-theoretic approach,” Proc. IEEE SmartGridComm, Oct. 2011.
12/3/2011
H. V. Poor
SG: Privacy and Utility
27
System Model
•
Noisy measurements  at RTO  with interference from other
RTOs:
 


 1

      1 2  
 system state
•
•
Utility: Mean squared error state and its estimate
Privacy: leakage of state from measurements and messages
•
Cooperation leads to inevitable leakage of state information
[SKTP ’11]: For a two-RTO network, a one-shot Wyner-Ziv coding
maximizes privacy for a desired utility at each RTO.
L. Sankar, S. Kar, R. Tandon, and H. V. Poor, “Competitive privacy in the smart grid: An
information-theoretic approach,” Proc. IEEE SmartGridComm, Oct. 2011.
12/3/2011
H. V. Poor
SG: Privacy and Utility
28
Smart Meter Privacy
•
•
Smart metering is a critical enabler of the smart grid
Advantage (utility) for consumers:
– Tariff- and network-load-aware appliance usage
•
Advantages (utility) for power supplier/ data collector:
– Continuous load monitoring and balancing
– data mining (analytics) for marketing (energy audit vendors, appliance
manufacturers, insurance companies)
•
Data mining: tremendous utility to the data collectors and a huge
privacy risk to the consumer
– Immediate due to meter rollouts in US/EU
12/3/2011
H. V. Poor
SG: Privacy and Utility
29
Smart Meter: Appliance Signatures
12/3/2011
H. V. Poor
SG: Privacy and Utility
30
Our Contributions
•
Our insight: privacy leakages dominantly from intermittently used
appliances
– e.g.: kettles, TV, reveal more than continuously running A/C, heaters
Our utility-privacy tradeoff framework consists of:
• Load model: colored Gaussian mixture of an intermittent and a
continuous On load
• Inference model: inference sequence correlated with intermittent
appliance processes
•
Utility: Euclidean distance between measured continuous valued
meter data and revealed data
•
Privacy: mutual information between a possible inference sequence
(intermittent appliance sequence) and revealed data
12/3/2011
H. V. Poor
SG: Privacy and Utility
31
Meter Privacy: Main Result
[RSMP ’11]: Privacy leakage is minimized by a spectral
‘interference-aware reverse water-filling’ solution.
1 = 0.4 ; 2 = 0.8
 (0) = 12 ;  (0) = 8
2 = 0.1
1 = 40 ; 2 = 120
 = 629
100
90
()
( () +  2 )
∆()
Distortion  = 4
80
waterlevel 
( () =  () +  () +  2 )
Power spectrum
70
60
50
40
30
Privacy preservation
as a result of:
i) noisy interference
(zero distortion case)
ii) distortion-induced
waterlevel 
̂ () (distorted)
20
10
0
-3
-2
-1
0
angular frequency  (radians)
1
2
3
S. Rajagopalan, L. Sankar, S. Mohajer, and H. V. Poor, “Smart meter privacy: An
information-theoretic approach,” Proc. IEEE SmartGridComm, Oct. 2011.
12/3/2011
H. V. Poor
SG: Privacy and Utility
32
SG Privacy: Remarks
Competive Privacy for distributed SE:
• Pricing-based incentive mechanisms for collaboration [BSPD, ’11]
•
Generalization to multiple RTOs to be addressed
Smart Meter Privacy:
• A dynamic analysis-synthesis framework required for streaming data
•
Can privacy be appliance agnostic?
E. V. Belmega, L. Sankar, H. V. Poor, and M. Debbah, “Distributed State Estimation for
Smart Grids: Competition vs. Cooperation,” (invited) ISCCSP, May. 2012.
12/3/2011
H. V. Poor
SG: Privacy and Utility
33
Summary
•
The privacy problem is pervasive … in all cyber-physical systems
•
One solution will not fit all applications…
•
But a framework provides the much needed abstraction
•
More needs to be done…
Medical cyber-physical system
12/3/2011
H. V. Poor
SG: Privacy and Utility
34
For more: … http://www.arxiv.org
Thank you!
12/3/2011
H. V. Poor
SG: Privacy and Utility
35
Download