presentation - Graduate Computer Science Systems

advertisement
Privacy-Preserving Computation
and Verification of Aggregate
Queries on Outsourced Databases
Brian Thompson1, Stuart Haber2, William G.
Horne2, Tomas Sander2, and Danfeng Yao1
1
Rutgers University
Dept. of Computer Science
Piscataway, NJ
2
Hewlett-Packard Labs
5 Vaughn Dr., Suite 301
Princeton, NJ
PETS 2009
Contributions
• An efficient, distributed architecture for
outsourcing databases
• A privacy-preserving protocol for computing
aggregate queries that is resistant to
collusion of dishonest service providers
• A mechanism that allows users to verify the
integrity and correctness of aggregate
query responses
PDAS: Privacy-Preserving Database-As-a-Service
Outline
• Motivation
• PDAS Architecture and Protocol
• Secure Computation of Aggregate Queries
• Correctness Verification
• Conclusions and Future Work
Outline
• Motivation
• PDAS Architecture and Protocol
• Secure Computation of Aggregate Queries
• Correctness Verification
• Conclusions and Future Work
PETS 2009
Simple Client-Server Model
Client
Data Owner
Client
query
response
Client
Client
Client
What if data owner has insufficient time or
resources to answer all queries?
PDAS: Privacy-Preserving Database-As-a-Service
PETS 2009
Database-As-a-Service
• Outsource database to a trusted third-party
service provider (SP).
• SP supports and maintains DBMS infrastructure,
stores data and responds to queries.
• Applications: Census data, medical records,
network monitoring, recommendation systems.
• Data may be private or sensitive.
– Only answer queries that follow a pre-defined
outside scope
inference control policy.
of our work
PDAS: Privacy-Preserving Database-As-a-Service
PETS 2009
Database-As-a-Service
Security threat!
What if server is
compromised or
SP is malicious?
Data Owner
sensitive data,
inference control policy
Service
Provider
query
result AQ
rejected!
query Q
Client
Integrity issue!
How does Client
know that results
are correct?
PDAS: Privacy-Preserving Database-As-a-Service
PETS 2009
Database-As-a-Service
• Encryption [HIM02, MT06]
– When client is the original data owner.
• Publish only statistics
– Limits utility for complex data mining apps.
• Publish representative subset
– Good for approximate query results.
– No privacy for individuals in released dataset.
PDAS: Privacy-Preserving Database-As-a-Service
PETS 2009
Our Solution: Privacy-Preserving
Database-As-a-Service (PDAS)
• Outsource database to m service providers.
• Each SP gets a “share” of each data item.
• Each share gives zero information, but the
shares can be combined to reconstruct the
original data. [Shamir ’79]
• A homomorphic commitment scheme is used
to guarantee correctness. [Pedersen ’91]
PDAS: Privacy-Preserving Database-As-a-Service
Outline
• Motivation
• PDAS Architecture and Protocol
• Secure Computation of Aggregate Queries
• Correctness Verification
• Conclusions and Future Work
PETS 2009
PDAS Architecture
Data Owner
request shares of AQ
SP1
calculate
share AQ1
SP2
SP3
calculate
share
result AAQQ2
calculate
share AQ3
aggregate
query Q
result AQ,
proof of correctness
Client
PDAS: Privacy-Preserving Database-As-a-Service
PETS 2009
PDAS Protocol
1. COMMIT: Data owner generates commitment
values, signs root of Merkle hash tree.
2. DISTRIBUTE: Shares of each data item are
distributed to SPs using Shamir secret-sharing.
3. QUERY: Client submits aggregate query to SP.
4. RESPOND: SP requests shares of aggregate
from other SPs, recovers result, returns to Client.
5. VERIFY: Client checks commitments against
signed root hash, verifies commitment for result.
PDAS: Privacy-Preserving Database-As-a-Service
Outline
• Motivation
• PDAS Architecture and Protocol
• Secure Computation of Aggregate Queries
• Correctness Verification
• Conclusions and Future Work
PETS 2009
Secret Sharing with Polynomials
[Shamir ’79]
• Construct a random (k-1)-degree
polynomial P with P(0) = S.
• Each share is a point on the curve.
• k points are both necessary and sufficient
to uniquely determine the polynomial.
Note: Computation in the field Fq
Note: Allows for threshold scheme
PDAS: Privacy-Preserving Database-As-a-Service
PETS 2009
Secret Sharing with Polynomials
PA(x)
(x1, PA(x1))
(0, A)
x1
x2
(x2, PA(x2))
x3
(x3, PA(x3))
PETS 2009
Secret Sharing with Polynomials
(x1, PB(x1))
(0, B)
(x2, PB(x2))
(x3, PB(x3))
x1
PB(x)
x2
x3
PETS 2009
Secret Sharing with Polynomials
Task: secure
Task: compute
computation of
A+B
PA(x)
(x1, PB(x1))
(0, B)
(x2, PB(x2))
(x3, PB(x3))
(x1, PA(x1))
(0, A)
x1
PB(x)
x2
(x2, PA(x2))
x3
(x3, PA(x3))
PETS 2009
Secret Sharing with Polynomials
(x1, PA+B(x1))
(0, A+B)
Determined
Player 312 calculates:
the sum A+B
without
PA(x
revealing
A321)or B !
1) + PB(x
3
2
PA(x)
(x1, PB(x1))
PA+B(x)
(x2, PB(x2))
(x2, PA+B(x2))
(x1, PA(x1))
x1
PB(x)
(x3, PB(x3))
(x3, PA+B(x3))
x2
(x2, PA(x2))
x3
(x3, PA(x3))
PETS 2009
Secret Sharing in PDAS
• A secret-sharing polynomial Pj is constructed
for each data element Dj , i.e. P j ( 0 )  D j
• The share of data Dj for SPi is ( i , P j ( i ))
• Suppose client queries for SUM ( D1 ,  , D n )
• SPi computes and broadcasts Pˆ (i)   Pj (i)
• Using polynomial interpolation, the SPs can
derive the polynomial Pˆ ( x)   Pj ( x)
• Pˆ (0)   Pj (0)  SUM ( D1 ,, Dn )
PDAS: Privacy-Preserving Database-As-a-Service
PETS 2009
Secret Sharing in PDAS
• Honest SPs only contribute to a computation
if the query follows the data owner’s policy.
• PDAS allows for a (k,m) threshold scheme,
where any k of m SPs can answer a query.
If less than k collaborate, they learn nothing.
• If there are less than k dishonest SPs, the
system has information theoretic security.
• Privacy is preserved* – no information is
leaked besides the query results!
PDAS: Privacy-Preserving Database-As-a-Service
Outline
• Motivation
• PDAS Architecture and Protocol
• Secure Computation of Aggregate Queries
• Correctness Verification
• Conclusions and Future Work
PETS 2009
Verification in PDAS
The Pedersen Commitment Scheme [’91]
Prover: COMMIT(x )
• Publish generators g , h of group G p
• Choose random r
x r
• Calculate commitment value: C r ( x )  g h
Verifier: VERIFY( x , r , c )
• Check commitment: c  C r ( x )  g x h r
PDAS: Privacy-Preserving Database-As-a-Service
PETS 2009
Verification in PDAS
• Owner computes commitment to each data
entry C r ( D j ) and signs to authenticate.
j
• Given D j , r j , C j , the client verifies the
D
r
commitment: C j  C r ( D j )  g h .
j
j
j
• This requires access to sensitive data D j !
• Problem: How to verify an aggregate query
result without access to individual entries?
Use a homomorphic commitment scheme!
PDAS: Privacy-Preserving Database-As-a-Service
PETS 2009
Verification in PDAS
Pedersen commitment scheme is homomorphic:
Cr1 ( x1 )  Cr2 ( x2 )  g
What is
x1+ x2?
x1  x2
r  r2
h1
 Cr1  r2 ( x1  x2 )
commitments signed
by data owner
Verify:
C rˆ ( xˆ )  C r1 ( x1 )  C r2 ( x 2 )
Service Provider
xˆ  x1  x 2
rˆ  r1  r2
C r1 ( x1 )  g h
x1
r1
C r2 ( x 2 )  g h
x2
r2
xˆ , rˆ
C r1 , C r2
PDAS: Privacy-Preserving Database-As-a-Service
PETS 2009
Verification in PDAS
• Use Merkle hash tree to improve efficiency.
• Data owner only signs once: the root hash.
hroot
hroot
h0
h00
C r1 ( x1 )
C r2 ( x 2 )
h1
h01
C r3 ( x 3 )
C r4 ( x 4 )
h10
C r5 ( x 5 )
C r6 ( x 6 )
h11
C r7 ( x 7 )
PDAS: Privacy-Preserving Database-As-a-Service
C r8 ( x 8 )
Outline
• Motivation
• PDAS Architecture and Protocol
• Secure Computation of Aggregate Queries
• Correctness Verification
• Conclusions and Future Work
PETS 2009
Security Properties of PDAS
• Secrecy: Only query results are revealed.
• Security: Commitments are computationally
binding and unconditionally hiding.
• Correctness: Accuracy, integrity guaranteed.
• Collusion resistance: Privacy is protected
against k-1 collaborating adversaries.
• Accountability: Malicious SPs will be caught.
In practice, may relax some properties to achieve greater
functionality. Details in corrected version of paper.
PDAS: Privacy-Preserving Database-As-a-Service
PETS 2009
Efficiency of PDAS
• Setup cost is O(nm) time* for data owner,
but there is no maintenance cost.
• Space required is O(n) for each SP.
• Time complexity to compute a query over
subset S is only O(|S|) for each SP, plus
O(|S| log n) communication cost.
• Verification has computational and
communication cost O(min(|S| log n, n)).
PDAS: Privacy-Preserving Database-As-a-Service
PETS 2009
Extensions
• Dynamic databases
– Support efficient addition/deletion
• Multiple data owners
• Load balancing
• Selection over insensitive attributes
– “Mixed” databases
– Guaranteeing completeness
PDAS: Privacy-Preserving Database-As-a-Service
PETS 2009
Future Work
• Complex queries
– Nested queries
– Selection over sensitive attributes
– MAX, MIN
• Inference control
– Differential privacy [Dwork06]
• Private Information Retrieval
– [Chor, Goldreich, Kushilevitz, Sudan ‘95]
PDAS: Privacy-Preserving Database-As-a-Service
PETS 2009
Conclusions
PDAS accomplishes the following goals:
• A distributed architecture for computing
aggregate queries over sensitive data in
outsourced databases.
• An efficient protocol for verifying the
accuracy and integrity of query results.
• A secure system that is robust against a
network of k-1 collaborating adversaries.
PDAS: Privacy-Preserving Database-As-a-Service
PETS 2009
Thank you!
Corrected version to be available soon:
http://www.cs.rutgers.edu/~danfeng/
PDAS: Privacy-Preserving Database-As-a-Service
PETS 2009
PDAS: Privacy-Preserving Database-As-a-Service
PETS 2009
Extra Slides
PDAS: Privacy-Preserving Database-As-a-Service
PETS 2009
Our Solution: Secret Sharing
• How to enforce a query response policy?
Please give me
your share of Σ Dj!
SUM
=?
Okay, sure!
PDAS: Privacy-Preserving Database-As-a-Service
PETS 2009
Our Solution: Secret Sharing
• How to enforce a query response policy?
Please give me
your share of x!
No, I’m not
supposed to. . .
PDAS: Privacy-Preserving Database-As-a-Service
PETS 2009
Secret Sharing
PDAS: Privacy-Preserving Database-As-a-Service
PETS 2009
Related Work
• H. Hacigümüs, B. Iyer, S. Mehrotra. “Efficient Execution
of Aggregation Queries over Encrypted Relational
Databases.” DASFAA, 2004.
• F. Chin. “Security Problems on Inference Control for
SUM, MAX, and MIN Queries.” Journal of ACM, 1986.
• G. Jagannathan, R. Wright. “Private Inference Control for
Aggregate Database Queries.” PADM, 2007.
PDAS: Privacy-Preserving Database-As-a-Service
Download