Kobbi Nissim
Microsoft
Based on a position paper with Nina Mishra
q = (f ,i
1
,…,i k
) f (d i1
,…,d ik
)
Statistical database
• Dataset: {d
1
,…,d n
}
– Entries d i
: Real, Integer, Boolean
• Query: q = (f ,i
1
,…,i k
)
– f : Min, Max, Median, Sum, Average, Count…
• Some users are bad…
2
Here’s a new query: q i+1
Here’s the answer
OR
Query denied (as the answer would cause privacy loss)
Auditor
Query log q
1
,…,q i
Statistical database
3
• [Adam, Wortmann 89] classify auditing as a query restriction method
– Such methods limit the queries users may post, usually imposing some structure (e.g. combinatorial, algebraic)
– “Auditing of an SDB involves keeping up-to-date logs of all queries made by each user (not the data involved) and constantly checking for possible compromise whenever a new query is issued”
• Partial motivation:
May allow for more queries to be posed, if no privacy threat occurs
• Early work: Hofmann 1977, Schlorer 1976, Chin,
Ozsoyoglu 1981, 1986
• Recent interest:
Kleinberg, Papadimitriou, Raghavan 2000,
Li, Wang, Wang, Jajodia 2002, Jonsson, Krokhin 2003
4
• Out of the scope for this talk (but important):
– Very weak privacy guarantee: Privacy breached
(only) when a database entry may be uniquely deduced
– Exact answers given
• Important for this talk:
– Data taken into account in decision procedure
• Answers to q
1
,…,q i
• Denials ignored and q i+1 taken into account
5
Sum/Max
[Chin]
Boolean
[KPR00]
Max [KPR00]
Interval based
[LWWJ02]
Generalized results [JK03]
Data Queries real Sum/max d i
0/1 Sum
Breach Complexity learned
--”--
NP-hard
NP-hard
Real Max --”-d i
[a,b] sum d i within accuracy
.
PTIME
PTIME
NP-hard /
PTIME
6
d i real, sum/max queries q1 = sum(d1,d2,d3) sum(d1,d2,d3) = 15 q2 = max(d1,d2,d3)
Denied (the answer would cause privacy loss)
Oh well… d1=d2=d3 = 5
I win!
Auditor
7
d i
[0,100], sum queries,
=1 (PTIME) q1 = sum(d1,d2)
Sorry, denied q2 = sum(d2,d3) sum(d2,d3) = 50 d1,d2
[0,1] d3
[49,50]
Auditor
8
Colonel Oliver North, on the Iran-Contra Arms Deal:
On the advice of my counsel I respectfully and regretfully decline to answer the question based on my constitutional rights.
David Duncan, Former auditor for Enron and partner in Andersen:
Mr. Chairman, I would like to answer the committee's questions, but on the advice of my counsel I respectfully decline to answer the question based on the protection afforded me under the Constitution of the United
States.
9
d1 d2
d3 d4 d5 d6 d7 d8 … d n-1 d n d i real q1 = max(d1,d2,d3,d4)
M
1234 q2 = max(d1,d2,d3)
If denied: d4=M
1234 q2 = max(d1,d2)
M
123
/ denied
M
12
/ denied
If denied: d3=M
123
Recover 1/8 of the database!
Auditor
10
d1 d2 d3 d4 d5 d6 d7 d8 … d n-1 d n d i
Boolean q1 = sum(d1,d2)
1 / denied q i q2=sum(d2,d3) denied iff d i
…
1 / denied
= d i+1
learn database/complement
Let d i
,d j
,d k not all equal, where q i-1
, q i
, q j-1
, q j
, q k-1
, q k all denied q2=sum(d i
,d j
,d k
)
1 / 2
Recover the entire database!
Auditor
11
• Obvious problem: denied queries ignored
– Algorithmic problem: not clear how to incorporate denials in the deicion
• Subtle problem:
– Query denials leak (potentially sensitive) information
• Users cannot decide denials by themselves
Possible assignments to {d
1
,…,d n
}
Assignments consistent with (q
1
,…q i
) q i+1 denied
12
Decision data q
1
,…,q i
, q i+1 i i
, q i i
, a i+1
Examples
• Size overlap restriction
• Algebraic structure
• Sum/Max, Interval based,
Boolean, Max
• Cell suppression
• k-anonimity
“safe”
“unsafe”
*Note: can work in “unsafe” region, but need to prove denials do not
13 leak crucial information
An auditor is simulatable if a simulator exists s.t.: q
1
,…,q i q i+1 q i+1
Statistical database
Auditor q
1
,…,q i a
1
,…,a i
Simulator
Deny/answer
Deny/answer
Simulation
denials do not leak information
* `self auditors’ in [DN03]
14
• Subtleties in current definition of auditors allow for information leakage, and potentially, privacy breaches
– Denials are not taken into account
– Auditor uses information not available to user
• Simulatable auditors provably don’t leak information in decision
– New starting point for research on auditors
15