A Privacy – Preserving Index for Range queries Paper By: Presented By:

advertisement
A Privacy – Preserving Index
for Range queries
Paper By: Bijit Hore, Sharad Mehrotra, Gene Tsudik
Presented By: Akshay Phadke
What this paper is about




Database as a Service (DAS)
Improving the existing Bucketization Technique
Identification of privacy measures in DAS.
Development of a novel privacy-preserving rebucketization technique.
DAS and its implications



Database-as-a-service in which organizations
outsource data management to a service provider.
Privacy because the data is stored at service
provider.
One possible solution: Q = Qsec + Qunsec
Previous Solutions


Bucketization for ranged queries
Attribute domain is partitioned into a set
indentified by a set.
Deterministic encryption for join queries.
Drawbacks:
 Lacks in-depth privacy scenarios.
 Privacy is subjective: no clear specification.
Before we proceed



Etuple: tuple stored in encrypted form.
crypto-indices: indices created on sensitive
attributes.
Bucket_id: Set created is assigned a unique random
tag.
Example
Allocating a large number of buckets to crypto-indices increases query precision but reduces
privacy. On the other hand, a small number of buckets increases privacy but adversely aects
performance.
Uniform Query Distribution

Total False Positives:

Average Query Precision:
Goal: Minimize the total number of false positives.
Algorithm Basics


Number of false positives depends on the the width
of the bucket (i.e. minimum and the maximum
values) and the sum of the frequencies.
To solve the problem use Optimal Substructure
property: Splitting the problems into two smaller
sub problems.
Algorithm
Variance, ASEE and Entropy

Maximize Var(x)
Controlled Diffusion(CDf)






QoS is the maximum allowed performance
degradation factor (K).
CDf algorithm increases privacy of buckets.
Diffusion carried out in a controlled manner.
Elements diffused into composite buckets.
d = K..|Bi| / fCB
Composite buckets overlap whereas in case of
optimal buckets, they don’t.
Experiments


Data Set
- Synthetic Data Set
- Real Data Set
- Benchmark Query Set
Measurements
- Decrease in Precision
- Privacy Measure
- Performance-Privacy Trade Off
- Time taken
Results


Observed decrease in query precision was less than 3
For privacy measure: standard deviation increases by
a large factor. Entropy grows more slowly.
Critique



Although starts promising, the paper becomes a
mathematics paper and seems to loose focus of
actual intent.
Examples mentioned just have the first step and the
final solution, no intermediate steps.
The paper doesn’t explain the results.
Thank you
Download