A Privacy – Preserving Index for Range queries Paper By: Bijit Hore, Sharad Mehrotra, Gene Tsudik Presented By: Akshay Phadke What this paper is about Database as a Service (DAS) Improving the existing Bucketization Technique Identification of privacy measures in DAS. Development of a novel privacy-preserving rebucketization technique. DAS and its implications Database-as-a-service in which organizations outsource data management to a service provider. Privacy because the data is stored at service provider. One possible solution: Q = Qsec + Qunsec Previous Solutions Bucketization for ranged queries Attribute domain is partitioned into a set indentified by a set. Deterministic encryption for join queries. Drawbacks: Lacks in-depth privacy scenarios. Privacy is subjective: no clear specification. Before we proceed Etuple: tuple stored in encrypted form. crypto-indices: indices created on sensitive attributes. Bucket_id: Set created is assigned a unique random tag. Example Allocating a large number of buckets to crypto-indices increases query precision but reduces privacy. On the other hand, a small number of buckets increases privacy but adversely aects performance. Uniform Query Distribution Total False Positives: Average Query Precision: Goal: Minimize the total number of false positives. Algorithm Basics Number of false positives depends on the the width of the bucket (i.e. minimum and the maximum values) and the sum of the frequencies. To solve the problem use Optimal Substructure property: Splitting the problems into two smaller sub problems. Algorithm Variance, ASEE and Entropy Maximize Var(x) Controlled Diffusion(CDf) QoS is the maximum allowed performance degradation factor (K). CDf algorithm increases privacy of buckets. Diffusion carried out in a controlled manner. Elements diffused into composite buckets. d = K..|Bi| / fCB Composite buckets overlap whereas in case of optimal buckets, they don’t. Experiments Data Set - Synthetic Data Set - Real Data Set - Benchmark Query Set Measurements - Decrease in Precision - Privacy Measure - Performance-Privacy Trade Off - Time taken Results Observed decrease in query precision was less than 3 For privacy measure: standard deviation increases by a large factor. Entropy grows more slowly. Critique Although starts promising, the paper becomes a mathematics paper and seems to loose focus of actual intent. Examples mentioned just have the first step and the final solution, no intermediate steps. The paper doesn’t explain the results. Thank you