Location Privacy 1 Context Better localization technology + Pervasive wireless connectivity = Location-based applications 2 Location-Based Apps For Example: GeoLife shows grocery list near WalMart Micro-Blog allows location scoped querying Location-based ad: Coffee coupon at Starbucks … Location expresses context of user Facilitating content delivery Its as if Location is the IP address for content 3 Double-Edged Sword While location drives this new class of applications, it also violates user’s privacy Sharper the location, richer the app, deeper the violation 4 The Location Based Service Workflow Forward to local service: Request: Reply: Reply: Retrieve all available services in Retrieve all available services in location client’s location Client Server LBS Database (Location Based Service) 5 The Location Anonymity Problem Request: Retrieve all bus lines from location to address Client Server Privacy Violated = = LBS Database (Location Based Service) 6 Double-Edged Sword Moreover, range of apps are PUSH based. Require continuous location information Phone detected at Starbucks, PUSH a coffee coupon Phone located on highway, query traffic congestion 7 Location Privacy Problem: Continuous location exposure a serious threat to privacy Research: Preserve privacy without sacrificing the quality of continuous loc. based apps 8 Just Call Yourself ``Freddy” Pseudonymns [Gruteser04] Effective only when infrequent location exposure Else, spatio-temporal patterns enough to deanonymize … think breadcrumbs John Leslie Jack Susan Alex Romit’s Office 9 A Customizable k-Anonymity Model for Protecting Location Privacy Paper by: B. Gedik, L.Liu (Georgia Tech) Slides adopted from: Tal Shoseyov 10 Location Anonymity “A message from a client to a database is called location anonymous if the client’s identity cannot be distinguished from other users based on the client’s location information.” Database 11 k-Anonymity “A message from a client to a database is called location k-anonymous if the client cannot be identified by the database based on the client’s location from other k-1 clients.” 12 Implementation of Location Anonymity Server transforms the message byto“anonymizing” Database executes request Server forwards data Server sends the location datato in the the received according client Database replies to server “anonymized” message Client sends plain request data anonymous with compiled messagedata to the server 13 Implementation of Location k-Anonymity y Temporal Spatial Cloaking Cloaking – Setting – Setting a range a timeofinterval, space to be where a single allbox, the clients where all in aclients specific located location within the sending range area message said to beininthat the time “same interval location”. are said to have sent the message in the “same time”. x t 14 Implementation of Location k-Anonymity Spatial-Temporal Cloaking – Setting a range of space and a time interval, where all the messages sent by client inside the range in that time interval. This spatial and temporal area is called a “cloaking box”. t y x 15 Previous solutions M. Gruteser, D Grunwald (2003) – For a fixed k value, the server finds the smallest area around the client’s location that potentially contains k-1 different other clients, and monitoring that area over time until such k-1 clients are found. Drawback: Fixed anonymity value for all clients (service dependent) 16 Add Noise K-anonymity [Gedic05] Convert location to a space-time bounding box Ensure K users in the box Location Apps reply to boxed region Bounding Box You K=4 Issues Poor quality of location Degrades in sparse regions Not real-time 17 Confuse Via Mixing Path intersections is an opportunity for privacy If users intersect in space-time, cannot say who is who later 18 Confuse Via Mixing Path intersections is an opportunity for privacy If users intersect in space-time, cannot say who is who later ? Hospital ? Airport Unfortunately, users may not intersect in both space and time 19 Hiding Until Mixed Partially hide locations until users mixed [Gruteser07] Expose after a delay Hospital Airport 20 Hiding Until Mixed Partially hide locations until users mixed [Gruteser07] Expose after a delay Hospital Airport But delays unacceptable to real-time apps 21 Existing solutions seem to suggest: Privacy and Quality of Localization (QoL) is a zero sum game Need to sacrifice one to gain the other 22 Hiding Stars with Fireworks: Location Privacy through Camouflage 23 Goal Break away from this tradeoff Target: Spatial accuracy Real-time updates Privacy guarantees Even in sparse populations New Proposal: CacheCloak 24 The Intuition Predict until paths intersect Hospital Airport 25 The Intuition Predict until paths intersect Predict Hospital Airport Predict 26 The Intuition Predict until paths intersect Expose predicted intersection to application Predict Hospital Airport Predict Cache the information on each predicted location 27 CacheCloak System Design and Evaluation 28 Architecture Assume trusted privacy provider Reveal location to CacheCloak CacheCloak exposes anonymized location to Loc. App Loc. App1 Loc. App2 Loc. App3 Loc. App4 CacheCloak 29 In Steady State … Location Based Application CacheCloak 30 Prediction Location Based Application Backward prediction Forward prediction CacheCloak 31 Prediction Location Based Application CacheCloak 32 Predicted Intersection Location Based Application Predicted Path CacheCloak 33 Query Location Based Application Predicted Path CacheCloak 34 Query Location Based Application ? ? ? ? CacheCloak 35 LBA Responds Location Based Application Array of responses CacheCloak 36 Cached Location Based Application Cached Responses CacheCloak Location based Information 37 Cached Response Location Based Application Cached Responses CacheCloak Location based Information 38 Cached Response Location Based Application Cached Responses CacheCloak Location based Information 39 Cached Response Location Based Application Cached Responses CacheCloak 40 Cached Response Location Based Application Predicted Path CacheCloak 41 Benefits Real-time Response ready when user arrives at predicted location Predicted Path High QoL Responses can be specific to location Overhead on the wired backbone (caching helps) Entropy guarantees Entropy increases at traffic intersections Sparse population Can be handled with dummy users, false branching 42 Quantifying Privacy City converted into grid of small sqaures (pixels) Users are located at a pixel at a given time Each pixel associated with 8x8 matrix Element (x, y) = probability that user enters x and exits y y Probabilities diffuse At intersections Over time x pixel Privacy = entropy E user pixels pi log pi 43 Diffusion Probability of user’s presence diffuses Diffusion gradient computed based on history i.e., what fraction of users take right turn at this intersection Time t1 Time t2 Time t3 Road Intersection 44 Evaluation Trace based simulation VanetMobiSim + US Census Bureau trace data Durham map with traffic lights, speed limits, etc. 6km x 6km 10m x 10m pixel 1000 cars Vehicles follow Google map paths Performs collision avoidance 45 Results High average entropy Bits of Mean Entropy Quite insensitive to user density (good for sparse regions) Minimum entropy reasonably high Max. Min. Time (Minutes) Number of Users (N) 46 Results Peak Counting Mean # of Peaks # of places where attacker’s confidence is > Threshold Time (Seconds) Time (Seconds) 47 Results Peak Counting Mean # of Peaks # of places where attacker’s confidence is > Threshold Number of Users (N) 48 Limitations, Discussions … CacheCloak overhead Application replies to lot of queries However, overhead on wired infrastructure Caching reduces this overhead significantly CacheCloak assumes same, indistinguishable query Different queries can deanonymize Possible through query combination … future work Per-user privacy guarantee not yet supported Adaptive branching & dummy users CacheCloak - a central trusted entity Distributed version proposed in the paper 49 Closing Thoughts Two nodes may intersect in space but not in time Mixing not possible, without sacrificing timeliness Mobility prediction creates space-time intersections Enables virtual mixing in future 50 Closing Thoughts CacheCloak Implements the prediction and caching function High entropy possible even under sparse population Spatio-temporal accuracy remains uncompromised 51 52 53 Thank You For more related work, visit: http://synrg.ee.duke.edu 54