Quantifying Location Privacy Reza Shokri George Theodorakopoulos Jean-Yves Le Boudec Jean-Pierre Hubaux May 2011 2 A location trace is not only a set of positions on a map The contextual information attached to a trace tells much about our habits, interests, activities, and relationships 3 4 envisioningdevelopment.net/map 5 Location-Privacy Protection Distort location information before exposing it to others 6 Location-Privacy Protection • Anonymization (pseudonymization) – Replacing actual username with a random identity A common formal framework is MISSING • Location Obfuscation – Hiding location, Adding noise, Reducing precision How to evaluate/compare various protection mechanisms? Which metric to use? original low accuracy low precision 7 Pictures from Krumm 2007 Location Privacy: A Probabilistic Framework Location-Privacy Preserving Mechanism Actual Traces (vectors of actual events) Users Attacker Knowledge Construction u1 Past Traces (vectors of noisy/missing events) uN u2 u1 … … uN Timeline: 1 2 Obfuscation 3 4 KC T Anonymization LPPM Users’ Mobility Profiles Observed Traces (vectors of observed events) Nyms 1 MC Transition Matrices uN u1 rj ri Pij 2 … N Attack Timeline: 1 2 3 4 T Reconstructed Traces 9 Location-Privacy Preserving Mechanism Location-Obfuscation Function: Hiding, Reducing Precision, Adding Noise, Location Generalization,… Alice LPPM Alice Alice Alice Alice Alice Alice Alice Alice Alice A Probabilistic Mapping of a Location to a Set of Locations 10 Location-Privacy Preserving Mechanism Anonymization Function: Replace Real Usernames with Random Pseudonyms (e.g., integer 1…N) Bob 1 LPPM Alice Charlie 3 2 A Random Permutation of Usernames 11 Location-Privacy Preserving Mechanism Actual trace of user u Anonymization Location Obfuscation (for user u) Observed trace of user u, with pseudonym u’ Spatiotemporal Event: <Who, When, Where> 12 Adversary Model Observation Anonymized and Obfuscated Traces Knowledge Users’ mobility profiles LPPM PDFanonymization PDFobfuscation 13 Learning Users’ Mobility Profiles ((adversary knowledge construction)) Users’ Profiles MC Transition Matrices Past Traces (vectors of noisy/missing past events) uN uN u1 KC rj ri Pij … u1 From prior knowledge, the Attacker creates a Mobility Profile for each user Mobility Profile: Markov Chain on the set of locations Task: Estimate MC transition probabilities Pu 14 Example – Simple Knowledge Construction Prior Knowledge for (this example: 100 Training Traces) Alice Day –100 12 7 14 20 … traces? DayHow –99 to13consider 20noisy/partial 19 25 … … Day –1 only 12 e.g., knowing 13 12 19 7 13 19 ⅓ ⅓ … the user’s location in the morning (her workplace), Time 8am 9am 10am 11am … and her location in the evening (her home) Mobility Profile for Alice 12 ⅓ 15 Learning Users’ Mobility Profiles ((adversary knowledge construction)) Users’ Profiles MC Transition Matrices Past Traces (vectors of noisy/missing past events) uN uN u1 KC rj ri Pij … u1 From prior knowledge, the Attacker creates a Mobility Profile for each user Mobility Profile: Markov Chain on the set of locations Task: Estimate MC transition probabilities Pu Our Solution: Using Monte-Carlo method: Gibbs Sampling to estimate the probability distribution of the users’ mobility profiles 16 Adversary Model Observation Knowledge Anonymized and Obfuscated Traces Users’ mobility profiles LPPM PDFanonymization PDFobfuscation Inference Attack Examples Localization Attack: “Where was Alice at 8pm?” What is the probability distribution over the locations for user ‘Alice’ at time ‘8pm’? Tracking Attack: “Where did Alice go yesterday?” What is the most probable trace (trajectory) for user ‘Alice’ for time period ‘yesterday’? Meeting Disclosure Attack: “How many times did Alice and Bob meet?” Aggregate Presence Disclosure: “How many users were present at restaurant x, at 9pm?” 17 Inference Attacks Computationally infeasible: (anonymization permutation) can take N! values Our Solution: Decoupling De-anonymization from De-obfuscation 18 De-anonymization 1 - Compute the likelihood of observing trace ‘i’ from user ‘u’, for all ‘i’ and ‘u’, using HMP: Forward-Backward algorithm. O(R2N2T) Users Nyms u1 1 u2 2 … … uN N 2 - Compute the most likely assignment using a Maximum Weight 19 Assignment algorithm (e.g., Hungarian algorithm). O(N4) De-obfuscation Localization Attack Given the most likely assignment *, the localization probability can be computed using Hidden Markov Model: the Forward-Backward algorithm. O(R2T) Tracking Attack Given the most likely assignment *, the most likely trace for each user can be computed using Viterbi algorithm . O(R2T) 20 Location-Privacy Metric Assessment of Inference Attacks In an inference attack, the adversary estimates the true value of some random variable ‘X’ (e.g., location of a user at a given time instant) Let xc (unknown to the adversary) be the actual value of X Three properties of the estimation’s performance: How accurate is the estimate? Confidence level and confidence interval How focused is the estimate on a single value? The Entropy of the estimated random variable How close is the estimate to the true value (the real outcome)? 22 Location-Privacy Metric The true outcome of a random variable is what users want to hide from the adversary Hence, incorrectness of the adversary’s inference attack is the metric that defines the privacy of users Location-Privacy of user ‘u’ at time ‘t’ with respect to the localization attack = Incorrectness of the adversary (the expected estimation error): 23 Location-Privacy Meter A Tool to Quantify Location Privacy http://lca.epfl.ch/projects/quantifyingprivacy Location-Privacy Meter (LPM) • You provide the tool with – Some traces to learn the users’ mobility profiles – The PDF associated with the protection mechanism – Some traces to run the tool on • LPM provides you with – Location privacy of users with respect to various attacks: Localization, Tracking, Meeting Disclosure, Aggregate Presence Disclosure,… 25 LPM: An Example CRAWDAD dataset • N = 20 users • R = 40 regions • T = 96 time instants • Protection mechanism: – Anonymization – Location Obfuscation • Hiding location • Precision reduction (dropping low-order bits from the x, y coordinates of the location) 26 LPM: Results – Localization Attack No obfuscation 27 Assessment of other Metrics K-anonymity Entropy 28 Conclusion • A unified formal framework to describe and evaluate a variety of location-privacy preserving mechanisms with respect to various inference attacks • Modeling LPPM evaluation as an estimation problem – Throw attacks at the LPPM • The right Metric: Expected Estimation Error • An object-oriented tool (Location-Privacy Meter) to evaluate/compare location-privacy preserving mechanisms http://people.epfl.ch/reza.shokri 29 30 Hidden Markov Model Alice Oi {11,12,13} {6,7,8} {14,15,16} {18,19,20} … PLPPM(6{6,7,8}) PAlice(11) 11 PAlice(116) 6 PAlice(614) 14 18 12 7 15 19 13 8 16 20 31