Negotiated Privacy CS551/851CRyptographyApplicationsBistro Mike McNett 30 March 2004 • Stanislaw Jarecki, Pat Lincoln, Vitaly Shmatikov. Negotiated Privacy. • Dawn Xiaodong Song, David Wagner, Adrian Perrig. Practical Techniques for Searches on Encrypted Data. • Brent R. Waters, Dirk Balfanz, Glenn Durfee, and D. K. Smetters. Building an Encrypted and Searchable Audit Log. Negotiated Privacy Necessary? • World Wide Web Consortium (W3C) Platform for Privacy Preferences (P3P) Project (http://www.w3.org/P3P/) “The Platform for Privacy Preferences Project (P3P), … is emerging as an industry standard providing a simple, automated way for users to gain more control over the use of personal information on Web sites they visit. … P3P enhances user control by putting privacy policies where users can find them, in a form users can understand, and, most importantly, enables users to act on what they see. “ NOTE: 10 February 2004, W3C P3P 1.1 First public Working Draft Why is it Really Necessary? “The way to have good and safe government, is not to trust it all to one, but to divide it among the many...[It is] by placing under every one what his own eye may superintend, that all will be done for the best.” Thomas Jefferson to Joseph Cabell (Feb. 2, 1816) It’s necessary because Mr. Jefferson said so! Outline • • • • • • • Application Areas Options for Privacy Management What Negotiated Privacy Is What Negotiated Privacy Is Not Implementation Details Limitations Conclusion Application Areas • Health data (diseases, bio-warfare, epidemics, drug interactions, etc.) • Banking (money laundering, tax avoidance, etc.) • National security (terrorist tracking, money transfers, etc.) • Digital media (copies, access rights, etc.) • Note: Many applications require – Security – Guarantees of privacy Options for Privacy Management • Trust the collectors / analysts (people / organizations accessing the data)? IRS, DMV, WalMart • Trust the users for which the data is about? P3P • Combination of the above? – Negotiate what is reportable and what isn’t What Negotiated Privacy Is • Provide personal data escrow of private data by the subjects of monitoring • Pre-negotiated thresholds (interested parties) • Conditional release: Meet threshold “unlock” private data • Ensures both accuracy and privacy • Only allows authorized queries (i.e., has a threshold been met?) What Negotiated Privacy Is Not • Private Information Retrieval (PIR) – enforces privacy when data is retrieved • Digital Cash – enforces privacy of multiple “digital coins” – can’t verify that a user has “too many” coins • Privacy Preserving Datamining – sanitizes or splits data – can’t control conditions for exposing information • Searching on Encrypted Data – Allows efficient (secure, but not private) searches – Paper by Song, Wagner, Perrig “Practical Techniques for Searches on Encrypted Data” Song, Wagner, Perrig • Several schemes – Last one supports: – Provable Secrecy (the untrusted server cannot learn anything about the plaintext given only the ciphertext) – Controlled Searching (the untrusted server cannot search for a word without the user’s authorization) – Hidden Queries (the user may ask the untrusted server to search for a secret word without revealing the word to the server) – Query Isolation (the untrusted server learns nothing more than the search result about the plaintext) • Note – Negotiated Privacy has “Provable Secrecy” and is only slightly related to “Controlled Searching” Basic Idea (details later) Example: Database – One record per copied song, per user. User 4. Report Activity, t 5. if P(t) then give Receipt 1. Escrow (e.g., Make one Copy of Song) User Artist Song xxxxxxx xxxxxxx xxxxxxx 6. Validate and - Provide Service, or Service - Deny Service Provider PKI / Magistrate Analyst 3. Issue Receipt, or Request Disclosure 2. Validate Escrow Database Details • Reference: http://www.math.clemson.edu/faculty/Gao/crypto_mod/node4.html Details • Required “tools” / data: – – – – – – asymmetric key system (x = private; y = public = g, gx) activity t (plaintext) predicate P(t) core(t) = part of the data that determines value of P(t) s = fresh random element in Gq personal data escrow [t]x = (tag, c, Encs{t}, k) where • • • • • tag = hx where h = hash(core(t)) where deterministically hashes into Gq c = sx k = threshold value SigKM(U,y) SigKA[t]x Protects against Malicious User Protects against Malicious Analyst / Provider Details 1. g,y 2. Verify U knows x (e.g., Schnorr Auth) 3.SigKM(U,y) User PKI / Magistrate 4. Generate Escrow [t]x: tag = hx where h = hash(core(t)) s = fresh random element in Gq hash s into keyspace and then Encs{t} c = sx k = threshold value User Artist Song xxxxxxx xxxxxxx xxxxxx xxxxxxx xxxxxxx xxxxxx xxxxxxx xxxxxxx xxxxxx xxxxxxx xxxxxxx xxxxxx Service Provider 5. Send [t]x Analyst 7. Send SigKA[t]x, or Request Disclosure 6. Validate Escrow: Database Escrow freshness If |tag| < k-1 then issue receipt Else user must disclose other records w/same tag Details 2. 1. Verify U knows x (e.g., Schnorr Auth) g,y 3. SigKM(U,y) PKI / Magistrate User Artist Song xxxxxxx xxxxxxx xxxxxx xxxxxxx xxxxxxx xxxxxx xxxxxxx xxxxxxx xxxxxx xxxxxxx xxxxxxx xxxxxx User User 8. Report Activity t if P(t) then give s, SigKA([t]x), SigKM(U,y) and proof (tag=hx, c=sx, and y=gx) 5. Send [t]x 7. Issue Receipt, or Request Disclosure Analyst 6. Validate Escrow Database Service Provider 9. Verify signatures Verify identity is U Verify t matches activity Verify reported k is correct for this activity Compute h = hash(core(t)) Verify proof information (tag=hx, c=sx, y=gx) 10.Provide Service, or Deny Service More Details User Artist Song xxxxxx D Evans xxxxxx Spears Britney xxxxxx Toxic xxxxxx D Evans xxxxxx Spears Britney xxxxxx Toxic xxxxxx D Evans xxxxxx Spears Britney xxxxxx Toxic xxxxxx D Evans xxxxxx Spears Britney xxxxxx Toxic • Disclosure: – – – – – When count(tag) ≥ k-1 Not automatic – must request U to disclose Only disclose escrows with same relevant tag A gives U all relevant escrows for U to open U opens all [t]x by: • s = (c)1/x • t = Encs{t} • h = hash(core(t)) – For each [t]x, send to A: h, s, SigKM(U,y), and proof that tag = hx, c=sx, and y=gx • A learns U and t • Lemma 4: – A will know the number of other reportable activities by U – Doesn’t leak to A the plaintext of other activities of U Limitations • Social, legal, etc. questions • Upfront threshold & query negotiations are required • Query limitations – dynamic queries are difficult (impossible??) • Can’t do “group” thresholds (since all must have same tag) • No automatic disclosure of records (but could go to magistrate, if necessary) • U gets escrow, but decides not to get served • Can’t completely stop impersonations (use biometrics??) • Doesn’t stop threats due to collusion among entities Conclusion • Good initial move towards supporting reasonable negotiated privacy • Provides unique functionality for niche applications • Don’t ask Dave for copies of his music Searching on Encrypted Data Presented by Leonid Bolotnyy March 30, 2004 @UVA Outline • Practical Techniques for Searches on Encrypted Data • Building an Encrypted and Searchable Audit Log Practical Techniques for Searches on Encrypted Data Goals • Provable Security – Untrusted server learns nothing about the plaintext given only ciphertext • Controlled Searching – Untrusted server cannot perform the search without user authorization • Hidden Queries – Untrusted server does not know the query • Query Isolation – Untrusted server does not learn more than the search results Basic Scheme Encryption We want to encrypt wo rds W 1 ,W 2, ...,Wl W 1 ,W 2, ...,Wl n bits each S 1, S 2, ..., Sl n - m bits each Si are pseudorand om values generated using stream cipher Ti Si , Fki ( Si ) F is the pseudorand om function w ith the range of m bits ki is some secret key stored on a trusted server Ci Wi Ti Basic Scheme Search and Decryption • To Search: Send W , ki to the unstrusted server For each entry the server computes Ci W Ti and checks whether Fki (Ti 1, n m ) Ti n m 1, n If they are equal, then the match occurs and the document is sent to the requester. Number of false positives are possible, but can be reduced by increasing m. • To Decrypt: Determine Si , compute Fki ( Si ), and W Ci Si , Fki ( Si ) Basic Scheme Issues Bad: 1: The problem with the basic scheme lies in k i , giving the untrusted server an opportunit y to search for any key word, violating the controlled search criteria. 2: The untrusted server knows the search query. Good: 3: 4: The time to perform the search is linear in the number of key words, so it requires O(n) stream cipher and block cipher operations . At the positions where the untrusted server does not know k i , it learns nothing about the key word. Controlled Searching To perform controlled searching, we tie the key k i to the word Wi . To do that, we introduce a new pseudorand om function f : K F {0,1}* K F keyed with a secret key chosen uniformly at random. Now, k i f k ' (Wi ). To search : the untrusted server is given W and f k i (W). • How do we decrypt now? • The issue of hiding search queries is still unresolved. Hidden Searches To allow for hidden searches, we encrypt th e word W . X i E k '' (Wi ), Ci X i Ti where Ti S i , Fki ( S i ) To search : Send E k '' (Wi ) and k i E k ' (Wi ) to the untrusted server. • The problem with decryption still remains Solving Decryption Problem To solve the decryption problem, we break X i E k '' (Wi ) into two parts. The first part Li has n m bits and the second Ri has m bits. Then, can compute the key k i only as the function of the first part. Making the above changes does not reduce the security of the scheme, but allows for an easy decryption because we can find S i , XOR it with th e ciphertext to retrieve Li and compute k i f k ' ( Li ). To search, send : X i E k '' (Wi ) Li , Ri , k i f k ' ( Li ). To decrypt : Determine S i , compute Li C i 1, n m S i and k i f k ' ( Li ). Scheme Conclusions • “Efficient” encryption, decryption, search that take O(n) number of block cipher and stream cipher operations • Provable security with controlled searching, hidden queries, query isolation • Possible support for composed queries • Possible support for varied-length words – Padding with fixed length blocks – Variable length words (store the length) Building an Encrypted and Searchable Audit Log Reasons to Encrypt Audit Logs • Log may be stored at not completely trusted (secure) site • To prevent tampering with the log • To restrict access to the log – Allow only access to certain parts of the log – Allow only certain entities to access the log Characteristics of a Secure Audit Log • Temper Resistant – Guarantee that only the authorized entity can create entries and once created, entries cannot be altered • Verifiable – Allow verification that all entries are present and has not been altered • Searchable with data access control – Allow log to be “efficiently” searched only by authorized entities Notation and Setup t Audit Logs Audit Escrow Agent - Creates t secrets S1 , S 2 , ..., S t Investigat or Audit Records R0 , R1 , ..., Rn Each audit record Ri contains : - E ki (mi ) - encryption of string mi with a key k i - H ( Ri 1 ) - hash of previous record to prevent ta mpering - c wa , c wb ,... - informatio n about key words wa , wb ,... to be used for searching - Vi - verificat ion informatio n ( H ( Ri )) Symmetric Key Scheme • • • H – pseudorandom function keyed with S S – secret key for this log chosen by the escrow agent flag – constant bit string of length l. Want : Encrypt m with key words w1 , w2 , ..., wn 1. Server chooses a random key k and computes E k (m) 2. Server chooses a random string r of fixed length and computes ai H s ( wi ), bi H ai (r ), ci bi flag , k 3. Server wri tes E k (m), r , c1 , c 2 , ..., c n Search and Decryption • To search for all entries with keyword w: Escrow constructs a search capability for an investigat or : Dw H S1 ( w), H S 2 ( w), ..., H St ( w) where H S j ( w) is the search capability for log server j. For each entry in log j, the investigat or computes H H S j ( w) (r ) bi , valuei bi ci . If the first l bits match the flag, then extract key k , and decrypt m. What if we encounter a false positive? • To decrypt: ??? Issues and Problems • flag size and possibility of false positives • Capabilities for different key words appear random • Adversary may be is able to learn S which is known to the server • Updating keys requires constant connection to the escrow agent + numerous keys management problem + high search time • STORE AS LITTLE SECRET INFORMATION ON THE SERVER AS POSSIBLE Identity Based Encryption • Identity Based Encryption allows arbitrary strings to be used as public keys • Master secret key stored with a trusted escrow agent allows generation of a private key after the public key has been selected IBE Setup and Key generation • Setup: G1 , G2 : two groups of large prime orders p and q P0 : an arbitrary generator of G1 e : G1 G2 G 2 " admissable " bilinear map H 1 : {0,1} G1 and H 2 : G2 {0,1}n two cryptograp hic functions s Z q : master secret System parameters : P ( p, q, G1 , G2 , e, P0, P1 ) where P1 sP0 • Key generation: d w sH 1 ( w) : Private key correspond ing to public key w. IBE Encryption and Decryption • Encryption To encrypt m {0,1}n with public key w, compute 1) Qw H 1 ( w) G1 2) g w e(Qw , P1 ) 3) c rP0 , m H 2 ( g w ) for a random r Z q r • Decryption To decrypt c U ,V using d w as private key, compute m V H 2 (e(d w ,U )) Note : c rP0 , m H 2 ( g w ) r (m H 2 ( g w )) H 2 (e(d w , rP0 )) (m H 2 ( g w )) H 2 (e( sH 1 ( w), rP0 )) r (m H 2 ( g w )) H 2 (e( H 1 ( w), sP0 ) r ) m r r Asymmetric Scheme using IBE • To encrypt: Want : Encrypt m with key words w1 , w2 , ..., wn 1. Server chooses a random key k and computes E k (m) 2. For each wi , server computes ci of flag , k using IBE with wi as public key 3. Server wri tes E k (m), c1 , c 2 , ..., c n • To search: The investigat or gets d w from the escrow agent to search for w. Investigat or decrypts each ci and checks if the first l bits are flag. If they are, he extracts k and decrypts m. • To decrypt: ??? Comments on the IBE Scheme • Note: An investigat or holding d w cannot get d w' . • Each server stores only public parameters • Compromising the server does not allow attacker to search the data • Possible to separate the search and decryption by encrypting the key using some other public key (requires an extra access to the escrow agent for decryption) • A drawback: Tremendous increase in computation time Scheme Optimizations • Pairing Reuse By caching g w for every w, we don' t need to compute the pairing twice, producing considerab le speedup for future searches for w. • Indexing We can group log entries into blocks, reducing the number of key words we need to encrypt. For each key word w, we encrypt entry numbers with correspond ing keys, reducing computatio n time. • Randomness Reuse We can also use one random r for each entry, which would substantia lly increase the decryption time because we only need to calculate one pairing.