Varnish: Increasing Data Privacy with SelfDestructing Data Written By Roxana Geambasu; Tadayoshi Kohno; Amit A.Levy; Henry M.Levy, USENIX Security Symposium (Usenix), 2009 Presented By Xinghuang Leon Xu 1 Outline • Part 1: Motivation & Introduction • Part 2: Vanish Architecture and implementation • Part 3: Evaluation and Applications 2 Part 1 Motivation & Introduction 3 Introduction • What is Vanish? Vanish is a system that empowers user with the ability to control their data’s life span. • Where user’s sensitive data can persist in the cloud even after the user account termination with the help of self destructing framework users can regain control over their confidential data such as (e-mails, facebook messages or any web contents created or posted). 4 Introduction (2) • Vanish protects the privacy of past, archived data – such as copies of emails maintained by email provider against all kinds of legal, malicious and accidental attacks. • All the copies of data including the pristine copy becomes obliterate after a specific amount of duration, without any user's involvement to perform any action or any third party association to perform the deletion. 5 Example Scenario • How can Alice be sure that sensitive data sent over electronic mail system is secure? • Services may retain data for long after user tries to delete 6 Motivating Problem: Data Lives Forever ISP • It is possible to retrieve archived data months/years later. • Emails are frequently cached or archived by the email provider on their local back up systems, ISP’s etc. • Therefore there is a chance of risk exposure in future to unintended parties. • Can we empower users with control of data lifetime? 7 Design Goals • Available until expiration • Automatically becomes unreadable, even without actions of the user • No secure hardware required from both users • No centralized system (unlike Hushmail) to be comprised by the government or hackers 8 Other Approaches Most obvious approach is to do manual deleting by installing CRON job. Protection using PGP does not work against adversaries. Forward secrecy encryption can be violated by caching, backup archives or court orders. 9 Other Approaches (2) Ephemerizer solution - Untrustworthy Centralized Third party Services 10 Self Destructing Data Approach ISP File/document is destroyed after specific time out period making all copies of data unreadable including the pristine copy. 11 Part 2: Vanish Architecture and implementation 12 Vanish Data Object (VDO) It encapsulates user’s data and prevents its content from storing at intermediate hops and becoming source of retroactive attacks. It will become unreadable even if connectivity is removed from storage site. While user encapsulates data in VDO he/she would be knowing the approximate time period to be set to the VDO. 13 Vanish Implementation • Vanish is used to leverage existing, decentralized, large scale Distribution Hash Tables. • Encrypt the data with a key and store the key in a high-churn globally-distributed DHT system • Once it reaches the timeout value, the key would be erased from the DHT and forever lost. The data will not be readable without the key 14 DHT 101 Peer-to peer (P2P) storage network with multiple nodes . DHT exhibit a put/get interface for reading and storing data. It implements (3) operations: lookup, get, and store. The data itself consists of an (index, value) pair. Each node in the DHT manages a part of an astronomically large index name space (e.g., 2160 values for Vuze). 15 DHT 101 (2) • STEP 1-LOOKUP NODE: A user performs a lookup to determine the nodes responsible for the index • STEP 2-STORE DATA: A user issues a store to the responsible node, who saves that (index, value) pair in its local DHT database. • STEP 3: RETRIEVE DATA: A user would lookup the nodes responsible for the index and then issue get requests to those nodes in order to retrieve the value at a particular index. DHT may replicate data on multiple nodes to increase availability. 16 DHT 101(3): Important Properties 1. Availability: Provide good availability of data prior to a specific timeout. (e.g., Vuze has a fixed 8-hour timeout, OpenDHT has a timeout of up to one week) 1. Scale, geographic distribution, and decentralization: Measurement studies of the Vuze and Mainline DHTs estimate in excess of one million simultaneously active nodes in each of the two networks. 3. Churn: DHTs evolve naturally and dynamically over time as new nodes constantly join and old nodes leave. The average lifetime of a node in the DHT varies across networks and has been measured from minutes. 17 How Vanish Leverage DHT? Vanish takes data content D and encapsulates it into a VDO V. It encrypts D with a random key K and produces cipher text C It then splits the key into N shares suppose K1,k2....kn. After computing the shares it picks up random access key L as seed of random generator to generated the Indices I1,I2...In Final VDO threshold) comprises of (L,C,N 18 Vuze DHT vs OPEN DHT Vuze DHT • Open to be joined by any users • Millions plus nodes, geographical distributed through the • High churn, user leaving and entering within the network • Fixed 8 hours timeout Open DHT • Restricted membership • Variable time out up to 1 week 19 How Data Time out Works • The DHT nodes churn or internally cleanse themselves, thereby rendering the protected data unavailable over time. • It would be difficult to determine retroactively which nodes where responsible for storing a given piece of data in past. • Keyloses make all data copies permanently unreadable. 20 Part 3: Evaluation & Application 21 Attacks and defense • Settings • Attack Strategy 1: Decapsulate VDO prior to Expiration • Attack Strategy 2: Sniff User’s internet connection • Attack Strategy 3: Attack DHT! (“store” sniffing & “lookup” sniffing) 22 Setting (User, email client, internet, DHT nodes) 23 Attack strategy 1: Decapsulate VDO prior to Expiration 24 Defense: encrypt the VDO with another key encryption scheme like PGP or GPG 25 Attack strategy 2: Sniff User’s internet Connection 26 Defense: use Tor 27 Attack DHT 28 Attack DHT: “store”sniffing • Join the network and get as much keys and index pairs as possible • Periodic push from neighbors 29 Cost to attackers using store sniffing • Using 3 hour churn model, N=50, 90% threshold, in order to comprise 25% of VDO on Vuze, it is estimated to need • 87,000 nodes • = $860K per year 30 Attack DHT: “Lookup” sniffing • Attackers don’t know what is valid key in the 160-bits keyspace • Use the “lookup” request that comes to them • Defense: change the local Vuze node, so it obfuscates the key 31 Attack DHT: Sybil attack • To be continued.. 32 Performance Evaluation • Measurements use an Intel T2500 DUO with 2GBRAM,Java 1.6 and broadband network. • Single Vuze DHT took 4 minutes to store 50 shares by employing several optimization time could be lowered to 32 seconds for 50 shares • The graph shows getting DHT shares are relatively fast when compared to storing VDO’s 33 Vanish Application • Firefox plug-in (Included in release of Vanish) • Thunderbird plug-in (Developed by the community two weeks after release ) • Self-destructing files • Self-destructing trash-bin 34 Vanish Application (2) 35 Conclusion Disadvantages of Vanish Fixed time out challenges in Vuze based DHT. For much larger data sizes encryption/decryption becomes complicated. No defense provided against certain attacks like denial of service which would prevent reading data for life time. 36 Questions? 37