Steganographic File Systems Claudia Diaz ESAT/COSIC (K.U.Leuven) 1 Talk Outline 2 Motivation Related Work Traffic Analysis Attacks on Continuously Observable Steganographic file systems Countering Traffic Analysis Slides credit: Conclusions The slides on traffic analysis attacks have been created by Carmela Troncoso Steganographic File Systems Motivation Problem: we want to keep stored information secure (confidential) Encryption protects against the unwanted disclosure of information – but… reveals the fact that hidden information exists! User can be threatened / tortured / coerced to disclose the decryption keys – Soldiers, intelligence agents – Criminals forcing victims to hand bank access keys – Journalists / human rights activists in countries where freedom of information is not guaranteed Steganographic File Systems Motivation We need to hide the existence of files Solution plausible deniability – Allow users to deny believably that any further encrypted data is located on the storage device – If password is not known, no information about existence of files Example – Soldier revealing manuals – But keeping secret information on targets, plans, etc. Talk Outline 5 Motivation Related Work Traffic Analysis Attacks on Continuosly Observable Steganographic file systems Countering Traffic Analysis Conclusions The Steganographic File System (I) 1. Anderson, Needham and Shamir (1998) First SFS, two approaches: Use cover files such that a linear combination (XOR) of them reveals the information – – Password: subset of files to combine Drawbacks: 6 Needs a lot of cover files to be secure Writing/reading operations have high cost The Steganographic File System (II) 2. Real files hidden in encrypted form in pseudo-random locations amongst random data – – Location derived from the name of the file and a password. Collisions (birthday paradox) overwrite data Use only small part of the storage capacity Replication (need mechanism to detect overwriting) 7 StegFS: A Steganographic File System for Linux McDonald and Kuhn (1999) 15 default security levels (initialized with random keys), such that is not possible to know whether we have revealed the access keys to all levels in use User can show lower levels but hide existence of high security ones Block allocation table with file system content (instead of location derived from file name/password) where opening one level opens its (and all lower levels) entries in BAT Pure replication to protect against loss of information Free implementation (v1.1.4 in http://www.mcdonald.org.uk/StegFS/) 8 StegFS: A Steganographic file system (I) Zhou, Pan and Tan (2003) – Bitmap for blocks: free (0) or allocated (1) – Allows multi-user – Trusted (tamper resistant) user agent Types of blocks: – – – 9 File blocks (1): contain encrypted user data Dummy blocks (0): free. Contain random data Abandoned blocks (1): non used. Hide amount of file blocks StegFS: A Steganographic file system (II) 10 FAK (File access key) : individually encrypt each file – Easy share files UAK (User access key): encrypts a hidden file that contains a directory of all of the (filename, FAK) pairs for that access level – Easy access for one user UAK -> FAK+file name-> File header with locations of blocks Implementation (http://xena1.ddns.comp.nus.edu.sg/SecureDBMS) Mnemosyne: Peer-to-Peer Steganographic Storage Hand and Roscoe (2002) – Distributed steganographic file system Block oriented – Location based on file name + location key – Two operations: – – Use IDA (Information Dispersal Algorithm) for replication (n out of m) – 11 putblock(blockID, data) getblock(blockID) – Enhances resilience Difficults traffic analysis Mojitos: A Distributed Steganographic File System Giefer and Letchner (2002) – – 12 Combines StegFS (levels, BAT) and Mnemosyne (distributed, block level storage) Client – Server architecture Client knows file name + access key Servers hold BAT, trusted with user keys (vulnerable to server hacks) Client asks inode (previous authentication) and then operates directly over file blocks Use cover traffic to hide patterns of access (no details) Continuously Observable Steganographic File Systems 13 Previous schemes resist one/two snapshots What if attacker monitors raw storage? – Remote or shared store – Multiple snapshots (arbitrarily close in time) prior to coercion Assumption: adversary can continuously record the contents of storage / monitor all accesses Hiding Data Accesses in Steganographic File System 14 Only one proposal considering this attacker (Zhou, Pan &Tan, 2004) – Based on StegFS [PTZ03] Types of blocks: – File blocks: contain encrypted user data – Dummy blocks: free. Contain random data Update operations: – Data update: change content of block – Dummy update: change IV of encryption (CBC block cipher) – Relocation of blocks – Goal: not possible to distinguish data and dummy updates Read operations: – Use oblivious storage to combine dummy and real reads Talk Outline 15 Motivation Related Work Traffic Analysis Attacks on Continuosly Observable Steganographic file systems Countering Traffic Analysis Conclusions Traffic Analysis Attacks on Continuoslyobservable Steganographic file systems Troncoso, Diaz, Dunkelman and Preneel (2007) Attack on the update algorithm of StegFS [ZPT04] Exploit patterns in location accesses: – – 16 Distinguish between user active and idle periods Find files in the system (prior to coercion) StegFS: Update Algorithm N Dummy Update Request update B1? Y Y Pick Randomly B2 Dummy Update on B2 17 Update on B1 B2=B1? N N B2 dummy? Y Rewrite B1 with random data Substitute B2 for updated B1 StegFS: proof of security “For a data update, each block in the storage space has the same probability of being selected to hold the new data. Hence the data updates produce random block I/Os, and follow exactly the same pattern as the dummy updates. Therefore, whether there is any data update or not, the updates on the raw storage follow the same probability distribution as that of dummy updates.“ All locations have the same probability of being selected BUT: – Location accesses produced by file updates follow different patterns than dummy updates. Traffic analysis attacks on accessed locations!! 18 Attacking multi-block files: Update one block Pattern (updating B1): 1. 2. 3. As many dummy updates on data blocks as data B2’s are chosen Overwrite file block B1 with random data Overwrite dummy block B2 with the updated data Example: Update block B1=3 19 Block selected Access list 290 (F) 47 (F) 127 (D) - 290 47 3 127 Operation performed Dummy update on data block B2=290 Dummy update on data block B2=47 Overwrite file block B1=3 with random data Write B2=127 with updated content of B1 Attacking multi-block files: Update a file 20 Update F1 in 123 1, 2, 3 175 Update F1 in 234 34, 345, 127 23 Update F1 in 479 90, 333, 76 200 213 1 34 479 2 345 290 43 3 127 231 146 216 345 333 12 257 34 90 241 57 12 127 76 321 468 93 76 125 90 12 245 222 333 60 189 230 349 432 34, 345, 127 90, 333, 76 Attacking multi-block files: Algorithm 21 Fixed Group GF(A) 213 1 34 479 2 345 290 47 3 127 231 146 216 77 243 10 234 23 34 90 12 157 345 333 241 57 12 127 327 111 76 321 468 200 150 398 76 Moving Group GM(A) Moving Group GM(A) Moving Group GM(A) Fixed Group GF(A) 213 1 34 479 2 345 290 47 3 127 231 146 216 77 243 10 234 23 34 90 12 157 345 333 241 57 12 127 327 111 76 321 468 200 150 398 76 Moving Group GM(A) Moving Group GM(A) Fixed Group GF(A) 213 1 34 479 2 345 290 47 3 127 231 146 216 77 243 10 234 23 34 90 12 157 345 333 241 57 12 127 327 111 76 321 468 200 150 398 76 Moving Group GM(A) Attacking multi-block files: False positives The attacker thinks he has found a file but the patterns have been randomly produced B = size of storage b = size of file searched Number of file accesses 22 T = total accesses Attacking one-block files: Assumption: file blocks updated more frequently than dummy blocks i 1 1 1 PD (i ) 1 B B PF (i ) 1 f i 1 f 1 f B 23 Need more than one update (hops) 123 175 213 1 479 356 290 47 479 231 146 216 367 100 23 231 437 201 Near repetition Near repetition Attacking one-block files: Algorithm (h=5) 24 123 175 213 1 479 356 290 47 479 231 146 216 367 100 23 479 431 231 347 67 12 90 431 67 98 239 347 278 95 467 109 21 263 278 222 417 322 347 274 87 9 123 321 479 231 431 347 278 222 67 274 end end Attacking one-block files: False positives • Dummy block updates appear near ( DC such that PF DC C ) in the h hops considered P (1) P i P (h) P (1) DC f i 0 h D f f C DC P( f ) f DC P( f ) 25 Attacking one-block files: False negatives A file update happens far ( DC ) in one of the h hops Pf (1) 26 h Pf (1) P i P ( h ) F f i DC i 1 i h Simulation results Multi-block files 27 Size of files (blocks) Number of files per size File update frequency Size of storage space 2-3 10 3% 10,000 4-10 10 3% 10,000 Size of files Files found Wrong size False positives 2-3 4-10 >99% >99% <2% <1% <2% 0 Simulation results One-block files Number of files Hops Accesses to each file threshold 10 12 • Rate of success func(f) • f false 28 positives 10 C Size of storage space 10-8 100,000 Attack Conclusions 29 Security claims unsubstantiated 1. The algorithms do not produce same pattern for dummy and user updates 2. The distribution of updated locations is different depending on whether there is user activity or not Blocks are rarely relocated, and when they are, their new location is known Multi-block file updates generate correlations between accessed locations Very easy find multi-block files and easy one-block files A bit of randomness is not enough Talk Outline 30 Motivation Related Work Traffic Analysis Attacks on Continuosly Observable Steganographic file systems Countering Traffic Analysis Conclusions Requirements 31 Different levels of security Forward and backward security after coercion Data persistence Counter traffic analysis Attack model Continuously monitors the contents and accesses to the storage Records all past states of the storage Performs real-time traffic analysis on the accessed block locations Ability to coerce the user at any point – – 32 User produces some low-level keys Attacker inspects user computer status Attacker should not learn about higher levels System architecture Table 34 A password per level decrypts all the level entries Block key for forward and backward security H(·) to detect active attacks Metadata to manage the file system Data persistence High security file blocks appear as empty (while in low security mode) thus data may be lost Erasure codes: – 35 Converts n blocks in m such that n of them suffice to recover the info Regeneration of higher levels Difficult traffic analysis for file read operations Traffic analysis resistance: Pool mix Technique to anonymize email traffic used here to obscure relocation Cycle: – – – – 36 Collect a block from the raw storage, Change its appearance, Storing it in an pool (which contains P blocks from previous cycles), and Randomly flush a block out More randomness in relocations Traffic analysis resistance Dummy traffic Provide unobservability (idle/ non-idle periods) Automatically generated accesses to the storage. Pattern of dummy accesses must be statistically indistinguishable from user requests The system chooses blocks at random to be read and put in the pool – 37 Works if files are small More sophisticated dummy selection strategies are possible Access cycle Additional work (not presented) Definition of metrics for unobservability and plausible deniability Probabilistic function qψ(t)(t) to detect correlations generated by the repeated access to files Pattern recognition algorithm for gathering evidence prior to coercion Test for detection of file accesses after coercion Results for unobservability and deniability Conclusions 40 Different levels of security: Table Forward security after coercion: One-time block keys Data persistence: Erasure codes, redundancy Counter traffic analysis Dummy traffic to conceal user activity High-entropy relocation of data to hide new locations and access patterns Not trivial to achieve deniability for multi-block files Thank you 41