Traffic Analysis on a Continuously-Observable Steganographic File

advertisement
Steganographic File Systems
Claudia Diaz
ESAT/COSIC
(K.U.Leuven)
1
Talk Outline





2
Motivation
Related Work
Traffic Analysis Attacks on Continuously
Observable Steganographic file systems
Countering Traffic Analysis
Slides credit:
Conclusions
The slides on traffic
analysis attacks have
been created by
Carmela Troncoso
Steganographic File Systems
Motivation

Problem: we want to keep stored information secure
(confidential)

Encryption protects against the unwanted disclosure of information
–

but… reveals the fact that hidden information exists!
User can be threatened / tortured / coerced to disclose the
decryption keys
–
Soldiers, intelligence agents
–
Criminals forcing victims to hand bank access keys
–
Journalists / human rights activists in countries where freedom of
information is not guaranteed
Steganographic File Systems
Motivation



We need to hide the existence of files
Solution plausible deniability
–
Allow users to deny believably that any further encrypted
data is located on the storage device
–
If password is not known, no information about existence of
files
Example
–
Soldier revealing manuals
–
But keeping secret information on targets, plans, etc.
Talk Outline





5
Motivation
Related Work
Traffic Analysis Attacks on Continuosly
Observable Steganographic file systems
Countering Traffic Analysis
Conclusions
The Steganographic File
System (I)


1.
Anderson, Needham and Shamir (1998)
First SFS, two approaches:
Use cover files such that a linear
combination (XOR) of them reveals the
information
–
–
Password: subset of files to combine
Drawbacks:


6
Needs a lot of cover files to be secure
Writing/reading operations have high cost
The Steganographic File
System (II)
2.
Real files hidden in encrypted form in
pseudo-random locations amongst
random data
–
–
Location derived from the name of the file
and a password.
Collisions (birthday paradox) overwrite data
 Use
only small part of the storage capacity
 Replication (need mechanism to detect
overwriting)
7
StegFS: A Steganographic File
System for Linux

McDonald and Kuhn (1999)
15 default security levels (initialized with random keys),
such that is not possible to know whether we have
revealed the access keys to all levels in use
User can show lower levels but hide existence of high
security ones
Block allocation table with file system content (instead
of location derived from file name/password) where
opening one level opens its (and all lower levels)
entries in BAT
Pure replication to protect against loss of information

Free implementation (v1.1.4 in http://www.mcdonald.org.uk/StegFS/)




8
StegFS: A Steganographic file
system (I)

Zhou, Pan and Tan (2003)
– Bitmap for blocks: free (0) or allocated (1)
– Allows multi-user
– Trusted (tamper resistant) user agent

Types of blocks:
–
–
–
9
File blocks (1): contain encrypted user data
Dummy blocks (0): free. Contain random data
Abandoned blocks (1): non used. Hide amount of
file blocks
StegFS: A Steganographic file
system (II)




10
FAK (File access key) : individually encrypt each file
– Easy share files
UAK (User access key): encrypts a hidden file that
contains a directory of all of the (filename, FAK)
pairs for that access level
– Easy access for one user
UAK -> FAK+file name-> File header with locations
of blocks
Implementation
(http://xena1.ddns.comp.nus.edu.sg/SecureDBMS)
Mnemosyne: Peer-to-Peer
Steganographic Storage

Hand and Roscoe (2002)
–
Distributed steganographic file system
Block oriented
–
Location based on file name + location key
–

Two operations:
–
–

Use IDA (Information Dispersal Algorithm) for
replication (n out of m)
–
11
putblock(blockID, data)
getblock(blockID)
–
Enhances resilience
Difficults traffic analysis
Mojitos: A Distributed
Steganographic File System

Giefer and Letchner (2002)
–
–



12

Combines StegFS (levels, BAT) and Mnemosyne
(distributed, block level storage)
Client – Server architecture
Client knows file name + access key
Servers hold BAT, trusted with user keys
(vulnerable to server hacks)
Client asks inode (previous authentication) and
then operates directly over file blocks
Use cover traffic to hide patterns of access (no
details)
Continuously Observable
Steganographic File Systems
13

Previous schemes resist one/two snapshots

What if attacker monitors raw storage?
– Remote or shared store
– Multiple snapshots (arbitrarily close in time) prior to coercion

Assumption: adversary can continuously record the contents of
storage / monitor all accesses
Hiding Data Accesses in
Steganographic File System




14
Only one proposal considering this attacker (Zhou, Pan &Tan,
2004)
–
Based on StegFS [PTZ03]
Types of blocks:
–
File blocks: contain encrypted user data
–
Dummy blocks: free. Contain random data
Update operations:
–
Data update: change content of block
–
Dummy update: change IV of encryption (CBC block cipher)
–
Relocation of blocks
–
Goal: not possible to distinguish data and dummy updates
Read operations:
–
Use oblivious storage to combine dummy and real reads
Talk Outline





15
Motivation
Related Work
Traffic Analysis Attacks on Continuosly
Observable Steganographic file systems
Countering Traffic Analysis
Conclusions
Traffic Analysis Attacks on Continuoslyobservable Steganographic file systems



Troncoso, Diaz, Dunkelman and Preneel
(2007)
Attack on the update algorithm of StegFS
[ZPT04]
Exploit patterns in location accesses:
–
–
16
Distinguish between user active and idle periods
Find files in the system (prior to coercion)
StegFS:
Update Algorithm
N
Dummy
Update
Request
update B1?
Y
Y
Pick
Randomly
B2
Dummy
Update on
B2
17
Update on
B1
B2=B1?
N
N
B2 dummy?
Y
Rewrite B1
with random
data
Substitute
B2 for
updated B1
StegFS: proof of security
“For a data update, each block in the storage space has the same probability of
being selected to hold the new data. Hence the data updates produce random
block I/Os, and follow exactly the same pattern as the dummy updates.
Therefore, whether there is any data update or not, the updates on the raw
storage follow the same probability distribution as that of dummy updates.“


All locations have the same probability of being selected
BUT:
– Location accesses produced by file updates follow different
patterns than dummy updates.
Traffic analysis attacks on accessed locations!!
18
Attacking multi-block files:
Update one block

Pattern (updating B1):
1.
2.
3.
As many dummy updates on data blocks as data B2’s are
chosen
Overwrite file block B1 with random data
Overwrite dummy block B2 with the updated data
Example: Update block B1=3
19
Block
selected
Access
list
290 (F)
47 (F)
127 (D)
-
290
47
3
127
Operation performed
Dummy update on data block B2=290
Dummy update on data block B2=47
Overwrite file block B1=3 with random data
Write B2=127 with updated content of B1
Attacking multi-block files:
Update a file
20
Update F1 in 123
1, 2, 3
175
Update F1 in 234
34, 345, 127 23
Update F1 in 479
90, 333, 76 200
213
1
34
479
2
345
290
43
3
127
231
146
216
345
333
12
257
34
90
241
57
12
127
76
321
468
93
76
125
90
12
245
222
333
60
189
230
349
432
34, 345, 127
90, 333, 76
Attacking multi-block files:
Algorithm
21
Fixed
Group
GF(A)
213
1
34
479
2
345
290
47
3
127
231
146
216
77
243
10
234
23
34
90
12
157
345
333
241
57
12
127
327
111
76
321
468
200
150
398
76
Moving
Group
GM(A)
Moving
Group
GM(A)
Moving
Group
GM(A)
Fixed
Group
GF(A)
213
1
34
479
2
345
290
47
3
127
231
146
216
77
243
10
234
23
34
90
12
157
345
333
241
57
12
127
327
111
76
321
468
200
150
398
76
Moving
Group
GM(A)
Moving
Group
GM(A)
Fixed
Group
GF(A)
213
1
34
479
2
345
290
47
3
127
231
146
216
77
243
10
234
23
34
90
12
157
345
333
241
57
12
127
327
111
76
321
468
200
150
398
76
Moving
Group
GM(A)
Attacking multi-block files:
False positives

The attacker thinks he has found a file but the
patterns have been randomly produced
B = size of storage
b = size of file searched
Number of file accesses
22
T = total accesses
Attacking one-block files:

Assumption: file blocks
updated more frequently
than dummy blocks
i 1
 1 1
PD (i )  1    
 B B
PF (i )  1  f 
i 1
f
1
f 
B
23
Need more than one
update (hops)
123
175
213
1
479
356
290
47
479
231
146
216
367
100
23
231
437
201
Near
repetition
Near
repetition
Attacking one-block files:
Algorithm (h=5)
24
123
175
213
1
479
356
290
47
479
231
146
216
367
100
23
479
431
231
347
67
12
90
431
67
98
239
347
278
95
467
109
21
263
278
222
417
322
347
274
87
9
123
321
479
231
431
347
278
222
67
274
end
end
Attacking one-block files:
False positives
•
Dummy block updates appear near ( DC such that
PF DC    C ) in the h hops considered
P (1)   P i   P (h)  P (1) 
DC
f
i 0
h
D
f
f
 C  DC  P( f ) 
f  DC  P( f ) 
25
Attacking one-block files:
False negatives

A file update happens far ( DC ) in one of the h hops
Pf  (1) 
26
h
 Pf  (1)


P
i

P
(
h
)



F
f
i  DC
i 1  i 

h
Simulation results
Multi-block files
27
Size of files
(blocks)
Number of
files per size
File update
frequency
Size of
storage space
2-3
10
3%
10,000
4-10
10
3%
10,000
Size of files
Files found
Wrong size
False
positives
2-3
4-10
>99%
>99%
<2%
<1%
<2%
0
Simulation results
One-block files
Number of
files
Hops
Accesses to
each file
threshold
10
12
• Rate of success func(f)
•  f  false
28
positives
10
C
Size of
storage space
10-8
100,000
Attack Conclusions




29
Security claims unsubstantiated
1.
The algorithms do not produce same pattern for dummy and
user updates
2.
The distribution of updated locations is different depending
on whether there is user activity or not
Blocks are rarely relocated, and when they are, their new
location is known
Multi-block file updates generate correlations between accessed
locations
Very easy find multi-block files and easy one-block files
A bit of randomness is not
enough
Talk Outline





30
Motivation
Related Work
Traffic Analysis Attacks on Continuosly
Observable Steganographic file systems
Countering Traffic Analysis
Conclusions
Requirements




31
Different levels of security
Forward and backward security after
coercion
Data persistence
Counter traffic analysis
Attack model




Continuously monitors the contents and
accesses to the storage
Records all past states of the storage
Performs real-time traffic analysis on the
accessed block locations
Ability to coerce the user at any point
–
–
32

User produces some low-level keys
Attacker inspects user computer status
Attacker should not learn about higher levels
System architecture
Table




34
A password per level decrypts all the level entries
Block key for forward and backward security
H(·) to detect active attacks
Metadata to manage the file system
Data persistence


High security file blocks appear as empty
(while in low security mode) thus data may
be lost
Erasure codes:
–


35
Converts n blocks in m such that n of them suffice
to recover the info
Regeneration of higher levels
Difficult traffic analysis for file read operations
Traffic analysis resistance:
Pool mix


Technique to anonymize
email traffic used here to
obscure relocation
Cycle:
–
–
–
–

36
Collect a block from the
raw storage,
Change its appearance,
Storing it in an pool (which
contains P blocks from
previous cycles), and
Randomly flush a block out
More randomness in
relocations
Traffic analysis resistance
Dummy traffic




Provide unobservability (idle/ non-idle periods)
Automatically generated accesses to the storage.
Pattern of dummy accesses must be statistically
indistinguishable from user requests
The system chooses blocks at random to be read and
put in the pool
–

37
Works if files are small
More sophisticated dummy selection strategies are
possible
Access cycle
Additional work (not presented)





Definition of metrics for unobservability and
plausible deniability
Probabilistic function qψ(t)(t) to detect
correlations generated by the repeated
access to files
Pattern recognition algorithm for gathering
evidence prior to coercion
Test for detection of file accesses after
coercion
Results for unobservability and deniability
Conclusions




40
Different levels of security: Table
Forward security after coercion: One-time block keys
Data persistence: Erasure codes, redundancy
Counter traffic analysis
 Dummy traffic to conceal user activity
 High-entropy relocation of data to hide new
locations and access patterns
 Not trivial to achieve deniability for multi-block
files
Thank you
41