Messaging anonymity & the traffic analysis of hardened systems
FOSAD 2010 – Bertinoro, Italy
Networking
Relation between identity and efficient routing
Identifiers: MAC, IP, email, screen name
No network privacy = no privacy!
The identification spectrum today
Full
Anonymity
Pseudonymity
“The Mess”
we are in!
Strong
Identification
NO ANONYMITY
Weak identifiers
everywhere:
NO IDENTIFICATION
IP, MAC
Logging at all levels
Login names / authentication
PK certificates in clear
Application data leakage
Expensive / unreliable logs.
IP / MAC address changes
Open wifi access points
Botnets
Partial solution
Authentication
Also:
Location data leaked
Weak identifiers easy to
modulate
Open issues:
DoS and network level attacks
Anthony F. J. Levi - http://www.usc.edu/dept/engineering/eleceng/Adv_Network_Tech/Html/datacom/
No integrity or
authenticity
MAC Address
3.1.
Internet Header Format
A summary of the contents of the internet header follows:
Link different
packets together
0
1
2
3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL |Type of Service|
Total Length
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Identification
|Flags|
Fragment Offset
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time to Live |
Protocol
|
Header Checksum
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Source Address
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Destination Address
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Options
|
Padding
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Example Internet Datagram Header
Weak identifiers
Same for TCP, SMTP, IRC, HTTP, ...
Figure 4.
No integrity / authenticity
Objective: foundations of anonymity & traffic analysis
Why anonymity?
Lecture 1 – Unconditional anonymity
DC-nets in detail, Broadcast & their discontents
Lecture 2 – Black box, long term attacks
Statistical disclosure & its Bayesian formulation
Lecture 3 – Mix networks in detail
Sphinx format, mix-networks
Lecture 4 – Bayesian traffic analysis of mix networks
Specialized applications
Electronic voting
Auctions / bidding / stock market
Incident reporting
Witness protection / whistle blowing
Showing anonymous credentials!
General applications
Freedom of speech
Profiling / price discrimination
Spam avoidance
Investigation / market research
Censorship resistance
Sender anonymity
Alice sends a message to Bob. Bob cannot know
who Alice is.
Receiver anonymity
Alice can send a message to Bob, but cannot find
out who Bob is.
Bi-directional anonymity
Alice and Bob can talk to each other, but neither
of them know the identity of the other.
3rd party anonymity
Alice and Bob converse and know each other, but
no third party can find this out.
Unobservability
Alice and Bob take part in some communication,
but no one can tell if they are transmitting or
receiving messages.
Unlinkability
Two messages sent (received) by Alice (Bob)
cannot be linked to the same sender (receiver).
Pseudonymity
All actions are linkable to a pseudonym, which is
unlinkable to a principal (Alice)
Point 1: Do not re-invent this
E(Junk)
Simple receiver
anonymity
E(Message)
E(Junk)
Point 2: Many ways to do
broadcast
- Ring
- Trees
It has all been done (Buses)
Point 3: Is your anonymity
system better than this?
Point 4: What are the
problems here?
E(Junk)
E(Junk)
Coordination
Sender anonymity
Latency
Bandwidth
DC-nets
Dining Cryptographers (David Chaum 1985)
Multi-party computation resulting in a message
being broadcast anonymously
2 twists
How to implement DC-nets through broadcast trees
How to achieve reliability?
Communication cost...
“Three cryptographers are sitting down to
dinner at their favourite three-star restaurant.
Their waiter informs them that arrangements
have been made with the maitre d'hotel for the
bill to be paid anonymously.
One of the cryptographers might be paying for
the dinner, or it might have been NSA (U.S.
National Security Agency).
The three cryptographers respect each other's
right to make an anonymous payment, but they
wonder if NSA is paying.”
I didn’t
Ron
I paid
Adi
Did the
NSA pay?
I didn’t
Wit
Toss coin
car
I didn’t
ma = 0
br = mr + car + crw
Ron
ba = ma + car + caw
Toss coin
crw
Adi
Combine:
Toss coin
B = ba + br + bw =
caw
ma + mr +mw = mr (mod 2)
bw = mw + crw + caw
I paid
mr = 1
Wit
I didn’t
mw = 0
Generalise
Many participants
Larger message size
▪ Conceptually many coins in parallel (xor)
▪ Or: use +/- (mod 2|m|)
Arbitrary key (coin) sharing
▪ Graph G:
▪ nodes - participants,
▪ edges - keys shared
What security?
Toss m coins
car
ma = 0
br = mr + car - crw (mod 2m)
Ron
ba = ma - car + caw
(mod 2m)
I want to send
mr = message
Toss m coins
crw
Adi
Combine:
Toss m coins
B = ba + br + bw =
caw
ma + mr +mw = mr (mod 2m)
mw = 0
bw = mw + crw - caw (mod 2m)
Wit
Derive coins
cabi = H[Kab, i]
for round i
Stream cipher
(Kab)
C
B
Shared key Kab
A
Alice broadcasts
ba = cab + cac + ma
If B and C corrupt
Alice broadcasts
ba = cab + cac + ma
C
Adversary’s view
ba = cab + cac + ma
B
Shared key Kab
A
No Anonymity
Adversary nodes partition the
graph into a blue and green
sub-graph
Calculate:
Bblue = ∑bj, j is blue
Bgreen = ∑bi, i is green
C
B
Anonymity set size = 4
(not 11 or 8!)
A
Substract known keys
Bblue + Kred-blue = ∑mj
Bgreen + K’red-green = ∑mi
Discover the originating
subgraph.
Reduction in anonymity
bi broadcast graph
Tree – independent of key sharing graph
= Key sharing graph –
No DoS unless split in graph
Communication in 2 phases:
1) Key sharing (off-line)
2) Round sync & broadcast
Combine:
B = bi = mr (mod 2m)
Aggregator
bi broadcast graph
Tree – independent of key sharing graph
= Key sharing graph –
No DoS unless split in graph
Communication in 2 phases:
1) Key sharing (off-line)
2) Round sync & broadcast
(peer-to-peer?)
Ring?
Tree?
Toss m coins
car
ma = 0
br = mr + car - crw (mod 2m)
Ron
ba = ma - car + caw
(mod 2m)
Toss m coins
crw
Adi
Combine:
Toss m coins
B = ba + br + bw =
caw
ma + mr +mw = collision (mod 2m)
bw = mw + crw - caw (mod 2m)
I want to send
mr = message1
Wit
I want to send
mw = message2
Ethernet:
detect collisions (redundancy)
Exponential back-off, with random re-transmission
Collisions do not destroy all information
B = ba + br + bw = ma + mr +mw =
= collision (mod m)
= message 1 + message2 (mod m)
N collisions can be decoded in N transmissions
Trick 1: run 2 DC-net rounds in parallel
(1, m1)
Trick 2: retransmit if greater than average
Round 1
(5, m1+m2+m4+m5+m6)
(1, m2)
(0, 0)
Round 2
(3, m1+m2+m4)
(2, m5+m6)
(1, m4)
(1, m5)
(1, m6)
(0, 0)
Round 3
(1, m1)
Round 4
(1, m2)
(2, m2+m4)
Round 5
(1, m5)
(1, m4)
(1, m6)
Reliability DoS resistance!
How to protect a DC-network against disruption?
Solved problem?
DC in a Disco – “trap & trace”
Modern crypto: commitments, secret sharing, and
banning
Note the homomorphism: collisions = addition
Can tally votes / histograms through DC-nets
Security is great!
Full key sharing graph perfect anonymity
Communication cost – BAD
(N broadcasts for each message!)
Naive: O(N2) cost, O(1) Latency
Not so naive: O(N) messages, O(N) latency
▪ Ring structure for broadcast
Expander graph: O(N) messages, O(logN) latency?
Centralized: O(N) messages, O(1) latency
Not practical for large(r) N!
Local wireless communications?
Perfect Anonymity
“In the long run we are all dead” – Keynes
DC-nets – setting
Single message from Alice to Bob, or between a set
group
Real communications
Alice has a few friends that she messages often
Interactive stream between Alice and Bob (TCP)
Emails from Alice to Bob (SMTP)
Alice is not always on-line (network churn)
Repetition – patterns -> Attacks
Even perfect anonymity systems leak
information when participants change
Remember: DC-nets do not scale well
Setting:
N senders / receivers – Alice is one of them
Alice messages a small number of friends:
▪ RA in {Bob, Charlie, Debbie}
▪ Through a MIX / DC-net
▪ Perfect anonymity of size K
Can we infer Alice’s friends?
rA in RA= {Bob, Charlie, Debbie}
Alice
K-1 Senders
out of N-1
others
Anonymity
System
K-1 Receivers
out of N
others
(Model as random receivers)
Alice sends a single message to one of her friends
Anonymity set size = K
Entropy metric EA = log K
Perfect!
Alice
T1
Others
rA1
Anonymity
System
Alice
T2
Others
Others
Anonymity
System
Tt
Others
Observe many rounds
in which Alice
participates
Rounds in which Alice
participates will
output a message to
her friends!
Infer the set of friends!
Others
rA3
Anonymity
System
Alice
T4
rA2
Alice
T3
Others
Others
rA4
Anonymity
System
...
Others
Guess the set of friends of Alice (RA’)
Constraint |RA’| = m
Accept if an element is in the output of each
round
Downside: Cost
N receivers, m size – (N choose m) options
Exponential – Bad
Good approximations...
Note that the friends of Alice will be in the sets
more often than random receivers
How often? Expected number of messages per
receiver:
μother = (1 / N) ∙ (K-1) ∙ t
μAlice = (1 / m) ∙ t + μother
Just count the number of messages per receiver
when Alice is sending!
μAlice > μother
Parameters: N=20 m=3 K=5 t=45
Round Receivers
SDA
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[19, 10, 17, 13, 8]
[0, 7, 0, 13, 5]
[16, 18, 6, 13, 10]
[1, 17, 1, 13, 6]
[18, 15, 17, 13, 17]
[0, 13, 11, 8, 4]
[15, 18, 0, 8, 12]
[15, 18, 15, 19, 14]
[0, 12, 4, 2, 8]
[9, 13, 14, 19, 15]
[13, 6, 2, 16, 0]
[1, 0, 3, 5, 1]
[17, 10, 14, 11, 19]
[12, 14, 17, 13, 0]
[13, 17, 19]
[0, 5, 13]
[5, 10, 13]
[10, 13, 17]
[13, 17, 18]
[0, 13, 17]
[0, 13, 17]
[13, 15, 18]
[0, 13, 15]
[0, 13, 15]
[0, 13, 15]
[0, 13, 15]
[0, 13, 15]
[0, 13, 17]
1
1
2
2
2
1
1
2
1
1
1
1
1
1
395
257
203
179
175
171
80
41
16
16
16
4
2
2
17
18
19
20
21
22
23
24
[4, 1, 19, 0, 19]
[0, 6, 1, 18, 3]
[5, 1, 14, 0, 5]
[17, 18, 2, 4, 13]
[8, 10, 1, 18, 13]
[14, 4, 13, 12, 4]
[19, 13, 3, 17, 12]
[8, 18, 0, 10, 18]
[0, 13, 19]
[0, 13, 19]
[0, 13, 19]
[0, 13, 19]
[0, 13, 19]
[0, 13, 19]
[0, 13, 19]
[0, 13, 18]
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1 [15, 13, 14, 5, 9]
16 [18, 19, 19, 8, 11]
[13, 14, 15]
[0, 13, 19]
KA={[0, 13, 19]}
SDA_error
2
#Hitting sets
685
Round 16:
Both attacks give correct result
0
1
SDA: Can give wrong results –
need more evidence
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
[19, 4, 13, 15, 0] [0, 13, 19]
[13, 0, 17, 13, 12] [0, 13, 19]
[11, 13, 18, 15, 14] [0, 13, 18]
[19, 14, 2, 18, 4] [0, 13, 18]
[13, 14, 12, 0, 2] [0, 13, 18]
[15, 19, 0, 12, 0] [0, 13, 19]
[17, 18, 6, 15, 13] [0, 13, 18]
[10, 9, 15, 7, 13] [0, 13, 18]
[19, 9, 7, 4, 6]
[0, 13, 19]
[19, 15, 6, 15, 13] [0, 13, 19]
[8, 19, 14, 13, 18] [0, 13, 19]
[15, 4, 7, 13, 13] [0, 13, 19]
[3, 4, 16, 13, 4] [0, 13, 19]
[15, 13, 19, 15, 12] [0, 13, 19]
[2, 0, 0, 17, 0]
[0, 13, 19]
[6, 17, 9, 4, 13]
[0, 13, 19]
[8, 17, 13, 0, 17] [0, 13, 19]
[7, 15, 7, 19, 14] [0, 13, 19]
[13, 0, 17, 3, 16] [0, 13, 19]
[7, 3, 16, 19, 5] [0, 13, 19]
[13, 0, 16, 13, 6] [0, 13, 19]
0
0
1
1
1
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
SDA: Can give wrong results –
need more evidence
Counter-intuitive
The larger N the easiest the attack
Hitting-set attacks
More accurate, need less information
Slower to implement
Sensitive to Model
▪ E.g. Alice sends dummy messages with probability p.
Statistical disclosure attacks
Need more data
Very efficient to implement (vectorised) – Faster partial results
Can be extended to more complex models (pool mix, replies, ...)
Bayesian modelling of the problem
A systematic approach to traffic analysis
Profile
Alice
T1
Alice
Profile Others
Other
Anonymity
System
rA1 Alice
rother Other
Matching M1 M
Observation O1 = Senders ( Uniform) + receivers (as above)
Full generative model:
Pick a profile for Alice Alice and a Profile for others Other from prior distributions (once!)
For each round:
▪ Pick a multi-set of senders uniformly from the multi-sets of size K of possible senders
▪ Pick a permutation M1 from kk uniformly at random
▪ For each sender pick a receiver according to the probability in their respective profiles
Profile
Alice
T1
Alice
Profile Others
Other
Anonymity
System
rA1 Alice
rother Other
Matching M1 M
Observation O1 = Senders ( Uniform) + receivers (as above)
Full generative model:
Pick a profile for Alice Alice and a Profile for others Other from prior distribution (once!)
For each round:
▪ Pick a multi-set of senders in Oi uniformly from the multi-sets of size K of possible senders
▪ Pick a permutation M1 from kk uniformly at random
▪ For each sender pick a receiver rAlice and rother according to the probability in their respective profiles
Profile
Alice
T1
Alice
Profile Others
Other
Anonymity
System
rA1 Alice
rother Other
Matching M1 M
Observation O1 = Senders ( Uniform) + receivers (as above)
The Disclosure attack inference problem:
“Given all known and observed variables (priors , M, observations Oi) determine the distribution
over the hidden random variables (profiles Alice , Other and matching Mi )”
Using information from all rounds, assuming profiles are stable
How? (a) maths = Bayesian inference (b) computation = Gibbs sampling
What are profiles?
Keep it simple: a multinomial distribution over all users.
E.g.
Alice = [Bob: 0.4, Charlie: 0.3, Debbie: 0.3]
Other = [Uniform(1/n-1)]
Prior distribution on multinomials: Dirichlet Distribution, i.e.
samples from Dirichlet distribution are multinomial
distributions
Other profile models are also possible: restrictions on number
of contacts, cliques, mutual friends, …
Other quantities have known conditional probabilities.
What are those?
T1
Profile
Alice
Alice
Pr[rA1 | Alice ,M1, O1]
Pr[Alice | ]
Anonymity
System
Profile Others
Other
Pr[Other | ]
Pr[O1 | K]
Matching M1 M Pr[M1 | M]
rA1 Alice
rother Other
Pr[rother | other ,M1, O1]
Observation O1 = Senders ( Uniform) + receivers (as above)
Full generative model:
Pick a profile for Alice Alice and a Profile for others Other from prior distribution (once!)
For each round:
▪ Pick a multi-set of senders in Oi uniformly from the multi-sets of size K of possible senders
▪ Pick a permutation M1 from kk uniformly at random
▪ For each sender pick a receiver rAlice and rother according to the probability in their respective profiles
What we have:
Pr[Alice , Other, Mi , riA, riother | Oi, M, , K] =
= Pr[riA | Alice ,M1, Oi] x Pr[riother | other ,M1, Oi] x
Pr[Mi | M] x Pr[Alice | ] x Pr[Other | ]
What we really want:
Pr[Alice , Other, Mi | riA, riother, Oi, M, , K]
Pr[A=a, B=b]
= Pr[A=a | B=b] x Pr[B = b]
=Pr[B=b | A=a] x Pr[A = a]
Derivation
Inverse probability
Meaning
Pr[A=a | B=b]
= Pr[B=b | A=a ] x Pr[A=a ] / Pr[B = b]
Forward probability
Prior
Normalising factor
Pr[B=b] = (A=a) Pr[A=a, B=b]
= (A=a) Pr[A=a | B=b] x Pr[B = b]
Large spaces = no direct computation
Normalising
factor
A
Pr[ r , rB |
Pr[
,A
B
,A
M O , M, , K] x
Pr[Alice , Other, Mi | riA, riother , Oi, M, , K] =
iA
iother
Alice
Alice
, Other
Other, Mi
i
i
| Oi, M, , K]
Pr[ riA, riother | Alice , Other, Mi Oi, M, , K] x
Pr[Alice , Other, Mi | Oi, M, , K]
All riA, riother
Large spaces = no direct computation
Good: can compute the probability sought up
to a multiplicative constant
Pr[Alice , Other, Mi | riA, riother , Oi, M, , K]
Pr[ riA, riother | Alice , Other, Mi Oi, M, , K] x Pr[Alice , Other, Mi | Oi, M, , K]
Bad: Forget about computing the constant –
in general
Ugly: Use computational methods to
estimate quantities of interest for traffic
analysis
Full hidden state: Alice , Other, Mi
Marginal probabilities & distributions
Pr[Alice->Bob] – Are Alice and Bob friends?
Mx – Who is talking to whom at round x?
How can we get information about those?
Without computing the exact probability!
Consider a distribution Pr[A=a]
Direct computation of mean:
E[A] = a Pr[A=a] x a
Computation of mean through sampling:
Samples A1, A2, …, An A
E[A] i Ai / n
Consider a distribution Pr[A=a]
Computation of mean through sampling:
Samples A1, A2, …, An A
E[A] i Ai / n
More:
Estimates of variance
Estimate of percentiles – confidence intervals
(0.5%, 2.5%, 50%, 97.5%, 99.5%)
Profile
Alice
T1
Alice
Profile Others
Other
rA1 Alice
Anonymity
System
rother Other
Matching M1 M
Observation O1 = Senders ( Uniform) + receivers (as above)
The Disclosure attack inference problem:
Sample
“Given all known and observed variables (priors , M, observations Oi) determine the distribution
over the hidden random variables (profiles Alice , Other and matching Mi )”
Using information from all rounds, assuming profiles are stable
How? (a) maths = Bayesian inference (b) computation = Gibbs sampling
Typical approaches
Direct sampling: Draw elements according to the
their probabilities
Rejection sampling: Draw elements according to
another distribution, keep with the probability of
the element
Markov-Chain Monte Carlo (MCMC):
Gibbs Sampling (simpler)
Metropolis Hastings (more flexible)
Allows sampling from complex distributions
when their marginal distributions are easy to
sample from.
Example: Sample Pr[A,B | O]
For sample s in (0, SAMPLES):
For iteration j in (0, ITERATIONS):
▪ aj A with Pr[A|B=bj-1,O]
▪ bj B with Pr[B|A=aj,O]
Samples = (aSAMPLES, bSAMPLES)
Our distribution:
Pr[Alice , Other, Mi | riA, riother , Oi, M, , K]
Marginal distributions:
Profiles:
Pr[Alice , Other | Mi ,riA, riother , Oi, M, , K]
(Direct sampling by sampling Dirichlet dist.)
Mappings:
Pr[Mi | Alice , Other , riA, riother , Oi, M, , K]
(Direct sampling of the matching link by link)
Pr[Alice , Other, Mi | riA, riother , Oi, M, , K]
SAMPLING
Get typical values of all
hidden variables
Pros: determine the
distribution and confidence
intervals of each quantity
of interest.
Cons: not the “best
solution”
OPTIMIZING
Get the values of hidden
variables that maximize
the probability
Pros: “best solution”
Cons: information about
other solutions lost!
Near-perfect anonymity is not perfect enough!
High level patterns cannot be hidden for ever
Unobservability / maximal anonymity set size needed
Flavours of attacks
Very exact attacks – expensive to compute
▪ Model inexact anyway
Statistical variants – wire fast!
Bayesian approach – systematic
▪ Defile a “forward” generative probability model
▪ “Invert” the model using Bayes theorem
▪ Use sampling to estimate quantities of interest
▪ Find their confidence intervals
References
Limits of Anonymity in Open Environments
by Dogan Kesdogan, Dakshi Agrawal, and Stefan Penz. In the Proceedings of Information Hiding Workshop (IH 2002),
October 2002.
Statistical Disclosure Attacks: Traffic Confirmation in Open Environments
by George Danezis. In the Proceedings of Security and Privacy in the Age of Uncertainty, (SEC2003), Athens, May
2003, pages 421-426.
The Hitting Set Attack on Anonymity Protocols
by Dogan Kesdogan and Lexi Pimenidis. In the Proceedings of 6th Information Hiding Workshop (IH 2004), Toronto,
May 2004.
Statistical Disclosure or Intersection Attacks on Anonymity Systems
by George Danezis and Andrei Serjantov. In the Proceedings of 6th Information Hiding Workshop (IH 2004), Toronto,
May 2004.
Practical Traffic Analysis: Extending and Resisting Statistical Disclosure
by Nick Mathewson and Roger Dingledine. In the Proceedings of Privacy Enhancing Technologies workshop (PET
2004), May 2004, pages 17-34.
Two-Sided Statistical Disclosure Attack
by George Danezis, Claudia Diaz, and Carmela Troncoso. In the Proceedings of the Seventh Workshop on Privacy
Enhancing Technologies (PET 2007), Ottawa, Canada, June 2007.
Perfect Matching Statistical Disclosure Attacks
by Carmela Troncoso, Benedikt Gierlichs, Bart Preneel, and Ingrid Verbauwhede. In the Proceedings of the Eighth
International Symposium on Privacy Enhancing Technologies (PETS 2008), Leuven, Belgium, July 2008, pages 2-23.
Vida: How to Use Bayesian Inference to De-anonymize Persistent Communications
by George Danezis and Carmela Troncoso. In the Proceedings of Privacy Enhancing Technologies, 9th International
Symposium, PETS 2009, Seattle, WA, USA, August 5-7, 2009, 2009, pages 56-72.
Traffic analysis exercise
http://www.cl.cam.ac.uk/~sjm217/projects/anon/ta-exercise.html
Anonymity, cryptography, network security and attacks
FINANCIAL CRYPTOGRAPHY
& DATA SECURITY 2011
St. Lucia – Caribbean
Submission deadline: October 1, 2010
Event: Feb-Mar 2011.
PRIVACY ENHANCING
TECHNOLOGIES 2011
Waterloo – Canada
Submissions deadline: March 2011
Event: July 2011
David Chaum (concept 1979 – publish 1981)
Ref is marker in anonymity bibliography
Makes uses of cryptographic relays
Break the link between sender and receiver
Cost
O(1) – O(logN) messages
O(1) – O(logN) latency
Security
Computational (public key primitives must be secure)
Threshold of honest participants
Alice
M->B: Msg
A->M: {B, Msg}Mix
Bob
The Mix
Adversary cannot
see inside the Mix
1) Bitwise unlinkability
Alice
A->M: {B, Msg}Mix
?
M->B: Msg
Bob
The Mix
?
2) Traffic analysis resistance
Bitwise unlinkability
Ensure adversary cannot link messages in and out
of the mix from their bit pattern
Cryptographic problem
Traffic analysis resistance
Ensure the messages in and out of the mix cannot
be linked using any meta-data (timing, ...)
Two tools: delay or inject traffic – both add cost!
Broken bitwise unlinkability
k
The `stream cipher’ mix (Design 1)
PKMix
,
Stream Cipher
Message
{M}Mix = {fresh k}PKmix, M Streamk
A->M: {B, Msg}Mix
Alice
M->B: Msg
Bob
The
Mix
Active attack?
Tagging Attack
Adversary intercepts {B, Msg}Mix
and injects {B, Msg}Mix xor (0,Y).
The mix outputs message:
M->B: Msg xorY
And the attacker can link them.
k
Mix acts as a service
Everyone can send messages to it; it will apply an
algorithm and output the result.
That includes the attacker – decryption oracle,
routing oracle, ...
(Active) Tagging attacks
Defence 1: detect modifications (CCA2)
Defence 2: lose all information (Mixminion, Minx)
Input
Processing
inside
MIx
Output
George Danezis & Ian Goldberg. Sphinx: A Compact and Provably Secure Mix Format. IEEE S&P ‘09.
Broken traffic analysis resistance
The `FIFO*’ mix (Design 2)
Mix sends messages out in the order they came in!
A->M: {B, Msg}Mix
Alice
Bob
The
Mix
* FIFO = First in, First out
M->B: Msg
Passive attack?
The adversary simply counts the
number of messages, and assigns
to each input the corresponding
output.
Mix strategies – ‘mix’ messages together
Threshold mix: wait for N messages and output them in a random order.
Pool mix: Pool of n messages; wait for N inputs; output N out of N+n; keep
remaining n in pool.
Timed, random delay, ...
Threshold Mix
Pool Mix
Pool
“Hell is other people” – J.P. Sartre
Anonymity security relies on others
Problem 1: Mix must be honest
Problem 2: Other sender-receiver pairs to hide amongst
Rely on more mixes – good idea
Distributing trust – some could be dishonest
Distributing load – fewer messages per mix
Two extremes
Mix Cascades
▪ All messages are routed through a preset mix sequence
▪ Good for anonymity – poor load balancing
Free routing
▪ Each message is routed through a random sequence of mixes
▪ Security parameter: L then length of the sequence
A->M2: {M4, {M1,{B, Msg}M1}M4}M2
Free route
mix network
Alice
The Mix
M1
M2
M3
M4
(The adversary should
5 more information
getMno
M6
than before!)
M7
Bob
Bitwise unlinkability
Length invariance
Replay prevention
How to find mixes?
Lists need to be authoritative, comprehensive & common
Additional requirements – corrupt mixes
Hide the total length of the route
Hide the step number
(From the mix itself!)
Length of paths?
Good mixing in O(log(|Mix|)) steps = log(|Mix|) cost
Cascades: O(|Mix|)
We can manage “Problem 1 – trusting a mix”
The (n-1) attack – active attack
Wait or flush the mix.
Block all incoming messages (trickle) and injects
own messages (flood) until Alice’s message is out.
1
Alice
Bob
The
Mix
Attacker
n
Strong identification to ensure distinct identities
Problem: user adoption
Message expiry
Messages are discarded after a deadline
Prevents the adversary from flushing the mix, and injecting
messages unnoticed
Heartbeat traffic
Mixes route messages in a loop back to themselves
Detect whether an adversary is blocking messages
Forces adversary to subvert everyone, all the time
General instance of the “Sybil Attack”
Malicious mixes may be dropping messages
Special problem in elections
Original idea: receipts (unworkable)
Two key strategies to prevent DoS
Provable shuffles
Randomized partial checking
Bitwise unlinkability: El-Gamal re-encryption
El-Gamal public key (g, gx) for private x
El-Gamal encryption (gk, gkx ∙M)
El-Gamal re-encryption (gk’ ∙ gk , gk’xgkx ∙M)
▪ No need to know x to re-encrypt
▪ Encryption and re-encryption unlinkable
Architecture – re-encryption cascade
Output proof of correct shuffle at each step
Alice’s input
Mix 1
Mix 2
Mix 3
El-Gamal
Encryption
Reenc
Reenc
Reenc
Proof
Proof
Proof
Threshold
Decryption
Proof
Proof of correct shuffle
Outputs are a permutation of the decrypted inputs
(Nothing was inserted, dropped, otherwise modified!)
Upside: Publicly verifiable – Downside: expensive
Applicable to any mix system
Two round protocol
Mix commits to inputs and outputs
Gets challenge
Reveals half of correspondences at random
Everyone checks correctness
Pair mixes to ensure messages get some
anonymity
Mix i
Mix i+1
Reveal half
Reveal other half
Rogue mix can cheat with probability at most ½
Messages are anonymous with overwhelming
probability in the length L
Even if no pairing is used – safe for L = O(logN)
Cryptographic reply address
Alice sends to bob: M1,{M2, k1,{A,{K}A}M2}M1
▪ Memory-less:
k1 = H(K, 1)
k2 = H(K, 2)
Bob replies:
▪ B->M1: {M2, k1, {A,{K}A}M2}M1, Msg
▪ M1->M2: {A,{K}A}M2 , {Msg}k1
▪ M2->A: {K}A, {{Msg}k1}k2
Security: indistinguishable from other messages
Anonymity requires a crowd
Difficult to ensure it is not simulated – (n-1) attack
DC-nets – Unconditional anonymity at high
communication cost
Collision resolution possible
Mix networks – Practical anonymous messaging
Bitwise unlinkability / traffic analysis resistance
Crypto: Decryption vs. Re-encryption mixes
Distribution: Cascades vs. Free route networks
Robustness: Partial checking
The anonymity set (size)
Dining cryptographers
▪ Full key sharing graph = (N - |Adversary|)
▪ Non-full graph – size of graph partition
Assumption: all equally likely
Mix network context
Threshold mix with N inputs: Anonymity = N
Mix
Anonymity
N=4
Example: 2-stage mix
Option 1:
3 possible participants
Alice
Bob
¼
¼
=> N = 3
Mix 1
Note probabilities!
Charlie
½
Mix 2
Option 2:
Arbitrary min probability
?
Problem: ad-hoc
Alice
Bob
Example: 2-stage mix
¼
¼
Define distribution of senders
(as shown)
Entropy of the distribution is
anonymity
Mix 1
E = -∑pi log2 pi
Example:
E
Charlie
½
Mix 2
?
= - 2 ¼ (-2) – (½) (-1)
= + 1 + ½ = 1.5 bits
(NOT N=3 => E = -log3 = 1.58 bits)
Intuition: missing information
for full identification!
Only the attacker can measure the anonymity of a
system.
Need to know which inputs, output, mixes are controlled
Anonymity of single messages
How to combine to define the anonymity of a systems?
Min-anonymity of messages
How do you derive the probabilities? (Hard!)
Complex systems – not just examples
Core:
The Dining Cryptographers Problem: Unconditional Sender and Recipient
Untraceability by David Chaum.
In Journal of Cryptology 1, 1988, pages 65-75.
Mixminion: Design of a Type III Anonymous Remailer Protocol by George
Danezis, Roger Dingledine, and Nick Mathewson.
In the Proceedings of the 2003 IEEE Symposium on Security and Privacy, May
2003, pages 2-15.
Sphinx: A Compact and Provably Secure Mix Format by George Danezis and Ian
Goldberg. In the Proceedings of the 30th IEEE Symposium on Security and Privacy
(Samp;P 2009), 17-20 May, Oakland, California, USA, 2009, pages 269-282.
More
A survey of anonymous communication channels by George Danezis, Claudia
Diaz and Paul Syverson
http://homes.esat.kuleuven.be/~gdanezis/anonSurvey.pdf
The anonymity bibliography
http://www.freehaven.net/anonbib/
A systematic approach to measuring anonymity
Remixed slides contributed by Carmela Troncoso
Uncover who speaks to whom
Observe all links (Global Passive Adversary)
Long term disclosure attacks:
Exploit persistent patterns
Disclosure Attack [Kes03], Statistical Disclosure Attack [Dan03], Perfect Matching
Disclosure Attacks [Tron-et-al08]
Restricted routes [Dan03]
Messages cannot follow any route
Bridging and Fingerprinting [DanSyv08]
Users have partial knowledge of the network
Based on heuristics and specific models, not generic
86
Determine probability distributions input-output
A
A
MIX 1
B
( A, B, C )
1
1
or B
2
2
MIX 3
A
1
1
or B
2
2
A
MIX 2
C
3 3
(
Q 8,8,
3 3
R ( , ,
8 8
1
)
4
1
)
4
1
1
1
or B or C
4
4
2
S
(
1 1 1
,
, )
4 4 2
Threshold mix: collect t messages, and outputs them changing
their appearance and in a random order
87
Constraints, e.g. length=2
A
B
A
MIX 1
( A, B, C )
1
1
or B
2
2
MIX 3
A
1
1
or B
2
2
C
1 1 1
(
Q 4 , 4 , 2)
1 1 1
R ( , , )
4 4 2
C1
MIX 2
S
(
1 1
,
,0 )
2 2
Non trivial given
observation!!
88
How to compute probabilities
systematically??
Senders
Mixes (Threshold = 3)
Receivers
89
Find “hidden state” of the mixes
A
B
M1
Q
Pr[ HS | O, C ]
R
M3
C
S
M2
Prior information
Pr[O | HS , C ] Pr[ HS | C ]
Pr[O | HS , C ] K
Pr[ HS | O, C ]
Pr[ HS , O | C ] Too large to
HS
enumerate!!
“Hidden State” + Observation = Paths
A
B
Q
M1
R
M3
C
S
M2
P1
P2
P3
A
B
C
M1
M1
M2
M2
M3
S
M3
Q
R
Pr[O | HS , C ] K Pr[ Paths | C ]
Pr[ HS | O, C ]
Actually… we want marginal probabilities Pr( A Q | O, C )
A
B
A
A
( A, B, C )
1
1
or B
2
2
Q
R
1
1
or B
2
2
A
C
3 3 1
( , , )
8 8 4
3 3 1
( , , )
8 8 4
1
1
1
or B or C
4
4
2
S
(
1 1 1
, , )
4 4 2
Pr[ A Q | O, C ] I AQ ( HS j ) Pr[ HS | O, C ]
HS
But… we cannot obtain a full distribution on HS directly, and
cannot enumerate them all
92
Using sampling
HS1, HS2, HS3, HS4,…, HSN ~ Pr[ HS | O, C ]
(A → Q)?
0
1
0
1
… 1
Pr[ A Q | O, C
j
I AQ ( HS j )
N
Markov Chain Monte Carlo Methods
Pr[ HS | O, C ]
Pr[ Paths | C ]
How does Pr[Paths|C] look like?
Evaluation: information theoretic metrics for
anonymity
H Pr[ A Ri | O, C ] log Pr[ A Ri | O, C ]
Ri
Or min-Entropy, max-probability, (k-anonymity), …
Operations: estimating probability of arbitrary
events
Input message to output message?
Alice speaking to Bob ever?
Two messages having the same sender?
How we define
Pr[ Paths | C ] ?
Pr[ L l | C ]
Lmax
1
Lmin
Example uniform
path length lmin to lmax
x
( N mix l )!
Pr[ M x | L l , C ]
N mix !
Example random
permutation of length l
1 If all distinct
I set ( M x )
0 otherwise
Context: Senders and the time / round they send
Basic generative model:
Pr[ Paths | C ] Pr[ Px | C ]
Users decide on paths independently
Path length sampled from a path length distribution
Set of nodes chosen as a random choice of distinct nodes
Pr[ Px | C ] Pr[ L l | C ] Pr[ M x | L l , C ] I set ( M x )
x
The problem of
Unknown destinations
Observer has to stop at
some point!
End of observation
Destination
Destination
Problem of imputed observation:
Bayesian approach – fill in the data
Destination
Lmax
Pr[ Px | C ] Pr[ L l | C ] Pr[ M x | L l , C ] I set ( M x )
l Lobs
Bridging: users can only send through mixes they know
Pr[ Px | C ] Pr[ L l | C ] Pr[ M x | L l , C ] I set ( M x ) I bridging ( M x | x )
Non-compliant clients (with probability p c p )
Do not respect length restrictions
( Lmin, c p , Lmax, c p )
Choose l out of the Nmix node available, allow repetitions
Pr[ M x | L l , C , I c p ( Path)]
As before, paths are independent
1
N mix
l
Pr[ Paths | C ] Pr[ Px | C ]
x
Decide whether a node is compliant or not to calculate the probability
Pr[ Paths | C ] p c p Pr[ Pi | C , I c p ( Pi )] (1 p c p ) Pr[ Pj | C ]
iPc p
jPcp
Social network information
Assuming we know sending profiles
Pr(Sen x Rec x )
Pr[ Px | C ] Pr[ L l | C ] Pr[ M x | L l , C ] I set ( M x ) Pr[Sen x Rec x ]
Augment the traffic analysis
of mix networks with long-term
profiles
Other constraints
Unknown origin
Dummies
Other mixing strategies
….
Take home lesson:
• Integrate more constraints / more
complexity by augmenting generative
model
• More constraints = less anonymity
How we sample
Pr[ Paths | C ]?
Sample from distribution that is difficult to sample
directly
Pr[ HS | O, C ]
Pr[O | HS , C ] Pr[ HS | C ]
Pr[O | HS , C ] K Pr[ Paths | C ]
Pr[
HS
,
O
|
C
]
HS
Pr[ Paths | C ] Pr[ Px | C ]
x
Best Paths vs. Paths Samples – 3 Key points
Requires only a generative model / likelihood
Provides full & true distributions – good error estimate
▪ Not false positives and negatives
Systematic & extensible
We want to sample Pr[A] which is hard to sample directly
But we can calculate Pr[A] / Z up to a multiplicative constant Z
Q(A=ai | A=ai-1)
A = ai
A = ai-1
Q(A=ai-1 | A=ai)
Metropolis Hastings algorithm
Define a random walk Q on A, that is easy to sample
and calculate Q(A=ai | A=ai-1) and Q(A=ai-1 | A=ai)
Select a random initial sample A=a0
Given ai-1 draw a sample ai from Q(ai | ai-1)
Calculate
Pr[ A ai ]Q(ai | ai 1 )
Pr[ai 1 ]Q(ai 1 | ai )
Accept ai with probability min(1, )
Constructs a Markov Chain with stationary distribution Pr( HS | O, C )
Current state
Q
Candidate state
Q( HS candidate | HS current )
HScurrent
HScandidate
Q( HS current | HS candidate)
1.
Compute
2.
If
1
Pr( HS candidate)Q( HS candidate | HS current )
Pr( HS current )Q( HS current | HS candidate)
HS current HS candidate Accept
else u ~ U (0,1)
if u HS current HS candidate
else
HS current HS current
Accept with probability α
Q( Pathscandidate | Pathscurrent )
Pr[ HS | O, C ]
Pr[ Paths | C ]
Z
Pahtscandidate
Pathscurrent
Q( Pathscurrent | Pathscandidate)
Pr[ Pathscandidate ]Q( Pathscandidate | Pathscurrent )
Pr[ Pathscurrent ]Q( Pathscurrent | Pathscandidate )
Transition Q: swap operation
Q
R
A
B
M3
M1
C
S
M2
More complicated transitions for non-compliant clients
Q( Pathscandidate | Pathscurrent )
Pahtscandidate
Pathscurrent
Q( Pathscurrent | Pathscandidate)
Paths
Paths
PathsJ
Pathsi
Paths
Paths
Paths
Consecutive paths dependent
Sufficiently separated samples
guarantee independence
It works!
C. Troncoso - UTA - Nov 2009
106
Events should happen with the predicted probability
1.
2.
3.
4.
Create an instance of a network
Run the sampler and obtain P1,P2,…
Choose a target sender and a receiver
( Paths )
Predict probability
I
Sen Rec
Pr(Sen Rec )
5.
6.
j
j
N
Check if actually Sen chose Rec as receiver
I Sen Rec (network )
Choose new network and go to 2
Empirical probability
E ( I Sen Rec (network ))
Prempirical ( X Y ) ~ Beta ( I X Y 1, I X Y 1)
Predicted probability
I
Sen Rec
Paths
j
( Paths j )
100 msg
It scales well as networks get larger
As expected mix networks offer good protection
1000 msg
Nmix
t
Nmsg
Samples
RAM(Mb)
3
3
10
500
16
3
3
50
500
18
10
20
50
500
18
10
20
1 000
500
24
10
20
10 000
500
125
Size of network and population
Results are kept in memory during simulation
Nmix
t
Nmsg
iter
Full analysis (min)
One sample(ms)
3
3
10
6011
4.24
509.12
3
3
50
6011
4.80
576.42
5
10
100
7011
5.34
641.28
10
20
1 000
7011
5.97
706.12
Operations should be O(1)
Writing of the results on a file
Different number of iterations
Models with partial adversaries
Onion routing – low latency models
Location privacy
Use models to derive the anonymity systems
Traffic analysis is non trivial when there are constraints
Probabilistic model: incorporates most attacks
Non-compliant clients
Integrate with other inferences: long-term attacks & other
Monte Carlo Markov Chain methods to extract marginal
probabilities
Systematic
Only generative model needed
Future work:
Model more constraints / less information for attacker
Added value?
More info
Vida: How to use Bayesian inference to de-anonymize
persistent communications. George Danezis and Carmela
Troncoso. Privacy Enhancing Technologies Symposium 2009
The Bayesian analysis of mix networks. Carmela Troncoso and
George Danezis. 16th ACM Conference on Computer and
Communications Security 2009
The Application of Bayesian Inference to Traffic analysis.
Carmela Troncoso and George Danezis Microsoft Technical Report
Sampling error, Beta distribution, …
Pr(O, HS | C ) Pr( HS | O, C ) Pr(O | C )
Pr(O, HS | C ) Pr(O | HS , C ) Pr( HS | C )
Pr(O | HS , C ) Pr( HS | C ) Pr(O | HS , C ) Pr( HS | C )
Pr( HS | O, C )
Pr(O | C )
Pr( HS , O | C )
HS
Joint probability:
Pr( X , Y ) Pr( X | Y ) Pr(Y ) Pr(Y | X ) Pr( X )
We need to specify the prior knowledge (Pr( ) Pr( HS | C ) )
expresses our uncertainty
Beta(1,1)
Beta(5,1)
conforms to the nature of the parameter,Beta(0.5,0.5)
i.e. is continuous
but bounded
between 0 and 1
A convenient choice is the Beta distribution
(a b) a 1
P( ) Beta (a, b)
(1 )b1
(a)(b)0.0 0.4 0.8 0.0
theta
Beta(0.5,0.5)
0.0
0.4
0.8
theta
Beta(1,1)
0.0
0.4
0.8
theta
Beta(5,5)
Beta(5,1)
0.0
0.4
0.8
theta
0.0
0.4
0.8
theta
0.4
0.8
theta
Beta(5,20)
0.0
0.4
0.8
theta
0.0
0.4
0.8
theta
Beta(50,20
0.0
0.4
0.8
theta
Combining a beta prior with the binomial likelihood gives a
posterior distribution
p( | total, successes ) p( successes | , total) p( )
successes a 1 (1 )total successesb1
Beta ( successes a, total successes b)
Prior
knowledge
P1 P2 P3 …
Paths1
(A → Q)? 1 0 1 …
Pr( A Q)
I
AQ
( Pathsi )
i
j
Paths
Paths
Paths2
Paths
Error estimation
Bernouilli distribution
Pr[ I AQ ( P1 ), I AQ ( P2 ), I AQ ( P3 ),... | Pr( A Q)]
Pr[Pr( A Q) | I AQ ( P1 ), I AQ ( P2 ), I AQ ( P3 ),...]
Paths3
Paths
Prior Beta(1,1) ~ uniform
Pr( A Q) ~ Beta ( I AQ ( Pi ) 1,
Paths
I
Paths
AQ
( Pi ) 1)
Pr(Sen Rec ) 0.4
Studying events with
I AB ( P1 ) 0; I AB ( P2 ) 1; I AB ( P3 ) 0; I AB ( P4 ) 0; I AB ( P5 ) 1
Network 1
Pr( A B)
I
AB
( Pj )
j
5
Pr( X Y) 0.4;
Pr( X Y) 0.4;
0.4;
I A B ( Network1 ) 0
I XY ( Network2 ) 0
I XY ( Network3 ) 1
Network 2Pr(X Y) 0.4; I ( Network ) 1
Network 3Pr(X Y) 0.4; I ( Network ) 0
Network 4Pr
(X Y) 0.4
Network
Pr 5 ( X Y ) ~ Beta (2 1, 3 1) X% Confidence intervals
sampled
empirical
X Y
4
X Y
5
Anonymous web browsing and peer-to-peer
Anonymising streams of messages
Example: Tor
As for mix networks
Alice chooses a (short) path
Relays a bi-directional stream of traffic to Bob
Cells of traffic
Alice
Onion
Router
Onion
Router
Bi-directional
Onion
Router
Bob
Setup route once per connection
Use it for many cells – save on PK operations
No time for delaying
Usable web latency 1—2 sec round trip
Short routes – Tor default 3 hops
No batching (no threshold , ...)
Passive attacks!
Adversary observes all inputs and outputs of
an onion router
Objective link the ingoing and outgoing
connections (to trace from Alice to Bob)
Key: timing of packets are correlated
Two techniques:
Correlation
Template matching
1
T=0
3
2
1
2
2
INi
Number of cell
per time interval
Onion
Router
1
T=0
2
3
0
3
OUTi
Quantise input and output load in time
Compute:
Corr = ∑i INi∙OUTi
Downside: lose precision by quantising
2
Input Stream
Onion
Router
Output Stream
INTemplate
Compare with template
vi
Use input and delay curve to make template
Prediction of what the output will be
Assign to each output cell the template value (vi) for its
output time
Multiply them together to get a score (∏ivi)
Cannot withstand a global passive adversary
(Tracing attacks to expensive to foil)
Partial adversary
Can see some of the network
Can control some of the nodes
Secure if adversary cannot see first and last node of
the connection
If c is fraction of corrupt servers
Compromize probability = c2
No point making routes too long
Forward secrecy
In mix networks Alice uses long term keys
A->M2: {M4, {M1,{B, Msg}M1}M4}M2
In Onion Routing a bi-directional channel is
available
Can perform authenticated Diffie-Hellman to
extend the anonymous channel
OR provides better security against
compulsion
Alice
OR1
OR2
OR3
Bob
Authenticated DH
Alice – OR1
K1
Authenticated DH, Alice – OR2
Encrypted with K1
K2
Authenticated DH, Alice – OR3
Encrypted with K1, K2
K3
TCP Connection with Bob, Encrypted with K1, K2, K3
Encryption of input and output streams under
different keys provides bitwise unlinkability
As for mix networks
Is it really necessary?
Authenticated Diffie-Hellman
One-sided authentication: Alice remains anonymous
Alice needs to know the signature keys of the Onion
Routers
Scalability issue – 1000 routers x 2048 bit keys
Show that:
If Alice knows only a small subset of all Onion
Routers, the paths she creates using them are not
anonymous.
Assume adversary knows Alice’s subset of nodes.
Hint: Consider collusion between a corrupt middle and last node – then corrupt last
node only.
Real problem: need to ensure all clients know
the full, most up-to-date list of routers.
Anonymous routing immune to tracing
Reasonable latency?
Yes, we can!
Tracing possible because of input-output
correlations
Strategy 1: fixed sending of cells
(eg. 1 every 20-30ms)
Strategy 2: fix any sending schedule
independently of the input streams
Mixes and OR – heavy on cryptography
Lighter threat model
No network adversary
Small fraction of corrupt nodes
Anonymity of web access
Crowds: a groups of nodes cooperate to
provide anonymous web-browsing
Probability p
(Send out request)
Reply
Probability 1-p
(Relay in crowd)
Bob
(Website)
Example:
p=1/4
Alice
Crowd – (Jondo)
Final website (Bob) or corrupt node does not
know who the initiator is
Could be the node that passed on the request
Or one before
How long do we expect paths to be?
Mean of geometric distribution
L = 1 / p – (example: L = 4)
Latency of request / reply
Consider the case of a corrupt insider
A fraction c of nodes are in fact corrupt
When they see a request they have to decide
whether
the predecessor is the initiator
or merely a relay
Note: corrupt insiders will never pass the
request to an honest node again!
Bob
(Website)
Corrupt node
Alice
Probability 1-p
(Relay in crowd)
What is the
probability my
predecessor is the
initiator?
Crowd – (Jondo)
p
Req
Initiator
1-p
c
Predecessor is initiator
& corrupt final node
Corrupt
Predecessor is random
& corrupt final node
Relay
Req
1-c
p
Honest
1-p
pI = (1-p) c / c ∑i=1..inf (1-p)i(1-c)i-1
pI = 1 – (1-p)(1-c)
Corrupt
c
Relay
Req
p
1-c
Honest
1-p
Corrupt
c
Relay
1-c
pI grows as (1) c grows (2) p grows
Honest
Exercise: What is the information theoretic amount of anonymity of crowds in this context
What about repeated requests?
Alice always visits Bob
E.g. Repeated SMTP connection to microsoft.com
Adversary can observe n times the tuple
2 x (Alice, Bob)
Probability Alice is initiator (at least once)
▪ P = 1 – [(1-p)(1-c)]n
Probability of compromize reaches 1 very fast!
Fast routing = no mixing = traffic analysis attacks
Weaker threat models
Onion routing: partial observer
Crowds: insiders and remote sites
Repeated patterns
Onion routing: Streams vs. Time
Crowds: initiators-request tuples
PKI overheads a barrier to p2p anonymity
Core:
Tor: The Second-Generation Onion Router by Roger Dingledine, Nick
Mathewson, and Paul Syverson. In the Proceedings of the 13th USENIX
Security Symposium, August 2004.
Crowds: Anonymity for Web Transactions by Michael Reiter and Aviel Rubin.
In ACM Transactions on Information and System Security 1(1), June 1998.
More:
An Introduction to Traffic Analysis by George Danezis and Richard Clayton.
http://homes.esat.kuleuven.be/~gdanezis/TAIntro-book.pdf
The anonymity bibliography
http://www.freehaven.net/anonbib/
0
You can add this document to your study collection(s)
Sign in Available only to authorized usersYou can add this document to your saved list
Sign in Available only to authorized users(For complaints, use another form )