Defending Against Sybil Attacks Paul Parker Advisor: Shouhuai Xu

advertisement
Defending Against Sybil
Attacks
Paul Parker
Advisor: Shouhuai Xu
Talk Outline





Intro and Motivation
Problem Definition
Existing Work
Intended Approach
Results So Far
P2P and Other Self-Organizing
Networks




Backup
File Sharing
Distributed Computation
Distributed File Systems



Farsite
GFS
Organic GRID
Sybil Attack
Why Use Sybil Attack?


disruption
for-profit motives:



RIAA [drop?]
disproportionate access to resources
(computation, storage)
control network
Problem Definition
Detect creation of multiple node
identities from a single physical
node without a central certifying
authority
Existing Work:
Is Preventing Sybil Attacks Possible?




John Douceur, Microsoft Research
“The Sybil Attack”, IPTPS '01 (First
International Workshop on Peer-to-Peer
Systems (revised paper 2002))
named and introduced problem
strong negative theoretical results for
networks without a centralized authority
Douceur’s Assumptions


set of entities (i.e., nodes)
synchronous broadcast cloud


message



message recv’d by all entities w/i bounded time
finite length bit string
no direct links between entities (“form of centrally
supplied authentication”)
identity – abstraction that persists across multiple
communication events
Assumptions meant to be extremely general
Douceur’s Model
Entity behavior:
 correct entities will present 1 legitimate identity

faulty entities will present 1 legitimate identiy and ≥ 1
counterfeit identity
How could we possibly verify identities?

Assume attacker has limited resources

Distinguish identities via resource-consumption challenge:




CPU
storage
network bandwidth
Example:

simultaneously issue puzzle to all claimed identities that takes 1
second for 1 GHz computer to solve
Douceur’s Lemmas
Direct validation:

1.
Any faulty entity f can present as many distinct identities as the ratio of
its power to minimal power

e.g., 3 GHz CPU could present 3 identities at 1 GHz minimum

2.
If an entity l accepts identities that are not validate simultaneously, a
single f can present arbitrarily many distinct identities to l

e.g., 1 GHz computer could present 3 identities over 3 seconds
Indirect validation:

3.
If an entity l accepts identities vouched for by q accepted identities,
then F can present arbitrarily many identities to l if |F| > q or F has at least
q + |F| resources

4.
Without simultaneous challenges, even a minimally-capable entity f can
present |C|/q distinct identities to l.
Possibly not actual proofs, but very closely reasoned
Douceur’s Conclusion


“attacks always possible except under
extreme and unrealistic assumptions of
resource parity and coordination among
entities”
i.e., to prevent attacks must assume:



all entities have nearly identical capabilities
all presented identities are simultaneously checked
by all entities across the entire system
therefore in heterogeneous real systems such
as Internet, Sybil attacks always possible
Existing Work:
New Ideas
On the Establishment of Distinct Identities in
Overlay Networks, Bazzi & Konjevod, PODC 2005
establishing pairwise distinctness often helpful


distinctness test yields true or unknown
Douceur abstracted out potentially helpful
details


real networks physically embedded in geometric
spaces
BK2005 Assumptions

actual distance between 2 entities
approx. satisfies metric properties




symmetry (bc=cb)
definiteness (ab exists)
triangle inequality (ab+bc≥ac)
sending message to and from 2
entities (Round-Trip Time) no faster
than function of the actual distance
a
5
4
b
3
3
c
BK2005 Example:
Using Latency to Distinguish Nodes
?
?
C
D
100 ms RTT
A
(trusted)
30 ms RTT



30 ms RTT
A and B sign certificates for C and D
Practical technique
Assumptions:


B
(trusted)
triangle inequality holds (c ≤ a + b)
occasional network quiescence
More BK2005 Assumptions

Euclidean or Spherical Geometry can model RTT
distances:





Limited number of corrupt beacons
Asynchronous unreliable network



i.e., nodes can be embedded into Euclidean space Rd or
spherical space Sd with little or no error on RTT distance
Hence have metric properties
Note similar to assuming efficient routing
over long periods of time, occasional quiescence will allow
synchrony and reliability
these allow computing distance between beacons
Broadcast or point-to-point message models
BK2005 Theorems
Can certify distinctness in presence of:
 trusted beacons:




corrupt applicant (in convex hull in Rd, or in Sd
anywhere)
multiple colluding entities for broadcast
up to d multiple colluding entities for point-topoint (d=dimensionality of space)
up to f corrupt beacons

at least f+d+1 correct ones, one corrupt applicant
or multiple colluding corrupt applicants
BK2005 Conclusions



can prevent Sybil attacks via geometric
distinctness certification (given assumptions)
nice theoretical results
translation to real work requires significant
investigation


“a lot more work” to make this of “more practical
value”
generalization of first example “has a good
chance of leading to solutions that can be
used in practice”
Existing Work:
Another Idea

Remote physical device fingerprinting


Computers have clocks






Kohno, Broido, and claffy, UCSD, IEEE S&P 2005 (“Oakland”)
quartz crystal
resonant frequency function of size
frequency varies slightly between typical crystals
First derivative of clock frequency is skew (“fast” or “slow”ness of clock)
Time reported by OS varies with hardware skew and OS
factors
Thus, particular skew distinguishes computer
Kohno et al Details

TCP spec includes TCP Timestamping option





TCP stack inserts a timestamp when sending
packet
Clock skew can be estimated by observing
these over time
Thus, fingerprint remote physical device by
observing TCP streams
5-6 bits of entropy (“distinctness”)
TSOpt field can be disabled or scrubbed
Our Intended Approach
Provisional:
 Do BK experiments
 Combine multiple approaches
(intelligently)
(some of mine are new proposals)
 BK2005 approach
 Kohno et al approach
 neighborhood
 memory latency computational puzzles
 OS fingerprinting
PlanetLab
(to use for BK testbed)



Worldwide research overlay network
More than 600 nodes at 300 sites
www.planet-lab.org
Planned BK Experiments
(so far)
Data-based experiments:
 Test triangle inequality (to w/i a margin)
 Test technique applicability
Actual experiments:
 Testbed for trying technique
Results So Far

Analyzing Triangle Inequality w/ PlanetLab



Nodes: 481
Theoretical possible triangles: 110,591,520
Number with 3 sides: 50,963,180
% of
theoretical
% of 3-sided
Obeying triangle
inequality
8.94
19.50
Almost obeying
(within 15%)
23.73
51.49
Approximately
obeying
32.67
70.89
Results So Far: Timestamping
and OS Fingerprinting
Streams with Timestamp
w/ Timestamp and p0f
w/ Round clock skew
w/ Reasonable skew and p0f


Web trace
31636
23151
8858
4649
p0f – passive OS fingerprinting tool
Issues:




Abilene III dataset
14050
480
3360
110
p0f ran in less accurate SYN+ACK mode for Web trace, because host initiated all
connections
intersection of p0f result and reasonable clock skew low (15% even on Web trace)
good timestamping data hard to obtain
 most traces truncate TCP header before TSOpt
Implications:


OS fingerprinting probably a secondary technique (also because can be faked)
Timestamping didn’t work well on brief streams (not enough data?)
Directions for Future Work


Can a self-organizing network automatically
defend against Sybil attacks, without
starting from a set of trusted nodes?
Can we provide identities for nodes, or
merely distinguish them? If not, how much
distinguishability can we provide?
Questions?
Download