Samsara: Honor Among Thieves in Peer-to-Peer Storage

advertisement
Samsara: Honor Among
Thieves in Peer-to-Peer
Storage
Introduction
Peer-to-Peer Paradigm
A node stores some data in remote
nodes
Agree to do the same in return
Replication  fault-tolerance
Decentralized
Self-administering
Scalable
Problem with the P2P Model
The tragedy of the commons
Consume without contributing
Some existing solutions
Third parties  centralized
administration
Currency  trusted infrastructure
Certified evidence of storage
consumption  centralized authority
Observation
Problem is simplified if we have
Symmetric exchange of resources
Observation
Problem is simplified if we have
Symmetric exchange of resources
Guarantees consumption <= contribution
However, symmetric relationships are
rare
Replica A needs 1 GB; replica B needs 1 MB
Samsara
An infrastructure to enforce fairness in
peer-to-peer systems
No third trusted parties
No monetary models
No certified identities
Another Observation
Symmetric storage relationships can
be manufactured
A claim-based system
Another Observation
Symmetric storage relationships can
be manufactured
A claim-based system
Based on incompressible storage claims
Storage Claims
A node periodically checks its peer
Make sure that the peer is adhering to
the contract
If the peer breaches the contract
The node is free to drop the peer’s data
Each node now can perform selfishly
Collectively, all nodes need to play
fair
Some Questions
How to reduce the storage overhead
How to punish cheaters
How to tell failures from cheating
Background
Pastiche
Peer-to-peer, cooperative backup
system
Unsolved problem: unchecked storage
consumption
Pastiche
Samsara
Pastry
OS, Disk
Design
Goal: Ensure that nodes consume no
more resources than they contribute
Manufacture symmetric storageexchange relationships
Through storage claims
A claim can be passed along to form a
dependency chain
A claim can be removed if it forms a
circular dependency
Design
Punish cheaters by deleting their data
probabilistically
Short outage can recover from surviving
copies
Cheaters will eventually lose data
Claim Construction
Requires three values
A secret pass phrase
A private, symmetric key
A location in the storage space
Claim Construction
Storage space initially filled with hash
values
SHA( Pass phrase , 0)
SHA( Pass phrase , 1)
SHA( Pass phrase , 2)
…
…
Key
…
Querying Nodes
Queries
Monitor remote storage
Once every few hours
Need not be answered immediately
Querying Nodes
Query sends h0 to verify data 1..n
Remote site computes
h1 = SHA(data1, h0)
h2 = SHA(data2, h1)
…
hn = SHA(datan, hn-1)
Remote site returns hn
Transient Failure
Difficult to discern cheating from
transient failures
One solution:
Grace periods before deletion
Problem: revolving credits…
Samsara Solution
Replication + independent
probabilistic deletion
Deletion rate is an exponential
growing function of the number of
failed queries
A cheater (> 32GB) cannot replicate fast
enough to get a free ride
Need to replicate 10 times in 3 days
Samsara Solution
A node should only lose all of its data
if it fails queries for an entire grace
period
Most outages are within 3 days
Probabilistic Discard Example
X1 2 X3
X4 X5 6
X2 3
X4 X5 X6
1
X1 2 X3
X4 5 X6
Failed queries = 2
1
0
3
Overhead Reduction
Storage claims can be forwarded
Overhead Reduction
Storage claims can be forwarded
Overhead Reduction
Storage claims can be forwarded
However, if something goes wrong
The forwarding replica is responsible
Increase the incentive for not forwarding
Diffie-Hellman Key Exchange
Need a prime number
p
Need a base integer g
between 1 and p – 1
Site A picks x between
1 and p – 2
Site B picks y between
1 and p – 2
p: 13
g: 7
A: 3
B: 5
Diffie-Hellman Key Exchange
Site A computes
gx mod p
A: 73 mod 13 = 5
B: 75 mod 13 = 11
Site B computes
gy mod p
Site A and B
exchange public
values
A: 3, 11(from B)
B: 5, 5 (from A)
Diffie-Hellman Key Exchange
Site A computes
(gy mod p)x mod p
A: 3, 11(from B)
B: 5, 5 (from A)
Site B computes
(gx mod p)y mod p
Now A and B have
a shared secret
Problem: Prone to
man-in-the-middle
attacks
A: 113 mod 13 = 5
B: 55 mod 13 = 5
Forwarding and Reliability
Longer forwarding chain  lower
reliability
Cyclic chains are okay, because the
accountability is wrapped around
Unfortunately, cycles are rarely found
Limitations
Cannot handle malicious nodes
Cannot force nodes to store data for
others
Cannot create place holders for
bandwidth and processing power
Implementation
Written in C
Three layers
Messaging layer
Replica manager
Storage layer
– A single flat file
– Linked list of free space
File copy benchmark
10
Seconds
8
store claims
6
fetch claims
4
store data
2
0
NFS
Samsarad
13MB file copied between two nodes
Query benchmark
6
Seconds
5
4
verify
3
answer query
2
1
0
Query File
Query Claims
2 hours to verify 32GB claims @550MHz
Reliability simulations
Examine chain length and reliability
What percentage of files lost?
Simulate the absolute worst case
Limit chain length
Transfer as much as possible w/i limit
All failures occur
Permanently
Simultaneously
Before new replicas can be created
Percent Lost Objects
Reliability results
5%
4%
3%
2%
1%
0%
1%
2%
4%
8%
Failure Rate
1
2
4
8
unlim
16%
Download