consistentHashing - People

advertisement
Consistent Hashing:
Load Balancing
in a Changing World
David Karger, Eric Lehman,
Tom Leighton, Matt Levine,
Daniel Lewin, Rina Panigrahy
MIT
Caches can Load Balance
Server


Items distributed
among caches


Users get items
from caches
MIT

Numerous items in
central server.
Requests can swamp
server.
Distribute items among
caches.
Clients get items from
caches.
Server gets only 1
request per item
Who Caches What?

Each cache should hold few items
» else cache gets swamped by clients

Each item should be in few caches
» else server gets swamped by caches
» and cache invalidations/updates expensive

Browser must know right cache
» fast, local computation
MIT
A Solution: Hashing
Server

items assigned to caches
by hash function.

Users use hash to compute
cache for item.
MIT

Example:
y = ax+b (mod n)
Intuition: Assigns items to
“random” caches
» few items per cache
Easy to compute which
cache holds an item
Problem: Adding Caches
Suppose a new cache arrives.
 How work it into hash function?
 Natural change:

y=ax+b (mod n+1)

Problem: changes bucket for every item
» every cache will be flushed
» servers get swamped with new requests
Goal: when add bucket, few items move
MIT
Problem: Inconsistent Views
Each client knows about a different set
of caches: its view
 View affects choice of cache for item

» Same item may hash to many places:
caches swamp server with request for item
» Many items may hash to same place:
clients swamp cache

MIT
Goal: despite views, items evenly
distributed into a few caches each
Solution: Consistent Hashing


Use standard hash function to map caches
and items to points in unit interval.
» “random”
points spread uniformly
Item assigned to nearest cache in view
item
Cache (Bucket)
Computation easy as standard hash function
MIT
Properties
All buckets get roughly same number of
items (like standard hashing).
 When kth bucket is added only a 1/k
fraction of items move.

» and only from a few caches
When a cache is added, minimal
reshuffling of cached items is required.
MIT
Multiple View Properties

Despite multiple views, each cache gets
few items
» no cache overloaded

Despite multiple views, each item only
in few caches.
» server protected, cache updates easy
System tolerates multiple, inconsistent
views of caches (also fault tolerant).
MIT
Load Balancing

Task: distribute items into buckets
» Data to memory locations
» Files to disks
» Tasks to processors
» Web pages to caches (our motivation)

MIT
Goal: even distribution
Problem: No Synchronization



Each user knows about a different set of caches: a
view
View affects assignment of items to caches
Problems when there are multiple views:

items
View 1
View 3
View 2


View 4
The items assigned to a specific
cache are different in each view.
These sets could be essentially
disjoint for standard hash functions.
Over all views, cache is responsible
for too many items.

items assigned to one cache over 4 views
MIT
Cache not large enough to contain
active set of items
Multiple Views: Cont..

item


View 3
View 1
View 4
View 2
item assigned to different caches in each of 4 views
MIT
Item may be assigned to
different caches in different
views.
Standard hash function may
assign item to a different cache
in every view.
Result: item requested from
many caches
» Server swamped with requests
for copies of the item.
» Hard to update cached copies
Problem: Adding Caches


New cache means new hash function
» natural change: y = ax+b (mod n+1)
Standard hash functions completely redistribute items
when the range of function changes:
» Every cache will be flushed
» server is swamped with requests since items are
reshuffled between caches.
» Need to broadcast the new hash function to all
users at the same time
– some kind of global synchronization?...
MIT
Download