Mathematics-Inspired Protocols for Distributed Systems

advertisement
Mathematics-Inspired Protocols
for Distributed Systems
Steven Y. Ko*, Indranil Gupta
Dept. of Computer Science
University of Illinois at Urbana-Champaign
{sko@cs.uiuc.edu; indy@cs.uiuc.edu}
* Currently PhD student at UIUC. In the audience here.
Yookyung Jo**
Dept. of Computer Science
Cornell University
{ykjo@cs.cornell.edu}
** Work done during M.S. at UIUC.
An Open Design Challenge
Phenomena and Results
from Biological, Physical,
Social Worlds, etc.
Self-adaptive
Protocols for
Distributed Computing
Problems
2
Lost in Translation
Two popular ways today for this translation:
I. Sit down with social scientists, biologists etc.


II.
Time consuming, terminology different
But good to talk!
Read textbooks written by them and derive protocols
that “somewhat” model the phenomenon
Both above approaches often lead to:



Hand-wavy design: non-rigorous translation leads to
unpredictable protocols
Difficulty of analysis of derived protocol
Lack of generality of translation
3
The Third Way
Phenomena and Results
from Biological, Physical,
Social Worlds, etc.
Mathematical Models
Self-adaptive
Protocols for
Distributed Computing
Problems
4
Why Mathematical Models?
For long, a popular language for representing phenomena and ideas
• Scientists from fields of Biology, Sociology, Physics, etc. have used
these models to represent phenomena, results, and ideas
• Many decades (or centuries) of equations available in these fields
• E.g., Sequence Equations, e.g., x  1 .x
t 1
2
t
Translation is Systematic and Rigorous
• Derive Protocol from Mathematical model (equation)
• Translation is not hand-wavy
Derived Protocol is easy to understand
• Rigorous analysis and provable properties of derived protocol
• Generality of translation
• Amenable to augmentation with topology-awareness, etc.
5
Story of this Paper
• Consider a popular class of mathematical models
– E.g., Sequence equations, e.g., xt 1  r.xt
• Develop techniques for translating any mathematical
equation belonging to this class into  a distributed
protocol
– Key idea: Emergent behavior of the protocol across the
distributed system = the mathematical equation
• We are not simulating the mathematical equation at each process
• Challenge: going from global (equation)  local (protocol)
• Use these techniques to design adaptive protocols for
P2P computing
6
Roadmap
I.
II.
Related Work and System Model
Translation of Sequence Equations into
Sequence Protocols
III. Adaptive Protocols for P2P Computing
–
HoneyAdapt system for Grid computing
7
Related Work
• Simulation of mathematical equation at each process
[Uresin90]
– Instead, our focus is on running a protocol that obtains equation
as emergent behavior
• Nature-inspired research, e.g., [Mute,AntNet], etc., and
population protocols [Merritt00,Angluin04]
– But not derived from mathematical models
• [Gupta04] considered translation of continuous
differential equations into equivalent distributed protocols
– Current SASO 07 paper considers translation of sequence
equations.
Main differences from [Gupta04]:
– Sequence equations discrete and not continuous
– Require completely different translation techniques
– Adaptive and phase-change behavior more pronounced
8
System Model
• Static group of N non-faulty processes (N large)
 Can be relaxed for most sequence protocols
• Reliable unicast communication
 TCP
 Sequence protocols resilient to message losses
• Coarse-grained time synchronization (O(minutes))
– allows processes to move synchronously
– allows notion of “rounds”
 Provided by NTP, TIME, or DAYTIME (e.g., NIST servers)
 Many sequence protocols have asynchronous variants
• Any process can randomly sample another process
 Use CYCLON [Voulgaris05], Peer sampling service [Jelasity04], etc.
9
Creating Sequence Protocols
• Canonical Sequence Equation:
–
–
–
x is a variable in [0,1]
xt 1  f ( xt , xt 1 , xt 2 ,...xt k )
xt is its value at time t
k is constant
[All our discussion is extendible to multi-variable sequence
equations]
• Challenge: global  local
– Assign each process p a binary state variable Xp, representing
whether it is in state X or not. Xp=1 means the process is in state
X. Xp=0 means process is not in state X.
– Let x = fraction of processes (system-wide) with Xp=1.
So, x  [0,1]
– Derive a distributed protocol so that the time-variation of x is
predicted by the sequence equation. That is:
xt 1  x
Goal : xt 110 x
Case Study I (Constant Term)
xt 1  r , r  [0,1]
Each round at process p:
Flip a coin with heads probability r
if heads
Xp=1
else Xp=0
=> Value of x, the
fraction of processes
in state X, is predicted
by:
xr
Goal : xt 1  x
11
Case Study II (Linear Term)
xt 1  r.xt , r  0
Each round at process p:
remember last round’s Xp value
// Token Generation
if last round’s Xp was 1
generate an expected r tokens
relay token to random process
// Token Relay
hold at most one token at any time
if receive any additional tokens
relay it to a random process
// Token Apply
at end of round
if have > 0 tokens
set Xp=1
else Xp=0
 Number of tokens generatedxis
t .N
 Value of x, the fraction of
processes in state X, is
predicted by:
xt 1  r.xt
This Protocol can be extended to:
 Arbitrary memory (k in
sequence equation)
 Multi-variable equations
Multiplicative Protocol
Goal : xt 1  x
12
Stepping Back – General
Methodology
For the sequence equation:
xt 1  f ( xt , xt 1 , xt 2 ,...xt k )
– Take each term on the right hand side
• Term is minimal unit separated by + and – signs
– Translate each term according to appropriate case studies
– Generate positive tokens for + terms, and negative tokens for –
terms
– A positive token destroys a negative token
– Relay and Apply tokens as usual
Theorem: If for each term T, number of tokens generated
is T X N, then
Term Translation
x x
t
=Case Study
13
What other Terms can we
Translate?
I. Polynomial Terms:
1. Constant – Case
Study I x  r, r [0,1]
2. Linear – Case Study
II xt 1  r.xt , r  0
3. (multi-variable
equations)
Multiplicative
Polynomial - see paper
t 1
II. Non-polynomial
Terms:
1. Division Terms - see paper
2. Fractional Terms
– next
III. Recursive
Translation – next
14
Translation of Fractional Term
jL
xt 1 , T 
Each round at process p:
remember last k round’s Xp values
divide round into two equi-long
subrounds
// Subround 1
// Token Generation
for each j =1 to L
if Xj(p)=1
generate aj tokens tagged with j
// Token Relay and Apply
multicast tokens to all other processes
// Subround 2
// Token Generation
select random token among those
received
suppose tag is j’
if (bj’=1)
generate a token for subround 2
// Token Relay and Apply
apply as usual
 b .a .x
j 1
jL
j
 a .x
j 1
Subround 1
j
j
j
, a j ' s positive, b j ' s binary
j
 Subround 1:
E[Number of tokens generated at p] is=
jL
 a .x
j
j 1
Round
j
 Subround 2:
E[Number of tokens generated at p] is=
jL
 b .a .x
Subround 2
j 1
jL
j
j
 a .x
j 1
j
j
j
 Value of x, the fraction of processes in
state X, is predicted by:
T x
Goal : T  x
15
Recursive Translation
• Any term that consists of
sub-terms that are
translatable, can itself be
translated
– Split a round into two
subrounds
– In subround 1, run the
derived protocols for the
subterms
– In subround 2, run the
derived protocol for the
overall term
– Subround division is also
recursive
xt 1
xt 1 , T  2
xt  xt 1
STEP1
x'  xt
STEP2
2
Subround 1
Round
xt 1
x' xt 1
Subround 2
EXAMPLE
16
Roadmap
I.
II.
Related Work and System Model
Translation of Sequence Equations into
Sequence Protocols
Adaptive Protocols for P2P Computing
Multiplicative Protocol (based on xt 1  r.xt )
III.
•
–
–
•
For detecting global thresholds in a distributed
fashion
see paper for details
HoneyAdapt system for Grid computing
–
next
17
HoneyAdapt - Motivation
Challenge: how do clients
choose “best” algorithm (A,B,…L)
adaptively at run time in a black-box manner?
e.g., for parallel sorting problem,
A=quicksort, B=insertion sort,…
Grid Server (master)
-Partitions large data set
into chunks
-Serves out chunks
on-demand to clients
-Collates results in the end
-E.g., parallel sorting
problem, graphics
rendering, etc.
Typical
Client
2. Process data chunk using
one of algorithms A,B,C,D,…L
Grid Clients
(workers)
Connected in
an overlay
18
HoneyAdapt – Inspiration
Nectar Source A
Nectar Source B
Honyebees (apis mellifera)
-need to decide which is the
“better” nectar source
-in a distributed fashion
19
HoneyAdapt – Inspiration
Nectar Source A
Nectar Source B
1. (time t)
Forage a
nectar source
2. With probability (1-pf),
use the same nectar source for
time (t+1)
pf=following probability
4. After dance, if did not follow
(so with probability pf),
decide next source to forage
by picking a dancing bee
at random
3. Execute honeybee dance of 8’s
-Duration of dance proportional
to quality of advertised
nectar source
-Direction of dance points towards source
20
HoneyAdapt – Mathematical Model
Source A = Algorithm A
Source B = Algorithm B
=
Bees converge quickly towards better source
(proof in paper)
Linear Term
Fractional Term (+Recursive)
Fraction of nodes (bees/clients)
foraging source (algorithm) i
at time (t+1)
Following probability
ai (t  1)  ai (t ).(1  pf )  
sqi .ai (t )
L
 sq .a (t )
j A
j
Quality of source (algorithm) j
L
. pf .a j (t )
j A
j
(See paper for general model. From [Seeley96].)
21
(Recall) HoneyAdapt - Motivation
Grid Server (master)
-Partitions large data set
into chunks
-Serves out chunks
on-demand to clients
-Collates results in the end
-E.g., parallel sorting
problem, graphics
rendering, etc.
Typical
Client
2. Process data chunk using
one of algorithms A,B,C,D,…L
Grid Clients
(workers)
Connected in
an overlay
22
HoneyAdapt –Model and Derived Protocol
=
2A. Choose algorithm i (initially, random) for this chunk
2B. With probability (1-pf), use same algorithm for next chunk
2B. Dance: create a number of advertisement messages for algo i.
Number of adv. msgs. proportional to the quality of sorting
(inversely proportional to running time of chunk with algorithm i)
2C. Send advertisement messages to immediate neighbors
in overlay
2D. If follow (prob. pf), decide algorithm i for next chunk by
picking an advertisement message at random
Algorithm’s emergent behavior
= Sequence equation
(proof in paper)
Fraction of nodes (bees/clients)
Following probability
foraging source (algorithm) i
Quality of algorithm j
at time (t+1)
ai (t  1)  ai (t ).(1  pf )  
sqi .ai (t )
L
 sq .a (t )
j A
j
L
. pf .a j (t )
j A
j
23
HoneyAdapt - Simulation
Setup:
* Random graph overlay of ~1000 clients
* Dataset consists of 100K chunks of 10 different types
* Each type has 10 algorithms assigned randomly in terms of quality
* “Cluster”=consecutive chunks of same type (with same “best” algo.)
* pf=0.9
Adaptivity: HoneyAdapt takes only
2x time compared to optimal,
and beats non-adaptive strategies
Scalability up to and beyond 4000 nodes:
-Running time: Only 85% worse than optimal
-Bandwidth: 0.04 messages/node/chunk 24
HoneySort – Deployment
Setup:
* Up to 30 COTS PC clients (Linux)
* Complete graph overlay with TCP links
* Clients choose between quicksort and insertion sort
* Sort 1 million database of 8 B entries
* Server pre-partitions data into 333 chunks
Results:
 Sorted Arrays: HoneySort as good as insertion sort
 Randomized arrays: HoneySort as good as quicksort
 Part-sorted part-randomized arrays:
Honeysort beats both quicksort and insertion sort!
25
Summary
Phenomena and Results
from Biological, Physical,
Social Worlds, etc.
This paper:
Model=Sequence Equations
Translation techniques for polynomial/non terms
Derived Sequence Protocol so
its emergent behavior = Sequence equation
HoneyAdapt for adaptive Grid computing
HoneySort beats traditional parallel sorting
algorithms
Mathematical Models
Distributed Protocols Research Group (DPRG):
http://kepler.cs.uiuc.edu
Self-adaptive
Protocols for
Distributed Computing
Problems
26
Backup Slides
27
Translation of Division Term
(usually a sub-term in a larger term)
Each round at process p:
remember last k round’s Xp values
divide round into two equi-long
subrounds
// Token Generation [subround 1]
integer i=0
do
select a random process q
query the value of Xq k
rounds ago
i=i+1
until (Xq=1)
generate i token messages
// Token Relay [subround 2]
relay token to random process
hold at most one token at any time
if receive any additional tokens
relay it to a random process
// Token Apply [subround 2]
at end of round
use tokens for next subround
T
r
xt  k
, k is an integer
1
xt  k
1
N.
xt  k
 E[Number of tokens generated at p] is
Subround 1
 Total number of tokens generated is
 Value of x, the fraction of processes in state
X, is predicted by:
Round
Subround 2
T x
This Protocol can be extended to:
 Arbitrary memory (k in sequence equation)
 Multi-variable equations
Goal : T  x
28
Big Picture
• Self-adaptive and self-organizing
distributed protocols
• Protocol design
• Biological, Physical, Social phenomena as
a source of ideas for protocol design
Need: Systematic Translation of
phenomena into distributed protocols
• Use mathematical models as a conduit
29
Download