PPT

advertisement
Server Farms
Q: Why would someone choose M/M/k over M/M/1 then?

M/M/1 (single fast server) better under low load, as good under
high load
A: Server Farms!

(and “Clouds”!)
k slow servers with speed μ are much cheaper than one fast server
of speed kμ!
But many more servers!
(100s or 1000s)
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
1
Capacity Provisioning for Server Farms
Problem: How many servers do I need?
Fact: The more servers the better the response time
Fact: Having a server idle still consumes 60% of power

Power for running server farms among the biggest costs/concerns
of a company  “greening” the Internet is a “hot” research topic!
[new problem]: minimum number of servers that will
guarantee a low E[T] or low PQ?
 We can get this from the M/M/k equations (but not easy)
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
2
Building up Intuition
 M/M/1 Rule of Thumb: utilization ρ should stay below 0.8
ρ = 0.8  E[N] = 4
 ρ = 0.95  E[N] = 19

(delays explode!)
 E[TQ]M/M/K = (1/λ) •PQ • ρ/(1-ρ)

Not as clear to tell (depends both on ρ and PQ)
Q: How about E[TQ]/PQ?
A: Expected waiting time only for delayed customers
1 ρ
1
E[TQ | delayed] 

λ 1  ρ kμ1  ρ
Q: What does this equation imply for high ρ (e.g. ρ = 0.95)?
A: high ρ does not imply high delay  just put more servers!

5 servers  delay = 4/μ | 100 servers  delay = 1/(5μ)
Q: Why?
A: Even if all servers have high ρ  prob{all busy at the same time} is lower
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
3
M/M/∞
Goal: minimum k, so that PQ < X% (e.g. 20%)


Bounding PQ is equivalent to bounding E[TQ] etc.
Easier to consider M/M/∞ first
i
λ 1
π i    π0
 μ  i!
Q: Local Balance equations?
Q: What is this?
A: The number of customers in an M/M/∞ is Poisson(λ/μ)
Q: What is E[N] and E[T]? Does it confirm your intuition?
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
4
Square-Root Staffing Rule
Def: R = λ/μ (assume R is large)
Main result: with only k  R  R servers  PQ < 20%
Q: Probability to have more than R  R jobs in an M/M/∞?
A1: Prob{Poisson (R) > R  R )
A2: (for large R) Poisson(R)  Normal(R,R) =>
Final answer: P{Normal exceeds mean by >1 std dev.) = 16%
Q: Is this probability higher or lower for M/M/k?
A: Higher! M/M/∞ has more resources (servers) to “clear”
extra work
M/M/k: turns out that R  R servers are enough for PQ < 20%

See Ch.16 (Th. 48) for a more detailed rule and proof
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
5
Bulk Arrivals Systems: M[K]/M/2/6
Consider:
 Two server system (memoryless)

Service rate μ (for each server)
 System size = 6 (2 in servers and 4 in queue)
 Batch/Bulk arrivals

Batches of jobs arrive as Poisson(λ)
 Batch size Distribution: Each batch might contain 1,2,or 3
jobs



Pr{X=1} = 0.5
Pr{X=2} = 0.3
Pr{X=3} = 0.2
 Average Batch Size = E[X] =1(0.5)+2(0.3)+3(0.2)= 1.7
Q: How do we solve this queueing system?
Q: Can we still use a CTMC to solve it?
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
6
Solving the Bulk Arrival System
Local balance on cut
 Global Balance equations
λ
 π1  π 0
π0 0.5λ  0.3λ  0.2λ   π1μ
2
μ
λ
 0.5λ 
π1(0.5λ  0.3λ  0.2λ  μ)  π2μ  π0 0.5λ  π2    π0  
 π0
μ 
 μ 

π2(λ  μ)  π3μ  π0 0.3λ  π1 0.5λ

Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
7
Bulk/Batch Departures: M/M[K]/n
 Jobs are served in batches

time between batches is exponential (μ)
 Fixed batch size k
batch k = 3
 λπ0= μπ1 + μπ2+ μπ3
 (λ+μ) π1= λπ0+ μπ4
 (λ+μ) π2= λπ1+ μπ5
:
 (λ+μ) πn= λπn-1+ μπn+k
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
8
Solving Bulk Arrival/Departure Systems
 Define transition matrix P
 Solve π•(I-P) = 0

Together with Σi πi = 1
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
9
Examples of Batch/Bulk Arrivals/Service
Batch Arrivals
 Customer arrive to a server with buses


Bus arrivals are Poisson
Number of customers in each bus is random variable X
 Number of files requested at a web/file server is random

Requests arrive as Poisson
Batch Departures
 Multicasting popular files



Some of the nodes in the queue might be asking for the same file
If the file requested by the node at the head of the queue was
also requested by another k nodes in the queue, all k nodes are
served with a single broadcast message (batch size k)
The batch size k depends on the popularity of the file asked and
the number of customers in the queue.
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
10
PASTA Property
 Assume you are simulating or observing a queueing system
 pn: (limiting) probability of being at state n (n jobs in system)

Ergodicity  pn = long-time fraction of being at state n
 an: probability that an arrival finds n jobs
 dn: probability that a departure leaves n jobs
Goal: interested in measuring pn (percentage of time in state n)
Q: How?
Method 1: let the system run for infinite time (for long time) 
measure state  repeat many times and take average
Method 2: measure jobs at times of arrivals
Q: Does this work? Is an = pn?
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
11
PASTA Property (2)
Q: Is an = dn?
A: Yes, if arrivals and departures happen one at a time (no batch)
Q: Is an = pn?
A: No, not necessarily!
Example:
 Consider a single queue system
 Arrivals: Uniform in (1,2)
 Service times: Deterministic with a duration of 1
Q: What is a0?
A: a0 = 1  customer completes service before the next arrival
Q: What is p0?
A: p0 ≠ 1
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
12
PASTA: (P)oisson (A)rrivals (S)ee (T)ime (A)verage
Theorem: If arrivals are Poisson, then an = pn = dn
Proof:
pn = limt∞P(N(t) = n)
an = limt∞P(N(t) = n | an arrival occurred just after time t)
Define: A(t,t+δ): event that an arrival occurred just after t
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
13
PASTA Property in Simulations/Experiments
measurements
S3
S1
S2
S5
S4
 Assume you are simulating a system
 or observing a real system (a backbone router)
 The system moves randomly from state to state (e.g.
Markov Chain, Queueing System)
 PASTA  we can sample the system (state) at exponentially
distributed times

e.g. send measurement (“probe” packets) with exp inter-packet times
Q: Do the samples need to be exponentially distributed
A: No! they just need to be independent from N(t)
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
14
Queueing Networks: A Simple Tandem Queue
 Normal Approach: define a CTMC
 Finite CTMC: can solve in Matlab (for numerical rates)
 Infinite CTMC in multiple (2) dimensions: VERY hard!
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
15
Time-reversibility of Markov Chains
Forward Chain: … 3  5  1  2  1  3  4  1  …
Reverse Chain: …  3  5  1  2  1  3  4  1  …
Q: Is the reverse chain a CTMC?
A: Yes! VIEW 1 of CTMC can be shown!
 Time in state i is exponentially distributed
Q: What is the probability p*ik (going from i to k in the reverse)
A: It is the probability that the fwd chain went to i from k (and
not another state)
Q: What is Σkp*ik ?
A: Σkp*ik = Σk P(reverse moves from state i to k) = 1
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
16
Time-Reversibility (2)
Q: How does π*i relate to πi?
A: π*i = πi
Time-reversibility Theorem:
If πi qik = πk qki  forward chain and reverse chain are
identical!
 Example: consider an M/M/1 system
 Theorem says that rate of going from n to n+1 in forward
chain (i.e. probability of a queue increase, given n) is equal
to the rate of going from n to n+1 in reverse chain (i.e. a
queue decrease, given n+1).
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
17
Time-reversibility examples
λ0
0
2
1
μ1
λ2
λ1
μ2
…
μ3
Q: Are birth-death processes time-reversible?
A: Yes, they are: If n  n+1 can only go back to n as n+1  n
Q: What about batch arrival systems?
A: No, they’re not necessarily!
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
18
Burke’s Theorem
Poisson (λ)
e.g. 3 jobs/sec
service: exp(μ)
e.g. 5 jobs/sec
Burke’s Theorem (holds also for M/M/k)
Q1: What is the departure process from an M/M/1?
A: It is Poisson (λ)
Q2: How does N(t) (the number of jobs in the system at time
t) depends on the sequence of departure times prior to t?
A: It does not!
Proof:
Q1: Departures in the M/M/1 are arrivals in the reverse
process  but the reverse process is an identical M/M/1
Q2: Sequence of departures prior to t  sequence of
arrivals (in reverse chain) after t  clearly independent
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
19
Tandem Queue: Solution Using Burke’s Theorem
 1st queue: M/M/1  P(n1 jobs) = ρ1n1(1-ρ1)
Q: What about the 2nd queue?
A: Seems like an M/M/1 also  P(n2 jobs) = ρ2n2(1-ρ2)
Q: But isn’t N2(t) dependent on N1(t)?
A: Departures from queue 1 (before t) are arrivals to queue 2
before t


departures before t are independent of N1(t) (Burke)
arrivals (to queue 2) before t  completely define N2(t)
 N1(t), N2(t) independent  πn1,n2 = ρ1n1(1-ρ1) ρ2n2(1-ρ2)
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
20
Example of Tandem Queues
Q: Which of the two systems has better performance?
A: None! They both have the same mean response time
Q: How can you quickly prove it?
A: Use Little’s Law
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
21
An Acyclic Network with Probabilistic Routing
Q: How can we solve this queueing system?
Q: Can we still treat each individual queue as an M/M/1?
A: Yes. Use Burke’s Theorem and Poisson Splitting
πn1,n2,…nk = ρ1n1(1-ρ1) ρ2n2(1-ρ2) ⋯ ρknk(1-ρk)
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
22
Queueing Networks: Jackson Network
•
•
•
•
Exponential servers
FCFS queues
Probabilistic routing
Allows loops (cycles)
Q: Is each queue still an
M/M/1?
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
23
Jackson Network: A Counter-example
Q: Is the total arrival process into the server (i.e. outside
and feedback) a Poisson process?
Q: Does the arrival process into this server look Poisson?
A: No! the feedback and the external process are dependent
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
24
Jackson Network: Is a Product-Form Network
k
k
i 1
i 1
Pnetwork state is (n1 , n2 ,..., n k )   Pn i jobs at server i   ρni i (1  ρi )
 This is a very important result!
 It transforms an infinite k-dimensional Markov chain into a
simple closed form
 Can still treat each queue independently!
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
25
Jackson Network Example: a Web Server
Thrasyvoulos Spyropoulos / spyropou@eurecom.fr
Eurecom, Sophia-Antipolis
26
27
28
Download