Switching Lecture 1 Switch Architecture Inter connection between N input links and N output links Switch Arbiter O/p lines I/p lines 2 1 2 1 3 4 3 4 5 5 O/p adapters I/p adapters Switch fabric Switch Fabric I/p lines O/p lines Cross bar Switch Types Single staged Multi Staged Input Queued Packets stored at the input Switch speed = line speed Any combination of packets can be transferred across a switch as long as the input and output terminals do not overlap (Matching) 1-3, 2-1 allowed, 1-2, 3-2 not allowed, 1-4, 1-5 not allowed Output Queued Packets stored at the output Switch speed = N * line speed (speedup = N) Any N packets intended for an output can be transferred to the output simultaneously, 1-2, 3-2 allowed Combined Input Output Queued Packets stored at both input and output Speedup between 1 and N Shared Memory Switch Packets stored in the switch fabric; every output line reads the paper as and when they are transmitted Switch Performance Metric Throughput Rate vector comprising of rate of transmission of packets across each output (r1,….rN) Delay Vector Delay between packet transmission from input to output Input Queued Switch rij is the average number of packets arriving at input line i for output line j Let an output line serve one packet each slot i rij 1 for each j (output contention) j rij 1 for each i (input contention) Output Queued switch j rij 1 for each i (input contention) i rij N for each j (output contention)? Redundant constraint Combined Input Output Queued Switch j rij 1 for each i (input contention) i rij k for each j (output contention) speedup k Shared memory switch Constraints just like output queued switch Less packet loss due to statistical multiplexing of buffer memory Output queued switch loses packet if any single output buffer overflows, even if there is additional space in other output buffers Shared memory switches share all buffers and will overflow only if the combined buffers overflow Scheduling for Input Queued Switch Choice of matchings determine whether any throughput can be attained as long as the constraints are satisfied FIFO scheduling Every input maintains a queue of packets Transfer the first packet of each queue subject to matching constraints HOL Contention 2 4 1 4 2 3 4 1 2 3 4 Only one packet can be transferred across the switch 4 2 1 4 2 3 4 1 2 3 4 Two packets can be transferred across the switch Input constraints allow (jj rij 1 )/N close to 1 (rate of transfer of packets across the switch per line) Under FIFO scheduling this number is upper bounded by 0.586 Penalty of HoL blocking Proof Lookahead scheduling First, schedule among the head of line packets in each input queue If an input does not transfer packet but has a packet, see whether the second packet can be scheduled This can be generalized to considering the first w packets in each queue 2 4 1 4 2 3 4 1 2 3 4 At the first cut 2-4 is scheduled, then look at the second packet in queue 1 and schedule 1-2 Lookahead performance (jj rij 1 )/N increases with increase in w, but the limiting value for w is strictly less than 1 Queueing in High-Performance Packet Switching Hluchyj and Karol, IEEE JSAC December 88, Vol 6, No. 9 Achieving 100 % Throughput in Input Queued Switch 100 % throughput is said to be attained if the switch is able to sustain any arrival process as long as the rate constraints are satisfied, This will allow (jj rij 1 )/N close to 1 ``Achieving 100 % Throughput in Input Queued Switch,’’ Mckeown et al, INFOCOM 95 ``Stability properties of constrained queueing systems and scheduling for maximum throughput in multihop radio networks,’’ Tassiulas and Ephremides, IEEE Trans of Automatic Control, Vol. 37, No. 12, pp. 1936-1949, 1992 Maximum weighted matching based scheduling Every input maintains separate logical queues for each output Weight of each queue is the queue length Serve the packets from the queues which form a maximum weighted matching Weight of a matching is the sum of weights of the queues in the matching 4 4 2 4 1 4 2 Schedules 1-4 3 4 1 2 3 4 Possible matchings are 1-2, 2-4, weight = 2 1-4, weight = 3 1-2, weight = 1 2-4, weight = 1 Maximum weighted matching based scheduling gives priority to long queues, and also tries to schedule a large number of queues, but may not always schedule the largest possible number of queues Proof for optimality Lecture 2 Proof for FIFO throughput Queueing in High-Performance Packet Switching Hluchyj and Karol, IEEE JSAC December 88, Vol 6, No. 9 Assumptions: Input queues saturated, that is packets are always waiting at the input queues Inputs can be selected with equal probability for each output Suppose a new packet reaches the HOL position of an input queue, then its destination is a specific output queue with equal probability as any other output queue (1/N) Notations Bim Number of packets at the heads of the input queue destined for the ith output in the mth slot, but not selected for it. Aim Number of packets moving to the heads of the input queue destined for the ith output in the mth slot Fm Number of packets transmitted through the switch in the mth slot (expected value: F) Expected number of packets passing through the switch per slot per output line (F/N) Bim = max(0, Bim-1 + Aim –1) Eqn 1 Aim is binomial (Fm-1, 1/N) For large N, binomial (Fm-1, 1/N) is Poisson(F/N) Thus the dynamics of Eqn 1 resembles a queueing system (M/D/1) where Aim is the arrival process, Poisson(), Bim is the number of packets in the queue. From standard M/D/1 result, expected value of Bim is 2/2(1- ) Fm-1 = N - i Bim-1 Dividing both sides by N, and taking expectations we have = 1 – expected value of Bi = 1 - 2/2(1- ) Solving for , = 2 - 2 Proof of optimality of the maximum weighted matching algorithm for input queues Preview of Markov Process • A sequence of random variables X1 , X2,….,Xn ,….. such that – Xi+1 is independent of X1,….Xi-1 given Xi – Pr(Xi+1 = ai+1/ Xi = ai, …., X1 = a1) = Pr(Xi+1 = ai+1/ Xi = ai ) • Discrete time discrete state markov chain • So a markov chain evolution can be specified by – Initial states – Transition probabilities Pr(Xi+1 = ai+1/ Xi = ai ) = pi,i+1 0.5 1 0 2 1.0 0.5 0.5 0.5 State A communicates with state B if there is a positive probability path from A to B A set of states is closed if all states in the set communicate with each other, and no state in the set communicates to any state outside the set, e.g., {0, 1} A state a is open if it communicates to some other state which does not communicate to a, e.g., {2} A state has a period d if the process can only return to this state in intervals which are multiples of d More precisely, di is the g.c.d. of {k: pi ik > 0}, Where pi ik is Pr(Xk = i/ X0 = i) 0.5 1 0 d0= 1 0.5 0.5 0.5 1.0 1 0 1.0 d0 = 2 Period of any two communicating states are equal A state is aperiodic if its period is 1 Consider a function of the states f(x) Ef(x) = x p(x)f(x) Consider the input queueing system Let the system have random arrivals, i.e., the number of arrivals for the different sessions is random Arrival process for each input-output pair is i.i.d (independent and identically distributed) Number of arrivals of i-j in slot t is independent of the number of arrivals of any other pair in past or future slots, and also independent of the number of arrivals of any other pair in the same slot Probability that a packet arrives for i-j in any slot t is qij Under this assumption the queue length process at the nodes (values of the Bij(t) s for all pairs i-j) constitute a markov process Under maximum weighted matching, this markov chain consists of one closed set with periodicity 1 (check it!) Let f(x) be the total number of packets in the input queues in state x Ef(x) = x p(x)f(x) A scheduling strategy is said to be optimum if it attains finite expected queue length as long as any other strategy attains it. If the system is markov with an aperiodic closed set, and other open states, and Suppose we can find a scalar function V(x) of vector x such that E(V(B(t + 1))- V(B(t)) / B(t) = x) 0 for all states x, except a finite number of states and V(x) 0 for all states x, (negative drift condition) Meyn and Tweedie, Book Then the probability distribution of the markov chain converges to a probability distribution with finite expectation Assume that the expected arrival rates satisfy the rate constraints for the input queued switch, we will show that the negative drift condition is satisfied The function V(x) we choose is ixi2 Notations: B(t) Vector of queue lengths in slot t (column vector) A(t) Vector of arrivals in slot t B(t) Vector of departures in slot t B(t+1) = B(t) + A(t+1) – D(t+1) V(B(t + 1))- V(B(t)) = (B(t + 1) - B(t))T (B(t + 1) + B(t)) = (A(t+1) - D(t+1))T (A(t+1) + D(t+1) + B(t)) = (A(t+1) - D(t+1))T (A(t+1) + D(t+1)) + (A(t+1) - D(t+1))T B(t) E[V(B(t + 1))- V(B(t))/ B(t) = x] =E[(A(t+1) - D(t+1))T (A(t+1) + D(t+1))/ B(t) = x] + E[(A(t+1) - D(t+1))T B(t)/ B(t) = x] The first term in the summation can be upper bounded by a positive constant since the expected arrivals in slot t + 1 are finite and independent of the queue lengths in slot t, the expected departure from each pair is at most 1 and at least 0 E[(A(t+1))T B(t)/ B(t) = x] = aTx a = i i Mi (using rate constraints) where M is a matching vector and i i 1, i 0 aTx = i i weight of matching i under x (i i ) weight of maximum weight matching under x E[(D(t+1))T B(t)/ B(t) = x] = weight of maximum weight matching under x E[(A(t+1) - D(t+1))T B(t)/ B(t) = x] ((i i ) - 1) weight of maximum weight matching under x Depending on state x, this can become as highly negative as desired Thus the negative drift condition holds except for a finite number of states. Performance Difference Between Input and Output Queues Note that input queued switch has additional rate constraints at the output (as compared to output queued switches) For output line stability these constraints will eventually arise in output queued switches as well. Thus throughput wise under maximum weighted matching input and output queued switches perform similarly. Delay performances can be different because of the nature of constraints. Can an input queued switch emulate an output queued switch? Lecture 12, TCOM 799, Fall 01 Lecture 13 Computationally Simple Algorithms for Maximum possible throughput in input queued switches Implementational complexity Maximum weighted matching will require the arbiter to know the instantaneous queue lengths and then computing the maximum weighted matching. Will there be a loss in throughput if the scheduling is as per the maximum weighted matching in a previous slot? Not, as long as the delay is finite and the maximum number of arrivals are upper bounded in a slot The proof for throughput optimality uses properties of maximum weight matching in computing the following value E[(D(t+1))T B(t)/ B(t) = x] = weight of maximum weight matching under x If D(t+1) is not a maximum weight matching, then (D(t+1))T B(t) differs from the weight matching by at most a constant provided the maximum number of arrivals in any slot is finite. The result follows. Computational complexity A maximum weighted matching can now be computed once in a certain interval, and used throughout the interval without any loss in throughput. O notation Maximum weighted matchings can be computed in bipartite graphs in O(N3 log N) Other low complexity matchings Maximum size matchings Low throughput for nonuniform arrival rates Computation can become simpler for maximum weight matchings with weight no longer the queue lengths •"A Practical Scheduling Algorithm to Achieve 100% Throughput in Input-Queued Switches." Adisak Mekkittikul, and Nick McKeown IEEE Infocom 98, Vol 2, pp. 792-799, April 1998, San Francisco. Choose a maximum weight matching where the Weight of a pair is now the sum of the queue lengths at the input and the output ports for the pair. This is also a maximum size matching. Simpler algorithms for computing maximum size matchings can be used to compute this maximum weight matching in O(N2.5 ) Proof for optimality Consider a symmetric matrix T (N2 X N2) such that maxmatching M MTB(t) constant . is not upper bounded by a Then the scheduling policy which schedules the matching M which maximizes the above every slot is also throughput optimal T can be chosen suitably to reduce the complexity of computing such a optimal matching Examples of T Identity matrix Maximum weight matching with weight = queue length Tij = 2 if i = j 1 if i/N = j/N 1 if i mod N = j mod N 0 otherwise Maximum weight matching with weight of a pair is now the sum of the queue lengths at the input and the output ports for the pair Can there be a T to represent maximum size matching? Proof for optimality The function V(x) we choose is XTX V(B(t + 1))- V(B(t)) = (B(t + 1) - B(t))T T (B(t + 1) + B(t)) = (A(t+1) - D(t+1))T T(A(t+1) + D(t+1) + 2B(t)) = (A(t+1) - D(t+1))T T(A(t+1) + D(t+1)) +2 (A(t+1) - D(t+1))T TB(t) E[V(B(t + 1))- V(B(t))/ B(t) = x] =E[(A(t+1) - D(t+1))T T(A(t+1) + D(t+1))/ B(t) = x] + 2E[(A(t+1) - D(t+1))T TB(t)/ B(t) = x] The first term in the summation can be upper bounded by a positive constant since the expected arrivals in slot t + 1 are finite and independent of the queue lengths in slot t, the expected departure from each pair is at most 1 and at least 0 E[(A(t+1))T TB(t)/ B(t) = x] = aT Tx a = i i Mi (using rate constraints) where M is a matching vector and i i 1, i 0 aT Tx = i i Mi Tx (i i ) maxmatching M MTx E[(D(t+1))T TB(t)/ B(t) = x] = maxmatching M MTx E[(A(t+1) - D(t+1))T T B(t)/ B(t) = x] ((i i ) - 1) maxmatching M MTx This can be as negative as desired for suitable x, result holds Linear complexity algorithms •“Linear complexity algorithms for maximum throughput in radio networks and input queued switches" Leandros Tassiulas IEEE Infocom 98, Vol 2,April 1998, San Francisco. Randomized algorithms Choose the schedule randomly with a probability distribution which depends on the queue lengths Maximum weighted matching chooses the matching M which attains maxmatching M Mx when the queue length vector is x The randomized algorithm chooses schedules with a certain probability, the distribution is such that the probability of choosing the maximum weighted matching is at least , where can be pre-selected as any number between 0 and 1. Example choice: include each pair independently with probability 0.5 if the choice is not a matching dont include any edge probability of choosing any matching is 2-(N*N) Let this schedule be I Let I(t) = I if B(t)I B(t)I(t-1) = I(t-1) otherwise The policy will be to schedule I(t) in each slot Markov process representation is Y(t) = (B(t), I(t)) B(t+1) = B(t) + A(t+1) – I(t) Proof of optimality Let IB(t) be the maximum weighted matching vector for queue lengths B(t) V(Y) = V1(Y) + V2(Y) V1(Y) = ibi2 V2(Y) = ((IB-I)T B)2 V1(Y(t + 1))- V1(Y(t)) = (B(t + 1) - B(t))T (B(t + 1) + B(t)) = (A(t+1) - I(t))T (A(t+1) + I(t) + 2B(t)) = (A(t+1) - I(t))T (A(t+1) + I(t)) +2 (A(t+1) - I(t))T B(t) (A(t+1) - I(t))T B(t) = (A(t+1) – IB(t) )T B(t) + (IB(t) – I(t))T B(t) E[(A(t+1) - IB(t) )T B(t)/ Y(t) = Y ] ((i i ) - 1) wt of maximum Matching under B(t) (1/N)(i i ) - 1) V1(Y) E[(IB(t) - I(t+1))T B(t)/ Y(t) = Y] = V2(Y) E[(A(t+1) - I(t))T (A(t+1) + I(t))/ Y(t) ] = constant E[V1(Y(t + 1))- V1(Y(t))/ Y(t) ] 2(1/N)((i i ) - 1) V1(Y) + 2 V2(Y) + constant E[V2(Y(t + 1))/ Y(t) = Y] = 0. P[I(t+1) = IB(t+1) ] + E[V2(Y(t + 1))/ Y(t)=Y, IB(t+1) I(t+1) ] P[I(t+1) IB(t+1) ] (1-) E[((IB(t+1) – I(t+1))T B(t+1))2 / Y(t), IB(t+1) I(t) ] E[((IB(t+1) – I(t+1))T B(t+1))2 / Y(t), IB(t+1) I(t+1) ] = E[((IB(t+1) – I(t+1))T (B(t) + A(t+1) – I(t))2 / Y(t), IB(t+1) I(t+1) ] = E[((IB(t+1))T B(t) - (I(t+1))T B(t) + (IB(t+1) – I(t+1))T (A(t+1) – I(t)))2 / Y(t), IB(t+1) I(t+1) ] (IB(t+1))T B(t) (IB(t))T B(t) (I(t+1))T B(t) I(t)T B(t) E[((IB(t+1))T B(t) - (I(t+1))T B(t) + (IB(t+1) – I(t+1))T (A(t+1) – I(t)))2 / Y(t), IB(t+1) I(t+1) ] E[((IB(t))T B(t) - (I(t))T B(t) + (IB(t+1) – I(t+1))T (A(t+1) – I(t)))2 / Y(t), IB(t+1) I(t+1) ] = E [((IB(t) – I(t))T B(t))2 / Y(t), IB(t+1) I(t+1) ] +E [ (IB(t+1) – I(t+1))T (A(t+1) – I(t)))2 / Y(t), IB(t+1) I(t+1) ] + 2E[((IB(t) – I(t))T B(t)) (IB(t+1) – I(t+1))T (A(t+1) – I(t))) / Y(t), IB(t+1) I(t+1) ] E [((IB(t) – I(t))T B(t))2 / Y(t), IB(t+1) I(t+1) ] = ((IB(t) – I(t))T B(t))2 = V2(Y) E [ (IB(t+1) – I(t+1))T (A(t+1) – I(t)))2 / Y(t), IB(t+1) I(t+1) ] constant E[((IB(t) – I(t))T B(t)) (IB(t+1) – I(t+1))T (A(t+1) – I(t))) / Y(t), IB(t+1) I(t+1) ] constant ((IB(t) – I(t))T B(t)) = constant V2(Y) E[((IB(t+1) – I(t+1))T B(t+1))2 / Y(t), IB(t+1) I(t+1) ] V2(Y) + constant + constant V2(Y) E[V2(Y(t + 1))/ Y(t), IB(t+1) I(t+1) ] P[I(t+1) IB(t+1) ] (1-) (V2(Y) + constant + constant V2(Y)) E[V2(Y(t + 1))/ Y(t) = Y] (1-) (V2(Y) + constant + constant V2(Y)) E[V2(Y(t + 1)) - V2(Y(t )) / Y(t) = Y] -V2(Y) + (1-) (constant + constant V2(Y)) E[V(Y(t + 1))- V(Y(t))/ Y(t) ] 2(1/N)((i i ) - 1) V1(Y) V2(Y) + (2-const.) V2(Y) + constant Shared Memory Switches Resource management constraints Packets are stored in the switch fabric As soon as packets arrive at the input, they are read in the switch memory and stored there. Each output line serves packets intended for it as and when it is available. There are N logical queues Output lines serve packets independent of each other. Scheduling among the outputs is not an issue Different queues share the same physical memory. So memory management rather than scheduling is the issue How does memory management affect performance? Performance metric: Throughput or packet drop Let N = 2 Suppose the entire switch memory consists of packets for output 1 Packets cleared from the memory at rate 1 per slot. If the memory had packets for both outputs, packets would be cleared at rate 2 Load balancing reduces packet drop! Memory can be managed to balance the load Memory management options When a packet arrives: (a) It can be accepted (b) It can be rejected (c) It can be accepted while dropping some other packet (pushout) The objective is to choose the optimal course of action so as to minimize packet drops Some architectures do not allow pushout! Optimal memory management in presence of pushout Optimal Buffer Sharing I. Cidon, L. Georgiadis, R. Guerin, A. Khamisy IEEE JSAC, Vol. 13, No. 7, September 1995 Optimal Strategy for N = 2 Optimal memory management strategy accepts packets whenever the buffer has space without any push outs If the buffer is full, then the arrival for output port j is accepted pushing out one for the other port, if the number of packets for port j is below a certain threshold, j = 1,2 The thresholds for different ports can be different, but their sum equals the total memory B Can the queue lengths for both ports be above their respective thresholds? If the service rates are equal, then the threshold is lower than B/2 for the one with the higher arrival rate Optimal strategy for arbitrary N Known for identical arrival and transmission rates for all ports. Accept a packet if the buffer is not full. If the buffer is full, reject any arrival for the largest queue, accept arrivals for any other packet while dropping packets from the largest queue Proofs involve markov decision process Lecture 4 Optimal policy when push out is not allowed Sharing memory optimally G. Foschini, B. Gopinath, IEEE transactions on Communications, Vol 31, No. 3, March 1983 Load balancing still reduces the packet drop. Since push out is not allowed, load can be balanced by rejecting new arrivals of overloaded queues even when the buffer has space Optimal Strategy for N = 2 Accept a packet for port j if (a) Buffer has space (b) Number of packets waiting for port j is less than a threshold mj The policy reserves B – m1 memory units for output 2 and B – m2 memory units for output 1 Optimal Strategy for N = 3 Accept a packet for port j if (a) Buffer has space (b) Number of packets waiting for port j is less than a threshold mj (c) ij xij mij even after accepting the packet The policy reserves memory units for individual outputs as also the combinations Proof for optimality In general, every policy will have a set of states and will admit packets only if it does not move out of the set. The objective will be to find the set which reduces the blocking. The system can be modeled by a continuous time markov chain if arrivals are Poisson and departures exponential, with the state vector consisting of queue lengths of individual queues System Model Queue j has Poisson arrivals at the rate j exponential service at rate j Utilization j = j / j Steady state distribution (a,…. ) = 1a…. N../ a.. 1a…. N…. Reversibility Lecture 8 Optimal memory management in presence of pushout Queue j has Poisson arrivals at the rate j exponential service at rate j Loss minimization is equivalent to throughput maximization Whenever a packet is served, system gets a reward of 1 unit. The objective is to maximize the total reward System state: queue length vector (x1 x2 ….. xN) Control action: memory management ej A vector with 1 in the ith position and 0 in the rest I(x) = 0 if x = 0 = 1 otherwise J(x1 x2 ….. xN) = j I(xj 0) + i max(J(x), J(x + ei), J(x + ei ej)) if i xi B = j I(xj 0) + i max(J(x), J(x + ei - ej)) if i xi = B Optimal Strategy for N = 2 Properties for the cost function J(x, y) Monotonicity and Boundedness in x: 0 J(x+1, y) – J(x, y) 1, 0 x B - 1 Monotonicity and Boundedness in y: 0 J(x, y+1) – J(x, y) 1, 0 y B - 1 Concavity along x: J(x+1, y) – J(x, y) J(x, y) - J(x-1, y), 1 x B - 1 Concavity along y: J(x, y+1) – J(x, y) J(x, y) - J(x-1, y), 1 x B - 1 Concavity along the line x + y = b, 2 b B : J(x+1, y-1) – J(x, y) J(x, y) - J(x-1, y+1), 1 x B – 1, 1 y B–1 Derivation of Optimal strategy from the properties Monotonicity and boundedness implies that the optimal decision is to accept a packet for a port if the buffer has space The concavity property on the straight line x + y = B implies that the function J(x, y) has either a unique maxima on the line or two consecutive maximas. These correspond to the replacement thresholds. Nature of Thresholds If 1 2 1 = 2 J(x, y) J(y, x), y x Approximate computation of thresholds J(x, y) f(x,y) = d – c11x – c22y d = (1 + 2)/(1- ) cj = j/(1 - (1 - j (1- j ))) j roots of quadratic equation The maximum of f(x, B-x) gives the thresholds. Optimal Strategy for N for symmetric arrival rates Properties for the cost function J(x, y) Monotonicity and Boundedness : 0 J(x+ ej ) – J(x) 1, Symmetry J(x) = J(y), where y is a permutation of x Balancing: J(x) J(x + ei - ej) if ith component of x is less than the jth component Drop from the longest queue