A Study of Bandwidth-sharing Mechanisms in Connection-oriented Networks Ph.D. Dissertation presented by Xiangfei Zhu Department of Computer Science University of Virginia Feb 19, 2008 Outline Quick overview Motivation Proposed mechanisms Hypothesis and Metrics Contributions and Publications BA-n BA-First VBDS Immediate-request Related work Summary 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 2 Hypothesis Well-designed algorithms employing immediate-request and book-ahead bandwidth-sharing mechanisms will lead to efficient utilization of modern high-speed connection-oriented networks 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 3 Metrics Service provider-oriented metrics User-oriented metrics Utilization Always possible to achieve high utilization if there are no user-oriented performance requirements Call blocking probability: book-ahead mechanisms for session-type requests Delay: book-ahead mechanisms for data-type requests Combined metrics 2/19/2008 Session type: express call blocking probability as a function of utilization Data type: express mean transfer delay as a function of utilization Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 4 Key contributions Two book-ahead mechanisms for session-type requests A book-ahead mechanism for data-type requests Overcomes a disadvantage of using circuit-switched networks for file transfers (when compared to packet switching) Design and deployment of a wide-area, high-speed, optical dynamic circuit network Analytical and simulation models for these two schemes Models can be used as tools to test design choices and parameter values Demonstrated the readiness of off-the-shelf switches for actual service offerings Measurements of actual end-to-end call setup delays and per-switch processing delays 2/19/2008 Useful to other researchers for modeling purposes Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 5 Publications X. Zhu and M. Veeraraghavan, " Analysis and Design of Book-ahead Bandwidth-Sharing Mechanisms," accepted by the IEEE Transactions on Communications (TCOM). X. Zhu, M. E. McGinley, T. Li, and M. Veeraraghavan, "An Analytical Model for a Bookahead Bandwidth Scheduler," Proc. of IEEE Global Telecommunications Conference (Globecom) 2007, Washington, DC, Nov. 2007. X. Zhu, X. Zheng, and M. Veeraraghava, "Experiences in implementing an experimental wide-area GMPLS network," IEEE Journal on Selected Areas in Communications (JSAC), vol. 25, pp. 82-92, Apr. 2007. X. Zhu, X. Zheng, M. Veeraraghavan, Z. Li, Q. Song, I. Habib, and N. S. V. Rao, “Implementation of a GMPLS-based network with end host initiated signaling,” in Proc. Of IEEE International Conference on Communications (ICC) 2006, Istanbul, Turkey, Jun. 2006. 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 6 Why the renewed interest in connectionoriented networks? Internet – connectionless packet-switching Pros: efficient (high utilization) Cons: low quality of service (bandwidth, delay, jitter, etc. ) Resurgence of interests in connection-oriented networks: Top-down driver: large-team scientific projects require predictable high-speed network services Bottom-up driver: advances in optical circuit-switching technologies Terascale Supernova Initiative (TSI) http://www.phy.ornl.gov/tsi/ Various connection-oriented testbeds are being deployed around the world NSF Experimental Infrastructure Network (EIN) program ESnet4 (US), CA*net4 (Canada), UKLight (UK), SURFnet (Netherlands), JGN2 (Japan) Internet2 Dynamic Circuit network Large Hadron Collider (LHC) http://www.phys.ufl.edu/~matchev/LHCJC/lhc.html 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 7 Internet2 deployment of Dynamic Circuit network IP Network Dynamic Circuit Network 2/19/2008 Backbone picture reprinted from http://www.internet2.edu/pubs/networkmap.pdf Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 8 Why revisit the topic of bandwidth sharing in connection-oriented networks? Immediate request Leased lines Better service quality Existing mechanisms Better utilization Immediate-request (IR) mode: used in the telephone network Leased-line mode: used in high-speed connection-oriented networks, such as SONET and WDM Can these mechanisms be used in connection-oriented networks in new context (high-speed + new apps)? IR mode: cannot achieve high utilization with low call blocking probability when channel density is low Channel density in the telephone network is on the order of 100 or more Channel density in high-speed testbeds is on the order of 10 Leased-line mode: poor temporal sharing, expensive and inefficient Cannot be used because the number of universities involved in these projects is large New bandwidth-sharing mechanisms are needed! 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 9 What mechanisms exist for sharing resources in other contexts? Reservation systems Reservation phase before resource usage e.g., book flight tickets, make medical appointments, etc. Queueing systems On-demand service e.g., bank teller, grocery store checkout, etc. Two types of queueing system based on waiting space Bufferless queueing – no waiting space Buffered queueing – has waiting space 2/19/2008 e.g., street parking e.g., bank teller, grocery store checkout Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 10 Are these mechanisms suitable for bandwidth sharing? Reservation systems Yes, book-ahead mode Queueing systems Bufferless queueing – Yes, immediate-request call-blocking mode Buffered queueing – No H1 H3 H4 H5 X1 X2 X3 H7 H8 idle idle H6 2/19/2008 H2 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 11 Two types of book-ahead systems Classification based on request type Session-type requests Specify desired bandwidth and duration e.g., remote visualization and remote instrument control Data-type requests Specify size of data to be transferred e.g., file transfers 2/19/2008 File size known at the sending end Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 12 Proposed mechanisms Bandwidth Sharing in high-speed connection-oriented networks Book-ahead Immediate-request high per-channel rate Low-to-moderate per-channel rate Deployed a testbed VBDS BA-n/BA-First (Varying-Bandwidth Delayed Start) session-type requests data-type requests Simulation model BA-n BA-First Users specify a set of call-initiation time options Users accept any callinitiation time Analytical model Analytical model Simulation model Simulation model Comparison with IR Comparison with IR Published in TCOM 2/19/2008 Comparison with packet switching Implemented software Measured call-setup delays Published in JSAC Published in ICC Published in Globecom Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 13 Analytical model for the BA-n scheme A call specifies: - Bandwidth: 1 channel - Holding time: H timeslots Assumptions: Call arrival process is Poisson - Set of n call-initiation times: {s1, s2,…, sn} scheduler X Switch1 m channels X Switch2 Channel available for H timeslots starting at any one of the n call-initiation times? Yes, accept request No, reject request 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 14 Discrete-time Markov Chain model (x1, x2, …xK) xi: number of reserved channels in the ith interval 0≤xi ≤ m Challenges K: reservation window in timeslots System state: vector X with K components (x1, x2, …xK) m: link capacity in channels Non-homogeneous system Transition rates at time interval boundaries are infinite, but finite at other times Mixed system Call arrival process: continuous Call holding time: discrete A user can reserve any timeslots in the reservation window Key insights Embedded DTMC at time interval boundaries Discretize time into very “small” timeslots to use geometric distribution to approximate (exponential) call interarrival time distribution 2/19/2008 Timeslots should be small enough to make the probability of more than 1 call arriving in a timeslot negligible Any call arrival rate can be downgraded to a small call arrival rate by changing the time unit e.g., 36 call/hour -> 0.01 call/second Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 15 Simulation model Limitation of the analytical model – does not scale with m Recall that the state space is defined as Size of the state space: (m+1)K Simulation model Support larger values of m Relax assumptions used in the analytical model 2/19/2008 Call-initiation time options: uniform distribution → bell-shaped distribution Per-call bandwidth: single channel → multi channels Path length: single link → multi links Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 16 Model validation and verification Model validation Our models are for an initial design and implementation of BA systems Therefore, no real-world measurements Model validation technique – peer/expert reviews Real system measurements “available” for input parameters Example: Real-system measurements for telephony applications - Poisson call arrival process Same pattern likely in video-conference calls “Three aspects of model validation Assumptions Input parameter values and distributions Output values and conclusions” [Jain91] “Three validation techniques Expert intuition Real system measurements Theoretical results” [Jain91] “Qualitative validation has to be used when adequate acceptable real world data do not exist to permit quantitative validation and is based mainly on SME (Subject Matter Expert) and peer view” [Pace02] Model verification Compare analytical model results with simulation model results [Jain91] R. Jain, The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling, New York, Wiley-Interscience, 1991. [Pace02] D. K. Pace and J. Sheehan, “Subject matter expert (SME)/peer use in m&s v&v,” in Proc. of the Foundations, Lauarel, MD, Oct. 2002. 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 17 Key results from the BA-n study BA-n scheme outperforms IR scheme when per-channel rate is high e.g., when m=10 With the IR mode, high utilization achievable but at a cost With the BA-3 mode, high utilization achievable with low call blocking probability 0.1% call blocking probability at 80% utilization 2% call blocking probability at 90% utilization Reservation window size (K) dependence on call holding time (H) 23% call blocking probability at 80% utilization 46% call blocking probability at 90% utilization K/H does not need to be large e.g., when m=10, to achieve 90% utilization with 2% call blocking probability, K=4H. Multi-link scenario BA-n scheme outperforms IR Fairness achieved with “trunk reservation” 2/19/2008 Between long-path and short-path calls Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 18 Roadmap Bandwidth Sharing in high-speed connection-oriented networks Book-ahead Immediate-request high per-channel rate Low-to-moderate per-channel rate Deployed a testbed BA-n/BA-First VBDS session-type requests data-type requests Simulation model BA-n BA-First calls specify a set of callinitiation time options calls accept any callinitiation time Analytical model Analytical model Simulation model Simulation model Comparison with IR Comparison with IR Published in TCOM Published in Globecom 2/19/2008 Comparison with packet switching Ph.D. Dissertation Defense Department of Computer Science, University of Virginia Implemented software Measured call-setup delays Published in JSAC Published in ICC 19 Analytical model for the BA-First scheme A call specifies: - Bandwidth: 1 channel - Holding time: H timeslots 1 timeslot Assumptions: Call arrival process is Poisson - Set of n call-initiation times: {s1, s2,…, sn} scheduler X Switch1 - Any call-initiation time m channels X Switch2 Is a channel available in the entire reservation window? Yes, accept request No, reject request 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 20 System state m Call arrivals n Use “bins” to represent reservation intervals If the ith bin is not full, all bins after it must be empty The system state is expressed as a 2-tuple (i, n) 2/19/2008 i – index of the first bin that is not full n – number of reserved channels in the ith bin A special case is (K, m), which denotes the state in which all bins are full Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 21 CTMC m Call Call arrivalsarrivals n The state of the system changes in two cases A call arrives: A time-interval boundary is encountered e.g., (i, n)->(i-1, n) if i>1 The model is a CTMC but it is non-homogeneous e.g., (i, n)->(i, n+1) if n<m-1; (i, n)->(i+1, 0) if n=m-1 and i<K The system behavior at the timer-interval boundaries is different from its behavior at other times There is an embedded time-homogeneous DTMC if we only look at the system at the time-interval boundaries 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 22 Embedded DTMC j Call Call arrivalsarrivals m n q The transition probability can be calculated by counting the number of calls (denoted by a) that arrived in the past time interval, and calculating the probability that a calls arrive in a interval A: number of call arrivals in the current interval FA(a) is the Cumulative Distribution Function of A GA(a) is the Probability Mass Function of A The transition probability from state (i, n) to state (j, q) is 2/19/2008 1-FA(mK-1) GA(m(j-1)+q) 1-FA(m(K-i+1)+m-n-1) GA(m(j-i+1)+q-n) if i=1 & (j,q)=(K,m), i.e., mK or more calls arrived if i=1 & (j,q)≠(K,m), i.e., m(j-1)+q calls arrived if i≠1 & (j,q)=(K,m), i.e., m(K-i+1)+m-n or more calls arrived if i≠1 & (j,q)≠(K,m), i.e., m(j-i+1)+q-n calls arrived Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 23 Performance metrics m Call arrivals n fractional part integral part Call blocking probability Link utilization Mean scheduling delay - two parts 2/19/2008 Integral part: number of intervals before scheduled service interval Fractional part: delay within the arrival interval Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 24 Use of the model - Test design choices and parameter values IR v.s. BA schemes Example To achieve a 90% utilization with a call blocking probability less than 10% To achieve a 90% utilization with a call blocking probability less than 20% 2/19/2008 BA-First schemes are needed when m<59 BA-First schemes are needed when m<32 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 25 Use of the model - Select an appropriate reservation window size Parameters: To run a system at 100% offered load with a 4% or less call blocking probability 1) link capacity in channels, m = 2 or 8 2) reservation window size, K = 2, 4, 8, or 16 If m=2, K should be 8 time units If m=8, the number is only 4 time units Is larger value of K always better? If m=8, call blocking probability and utilization plots for K=4, 8 and 16 overlap But mean scheduling delay increases significantly as K increases Increasing reservation window size beyond a certain level is actually detrimental to system performance! 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 26 Use of the model - An approximate solution for M/D/m/p system Solutions exist for M/D/1, M/D/m (approximation) systems No existing solution for M/D/m/p system BA-First model (m, K) ≈ M/D/m/m(K+1) queuing model at moderate-to-high loads Why? Call-arrival process: both Poisson Call holding time: both deterministic Reservation window is effectively “waiting space” 1/2 fractional part 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 27 Key results from the BA-First study We modeled the BA-First mechanism using a non-homogeneous CTMC We extracted an embedded DTMC and solved it for steady-state probabilities We obtained solutions for metrics such as call blocking probability, link utilization, and mean scheduling delay We demonstrated the use of the model as a design tool for bookahead systems We demonstrated the use of the model as a solution for M/D/m/p queueing system at moderate-to-high loads 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 28 Roadmap Bandwidth Sharing in high-speed connection-oriented networks Book-ahead Immediate-request high per-channel rate Low-to-moderate per-channel rate Deployed a testbed BA-n/BA-First VBDS session-type requests data-type requests Simulation model BA-n BA-First calls specify a set of callinitiation time options calls accept any callinitiation time Analytical model Analytical model Simulation model Simulation model Comparison with IR Comparison with IR Published in TCOM Published in Globecom 2/19/2008 Comparison with packet switching Ph.D. Dissertation Defense Department of Computer Science, University of Virginia Implemented software Measured call-setup delays Published in JSAC Published in ICC 29 Book-ahead scheme for data-type requests Data-type requests: specify size of data to be transferred Drawback of using circuits for file transfers With fixed-bandwidth allocation, file transfers cannot take advantage of bandwidth freed by the completion of other transfers File transfer request =(File Size, Maximum rate, [Requested start time]) Can be provided by file server Limited by various constraints at end hosts, such as disk-access speed Fixed-Bandwidth Delayed Start (FBDS) Fixed-bandwidth allocation with rate set to maximum rate Varying-Bandwidth Delayed Start (VBDS) 2/19/2008 Assign different bandwidth levels for different time ranges Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 30 VBDS Idea of VBDS Upon receiving a reservation request, VBDS scheduler returns a TimeRange-Channel (TRC) vector {(Bk, Ek, Ck, k=1,2,…)} Bk: start time of the kth time range Ek: end time of the kth time range Ck: set of channels allocated to the transfer in the kth time range Scheduler maintains channel-availability function γ(t) Cost of VBDS Switches need to be reprogrammed multiple times within a transfer Switch programming time is considered in the analysis Switches need to maintain channel availability function Reduce the number of changes in channel-availability function 2/19/2008 Discretize time Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 31 Channel availability γ(t) VBDS example 4 3 2 1 0 20 30 40 50 60 70 80 … ∞ Time Assumptions: 10 4-channel link with per-channel rate 10Gbps Unit of time discretization: 100ms Switch programming time: 1 unit A file transfer request specifies (5GB, 20Gbps, 50) 2/19/2008 (50, 60, {4}) – 1.125GB (60, 70, {2, 4}) – 2.375GB (70, 75, {2, 4}) – 1.5GB (File Size, Maximum rate, Requested start time) Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 32 Numeric results Compare VBDS, FBDS, and Packet switching (PS) Normalized Delay Average throughput 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 33 Key results from the VBDS study Circuit-switched network with VBDS achieves similar performance as packetswitched networks for moderate-to-large files VBDS favors large files when compared to packet switching Packet switching: newly arriving transfers “cut in” VBDS: Not so. Allocated bandwidth remains dedicated to ongoing transfers We do not recommend using circuit-switched network for small files Significant: at high speeds, circuit switching cost << packet switching cost Scheduling and circuit setup overheads Cost 2/19/2008 Circuit switching: setup overhead (unsuitable for small files) Packet switching: congestion control algorithm (lower throughput for moderate-to-large files) Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 34 Roadmap Bandwidth Sharing in high-speed connection-oriented networks Book-ahead Immediate-request high per-channel rate Low-to-moderate per-channel rate Deployed a testbed BA-n/BA-First VBDS session-type requests data-type requests Simulation model BA-n BA-First calls specify a set of callinitiation time options calls accept any callinitiation time Analytical model Analytical model Simulation model Simulation model Comparison with IR Comparison with IR Published in TCOM Published in Globecom 2/19/2008 Comparison with packet switching Ph.D. Dissertation Defense Department of Computer Science, University of Virginia Implemented software Measured call-setup delays Published in JSAC Published in ICC 35 Immediate-request bandwidth sharing Deployed a wide-area experimental network with immediaterequest mode of bandwidth sharing - CHEETAH State-of-the-art in 2004 2/19/2008 Control-plane protocols are standardized by IETF - GMPLS protocol suite Vendors have implemented these protocols in high-speed optical circuit switches No deployed network uses these functions No signaling protocol client for end hosts to enable the creation of end-to-end high-speed circuits Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 36 CHEETAH network Switches: Sycamore SN16000 Intelligent Optical switch Robust implementation of GMPLS control-plane protocols Support standardized Ethernet-SONET mapping Oak Ridge, TN Raleigh, NC SN16000 GbE/ OC192 Control 10GbE card Card card To Cray X1 H zelda4 H zelda5 SN16000 GbE/ OC192 Control 10GbE card Card card H wukong Atlanta, GA OC-192 OC-192 SN16000 GbE/ OC192 Control 10GbE card Card card H zelda1 H zelda2 H zelda3 End hosts: general-purpose Linux PCs with two NICs and CHEETAH end-host software 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 37 IR mode of sharing in CHEETAH Designed and implemented an end-host software package based on the GMPLS architecture Stand-alone circuit request tools Integrated into applications such as Squid (an open-source web proxy software) Ran experiments of IR mode call setups and releases Measured end-to-end circuit setup delays and per-switch signaling message processing delays Measurements useful to other researchers for modeling purposes Demonstrated the readiness of off-the-shelf switches for actual service offerings 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 38 Related work Research papers on book-ahead bandwidth sharing File transfers Most of these papers use simulations None of them considers book-ahead calls with multiple acceptable options Our results show that a book-ahead mechanism that specifies only one callinitiation time may perform worse than an immediate-request mechanism List scheduling: all proposed algorithms use fixed allocations Bin packing: cannot break a block into pieces to fit into bins TCP improvements: determine fair share for a flow faster and more accurately, while we determine share for a flow during setup Optical connection-oriented testbeds 2/19/2008 e.g.: ESnet4, NSF DRAGON, CA*net4, UKLight, JGN2, etc. Focus: implementation & inter-domain usage Our work: mixed study of IR and BA; theoretical modeling + implementation Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 39 Summary High-speed connection-oriented networks should support a combination of bandwidth-sharing services For reservations that specify file size (large file transfers) For reservations that specify desired bandwidth and duration For serving as “wires” between switches to create networks that offer other services For video telephony, transfers of moderate-sized files New services: Existing services: BA-n/BA-First Leased lines VBDS Immediate request Better service quality 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia IP services Greater sharing 40 Future work Routing issue in reservation phase Currently assume a linear topology in multi-link scenarios Multiple route options should be exploited Distributed implementation Necessary for inter-domain scheduling Service providers do not share network topology information with each other Validate models against real measurements 2/19/2008 A long-term future work item after deployment & user base build-up Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 41 Questions from Form G111 Questions from Form G111 - Defining the problem In the context of new optical circuit-switched technologies and new application requirements, what bandwidth-sharing mechanisms can lead to efficient utilization of modern high-speed connection-oriented networks? 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 43 Questions from Form G111 - Analysis of previous and related work Research papers on book-ahead bandwidth sharing File transfers Most of these papers use simulations None of them considers book-ahead calls with multiple acceptable options Our results show that a book-ahead mechanism that specifies only one callinitiation time may perform worse than an immediate-request mechanism List scheduling: all proposed algorithms use fixed allocations Bin packing: cannot break a block into pieces to fit into bins TCP improvements: determine fair share for a flow faster and more accurately, while we determine share for a flow during setup Optical connection-oriented testbeds 2/19/2008 e.g.: ESnet4, NSF DRAGON, CA*net4, UKLight, JGN2, etc. Focus: implementation & inter-domain usage Our work: mixed study of IR and BA; theoretical modeling + implementation Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 44 Questions from Form G111 - Success criteria Has the student adequately defined the measure(s) of success to be used to evaluate the work? Is there a well defined metric with a goal? Does the metric adequately represent the desired success criteria? Success criteria Session-type BA requests Data-type BA requests At least the same performance as packet switching IR mode BA-n: better performance than IR BA-First: a model that scales to m>100 Stable network deployment and software implementation Metrics 2/19/2008 Session type: express call blocking probability as a function of utilization Data type: express mean transfer delay as a function of utilization Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 45 Questions from Form G111 - Solution Is the approach taken well executed? Does it appear to be correct? Is the work technically challenging? Does the student utilize appropriate professional standards? A combination of analytical, simulation, and experimental methods. Two book-ahead mechanisms for session-type requests One book-ahead mechanism for data-type requests A simulation model for this mechanism A wide-area testbed for experimental study of the immediate-request mechanism 2/19/2008 Analytical and simulation models for these mechanisms Testbed deployment Software implementation Experimentation and measurements Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 46 Questions from Form G111 - Innovation and risk Two new Markov chain models for book-ahead bandwidth-sharing schemes (first Markov chain models for book-ahead schemes) An approximate solution for the M/D/m/p queueing system One of the first deployments of a wide-area high-speed dynamic circuit network 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 47 Questions from Form G111 - Broader implications (Social, economic, political, technical, ethical, business, etc.) Demonstrated the readiness of off-the-shelf circuit switches for actual service offering (business and technical) Designed efficient bandwidth-sharing algorithms for highspeed connection-oriented networks Circuit switches are less complex than packet switches, which means 2/19/2008 Less expensive (economic) Consume less power (environmental) Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 48 Backup slides (Backup slides) BA-n - Example of the analytical model for the BA-n scheme Assumptions Link capacity m = 1 Advance-reservation horizon K = 3 Number of classes L = 2 Holding time for class-1 calls h1 = 1 Holding time for class-2 calls h2 = 2 Number of options n = 1 System transition happens at the end of each timeslot Example: state (0, 0, 1) # of reserved channels 1 Current time t 2/19/2008 t+k Time A call arrives and reserves the third timeslot -> state (0, 1, 1) Pr=(1/3)pr1 A call arrives and reserves the first timeslot -> state (1, 1, 0) Pr=(1/3)pr1 No call arrives or the arrived call is blocked -> state(0, 1, 0) Pr=1-(2/3)pr1 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 50 (Backup slides) BA-n – DTMC model – Transition probability matrix Define a left shift operator : If , Define a K-component vector The transition probability from state x to state y is 2/19/2008 . , where p: the probability that a call arrives during a time slot rj: the probability that an incoming call belongs to class j qi,j: the probability that a class-j call is admitted with a initiation time of the ith timeslot Bx: the probability that an incoming call is blocked when the system is in state x First row: a class-j call is admitted with an initiation time of the ith timeslot Second row: no call arrives or a call arrives but is blocked Third row: all other states Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 51 (Backup slides) BA-n – To compute qij and Bx Use Hypergeometric probability mass function calculate Bx and qi,j A large set of N elements, known to have d defective elements The probability of having k defective units in a random batch of n elements, drawn without replacement from the large set Define dj: number of “ineligible” timeslots for class-j calls Mapping: to A total of (K+1-hj) candidate timeslots: corresponds to N dj ineligible timeslots: corresponds to d defective units e.g.: the first t options are all rejected: corresponds to a batch of n elements are all defective (k=n) After we obtain the transition matrix 2/19/2008 Calculate the steady-state probabilities Calculate average call blocking probability and utilization Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 52 (Backup slides) BA-n – Comparison of blocking probability and utilization (a) Call blocking probability (b) Utilization BA-all clearly outperforms IR BA-1 is worse than IR Reason: “gaps” are caused by advance reservations Analogy: if a doctor spends exactly 1 hour with each patient, patients arriving in the middle of an hour will cause gaps (time period shorter than 1 hour) Restricted call-initiation times 2/19/2008 Call-initiation time options are restricted to fall on call holding time boundaries Restricted BA-n mechanisms clearly outperform IR Performance of restricted BA-n is almost as good as BA-all Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 53 (Backup slides) BA-n – Dependence of reservation window size on number of channels and call holding time (1) (number of class L=1, call holding time H = 200, offered load = 100%) Provide insight into how to select the advance-reservation horizon (K ) Longer K means better performance Longer K also means greater storage and computation needs The performance improvement is small after K reaches a certain value 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 54 (Backup slides) BA-n – Dependence of reservation window size on number of channels and call holding time (2) BA-all with Number of channels m =2 Call holding time H =300 Offered load = 100% The ratio K/H instead of K determines the call blocking probability. K/H values for different values of m corresponding to 3 values of call blocking probability 2/19/2008 Call blocking probability 2% 5% 10% m=2 14 6 4 m=5 5 4 3 m=10 4 3 2 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 55 (Backup slides) BA-First – Numerical results – system design m=4 Consider a system designer who wants to know the payoff by increasing the reservation window size 2/19/2008 Assume we want to run the system with 1% call blocking probability e.g., m=4, by increasing K from 2 to 4, the system load/channel can be increased from 75% to 93% This is quite significant in that it allows for a 24% increase in the number of endpoints multiplexed on to the link Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 56 (Backup slides) VBDS – Why prefer using connection-oriented networks for file transfers The cost of high-speed circuit switches are lower than high-speed packet switches High-speed memories for route table lookup and packet buffering Rich set of features such as policing and shaping Item Sycamore SN16000 Base system $130,000 $183,500 10x1GbE card $169,830 $63,500 1x10GbE $125,000 $65,500 1xOC192 $225,000 $37,500 $2,573,970 ($21,000/Gbps) $1,084,700 ($9,000/Gpbs) A system with 6 pieces 10xGbE +6 pieces 1x10GbE + 3 pieces OC192 (Total 120Gbps client data rate) Cisco 12416 The simulated PS system is an idealized system in which buffers are assumed to be infinitely large In reality packet loss will occur due to congestions Mechanisms, such as TCP’s congestion control schemes, are required to recover from these packet losses with retransmissions and rate adjustments Transport protocols designed for circuits, such as Circuit-TCP (C-TCP) are more efficient 2/19/2008 Take advantage of the information on the fair share of a flow Disabling TCP’s Slow Start and AIMD algorithm Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 57 (Backup slides) VBDS – VBDS favors large files when compared to PS Packet switching: newly arriving transfers “cut in” VBDS: Not so. Allocated bandwidth remains dedicated to ongoing transfers 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 58 (Backup slides) IR - Erlang-B formula Call blocking probability (PB) against the link capacity expressed in channels (m) Cannot achieve high utilization with low call blocking probability when m is small 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 59 (Backup slides) IR - GMPLS control plane Purpose of Generalized Multi-Protocol Label Switching (GMPLS) control plane Three components Dynamic bandwidth sharing (distributed) Provisioning (configure the switches for the circuit/VC) Link management protocol OSPF-TE routing protocol RSVP-TE signaling protocol Bandwidth sharing mode 2/19/2008 Immediate-request (cannot specify a future call-initiation time or call holding time in protocols) Calls are accepted or rejected - “call blocking" Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 60 (Backup slides) IR - CHEETAH concept 2/19/2008 Provide on-demand circuit service as an add-on to the connectionless service provided by the Internet Hybrid circuits: GbEthernet-SONET-GbEthernet Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 61 (Backup slides) IR - CHEETAH end-host software development Architecture End Host CHEETAH software Internet DNS client RSVP-TE module Application DNS client RSVP-TE module SONET circuitswitched network TCP/IP C-TCP/IP Application TCP/IP NIC 1 NIC 2 Circuit Gateway Circuit Gateway NIC 1 NIC 2 C-TCP/IP Based on the RSVP-TE code from KOM/DRAGON End Host CHEETAH software About 40K lines of C++ code What I did: 2/19/2008 Modified the code to inter-operate with the Sycamore SN16000 Added admission control, session management, user interface, etc. Integrated code for DNS lookup from our partner CUNY Designed and implemented APIs for general applications About 4K lines of new code Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 62 (Backup slides) IR - CHEETAH end-host software architecture DNS server CHEETAH software End host CHEETAH daemon (CD) Circuit-requestor DNS client CAC CD API DNS lookup socket Route/ARP table update DNS lookup – to support our scalability goal Five steps of circuit setup Message parsing RSVPD Route determination RSVPD API socket RSVP-TE RSVP-TE Daemon (RSVPD) messages User space Kernel space Left to the edge switch CAC Date-plane configuration Route/ARP table update Message construction RSVPD C-TCP API C-TCP CD API can be integrated into web servers, FTP servers, etc., so that “elephant” flows are automatically handled via a dynamically created dedicated circuit/VC 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 63 (Backup slides) IR - End-to-end signaling delay measurements Signaling delays incurred in setting up a circuit between zelda1 (in Atlanta, GA) and wuneng (in Raleigh, NC) across the CHEETAH network. Circuit type End-tend circuit setup delay (s) Processing delay for Path message at the NC SN16000 (s) Processing delay for Resv message at the NC SN16000 (s) OC-1 0.166103 0.091119 0.008689 OC-3 0.165450 0.090852 0.008650 1Gb/s EoS 1.645673 1.566932 0.008697 Round-trip signaling message propagation plus emission delay between GA SN16000 and NC SN16000: 0.025s Observations: Delays for setting up SONET circuits for rates in the original SONET hierarchy are small (166ms) Delays for hybrid Ethernet-SONET circuits are much higher (1.6s) (vendor implementation) The measured delay can be used for analytical and simulation models for related research 2/19/2008 Ph.D. Dissertation Defense Department of Computer Science, University of Virginia 64