Presentation - University of Virginia

advertisement
A Study of Bandwidth-sharing Mechanisms in
Connection-oriented Networks
Ph.D. Dissertation presented by
Xiangfei Zhu
Department of Computer Science
University of Virginia
Feb 19, 2008
Outline

Quick overview




Motivation
Proposed mechanisms






Hypothesis and Metrics
Contributions and Publications
BA-n
BA-First
VBDS
Immediate-request
Related work
Summary
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
2
Hypothesis

Well-designed algorithms employing
immediate-request and book-ahead
bandwidth-sharing mechanisms will lead to
efficient utilization of modern high-speed
connection-oriented networks
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
3
Metrics

Service provider-oriented metrics



User-oriented metrics



Utilization
Always possible to achieve high utilization if there are no user-oriented
performance requirements
Call blocking probability: book-ahead mechanisms for session-type requests
Delay: book-ahead mechanisms for data-type requests
Combined metrics


2/19/2008
Session type: express call blocking probability as a function of utilization
Data type: express mean transfer delay as a function of utilization
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
4
Key contributions

Two book-ahead mechanisms for session-type requests



A book-ahead mechanism for data-type requests


Overcomes a disadvantage of using circuit-switched networks for file transfers
(when compared to packet switching)
Design and deployment of a wide-area, high-speed, optical dynamic circuit
network


Analytical and simulation models for these two schemes
Models can be used as tools to test design choices and parameter values
Demonstrated the readiness of off-the-shelf switches for actual service offerings
Measurements of actual end-to-end call setup delays and per-switch processing
delays

2/19/2008
Useful to other researchers for modeling purposes
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
5
Publications

X. Zhu and M. Veeraraghavan, " Analysis and Design of Book-ahead Bandwidth-Sharing
Mechanisms," accepted by the IEEE Transactions on Communications (TCOM).

X. Zhu, M. E. McGinley, T. Li, and M. Veeraraghavan, "An Analytical Model for a Bookahead Bandwidth Scheduler," Proc. of IEEE Global Telecommunications Conference
(Globecom) 2007, Washington, DC, Nov. 2007.

X. Zhu, X. Zheng, and M. Veeraraghava, "Experiences in implementing an experimental
wide-area GMPLS network," IEEE Journal on Selected Areas in Communications
(JSAC), vol. 25, pp. 82-92, Apr. 2007.

X. Zhu, X. Zheng, M. Veeraraghavan, Z. Li, Q. Song, I. Habib, and N. S. V. Rao,
“Implementation of a GMPLS-based network with end host initiated signaling,” in Proc. Of
IEEE International Conference on Communications (ICC) 2006, Istanbul, Turkey, Jun.
2006.
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
6
Why the renewed interest in connectionoriented networks?

Internet – connectionless packet-switching



Pros: efficient (high utilization)
Cons: low quality of service (bandwidth, delay, jitter, etc. )
Resurgence of interests in connection-oriented networks:


Top-down driver: large-team scientific projects require
predictable high-speed network services
Bottom-up driver: advances in optical circuit-switching
technologies
Terascale Supernova Initiative (TSI)
http://www.phy.ornl.gov/tsi/

Various connection-oriented testbeds are
being deployed around the world



NSF Experimental Infrastructure Network (EIN)
program
ESnet4 (US), CA*net4 (Canada), UKLight (UK),
SURFnet (Netherlands), JGN2 (Japan)
Internet2 Dynamic Circuit network
Large Hadron Collider (LHC)
http://www.phys.ufl.edu/~matchev/LHCJC/lhc.html
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
7
Internet2 deployment of Dynamic Circuit network
IP Network
Dynamic Circuit Network
2/19/2008
Backbone picture reprinted from http://www.internet2.edu/pubs/networkmap.pdf
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
8
Why revisit the topic of bandwidth sharing
in connection-oriented networks?
Immediate
request
Leased lines
Better service quality

Existing mechanisms



Better utilization
Immediate-request (IR) mode: used in the telephone network
Leased-line mode: used in high-speed connection-oriented networks, such as
SONET and WDM
Can these mechanisms be used in connection-oriented networks in new
context (high-speed + new apps)?

IR mode: cannot achieve high utilization with low call blocking probability when
channel density is low



Channel density in the telephone network is on the order of 100 or more
Channel density in high-speed testbeds is on the order of 10
Leased-line mode: poor temporal sharing, expensive and inefficient

Cannot be used because the number of universities involved in these projects is large
New bandwidth-sharing mechanisms are needed!
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
9
What mechanisms exist for sharing
resources in other contexts?

Reservation systems



Reservation phase before resource usage
e.g., book flight tickets, make medical appointments, etc.
Queueing systems



On-demand service
e.g., bank teller, grocery store checkout, etc.
Two types of queueing system based on waiting space

Bufferless queueing – no waiting space


Buffered queueing – has waiting space

2/19/2008
e.g., street parking
e.g., bank teller, grocery store checkout
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
10
Are these mechanisms suitable for
bandwidth sharing?

Reservation systems


Yes, book-ahead mode
Queueing systems


Bufferless queueing – Yes, immediate-request call-blocking mode
Buffered queueing – No
H1
H3
H4
H5
X1
X2
X3
H7
H8
idle
idle
H6
2/19/2008
H2
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
11
Two types of book-ahead systems

Classification based on request type

Session-type requests



Specify desired bandwidth and duration
e.g., remote visualization and remote instrument control
Data-type requests


Specify size of data to be transferred
e.g., file transfers

2/19/2008
File size known at the sending end
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
12
Proposed mechanisms
Bandwidth Sharing in high-speed
connection-oriented networks
Book-ahead
Immediate-request
high per-channel rate
Low-to-moderate per-channel rate
 Deployed a testbed
VBDS
BA-n/BA-First
(Varying-Bandwidth Delayed Start)
session-type requests
data-type requests
 Simulation model
BA-n
BA-First
Users specify a set of
call-initiation time options
Users accept any callinitiation time
 Analytical model
 Analytical model
 Simulation model
 Simulation model
 Comparison with IR
 Comparison with IR
Published in TCOM
2/19/2008
 Comparison with
packet switching
 Implemented software
 Measured call-setup
delays
Published in JSAC
Published in ICC
Published in Globecom
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
13
Analytical model for the BA-n scheme
A call specifies: - Bandwidth: 1 channel

- Holding time: H timeslots
Assumptions:

Call arrival process is Poisson
- Set of n call-initiation times: {s1, s2,…, sn}
scheduler
X
Switch1

m channels
X
Switch2
Channel available for H timeslots starting at
any one of the n call-initiation times?


Yes, accept request
No, reject request
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
14
Discrete-time Markov Chain model
(x1, x2, …xK)


xi: number of reserved channels in the ith interval
0≤xi ≤ m
Challenges




K: reservation window in
timeslots
System state: vector X with K components (x1, x2, …xK)


m: link capacity in channels
Non-homogeneous system

Transition rates at time interval boundaries are infinite, but finite at other times
Mixed system

Call arrival process: continuous

Call holding time: discrete
A user can reserve any timeslots in the reservation window
Key insights


Embedded DTMC at time interval boundaries
Discretize time into very “small” timeslots to use geometric distribution to approximate (exponential) call
interarrival time distribution


2/19/2008
Timeslots should be small enough to make the probability of more than 1 call arriving in a timeslot negligible
Any call arrival rate can be downgraded to a small call arrival rate by changing the time unit

e.g., 36 call/hour -> 0.01 call/second
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
15
Simulation model


Limitation of the analytical model – does not scale with m

Recall that the state space is defined as

Size of the state space: (m+1)K
Simulation model


Support larger values of m
Relax assumptions used in the analytical model



2/19/2008
Call-initiation time options: uniform distribution → bell-shaped
distribution
Per-call bandwidth: single channel → multi channels
Path length: single link → multi links
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
16
Model validation and verification

Model validation

Our models are for an initial design and
implementation of BA systems



Therefore, no real-world measurements
Model validation technique – peer/expert reviews
Real system measurements “available”
for input parameters

Example:



Real-system measurements for telephony
applications - Poisson call arrival process
Same pattern likely in video-conference calls
“Three aspects of model validation
 Assumptions
 Input parameter values and distributions
 Output values and conclusions” [Jain91]
“Three validation techniques
 Expert intuition
 Real system measurements
 Theoretical results” [Jain91]
“Qualitative validation has to be used
when adequate acceptable real world
data do not exist to permit quantitative
validation and is based mainly on
SME (Subject Matter Expert) and peer
view” [Pace02]
Model verification

Compare analytical model results with simulation model results
[Jain91] R. Jain, The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling, New York, Wiley-Interscience, 1991.
[Pace02] D. K. Pace and J. Sheehan, “Subject matter expert (SME)/peer use in m&s v&v,” in Proc. of the Foundations, Lauarel, MD, Oct. 2002.
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
17
Key results from the BA-n study

BA-n scheme outperforms IR scheme when per-channel rate is high

e.g., when m=10

With the IR mode, high utilization achievable but at a cost



With the BA-3 mode, high utilization achievable with low call blocking probability



0.1% call blocking probability at 80% utilization
2% call blocking probability at 90% utilization
Reservation window size (K) dependence on call holding time (H)



23% call blocking probability at 80% utilization
46% call blocking probability at 90% utilization
K/H does not need to be large
e.g., when m=10, to achieve 90% utilization with 2% call blocking probability,
K=4H.
Multi-link scenario


BA-n scheme outperforms IR
Fairness achieved with “trunk reservation”

2/19/2008
Between long-path and short-path calls
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
18
Roadmap
Bandwidth Sharing in high-speed
connection-oriented networks
Book-ahead
Immediate-request
high per-channel rate
Low-to-moderate per-channel rate
 Deployed a testbed
BA-n/BA-First
VBDS
session-type requests
data-type requests
 Simulation model
BA-n
BA-First
calls specify a set of callinitiation time options
calls accept any callinitiation time
 Analytical model
 Analytical model
 Simulation model
 Simulation model
 Comparison with IR
 Comparison with IR
Published in TCOM
Published in Globecom
2/19/2008
 Comparison with
packet switching
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
 Implemented software
 Measured call-setup
delays
Published in JSAC
Published in ICC
19
Analytical model for the BA-First scheme
A call specifies: - Bandwidth: 1 channel

- Holding time: H timeslots 1 timeslot
Assumptions:

Call arrival process is Poisson
- Set of n call-initiation times: {s1, s2,…, sn}
scheduler
X
Switch1

- Any call-initiation time
m channels
X
Switch2
Is a channel available in the entire reservation
window?


Yes, accept request
No, reject request
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
20
System state
m
Call
arrivals
n

Use “bins” to represent reservation intervals


If the ith bin is not full, all bins after it must be empty
The system state is expressed as a 2-tuple (i, n)



2/19/2008
i – index of the first bin that is not full
n – number of reserved channels in the ith bin
A special case is (K, m), which denotes the state in which all bins are full
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
21
CTMC
m
Call Call
arrivalsarrivals
n

The state of the system changes in two cases

A call arrives:


A time-interval boundary is encountered


e.g., (i, n)->(i-1, n) if i>1
The model is a CTMC but it is non-homogeneous


e.g., (i, n)->(i, n+1) if n<m-1; (i, n)->(i+1, 0) if n=m-1 and i<K
The system behavior at the timer-interval boundaries is different from its behavior at
other times
There is an embedded time-homogeneous DTMC if we only look at the system
at the time-interval boundaries
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
22
Embedded DTMC
j
Call Call
arrivalsarrivals
m
n
q



The transition probability can be calculated by counting the number of calls (denoted by a)
that arrived in the past time interval, and calculating the probability that a calls arrive in a
interval
A: number of call arrivals in the current interval

FA(a) is the Cumulative Distribution Function of A

GA(a) is the Probability Mass Function of A
The transition probability from state (i, n) to state (j, q) is




2/19/2008
1-FA(mK-1)
GA(m(j-1)+q)
1-FA(m(K-i+1)+m-n-1)
GA(m(j-i+1)+q-n)
if i=1 & (j,q)=(K,m), i.e., mK or more calls arrived
if i=1 & (j,q)≠(K,m), i.e., m(j-1)+q calls arrived
if i≠1 & (j,q)=(K,m), i.e., m(K-i+1)+m-n or more calls arrived
if i≠1 & (j,q)≠(K,m), i.e., m(j-i+1)+q-n calls arrived
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
23
Performance metrics
m
Call
arrivals
n
fractional part
integral part

Call blocking probability

Link utilization

Mean scheduling delay - two parts


2/19/2008
Integral part: number of intervals before scheduled service interval
Fractional part: delay within the arrival interval
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
24
Use of the model -
Test design choices and parameter values

IR v.s. BA schemes

Example

To achieve a 90% utilization
with a call blocking probability
less than 10%


To achieve a 90% utilization
with a call blocking probability
less than 20%

2/19/2008
BA-First schemes are needed
when m<59
BA-First schemes are needed
when m<32
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
25
Use of the model -
Select an appropriate reservation window size

Parameters:

To run a system at 100% offered load with a 4% or less call blocking probability



1) link capacity in channels, m = 2 or 8
2) reservation window size, K = 2, 4, 8, or 16
If m=2, K should be 8 time units
If m=8, the number is only 4 time units
Is larger value of K always better?


If m=8, call blocking probability and utilization plots for K=4, 8 and 16 overlap
But mean scheduling delay increases significantly as K increases
Increasing reservation window size beyond a certain level
is actually detrimental to system performance!
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
26
Use of the model -
An approximate solution for M/D/m/p system



Solutions exist for M/D/1, M/D/m (approximation) systems
No existing solution for M/D/m/p system
BA-First model (m, K) ≈ M/D/m/m(K+1) queuing model at moderate-to-high loads

Why?



Call-arrival process: both Poisson
Call holding time: both deterministic
Reservation window is effectively “waiting space”
1/2
fractional part
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
27
Key results from the BA-First study

We modeled the BA-First mechanism using a non-homogeneous
CTMC

We extracted an embedded DTMC and solved it for steady-state
probabilities

We obtained solutions for metrics such as call blocking
probability, link utilization, and mean scheduling delay

We demonstrated the use of the model as a design tool for bookahead systems

We demonstrated the use of the model as a solution for M/D/m/p
queueing system at moderate-to-high loads
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
28
Roadmap
Bandwidth Sharing in high-speed
connection-oriented networks
Book-ahead
Immediate-request
high per-channel rate
Low-to-moderate per-channel rate
 Deployed a testbed
BA-n/BA-First
VBDS
session-type requests
data-type requests
 Simulation model
BA-n
BA-First
calls specify a set of callinitiation time options
calls accept any callinitiation time
 Analytical model
 Analytical model
 Simulation model
 Simulation model
 Comparison with IR
 Comparison with IR
Published in TCOM
Published in Globecom
2/19/2008
 Comparison with
packet switching
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
 Implemented software
 Measured call-setup
delays
Published in JSAC
Published in ICC
29
Book-ahead scheme for data-type requests

Data-type requests: specify size of data to be transferred

Drawback of using circuits for file transfers

With fixed-bandwidth allocation, file transfers cannot take advantage of bandwidth
freed by the completion of other transfers
File transfer request =(File Size, Maximum rate, [Requested start time])
Can be provided by file server
Limited by various constraints at end hosts, such as disk-access speed

Fixed-Bandwidth Delayed Start (FBDS)


Fixed-bandwidth allocation with rate set to maximum rate
Varying-Bandwidth Delayed Start (VBDS)

2/19/2008
Assign different bandwidth levels for different time ranges
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
30
VBDS

Idea of VBDS

Upon receiving a reservation request, VBDS scheduler returns a TimeRange-Channel (TRC) vector {(Bk, Ek, Ck, k=1,2,…)}





Bk: start time of the kth time range
Ek: end time of the kth time range
Ck: set of channels allocated to the transfer in the kth time range
Scheduler maintains channel-availability function γ(t)
Cost of VBDS

Switches need to be reprogrammed multiple times within a transfer


Switch programming time is considered in the analysis
Switches need to maintain channel availability function

Reduce the number of changes in channel-availability function

2/19/2008
Discretize time
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
31
Channel availability γ(t)
VBDS example
4
3
2
1
0

20
30
40
50
60
70
80
…
∞
Time
Assumptions:




10
4-channel link with per-channel rate 10Gbps
Unit of time discretization: 100ms
Switch programming time: 1 unit
A file transfer request specifies (5GB, 20Gbps, 50)



2/19/2008
(50, 60, {4}) – 1.125GB
(60, 70, {2, 4}) – 2.375GB
(70, 75, {2, 4}) – 1.5GB
(File Size, Maximum rate, Requested start time)
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
32
Numeric results
Compare VBDS, FBDS, and Packet switching (PS)
Normalized Delay
Average throughput

2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
33
Key results from the VBDS study

Circuit-switched network with VBDS achieves similar performance as packetswitched networks for moderate-to-large files


VBDS favors large files when compared to packet switching



Packet switching: newly arriving transfers “cut in”
VBDS: Not so. Allocated bandwidth remains dedicated to ongoing transfers
We do not recommend using circuit-switched network for small files


Significant: at high speeds, circuit switching cost << packet switching cost
Scheduling and circuit setup overheads
Cost


2/19/2008
Circuit switching: setup overhead (unsuitable for small files)
Packet switching: congestion control algorithm (lower throughput for moderate-to-large
files)
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
34
Roadmap
Bandwidth Sharing in high-speed
connection-oriented networks
Book-ahead
Immediate-request
high per-channel rate
Low-to-moderate per-channel rate
 Deployed a testbed
BA-n/BA-First
VBDS
session-type requests
data-type requests
 Simulation model
BA-n
BA-First
calls specify a set of callinitiation time options
calls accept any callinitiation time
 Analytical model
 Analytical model
 Simulation model
 Simulation model
 Comparison with IR
 Comparison with IR
Published in TCOM
Published in Globecom
2/19/2008
 Comparison with
packet switching
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
 Implemented software
 Measured call-setup
delays
Published in JSAC
Published in ICC
35
Immediate-request bandwidth sharing

Deployed a wide-area experimental network with immediaterequest mode of bandwidth sharing - CHEETAH

State-of-the-art in 2004




2/19/2008
Control-plane protocols are standardized by IETF - GMPLS
protocol suite
Vendors have implemented these protocols in high-speed optical
circuit switches
No deployed network uses these functions
No signaling protocol client for end hosts to enable the creation of
end-to-end high-speed circuits
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
36
CHEETAH network

Switches: Sycamore SN16000 Intelligent Optical switch


Robust implementation of GMPLS control-plane protocols
Support standardized Ethernet-SONET mapping
Oak Ridge, TN
Raleigh, NC
SN16000
GbE/
OC192 Control
10GbE
card Card
card
To Cray X1
H zelda4
H zelda5
SN16000
GbE/
OC192 Control
10GbE
card Card
card
H wukong
Atlanta, GA
OC-192
OC-192
SN16000
GbE/
OC192 Control
10GbE
card Card
card

H zelda1
H zelda2
H zelda3
End hosts: general-purpose Linux PCs with
two NICs and CHEETAH end-host software
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
37
IR mode of sharing in CHEETAH

Designed and implemented an end-host software package based on
the GMPLS architecture


Stand-alone circuit request tools
Integrated into applications such as Squid (an open-source web proxy
software)

Ran experiments of IR mode call setups and releases

Measured end-to-end circuit setup delays and per-switch signaling
message processing delays


Measurements useful to other researchers for modeling purposes
Demonstrated the readiness of off-the-shelf switches for actual service
offerings
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
38
Related work

Research papers on book-ahead bandwidth sharing




File transfers




Most of these papers use simulations
None of them considers book-ahead calls with multiple acceptable options
Our results show that a book-ahead mechanism that specifies only one callinitiation time may perform worse than an immediate-request mechanism
List scheduling: all proposed algorithms use fixed allocations
Bin packing: cannot break a block into pieces to fit into bins
TCP improvements: determine fair share for a flow faster and more
accurately, while we determine share for a flow during setup
Optical connection-oriented testbeds



2/19/2008
e.g.: ESnet4, NSF DRAGON, CA*net4, UKLight, JGN2, etc.
Focus: implementation & inter-domain usage
Our work: mixed study of IR and BA; theoretical modeling + implementation
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
39
Summary

High-speed connection-oriented networks should support a combination of
bandwidth-sharing services
For reservations that
specify file size (large file
transfers)
For reservations that specify
desired bandwidth and duration
For serving as “wires” between
switches to create networks
that offer other services
For video telephony, transfers
of moderate-sized files
New services:
Existing services:
BA-n/BA-First
Leased
lines
VBDS
Immediate
request
Better service quality
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
IP services
Greater sharing
40
Future work

Routing issue in reservation phase



Currently assume a linear topology in multi-link scenarios
Multiple route options should be exploited
Distributed implementation

Necessary for inter-domain scheduling


Service providers do not share network topology information with each other
Validate models against real measurements

2/19/2008
A long-term future work item after deployment & user base build-up
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
41
Questions from Form G111
Questions from Form G111 -
Defining the problem

In the context of new optical circuit-switched
technologies and new application requirements,
what bandwidth-sharing mechanisms can lead to
efficient utilization of modern high-speed
connection-oriented networks?
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
43
Questions from Form G111 -
Analysis of previous and related work

Research papers on book-ahead bandwidth sharing




File transfers




Most of these papers use simulations
None of them considers book-ahead calls with multiple acceptable options
Our results show that a book-ahead mechanism that specifies only one callinitiation time may perform worse than an immediate-request mechanism
List scheduling: all proposed algorithms use fixed allocations
Bin packing: cannot break a block into pieces to fit into bins
TCP improvements: determine fair share for a flow faster and more accurately,
while we determine share for a flow during setup
Optical connection-oriented testbeds



2/19/2008
e.g.: ESnet4, NSF DRAGON, CA*net4, UKLight, JGN2, etc.
Focus: implementation & inter-domain usage
Our work: mixed study of IR and BA; theoretical modeling + implementation
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
44
Questions from Form G111 -
Success criteria

Has the student adequately defined the measure(s) of success to be used
to evaluate the work? Is there a well defined metric with a goal? Does the
metric adequately represent the desired success criteria?

Success criteria

Session-type BA requests



Data-type BA requests


At least the same performance as packet switching
IR mode


BA-n: better performance than IR
BA-First: a model that scales to m>100
Stable network deployment and software implementation
Metrics


2/19/2008
Session type: express call blocking probability as a function of utilization
Data type: express mean transfer delay as a function of utilization
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
45
Questions from Form G111 -
Solution

Is the approach taken well executed? Does it appear to be correct?
Is the work technically challenging? Does the student utilize
appropriate professional standards?

A combination of analytical, simulation, and experimental methods.

Two book-ahead mechanisms for session-type requests


One book-ahead mechanism for data-type requests


A simulation model for this mechanism
A wide-area testbed for experimental study of the immediate-request
mechanism



2/19/2008
Analytical and simulation models for these mechanisms
Testbed deployment
Software implementation
Experimentation and measurements
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
46
Questions from Form G111 -
Innovation and risk

Two new Markov chain models for book-ahead bandwidth-sharing
schemes (first Markov chain models for book-ahead schemes)

An approximate solution for the M/D/m/p queueing system

One of the first deployments of a wide-area high-speed dynamic
circuit network
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
47
Questions from Form G111 -
Broader implications
(Social, economic, political, technical, ethical, business, etc.)

Demonstrated the readiness of off-the-shelf circuit switches
for actual service offering (business and technical)

Designed efficient bandwidth-sharing algorithms for highspeed connection-oriented networks

Circuit switches are less complex than packet switches, which
means


2/19/2008
Less expensive (economic)
Consume less power (environmental)
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
48
Backup slides
(Backup slides) BA-n -
Example of the analytical model for the BA-n scheme
Assumptions

Link capacity m = 1
Advance-reservation horizon K = 3
Number of classes L = 2
Holding time for class-1 calls h1 = 1
Holding time for class-2 calls h2 = 2
Number of options n = 1






System transition happens at the end of each timeslot
Example: state (0, 0, 1)


# of reserved
channels
1
Current time t



2/19/2008
t+k
Time
A call arrives and reserves the third timeslot -> state (0, 1, 1) Pr=(1/3)pr1
A call arrives and reserves the first timeslot -> state (1, 1, 0) Pr=(1/3)pr1
No call arrives or the arrived call is blocked -> state(0, 1, 0) Pr=1-(2/3)pr1
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
50
(Backup slides) BA-n –
DTMC model – Transition probability matrix

Define a left shift operator
:
If
,
Define a K-component vector

The transition probability from state x to state y is








2/19/2008
.
, where
p: the probability that a call arrives during a time slot
rj: the probability that an incoming call belongs to class j
qi,j: the probability that a class-j call is admitted with a initiation time of the ith timeslot
Bx: the probability that an incoming call is blocked when the system is in state x
First row: a class-j call is admitted with an initiation time of the ith timeslot
Second row: no call arrives or a call arrives but is blocked
Third row: all other states
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
51
(Backup slides) BA-n –
To compute qij and Bx

Use Hypergeometric probability mass function
calculate Bx and qi,j




A large set of N elements, known to have d defective elements
The probability of having k defective units in a random batch of n
elements, drawn without replacement from the large set
Define dj: number of “ineligible” timeslots for class-j calls
Mapping:




to
A total of (K+1-hj) candidate timeslots: corresponds to N
dj ineligible timeslots: corresponds to d defective units
e.g.: the first t options are all rejected: corresponds to a batch of n elements are all
defective (k=n)
After we obtain the transition matrix


2/19/2008
Calculate the steady-state probabilities
Calculate average call blocking probability and utilization
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
52
(Backup slides) BA-n –
Comparison of blocking probability and utilization
(a) Call blocking probability





(b) Utilization
BA-all clearly outperforms IR
BA-1 is worse than IR
Reason: “gaps” are caused by advance reservations
Analogy: if a doctor spends exactly 1 hour with each patient, patients arriving in the middle of an hour will
cause gaps (time period shorter than 1 hour)
Restricted call-initiation times



2/19/2008
Call-initiation time options are restricted to fall on call holding time boundaries
Restricted BA-n mechanisms clearly outperform IR
Performance of restricted BA-n is almost as good as BA-all
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
53
(Backup slides) BA-n – Dependence of reservation window
size on number of channels and call holding time (1)
(number of class L=1, call holding time H = 200, offered load = 100%)

Provide insight into how to select the advance-reservation horizon (K )



Longer K means better performance
Longer K also means greater storage and computation needs
The performance improvement is small after K reaches a certain value
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
54
(Backup slides) BA-n – Dependence of reservation window
size on number of channels and call holding time (2)

BA-all with




Number of channels m =2
Call holding time H =300
Offered load = 100%
The ratio K/H instead of K determines the call blocking probability.
K/H values for different values of m corresponding to 3 values of call blocking probability
2/19/2008
Call blocking probability
2%
5%
10%
m=2
14
6
4
m=5
5
4
3
m=10
4
3
2
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
55
(Backup slides) BA-First –
Numerical results – system design
m=4

Consider a system designer who wants to know the payoff by increasing the
reservation window size



2/19/2008
Assume we want to run the system with 1% call blocking probability
e.g., m=4, by increasing K from 2 to 4, the system load/channel can be increased from
75% to 93%
This is quite significant in that it allows for a 24% increase in the number of endpoints
multiplexed on to the link
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
56
(Backup slides) VBDS –
Why prefer using connection-oriented networks
for file transfers

The cost of high-speed circuit switches are lower than high-speed packet switches


High-speed memories for route table lookup and packet buffering
Rich set of features such as policing and shaping
Item
Sycamore SN16000
Base system
$130,000
$183,500
10x1GbE card
$169,830
$63,500
1x10GbE
$125,000
$65,500
1xOC192
$225,000
$37,500
$2,573,970
($21,000/Gbps)
$1,084,700
($9,000/Gpbs)
A system with 6 pieces 10xGbE +6
pieces 1x10GbE + 3 pieces OC192
(Total 120Gbps client data rate)

Cisco 12416
The simulated PS system is an idealized system in which buffers are assumed to be
infinitely large



In reality packet loss will occur due to congestions
Mechanisms, such as TCP’s congestion control schemes, are required to recover from these
packet losses with retransmissions and rate adjustments
Transport protocols designed for circuits, such as Circuit-TCP (C-TCP) are more efficient


2/19/2008
Take advantage of the information on the fair share of a flow
Disabling TCP’s Slow Start and AIMD algorithm
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
57
(Backup slides) VBDS –
VBDS favors large files when compared to PS

Packet switching: newly arriving transfers “cut in”

VBDS: Not so. Allocated bandwidth remains dedicated to ongoing transfers
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
58
(Backup slides) IR -
Erlang-B formula


Call blocking probability (PB) against the link capacity
expressed in channels (m)
Cannot achieve high utilization with low call blocking
probability when m is small
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
59
(Backup slides) IR -
GMPLS control plane

Purpose of Generalized Multi-Protocol Label Switching (GMPLS)
control plane



Three components




Dynamic bandwidth sharing (distributed)
Provisioning (configure the switches for the circuit/VC)
Link management protocol
OSPF-TE routing protocol
RSVP-TE signaling protocol
Bandwidth sharing mode


2/19/2008
Immediate-request (cannot specify a future call-initiation time or call
holding time in protocols)
Calls are accepted or rejected - “call blocking"
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
60
(Backup slides) IR -
CHEETAH concept


2/19/2008
Provide on-demand circuit service as an add-on to the
connectionless service provided by the Internet
Hybrid circuits: GbEthernet-SONET-GbEthernet
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
61
(Backup slides) IR -
CHEETAH end-host software development

Architecture
End Host
CHEETAH
software
Internet
DNS client
RSVP-TE module
Application
DNS client
RSVP-TE module
SONET circuitswitched network
TCP/IP
C-TCP/IP

Application
TCP/IP
NIC 1
NIC 2
Circuit
Gateway
Circuit
Gateway
NIC 1
NIC 2
C-TCP/IP
Based on the RSVP-TE code from KOM/DRAGON


End Host
CHEETAH
software
About 40K lines of C++ code
What I did:





2/19/2008
Modified the code to inter-operate with the Sycamore SN16000
Added admission control, session management, user interface, etc.
Integrated code for DNS lookup from our partner CUNY
Designed and implemented APIs for general applications
About 4K lines of new code
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
62
(Backup slides) IR -
CHEETAH end-host software architecture
DNS
server
CHEETAH software
End host
CHEETAH daemon (CD)
Circuit-requestor

DNS client

CAC
CD API
DNS lookup
socket
Route/ARP
table update
DNS lookup – to support our
scalability goal
Five steps of circuit setup

Message parsing

RSVPD

Route determination
RSVPD API
socket

RSVP-TE
RSVP-TE Daemon
(RSVPD)
messages


User space
Kernel space
Left to the edge switch
CAC
Date-plane configuration

Route/ARP table update
Message construction

RSVPD

C-TCP API
C-TCP
CD API can be integrated into web servers, FTP servers, etc., so that “elephant” flows
are automatically handled via a dynamically created dedicated circuit/VC
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
63
(Backup slides) IR -
End-to-end signaling delay measurements

Signaling delays incurred in setting up a circuit between zelda1 (in Atlanta, GA) and
wuneng (in Raleigh, NC) across the CHEETAH network.
Circuit type
End-tend circuit
setup delay (s)
Processing delay for Path
message at
the NC SN16000 (s)
Processing delay for Resv
message at
the NC SN16000 (s)
OC-1
0.166103
0.091119
0.008689
OC-3
0.165450
0.090852
0.008650
1Gb/s EoS
1.645673
1.566932
0.008697
Round-trip signaling message propagation plus emission delay between GA SN16000 and NC SN16000: 0.025s

Observations:



Delays for setting up SONET circuits for rates in the original SONET hierarchy are small
(166ms)
Delays for hybrid Ethernet-SONET circuits are much higher (1.6s) (vendor implementation)
The measured delay can be used for analytical and simulation models for related
research
2/19/2008
Ph.D. Dissertation Defense
Department of Computer Science, University of Virginia
64
Download