as a PDF

advertisement
Achieving Per-Flow Weighted Rate Fairness in a Core Stateless Network
Raghupathy Sivakumar Tae-Eun Kim
Narayanan Venkitaraman Jia-Ru Li Vaduvur Bharghavan
University of Illinois at Urbana-Champaign
Email:fsivakumr,tkim,murali,juru,bharghav g@timely.crhc.uiuc.edu
Abstract
Corelite is a Quality of Service architecture that provides weighted max-min fairness for rate among flows in a
network without maintaining any per-flow state in the core
routers. There are three key mechanisms that work in concert to achieve the service model of Corelite: (a) the introduction of markers in a packet flow by the edge routers
to reflect the normalized rate of the flow, (b) weighted fair
marker feedback at the core routers upon incipient congestion detection, and (c) linear increase/multiplicative decrease based rate adaptation of packet flows at the edge
routers in response to marker feedback.
1. Introduction
The current Internet supports a single service model simple best effort service. However, the increasing diversity of applications using the Internet has led to the emergence of Quality of Service (QoS) as a critical issue to
be addressed in the future Internet. Two broad paradigms
that have been proposed in the last few years for supporting quality of service in the Internet are Integrated services
(Intserv) and Differentiated services (Diffserv). The Intserv
approach supports absolute per-flow quality of service measures but requires a substantial amount of per-flow state to
be maintained in the routers of the network [10]. Since high
speed routers in the core of backbone networks typically
serve hundreds of thousands of flows simultaneously, it has
been argued that Intserv is not a scalable solution for providing QoS support in the Internet. Diffserv, on the other hand,
proposes a scalable service discrimination model without
requiring any per-flow state management at the routers in
the network core [8, 13, 14]. Diffserv is gaining some popularity as the QoS paradigm of the future Internet, in large
part because it moves the complexity of providing quality
of service out of the core and into the edges of the network,
where it is feasible to maintain a restricted amount of per-
flow state 1 .
Although Diffserv scores over the Intserv approach in
terms of scalability, the key question is: what kind of service
model can the Diffserv approach support?. Existing instantiations of the Diffserv model support coarse service differentiation focusing primarily on aggregates of flows, and
differentiating between service classes rather than providing
per-flow QoS measures. However, recent works in corestateless networks [4, 5, 11] have proposed approaches to
achieve finer grained service differentiation in an attempt to
emulate the richer Intserv service model within the framework of the scalable Diffserv network model.
In this paper, we present the Corelite QoS architecture
with a focus on the set of edge router and core router mechanisms for achieving per-flow weighted rate fairness in a
core-stateless network. Weighted rate fairness achieves fine
grained service discrimination on the end-to-end rate allocated to flows, and has been previously used in stateintensive Intserv-like networks as a sophisticated service
model for providing QoS. However, we believe that Corelite
is among the first approaches to achieve weighted rate fairness in a core stateless network [5, 6]. Corelite is broadly
based on the Diffserv approach, and the key tenets that form
the basis of the Corelite design are: (i) no per-flow state in
the core of the network, (ii) a simple forwarding behavior
for the core routers, (iii) a low overhead (weighted) fairfeedback scheme at the core routers to provide early congestion feedback to the edge routers, and (iv) rate adaptation (at the edges) without any packet loss in the network.
Guided by the above principles, Corelite supports weighted
rate fairness among flows in the network. Briefly, each flow
in the network can choose to belong to one of many rate
classes (each rate class has an associated rate weight) and
the rate alloted to a flow is in accordance with a weighted
version of the max-min fairness [1] algorithm given the
flows in the network and their respective rate weights.
The rest of the paper is organized as follows. Section 2 describes the weighted rate fairness service model,
1 In this paper, we borrow from the terminology used in [5] and refer to
networks conforming to the network model with no state maintenance in
the core as core-stateless networks.
. We define b(i)=w(i) to be the normalized rate of flow i.
The goal of weighted rate fairness is thus to achieve maxmin fairness of the normalized rates rather than the actual
rates of flows. While Corelite does not place any bounds on
the number or range of the distinct rate weights that can be
supported, we expect that a network administrator will typically provide a small number of rate classes for a network,
and associate a rate weight with each class. Each flow will
then select a rate class.
Max-min fairness is a well known concept [1], and
weighted max-min fairness is a fairly straightforward extension.
Unfortunately, achieving weighted max-min fairness
without maintaining per-flow state routers is no easy task.
In a Diffserv-like network, no core router knows about
which flows are traversing through it, let alone their rate
weights, no edge router knows which other flows are sharing the path traversed by a flow that is controlled by it, there
is no centralized knowledge, and decisions for rate adaptation are made for each flow based purely on the feedback
received for that flow.
and presents the high level Corelite approach for achieving weighted rate fairness. Section 3 explores the key core
router functionality in Corelite in greater detail. Section
4 evaluates the performance of Corelite and compares its
quantitatively with related work. Section 5 discusses some
related work. Section 6 concludes the paper.
2. The Corelite Architecture
The Internet can be viewed as an agglomeration of autonomous heterogeneous network clouds. Each network
cloud consists of core routers at the center and edge routers
at the fringes. An end host to end host connection can potentially flow through multiple network clouds. However,
the mechanisms proposed in Corelite are for a single network cloud and hence can be deployed in a network cloud
independently of other network clouds. Further, since Corelite proposes mechanisms for a single network cloud as opposed to an inter-network in general, its mechanisms are
edge-to-edge mechanisms and not end-to-end mechanisms.
Thus, any reference to a flow in the rest of this paper signifies an edge to edge flow that can potentially comprise of
several end to end micro flows.
There are two key components to be addressed in such
a setup: (a) edge router-core router interaction within a
network cloud, and (b) edge router-edge router interaction
across neighboring network clouds. In this paper we will
only focus on the first component: interaction between the
edge router and core router in a network cloud. Towards
this end, we will consider a single network cloud, and show
how we can achieve weighted rate fairness among the flows
in that traverse the cloud without maintaining any per-flow
state in the core routers. We now define weighted rate fairness formally.
2.2 Achieving Weighted Rate Fairness in Corelite
Let us now look at the basic operation of Corelite. There
are three main steps as illustrated in Figure 2:
1. Shaping and Marking at the Edge Router: Each
(ingress) edge router maintains the allowed transmission rate bg (f ) for every flow passing through it (into
the network cloud), and shapes the flow’s traffic according to its current bg (f ).
In addition to shaping, the edge router periodically
introduces marker packets in the flow transparent to
the sender and receiver of the packet flow, such that
the rate of transmission of the marker packets reflects
the normalized rate of the packet flow. Specifically,
an edge router introduces a marker packet after every Nw data packets (or bytes) of the packet flow,
where Nw = K1 w, K1 is a constant and w is the
weight class of the flow. Thus, a flow f that transmits at the rate of bg (f ) has a marker packet rate of
bg (f )=(K1 :w(f )). Recall that bg (f )=w(f ) is the normalized rate of the flow. In Figure 2.(1), K1 = 1.
Thus flow A has a marker packet inserted after each
data packet, and flow B has a marker packet inserted
after every alternate data packet. Note that the marker
rates reflect the normalized flow rates.
2.1 Weighted Rate Fairness
In Corelite, each flow is assigned a rate weight, and the
network bandwidth is distributed among competing flows
in accordance with their rate weights in order to achieve
weighted rate fairness. We define weighted rate fairness as
a weighted version of max-min fairness, where two flows
that share the same bottleneck link are allocated the link
bandwidth in the ratio of their rate weights. Let b = <
b(i)ji 2 F > denote a rate allocation vector, where b(i)
is the rate allocated to flow i, and let w(i) denote the rate
weight of flow i. A weighted fair rate allocation vector b
of rate thus satisfies the following condition: for any other
distinct rate allocation vector b0 ,
The marker packet is logically distinct though it may
be physically piggybacked to a data packet. The source
address of the marker is the edge router that generated
it, and the contents of the marker identify the packet
flow to which it corresponds uniquely within the edge
router.
8i; b0 (i) > b (i) ) 9j; b0 (j )=w(j ) b(i)=w(i)
andb0 (j ) < b (j )
2
along the flow path).
2. Marker Caching and Feedback at the Core Router:
When a core router receives a data packet, it forwards
the packet according to its standard forwarding behavior. When it receives a marker packet, it forwards the
packet likewise, but also copies the marker packet into
a local marker cache (which is a circular queue). The
marker cache thus contains the recent history of packet
transmissions, and the number of markers of a packet
flow in the marker cache is proportional to the normalized rate of the flow. The marker cache in Figure
2 illustrates this fact. It is important to note that the
core router does not inspect the contents of the marker
packets and performs no per-flow processing or state
management at all2 .
bg (f ) =
bg (f ) + maxf0; bg (f ) :m(f )g
if m(f ) = 0
if m(f ) > 0
where and are increase and decrease constants
respectively, and m(f ) is the number of markers received in the last period. The edge router reacts to
the maximum of the markers received from any core
router for each flow rather than the sum of the markers received for the flow because the goal is to throttle the rate in response to the bottleneck link. We
know from the core router behavior that m(f ) is proportional to bg (f )=w(f ). Thus, the decrease function in the rate adaptation algorithm is effectively a
weighted variant of the well known linear-increasemultiplicative-decrease (LIMD) rate adaptation algorithm that is known to converge to fairness [3]. Figure
2.(3) shows the updated rate after flows A and B receive feedback, with = 1. The rates for the flows
evolve as shown in Figure 2.(4) and asymptotically oscillate around the intersection of the fairness and efficiency lines.
Periodically, each core router detects incipient congestion by checking for queue buildup in the packet
queue(s). Upon detecting incipient congestion, the
core router does not drop queued packets. Instead,
it computes how many marker notifications it must
send back, randomly selects markers from the marker
cache, and sends each selected marker back to the edge
router that generated the marker (based on the source
address of the marker). The expected number of markers selected for a flow is proportional to its normalized
transmission rate. In Figure 2.(2), flows A and B transmit at the same absolute rate; thus the normalized rate
of flow A is twice that of flow B, and A receives twice
as many marker feedbacks as B does. Of course, the
core router does not know or care which flows it generates markers for.
Note that the feedback is generated on the basis of
markers in the cache rather than packets currently in
the queue or incoming packets in the next epoch. Thus
the feedback mechanism in Corelite is designed to be
independent of the scheduling discipline at the core
router and fairly insensitive to bursty flows.
3. Rate Adaptation at the Edge Router: Periodically
(once every fixed size “epoch”), each edge router
checks for marker feedback from core routers. For
each flow that traverses through it, if the edge router
received markers for the flow in the last period, it throttles back the rate for the flow proportional to the number of received markers; otherwise it increases the rate
for the flow by a constant (to probe for additional rate
Figure 1. Illustration of the Operation of Corelite.
What makes Corelite achieve weighted rate fairness without maintaining any per-flow state? There are two features
that work in concert: first, the core router generates feedback in proportion to the normalized rate of a flow (i.e.
m(f ) = k:bg (f )=w(f )) because edge routers insert markers to reflect the normalized rate of the flow, and second, the
edge router throttles the rate in proportion to the number of
2 It
can be argued that the marker cache implicitly maintains some perflow state in the form of markers even if the core router does not explicitly maintain per-flow state. In fact, we have only used the marker cachebased mechanism as a simple way of introducing the Corelite approach.
In Section 3, we will describe an equivalent mechanism for generating
weighted fair marker feedback that does not require the maintenance of
marker caches, and is thus truly flow stateless.
3
received markers (i.e. bg (f ) = bg (f )(1
=w(f ))). In
effect, we execute an enhanced LIMD algorithm at the edge
routers where the feedback is known to be fair. This leads to
weighted rate fairness, as we show through both simulations
and analysis.
Thus far, we have presented a high level overview of
Corelite. Several important details remain to be addressed:
(a) how is incipient congestion detected, (b) how many
markers are selected upon detection of congestion, (c) how
big does the marker cache need to be, and (d) can we get rid
of the marker cache altogether and come up with a much
simpler scheme for fair marker selection. In the next section, we address the issues raised above. Specifically, we
explore an alternative approach for marker feedback selection at the core router that does not require the maintenance
of marker caches, thereby reducing the memory overhead
of Corelite and making it truly flow stateless.
rate throttling by at least . Then, the number of markers
sent back, Fn , is computed by the following equation:
qthresh
f qavg
g + k(qavg qthresh )3
(1 + qavg ) (1 + qthresh )
where is the service rate of the outgoing link in packets
Fn =
per congestion epoch.
The first term represents the estimated amount by which
the input rate must be throttled under the assumption of
M/M/1 queues, and the second term represents a selfcorrecting factor if the assumption is of M/M/1 wrong.
If k = 0, the second term vanishes. Then Fn
represents the difference between the estimated packet arrival rate that corresponds to the average queue size of
qavg
qavg (i.e. qavg
) and the desired expected arrival rate
+1
that corresponds to the average queue size of qthresh (i.e.
qthresh
qthresh
), for the case of a Poisson arrival process with
+1
exponentially distributed packet service time. Thus, Fn
is the amount by which the aggregate traffic rate must be
throttled. Since each marker packet causes the edge router
to throttle the rate for the corresponding flow by at least ,
the core router needs to send back Fn markers to ensure the
required drop in aggregate input traffic rate.
Of course, the traffic assumptions of Poisson arrivals
and exponential service times are not always true. In particular, it may turn out that fewer markers are selected
according to this formula than required. In this case, if
k 2 = 0, the rate of increase of the marker selection rate,
d Fn
1
dqavg / (qavg +1)2 , is negative and may lead to progressively larger queue buildups and subsequent packet drops.
This is where the second term comes in. With a small but
non-zero k , after the queue size becomes sufficiently large,
d2 Fn =dqavg starts to be dominated by the k:qavg contribution of the second term and thus, larger average qavg causes
the core router to send back sufficient markers that the input
traffic will be throttled effectively. This keeps the queues
from overflowing. Additionally, since k is small, for small
queue sizes the second term has no significant impact in
terms of generating markers too conservatively.
Our initial simulations on measuring the sensitivity of
Corelite to input traffic patterns indicates that the computation for Fn works reasonably well even if the Poisson traffic
assumptions do not hold. However, a more detailed sensitivity analysis of this function is ongoing work. Further,
the congestion estimation module can be replaced with no
impact on the rest of the Corelite mechanisms.
3. Marker Management in Corelite
In Corelite, the edge router is responsible for two functions: (a) shaping and marker injection, and (b) rate adaptation, and the core router is responsible for two functions: (a)
incipient congestion detection, and (b) marker management
and weighted fair marker feedback. Since the uniqueness
in Corelite is the ability of core routers to provide weighted
fair feedback without maintaining per-flow state, we now
focus on the two core router mechanisms.
3.1 Incipient Congestion Detection
Each core router monitors its packet queues, and upon
detecting incipient congestion, sends back sufficient number of markers to the edge routers, so that the consequent
rate throttling performed by the edge routers will reduce
the aggregate input traffic into the core router in the future,
and thus alleviate congestion before queues become full and
packets are dropped.
In Corelite, the core router detects incipient congestion
by monitoring the length of its packet queues. A core router
may have multiple packet queues depending on its forwarding behavior. For purposes of congestion detection, we only
care about the aggregate queue size over all the queues corresponding to a link. The congestion detection function is
performed periodically, once every congestion epoch. The
core router maintains the average of the aggregate queue
size, qavg , during the epoch. At the end of every epoch,
the core router checks to see if qavg exceeds a predefined
congestion threshold queue size qthresh . If qavg > qthresh ,
then the core router concludes that there is incipient congestion and that it must send back markers to initiate rate
throttling at the edge routers.
The remaining question is how many markers to send
back? Consider that the sending of each marker causes a
3.2 Marker Selection
In this section we present a marker maintenance and
selection mechanism to generate weighted fair feedback.
“Weighted fair feedback” means that when a link detects
incipient congestion and decides to send back Fn markb (f )=w(f )
ers, it sends Fn Pg b (i)=w(i) markers to flow f . The
i g
4
edge router
scheme requires no marker caches, and additionally only
sends feedback selectively for those flows whose input traffic is larger than their weighted fair share (i.e. flow f receives no feedback if bg (f ) weighted fair share, and rebg (f )=w(f )
ceives Fn P
otherwise).
b (i)=w(i)
capacity/delay
S11
S1
S15
S16
S20
4 Mbps/40 ms
R9
4 Mbps/ 40 ms
4 Mbps/ 40 ms
S2
R10
S3
C1
4 Mbps/ 40 ms
C2
4 Mbps/ 40 ms
C3
4 Mbps/ 40 ms
C3
R13
ijbg (i)>weightedfairshare g
S9
4 Mbps/ 40 ms
4Mbps/ 40 ms
4 Mbps/ 40 ms
R20
Further, the scheme does not require per-flow processing or
maintaining per-flow state at the core routers.
The selective marker feedback approach presented in this
section is motivated by CSFQ [5], in which the goal is to
select markers corresponding to only those flows that are
transmitting more than their weighted fair share. When the
edge router sends a marker packet, it also puts the normalized packet transmission rate, rn = bg =w, for the flow in
the marker packet. The core router maintains a running average, rav , of the labelled rate rn over all the markers that
traverse through it. This is the only additional state variable
that is required at the core router to achieve a weighted fair
marker selection.
The core router determines the number, Fn , of markers to send as feedback to throttle flow rates. Further, the
core router selects only markers whose normalized rate rn
is greater than or equal to the running average rav maintained at the core. Note that rav is the computed average of
the normalized rate, and rav overestimates the average since
more markers are encountered corresponding to flows with
larger normalized rate. In other words, selecting flows with
rn rav isolates only those flows that are over utilizing the
link unfairly.
What remains is to describe the precise marker selection mechanism without requiring core router per-flow state.
The probability of selecting a marker is computed as pw =
Fn
wav , where wav is the running average of the number of
markers observed in each epoch. Additionally, a deficit
variable is maintained, reset to 0 at the start of each epoch.
When the core router sees a marker, it first selects the
marker for feedback with a probability of pw : (a) if the
marker is selected and its labelled rate rn rav , the marker
is sent back to the edge router that generated it; (b) otherwise if the marker is selected but its labeled rate rn < rav ,
the marker is not sent back as feedback, but the deficit variable is incremented; (c) otherwise, if the marker is not selected, but the deficit variable is positive and the labelled
rate rn rav , then the marker is sent as feedback and the
deficit variable is decremented. In summary, the deficit variable ensures that if a marker corresponding to a flow with
normalized rate lower than the running average happens to
be selected, then it is swapped with a future marker corresponding to a flow whose normalized rate is at or above the
average.
The above approach has the advantage of selectively
throttling misbehaving flows without maintaining per-flow
state, which is a very powerful feature. However, there are
some issues associated with the approach. First, we only
S10
R1
R2
R5
R6 R7
R8
R11 R12
Figure 2. Network Topology
consider the incoming markers in the current epoch when
selecting markers for feedback. This makes the approach
susceptible to bursts in packet/marker arrival. Second, there
is no guarantee that the required number of markers will in
fact be selected in the current epoch.
In summary, the algorithm is similar to CSFQ, but
improves on CSFQ: it does not depend on the accuracy
of explicit fair share measurement unlike CSFQ. In the
performance evaluation section, we show how and why this
approach performs better than CSFQ.
Over the last two sections, we have described the key
mechanisms that enable Corelite to achieve weighted rate
fairness in a core-stateless network. These include traffic
shaping, marking and rate adaptation mechanisms of edge
routers (discussed in Section 2) and the congestion estimation and marker selection mechanisms of core routers
(discussed in Sections 2 and 3). In the next section, we will
evaluate the performance of Corelite.
4. Performance Evaluation
In this section, we use simulations to evaluate our model
and compare the weighted rate fairness obtained in the context of Corelite, against the weighted version of CSFQ [5].
The simulations 3 , performed using the ns-2.1b4a simulator [12], serve to justify the validity of the mechanisms
used in Corelite. In this section we present two sets of results. In the first set of results, we illustrate the efficacy
of the mechanisms used to provide minimum rate contracts
and weighted rate fairness in Corelite by computing the expected values and comparing it with the actual rates allocated (bg ) to flows. For the next set of results, we compare
the weighted rate fairness obtained in the context of Corelite, against the weighted version of CSFQ 4 . In this case
we compare their behavior in steady state and when flows
dynamically enter and exit the network.
The topology used for the simulations is shown in Figure
3 The ns implementation of the new objects and the simulation
scripts used to obtain the results in this section can be obtained from
http://timely.crhc.uiuc.edu/Projects/Corelite/
4 The
CSFQ implementation for ns was obtained from
http://www.cs.cmu.edu/˜istoica/csfq/software.html
5
Alloted rate
Number of packets successfully sent
120
60000
flow1
flow2
flow3
flow4
flow5
flow6
flow7
flow8
flow9
flow10
flow11
flow12
flow13
flow14
flow15
flow16
flow17
flow18
flow19
flow20
alloted_rate
80
60
40
50000
40000
total_sent
100
flow1
flow2
flow3
flow4
flow5
flow6
flow7
flow8
flow9
flow10
flow11
flow12
flow13
flow14
flow15
flow16
flow17
flow18
flow19
flow20
30000
20000
20
10000
0
0
0
100
200
300
400
500
600
700
800
0
time in seconds
100
200
300
400
500
600
700
800
time in seconds
Figure 3. Instantaneous Rate
Figure 4. Cumulative Service
2. Flows 1, 9, 10, 11 and 16 start at time t=250 seconds and
stop at time t=500 seconds. All other flows start at time t=0
seconds and stop at time t=750 seconds.
2. It consists of three congested links and has flows that traverse different number of congested links. Flows in topology 1 also have high and varying round trip times ranging
from 240ms to 400ms. The source agents that we have used
to obtain the results for Corelite and CSFQ use similar rate
adaptation schemes viz. decrease the sending rate proportional to the number of congestion indication messages received (losses in case of CSFQ) or increase the sending rate
by one every epoch. After startup, the agents remain in the
slow-start phase (doubling the sending rate every second)
until they receive the first congestion notification or until
the out-of-profile rate exceeds ss-thresh (set to 32 packets
per second) at which point they reduce their rate by half
and switch to the linear increase phase. All the simulations
presented in this section use a fixed packet size of 1KB,
K 1 (used for generating markers) of 1, and (used for
rate adaptation) of 1, a queue size of 40 packets, congestion detection threshold of 8 packets, and an epoch size of
100ms at the core router. All the links have a bandwidth of
4Mbps(500 packets per second) and a latency of 2ms. For
CSFQ, K (used in estimating flow rate) and Klink (average
interval for computing rateTotal) were set to 100ms. In all
the cases, we assume that the flows always have packets to
send.
The results from this scenario are shown in Figures 3 and
4. We will first calculate the expected rate for the flows and
then compare it with the rates obtained by the flows.
To calculate the expected rates for the flows 1 to 20, observe that all the links have a bandwidth of 500 packets per
second. Initially, when flows 1, 9, 10, 11 and 16 are not
in the network, each flow should get a rate of 33.33 packets per second per unit weight. In Figure 3 flows 5 and 15
have an alloted rate of 99.99 packets per second (33.33 * 3)
since they have a rate weight of 3. The other flows in the
network have rate weights of 2 and hence have an alloted
rate of 66.66 packets per second. At time t=250 seconds,
when flows 1, 9, 10, 11 and 16 are introduced, the fair share
per unit weight drops down to 25 packets per second. Consequently, flows 5 and 15 have an alloted rate of 75 packets
per second, flows 1, 11 and 16 have an alloted rate of 25
packets per second and all other flows have an alloted rate
of 50 packets per second. Finally, when flows 1, 9, 10, 11
and 16 stop, the other flows climb up to their original rate
allocations.
Figures 3 and 4 show the results of the simulations and
they conform to the expected values. In Figure 3 we observe that all flows except 1,9,10,11 and 16 start at time
0, and converge rapidly to their fair share. When flows
1,9,10,11,16 start at time 250seonds, other flows fall back
almost instantaneously. The new flows receive no congestion notifications until they reach a point close to its fairshare. The 3 flows at the bottom of Figure 3 are flows 1,
11 and 16, having a weight of 1. Although these flows traverse different paths they all approximately get their fair
share of 25 packets per second. The largest bunch of flows
right above these three flows are the flows with a weight of
2. They receive approximately twice the amount of excess
bandwidth compared to flows with weight 1. This set again
has flows traversing different number of congested links and
4.1 Weighted Rate Fairness with Network Dynamics
In this scenario, we illustrate how Corelite can effectively support weighted rate fairness in a core stateless network. We consider a total of 20 flows with flows 1 to 5,
11 to 12 and 16 to 20 passing through only a single congested link between C1-C2, C2-C3 and C3-C4 respectively
and have a round trip time of 240 ms. Flows 6 to 8 and 13 to
15 traverse two congested links and have a round trip time
of 320 ms while flows 9 and 10 traverse three congested
links and have a round trip time of 400 ms. Flows 5 and
15 have a rate weight of 3, and flows 1, 11 and 16 have a
weight of 1 each. All other flows have their weights set to
6
Alloted rate
Alloted rate
90
90
flow1
flow2
flow3
flow4
flow5
flow6
flow7
flow8
flow9
flow10
80
70
70
60
alloted_rate
alloted_rate
60
flow1
flow2
flow3
flow4
flow5
flow6
flow7
flow8
flow9
flow10
80
50
40
50
40
30
30
20
20
10
10
0
0
0
10
20
30
40
time in seconds
50
60
70
80
Figure 5. Corelite Instantaneous rate
0
10
20
30
40
time in seconds
50
60
70
80
Figure 6. CSFQ Instantaneous rate
30, though their weighted fair share rate is around 70 packets per second. Also, note that these flows experience more
packet drops along the way - between times 30 50seonds
- before they reach their fair rate. The selective marker feedback mechanism used with Corelite does not try to estimate
the fair rate. Instead, it only computes an average of the
normalized rates observed in the w marked packets and
throttles only those flows that send more than this computed
average. In Figure 5, flows 7 to 10, with weights 4 and 5,
complete their slow-start phase and move into the linear increase phase at time 7seonds. They receive congestion
notifications only after they are close to their respective fair
share rates. This results in Corelite converging more than
30 seconds faster than CSFQ.
hence with different round trip times.
The closely spaced parallel lines in the cumulative service graph shown in Figure 4 shows that the total service
obtained by flows having the same weight is the same irrespective of their round trip times and the number of congested links they traverse (recall that our fairness model is
maxmin rather than proportional or min-potential delay).
4.2 Weighted Fair Rate Allocation (Corelite vs
CSFQ)
In the remaining sections, we compare the performance
of Corelite against the weighted version of CSFQ. In this
section we start multiple flows with different weights at the
same time, and compare the startup and steady state behavior of Corelite and CSFQ. We use topology 1, with 10 flows
having five different weights such that flow i has a weight
di=2e. The results for this scenario obtained with Corelite
and CSFQ are shown in Figures 5 and 6 respectively.
Both the mechanisms achieve results that closely approximate the ideal values in steady state. However, their startup
behaviors differ with Corelite converging faster than CSFQ.
In this scenario, with Corelite, none of the flows experienced packet drops and flows sending at a rate lower than its
fair share never received congestion notifications. However,
with CSFQ, when many flows startup simultaneously the
estimated fair rate deviates from its correct value because
it does not track the rapidly changing fair share correctly.
If the fair share is underestimated, then packets from flows
that are sending below the actual fair share can be dropped.
On the other hand, if the fair share is overestimated then
more packets will be accepted than the router can transmit,
resulting in queue buildups and potential overflows. This results in flows observing losses even before they reach their
fair share. Thus in CSFQ the drop behavior degenerates
into a tail-drop behavior when the buffer overflows. This
occurs when the estimated fair-share at the core router is
higher than the correct value. In Figure 6, flows 7 and 8,
move into linear decrease phase when their rates are only
4.3 Weighted Fairness with Network Dynamics
(Corelite vs CSFQ)
In this section we compare the behavior of the two
schemes, when flows with different weights enter the network one after another in rapid succession. We use the
topology in Figure 2 with 20 flows, flows 1, 11 and 16 having a weight of 1 and flows 5, 10 and 15 having a weight of
3. All other flows have a weight of 2.
Figures 7 and 8 correspond to Corelite and CSFQ respectively when flows start 1 seond apart in ascending order
of flow number. Clearly, convergence is faster in Corelite
than in CSFQ. This is because unlike Corelite, where all
flows move to the linear increase phase only after reaching a
point close to their final rate, in CSFQ, flows observe losses
early in their life time resulting in slower convergence. As
we mentioned in the previous section, when flows enter in
the network in rapid succession, the estimated fair share in
CSFQ will not converge to the correct value instantaneously
and the core router can degenerate into tail dropping. However, in Corelite, even if there are packet drops, the feedback
generation is still fair. As edges react only to congestion indications, the rates allocated to flows remains fair.
Figures 9 and 10 show the results obtained with Corelite
7
Alloted rate
Alloted rate
90
90
flow1
flow2
flow3
flow4
flow5
flow6
flow7
flow8
flow9
flow10
flow11
flow12
flow13
flow14
flow15
flow16
flow17
flow18
flow19
flow20
70
alloted_rate
60
50
40
flow1
flow2
flow3
flow4
flow5
flow6
flow7
flow8
flow9
flow10
flow11
flow12
flow13
flow14
flow15
flow16
flow17
flow18
flow19
flow20
80
70
60
alloted_rate
80
50
40
30
30
20
20
10
10
0
0
10
20
30
40
time in seconds
50
60
70
80
Figure 7. Corelite Instantaneous rate
0
10
20
30
40
time in seconds
50
60
70
80
Figure 8. CSFQ Instantaneous rate
and CSFQ respectively, when flows 1 to 20 start 1seond
apart in ascending order and after a life of 60seonds stop
one second apart in the same order. The flows then restart, 5
seconds after they had stopped. Thus there are flows simultaneously entering and leaving the system during the time
between 65 and 80 seonds. The figures clearly show that
Corelite adapts gracefully to the dynamics of the network,
where as with CSFQ the difference in performance obtained
especially by flows with higher weights and that are shortlived is significant because flows have a greater chance of
exiting their slow-start prematurely. Corelite avoids this and
provides improved fairness even for short-lived flows.
5. Related Work
Although Corelite is based on the Diffserv philosophy of
maintaining no per-flow state in the core, it differs significantly from the existing approaches in terms of the semantics of marking, specific functionalities of the core and the
edge router, and the service profiles it offers to users. A service in Diffserv is typically for traffic aggregates, not individual flows [8]. Most existing Diffserv approaches [13, 14]
use a mechanism of marking, where packets are marked
based on whether they are in-profile or out-of-profile. In
particular, packets belonging to in-profile traffic are marked
while the others are left unmarked. Marked packets receive
preferential treatment as they are forwarded in the network,
in terms of a lower drop priority and/or higher scheduling
priority. Core routers drop best effort traffic before dropping marked packets. However, there is no explicit support
for fairness between flows.
There has been a lot of work done on incipient congestion detection mechanisms [7, 9]. In [7] when a packet arrives, the router calculates the average queue length for the
last busy+idle period and the current busy period. When
the average queue length exceeds one, it sets the congestion
indication bit in the arriving packets. In RED[9], the router
maintains an exponentially-weighted moving average of the
queue length which is used to detect congestion. It maintains two thresholds. If the average queue length is less than
the minthresh , no packet is dropped and when it is greater
than maxthresh all packets are dropped. When the average queue length is in between these two values, packet are
dropped with a probability that is a function of the average
queue length. However, it provides no fairness guarantees.
FRED[2] extends RED to provide some degree of fair
bandwidth allocation. However, it maintains state for all
flows that have at least one packet in the buffer. Also it
deviates from the ideal case in a number of scenarios as
pointed out in [5]. In CSFQ [5], the core router dynamically
calculates the fair-share for flows, and on congestion prob-
4.4 Summary of Performance Evaluation
The results that we have presented here serve as a proof
of concept for the Corelite mechanisms and justify to some
extent the claims made in this paper. Our initial results show
that Corelite provides per-flow rate contracts and weighted
fair allocation of bandwidth without any per-flow state in
the core, the system is stable and adapts itself gracefully to
the network dynamics. Comparisons with CSFQ indicate
that though both the mechanisms perform well in steady
state, Corelite performs significantly better than CSFQ
when the fair share at the core router varies rapidly. In the
multiple-hop case, flows traversing multiple congested links
in CSFQ experience more losses and hence get a lower cumulative throughput than the ones traversing a single congested link. This is not the case in Corelite because the edge
router can distinguish the congestion indications generated
from different core routers. Our simulations with different core router epoch sizes, different marking thresholds,
and channels with large latencies indicate that Corelite is
not very sensitive to these parameters. The simulations are
however, by no means comprehensive. Simulations using
different adaptation schemes at the edge router and different congestion estimation schemes at the core router, and
using agents like TCP which involve interaction between
the edge router and end-host are part of ongoing work.
8
Alloted rate
Alloted rate
90
90
flow1
flow2
flow3
flow4
flow5
flow6
flow7
flow8
flow9
flow10
flow11
flow12
flow13
flow14
flow15
flow16
flow17
flow18
flow19
flow20
70
alloted_rate
60
50
40
flow1
flow2
flow3
flow4
flow5
flow6
flow7
flow8
flow9
flow10
flow11
flow12
flow13
flow14
flow15
flow16
flow17
flow18
flow19
flow20
80
70
60
alloted_rate
80
50
40
30
30
20
20
10
10
0
20
40
60
80
100
time in seconds
120
140
160
Figure 9. Corelite Instantaneous rate
0
20
40
60
80
time in seconds
100
120
140
160
Figure 10. CSFQ Instantaneous rate
abilistically drops packets only from flows whose current
utilization of bandwidth is greater than the estimated fairshare. CSFQ thereby achieves bandwidth allocation that
closely approximate the fair-share without maintaining perflow state. However, we have compared Corelite and CSFQ
in the previous section to discuss some of the trade-offs.
[4] I. Stoica, H. Zhang. Providing guaranteed services without
per-flow management. Proceedings of SIGCOMM, September 1999.
[5] I. Stoica, S. Shenker, and H. Zhang. Core-stateless fair
queueing: Achieving approximately fair bandwidth allocations in high speed networks. Proceedings of ACM SIGCOMM, September 1998.
[6] N. Venkitaraman, R. Sivakumar, T. Kim, S. Lu, and V.
Bharghavan. The corelite qos architecture: Providing a flexible service model with a stateless core. TIMELY Research
Report, February 1999.
[7] R. Jain, and K.K. Ramakrishnan. Congestion avoidance
in computer networks with a connectionless network layer:
Concepts, goals and methodolog. Proceedings of IEEE
Computer Networking Symposium, April 1988.
[8] S. Blake, et al. A framework for differentiated services. Internet Draft, October 1998.
[9] S. Floyd, and V. Jacobson. Random early detection gateways for congestion avoidance. IEEE/ACM Transactions on
Networking, 1(4), August 1993.
[10] S. Shenker, and C. Patridge. Specification of guaranteed
quality of service. RFC 2212, September 1997.
[11] T. Kim, R. Sivakumar, K-W. Lee, V. Bharghavan. Multicast
service differentiation in core-stateless networks. Proceedings of International Workshop on Network Group Communication, November 1999.
[12] UCB/LBNL/VINT Network Simulator - ns (version 2).
http://www-mash.cs.berkeley.edu/ns/.
[13] V. Jacobson. Differentiated services architecture. Talk in the
Int-serv WG at the Munich IETF, August 1997.
[14] W. Feng, D. Kandlur, D. Saha, and K. Shin. Adaptive packet
marking for providing differentiated services in the internet.
Proceeding of International Conference on Network Protocols, October 1998.
6. Conclusion
A key design challenge for scalable QoS architectures
has been whether it is possible to provide per-flow metrics
for rate without maintaining per-flow state in the core of the
network. In this paper, we have described the fundamental
mechanisms in Corelite that enable us to support weighted
rate fairness, which provides per-flow end-to-end relative
service classes for rate, with routers that have a simple forwarding behavior and maintain no per-flow state in the core
of the network. minimum rate contracts. In Corelite, markers are used to (a) normalize the rate of the flow according to
the rate weight of the flow, and (b) enable the core router directly generate a congestion notification to the edge router
thereby enabling it maintain the allowed transmission rate
of individual flows and drop packets from ill behaved flows
at the edges of the network. We are still investigating the
interactions required between the edge routers of different
autonomous domains, the interactions between the end-host
and the edge router and aggregation of flows at the edge
router.
References
[1] D. Bertsekas, and R. Gallager. Data Networks. PrenticeHall, Second Edition.
[2] D. Lin, and R.Morris. Dynamics of random early detection.
Proceedings of ACM SIGCOMM, September 1997.
[3] D-M. Chiu and R. Jain. Analysis of the increase and decrease algorithms for congestion avoidance in computer network. Journal of Computer Networks and ISDN Systems,
17(1), June 1989.
9
Download