Scalable Distributed Router Mechanisms to
Encourage Network Congestion Avoidance
by
Rena Whei-Ming Yang
Submitted to the Department of Electrical Engineering and
Computer Science
in partial fulfillment of the requirements for the degree of
Master of Engineering in Electrical Engineering and Computer
Science
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
February 1998
@ Massachusetts Institute of Technology 1998. All rights reserved.
Author ..........
Department of Electialngineering and Computer Science
February 9, 1998
Certified by.
John T. Wroclawski
Research Scientist
Supervisor
Th
Z7
Accepted by
Arthur C. Smith
Chairman, Department Committee on Graduate Theses
I
. - ~.i
~
Eng
Scalable Distributed Router Mechanisms to Encourage
Network Congestion Avoidance
by
Rena Whei-Ming Yang
Submitted to the Department of Electrical Engineering and Computer Science
on February 9, 1998, in partial fulfillment of the
requirements for the degree of
Master of Engineering in Electrical Engineering and Computer Science
Abstract
Many forces have led to an increase in the amount of network traffic which is nonadaptive in the presence of congestion. This non-congestion-controlled network traffic
is difficult to constrain because of the aggregating process which occurs within the
network. Previous attempts have been made to enforce compliance to network feedback by isolating and regulating individual flows, but with these come considerable
development and/or computational costs.
However, several new designs are emerging which allow less costly access to information to identify traffic flows which could be considered non-congestion-controlled
and which provide ways to penalize these flows locally. This thesis studies a means of
using such mechanisms to identify nonadaptive network flows, and proposes a protocol to push this information, along with penalization responsibility, towards the flows'
sources. This reduces the negative effects that these flows have on adaptive network
traffic competing for the same resources. We propose a design for such a pushback
protocol, build a network simulation of this pushback protocol integrated with an
identification and penalization scheme, and analyze its effectiveness in constraining
nonadaptive network traffic in several scenarios.
Thesis Supervisor: John T. Wroclawski
Title: Research Scientist
Acknowledgments
Many people have provided valuable contributions and support during this thesis'
execution and writing. I would like to thank these people, in particular, for their
help:
John Wroclawski, my thesis advisor: for his patience, and insightful comments
and conversations.
The conceivers and writers of the Advanced Network Architecture's latest DARPA
proposal [4]: for the idea of this pushback protocol.
The real and virtual inhabitants of NE43-537: for their feedback, conversation,
and ideas. I especially want to thank Wenjia Fang, Rob Cheng, Dina Katabi, Elliot
Schwartz, and Dan Lee: for their passionate interest in everything they do and their
keen evaluations and questions about the design of this protocol.
My family: for their trust and belief in me, and their guidance throughout my
life.
All of my friends, especially Art Housinger: for their support, their words of
encouragement, and their belief that I could finish this thesis, even when I had none.
The developers of RED and its penalty box extensions, FRED, and ns: for the
infrastructure and ideas without which I could not have developed any of this.
The Defense Advanced Research Projects Agency contract #DADT63-94-C-0072:
which provided financial sustenance whilst I was doing all of this research.
Anne Hunter: for her vast knowledge of the answers to all questions, her patience
and tolerance, and her belief in the student at MIT.
Contents
10
1 Introduction
2
Background ................
1.2
Motivation ...........
1.3
Goals ...........
1.4
Thesis Overview .............................
..
........
..
..
......
......
11
12
..............
..
13
14
Existing Measures
2.1
Per-Flow or Quasi-Per-Flow Constraint . ................
2.2
Lightweight Identification with Penalization
2.3
3
10
............
1.1
2.2.1
RED with Penalty Box ...................
2.2.2
FRED ............
14
15
. .............
16
...
17
.............
..
18
Reserved Bandwidth ...........................
18
. .......
......
2.3.1
RSVP ..............
2.3.2
Expected Capacity Framework . .................
18
20
Design Issues
3.1
Protocol Expressiveness
3.2
Propagation ...........
3.3
3.2.1
Path Determination. . ..................
3.2.2
Flow Granularity ............
..
Pushback Timeouts ...................
3.3.2
Penalty Timeouts .........
23
............
.
...
.
.
..
..........
.......
Timeouts/Time Scales ...................
3.3.1
21
......
...................
....
..
...
.
......
23
24
26
26
27
4
5
3.4
Penalization ...............
3.5
Trust Issues ..................
3.6
Contiguous Propagation
.............
28
. ............
28
.................
......
29
Design Description
30
4.1
High Level Operation of Integrated System . ..........
4.2
Implementation . ..................
. . . .
...
4.2.1
Granularity
4.2.2
Pushback Packet
4.2.3
Nonadaptive Identification Test . ..............
4.2.4
Pushback Sender ...........
4.2.5
Penalization ...................
4.2.6
Pushback Receiver ...................
4.2.7
Trust .......
........
...................
.......
..........
.
....
.
.
.
.......
......
.. .
.......
..
.
35
.
37
..........
37
39
5.1
Simple Pushback
5.2
Robustness in Adverse Environments . .................
..................
5.2.1
Lossy Environments
5.2.2
Routing Dynamics
32
33
.......
....
31
31
.
...
30
31
Simulations/Evaluation
5.3
6
...
..........
...................
.......
39
43
...
............
Cost ...................
.. .
.............
5.3.1
Overhead/processing
5.3.2
Effective Time Scale of Protocol . ................
...................
43
.
..
..
Evaluation .......
6.2
Remaining Issues/Future Work
48
.
48
49
Conclusion
6.1
47
55
......
..............
...................
55
..
56
List of Figures
20
. .................
3-1
Local Identification and Penalization
3-2
Identification + Pushed Back Penalization . ..............
4-1
Pushback Packet Format . ..................
5-1
Simple Pushback Network Topology . ..................
40
5-2
No nonadaptive constraint, on links L1 and L2 . ............
41
5-3
FRED nonadaptive constraint + local penalization, on links L1 and L2 42
5-4
FRED nonadaptive constraint + 1 pushback, on links L1 and L2 . ..
5-5
Lossy Network Topology ............
5-6
Loss rates 0-20%, measured on link L1, L2, and L3
. .........
44
5-7
Loss rates 30-50%, measured on link L1, L2, and L3 . .........
45
5-8
Loss rates 60-80%, measured on link L1, L2, and L3 ..........
46
5-9
Routing Dynamics Topology ..........
21
......
....
..
......
...........
5-10 Short term route oscillations: uptime mean=1s, downtime mean=0.1s
31
42
43
47
51
5-11 Medium term route oscillations: uptime mean=10s, downtime mean-ls 52
5-12 Longer term route oscillations: uptime mean=20s, downtime mean=20s 53
5-13 Longer term route oscillations: uptime mean-=50s, downtime mean=5s
54
Chapter 1
Introduction
1.1
Background
In 1986, the Internet suffered a series of congestion collapses, or periods when the
network was fully or nearly fully utilized, yet little of the network traffic sent was actually arriving at its destination. These collapses occurred because currently existing
transport protocols lacked sufficient adaptive mechanisms to deal with congestion.
In 1988, several algorithms were suggested to increase Internet stability via end
to end transport protocol use of congestion control mechanisms [11]. These modifications by Jacobson for congestion avoidance and control, which included guidelines for
adapting host sending rates and timing behavior to create a more stable system during times of congestion, corrected certain features of the original TCP specification
which were found to be inadequate.
These modifications were distributed and adopted widely with the Berkeley UNIX
operating system, BSD. Indeed, RFCs 1122 and 1123: "Requirements for Internet
Hosts" [1, 2] actually demand that all hosts implement Jacobson's modifications to
the original TCP specification. Although hosts complied with these demands, it was
obvious that the network architecture had no mechanism to enforce such network
standards and policies.
1.2
Motivation
More recently, the expansion and commercialization of the network, poor implementations of transport protocols, new types of network services, and the ever increasing
pressure for performance of network products, have led to an increase in the amount
of network traffic which is non-congestion-controlled.
Floyd and Fall [8] identify two dangers with this trend. The first of these dangers
is that by ignoring congestion indications by the network, these nonadaptive network
flows may bring about conditions which could lead to congestion collapse. The second
danger is that these network flows, by being nonadaptive, capture a disproportionately large fraction of the available bandwidth during times of congestion, relative
to adaptive network flows. This may cause acute unfairness and disadvantage to
adaptive network flows, and forms one of the incentives for being nonadaptive.
To conserve resources, the network aggregates traffic, losing its ability to extract
details about the behavior of any of the individual flows that it contains. This lost
detail is exactly the information necessary for any nonadaptive identification and constraint mechanism to operate, and so non-congestion-controlled network traffic which
is being deployed remains unconstrained. Given the increase in this new generation of
non-congestion-controlled traffic and its dangers, there is an immediate, urgent need
for moderation [3].
Previous attempts have been made to enforce compliance to network feedback by
isolating and regulating individual flows. Unfortunately, although this design limits
the effects of misbehavior within the network, it also incurs considerable development
and/or computational cost.
The two goals of enforcing network feedback and minimizing network cost may
seem at odds with one another. However, several new designs are emerging which
appear to address both of them. Initial work has been done to use less costly means of
extracting information to identify traffic flows which could be considered misbehaving
and to penalize these locally. This thesis continues in such a direction, but attempts
to increase efficiency in dealing with misbehaving traffic by using the resources of
upstream neighbors to push penalization closer to the flow's origin and constrain the
flow before it can cause damage, and to free the resources of heavily loaded central
nodes.
We propose to use mechanisms to help identify network flows which may be nonadaptive in the face of congestion, then to integrate such identification tests with a
pushback protocol, to push the identity of nonadaptive network flows towards the
source of these flows, and to penalize these flows at some node closer to the source
of the nonadaptive network flow's source, decreasing the congestion in the intervening network nodes, and possibly, reducing the disadvantage to some of the adaptive
network flows competing for the same resources.
1.3
Goals
In designing such a system, we must keep several goals in mind. There are several
layers to such goals. The global goal is to encourage end-to-end congestion control
by creating incentives to do so, in the form of penalties against network flows which
are non-congestion-controlled.
Another goal is to minimize the negative effects of
existing nonadaptive network flows.
More specific to our protocol:
* In the case when no other routers respond to pushback protocol, our system
should produce network behavior similar to a system which locally identifies
and penalizes nonadaptive network flows. This implies no significant delays in
identification or penalization, and also, no changes to the regulated flow set.
* It should also be backwards compatible with both routers which do not run
a nonadaptive test, and routers which do not understand this protocol. At its
heart, this protocol is an optional addition to the current capabilities of the network. Because of this, its use should not noticeably degrade the functioning of a
router, regardless of the presence or absence of other routers which understand
the protocol.
* In the case when other routers respond to pushback protocol, the system should
produce network behavior at least as efficient as a system which locally identifies
and penalizes nonadaptive network flows, and more efficient network behavior
in the common case.
* The system should be robust. The protocol should be able to adapt to a considerable amount of failure in routers or links over a reasonable period of time.
It should also adapt to routing changes over time.
* The system should require minimal network and processing resources. It should
provide a benefit commensurate with its overhead. Cost of the protocol should
be measured according to the amount of scarce resources it requires.
1.4
Thesis Overview
In the next chapter, we discuss different approaches that have been suggested in the
past to address the existence of nonadaptive network flows. Chapter 3 discusses the
design space for the integrated identification, pushback protocol, and penalization
system. Chapter 4 describes the system that we have designed and built in the ns
environment [16].
Chapter 5 attempts to illustrate the actual effectiveness of our
system in the presence of nonadaptive network flows in several network scenarios
through simulation, and evaluates the design that we have created. Chapter 6 draws
conclusions from the simulations we have run and discusses areas that require further
study.
Chapter 2
Existing Measures
The changing climate of the network and the observation that the network architecture has no mechanism to enforce network congestion avoidance and control policies
has led to a number of proposals for dealing with nonadaptive flows. Instead of assuming that network users are universally compliant to congestion notification and
trusting that endpoints will be congestion controlled, they instead assume that users
may be selfish, and that some means of enforcing or encouraging congestion control
should exist within the network. In this chapter, we cover three categories of such
measures.
2.1
Per-Flow or Quasi-Per-Flow Constraint
The first category of such measures is based on fair queuing. The fair queuing designs
attempt to partition and isolate network flows from each other so nonadaptive flows
are placed into their own queue [7, 18]. This enforces "fairness" in the system by limiting the amount of queue resources a given flow can consume. In a busy router using
fair queuing, a traffic flow which attempts to gain a larger portion of the bandwidth
would only increase its own delay. Although this design enforces good behavior of
the traffic flows, it is accompanied with high cost to the network, either in processing
cost to the router, or in development of specialized hardware to optimize performance.
This approach also does not scale well, because the resources to execute fair queuing
increase linearly with the number of flows passing through a node.
A number of efforts have been made to decrease the resources necessary to maintain the good behavior of fair queuing. Stochastic fairness queuing is one such effort
[17]. In stochastic fairness queuing, flows are hashed into queues, instead of being
mapped to a particular queue in a. strict one-to-one manner. These queues are periodically rehashed, so any unfairness which exists because adaptive flows are hashed into
the same queue with nonadaptive flows is transient. This design requires a constant
set of resources, but as the number of flows increases, its ability to warrant good
behavior degrades.
2.2
Lightweight Identification with Penalization
The following two approaches are router mechanisms to extract information which
has been lost by the aggregation process without having to resort to costly per flow
scrutiny. These lighter weight mechanisms provide a means to identify network flows
which exhibit nonadaptive characteristics and separate them from the aggregated flow
for closer scrutiny and/or penalization.
Random Early Drop (RED) [9] forms the basis for both of the lightweight identification methods covered in the following sections. In and of itself, it is a queue
management algorithm for congestion avoidance in packet switched networks.
It
executes this queue management function by probabilistically dropping packets depending on its estimate of a time averaged queue length. If the time-averaged queue
length lays below a certain threshold, it drops no packets. In its congestion avoidance
phase, it preemptively drops packets with a probability which is proportional to the
time averaged queue length. If the time averaged queue length rises above a certain
threshold, the RED node begins to drop all packets. Although its purpose is not to
detect non-congestion-controlled network flows, information gathered from its queue
maintanence activities can be used to for this purpose.
2.2.1
RED with Penalty Box
The Penalty Box extension [8] to RED uses RED's congestion avoidance and control
packet drops to sample the network traffic. One very useful characteristic of the packets that the RED algorithm drops is that they are a representative sampling of the
traffic flowing through that gateway; the number of dropped/marked packets from
each flow is proportional to the bandwidth that these packets are consuming. RED
statistical sampling works on the assumption that there is some amount of congestion
in the network, and because RED drops packets with a probability which is proportional to the time averaged queue length, its "sampling" frequency is proportionate
to the congestion level at the router while RED is in its congestion avoidance phase.
The penalty box extension to the RED queue management design utilizes this
characteristic to sample the network traffic flowing through a node. From a limited
number of samples, it is able to identify candidates for further investigation. From
a window of information defined by a history of RED drops, the router calculates a
normalized drop metric which has been shown to accurately estimate the arrival rate
for high-bandwidth flows.
The high-bandwidth flow candidates are then tested for the presence or absence
of different characteristics of adaptive network flows.
The first two tests, TCP-
unfriendliness and unresponsiveness, estimate the arrival rate of particular suspicious
flows based upon the RED packet drop history, then compare this arrival rate to what
it should be for an adaptive network flow.
In the TCP-unfriendliness case, a comparison is made to a computed maximum
sending rate of any TCP that is conforming to the required TCP congestion avoidance and control algorithms. This test requires preconfiguration of an estimate of
the minimum round trip time and maximum packet size for all flows passing through
that node, and focuses specifically on detecting flows which do not behave like TCP, a
widely used adaptive network protocol. The unresponsiveness test is a bit more general. It operates over time, using the fact that adaptive flows' sending rate estimates
should decrease in response to increases in the long term drop rate. A comparison is
made to bandwidth an adaptive flow should have, given the changes in drop rate in
the previous epoch. The very-high-bandwidth test uses the observation that adaptive
network flows generally do not occupy disproportionate fractions of the bandwidth.
After a group of misbehaving flows has been identified, they must be regulated.
The "penalty box" portion of the mechanism partitions network flows into a group
which has failed the misbehavior tests and group which has not and schedules constrained flows in penalty. In the simulations of [8], penalized flows are placed into a
separate class-based queue, where the congestion level is designed to be higher than
the congestion level of the partition containing unpenalized network flows. And, like
the fair queuing mechanism, in times of congestion, the more a nonadaptive flow
sends, the more it will congest its own limited resources along with the limited resources of other penalized flows.
2.2.2
FRED
The FRED algorithm is another mechanism to identify and constrain nonadaptive
network flows using information gathered from the RED queue management system
[15]. It proposes one conceptual addition to the original RED mechanism in order to
deal with nonadaptive network flows. It observes that a flow's output share is proportional to the input queue share it is able to capture. Even with RED, input queue
representation is proportionate to a network flow's arrival rate. Because nonadaptive
network flows tend to consume a larger proportion of the input queues, they also consume a larger proportion of the bandwidth exiting a node. From these observations,
FRED proposes that the node should constrain any network flow which is consuming
a disproportionate share of the node's buffer space.
It accomplishes this by keeping records of all the active flows. It defines active
flows as those flows which have packets buffered at the local node. By only tracking
active flows, it limits the amount of information that it must maintain to a constant
level based on the number of buffers it has. Per-active-flow, it tracks the instantaneous number of packets queued. If a flow attempts to queue more than a maximum
threshold of packets, its strike factor is incremented. After a certain number of strikes,
the flow is marked for constraint. Under constraint, if this flow's packets cause it to
surpass the per-flow packet average, those packets will be dropped. Its estimate of
the per flow average queue length is RED's time averaged queue estimate divided by
its count of the current number of active flows. Like RED with Penalty Box, its use
of buffer constraint assumes a certain level of congestion at that point in the network.
2.3
Reserved Bandwidth
The next category of measures observes that the current network architecture is
inadequate to support network flows which require guarantees beyond the best-effort
service of the existing network. This category of measures offers a means for the
resources of nonadaptive flows to be requested and provides a way for the network to
control and partition its own bandwidth and resources, but still suffers from the basic
problem of nonadaptive flows which do not provision for themselves or flows which
provision for a certain amount of resources and send beyond this amount.
2.3.1
RSVP
The ReSerVation Protocol (RSVP) [23] is an IP-based approach that provides end
users with a means of requesting the network resources that they require. RSVP
provides a way to describe any flow specification that might require a reservation of
network resources and a way to describe many different styles of resources that may
be reserved within that network. It also provides a way of installing state into the
network which will be robust and dynamic enough to allow for changing conditions in
the network, as well as a means of propagating this information so that it may form
a path between a sender and a receiver.
2.3.2
Expected Capacity Framework
The Expected Capacity Framework [6] is another approach which provides end users a
means of requesting a specific level of service from the network. Instead of the costly
RSVP approach to reserving resources with absolute guarantees, this mechanism
provides a means for networks to offer known levels of resources to end users with
high assurance of availability, and it deals with end users which send excessive traffic
by aggressively dropping their packets.
It does this through the efforts of two processes, a tagger located at the edge of
the network, and a dropper, located more centrally within the network. The tagger's
function is to tag packets with a bit indicating whether that packet is "in" or "out" of
a previously agreed service profile. During times of congestion, the dropper discards
packets. It drops "out" packets more severely that "in" packets.
Chapter 3
Design Issues
Existing mechanisms suggested in earlier work show that routers may be able to
provide a scalable means to identify misbehaving users, and to penalize these users
at the local router. Although identifying and penalizing misbehavior locally leads to
a very self-sufficient system, a certain amount of overall efficiency is lost.
nodes in this direction ,
benefit from penalization
__
R3
identifying
and
penalizing
router
wasted bandwidth on
misbahving flow's traffic
_
_
_
R
RR2
SRC
misbehaving
traffic source
Figure 3-1: Local Identification and Penalization
It is lost in bandwidth spent carrying the nonadaptive flow's traffic through the
network to a point of congestion, and is lost in penalizing these flows in congested
central nodes, whose processing time could be more wisely spent forwarding packets.
This thesis suggests a mechanism through which we may use such local algorithms
to test for nonadaptive network flows, but through which we may use the resources
from upstream routers to add efficiency to the overall system. Instead of waiting until
a point of congestion to penalize a flow, we use the resources of upstream neighbors
to push penalization closer to the flow's entry point into the network and constrain
that flow before it can cause congestion or unfairness. This also frees the resources
nodes in this direction
benefit from penalization
pushback
identifying
router
pushback
penalizing
router
misbehaving
traffic source
Figure 3-2: Identification + Pushed Back Penalization
of heavily loaded central nodes by pushing the responsibility of penalization towards
the edges of the network.
This mechanism will be provided through a pushback protocol which will act as
a means to transfer information from a sensor to an actuator, from the identifier's
identification tests to the penalizer's penalization policy.
Several issues will be covered in this chapter concerning the design of such a
protocol.
3.1
Protocol Expressiveness
In order to be effective, the information carried by the pushback protocol must be
sufficient to effectively transfer control of penalization from the identifying node to
any potential penalizing nodes.
This information may be divided into two parts:
* Flow identifier
* Penalization information
The flow identification portion must provide enough information for unique identification at the smallest granularity flow that could be reasonably penalized. Because
most systems will likely use classification of packet headers as a means of identifying
those flows which require penalization, this should probably include at least the flow's
source address, source port, destination address, and destination port.
Penalization information includes any information that a remote node would need
to have in order to determine a suitable penalty. The penalization information ex-
pressed by the protocol must be general enough for a majority of nodes that might
accept this protocol to map into a policy that they can execute locally. Exactly what
is expressed in this penalization information depends on who controls what is placed
in these parameters.
Control of this information may lie with the initiator of the protocol (the identifier), with the receiver of the protocol (the eventual penalizer), with some mixture of
these two, or from some outside source.
If this responsibility lays with the identifier, the initiator of the protocol must suggest to the penalizer a penalization action. The penalizer then requires no processing
in order to figure out what penalization to execute. This may seem attractive at first,
but with the great diversity of receivers for the protocol, expressing penalization goals
in a way that all nodes are capable of executing, or mapping to a capability that they
do have, becomes complex.
If the responsibility lays with the penalizer, the initiator of the protocol must
express information about its internal state, and the penalizer will process this information in order to form a suitable local penalty. The strength in this is that the
penalizer may have more information about its local environment than the identifier.
With the combined information about the identifier's internal state, its own capabilities, and its local environment, a penalizer node may be able to generate a more
suitable penalty than the identifier.
Any information expressed by the protocol must not rely on relative state at the
identifier's location. For instance a scheduling penalty, which relies on local congestion
levels, may have little effect at a remote router if the congestion level at that remote
router is negligible.
3.2
3.2.1
Propagation
Path Determination
In order to propagate penalization information to a node that will be able to have
an effect on a nonadaptive network flow, we may choose to send messages upstream,
towards the source of that flow, or downstream, along with the nonadaptive traffic.
If we send the message upstream towards the flow's source, penalization may be
installed directly into one of these upstream nodes; if we send messages downstream
towards the flow's destination, a higher level mechanism must exist to penalize that
flow from downstream.
Complications exist in either direction. In the upstream direction, there is no
established way to determine the previous hop for a particular flow, and not all
network paths are bi-directional.
In the downstream direction, although routing
tables provide an easy and established means of propagating messages through the
network, some sort of higher level mechanism or policy must exist to penalize a flow
from a downstream location.
Because we do not wish to focus on building such a higher level policy to penalize
network flows using the downstream method, and because we currently cannot rely
upon the existence of such a policy, we explore in this section a number of methods
for determining and propagating information in the upstream direction for a network
flow, given a packet from that flow.
What we require is an inverse flowpath function, a one-to-one mapping between
information available in a packet and its flow's previous hop. Routing protocols
today specify ways to access next hop information for packets, given the packet's
destination. Some people discussing next generation routing protocols suggest that
these protocols should also include ways to get previous hop information. In the
absence of such protocols providing previous hop information, we must form our own
ad hoc mechanisms for finding upstream routers.
If one assumes symmetric paths, one can use the same routing tables which are
used to forward packets, to propagate them backwards. Research shows [19], how-
ever, that symmetric routing is often far from the reality of the current network. If
penalization does not occur on the same path as data is forwarded, however, our
protocol is useless.
Many link layers in the network today, like ethernet and point-to-point, have
packet headers which include previous hop information. If we assume that a majority
of link layers within the network provide access to previous hop information, we may
be able to extract previous hop addresses directly from the link layer information of
a flow's packet. One of this approach's assets is its low bandwidth consumption. Its
negatives include its dependence upon link layer information which may or may not be
accessible, its inflexibility, and its violation of layer abstractions. Also, in some more
complex routing/bridging instances, the previous hop information available from the
link layer may be incorrect.
On the other end of the spectrum, one can use some form of broadcast, multicast,
or constrained broadcast to find an upstream path. Using plain broadcast, one could
imagine a pushback message being propagated to all possible previous hop nodes for
a flow, but constraining this broadcast might more wisely limit the number of nodes
unnecessarily effected by the pushback protocol.
This constraint on the broadcast might require testing of whether a node is receiving a certain amount of packets from the suspect network flow or relate to packet
forwarding information. This approach, although requiring less bandwidth, incurs a
significant amount of latency in testing the suspect flow. Its assets are that it doesn't
need to know about link layer details, but its negatives are its latency and processing
cost or its bandwidth consumption.
3.2.2
Flow Granularity
Flow granularity is the size chosen to define a flow, such as a TCP connection or all
the packets from a particular host. It dictates how network traffic will be aggregated
in the different stages of our protocol: identification, propagation, and penalization.
The granularity of identification defines the size of flow that is tested in a nonadaptive
identification test, the granularity of propagation effects the number of message that
are sent through the network, and the granularity of penalization helps to define
which packets are grouped together in a penalty.
If these granularities are too small, large amounts of resources will be necessary to
keep track of each of these small flows, or the number of limited resources allocated
for handling nonadaptive flows will not have enough detail to control these flows;
if the granularity is too large, there is a danger that adaptive flows may be cocategorized with one or more nonadaptive flows and thereby doubly penalized, once
by the penalization scheme, but once more because they will suffer excess congestion
by being categorized along with any nonadaptive network flows.
Identification, propagation, and penalization can all operate at many granularities
in this system. One can take several approaches to forming the appropriate granularity; one can start with a small granularity and conglomerate, or one can start with a
large granularity and refine. One can also alter or maintain granularity at different
points in the pushback process.
One approach is to predefine a constant granularity and inflict it upon the whole
network. This has the advantage that it is easiest to implement, but different regions or levels of hierarchy may operate at different granularities, and the predefined
granularity may be too large for some environments or too small for others.
Another approach is to control aggregation in a distributed manner, changing
granularities as needed between borders within the network.
Each region in the
network may then have the freedom to define its own granularity, and border nodes
will be responsible for translating between the different granularities of neighboring
networks. This is a very versatile system, but its complexity and latency may become
issues. Complexity may be an issue because of the difficult task of designing such a
translation, and latency may be an issue due to time spent translating granularities.
A final approach may include defining a certain granularity for identification and
propagation within the middle of the network, but using information available at the
network's edge to help guide granularity decisions for penalization. This information
comes in the form of "profiles", or sizes of aggregation that have been negotiated
with edge nodes . Since these profiles are negotiated and retained at the edges of
the network, the central portion of the network should not have to reason about
granularities.
The granularity in the central portion of the network must necessarily be small in
order to be able to support any granularity of a profile at the edge. The disadvantage
of this small granularity is that numerous messages will have to be sent propagating
information about each of the numerous small nonadaptive flows, causing higher
bandwidth consumption for the protocol.
3.3
3.3.1
Timeouts/Time Scales
Pushback Timeouts
Because we cannot assume that this mechanism will be ubiquitously deployed and
because we are not guaranteed that our pushback messages will arrive at their destination, we must use some form of acknowledgment mechanism in deciding whether
the previous hop that we have determined has taken up the responsibility to either
push penalization back farther, or to penalize the specified flow itself. The absence
of an acknowledgment along with a timeout will be our feedback. Such a pushback
timeout should be set conservatively; it must be long enough for most packets to be
queued locally, transmitted to the previous hop router, queued there, processed, and
the acknowledgment to arrive locally and be processed.
Several scenarios may have occurred if the local node does not receive acknowledgment of its pushback message.
The previous hop node may have received the
pushback message and may not support the protocol. Another case is that the previous hop node does understand the pushback protocol, but the pushback message that
we sent may have gotten lost before it was received at that node. A last scenario is
that the previous hop does understand the protocol, has sent an acknowledgment to
us, but this acknowledgment has gotten lost on its way back to us.
Because the network can generally be considered best effort only, particularly in a
congested state, we must provision for these last two scenarios by retransmitting our
pushback messages. The number of times we retransmit will depend upon the network
resources we are willing to consume and the latencies we are willing to endure.
3.3.2
Penalty Timeouts
There are also several reasons to provide a timeout to penalization. One constraint on
our control system is our high level goal to encourage congestion control by penalizing
nonadaptive network flows in such a way that their throughput is worse than if they
were congestion controlled. The execution of this goal has the following effect: Once
a penalization process is running at its destination, the original identifier, and in fact
no nodes in the network upstream from the penalizer node, can sense the behavior
or misbehavior of a constrained network flow.
Because our goal is to penalize nonadaptive network flows beyond the point of
being able to sense them, dynamically searching for a penalization operating point will
work poorly; searching will only allow us to approach a point where these penalized
nonadaptive network flows no longer exhibit nonadaptive behavior, but we require
an operating point more severe than this. To encourage congestion control, we must
punish severely enough that the allowed sending rate of a nonadaptive network flow
is less than that of an adaptive network flow.
Also, because identification of a nonadaptive network flow is merely a hypothesis,
the network cannot punish a network flow indefinitely. This would require unnecessary
network resources in many cases, and the network flow should be given a chance to
amend its behavior and release its penalization.
The combined goals of penalizing well past the point where we can reason about
the behavior of a nonadaptive flow, robustness in the face of routing changes and node
failures, and fairness to penalized network flows, warrant a soft state approach with a
timeout mechanism on penalization and a refreshing of state if the nonadaptive flow
is sensed once again. We do this by installing penalties into penalization nodes, then
timing out after a particular period of time.
This "penalization timeout" should be small enough that it will be sufficiently
reactive to network changes and failures, as well as improved host behavior, but not
so small that refreshing of the pushback protocol requires a significant portion of
nodes' processing time or links' bandwidth.
3.4
Penalization
A balance between resource use and penalization latency must be found in timing
our penalization. Local penalization may start before the pushback protocol is even
initialized, or as late as after one or more pushback timeouts. If local penalization
occurs before the pushback, there is a danger that numerous nodes along the flow's
path will be simultaneously penalizing the network flow, using up several nodes' processing time. Alternatively, if pushback timeouts are lengthy enough and penalization
doesn't start until a number of these timeouts, the latency for penalization of a flow
may become quite large.
Penalization can also take on different qualities. Initiation and termination of
these penalties can occur over different scales of time. One might imagine that a
penalty could commence gradually or it might take effect immediately and absolutely.
Penalization severity may range from allowing none of the specified network flow's
packets to pass, to allowing all of its packets to pass through a penalizing node. The
node controlling penalization also controls the severity of this penalization.
The
most useful way of setting such a penalization severity is to form a penalization
severity proportional to the severity of the misbehavior of a flow, as determined by
the information available.
3.5
Trust Issues
In transferring control of penalization that could be done locally, we must be concerned with issues of trust within this system.
A certain amount of trust is already inherently present inside of the network in
the form of routing information. Routers must trust each others updates to a certain
degree, in order to carry out the distributed routing algorithms of the present network.
In this system, trust is required for efficiency of the protocol in addition to correct functionality. If nodes do not trust any of their neighbors and must verify all
information relating to propagation and penalization, the time scale over which the
protocol will operate and the processing overhead required to acquire and maintain
the necessary information may be prohibitive.
In the forward direction, an identifying node relies upon the cooperation and
correct functioning of its neighboring nodes when it releases its penalization responsibilities. If its neighbors have incorrectly notified it that the penalization responsibility
has been taken by another node, and it releases its own penalization of that flow, then
a nonadaptive flow may remain unconstrained. On the other hand, the processing
of numerous nodes may be temporarily devoted to penalizing a single flow if the
identifying node releases this penalty only when it can no longer sense that flow's
misbehavior.
In the backward direction, a node must also trust the information that a supposed
identifier has sent it. An abuse of this protocol or a malfunctioning of a node may
constitute a denial of service attack to a network flow or to nodes whose processing
resources must be used to penalize a network flow. If those nodes must test the
penalization judgments of all supposed identifiers, however, the latency and processing
power required to decide whether a flow is, indeed, nonadaptive, could cause problems.
In some parts of the network, it may even be impossible to make such a judgment.
3.6
Contiguous Propagation
A possible problem with this protocol is that propagation can only progress as far as
the farthest contiguous node which understands the pushback protocol. This could
pose a problem with deployment. A possible enhancement to this work is to find
a means of skipping nodes which do not understand the protocol, to rendezvous
nodes with prospective upstream points, or for nodes which are protocol compliant to
advertise, and if an identifier can determine that such a node is on the path towards
the source of a nonadaptive flow, it could send to them.
Chapter 4
Design Description
4.1
High Level Operation of Integrated System
We have implemented a particular design for an integrated identification, propagation,
and penalization system within the ns network simulator environment.
At a high level, the system operates in the following manner: on particular links
in our network, we monitor network traffic with a slightly modified version of the
FRED algorithm. If the packets queued by a particular active flow cause it to fail an
adaptive test, we initiate our pushback protocol. We determine a previous hop, and
send a pushback message containing penalization information and an identifier for the
misbehaving flow, to an upstream router. If this upstream router node implements
the pushback protocol, that router will acknowledge our pushback message, then
continue to push this message upstream towards the misbehaving traffic's source.
This recursive process will terminate at the last router on this upstream path which
is knowledgeable about the pushback protocol; this router will take responsibility for
penalizing the misbehaving traffic source.
4.2
4.2.1
Implementation
Granularity
The original granularity of all of our nonadaptive identification tests was that of a
TCP connection. We will maintain this TCP granularity, although different nonadaptive identification tests may allow this to be set to other granularities.
For
propagation and penalization, we've chosen an IP source/destination address pair
granularity. This choice has the advantage that it will not be fooled by applications
which divide their network transactions into multiple TCP connections to increase
throughput. No granularity changes occur within the network because of the uniform
granularity used for propagation and penalization.
4.2.2
Pushback Packet
Our Pushback Packet, however, contains parameters to express a flow's source IP
address, source port, destination IP address, and destination port. This allows capability of expressing anything as small as a TCP connection. The Pushback Packet
also contains fields for expressing penalization parameters. A type field allows for
different kinds of penalization, and a variable amount of space is given for different
penalization parameters.
Currently our Pushback Packet can only support a penalization parameter set
which includes a type descriptor, a numerical measure of the identifier's current state
for a flow, a numerical measure of the identifier's target state for a flow, and a
penalization timeout.
Pushback Packet Information
FlowlD
Source IP Address
Source Port
Destination IP Address
Destination Port
Penalization Descnption
Type
Penalization Parameter Space
Figure 4-1: Pushback Packet Format
4.2.3
Nonadaptive Identification Test
The test that we use for identification of nonadaptive flows is what we call the FRED
strike test 1. The tests developed in [8] can also be substituted, as well as any decisive
and reliable nonadaptive tests which may be developed in the future.
As explained in the Existing Measures chapter, the FRED test tracks the instantaneous queue lengths of all active flows. If this instantaneous queue length exceeds
a certain threshold, the packet which pushed it over this threshold will be dropped
and that flow's strike value will be incremented. After a certain number of strikes,
the flow enters a constrained state. In its constrained state, the flow is allowed to
queue no more packets than the average per active flow.
We have modified the original FRED algorithm slightly in our implementation.
This modification was made to solve a problem which does not actually occur with
any frequency. The algorithm for FRED, as specified in [15] only allows flows to rid
themselves of their misbehaving status within that node when all of their packets have
been dequeued. Originally, it was thought that this would target both high bandwidth
nonadaptive flows and long-lived adaptive flows which maintained a queue of packets
at a node, when in fact, it is quite unusual for an adaptive flow to maintain a queue
at a node for any extended period of time.
We made the following modification to the original FRED algorithm to allow flows
to rid themselves of their misbehaving status more quickly: If an active flow's does
not misbehave for an iteration, it will be rewarded through a decrementing of its strike
factor. This causes the nonadaptive constraint and pushback tests to be more lenient
towards flows.
This may create results slightly different from the original FRED
algorithm, but since it is more lenient towards suspect flows, it should not invoke
excessive penalization behavior and should not invalidate the results we present in
this thesis.
The actual criteria for pushback is more strict than that for dropping a packet; it
is based upon the strikes that a flow has received over a period of time. We observe
1In this way, we are dependent upon the correct operation of the FRED algorithm for accurate
identification of high bandwidth nonadaptive flows
that at a given congestion level, high bandwidth nonadaptive network flows tend to
accumulate strikes at a relatively swift rate. TCP flows, after an initial burst of strikes
from slow start overestimation, tend to accumulate strikes at a much more gradual
rate, while in TCP's linear congestion avoidance phase. We push back on flows which
accumulate strikes several times faster than its peers.
4.2.4
Pushback Sender
The Pushback Sender is the workhorse of the pushback portion of this system. It
can be triggered either by the nonadaptive identification test in the identifying node
or by the arrival of a pushback message from a downstream node. It resides in a
Pushback Node, any node that contains a Pushback Sender/Pushback Receiver pair
and a penalization mechanism.
The Pushback Sender agent has many functions. Its first is to determine a recipient
of a pushback message. We use available link layer information to find the previous
hop's identity. A node finds this by scanning incoming traffic for packets which fit a
flow's profile. When found, that packet's link layer address identifies the previous hop
for the next stage of the pushback protocol. Since any suspect flow is, by the nature
of the nonadaptive test, a high bandwidth nonadaptive flow, the latency between
initiating the pushback process and finding its previous hop should be small.
This approach assumes that there is only one previous hop for any given suspect
flow. It may be expanded to find multiple previous hops by caching all of the previous
hops found for a flow over a period of time, and sending to these.
If a previous hop cannot be determined, then the host may choose to become
the penalizer for a flow, or it may decide that it is not the correct recipient of that
flow's pushback message. This may occur because of routing changes or failures in
the network. There are several responses to this case. One is to penalize the flow
locally. Another one is to refrain from ACKing so a downstream node's pushback
times out and it becomes the penalizer. A third, which we have not explored, is to
send a negative ACK, informing the downstream node that this node is unable or
unwilling to perform the responsibilities implied by ACKing.
Next, if a previous hop has been found, that previous hop is used as the destination for a Pushback Packet that is constructed. A flow identifier, an appropriate
penalization timeout, and some internal state information about the identifier are
inserted into the packet.
This penalization timeout was chosen to be a constant value of ten seconds, but
this parameter's value has obviously not been explored thoroughly enough. Further
research is necessary to find an optimum value for most network conditions, or a means
of finding such an optimum value dependent on measures of network conditions at
the time of identification or penalization.
The packet is then forwarded to the previous hop node. Meanwhile, a timer is set.
This pushback timer is sufficiently long that the packet should have plenty of time to
be transmitted, received, and acknowledged.
Like the penalization timeout, this pushback timeout was set to a constant value.
Again, this approach was taken for simplicity. One can certainly imagine a pushback
timeout set by an estimate of a flow's arrival rate or some other measure. We also
leave this as an area for future research.
There is a decent chance that a flow, once it fails the nonadaptive test, will fail it
once again sometime before either we receive an acknowledgment from a neighboring
pushback-aware node or our pushback timeout expires. We do not wish to flood the
network with repetitive information about a particular misbehaving network flow.
For this reason, although the nonadaptive test may fail repetitively over a short span
of time locally, we disable sending further pushbacks for this flow until the pushback
timeout.
After sending a Pushback Packet, the Pushback Sender must wait until the first
of two events happen:
* Pushback Acknowledgment
* Pushback Timeout
If the first event that occurs is the Pushback Acknowledgment, the local node
cancels the pending pushback timeout and any penalties which have been initiated
against that flow. If it trusts the acknowledgment, it may assume that its penalizing
responsibilities are over.
If the timeout is the first event that occurs, we may choose to retransmit the
pushback message or give up trying to pushback. Because we will be operating in a
best effort environment, we choose to retransmit several times before giving up.
Each node in the pushback path which understands the pushback protocol will
experience a similar pushing back process. Pushback should terminate before sending
to the actual source of the traffic.
4.2.5
Penalization
One node will be the final node in the pushback path. That node must initiate
penalization. In our system, the penalizer controls the penalization severity; it computes this based on information which the identifier has sent it and some long term
information which it has stored about suspect nonadaptive network flows.
In our detection/penalization method, the penalty severity is computed from a
ratio of the time average per-active-flow queue length maintained by FRED and the
number of packets the penalized flow has in the queue, along with the number of
offenses which a given flow has committed within the time span of our long term
memory. For every offense that a particular flow has committed, the severity of the
penalization is doubled. The timeout for ending the penalization is also doubled, up
to a certain point. This limitation is set in order to remain responsive to network
changes and failures. In this way, a repetitively sensed nonadaptive flow will require
progressively less network bandwidth and less overhead in terms of the pushback
protocol pushbacks.
We have chosen the following style of penalization: Since we will be dealing with
flows which are nonadaptive in the face of congestion, packet drops will not effect
these flows in the same severe manner that they would effect an adaptive network
flow. For this reason, we have chosen a moderately aggressive probabilistic packet
drop approach for penalization, assuming a constant sending rate, so dropping a
certain percentage will reduce a flow's bandwidth consumption by that percentage.
In the design section, we mentioned that initiation and termination of penalization
can occur at different time scales. Because we base our penalization severity on the
arrival rate of a flow at the time of the pushback, the first estimation of a nonadaptive
flow's arrival rate should be accurate.
Also, because we send periodic pushback
messages while sensing a nonadaptive flow, we do not install our penalties gradually;
this might cause multiple pushbacks for the same flow. If we were to terminate our
penalties gradually, the first sensing of the nonadaptive flow may be an inaccurate
measurement of its unpenalized behavior. For this reason, we choose to install and
terminate our penalties to an absolute degree, instead of gradually.
In timing our penalization, if one attempt to pushback upstream fails, we begin
local penalization.
This is a middle ground between penalizing immediately and
penalizing after the maximum number of timeouts/failures. It attempts to minimize
the number of nodes penalizing a flow simultaneously in the network, but not incur
the excessive latency of waiting for several pushback timeouts.
The penalty itself is executed by a fragment of code included in the forwarding
path of the penalizing node. A packet matcher will identify packets which belong to
constrained flows, and these packets will be subjected to a probabilistic drop.
The penalizing node is also responsible for updating a long-term data structure
which records the behavior of suspect misbehaving flows. The primary use of this
information is in reducing the system resources spent on continually misbehaving
network flows. Although information within this data structure times out after a
period of time, this timeout is on a time scale much longer than the pushback or
penalty timeouts in this system. Ideally, if a nonadaptive network flow continues
to misbehave, this offense information should be maintained until the flow's current
penalization has expired and updated information about the nonadaptive flow has
been propagated to the penalizing node. We choose this lengthy timeout to be several
multiples of a flow's penalty timeout.
This long term information is stored only in the penalizing router. Update of
this offense information would ideally record the number of offenses a nonadaptive
network flow has incurred throughout the entire network, but dynamic conditions
in the network may make this impossible. It is possible to store such information
in the identifier, in all nodes in the upstream path towards the penalizer, or in the
penalization node. If it resides in the identifying node, changing congestion conditions
in the network may change the identifier of the nonadaptive network flow. If this
information is located in the penalizing node, routing changes may cause the penalizer
to change. If the information is located in all nodes along the current path of the
flow, this will require the update of several nodes, and offense information may still
be globally inaccurate because of routing changes. We choose the location of this long
term information to be the penalizer, because it minimizes the overall memory that
must be updated, and is more stable than the identifier in the presence of changing
congestion conditions.
4.2.6
Pushback Receiver
Pushback Nodes also contain a Pushback Receiver. We use explicit acknowledgment
of pushback messages. The Receiver's most fundamental purpose is to create and
send these acknowledgment packets with the appropriate information, and to contact
the Pushback Sender at the same node to initiate a further pushback.
4.2.7
Trust
As we have for other components of this system, we choose a moderate stance on
trust.
We divide the network into two logical parts, that which lays within the same
administrative domain (AD), and that which lies outside. Within the borders of an
AD, we assume that nodes generally trust each other's judgments on misbehavior.
Therefore, within a region of trust, for efficiency purposes, we propagate the judgment
of misbehaving network sources with little verification. Between borders, however,
we may choose to verify these opinions by using local means to observe and test
the behavior of suspect flows over time, before propagating them through one's own
network and incurring bandwidth and processing cost.
This provides a means of
propagating quickly and with little processing within an administrative domain, while
protecting one's internal network from the opinions and malfunctions of the outside
world.
In the backward direction, border gateways may check that claims of misbehaving
flows are real by initiating their own tests of a particular flow. If they cannot sense
this or are unwilling to do such a test, then they may choose to not acknowledge the
peer border router which sent them the pushback message, they may choose to send a
negative acknowledgment to a downstream gateway, or they may choose to continue
the pushback's propagation. In the other direction, if a border gateway senses that
the upstream path hasn't sufficiently penalized the misbehaving network flow, despite
acknowledgments that it receives, it may assume this responsibility itself.
Chapter 5
Simulations/ Evaluation
We have run numerous simulations on our integrated pushback system to help guide
our design decisions, to gather evidence that our system operates in an expected
manner, and to get an idea of what this system is capable of and what its limitations
are in various network conditions.
5.1
Simple Pushback
In this section, we attempt to show that our system is able to accurately identify highbandwidth misbehaving flows. We show evidence that it produces network behavior
at least as efficient as a system which locally identifies and penalizes nonadaptive
network flows, and more efficient network behavior in the common case. We also
show that the protocol we have developed is able to operate correctly in the presence
of routers which do not run a nonadaptive test, as well as routers which do not
understand this protocol. Most importantly, we show that our system is effective
in pushing congestion towards a flow's source, and that it helps to constrain the
effects of existing nonadaptive network flows, as well as penalizing severely enough
to encourage congestion control.
The network we use in this section is shown in Figure 5-1. In the upper network
is a TCP flow which is running the new Reno TCP algorithm (TCPnewreno) and
a constant bit rate (CBR) network flow which is sending at 800Kbps. In the lower
1Mbps
1Mbps
Figure 5-1: Simple Pushback Network Topology
network, are two TCPnewreno flows which share a 1Mbps bandwidth link. The upper
and lower networks converge on what may quickly become a highly congested 1Mbps
link.
Figure 5-2 shows simulations of the network traffic on links without any sort
of regulating mechanisms for nonadaptive network flows. As one might expect in
this environment, a TCP which shares any congested links with the high bandwidth
constant bit rate source can manage to gain only the remaining bandwidth that the
CBR does not occupy; in our network, this is very little. The TCP which shares the
upper network with the CBR flow manages to gain 200Kbps, while the TCPs from
the lower network are able to get almost none of their fair share of the bandwidth.
Figure 5-3 shows network traffic with the FRED algorithm and nonadaptive network test running on the congested link between routers R3 and R4, along with a
pushback-aware node R3. Note that no pushback is occurring in this case, only local
penalization. Even with local penalization, we can already see some benefit to the
adaptive flows. With local penalization, the two TCPs on the lower network benefit
greatly, and are able to take advantage of the bandwidth that the penalized CBR is
no longer consuming. The TCP which shares another congested link with the CBR,
however, is still only able to achieve the remaining 200kbps that the CBR is not
consuming.
Figure 5-4 shows simulations with the FRED algorithm and nonadaptive network
test running on the congested link between R3 and R4 and two pushback-aware
nodes, at R3 and RI. In this network, penalization is pushed back once. At this
point, the CBR allows the TCP of the upper network to find an operating point
which is nearly an equal share with the other two TCPs on the congested link. With
the CBR interfering minimally with any adaptive flows, each of the TCPs is able to
gain approximately one-third of the bandwidth on the congested link..
One may note, that with the FRED mechanism running on the congested link,
even when the CBR's penalization times out, it does not gain the full 800kbps that it
might because the FRED mechanism constrains the queue space that it can acquire.
Figures 5-3 and 5-4 show the soft state penalization and timeout cycle through
the bandwidth measurements of the CBR flow. Initially, the CBR flow is sensed as
high bandwidth nonadaptive by the FREI
link. This information is pushed back to
a penalization node and penalized for a period at a certain severity. Xfter a period
of time, this penalization times out, and the oscillation begins again, with increasing
penalization severity and timeouts for repetitively sensed nonadaptive flows.
L2, no constraint
L, no constraint
melm
40130
Figure 5-2: No nonadaptive constraint, on links L1 and L2
41,
3u,
LI, local penalization
xi I
L2, local peiaization
INA
1
O
lmsib&
am
an
loaM
lam
Mi
Figure 5-3: FRED nonadaptive constraint + local penalization, on links L1 and L2
e.wxIa
LI, 1 pushback
'
--
L2, I pushback
la
1.2
aoo
o
t10,m
10an
ZMOO
LN
an
in
W10000
in
IM.
Mm
ZQQA
Figure 5-4: FRED nonadaptive constraint + 1 pushback, on links L1 and L2
42
5.2
Robustness in Adverse Environments
In this section, we show that our pushback system can handle a considerable amount
of network failure in the form of lossy links, as well adapt to network topology changes
due to routing dynamics over a period of time.
5.2.1
Lossy Environments
2 TCP
10Mbps
1Mbps
10Mbps
10Mbps
10Mbps
R2
R3
R4
1M s
10Mbps
R5
Ps
SINK
L2
LI
10Mbps
CBR
Figure 5-5: Lossy Network Topology
Our lossy network is shown in Figure 5-5. On the lower network is a CBR network
flow attached to a chain of links, ending in a congested link where this CBR flow and
two TCP flows merge. These two TCP flows come from the upper network. At the
merging link is running FRED with a pushback test process. All the links on the
lower network are pushback-aware. The links in the CBR's upstream direction have
been given increasing amounts of lossiness. This lossiness is in the form of a uniform
probability packet drop.
Figures 5-6 through 5-8 show the effects of increasing packet loss on the ability of
the pushback protocol to operate. The effects of this lossiness on the pushback messages become clearly evident only after 20% packet loss of these pushback messages
on each lossy link. Before that point, retransmissions are able to deal with packet
loss sufficiently. Our decision to limit our protocol to three retransmissions causes
some of the pushbacks to be lost at higher packet loss rates, however.
We can see the effects of our decision to penalize after one pushback timeout.
Following the sensing of a nonadaptive network flow, the penalized flow's bandwidth
has a slight dip before it settles in to an extended penalization level. This is the point
where multiple nodes are penalizing that flow.
No Loss
ia
r )
i,
i
"u
L
rr
-
-I ii
LW'i
j
II
~
ui
U
:
r.
10% Loss
L2
" i
i
H
li
-.~
i
I-
I
i
!j , ,
L1
>1
,_..
20% Loss
L2
Ii
- i
I
.i
i
-ini
Figure 5-6: Loss rates 0-20%, measured on link L1, L2, and L3
44
30% Loss
L2
---
II~
fi:
-p
,
i ',
TI,
IiI
-
Lt~
~
~i
- L ilil .
-j
Li
Li
-
_
r,
SI
-
i
-
5
ii
r
I
-
40% Loss
L2
---
I
~1F tlrliR,1 la: ir~rt
" 4 "ti ',
-?
i
i i
1
i
L3
---
--.
"r
wr
"p
-J
.II
" i
-i
I
LIeas
i m'
-
I -
m
L1
,
i
50% Loss
- - S L2
4-~i
Ii
IIr~~U~
;-"---~
50%6 Loss
L2
me
ari
li
•1.
ii
1
ii
-.
,r
.
.
. .
LJ
m ii"
,
..
.i'.
4,
~i.-~--
Is
4
Ii
.
iAr
'Bf
I
. . . . . .. - ,
~brq
iiuar ;I
II
slbln
ruuI
Figure 5-7: Loss rates 30-50%, measured on link L1, L2, and L.,
45
60% Loss
I.3
7Jl
"u I
-
70% Loss
I
r-
i
U
i
mo
.Y
,N
.
-
,
:
.
i
In;
Nmu
I',,,-
80% Loss
..
LI.
------
.....
L2 . . '70.
1~::
Figure 5-8: Loss rates 60-80%, measured on link L1, L2, and L3
46
5.2.2
Routing Dynamics
MbpsMbps
1
1Mbps
1 Mbps
1 Mbps
1 Mbps
L1
Figure 5-9: Routing Dynamics TopologyMbps
R6
R7
1 Mbps
Figure 5-9: Routing Dynamics Topology
Figure 5-9 shows the network we used to test our protocol's responsiveness to routing
changes in the network. A questionable link in our network, Lq, toggles between
an up and down state with "uptimes" and "downtimes" exponentially distributed
about a mean value. This causes the constant bit rate traffic to be sent on the upper
network when that link is down, and on the shorter path lower network when the link
is operational. FRED is running on the first link after the upper, lower, and TCP
links converge.
We experiment with the questionable link up and down times in order to understand our system more thoroughly in the presence of different rates of routing
dynamics.
As a control case, we pushback to the upper and lower network's common node,
R5, where the CBR's path does not effect its penalization.
We use this control
to compare to the next set of more interesting measurements, where we push back
further, to routers R3 and R6.
We take these measurements on links L1, L2, and L3. The measurements on links
L1 and L2 tell us that our protocol is able to correctly find the current path of the
nonadaptive traffic, and push back in this direction. Whichever path the CBR flow
is taking when our nonadaptive test senses it receives the pushback and penalization
responsibility.
The measurements on link L3 tell us what the net effect of pushing back does
for the two competing TCPs which share a congested link with this constant bit rate
traffic. If the protocol were to be instantaneously reactive to any routing changes, the
measurements for the multiple pushbacks and our control case should be the same.
Because we have chosen a timeout and refreshing soft state approach, which has a
less hefty overhead, we cannot guarantee instantaneous reactions, but our approach
seems to operate quite well in all cases but one.
This case, as may be expected, is the case where the CBR traffic alternates routes
at a rate much faster than our timeout/refresh interval. This case may be likened
to route flutter in a real network. Although our system itself is not as effective in
penalizing the CBR traffic at this dynamic a rate, our penalization of half of the paths
that the nonadaptive traffic travels maintains a significant level of constraint on its
traffic.
5.3
Cost
Certainly a system such as ours is not without cost in operation. In this section, we
evaluate a few parameters of cost that our system incurs.
5.3.1
Overhead/processing
In our goals, we proposed that our system should require minimal network and processing resources. In this section, we begin to reason about the resources that our
proposed pushback system would require.
Each of the lightweight identification mechanisms covered in this paper require
a constant amount of memory resources to identify a nonadaptive flow. By using
such an identification mechanism, we inherit this memory overhead. In penalization,
our system requires memory resources proportional to the number of identified nonadaptive flows, as does the local identification and penalization system. In addition,
however, our system keeps track of longer term information about misbehaving flows.
The major source of overhead in our system, however, is exacted from the inter-
mediate points which participate in the pushback process. Each intermediate node
must provide processing power to acknowledge pushbacks, determine previous hops,
form packets, and forward them to the appropriate location. Intermediate links must
provide bandwidth for forwarding pushback messages and acknowledgments. All of
these resources are proportional to the number of nonadaptive flows identified in the
network.
Throughout our design, we have attempted minimize the amount of both bandwidth and processing time that our protocol requires. From our simulations, repetitively sensed nonadaptive flows are given progressively smaller amounts of network
resources. Here we see the effects of this design when we compare a scenario where
N different nonadaptive flows are identified over a short period of time, as opposed
to a single nonadaptive flow identified N times. In the first case, the amount of long
term information installed at the penalization nodes is linearly proportional to N. In
the second, this long term information should be constant. The intermediate nodes
of the first case also pay a much higher price per time period because repetitively
identified nonadaptive flows require exponentially decreasing resources per unit time,
up to the point where the penalization timeout reaches a maximum.
5.3.2
Effective Time Scale of Protocol
A portion of the cost of this protocol is in opportunity cost in penalizing shorter term
nonadaptive network flows. In this section, we address this concern briefly.
The latencies of processing, queuing, and transmission of the pushback message
may define a control difference between a pushed back penalization system and a
system which identifies and penalizes locally. It may also define and constrain the
time scale over which we can be effective against nonadaptive flows.
The original local penalization mechanism could sense and constrain those flows
on time scales longer than the combined time to identify a nonadaptive flow and
initiate a penalization mechanism. Because of the way we have chosen to time our
penalization after one pushback timeout, the system that we have designed has a
smaller set of flows which it can effect, those which are longer than our integrated
identification, propagation, and penalization setup.
The time difference of penalization between a system which penalizes locally and
our system lays in the number of hops our protocol is able to push back, the network
conditions that we operate in, and the timeouts that we choose for our protocol. The
more hops that our protocol is able to propagate a pushback message, the larger the
amount of latency between identification and penalization.
In the most ideal case, for each hop that our protocol propagates, only the time
to process, acknowledge, and propagate may be added to our "time" overhead, along
with the pushback timeout value set to decide whether penalization responsibility
must be taken at the local node. In the worst case, retransmission timeouts must be
added for loss of pushback packets and acknowledgments.
The actual values of the pushback timeout will govern what sorts of latencies
one could expect between identification and penalization, and actual opportunity
cost will depend on the time scales that nonadaptive flows operate in our network.
In our system, under ideal conditions, our nonadaptive test operates on the order of
seconds, and our actual propagation time and timeout values are small in comparison,
so pushing back only adds a small fraction of latency to the entire system.
controk single pushback
L1l
-
4d
--
-r
-4----------
U
mam.,
L
t,
multiple pushbacks
L2
N-A=o~-
lI '
-----r------~----
MU
MU
I
1,
1
L3
~rv
i'
i
..
m~..--
ii
jI
iW
met
me
-a
i
*i
i
it
tL::yrL
1
mar
IMe
-," I '
--~
11
L
--.
rr~c~~~ij~5~Pyj
-IN
1111
Figure 5-10: Short term route oscillations: uptime mean=ls, downtime mean=0.ls
51
,control: singlepushback
S
-NAO-
JLA
L3
_
__
It
ii
ii
am
'
a
u
'
u
SU
pushbacks
multiple
U
)U~l
multiple pushbacks
L2
Li
,---
ji1
dI,,,
hi
..!ll t~r'.l
Ii
~
\i-7 I
i
I
Y4
-
i -...Fd
iiill
i!
ilI l. 'A
Figure 5-11: Medium term route oscillations: uptime mean=lO0s, downtime mean=ls
control: single pushback
LI
.
21 :1'Ij
=
MAN
-~
-S
-
!
L2
11
Oui
iii
lu"
, ix
-
-
II-
tII
i
i
-4!
,,*1-
I
N
-f
LJ
i
, , --
~-_~
UT U
-- UL;...
-
O
,. -- '-
U
LU ...
_._ r--
t
it I
;-FT,
,
MiiG
,
ui
-O
S
tMI
.l
......
- /....
- u -- ;-
u
multiple pushbacks
L2
Li
,M
.
17,
-i
I
-
~L3
...
1
ms
t
I fiL
am
ITI
,-ta
I
a
{
um
le
,I
U
S
,m
.
... .
,1U
_ . _
sU
L.
aU
.. _
IP
_
II
Figure 5-12: Longer term route oscillations: uptime mean=20s, downtime mean=20s
53
control: single pushback
-B
ms
I
i
am
I
multiple pushbacks
r.,,
L2
LI
i 'i
.. .. .
tm
a
m
mu
L3
,
n
.. . . .. •.
..
. . .
m
.......
-.
.
.,--L ---
r-- _.
.
.
.
-
mu
Figure 5-13: Longer term route oscillations: uptime mean=50s, downtime mean=5s
54
Chapter 6
Conclusion
6.1
Evaluation
The expansion and commercialization of the network, new protocol implementations
and services, and the intense pressure for increased network performance have led to
an increase in the amount of network traffic which is non-congestion-controlled. This
trend will certainly not decrease in the future. With this non-congestion-controlled
network traffic come the dangers of congestion collapse and severe unfairness to adaptive network flows.
In the past, several attempts have been made to enforce compliance to network
feedback by isolating and regulating individual flows, but only at great cost. More
recently, several designs have emerged which allow less costly means to identify high
bandwidth traffic flows which could be considered nonadaptive and to penalize these
traffic flows locally. This thesis has attempted to take this work one step further.
It proposes a protocol to push information about high bandwidth nonadaptive
network flows back towards the source of this traffic, reducing its adverse effects
on traffic competing for the same network resources and reducing the chance for
congestion collapse.
In this thesis, we have attempted to investigate the integrated identification, propagation, and penalization components of this pushback system. We have explored the
design space for such a system, simulated this system to help reason about the param-
eters of its design, and through these simulations, shown evidence of this protocol's
effectiveness and appropriateness in reaching our stated goals.
We have shown evidence of the protocol's ability to use neighboring node's resources to reduce adverse effects of high bandwidth nonadaptive network flows on
adaptive traffic in several network topologies and network conditions.
We have also shown the protocol's ability to operate at least as effectively as a
local identification and penalization mechanism, even in the presence of high packet
loss. We have further shown that although our system isn't as responsive as one which
reacts immediately to routing changes, our protocol is quite effective in networks with
routing dynamics which change at a rate slower than our timeout/refresh intervals,
and moderately effective in networks with routing dynamics which change faster than
our timeout/refresh intervals.
6.2
Remaining Issues/Future Work
We have by no means fully explored the design space that we originally mapped out,
though. Several issues remain for further exploration:
* Setting penalization policy at different locations and in the presence of different
network conditions.
* Path determination, particularly the problem of propagation to upstream or
entry nodes for a network flow.
* The use of different granularities for identification, propagation, and penalization.
* The possibilities for deployment which do not rely upon contiguous propagation
of messages.
* Generalizing to the case where high bandwidth nonadaptive flows are not unusual, but numerous in the network.
* Finding the optimum values or approaches to finding the values of various parameters of the pushback protocol.
* Investigating the following effect: penalization release of nonadaptive flows may
cause adaptive flows to greatly reduce their sending rate, in turn reducing the
congestion level at a node. Since our identification methods are all dependent
upon a certain level of congestion to exist in order to operate, this may cause
a nonadaptive flow to not be sensed and/or penalized for long periods of time,
until a sufficient congestion level is reached.
Perhaps the most important future work, however, is to implement the protocol
that we propose, or a mechanism which is able to accomplish the same goals, into a
real network, and monitor its effectiveness in the presence of actual high bandwidth
nonadaptive network flows, failures, and other adversities.
Bibliography
[1] Bob Braden and et al. Requirements for internet hosts - application and support.
RFC-1123. IETF, 1989.
[2] Bob Braden and et al. Requirements for internet hosts - communication layers.
RFC-1122. IETF, 1989.
[3] Bob Braden and et al. Recommendations on queue management and congestion
avoidance. Internet Draft, September 1997.
[4] D. Clark, Karen R. Sollins, and John T. Wroclawski. Robust, multi-scalable
networks. DARPA Proposal, MIT Laboratory for Computer Science, 1997.
[5] D. D. Clark. The design philosophy of the darpa internet protocols. In SIGCOMM '88 Symposium on Communications Architectures and Protocols, Computer Communication Review, volume 18, pages 106-114, Stanford, CA, USA,
August 1988.
[6] David D. Clark and Wenjia Fang. Explicit allocation of best effort packet delivery
service. unpublished manuscript, 1997.
[7] Alan Demers, Srinivasan Keshav, and Scott Shenker. Analysis and simulation
of a fair queuing algorithm.
In SIGCOMM Symposium on Communications
Architectures and Protocols, ACM, pages 1-12, Austin, Texas), September 1989.
[8] S. Floyd and K. Fall. Router mechanisms to support end-to-end congestion control. Technical report, Information and Computing Sciences Division, Lawrence
Berkeley National Laboratory, February 1997.
[9] S. Floyd and Van Jacobson. Random early detection gateways for congestion
avoidance. IEEE/A CM Transactions on Networking, 1:397-413, August 1993.
[10] S. Floyd and Van Jacobson. Link-sharing and resource management models for
packet networks. IEEE/A CM Transactions on Networking, 3(4):365-386, August
1995.
[11] Van Jacobson. Congestion avoidance and control. ACM Computer Communication Review, 18:314-329, August 1988.
[12] Raj Jain. Congestion control in computer networks: Issues and trends. IEEE
Network, 4:24-30, May 1990.
[13] Yannis A. Korilis and Aurel A. Lazar. Why is flow control hard: Optimality,
fairness, partial and delayed information. Technical report, Center for Telecommunications Research, Columbia University, New York, NY, 1992.
[14] H. T. Kung and R. Morris. Credit-based flow control for atm networks. IEEE
Network Magazine, 9(2):40-48, March 1995.
[15] D. Lin and R. Morris. Dynamics of random early detection. In SIGCOMM '97
Symposium on Communications Architectures and Protocols, Computer Communication Review, Palais des Festivals, Cannes, France, September 1997.
[16] S. McCanne and S. Floyd.
ns (network simulator).
URL http://www-
nrg.ee.lbl.gov/ns, 1995.
[17] P. E. McKenney. Stochastic fairness queuing. In Proceedings of the Conference
on Computer Communications (IEEEInfocom), San Francisco, California), June
1990.
[18] John Nagle.
Congestion control in ip/tcp internetworks.
RFC-896, Ford
Aerospace and Communications Corporation, 1984.
[19] V. Paxson. End-to-end routing behavior in the internet. IEEE/ACM Transactions on Networking, 5(5):601-615, October 1997.
[20] Larry L. Peterson and Bruce S. Davie. Computer Networks: A Systems Approach.
Morgan Kaufmann, San Francisco, CA, 1996.
[21] J. H. Saltzer, D. P. Reed, and D. D. Clark. End-to-end arguments in system
design. ACM Transactions on Computer Systems, 2:277-288, November 1984.
[22] A. Shenker. Making greed work in networks: A game-theoretic analysis of switch
service disciplines. In SIGCOMM Symposium on Communications Architectures
and Protocols, pages 47-57, London, UK, September 1994.
[23] Lixia Zhang and et al. Rsvp: a new resource reservation protocol. IEEE Network,
7:8-18, September 1993.