Usage Profiles: Allocation of Network Capacity to
Internet Users
BARKER
by
MASSACHUSETTS'INSTITUTE
OF TECHNOLOGY
Pierre Arthur Elysee
APR 2 4 2001
Bachelor in Computer Systems Engineering
University of Massachusetts at Amherst, 1997
LIBRARIES
Submitted to the Department of Electrical Engineering
and Computer Science in partial fulfillment of the
requirements for the degree of
Master of Science in Electrical Engineering and Computer Science
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
January 2001
©2001 Massachusetts Institute of Technology
All rights reserved.
Author ........................ .................
Department of Electrical Enginee ng and Computer Science
January, 2001
Certified by
................................
David D. Clarke
Senior Research Scientist
,Thesis Suvervisor
Accepted by ............ ,%... ............................
. . . . ..
; ..
.........................
Arthur C. Smith
Chairman, Department Committee on Graduate Students
Usage Profiles: Allocation of Network Capacity to Internet Users
by
Pierre Arthur Elysee
Submitted to the
Department of Electrical Engineering and Computer Science
in partial fulfillment of the requirements for the degree of
Master of Science in Electrical Engineering.
Abstract
In the Internet of today, there are only very crude controls placed on the amount of network capacity that any one user can consume. All users are expected to slow down when
they encounter congestion, but there is little verification that they actually do, and there are
no controls that permit a relative allocation of capacity to one user over another. The
research in this thesis describes a method to impose a usage limit, or "usage profile" on the
behavior of individual users. In particular, this thesis explores the design of usage profiles
that allow bursty traffic patterns, as opposed to continuous rate limits. This work describes
an effective usage profile algorithm for web traffic which has a very bursty character. The
following approach studies the characteristics of web traffic, and introduces the fundamental concepts to establish the necessary framework. Through simulations, it analyzes an
existing usage profile, the leaky-bucket scheme for different token rates and different data
sets, and points out its limitations in the context of web traffic. Then, it proposes a new
usage profile, the Average Rate Control Usage Profile (ARCUP) algorithm, that best regulates web traffic. Several variants of this algorithm are presented throughout. It discusses
the characteristics of a good profile in order to facilitate the choice of a specific variant.
The selected variant of the ARCUP algorithm is simulated for different target rates and
different data sets. The results show that this algorithm will work for any data sets that are
heavy-tailed distributed, and for different target rates which represent different usage profiles. This thesis concludes with a summary of findings and suggests possible applications.
Thesis Supervisor: David C. Clark
Title: Senior Research Scientist
This page left intentionally blank
3
Acknowledgements
I'd like to express my sincere gratitude to professor Dave Clark for his insights, motivation,
and encouragement throughout this thesis. I would like to extend my gratitue as well to professor
Al Drake who guided me through my dark days here. Finally, I would like to thank my friends
who have been instrumental to my success at MIT , in particular Amit Sinha and Eric Brittain. I
would like to thank all my professors and advisors who have contributed to my education and success. Finally, my thanks go to those who are dearest to me: my mother, Odilia Nazaire and my
father Wilner Elysee for their everlasting love and support.
4
This page left intentionally blank
5
Table of Contents
1
2
3
4
Introduction
8
1.1
Introduction .................................................................................................
1.2
Web overview ................................................................................................
.9
1.3
C haracteristics of w eb traffic ........................................................................
10
1.4
Genesis of data used in simulations ............................................................
11
1.5
M ethodology ................................................................................................
.. 8
. 12
14
Theoretical background
2.1
D efinition of self-sim ilarity ..........................................................................
14
2.2
D efinition of heavy-tailed.............................................................................
15
2.3
Definition of ON/OFF sources......................................................................16
2.4
Examining ON-times or file transfer sizes....................................................16
2.5
E xam ining OFF-tim es.................................................................................
17
2.6
Fractional Brow nian Motion........................................................................
17
2.7
Feedback C ontrol System ............................................................................
19
22
Leaky bucket algorithm
3.1
D efinition of a leaky bucket........................................................................
22
3.2
Analyzing the leaky bucket algorithm for different token rates ..........
23
3.3
Token rate equals to 10,000 bits per second...............................................
24
3.4
Token rate equals to 20,000 bits per second...............................................
26
3.5
Token rate equals to 25,000 bits per second...............................................
27
3.6
Token rate equals to 8,000 bits per second.................................................
28
3.7
Performance of the leaky bucket algorithm on different data sets...............28
3 .8
S umm ary .....................................................................................................
Averate Rate Control Usage Profile Algorithm
. 30
32
4 .1
Introduction ...............................................................................................
4.2
U ncontrolled average rate:..........................................................................
4.3
M axim um data control.....................................................................................35
4.4
Decrease and increase the peak rate by a fixed factor .................................
4.5
Varying the peak rate and maximum data control...................38
4.6
Performance of the ARCUP algorithm for different target rates.................41
4.7
Effect of maximum data size on obtained average rate ................
45
4.8
Effect of maximum data on transmission duration ...................
46
6
. . 32
33
36
4.9 Algorithm performance for different data sets.................................................47
4.10 Running the algorithm at the peak, target, and average rates ...................... 55
5
Conclusion, Applications, and Future Work
5.1 C onclusion ................................................................................................
5 .2 A pp lication .................................................................................................
5.3 F uture w ork ................................................................................................
5.4 R eferences..................................................................................................
. 60
. 61
62
A Source codes and Data sample
A .1
A.2
A.3
A.4
A.5
58
. . 58
. . 59
. . 62
G enerating O FF tim es ...............................................................................
...68
This module represents the leaky bucket's scheme source codes ........
. ..73
This module represents the uncontrolled ARCUP algorithm ...........
This module represents the uncontrolled ARCUP algorithm .................... 78
84
Sample of data used in simulations ..........................................................
7
Chapter 1
Introduction
The goal of this chapter is to introduce our work and to establish a basis for its usability. It contains the following information: an introduction, a brief overview of the Web, a
description of the characteristics of web traffic, and an analysis of the data used in simulations.
1.1 Introduction
The Internet today uses a service model called "best effort". In this service model, the network allocates bandwidth amongst all the instantaneous users as best as it can, and
attempts to serve all of them without making any explicit commitment as to the quality of
service (QOS) offered [9]. For years, there have been heated debates in the Internet community regarding the types of services that should be provided in the future. For some
researchers, the existing service has been working fairly well so far. Others argue that
users who are willing to pay more money in order to have a better service should be given
the option. Previous work in [9] has explored the issue of extending the Internet by adding
features that permit allocating different service levels to different users. A number of
schemes have been proposed to accomplish this goal: fair allocation service, priority
scheduling, expected capacity allocation, guaranteed minimum capacity. Closely related
to the work presented in this paper is the guaranteed minimum capacity scheme. This service provides a guaranteed worst case rate along any path from source to destination (for
more details see [9]). A drawback with this scheme is that it assumes that the traffic
offered by the user is a steady flow. This is not the case for Internet traffic.
The majority of Web users are not interested in a capacity profile that allows them to
go at a continous steady rate. While cruising the web, for instance, the normal usage pattern is short on-periods separated by long off-periods. The ideal profile from the user's
prospective would allow bursts to occur at a high speed. It should award a user that is not
sending continously at the high rate, and constrain a user that is doing the contrary.
Appropriate profiles are needed to match bursty usage patterns. The problem with defining
8
such a profile is made harder by the fact that the size of web transfers are "heavy-tailed",
which means that while almost all Web transfers are very small, most of the bytes transferred are in a few very large transfers. In a previous study conducted in [8], a leakybucket scheme has been proposed to regulate web traffic. This paper confirms that the
leaky-bucket scheme penalizes web traffic by introducing excessive delays. Then, it proposes a scheme that better regulates this type of traffic.
The rest of this paper has the following organization: the remaining of chapter I
emphasizes the need for usage profiles, defines the characteristics of web traffic, and analyzes the genesis of the data used in simulations; chapter II contains theoretical background information; chapter III presents and analyzes the "leaky bucket" algorithm;
chapter IV presents and analyzes the "average rate control usage profile" (ARCUP) algorithm; chapter V proposes possible applications of the usage profile algorithms and a summary of findings; it contains references, source code, and a sample of the data sets used in
simulations.
1.2 Web overview
The eminent role of the World Wide Web as a medium for information dissemination
has made it important to understand its properties. In recent years, the Web has been used,
among other applications, to trade stocks on-line, to conduct electronic commerce, and to
publish and deliver information in a variety of ways such as raw data, formatted text,
graphics, audio, video, and software [5]. An easy way to think of the Web, however, is as
a set of cooperating clients and servers. We interact with the Web through web browsers;
Netscape and Internet explorer currently dominate the market. A browser is a graphical
client program that allows users to access remotely located files or objects [11]. In order to
access a file, we need to know its URL (uniform resource locator) which resembles the
following:
http://www.haitiglobalvillage.com/
http://www.soccer.com/
http://www.amazon.com/
9
Since the Web is organized as a client-server model, each file is stored on a specific host
machine. Upon a request from the user, the requested file is transferred to the user's local
machine. These exchanges generate traffic over the Internet.
Today, the Web generates more data traffic on the Internet than any other application.
In an attempt to control the amount of traffic generated by each user on the Internet, one
proposal is to impose a usage profile on each user to regulate their traffic pattern. A usage
profile is a mechanism that shapes traffic and limits the length of bursts; further, it defines
a limit on the maximum traffic generated by each user and discourages users from abusing
the network by introducing additional delays on their transfers should they violate their
profile.
1.3 Characteristics of web traffic
The design and implementation of an attractive "usage profile" algorithm for Web
browsing requires a thorough understanding of Web traffic (in this paper, we use the terms
web traffic and Internet traffic interchangeably since the Web is currently the main contributor of network traffic). For years, researchers assumed that traffic on the Internet followed the model of a poisson process. Poisson arrival processes have a bursty length
characteristic which tends to be smoothed by averaging over long period of time [1]. To
the contrary, recent studies of LAN and wide-area network have provided ample evidence
that Internet traffic is self-similar [1]. As a result, such traffic can be modeled using the
notion of self-similarity.
Self-similarity is the trait we attribute to any object whose
appearance is scale-invariant. That is, an object whose appearance does not alter regardless of the scale at which it is viewed [13]. [2] shows that Internet traffic exhibits longrange dependence as well. Long-range dependence involves the tail behavior of the autocorrelation function of a stationary time series while self-similarity refers to the scaling
behavior of the finite dimensional distribution of a discrete or continous time process [2].
Futher, a process is considered to be long-range dependent or heavy-tailed if its auto-correlation function decays hyperbolically rather than exponentially [2]. Since self-similar
processes display similar features, they are often referred to as heavy-tailed [8]. The difference between self-similarity and heavy-tailed will be addressed in the next chapter.
10
1.4 Genesis of data used in simulations
To understand the nature of web traffic, we made use of real users' traces. These traces
were used to evaluate our proposed control algorithms. They were collected at Boston
University's Computer Science Department. Researchers at BU added a measurement
apparatus to the Web browser NCSA Mosaic, the preferred browser at the time (November
1994 through February 1995). These researchers were able to monitor the transactions
between each individual user on their LAN and the Internet. These traces contain records
of the HTTP request and user behavior occurring between November 1994 and May 1995.
During that period a total of 9,633 Mosaic sessions were traced, corresponding to a population of 762 users, and resulting in 1,143,839 requests for data transfer. Here is a sample
trace:
gonzo "http://www.careermosaic.com/cm/cml.html" 4776 0.806263
gonzo "http://www.careermosaic.com/cm/images/cmHome.gif" 90539 7.832752
gonzo "http://www.careermosaic.com/cm/cml.html" 0 0.0
Each line corresponds to a single URL request by the user and can be read as follows:
User name: gonzo
URL: http://www.careermosaic.com/cm/
File name: cml.html
Size of the document in bytes: 4776 bytes
Object retrieving time in seconds: 0.806263
These data were collected at the application level. They were thoroughly examined in [1]
which showed that the heavy-tailed nature of transmission and idle times are not primarily
due to network protocols or user preference; the heavy-tailed property can rather
11
350
I
data set gonzo
300
250
C/)
-2 200
0
E
150
100
50
0
0
2
4
6
8
10
File sizes in bytes
12
14
16
18
X 104
Figure 1-1: Distribution of data set gonzo
be attributed to information storage and processing. The results of these studies show that
files transmitted through the Internet and files stored on servers obey heavy-tail distributions. Figure 1-1 shows that data set gonzo is indeed heavy-tailed. Most of the files have
sizes less than 20,000 bytes while some are as big as 160,000 bytes. With confidence, we
use these data sets to simulate our algorithms.
1.5 Methodology
In this paper, ON-times are represented by data collected by researchers at BU from
the monitoring of transactions between each individual user on their LAN and the Internet.
When not specified, data set gonzo will be the one considered in our analyses to represent
ON-times. OFF-times are generated by a model based on a fractional Brownian motion
algorithm. The leaky-bucket and ARCUP algorithms are the two schemes evaluated and
contrasted in this paper; they are both written in C. The leaky-bucket scheme is simulated
12
for different token rates in order to determine the most appropriate rate for a given user.
Moreover, it is simulated for different data sets which illustrate its performance and its
limitation in each case. More precisely, it shows that leaky-bucket is not the preferred
scheme to regulate heavy-tailed distributed traffic. To cope with the shortcomings of the
leaky-bucket scheme, the ARCUP algorithm is proposed.
I construct a hypothetical user model by combining the ON times derived from BU
user data, and the OFF times derived from our fBM model. I picked a target value of average usage for each user of 10,000 bits/second, which is a not unreasonable overall usage
rate for a user exploring the Web today. To adjust each data set to achieve this long-term
average rate, I scale the OFF times produced by model appropriately. The result of this
scaling operation is that for each user trace, if no control is applied (that is, if there is no
usage profile), the average rate over the whole trace will be 10,000 bits/sec.
In general terms, the method used to evaluate each proposed usage profile is as fair as
possible. The profile is initially set so that it also has a long-term average permitted rate of
10,000 b/s. We observe the extent to which the bursty behavior of the user is passed
through the profile unchanged. We then adjust the average rate. We then developed several
variants of this algorithm. The best variant is simulated for different target rates and data
sets. It embodies the most effective usage profile algorithm for Web traffic that emerges
from this study. On the one hand, this algorithm permits bursty traffic if the user's average
rate is less than the contracted target rate. On the other hand, it prevents the user from
abusing his/her profile.
13
Chapter 2
Theoretical background
The background information presented in this chapter is aimed at putting our work into
context. It is designed to ease the reader's understanding and will help the reader to appreciate the merit of our results. This chapter is organized as follows: sections I and II define
self-similarity and heavy-tail behavior; section III defines and examines the nature of ON/
OFF sources; section IV presents a definition of fractional Brownian motion; section V
defines autocorrelation; and finally, section VI introduces the notion of feedback control
which is essential for the understanding of the ARCUP algorithm.
2.1 Definition of self-similarity
Self-similarity (this definition closely follows the one given in [1]):
let
X (t)
be a stationary time series with zero-mean ( i.e., pt = 0 ). The m-aggregated series is
defined by summing the original series X over non-overlapping blocks of size m. X is said
to be H-self-similar, if for all m >= 0, X(m) has the same distribution as X rescaled. Mathematically,
tiM
X~=m-_
Xi
(2-1)
(i = t- 1)
for all natural m. When X is H-self-similar, its autocorrelation function is given by:
r(k) = E[(X-
(Xt+k)
2
(2-2)
which is the same for the series X(m) for all m. Therefore, the distribution of the aggregated series is the same as that of the original except for a change in scale. In general, a
process with long-range dependence has an autocorrelation function
14
r(k) = k
(2-3)
as k goes to infinity, where 0< b <1. The series X(t) is said to be asymptotically secondorder self-similar with Hurst parameter H = 1- b/2 [3]. The Hurst parameter gives the
degree of long-range dependence present in a self-similar time series.
2.2 Definition of heavy-tailed
The understanding of heavy-tailed distribution is very important in network engineering due to their relationship to traffic self-similarity. Unlike exponential, poisson, or normal distribution, heavy-tailed distributions exhibit uncommon properties.
To date, the
simplest heavy-tailed distribution known is the Pareto distribution. Both its probability
mass function and its cumulative mass function are given by:
P(x) = akx-(a+1)
(2-4)
F(x) = P[X:x] = I-
(2-5)
for cc, k > 0, and x >=k. In this distribution, if c <= I (see equation 1 below) the distribution is unstable with infinite mean and infinite variance. If cc<= 2, on the other hand, the
distribution is stable with finite mean but infinite variance [1]. Consequently, the degree
of self-similarity largely depends on the parameter alpha. Regardless of the behavior of
the distribution, the process is heavy-tailed if the asymptotic shape of the distribution is
hyperbolic.
The Pareto distribution has an attractive trait; its distribution is hyperbolic over its
entire range . It is widely used to generate self-similar traffic. In fact, it is shown in [2]
that superposing a number of Pareto distributed ON/OFF sources produces a time series
sequence that is asymptotically self-similar.
2.3 Definition of ON/OFF sources
Using Web traffic, user preference, and file sizes data, [5] explains why the transmission times and quiet times for a given Web session are heavy-tailed. Web traffic can be
15
modeled using heavy-tailed distribution; in the literature, they are referred to as ON and
OFF times distributions. From a user's point of view, ON times correspond to the transmission duration of Web files, and OFF times represent times when the browser is not
actively transferring data [5]. It is imperative to mention that transmission times depend
on network condition at the time as well. Technically, ON times represent periods during
which packets (or cells depending on the architecture in use) arrive at regular intervals
while OFF times exemplify periods with no packets arrival. Some systems have an ON/
OFF characteristic where an ON-period can be followed by other ON-periods and OFFperiods by other OFF-periods [2]. In contrast, the model used to mimic ON-OFF sources
in Internet traffic are said to be strictly alternating [5]. That is an ON-period is always followed by an OFF period, vice-versa. And at the level of individual source-destination
pairs, the lengths of the ON and OFF-periods are independent and identically distributed.
However, the ON and OFF periods are independent from one another, and their distribution need not be the same. In our simulations, we use actual data from Web traces to represent ON-times, and generate synthetic OFF times using fractional Brownian motion.
2.4 Examining ON-times or file transfer sizes
Researchers in [1] showed that the distribution of transfer size for file sizes greater
than 10,000 bytes can be modeled using a heavy-tail distribution. It follows that self-similarity or heavy-tailed property exhibited by Internet traffic is mainly determined by the set
of available files in the Web. Today, most of the data transfered over the Internet represent
multimedia files (i.e., images, text, video, and audio). Researchers in [5] report that
although multimedia may not be the primary factor in determining the heavy-tailed nature
of files transferred, it does increase the distribution tail weight. The tail weight for the distribution of file sizes between 1,000 and 30,000 bytes is primarily due to images; for file
sizes between 30,000 and 300,000, the tail weight is caused mainly by audio files. Finally,
from 300,000 and onwards, the weight is attributed to video files.
To prove that multimedia is not solely responsible for the heavy-tailed nature of Internet traffic, the authors in [5] compare the distribution of available files in the Web with the
distribution of Unix files. They conclude that the distribution of Unix files are much
heavier tailed than the distribution of Web files. This conclusive remark suggests that even
16
with added multimedia contents, Web files do not dominate the heaviness in the tail distribution of transferred files: file sizes in general follow a heavy-tailed distribution [5].
2.5 Examining OFF-times
As mentioned in section IV-I, ON-times are the result of transmission durations of
individual Web files; OFF-times, on the other hand, represent periods when a browser is
not actively transferring data. While ON-times are mainly due to the transmission duration
of Web data, OFF-times can be the result of a number of different causes: the client's
machine (or workstation) may be idle because it has just finished receiving the content of a
Web page. Therefore, before requesting the next content, it will interpret, format, and display the first component. In other cases, the client machine may be idle because the user is
processing the information last received; or the user may not be using his/her machine at
all.
These two phenomena are called "active OFF" and "inactive OFF" times by the
authors in [1]. The understanding of OFF-times distribution lies in their differences.
Active "OFF-times" represent the time required by client machine to format, interpret,
and display the content of a Web page. As a result, machine processing time tends to be in
the range of ims to 1 second [1]. It is unlikely, however, for an embedded Web document
to require more than 30 seconds processing time. The researchers in [1] assumed that
inactive OFF-times are in general due to user inactivity. Thereby, they concluded that
inactive "OFF-times" resulting from user inactivity are mainly responsible for the heavytailed nature of OFF-times.
2.6 Fractional Brownian Motion
Fractional Brownian motion is a natural extension of ordinary Brownian motion. It is
a gaussian zero-mean stochastic process, and is indexed by a single scalar parameter H
ranging between zero and one [4]. Fractional Brownian motion (fBm) has very useful
properties: fractal dimension, scale-invariance, and self-similarity [3]. As such, fBm represents a good model to describe non stationary stochastic processes with long-range
dependence [4]. As a non stationary process, fBm does not admit a spectrum in the usual
sense. However, it is possible to attach to it an average spectrum [4].
17
Although non stationary, fBm does have stationary increments, which means that the
probability properties of the process:
(2-6)
Bh(t + s) - Bh(t)
only depend on the variable s. Moreover, this increment process is self-similar because for
any a > 0, the following is true:
(2-7)
Bh(at)) = a x Bh(t)
where '=' represents equality in distribution. A standard fractional Brownian motion has
the integral representation:
t2
Bh ( t 2 )-Bh(tl) = 1/(F(h+0.5)).
tI
f(t2 - s) h-0.dB(s) - f(tl -s)
-0.5dB(s)
2-8)
Ordinary Brownian motion is obtained from the standard fractional Brownian motion
when the Hurst parameter h = 0.5. The non stationary property of fBm is manifested in its
covariance structure [4] given:
E(Bh(t)Bh(s))) = ((y2 )/(2))(t 2 h +
1s2hi
-
(t
-
s)2h1 )
(2-9)
After manipulating the previous equation, the variance is
Var(Bh(t))
=
y2t
2
hJ
(2-10)
The self-similarity characteristic of fBm can be deduced from the previous equation with a
scale transformation [3].
G~h
2
(ri)
=
2h
2
r CTI2 (1)
18
(2-11)
This result proves that fractional Brownian motion is scale-invariance and statiscally
indistinguishable with regard to a scale transformation [3].
2.7 Feedback Control System
The reader is encouraged to reread this section for a better understanding of the
ARCUP algorithm; it gives a brief definition of a control system. A control system can be
viewed as a system for which the manipulation of the input element(s) results in a desired
output(s). Two main features define a control system: first, a mathematical model that
expresses the characteristics of the original system; second, the design stage in which an
appropriate control mechanism is selected and implemented in order to achieve a desired
system performance [6]. The control of a system can be performed either in open loop or
in closed-loop. An open loop is a system in which the control input to the system is independent of its output. It is imperative that the system itself is not influenced by the system
output. Open-loop systems are often simple and inexpensive to design. In some applications, however, it is important to feed back the output in order to better control the system.
In this case, the loop is said to be closed. Therefore, in a closed-loop system the control
input is influenced by the system output. The system output value is compared with a reference input value, the result is used to modify the control system input [6]. The mathematical modeling of feedback systems was introduced by Nyquist in 1932: he observed
the behavior of open loop systems and sinusoidal inputs to deduce the associated closedloop system. His work was later improved and surpassed by Bode in 1938 and Evans in
1948. Their work constitutes the basis of classical control theory [6].
A basic closed-loop system contains: an input device, an output measuring device, an
error measuring device, and an amplifier and control system. The latter manipulates the
measured error in order to positively modify the output [6]. It is worth noting that there
are two types of closed-loop or feedback systems:
(a) A regulator is a closed-loop system that maintains an output equal to a pre-determined value regardless of changes in system parameters; and
19
(b) A servomechanism is a closed-loop system that produces an output equal to some
reference input position without change in parameter values. The former is better suited
to model the system presented in this paper.
20
This page left intentionally blank
21
Chapter 3
Leaky bucket algorithm
Having established the framework that would allow the reader to understand our work,
we are now in a position to introduce the first usage profile algoritm. The study of leaky
bucket assumes that the traffic source behaves as a two-state on/off arrival process with an
arbitrary distribution for the time spent in each state [9]. A source is considered to be on
(or busy) when it is transmitting/receiving and off (or idle) when it is not. Technically,
ON-times represent periods during which packets (or cells depends on the architecture in
use) arrive at regular intervals while OFF-times represent periods with no packets arrival.
In this chapter we introduce the concept of a leaky bucket, and analyze the results of the
algorithm for different token rates and on different data sets. It is organized as follows:
section I gives a definition of the leaky bucket concept; section II analyzes the algorithm
for different token rates ranging from 10,000 to 25,000 bits per second; section III evaluates the performance of the leaky bucket algorithm for different data sets; and finally, section IV summarizes our findings.
3.1 Definition of a leaky bucket
This mechanism has been utilized as a congestion control strategy in high-speed networks. It is often used to control traffic flows; ATM is a prime example. The ATM (Asynchronous Transfer Mode) technology is based on the transmission of a fixed size data unit.
When an ATM connection is established, the traffic characteristics of the source and its
quality of service are guaranteed by the network. The network enforces the admission control policies by using a usage parameter control [13]. The leaky bucket algorithm serves
this purpose. The leaky bucket algorithm is used to control bursty traffic as well. An Internet Service Provider (ISP), for instance, can use a leaky bucket profile to shape its incoming traffic [8]. A simple leaky bucket is characterized by two components: the size of the
bucket and the token replenishing rate. Tokens are generated at a constant rate and are
stored in the bucket which has a finite capacity. A token which arrives when the bucket is
full is automatically discarded. If the bucket contains enough tokens when packets arrive,
22
the mechanism allows them to pass through. That is, the user can burst traffic into the network at his/her allowed peak rate which is most of the time equivalent to the physical link
capacity. After each transfer, the bucket is decremented accordingly. When the number of
tokens is insufficient only a portion of the arriving traffic is immediately sent while the rest
is queued. If the bucket is empty, all arriving packets (traffic) are queued or discarded
according to the policy in place. The queued packets are serviced upon token arrival in the
bucket. If the token replenishing rate is constant, the queued packets service rate is constant as well. Note, with large buckets, users can send bursty traffic in a short time period.
The token replenishing rate, on the other hand, allows user to send data at constant bit rate
for any periods of time.
3.2 Analyzing the leaky bucket algorithm for different token rates
The leaky-bucket scheme has been utilized to monitor and enforce usage parameter
control (UPC). Researchers in [8] show that a leaky-bucket scheme can be imposed on
each on-off source as a usage profile. A profile is, therefore, defined by an initial number
of tokens in the bucket and a token replenishing rate. The profile determines when the onoff sources (or users) can burst traffic into the network or should they send at the token
rate. In this thesis, the bucket has an infinite capacity (i.e., no token is discarded). The
amount of token accumulated by the bucket, however, is finite; it depends on the OFFperiods. Indeed, the lenghts of OFF-periods are bounded, thereby the amount of tokens in
a bucket is bounded as well. In general, the transfer time for each file has a peak time and
a token time component. The larger the file (i.e., the longer the on-period) the bigger the
token component is likely to be and the slower the transfer. The longer the off-period the
more tokens the bucket accumulates. The arriving file following such a period can be sent
at the peak rate. The bigger the peak time component, the faster is the transfer. After a
long ON-time, representing the transfer of a large file, the bucket contains few tokens.
The total transfer time required to send an arriving file depends more generally on the history of previous ON-OFF times.
If the arriving file is small, there is a high probability
that it will be sent at the peak rate. Conversely, if the arriving file is large, there is a low
probability that the bucket will have enough tokens: it will be processed at a speed near the
token rate [8]. A drawback with this scheme is that the more heavy-tailed the distribution
23
of the ON-times is the more the traffic that has to be sent at the token rate. [8] shows that
increasing the peak rate is not an effective way to solve this problem. In the following
subsections, we analyze the leaky bucket algorithm for different token rate values.
We
will consider the normalized average rate (i.e., 10,000 bits/sec) and two of its multiple:
token rates of 20,000, and 25,000 bits per second. These rates will constitute the focus of
our attention. This procedure will allow us to determine the required token rate that will
provide the least total transfer time for the user under consideration.
3.3 Token rate equals to 10,000 bits per second
The initial value of 10,000 bits per second is chosen since this is the actual long-term
average rate of the traffic to be controlled. When applying this token rate, most transfer
times have two components: a peak time and a token time. The first fraction of the data is
received at the peak rate, and the rest if any at the token rate. The smaller the transmission
duration time the more dominant is the peak time.
As mentioned earlier, the goal of the
usage profile algorithm is to regulate users to their long-term contracted rate (token rate or
target rate), allow burst traffic to be sent rapidly, and prevent abuse. In the scatter plot of
Figure 3-1, the points that are lying near the X-axis represent small files that are sent at the
24
peak rate. On the other hand, the transfers that required thirty-seven, fourty-two, and
r5
leaky bucket aig. with token rate
=
10,000 bits/sec
4 5*
40
-*
3 5-
3 00
_0
25 -
0
E 20U)
15*
10 -
*
5 -
**
**4
nimh*
0
2
4
*
*I
****
6
8
10
Data transferred in bytes
12
14
16
18
x 104
Figure 3-1: Leaky bucket with token rate 10,000 bits/sec
forty-nine seconds are three examples of very large transfers. As a result, the token time
dominate in each case. For example, for the thirty seven second transfer, the file size is
58,852 bytes with a peak time of 0.0928 second and a token time of 37.7 seconds. For the
forty-two second transfer, the file transferred has a size of 90539 bytes with a resulting
peak time of 0.3 and a resulting token time of 42.12 seconds. Finally, for the forty-nine
second event, the file transferred has a size of 163711 bytes resulting in a peak time of
0.81 second and a token time of 48.32 seconds. In all three cases the results are much better than if the user were receiving strictly at the token rate. The reader can verify that for
the larger transfer (i.e., file size = 163711 bytes), the total transmission time would have
been 131 seconds had the whole file been sent at the token rate which implies that over
one half of the data was received at the peak rate, and the rest at the token rate. In this
case, the transmission duration time is dominated by the token time.
25
3.4 Token rate equals to 20,000 bits per second
As mentioned earlier, the worst case transfer time is file size in bits divided by the
token rate. In this case, we decrease the transfer time for two reasons: a faster token rate,
and a larger number of tokens after a given OFF time which increases the number of bytes
sent at the peak rate. By increasing the token rate, we decrease the transfer time: they are
inversely proportional. However, increasing the token rate has the disavantage that it permits a user to "abuse" the profile by sending at a long-term steady state rate greater than
the intended target rate. The duration of the transfer still has a peak time component and a
token time component. In the scatter plot of Figure 3-2, the peak time dominates in most
of the transactions. As an illustration, the two outliners occuring at 0.5 and 4.5 seconds
have greater token times (see Figure 3-2). The first represents the file transferred for the
4. 5
leaky bucket aig. with token rate
*
=
20,000 bits/sec
4-
3. 5-
0
3-
C
2. 5-
0
U)
2-
C
_ 1. 5*
0. 1
*
0
2
4
6
8
10
Data transferred in bytes
12
14
16
18
x 104
Figure 3-2: Leaky bucket with token rate 20,000 bits/sec
duration of 0.5 second, and having a size of 16,031 bytes with a peak time of 0.117 and a
token time of 0.4 second. The second represents the file transferred for the duration of 4.5
26
seconds, and having a size of 15,429 bytes with a peak time of 0.04 second and a token
time of 4.17 seconds.
3.5 Token rate equals to 25,000 bits per second
A token rate of 25,000 bits per second is chosen here based on simulation results. This
rate is sufficient to allow this user to receive all files at the peak rate (i.e., achieves the lowest possible transmission duration time). That is, the user always accumulates enough
tokens to receive data at the peak rate --token time is null. The transmission duration
depends only on the peak rate. As an illustration, if we consider the 80,000 bytes file, we
find the duration time to be 0.64 seconds. This result is compliant with the results in Fig-
leaky bucket with token rate
=
25,000 bits/sec
*
1. 2*
1-
.,2
0. 8 -_0
~0
CO)
0. 6 -
E
(n)
0. 4-
0. 2-
0
2
I
4
I
6
8
10
Data transferred in bytes
12
14
16
18
x 104
Figure 3-3: Leaky bucket with token rate 25,000 bits/sec
ure 3-3. Further increasing the token rate for this data set will not improve the delay duration.
27
3.6 Token rate equals to 8,000 bits per second
In case of overload, that is when the normalized average rate of a user is greater than
the token rate, the replenishing rate of the bucket is slower than the arriving data rate. The
bucket is unable to accumulate enough tokens. Therefore, most of the files are received at
the token rate. The transmission duration of the largest file goes from fifty seconds at
10,000 bits per second to 160 seconds at 8,000 bits per second. The leaky bucket algorithm severely penalizes overloaded traffic with heavy-tailed distributions. In the next
chapter, we present a different algorithm that yields better result from a similar situation.
160
leaky bucket algorithm with token rate
40
=
8,000 bits/sec
-
1
20
--
10 0
-
M
0
_0
~0 80 -C
U')
E
6 0--
40-
2 0-
0
2
4
6
10
8
12
14
Data transferred in bytes
16
18
x 104
Figure 3-4: Leaky bucket with target rate 8,000 bits/sec
3.7 Performance of the leaky bucket algorithm on different data sets
Studying the algorithm for one data set is insufficient to draw a conclusion about performance: data set distribution varies from user to user. This section evaluates the performance of the leaky bucket algorithm for different users (or data sets). The objective in the
28
following analysis is to determine the minimum token rate required by each data set in
order to achieve the lowest possible transfer time. For simplicity, we conducted this study
of the algorithm on only four sets of data; we analyzed data sets goofy, daffy, gonzo, and
pooh. From the scatter plot of Figure 3-5, we can infer that data set goofy attains its lowest
leaky bucket wit token rate
toer data set goofy.
9-
= 25,000 bits/sec
leaky bucket with token rate
tor data set datty.*
45
=
35,000 bits/sec
4
735
67--
5-
25
43--
3 -
05
6
4
2
0
14
12
10
8
Data transferred in bytse
wit token rate =25,000
leaydata suke
ton
set geezo.
1
0
2
bits/sec
transferred
leaky bucket with token rate
for data set pooh,
0.6 -
1.2
3
Data
X 105
6
5
4
in bytes
X 10
55,000 bits/sec
0.5.0 0.4
--
0.6
0.3-
0.4
02-
02
01
~d
2
4
6
10
8
Data transferred in bytes
12
14
16
18
X
70
to
1
2
3
4
5
Data transferred in bytse
6
7
8
9
X 10o
Figure 3-5: Maximum token rate required for each data set
transferred duration time of ten seconds for a token rate of 25,000 bits per second; data set
daffy reaches its lowest transferred duration time of five seconds for a corresponding token
rate of 35,000 bits per second; data set gonzo reaches its lowest transferred duration time
of 1.4 seconds for a corresponding token rate of 25,000 bits per second. Finally, data set
pooh reaches its lowest transferred duration time of 0.7 second for a corresponding token
rate of 55,000 bits per second. If we consider the peak rate (IMbits per second), our calculations yield transferred duration times corresponding to he results exhibit by the scat-
29
ter plot of Figure 3-5. The discrepancy in the token rate, in turn, can be attributed to the
difference in these data sets distributions.
Data set goofy reaches its maximum token rate sooner because its data set tail distribu300
250F
data set daffy
data set goofy
250
200
200
1t0
150
100
50
2
6
8
File sizes in bytes
4
250
12
10
1
14
2
X 10
3
File sizes in bytes
5
4
6
X 10
data set pooh
data set gonzo
300
250
-
10
200
0 0
Eo
10
50
0111
0
7
3
01
8
X
File sizes in bytes
1
2
File sizes in bytes
1o'
2
14
16
18
X 104
Figure 3-6: Data set distributions
tion is less heavier (see Figure 3-6).
As a result, data set goofy accumulates enough
tokens that allows it to receive data at the peak rate given a smaller token rate. That is the
opposite for data set pooh which has a very heavy-tailed distribution. Consequently, the
required token rate is much larger.
3.8 Summary
To summarize, the transmission duration time decreases as we increase the token rate.
Though increasing the token rate (i.e., the usage profile) results in lowering the transfer
delays, such a scheme is neither efficient nor economically appealing. To achieve a desir-
30
able delay, a user will eventually need a higher profile which will be more expensive.
After all, this higher usage profile may be needed just because of a few large transfers.
Running this scheme for different data sets shows that the heavier the tail distribution of
the data set the greater is the required token rate necessary to achieve the desired minimum
delay. As pointed out in [1], the leaky-bucket scheme performs well with Poisson like distnbution but penalizes heavy-tailed traffic (i.e., web traffic). The need for a suitable algorithm to regulate web traffic is eminent: the next chapter proposes and evaluates such an
algorithm.
31
Chapter 4
Averate Rate Control Usage Profile Algorithm
4.1 Introduction
This algorithm is build on the Time Sliding Window Tagging algorithm presented in
[12]. There are two parts to this profile scheme. The first is a rate estimator, based on the
Time Sliding Window estimator, which measures the average usage over some period. The
second is a control or limit that is applied to the traffic so long as the actual sending rate, as
measured by the estimator, exceeds the target rate. In contrast to the leaky bucket scheme,
the controls look at each ON period in isolation; in other words, what happens in each ON
period does not depend on the length of the immediate OFF period, or any other shortterm history. The control is only related to the output of the estimator.
The estimator allows the algorithm to determine when a user has exceeded its target
rate. In this case, the algorithm will enforce some pre-defined policy to discourage such a
behavior. I propose a number of different controls, which involve reducing the allowed
peak rate, and limitimg the number of bytes in any one ON period that can be sent at the
peak rate. Further, instead of using packets to calculate the total number of bytes sent, I
use file sizes. In this simulation, all control adjustments are made at the beginning of each
file (each ON period) since the size of the file is known in advance. A practical scheme
must make the decision incrementally since the duration of the ON period is not known.
This algorithm maintains three local variables: Win-length that represents how much
the past "history" of a user will affect its current rate; the average-rate that estimates the
current rate of the user upon each ON period; Tfront that represents the time of the current file arrival. While Tfront and the average-rate are calculated throughout simulation,
Win-length must be configured. [12] shows that this algorithm performs well for Tfront
between 0.6 and I second. The core of the ARCUP algoritm is presented below:
Initially:
32
Winlength = constant;
Average-rate = 0;
T_front = 0;
Upon each file arrival, TSW updates its variables:
FilesinTSW = average-rate*Winjlength;
Newfiles = FilesinTSW + workload;
average-rate = Newfiles/ (total-time - Tfront + Win-length);
T_front = total-time; / time of the last packet arrival
The goal of this algorithm is to hold the user to his/her long term average rate and to
impose limitations on burst. Our challenge is to let burst through with little distortion
while keeping the achieved average rate close to the nominal average rate. We accomplished these goals by keeping the current average rate as close as possible to the target
rate; both average and target rate are expressed in bits per second. For each arriving file,
the algorithm compares the current average rate with the target rate. If the average rate is
smaller than the target rate, no action is taken; if the average rate is greater than the target
rate, then some control is imposed to reduce the average rate. A typical average rate function is depicted in Figure 4-2. Since we want to achieve this goal regardless of changes in
system parameters, we use a feedback closed-loop to control the system. In other words,
the average rate is continuously measured and compared to the target rate. The difference
(or error value) between them is used to vary the peak rate, which for this simulation has a
maximum value of 1 Mega bits per second.
4.2 Uncontrolled average rate:
The uncontrolled average rate represents the average rate of each user if no control mechanism is imposed. For each user, the corresponding average rate is determined by allowing the user to receive data strictly at the peak rate. A simple division between the total
bits transferred and the total transmission time yields the uncontrolled average rate. The
OFF-times for each data set is scaled to adjust the uncontrolled average rate to the desired
value. Under these conditions, the user obtains its highest average peak rate (the highest
33
value that the average rate can reach) and its lowest transmission duration delay. In the
following subsections, we will present several schemes that aim at controlling the average
rate and by doing so we increase the transmission delay.
This chapter defines and analyzes the same estimator scheme with different controls.
x
11_
uncontrolled average rate
2.51-
2
CO
0)
1.5
(Z
1
0.5-
010
]V~N~1VJ4
5
~ThTh
10
\JV
l
NJ
_0
'?V
20
15
25
I~j
\I
30
35
Time in mn
Figure 4-1: Uncontrolled average for data set gonzo
Data set gonzo is utilized as the benchmark to evaluate the merit of each applied control
technique. We will present the following methods: in section I, we analyze an uncontrolled average rate; in section II, we show a maximum data control method; in section III,
we study a decrease and increase of the peak rate by a fixed factor approach; in section IV,
we combine the two techniques used in sections II and III; in section V, we study the performance of the algorithm for different target rates; in section VI, we evaluate the performance of the algorithm for different data sets; and in section VII, we compare the results
for running the algorithm at the peak, average, and target rates.
34
As an example, we consider a user with a target rate of 10,000 bits/sec which is not
unreasonable for a user surfing the Web today. In this case, no control algorithm is applied
on the user who completed all transfers at the peak rate. The resulting average rate function (Figure 4-1) stays above the target rate between six and seven minutes into simulation, and reaches a maximum value which is about twenty-five times the target rate. To
reiterate, our objective is to keep the average rate as close as possible to the target rate.
This control will become apparent to the user in terms of elapse time when downloading
information. The smaller the target rate and the tighter the control, the more time it will
take to receive Web data. The straight lines in Figure 4-1 represent very long off-periods
while the spikes represent large files transfer.
4.3 Maximum data control
The motivation behind this scheme is to prevent the user from receiving large files
regardless of its current average rate. This variant does not use the rate estimator. It
receives data at the peak rate except if the size of the Web data is greater than a pre-defined
value. Files as big as 900,000 bits are suitable to evaluate the performance of different data
sets in the context of this algorithm. The bigger the maximum data the smaller is the overall transmission time for a given data set. Here, we evaluate the algorithm for rather two
small maximum data: 300,000 bits and 150,000 bits. With a maximum data of 300,000
bits, the user is less restricted. Therefore, the total transmission time to complete his/her
transfers is less than in the other case. However, the long-term average rate is better controlled with a maximum data of 150,000 bits while the increase in the total transmission
time is tolerable. In both cases, any file greater than the pre-defined maximum data size is
process as follows: the pre-defined maximum data is serviced at the peak rate, and the rest
at the target rate. If the current file size is less than the pre-defined maximum data, the user
can always receive at the peak rate. In Figure 4-2, the long term average rate in both cases
are well below the value obtained in the uncontrolled case (Figure 4-1). The improvement
in controlling the long-term average rate is at the expense of additional delay. The time
required by the same user to complete his/her entire transaction is increased by about four
minutes.
35
X 10
12
10 -
max data
0
a)
300,000 bits
8
C,)
(I)
15
max data
a)
150,000 bits
6
a)
a)
4
2
0
0
5
10
15
20
Time in mn
25
30
35
40
Figure 4-2: Maximum data effect
4.4 Decrease and increase the peak rate by a fixed factor
For this version of the ARCUP algorithm, the user is allowed to receive data at the
peak rate until its average rate surpasses its target rate. At the occurrence of such an event,
the peak rate is decreased by a pre-defined factor (in this case, we set the pkrate =
0.8*pk-rate). This decrease continues until the average rate drops below the target rate. At
each packet arrival, the ARCUP algorithm determines how to adjust its peak rate. We have
chosen this approach instead of a per time interval criterion since the ON periods represent
when the user is active. Figure 4-3 shows all the cases where the peak rate has to be
dropped. Figure 4-4 depicts the changes occuring in the peak rate; it also expresses the
dynamism between the average, the target, and the peak rates. A spike in Figure 4-3 corresponds to a drop in the peak rate in Figure 4-4. In this instance, the longest period of
36
drop occurs between twelve and seventeen minutes into simulation; between twelve and
twenty-two minutes approximately, the peak rate is kept below its maximum value.
x 105
2 r-
1.8 -
1.6 -
1.4
CD,
S1.2
Cz
a
)
CU
a") 0.8
0.6
0.4 F
0.2
0
0
5
10
10, 1520253
15
q
20
25
30
35
40
Time in mn
Figure 4-3 : Data gonzo with target rate 10,000 bits/sec
The peak rate is increased by a predefined factor as well (in this example, we set peak
rate = 1.2*peak rate). After a drop around five minutes into simulation (Figure 4-4), the
peak rate increases and remains constant until about twelve minutes into simulation. The
behavior of the peak rate within that interval is due to the fact that the average rate remains
below the target rate over that same interval (Figure 4-3).
37
x 105
9-
87-
CZ
1-
26-
32-
01
0
5
10
20
15
25
30
35
Time in mn
Figure 4-4: Change in the peak rate
4.5 Varying the peak rate and maximum data control
This section is somewhat a hybrid of section II and section III. This variant of the
ARCUP algorithms provides the best results; thereby, it is utilized for the remaining of
this thesis to evaluate different case studies. In addition to the maximum data control, the
peak rate is dropped whenever the average rate is greater than the target rate. This control
occurs regardless of files' sizes. Analyzing this variant of the ARCUP algorithm for the
same maximum data used above, we observed a better control in the average rate with a
slight increase in the total delay. The maximum value reaches by the average rate in this
case is about 80,000 bits/sec compare to 200,000 bits/sec in the previous case (Figure 4-3).
In return, the total delay is about three minutes extra.
38
We present the results for two maximum values data: namely, 300,000 and 150,000
X 10
8
max data = 300,0 00 bits
7
6
Cl,
5
max data
=
150,000 bits
CZ
)4
a)
3
2
1
~v1
0
0
5
10
15
20
25
Time in mn
30
35
40
45
Figure 4-5: Maximum data and peak rate control
bits. In both cases, the algorithm restricts the user from receiving data at the peak rate
once the file size exceeds those values. This restriction takes effect even when the current
average rate is less than the target rate. With the maximum data of 300,000 bits, the longterm average rate is slightly bigger with a total delay of 40 minutes. The long-term average rate is more restricted with the smaller maximum data while the incurring delay is
about the same. Figure 4-5 shows the result for these two cases, while Figure 4-6 displays
their relationship with the peak rate. The reader can observe that the peak rate drops whenever the average rate is greater than the target rate; it increases when the target rate is
smaller, or remains constant. In Figure 4-5, for instance, when the maximum data size
allowed equals 150,000 bits, the peak rate momentarily drops from 1 Mega bits per second
to 10,000 bits per second after seven minutes into simulation. This drop occurs because
39
the average rate is greater than the target rate within that interval. Consequently, it forces
x 10
10
-
9
max data = 150,000 bits
8
7
U
-D
6
max data = 300,000 bits
C
a)
0)
5
4
0
0
5
5
10
10
15
15
25
20
25
20
Time in mn
30
30
I
I
35
40
45
Figure 4-6: Variation in the peak rates
the average rate to go below the target rate. The peak rate remains below its maximum
value during much of the simulation. It momentarily climbs back to it nominal value after
thirty minutes into simulation due to the re-adjustment in the average rate. The peak rate
remains at its maximum value (i.e., IMbits/sec) for about two minutes thereafter.
This
observation can be explained by the fact that the average rate remains below the target rate
within that interval. On the other hand, when the maximum data size allowed equals
300,000 bits, the result is a little bit different. The reader should notice that the drop in the
peak rate is more acute when the maximum data equals 300,000 bits. By relaxing the constraint on the maximum data size, the user obtains a greater average rate (see Figure 4-5) .
Since the peak rate is dynamically controlled, a greater increase in the average rate results
in a greater decrease in the peak rate (see Figure 4-6).
40
4.6 Performance of the ARCUP algorithm for different target rates
Is the ARCUP algorithm sensitive to the target rate? Obviously, different users are
bound to have different profile requests. In a quest to determine the best profile for a user
given a sample of his/her Web transactions over a period of time, I evaluate the ARCUP
algorithm for different target rate values. For simplicity, we restrict ourselves to just three
values, namely: 12,000 bits/sec, 10,000 bits/sec, and 8,000 bits/sec.
For this analysis, I concentrate on the transfer time required for two files when different target rates are applied. As an illustration, I focus on the scatter plot of figure 4.7-4.9 to
analyze the transmission times resulting for a file of 20,000 bytes and a file of 60,000
bytes.
First, I analyze the results for the case when the target rate equals 12,000 bits/sec. For
the first file, the resulting transmission duration time is approximately two seconds using
the ARCUP algorithm average rate; if it were processed at the nominal average rate (
10,000 bits/sec), the resulting transmission time would have been sixteen seconds. Further, this file size is less than the pre-defined maximum data of 300,000 bits. As a result,
most of the transfers are done at a rate that is close to the peak rate. For the second file, the
nominal average rate (i.e, 10,000 bits/sec) would yield a transmission duration time of
fourty-eight seconds while the ARCUP algorithm yields approximately eight seconds.
Since the second file size is greater than the pre-defined maximum data, its transmission
transfer time contains a peak rate component and a target rate component. In the scatter
plot of Figure 4-7, the dotted points near the x-axis represent the files sizes less than
41
300,000 bits. All those files are processed at a rate that is near the peak rate. The rem aining dotted points represent the larger files that are processe d at a slower rate.
20 C
target rate = 12,000 bits/sec
*
*
2 0-
1
*
5-
E
'-1
CZ
0*
*
*
*
5-
n*-2
4
6
10
8
Transfer size in bytes
12
14
16
18
x 104
Figure 4-7: Data set gonzo at 12,000 bits/sec
Second, I analyze the results for the case when the target rate equals 10,000 bits per
second. By lowering the target rate, we obtained an increased in the maximum delay per
transaction. In this analysis, I consider the same two files mentioned in the previous case.
The same amount of time is required for the smaller file (see Figure 4-7) since it is less
than the pre-defined maximum data allowed. However, there is an increased of approxi-
42
mately three minutes for the second file . Third, by lowering the target rate to 8,000 bits/
45
40-
Varying peak rate with maxdata
=
300,000 bits and target rate =10,000 bits/sec
35-
30C.)
*
e
Cd)
25-
**
Cn
**
20F-
***
15-
10-
5
0
0
2
4
8
6
10
Transfer size in bits
12
14
x 105
Figure 4-8: Data set gonzo at 10,000 bits/sec
sec, the ARCUP algorithm becomes more restrictive (Figure 4-8). The individual transfer
rate for each file increases as well as the overall transmission time. We notice that the
20,000 bytes file still requires approximately the same amount of time while the 60,000
bytes file requires twenty seconds. In addition, there is a slight increase in the transmission
time of the files that are less the maximum data. These are the files with an average and a
peak rate component. In this case, the average rate reached the target rate much sooner; it
has the effect of reducing the obtained average rate; though, it increased the transmission
times.
43
To summarize, variations in the target rate affect the obtained average rate ( total data
r_
I
I
I
I
target rate = 8,000 bits/sec
4 5-
4 0-
35C)
C)
*
30-
0
Z3
25**
0
Ud)
2
E
(n)
0-*
15-*
0
*
**
10
-
*
0
2
4
6
10
8
Data transferred in bytes
12
14
16
18
x 104
Figure 4-9: Data set gonzo at 8,000 bits/sec
transfer over total transmission duration) as well as the transmission duration time. If the
target rate is set to be greater than the nominal average rate, the user obtains a lower
response time per transfer. On the other hand, if the target rate is set lower than the nominal average rate, (which is 10,000 bits/sec for data set gonzo) the user obtains greater
response time. Below is a table of obtained average rate for different target rates and different maximum data.
44
4.7 Effect of maximum data size on obtained average rate
Increasing the maximum data size can be viewed as limiting the type of files that a
Table 4-1: Effect of maximum data size on obtained average rate
Maximum
data
Obtained
average rate
(bits/sec)
Nominal
average rate
(bits/sec)
Target rate
(bits/sec)
100,000 bits
8,780
10,000
8,000
300,000 bits
9,084
600,000 bits
9,248
900,000 bits
9,460
100,000 bits
8,933
10,000
10,000
300,000 bits
9,220
600,000 bits
9,4708
900,000 bits
9,652
100,000 bits
9,045
10,000
12,000
300,000 bits
9,304
600,000 bits
9,525.5
900,000 bits
9,675.6
user can download with no additional delay. A typical case can be to allow the user to
download, for instance, hypertext, images, videos, but not MP-3 files. Table 4-1 and Figure 4-10 illustrate the case of a fixed target rate but different maximum data. The total
transmission duration time improves with an increase in the maximum data. Figure 4-10
shows that by increasing the maximum data allowed more files are being received near or
at the peak rate. It illustrates the characteristic of the ARCUP algorithm to restrict large
files.
45
max data 300,000 bits
max data= 100,000 bits
25
25 -
20
20**
9
.15
0
E
95
10
10*
.45
5-
2
4
6
8
10
Transfer size in bytes
14
12
max data = 600,000 bits
16
11
0
6
4
*
8
10
Transfer size in bytes
max data
25 -
12
14
16
12
14
16
X 10
1a
900,000 bits
25-
*-
20
2
20 -
5-
S15 -
.0 1
0-
to-
5 --
5-
0
2
4
6
10
Transfer size in bytes
8
12
14
16
18
.
*
* * **
Mt.**p
.
2
X 10
4
* ,
6
*
a
10
Transfer size in bytes
18
X 104
Figure 4-10: Transfer for data set gonzo with different maximum data
4.8 Effect of maximum data on transmission duration
Table 4-2 shows the results of running the ARCUP algorithm at different maximum
data values. These results are not taking into consideration OFF periods. In other words,
they solely represent the activity periods (i.e., ON periods) of data set gonzo at a target
Table 4-2: Effect of maximum data on transmission duration
Maximum data
(bits)
Total transmission
duration (secs)
Uncontrolled
duration (secs)
100,000
290
20
300,000
221
20
600,000
150
20
46
Table 4-2: Effect of maximum data on transmission duration
Maximum data
(bits)
Total transmission
duration (secs)
Uncontrolled
duration (secs)
900,000
115
20
rate of 10,000 bits/second. As observed, the total transmission duration decreases as we
increase the maximum data value. This feature will provide ISPs (Internet Service Provider) and/or network adminitrators the flexibility to adjust a user's profile as desired.
4.9 Algorithm performance for different data sets
Studying the algorithm for one data set is insufficient to draw a performance conclusion: data set distribution varies from user to user. This section evaluates the performance
of the ARCUP algorithm for different users (or data sets). The objective in this following
analysis is to determine the algorithm's performance for different data sets. This study
involves four sets of data: goofy, taz, daffy, and pooh (The names of the actual users were
removed for privacy rights).
For each data set, we first consider the uncontrolled average
rate followed by the controlled average rate. The target rate and the nominal average rates
in all cases are 10,000 bits/sec, and the maximum data is 300,000 bits. The ARCUP algorithm performs relatively well in each case. Table 4-3 shows the achieved average rate for
each data set.
In each case, we give the user's uncontrolled average rate followed by his/her conTable 4-3: Achieved average rate
User name
Long-term average
rate(bits/sec)
Achieved average
rate (bits/sec)
Daffy
9,993
9,155
Gonzo
10,122
9,048
Goofy
10,406
9,467
Pooh
10,430
9,260
Taz
9,883
8,864
trolled average rate. In the uncontrolled case, the average rate has the highest spikes and
47
the smallest total transmission delays.
In the controlled case, the long-term average rate
is much lower, but there is an increase in total transmission delay. For instance, data set
pooh's average rate swings in the vicinity of the target rate while the transmission time
delay increases considerably for the controlled case (Figure 4-16). In the other cases, the
algorithm controls the average rates with far less additional transmission delay (Figure 48). As an illustration, the achieved average rate for data set daffy is reduced considerably
in the controlled case with a small increase in the total transmission delay (Figure 4-12).
x10
4.5
4 -
3.5 F
3
CO,
2.5F-
2
0)
1.5
1
0.5
0
AI
0
5
VI \
10
-j
15
20
Time in mn
25
Figure 4-11: Uncontrolled daffy
48
30
35
40
x 104
8
7
6
55
(D5
co
3
2
1
0
0
5
10
15
20
25
Time in nr
Figure 4-12: Control daffy
49
30
35
kI
40
45
x 105
5
4.5 -
4 -
F
3.5
C-)
3
Cu)
CO)
2.5 -
CD
2
1.5
1
0.5
'Ni\l
[VJ\Ilf
0
0
\/41u&-
5
o\I
.j W Ij WufU'
10
Al
It,
YLWWMA-
15
IL
KUV.jr
20
Time in mn
L--.t.
I I[Akk1,
L
J~~ul l'\fj1 U
25
Figure 4-13: Uncontrolled goofy
50
30
1 1N --4
1h
IJAWW
35
I
40
x 104
12
10F
c-)
cz
2
-
diU I
IlLiIN44
.1
0
Ik
0
5
10
15
11
K
qjqj
rv
25
20
Time in mn
30
Figure 4-14: Controlled goofy
51
J11.
35
IN
40
45
SX 105
2.5-
2a)
C',
1.5)
a)
1--
0.5-
0N
0
5
15
10
Time in mn
Figure 4-15: Uncontrolled pooh
52
20
2
121
10
8
a-)
a)
6
4
2
ITWO M
L --- U L
I I
0
0
5
10
15
20
Time in mn
Figure 4-16: Controlled pooh
53
25
30
X10
5-
4-
C
3 -
0
a,
2-
0
20
40
60
Time in mn
80
Figure 4-17: Uncontrolled taz
54
100
120
X 10
6-
5-
4 --
Z)3 --
2--
0
0
20
40
80
60
100
120
140
Time in mn
Figure 4-18: Controlled taz
4.10 Running the algorithm at the peak, target, and average rates
The transmission time is given by the ratio between the size of the file being received
and the actual transfer rate. The transmission time for each file received has two boundaries. Considering the scatter plot of Figure 4-19, the transmission time is limited above
by the target rate and below by the peak rate.
This section investigates the duration of
transferred files for running the algorithm : 1) at the peak rate, 2) at the target rate, and 3)
at the average rate. An illustration should facilitate its understanding: when allowed to
continuously receive files at the peak rate, the maximum delay observed by the user is 1.4
55
second per transaction. This result is accurate
because the user's largest transaction is
181
16
o
peak rate
*
target rate
A
average rate
*
14
-A
12
AA
Cl)
10
E
I-
i8
AA
-
6
-
4
A
AA
2
0
O'O1m
0
0.5
nc)
1.5
1
Transfer size in bytes
n-
2
2.5
X 104
Figure 4-19: Average, peak, and target rates
180,000 bytes at a speed of 1 Mega bits per second. When constrained to continuously
receive at the target rate, the maximum delay is 120 seconds per transaction. Again, we
can corroborate this result because the largest transaction is still 180,000 bytes at a target
rate of 10,000 bits per second. For a user that is receiving data at the average rate, the
observed delay in transmission time is enclosed between the two previous cases.
In this
case, each transmission time has a peak time as well as a target time component. As a
result, they are not processed any faster than the peak rate nor any slower than the target
rate.
56
This page left intentionally blank
57
Chapter 5
Conclusion,
Applications,
Future Work
and
This chapter contains a summary of our findings, suggests a possible application of the
ARCUP algorithm, and proposes an idea for future work. Further, it contains references,
source codes, and a sample of the data set used in simulations.
Since the two schemes presented in chapter 4 and chapter 5 are different, we presented
the graphical results for each one of them in the most comprehensive way. However, one
will be mostly interested to know how much the flows are delayed by each scheme respectively. We answer this question by evaluating both algorithms against a common metric.
We first look at the amount of time required for a user to complete his/her transactions
when no control is applied. We then computed the amount of time required for the same
set of data when either scheme is applied. The results are inserted in the following section.
5.1 Conclusion
This paper analyzed the leaky-bucket algorithm and showed that is inefficient in controlling heavy-tail traffic. This analysis is supported by [8] as well. Table I shows the
results from running the leaky-bucket versus the proposed algorithm for different data sets
at the standard usage profile (i.e., target rate = 10,000 bits/sec). The target rate in the case
of the ARCUP algorithm is equivalent to the token rate in the leaky-bucket algorithm.
From this table, one case deserves some elaboration: data set goofy. Data set goofy (Figure 1-1) is relatively poisson distributed, therefore the leaky-bucket scheme provides a
transmission delay that is similar to the ARCUP algorithm. The leaky-bucket profile
works well for short-range dependence traffic -- predictable type of traffic like poisson
58
model. In all the other cases, the ARCUP algorithm outperforms its counterpart. Table 51:
Table 5-1: Leaky bucket vs. ARCUP algorithm (time in mn and data in Mega bits)
User name or
data set
Total data
transferred
Uncontrolled
Leaky-bucket
TSW algorithm
Taz
67
113
140.5
127
Gonzo
19.7
32.4
40
34.6
Goofy
22.7
36
40.3
40
Pooh
22.6
20.8
24
21.7
Daffy
13
37.7
43
42
These results, are obtained by running the ARCUP algorithm with a maximum data
equals to 600,000 bits and a target rate of 10,000 bits/sec. These results can be further
improved by increasing the maximum data value. In closing, the simple leaky bucket profile works well for short-range dependence traffic -- predictable type of traffic like Poisson
model. The ARCUP algorithm is more suitable for long-range dependent or self-similar
traffic. It outperforms the leaky bucket algorithm especially when the total files transferred
is considerable. More studies can confirm whether or not this algoritm (ARCUP) does
penalize Markovian traffic.
5.2 Application
In the current Internet model, ISPs (Internet Service Providers) lease their bandwidths
from the backbone suppliers. The latter own high capacity links like OC-1, OC-2, and OC3. They are considered as the source of bandwidth capacity. Corporations, institutions,
and individual users, in turn, lease their bandwidth from the ISPs. To deter end users from
using more than their contracted bandwidth, Internet Service Providers can build a usage
profile for each source. They will have to provision enough bandwidth to carry traffic for
all users. For Web (bursty) traffic, they assume that not all users send at peak rate at the
same time. In a setting like a university, ISPs can provide a profile to the LAN administrator. The administrator will then repartition profiles locally according to users' needs. In a
computer laboratory, like the Athena clusters here at MIT, all workstations will be
59
assigned the same profile. Businesses, for example, will be able to discourage their
employees from making excessive usage of the Internet by assigning them lower usage
profiles. Let's say an employee profile is such that he/she can download five files every ten
minutes. If this employee tries to download seven files in ten minutes, the algorithm can
be calibrated to complete this transfer in twenty minutes.
Such a policy will encourage
the user to stay within his assigned profile. In addition, the usage profile algorithm can be
built with added features than can warn the user when exceeding his/her profile. As an
incentive, the user will not be penalized if he/she voluntarily slows down. The usage profile mechanism will contribute to avoid congestion over the Internet.
5.3 Future work
The two algorithms presented in this thesis can work in tandem with mechanisms that
can differentiate the type of traffic before hand. In future work, one can explore the possibilities of designing a hybrid algorithm that can regulate properly all types of traffic. Further, for both algorithms, we assumed that the length on the ON periods are known a priori
which is contrary to reality. In future work, one can explore the merit of a scheme that can
determine the proper adjustment to the peak and the average rate based on the instantaneous length of the ON period. An inquisitive mind can even look into a different algoritm
that will produce better results than the one presented in this thesis.
60
5.4 References
[1] M. E. Crovella and A. Bestavros, "Self-Similarity in World Wide Web Traffic: Evidence and Possible Causes", in: IEEE/A CM Transactionson networking, Vol. 5, No. 6,
December 1997.
[2] Walter Willinger, Murad S. Taqqu, Robert Sherman, and Daniel V. Wilson, "Self-Similarity Through High-Variability: Statistical Analysis of Ethernet LAN Traffic at the
Source Level", in: IEEE/ACM Transactions on networking, Vol. 5, No. 1, February
1997.
[3] Modeling Mammographic Images Using Fractional Brownian Motion, Signal Processing Research Centre, Queensland University of Technology.
[4] Patrick Flandrin, "Wavelet Analysis and Synthesis of Fractional Brownian Motion",
in: IEEE Transactionson Information Theory, Vol. 5, No. 38, March 1992.
[5] W. Willinger, V. Paxon, and M.S. Taqqu, "Self-Similarity and Heavy Tails Structural
Modeling of Network Traffic In a Pratical Guide to Heavy Tails: Statistical Techniques
and Applications", R. Adler, R. Feldman, and M.S. Taqqu editors, Birkhauser, 1998.
[6] S. A. Marshall, "Introduction to Control Theory", 19978.
[7] Gennady Samorodnitsky and Murad S. Taqqu, "Stable Non-Gaussian Random Processes, Stochastic Models with Infinite Variance", pp. 318-320.
[8] Ian Je Hun Liu, Bandwidth Provisioning for an IP Network using User Profiles. S. M.
Thesis, Technology and Policy, and Electrical Engineering and Computer Science,
Massachusetts Institute of Technology, Cambridge, Ma. 1999.
[9] David D. Clark, "An Internet Model for Cost Allocation and Pricing": in Internet Economics, McKnight, L. and Bailey, J., editors, MIT Press, 1997.
[10] Kevin Warwick, "An Introduction to Control Systems", Advanced Series in Electrical
and Computer Engineering,Vol. 8, 2nd edition.
[11] Larry L. Peterson and Bruce S. Davie, "Computer Networks, A Systems Approach",
2nd edition.
[12] Wenjia Fang, Differentiated Services: Architecture, Mechanisms and an Evaluation,
Ph.D. Dissertation, Presented to the Faculty of Princeton University, November 2000
[13] F. Nicola, Gertjan A. Hagesteijn, and Byung G. Kim, Fast-simulation of the leaky
bucket algorithm.
61
Appendix A
Source codes and Data sample
A.1 Generating OFF times
/* Author: Pierre Arthur Elysee */
/* Date:
6/4/99 */
/* Description: This module generates the off times use in our simulations */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <time.h>
/* all macros */
#define seps ""
/* separation symbols for parsing */
#define MAXNUMBERCHARS 200 /* number of characters expected per line */
/* This function generates random numbers */
double rand-gen(num)
/* generate a uniformly distributed random number from [0,1] */
unsigned long *num;
{
double random;
62
*num = (*num * 16807) % 4294967295;
random = ((double) *num) / 4.294967295e9;
return (random);
I
void mainO
{
/* variables */
FILE *fp-in, *fp-out;
char inputname[20], outputname[20];
char buff[200]; /* hold one line of data at a time */
float word[30], duration;
int nw, onsources, offsources, fbm = 1;
double
k
=
0.0,
y=0.0,
time on=0.0,
timeoff=0.0,
savetimeon=0.0,
save time off=0.0;
double starttime=0.0, endtime=0.0;
char *w;
char buffl [80], buff2[15], buff3[15], buff4[15], buff5[15], buff6[15];
unsigned long seed = 2345677;
long int offset;
printf("Enter input file name:\n");
scanf("%s", input-name);
strcpy(output-name,input-name);
strcat(output-name, "_out");
printf("input file name:
%s \n",input name);
printf("output file name:
%s \n\n",output-name);
63
/* durantion value is entered here */
printf("Enter duration value:\n");
scanf("%f", &duration );
starttime = clockO; /* beginning of simulation */
/* testing input and output file for existance */
if ( (fpjin = fopen(input-name,"r") ) == (FILE *) NULL)
{
printf("Couldn't open %s for reading.\n", input-name);
exit(1);
I
if ( (fpout =fopen(outputname,"w") ) == (FILE *) NULL )
{
printf("Couldn't open %s for writing.\n", output name);
exit(1);
I
else
{
strcpy(buff3,"Time on");
strcpy(buff4,"Time off");
strcpy(buff5,"Source index");
strcpy(buff6,"Fbm_index");
/* output formatting */
sprintf(buffl., "%s
%s
%s %s", buff3, buff4, buff5, buff6);
fputs(buff 1, fpout); /* write line to file */
fputs("\n", fpout); /* add return character to line */
fp-in = fopen(input_name,"r");
64
/* Read input component list file stream */
while (fgets(buff, MAXNUMBERCHARS, fpjin) != 0)
{
nw =0;
w = strtok(buff, seps); /* find first word */
while (w)
{
/* parsing input line */
word[nw++] = atof(w);
w = strtok(NULL, seps); /* find next word */
I
/* calculating number of sources on and off for each FBM */
onsources = (int)(word[7]/word[1]);
offsources = (int)(word[3] - on-sources);
/*testing Beta and alpha values */
if(word[21] <= 0.0 || word[23] <= 0.0 || word[13] <= 0.0)
{
sprintf(buff 1, "%s %d", "Invalid BetaHigh1 or BetaHigh2 or Alphal for
FBM:", fbm++);
fputs(buff 1, fpout);
fputs("\n", fpout); continue;
I
/* calculating time-on and timeoff for each source within an FBM */
while(on-sources)
65
{.
while(duration > timeon){
/* y is a uniformely distributed random variable between (0,1) */
do{
y = rand-gen(&seed); // y will never be either zero or one
}while(y==0.0 11y==I.);
/* calculating on/off times */
timeon = word[21]*pow((1/y-1), 1.0/word[13]); /* time in seconds */
timeoff= word[21]*pow((1/y-1), 1.0/word[13]); /* time in seconds */
timeon = timeon + savetimeoff;
timeoff = time-off + time-on;
savetimeoff = timeoff;
if(time-on > duration ) continue; /*time-on has exceeded simulation time*/
/* writing results to output file */
strcpy(buff 1, "\0"); /* initializing buffer to null */
sprintf( buff 1, "%f %f
%d
%s%d%s", timeon, timeoff,
onsources, "A", fbm, "Z");
fputs(buff 1, fp-out); /* write line to file */
fputs("\n", fp-out); /* add return character to line */
}/* end while duration */
timeoff= 0.0;
timeon = 0.0;
savetimeon =0.0;
savetimeoff= 0.0;
onsources--; /* get next source */
66
} /* end while onsources */
while(off sources)
{while(duration > time-on){
/* y is a uniformely distributed random variable */
do{
y = rand-gen(&seed); /* y will never be either zero or one */
}while(y==O.O 11y==1.0);
/* calculating on times */
timeon = word[23]*pow((1/y-1), 1.0/word[13]); /* time in seconds */
timeoff = word[23]*pow((1/y-1), 1.0/word[13]); /* time in seconds */
timeon = timeon + savetime_off;
time-off = time-on + timeoff;
savetimeoff = timeoff;
if(timeon > duration ) continue;/*timeoff has exceeded simulation time */
1* writing results to output file */
strcpy(buff 1, "\0"); /* initializing buffer to null */
sprintf( buff1, "%f
%f
%d
%s%d%s", timeon, timeoff,
offsources, "A", fbm, "Z");
fputs(buff 1, fpout); /* write line to file */
fputs("\n", fpout); /* add return character to line */
} /* end while duration */
offsources--; /* get next source */
timeoff = 0.0;
67
timeon = 0.0;
savetimeon = 0.0;
savetimeoff = 0.0;
}/* end while offsources */
fbm++; /* get a new FBM from file */
}/* end outer while */
fclose(fp-in);
fclose(fp-out);
endtime = clocko;
printf("-------------------------------------\n");
printf("Simulation has been completed successfully.\n");
printf("The total running time is: %e seconds\n",((double)(end-time - starttime))/
CLOCKSPERSEC);
} /* end else */
}/* end main */
A.2 This module represents the leaky bucket's scheme source codes
/* Author: Pierre Arthur Elysee */
/* Date: 2/4/2000 */
/* Description: This module represents the leaky bucket scheme. It calculates the total
time requires by each source to complete their data tranfer. */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
/* all macros */
68
#define seps ""
1* separator symbols for parsing */
#define MAXNUMBERCHARS 200 /* number of characters expected per line */
#define inibucket 0.0; /* initial content of bucket */
typedef struct
int w;
struct NODE *next;
} NODE;
void mainO
{
/* variables */
FILE *fp-in, *fp-out;
char input-name[20], output_name[20];
char buff [200]; /* hold on line of data from input file */
char buff 1[200], buff2[20], buff3[10],buff4[20], buff5[10],buff6[20], buff7[20];
NODE *ptr head, *ptr-tail, *ptr walk, *prev-pos, *temp;
float offtime;
float pkjtime = 0.0;
float tkntime = 0.0; /* peak time and token time */
float total-pk time = 0.0;
float totaltkntime = 0.0;
float totalofftime = 0.0;
float totalontime = 0.0;
float prevontime = 0.0;
float period-on, periodoff, prev-offtime;
float tknrate = 10000.0; /* replenishing rate of token in bit/sec */
69
float pkjrate = 1000000.0; /* bit/sec */
float totaltime = 0.0; /* time requires by each source to comple their transfer */
float bucket = 0.0; /* bucket capacity */
float totaldata = 0.0;
float datainbits = 0.0;
int workload = 0;
int ontime = 0.0;
int typ-offtime = 0; /* in sec */
int newid = 0;
int old_id = 1;
int flag = 1;
int i = 0;
char *tmp;
printf("Enter input file name:\n");
scanf("%s", input-name);
strcpy(output-name,input-name);
strcat(output-name, "_out");
printf("input file name:
%s \n",input-name);
printf("output file name:
%s \n\n",output-name);
printf("please enter token rate:");
scanf("%f', &tknrate);
/* testing input and output file for existence */
if ( (fp_in = fopen(inputname,"r") )== (FILE *) NULL)
{
printf("Couldn't open %s for reading.\n", inputname);
70
exit(I);
}
if ( (fp-out = fopen(output-name,"w") ) == (FILE *) NULL)
{
printf("Couldn't open %s for writing.\n", output-name);
exit(1);
I
else
{
strcpy(buff2,"typoff-time");
strcpy(buff3,"total data");
strcpy(buff5,"totalpktime");
strcpy(buff6,"total tkn_time");
strcpy(buff7,"totaltransfer time");
/* output formatting */
sprintf(buffl, "%s
%s
%s
%s
%s", buff2, buff3, buff5, buff6, buff7);
fputs(buff 1, fp-out); /* write line to file */
fputs("\n", fp-out); /* add return character to line */
fp__n = fopen(input-name,"r");
while (fgets(buff, MAXNUMBERCHARS, fpin) != 0)
{
typ-offtime = atoi(strtok(buff, seps));
workload = atoi(strtok(NULL, seps));
data in bits = 1.0*workload;
71
if(typofftime == 0) typ-offjtime = 4; /* since of time off can not be zero */
bucket
= bucket + tkn rate*typ-off time;
totaldata = totaldata + datainbits;
/* calculating peak time and token time */
if(data_inbits > bucket)
pktime = bucket/pk-rate;
datainbits = datainbits - bucket;
bucket = pktime*tkn_rate; /* replenishing bucket for transfer duration*/
tkntime
= datainbits/tknrate - bucket/tknrate;
bucket = 0.0;
}
else
pktime = datainbits/pkjrate;
tkntime = 0.0;
bucket = bucket - datainbits;
}
/* calculating total transfer time */
total-pk time = totalpktime + pktime;
totaltkntime = totaltkntime + tkntime;
totaltime = total_pk_time + totaltkntime + typ-offtime + total-time;
sprintf(buffl, " %d
%d
%f
%f
work-load, pktime, tkntime, total_time);
fputs(buff 1, fpout); /* write line to file */
fputs("\n", fp-out); /* add return character to line */
}/ end while
72
%f", typofftime,
/* taking care of last source */
/*spnntf(buffl, " %d
%f
%f
%f
%f", typ-offtime,
total-data/8, total-pk-time, totaltkn time,total-time);
fputs(buff 1, fp-out); write line to file
/*fputs("\n", fp-out); add return character to line */
fclose(fpjin);
fclose(fpout);
}// end else
}// end main
A.3 This module represents the uncontrolled ARCUP algorithm
/* Author: Pierre Arthur Elysee */
/* Date: 9/24/00 */
/* Description: This module calculates the total time requires *1
/* by each source to complete their data tranfer. It uses an average estimator skim
along with */
/* real web traces. send av/target partion of data at target and l-av/target at peak rate
when average > target, increase otherwise. Report the transfer time for each transaction*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
/* all macros */
#define seps " "
/* separator symbols for parsing */
#define MAXNUMBERCHARS 200 /* number of characters expected per line */
73
#define ini_bucket 0.0; /* initial content of bucket */
typedef struct {
int w;
struct NODE *next;
} NODE;
void mainO
{
/* variables */
FILE *fp-inI, *fp-in2, *fp out;
char input_name1 [20], input-name2[20], output-name[20];
char buff[200]; /* hold on line of data from input file */
char buff 1[200], buff2[20]. buff3[10],buff4[20], buff5[10],buff6[20], buff7[20];
char *w, *tmp;
NODE *ptr head, *ptr-tail, *ptr walk, *prev-pos, *temp;
float offtime = 0.0;
float pkjtime = 0.0;
float total-pk-time = 0.0;
float pk-rate = 1000000.0; /* initial pkrate */
float targetrate = 10000.0; /* initial target-rate */
float avrate = 0.0; /* peak time and token time */
float prevontime = 0.0;
float period-on, period-off, prev-off-time;
float totaltime = 0.0; /* time requires by each source to comple their transfer */
float totaldata = 0.0;
float datainbits = 0.0;
74
float Win-length = 0.6;
float T_front = 0.0;
float Files_inTSW = 0.0;
float Newfiles = 0.0;
float alpha = 0.05;
float word[30]; /*contain data from input file */
int ontime = 0.0;
int period = 1;
int workload = 0;
int maxdata = 600000;/* representing maximum allowed file sizes */
int typ-offtime = 0; /* in sec */
int old_id = 1;
int flag = 1;
int i = 0;
int nw = 0;
printf("Enter input file name of data:\n");
scanf("%s", input-name2);
strcpy(output-name,input-name2);
strcat(output-name, "_out");
printf("input file name:
%s \n",input-name2);
printf("output file name:
%s \n\n",output-name);
if ( (fpin2 = fopen(inputname2,"r") ) == (FILE *) NULL)
{
printf("Couldn't open %s for reading.\n", inputname2);
exit(I);
75
}
if ( (fp-out = fopen(output-name,"w") ) == (FILE *) NULL)
{
printf("Couldn't open %s for writing.\n", output_name);
exit(1);
I
else
{
strcpy(buff2,"typ-off-time");
strcpy(buff3,"total data");
strcpy(buff4,"pkjrate");
strcpy(buff5,"av-rate");
strcpy(buff6,"target_rate");
strcpy(buff7,"totaltransfer time");
/* output formatting */
/* sprintf(buff1, "%s
%s
%s
%s
%s
%s", buff2, buff3, buff4, buff5,
buff6, buff7); */
fputs(buffl, fp-out); /* write line to file */
fputs("\n", fp-out); /* add return character to line */
/* open file with data transfer info */
fp-in2 = fopen(input-name2,"r");
/* reading data from input file */
while (fgets(buff, MAXNUMBERCHARS, fpjin2) != 0)
{
76
nw =0;
w = strtok(buff, seps); /* find first word */
while (w)
{
/* parsing input line */
word[nw++] = atof(w);
w = strtok(NULL, seps); /* find next word */
}
typ-offjtime = word[0];
workload = (int)(word[1]);
datainbits = 8*workjload; /* data to transfer in bits */
pk-time = datajin-bits/pkrate;
}
totaltime = Tfront + typ-off-time + pkjtime;
/* calculating the the average transfer rate */
FilesinTSW = avrate*Win_length;
Newfiles = Files_inTSW + workload*8;
avrate = Newfiles/ (total-time - Tfront + Win_length);
T_front = total-time; // time of the last packet arrival
total_pk_time = totalpkjtime + pkjtime;
sprintf(buffl," %d
%d
%d
%d
%d
%f
%f",
typ-offjtime, work-load, (int)pk-rate, (int)avrate, (int)target_rate,total-time/60.0
,total-data);
fputs(buff 1, fpout); /* write line to file */
fputs("\n", fp-out); /* add return character to line */
I
77
fclose(fpin2);
fclose(fpout);
}// end else
}// end main
A.4 This module represents the uncontrolled ARCUP algorithm
/* Author: Pierre Arthur Elysee */
/* Date: 9/24/00 */
/* Description: This module calculates the total time requires */
/* by each source to complete their data tranfer. It uses an average estimator skim
along with */
/* real web traces. send av/target partion of data at target and I-av/target at peak rate
when average > target, increase otherwise. Report the transfer time for each transaction*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
/* all macros */
#define seps ""
/* separator symbols for parsing */
#define MAXNUMBERCHARS 200 /* number of characters expected per line */
#define inibucket 0.0; /* initial content of bucket */
typedef struct {
int w;
struct NODE *next;
} NODE;
78
void maino
{
/* variables */
FILE *fp-inl, *fp-in2, *fpout;
char inputnamel[20], input-name2[20], outputname[20];
char buff[200]; /* hold on line of data from input file */
char buff 1[200], buff2[20], buff3[10],buff4[20], buff5[10],buff6[20], buff7[20];
char *w;
NODE *ptr-head, *ptr-tail, *ptr walk, *prev-pos, *temp;
float offtime;
float pkjtime = 0.0;
float totalpk time = 0.0;
float pk-rate = 1000000.0; /* initial pkrate */
float target-rate = 10000.0; /* initial target-rate */
float avrate = 0.0; /* peak time and token time */
float prevontime = 0.0;
float period-on, period-off, prevoffjtime;
float totaltime = 0.0; /* time requires by each source to comple their transfer */
float totaldata = 0.0;
float datainbits =0.0;
float Win-length = 0.6;
float T_front = 0.0;
float Files_inTSW = 0.0;
float Newfiles = 0.0;
float alpha = 0.05;
float word[30]; /*contain data from input file */
79
int ontime = 0.0;
int period = 1;
int work-load = 0;
int maxdata = 300000;
int typ-offtime = 0; /* in sec */
int old_id = 1;
int flag = 1;
int i = 0;
int nw = 0;
char *tmp;
printf("Enter input file name of data:\n");
scanf("%s", input-name2);
strcpy(output-name,input-name2);
strcat(output-name, "_out");
printf("input file name:
%s \n",inputname2);
printf("output file name:
%s \n\n",outputname);
if ( (fp__n2 = fopen(input-name2,"r") ) == (FILE *) NULL)
{
printf("Couldn't open %s for reading.\n", input name2);
exit(1);
}
if ( (fp-out = fopen(outputname,"w") ) == (FILE *) NULL)
{
printf("Couldn't open %s for writing.\n", output~name);
80
exit(1);
I
else
{
strcpy(buff2,"typoffjtime");
strcpy(buff3,"total data");
strcpy(buff4,"pk-rate");
strcpy(buff5,"av-rate");
strcpy(buff6,"target_rate");
strcpy(buff7,"totaltransfer time");
/* output formatting */
/* sprintf(buffl, "%s
%s
%s
%s
%s
%s", buff2, buff3, buff4, buff5,
buff6, buff7); */
fputs(buff 1, fpout); /* write line to file */
fputs("\n", fp-out); /* add return character to line */
/* open file with data transfer info */
fp-in2 = fopen(input-name2,"r");
/* reading data from input file */
while (fgets(buff, MAXNUMBERCHARS, fpjin2) != 0)
{
nw = 0;
w = strtok(buff, seps); /* find first word */
while (w)
{
81
/* parsing input line */
word[nw++] = atof(w);
w = strtok(NULL, seps); /* find next word */
}
typ-offtime = word[O];
workload = (int)(word[1]);
datainbits = 8*work-load; /* data to transfer in bits */
totaldata = totaldata + datainbits;
if( typ-offjtime == 0) typ-offjtime = 3;
if(data in bits > max-data)
{
pkjtime = 0.2*datainbits/targetrate;
pkjtime = pkjtime + 0.8*datainbits/pkrate;
datainbits = 0;
}
if(av-rate > target-rate && datainbits != 0)
{
pkjrate = 0.8*pk rate;
pkjtime = data_in_bits/pk-rate;
}
else if(data in bits != 0)
{
pkjrate
11.2*pk rate;
if(pkrate > 1000000.0) pk-rate = 1000000.0;
82
pkjtime = datainbits/pk_rate;
}
totaltime = Tfront + typoffjtime + pktime
/* calculating the the average transfer rate */
Files_inTSW = avrate*Win_length;
Newfiles = FilesinTSW + workload*8;
avrate = Newfiles/ (total-time - Tfront + Win_length);
T_front = total-time; // time of the last packet arrival
total-pk time = total_pktime + pktime;
sprintf(buff 1,
"
%d
%d
%d
%d
%d
%f
%f",
typofftime,
workload, (int)pk rate, (int)av-rate, (int)targetratetotaltime/60.0 ,total-data);//
totaltime/60.Opktime
fputs(buff 1, fpout); 1* write line to file */
fputs("\n", fp-out); /* add return character to line */
/* readjusting peak rate */
} while(fgets(buff,......)
fclose(fpin2);
fclose(fp-out);
}// end else
}// end main
83
A.5 Sample of data used in simulations
taz 797447407 13352 "http://cs-www.bu.edu/" 2299 0.969024
taz 797447408 940773 "http://cs-www.bu.edu/lib/pics/bu-iogo.gif' 1803 0.629453
taz 797447409 941884 "http://cs-www.bu.edu/lib/pics/bu-label.gif' 715 0.326586
taz 797447476 527498 "http://cs-www.bu.edu/students/grads/Home.html" 4734 0.494357
taz 797447611 924579 "http://cs-www.bu.edu/lib/icons/rball.gif' 0 0.0
taz 797447639 206997 "http://www.cts.com/cts/market/" 9103 1.751035
taz 797447641 152490 "http://www.cts.com/cts/market/marketplace.gif' 20886 1.143875
taz 797447643 428176 "http://www.cts.com/cts/market/dirsite-icon.gif' 2511 0.590501
taz 797447644 349043 "http://www.cts.com/art/cts.gif' 1826 0.613857
taz 797447650 507322 "http://www.cts.com/~flowers" 318 0.599902
taz 797447651 164916 "http://www.cts.com:80/-flowers/" 3044 1.556657
taz 797447652 850779 "http://www.cts.com/~flowers/thumb.gif' 9256 2.654269
taz 797447679 477564 "http://www.cts.com/~flowers/order.html" 2865 0.743227
taz 797447680 342368 "http://www.cts.com/-flowers/thumb.gif' 0 0.0
taz 797447723 595449 "http://www.cts.com:80/-flowers/" 0 0.0
taz 797447727 341012 "http://www.cts.com/cts/market/dirsite-icon.gif' 0 0.0
taz 797447727 348313 "http://www.cts.com/art/cts.gif' 0 0.0
taz 797447735 567056 "http://www.cts.com/-vacation" 320 1.312128
taz 797447736 930528 "http://www.cts.com:80/-vacation/" 1168 1.020013
taz 797447738 92121 "http://www.cts.com/-vacation/logo2.gif' 5485 2.669898
taz 797447741 441969 "http://www.cts.com/-vacation/boxindxl.gif' 3828 0.682333
taz 797447742 561155 "http://www.cts.com/~vacation/boxindx2.gif' 3786 0.701705
taz 797447743 760975 "http://www.cts.com/~vacation/boxindx4.gif' 3936 0.757089
taz 797448806 36925 "http://worldweb.net/-stoneji/tattoo/mytats.html" 2898 0.181871
taz 797448806 314167 "http://worldweb.net/-stonej/tattoo/my-tats.gif' 7806 0.254429
taz 797448807 108445 "http://worldweb.net/-stoneji/tattoo/dragon.gif' 10878 0.313795
taz 797448808 675031 "http://worldweb.net/~stoneji/tattoo/cavedraw.gif' 3710 0.231060
84
taz 797448809 849560 "http://worldweb.net/-stoneji/tattoo/sm-arm.gif' 3198 0.297034
taz 797448823 624434 "http://worldweb.net/-stoneji/tattoo.html" 0 0.0
taz 797454701 532335 "http://www.hollywood.com/rocknroll/" 1837 0.880347
taz 797454705 926469 "http://www.hollywood.com/rocknroll/buzz.gif' 3841 0.400525
taz 797454706 750284 "http://www.hollywood.com/rocknroll/quote.gif' 3888 0.410293
taz 797454707 571245 "http://www.hollywood.com/rocknroll/sound.gif' 4181 0.425392
taz 797454708 420418 "http://www.hollywood.com/rocknroll/video.gif' 4000 0.437737
taz 797454709 257421 "http://www.hollywood.com/rocknroll/sight.gif' 3512 0.419571
85