Assignment 1: Network Emulation of TCP CUBIC and SACK (Draft #1)

advertisement
Assignment 1: Network Emulation of TCP CUBIC and
SACK (Draft #1)
Naeem Khademi
Networks and Distributed Systems Group
Department of Informatics
University of Oslo, Norway
naeemk@ifi.uio.no
ABSTRACT
The goal of this report is to experimentally evaluate the
performance of two TCP congestion control mechanisms,
namely SACK and CUBIC, as implemented in TCP suite in
Linux kernel by using network emulation techniques. The
performance of these TCP variants have been evaluated using real-life measurements under different parameters settings which are varying bottleneck buffer size and concurrent
download flows.
Keywords
TCP, Measurement, Network emulation
1.
INTRODUCTION
TCP has been the dominant reliable data transfer protocol
in Internet over a decade and expected to remain so in the future. One of the most challening issues among academicians
has been to maximize the TCP’s performance (e.g. goodput) under different scenarios in Internet. Several congestion
control mechanisms are proposed to achieve this goal with
CUBIC being currently the default congestion control mechanism in Linux kernel. However, TCP suite implemented in
Linux kernel provides the functionality to select the congestion control mechanisms other than default and also provides
the source code of them openly. This gives the oppurtunity
to the researchers to use the passive measurements to evaluate the performance of these mechanisms under varying
conditions and to enhance/modify the existing congestion
control mechanisms to optimize the performance.
Buffer size plays an important rule in determining the TCP’s
performance. The common assumption about the buffer sizing requirement for a single TCP flow follows the rule-ofthumb which identifies the required buffer size at the bottleneck as the bandwidth×delay product of the flow’s path.
This amount of buffer size is associated to the sawtooth behavior of TCP congestion window (cwnd ) in NewReno or
SACK. TCP designed in a way to saturate the maximum
available bandwidth over a long-term period and it eventually fills up the total buffer size provided along the path.
Larger buffer sizes introduce higher queueing delays and
therefore are not favorable for delay-sensitive applications
and affect the TCP’s round-trip time (RTT).
It has been shown that, under a realistic traffic scenario, such
as when multiple TCP flows coexist along the path between
sender and receiver, desyncronized TCP flows’ cwnd clash
out each other and provide almost a flat-rate aggregate cwnd
value, easing the buffer sizing requirement to a smaller value
than bandwidth×delay product. The research works have
shown that, this value can be correlated to the square root
of the number of flows. This phonomenon facilitates the
development of core routers at the Internet backbone with
smaller buffer sizes providing almost the same utilization
level while requiring few numbers of RAM chipsets, lowering
the products’ cost.
In this paper, we study the impact of different buffer sizes
at the bottleneck router (here, a network emulator node)
jointly with various number of coexisting flows between a
sender and a receiver, for both TCP CUBIC and SACK. We
study how these scenario settings affect the TCP’s throughput and RTT and compare the behavior of CUBIC and
SACK in this aspects. The rest of this paper is organized
as follows: Section 2 presents the experimental setup and
methodology used in the measurements. Section 3 demonstrates the measurement results and evaluates the performance of TCP CUBIC and SACK under various parameter
settings and finally Section 4 justifies the provided results
and concludes the paper.
2.
EXPERIMENTAL METHODOLOGY
This section describes the experimental setup and methodology used in this paper. To passively measure the performance of different TCP variants, we setup a testbed using
three computers acting as sender, receiver and an emulator/router node in between. Each of the TCP peers are
physically conected to the emulator node using a 100Mbps
wired link (however we have used an emulation tool to limit
the available bandwidth to 10Mbps only). Figure 1 is a
schematic diagrams of the network topology used in the
tests. We have used emulab to setup this testbed and conduct our experiments. emulab in an open network testbed
at the University of Utah which provides a platform for
researchers to conduct real-life network experiments (Figure 2). The specifications of this test-bed is presented in
5e+06
Emulator node
(delay=100ms)
10 Mbps
node A (sender)
10 Mbps
Figure 1: Network topology
Aggregate Throughput (b/s)
node B (receiver)
4e+06
3e+06
2e+06
1e+06
CUBIC
SACK
0
1
2
5
10
Number of Parallel Flows (buffer size=20pkts)
20
Figure 3: Aggregate throughput of parallel flows
Figure 2: Emulab testbed
Table 1. The PCs employed for the measurements are located in the clusters of concentrated nodes stacked in network lab environments.
Table 1: Test-bed setup
Test-bed
Emulab
PC
Pentium III 600 MHz
Memory
256 MB
Operating System
Fedora Core 4
Linux kernel
2.6.18.6
Default Interface Queue Size
1000 pkts
Node numbers
3
Each experiment was repeated for 10 runs, each of which
lasting for 50 seconds with 20 seconds time gap between each
of the two experiments. The presented results are the averages over all runs. The iperf traffic generation tool was
used to measure TCP traffic during each test. TCP’s send
and receive buffers were set large enough (256 KB) to provide the opportunity for cwnd to grow with no restriction.
The Maximum Segment Size (MSS) was set to 1448 bytes
(MTU=1500 bytes). Two performance metrics are measured:
1) TCP throughput. 2) Round-Trip Time (RTT). To gather
the traffic statistics we have used tcpdump to monitor the
traffic traversed along the path between sender and receiver.
To calculate the throughput and the average RTT, we employed tcptrace, a tool which parses the dumped traffic files
in pcap format and gives various statistics about the TCP
flows.
Each experiment consists of 50 seconds iperf TCP traffic
from node A to node B. The number of coexisting flows
between these nodes varies from 1 to 20. The network emulator’s role (the intermediate node) is to incur a fixed amount
of delay for each arriving TCP packet (100 ms) and to limit
the maximum available bandwidth on the link to 10 Mbps.
In addition, the emulator buffers the arriving TCP packets in both its ingress and egress buffers with them being
in equal sizes (ranging from 20 to 100 packets). To provide a fine-grained distribution of the serviced packets by
the emulator node among different TCP flows, we have set
the maximum burst size of a TCP flow at the emulator node
to be as 10 consecutive packets only. We are able to change
the TCP congestion control mechanism in TCP Linux suite
using the /proc file system or sysctl command.
3.
MEASUREMENT RESULTS
This section demonstrates the performance measurement results of TCP CUBIC and SACK for different buffer sizes and
various number of parallel flows.
3.1
The Impact of Number of Parallel TCP
Flows
Figure 3 shows the impact of number of coexisting flows
on the overall TCP performance when the buffer size at
the bottleneck node is 20 packets. We can observe that
as the number of concurrent TCP flows grow from 1 to 20,
the total aggregate throughput of system increases from 2.5
and 2.9 Mbps to 4.5 and 4.9 Mbps for SACK and CUBIC
respestively with CUBIC achieveing a slightly higher TCP
throughput than SACK. This almost two-fold increase in
throughput is due to the fact that as the number of coexisting flows increases, there flows become more desynchronized
over time provding a better utilization level than a single
flow scenario. In this scenario, CUBIC’s performance stands
at a higher level than SACK due to the aggresive nature of
CUBIC’s cwnd growth (as the cubic function of the time
elapsed from the last congestion event).
Figure 4 demonstrates the impact of various number of coexisting TCP flows on the average RTT of all flows. TCP
flows’ average RTT remains almost fixed at a minimal level
for various number of parallel flows, when bottleneck’s buffer
size is set to the small values of 20 packets for both CUBIC
and SACK. This is because small buffer sizes incur very
little queuing delays and packet drop events become more
frequent instead.
CUBIC-buf=20pkts
SACK-buf=20pkts
CUBIC-buf=50pkts
SACK-buf=50pkts
CUBIC-buf=100pkts
SACK-buf=100pkts
CUBIC, 1 flow
SACK, 1 flow
CUBIC, 2 flows
SACK, 2 flows
CUBIC, 5 flows
200
SACK, 5 flows
CUBIC, 10 flows
SACK, 10 flows
CUBIC, 20 flows
SACK, 20 flows
200
180
RTT (ms)
RTT (ms)
180
160
140
120
140
120
100
1
2
5
Number of Parallel Flows
10
100
20
20
Figure 4: RTT vs. number of parallel flows
1.1e+07
1e+07
9e+06
8e+06
7e+06
CUBIC, 1 flow
SACK, 1 flow
CUBIC, 2 flows
SACK, 2 flows
CUBIC, 5 flows
SACK, 5 flows
CUBIC, 10 flows
SACK, 10 flows
CUBIC, 20 flows
SACK, 20 flows
6e+06
5e+06
4e+06
3e+06
2e+06
1e+06
0
20
50
100
Buffer Size (packets)
50
100
Buffer Size (packets)
200
Figure 6: RTT vs. Buffer Size
1.2e+07
Aggregate Throughput (b/s)
160
200
In this paper, we evaluated the performance of two congestion control mechanisms, namely CUBIC and SACK, using
passive measurements and emulation techniques and under
various scenarios. The impact of buffer sizing at the bottleneck on the throughput and RTT of these TCP variants
has been studied. Forthermore, the impact of number of coexisting TCP flows has been studied as well. While higher
number of parallel flows leads to the increase of aggregate
throughput for both CUBIC and SACK (with CUBIC performing slightly better most of the times), it also increases
the average RTT of CUBIC compared to SACK. On the
other hand, considering various buffer sizes, an increase in
buffer size leads to the increase of throughput and RTT for
both SACK and CUBIC. However, TCP throught remains
constact after a certain threshold while increasing in queuing delay (and therfore RTT), in this case, CUBIC is being
more exposed to the increase in RTT.
Figure 5: Throughput vs. Buffer Size
5.
However, the difference between CUBIC and SACK’s RTT
becomes more significant when the buffer size grows with
CUBIC’s RTT being more than SACK under various number of parallel flows. This returns to the aggresive behavior
of CUBIC after a loss event, most probably caused by a
buffer overflow, which leads to the higher number of TCP
packets being at the bottleneck buffer at any instant of
time and therefore increasing the queuing delay. In contrast, SACK performs conservatively by halving the cwnd
size (similar to NewReno), therefore having less number of
packets traversing along the path, and draining the bottleneck buffer after a loss event, and consequently smaller queing delay values.
3.2
The Impact of Buffer Sizing
Figure 5 shows the impact of buffer sizing on the aggregate
TCP throughput of the system (This graph will be explained
in the next draft/presentation). Figure 6 demonstrates the
average RTT of TCP flows for varying buffer sizes (This
graph will be explained in the next draft/presentation).
4.
CONCLUSIVE REMARKS
REMARKS
This draft will be modified accordingly to include more results/figures. We are already continuing the experiments for
larger than 100 packets buffer size values. Each test is repeated for 10 runs in emulab and the final average results
will be presented with confidence intervals of 95%.
Download