Assignment 2: Comparison of Simulation and Emulation Results of TCP Performance Evaluation (Draft #1) Naeem Khademi Kjetil Raaen Networks and Distributed Systems Group Department of Informatics University of Oslo, Norway Norwegian School of Information Technology Schweigaardsgate 14 0185 Oslo, Norway naeemk@ifi.uio.no raakje@nith.no ABSTRACT The goal of this report is to compare results of two approaches of evaluating the TCP performance of two congestion control mechanisms (CUBIC and SACK)with a shared bottleneck. The first approach is an experimental evaluation of these TCP variants, as implemented in TCP suite in Linux kernel by using network emulation techniques. The performance of these TCP variants have been evaluated using real-life measurements under different parameters settings which are varying bottleneck buffer size and concurrent flows. The second approach is simulation where these TCP variants have been evaluated using the ns2 network simulator. Keywords TCP, Measurement, Network emulation, Network simulation 1. INTRODUCTION TCP has been the dominant reliable data transfer protocol in Internet over a decade and expected to remain so in the future. One of the most challenging issues among academicians has been to maximize the TCP’s performance (e.g. throughput, goodput) under different scenarios in Internet. Several congestion control mechanisms are proposed to achieve this goal with CUBIC being currently the default congestion control mechanism in Linux kernel. However, TCP suite implemented in Linux kernel provides the functionality to select the congestion control mechanisms other than default and also provides the source code of them openly. This gives the opportunity to the researchers to use the passive measurements to evaluate the performance of these mechanisms under varying conditions and to enhance/modify the existing congestion control mechanisms to optimize the performance. The Linux TCP implementation code has been ported to ns-2 (in ns-2.31 and versions news than it) making it possible to evaluate the performance of Linux TCP implementation for academicians (and also com- paring it to real-life measurements) Buffer size plays an important rule in determining the TCP’s performance. The common assumption about the buffer sizing requirement for a single TCP flow follows the rule-ofthumb which identifies the required buffer size at the bottleneck as the bandwidth×delay product of the flow’s path. This amount of buffer size is associated to the sawtooth behavior of TCP congestion window (cwnd ) in NewReno or SACK. TCP designed in a way to saturate the maximum available bandwidth over a long-term period and it eventually fills up the total buffer size provided along the path. Larger buffer sizes introduce higher queueing delays and therefore are not favorable for delay-sensitive applications and affect the TCP’s round-trip time (RTT). It has been shown that, under a realistic traffic scenario, such as when multiple TCP flows coexist along the path between sender and receiver, desynchronized TCP flows’ cwnd clash out each other and provide almost a flat-rate aggregate cwnd value, easing the buffer sizing requirement to a smaller value than bandwidth×delay product. The research works have shown that, this value can be correlated to the square root of the number of flows. This phenomenon facilitates the development of core routers at the Internet backbone with smaller buffer sizes providing almost the same utilization level while requiring less RAM, lowering the products’ cost. In this paper, we study the impact of different buffer sizes at the bottleneck router (here, a network emulator node or a simulated wired node) jointly with various number of coexisting flows between a sender and a receiver, for both TCP CUBIC and SACK. We study how these scenario settings affect the TCP’s throughput and RTT and compare the behavior of CUBIC and SACK in this aspects using both emulation and simulation methodologies. The rest of this paper is organized as follows: Section 2 presents the experimental setup and methodology used in the measurements and simulations. Section 3 demonstrates the measurement results and evaluates the performance of TCP CUBIC and SACK under various parameter settings and finally Section 4 justifies the provided results and concludes the paper. 2. EXPERIMENTAL METHODOLOGY This section describes the experimental setups and methodology used in this paper. The experiments consist of two Emulator node (delay=100ms) node B (receiver) 10 Mbps node A (sender) 10 Mbps Figure 1: Network topology (simulation and emulation) Figure 2: Emulab testbed different setups (emulation vs. simulation) that are compared in the results section. 2.1 Emulation setup To passively measure the performance of different TCP variants, we set up a testbed using three computers acting as sender, receiver and an emulator/router node in between. Each of the TCP peers are physically connected to the emulator node using a 100Mbps wired link (however we have used an emulation tool to limit the available bandwidth to 10Mbps only). Figure 1 is a schematic diagrams of the network topology used in the tests. We have used emulab to set up this testbed and conduct our experiments. emulab in an open network testbed at the University of Utah which provides a platform for researchers to conduct real-life network experiments (Figure 2). The specifications of this test-bed is presented in Table 1. The PCs employed for the measurements are located in the clusters of concentrated nodes stacked in network lab environments. Table 1: Test-bed setup Test-bed Emulab PC Pentium III 600 MHz Memory 256 MB Operating System Fedora Core 4 Linux kernel 2.6.18.6 Default Interface Queue Size 1000 pkts Node numbers 3 Each experiment was repeated for 10 runs, each of which lasting for 50 seconds with 20 seconds time gap between each of the two experiments. The presented results are the averages over all runs. The iperf traffic generation tool was used to measure TCP traffic during each test. TCP’s send and receive buffers were set large enough (256 KB) to provide the opportunity for cwnd to grow with no restriction. The Maximum Segment Size (MSS) was set to 1448 bytes (MTU=1500 bytes). Two performance metrics are measured: 1) TCP throughput. 2) Round-Trip Time (RTT). To gather the traffic statistics we have used tcpdump to monitor the traffic traversed along the path between sender and receiver. To calculate the throughput and the average RTT, we employed tcptrace, a tool which parses the dumped traffic files in pcap format and gives various Each experiment consists of 50 seconds iperf TCP traffic from node A to node B. The number of coexisting flows between these nodes varies from 1 to 20. The network emulator’s role (the intermediate node) is to incur a fixed amount of delay for each arriving TCP packet (100 ms) and to limit the maximum available bandwidth on the link to 10 Mbps. In addition, the emulator buffers the arriving TCP packets in both its ingress and egress buffers with them being in equal sizes (ranging from 20 to 100 packets). To provide a fine-grained distribution of the serviced packets by the emulator node among different TCP flows, we have set the maximum burst size of a TCP flow at the emulator node to be as 10 consecutive packets only. We are able to change the TCP congestion control mechanism in TCP Linux suite using the /proc file system or sysctl command. 2.2 Simulation setup The Network Simulator, ns-2 is a discrete event based simulator for networking. It supports various TCP variants, including the ones tested in this report. In addition it allows configuring a significant amount of relevant parameters controlling simulation environments and factors in the network. The simulation tool used was ns-2.34. First, a topology is set up, identical to the one described in Figure 1. The links are configured to be 10Mbps duplex lines. Each line is configured to introduce a 25ms one-way delay. This will create a total RTT of 100ms. Then the FTP application transfering TCP data from sender to the receiver, simulating from one to 20 simultaneous streams using different buffer sizes. All simulated streams are started simultaneously and lasted for 100s. The simulation parameters are brought in Table 2. Table 2: Simulation setup Simulator ns-2.34 Queue type DropTail/PriQueue Queue size 10∼200 Simulation time 100 s TCP Receiver Window ∞ (30000 pkts) Node numbers 3 Application FTP 3. RESULTS This section demonstrates the performance measurement results of TCP CUBIC and SACK for different buffer sizes and various number of parallel flows, and compares them to the simulated results. 3.1 The Impact of Number of Parallel TCP Flows However, the difference between CUBIC and SACK’s RTT becomes more significant when the buffer size grows with CUBIC’s RTT being more than SACK under various number of parallel flows. This returns to the aggresive behavior of CUBIC after a loss event, most probably caused by a buffer overflow, which leads to the higher number of TCP packets being at the bottleneck buffer at any instant of time and therefore increasing the queuing delay. In contrast, SACK performs conservatively by halving the cwnd size (similar to NewReno), therefore having less number of packets traversing along the path, and draining the bottleneck buffer after a loss event, and consequently smaller queing delay values. 3.2 The Impact of Buffer Sizing Figure 5 compares the impact of buffer sizing on the aggregate TCP throughput of the system in simulation and real-life experiments. Aggregate Throughput (b/s) Figure 4 demonstrates the impact of various number of coexisting TCP flows on the average RTT of all flows. TCP flows’ average RTT remains almost fixed at a minimal level for various number of parallel flows, when bottleneck’s buffer size is set to the small values of 20 packets for both CUBIC and SACK. This is because small buffer sizes incur very little queuing delays and packet drop events become more frequent instead. 5e+06 4e+06 3e+06 2e+06 1e+06 1 some remarks about the average RTT will be added in the next draft/update here. 3.3 Simulating on a 100 Mbps link Figure 7 shows the aggregate TCP throughput of the system when the link bandwidth is set to 100 Mbps for various buffer sizes. 2 5 10 Number of Parallel Flows (buffer size=20pkts) 20 (a) emulation 10 9 8 7 6 5 4 3 2 1 CUBIC SACK 0 1 Real-life experiments and simulation results both match for the buffer sizes above 20 packets. However the simulation results are not able to re-generate the sharp performance degradation of both TCP CUBIC and SACK for the buffer sizes equal or smaller than 20 packets. The simulated results on the other hand show that any number of streams will fully utilize the 10 Mbps link regardless of the buffer size. It also shows only slight deterioration of utilization at very low buffer sizes, down to as little as 10∼20 packets. CUBIC SACK 0 Aggregate Throughput (Mb/s) Figure 3 shows the impact of number of coexisting flows on the overall TCP performance when the buffer size at the bottleneck node is 20 packets. We can observe that in emulation (Figure 3(a)), as the number of concurrent TCP flows grow from 1 to 20, the total aggregate throughput of system increases from 2.5 and 2.9 Mbps to 4.5 and 4.9 Mbps for SACK and CUBIC respectively with CUBIC achieving a slightly higher TCP throughput than SACK. This almost two-fold increase in throughput is due to the fact that as the number of coexisting flows increases, there flows become more desynchronized over time proving a better utilization level than a single flow scenario. In this scenario, CUBIC’s performance stands at a higher level than SACK due to the aggresive nature of CUBIC’s cwnd growth (as the cubic function of the time elapsed from the last congestion event). some explanations and simulation behavior... 2 5 10 15 Number of Parallel Flows (buffer size=20pkts) 20 (b) simulation Figure 3: Aggregate throughput of parallel flows SACK-buf=50pkts CUBIC-buf=100pkts SACK-buf=100pkts CUBIC-buf=20pkts SACK-buf=20pkts CUBIC-buf=50pkts 1.2e+07 1.1e+07 1e+07 Aggregate Throughput (b/s) 200 160 140 8e+06 7e+06 CUBIC, 1 flow SACK, 1 flow CUBIC, 2 flows SACK, 2 flows CUBIC, 5 flows SACK, 5 flows CUBIC, 10 flows SACK, 10 flows CUBIC, 20 flows SACK, 20 flows 6e+06 5e+06 4e+06 3e+06 120 2e+06 1e+06 100 1 2 5 Number of Parallel Flows 10 0 20 20 (a) emulation 50 100 Buffer Size (packets) 200 (a) emulation 10 450 RTT (ms) 350 SACK-buf=100pkts SACK-buf=120pkts SACK-buf=140pkts SACK-buf=180pkts SACK-buf=200pkts Aggregate Throughput (Mb/s) SACK-buf=10pkts SACK-buf=20pkts SACK-buf=40pkts SACK-buf=60pkts SACK-buf=80pkts 400 300 250 200 150 8 6 4 1 flow 2 flow 5 flow 10 flow 15 flow 20 flow 2 100 1 2 5 10 Number of Parallel Flows 15 20 0 10 20 40 (b) simulation-SACK 60 80 100 120 140 Buffer Size (packets) 160 180 200 180 200 (b) simulation-SACK 10 450 350 CUBIC-buf=100pkts CUBIC-buf=120pkts CUBIC-buf=140pkts CUBIC-buf=180pkts CUBIC-buf=200pkts Aggregate Throughput (Mb/s) CUBIC-buf=10pkts CUBIC-buf=20pkts CUBIC-buf=40pkts CUBIC-buf=60pkts CUBIC-buf=80pkts 400 RTT (ms) RTT (ms) 180 9e+06 300 250 200 150 8 6 4 1 flow 2 flow 5 flow 10 flow 15 flow 20 flow 2 100 1 2 5 10 Number of Parallel Flows 15 20 0 10 20 40 60 80 100 120 140 160 Buffer Size (packets) (c) simulation-CUBIC Figure 4: RTT vs. number of parallel flows (c) simulation-CUBIC Figure 5: Throughput vs. Buffer Size CUBIC, 1 flow SACK, 1 flow CUBIC, 2 flows SACK, 2 flows CUBIC, 5 flows SACK, 5 flows CUBIC, 10 flows SACK, 10 flows CUBIC, 20 flows SACK, 20 flows 200 160 100 140 120 100 20 50 100 Buffer Size (packets) 200 (a) emulation 450 Aggregate Throughput (Mb/s) RTT (ms) 180 400 350 80 60 40 1 flow 2 flow 5 flow 10 flow 15 flow 20 flow 20 0 RTT (ms) 10 20 40 60 300 80 100 120 140 Buffer Size (packets) 160 180 200 180 200 (a) simulation-SACK 250 150 100 10 20 40 60 80 100 120 140 Buffer Size (packets) 160 180 200 (b) simulation-SACK 450 400 350 Aggregate Throughput (Mb/s) 100 1 flow 2 flow 5 flow 10 flow 15 flow 20 flow 200 80 60 40 1 flow 2 flow 5 flow 10 flow 15 flow 20 flow 20 0 RTT (ms) 10 20 300 40 60 80 100 120 140 Buffer Size (packets) 160 (b) simulation-CUBIC 250 1 flow 2 flow 5 flow 10 flow 15 flow 20 flow 200 150 100 10 20 40 60 80 100 120 140 Buffer Size (packets) (c) simulation-CUBIC Figure 6: RTT vs. Buffer Size 160 Figure 7: Throughput vs. Buffer Size on a 100 Mbps link 180 200 4. CONCLUSIVE REMARKS In this paper, we evaluated and compared the performance of two congestion control mechanisms, namely CUBIC and SACK, using passive measurements and emulation techniques as well as in ns-2 simulations under various scenarios. some explanatory texts...