TCP Analysis 1. Objective The objective of this practical is to understand the behaviour of a practical implementation of TCP, compared to its theoretical one. 2. Terminology Congestion window: Number of packets that can be outstanding at any time [1]. MSS (Maximum Segment Size): Largest amount of data, specified in bytes, that a computer or communications device can handle in a single, unfragmented piece [2]. RTT (Round-trip delay Time): Time elapsed for a message to a remote place and back again [3]. TCP Window size: Amount of received data (in bytes) that can be buffered during a connection [4]. 3. Background knowledge The understanding of this practical requires previous knowledge about TCP [5]. You should understand TCP Congestion Control mechanisms [6], in order to be able to expect what the results of the analysis should be and its difference from the theoretical behaviour. For the set up of dedicated routers some knowledge about Quagga [7] is recommended. It also uses the protocol analyser software Wireshark [8], so it is recommended to get familiar with it before starting with the practical. In order to make easier the understanding of the practical, the user only needs to execute generic commands. The exactly commands that are executed in the virtual machines can be seen in the xml specification file. Therefore, if you would like to execute the practical more slowly, step by step, you only need to follow the commands specified in the xml file corresponding for each generic command. 4. Scenario description The scenario is illustrated in figure 1. It is made of: pc1: A client that requests a file to a server. r1 and r2: Two routers between pc1 and s1. s1: A server that has the file that the client requests. 1 of 10 pc1 s1 192.168.1.2 pc1r1 r1r2 r2s1 192.168.1.1 192.168.3.1 r1 r2 192.168.2.1 192.168.2.2 Figure 1 The following entries have been added to the /etc/hosts file of each of the virtual machines of the scenario. This allows the user typing symbolical names for the interfaces instead of having to write their IP addresses. 192.168.1.1 r11 192.168.1.2 pc1 192.168.2.1 r12 192.168.2.2 r22 192.168.3.1 r21 192.168.3.2 s1 5. Configuring the scenario Build the virtual machines of the scenario by starting the tcp.xml file with the VNUML tool [9]: cd /usr/share/vnuml/examples vnumlparser.pl -t tcp.xml -v -u root Note: Depending on your Linux distribution, the vnuml directory might be under /usr/share or /usr/local/share. 2 of 10 This instruction will boot the virtual machines and a shell will be prompted for each of them. The scenario is built as user root since root privileges are needed to do so. The user for the virtual machines is root and the password is xxxx. 6. Quagga A system with Quagga installed acts as a dedicated router. We want to launch OSPF between the two routers of the scenario (r1 and r2), so they can exchange routing information. In order to do this, the configuration files for the zebra daemon and ospfd daemon have been conveniently configured. The daemons are launched with the following instruction in the host: vnumlparser.pl -x start@vrrp.xml -v -u root The above instruction makes the following: Copies the zebra and opspfd configuration scripts to the routers (r1 and r2). Copies a file named index.txt and index2.txt to the server (s1). Modifies the /etc/hosts file of each virtual machine to include the entries described in section 4. Launches first the zebra daemon and then the ospfd daemon, in the virtual machines that are acting like routers (r1 and r2). Thus, these virtual machines start behaving like dedicated routers that run OSPF. Launches apache2 server in the server (s1). Because ospfd needs to acquire interface information from zebra in order to function, zebra must be running before invoking ospfd. Also, if zebra is restarted then ospfd must be too. Wait about 40 seconds for OSPF to converge. Then check that r1 has learnt the route for the server's network: Before OSPF converges: r1:~# route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 192.168.2.0 * 255.255.255.252 U 0 0 0 eth2 10.0.0.8 * 255.255.255.252 U 0 0 0 eth0 192.168.1.0 * 255.255.255.0 U 0 0 0 eth1 After OSPF converges: r1:~# route 3 of 10 Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 10.0.0.12 r22 255.255.255.252 UG 20 0 0 eth2 192.168.2.0 * 255.255.255.252 U 0 0 0 eth2 10.0.0.8 * 255.255.255.252 U 0 0 0 eth0 192.168.3.0 r22 255.255.255.0 UG 20 0 0 eth2 192.168.1.0 * 255.255.255.0 U 0 0 0 eth1 Check also that there is connectivity between pc1 and s1: ping s1 7. Capture the packets Login into s1 and run tshark [10] to capture packets on interface eth1. Use the -w option to save the trace into a file named trace_1. tshark -i eth1 -w trace_1 Is it important the place (interface) within the way between pc1 and s1 where you capture packets, or will the trace be the same if you capture in any interface that goes the connection through? The packets that flow through networks between pc1 and s1 are the same in any network of the way, but its order not. For example, if you capture packets in network r2s1, packets coming from s1 will be captured first that packets leaving pc1 at the same time. Also you cannot be sure if a packet reached s1 before it sent another one. For this reason, in order to correctly see the behavior of the server when a segment is lost, the capture must be made on the interface of s1. From the client (pc1) request a file to the server (s1): wget s1/index.txt Stop the capture in s1 by pressing Cntrl+c. Make sure that the capture was correctly done and no packets were dropped. Copy the capture from s1 to the host, so you can analyze the trace with Wireshark in a graphical interface. To do this, use the scp command [11] in a shell of the host: scp root@s1:/root/trace_1 /usr/share/vnuml/tcp/ 4 of 10 Password: trace_1 100% 162KB 162.0KB/s 00:00 8. Analyzing the capture We are going to prepare the capture in order to better analyze the TCP connection [12]. Open the captured trace with Wireshark. Then, filter the packets displayed in the Wireshark window by entering “tcp” into the display filter specification window towards the top of the Wireshark window. Now you can see several TCP and HTTP messages. Since we are interested in analyzing TCP behavior, change Wireshark’s listing of captured packets window so that it shows information about the TCP segments containing the HTTP messages, rather than about the HTTP messages. To have Wireshark do this, select Analyze->Enabled Protocols. Then uncheck the HTTP box and select OK. You should now see a Wireshark window that looks like: 5 of 10 Figure 2 9. Establishing a connection In the capture you can see the initial three-way handshake containing a SYN message. What is the sequence number of the TCP SYN segment that is used to initiate the TCP connection between pc1 and s1? The relative sequence number of the segment is 0, since it is where the connection begins. The absolute sequence number can be seen in the sequence number field in the data of the segment. It has a different value in each connection. In the example trace, the absolute sequence number in hexadecimal is 31 1A 73 8C. Identify where that segment is marked like a SYN segment. The segment is marked like a SYN segment in the flags field of the segment (it has the SYN bit set). If the initial SYN segment that pc1 sends does not include data, why does the server reply with a SYN+ACK segment with a relative ACK=1? Even if the initial SYN segment does not include data it consumes 1 byte. 10. General view of the trace Identify in which segment the client requests a file to the server. It is in the first segment after the initial three-way handshake. You can see in the data of the segment the GET method of the HTTP protocol. How much data does the receiver typically acknowledge in an ACK? Initially the receiver sends an ACK for every segment that receives. Then, it starts sending ACKs for groups of 10, 11 and 12 segments. This is done to increase the congestion window size of the sender faster, since it depends on the incoming acknowledgements. This is called 6 of 10 quick acknowledgements [13]. Does any time a segment get lost during the connection? Why? Yes. The server begins sending data according to TCP slow start phase. Then, it enters in a congestion avoidance phase, incrementing the congestion window by one every RTT until loss. Therefore, it tries to send each time data faster, so at the end a lost should come if the file is big enough. In the case that looses are produced, how many TCP duplicated ACK segments does the client send to indicate that a segment has been lost? In the example trace the fast recovery mechanism [14] is shown. The sender detects the loss of a segment by three duplicated ACKs. When the third duplicated ACK arrives, the sender retransmits just the lost segment (fast retransmission). Nevertheless, in the trace several duplicated ACKs can be seen. This is because the client sends them before the retransmission reached it. Is any time a segment received out of order during the connection? Why could this happen? Sometimes a segment is received out of order. This is because at some point in the way to the client, the segment gets a delay bigger than the next segment, and so they arrive out of order. Is the receiver window ever too small that prevents the sender sending a packet? Why could this happen? Sometimes the receiver cannot retrieve data from the TCP buffer as fast as the sender is introducing data in it. This can be caused because of a resource problem on the receiver, since the application is not retrieving data from the TCP buffer in a timely manner. Therefore the receiver window might get so small that the sender has to wait. When the receiver consumes some data from its reception buffer, thereby freeing up space, it should notify the sender by sending a TCP WindowUpdate [15] that advertises the current window size. 11. TCP congestion control in action Select a segment that comes from the server. Then select the menu Statistics->TCP Stream Graph-> Time-Sequence-Graph(Stevens). You should see a plot that looks similar to the following plot, created from the captured packets in the trace included with this practical: 7 of 10 Figure 3 Each dot represents a TCP segment sent by the server, plotting the segment’s sequence number versus the time at which it was sent. Looking at this plot, can you identify where TCP’s slow start phase begins and ends, and where congestion avoidance takes over? The slow start phase begins after the connection is established, at segment 7 of the example trace. It ends at segment 42. This is the time when the sender stops ACKing every segment, and starts doing it by groups, usually of 11 segments. This change can be seen clearly in the graph. After the loss of a packet, the slow start phase begins again. 8 of 10 How does the measured data differ from the theoretical TCP behavior? Identify on the plot where the loss of a segment has been produced and what point corresponds to the retransmission segment. If you click on that point the corresponding segment in the trace gets selected. In the example trace it can be clearly seen the retransmission segment. Clicking on this segment you can see that it is segment number 127. In the capture made in section 7, the sender finishes sending the file after a few segments after the loss is produced. To better analyze the behavior of TCP after the loss is produced, we are going to request a longer file. Repeat the steps described in section 7, this time requesting file index2.txt instead of index.txt. Name the trace trace_2. Analyze the behavior of TCP after the loss of a segment is produced. The sender entries again in the slow start phase. In this case, the receiver sends ACK's alternatively sometimes for each segment and other times for a small group of them. After some time, the sender enters in the congestion avoidance phase, and the receiver now sends ACK's for groups, usually of 8 segments. In the example trace the congestion avoidance phase starts at segment 189. 12. Closing a connection Observe how the release of the connection is made. As you can see, in this case it is pc1 the one that starts the end of the connection. How does pc1 know that the server has finished sending all the packets of the file? The client knows that the server has sent all the data because the length of the last packet that it received from the server was not the MSS of the connection. 13. References [1] Congestion window. http://en.wikipedia.org/wiki/Congestion_window [2] MSS. http://en.wikipedia.org/wiki/Maximum_segment_size [3] TCP Window size. http://en.wikipedia.org/wiki/Transmission_Control_Protocol#TCP_window_size 9 of 10 [4] RTT. http://en.wikipedia.org/wiki/Round-trip_delay_time [5] TCP RFC. http://www.faqs.org/rfcs/rfc793.html [6] TCP Congestion Control RFC. http://www.ietf.org/rfc/rfc2581.txt [7]Quagga. http://www.quagga.net/docs.php [8] Wireshark home page. http://www.wireshark.org/ [9] VNUML home page. http://www.dit.upm.es/vnumlwiki [10] tshark man page. [11] scp man page. [12] Kurose ethereal lab: TCP. http://gaia.cs.umass.edu/ethereal-labs/ [13] Quick acknowledgements. Pasi Sarolahti, Congestion Control in Linux TCP, page 9. [14] Fast recovery. J. F. Kurose, K. W. Ross, Computer networking, 3rd edition, 2005. Page 269. [15] Wireshark TCP type messages. http://wiki.wireshark.org/TCP_Analyze_Sequence_Numbers Dudas: 1. Reno vs BIC. ¿Cambiarlo en el fs update? No parece haber diferencia. 2. Meter reordering 3. Meter duplication Revisar respuesta out of order. Respuesta diferencias con teoría. throughput per unit time? 10 of 10