Low-rate TCP-targeted Denial of Service Attack Defense Johnny Tsao Petros Efstathopoulos University of California, Los Angeles, Computer Science Department Los Angeles, CA E-mail: {johnny5t, pefstath}@cs.ucla.edu ABSTRACT Low-rate TCP targeted denial of service attacks are a subset of DoS attacks that exploit the retransmission timeout (RTO) mechanism of TCP. In doing so, such an attack can drastically reduce throughput while producing little traffic relative to traditional DoS attacks. Since it produces only periodic traffic, a low-rate attack is difficult to detect and prevent. Another property of the periodic traffic, however, is that a low-rate attack’s success depends on synchronization with the victim’s RTO. A proposed defense to this attack is to randomize the RTO. In doing so, information can still be transmitted while the attacker is waiting and a connection will be able to avoid timing out successively. In this paper, we evaluate the effectiveness of randomizing the retransmission timeout in defending against low-rate TCP targeted denial of service attacks. Through experiments we show that such a defense can prevent a TCP flow from being throttled by a low-rate attack and still achieve respectable throughput. In addition, we will analyze the effectiveness of a low-rate DoS attack on the Linux implementation of TCP. 1. INTRODUCTION As the prevalence and use of networks grow, so too will the number of people attempting to exploit them, whether for personal gain or simply malicious intent. Therefore, the issue of network security and protection will only gain importance in today’s society and the future. A Denial-of-Service (DoS) attack is one such exploit/attack present in networks today. A DoS generally floods a network or link with traffic with the goal of denying other users access to specific nodes, resources, or services. Such resources include memory, bandwidth, or CPU cycles. The problem with traditional DoS attacks is that their network behavior of flooding links does not correspond with regular network traffic. Therefore, one can often detect a DoS attacker by analyzing the network traffic. In addition to detection, mechanisms can be added to routers to prevent DoS attacks from affecting a network by blocking such traffic. A different type of DoS attack was introduced in [1]. A low-rate TCP targeted DoS attack produces the negative effects of a typical DoS attack, except it targets TCP flows and can often elude detection since it sends traffic at a relatively low average rate. It does so by exploiting TCP’s retransmission timeout mechanism. TCP’s retransmission timeout mechanism is intended to deal with cases of severe congestion and multiple losses. If a sender fails to receive an ACK after the timeout period, it reduces its congestion window to one and retransmits the packet. In correspondence with the recommendation made by [3], many systems implement a minimum RTO of 1 second. It is this implementation that puts these systems at risk since a low-rate DoS attack can exploit it. A low-rate DoS attack will flood the network with an initial burst of packets. This burst fills up the queues and results in packet loss for legitimate TCP connections. These connections, after waiting the timeout for an ACK, reduce their windows and retransmit. While the TCP connections wait for ACKs that aren’t coming, the low-rate attacker does not need to attack. It can simply wait the RTO (one second) and start flooding the network again, exactly when the TCP connections try to retransmit their packets. In doing so, a low-rate attack can throttle a TCP connection’s throughput to nearly zero. In this paper, we will study the effectiveness of randomizing the RTO in defending against a low-rate TCP targeted DoS attack. We will first give an overview of related work on this topic. In section 3, we shall take a closer look at TCP’s timeout mechanism. We will analyze the timeout mechanism to find out why it exists and how it works. This will give us insight as to how it is exploited. Section 4 will go over the low-rate DoS attack. We will study the properties of such an attack and see how it takes advantage of the timeout mechanism. We will present the proposed solution of randomizing the RTO in section 5. We will then look at the changes we had to make to the Linux kernel in section 6. After, section 7 will present and analyze our experimental results showing the effectiveness of the randomized RTO defense. Section 8 will present our conclusion and a summary of our results. Lastly, section 9 will discuss future work. 2. RELATED WORKS Our work is mostly based on two papers. The first one is Low-Rate TCP-Targeted Denial of Service Attacks by Kuzmanovic and Knightly. This paper introduced the low rate attack and proposed randomizing the RTO as a solution. The second paper is Randomization: Defense against Low-Rate TCP-Targeted Denial-of-Service Attacks by Yang, Gerla, and Sanadidi. In this paper, they researched both randomization of RTO and probing as defenses against a low rate attack. Both papers do simulations to back up their work. For our paper, we have implemented the randomized RTO defense in Linux. We then run experiments to confirm the results seen in these other works. In addition, we see how the regular Linux kernel behaves under a low rate attack. 3. TCP TIMEOUT MECHANISM 3.1 TCP Congestion control: Timeout Mechanism Congestion control is essential for TCP. TCP congestion control operates in two timescales depending on the degree of the link’s congestion. On smaller timescales of round trip times (RTT) TCP performs additive-increase multiplicative-decrease (AIMD) control with the objective of having each floe transmit at the fair rate of its bottleneck link. At times of severe congestion in which multiple losses occur, TCP operates on longer timescales of Retransmission Time Out (RTO). In this study we are interested in the timeout mechanism of TCP which what the Low-Rate DoS attack tries to exploit. The TCP congestion control mechanism uses the notion of “congestion window” (cwnd). Each TCP sender uses the congestion window to regulate the rate of transmission based on the feedback it gets from the network. The congestion window is the TCP sender’s estimate of how much data can be outstanding in the network without packets being lost. Each receiver can advertise a window size. The sender takes into account this advertised window size and chooses as window size the minimum of the two: advertised window and congestion window. This mechanism helps avoid congestion since both the receiver’s capabilities and the networks characteristics are taken into account. After a period of congestion TCP enters the “slow-start” phase: it shrinks its congestion window to 1 segment and increases it by one every time an acknowledgement is received. If on the other had less than three duplicate ACKs are received before RTO expires, TCP shrinks its congestion window to 1 segment (slo start), doubles the value of the RTO (exponential backoff) and retransmits the lost packet. Under heavy congestion, TCP reduces its congestion window size to 1 segment and sets the RTO to its minimum value. The recommended minimum value for the RTO is 1sec, as proposed by the study presented in [3]. This mechanism was chosen for dealing with cases of heavy congestion since it is the most conservative sender behavior. If the RTO expires and the packet is lost again the exponential backoff continues and the sender doubles the value of RTO (RTO=2) and retransmits the packet. The sender then waits for the 2sec RTO to expire and if the packet still hasn’t been transmitted successfully the exponential backoff continues (RTO is set to 4sec and so on). 3.2 Vulnerability Although this mechanism is well suited for dealing with high congestion, it has a fundamental flaw. The values of RTO are predefined constants. The minimum value is 1sec and the possible RTO values during exponential backoff are multiples of 1sec. This property of the algorithm makes the system vulnerable to attacks that use a short high rate burst of packets to fill the bottleneck buffers, right before the RTO expires. An attacker that knows the timing of the sender can use a “square wave” attack (high rate, short duration bursts) and force the sender to repeatedly enter the retransmission timeout state, while the throughput achieved by the sender will be nearzero. rate burst length inter-burst period Figure 1: a low-rate attack can be approximated as a square wave 4. THE LOW-RATE ATTACK 5. RANDOMIZED RTO DEFENSE A low-rate TCP-Targeted denial of service attack has three properties: a send rate, a burst length, and an inter-burst period. The send rate is obviously the rate at which the attacker sends packets into the network. This rate must be higher than the bandwidth of the bottleneck link in order to produce packet loss and cause TCP flows to timeout. One possible defense against low-rate TCP-targeted DoS attacks is proposed by [1]. If the minimum value of the RTO is not set to 1sec, but instead it is randomized around 1sec, we could expect the effect of the attack on the throughput to be less severe. Simulation results by [2] show that randomization of the RTO is indeed a possible solution. When the value of RTOmin is randomized around 1sec, it is more difficult for the attacker to synchronize its square wave attack with the RTO expiration intervals. Thus the attacker can no longer predict the time the new packets will arrive and flood the buffers with a properly timed burst. The burst length is how long the attacker floods packets for each burst. This depends on the send rate, RTT of the flows, and bandwidth of the bottleneck. If the burst length is too long, then the attacker risks being detected. It was suggested in [1] that a burst length less than 300ms would allow the low-rate attacker to go relatively undetected. Our own experiments show that a burst length of 200ms is sufficient to produce zero throughput with the proper interburst period, so we do not have to worry about going over this threshold. The inter-burst period dictates the frequency at which the attacker floods packets. An attacker would want to pick an inter-burst period that corresponds exactly to the RTO. In this way, the attacker can throttle the throughput while transmitting the least amount of data possible to reduce the chances of detection. The low-rate DoS attack can be viewed as a square wave like in Figure 1. It bursts packets at a set rate for a set time and then waits out the inter-burst period before bursting again. It does so in the hopes that subsequent bursts will occur just as victim flows begin retransmission. Although TCP uses predefined backoff values for the RTO (1,2,4,8,…,64), randomized backoff intervals are used in other protocols, especially link layer protocols that use similar exponential backoff schemes (e.g. Ethernet). Such protocols use a random timeout value within a certain range. There are three different ways one could choose to randomize RTOmin. Each one uses a different range of possible values. The most conservative approach would be to pick a number in the range [t, t+1). This will overestimate the RTO. Alternatively, we could choose a value within the ranges [t-0.5, t+0.5) and [0.5t, 1.5t). Simulations and analysis done by [2] show that the range of randomization does not affect the results significantly. For the rest of this paper we assume that any randomization of the timeout is done using a value within the [t0.5, t+0.5) range. 6. THE LINUX KERNEL 6.1 Linux TCP congestion control The Linux kernel currently implements TCP New Reno. Although many optimizations have been incorporated in the Linux TCP stack, the congestion control mechanism is essentially as defined by TCP New Reno. One issue of great interest to our study is the fact that Linux allows a minimum value of 200ms for the RTO. This deviates from the standard approach of 1sec RTOmin and makes it harder for potential Low-Rate attacks to succeed. This is because in order to attack Linux effectively using the Low-Rate attack model, one would have to generate an attack square wave with inter-burst period of 200ms. In order to fill the bottleneck buffers using such a short IBP, the attacker would have to use a very high rate. Such an attack is difficult to achieve and the average attack rate would be high enough for DoS defense mechanisms to detect. The second version of the Linux kernel is based on the “1sec RTO” kernel with the addition of randomization. We modified the previous version so that the value of RTO is randomized within the range [t-0.5, t+0.5). The minimum value of the RTO is first adjusted so that it is not below 1sec. Afterwards a random number is generated using the Linux kernel’s random number generator. This number is then combined with the value of RTO to produce the effective timeout that is randomly selected from the desired range. We must note that the values of RTO, RTT, rttvar etc are measured in “jiffies”. One jiffy is the resolution of the software timer and for the x86 architecture it is 10ms. The minimum value of RTO was bounded to 100 jiffies (1sec) and it was randomized in the range of [50,150) jiffies. The randomized Linux kernel was used to measure the effectiveness of the attack and compare it with the results from the measurements taken with the “1sec RTO” kernel. 6.2 Linux kernel changes Although Linux is less vulnerable because of its 200ms RTOmin, the vulnerability in the design of the TCP timeout mechanisms is existent and other systems may be susceptible to this attack. We wanted to test the effectiveness of the attack on real systems and try to reproduce the results of the simulations. In order to do this we had to do a series of modifications to the Linux kernel. We used two versions of the Linux kernel. The first version was modified so that the minimum value of RTO was bounded to 1sec. Bounding RTOmin to 1sec was necessary to test the effectiveness of the attack. RTOmin is defined in the Linux kernel as 200ms. Changing this definition triggers a series of unwanted changes, the most important of which being the improper initialization of rttvar (round trip time variance). In order to bound the minimum value of RTO we had to make sure that it adjusted every time RTO is calculated from the values of srtt (smoothened RTT) and rttvar. Also the function that is used by Linux to check the upper bound of the RTO (tcp_bound_rto()) had to be modified so as to included lower bound checking as well. 7. EXPERIMENTAL RESULTS 7.1 Experimental Setup We conducted our experiments on a TCP test-bed. We had a sender, receiver, and attacker. We connected the sender and attacker to the receiver through an intermediate node running DummyNet. We used DummyNet to simulate the internet. The setup looks like Figure 2. Figure 2: Test-bed setup The sender and receiver are implemented by running Iperf as a client and server in order to generate TCP traffic. The pipes connecting the sender, attacker, and receiver have a bandwidth of 1.5Mbits/s and a queue of 50 slots. In addition, the RTT was 40ms. We programmed a custom attacker to produce a low-rate attack. It generates 3Mbits/s of traffic in periodic bursts of specified lengths and inter-burst periods. produce an attacker that attacks with an interburst period of 200ms, but it would have to have a much shorter burst length and much higher burst rate. With such a short inter-burst period, such an attacker may not be stealthy. The next test we ran was on our modified kernel with a 1 second RTO. Figure 4 shows the results of this test. 1 The first test we ran tested how the regular Linux kernel reacted to a low-rate attack. We expect the attack to be less effective since the attacker is specifically aimed at systems with an RTO of 1 second while Linux has an RTO of 200ms. It is interesting to see how it will react under an attack. The throughput graph is presented in Figure 3. 1 0.9 Throughput 0.8 0.7 0.6 0.5 0.4 0.3 0.2 Attack No Attack 0.1 0 1 0.8 0.7 0.6 0.5 0.4 0.3 0.2 Attack No Attack 0.1 0 0 1 2 3 4 5 6 Inter-burst Period (seconds) Figure 4: 1s RTO Kernel Throughput 7.2 Results 0 0.9 Throughput We ran a set of tests for each kernel variation: regular, 1 second RTO, and 1 second randomized RTO. We ran each test for 20 seconds. We allowed the sender to transmit for 2 seconds before starting the attacker. The attacker transmitted at 3 Mbits/s for 200ms while we varied the inter-burst periods. The important inter-burst period values are at 0.5 and 1 second. This is because they coincide with the RTO of 1 second that was proposed in [3]. 2 3 4 5 6 With this kernel, we see that there is a major drop in throughput for inter-burst periods of 0.5 and 1 second. These results agree with the simulation results reported by [1] and [2]. Since the retransmission timeout is set to 1 second, each retransmission encounters packet loss because the attacker is transmitting at the same time. If the inter-burst period is not synchronized with the RTO, however, we see that throughput can be achieved. This is evident in the throughput values for inter-burst periods that are not even divisors of 1 second. One thing we must clarify is the fact that we did see a very small throughput at the 0.5 and 1s inter-burst periods. This is simply due to the fact that we allowed the sender to transmit for 2 seconds before engaging the attacker. Once the attack began, throughput was throttled to zero for both of these cases. Inter-burst Period (seconds) Figure 3: Linux Kernel Throughput We see that the throughput is similar to what we would expect to see for a regular DoS attack. As the burst frequency gets higher (lower inter-burst period), the throughput drops. The reason that the throughput never drops to zero is because our attacker is geared toward a system with a minimum RTO of 1 second. We could After, we tested our modified Linux kernel which randomizes the RTO between t-0.5 and t+0.5 seconds. This produces RTOs that average to t seconds, yet do not allow a low-rate attack to know when it will try retransmitting. The results are shown in figure 5. burst rate at 3Mbits/s for each test run. results can be seen in figures 7 through 9. 1 0.9 The 0.7 0.6 0.6 0.5 Linux 1s Kernel Randomized 0.5 0.4 0.3 0.2 Attack 0.1 No Attack 0 0 1 2 3 4 5 6 Throughput Throughput 0.8 Inter-burst Period (seconds) 0.4 0.3 0.2 0.1 Figure 5: Randomized RTO Kernel Throughput 0 0 50 100 150 200 250 Burst Length (ms) Figure 7: 0.5s Inter-burst Period Throughput 1 Linux 1s Kernel Randomized 0.9 0.8 0.7 Throughput As you can see, the throughput for the randomized RTO kernel is dramatically better than the 1s RTO kernel for attacker inter-burst periods of 0.5 and 1 second. In fact, at the 1 second inter-burst period, we see that the randomized RTO kernel produced a throughput of nearly 40%. This is very good considering that the attacker is flooding the network 20% of the time. In addition, it is a significant improvement from the 1s RTO kernel which had zero throughput at this point. 0.6 0.5 0.4 0.3 0.2 0.1 Figure 6 shows how the different RTO implementations stack up against each other. 0 0 50 100 150 200 250 Burst Length (ms) Figure 8: 1s Inter-burst Period Throughput 1 0.9 0.8 0.7 Linux 1s Kernel Randomized 0.7 0.6 0.5 0.6 0.4 Throughput Throughput 0.8 Linux 1s RTO Random RTO 0.3 0.2 0.1 0 0 1 2 3 4 5 6 Inter-burst Period (seconds) Figure 6: Throughput Comparison 0.5 0.4 0.3 0.2 0.1 0 0 50 100 150 200 250 Burst Length (ms) Again we see that by using a randomized RTO, we do not suffer throughput throttling. In order to throttle the throughput of our modified kernel, an attacker would simply have to resort to a standard DoS attack that constantly transmits. This would not be stealthy since it could be detected relatively easily. Therefore, we see that randomization is a very effective defense against a low-rate DoS attack. We conducted one last test. We looked at inter-burst periods of 0.5 and 1s and varied the burst lengths from 50 to 200ms. We kept the Figure 9: Averaged Throughput of 0.5s and 1s From these tests we see that using a burst length of 200ms is just about right to throttle throughput in the 1s RTO kernel. In addition, we see that the randomized RTO performs better than a 1s RTO in these zero throughput cases. 8. CONCLUSION The design of TCP congestion control mechanism incorporates a statically defined timeout mechanism. Although other protocols use randomization techniques when choosing the value of the retransmission timeout, TCP uses multiples of 1sec. during its exponential backoff stage. This design makes TCP vulnerable to a Low-Rate attack. Current DoS attack defense mechanisms can not detect this kind of attack since the average attack rate is very low. Using our implementation we were able to test the effectiveness of this attack for three different platforms: the standard Linux kernel, the modified “1sec RTOmin” kernel and the “randomized RTO” kernel. Our experimental results confirm the simulation results. Systems using a minimum value of 1sec for the RTO can be heavily affected by a Low-Rate attack. When the attacker is synchronized with the sender (Inter-burst periods of 0.5s and 1s), the throughput is throttled. Experiments using the “randomized RTO” Linux kernel show that the proposed defense against this type of attack can greatly improve performance. Randomizing the value of RTO eliminates the throughput throttling problem for IBPs of 0.5s and 1s. In the 1s IBP case in particular we achieve 40% of the link’s maximum throughout, while under attack. 9. FUTURE WORK Our first experiments showed that randomizing the value of RTO provides defense against the Low-Rate attack. For our experiments we had to modify the Linux kernel so that the lower bound for the value of RTO is 1s, instead of 200ms. We would like to test our approach with other operating systems and verify whether any of the popular TCP stack implementations uses RTOmin of 1s and thus is vulnerable to the attack. Experiments with BSD, Solaris and Microsoft Windows would show to what extend this attack is a threat for real systems. One issue that we need to look into, is to whom exactly is this attack targeted to. We need to consider what are the real-life scenarios where the Low-Rate attack can actually be launched, what is the target going to be and to what extent will it affect network performance. Our next set of experiments will focus on the fairness of the randomized Linux kernel. Changing the value of RTO may affect the fairness of the system when multiple connections are present. We should also test the effectiveness of the “randomized kernel” defense against the attack with multiple long-term TCP flows as well as dynamic highly variable flows such as HTTP traffic. Finally, we would like to implement an alternative solution to this attack, using probing. A comparison between the effectiveness of the two defense mechanisms would be of great interest. 10. REFERENCES [1] A. Kuzmanovic and E. Knightly. Low-Rate TCP-Targeted Denial of Service Attacks. In Proceedings of ACM SIGCOMM ’03, Karlsruhe, Germany, August 2003. [2] G. Yang, M. Gerla, and M.Y. Sanadidi. Randomization: Defense Against Low-Rate TCP-Targeted Denial of Service Attacks. Internal draft, 2003. [3] M. Allman and V. Paxson. On Estimating End-to-End Network Path Properties. In Proceedings of ACM SIGCOMM ’99, Vancouver, British Columbia, September 1999. [4] P. Sarolathi and A. Kuznetsov. Congestion Control in Linux TCP. [5] D. Bovet and M. Cesati. Understanding the Linux kernel, O’Reilly Press, 2003.