By: Stefan Savage, David Wetherall, Anna Karlin & Tom Anderson Affiliation: Department of Computer Science and Engineering at the University of Washington Published: ACM SIGCOMM Conference, 2000 Presented by: Andrew Mantel Presentation date: March 19, 2009 Class: CAP6135 – Malware and Software Vulnerability Analysis (Spring 2009) Professor: Dr. Cliff Zou Goal / Motivation Background Previous work Terminology Marking algorithms Experiment Authors’ limitations / extensions My review 2 Goal: ◦ Describe “a technique for tracing anonymous packet flooding attacks in the Internet back towards their source” (Savage et al, 295) Motivation: ◦ Increase in denial of service (DoS) flood attacks From 1989 to 1995, CERT reported 50% increase per year ◦ Eliminate the problem by not allowing the attacking computer to hide Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 3 This paper isn’t: ◦ A perfect solution ◦ Cannot find the real attacker ◦ Very difficult problem (examples: compromised host attack, reflector attack) (Source: http://www.yojoe.com/) This paper is: ◦ A practical solution ◦ Traceback problem: Find the direct attacker ◦ Still a difficult problem due to IP spoofing (Source: http://raw.channelfrederator.com/) Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 4 Sample IP Packet (IPv4) (Modified from source: http://en.wikipedia.org/wiki/IPv4#Header) Source address is specified by the sender Can spoof as long as the attack doesn’t rely on return traffic Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 5 Previous Work 6 Goal: ◦ Prevent attackers from spoofing the source IP Approach: ◦ Routers block packets with illegitimate source addresses Cost: ◦ Computational cost to verify source address ◦ Farther away from the source, harder to verify the address Problems: ◦ Requires universal deployment ◦ Attacker could spoof a valid address Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 7 Input debugging: router feature to “filter particular packets on some egress port and determine which ingress port they arrived on” (Savage et al, 296) Approach: Problems: 1) Victim identifies signature of attacker packet 2) Victim calls a network operator who uses input debugging to trace the packet port upstream 3) Process continues, possibly across ISP borders, until the source site is found ◦ Assumes attack is in progress ◦ Requires too much cooperation / coordination Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 8 Approach: 1) Using a map of Internet topology, victim asks some hosts to iteratively flood each incoming link on the router closest to the victim 2) Network buffer fills up, and attacker packets start dropping 3) Monitor change in rate of incoming attacker packets to determine which link they came from 4) Repeat recursively until source is found. Problems: ◦ Controlled flooding is a DoS attack ◦ Requires a topological map of the Internet and willing flooding hosts ◦ Not effective against DDoS attack ◦ Can’t be used post-mortem Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 9 Approach: ◦ “Log packets at key routers and then use data mining techniques to determine the path that the packets traversed” (Savage et al, 297) Strength: ◦ Can be used post-mortem Problems: ◦ Large resource requirements ◦ Requires large scale inter-provider database integration Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 10 Approach: ◦ Every router randomly selects (with low probability) a packet it is forwarding ◦ Copies this packet to a special ICMP traceback message Includes info about the adjacent routers the packet travelled through ◦ If an attack occurs, use these ICMP traceback messages to reconstruct the path back to the attacker Problems: ◦ ◦ ◦ ◦ ICMP traffic gets limited/dropped Not all routers can debug the message Doesn’t work well without universal deployment Attacker could send fake ICMP traceback messages Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 11 (Savage et al, 297) Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 12 Terminology 13 Symbols: ◦ V: Victim ◦ Ai: Attack origin i ◦ Ri: Router i Attack path: unique ordered list of routers from Ai to V Example: {R6, R3, R2, R1} (Savage et al, 298) Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 14 Exact traceback is very difficult ◦ MAC address spoofing of source ◦ Insert false routers Focus on approximate traceback ◦ Find attack path with valid suffix ◦ Valid suffix: Suffix of attack path is the true attack path ◦ Example: {R5, R6, R3, R2, R1} ◦ Robust: Attacker can’t stop the victim from finding the valid suffix FRi = Fake Router i inserted by an attacker (Modified from: Figure 1, Savage et al, 298) Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 15 • • • • • • • • An attacker may generate any packet Multiple attackers may conspire Attackers may be aware they are being traced Packets may be lost or reordered Attackers may send numerous packets The route between attacker and victim is fairly stable Routers are both CPU and memory limited Routers are not widely compromised intelligent attacker nature of modern networks solution can’t be costly would destroy traceback Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 16 Marking Algorithms 17 Simplest marking algorithm Algorithm: 1) Each router appends its address to the end of the packet 2) Victim can reconstruct path from the ordered list Strengths: ◦ Reconstruct path from a single packet Weaknesses: ◦ High overhead to append data in flight ◦ May not have enough space for all addresses Attacker could just fill this space beforehand Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 18 Algorithm: ◦ Add a “node” field to the packet header ◦ Each router has the same probability p of overwriting this node field with their address ◦ Path reconstruction: Same p used by every router Will receive more packets marked by a certain router based on their distance (Modified from source: http://www.loriotpro.com/) Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 19 Strengths: ◦ Low cost to implement Weaknesses: ◦ Takes too long to reconstruct full path Must wait for a marked packet from far away from the victim ◦ Doesn’t work against multiple attackers Multiple routers may appear to have the same distance Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 20 Illustration of Edge Sampling Algorithm Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 21 Attack path reconstruction: ◦ Bounded by the furthest router d hops away ◦ Also a small chance of receiving a sample from the furthest router, but not from one closer ◦ Expected number of packets X required by Victim to reconstruct the attack path of length d: Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 22 Strengths: ◦ Distance field prevents attacker from spoofing a router within the valid suffix ◦ Can identify multiple attackers Weaknesses: ◦ When multiple attackers, can’t trust paths longer than the closest attacker requires additional precautions ◦ Additional space in IP packet header we’ll address this next 32 bits for start + 32 bits for end + 8 bits for distance = 72 bits So not backwards compatible Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 23 Modification 1: ◦ Edge-id: XOR the addresses ◦ Router nearest the Victim comes intact ◦ Reconstruct path starting near the Victim using: (Savage et al, 301) Reduces space requirements to 32 bits (XOR addresses) + 8 bits (distance) = 40 bits Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 24 Modification 2: ◦ Split XOR’d address into k chunks ◦ Include offset of chunk Problem: ◦ Edge-id no longer unique (Modified from: Figure 6, Savage et al, 301) For this example, reduces space requirements to 8 bits (chunk of the XOR address) + 2 bits (offset) + 8 bits distance = 18 bits Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 25 Modification 3: ◦ Bit-interleave router address with the hash of the address ◦ Increases the length of edge-id (Savage et al, 301) For this example: 8 + 3 + 8 = 19 bits Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 26 Address reconstruction: (Savage et al, 301) Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 27 How many packets do we need? ◦ Need kd edge fragments ◦ Each comes with probability p(1 – p)d-1 ◦ Example: k=8, d=10, p=1/25, then E(X) = 1300 To ensure path can be reconstructed with c certainty: ◦ Example: c=95%, then E(X) = 2150 Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 28 Summary: 1) 2) 3) 4) Big-interleave address with hash of address Split into k fragments XOR fragments Reconstruct address using , starting at router b nearest Victim, validating it using the hash 5) Reconstruct path(s) by graphing (Savage et al, 298) Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 29 Strengths: ◦ Requires less bits than Edge Sampling ◦ # of packets required to reassemble path is within range of DoS attack ◦ Good against multiple attackers (get divergent paths) ◦ Can be used post-mortem Weaknesses: ◦ Can take a long time to reconstruct paths when multiple attackers ◦ We still need a place in the IP header to store this data we’ll address this next Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 30 Use IP identification field 3 bits for offset (k=8) + 5 bits for distance + 8 bits for edge fragment = 16 bits Strengths: ◦ Header checksum doesn’t change ◦ Low overhead (usually just increment distance) (Savage et al, 302) Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 31 But what about the IP identification field? ◦ Original purpose: IP fragments Fragmentation is rare (0.25%)… but we still want some backwards-compatibility If fragmentation occurs before a marked router: ◦ Based on probability q, prepend a new ICMP “echo reply” header with full edge data If fragmentation occurs after a marked router: ◦ Don’t allow fragmentation (degrades performance) Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 32 Experiment 33 Implemented their algorithm on a simulator Simulator generates random paths and originates attacks Settings: ◦ Marking probability p = 1/25 ◦ Path length [1,31] ◦ 1,000 random tests per path length Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 34 Results: (Savage et al, 303) Most paths resolved using 1000-2000 packets Usually need less than 4000 packets Flood DoS attacks send hundreds or thousands of packets per second Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 35 Authors’ Limitations / Extensions 36 Overwriting the IP identification field limits backwards compatibility ◦ Solution: Enable traceback only by request There is no IP identification field in IPv6 ◦ Solution: Use the flow label field instead Sample IPv6 Packet (Modified from source: http://en.wikipedia.org/wiki/IPv6) Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 37 Some packets can arrive unmarked Attacker could mark these with bogus edges ◦ Valid suffix unaffected ◦ Spoof edges past the end of the true attack path Solution 1: ◦ Use traceroute to determine valid network connections Solution 2: ◦ Include a secret with each marked packet ◦ Contact associated network to determine the secret ◦ Disregard packets that don’t have valid secret Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 38 For large distributed attacks, very hard to reconstruct paths (may misattribute an edge) Still not a perfect solution ◦ Find source of attack traffic ◦ But can’t find real attacker Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. 39 My Review 40 Presented a traceback algorithm that is: ◦ ◦ ◦ ◦ Easy to understand Requires little router overhead Can be used post-mortem Can be implemented in IPv4 with little loss of functionality Focus on valid suffix Theoretically estimated applicability to flooding-style DoS attacks Experimentally demonstrated that their algorithm works Fair comparison between other existing traceback techniques 41 Weak explanation of experimental design ◦ Did they test single or multiple sources of attacks? ◦ What % of the network were marking routers? ◦ Did simulated attackers try anything sneaky? Didn’t report time it takes to reconstruct the path Didn’t report performance cost of dealing with fragmentation Authors note that Node Sampling would take too long, but seems similar to Edge Sampling Not a perfect solution Authors admit to this 42 Test on IPv6 and determine countermeasures to dealing with loss of flow label field Test on real networks Combine with automated attack detection 43 [1] Savage, Stefan; Wetherall, David; Karlin, Anna and Anderson, Tom. "Practical Network Support for IP Traceback". In Proceedings of the 2000 ACM Conference. Pg 295-306, 2000. [2] “IPv4”. Wikipedia. <http://en.wikipedia.org/wiki/IPv4>. [3] “IPv6”. Wikipedia. <http://en.wikipedia.org/wiki/IPv6>. (Modified from source: http://www.comixconnection.com/) 44