Detecting Stepping-Stone Intruders with Long Connection Chains Wei Ding Contents Introduction Measuring Upstream RTT Comparsion of uRTTs Distribution Validation Conclusion 2 Introduction Measuring Upstream RTT Comparsion of uRTTs Distribution Validation Conclusion 3 World with serious Internet crime threats. Based on IC3 (Internet Crime Complaint Center) Internet crime report for 2009, 336,655 complaint submissions which is a 22.3% increase over 2008. Total dollar loss from referred cases was $559.7 million. Just the tip of the iceberg. Many more cases are undetected and/or unreported. It’s very important to prevent hackers from intruding into our systems and stealing our information. 4 Intruders don’t want to be caught. Attacker In order for intruders to steal information from a host, it is necessary for the intruders to remotely login to the host. To avoid being detected, most of intruders use long connection chains of stepping-stones to reach the victim host. Victim 5 Stepping-Stone Attack Stepping-Stone Victim Attacker 6 Stepping-Stone Detection 7 End-of-Chain Protection It is much more important for a host to protect itself from being a victim. 8 End-of-Chain Protection Connection Chain Attacker Visible Hosts Victim 9 Introduction Measuring Upstream RTT Comparsion of uRTTs Distribution Validation Conclusion 10 Hypothesis There is no valid reason for normal users to use a long connection chain for remote login such as SSH connection. If we can discriminate long connection chains from short connection chains, then we can identify intruders from normal users. 11 11 Round-trip Time Can Be Used If we can compute the round-trip time (RTT) of packets, we can estimate the length of the connection chain. Computing downstream RTT is possible, but it is very difficult to compute upstream RTT. 12 Downstream RTT Host 1 Host 2 Host 3 Host 4 Request RTT Reply Time Measuring downstream RTT is feasible. But measuring upstream RTT is very difficult. 13 Upstream RTT Host 1 Host 2 Host 3 Host 4 Request Client Server Te Reply ? Request Ts Time Unknown time gap between previous reply and the next request can be one problem. 14 Another problem of Upstream RTT Host 1 Host 2 Client Host 3 Host 4 Reply Server Cross over Gap1 < RTT Gap2 < RTT Cross over Request Time Cross-over of reply and request packets is another problem. 15 What else we can use? Is there any difference between short connection chains and long connection chains? 16 Sorted Short and Long Connection uRTT (seconds) 5 4 Short 3 Long 2 1 0 1 101 201 301 401 501 601 701 Packets (sorted) 17 Two Types of Packet Time Gaps c d l s p w d w d (a) Inter-command gaps c d l s p (b) Intra-command gaps 18 Comparison Between Short and Long Connection Distribution 5 Short Short 4 uRTT (seconds) uRTT (seconds) 5 Long 3 2 1 4 Long 3 2 1 0 0 1 101 201 301 401 501 Packets (sorted) Distribution of Intra-command gaps only 1 11 21 31 41 51 61 71 81 91 101 Packets (sorted) Distribution of Inter-command gaps only 19 Introduction Measuring Upstream RTT Comparsion of uRTTs Distribution Validation Conclusion 20 Using uRTTs of Short Chains to Build a Profile. uRTT (seconds) 5 4 Profile 3 2 1 0 1 11 21 31 41 51 61 71 81 91 101 Packets (sorted) Any extracted curves from new collected connection packet stream will be compared with this profile distribution to quantify the difference. 21 Absolute Difference N g[i] -gp [i] i 1 N D(g, g p ) = g g[i] | i : 1 N g p g p [i ] | i : 1 N gp is the distribution of uRTT gaps of the profile chain. g is the test connection’s distribution. This distance measure takes the absolute distance between the profile distribution and any test connection distribution based on inter-command time gaps. 22 Median of Ratio Adjustment N g[i] R -gp [i] i 1 N DR (g, g p ) = g p [i] R Median | i 1,2,..., N g[i] A ratio R is used to adjust and compensate distribution with different average typing speed. Short connection curves under the profile curve will get the ratio R greater than one which can decrease the distance from the profile curve by calculating DR. But long chain may get also get decreased distance with the R less than one. 23 Weighted Ratio Adjustment Sp 1 , S p S W S 0, S p S Rw (1 W ) R N g[i] R w -gp [i] i 1 N D w (g, g p ) = S and Sp are the slopes of their uRTT distribution curves by linear regression (y=S*x + c). Most long connection chains will get a weight larger than 0 which gives an increased distance Dw. Using this adjustment, most long chains will have a bigger chance to hold an increased distance. 24 Validation: Classifying 4-hops Chains Accuracy Rate (TP) 100% 80% 60% D-Weighted D D-Ratio 40% 20% 0% 0% 5% 10% 15% 20% 25% False Positive Rate 20 sessions of 1-hop connection chains and 20 sessions of 4-hop connection chains are compared. For different false positive rate, leave-one-out cross validation is used to select the threshold to calculate the true positive rate. 25 Accuracy Rate (TP) Classifying 4-hops and 6-hops Chains with Weighted Ratio Distance 100% 80% 4-hop 60% 6-hop 40% 20% 0% 0% 5% 10% 15% False Positive Rate 20% 25% Using weighted ratio adjustment, all 4-hops and 6hops chains can be successfully classified when the FP is getting 15%. 26 Conclusion Our method of detection centers on utilizing the packet stream of incoming connections to build intercommand gaps curve. By using new connection distribution compared with a profile of short connection chains, it is possible to detect long connection chains with certain threshold. Our experiments show that by tolerating a false positive rate of 15%, 100% of the test cases (4-hop and 6-hop) can be correctly detected with our weighted ratio distance measurement. 27 27