PROFILING HACKERS' SKILL LEVEL BY STATISTICALLY CORRELATING THE RELATIONSHIP BETWEEN TCP CONNECTIONS AND SNORT ALERTS Khiem Lam Challenges to Troubleshooting Compromised Network Time consuming to find vulnerabilities Difficult to determine planted exploits Uncertain of the degree of damage Motivation for Profiling Hackers Can profiling the attacker’s skill level assist with risk management? Understand the level of threat Know the possibilities of vulnerabilities Reduce time and resource to investigate the “what if” scenarios Approach - Hypothesis of Skilled Attacker’s Behavior Avoid IDS detection if they know the rule set in advance Avoid common techniques to reduce chances of detection Establishes many short connections If these hypothesis are true, then there must be patterns to group attackers based on their behavior! Exploratory Approach Data Acquisition/Separation Data Standardization/Formatting Cluster Analysis Phase 1 – Data Acquisition/Separation TCP Connection Data IDS Alerts Data Competition PCAP Captures Snort Application Team A’s Pcap Team B’s Pcap Team A Connection Info Team B Connection Info Updated Snort Alerts Logs Competition Snort Alerts Logs Phase 2 – Data Standardization Team A Connection Info Competition Snort Alerts Logs CSV Format Data Aggregation using R Statistical Tool Team A’s Aggregated Data by Time Period Updated Snort Alerts Logs Phase 2 – Example of Actual Aggregated Data This is the aggregated data for two teams connecting to one service Results – Graph of the Aggregated Data Phase 3 – Cluster Analysis Using R Team A’s Aggregated Data by Time Period Team B’s Aggregated Data byTime Period Team C’s Aggregated Data by Time Period • Find correlation between attributes • Add weights Cluster Data Euclidean Distance Cluster Analysis Results + Graphs Phase 3 - Example of Actual Cluster Data This is the cluster data of all teams connecting to one service Results – Euclidean Cluster Graph Team # flags submitted 3 51 4 40 8 29 2 28 6 8 9 7 10 7 7 2 1 0 5 0 Results – K-Mean Cluster K-Mean Cluster Plot Team # flags submitted 3 51 4 40 8 29 2 28 6 8 9 7 10 7 7 2 1 0 5 0 Limitations of Current Approach Rely on competition data (time period, team subnet info) Assume attackers know of competition alerts in advance Assume submitted flags is reliable criteria to measure attacker’s skills Inconsistency between different services Future Work for Improvement Experiment with varying time period (5 minutes, 15 minutes, 30 minutes) Increase updated alert rules to capture more events Add additional features (Andrew and Nikunj’s TCP stream distance) Weigh the correlation between attributes Explore other R’s analysis Questions?