Kargus: A Highly-scalable software-based network intrusion detection 4906520 awoo100 Anthony Wood Contents Background – What is Network Intrusion Detection Motivation Related work – hardware-based systems Limitations of hardware-based systems Related work – software-based systems Contribution Kargus – Methodology Kargus – Architecture Results Critical and Appreciative Analysis Background – What is Network Intrusion Detection? Intrusion detection is the process of identifying and responding to malicious activities targeted at computing and network resources The goal of intrusions detection is to identify potential network intrusions and report them A typical workflow for NID is outlined below: What is Kargus? Software-based Network Intrusion Detection system Based on Signature matching (i.e. Misuse detection) Uses Batch Processing and Data Parallelism Motivation High-bandwidth networks are becoming more common place, with many large enterprises and campus networks adopting 10GBps+ connections This presents a number of challenges for intrusion detection: Required to monitor at a line rate to identify potential intrusion attempts by executing pattern matching against a large database of attack patterns Reassembling segmented packets flow-level payload reconstruction and handling a large number of concurrent flows also needs to be implemented efficiently Needs to protect against network attacks (e.g. DoS) on itself Related work – hardware based A common approach is to meet these challenges with dedicated network processors: [1] Cavium is an off the shelf product which uses between 4-16 specialist cores and other hardware including: • Hyper Access Memory Controller • I/O Bus [2] McAfee Network Security Platform uses purpose-built hardware Others use special Pattern Matching Memory And Regular Expression Matching on FPGA Limitations of hardware-based systems Maintenance and Upgrades on such hardware can be expensive Decreased operational flexibility, as low-level code is required to configure hardware to integrate with other systems Movement of organisations towards a cloud infrastructure means that they no longer want/need to support hardware Related work – software-based Snort [3] libpcap-based packet sniffer and logger, that can be used as a crossplatform lightweight network intrusion detection system [4] High performance, but started to experience packet loss at 1.0GBps. However this did not impact the ability of the system to detect network intrusions Snort is one of the best performing open source NIDS solutions available, however is unable to cope with the line rates available in modern networks Contributions The authors present a highly-scalable software-based intrusion detection system architecture IDS architecture that fully utilizes modern hardware innovations including: 1. Multiple CPU cores 2. non-uniform memory access (NUMA) architecture 3. Multiqueue 10 Gbps network interface cards (NICs) 4. Heterogeneous processors like graphics processing units (GPUs) Two techniques used to get optimum performance are 1. batch processing and 2. parallel execution with an intelligent load balancing algorithm. Kargus – Architecture An Overview Kargus – Architecture EMPLOYING GPU FOR PATTERN MATCHING One thread is started and affinitised to each CPU Core. This reduces overhead of thread switching. Threads are divided into IDS engine threads and GPU dispatcher threads. IDS Engine threads read incoming traffic from the Network Interface Controller queues, and is responsible for the entire IDS tasks for that piece of information IDS Engine threads: 1. Pre-process the network traffic data 2. Perform multi-string matching to determine if it is likely that a packet is an attack 3. If necessary, performs rule option evaluation If the CPU of the thread is overloaded, it hands off the pattern matching work load to the GPU-dispatcher Kargus – Methodology PACKET ACQUISITION AND PRE-PROCESSING To reduce allocation and deallocation overheads Kargus batch processes multiple packets at a time, allocating large buffers for packet payloads and metadata, using the PacketShader I/O Engine (PSIO). The large buffers are recycled for subsequent packet reading Each RX queue affinitised to a CPU core, removing thread switching overheads Receive-side scaling (RSS) distributes incoming packets by hashing the 5-tuple (Source IP, Source Port, Destination IP, Destination IP, protocol). This allows traffic that is part of the same flow to be enqeued to the same NIC in order Processing of each flow of packets can occur completely in parallel, therefore RSS reduces the impact of locking and thread safety on performance Kargus – Methodology PREPARATION FOR MATCHING IP packet fragments are reassembled and the checksum of the TCP packet verified Manages the flow content for each TCP connection Identifies the application protocol that each packet belongs to Extracts the pattern rules to match against the packet payload Kargus – Methodology ATTACK SIGNATURE MATCHING Precondition is that each packet payload is reassembled and normalised Packet payload is forwarded to the attack signature detection engine, Two phased approach to Attack-Signature matching by the detection engine Step 1: Multi-string pattern matching Matches a set of simple strings (e.g Snort is organised into port groups based on source and destination port numbers of the packet) Port group used to reduce the pattern matching space. Only attack signatures with the relevant port group are matched against the packet content Pattern matching implemented using Aho-Corasick, which has the same average case and worst case efficiency scenarios Step Two: Rule Option Evaluation If packets are caught in the string matching phase, they are evaluated further against a full attack signature Results Analysis of the packet capture methods shows that PSIO can handle much greater throughput (GBps) whilst maintaining a low CPU utilisation through batch processing Results Processing of innocent traffic speed has almost doubled, whilst malicious traffic speed up is significantly less. This is because Kargus’ architecture is geared towards efficient processing of innocent traffic Results Throughput and performance for Kargus is significantly greater across all packet sizes. For small packet sizes, the CPU only version of Kargus is more effective For large packet sizes, the cost of initialisation and overhead of maintaining GPU thread does not maintain its usefulness Conclusion Robust architecture aims to optimise the performance using a number of strategies at each stage of the network intrusion detection Kargus appears to dramatically improve performance using batch processing for normal traffic. However, the speed up for attack traffic is a lot less. This does not discount the validity of the solution as it is reasonable to expect most network traffic is normal GPU Kargus is more effective on larger packet sizes due to overhead involved in allocating tasks to the GPU Appreciative and Critical Analysis Appreciative Seemingly robust architecture for systems with large CPU power Investigated the possibility of task parallelisation or pipelining, in addition to data parallelisation, however suboptimal performance exists in pipelining Critical Would be interesting to compare performance with hardware based systems on the same infrastructure Would like to see other performance measures such as detection accuracy, packet loss as packet loss erodes system effectiveness Although most traffic is likely to be normal, solution doesn’t focus on reducing the complexity associated with malicious traffic. Heavily attacked networks using Kargus may experience less than ideal results as CPU and GPU become overloaded. May be susceptible to Denial-of-Service (DoS) attacks References [1] Intelligent Networks Powered by Cavium Octeon and Nitrox Processors: IDS/IPS Software Toolkit. http://www.cavium.com/css_ids_ips_stk.html. [2] McAfee Network Security Platform. http://www.mcafee. com/us/products/network-security-platform.aspx. [3] M. Roesch. Snort - Lightweight Intrusion Detection for Networks. In Proceedings of the USENIX Systems Administration Conference (LISA), 1999. [4] A. Alhomoud, R. Munir, J. Pagna Disso, I. Awan, and A. Al-Dhelaan, "Performance evaluation study of intrusion detection systems," Procedia Computer Science, Vol. 5, pp. 173-180, 2011. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S1877050911003498