Foundation for Research and Technology – Hellas Institute of Computer Science Distributed Computing Systems Lab 16 June 2006 1. Network Intrusion Detection Systems Network Intrusion Detection Systems (NIDSes) provide a powerful mechanism to defend against well-known attacks on a computer network. A NIDS is usually designed as a passive monitoring system that reads packets from a network interface through standard system facilities, such as libpcap1. The term "passive monitoring" means that NIDS is only receiving a copy of the packets, unlike firewalls that are located in the packets' path. NIDS are thus characterized as fail-safe systems because, even if they stop working, the operation of network is not interrupted. Most deployed NIDSes are signature-based. The detection mechanism of a signaturebased NIDS is based on a set of signatures, each one of them is a short description of an attack. After a set of normalization passes (e.g., IP fragment reassembly, TCP stream reconstruction, etc.) each packet is checked against the NIDS ruleset (the set of signatures is also referred as ruleset). A signature can inspect only packet header, for example packets on destination port 0 are suspicious, or may require full packet inspection. Full packet inspection involves matching patterns against the payload of packets. As an example, a signature taken from latest Snort, is alert tcp any any -> HTTP_SERVER 80 (content:"/default.ida?NNNNNN"; nocase; msg:"WEB-IIS CodeRed v2 root.exe access";) This signature instructs that if ``/default.ida?NNNNNN'' is found inside the payload of a TCP packet that is originating from any host and any source port and is destined to an HTTP server on port 80, then the CodeRed attack is taking place. The majority of signatures require full packet inspection (94% of rules based on the latest Snort ruleset). As expected, the number of signatures is increasing over time. In Figure 1, the signature growth is displayed. Note that the current Snort ruleset contains around 3500 signatures. 1 www.tcpdump.org Network Intrusion Prevention Systems (NIPSes) are the same with NIDSes, with the exception that are located in packets’ path and decide whether packets are dropped or not, based on the decision taken by the detection engine. Figure 1 Growth of Snort signatures over the last 6 years a. Snort Snort2 is the most well-known, open-source IDS available. It is a signature-based intrusion detection system and its signatures are updated almost daily. Snort outperforms existing NIDSes and has a large community of users that maintain and support it. The way Snort works is representative of how most signature-based NIDSes work, although internal mechanisms and algorithms may change. Snort captures packets through the libpcap library. Before each packet is checked against the ruleset, a set of preprocessors take place. Each preprocessor is responsible for a normalization operation. By default, Snort performs IP and TCP reassembly. In some cases, IP packets are fragmented, that is one IP packet is split in two or more packets. Reassembly takes these fragments and joins them into a single packet. Similarly, TCP packets are also fragmented. For example, an FTP transfer of a large file is contained in hundreds or thousands of packets that have to be reassembled to provide the full transfer. Reassembly is essential for preventing evasion attacks which try to hide the attack between packet boundaries; the first half of the attack is in a packet and the second half is in the next packet. Other Snort preprocessors perform telnet and RPC decoding and also keep flow state (if a packet is part of an established connection or not). After normalization passes, packet is decoded (decoding means separating its protocol fields, e.g. source and destination IP addresses) and checked against the ruleset. Most 2 http://www.snort.org attacks target a specific service and that service listens to a predefined port. Thus, signatures can be efficiently clustered based on the destination port of incoming packets. For example, signatures that detect attacks on a web server have port 80 as destination port and each incoming packet with destination port 80 is checked only against these signatures. This clustering avoids unnecessary checks on rules that are out of the packet's context (it would be overhead to check telnet rules for a web packet). A small subset of signatures describes attacks that do not target a specific destination port. These rules are checked separately for every incoming packet. As stated before, most signatures require searching for a pattern inside the packet payload. To avoid search of each pattern in a serial fashion, multi-pattern search algorithms are used. The original distribution of Snort comes with implementations of Wu-Manber and Aho-Corasick. For more details of pattern matching algorithms refer to the section 5. The main idea behind multi-pattern algorithms is that they try to search an independently large number of patterns inside the payload with a single pass. For performance reasons, Snort performs pattern matching first and header checking last as in most cases header check always matches and pattern matching is the most heavyweight operation. When a pattern is matched, the rest of the rule is checked (e.g. flags) and if the rule is matched then an alert is triggered. Snort also supports regular expression matching for most advanced checking. Regular expressions, however, are checked in a serial fashion. Alert can be either a log entry or, in most advanced cases, a record in a MySQL database. b. Bro Bro is another popular open-source, Unix-based and can be found at http://bro-ids.org/ . Bro detects intrusions by comparing network traffic against a customizable set of rules describing events that are deemed troublesome. These rules might describe specific attacks (including those defined by signatures, like in the case of Snort) or unusual activities, e.g., certain hosts connecting to certain services or patterns of failed connection attempts. Bro uses a specialized policy language that allows a site to tailor Bro's operation, both as site policies evolve and as new attacks are discovered. Bro comes with a rich set of policy scripts designed to detect the most common Internet attacks while limiting the number of false positives, i.e., alerts that confuse uninteresting activity with the important attack activity. These supplied policy scripts will run "out of the box" and do not require knowledge of the Bro language or policy script mechanics. Bro also comes with appropriate utilities that can convert Snort rules to the Bro language. c. Other open-source NIDSes Firestorm3 is similar to Snort and can reassemble IP fragments, perform TCP connection tracking, TCP stream reassembly and application layer stateful analysis. It has the 3 http://www.scaramanga.co.uk/firestorm/ capability to fully decode application layer protocols and also the ability o rate-limit alert output to protect itself from DoS attacks. Prelude4 is an Hybrid IDS framework, that is, it is a product that enable all available security application, be it open-source or proprietary, to report to a centralized system. In order to achieve this task, Prelude relies on the IDMEF (Intrusion Detection Message Exchange Format) IETF standard, that enables different kinds of sensors to generate events using an unified language. Prelude benefits from its ability to find traces of malicious activity from different sensors (Snort, honeyd, Nessus Vulnerability Scanner, Samhain, over 30 types of systems logs, and many others) in order to better verify an attack and in the end to perform automatic correlation between the various events. Prelude is committed to providing an Hybrid IDS that offers the ability to unify currently available tools into one, powerful, and distributed application. As a deployment solution, open-source Lambic OS5 is a fully functional Intrusion Detection System created with open-source tools, including Snort, ACID (web-based console management for Snort), SnortCenter and MySQL. d. Commercial NIDSes and NIPSes Most available commercial products are intrusion prevention systems. Snort_inline6 is basically a modified version of Snort that adds the ability to drop packets. Bro can be also used as an IPS as it can respond to attacks and execute arbitrary programs that will block attacking packets. Sourcefire’s Intrusion Sensor7 is a hardware sensor based on the technology developed in Snort. Sourcefire products offer anomaly detection capabilities, not found in the opensource version of Snort. For example, anomaly detection engines can spot if a machine begins broadcasting spam e-mail messages. The most powerful model of the Intrusion Sensor series can handle aggregate throughput up to 4 Gbps8. Cisco Intrusion Prevention System (IPS)9 is an inline, network-based solution, designed to identify, classify, and stop malicious traffic, including worms, spyware/adware, network viruses, and application abuse. Through the IPS Sensor software and based on Trend Micro technology, the Cisco IPS can perform enhanced virus/malware protection. Other characteristics include traffic anomaly detection and protocol anomaly detection. Traffic anomaly detection is applied for attacks that may cover multiple sessions and connections, using techniques based on identifying changes in normal network traffic patterns. An example would be an ICMP flood with a predefined number of ICMP packets within a certain amount of time. Protocol anomaly detection identifies attacks 4 http://www.prelude-ids.org/ http://www.ids.belbone.be/ 6 http://snort-inline.sourceforge.net/ 7 http://www.sourcefire.com/products/is.html 8 http://www.sourcefire.com/products/downloads/public/sf_IS5800.pdf?a=1&b=2#go 9 http://www.cisco.com/en/US/products/hw/vpndevc/ps4077/products_data_sheet0900aecd801eeea5.html 5 based on observed deviations in the normal RFC behavior of a protocol or service (an HTTP response without an HTTP request, for example). The most advanced model of the series, IDS 4250 XL Sensor, can process traffic up to 1 Gbps. 2. Firewalls Firewalls are security components that are located in the edge of the network and inspect all incoming/outgoing traffic. If a suspicious activity is detected then they can prevent the communication between the attacker and the victim service. In their simplest form, firewalls maintain a list of suspicious IP addresses that are considered to be attack sources. The source of each packet is checked against this list and if a match occurs, the packet is dropped. Firewalls also prevent communication to specific services, e.g. they can forbid access to the telnet service (drop packets going to port 23). Firewall functionality is integrated in most widely-deployed routers. Most routers offer the ability to blacklist IP addresses or block specific ports (also called Access-Control Lists –ACLs). Even home routers, like ADSL modems, have such abilities but can only support a few number of blocks. Alternative solutions use commodity PCs as firewalls by using two network interfaces and the iptables facility of UNIX operating system. Iptables allows an administrator to add rules to the kernel of the operating system which can drop or forward packets. A simple example of such a rule is the following: iptables -D INPUT -s 123.0.0.0/8 -p icmp -j DROP which instructs the firewall to drop any ICMP packets that come from subnet 123.0.0.0/8 subnet. However, these solutions can only handle a few hundred Mbps as they are limited by the processing power and bus throughput of the hosting PC. a. Commercial products Cisco PIX security appliance10 offers application-aware firewall services by integrating a broad range of features, apart from traditional ACLs. PIX appliances can detect popular form of attacks, including denial-of-service (DoS) attacks, fragmented attacks, replay attacks, and malformed packet attacks. They have the ability to perform TCP stream reassembly, traffic normalization and furthermore track the state of all network communications. PIX also supports advanced application and protocol inspection, including protocols like Hypertext Transfer Protocol (HTTP), File Transfer Protocol (FTP), Extended Simple Mail Transfer Protocol (ESMTP), Domain Name System (DNS) and Simple Network Management Protocol (SNMP). The throughput of the high-end model (PIX 535) can reach up to 1.7 Gbps. From the same company, Cisco ASA 5500 10 http://www.cisco.com/en/US/products/hw/vpndevc/ps2030/products_data_sheet09186a008007d05d.html Series IPS Edition11 is a firewall based on the technology deployed on the PIX product series and its performance reaches 450 Mbps. Global Technology Associates’ GB-2000X12 provides similar characteristics like CISCO PIX firewall. It can perform stateful packet inspection, built-in IPSec VPN, gateway failover and can support up to ten 100Mbps Ethernet ports. A less powerful production is the SG580 from SnapGear that can operate up to 200 Mbps but offers similar capabilities, like stateful inspection and intrusion protection by incorporating Snort. 3. Pattern matching algorithms A number of algorithms have been proposed for pattern matching in a NIDS. The performance of each algorithm may vary according to the case in which it is applied. The multi-pattern approach of Boyer-Moore is fast for a few rules, but does not perform well when used for a large set. On the contrary, Wu-Manber behaves perform well when used with large rule-sets. On the contrary, Wu-Manber behaves well on large sets, but its performance is degraded when short patterns appear in rules. E2xB is based on the idea that in most cases we have a mismatch and tries to filter out patterns that do not match. However, E2xB introduces additional preprocessing cost per packet, which is amortized only after a certain number of rules. In the following subsections a more detailed description for each algorithm is provided. a. The Boyer-Moore algorithm The most well-known algorithm for matching a single pattern against an input was proposed by Boyer and Moore13. The Boyer-Moore algorithm compares the search pattern with the input, starting from the rightmost character of the search pattern. This allows the use of two heuristics that may reduce the number of comparisons needed for pattern matching (compared to the naive algorithm). Both heuristics are triggered on a mismatch. The first heuristic, called the bad character heuristic, works as follows: if the mismatching character appears in the search pattern, the search pattern is shifted so that the mismatching character is aligned with the rightmost position at which the mismatching character appears in the search pattern. If the mismatching character does not appear in the search pattern, the search pattern is shifted so that the first character of the pattern is one position past the mismatching character in the input. The second 11 http://www.cisco.com/en/US/products/ps6120/prod_brochure0900aecd80402ef4.html http://www.gta.com/products/gb2000x/ 13 R. Boyer and J. Moore. A fast string searching algorithm.Commun. ACM, 20(10):762–772, October 1977. 12 heuristic, called the good suffixes heuristic, is also triggered on a mismatch. If the mismatch occurs in the middle of the search pattern, then there is a non-empty suffix that matches. The heuristic then shifts the search pattern up to the next occurrence of the suffix in the pattern. Horspool14 improved the Boyer-Moore algorithm with a simpler and more efficient implementation that uses only the bad-character heuristic. Fisk and Varghese15 developed Set-Wise Boyer-Moore (SWBM), an algorithm based on BoyerMoore concepts and operating on a set of patterns. SWBM was integrated in older versions of Snort and tested using a single traffic trace from an enterprise Internet connection. b. The E2xB algorithm E2xB is a pattern matching algorithm designed for providing quick negatives when the search pattern does not exist in the packet payload, assuming a relatively small input size (in the order of packet size)16. As mismatches are by far more common than matches, pattern matching can be enhanced by first testing the input (i.e., the payload of each packet) for missing fixed-size sub-strings of the original signature pattern, called elements. The collisions induced by E2xB, i.e., cases with all fixed-size sub-strings of the signature pattern showing up in arbitrary positions within the input, can then be separated from the actual matches using standard pattern matching algorithms, such as BoyerMoore. The small input assumption ensures that the rate of collisions is reasonably small - experiments has shown collision rates of 10% in the worst case. In the common case, negative responses can be obtained without resorting to general-purpose pattern matching algorithms. The E2xB algorithm was evaluated with traffic traces from diverse environments, including traces containing attacks, traces with normal web traffic, and WAN traffic traces from a local ISP. c. The Wu-Manber algorithm The most recent implementation of Snort uses a simplified variant of the Wu-Manber multi-pattern matching algorithm17, as discussed in Snort architecture manual18 . The 14 R. Horspool. Practical fast searching in strings. Software - Practice and Experience, 10(6):501–506, 1980. 15 M. Fisk and G. Varghese. An analysis of fast string matching applied to content-based forwarding and intrusion detection. Technical Report CS2001-0670 (updated version), University of California - San Diego, 2002 16 K. G. Anagnostakis, E. P. Markatos, S. Antonatos, and M. Polychronakis. : A domain-specific string matching algorithm for intrusion detection. In Proceedings of the 18th IFIP International Information Security Conference (SEC2003), May 2003 17 S. Wu and U. Manber. A fast algorithm for multipattern searching. Technical Report TR-94-17, University of Arizona, 1994. "MWM" algorithm is based on a bad character heuristic similar to Boyer-Moore, but uses a one or two-byte bad shift table constructed by pre-processing all the patterns instead of only one. MWM performs a hash on the two-character prefix of the current input to index into a group of patterns, which are then checked starting from the last character, as in Boyer-Moore. The performance of MWM was originally measured using text files and various sets of patterns. The first attempt to measure MWM as the basic algorithm for pattern matching in a NIDS was performed in 18. The results show that Snort is much faster than previous versions that used Set-Wise Boyer-Moore and Aho-Corasick. d. The Piranha algorithm The Piranha algorithm19 is based on the idea that if we find the rarest 4-byte substring of a pattern inside the packet payload, then we assume that this pattern matches. Each pattern is now represented by its least popular 4-byte sequence, where popular reflects the number of times that a specific substring exists in all patterns. For all the instances of the rare substring, the intrusion detection system is instructed to check the corresponding rule. Piranha itself can only handle patterns with length greater or equal to 4. For completeness, patterns with length less than 4 are handled separately. Piranha treats every byte-aligned pattern as a set of 32-bit sub-patterns. For example, the pattern “/admin.exe” (R1) is considered as the set of its 32-bit byte-aligned sub-patterns, i.e.,“/adm”, “admi”, “dmin”, “min.”, “in.e”, “n.ex” and “.exe”. To optimize memory footprint and execution speed, only the least popular sub-pattern is kept, where popular reflects the number of times that a specific substring exists in all patterns. All least popular sub-patterns are put in an index table – a hash table – so they can be easily located during the search phase. The searching phase of Piranha is straightforward. For each 4-byte sequence of the packet payload, the index table is consulted in order to find the patterns that contain this sequence. All these patterns are then sent to the intrusion detection system for further inspection 18 19 Sourcefire. Snort 2.0 - Detection Revisited. October 2002. http://www.snort.org/docs/Snort_20_v4.pdf. S. Antonatos, M.Polychronakis, P. Akritidis, K. G. Anagnostakis, E. P. Markatos. Piranha: Fast and memory-efficient Pattern Matching for Intrusion Detection. Proceedings of the 20th IFIP International Information Security Conference (SEC2005), May 2005.