Critical Features • Sufficient network capacity • Frequent updates • Support level • Power redundancy • High availability 1 Unified Threat Management (UTM) devices are a useful tool for reducing risks present in a company’s infrastructure at a reduced cost compared to using standalone devices. Based on the risks you need to address, you must select the appropriate features to help mitigate them. This presentation will cover the most important aspects for selecting a UTM device to minimize risks inside the organization: • • • Critical Features: UTM devices have a broad range of possible features. The most important ones will be discussed so you can select which features you need to address your company’s risks or find out if any are missing in a device you are looking at. Selection criteria: Technical specifications can solve many problems, but there are other variables included suck as market position of the company, partner – manufacturer relationships, and product roadmaps. These criteria will help to select the most convenient UTM device for your needs. Test plan: Features needed by the company must be tested and proved to see if the device is really capable of meeting the company’s needs. The following are the critical features for a UTM device: • 1 Sufficient network capacity: Any device that your traffic passes through has the potential to become a bottleneck, but few standalone devices try to perform so many checks on your traffic as a UTM device. In order to accomplish the dual requirements of protection and speed, vendors have had to go to extreme lengths such as creating custom ASICs to keep up1. In order to verify that the device you are looking at has sufficient capacity, double check specifications such as maximum throughput, number and types of ports, memory size and disk space. When you are looking at the specifications make sure you know to what they are referring. Many vendors may report throughput speed in the most favorable light such as with only the firewall enabled and no IPS. In addition, attempt to adjust the numbers for issues such as IPS rule tuning and the number of modules you will be enabling. Any device picked for testing or further review should not just have capacity for today’s traffic, but the ability to handle future traffic loads as well. The UTM device must integrate with your network, the way you administer it, and not affect network performance. Tippingpoint IPS product description, Visited 9/8/2007, http://www.tippingpoint.com/products_ips.html Mason Pokladnik and Manuel Santander SANS Technology Institute • • • • Frequent updates: Security is an ongoing process, and if the risks from new and evolving threats are not addressed, then you are not just standing still; you are actually in a constant state of decreasing protection. While some modules of a UTM device, like the firewall, may not be updated frequently, other items such as the signatures for the IPS, list of websites to be blocked by the webfilter and antivirus protection need frequent updates. Just as important as frequent updates is the speed with which vendors respond to new threats. Reducing the length of the “vulnerability window” between when a new threat is detected and when protection updates are distributed is another way a vendor may distinguish themselves from the pack. These emerging threats are also known as “zero day” vulnerabilities. Support level: In addition to getting out signatures and other updates quickly, the company behind the device you are considering should be able to respond to any other technical issues you may have in a timely fashion. Issues with licensing and bugs in the product may just be an inconvenience, but other issues can bring your network to a standstill. One of the risks of using a network IPS has always been that you may block legitimate traffic. The best prevention for this is for the company to write high quality signatures in the first place, but in the rush to get updated protection out to clients, mistakes will inevitably happen. When they do arise, it is important that the support team be able to help quickly isolate rules causing problems. Another consideration is the support team’s geographic distribution. Do they follow the sun with teams around the world? Are those teams fully staffed with upper level support and development support or do you have to wait on a response 24 hours because they are all on the other side of the planet? Power redundancy: A device at the core of your network should support redundant power supplies. If this feature were missing, it might indicate a lack of maturity in the product design. High availability: If you plan to install your device in a high availability configuration, the device you are considering needs to be able to support protocols such as HSRP, VRRP and GLBP. Mason Pokladnik and Manuel Santander SANS Technology Institute Critical Features (2) • IPS failure mode open/closed • Alerting – Forensic evidence – Detailed alerts for further investigation – Useful reports for statistics and security level improvements 2 • • 2 IPS failure mode fail open/fail closed: All network equipment is susceptible to hardware failures. Add to that the extreme care that must be taken when parsing protocols, like an IPS has to do, and you may find yourself with a non-working IPS someday. Snort, Ethereal, and ISS’s Realsecure IDS have all seen buffer overflows that can be exploited by packets passing through the device for analysis. You need to ask your organization what it would prefer to happen when the IPS dies. If it were equipped with the correct type of network cards it could fail “open” allowing network traffic to pass through, but with your network losing an important defensive control. This situation is even worse if you configure your network to operate this way and have no way of knowing that the IPS has stopped running. On the other hand most IPS devices can stop passing traffic and fail “closed.” This keeps downstream machines from being attacked, but could cost you a lot of money if you are depending on that network to generate income. If you think this situation is avoidable by having someone there to reboot the device, consider what happened with the Witty worm. ISS’s Realsecure IDS machines had a remotely exploitable vulnerability that was used to cause the machines to slowly destroy the data on their hard drives, eventually rendering the system unusable2. Now imagine this situation: you have just implemented two brand new converged security devices from the same vendor in a high availability, failover configuration with an unknown buffer overflow that can render them both useless without a total reinstall of the system. There is a good chance, no matter whose IPS engine you are using, this is the case. You are just lucky that no one has found that vulnerability and released a worm to exploit it. For more information, on the Witty worm you can visit http://www.caida.org/research/security/witty. Alerting / forensic evidence: The devices you consider should have excellent logging capabilities for its various modules. Specifically, look for the ability to send alerts by severity ratings to different destinations such as email, pager, database and a syslog server. The device should also generate logs for administrative activities such as configuration changes and normal system maintenance. It should also be able to turn on debugging for each device feature to check for issues such as the inability to bring up a VPN tunnel or figure out the unintended consequences of a poorly written signature. The IPS module should also include the ability to capture the relevant packets for alerts that are logged but not blocked. A device’s forensic ability should allow an analyst to not only view the details of an alert, but to drill down and see the network traffic that caused it. It may even allow them to verify whether or not a compromise has taken Witty Worm analysis, Visited 9/1/2007, http://www.caida.org/research/security/witty/ Mason Pokladnik and Manuel Santander SANS Technology Institute place without checking the system in question. This can greatly speed the IPS analyst’s job compared to having to verify the logs and system state of an attack target to see what they can tell you. You should also verify the reporting features of the device by checking to see if the reports generated by the device are good enough for daily monitoring and periodic management reports. Real time reports are also greatly appreciated. You may find that you need a separate reporting package, at additional expense, or a separate logging system to create the metrics you need to manage the device over its service life. Mason Pokladnik and Manuel Santander SANS Technology Institute Critical Features (3) • Integration with network architecture • Auditability – Can it support separation of duties? – Does the device log all important changes? – Failure notifications for failed module 3 • Integration with network architecture: A UTM device is just another technological control that you use to mitigate risk. Verifying how that device fits into your existing IT architecture before trying to implement it can save many implementation headaches and IT personnel costs. Some example areas to look at: – If this device is going to act as a router on your network does it support your enterprise routing protocol? You could plug a device that only supports RIP into an OSPF network, but you should be getting something in return that makes up for the increased network configuration complexity which causes more time for network technicians, and increased memory and processor requirements for routers that must speak multiple routing protocols. – Will the device talk to your log management or security information management system? If not, you just committed your organization to writing custom translation tools from the UTM devices log and alert formats to your management system’s in order to get the ability to audit and correlate on those events. – Make sure the device works with your authentication technology. If you use Active Directory or Novell eDirectory you will want to be able to authenticate administrators or users against it. Auditability: Current business regulations not only require the existence of an IDS or IPS system and a firewall, they also require that you be able to demonstrate you are protecting business assets and processes such as the financial reporting process. Best practices that auditors, both internal and external will want to see include: – Separation of duties – the device should support some sort of role based security that can limit what an administrator or auditor can see and change on the device. For example, you should be able to have an administrator able to make changes to only one module on the device and also be unable to modify any log files. – Appropriate logging – In order to verify that the controls are being managed properly, the device must be able to log items such as configuration changes both locally and to log management or SIM systems so that they can be matched up with configuration management approvals. – Failure warnings – A firewall or IPS is a preventative and/or detective control. One requirement of an effective security control is that you be able to tell when it is no longer working. The device should be able to raise some sort of alert upon detecting a process failure whether or not it is capable of remediating the problem on its own. Mason Pokladnik and Manuel Santander SANS Technology Institute Critical Features (4) • IPS detection models – Signature • Only detects known threats – Behavioral • Might detect zero-day attack – Hybrid 4 • IPS detection models: There are three major models used in IPS detection engines. You should understand which model your device uses and the benefits and drawbacks that entails for your network. The first type of model is called signature based detection. In this model the vendor, or in some cases the end user, write a rule that analyzes parts of the packet header or payload for telltale signs of an attack. When the “signature” of the attack is located, the packet can be dropped instead of being forwarded on to the end host. In the case of a TCP connection, a reset packet can be sent as well to tear down the connection and preserve system resources. The benefits of the signature model include: – A well written signature will very often find the specific attack or class of attack it is looking for without blocking legitimate traffic. – If you are able to write signatures for your device you may be able to put in an emergency detection rule until the vendor can release their signatures. This is even easier if your system uses the Snort engine or rules syntax as there are places like bleedingsnort.com where users are constantly posting rules to detect the latest threats and refining them as they get used on more systems. – You should be able to tune the signatures the detection engine is using to match the systems it is protecting and thereby reduce the load on the device. The drawbacks include: – Signature based systems cannot protect against unknown threats. By their very design, they look only for what they are told to look for. – Once a vulnerability is discovered, there is a period of time in which you are not protected while the threat is being analyzed and a rule is created, tested and distributed. This is often referred to as the window of vulnerability. – Signatures are often written to detect a specific attack or payload within the attack and by slightly altering it, an attacker may be able to bypass the signature. – As signature databases increase over time, the performance of the device is reduced. The second model is called the behavioral model, although it is alternately called both anomaly detection and heuristic detection. In the behavior model, the detection engine goes through a learning period where it observes traffic patterns on your network and uses that information to build a baseline of what legitimate traffic should look like.3 Then, when the detection engine enters blocking mode, it uses that baseline to identify traffic flows that are out of place using a 3 IDS: Signature versus anomaly detection, visited 9/2/2007, http://searchsecurity.techtarget.com/tip/1,289483,sid14_gci1092691,00.html Mason Pokladnik and Manuel Santander SANS Technology Institute myriad of possible indicators such as services never seen on a host before being accessed or uncharacteristic levels of traffic to many computers that might indicate the scanning phase or attack phase of a worm. Benefits include: – Behavioral detection can possibly detect threats that have never been seen before. This means if a zero day threat trips a behavioral rule that it might be blocked even though no one had to analyze it and write a signature for it. Even if it does not block the threat, it may detect that a desktop in your environment just started sending spam out to the Internet and raise an alert so that your incident response team can look into the situation. – Behavioral engines are often more capable of looking into higher level application protocols. If the device you are looking at supports deep inspection of traffic for a critical application on your network, it will be able to detect things a signature only engine will miss. Drawbacks include: – You will most likely have less ability to see the rules and tune a behavioral detection engine. This means the likelihood of blocking legitimate traffic is increased, and when that does happen, figuring out what went wrong is that much harder. Instead of just changing or disabling a signature, you may have to send packet captures to the device’s development team and wait for them to figure out what went wrong. – The more complex detection engine may require you to purchase a device with more memory and processor power to decode and keep track of all of those protocols. – If a threat complies with the normal behavior of a protocol, it may never be detected. The third major type of detection engine is the hybrid model. The hybrid model attempts to use both signature and behavior methods to offset the limitations of each model on its own. While you gain the benefits of having both types of protection you also get the drawbacks of both. This especially applies to capacity and scalability issues as you must now process the packets in two different ways while maintaining low latency requirements. Mason Pokladnik and Manuel Santander SANS Technology Institute Selection Criteria • Business fit – Does the device match how you manage your network? • Other unique features • Feature comparison to standalone • Vendor history/reputation • References 5 Deciding which UTM features are needed to address the critical risks inside an organization is very important, but there are other important variables also important that will measure if the UTM device is able to work as part of the infrastructure system in the organization. The following list will further help you decide which devices to consider. Business fit: Earlier we discussed how a UTM device should fit into your IT architecture; you should also evaluate how the device matches your organizational chart and processes. Information Security Magazine published a comparison of the administrative interfaces and features of several UTM devices in the June 2007 issue.4 In the article they brought up two relevant points. First, does the device match your security department’s organization? For example, does one group manage all the features of your UTM device or does a different group handle firewall from IPS or VPN? If the device is not capable of supporting multiple role based administrators, this may cause issues with your separation of duties. Second, how many people can administer the device at the same time? If you are going to have multiple people changing the device at the same time, you need to know if that is possible and what the rules are. Does the last change submitted win? Additional problems may arise from the administrative interface. Is it web based or Windows only? If you are going to be buying several of these devices for branch offices is there a scalable management system, or are you forced to make changes on each device? Unique features: In addition to the critical features needed, does the device have any unique features as a competitive advantage that would be useful in your environment? An example might include the ability to offload SSL processing from your web servers, giving you the dual benefits of increased performance on the server and the ability to monitor traffic that was previous encrypted for attacks. Feature comparison: While UTM devices are capable machines, they usually began life as one thing such as a firewall or IPS system and added features over time to become an integrated security device. There is a good chance that the first thing the product started out as is comparable to a standalone device feature-wise. However, as additional modules were developed, they may or may not have had to compete on their own merits. You should attempt to compare the features of each 4 Universal Control, Visited 8/30/2007, http://searchsecurity.techtarget.com/magazineFeature/0,296894,sid14_gci1259436,00.html Mason Pokladnik and Manuel Santander SANS Technology Institute module you plan on using in the UTM device to standalone devices that perform the same functions and see if the UTM device is missing any typical features that might indicate a lack of maturity. Vendor history and reputation: New technologies and companies enter the marketplace all the time. While many companies prefer to buy from an established company that does not necessarily mean a UTM device that just hit the market is not the best one for your company. If it fills a real niche in your environment, you may be able to overlook the fact a device has not been in as many networks or tested against both current and older vulnerabilities. On the other hand, security practitioners are regularly amazed by how old bugs come back to haunt you just as when the Land attack, first discovered in 1997, turned up again in the IPv6 stacks of Windows XP service pack 2 and Windows 2003 server.5 As we transition to an IPv6 world over the next several years, you can expect to see old vulnerabilities come back to life in new ways. You should try to verify if a new device includes detection for such “old” attacks. A vendor’s reputation in the marketplace may also serve as a good indicator of a device’s overall quality. Any company that has been around for a few years or more will have comments on its products out on the Internet. Go spend some time looking up stories on Google and newsgroups for your candidates. You will probably turn up support horror stories, hardware failures and more bad news than good, but if one of the devices you are looking at has had real problems, it will tend to rise above the normal complaint level such as Symantec’s recent support issues resulting from several large acquisitions.6 You should also ask friends in the industry if they are familiar with any of the products you are considering. References: You should ask your vendor for a list of three or more clients with similar network architectures who are using the device. Prepare a list of questions about how they are using the device, how it has performed, as well as, any support issues they may have had. Extra weight should be given to references in your industry. If possible, go visit those clients and ask questions about performance, support or any aspect about which you have doubts. 5 Wikipedia article, Visited 9/3/2007, http://en.wikipedia.org/wiki/LAND When security firms merge, some users are losers, Visited 9/8/2007, http://searchsecurity.techtarget.com/originalContent/0,289142,sid14_gci1244396,00.html 6 Mason Pokladnik and Manuel Santander SANS Technology Institute Selection Criteria (2) • Risk analysis • Modules for use in future • Operating costs • Necessary training for staff • Signatures – proprietary? • Roadmaps for future development 6 Risk analysis: In a perfect world you would have already performed a risk analysis for the processes and assets that your proposed device is supposed to be protecting. Now is the time to go back to that analysis and make sure that the device you are considering really fulfills the control functions for which it was intended. Modules for future use: A UTM device may be the sole source of protection for an entire location. It may have modules for firewall, IPS, web filtering, email gateway antivirus, anti spam, VPN termination, and others. Make sure you know which modules you will want to use now, as well as, which modules you might implement in the future, including possible risks associated with them. You will also want to look at the ability to scan non HTTP traffic such as instant messaging and peer to peer traffic for threats. Operating costs: Make sure you get the whole picture of what the ongoing costs of the device will be for the modules you are purchasing now and the ones you may use in the future. Find out if those costs include both software version upgrades, as well as, technical support. Verify that the technical support covers how you run your business. If you are a 24 hours / 365 days a year operation, then your support package should cover those hours. Attackers do not take holidays off. Sometimes they like to take advantage of the fact that many people will not be in the office to make their job easier. Training: What kind of training is going to be necessary to bring your administrators up to speed on the new device? If they are already familiar with products from that company, then there may be less of a learning curve in implementing the new device thereby reducing training expenses. Ask your vendor if they plan to provide onsite training and implementation support and if so how much it costs. Signatures: Some vendors keep their signature databases proprietary and secret. The only control you have over them is whether they are set to disabled, log or drop. Make sure the status of the signature database for potential devices is acceptable to your organization. You should be aware of whether or not you can add custom rules to the signature database for emergency protection while the vendor’s rules are being created and distributed for the latest threats. The lack of ability to do so may cause some real problems should you find yourself dealing with a zero day or targeted attack in the future. This also applies to all other filters on the machine such as email and web filtering. Roadmaps: In security you are either keeping up with new threats or you are constantly falling further behind. Ask your vendor for short term and strategic roadmaps that show what threats they Mason Pokladnik and Manuel Santander SANS Technology Institute see as an issue. If they are not thinking about IPv6 you should be very worried about the service life of device. Testing Tools • Fragroute • Firewalk • Port scanners/OS fingerprinting – Nmap, p0f (p zero f), Xprobe2 • Password auditing 7 Before you can perform any testing, a variety of malicious tools capable of testing your device must be gathered and grouped to perform test procedures, which will be discussed later. The following are some of the most commonly used tools: Fragroute: This tool comes in two variants; Fragrouter, which like a router, you pass your traffic through to apply evasion techniques. Fragroute runs on the same machine as the one generating the test traffic. In either format, it can apply IDS/IPS evasion techniques based on the way different operating systems reassemble packet fragments. Especially successful are “tiny” fragments where the header information is split into multiple packets, and fragment offset attacks which may allow you to overwrite information from a previous packet due to differences in IP stack implementations. Please be aware there are some trojaned versions of this software out there masquerading as new releases. Fragroute http://monkey.org/~dugsong/fragroute/ v1.3 Fragrouter http://www.w00w00.org/files/sectools/fragrouter/ v1.6 Firewalk: Firewall scanning tool. By running Firewalk against a known host behind a firewall you may be able to reconstruct the ports that are allowed through the firewall. Special firewall configurations that do not allow any ICMP responses can hamper this tool’s effectiveness. Firewalk http://www.packetfactory.net/projects/firewalk/ Port scanning and operating system fingerprinting tools: These tools allow you to find out an amazing amount of information about nearly anything with an IP address. A quick Nmap scan can reveal any TCP or UDP ports open on a host that are not blocked by some sort of firewall. Using some odd combinations of TCP flags and other information from a host’s response packets Nmap may even be able to tell you what operating system a remote system is running by “fingerprinting” its IP stack. Nmap is also one of the first tools an attacker will use against your network - after Google – so you will want to know what they are going to see. Since firewalls can make the OS fingerprinting process somewhat unreliable you will want to correlate with banner information from services and another programs such as Xprobe2 or the passive OS fingerprinting tool p0f (P zero F.) Mason Pokladnik and Manuel Santander SANS Technology Institute Good introduction to Nmap: http://www.ethicalhacker.net/content/view/155/1/ Nmap http://insecure.org/nmap/ XProbe2 http://xprobe.sourceforge.net/ P0f http://lcamtuf.coredump.cx/p0f.shtml Password auditing: It is an unfortunate fact that many systems can be logged into using the default credentials that shipped from the factory or with an easily guessed password. In order to check for these issues on your device you will need to run a password auditing tool like THC-hydra which can try to login to multiple services including SSH, FTP and basic web authentication. You should also go through the manual and find the default passwords for your device and add them to the password list to make sure that the device’s administrators have changed them. A version of THC-hydra is also included as the password guesser in the Nessus vulnerability scanner discussed later. THC-hydra http://freeworld.thc.org/thc-hydra/ Mason Pokladnik and Manuel Santander SANS Technology Institute Testing Tools (2) • Vulnerability assessment tools – Nessus, Core Impact, Metasploit, Nikto, Wikto • Application testing – SPI webinspect, Cenzic Hailstorm • Traffic capture/generation – Tcpdump, Snort, Wireshark – Tcpreplay, Tomahawk, Netdude – Karalon, Spirent Smartbits 8 Vulnerability assessment tools: There are many open source and commercial vulnerability assessment tools out there that will scan a host for services and applications that have known bugs or configuration issues that can be exploited to gain access to your system. In this case, the system we are concerned about is your new UTM system. Many UTM devices run a hardened version of BSD or Linux, a smaller number even run Windows. They may also use the Apache web server or PHP scripting engine for the administrative interface. All of these programs have had remotely exploitable security issues in the past, so you should run some tests to verify that you are not leaving a major security control open to attack. A few tools to consider when designing your tests: C = Commercial OS = Open Source Nessus (C, OS) – One of the original vulnerability scanners, Nessus has both a free version and a commercial version. It is good at bringing up potential issues, but by design many tests do not try to actually break into the system to verify that the vulnerability is real. Just be aware that this means not everything on a Nessus scan report is something that has to be fixed. It is also has a really useful feature: when scanning a production network you can disable certain checks that may cause running services to fail. Nessus http://www.nessus.org/ Metasploit (OS) – As a framework Metasploit was developed to make developing exploits easier by allowing you to modularize the code that takes advantage of a vulnerability (the exploit) from the payload code that you run when you gain access to the system. It also includes features such as shellcode polymorphism with several different encoding schemes to evade IDS/IPS detection.7 Since these modules are all exploits when you manage to successfully launch an attack with Metasploit, you know you have a problem. Metasploit http://www.metasploit.com/ Core Impact (C) – This is an expensive, but very good tool for any sort of penetration testing. Like Metasploit this product actually tries to exploit the system to verify the vulnerability actually exists. The 7 Metasploit framework encoders, Visited 9/7/2007, http://framework.metasploit.com/encoders/list Mason Pokladnik and Manuel Santander SANS Technology Institute interface and reduced collateral damage on the target system are two of the main reasons people pay for a tool like Core Impact instead of using a free one. Core Impact http://www.coresecurity.com/ Wikto (OS)/Nikto(OS) – These tools are focused on finding webserver vulnerabilities and configuration mistakes. Wikto http://www.sensepost.com/research/wikto/ Nikto http://www.cirt.net/code/nikto.shtml Application testing – As administrators have gotten better with firewall configurations and patching, attackers have spent more time attacking applications. Vulnerabilities like SQL injection, cross site scripting, and session hijacking can go straight through your carefully configured firewall and straight to valuable data. If you have access to a testing tool such as SPI Webinspect or Cenzic Hailstorm, you could run it against your device to see if a poorly designed administrative interface is putting your UTM device at risk.8 Real traffic capture/traffic generation: One of the most effective tests you can run on your new device is to run real captured traffic through it. Whether you are testing the throughput of the device, or its logging and alerting capabilities, real traffic flows from a production network are the best indicator of how a device will perform. It is a bad idea to put your device, while under test, into a production network in order to expose it to production traffic, so you will need some tools to capture and replay that traffic into your test network. One of the most important reasons for this is that an IPS module is configured to drop some malicious traffic. If the device misidentifies normal traffic as malicious, it will create a self inflicted denial of service attack that can be hard to track down without actually examining the traffic at various points along the network path. The first step in collecting real traffic is to use a packet sniffer to collect the files to disk. By far the most universal capture format out there is the pcap format, and you can collect pcap files using several tools such as Tcpdump and Snort on the various Unix like operating systems or Wireshark on Windows. Once you have captured the traffic you may want to change the IP addresses of the source or destination using a tool like Netdude. Finally, you will need to put that traffic back into the test network using a tool such as Tcpreplay. Tcpreplay is actually a suite of tools that will split the captured traffic into client and server parts and then rewrite the parts of the packets necessary to properly pass through your test device. Another traffic generator specific to IPS testing is a tool called Tomahawk. It can generate a large amount of traffic by running several attacks from different capture files in parallel or using multiple machines to saturate a network. This ability to mix good traffic captures and attack traffic captures at will allows you to verify both the performance and consistency of the device. Snort http://www.snort.org Wireshark http://www.wireshark.org Netdude http://netdude.sourceforge.net Tcpdump http://www.tcpdump.org Tcpreplay http://tcpreplay.synfin.net/trac 8 While using your port scanners and vulnerability tools make sure you are monitoring logs and alerts from the firewall and IPS. Since these are real attack tools you will want to know what it looks like when your defenses are being probed Mason Pokladnik and Manuel Santander SANS Technology Institute Tomahawk http://www.tomahawktesttool.org/ Commercial traffic generators – In addition to Tcprelay and real clients generating traffic, there are commercial programs and appliances out there such as Karalon Traffic IQ and Spirent Smartbits that will allow you to generate traffic test loads. The Smartbits hardware is usually used for network testing, but if you have access to one, it can be used to generate clean traffic while attack traffic is generated on another device. The Karalon products are actually designed to test security devices and include add-ons for features such as IDS/IPS evasion. For additional tool ideas you may want to check the website http://sectools.org/. While not all of the tools on this website will be useful in testing your device, it will show you what kinds of tools people are using for generic security testing. Mason Pokladnik and Manuel Santander SANS Technology Institute Testing Procedures • Goal – validate features • Phase 1 – Firewall only – Firewall, port scanners, vulnerability scanners • Phase 2 – Firewall + IPS (defaults) – Add IPS testing tools, look for consistency • Phase 3 – Firewall + IPS (tuned) – Same tools, look for DOS and rules issues 9 A successful test procedure should validate that the critical features you are expecting to be in the device actually are there and working as expected. While many features such as number of ports and power supplies are easy to verify, others like throughput numbers, device hardening, and alerting and logging abilities need a designed test procedure to make sure that when the test device is put in the production network you have a reasonable expectation of how it will – or will not - perform. Just as if you were testing separate devices, you would introduce them to the test network one at a time. Your best chance at isolating issues as they arise is to only configure one module at a time and add additional modules as you continue the testing. In the next section we will start by describing the test plan for a UTM device running a firewall and IPS module, but you could continue by adding VPN, web filtering, spam and email antivirus scanning modules, etc. and related tests if you are planning to use them. Test phase 1 – Firewall only Many vendors will publish a firewall only specification for their devices, making this the easiest number to compare among them. Unfortunately, when more modules are enabled the device’s performance will likely be decreased. In the first phase, your UTM device should be configured with the production firewall ruleset in place and all other modules disabled. Then you should use a variety of the tools discussed, including the port scanners, vulnerability scanners, Firewalk, and traffic generators, to test that the underlying OS and firewall module are working as expected. In all test phases you will want to aim appropriate traffic both at the device itself (to verify the device OS is properly hardened) and through the device (to test throughput, alerting and logging). At a minimum you will want to verify: Firewall access control lists are configured appropriately from each subnet to each subnet Device is not passing malicious packets when getting congested. The device handles normal traffic loads with low latency The device appropriately logs dropped traffic and failed logins No vulnerabilities are found from internal or external networks that would allow an internal or external threat to take control of the device Mason Pokladnik and Manuel Santander SANS Technology Institute Test phase 2 – Firewall + IPS in default configuration In the second phase of testing, you should start by enabling the IPS module of the device in its default configuration. Most vendors are relatively conservative with their default configurations as blocking your customer’s traffic is generally bad for business. This means the rules that are configured to drop traffic will be the ones that are very specific to attack traffic observed in the wild and all other rules will be set to log or alert. You will now add the IPS specific tools to the test plan such as Tcpreplay or Tomahawk, but it would be best to not add the complication of IDS evasion tactics. In addition to all of the previous phase 1 test items you should now verify: The effect on latency of enabling a new module The new throughput capacity of the device That the IPS module is properly dropping, alerting, or logging attack traffic that makes it past the firewall. This means not only recognizing all of the attacks you throw at it, but that it acts consistently. If you send 10 Metasploit RPC DCOM attacks into the traffic you should get 10 drops, alerts, or log events consistent with the way the detection rule is configured. Test phase 3 – Firewall + IPS in tuned configuration In the third phase of testing, you will implement your proposed production IPS ruleset. Again, this will increase the processing load on the device so keep monitoring the latency and throughput numbers to make sure they stay in an acceptable range. In this test plan, you should make sure you are now including real, known good traffic from your production network. This is where you will find out if your IPS rules are too strict, and what threat is proposed from false positive rule matches creating the dreaded self inflicted DOS. You may need to run this phase a few times, tweaking your ruleset each time to balance performance, security, and reliability. Now would also be a good time to research current threats and see how long it will take you to develop an emergency rule to alert or drop traffic. You could also make sure that your incident handling team is ready to interface with the new device including receiving alerts and their ability to investigate them using the device logs. If the device provides a forensic logging capability, you should be able to view the traffic associated with an alert, and possibly see whether or not the attack was successful. In addition to all of the previous test items, you should now verify: Known good traffic is passing through the device with no false positive detections The performance of the device is acceptable, not only for external to internal traffic, but for subnet to subnet local traffic such as web server to database traffic The device is able to block malicious traffic once it’s detected. That you have tuned your ruleset down to the correct set of rules for the services and operating systems that will be located in each subnet Mason Pokladnik and Manuel Santander SANS Technology Institute Testing Procedures (2) • Phase 4 – Firewall + IPS (evasion) – See how your device copes with evasion techniques • Phase 5 – local traffic special cases – Will a backup kill your network? • Additional phases if necessary – Continue adding things in a logical fashion 10 Test phase 4 – Firewall + IPS in tuned configuration with evasion tactics In phase 4 it is time to enable any evasion tactics that you have elected to use such as Fragrouter or a commercial tool to attempt to hide your attacks from the IPS detection engine. There are many different types of evasion tactics including fragment attacks, URL encoding, exploit polymorphism, and others. These tactics will let you know how well your IPS handles issues including in memory fragment reassembly and URL decoding to see if it can do so in a timely fashion. During this phase you may notice some transactions having increased latency or packets being dropped for lack of resources. It is one thing for only attack traffic to be affected, but your customers could be using browsers with foreign language sets and sending Unicode URL strings. If you wanted to be really thorough, you could use a packet creation tool such as Scapy, Rubyforger, or ISIC to send some truly non standard packets at a behavioral IPS and see what happens. You should continue to monitor all of the previous criteria at this point - paying special attention to the latency of good traffic, as well as, the detection rate of your attacks. No IPS system is perfect given the performance requirements. You need to decide at this point if your test device is performing well enough. If not consider if additional controls, such as a separate IDS, are justified. ISIC http://isic.sourceforge.net Scapy http://www.secdev.org/projects/scapy/ Rubyforger http://rubyforger.rubyforge.org Test phase 5 – Special case local traffic tests At this point your testing should have considered both external traffic and normal local subnet to local subnet traffic. In phase 5, you should monitor your production network for periodic high bandwidth flows such as backup traffic. If your backup system is fast enough, it can nearly saturate a gigabit Ethernet connection. That kind of traffic will have a definite hit on the performance of all traffic flowing through the device. So, in this test, you should be generating a normal traffic load with mostly known good traffic and then start a backup or any similar process you can identify. You may need to setup IPS exception rules for that traffic or other tuning to maintain the overall performance. Mason Pokladnik and Manuel Santander SANS Technology Institute Additional phases if necessary If you have additional UTM modules that you plan to implement, then you will want to continue adding them one at a time until you have reached the production state. Even if these are all the modules you plan to use, you might want to find some tools to test your IPS’s ability to detect an internal infection’s outgoing traffic patterns. Additional reading for developing tests UTM device test from NSS http://www.nss.co.uk/certification/utm/nss-utm-v20-testproc.pdf Firewall tests http://searchwindowssecurity.techtarget.com/generic/0,295582,sid45_gci1262328,00.html Testing an IPS using Tcpdump as the traffic generator http://www.iv2-technologies.com/~rbidou/HowToTestAnIPS.pdf An example test using Fragrouter http://searchsecurity.techtarget.com/magazineFeature/0,296894,sid14_gci1257037,00.html Mason Pokladnik and Manuel Santander SANS Technology Institute Summary • UTM devices are complex – Define your critical features – They need to fit your business • Many UTM devices in the market – Define selection criteria – Does the UTM accomplish needed functionality? Are tests required? 11 UTM devices are extremely complex and have many features that need to be handled very carefully. For example, if the UTM device is placed as a distribution device and handles, inline, all the traffic between core and access layers, it becomes a new point of failure. If a wrong firewall rule or a wrong traffic blocking directive activates, it could become a disaster and recovery plans would need to be activated. All critical features of a UTM device must be analyzed, and then a plan created for how to implement those features inside the network. Security analysts must also determine if the device fits inside the company by determining if it is addressing risks without causing collateral damage like process delays, equipment fault or usability problems for the company employees. Once the UTM device’s features are determined, you need to select the one that provides your company the best cost/benefit ratio for reducing risk. This must be determined by performing a test plan that reveals, as exactly as possible, the real capabilities of the device. There are many tools available on the market (open source and commercial) that are able to provide the required information. In spite of the complexities, a properly chosen and implemented UTM device can replace two or more standalone machines along with their associated acquisition, maintenance and manpower requirements. These potential savings are driving many companies to investigate a UTM device for their network. With the information discussed you should be able to successfully choose and validate a UTM device that fits in the only network that matters – yours. Mason Pokladnik and Manuel Santander SANS Technology Institute