Troubleshooting Guide Revision G McAfee Network Security Platform 8.1 COPYRIGHT Copyright © 2015 McAfee, Inc., 2821 Mission College Boulevard, Santa Clara, CA 95054, 1.888.847.8766, www.intelsecurity.com TRADEMARK ATTRIBUTIONS Intel and the Intel logo are registered trademarks of the Intel Corporation in the US and/or other countries. McAfee and the McAfee logo, McAfee Active Protection, McAfee DeepSAFE, ePolicy Orchestrator, McAfee ePO, McAfee EMM, McAfee Evader, Foundscore, Foundstone, Global Threat Intelligence, McAfee LiveSafe, Policy Lab, McAfee QuickClean, Safe Eyes, McAfee SECURE, McAfee Shredder, SiteAdvisor, McAfee Stinger, McAfee TechMaster, McAfee Total Protection, TrustedSource, VirusScan are registered trademarks or trademarks of McAfee, Inc. or its subsidiaries in the US and other countries. Other marks and brands may be claimed as the property of others. LICENSE INFORMATION License Agreement NOTICE TO ALL USERS: CAREFULLY READ THE APPROPRIATE LEGAL AGREEMENT CORRESPONDING TO THE LICENSE YOU PURCHASED, WHICH SETS FORTH THE GENERAL TERMS AND CONDITIONS FOR THE USE OF THE LICENSED SOFTWARE. IF YOU DO NOT KNOW WHICH TYPE OF LICENSE YOU HAVE ACQUIRED, PLEASE CONSULT THE SALES AND OTHER RELATED LICENSE GRANT OR PURCHASE ORDER DOCUMENTS THAT ACCOMPANY YOUR SOFTWARE PACKAGING OR THAT YOU HAVE RECEIVED SEPARATELY AS PART OF THE PURCHASE (AS A BOOKLET, A FILE ON THE PRODUCT CD, OR A FILE AVAILABLE ON THE WEBSITE FROM WHICH YOU DOWNLOADED THE SOFTWARE PACKAGE). IF YOU DO NOT AGREE TO ALL OF THE TERMS SET FORTH IN THE AGREEMENT, DO NOT INSTALL THE SOFTWARE. IF APPLICABLE, YOU MAY RETURN THE PRODUCT TO MCAFEE OR THE PLACE OF PURCHASE FOR A FULL REFUND. 2 McAfee Network Security Platform 8.1 Troubleshooting Guide Contents 1 Preface 7 About this guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Find product documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 7 7 8 Troubleshooting Network Security Platform 9 Before you start troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Simplifying troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Issues and status checks for the Sensor . . . . . . . . . . . . . . . . . . . . . . . . 10 Health check of a Sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Failover status check of a Sensor . . . . . . . . . . . . . . . . . . . . . . . . 11 Signature or software update status . . . . . . . . . . . . . . . . . . . . . . . 11 Download or upload status . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Check the traffic status of a Sensor . . . . . . . . . . . . . . . . . . . . . . . 12 Conditions requiring a Sensor reboot . . . . . . . . . . . . . . . . . . . . . . . 13 Sensor does not boot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Sensor stays in bad health . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Debugging critical Sensor issues . . . . . . . . . . . . . . . . . . . . . . . . 15 Sensor response if its throughput is exceeded . . . . . . . . . . . . . . . . . . . 16 Sensor latency monitor management . . . . . . . . . . . . . . . . . . . . . . . 16 Management of different types of traffic . . . . . . . . . . . . . . . . . . . . . 19 Sensor failover issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 XC cable connection issues for M8000 Sensors . . . . . . . . . . . . . . . . . . . 20 XC cable connection issues for NS9300 Sensors . . . . . . . . . . . . . . . . . . 20 External fail-open kit issues in connecting to the monitoring port . . . . . . . . . . . 20 Fail-open kit related issues . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Debugging issues with Connection Limiting policies . . . . . . . . . . . . . . . . . 24 Issues with Quarantine . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Issues and status checks for the Manager . . . . . . . . . . . . . . . . . . . . . . . . 27 The Manager connectivity to the database . . . . . . . . . . . . . . . . . . . . . 27 MySQL issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Sensor not displayed in the resource tree . . . . . . . . . . . . . . . . . . . . . 28 The Manager fails to start . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 The Manager interface does not work after JRE update . . . . . . . . . . . . . . . 33 Message on loading the Manager does not disappear . . . . . . . . . . . . . . . . 33 Unable to log on to the Manager after typing credentials . . . . . . . . . . . . . . . 34 Portions of the interface do not load properly . . . . . . . . . . . . . . . . . . . 35 Prompt appears in Threat Analyzer to open or save a JNLP file . . . . . . . . . . . . 36 Login button does not work . . . . . . . . . . . . . . . . . . . . . . . . . . 36 When using Internet Explorer 9 Real Time Threat Analyzer file download gets into a loop . . 37 The Manager client is unble to contact the Manager server when launching the Real Time Threat Analyzer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Real Time Threat Analyzer has strange behavior . . . . . . . . . . . . . . . . . . 38 Real Time Threat Analyzer security warning box keeps popping up . . . . . . . . . . . 38 McAfee Network Security Platform 8.1 Troubleshooting Guide 3 Contents Threat Analyzer UI stuck at downloading maps . . . . . . . . . . . . . . . . . . . Many options are grayed out in Threat Analyzer menu . . . . . . . . . . . . . . . . Unable to get alerts in Historical Threat Analyzer . . . . . . . . . . . . . . . . . . Issues and status checks for the Sensor and Manager in combination . . . . . . . . . . . . . Difficulties connecting Sensor and Manager . . . . . . . . . . . . . . . . . . . . Loss of connectivity between the Sensor and Manager . . . . . . . . . . . . . . . . DoS troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Issues and status checks for the Sensor and other devices in combination . . . . . . . . . . . Connectivity issues between the Sensor and other network devices . . . . . . . . . . Integration Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Global Threat Intelligence - API Overload . . . . . . . . . . . . . . . . . . . . . ePO - Connection failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vulnerability Manager - Connectivity issues . . . . . . . . . . . . . . . . . . . . Vulnerability Manager - Certificate Sync and FC Agent issues . . . . . . . . . . . . . Logon Collector - Integration issues . . . . . . . . . . . . . . . . . . . . . . . 2 Performance issues 61 Sniffer trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data link errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Half-duplex setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Full-duplex setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 4 61 61 61 61 Determine false positives Reduce false positives . . . . . . . . . . Tune your policies . . . . . . . . . . . . False positives and noise . . . . . . Determine a false positive versus noise 39 39 39 40 40 42 43 46 46 54 54 55 57 58 60 63 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . System fault messages 63 63 64 65 67 Manager faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Manager critical faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Manager error faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Manager warning faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Manager informational faults . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Sensor faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Sensor critical faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Sensor error faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 Sensor warning faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Sensor informational faults . . . . . . . . . . . . . . . . . . . . . . . . . . 119 NTBA faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 NTBA critical faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 NTBA error faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 NTBA warning faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 NTBA informational faults . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 5 Error messages 127 Error messages for RADIUS servers . . . . . . . . . . . . . . . . . . . . . . . . . . 127 Error messages for LDAP server . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 6 Troubleshooting scenarios 129 Network outage due to unresolved ARP traffic . . . . . . . . . . . . . . . . . . . . . . Delay in alerts between the Sensor and Manager . . . . . . . . . . . . . . . . . . . . Sensor-Manager Connectivity Issues . . . . . . . . . . . . . . . . . . . . . . . . . Wrong country name in IPS alerts . . . . . . . . . . . . . . . . . . . . . . . . . . Wrong country name in ACL alerts . . . . . . . . . . . . . . . . . . . . . . . . . . 4 McAfee Network Security Platform 8.1 129 130 134 136 139 Troubleshooting Guide Contents 7 Using the InfoCollector tool 141 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . How to run the InfoCollector tool . . . . . . . . . . . . . . . . . . . . . . . . . . . Using InfoCollector tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Using the Log Analyzer tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Running the Log Analyzer . . . . . . . . . . . . . . . . . . . . . . . . . . . Add a new customer case . . . . . . . . . . . . . . . . . . . . . . . . . . . View summary of the Manager . . . . . . . . . . . . . . . . . . . . . . . . . Create an Event Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . Search for a log file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Managing log files in repository . . . . . . . . . . . . . . . . . . . . . . . . 8 Automatically restarting a failed Manager with Manager Watchdog Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . How the Manager Watchdog works . . . . . . . . . . . . . . . . . . . . . . . . . . Install the Manager Watchdog . . . . . . . . . . . . . . . . . . . . . . . . . . . . Start the Manager Watchdog . . . . . . . . . . . . . . . . . . . . . . . . . . . . Use the Manager Watchdog with Manager in an MDR configuration . . . . . . . . . . . . . Track the Manager Watchdog activities . . . . . . . . . . . . . . . . . . . . . . . . 9 141 142 142 143 144 144 144 146 147 148 150 151 151 151 152 152 152 152 Utilize of the McAfee KnowledgeBase 155 Index 157 McAfee Network Security Platform 8.1 Troubleshooting Guide 5 Contents 6 McAfee Network Security Platform 8.1 Troubleshooting Guide Preface This guide provides the information you need to configure, use, and maintain your McAfee product. Contents About this guide Find product documentation About this guide This information describes the guide's target audience, the typographical conventions and icons used in this guide, and how the guide is organized. Audience McAfee documentation is carefully researched and written for the target audience. The information in this guide is intended primarily for: • Administrators — People who implement and enforce the company's security program. • Users — People who use the computer where the software is running and can access some or all of its features. Conventions This guide uses these typographical conventions and icons. Book title, term, emphasis Title of a book, chapter, or topic; a new term; emphasis. Bold Text that is strongly emphasized. User input, code, message Commands and other text that the user types; a code sample; a displayed message. Interface text Words from the product interface like options, menus, buttons, and dialog boxes. Hypertext blue A link to a topic or to an external website. Note: Additional information, like an alternate method of accessing an option. Tip: Suggestions and recommendations. Important/Caution: Valuable advice to protect your computer system, software installation, network, business, or data. Warning: Critical advice to prevent bodily harm when using a hardware product. McAfee Network Security Platform 8.1 Troubleshooting Guide 7 Preface Find product documentation Find product documentation After a product is released, information about the product is entered into the McAfee online Knowledge Center. Task 8 1 Go to the McAfee ServicePortal at http://support.mcafee.com and click Knowledge Center. 2 Enter a product name, select a version, then click Search to display a list of documents. McAfee Network Security Platform 8.1 Troubleshooting Guide 1 Troubleshooting Network Security Platform This section lists some troubleshooting tips for McAfee® Network Security Platform. Contents Before you start troubleshooting Simplifying troubleshooting Issues and status checks for the Issues and status checks for the Issues and status checks for the Issues and status checks for the Integration Scenarios Sensor Manager Sensor and Manager in combination Sensor and other devices in combination Before you start troubleshooting Before you get too deep into troubleshooting techniques, it is a good practice to consider the following questions: • Were there physical changes to your network that occurred recently? • If another device is placed in the Sensor's position, does that device receive traffic? • If the Sensor is in L2 mode, are your network's services still affected? • Are you using approved McAfee GBICs or SFPs or XFPs with your Sensor? (For a list of approved hardware, see McAfee KnowledgeBase article KB56364 (Go to http://mysupport.mcafee.com/ Eservice/, and click Search the KnowledgeBase).) McAfee Network Security Platform 8.1 Troubleshooting Guide 9 1 Troubleshooting Network Security Platform Simplifying troubleshooting Simplifying troubleshooting When an in-line device experiences problems, most people's instinct is to physically pull it out of the path; to disconnect the cables and let traffic flow unimpeded while the device can be examined elsewhere. McAfee recommends you first try the following techniques to troubleshoot a McAfee Network Security Sensor (Sensor) issue: • All Sensors have a Layer2 Passthru feature. If you feel your Sensor is causing network disruption, before you remove it from the network, issue the following command: layer2 mode assert This pushes the Sensor into Layer2 Passthru (L2) mode, causing traffic to flow through the Sensor while bypassing the detection engine. Check to see whether your services are still affected; if they are, then you have eliminated certain Sensor hardware issues; the problem could instead be a network issue or a configuration issue. (The layer2 mode deassert command pushes the Sensor back to detection mode). • McAfee recommends that you configure Layer2 Passthru Mode on each Sensor. This enables you to set a threshold on the Sensor that pushes the Sensor into L2 bypass mode if the Sensor experiences a specified number of errors within a specified time frame. Traffic then continues to flow directly through the Sensor without passing to the detection engine. • Connect a fail-open kit, which consists of a bypass switch and a controller, to any GE monitoring port pairs on the Sensor. If a kit is attached to the Sensor, disabling the Sensor ports forces traffic to flow through the bypass switch, effectively pulling the Sensor • For FE monitoring ports, there is no need for the external kit. Sensors with FE ports contain an internal tap; disabling the ports will send traffic through the internal tap, providing fail-open functionality. Note that the Sensor will need to reboot to move out of L2 mode only if the Sensor entered L2 mode because of internal errors. (It does not need a reboot if the layer2 mode assert command was used to put the Sensor into L2 mode). A Sensor reboot breaks the link connecting the devices on either side of the Sensor and requires the renegotiation of the network link between the two devices surrounding the Sensor. Depending on the network equipment, this disruption should range from a couple of seconds to more than a minute with certain vendors' devices. A very brief link disruption might occur while the links are renegotiated to place the Sensor back in in-line mode. Issues and status checks for the Sensor This section describes all issues and status checks specific to the Sensor. Contents Health check of a Sensor Failover status check of a Sensor Signature or software update status Download or upload status Check the traffic status of a Sensor Conditions requiring a Sensor reboot Sensor does not boot Sensor stays in bad health Debugging critical Sensor issues Sensor response if its throughput is exceeded 10 McAfee Network Security Platform 8.1 Troubleshooting Guide Troubleshooting Network Security Platform Issues and status checks for the Sensor 1 Sensor latency monitor management Management of different types of traffic Sensor failover issues XC cable connection issues for M8000 Sensors XC cable connection issues for NS9300 Sensors External fail-open kit issues in connecting to the monitoring port Fail-open kit related issues Debugging issues with Connection Limiting policies Issues with Quarantine Health check of a Sensor To see if your Sensor is functioning correctly, do one of the following: On the Sensor: • At the command prompt, type status. This displays system status (such as Operational Status, system initialization, signature version, trust, channel status, alert counts, and so on). Sensor should be initialized and in good health. • At the command prompt, type show. This displays configuration information (such as Sensor image version, type, name, Manager and Sensor IP addresses, and so on). On the Manager: • In the Manager Home page, view the Operational Status section. Manager status should be UP, and Sensor status should be ACTIVE. If you see system faults indicating that the Manager is down, see System Fault Messages to interpret the fault and, if necessary, take action to clear the fault. Pinging a Sensor The Sensor Management port responds only to 20 pings per second. This limited rate prevents the Sensor from being susceptible to a ping flood. To ping a Sensor Management port from multiple hosts, increase the time interval between pings. Failover status check of a Sensor To ensure that two Sensors comprising a failover pair are communicating via their interconnection cable, go to each Sensor's CLI and type show failover-status. Failover should display as enabled (YES), and the peer Sensor should display as UP. Cable failover through a network device Do not connect the heartbeat cable through an external network device. To keep overhead low and throughput high, the Sensors do not include layer 2 or 3 headers on the packets they pass over the heartbeat connection, and they pass data larger than the standard Ethernet maximum frame size (1518 bytes). If you attempt to place a network device, such as a switch or router, between the heartbeat ports, the heartbeat connection will fail. Signature or software update status To see if your Sensor successfully received a signature update or software upgrade, you can use the status command as shown in the following procedure, or the downloadstatus command, described later in this chapter. McAfee Network Security Platform 8.1 Troubleshooting Guide 11 1 Troubleshooting Network Security Platform Issues and status checks for the Sensor To use the status command: Task 1 On the Sensor, type status at the command prompt before updating the signature set on the Sensor. Note the signature version. 2 Update the signature set on the Sensor using the Manager screens. 3 On the Sensor, again type status at the command prompt after the update from Manager is complete. Verify that the signature version number has incremented. The new signature version should match with the signature set version that has been updated from the Manager and applied to the Sensor. Download or upload status To see the progress of an upload or download, use the downloadstatus command. The downloadstatus command displays the status of various download/upload operations: signature, software image, and DoS profile downloads (from Manager to Sensor) and DoS profile and debug trace uploads (from Sensor to Manager). It also lists the number of times you have performed the operation, status of your previous attempt to perform the operation (including—if the operation failed —the cause of failure), and the time the command was executed. Do the following: On the Sensor, type downloadstatus at the command prompt. Check the traffic status of a Sensor Sensor Statistics can be viewed in the Threat Analyzer by creating a new dashboard and by choosing monitors that display different type of Sensor statistics. Sensor Flow Statistics, IP Spoofing Statistics, Packet Drop Statistics, Port Packet Drop Statistics and Rate Limiting Statistics are the monitors available. Task 1 Click Options | Dashboard | New to open the Create New Dashboard dialog. 2 Enter a name for the new dashboard in the Dashboard Dialog. 3 Click Assign Monitor to view the Assign Monitor Dialog. 4 Select the Assign an existing Monitor radio button. 5 Select Default Monitors against Category (these are the default choices). 6 Select Sensor Performance against Type to view the choice of Monitors for Sensor Performance in the Monitor choices box. 7 Select Statistics - Flows and click OK. 8 Select the Sensor for which you want to view flow statistics. 9 Click Refresh to view the flow statistics for the selected Sensor. 10 Follow a similar procedure and select other Monitors for Sensor Performance to view the relevant Sensor Statistics. 12 McAfee Network Security Platform 8.1 Troubleshooting Guide Troubleshooting Network Security Platform Issues and status checks for the Sensor 1 List of Monitors for Sensor Statistics • Sensor Flow Statistics: Statistical view of the TCP and UDP flow data processed by a Network Security Sensor. Checking your flow rates can help you determine if your Sensor is processing traffic normally, while also providing you with a view of statistics such as the maximum number of flows supported as well as the number of active TCP and UDP flows. • IP Spoofing Statistics: Statistics on the number of IP spoofing attacks detected by McAfee® Network Security Platform. Statistics are displayed per direction. • Packet Drop Statistics: Packet drop rate on a Sensor. The statistics is displayed on a per Sensor basis. The statistics includes the count of number of packets dropped by Sensor due to set rate limiting on the Sensor and sanity check failures. • Port Packet Drop Statistics: Packet drop rate on a port. • Rate Limiting Statistics: Rate limiting statistics provides the estimated number of packets dropped/bytes dropped by the McAfee Network Security Sensor (Sensor). You can view rate limiting statistics for each Sensor (per port), listed in the resource tree of Manager. Conditions requiring a Sensor reboot The following situations either cause or require a Sensor reboot. You have two options for rebooting the Sensor. You can reboot the Sensor from the Manager interface, or you can issue the reboot CLI command. A Sensor reboot can take up to five minutes. • Issuing the following CLI commands causes an automatic reboot of the Sensor: • resetconfig • deletesignatures • factorydefaults For more information on the Sensor CLI commands, see McAfee Network Security Platform CLI Guide. • Changing the Sensor's management port IP address (IPv4 or IPv6) requires a manual reboot of the Sensor, before the change takes into effect. • Certain internal software errors can cause the Sensor to reboot itself. See a description of Sensor fault messages later in this chapter. For more information on Operational Status Viewer, see McAfee Network Security Platform Manager Administration Guide. • Enabling/disabling SSL requires a Sensor reboot. • Enabling/disabling parsing and detection of attacks in IPv6 traffic passing through the Sensor monitoring port requires a manual reboot of the Sensor. In the Manager user interface, you can enable/disable parsing and detection of attacks in IPv6 traffic with the Scan IPv6 traffic for attacks option from the IP Settings tab (IPS Settings/<Device_Name> | Advanced Scanning | IP Settings). For more information, see Configuring IP Settings for IPv4 and IPv6 traffic, McAfee Network Security Platform IPS Administration Guide. • Upgrading Sensor software requires a manual reboot of the Sensor. Reboot a Sensor using the Manager The Reboot Sensor action restarts a Sensor. You perform this action in the Manager interface. McAfee Network Security Platform 8.1 Troubleshooting Guide 13 1 Troubleshooting Network Security Platform Issues and status checks for the Sensor To reboot a Sensor, do the following: Task 1 Select <Admin_Domain_Name> | Device List | <Device_Name> | Physical Device | Reboot. 2 Click Reboot Now. Reboot a Sensor using the reboot command The reboot command restarts a Sensor. You perform this action in the Sensor CLI: Task 1 At the prompt, type: reboot 2 Confirm reboot. Sensor does not boot If you cannot get the Sensor to boot, try the following: • Check to ensure that the Sensor is powered on. Check the LEDs on the front of the Sensor. • Check the front panel LEDs to ensure that the Sensor temperature is normal. For more information on Sensor LEDs, see the McAfee Network Security Platform Sensor Product Guide for your Sensor model. • If you receive an error message in the CLI: "OS not found," you might have a corrupted internal flash. If you see this error, contact Technical Support to obtain help in recovering the Sensor. Sensor stays in bad health In certain instances, the Sensor stays in bad or uninitialized health state indefinitely. The bad health of the Sensor could be due to signature file download failure, or error while starting the Sensor. You can perform the following high-level troubleshooting steps to trace the error: 1 14 Execute the following commands and check the output for any errors: • show • status • show sensor health • show startup stats 2 Check if the hardware is connected correctly. 3 Check the InfoCollector tool for logs and the configuration backup. 4 Check if the issue is due to signature file download failure. If it is due to the aforementioned error, contact McAfee Support for further assistance. McAfee Network Security Platform 8.1 Troubleshooting Guide 1 Troubleshooting Network Security Platform Issues and status checks for the Sensor 5 Execute show startup stats debug CLI command and check the output for any errors. IntruDbg#> show startup stats Controller not ready to send INIT_ACKs to datapaths and dos. initial READY msg : not yet received from datapaths and dos dos has sent INIT_DONE. datapath0 has not sent INIT_DONE. datapath1 has sent INIT_DONE. datapath2 has not sent INIT_DONE. datapath3 has not sent INIT_DONE. datapath4 has sent INIT_DONE. datapath5 has sent INIT_DONE. datapath6 has sent INIT_DONE. datapath7 has sent INIT_DONE. dos has not sent READY. datapath0 has not sent READY. datapath1 has not sent READY. sb1cpu0 has not sent READY. sb1cpu1 has not sent READY. sb2cpu0 has not sent READY. sb2cpu1 has not sent READY. sb3cpu0 has not sent READY. sb3cpu1 has not sent READY. 6 Try to power cycle or netboot or reload the Sensor image. 7 Check if the issue is due to corrupt flash. Execute the flashcheck debug CLI command. Confirm that the output does not have any errors. Checking Flash may take more than 15 minutes and Sensor will go into Layer2 during command execution. Please enter Y to confirm: Checking Flash.... Flash check successful. No errors in Flash If the problem still persists, contact McAfee Support for further assistance. Debugging critical Sensor issues CLI commands in the debug mode are used to improve supportability of the Sensor for better debugging of critical issues. For more information on the CLI debugging commands, see the McAfee Network Security Platform CLI Guide. McAfee Network Security Platform 8.1 Troubleshooting Guide 15 1 Troubleshooting Network Security Platform Issues and status checks for the Sensor Sensor response if its throughput is exceeded Each Sensor model has a limited throughput. For example, the Network Security Platform M-2950 Sensor is rated at 1Gbps performance. With the Gigabit interfaces it is theoretically possible to cross the limit. What happens in this situation? Will it throttle the throughput to 1Gbps or will you just lose the IPS functionality for everything more than 1Gbps? The answer is that the Sensor will drop packets irrespective of the TCP flow violation settings. We also have the latency monitor feature where the Sensor can inline-forward traffic without IPS inspection if it crosses the limit. There could also be false negatives and the traffic might experience high latency. It is very important that you stay within the operating parameters of the device you deploy. If you are actually running at gigabit speeds, you should probably be running an M-3050/M-4050/M-6050/ M-8000/NS9100/NS9200 and NS9300 Sensor, which all have a much higher throughput. Sensor latency monitor management All networks working from layer 2 through layer 7 experience some amount of latency. Latency monitor provides a means to reduce latency introduced by the Sensor, when the amount of traffic seen on the network substantially exceeds the Sensor capacity. Sensor latency can be due to various factors such as the policies configured, protocols, content, applications, type of traffic flowing through the Sensor and so on. The Inspection Options Policies configured also adds to the latency. The following features consumes Sensor resources which results in latency: • HTTP Response Traffic Scanning • Advanced Malware Policies • Traffic Inspection • SSL decryption • Advanced Botnet Detection The latency can be reduced or varied, if Sensors detect the latency condition. Whenever there is a latency in the network, the Sensor performs the following functions: • Raises an alert in the Manager whenever there is a latency in processing the packets • Mitigates latency by switching to layer 2 mode Latency monitor is available in all M-series and NS-series Sensor models. Latency monitor feature configured monitors the time consumed for processing the packets. If the number of packets exceeds the threshold for which processing time is high, then it is considered as a condition of latency. You can configure latency monitor as alert-only mode or layer 2 mode. When latency is detected, based on the configuration, an alert is raised in the Manager for the alert-only mode. If it is configured for mitigation, the latency is mitigated before an alert is raised in the Manager. Latency monitor feature is disabled by default. The feature has to be enabled only when there is latency in the network introduced by the Sensor. If the feature is kept enabled, then there is a possibility of some attacks not being detected by the Sensor. To mitigate latency, the Sensor switches to layer 2 mode based on the sensitivity level configured. This takes less than a second after latency is detected. After latency is mitigated, the Sensor switches back to inline mode, depending on the time configured using the CLI command latency-monitor restore-inline. For example, if the latency-monitor restore-inline command is configured for 10 minutes, then the Sensor tries to switch back to online mode (from layer 2) after 10 minutes. If the Sensor is not configured to return to inline mode automatically, then it has to be manually restored to inline mode from layer 2 mode using the CLI command latency-monitor restore-inline. 16 McAfee Network Security Platform 8.1 Troubleshooting Guide Troubleshooting Network Security Platform Issues and status checks for the Sensor 1 Network Security Platform provides latency monitoring at three different sensitivity levels. The sensitivity levels configured in latency monitor checks for latency in two different stages: Stage 1 • High sensitivity — Checks for latency in every incoming packet before processing. • Medium sensitivity — Checks for latency in every alternate packet before processing. • Low sensitivity — Does not check for latency. In the above scenarios, if latency is not detected, the packets are forwarded for further processing to stage 2. Stage 2 Once latency is detected, the packets are processed through multiple phases taking optimized measures internally to handle high latency. If latency is mitigated by this process, then the Sensor returns to normal processing. If latency is not mitigated, then the Sensor switches to layer 2 mode if configured. The time consumed for processing each packet is calculated when the packet is being processed by the Sensor. The calculations are based on the following parameters: • Number of packets for which the latency is high • Duration for which this latency condition persists This duration for which the latency condition is monitored depends on the configured sensitivity level. Latency is detected based on the following sensitivity level thresholds configured: • High latency — If latency is experienced (high) for 1/6th of a second for every 50 packets • Medium latency – If latency is experienced for 2/6th of a second for every 100 packets • Low latency – If latency is experienced (persists) for 3/6th of a second for every 150 packets When latency is detected, the Sensor switches to latency management mode trying to mitigate latency by optimizing processes. During this mode, the situation is continuously monitored to check if the latency is mitigated. Optimization of processes may include turning off the attack detection and packets being forwarded without attack detection. The Sensor switches to layer 2 mode, if enabled, when latency is not mitigated even after running the optimization processes. The following CLI commands for Oversubscription are deprecated: • set oversubscription enable • set oversubscription disable • show oversubscription status McAfee recommends that you use latency monitoring instead. Enable latency monitor You can use the following CLI commands to enable, set sensitivity level, and check the status of latency monitor feature: latency-monitor enable action Enables latency monitoring in the Sensor and also specifies the action to be performed if high latency is observed in the Sensor. McAfee Network Security Platform 8.1 Troubleshooting Guide 17 1 Troubleshooting Network Security Platform Issues and status checks for the Sensor The following are the actions that can be specified in this command: • alert-only (generates an alert when a high latency is observed in the Sensor) • put-in-layer2 (generates an alert and also forwards the traffic to layer 2). Alerts that are generated can be seen in the System faults page in the Manager. Syntax: latency-monitor enable action <alert-only | put-in-layer2> This command should be executed with a parameter value, else the command is treated as invalid. If layer2-forward is enabled, it is necessary to set the layer 2 mode to be on. Otherwise the layer2-forward action does not get executed. Example: latency-monitor enable action alert-only latency-monitor sensitivity-level Configures the sensitivity level for latency management. Syntax: latency-monitor sensitivity-level high latency-monitor sensitivity-level medium latency-monitor sensitivity-level low latency-monitor restore-inline When a high latency is observed on the Sensor and the latency monitor is configured, the Sensor remains in layer 2 until a layer2 mode deassert is invoked or the Sensor reboots. This command allows the Sensor to come out of layer 2 mode without layer 2 deassert. The Sensor restores to inline from layer 2 if the following conditions are met: • The latency monitor has put the Sensor in layer 2 mode. • The Sensor is in good health. If the Sensor is in bad health, a deassert cannot be performed and the Sensor reboots. • A substantial amount of time has lapsed, as configured using this command, when the Sensor went into layer 2 due to latency. The default time to trigger an automatic layer 2 deassert is 10 minutes. If the latency continues to exist after the Sensor is restored to inline mode, the Sensor behaves according to the current setting of the latency monitor. Syntax: latency-monitor restore-inline enable <10-60> latency-monitor restore-inline disable Parameter Description <10-60> The time in minutes to trigger the restore inline from layer 2. It is counted since the time the Sensor moved into layer 2 state due to high latency. The latency-monitor status command displays the status of the latency monitor feature, and the status of the restore-inline feature of the latency monitor. 18 McAfee Network Security Platform 8.1 Troubleshooting Guide Troubleshooting Network Security Platform Issues and status checks for the Sensor 1 latency-monitor Disables the latency monitoring feature or displays the status of latency monitoring feature. Syntax: latency-monitor <disable | status> Default Value: Latency monitoring feature is disabled by default. If disabled, latency monitoring feature does not generate any alert nor forward the traffic to layer 2 when high latency is observed. If latency monitoring is enabled, the following information is displayed. • latency monitoring status (enable or disable) • configured action (alert-only or layer2-forward) Management of different types of traffic Non-ethernet frames are forwarded without inspection. The following are the types of special traffic: • Jumbo Ethernet frames • ISL frames See also Jumbo ethernet frames on page 19 ISL frames on page 19 Jumbo ethernet frames Sensors respond differently to jumbo frames based on which ports are receiving them. Inspection is available for jumbo frames only for M-3050, M-4050, M-6050, and M-8000 Sensors. • 10/100 (FE) ports: Jumbo frames are not supported. When a 10/100 port receives a jumbo frame, the frame is dropped. • 1000 (GE) port: The frame is passed through the Sensor, but is not subjected to IPS inspection. ISL frames All McAfee Network Security Sensor (Sensor) models (running all Sensor software versions) pass ISL frames through the Sensor without IPS inspection. Sensor failover issues Checking the following connections and settings might resolve Sensor failover issues. • The Sensor model and Sensor image version on both the peer Sensors should be the same. • The Sensor license and IPv6 status should be identical on the peer Sensors. • Identify the interconnect port for the selected model because the interconnect ports vary for different models. • Check on the FO type setting on the Sensor. The failover creation would fail if the FO type is set on the primary Sensor. • The Sensor health status should be good and normal. McAfee Network Security Platform 8.1 Troubleshooting Guide 19 1 Troubleshooting Network Security Platform Issues and status checks for the Sensor XC cable connection issues for M8000 Sensors XC cable connection issues can occur in the M8000 Sensors due to improper cabling of XFP interconnect ports(XC2, XC3, XC5 and XC6). Check the following connections in the M8000 Sensors while facing such issues. • One end of an LC-LC fiber-optic cable should be plugged into the XC2 port of the primary Sensor and the other end of the cable to be plugged into the XC5 port of the secondary Sensor. • One end of an LC-LC fiber-optic cable should be plugged into the XC3 port of the primary Sensor and the other end of the cable to be plugged into the XC6 port of the secondary Sensor. XC cable connection issues for NS9300 Sensors XC cable connection issues can occur in the NS9300 Sensors due to improper cabling of interconnect ports(G0/1, G0/2, G4/1, and G4/2). Check the following connections in the NS9300 Sensors while facing such issues. • One end of an LC-LC fiber-optic cable should be plugged into the G0/1 port of the primary Sensor and the other end of the cable to be plugged into the G4/1 port of the secondary Sensor. • One end of an LC-LC fiber-optic cable should be plugged into the G0/2 port of the primary Sensor and the other end of the cable to be plugged into the G4/2 port of the secondary Sensor. External fail-open kit issues in connecting to the monitoring port External fail-open kit issues can occur due to disconnection of network device cables and improper cabling or port configuration. By having a check on the following connections might resolve the issue. • Ensure that the cables are properly connected to both the network devices and the Bypass Switch. • Ensure that the transmit and receive cables are properly connected to the Bypass Switch. Fail-open kit related issues Issues related to fail-open kit at the customer's environment Applicable to Sensor models: M-series, NS-series Problem scenarios 20 1 Passive fail-open does not bypass even though the fail-open kit Sensor is down/Sensor is rebooted 2 Passive fail-open does not come up and continuously flaps 3 Active fail-open does not come up and continuously flaps 4 Active fail-open to Sensor link flaps continuously McAfee Network Security Platform 8.1 Troubleshooting Guide Troubleshooting Network Security Platform Issues and status checks for the Sensor 1 Data/Information Collection 1 2 Execute the following commands in the Sensor: • show • show inlinepktdropstat <port> • status • show sensor-load • show intfport <port> (multiple times) Check the following details: • Active fail-open type (model) and configuration • Cables and SFP type • Physical connection details (network topology) • Peer device port configuration 3 Trace the Sensor files. 4 Check the infoCollector tool for the logs including the configuration backup. (This is optional in case the issue is required to be reproduced locally) Following are the troubleshooting steps for the various problem scenarios: Problem 1: Passive fail-open does not bypass even though the fail-open kit Sensor is down/Sensor is rebooted 1 Check if the Sensor is up and in good state. 2 In the Physical Ports page of the Manager, check the following configurations: • Port is configured to Inline Fail-Open Passive • Appropriate media is selected, Copper/Fiber • Auto-Negotiate is selected. 3 If peer device port does not support MDIX, use an appropriate cable to bring up the link during the Sensor bypass. If it does not work, check the Passive Fail-Open Kit for any hardware issues. 4 While using Passive Fail-Open Kit, make sure to disable the STP on the peer device ports to avoid auto renegotiate. While using Passive Fail-Open Kit, each Sensor port individually negotiates with peer port initially when the Sensor is in inline mode. When the Sensor goes to bypass mode, the peer device port re-negotiates with each other. Make sure to enable Portfast on peer devices to minimize network outage. Problem 2: Passive fail-open does not come up and continuously flaps 1 Check if the Sensor is up and in good state. 2 In the Physical Ports page of the Manager, check the following configurations: • Port is configured to Inline Fail-Open Passive • Appropriate media is selected, Copper/Fiber • Auto-Negotiate is selected. • Appropriate cable is used. The cable type should be Cat5e and above for copper, and for fiber single-mode/multi-mode depending on the SFP used. McAfee Network Security Platform 8.1 Troubleshooting Guide 21 1 Troubleshooting Network Security Platform Issues and status checks for the Sensor 3 Check the control cable connection and the right controller port. 4 Check if the SFPs are according to McAfee's recommendations. 5 Check for bad/defective cable and SFPs. 6 Check if the peer device port is working and if the port settings are set to Auto-Negotiate. 7 Ensure local port testing (by connecting monitoring ports back to back). 8 Swap the working SFP and cables from another port pair. 9 If all the above steps fail, RMA the Sensor. Problem 3: Active fail-open does not come up and continuously flaps 1 Check if the Sensor is up and in good state. 2 Use McAfee recommended transceivers (normal SFP for 1G, SPF+ for 10G, and QSP for 10G ports). 3 Check the Active Fail-Open Kit monitoring port setting (specifically Auto-Negotiate and speed settings). It should be the same as Sensor monitoring ports and peer device. 4 Ensure local loopback port testing (by connecting monitoring ports back to back). 5 Swap the working SFP and cables from another port pair. 6 Check the load on the Sensor. 7 If all the above steps fail, RMA the Sensor. Steps to Configure and Debug active fail-open When configuring the Active Fail-Open Kit, in case of flapping issues, the configuration on the network peer ports must match with the one on Active Fail-Open Kit-Sensor monitoring port pair. 22 1 Ensure the power to the Optical Bypass Switch is on. 2 Using a DB-9 RS232 programming cable. Connect a PC that is running the HyperTerminal to the Optical Bypass Switch. 3 Launch a terminal emulation software like HyperTerminal, and set the following communication parameters: • Bits per second: 19200 • Flow control: None • Stop bit: 1 • Parity: None • Data bits: 8 4 Click OK. The CLI banner and login prompt are displayed. 5 Type the default username and password. (The default username and password is McAfee and is case sensitive). 6 Once you are logged in, use the following commands in the table to configure and troubleshoot the Active Fail-Open Kit: McAfee Network Security Platform 8.1 Troubleshooting Guide Troubleshooting Network Security Platform Issues and status checks for the Sensor 1 Command Description a Set the timeout value. To set the Timeout value, do the following: • Type a and press Enter. • TimeOut period (1-254 sec). Type the number of seconds between each heartbeat (1-254 seconds) and press Enter. Default = 1. • Retry Count (1-254). Type the number of missed heartbeats allowed before the Bypass Switch enters the On mode. Default = 3. The Retry Count must be greater than or equal to the Timeout period. b Set Switch parameters. To set speed duplex and auto-negotiation, LFD, bypass detect: • 1= turn On. • 0 = turn Off. • Fail Mode Open/Close= 1 The LFD and Bypass detecting mode settings cannot be changed. c Set TAP mode. • Type c and press Enter. • Type 1 to set the tap mode On or 0 to set the tap mode Off. Default = Off. d Show configuration. Type d and press Enter. The following is displayed: • LFD = On • Fail Mode= Open • Timeout Period= 1 • Bypass State= Off • Bypass Detect= Off • TAP Mode= Off • Retry Count= 3 e Show port status. Type e and press Enter. The following is displayed: • Port A= Up/Down • Port B= Up/Down • Port 1= Up/Down • Port 2= Up/Down f Set Switch name. • Type f and press Enter. • At the prompt, type the Switch name, which can be 8 characters long. z Reset to Factory Defaults. Problem 4: Active fail-open to Sensor link flaps continuously 1 Check if the Sensor is up and in a good state. 2 Use McAfee recommended transceivers (normal SFP for 1G, SPF+ for 10G, and QSP for 10G ports). McAfee Network Security Platform 8.1 Troubleshooting Guide 23 1 Troubleshooting Network Security Platform Issues and status checks for the Sensor 3 Check the Active Fail-Open Kit monitoring port setting (specifically Auto-Negotiate and speed settings). It should be the same as Sensor monitoring ports and peer device. 4 Check the Sensor ports (by connecting monitoring ports back to back). 5 Swap the working SFPs and cables from the other working port pair. 6 Swap the working Active Fail-Open Kit to confirm if a hardware problem exists or not. 7 Check the load on the Sensor to make sure that Sitera is dropping the HB packets from the Active Fail-Open Kit. To test if the Sitera is dropping the HB packets, contact McAfee Support for further assistance. Debugging issues with Connection Limiting policies Connection Limiting policies consist of a set of rules that enable the Sensors to limit the number of connections a host can establish or a connection rate. This section provides troubleshooting steps to resolve few issues with Connection Limiting policies. Before you begin Check that the Connection Limiting policy is correctly configured. • You can configure the Connection Limiting policy with the monitoring ports in SPAN, tap, or inline modes. The response actions differ for SPAN and tap modes. In these modes, the Sensor cannot block the connections or quarantine the hosts. • The connections are limited based on the predefined threshold value. The threshold value is defined as connections per second or active connections. For example, if you define 1 connection per second as the threshold value, then, 10 connections are allowed per 10 seconds. So, if there are 10 connections in the first second, all other connections from the second to the tenth second are dropped. On the other hand, if you have 1 connection for each second, all the 10 connections until the tenth second are allowed. • Connection Limiting rule based on Protocol applies to both IPv4 and IPv6 traffic. Connection limiting rule based on McAfee GTI applies to only IPv4 traffic. GTI does not support IPv6 traffic. • The Connection Limiting alert raised is IP: Too many TCP/UDP/ICMP sessions. This alert is present in the IPS Policies. Perform these steps to configure a basic Connection Limiting policy. Task 24 1 Go to Policy | Intrusion Prevention | Connection Limiting Policies and select Connection Limiting Rules. 2 Click New and configure the rule properties. McAfee Network Security Platform 8.1 Troubleshooting Guide 1 Troubleshooting Network Security Platform Issues and status checks for the Sensor 3 In Connection Limiting Rules, set the parameters like state, direction, and response. Figure 1-1 Connection Limiting Rule 4 Go to Devices | Devices | <Device_name> | IPS Interfaces. 5 From the interfaces, select <interface_name> | <subinterface_name> | Protection Profile | Connection Limiting Policy. 6 Select the Assign a Connection Limiting Policy? checkbox. 7 Select the required Connection Limiting policy on the Sensor interface and click Save. Make sure the IP: Too many TCP/UDP/ICMP sessions alert is enabled in the IPS policy that is applied on the Sensor interface. 8 Deploy the configuration changes to the Sensor. Troubleshooting Connection Limiting issues After Connection Limiting policies are configured, you might see issues like: • No alerts are raised in the Manager • Excess packets are not dropped or denied • Hosts are not quarantined Connection Limiting rules can be configured with protocol types Alert only, Alert & Drop Excess Connections, Alert & Deny Excess Connections and Alert & Quarantine. Perform these steps to troubleshoot issues like alerts not raised in the Manager, excess packets not dropped or denied, or hosts not quarantined after reaching the threshold value. 1 Make sure that the Connection Limiting policy rules are configured and applied to the Sensor interface. 2 From the Sensor CLI, run the show inlinepktdropstat all CLI command and check if the Conn Limiting Pkt Drop Count is 0. This means that the configured threshold value is not reached. Only when the count reaches a threshold value, alerts are triggered in the Manager. 3 Check whether the incoming traffic rate to the Sensor meets the Connection Limiting rule's threshold value. If it does not meet the threshold value, send the corresponding traffic rate. 4 Set a lower threshold value and check the active connections or connections per second. McAfee Network Security Platform 8.1 Troubleshooting Guide 25 1 Troubleshooting Network Security Platform Issues and status checks for the Sensor 5 Check if there is any firewall ignore rule for the source IP address configured in the Connection Limiting rule. Go to Policy | Firewall Policies | <Firewall Policy> | Access Rules to check if a source IP address's Response is set as Stateless Ignore or Ignore. 6 Check if the source IP address configured in the Connection Limiting rule is part of the Quarantine Exceptions list. Go to Devices | Global | IPS Device Settings | Quarantine | Default Port Settings to if source IP address is quarantined. Considerations for GTI connection limiting and XFF feature When you configure GTI and XFF for a connection limiting rule: • The Sensor cannot perform GTI lookup on the XFF IP address. That is, the GTI-based connection limiting does not work when the XFF feature is enabled. • When the XFF feature is enabled, the Sensor expects that all HTTP flows should have XFF data in the HTTP header. • The Sensor supports connection limiting on XFF based on protocol-based connection limiting. Alert Detection Matrix The table briefs how alerts are detected based on the connection limiting type and XFF feature configuration. Connection limiting type XFF configuration XFF or Non XFF Proxy IP tag traffic sent reputation to Sensor XFF IP Alert detection Protocol Disabled Without XFF - Yes Protocol Enabled With XFF - Yes Protocol Enabled Without XFF - No GTI Disabled Without XFF - Yes GTI Enabled With XFF Low risk High risk GTI Enabled With XFF High risk Low risk No GTI Enabled Without XFF - - No No Issues with Quarantine Network Security Platform enables you to quarantine your network hosts when required. There are two ways to quarantine hosts: • Configure the Sensor to quarantine hosts automatically when they generate specific attacks. • Manually quarantine specific hosts that are listed in the Real-time Threat Analyzer. You might see these issues while quarantining: • Real-time Threat Analyzer quarantine list does not have a host entry, but the host is stuck. • Real-time Threat Analyzer has a host that is not deleted after the expiry time. You might also see an error when manually deleting a host from the Threat Analyzer. To confirm if it is a quarantine issue, put the Sensor in Layer 2 or add the host IP address to the Quarantine Exceptions list and check if the issue is resolved. If the issue is not resolved, contact McAfee Support. 26 McAfee Network Security Platform 8.1 Troubleshooting Guide 1 Troubleshooting Network Security Platform Issues and status checks for the Manager Issues and status checks for the Manager This section describes issues and status checks specific to the Manager. Contents The Manager connectivity to the database MySQL issues Sensor not displayed in the resource tree The Manager fails to start The Manager interface does not work after JRE update Message on loading the Manager does not disappear Unable to log on to the Manager after typing credentials Portions of the interface do not load properly Prompt appears in Threat Analyzer to open or save a JNLP file Login button does not work When using Internet Explorer 9 Real Time Threat Analyzer file download gets into a loop The Manager client is unble to contact the Manager server when launching the Real Time Threat Analyzer Real Time Threat Analyzer has strange behavior Real Time Threat Analyzer security warning box keeps popping up Threat Analyzer UI stuck at downloading maps Many options are grayed out in Threat Analyzer menu Unable to get alerts in Historical Threat Analyzer The Manager connectivity to the database In the event that the Manager loses connectivity to the database (i.e. the database goes down) the alerts are stored in a flat file on the Manager server. When the database connectivity is restored, the alerts are stored in the database. The Manager database is full We recommend that the customer monitor the disk space on a continuous basis to prevent this from happening. If the Manager database or disk space is full, the Manager will unable to process any new alerts or packet logs. In addition, the Manager might not be able to process any configuration changes, including policy changes and alert acknowledgement. In fact, the Manager might stop functioning completely. To rectify this situation, please perform maintenance operations on the database, including deleting unnecessary alerts and packet logs. Furthermore, please reevaluate database capacity planning and sizing, and monitor free space proactively. The Manager is designed with various file and disk maintenance functions. You can archive alert and packetlog data and then delete the data to free up disk space. It also provides a standalone tool for creating database backups that can be archived for emergency restoration. The Manager also provides disk maintenance alerts, which send proactive system fault messages when the Manager disk space reaches a threshold of 51%. The Manager generates the disk space warning fault for disk space utilization. The severity of this fault changes with respect to the percentage of increase in the disk space utilization. The Manager database fails to start Below are some of the reasons for the Manager database failing to start. McAfee Network Security Platform 8.1 Troubleshooting Guide 27 1 Troubleshooting Network Security Platform Issues and status checks for the Manager • The Manager database process is already running. This can be checked by opening Windows Task Manager and looking for mysqld.exe with Memory foot print of hundreds of MB. • Start the service "McAfee Network Security Manager Database" from services window. If the service has not started, check for the reason of failure in <DBInstalldir>\data\<hostname>.err file. • In the command prompt, navigate to <DBInstalldir>\bin and run "mysqld - -console" manually. For a successful startup the message will be displayed as shown below: 130626 12:05:04 [Note] mysqld: ready for connections. Version: '5.5.31-enterprise-commercial-advanced-log' socket: '' port: 3306 MySQL Enterprise Server - Advanced Edition (Commercial) The version number and commercial license definition will vary across Manager versions. To close the successful startup session, use "CTRL-C" command. For an unsuccessful startup, the process will be abruptly shutdown mentioning the error. If unexpected database service shutdown occurs, check the <hostname>.err file for possible reason. Also, during this unexpected shutdown, mysql will create a minidump i.e. mysqld.dmp in the data directory. If required, this file can be used for further analysis. MySQL issues The common symptoms that occur if your database tables become corrupt: • .MYI or .MYD errors reported in the ems.log file. • Inability to acknowledge or delete faults in Operational Status . • When trying to view packet log for in the Threat Analyzer, you receive an error message: No Packet log available for this alert at this time If you think that your MySQL database tables have become corrupt, follow the instructions on verifying your tables, which is available in McAfee KnowledgeBase article KB60660. (Go to http:// mysupport.mcafee.com, and click Search the KnowledgeBase.) Sensor not displayed in the resource tree After adding the Sensor and establishing trust, if the Sensor is not displayed in the resource tree, perform the following steps for troubleshooting: Task 28 1 Capture traffic using wireshark in the Manager. 2 Check if the Manager is receiving UDP response packets from the Sensor. 3 Configure the firewall to allow UDP traffic if response packets are not coming. 4 Check if the Manager machine has multiple NIC cards. If yes, open <NSM_INSTALL_DIR>/bin/ tms.bat and modify the following line to assign a relevant IP address that is also used in Sensor configuration: 5 Set JAVA_OPTS=%JAVA_OPTS% -Dlumos.fixedManagerSNMPIPaddress="" 6 Restart the Manager. McAfee Network Security Platform 8.1 Troubleshooting Guide Troubleshooting Network Security Platform Issues and status checks for the Manager 1 You can enable detailed debugging messages by modifying <NSM_INSTALL_DIR>/config/ log4j_ism.xml file by adding and changing the following lines if it is already exists • <category name="iv.core.DiscoveryService"> <priority value="DEBUG"/></category> • <category name="iv.core.SensorConfiguration"> <priority value="DEBUG"/></ category> The Manager fails to start Below are some of the common reasons for the Manager failing to start: • The Manager Java process is already running. This can be checked by opening Windows Task Manager and looking for a java.exe with Memory foot print of hundreds of MB. Alternatively install sysinternals' Process Explorer from HTTP://TECHNET.MICROSOFT.COM/EN-US/SYSINTERNALS/ BB896653 to locate the java process. If found, as indicated in the following image, it should be removed. Figure 1-2 Check if Manager Java process is already running McAfee Network Security Platform 8.1 Troubleshooting Guide 29 1 Troubleshooting Network Security Platform Issues and status checks for the Manager • In the command prompt, navigate to <NSMInstallaitonDirectory>/bin and run tms.bat manually. Then check for below conditions. • One of the TCP ports that Manager binds to is in use. Use netstat -nab to list out all ports in use. These netstat options also identifies the executable that is binding to the port and the executable should be stopped. Figure 1-3 Check netstats 30 McAfee Network Security Platform 8.1 Troubleshooting Guide Troubleshooting Network Security Platform Issues and status checks for the Manager • 1 Check whether the logged user on the server has permissions to launch McAfee Network Security Manager service. This can be found by right clicking on the service, selecting Properties and then Log On tab. So, if the logged in user doesn't have permission to run local service, then the Manager does not start. Figure 1-4 Check user permissions • The server does not have enough RAM. The tms.bat file has a -Xmx<MaxHeap> setting in MB that specifies Java heap in MB that needs to be allocated to the process. If the server does not have that much RAM, then process will not start. • Sometimes, especially on 32-bit machines when there are instances of heap exhaustion, when you try to increase the maximum heap setting to a larger volume assuming to be having full 2000MB available. However stack space, native libraries share memory in the same 2000MB space and java heap cannot be higher than 1170MB. So, check that -Xmx setting is not greater than 1170MB if it is a 32 bit machine. • The process fails to start with a classloader exception such as ClassNotFound. This typically indicates issues with the Manager installation. A fresh installation or upgrade as appropriate should resolve the issue. Tasks • Analyze memory-related issues on page 31 Analyze memory-related issues Memory-related issues occur in the Manager when the amount of the heap space allocated by the Operating System, based on JVM options (-Xms, -Xmx) specified in tms.bat, is not enough for the application to continue to behave in desired manner. Typical symptoms include: • Application not being responsive – CPU usage of the Java process being high. • Application crashing – terminating. McAfee Network Security Platform 8.1 Troubleshooting Guide 31 1 Troubleshooting Network Security Platform Issues and status checks for the Manager • Communication channel(s) flap between the Manager and the device – channel connections being reset frequently. • Application not being able to start. The following logs are required for analysis: • Infocollector logs (mainly ems, emsmem, acqount, slowquery, DB err file). • Threads stack trace and CPU usage using stack trace and collect live objects in heap memory space using the heap dump tool. These logs are required before restarting the application, which is usually done to restore the application, unless it is recurring issue; heap dump tool or stack trace doesn't require a restart as in most cases memory leak might not be reproduced. And without these logs, an RCA would be extremely challenging. Task 1 Establish that JVM has experienced memory overload. This can be known by searching the info collector log with string OutOfMemoryError. The most preferred way is to perform a global search in all the files part of InfoCollector whose file name starts with ems* - with wildcard , which can be done using text editors like TextPad. If there are no search results, it signifies that JVM does not experience any memory issue because of the Manager application, but it could be caused by other applications or some operating System dlls - check JVM crash files. 2 If there is above exception , check the emsmem logs to know the time of memory and frequency; usually most cases exhibit either slow memory, over a period of days or months, or sudden decrease in memory. 3 After establishing the time of memory leak, check alert rate in aqcount logs. The recommended value is maximum 60alert/sec; Any value above this value over a period of time can cause memory issues. Alert Rate can be calculated from aqcount logs using the following method: • • Look for an entry similar to : "2012-07-31 13:27:52,012 AltQ:EPR-RCD: 6178500 0 112".There are three important information that needs to extracted namely : • (t1)timestamp(2012-07-31 13:27:52,012) • Alert received string(AltQ:EPR-RCD) • alert count(6178500). Now look for next immediately occurring entry which contains "AltQ:EPR-RCD";this entry will have an alert count greater by 300 - so if the above example is considered then alert count will be 6178800 - and note the (t2)timestamp of this entry Alert Rate = 300/(t2-t1) 32 4 Check the MySQL errors logs to find if there are any errors messages. 5 Check Slowquery logs to find out if there are any queries that are being called repeatedly and taking considerable amount time to execute - more than 5-10 minutes. 6 Search for all the error messages in ems logs using string "error" - similar to first step. Observe for the error messages that have occurred during the time interval of memory leak. 7 If heap dump - .bin file with prefixes 850heap, 1500heap - is available then it can be used in the heap dump analyzer tools like MAT, VisualVM which will identify the suspects causing memory leak. McAfee Network Security Platform 8.1 Troubleshooting Guide 1 Troubleshooting Network Security Platform Issues and status checks for the Manager The Manager interface does not work after JRE update Problem/Symptom: The JRE on the client workstation was updated from version 1.6 to version 1.7 and now portions of the Manager interface does not work. Potential Cause: • The Manager prior to 7.1.3.5 and 6.1.1.34 did not support JRE 1.7. • If you want to run JRE 1.7, you must install the Manager versions that supports Java version 1.7. • If you cannot upgrade the Manager to the version that support 1.7, you must re-install Java 1.6 on your client system. Remedy: To verify the version of the Manager, look for the version in the main menu. If the Manager is version 6.1.1.33 or below, then upgrade to version 6.1.1.34 or above refer to the release notes. • Network Security Platform 6.1.1.34-6.1.1.154 I-series Release Notes • Network Security Platform 6.1.1.34-6.1.1.154 M-series Release Notes If the Manager is version 7.1. or below, then upgrade to version 7.1.3.5 or higher, refer to release notes Network Security Platform 7.1.3.5-7.1.3.6 M-series Release Notes If you cannot upgrade the Manager to the version that supports Java 1.7, you will need to re-install JRE 6.x from the 'Add Remove programs'. Uninstall Java 7. Reconnect to the Manager and install the Java version when prompted. Message on loading the Manager does not disappear Problem/Symptom: A message is displayed stating "NSM is currently loading" but the message does not go away even after several minutes. Potential Cause: • • The Manager server (Java) tries to establish connections to the web server. If any of the server communication are not established, the Manager will not startup properly. The problem might be due to: • Java process not running on serverClient. • The client cannot talk to server (blocked ports). • Database not running on the server. The Manager server process is not running on the appliance or on the Manager software. Remedy: Verify that the service is started and running properly. 1 From the Start Menu search bar type 'cmd' to open the command-prompt with elevated privileges. 2 Run the command IMAGENAME eq java.exe to verify if Java is running on the server. 3 Check the output for java.exe on the server to ensure that the mem usage is above 500MB. If there is nothing listed, the Manager service is not running. McAfee Network Security Platform 8.1 Troubleshooting Guide 33 1 Troubleshooting Network Security Platform Issues and status checks for the Manager 4 Run the following commands on a command prompt to verify that 8501 to 8505 are open and actively listening. • netstat -an | find "LISTENING" | find "8501" • netstat -an | find "LISTENING" | find "8504" • netstat -an | find "LISTENING" | find "8502" • netstat -an | find "LISTENING" | find "8505" • netstat -an | find "LISTENING" | find "8503" 5 Verify if mysql is running, by executing the command netstat -an | find "LISTENING" | find "3306". 6 Try to start the Manager manually by running tms.bat from <install path>/App/bin/. Look for error messages at the bottom of this output. 7 Check the bottom of the emsout.log file in <install path>/App/ for errors. Unable to log on to the Manager after typing credentials Problem/Symptom: From the logon page, after typing the user name and password, the Manager application does not open. It displays only a blank page. Potential Cause: The Manager requires the window's pop-up capability to be disabled or have an exclusion configured. Remedy: Disable the pop-up blocker functionality. or Create an exception for the Manager server IP addresses. Table 1-1 Internet Explorer To disable pop-up blocker To add exception to pop-up blocker list 1 From the command prompt, execute 1 From the command prompt, execute the command the command Inetcpl.cpl. The Inetcpl.cpl. The Internet Properties window is displayed. Internet Properties window is displayed. 2 In the Privacy tab, select the Turn on Pop-up Blocker checkbox. 2 In the Privacy tab, deselect the 3 Click Settings. The Pop-up Blocker Settings window is displayed. checkbox option Turn on Pop-up Blocker. 4 In the Address of website to allowfield, add the IP address or host name of the Manager to the list of websites to be allowed. 34 McAfee Network Security Platform 8.1 Troubleshooting Guide 1 Troubleshooting Network Security Platform Issues and status checks for the Manager Table 1-2 Mozilla Firefox To disable pop-up blocker To add exception to pop-up blocker list 1 In the Firefox browser,select Tools | Options and click the Content tab. 1 In the Firefox browser, select to Tools | Options and click on the Content tab. 2 Deselect the, Block pop-up windows checkbox. 2 Select the Block pop-up windows checkbox. 3 Click Exceptions. The Allowed sites Pop-ups window is displayed. 4 In the Address of website text field, add the IP address or host name of the Manager to the list of web sites to be allowed. Table 1-3 Google Chrome To disable pop-up blocker To add exception to pop-up blocker list 1 In the Google Chrome browser, 1 In the Google Chrome browser, type the following in the type the following in the address address bar: chrome://settings/contentExceptions#popups. bar: chrome://chrome/settings/ The Content Settings window is displayed. content. The Content Settings window 2 In the Hostname pattern field, add the IP address or host name is displayed. of the Manager to the list of exceptions. 2 Select Allow all sites to show pop-ups. Portions of the interface do not load properly Problem/Symptom: Portions of the interface does not load properly or a Java logo is displayed instead of the normal interface. Potential Cause: • There might be a conflict with the version of Java running on the client machine. This happens during an upgrade to the Manager or to Java or any application that uses Java. An older/different version of Java might be loaded, causing the Manager to behave inconsistently. • The Manager supports all minor versions of Java, either version 1.6 or 1.7. • If you need to run Java version 1.7, you must run version 6.1.1.35 or higher or the Manager version 7.1.3.5 or higher. • If the base Java version is supported (either version 1.6 or 1.7), then there might be a version mis-match on your client machine. Clearing the cache will ensure there is only one version on the endpoint. Also verify there is only one version of Java running on the client workstation. Remedy: • Check which version JRE is installed on your client machine by accessing the link http:// www.java.com/en/download/installed.jsp or In the Control Panel navigate to Java Control Panel window. Refer KB55469 at kc.mcafee.com to determine which Java version shipped on your version of the Manager. • Try clearing temporary files using Java control panel, by performing the following steps. 1 In the Java Control Panel click the Settings tab. 2 Click Delete Files. 3 Select the files to be deleted and click OK. McAfee Network Security Platform 8.1 Troubleshooting Guide 35 1 Troubleshooting Network Security Platform Issues and status checks for the Manager • Uninstalling the currently installed client JRE will allow the Manager to push the default shipped JRE back to the client and ensure that it is installed properly. • Uninstall the currently installed version by closing all browser windows and using the add/remove programs function to uninstall Java. Prompt appears in Threat Analyzer to open or save a JNLP file Problem/Symptom: Clicking on Real time Threat Analyzer and Historical Threat Analyzer, launch button prompts to open or save a JNLP file. Potential Cause: • Browser is configured with 'do not save encrypted page on disk'. • JNLP association is incorrect. Remedy: Verify the browser settings by performing the following steps: 1 From the command prompt, execute the command Inetcpl.cpl. 2 Click on the Advanced tab and scroll down to the Security section. 3 Ensure that the option Do not save encrypted page to disk is deselected. Verify the .jnlp file association in the Windows configuration. The first time you are prompted to open a .jnlp file you can select the program, Java Web Start Launcher. Ensure that you save the setting Always use the selected program to open this kind of file. Table 1-4 Verifying .jnpl file association For Windows 2003 and Windows XP For Windows 2008 and Windows 7 1 From the command prompt, execute the command control folders 1 Navigate to Control Panel | Default Programs. or use the Folder Options function in the Control Panel. 2 Click the File Types tab and scroll down to .jnlp. 2 Click on the link Associate a file type or protocol with a specific program. 3 Scroll down to the .jnlp entry and click on the Change program, then select Java Web Start Launcher. 3 Click Change and select Java Web Start Launcher from the list If it is not in the list, browse to the Java install location which is usually C:\Program Files\Java\jre<version>\bin\javaw.exe. If it is not in the list, browser to the Java install location which is usually C:\Program Files (x86)\Java\jre<version>\bin\javaw.exe Login button does not work Problem/Symptom: The Login button does not work. Potential Cause:Internet Options are too restrictive. Remedy: Verify the following Internet Explorer browser settings by executing the Inetcpl.cpl command from the command prompt. The Manager IP address or host name can be added to the trusted sites. 36 McAfee Network Security Platform 8.1 Troubleshooting Guide 1 Troubleshooting Network Security Platform Issues and status checks for the Manager Or Modify the security zone’s settings to allow the required changes. To modify the settings: 1 Click the Security tab. 2 Click Custom Level and enable the following entries: • Run ActiveX Controls & Plugins • Script ActiveX Controls mark safe for scripting • Downloads: File Download • Scripting: Active Scripting 3 Click the Advanced tab and scroll down to the Security section. 4 Verify that the option Do not save encrypted page to disk is deselected. When using Internet Explorer 9 Real Time Threat Analyzer file download gets into a loop Potential Cause: Real Time Threat Analyzer relies on the Java WebStart functionality which downloads the JRE application and data into the client workstation. The JNLP (Java Network Launching Protocol) setting for JAVAW (Java webstart) is over-written. Remedy: Verify file type association is pointing to javaws.exe. Verifying .jnpl file association For Windows 2003 and Windows XP For Windows 2008 and Windows 7 1 From the command prompt, execute the command control folders 1 Navigate to Control Panel | Default Programs. or Navigate to Control Panel | Folder Options. 2 Select the File Types tab and scroll down to .jnlp. 3 Click on the Change button and select Java Web Start Launcher, from the list If it is not in the list, browse to the Java install location which is usually C:\Program Files\Java\jre<version>\bin\javaw.exe 2 Click on the link Associate a file type or protocol with a specific program 3 Scroll down to the .jnlp entry and click on the Change program button and select Java Web Start Launcher. 1f it is not in the list, browser to the Java install location which is usually C:\Program Files (x86)\Java\jre<version>\bin \javaw.exe The Manager client is unble to contact the Manager server when launching the Real Time Threat Analyzer Potential Cause: Client traffic is blocked from getting to the server. Most likely a firewall is blocking the connection. The Manager client communicates with the server via port 8555 for the Real Time Threat Analyzer. Remedy: Verify the port is open through the firewall. If telnet is not installed, then use the following command to install the utility in Windows 7 pkgmgr /iu:"TelnetClient". McAfee Network Security Platform 8.1 Troubleshooting Guide 37 1 Troubleshooting Network Security Platform Issues and status checks for the Manager telnet nsm-server-ip 8555 Check the firewall settings if it displays 'Could not open connection to the host, on port 8555: Connect failed'. Real Time Threat Analyzer has strange behavior Potential Cause: Real Time Threat Analyzer is sensitive to communication timing. It requires a certain operating window (Real Time Threat Analyzer to backend). Remedy: Verify the communication between client and server. Ping the Manager server. You are looking for a response time average is less than 200 ms. If the response is greater, increase the time-out value specified in ems.properties in <install path>/App/config/ ta.timeout. Period value is set to 20 seconds. This should be changed to 60 if there is latency in the ping test. Restart the Manager service for the change to take effect. Real Time Threat Analyzer security warning box keeps popping up Potential Cause: The publisher of the Manager certificate is not a trusted entity. The browser needs to trust the certificate publisher to avoid the security warning. To trust the certificate the browser must use the hostname of the Manager so the certificate and URL match. Remedy: • Verify that the Manager is accessible by host name by executing the command ping <hostname>. If it is not accessible, add the Manager name and IP to the internal DNS servers the same way it appears in the certificate. • Trust the publisher of the Manager certificate by installing the certificate as a trusted root certification authority: 1 In Internet Explorer, view the certificate by selecting it in the address bar and clicking View Certificates. 2 Move the certificates into trusted certificate authorities by performing the following steps. a Run certmgr.msc from the start menu. b Navigate to Intermediate Certificate Authorities folder and then to the Certificates folder. c Find the Manager’s hostname in the listed of Certificates. d Copy the Certificate by right clicking on it and selecting copy. e Navigate to Trusted Root Certification Authorities. f Right click in the right pane and paste the certificate. Accept any prompts that come up. Navigate to the Manager with the browser using the hostname and you should see the certificate as trusted. Or If you get an untrusted page message when accessing the Manager login page: 38 McAfee Network Security Platform 8.1 Troubleshooting Guide 1 Troubleshooting Network Security Platform Issues and status checks for the Manager • In Internet Explorer, click on the link Continue to this website not recommended. • In Mozilla Firefox, click on I Understand the Risks and then click on Add Exception to confirm security exception. • In Google Chrome, click on Proceed Anyway. Threat Analyzer UI stuck at downloading maps Potential Cause: Threat Analyzer and Manager communication has some issues. Remedy: Verify that the Threat Analyzer is not timed out. 1 From the command prompt type cd %USERPROFILE%/McAfee/NetworSecurityManager/<NSM_IP>/ ThreatAnalyzer 2 Find "failed" in the file threatanalyzer.log and find "2013-05-05" Replace date in bold with current date. This will filter out all messages with today's date. 3 Check if the output has a line matching the line as given below com.intruvert.acm.ui.test.ConnectionTask - failed If this output is found, increase the timeout for the Threat Analyzer by adding a line MAX_TIMEOUT_PERIOD=300000 in the ta.prop file. 4 Save the file and re-launch Threat Analyzer. Many options are grayed out in Threat Analyzer menu Potential Cause: • You are logged on as a user without having super user role. • There might be a communication problem between the Threat Analyzer applet and the Network Security Manager. Remedy: Verify you are logged on as a user with superuser privileges. If not, perform the following step: In the Manager, select My Company | Users | Role Assignment and check if you have Superuser role. It is also possible that Threat Analyzer is unable to get permission details from the Manager. Try increasing timeout using ta.prop. To do so, refer the steps mentioned in Scenario 5 Unable to get alerts in Historical Threat Analyzer Potential Cause: iv_alert table may have missing indexes. Remedy: McAfee Network Security Platform 8.1 Troubleshooting Guide 39 1 Troubleshooting Network Security Platform Issues and status checks for the Sensor and Manager in combination 1 From the command prompt, type cd <MYSQL_INSTALL_DIR>\binmysql –uroot –p <type root password>. 2 On mysql prompt run following statement: show index from iv_alert. It should display iv_alert_creation_ix index in the list. 3 Stop the Manager service. 4 Create the index with: create index iv_alert_creation_ix on iv_alert(creationTime); If you still have problems, recreate the iv_alert table by referring to Knowledgeable article KB69132 and restart the Manager service. Issues and status checks for the Sensor and Manager in combination This section describes issues and status checks when the Sensor and Manager are connected. Contents Difficulties connecting Sensor and Manager Loss of connectivity between the Sensor and Manager DoS troubleshooting Difficulties connecting Sensor and Manager If you experience problems getting the McAfee Network Security Manager (Manager)and Sensor to communicate, see if one of the following situations might be the cause. 40 McAfee Network Security Platform 8.1 Troubleshooting Guide 1 Troubleshooting Network Security Platform Issues and status checks for the Sensor and Manager in combination Network connectivity • Ensure that the Sensor and Manager server have power and are appropriately connected to the network. • Verify the link indicator lights on both devices to indicate they have an active link. • Ping the Sensor and Manager server to ensure that they are available on the network. Inconsistency in Sensor and Manager configuration • Verify that the Sensor name that was entered in the CLI is identical to that entered in the Manager. Ensure the same for the shared secret key value. If these values do not match, the two cannot communicate. The Sensor name is case sensitive. • Check the network addresses for the Manager, the Manager's gateway, and the Sensor to ensure everything is configured correctly by typing show at the Sensor CLI command prompt. Software or signature set incompatibility Verify that the Sensor software image, Manager software version, and signature set version are compatible. • A compatibility matrix is provided in the release notes that accompany each product release. Firewall between the devices If there is a firewall between the Sensor and the Manager server, make sure the devices are able to communicate by opening the appropriate ports. Ports used by the Manager server are listed in the McAfee Network Security Platform Installation Guide. Management port configuration If you experience problems getting your Sensor and Manager to communicate, it might be a communication issue between the Sensor's Management port and the network device to which it is connected. Check the Management Port Link indicator lights on the Sensor; if the link is down, see if any of the following suggestions enable connectivity. • Check that the network device is online. • Check the cable connecting the Sensor to the network device. • Ensure that the port on the device to which the Management port is connected is enabled and active. • The port speed and duplex mode of the two devices must match. For example, if the device connecting to the Sensor is not set to auto-negotiate, you must configure the Management port to use the same settings as those of the device connecting to the Management port. To troubleshoot this, use the set mgmtport command. Check the link LEDs on the devices to see if communication is established, or use the show mgmtport command to show the link's status. McAfee Network Security Platform 8.1 Troubleshooting Guide 41 1 Troubleshooting Network Security Platform Issues and status checks for the Sensor and Manager in combination Try each of these configuration options to see if one establishes a link: 1 If possible, set the other device's port configuration to auto-negotiate. (The Sensor is set to auto-negotiate by default.) 2 Using the set mgmtport command as described below in Setting the management port speed and duplex mode, try setting the speed and port of the Sensor to speed 100 and duplex half or full. 3 If no link is established, try speed 10 and duplex half or full. 4 If none of these attempts creates a link, try setting the port on the other device to a speed of 100, duplex half or full, and try step 2 again. 5 If this does not establish a link, you can then do the same, setting the other device to a speed of 10, duplex half or full, and try step 3 again. 6 If you are still experiencing difficulties, contact McAfee technical support. M series Sensors Management port support 1000 Mbps(1 Gbps) too. Use the set mgmtport auto command to establish a link to the connecting device (before performing this, see to it that the other device's port configuration's speed is fixed to 1000 and also set to auto-negotiate). Set the management port speed and the duplex mode Task • Set the speed of the Management port and whether the port should be set to half-or full-duplex. At the prompt, type: set mgmtport speed <10 | 100 | 1000> duplex <half | full> where< 10> indicates 10 Mbps, < 100> indicates 100 Mbps, < 1000> indicates 1000 Mbps, < half> indicates half-duplex, and < full > indicates full-duplex. 1000 Mbps is applicable only for M-series Sensors. I-Series Sensors support only 10/100 Mbps for Management port. Example: set mgmtport speed 100 duplex half. Loss of connectivity between the Sensor and Manager If you have previously established a connection between the Sensor and the Manager and the connection fails, try the following: • Check network connectivity. • View the system status on both the Manager and the Sensor. • Check to ensure the Management port on the Sensor is configured with the proper speed and duplex mode as described in Management port configuration. • Has the time been reset on the Manager server? The connection between the Sensor and Manager server is secure, and this secure communication is time-sensitive, so the time on the devices should remain synchronized. You must set the time on the Manager server before you install the Manager software and never change the time on that machine. If the time changes on the Manager server, the Manager will lose its connectivity with the Sensor and the Update Server. A time change could ultimately cause serious database errors. For more information, see the KnowledgeBase article KB55587. (Go to http://mysupport.mcafee.com, and click Search the KnowledgeBase.) 42 McAfee Network Security Platform 8.1 Troubleshooting Guide 1 Troubleshooting Network Security Platform Issues and status checks for the Sensor and Manager in combination How Sensor handles new alerts during connectivity loss The Sensor stores alerts internally until connection is restored. Network Security Platform classifies events and prioritizes to ensure the buffer is filled with the most meaningful events to an analyst. The following table lists the number of alerts that can be stored locally on the Sensor. Number Alert Type 100000 Signature based alerts 2500 Throttled alerts (with source and destination IP information) 2500 Compressed throttled alerts (alerts with no source and destination IP information) 2500 Statistical or anomaly DoS 2500 Throttled DoS alerts 1000 Host sweep alerts 1000 Port scan alerts Once the connection from the Sensor to the Manager has been re-established, the queued alerts are forwarded up to the Manager. So the customer will retain them even in the event that connectivity is disrupted for some time. If the buffer fills up before connectivity is restored, the Sensor will drop new alerts, but if blocking is enabled, the Sensor will continue to block irrespective of the Sensor's connectivity with the Manager. DoS troubleshooting Issues related to DoS alerts. Applicable to Sensor models: M-series, NS-series Problem scenario DoS alerts raised in Network Security Manager. Data/Information Collection 1 Execute show dospreventionprofile <dos-measure-name> <inbound/outbound> in the Sensor. 2 Trace the Sensor files. Troubleshooting Steps 1 Check for the source IP of the profile learning each of the packet types. Execute the following commands: • show dospreventionprofile tcp-syn inbound/outbound • show dospreventionprofile tcp-syn-ack inbound/outbound • show dospreventionprofile tcp-rst inbound/outbound • show dospreventionprofile udp inbound/outbound • show dospreventionprofile icmp-echo inbound/outbound • show dospreventionprofile icmp-echo-reply inbound/outbound • show dospreventionprofile icmp-non-echo-echoreply inbound/outbound McAfee Network Security Platform 8.1 Troubleshooting Guide 43 1 Troubleshooting Network Security Platform Issues and status checks for the Sensor and Manager in combination • show dospreventionprofile ip-fragment inbound/outbound • show dospreventionprofile non-tcp-udp-icmp inbound/outbound Check the bins for long-term average traffic rate and short-term average traffic rate values. An alert is raised when the short-term traffic rate is higher than the long-term traffic rate. 2 44 Check bins that are blocked. A sample of the source IP profile during the detection stage which indicates the blocked bins is shown in the figure. McAfee Network Security Platform 8.1 Troubleshooting Guide Troubleshooting Network Security Platform Issues and status checks for the Sensor and Manager in combination 3 If many DoS alerts are raised frequently for a particular IP, it could be false positive. The reason could be due to the profile of that IP not studied properly. 4 For volume related alerts (for example, if the inbound UDP volume is too high), check if the IP is missing in the alert details. To check the alert details, navigate to Analysis | <Admin Domain Name> | Threat Analyzer | Real-Time | Start the Real-Time Threat Analyzer. 1 Solution Relearn the profile to resolve the issue. McAfee Network Security Platform 8.1 Troubleshooting Guide 45 1 Troubleshooting Network Security Platform Issues and status checks for the Sensor and other devices in combination DoS scenarios • Observed value is calculated based on the following formula: Observed value = (collected count * (threshold duration/collected duration)) • When there is a burst of traffic, and the threshold is reached, the Sensor starts collecting the DoS IP information. This results in showing the packet count as zero, whereas the actual observed value is very high. This works in accordance with the design. • Similarly, in some scenarios the packet count is a non-zero value, whereas the actual observed value is zero. This happens when the traffic has stopped but the DoS IP collection and attack detection are still in progress. Issues and status checks for the Sensor and other devices in combination This section describes issues and status checks that involve a Sensor and any other devices, including third-party devices, that can be added. Connectivity issues between the Sensor and other network devices The most common Sensor problems relate to configuration of the speed and duplex settings. Speed determination issues can result in no connectivity between the Sensor and the switch. 46 McAfee Network Security Platform 8.1 Troubleshooting Guide 1 Troubleshooting Network Security Platform Issues and status checks for the Sensor and other devices in combination Duplex mismatches A duplex mismatch (for example, one end of the link in full-duplex and the other in half-duplex) can result in performance issues, intermittent connectivity, and loss of communication. It can also create subtle problems in applications. For example, if a Web server is talking to a database server through an Ethernet switch with a duplex mismatch, small database queries might succeed, while large ones fail due to a timeout. Manually setting the speed and duplex to full-duplex on only one link partner generally results in a mismatch. This common issue results from disabling auto-negotiation on one link partner and having the other link partner default to a half-duplex configuration, creating a mismatch. This is the reason why speed and duplex cannot be hard-coded on only one link partner. If your intent is not to use auto-negotiation, you must manually set both link partners' speed and duplex settings to full-duplex. Valid auto-negotiation and speed configurations The table below summarizes all possible settings of speed and duplex for Sensors and Cisco catalyst switch ports. Table 1-5 Speed Configurations Network Security Platform Configuration 10/100/1000 port (Speed/ Duplex) Configuration of Switch Resulting Sensor Resulting Catalyst (Speed/Duplex) (Speed/ Duplex) (Speed/ Duplex) 100 Mbps 1000 Mbps No Link No Link Full-duplex Full-duplex Neither side establishes link, due to speed mismatch 100 Mbps AUTO 100 Mbps 100 Mbps Correct configuration Full-duplex Full-duplex Full-duplex 100 Mbps 1000 Mbps 100 Mbps 100 Mbps Full-duplex Full-duplex Full-duplex Full-duplex 100 Mbps AUTO 100 Mbps 100 Mbps Half-duplex Half-duplex 100 Mbps 100 Mbps Half-duplex Half-duplex No Link No Link Half-duplex 10 Mbps AUTO Half-duplex 10 Mbps 1000 Mbps Half-duplex Half-duplex Comments Correct Manual Configuration Link is established, but switch does not see any auto-negotiation information from McAfee Network Security Platform and defaults to half-duplex when operating at 10/100 Mbps. Link is established, but switch does not see Fast Link Pulse (FLP) and defaults to 10 Mbps half-duplex. Neither side establishes link, due to speed mismatch. Gigabit auto-negotiation (no link to connected device) Gigabit Ethernet has an auto-negotiation procedure that is more extensive than that which is used for 10/100 Mbps Ethernet (per Gigabit auto-negotiation specification IEEE 802.3z-1998). The Gigabit auto-negotiation negotiates flow control, duplex mode, and remote fault information. You must either enable or disable link negotiation on both ends of the link. Both ends of the link must be set to the same value or the link will not connect. McAfee Network Security Platform 8.1 Troubleshooting Guide 47 1 Troubleshooting Network Security Platform Issues and status checks for the Sensor and other devices in combination If either device does not support Gigabit auto-negotiation, disabling Gigabit auto-negotiation forces the link up. Troubleshooting a Duplex Mismatch with Cisco Devices When troubleshooting connectivity issues with Cisco switches or routers, verify that the Sensor and the switch/routers are using a valid configuration. The show intfport <port> command on the Sensor CLI will help reveal errors. Sometimes there are duplex inconsistencies between Network Security Platform and the switch port. Symptoms include poor port performance and frame check sequence (FCS) errors that increment on the switch port. To troubleshoot this issue, manually configure the switchport to 100 Mbps, half-duplex. If this action resolves the connectivity problems, you might be running into this issue. Contact Cisco's TAC for assistance. Use the following commands to verify fixed interface settings on some Cisco devices that connect to Sensors: Cisco PIX® Firewall • interface ethernet0 100full. Cisco CSS 11000 • interface ethernet-3 • phy 100Mbits-FD Cisco catalyst 4000, 5000, 6000 series (native) • set port speed 1/1 100 • set port duplex 1/1 full Connectivity issues with Cisco 3750-12S switch Use the following ports when connecting a Cisco 3750-12s switch to your Sensor: 3, 4, 7, 8, 11, or 12. Connections using ports 1, 2, 5, 6, 9, or 10 might cause network issues, which is an inconsistent delay of packets. Cisco CSS 11000 • interface ethernet-3 • phy 100Mbits-FD Explanation of CatOS show port command counters 48 Counter Description Possible causes Alignment Errors Alignment errors are a count of the number of frames received that do not end with an even number of octets and have a bad CRC. These are the result of collisions at half-duplex, duplex mismatch, bad hardware (NIC, cable, or port), or a connected device generating frames that do not end with on an octet and have a bad FCS. FCS FCS error count is the number of frames that were transmitted or received with a bad checksum (CRC value) in the Ethernet frame. These frames are dropped and not propagated onto other ports. These are the result of collisions at half-duplex, duplex mismatch, bad hardware (NIC, cable, or port), or a connected device generating frames with bad FCS. McAfee Network Security Platform 8.1 Troubleshooting Guide Troubleshooting Network Security Platform Issues and status checks for the Sensor and other devices in combination 1 Counter Description Xmit-Err This is an indication that the internal transmit This is an indication of excessive input buffer is full. rates of traffic. This is also an indication of transmit buffer being full. The counter should only increment in situations in which the switch is unable to forward out the port at a desired rate. Situations such as excessive collisions and 10 Mb ports cause the transmit buffer to become full. Increasing speed and moving the link partner to full-duplex should minimize this occurrence. Rcv-Err This is an indication that the receive buffer is full. This is an indication of excessive output rates of traffic. This is also an indication of the receive buffer being full. This counter should be zero unless there is excessive traffic through the switch. In some switches, the Out-Lost counter has a direct correlation to the Rcv-Err. UnderSize These are frames that are smaller than 64 bytes (including FCS) and have a good FCS value. This is an indication of a bad frame generated by the connected device. Single Collisions Single collisions are the number of times the transmitting port had one collision before successfully transmitting the frame to the media. This is an indication of a half-duplex configuration. Multiple Collisions Multiple collisions are the number of times the transmitting port had more than one collision before successfully transmitting the frame to the media. This is an indication of a half-duplex configuration. Late Collisions A late collision occurs when two devices This is an indication of faulty hardware transmit at the same time and neither side of (NIC, cable, or switch port) or a duplex the connection detects a collision. The reason mismatch. for this occurrence is that the time to propagate the signal from one end of the network to another is longer than the time to put the entire packet on the network. The two devices that cause the late collision never see that the other is sending until after it puts the entire packet on the network. Late collisions are detected by the transmitter after the first time slot of the 64-byte transmit time occurs. They are only detected during transmissions of packets longer than 64 bytes. Its detection is exactly the same as it is for a normal collision; it just happens later than it does for a normal collision. Excessive Collisions Excessive collisions are the number of frames This is an indication of over utilization of that are dropped after 16 attempts to send the switch port at half-duplex or duplex the packet resulted in 16 collisions. mismatch. Carrier Sense Carrier sense occurs every time an Ethernet This is an indication of faulty hardware controller wants to send data and the counter (NIC, cable, or switch port). is incremented when there is an error in the process. McAfee Network Security Platform 8.1 Possible causes Troubleshooting Guide 49 1 Troubleshooting Network Security Platform Issues and status checks for the Sensor and other devices in combination Counter Description Possible causes Runts These are frames smaller than 64 bytes with a bad FCS value. This is an indication of the result of collisions, duplex mismatch, IEEE 802.1Q (dot1q), or an Inter-Switch Link Protocol (ISL) configuration issue. Giants These are frames that are greater than 1518 bytes and have a bad FCS value. This is an indication of faulty hardware, dot1q, or an ISL configuration issue. Auto-negotiation Auto-negotiation issues typically do not result in link establishment issues. Instead, auto-negotiation issues mainly result in a loss of performance. When auto-negotiation leaves one end of the link in, for example, full-duplex mode and the other in half-duplex (also known as a duplex mismatch), errors and re-transmissions can cause unpredictable behavior in the network causing performance issues, intermittent connectivity, and loss of communication. Generally these errors are not fatal-traffic still makes it through, but locating and fixing them is a time waster. Situations that might lead to auto-negotiation issues Auto-negotiation issues with the Sensor might result from nonconforming implementation, hardware incapability, or software defects. Generally, if the switch used with the Sensor adheres to IEEE 802.3u auto-negotiation specifications and all additional features are disabled, auto-negotiation should properly negotiate speed and duplex, and no operational issues should exist. • Problems might arise when vendor switches/routers do not conform exactly to the IEEE specification 802.3u. • Vendor-specific advanced features that are not described in IEEE 802.3u for 10/100 Mbps auto-negotiation (such as auto-polarity or cabling integrity) can also lead to hardware incompatibility and other issues. DNS connectivity and reputation issues DNS connectivity 50 McAfee Network Security Platform 8.1 Troubleshooting Guide Troubleshooting Network Security Platform Issues and status checks for the Sensor and other devices in combination 1 DNS connectivity to the Sensor sometimes has issues due to incorrect configuration or incorrect DNS server IP address. You can view the DNS connectivity fault in the System Faults page in the Manager. The Device DNS server connectivity status faults are generated by the Sensor whenever there is an issue in DNS connectivity. Figure 1-5 DNS server connectivity warning fault Figure 1-6 GTI server connectivity critical fault You can perform the following high-level troubleshooting steps to solve the connectivity problem: 1 Check the Devices | <Admin Domain Name> | Global | Default Device Settings | Common | Name Resolution for the global level setting in the Manager to see if the parent domain has the primary and secondary DNS server information entered correctly. Figure 1-7 Global level DNS server setting McAfee Network Security Platform 8.1 Troubleshooting Guide 51 1 Troubleshooting Network Security Platform Issues and status checks for the Sensor and other devices in combination 2 If the global setting has the correct information, check the Devices | <Admin Domain Name> | Devices | <Device Name> | Setup | Name Resolution device level setting to see if it inherits the global settings. Make sure that the Inherit Settings? is selected and also check if the inherited information is correct. Figure 1-8 Device level DNS server setting If the connectivity problem still persists contact McAfee Support for further assistance. GTI file reputation 52 McAfee Network Security Platform 8.1 Troubleshooting Guide 1 Troubleshooting Network Security Platform Issues and status checks for the Sensor and other devices in combination In case of any errors for file reputation analysis, you can perform the following high-level troubleshooting steps: 1 Check if the malware detection is enabled in Policy | <Admin Domain Name> | Intrusion Prevention | Advanced Malware | Advanced Malware Policies. 2 In case of file reputation, the request is sent for bad file reputation. The file is sent as an MD5 checksum in DNS requests. If there is no response from the DNS, check the DNS connectivity. If the DNS connectivity has any issues, perform the high-level steps mentioned under DNS connectivity to solve the problem. If the DNS connectivity is working correctly, there will a response for the file reputation request. Confirm the connectivity by executing and checking the output of show malwareenginestats CLI command. Check the output of malware statistics for GTI file reputation engine. The Number of files sent and Number of response Received should show an increase in comparison with the number of files sent/received before sending the reputation request. Malware Statistics for GTI File Reputation Engine Number of files sent: 11132 Number of response Received: 9377 Number of files ignored: 1755 Number of files with malware score clean: 0 Number of alerts with malware score very low: 37 Number of alerts with malware score low: 0 Number of alerts with malware score medium: 0 Number of alerts with malware score high: 0 Number of alerts with malware score very high: 1233 Number of alerts with malware score unknown: 8051 Total number of alerts sent: 1233 Total number of attacks blocked: 1233 Total number of TCP resets sent: 1233 If the connectivity problem still persists contact McAfee Support for further assistance. GTI IP reputation When a syn packet is seen, the Sensor checks to see if IP reputation is enabled for that port/protocol. When enabled, the Sensor sends a query to the management process. The first flow is always allowed to pass through since the reputation score is not available. After a reputation score is assigned to the packet, the score is updated to the Sensor. The subsequent flows from the same IP address is marked with the reputation score in the header for lookup in datapath processor. Source IP is checked for inbound flows, and destination IP is checked for outbound flows, even though the entire 5-tuple is passed in the query. McAfee Network Security Platform 8.1 Troubleshooting Guide 53 1 Troubleshooting Network Security Platform Integration Scenarios The Sensor connectivity status with GTI server critical fault is generated by the Sensor in the Manager whenever the GTI server has connectivity issues to the Sensor. Figure 1-9 Sensor connectivity fault You can perform the following high-level troubleshooting steps to solve the connectivity problem: 1 Check if proxy configuration is required. If the organization has a firewall/proxy between the Sensor management port and the cloud, then the proxy has to be configured with username/ password if required. You can configure the proxy server under Manage | <Admin Domain Name> | Setup | Proxy Server. 2 Port 443 should not be blocked on the management port network. 3 Check the Devices | <Admin Domain Name> | Global | Default Device Settings | Common | Name Resolution for the global level setting in the Manager to see if the parent domain has the primary and secondary DNS server information entered correctly. If the connectivity problem still persists contact McAfee Support for further assistance. Integration Scenarios This section explains about the troubleshooting in integration scenarios and the required steps for troubleshooting. Tasks • Global Threat Intelligence - API Overload on page 54 • ePO - Connection failure on page 55 • Vulnerability Manager - Connectivity issues on page 57 • Vulnerability Manager - Certificate Sync and FC Agent issues on page 58 • Logon Collector - Integration issues on page 60 Global Threat Intelligence - API Overload When the Manger integrates with Global Threat Intelligence to obtain the reputation scores on hosts and geo‑locations, the API is used to send back the feature usage data to McAfee and there is a possibility of the API getting overloaded. 54 McAfee Network Security Platform 8.1 Troubleshooting Guide 1 Troubleshooting Network Security Platform Integration Scenarios Perform the following steps for troubleshooting: Task 1 If the proxy server is enabled, verify that "tunnel.web.trustedsource.org" is allowed by proxy server ACLs. 2 In the Manager, selectManage | Integration | Global Threat Intelligence and check if the Alert Data Details option is enabled. 3 Check if SDK boot straps to Global Threat Intelligence cloud successfully by checking for below in ems.log. • 2011-12-06 15:55:01,510 INFO [http-0.0.0.0-9999-3] com.intruvert.ts.helper.TSRatingLookupHelper - Major version: • 2011-12-06 15:55:01,510 INFO [http-0.0.0.0-9999-3] com.intruvert.ts.helper.TSRatingLookupHelper - 2 • 2011-12-06 15:55:01,510 INFO [http-0.0.0.0-9999-3] com.intruvert.ts.helper.TSRatingLookupHelper - Minor version: • 2011-12-06 15:55:01,510 INFO [http-0.0.0.0-9999-3] com.intruvert.ts.helper.TSRatingLookupHelper - 0 • 2011-12-06 15:55:01,510 INFO [http-0.0.0.0-9999-3] com.intruvert.ts.helper.TSRatingLookupHelper - Version description: • 2011-12-06 15:55:01,510 INFO [http-0.0.0.0-9999-3] com.intruvert.ts.helper.TSRatingLookupHelper - TrustedSource SDK 2.0.5.02 (Build 1117) • 2011-12-06 15:55:01,510 INFO [http-0.0.0.0-9999-3] com.intruvert.ts.helper.TSRatingLookupHelper - Version: • 2011-12-06 15:55:01,511 INFO [http-0.0.0.0-9999-3] com.intruvert.ts.helper.TSRatingLookupHelper - 2.0.5.02-1117 • 2011-12-06 15:55:01,672 INFO [http-0.0.0.0-9999-3] com.intruvert.ts.helper.TSRatingLookupHelper - Using Proxy Server:1.1.1.1, port: 20 • 2011-12-06 15:55:01,780 INFO [http-0.0.0.0-9999-3] com.intruvert.ts.helper.TSRatingLookupHelper - Device Id: 9b11e1c4-069e-4195-8dd1-c2842ba338f6 • 2011-12-06 15:55:01,780 INFO [http-0.0.0.0-9999-3] com.intruvert.ts.helper.TSRatingLookupHelper - MIICZjCCAc +gAwIBAgICEFIwDQYJKoZIhvcNAQEFBQAwNjEZMBcGA1UEAxQQVHJ1c3RlZFNvdXJjZV9DQTEMM AoGA1UEChMDU0NDMQswCQYDVQQGEwJVUzAe • 2011-12-06 15:55:01,780 INFO [http-0.0.0.0-9999-3] com.intruvert.ts.helper.TSRatingLookupHelper MIICXQIBAAKBgQDegOtxL2JHaGLwU6RTQKPfGtzMp3zxiKRc4yPqgPtIgZqReQj7yw6pqvpBmpcx/ OobEjs0hA8v0abE3BFwEX0Mezre2B9NpPhuJnNHhe4c/cGdxtC53 ePO - Connection failure If there is a connection failure between the Manager and the ePO server, perform the following steps for troubleshooting. In the Manager: McAfee Network Security Platform 8.1 Troubleshooting Guide 55 1 Troubleshooting Network Security Platform Integration Scenarios 1 Ensure that the provided configurations like IP address, port numbers, user name and the password to the ePO server are correct. 2 Ping or try to access ePO server directly from the Manager server. If it is not accessible, check the firewall configuration and follow other regular network troubleshooting steps. 3 Ensure that the required permissions are given to the configured user name. To isolate the permission issue, use global administrator user name or password for testing the connection. If the connection is successful with global administrator credentials, then it could be a problem with configured user name. 4 Check these log files for any errors: 5 • For Manager Versions below 7.5.5: Check ems.log file for any errors • For Manager Version 7.5.5 and above: Check epo.log file for any errors Manager uses the following URLs. Try accessing them from the Manager server through a browser.https://EPO_SERVER_IP:8443/remote/ISExtension.HostForensicsCommand.do? command=getHostDetails&ip=[specify_IP] Check these logs files. Following denotes is a successful "TestConnection" 011-11-22 15:09:51,500 INFO [ajp-127.0.0.1-8009-3] iv.common.HttpClient.ApacheGetImpl doGET(), succesfully made the request to http client, url is https://172.16.101.37/remote/ ISExtension.HostForensicsCommand.do? command=getHostDetails&ip=127.0.0.1&orion.user.security.token=tpc5pvsNVHxo3fiS The following denotes an error in connection ems.log.3:2011-11-17 12:15:10,914 ERROR [ajp-127.0.0.1-8009-5] iv.common.HttpClient.ApacheGetImpl - doGET:Error while doing the http get function for the url https://172.17.94.80/remote/ISExtension.HostForensicsCommand.do? command=getHostDetails&ip=127.0.0.1&orion.user.security.token=kSffjTChbZRcE0IJ the error isjava.net.SocketTimeoutException: Read timed out ems.log.3:2011-11-17 12:48:21,435 ERROR [ajp-127.0.0.1-8009-4] iv.common.HttpClient.ApacheGetImpl - doGET:Error while doing the http get function for the url In the ePO 1 Ensure that the ePO server has the latest NSPExtension installed. 2 Ensure that the required permissions are given to the configured username. Check if user has sufficient permission to access NSP Extension. • In Menu | User Management | Users | desired User note down "Permissions Sets". • In Menu | User Management | Permission sets select the permission that is assigned to this user. Check if Network Security Platform has view and change settings. 3 To test the connection to the Manager server, manually run the NSP:Dashboard Data Pull Task. If connection fails, ping or try to access the Manager server directly from the ePO server. If connection fails, check the firewall and follow regular network troubleshooting steps. 4 Check orion.log file for any error messages at C:\Program Files\McAfee\ePolicy Orchestrator\Server \Logs\orion.log. If test connection is carried out from child admin domain then make test connection for parent admin domain by following above trouble shooting steps. 56 McAfee Network Security Platform 8.1 Troubleshooting Guide Troubleshooting Network Security Platform Integration Scenarios 1 Vulnerability Manager - Connectivity issues When you run through the integration wizard when connecting to the Vulnerability Manager database, the following error is displayed: The attempt to confirm connectivity with the McAfee Vulnerability Manager database has failed for the following reason: Internal Server Error Perform the following steps for troubleshooting: 1 Stop the service of the Manager. 2 Disable CBC protection mode in App/bin/tms.bat. 3 Open tms.bat file and do the following java option to turn off CBCProtection. 4 • set JAVA_OPTS=%JAVA_OPTS% -server -Xms768m -Xmx768m -Xss128K • set JAVA_OPTS=%JAVA_OPTS % -XX:NewRatio=4 -XX:PermSize=128m -XX:MaxPermSize=256m -XX:+UseParallelOldGC • set JAVA_OPTS=%JAVA_OPTS% -Dapp.home.dir="%APPROOT%" • set JAVA_OPTS=%JAVA_OPTS% -Dapp.install.root="%APPROOT%" • set JAVA_OPTS=%JAVA_OPTS% -Dapp.home.dir.url="%APPROOT%" • set JAVA_OPTS=%JAVA_OPTS% -Dwin.dir="%WINDIR%" • set JAVA_OPTS=%JAVA_OPTS% -Dlumos.fixedManagerSNMPUDPPort="4167" • set JAVA_OPTS=%JAVA_OPTS% -Dlumos.fixedManagerSNMPIPaddress="" • set JAVA_OPTS=%JAVA_OPTS% -Dlumos.fixedManagerSNMPIPv6address="" • set JAVA_OPTS=%JAVA_OPTS% -Dpython.path="%JYTHONLIB%" • set JAVA_OPTS=%JAVA_OPTS % -Div.policymgmt.RuleEngine.compiler.netl7antlr.strictCheckEnabled="FALSE" • set JAVA_OPTS=%JAVA_OPTS% -Div.compiler.snort.dumpPCRE="TRUE" • rem set JAVA_OPTS=%JAVA_OPTS % -Div.policymgmt.RuleEngine.compiler.enableAPforSPM="FALSE" • set JAVA_OPTS=%JAVA_OPTS% -Div.compiler.snort.dumpSSIDandStates="TRUE" • set JAVA_OPTS=%JAVA_OPTS% -Div.controlchannel.snmpv3.useLocalizedKeys="FALSE" • set JAVA_OPTS=%JAVA_OPTS% -Dsun.lang.ClassLoader.allowArraySyntax=true • set JAVA_OPTS=%JAVA_OPTS% -Djava.rmi.server.hostname="localhost" • set JAVA_OPTS=%JAVA_OPTS% -Dcatalina.home="%CATALINA_HOME%" • set JAVA_OPTS=%JAVA_OPTS% -Djsse.enableCBCProtection=false Restart the Manager service. After performing these steps, run through the integration wizard to try and connect the Vulnerabiltiy Manager database. McAfee Network Security Platform 8.1 Troubleshooting Guide 57 1 Troubleshooting Network Security Platform Integration Scenarios Vulnerability Manager - Certificate Sync and FC Agent issues Table 1-6 Issue 1 Problem Solution FC Agent service doesn't get installed while installing the Manager To install FCAgent service: 1 Download the software vcredist_x86.exe and run it in that host. 2 Download link http://www.microsoft.com/download/en/details.aspx? displaylang=en&id=5638. 3 At the command prompt, go to c:\Program Files (x86)\foundstone\FCM and run the command fcagent -i to install the service. Table 1-7 Issue 2 Problem Solution When you click on API tab in the Manager, internal server error is displayed This issue might be seen in some systems when the command sc query FCAgent is executed internally in the Manager. To run this command, the server in which manager is deployed might not have the right permission settings. the Administrator has to provide permission to run sc.exe. To change permission settings for sc.exe. 1 Go to //windows/system32/sc.exe. 2 Right-click sc.exe and select Properties. 3 Click the Security tab. 4 Add a local service and provide full permission. 58 McAfee Network Security Platform 8.1 Troubleshooting Guide 1 Troubleshooting Network Security Platform Integration Scenarios Table 1-8 Issue 3 Problem Solution FCAgent service doesn't start in Manager server To integrate with Vulnerability Manager, the Manager must update the Windows registry. However, the user account used to run the Manager service will not have permissions to write to the Windows registry if the Manager is fully locked down. To give that user account the required permissions, follow these steps: 1 On the server running the Manager, run regedit.exe. 2 Change the permissions on registry and allow Full Control to 'Local Service' for the keys: • HKLM • HKLM\Software • HKLM\Software\Foundstone 3 Right-click on these keys and choose Permissions. 4 Add the user account used to run the Manager service (likely LOCAL SERVICE). 5 Give that user account Full Control over the key. 6 Click OK. Changes take effect immediately. A reboot is not required. 7 In the API Server page, click Save. If the operating system is 64-bit, perform this procedure for these keys: • HKLM • HKLM\Software • HKLM\Software\wow6432Node • HKLM\Software\wow6432Node\Foundstone. Table 1-9 Issue 4 Problem Solution You are able to start the FC Agent service, clicking on 'Retrieve MVM Certificate' returns error message. It might be because port 3801 is not enabled in the API server. Check if port 3801 has been enabled. Vulnerability Manager could be deployed in distributed mode where FCM Server could be in one server. The API Server, DB , Enterprise Manager and Scan Engines could be another server. In the API server page try configuring the FCM Server IP address and port 3801. Try clicking the Retrieve Certificates button. If the OnDemand scan fails, try changing the port back to 3800. Table 1-10 Issue 5 Problem Solution Retrieve MVM certificate is failing This might occur if C:program files\found stone or C:program even though the SSHStauscache and Files(x86) \Foundstone" does not have write permission for Statuscache keys are present in the Local Service. registry 1 Add local service and giving full permission to local service. 2 Click Retrieve MVM Certificate again after giving the required permissions. McAfee Network Security Platform 8.1 Troubleshooting Guide 59 1 Troubleshooting Network Security Platform Integration Scenarios Logon Collector - Integration issues To ensure connectivity between the McAfee® Logon Collector and Manager, the following configurations are mandatory. • Ensure that the Active Directory services are up and running. If the Active Directory (AD) is not configured correctly or down, then the Manager does receive Logon Collector updates and test connectivity does not get verified. • Add the domain that needs to be monitored in the Logon Collector server. If the domain is not added test connection fails and the Manager does not receive Logon Collector updates. • Ensure that all Logon Collector components of the Logon Collector server are running. • While exchanging Logon Collector certificate with the Manager by pasting, ensure that you copy the certificate content to Notepad to remove any inadvertent spaces that might cause certificate exchange failure during connectivity. • To verify that Manager is receiving Logon Collector updates, create a Firewall then double-click the Source User field to verify that the Groups are configured in the AD. As a part of the Manager-Sensor Logon Collector Integration, the Manager sends IP User mapping and User-Group mapping periodically on certain well defined events. The Sensor receives the Logon Collector updates from the Manager only when user-based Firewall policies are assigned to Sensors. Manager notifies the following two faults related to this integration which will be available in System Fault page: 60 • number of user configured in AD is more than 75000 or IP-user mapping is more than 100,000. • MLC bulk update file exceeds 25mb limit which is a critical fault and user intervention is needed. McAfee Network Security Platform 8.1 Troubleshooting Guide 2 Performance issues Most performance issues are related to switch port configuration, duplex mismatches, link up/down situations, and data link errors. Contents Sniffer trace Data link errors Sniffer trace A Sniffer details packet transfer, and thus a Sniffer trace analysis can help pinpoint switch and McAfee® Network Security Platform performance or connectivity issues when the issues persist after you have exhausted the other suggestions in this document. Sniffer trace analysis reveals every packet on the wire and pinpoints the exact problem. Note that it may be important to obtain several Sniffer traces from different ports on different switches, and that it is useful to monitor ("span") ports rather than spanning VLANs when troubleshooting switch connectivity issues. Data link errors Many performance issues may be related to data link errors. Excessive errors usually indicate a problem. For more information, see also Configuration of Speed and Duplex settings. Half-duplex setting When operating with a duplex setting of half-duplex, some data link errors such as FCS, alignment, runts, and collisions are normal. Generally, a one percent ratio of errors to total traffic is acceptable for half-duplex connections. If the ratio of errors to input packets is greater than two or three percent, performance degradation may be noticeable. In half-duplex environments, it is possible for both the switch and the connected device to sense the wire and transmit at exactly the same time, resulting in a collision. Collisions can cause runts, FCS, and alignment errors, which are caused when the frame is not completely copied to the wire, resulting in fragmented frames. Full-duplex setting When operating at full-duplex, FCS, cyclic redundancy checks (CRC), alignment errors, and runt counters should be minimal. If the link is operating at full-duplex, the collision counter is not active. If the FCS, CRC, alignment, or runt counters are incrementing, check for a duplex mismatch. Duplex mismatch is a situation in which the switch is operating at full-duplex and the connected device is McAfee Network Security Platform 8.1 Troubleshooting Guide 61 2 Performance issues Data link errors operating at half-duplex, or vice versa. The result of a duplex mismatch is extremely slow performance, intermittent connectivity, and loss of connection. Other possible causes of data link errors at full-duplex are bad cables, a faulty switch port, or software or hardware issues. 62 McAfee Network Security Platform 8.1 Troubleshooting Guide 3 Determine false positives This section lists methods for determining and reducing false positives. Contents Reduce false positives Tune your policies Reduce false positives Your policy determines what traffic analysis your McAfee® Network Security Sensor (Sensor) will perform. McAfee® Network Security Platform provides a number of policy templates to get you started toward your ultimate goal: prevent attacks from damaging your network, and limit the alerts displayed in the Threat Analyzer to those which are valid and useful for your analysis. There are two stages to this process: initial policy configuration and policy tuning.Though these are tedious tasks, McAfee has extended its blocking options to include SmartBlocking, which only activates blocking when high confidence signatures are matched, thus minimizing the possibility of false positives.Network Security Platform is replacing its present Recommended for Blocking (RFB) designation with Recommended for SmartBlocking (RFSB) because this new level of granularity enables McAfee to recommend many more attacks – the list of RFB attacks is a subset of the list of RFSB attacks. The ultimate goal of policy tuning is to eliminate false positives and noise and avoid overwhelming quantities of legitimate, but anticipated alerts. Tune your policies The default McAfee Network Security Platform policy templates are provided as a generic starting point; you will want to customize one of these policies for your needs. So the first step in tuning is to clone the most appropriate policy for your network and your goals, and then customize it. (You can also modify a policy directly rather than modifying a copy.) Some things to remember when tuning your policies: • We ask that you set your expectations appropriately regarding the elimination of false positives and noise. A proper Network Security Platform implementation includes multiple tuning phases. False positives and excess noise are routine for the first 3 to 4 weeks. Once properly tuned, however, they can be reduced to a rare occurrence. • When initially deployed, Network Security Platform frequently exposes unexpected conditions in the existing network and application configuration. What may at first seem like a false positive might actually be the manifestation of a misconfigured router or Web application, for example. McAfee Network Security Platform 8.1 Troubleshooting Guide 63 3 Determine false positives Tune your policies • Before you begin, be aware of the network topology and the hosts in your network, so you can enable the policy to detect the correct set of attacks for your environment. • Take steps to reduce false positives and noise from the start. If you allow a large number of "noisy" alerts to continue to sound on a very busy network, parsing and pruning the database can quickly become cumbersome tasks. It is preferable to all parties involved to put energy into preventing false positives than into working around them. Exception objects are also an option where you can have custom rule sets specific to his environment. You can disable all alerts that are obviously not applicable to the hosts that you protect. For example, if you use only Apache Web servers, you can disable IIS-related attacks. False positives and noise The mere mention of false positives always causes concern in the mind of any security analyst. However, false positives may mean quite differently things to different people. In order to better manage the security risks using any IDS/IPS devices, it's very important to understand the exact meanings of different types of alerts so that appropriate response can be applied. With Network Security Platform, there are three types of alerts which are often taken as "false positives:" • incorrectly identified events • correctly identified events subject to interpretation by usage policy • correctly identified events uninteresting to the user. Incorrect identification These alerts typically result from overly aggressive signature design, special characteristics of the user environment, or system bugs. For example, typical users will never use nested file folders with a path more than 256 characters long; however, a particular user may push the Windows' free-style naming to the extreme and create files with path names more than 1024 characters. Issues in this category are rare. They can be fixed by signature modifications or software bug fixes. Correct identification — significance subject to usage policy Events of this type include those alerting on activities associated with Instant Messaging (IM), Internet Relay chat (IRC), and Peer to Peer programs (P2P). Some security policies forbid such traffic on their network; for example, within a corporate common operation environment (COE); others may allow them to various degrees. Universities, for example, typically have a totally open policy for running these applications. Network Security Platform provides two means by which to tune out such events if your policies deem these events uninteresting. First, you can define a customized policy in which these events are disabled. In doing so, the Sensor will not even look for these events in the traffic stream to which the policy is applied. If these events are of interest for most of the hosts except a few, creating exception objects to suppress alerts for the few hosts is an alternative approach. Correct identification — significance subject to user sensitivity (also known as noise) There is another type of event which you may not be interested in, due to the perceived severity of the event. For example, Network Security Platform will detect a UDP-based host sweep when a given host sends UDP packets to a certain number of distinct destinations within a given time interval. Although you can tune this detection by configuring the threshold and the interval according to their sensitivity, it's still possible that some or all of the host IPs being scanned are actually not live. Some users will consider these alerts as noise, others will take notice because it indicates possible reconnaissance activity. Another example of noise would be if someone attempted an IIS-based attack against your Apache Web server. This is a hostile act, but it will not actually harm anything except wasting some network bandwidth. Again, a would-be attacker learns something he can use against 64 McAfee Network Security Platform 8.1 Troubleshooting Guide Determine false positives Tune your policies 3 your network: Relevance analysis involves the analysis of the vulnerability relevance of real-time alerts, using the vulnerability data imported to Manager database. The imported vulnerability data can be from Vulnerability Manager or other supported vulnerability scanners such as Nessus.The fact that the attack failed can help in zero in on the type of Web server you use. Users can also better manage this type of events through policy customization or installing attack filters. The noise-to-incorrect-identification ratio can be fairly high, particularly in the following conditions: • the configured policy includes a lot of Informational alerts, or scan alerts which are based on request activities (such as the All Inclusive policy) • deployment links where there is a lot of hostile traffic, such as in front of a firewall • overly coarse traffic VIDS definition that contains very disparate applications, for example, a highly aggregated link in dedicated interface mode Users can effectively manage the noise level by defining appropriate VIDS and customize the policy accordingly. For dealing with exceptional hosts, such as a dedicated pentest machine, alert filters can also be used. Determine a false positive versus noise Some troubleshooting tips for gathering the proper data to determine whether you are dealing with a false positive or uninteresting event; • What did you expect to see? What is the vulnerability, if applicable, that the attack indicated by the alert is supposed to exploit? • Ensure that you capture valid traffic dumps that are captured from the attack attempt (for example, have packet logging enabled and can view the resulting packet log) • Determine whether any applications are suspected of triggering the alert—which ones, which versions, and in what specific configurations. If you intend to work with McAfee Technical Support on the issue, we ask that you provide the following information to assist in troubleshooting: • If this occurred in a lab using testing tools rather than live traffic, please provide detailed information of the attack/test tool used, including its name, version, configuration and where the traffic originated. • If this is a testing environment using a traffic dump relay, make sure that the traffic dumps are valid, TCP traffic follows a proper 3-way handshake, and so on • Also, please provide detailed information of the test configuration in the form of a network diagram. • Create an Evidence Report (within Threat Analyzer) with the packet log • Be ready to tell Technical Support how often you are seeing the alerts and whether they are ongoing McAfee Network Security Platform 8.1 Troubleshooting Guide 65 3 Determine false positives Tune your policies 66 McAfee Network Security Platform 8.1 Troubleshooting Guide 4 System fault messages This section lists the system fault messages visible in the Manager Operational Status viewer, organized by severity, with Critical messages first, then Errors, then Warnings, then Informational messages. You can view the faults from the Operational Status menu in Manager. For more information, see fault messages for Vulnerability Manager Scheduler and Automatic report import using Scheduler, McAfee Network Security Platform Integration Guide. The fault messages you might encounter, their severity, and a description, including information on what action clears the fault are briefed. In many cases, the fault clears itself if the condition causing the fault is resolved. In cases where the fault does not clear, you must acknowledge or delete it to dismiss it. For Sensor faults, go through Manager and Sensor faults. Similarly for NTBA issues, refer to Manager and NTBA faults. Contents Manager faults Sensor faults NTBA faults Manager faults The Manager faults can be classified into critical, error, warning, and informational. The Action column provides you with troubleshooting tips. Manager critical faults These are the critical faults for a Manager and Central Manager. Fault Severity Description/Cause Action AD groups size exceeded Critical Currently Manager-MLC integration supports only 2,000 AD groups for NS-series and Virtual IPS and 10,000 AD groups for M-series which has exceeded now. Sensor behavior cannot be guaranteed, if these numbers are not brought down. Reduce the number of admin domain user groups to be within the specified limit. Approaching max allowable table size Critical <Percentage value>% capacity. Current largest table size: <Table size value>. To ensure successful database tuning, Manager begins to drop alerts and packet logs. Please perform maintenance operations to clean and tune the database. McAfee Network Security Platform 8.1 Troubleshooting Guide 67 4 68 System fault messages Manager faults Fault Severity Description/Cause Action AD groups size limitation Critical Currently Manager-MLC integration supports only {0} AD groups. Sensor version {1} cannot accommodate {2} AD groups Reduce the number of groups in Active Directory. Audit failed and Manager shutting down Critical The Manager is not able to log an audit and is shutting down. Check ems log to determine the reason for audit failure. Botnet detectors deployment failure Critical Cannot deploy the botnet detectors to device <Sensor_name>. See system log for details. Occurs when the Manager cannot push the BOT DAT file to the Sensor. This can result from network connectivity issue. Cannot push down persisted Device configuration information Critical The attempt by the Manager to deploy the configuration to device {0} failed during device re-initialization. The device configuration is now out of sync with the Manager settings. The device may be down. See the system log for details. The Manager cannot deploy the original device configuration during device re-initialization. This can also occur when a failed device is replaced with a new unit, and the new unit is unable to discover its configuration information. Cannot pull up Sensor configuration MIB information from the Sensor again during a state transition from disconnected to active Critical Device re-discovery failure. The upload of device configuration information for device {0} failed again after being triggered by the status polling thread. The device is not properly initialized. This fault occurs as a second part to the “device discovery failure” fault. If the condition of the device changes such that the Manager can again communicate with it, the Manager again checks to see if the device discovery was successful. This fault is issued if discovery fails, thus the device is still not properly initialized. Check to ensure that the device has the latest software image compatible with the Manager software image. If the images are incompatible, update the device image via a tftp server. Cannot start control Critical channel service (key store) The Manager's key file is unavailable and possibly corrupted. This fault could indicate a database corruption. If you have a database backup file (and think it is not corrupted) you can attempt a Restore. If this does not work, you may need to manually repair the database. Contact McAfee Technical Support. Cannot start control Critical channel service (EMS certificate) Can't obtain the Manager certificate If you have a database backup file (and think it is not corrupted), you can attempt a Restore. If this does not work, try executing the Database Maintenance action. McAfee Network Security Platform 8.1 Troubleshooting Guide System fault messages Manager faults 4 Fault Severity Description/Cause Action Cannot generate the SNMP association for the specified Sensor Critical Failed to create command channel association. The device is not properly initialized. This error indicates a failure to create a secure connection between the Manager and the device, which can be caused by loss of time synchronization between the Manager and device or that the device is not completely online after a reboot. Restart the Manager and check the device operating status to ensure that the device’ health and status are good. Cluster software mismatch status Critical The software versions on the cluster primary and cluster secondary are not the same. Check for errors in software image download to cluster. Database backup failed Critical The Manager was unable to back up its database. Error Message: <exception string>. This message indicates that an attempt to manually back up the database backup has failed. The most likely cause of failure is insufficient disk space on the Manager server; the backup file may be too big. Check your disk capacity to ensure there is sufficient disk space, and try the operation again. Disk space warning Critical When the utilized disk space in the Manager server exceeds 89% of the capacity. Make sure that the drive where the Manager is installed has sufficient disk space. Please prune and tune the database. Example: • Disk space used = 90% invokes a critical fault. Dropping alerts and Critical packet logs <Percentage value>% capacity. Please perform maintenance Dropping alerts and packet logs. operations to clean and tune the database. DXLService is down The DXLService is down due to: Critical • Failure to connect to the ePolicy Orchestrator Server. • Failure to connect to the Data eXchange Layer. • Failure to start the McAfee Agent service. • Failure to start the Data eXchange Layer service. Fan error Critical McAfee Network Security Platform 8.1 The fan has failed. • Check the connectivity between IPS and ePO, or check the logs. • Check the connectivity between IPS and Data eXchange Layer, or check the logs. • Check the logs. • Check the logs. Check the fan LEDs on the front of the device to ensure all internal fans are functioning. The fault clears when the temperature falls below its internal ‘low’ temperature threshold. Troubleshooting Guide 69 4 System fault messages Manager faults Fault Severity Description/Cause Action Firewall connectivity Critical failure The connectivity between the device and the firewall is down. Check Packet Capture configuration is down. This fault can occur in situations where, for example, the firewall machine is down, or the network is experiencing problems. Ping the firewall to see if the firewall is available. Contact your IT department to troubleshoot connectivity issues. Gateway Anti-Malware engine initialization failed Critical Gateway Anti-Malware Engine Initialization failed due to some internal error. Check the logs. Try enabling automatic signature update option or downloading signatures manually using cli. Gateway Anti-Malware signature download failure Critical Gateway Anti-Malware Engine could not be initialized as the required signature files are not available. Gateway Anti-Malware signature Check the logs. download failed because of Try enabling automatic signature update failed. signature update option or Gateway Anti-Malware signature downloading signatures download failed because of manually using CLI. signature is not available. Check the network Gateway Anti-Malware signature connection. could not be downloaded Check the network because of update server connection. connection issue. Gateway Anti-Malware signature Configure appropriate credentials for proxy. validation failed. Gateway Anti-Malware signature could not be downloaded as update server is not reachable. Gateway Anti-Malware signature could not be downloaded as DNS resolution failed for Anti-Malware update server. Gateway Anti-Malware signature could not be downloaded because proxy server is not reachable. Gateway Anti-Malware signature could not be downloaded because proxy authentication failed Geo IP location file download failure 70 Critical McAfee Network Security Platform 8.1 Cannot push Geo IP location file to device <Sensor_name>. See system log for details. Occurs when the Manager cannot push the Geo IP Location file to a Sensor. Could result from a network connectivity issue. Troubleshooting Guide System fault messages Manager faults 4 Fault Severity Description/Cause Action GTI File Reputation DNS Error Critical Connectivity to Artemis server is You may need to correct the restored. Error connecting to Artemis DNS configuration. local DNS server"; Malformed DNS response from Artemis server"; Error connecting to Artemis server"; Information not available in Artemis server"; Sensor internal memory error on connecting to Artemis server"; Sensor internal query error on connecting to Artemis server"; Unknown internal error on connecting to Artemis server"; Hardware error Critical This is a Generic Hardware related error in the device. Check the device to know more. Incompatible custom attack Critical One or more custom attack definition is incompatible with the current signature set. Error message: <exception string>. The Custom Attack Editor indicates which definitions are incompatible. (Incompatibility could result from attack or signature overlap.) Update the definition in the Custom Attack Editor and try again. Incompatible UDS signature Critical A user-defined signature (UDS) is incompatible with the current signature set. You will need to edit your existing UDS attacks to make them conform to the new signature set definitions. Bring up the Custom Attack Editor (IPS Settings > Advanced Policies > Custom Attack Editor) and manually performing the edit / validation. This fault clears when a subsequent UDS compilation succeeds. Link failure of <Sensor> Critical The link between this port and This is a connectivity issue. the external device to which it is Contact your IT department connected is down. to troubleshoot network connectivity. This fault clears when communication is re-established. Low JVM Memory Critical The Manager is experiencing high memory usage. Available system memory is low. Reboot the Manager server. Low Tomcat JVM Memory Critical The Manager is experiencing high memory usage. Available system memory is low. Reboot the Manager server. McAfee Network Security Platform 8.1 Troubleshooting Guide 71 4 System fault messages Manager faults Fault Severity Description/Cause Action Packet log save failed Critical The Manager was unable to access the packet log tables in the database. Error Message: <exception string>. An attempt to save packet log data to the database failed, most likely due to insufficient database capacity. Please ensure that the disk space allocated to the database is sufficient, and try the operation again. Power supply error Critical There is a power supply error to the device. Restore the power supply to clear this fault. Check power to the outlet providing power to the power supply; if a power interruption is not the cause, replace the failed power supply. <Sensor_name> configuration update failure Critical The attempt by the Manager to deploy the configuration to device <Sensor_name> failed during device re-initialization. The device configuration is now out of sync with the Manager settings. The device may be down. See the system log for details. The Manager cannot push the original device configuration during device re-initialization. This can also occur when a failed device is replaced with a new unit, and the new unit is unable to discover its configuration information. Sensor attack detection error Critical The Sensor attack detection stopped on one or more engines. Device reboot may be required to resolve the issue. Message generated based on the Sensor attack detection error. A device reboot may be required. Simultaneous FIPS role logon Critical Users from all three FIPS mode roles (Audit Administrator, Crypto Administrator and Security Administrator) have logged onto the Manager at the same time. This message is informational. Software error Critical A recoverable software error has This error may require a occurred within the device. A reboot of the device, which device reboot may be required. may then resolve the issue causing the fault. Temperature error Critical Device temperature is outside its normal range. Check the fan LEDs on the front of the Sensor to ensure all internal device fans are functioning. This fault will clear when the temperature returns to its normal Critical This fault can be due to two reasons - SNMPD process restart exceeded the maximum threshold or due to communication failure in the management processor. Manually reboot the Sensor, which may then resolve the issue causing the fault. Critical The attempt to import the IPS signature set into the Manager was not successful. A valid signature set must be present before any action can be taken in Network Security Platform. SNMP query Device reboot required Signature set IPS signature set import failure 72 McAfee Network Security Platform 8.1 Troubleshooting Guide System fault messages Manager faults 4 Fault Severity Description/Cause Action Memory Error Critical This is a Generic Memory related error in the device. Check the device to know more. Signature set import failed Critical The attempt to import the signature set into the Manager was not successful. (A valid signature set must be present on the Manager for it to work as expected.) A valid signature set must be present before any action can be taken in Network Security Platform. Signature set download failure Critical The attempt by the Manager to deploy the signature set to device <Sensor_name> failed. See the system log for details. (The Manager will continue to attempt deployment.) Occurs when the Manager cannot push the signature set file to a Sensor. Could result from a network connectivity issue. The Manager is unable to communicate with the Update Server. This fault clears when communication with the Update Server succeeds. Any connectivity issues with the Update Server will generate this fault, including DNS name resolution failure, Update Server failure, proxy server connectivity failure, network connectivity failure, and even situations where the network cable is detached from the Manager server. If your Manager is connected to the Internet, ensure it has connectivity to the Internet. Server communication Communication failure with the Network Security Platform Update Server Critical Communication failure with the proxy server Critical The Manager is unable to communicate with the proxy server. (This fault can occur only when the Manager is configured to communicate with a proxy server.) This fault clears when communication to the Update Server through the proxy succeeds. Communication failure with the McAfee Update Server Critical The Manager is unable to establish network connectivity with the Update Server. See system log for details. Any connectivity issues with the Update Server will generate this fault, including DNS name resolution failure, Update Server failure, proxy server connectivity failure, network connectivity failure, and even situations where the network cable is detached from the Manager server. This fault clears when communication with the Update Server is restored. Manager Disaster Recovery(MDR) Conflict in MDR IP address type Critical Device detected a conflict with MDR IP Address type as <IPv4/ IPv6> instead of type <IPv6/ IPv4> You may need to correct the MDR configuration. Conflict in MDR Mode Critical MDR mode: Manager IP address / MDR status. There is a problem with MDR configuration. Check your MDR settings. McAfee Network Security Platform 8.1 Troubleshooting Guide 73 4 System fault messages Manager faults Fault Severity Description/Cause Action Conflict in MDR Pair IP address Critical Device detected a conflict with MDR-Pair IP Address: Manager-IP address / MDR action. You may need to correct the MDR configuration. Conflict in MDR Status Critical Sensor found a conflict with MDR-Status; ISM-IPAddress / MDR-Status as <ISMAddress> / Up/Down and <PeerISMAddress> / Up/Down There is a problem with MDR configuration. Check your MDR settings. Generic device error Critical Review device status. MDR - system time synchronization error Critical The two Managers in an MDR pair must have the same operating system time. Ensure both Managers are in sync with the same time source. (Otherwise, the device communication channels will experience disconnects.) Ensure both Managers are in sync with current time. MDR pair changed <NSM Name or NSCM Name> Critical The < NSM Name or NSCM Name> Manager is <previousPrimaryIpAddr/ previousSecIpAddr> and now primary and secondary are <presentPrimaryIpAddr/ presentSecIpAddr>. Corrected the MDR pair. The Manager found InActive (stand by) for now, the peer Manager is either not reachable or does not have data. If the Manager that has moved to MDR mode is Network Security Central Manager, then make the Central Manager, which has all the Network Security Manager data as Active or reform MDR. The Manager Critical <Manager_name> has switched to MDR mode, and this Manager cannot handle the change If the MDR moved Manager is Network Security Manager then make the Manager which has Central Manager data as active or make sure that active Manager has Central Manager configuration data. 74 McAfee Network Security Platform 8.1 Troubleshooting Guide System fault messages Manager faults Fault Severity The Manager_name Critical has moved to MDR mode, and this Manager cannot handle the change Description/Cause Action The Central Manager server is in Standby mode. The Manager server which is configured by Central Manager goes into secondary Standby mode after MDR creation or before data dump from primary to secondary takes place. If the Central Manager server has moved to Standby, then the Central Manager with latest Manager information is moved to Active mode or recreate MDR pair. The Manager server configured by Central Manager is in Active mode but is in a disconnected state and therefore cannot communicate with Central Manager. If Manager is reconnected and Central Manager is in Standby mode, then the Peer Central Manager does not have Manager configuration. If the Manager has moved to Standby, then make the Manager with Central Manager information as Active or make sure that active Central Manager or Manager has latest configuration data. The Manager has moved to MDR mode, and this Manager cannot handle the change Critical The Manager server is in Standby mode(MDR action) and active peer Manager does not have Central Manager information There is conflict in the MDR configuration for the Manager <Manager_name> Critical The configuration between an Dissolve and recreate an existing MDR pair (Manager 1 MDR pair. and Manager 2 - both Managers are Central Manager configured) is disabled and a new MDR pair configuration has been created with Manager 2 and Manager 3. Manager 2 is in Standby mode and Manager 3 does not have Central Manager configuration The MDR Critical connection is down. The communication from <Primary/Secondary> to <Secondary/Primary> is down. 4 If the Manager server has moved to Stand by, then make Central Manager with latest Manager information as Active or reform MDR; if the Manager has moved to Standby, then make the Manager with Central Manager information as Active or make sure that active Central Manager or Manager has latest configuration data. Please look into the connection statuses of the systems and manager logs. Vulnerability Manager configuration Scheduled Vulnerability Manager vulnerability data import failed Critical This message indicates that the vulnerability data import by the Scheduler from Vulnerability Manager database has failed. Refer to error logs for details Vulnerability data import from Vulnerability Manager failed Critical Scheduled import of vulnerability data failed from FoundStone database server into ISM database table This message is informational. McAfee Network Security Platform 8.1 Troubleshooting Guide 75 4 System fault messages Manager faults Fault Severity Description/Cause Action On demand scan failed Critical Scan failed because the See the fault message connection to Vulnerability Manager Scan Engine was refused. <Connection has been reset by Foundstone Server. Unable to communicate with Foundstone Server. FoundScan Engine may not be reachable or Failed to resolve Fully Qualified Domain Name SSL Handshake with FoundScan Engine Failed.>, <Please check if the FS API Service port has been blocked by Firewall or if valid port has been specified. Please check the ems log for more details. Try adding the engine host name entry to the DNS Server or Try adding an entry for engine IP and host name in hosts file located in windows \system2\drivers\etc. No Trusted Certificate found, Please check the Foundstone version and certificates used for communication. Please check if the FS API Service port has been blocked by Firewall or if valid port has been specified.> Advanced Threat Defense connectivity Communication failure with the Advanced Threat Defense device Critical Valid Edge Critical certificate download failure The Manager is unable to establish connectivity with the Advanced Threat Defense (ATD) device. See system log for details. This fault will be cleared when connection is restored. Any connectivity issues with the Advanced Threat Defense (ATD) will generate this fault, including ATD device failure, network connectivity failure, and even situations where the network cable is detached from the Manager server. This fault clears when communication with the ATD is restored. Cannot push Valid Edge certificate to device <Sensor_name>. See system log for details. Occurs when the Manager cannot push the Valid Edge Certificate to a device. Could result from a network connectivity issue. Port conflict in Central Manager custom attack definition synchronization. Port <port_name> is already in use. Free this port for Central Manager synchronization to succeed. Free this port for McAfee® Network Security Central Manager synchronization to succeed. Central Manager Central Manager custom attack synchronization failed 76 Critical McAfee Network Security Platform 8.1 Troubleshooting Guide System fault messages Manager faults Fault Severity Description/Cause Deleted Manager information Critical The Manager information See the fault message. <mgr_ip_address> has been deleted. Reason: <The action Stand alone to MDR is received where the peer is already having configured <standby_manager> and hence deleting, mgr info of <standby_managers> this LM will be no longer trusted>. Manager <Manager_name> unreachable Critical Connectivity with Manager <Manager_name> has been lost. Manager <Manager_name> MDR error Critical Manager <Manager_name> detected in standby mode. The peer Manager <peer_Manager_name> is either not reachable or does not have <configuration> data. Action Indicates that the Network Security Central Manager and Network Security Managers cannot communicate each other, the connection between these two may be down, or the Manager has been administratively disconnected. Troubleshoot connectivity issues: 1) check that a connection route exists between the Network Security Central Manager and the Network Security Manager; 2) Access to the Network Security Manager/ Network Security Central Manager directly. This fault clears when the Manager detects the Sensor again. If the above managers which has moved to MDR mode is Network Security Central Manager, then make the Central Manager which as all the Network Security Managers data as Active or The Manager <Manager_name> reform MDR, if tbe MDR used to be the <previousIp>/ moved manages is Network <previousPeerIp> MDR Security Manager, then configuration and is now the make the Manager which <currentIp>/ <currentIpsPeer> has Central Manager data as MDR configuration, and the active or make sure that primary Manager <currentIp> is active Manager has Central not active and its peer Manager configuration data. <currentIpsPeer> does not have <ICC> configured. MDR configuration Critical conflict for Manager <Manager_name> Manager <primary_mgr_ip> is in <standalone/MDR pair> mode, and its peer Manager <secondary_mgr_ip> is in <standalone/MDR pair> mode. MDR pair changed This fault tells about change of Correct the MDR pair. MDR configuration for a Local Manager or Central Manager. The fault tells that for this Manager, the IP addresses of the underlying MDR pair has changed. The fault gives the old and new IP addresses of the primary and secondary Manager. Critical McAfee Network Security Platform 8.1 4 Correct the MDR pair. Troubleshooting Guide 77 4 System fault messages Manager faults Fault Severity Description/Cause Action The Manager <Manager_name> is not reachable Critical Indicates that the Network Security Central Manager and Manager cannot communicate each other, the connection between these two may be down, or the Manager has been administratively disconnected. 1 Check that a connection route exists between the Network Security Central Manager and the Manager. 2 Access the Manager/ Network Security Central Manager directly. This fault clears when the Manager detects the Sensor again. No communication exists between Central Manager and Manager. Indicates that the Central Manager server and Manager cannot communicate with each other. The connection between these two may be down, or Central Manager has been administratively disconnected. 1 Check that a connection route exists between the Central Manager and Manager; 2 Access the Manager directly. This fault clears when the Manager detects the Sensor again. Network Security Central Manager UDS signature synchronization failed 78 Critical McAfee Network Security Platform 8.1 Port conflict in Network Security Free this port for Network Central Manager UDS Security Central Manager synchronization. Port already in synchronization to succeed. use by UDS. Free this port for Central Manager synchronization to succeed. Troubleshooting Guide System fault messages Manager faults Fault Severity Trust request failure Critical Description/Cause Action The trust request has failed. Error message: <exception string>. See additional text information. 4 The trust request has failed because Manager <Network Security Central Manager> may not be reachable. Please confirm the Manager IP address and that its service is up and running. The trust request has failed because manager <Network Security Central Manager> has not yet configured. The trust request has failed because the <Network Security Central Manager> already has a trust using the configured name. The previous trusted with <Network Security Central Manager> may represent Manager or another. The solution is to delete and re-add the configuration with <Network Security Central Manager>. The trust request has failed because the configured Manager is in MDR mode, and no active <Network Security Central Manager> Manager has been detected with which to establish the trust. The trust request failed due an internal error. Alert queue threshold alarms Alert save failed Critical The Manager was unable to access the alert tables in the database. Error Message: <exception string>. An attempt to save alerts to the database failed, most likely due to insufficient database capacity. Please ensure that the disk space allocated to the database is sufficient, and try the operation again. Alert capacity threshold exceeded Critical <Percentage value>% capacity. Number of alerts: <Number of alerts> (Database maintenance and tuning is required.) Please perform maintenance operations to clean and tune the database. Database connectivity problems Critical The Manager is having problems Please check if the database Communicating with it's service is running and database. Error Message: connectivity is present. <exception string>. Database connectivity lost Critical The Manager has lost connectivity with its database. Error Message: <exception string> Please check the DB Connectivity. Database integrity error Critical Unable to locate index file for table: <index_file_name>. Repair the corrupt Database tables McAfee Network Security Platform 8.1 Troubleshooting Guide 79 4 System fault messages Manager faults Fault Severity Description/Cause Action Exceeding alert capacity threshold Critical As with the "Approaching alert capacity threshold" fault message, this message indicates the percentage of space occupied by alerts in the database. This message appears once you have exceeded the alert threshold specified in Manager | Maintenance. Perform maintenance operations to clean the database. Delete unnecessary alerts, such as alerts older than a specific number of days. Failure to create additional space could cause undesirable behavior in the Manager. Licensing License expires soon Critical Indicates that your Network Security Platform license is about to expire; this fault first appears 7 days prior to expiration. Contact licensing@mcafee.com for a current license. This fault clears when the license is current. Please contact Technical Support or your local reseller. License expired Critical Indicates that your Network Security Platform license has expired. Contact licensing@mcafee.com for a current license. This fault clears when the license is current. Virtual IPS Sensor License non-compliance Critical When the number of virtual IPS Sensors installed crosses the licenses purchased, this fault appears in the Manager. Import the required licenses to the Manager before installation, or please contact Technical Support or your local reseller. Manager does not have enough licenses to manage the current number of virtual IPS Sensors Critical The number of licenses needed to become compliant. Contact Technical support or your local reseller to obtain a License. Manager error faults These are the error faults for a Manager and Central Manager. 80 Fault Severity Description/Cause Action Anti-virus DAT file error Error A Device is detecting an error on av-dat file segment <segment_id>. The segment error cause is <unknown cause>, and the download type is <init/update>. Make sure that the Sensor is online and in good health. The Manager will make another attempt to push the file to the Sensor. This fault will clear when the av-dat file is successfully pushed to the Sensor. Device in bad health Error Please check the running status of device <device_name>. This fault occurs with any type of device software failure. (It usually occurs in conjunction with a software error fault.) If this fault persists, we recommend that you perform a Diagnostic Trace and submit the trace file to Technical Support for troubleshooting. McAfee Network Security Platform 8.1 Troubleshooting Guide 4 System fault messages Manager faults Fault Severity Description/Cause Action ePO Server Connection Error Error The Manager has no connection to the configured ePO server. Indicates that the Manager has no connection to the configured ePO server. This can be due to network connectivity issues, incorrect credentials, or incorrect configuration. Refer to the ePO integration documentation for more information. Firewall filter application error Error Error applying firewall filter <FILTER: [AttackID=<attackId>] [VidsID=<vidsId>] [SrcIP=<srcIP>] [DstIP=<dstIP>] [Port=<port>] [Protocol=<protocol>] [type=<typeString>]> An attempt to apply this firewall filter from the device to the firewall has failed. Failure reason: <Exceed Max Number of Filters Check your firewall configuration. If possible, increase the maximum number of available filters. Ensure connectivity between the sensor and the firewall. Error Applying Filter Timeout During Adding Filter Unknown Host Isolation Error#> IP: IPS quarantine Error block nodes exhausted MLC Server Connection Error Error When the number of quarantine rules exceed the permitted limit, the Central Manager raises a fault message to the Manager when the number of quarantine rules exceeds the maximum permitted limit. This can be viewed as an alert in the Threat Analyzer. For more information on quarantine and remediation functionality, see Quarantine settings. Manager has no connection to configured MLC server. Indicates that the Manager has no connection to the configured MLC server. This can be due incorrect certificate import, network connectivity issues or issues internal to the MLC server. Refer to the MLC integration documentation for more information. You can have up to 1000 Quarantine rules for an IPv4 addresses, and up to 500 Quarantine rules for IPv6 addresses. Mail server and queue McAfee Network Security Platform 8.1 Troubleshooting Guide 81 4 System fault messages Manager faults Fault Severity Description/Cause Action Alert queue full Error The Manager has reached its limit <queue_size_limit> for alerts that can be queued for storage in the database. (<no_of_alerts> alerts dropped) Indicates that the Manager has reached the limit (default of 100,000) of alerts that can be queued for storage in the database. Alerts are being detected by your sensor(s) faster than the Manager can process them. This is evidence of extremely heavy activity. Check the alerts you are receiving to see what is causing the heavy traffic on the sensor(s). E-mail server unreachable Error Connection attempt to e-mail server This fault indicates that the <mail server> failed. Error: SMTP mailer host is <Messaging Exception String>. unreachable, and occurs when the Manager fails to send an email notification or a scheduled report. This fault clears when an attempt to send the email is successful. Packet log queue full Error The Manager packet log queue has reached its maximum size of <pktlog_queue_size_limit>. (<no_of_pktlogs_dropped> packets) The Manager packet log queue has reached its maximum size (default 200,000 packets), and is unable to process packets until there is space in the queue. Packets are being detected by your sensor(s) faster than the Manager can process them. This is evidence of extremely heavy activity. Check the packets you are receiving to see what is causing the heavy traffic on the sensor(s). Error The Manager packet log queue has reached its maximum size (default 200,000 alerts), and is unable to process packet logs until there is space in the queue. This is evidence of extremely heavy activity. Check the packet logs you are receiving to see what is causing the heavy traffic on the Sensor. Also see the suggested actions for the alert Unarchived, queued alert count full. 82 McAfee Network Security Platform 8.1 Troubleshooting Guide 4 System fault messages Manager faults Fault Severity Description/Cause Packet capturing error Error The device detected an error connecting to the SCP server while attempting to transfer a packet capture file. Action Device shall attempt to automatically recover. Check Packet Capture configuration. The device is unable to send the packet capture file via SCP. The device has stopped capturing packets due to insufficient internal memory. The device experienced an internal error while performing the packet capture. The device is unable to authenticate with target server to transfer a packet capture file. Queue size full Syslog Server unreachable McAfee Network Security Platform 8.1 Error Error The Manager alert queue has reached its maximum size (default 200,000 alerts), and is unable to process alerts until there is space in the queue. Alerts are being detected by your sensor(s) faster than the Manager can process them. This is evidence of extremely heavy activity. Check the alerts you are receiving to see what is causing the heavy traffic on the sensor(s). The Manager alert slow consumer (SNMP Trap forwarder) queue has reached its maximum size of alerts dropped) The Manager alert slow consumer (SNMP Trap forwarder) queue has reached its maximum size, and is unable to forward alerts until there is space in the queue. Alerts are being detected by your sensor(s) faster than the Manager can process them. This is evidence of extremely heavy activity. Check the alerts you are receiving to see what is causing the heavy traffic on the sensor(s). Connection attempt to Syslog server This fault indicates that the <server address> failed. Error: Syslog Server is <Syslog TCP connection failed>. unreachable, and occurs when the Manager fails to send an syslog notification. This fault clears when an attempt to send the syslog is successful. Troubleshooting Guide 83 4 System fault messages Manager faults Fault Severity Description/Cause Action Unarchived, queued packet log count full Error Indicates that the Manager has reached the limit (default of 100,000) of packet logs that can be queued for storage in the database. Also indicates the number of dropped packet logs. Indicates that the Manager has reached the limit (default of 100,000) of packets that can be queued for storage in the database. Packets are being detected by your sensor(s) faster than the Manager can process them. This is evidence of extremely heavy activity. Check the packets you are receiving to see what is causing the heavy traffic on the sensor(s). A Device configuration update failed to be pushed from the Manager server to the sensor. Please see ems.log file to isolate reason for failure. Please perform maintenance operations to clean and tune the database. Update device configuration Device configuration update failed Error Alert capacity monitor Approaching alert capacity threshold Error <Percentage_value>% capacity. Number of alerts: <number_of_alerts>. (Database maintenance and tuning is recommended.) Approaching alert capacity Error Current database size is <x> GB and disk capacity is <y>. Error The Manager is unable to accept more incidents from the Incident Generator. Error message: <exception string>. Incident Manager Incident update failed You have reached the maximum number of incidents that can be accepted by the Manager. Delete old incidents to provide room for incoming incidents. Alert queue threshold alarms Alert pruning failure Error The Manager was unable to prune Check your Database alerts and packet logs during normal Connections maintenance. Error Message: <exception string>. Device upload scheduler Scheduled botnet detector deployment failure 84 Error McAfee Network Security Platform 8.1 The Manager was unable to perform Indicates that the Manager the scheduled Bot DAT deployment was unable to perform the to the device <Sensor_name>. scheduled BOT DAT deployment to the Sensor. This is because of network connectivity between the Manger and the Sensor, or an invalid DAT file. This fault clears when an update is sent to the Sensor successfully. Troubleshooting Guide System fault messages Manager faults 4 Fault Severity Description/Cause Action Scheduled IPS signature set deployment failure Error This fault can indicate problems with network connectivity between the Manger and the sensor, incompatibility between the update set and the Manager software, compilation problems with the signature update set, or an invalid update set. This fault clears when an update is sent to the sensor successfully. The Manager was unable to perform the scheduled signature set deployment to the device. Error Message: <exception string>. Real-time update scheduler Real-time Scheduler -signature set update from Manager to Sensor failed Error Unable to make scheduled signature This fault can indicate set update from the Manager to problems with network Sensor. connectivity between the Manager and the Sensor. This fault clears when a signature update is applied successfully. Scheduled real-time update from Update Server to Manager failed Error Unable to make scheduled update of This fault clears when a Manager signature sets. This fault signature update is applied can indicate—for example, problems successfully. with network connectivity between the Update Server and the Manager or between the Manager and the Sensor; invalid update sets; or update sets that were not properly signed. Scheduled BOT DAT signature set download failure Error The Manager is unable to perform the scheduled BOT DAT signature set download from the GTI Server. Error Message: <exception string>. This fault can indicate problems with network connectivity between the GTI Server and the Manager, invalid BOT DAT file. This fault clears automatically once a new signature set update is successfully installed. Scheduled IPS signature set download failure Error The Manager is unable to perform the scheduled signature set download from the Update Server. Error Message: <exception string>. This fault can indicate problems with network connectivity between the Update Server and the Manager ; invalid update sets; or update sets that were not properly signed. This fault clears when a signature update is applied successfully. Queue size full Error The Manager alert queue has reached its maximum size (default 200,000 alerts), and is unable to process alerts until there is space in the queue. Alerts are being detected by your sensor(s) faster than the Manager can process them. This is evidence of extremely heavy activity. Check the alerts you are receiving to see what is causing the heavy traffic on the Sensor(s). McAfee Network Security Platform 8.1 Troubleshooting Guide 85 4 System fault messages Manager faults Manager warning faults These are the warning faults for a Manager and Central Manager. Fault Severity Description/Cause Action Disk Space Warning Warning Make sure that the drive where the Manager is installed has sufficient disk space. When the utilized disk space on the Manager server is between 80% and 89%. Example: • Used disk space = 80% invokes a warning. • Used disk space = 79% does not result in any fault. Failed to backup IDS Warning Policy Warning Failed to backup Policy. Delete previous versions. Failed to backup Policy. Please contact technical support or local reseller. Failed to backup Recon Policy Warning Failed to backup Policy. Please contact technical support or local reseller. Warning Failed to backup Policy. Delete previous version. Warning The Audit Log capacity of the Manager was reached, and the Manager will begin overwriting the oldest records with the newest records (i.e. first in first out). This fault will be raised after a configured number of records written. No action is required. Initiating Audit Log file rotation The fault indicates the number of records that have been written to the audit log; and equal number of audit log records are now being overwritten. Invalid Malware File Archive Storage Settings Warning The capacity is configured in the iv_emsproperties table in MySQL; this option can be turned off. If this feature is enabled, when disk capacity is reached or audit log capacity is reached, then Audit Log rotation is initiated. The available free disk space on the Manager is less than the disk space required to support the current malware storage settings. Reduce the maximum disk space allowed for one or more file type. MLC IP - User Warning mapping/User count exceeds limit Currently, NSM-MLC integration supports only 100000 IP-user mapping and 75000 users. One of these has exceeded, so the device behavior cannot be guaranteed until these numbers are brought down. Check the MLC server configured with this Manager. Consider reducing the number of users/computers that is monitored by MLC. Packet capture complete Warning The device is near capacity. Packet captures might not capture all packets. Check Packet Capture configuration and restart if required. Policy Update Failed Warning Failed to update following policies Please edit the policy to fix during Signature Set import. Please edit the issue. the policy to fix the issue. System startup in progress; alerts being restored Warning System startup restored alerts from the Threat Analyzer may not archive file. Threat Analyzer may not show all alerts. show all alerts. Vulnerability Manager configuration IPS policy backup failure 86 Warning McAfee Network Security Platform 8.1 Failed to back up policy <policy_name>. See ems logs. Troubleshooting Guide 4 System fault messages Manager faults Fault Severity Description/Cause Action Warning Failed to back up policy <policy_name>. The maximum limit of <value> has been reached. Delete previous versions. Failed to back up policy <policy_name>. See ems logs. Failed to back up policy <policy_name>. The maximum limit of <value> has been reached. Delete previous versions. Reconnaissance Warning policy backup failure Warning Policy synchronization Policy synchronization aborted Warning Policy synchronization has aborted because concurrent processes are running on the Manager. Policy Synchronization aborted because concurrent processes are running on the Network Security Manager. Policy Synchronization aborted because concurrent processes are running on the Manager Server Warning Unable to synchronize policy due to concurrent processes are running on the Manager Server. Try again later . Scheduled configuration report Scheduled reports error Warning Report generation failed for report template <report_template_name> because one or more of the selected resources is no longer available. Edit and save the disabled template in Report Generation. Manager Disaster Recovery(MDR) MDR - IPv4 and IPv6 address configuration Warning You have specified only the peer Manager <IPv4/IPv6> address. So you cannot add any <IPv4/IPv6> devices to the current Manager nor will the existing <IPv4/IPv6> devices be able to communicate to the peer Manager. If Device is needed to communicate over IPv6 to Manager and Manager is in mdr mode, then mdr has to be reconfigured to include IPv6 version of the peer manager. Warning The Manager was not shut down gracefully. (Database tuning is recommended.) Perform database tuning (dbtuning) to fix possible database inconsistencies that may have resulted. Tuning may take a while, depending on the amount of data currently in the database. Manager Reboot Manager shutdown was not graceful Manager informational faults These are the informational faults for a Manager and Central Manager. Fault Severity Alert Archival state has changed Informational The alert archival process has started. Command to invoke upload internal hosts process to NSM Informational The internal host information is sent This message is for to the Manager. user information. No action required. McAfee Network Security Platform 8.1 Description/Cause Action This message is for user information. No action required. Troubleshooting Guide 87 4 88 System fault messages Manager faults Fault Severity Cluster software initialization status Informational Device software has been initialized. On initialization failure, check if cluster cross-connects are present as documented. Custom attacks are being saved to the Manager Informational One or more custom attack definition is in the process of being saved from the Custom Attack Editor to the Manager. This message is for user information. No action required. Database backup in progress Informational A database backup is in progress. This message is informational Data dump retrieval from peer has been completed successfully Informational The data dump retrieval from peer has been completed successfully This message is for user information. No action required. Data dump retrieval Informational The data dump retrieval from peer from peer is in progress is in progress This message is for user information. No action required. Database backup failure Informational Unable to backup database tables. This message indicates that an attempt to manually back up the database backup has failed. The most likely cause of failure is insufficient disk space on the Manager server; the backup file may be too big. Check your disk capacity to ensure there is sufficient disk space, and try the operation again. Manager Request is not Informational The Manager Request is not from from Trusted IP Address Trusted IP Address. Ensure the Peer Manager is not already in MDR with other Manager. Network Security Informational An Network Security Platform-defined UDS Platform-defined UDS has been overridden by signature incorporated in a new signature set set. and has been removed from the Custom Attack Editor. This message is informational and indicates that an emergency McAfee-provided UDS signature has been appropriately overwritten as part of a signature set upgrade. McAfee Network Security Platform 8.1 Description/Cause Action Troubleshooting Guide System fault messages Manager faults Fault Severity Description/Cause Action Packet capture file transfer status Information The device has started sending the packet capture file via SCP. This message is informational. 4 The device has completed sending the packet capture file via SCP. The device has stopped capturing packets because it has reached the configured maximum capture file size. The device has stopped capturing packets because it has reached the configured maximum duration. The device is ready to transfer the packet capture file to Manager. Packet Log Archival state has changed Informational Indicates that the packet log archival state has changed This message is for user information. No action required. Scheduler - Signature Informational Scheduler - Signature download download from Manager from Manager to Sensor has failed. to Sensor failed Sensor software image or signature set import in progress This message is for user information. No action required. Informational A Sensor software image or This message is for signature set file is in the process of user information. No being imported from the Network action required. Security Platform Update Server to the Manager server. Informational This message is for user information. No action required. Signature set update failed Informational Signature set update failed while transferring from the Manager server to the Sensor. This message is for user information. No action required. Signature set update not successful Informational The attempt to update the signature set on the Manager was not successful, and thus no signature set is available on the Manager. You must re-import a signature set before performing any action on the Manager. A valid signature set must be present before any action can be taken in Network Security Platform. Switchback has been completed, the primary Manager has got the control of Sensors now Informational N/A This message is for user information. No action required. System startup in process - alerts being restored Informational The Manager is starting up and restoring alerts from the device archive file. Threat Analyzer may not show all alerts until the Manager is fully online. You need to restart Manager, to view the restored alerts in Threat Analyzer. Syslog Forwarder is not Informational ACL logging is enabled, but no configured for the Syslog server has been configured Admin Domain: <Admin to accept the log messages. Domain Name> to accept the ACL logs. McAfee Network Security Platform 8.1 Configure a Syslog server to receive forwarded ACL logs. Troubleshooting Guide 89 4 System fault messages Manager faults Fault Severity Description/Cause Action Successful scheduled DAT file download Informational The scheduled DAT file download from the McAfee GTI Server to the Manager was successful. UDS export to the Manager in progress Informational One or more UDS is in the process This message is for of being exported from the Custom user information. No Attack Editor to the Manager server. action required. This message is for user information, no action required Vulnerability Manager configuration Successful vulnerability data import from Vulnerability Manager Informational Vulnerability data successfully imported from FoundStone database server into ISM database table. This message is informational. No vulnerability records found for import from FoundStone database. Scheduled Vulnerability Manager vulnerability data import failed Informational Scheduled Vulnerability Manager vulnerability data import has failed Refer to error logs for details Vulnerability data Informational This message indicates that the import from McAfee vulnerability data import from Vulnerability Manager McAfee Vulnerability Manager database was successful database is successful. For more information on importing vulnerability data reports in Manager, see Importing Vulnerability Scanner Reports, McAfee Network Security Platform Integration Guide. Policy synchronization Deleted NSCM rule set in use Informational Rule set is currently assigned to one Remove the reference or more resource. Create a clone and try again. before deletion. Deleted NSCM attack filter in use Informational Attack filter is currently assigned to one or more resource. Create a clone before deletion. Remove the reference and try again. Deleted NSCM policy in use Informational Policy is currently assigned to one or more resource. Create clone before deletion. Remove the reference and try again. Central Manager Deleted Network Security Central Manager Exception object is applied on resource Informational Exception object is applied on Deleted Network resource(s). Creating a clone before Security Central delete. Manager Exception object is applied on resource(s) Deleted Central Manager policy is applied on resources Informational Deleted Central Manager policy is in Remove the reference use and try again Policy <policy name> is applied on resources. Creating clone <policy name> before delete. Reset to standalone has Informational A "Reset to Standalone" has been been invoked; the invoked; the Primary Manager is Primary <Manager/ standalone and is in control of Central Manager> is in Sensors control of <Sensors/ Manager> 90 McAfee Network Security Platform 8.1 Remove the reference and try again. This message is for user information, no action required. Troubleshooting Guide System fault messages Manager faults Fault Severity Description/Cause Reset to standalone is invoked; the Secondary <Manager/Central Manager> is in control of <Sensors/Manager> Informational A "Reset to Standalone" has been invoked; the Secondary Manager is standalone and is in control of Sensors 4 Action This message is for user information, no action required. Reset to standalone is Informational A "Reset to Standalone" has been invoked; the <Manager/ invoked; the current Manager is Central Manager> is in standalone and in control of control of <Sensors/ Sensors. Manager> This message is for user information. No action required. Reset to standalone has Informational A "Reset to Standalone" has been been invoked; the peer invoked; the Peer Manager is <Manager/Central standalone and in control of Manager> is in control Sensors. of <Sensors/Manager> This message is for user information. No action required. Alert queue threshold alarms Alert archival in progress Informational The Manager is archiving alerts Wait for the Alert archival to complete Packet log archival in progress Informational The Manager is archiving packet logs Kindly wait for the Packet Log archival to complete. Manager Disaster Recovery(MDR) Manager version mismatch. Primary Manager has latest version Informational The two Managers in an configuration must have the same Manager software version installed. The Primary Manager software is more recent than that of the Secondary Manager. Ensure the two Managers run the same software version. Manager version mismatch. Secondary Manager has latest version Informational The two Managers in an MDR configuration must have the same Manager software version installed. The Secondary Manager software is more recent than that of the Primary Manager. Ensure the two Managers run the same software version. MDR synchronization in progress Informational The synchronization from the peer Manager is in progress. This message is for user information. No action required. MDR synchronization failure Informational There was a problem while retrieving data from the peer Manager - aborting the synchronization process. Check whether the peer Manager machine is reachable from this machine MDR - Manager <Central Manager/ Manager> switched from <Standalone/ MDR> to <MDR/ Standalone> mode Informational Manager <(mgr_name) OR (ICC) (mgr_name)> is taking the control. See the fault message. MDR manual switch over successful; the Secondary <Manager/ Central Manager> is in control of <Sensors/ Manager> Informational Manager Disaster Recovery initiated This message is for via a manual switchover, is user information. No successfully completed. Secondary action required. Manager is now in control of Sensors. McAfee Network Security Platform 8.1 The Manager <mngr_name> is <Primary/Secondary> and its peer Manager, <peer_mgr_ip_addr> is <Primary/Secondary> Troubleshooting Guide 91 4 92 System fault messages Manager faults Fault Severity Description/Cause Action MDR automatic switchover has been completed; the Secondary <Manager/ Central Manager> is in control of <Sensors/ Manager> Informational Manager Disaster Recovery switchover has been completed; the Secondary Manager is in control of Sensors. MDR configuration information retrieval from Primary Manager successful Informational Manager Disaster Recovery This message is for Secondary Manager has successfully user information. No retrieved configuration information action required. from the Primary Manager. MDR forced switch over has been completed; the Secondary <Manager/Central Manager> is in control of <Sensors/Manager> Informational Manager Disaster Recovery is completed via a manual switchover. Secondary Manager is now in control of Sensors. This message is for user information, no action required. MDR operations have been resumed Informational Manager Disaster Recovery functionality has been resumed. Failover functionality is again available. This message is for user information, no action required. MDR operations have been suspended Informational Manager Disaster Recovery functionality has been suspended. No failover will take place while MDR is suspended. This message is for user information, no action required. MDR switchback has been completed; the Primary <Manager/ Central Manager> is in control of <Sensors/ Manager> Informational Manager Disaster Recovery This message is for switchback has been completed; the user information, no Primary Manager has regained action required. control of Sensors. MDR pair is changed Informational McAfee® Network Security Central Manager (Central Manager) has an MDR pair created and the Manager is in disconnected mode. If Central Manager MDR pair is dissolved, and recreated, making the existing primary Manager as secondary Manager and existing secondary Manager as primary Manager, the fault is raised. Failover has occurred; the Secondary Manager is now in control of the Sensors. Troubleshoot problems with the Primary Manager and attempt to bring it online again. Once it is online again, you can switch control back to the Primary. Dissolve and re-create an MDR pair. Network Security Informational The two Managers in an MDR Manager Type mismatch configuration must have the same Manager Type. Ensure both Managers are of same Type (Network Security Central Manager or Network Security Manager) Successful MDR synchronization from <Network Security Central Manager/ Network Security Manager> This message is informational. Informational The secondary <Central Manager/ Manager> has successfully retrieved configuration information from the primary <Central Manager/Manager>. McAfee Network Security Platform 8.1 Troubleshooting Guide System fault messages Manager faults Fault Severity Description/Cause Successful MDR switchback. (Primary <Central Manager/ Manager> will take control of the <Managers/Sensors>) Informational The MDR switchback has completed without error. (The primary <Central Manager/Manager> will take control of the <Managers/ Sensors>.) Successful MDR manual switchover. (Secondary <Central Manager/ Manager> will take control of the <Managers/Sensors>) Informational The administrator-initiated MDR This message is switchover has completed without informational. error. (The secondary <Central Manager/Manager> will take control of the <Managers/Sensors>) MDR - Reset to standalone invoked Informational The MDR pair has been reset to This message is standalone Managers. This <Central informational. Manager/Manager> is standalone and will take control of the <Managers/Sensors>. 4 Action This message is informational. Informational (This <Central Manager/Manager> will take control of the <Managers/ Sensors>) The MDR pair has been reset to standalone Managers. The peer <Central Manager/ Manager> is standalone and will take control of the <Managers/Sensors>. MDR has been canceled Informational Manager Disaster Recovery has been cancelled This message is informational. MDR automatic switchover detected. (Secondary <Central Manager/Manager> will take control of the <Managers/Sensors>) Informational An automatic MDR switchover has completed without error. (The secondary <Central Manager/ Manager> will take control of the <Managers/Sensors>.) This message is informational. MDR manual switchover Informational The administrator has initiated an in progress. (Secondary MDR switchover. (The secondary <Central Manager/ <Central Manager/Manager> will Manager> will take take control of the <Managers/ control of the Sensors>) <Managers/Sensors>) This message is informational. Successful MDR pair creation Informational Manager Disaster Recovery (MDR) has been successfully configured. This message is for user information, no action required. Successful MDR synchronization in progress Informational Synchronization from the peer Manager has been completed successfully. This message is for user information. No action required. MDR suspended Informational Manager Disaster Recovery has been administratively suspended. (No switchover will take place while MDR is suspended.) This message is informational. MDR resumed Informational Manager Disaster Recovery functionality has been resumed by the administrator. Failover functionality is again available. This message is informational. McAfee Network Security Platform 8.1 Troubleshooting Guide 93 4 System fault messages Manager faults Fault Severity Description/Cause MDR Device-to-Manager IP mismatch Informational The device-to-Manager communication IP <Manager_ip> does not match with the peer Manager IP <peer_Manager_ip>. Action Ensure that the Sensor- Manager communication IP matches with the peer Manager's peer IP in MDR configuration. MDR - <Network Informational The two <Central Manager/ Security Central Manager>s in an MDR configuration Manager/Network must have the same <Network Security Manager> Security Central Manager/Network version mismatch. (Peer Security Manager> software version <Central Manager/ installed. The peer <Network Manager> has newer Security Central Manager/Network version) Security Manager> server software is more recent than that of the current <Central Manager/ Manager>. Ensure both Managers are running the same version of the Manager software. MDR - Manager type mismatch Informational The two Managers in an MDR pair Ensure both Managers must be of the same type (Manager are of same Type versus Central Manager). (Network Security Central Manager or Network Security Manager). MDR - <Central Manager/Manager> request is not from a trusted IP address Informational The <Central Manager/Manager> request is not from a trusted IP address. MDR - system time synchronization error Informational The two Managers in an MDR pair Ensure both Managers must have the same operating are in sync with system time. Ensure both Managers current time. are in sync with the same time source. (Otherwise, the device communication channels will experience disconnects.) Ensure the Peer Manager is not already in MDR with other Manager. Database archival Alert archival in progress Informational Alerts are currently being archived. Successful alert archival Informational The alert archival successfully completed. Do not attempt to tune the database or perform any other database activity such as a backup or restore until the archival process successfully completes. This message is for user information. No action required. Database tuning 94 McAfee Network Security Platform 8.1 Troubleshooting Guide System fault messages Manager faults Description/Cause 4 Fault Severity Action Database tuning in progress Informational The Manager database is currently being tuned. The user cannot do the following operations during tuning process (1) Viewing / Modifying alerts from Threat Analyzer (2) Generating IDS reports on alerts (3) Backing up / Restoration of all tables OR alert and packet log tables. (4) Archiving alerts and packet logs into files Database tuning recommended Informational Database tuning is recommended. <no_of_days> days have passed since the last database tuning. Shutdown the Manager and execute the Database Tuning Utility at the earliest Successful database tuning Informational The Manager database was tuned without error. This message is for user information. No action required. Informational Firewall logging has been enabled, yet no syslog server is currently defined/enabled for admin domain <admin_domain_name>. This message will appear until a Syslog server has been configured for use in forwarding ACL logs. ACL logging Required syslog forwarder missing Update scheduler Automatic botnet Informational A new botnet detector has recently detectors deployment in been downloaded from the GTI progress Server to the Manager and is being deployed to the devices. This message is informational. Automatic signature set deployment in progress Informational A new signature set has recently been downloaded from the Update Server to the Manager and is now being deployed to the devices. This message is informational. Botnet detectors deployment in progress Informational A new botnet detectors version has recently been downloaded from the McAfee update server to the Manager and is being deployed to the devices. This message is informational. Connecting to McAfee update server for updates Informational Connecting to McAfee update server This message is for updates. informational. Failed connection attempt to McAfee GTI Server. Informational Failed to connect to the McAfee GTI Server. This message is informational. Scheduled signature set Informational A new signature set has recently deployment in progress been downloaded from the Update Server to the Manager and is now being deployed to the devices, as scheduled. This message is informational. Scheduled signature set Informational A scheduled signature set update is download in progress in the process of downloading from the McAfee Update Server to the Manager server This message is informational. McAfee Network Security Platform 8.1 Troubleshooting Guide 95 4 System fault messages Manager faults Fault Severity Description/Cause Action Scheduled botnet Informational The scheduled botnet detectors detectors download is in download from the McAfee update progress server to the Manager is in progress. This message is informational. Successful scheduled signature set deployment Informational A new signature set has recently been downloaded from the Update Server to the Manager and successfully deployed to the devices, as scheduled. This message is informational. Successful scheduled signature set download Informational The scheduled signature set download from the McAfee Update Server to the Manager was successful. This message is informational. Successful scheduled botnet detectors download Informational The scheduled botnet detectors download from the McAfee update server to the Manager was successful. This message is informational. Successful scheduled botnet detectors deployment Informational A new botnet detectors version has recently been downloaded from the McAfee update server to the Manager and is being deployed to the devices. This message is informational. Successful automatic botnet detectors deployment Informational A new botnet detectors version has recently been downloaded from the McAfee Update Server to the Manager and successfully deployed to the devices. This message is informational. Successful automatic signature set deployment Informational A new signature set has recently been downloaded from the Update Server to the Manager and successfully deployed to the devices. This message is informational. Update Scheduler in progress Informational This message indicates that the update scheduler is in progress. This message is informational. Signature download from Update Server to Manager Signature set deployment in progress Informational A signature set is in the process of This message is being deployed from the Manager to informational. the device. Successful signature set Informational The signature set was successfully download from Update downloaded from the McAfee Server Update Server to the Manager. This message is informational. Update device configuration Device configuration update in progress Informational The Manager is in the process of pushing the configuration (and signature set, as applicable) to the device. This message is informational. Signature set 96 DAT file import is in progress Informational A DAT file is being imported into the This message is for Manager. user information. No action required. Device software, IPS signature set, or botnet detectors import in progress Informational A device software, IPS signature set, or botnet detectors file is being imported into the Manager. McAfee Network Security Platform 8.1 This message is informational. Troubleshooting Guide System fault messages Manager faults Description/Cause 4 Fault Severity Action Device software, IPS signature set, or botnet detectors download in progress Informational A device software, IPS signature set, or botnet detectors file is being downloaded from the McAfee Update Server to the Manager. This message is informational. Informational The audit log capacity on the Manager is <value taken from ems property iv.policymgmt.RuleEngine.CircularA uditLogMax> records. After this number of records is reached, the Manager will overwrite the oldest records with the newest records (i.e. first in, first out). This fault indicates that <value taken from ems property iv.policymgmt.RuleEngine.CircularA uditLogMax> records have been written to the audit log and that the oldest audit log records are now being overwritten. This fault will be raised every <value taken from ems property iv.policymgmt.RuleEngine.CircularA uditLogMax> records written. No action is required. This is an informational fault. No action, this is an indicator to inform that audit log is overwritten. Audit logger Rotating audit logs User defined signature Custom attack Informational One or more custom attack overridden by signature definition has been incorporated set into the current signature set and therefore removed as a custom attack. Removed custom attacks: <list of removed custom attacks> This message is for user information. No action required. Custom attack save in progress Informational One or more custom attack definition is in the process of being saved to the Manager. This message is informational. Custom attack save successful Informational One or more custom attack definition has been successfully saved to the Manager. This message is for user information. No action required. Database backup is in progress Informational A manual or scheduled database backup process is in progress. Do not attempt to tune the database or perform any other database activity such as an archive or restore until the backup process successfully completes. Database backup successful Informational The database backup was successful. This message is for user information. No action required. Backup Manager Backup scheduler McAfee Network Security Platform 8.1 Troubleshooting Guide 97 4 System fault messages Sensor faults Fault Severity Description/Cause Action Scheduled backup failed Informational Unable to create backup for scheduled database This fault indicates problems such as SQL exceptions, database connectivity problems, or out-of-disk space errors. Check your backup configuration settings. This fault clears when a successful backup is made. Mail server and queue System startup in process - alerts being restored Informational The Manager is starting up and restoring alerts from the device archive file. Threat Analyzer may not show all alerts until the Manager is fully online. Threat Analyzer may not show all alerts. Restarting the manager is required to show the restored alerts in Threat Analyzer. Sensor faults The Sensor faults can be classified into critical, error, warning, and informational. The Action column provides you with troubleshooting tips. Sensor critical faults These are the critical faults for a Sensor device. 98 Fault Severity Description/Cause Action BOT DAT file download failure Critical The Manager cannot push the BOT DAT file to device <Sensor_name> Occurs when the Manager cannot push the BOT DAT file to the Sensor. Could result from the network connectivity issue. Bootloader upgrade failure Critical The firmware upgrade has failed on the Sensor. Debug or reload the firmware on the Sensor. Conflict in MDR Status Critical Sensor found a conflict with MDR There is a problem with MDR status; Manager IP address / configuration. Check your MDR MDR status as ... settings. CRC Errors Critical A recoverable CRC error has occurred within the Sensor. Reboot the Sensor, which may then resolve the issue causing the fault. Cluster software mismatch status Critical The software versions on the cluster primary and cluster secondary are not the same. Check for errors in software image download to cluster. McAfee Network Security Platform 8.1 Troubleshooting Guide System fault messages Sensor faults 4 Fault Severity Description/Cause Action Device re-discovery failure Critical The upload of device configuration information for device <Sensor_name> failed again after being triggered by the status polling thread. The device is not properly initialized. This fault occurs as a second part to the “device discovery failure” fault. If the condition of the Sensor changes such that the Manager can again communicate with it, the Manager again checks to see if the Sensor discovery was successful. This fault is issued if discovery fails, thus the Sensor is still not properly initialized. Check to ensure that the Sensor has the latest software image compatible with the Manager software image. If the images are incompatible, update the Sensor image via a tftp server. Device is unreachable Critical SNMP ping failed: Device <Sensor_name> is unreachable through its command channel. Indicates that the device cannot communicate with the Manager: the connection between the device and the Manager is down, or the device has been administratively disconnected. Troubleshoot connectivity issues: 1) check that a connection route exists between the Manager and the device; 2) check the device'’s status using the <status> command in the device command line interface, or ping the device or the device's gateway to ensure connectivity. This fault clears when the Manager detects the device again. Device dropping packets internally Critical Device capacity has been reached. Device front end is overloaded. Reduce the amount of traffic passing through the Sensor as there is an overload of traffic on the Sensor. Device model change Critical detected Device <Sensor_name> has been replaced by a different model <model_name>, which does not match the original model. The alert channel will not be able to establish a connection. Make sure you replace the model with the same Sensor model (e.g., replace an I-2700 with an I-2700, not an I-4010). Device switched to Critical Layer 2 bypass mode Device is now operating in Layer 2 bypass mode. (Inspection has been disabled.) The Sensor has experienced multiple errors, surpassing the configured Layer2 mode threshold. Check the Sensor's status. Device reboot required The SSL decryption state or Reboot the Sensor to cause the supported flow count on device SSL change to take effect. <Sensor_name> has been changed (new value = <value>). A device reboot is required to make the change take effect. Critical McAfee Network Security Platform 8.1 Troubleshooting Guide 99 4 System fault messages Sensor faults Fault Severity Description/Cause Action Dropping alerts and packet logs Critical Manager is not communicating with the database; the alert and packet logs overflowing queues. Perform maintenance operations to clean and tune the database or disable dropping option. Fail Open Control Module Timeout Critical Communication has timed out between the Fail Open Controller in the Sensor's Compact Flash port and the Fail Open Bypass Switch. This situation has caused the Sensor to move to Bypass mode and traffic to bypass the Sensor. The fault could be the result of a cable being disconnected, or removal of the Bypass Switch. This fault clears automatically when communication resumes between the Fail Open Controller and Fail Open Bypass Switch. Failed to create command channel association Critical Command channel association creation failed for device <Sensor_name>. The device is not properly initialized. This error indicates a failure to create a secure connection between the Manager and the device, which can be caused by loss of time synchronization between the Manager and device or that the device is not completely online after a reboot. Restart the Manager and/or check the Sensor’s operating status to ensure that the Sensor’s health and status are good. Failed to update the failover Sensor configuration Critical Monitoring port IP settings are not configured for the ports that require it. Either configure the Monitoring Port IPs for all the above ports (or) Disable those features. For example, monitoring port IP settings are required for a monitoring port to export NetFlow data to NTBA and to implement require-authentication Firewall access rules. Failover peer status Critical This fault indicates whether the Sensor peer is up or down. This fault clears automatically when the Sensor peer is up. Fan error Critical One or more of the fans inside the Sensor have failed. On the I-4000, you can also check the Sensor's front panel LEDs to see which fan has failed. For the I-4000 and 4010, the Manager indicates which fan has failed. If a fan is not operational, McAfee strongly recommends powering down the Sensor and contacting Technical Support to schedule a replacement unit. In the meantime, you can use an external fan (blowing into the front of the Sensor) to prevent the Sensor from overheating until the replacement is completed. Fail-open bypass switch timeout 100 Critical McAfee Network Security Platform 8.1 The device is not able to communicate with the fail-open bypass switch. Check external FailOpen kit connections or portpair configuration to restore Inline FailOpen mode. Troubleshooting Guide 4 System fault messages Sensor faults Fault Severity Description/Cause Action Firewall connectivity failure Critical The connectivity between the device and the firewall is down. This fault can occur in situations where, for example, the firewall machine is down, or the network is experiencing problems. Ping the firewall to see if the firewall is available. Contact your IT department to troubleshoot connectivity issues. Hardware error Critical There is an error in the hardware Debug or replace the hardware component on the Sensor. component. Sensor connectivity status with GTI server Critical Sensor is unable to communicate Message generated based on with GTI server. This fault will be Sensor Connectivity with GTI cleared when connection is Server. restored. Illegal In-line, fail-open configuration of <port_name>. Critical The Sensor is configured to operate with an external Fail-Open Module hardware component, but cannot detect the hardware. This error applies only to Sensors running in in-line mode with a gigabit port in fail-open mode (using the external Fail Open Module). When this fault is triggered, the port will be in bypass mode and will send another fault of that nature to the Manager. When appropriate configuration is sent to the Sensor (either the hardware is discovered or the configuration changes), and the Sensor begins to operate in in-line-fail open mode. Image downgrade detected Critical Unsupported configuration upgrade/downgrade, default configurations are used. This is an internal error. Check the Sensor status to see that the Sensor is online and in good health. Internal configuration Critical error An internal application This is an internal error. Check communication error occurred on the sensor status to see that the the device during <handling Sensor is online and in good signature segments file health. SNMP configuration request or other Sensor internal communication. Image downgrade, Please do a resetconfig. Unsupported configuration upgrades, default configurations are used. Image downgrade detected. Please execute <resetconfig> on the device CLI to complete the downgrade. Unsupported BOT DAT configuration detected after upgrade/downgrade. The default configuration will be used. McAfee Network Security Platform 8.1 Troubleshooting Guide 101 4 System fault messages Sensor faults Fault Severity Description/Cause Action Interface/ sub-interface creation failure Critical Device <Sensor_name> could not generate an interface or sub-interface. See the system log for details. This fault generally occurs in situations where the port in question is configured incorrectly. For example, a pair of ports is configured to be in different operating modes (1A is In-line while 1B is in SPAN). Check the configuration of the port pair for inconsistencies, then configure the port pair to run in the same operating mode. Invalid fail-open configuration: <port_pair_name> Critical An invalid configuration has been The Sensor requires appropriate applied to <port_pair_name> hardware to support in-line, fail-open configuration on its gigabit ports. Ensure that the hardware is available and that the correct ports are in-line and configured to run in this mode. Invalid SSL decryption key Critical Device has detected invalid SSL User may need to re-import the decryption key: <SSL decryption server SSL decryption key. key> Late Collision of <count Up/Down> Critical This fault can indicate a problem with the setup or configuration of the 10/100 Ethernet ports or devices connected to those ports. It can also indicate a compatibility issue between the Sensor and the device to which it is connected. Check the speed and duplex settings on the Sensor ports and the peer device ports and ensure that they are the same. Link failure of Port <port_name> Critical The link between a Monitoring port on the Sensor and the device to which it is connected is down, and communication is unavailable. The fault indicates which port is affected. Contact your IT department to troubleshoot connectivity issues: check the cabling of the specified Monitoring port and the device connected to it; check the speed and duplex mode of the connection to the switch or router to ensure parameters such as port speed and duplex mode are set correctly; check power to the switch or router. Users from all three FIPS mode roles (Audit Administrator, Crypto Administrator and Security Administrator) have logged onto the Manager at the same time. The link on port <port_name> is <up/down>. The link between port "<port_name>" and the device to which it is connected is down, and communication is unavailable. 102 This fault clears when communication is re-established. License expires soon Critical Your license is going to expire in less than 7 days. Please contact Technical Support or your local reseller. Load Balancer fail-over configuration mismatch Critical Load Balancer <Load_Balancer_name> reports fail-over peer configuration is not matching. Verify Load Balancer configuration. Both Load Balancers in fail-over pair is expected to have same configuration. McAfee Network Security Platform 8.1 Troubleshooting Guide 4 System fault messages Sensor faults Fault Severity Description/Cause Action Load Balancer is unreachable Critical SNMP ping failed; load balancer <load_balancer_name> is unreachable through its command channel. Indicates that the load balancer cannot communicate with the Manager: the connection between the load balancer and the Manager is down, or the load balancer has been administratively disconnected. Troubleshoot connectivity issues: 1) check that a connection route exists between the Manager and the load balancer; 2) check the load balancer status using the status command in the load balancer command line interface, or ping the load balancer or the load balancer gateway to ensure connectivity to the load balancer. This fault clears when the Manager detects the load balancer again. Malware File Archive Disk Usage(Compressed files) Critical The disk usage for archived Prune/delete unwanted files, or compressed files has reached the increase the maximum disk user defined threshold of the space or both. maximum allowed. New files of this type will no longer be saved to the disk once usage reaches100%. Malware File Archive Disk Usage (Executables) Critical The disk usage for archived executables has reached the user-defined threshold of the maximum allowed. New files of this type will no longer be saved to the disk once usage reaches 100%. Prune/delete unwanted files, or increase the maximum disk space or both. Malware File Archive Disk Usage (Office Files) Critical The disk usage for archived office files has reached the user-defined threshold of the maximum allowed. New files of this type will no longer be saved to the disk once usage reaches 100%. Prune/delete unwanted files, or increase the maximum disk space or both. Malware File Archive Disk Usage (PDFs) Critical The disk usage for archived PDFs Prune/delete unwanted files, or has reached the user-defined increase the maximum disk threshold of the maximum space or both. allowed. New files of this type will no longer be saved to the disk once usage reaches 100%. Manual Sensor Reboot Required Critical Sensor requires manual reboot due to an issue. Please reboot the Sensor. Please Reboot the Sensor. Memory error Critical A recoverable software memory error has occurred within the Sensor. Reboot the Sensor, which may then resolve the issue causing the fault. McAfee Network Security Platform 8.1 Troubleshooting Guide 103 4 104 System fault messages Sensor faults Fault Severity Description/Cause Action MLC Group Size fault Critical Sensor version 8.0 or lower not supported for this group size. Fault is raised when the admin domain user group exceeds 2,000 in an 8.0 or lower M-series model. The 10,000 admin domain user group is supported only in the 8.1 Manager for M-series model. Reduce the number of admin domain user groups to a value that is supported by your Sensor. MPE certificate download failure Critical Cannot push MPE certificate to device <Sensor_name>. See system log for details. Occurs when the Manager cannot push the MPE Certificate to a Sensor. Could result from a network connectivity issue. NTBA IPS connection failure Critical Device can't communicate to NTBA over management port on TCP protocol. If any of devices are uninstalled, this problem may exists initially for a few minutes and should go away. If the fault still appears, then check the firewall rules and connections and connectivity from IPS Management port to NTBA management port. Ondemand scan Critical failed because connection was refused to FoundScan engine This fault can be due to two reasons- the user has not specified the Fully Qualified Domain Name OR the FoundScan engine is shutdown. For more information on using Fully Qualified Domain Name, see McAfee Network Security Platform Integration Guide. Packet capture rules download Critical Cannot push packet capture Occurs when the Manager cannot rules to device <Sensor_name>. push the packet capture rules to See system log for details. a Sensor. Could result from a network connectivity issue. Packet overflow Critical A recoverable software buffer overflow error has occurred within the Sensor. Reboot the Sensor. which may then resolve the issue causing the fault Port late collision Critical This fault could indicate a problem with the setup or configuration of the 10/100 Ethernet ports or devices connected to those ports. It could also indicate a compatibility issue between the Sensor and the device to which it is connected. The Sensor may be detecting an issue with another device located on the same network link. Check to see if there is a problem with one of the other devices on the same link as the Sensor. This situation could cause traffic to cease flowing on the Sensor and may require a Sensor reboot. Port pair <port_name> is back to In-line, Fail-Open Mode Critical Sensor is back to In-line, Fail-Open Mode. This message indicates that the ports have gone from Bypass mode back to normal. Port pair <port_name> is in Bypass Mode Critical This fault indicates that the indicated GBIC ports are unable to remain in In-line Mode as configured. This has caused fail-open control to initiate and the Sensor is now operating in Bypass Mode. Bypass mode indicates that traffic is flowing through the Fail Open Bypass Switch, bypassing the Sensor completely. Check the health of the Sensor and the indicated ports. Check the connectivity of the Fail Open Control Cable to ensure that the Fail Open Control Module can communicate with the Fail Open Controller in the Sensor's Compact Flash port. McAfee Network Security Platform 8.1 Troubleshooting Guide 4 System fault messages Sensor faults Fault Severity Description/Cause Action Port pair <port_pair_name> in bypass mode Critical Device <Sensor_name> is configured to run in-line and to fail open, but it is in bypass mode. This fault indicates that some failure has occurred, causing the fail-open control module to switch operation to Bypass Mode. No traffic is flowing through the Sensor. Port pair <port_pair_name> in in-line, fail-open mode Critical Device <Sensor_name> has returned to in-line, fail-open mode. This message indicates that the ports have gone from Bypass Mode back to normal. Port pair <port_pair_name> fail-open kit status Critical Device <Sensor_name> is configured to run in-line and to fail open, but it is in <Bypass, Tap, Absent, Unknown, L2Bypass, Timeout, IllegalConfig,Restore> Mode. This fault indicates that some failure has occurred, causing the fail-open control module to switch operation to <Bypass, Tap, Absent, Unknown, L2Bypass, Timeout, IllegalConfig,Restore> Mode. No traffic is flowing through the Sensor. Port media type mismatch Critical <Port_name>: Configured media type is <none/optical/copper/ unknown>. Inserted media type is <optical/copper/unknown> Check if pluggable connector matched user configuration. Example: Copper SFP inserted in cage configured for Fiber. Replace the media according to the configured value. Port certification mismatch Critical <Port_name>: McAfee Certified pluggable interface. McAfee certification status is <not matching/matching>. Check if pluggable interface is McAfee certified. Replace with McAfee certified connector or disable check-box to use non certified connector (recommended to use McAfee certified). Power supply error Critical The <primary/secondary> power supply to the device <was inserted/was removed/is Operational/is non-operational>. Restore the power supply to clear this fault. Check power to the outlet providing power to the power supply; if a power interruption is not the cause, replace the failed power supply. Sensor changes to a different model Critical A Sensor was replaced with a different model type (for example, an I-1200 was replaced with an I-1200-FO (failover only) Sensor). The alert channel will be unable to make a connection. When replacing a Sensor, ensure that you replace it with an identical model (for example, replace an I-1200 with an I-1200, do not attempt to replace a regular Sensor with a failover-only model, and vice-versa). Sensor configuration download failure Critical The Manager cannot push original Sensor configuration to Sensor during Sensor re-initialization, possibly because the trust relationship is lost between Manager and Sensor. The link between Manager and Sensor may be down, or you may need to re-establish the trust relationship between Sensor and Manager by resetting the shared key values. This can also occur when a failed Sensor is replaced with a new unit, and the new unit is unable to discover its configuration information .It happens if the Sensor's health is bad. McAfee Network Security Platform 8.1 Troubleshooting Guide 105 4 System fault messages Sensor faults Fault Severity Description/Cause Action <Sensor_name> configuration update failure Critical The attempt by the Manager to deploy the configuration to device <Sensor_name> failed during device re-initialization. The device configuration is now out of sync with the Manager settings. The device may be down. See the system log for details. The Manager cannot push the original device configuration during device re-initialization. This can also occur when a failed device is replaced with a new unit, and the new unit is unable to discover its configuration information. Sensor reboot required for SSL decryption configuration change Critical User-configured SSL decryption settings for a particular Sensor changed, requiring a Sensor reboot. Reboot the Sensor to cause the changes to take effect. Signature set error Critical The device has detected an error on signature segment <segment_id>. The segment error cause is <unknown cause>, and the download type is <init/update/unknown signature download type>. Ensure that the Sensor is online and in good health. The Manager will make another attempt to push the file to the Sensor. This fault will clear with the signature segments are successfully pushed to the Sensor. Solid State Drive <drive 0> Error Critical The solid state drive <drive 0> is Check the respective SSD status, <drive 1>. on failure replace the SSD. Sensor switched to Layer 2 mode Critical The Sensor has moved from detection mode to Layer 2 (Passthru) mode. This indicates that the Sensor has experienced the specified number of errors within the specified timeframe and Layer 2 mode has triggered. The Sensor will remain in Layer 2 mode until it is rebooted. Sensor switched to Critical Layer 2 Bypass mode Sensor is now operating in Layer2 Bypass mode. Intrusion detection/prevention is not functioning. The Sensor has experienced multiple errors, surpassing the configured Layer2 mode threshold. Check the Sensor's status. Software error Critical A recoverable software error has occurred within the device. A device reboot may be required. This error may require a reboot of the Sensor, which may then resolve the issue causing the fault. SSL decryption key download failure Critical Cannot push SSL decryption keys Occurs when the Manager cannot to device <Sensor_name>. See push the SSL decryption keys to system log for details. a Sensor. Could result from a network connectivity issue. Temperature status Critical Inlet Temperature value increased above 50. Check the Fan LEDs in front of the chassis to ensure all internal chassis fans are functioning. This fault will clear when the temperature returns to its normal range. User login via console after Sensor initialization Critical Sensor reports user <user_name> login via console after Sensor initialization. This is a FIPS 140-2 Level 3 violation. This message is informational. Advanced Threat Defense connectivity 106 McAfee Network Security Platform 8.1 Troubleshooting Guide System fault messages Sensor faults Fault Severity Description/Cause Sensor connectivity Critical status with Advanced Threat Defense device 4 Action Message generated based on Sensor Connectivity with Advanced Threat Defense (ATD) device. Sensor is unable to communicate with Advanced Threat Defense (ATD) device due to . This fault will be cleared when connection is restored. To obtain a permanent license now, kindly contact Technical Support or your local reseller. Licensing Device discovered without license Critical Device <Sensor_name> discovered without license, and may not detect attacks. Device discovered with cluster secondary license. Critical Device <Sensor_name> was discovered with a cluster secondary license. This device not be connected to the Manager directly. Device license expired Critical Device license expired. The device may not detect attacks. Device support license expired Critical Device support license expired. The device may not detect attacks. Expired device license Critical Device license expired. The device may not detect attacks. Expired device support license Critical Device support license expired. The device may not detect attacks. Expired license for device of type <device_type> Critical The device may not detect attacks. Expired support license for device of type <device_type> Critical The device may not detect attacks. No valid license Critical detected for device of type <device_type> The discovered device may not detect attacks. Pending support license expiration for device of type <device_type> Support license for this device expires in <x> days. Critical McAfee Network Security Platform 8.1 Please contact technical support or your local reseller to obtain a License. Please contact technical support or your local reseller to renew the support License. Troubleshooting Guide 107 4 System fault messages Sensor faults Sensor error faults These are the error faults for a Sensor device. Fault Severity Description/Cause Action Alert channel down Error This fault clears when the alert channel is back up. The alert channel for device <Sensor_name> is down. Reason: <"Channel connection failed reason unknown", "Channel is up", "Sensor unable to sync time with NSM (error 2)", "Sensor unable to generate valid certificate (error 3)", "Sensor unable to persist Sensor certificate (error 4)", "Sensor fail connecting to NSM (error 5)", "Sensor in untrusted connection mode (error 6)", "Sensor install connection failed (error 7)", "Sensor unable to persist NSM certificate (error 8)", "Mutual trust mismatch between Sensor and NSM (error 9)" "Error in SNMPv3 key exchange (error 10)", "Error in initial protocol message exchange (error 11)", "Sensor install in progress", "Opening alert channel in progress", "Link error. Attempting to reconnect (error 14)", "Alert channel reconnect failed (error 15)", "Closing alert channel in progress", "Closing alert channel failed (error 17)", "Send alert warning (error 18)", "Keep alive warning (error 19)", "Sensor unable to delete certificate (error 20)", "Sensor unable to create SNMP user (error 21)", "Sensor unable to change SNMP user key (error 22)"> 108 McAfee Network Security Platform 8.1 Troubleshooting Guide 4 System fault messages Sensor faults Fault Severity Description/Cause Action The Manager cannot communicate with the device via the channel on which the Manager listens for Sensor alerts. Device in bad health Error Please check the running status of device <device_name>. This fault occurs with any type of device software failure. (It usually occurs in conjunction with a software error fault.) Game error Error Indicates that the engine could not be This fault clears when the engine initialized or downloaded and also if could be initialized or the Dat file could not be downloaded. downloaded and also if the Dat file can be downloaded. Internal packet drop error Error Device is dropping packets due to traffic load. Reduce the amount of traffic passing through the Sensor as this fault indicates overload of traffic on the Sensor. MLC Bulk update Error file size exceeds limit Device has a limit for the MLC Bulk Update file size that it can process. As this has exceeded, update to the device <Sensor_name> is aborted. Check the MLC server configured in this Manager for the number of users, groups, and IP user mappings. Make sure they do not exceed the limits specified in the MLC Integration documentation. Out-of-range configuration Device <Sensor_name> has detected an out-of-range configuration value. Contact McAfee Technical Support for assistance. Error McAfee Network Security Platform 8.1 If this fault persists, we recommend that you perform a Diagnostic Trace and submit the trace file to Technical Support for troubleshooting. Troubleshooting Guide 109 4 System fault messages Sensor faults Fault Severity Description/Cause Action Packet log channel down Error This fault clears when the packetlog channel is back up. The packet log channel for device <Sensor_name> is down. Reason: <Channel is up", Sensor unable to sync time with NSM (error 2)", Sensor unable to generate valid certificate (error 3)" Sensor unable to persist Sensor certificate (error 4)" Sensor fail connecting to NSM (error 5)", Sensor in untrusted connection mode (error 6)", Sensor install connection failed (error 7)", Senor unable to persist NSM certificate (error 8)", Mutual trust mismatch between Sensor and NSM (error 9) Error in SNMPv3 key exchange (error 10)", Error in initial protocol message exchange (error 11)" Sensor install in progress", Opening packet-log channel in progress", Link error. Attempting to reconnect (error 14)", Packet-log channel reconnect failed (error 15)", Closing packet-log channel in progress", Closing packet-log channel failed (error 17)", Send alert warning (error 18)", Keep alive warning (error 19)"> The Manager cannot communicate with the device via the channel on which the Manager receives packet logs. 110 Put peer DoS profile failure Error The Sensor was unable to push a requested profile to the Manager. See the ems.log file for details on why the error is occurring. The fault will clear when the Sensor is able to push a valid DoS profile. Peer DoS profile retrieval failure Error Peer DoS profile retrieval request from device <Sensor_name> failed. No DoS profile for peer <peer_Sensor_name> is available. The Manager cannot obtain the requested profile from the peer Sensor, nor can it obtain a saved valid profile. See log for details. McAfee Network Security Platform 8.1 Troubleshooting Guide System fault messages Sensor faults Fault Severity Description/Cause 4 Action Peer DOS profile retrieval request from device <Sensor_name> failed because the profile cannot be pushed to the device that requested it. See system log for details. Check Manager connection to Network Security Platform. <Sensor> Error discovery failure <Sensor>, <Sensor_name> failed to discover configuration information. The device is not properly initialized. Typically, the Manager will be unable to display the Sensor in this situation, which could indicate an old software image on the Sensor. If this fault is triggered because the Sensor is temporarily unavailable, the Manager will clear this fault when the Sensor is back online. If the fault persists, check to ensure that the Sensor has the latest software image compatible with the Manager software image. If the images are incompatible, update the Sensor image via a tftp server. Sensor reports an out-of-range configuration The Manager received a value from the Sensor that is invalid. The additional text of the message contains details. This fault does not clear automatically; it must be cleared manually. The Manager received a value from the Sensor that is invalid. The additional text of the message contains details. This fault does not clear automatically; it must be cleared manually. Sensor reports an out-of-range configuration Error Error Contact McAfee Technical Support for assistance. Contact McAfee Technical Support for assistance. Sensor reports NMS user privacy key decrypt failure Error NMS user privacy key decryption failed for user <user_name>. Please delete NMS user and add again with valid credential. Sensor reports NMS user authentication key decrypt failure Error NMS user authentication key decryption failed for user <user_name>. Please delete NMS user and add again with valid credential. Sensor configuration update failed Error The Sensor configuration update failed to be pushed from the Manager Server to the Sensor. Please see ems.log file to isolate reason for failure. McAfee Network Security Platform 8.1 Troubleshooting Guide 111 4 System fault messages Sensor faults Fault Severity Description/Cause Sensor Error discovery failure The Sensor failed to discover its configuration information, and thus is not properly initialized. Typically, the Manager will be unable to display the Sensor. Could indicate an old Sensor image on the Sensor. Action Check the Manager connection to Network Security Platform. Check to ensure that the Network Security Platform has the latest software image compatible with the Manager software image. If the images are incompatible, update the The Manager has reached its limit (<queue_size_limit>) for alerts that can be queued for storage in the database. (no_of_alerts alerts dropped) image via a tftp server. 112 Sensor reports that the alert channel is down Error This fault indicates that the Sensor is reporting that the alert channel is down, but the physical channel is actually up. The Sensor will typically recover on its own. If you are receiving alerts with packet logs and your Sensor is otherwise behaving Channel is up", Sensor unable to sync normally, you can ignore this message. time with NSM (error 2)", Sensor unable to generate valid certificate Check to see if trust is (error 3)" Sensor unable to persist established between the Sensor Sensor certificate (error 4)" Sensor and Manager issuing a show fail connecting to NSM (error 5)", command in the Sensor CLI. Sensor in untrusted connection mode If this fault persists, contact (error 6)", Sensor install connection McAfee Technical Support. failed (error 7)", Sesnor unable to persist NSM certificate (error 8)", Mutual trust mismatch between Sensor and NSM (error 9) Error in SNMPv3 key exchange (error 10)", Error in initial protocol message exchange (error 11)" Sensor install in progress", Opening packet-log channel in progress", Link error. Attempting to reconnect (error 14)", Packet-log channel reconnect failed (error 15)", Closing packet-log channel in progress", Closing packet-log channel failed (error 17)", Send alert warning (error 18)", Keep alive warning (error 19)" SSL decryption key invalid Error The Manager detects that a particular SSL decryption key is no longer valid. The detailed reason why the fault is occurring is shown in the fault message. These reasons can range from the Sensor re-initializing itself with a different certificate to an inconsistency between the decryption key residing on a primary Sensor and its failover peer Sensor. Re-import the key (which is identified within the error message). The fault will clear itself when the key is determined to be valid. Trust Establishment Error – Bad Shared Secret Error Device <Sensor_name> could not be added to the Manager because the shared secret it provided does not match what was defined for it on the Manager. Make sure the shared secret entered on the device CLI matches the one defined within the Manager GUI. (Note: The shared secret is case sensitive.) McAfee Network Security Platform 8.1 Troubleshooting Guide System fault messages Sensor faults Fault Severity Description/Cause Trust Error Establishment Error – Unknown Device Device <Sensor_name> could not be added to the Manager because it has not been defined on the Manager. 4 Action Make sure the device you would like to add to the Manager has been defined within the Manager GUI before trying to add it via the device CLI. (Note: The device name is case sensitive.) Update device configuration Device Configuration update failed Error Device configuration update failed to See the ems.log file to isolate be pushed from the Manager server to reason for failure. the Sensor. Device upload scheduler Scheduled botnet detector deployment failure Error The Manager was unable to perform Indicates that the Manager was the scheduled BOT DAT deployment to unable to perform the scheduled the device <Sensor_name>. BOT DAT deployment to the Sensor. This is because of network connectivity between the Manager and the Sensor, or an invalid DAT file. This fault clears when an update is sent to the Sensor successfully. Sensor warning faults These are the warning faults for a Sensor device. Fault Severity Description/Cause Action DAT Config is out of sync Warning The DAT Segments Config update to the device <Sensor_name> failed. The Bot DAT Config file on the failover pair is out of sync as a result. (The Manager will automatically make another attempt to deploy the BOT DAT Config file). Ensure that the Sensor is online and is in good health. The Manager will make another attempt to push the file. The fault will be cleared when the Manager is successful. Device configuration update is in progress Warning Device configuration update is in progress. Device configuration update is in progress. Device power up Warning The device has completed booting and is online. This message is informational. Acknowledge or delete the fault to clear it. McAfee Network Security Platform 8.1 Troubleshooting Guide 113 4 System fault messages Sensor faults Fault Severity Description/Cause Action Device performance <CPU Utilization, TCP/UDP Flow Utilization, Port Throughput Utilization, Sensor Throughput Utilization, L2 Error Drop, L3/L4 Error Drop> Warning Network Security Device Performance Monitoring <CPU Utilization, TCP/UDP Flow Utilization, Port Throughput Utilization, Sensor Throughput Utilization, L2 Error Drop, L3/L4 Error Drop> triggered since the <% or empty string> crossed the threshold value with <fallen/ risen/been> for <metric_value> band on <Sensor_name>. Device in high latency mode Warning <Sensor_name> has <fallen/risen/been> to <above/below> <% or empty string> on <Sensor_name>, which is <above/ below> the configured <alarm_name_as_configured_by_the_ user> threshold of <threshold_value> < % or empty string>. Device high latency mode is currently <LatencyConflict/ LatencyConflictCleared>. (The device will attempt to automatically recover from the high latency condition.) Device high latency mode and Layer 2 bypass mode are currently <LatencyConflict/ LatencyConflictCleared>. (the device will attempt to automatically recover from the high latency condition.) 114 The device will attempt to automatically recover from the high latency condition. Device latency monitoring configuration is conflicting with Layer 2 monitoring configuration Warning Device latency monitoring configuration requires Layer 2 pass-through monitoring to be enabled. Disable moving Sensor to Layer 2 bypass mode on high latency or enable Layer 2 pass-through monitoring. Device login failure Warning <Console/SSHD> login failure threshold of 3 attempts is exceeded for user name <user_name> from remote IP Address <remote_ip> on remote port <remote_port>. Device packet capturing terminated Warning Packet capturing has been stopped during Restart Packet device re-initialization. Please explicitly Capture if restart packet capturing, as required. required. Device DNS server connectivity status Warning DNS server is <Up and Reachable/Down or Unreachable> from the device. Physical configuration change Warning The physical configuration for device < Sensor_name> has changed. A new physical configuration has been discovered. Occurs when the Sensor connects to the Manager with a different physical configuration. Pluggable interface is absent Warning Indicates that the Pluggable interface is absent. Indicates if the pluggable connector is absent in the cage. McAfee Network Security Platform 8.1 Disable moving Sensor to Layer 2 bypass mode on high latency or enable Layer 2 pass-through monitoring. Troubleshooting Guide System fault messages Sensor faults 4 Fault Severity Description/Cause Action Pluggable interface certification status Warning Indicates if pluggable connector is McAfee Indicates if certified or not. pluggable connector is McAfee certified or not. Sensor Warning resetting due to FIPS mode change This message is informational. SNMP trap received from load balancer Warning Load balancer <load_balancer_name> reported trap type <oid_of_the_mib_object_reported>. Message generated based on SNMP trap received from device. Uninitialized device Warning Device <Sensor_name> is not properly initialized. The Sensor may have just been rebooted and is not up yet. Wait a few minutes to see if this is the issue; if not, check to ensure that a signature set is present on the Sensor. A resetconfig command may have been issued, and the Sensor not yet been reconfigured. Up Warning The Sensor has just completed booting and is on-line. This message is informational. Acknowledge the fault. Load balancer port mode change for <port_pair> Warning Load balancer <load_balancer_name> reports operating mode for port <port_pair> changed to <Fail-open/ Span/Tap/Fail-close>. Message generated based on SNMP trap received from load balancer device. Load balancer power up Warning Load balancer <load_balancer_name> has completed booting and is online. This message is informational. Acknowledge or delete the fault to clear it. XC Cluster Load balancer Warning port fail-over mode change for <port_pair> McAfee Network Security Platform 8.1 Load balancer <load_balancer_name> Message reports port <port_name> fail-over mode generated changed. based on SNMP trap received from load balancer device. Troubleshooting Guide 115 4 System fault messages Sensor faults Fault Severity Load balancer Warning system fail-over mode change Description/Cause Action Load balancer <load_balancer_name> reports fail-over mode change to <Unknown Message generated based on SNMP trap received from load balancer device. Hunting for peer Stand-alone Primary Secondary Peer device software mismatch> Load balancer Warning system fail-over status change Load balancer <load_balancer_name> reports fail-over status change to <Unknown Hunting for peer Stand-alone Message generated based on SNMP trap received from load balancer device. Primary Secondary Peer device software mismatch> Load balancer system peer fail-over status change Warning Load balancer <load_balancer_name> reports peer fail-over status change to <Unknown Hunting for peer Stand-alone Message generated based on SNMP trap received from load balancer device. Primary Secondary Peer device software mismatch> Load balancer Warning port load balancing mode change for <port_name> Load balancer <load_balancer_name> reports port <port_name> load balancing mode changed to <Good/Bad/Active/ Inactive/Loopback/Rebalance/Spare/ Standby/Standby Failure/Spare Active/ Spare Inactive/Spare Failure> Message generated based on SNMP trap received from load balancer device. The jumbo frame parsing setting on this device has been updated and a reboot is required for the change to take effect. Please reboot the device to effect the change. Device IP settings Device reboot required Warning Vulnerability Manager configuration Offline device download in progress Warning Offline device download has been initiated from the device command line interface. Please wait for offline Sensor download to complete. Successful offline device download Warning Offline device download has completed with status <successful/failed>. Download type=<sigfile/software/ software sigfile combo>, Time=<timestamp>, Filename=<downloaded_file_name> Please see log messages if download has failed, status code=< Successful/ Failed>. Licensing 116 McAfee Network Security Platform 8.1 Troubleshooting Guide System fault messages Sensor faults 4 Fault Severity Description/Cause Action Pending device license expiration Warning Device license expires in less than <x> days. Pending device support license expiration Warning Device support license expires in less than <x> days. Please contact Technical Support or your local reseller. Pending device add-on license expiration Warning Device license expires in less than <x> days. Pending device support add-on license expiration Warning Device license expired in less than <x> days. Pending license expiration for device of type <device_type> Warning License for this device expires in <x> days. Please contact technical support or your local reseller to renew the License. Warning Cannot disable failover on device <Sensor_name>. The device is offline. (The Manager will make another attempt when the device comes back online.) Make sure that the Sensor is on-line. The Manager will make another attempt to disable failover when it detects that the Sensor is up. The fault will clear when the Manager is successful. Botnet Warning detectors out of sync The deployment of botnet detectors to the device <Sensor_name> failed. The botnet detectors on the failover pair <Sensor_name1> are out of sync as a result. (The Manager will automatically make another attempt to deploy them.) Make sure that the device is online and is in good health. The Manager will automatically make another attempt to deploy the botnet detectors. The fault will be cleared once the deployment is complete. Firewall connection status inconsistent on failover Sensor pair The firewall connection status on the failover pair <Sensor_peer_name> is inconsistent. This may cause the firewall function to be inconsistent for the pair. Ensure that both Sensors of the failover pair are connected to the firewall and that both Sensors are online and in good health. Device failover Attempt to disable failover failed Warning McAfee Network Security Platform 8.1 Troubleshooting Guide 117 4 System fault messages Sensor faults Fault Severity Signature Warning segments out of sync Description/Cause Action An attempt to update the signature set on both Sensors of a failover pair was unsuccessful for one of the pair, causing the signature sets to be out of sync on the two Sensors. The Manager will make another attempt to automatically push the signature file down to the Sensor on which the update operation failed. Ensure that the Sensor in question is on-line and in good health. The fault will clear when the Manager is successful. If the operation fails a second time, a Critical Signature set download failure fault will be shown as well. Both faults will clear when the signature set is successfully pushed to the Sensor. Signature deployment to device <Sensor_name> failed. The signature segments on failover pair <Sensor_peer_name> are out of sync. (The Manager will automatically make another attempt to deploy the signature.) SSL decryption Warning keys out of sync 118 McAfee Network Security Platform 8.1 Ensure that the Sensor is online and in good health. The Manager will make another attempt to push the file down. The fault will clear when the Manager is successful. SSL decryption keys update to device <Sensor_name> failed, and the SSL decryption keys on failover pair <Sensor_peer_name> are out of sync as a result. (The Manager will automatically make another attempt to deploy the new keys.) Ensure that the Sensor is online and in good health. The Manager will make another attempt to push the file down. The fault will clear when the Manager is successful. Troubleshooting Guide 4 System fault messages Sensor faults Fault Severity Description/Cause Action Temperature Status Warning Inlet Temperature value increased above 44. Check the Fan LEDs in front of the chassis to ensure all internal chassis fans are functioning. This fault will clear when the temperature returns to its normal range. Signature set Deprecated applications detected in firewall policies Warning The Manager has detected the following use of deprecated applications in firewall policies: <Deprecated Application <app_name> used in Policy <policy_name>/Rule#<ruleOrderNum> Deprecated Application <app_name> used in Rule Element(of type Application Group) <rule_name>@<policy_name>/ Rule# <ruleOrderNum>> These applications must be removed from the firewall policies. Sensor informational faults These are the informational faults for a sensor device. Fault Severity Automatic BOT DAT set deployment in progress Informational A new BOT DAT set has recently This message is for been downloaded from the GTI user information. No Server to the Manager and is being action required. deployed to the devices. BOT DAT deployment in progress Informational A new BOT DAT file has recently This message is for been downloaded from the GTI user information. No Server to the Manager and is being action required. deployed to the devices. Cluster software initialization status Informational Device software has been initialized. On initialization failure, check if cluster cross-connects are present as documented. Device software or signature set import in progress Informational A device software image or signature set file is being imported into the Manager. This message is for user information. No action required. Device software or signature set download in progress Informational A device software image or signature set file is being downloaded from the McAfee Update Server to the Manager. This message is for user information. No action required. Port pair <port name> is back to In-line Fail-Open Mode Informational Indicates that the ports have gone from Bypass Mode back to normal. This message is for user information, no action required. Resource mismatch Informational A configured memory or CPU is lesser than the optimal number This message is for user information. No action required. McAfee Network Security Platform 8.1 Description/Cause Action Troubleshooting Guide 119 4 System fault messages Sensor faults Fault Severity Description/Cause Sensor configuration update in progress Informational A Sensor configuration update is in This message is for the process of being pushed from user information. No the Manager server to the Sensor. action required. Sensor configuration update successful Informational Sensor configuration update successfully pushed from the Manager server to the Sensor. This message is for user information. No action required. Sensor discovery is in progress Informational The Manager is attempting to discover the Sensor. This message is for user information. No action required. Sensor resetting due to FIPS mode change Informational An upgrade or downgrade between This message is FIPS and non-FIPS software informational. images has been detected. This resets the sensor configuration and restores the default login password. Sensor software image download failed Informational Sensor software image failed to download from the McAfee Update Server to the Manager server. This message is for user information. No action required. Sensor swappable port module status for group <G0/G1/G2/G3> Informational Sensor reports port module <removed/added> for group <G0/G1/G2/G3>. This message generated based on user removing or inserting port module into sensor slot. Sensor reports port module is removed from slot for group <G0/G1/G2/G3>. Action Sensor reports <NULL/QSFP/SFP> port module inserted into slot for group <G0/G1/G2/G3>. Successful automatic botnet detectors deployment Informational A new botnet detector set has This message is for recently been downloaded from the user information, no GTI Server to the Manager and is action required. being deployed to the devices. User login via console Informational Sensor reports user login via after sensor initialization console after sensor initialization. This is a FIPS 140-2 Level 3 violation. This message is informational. Licensing Device discovered with license Informational Device <Sensor_name> was discovered with a license that will expire on <date>. License detected for Informational License valid until <date>. <Sensor_name> of type Renew the license before expire. Renew the license before it expires. Device discovery The <NTBA Appliance/ Sensor>, <device_name> The <NTBA Appliance/ Sensor>, <device_name> discovery in progress Informational The Manager is in the process of discovering the device. Wait for the discovery of the device to complete. Informational Device software image is in the process of downloading from the McAfee Update Server to the Manager server. This message is for user information. No action required. Download software Device software image download in progress 120 McAfee Network Security Platform 8.1 Troubleshooting Guide 4 System fault messages NTBA faults Fault Severity Description/Cause Action Device software image download successful Informational Device software image successfully This message is for downloaded from the McAfee user information. No Update Server to the Manager action required. server. Update device software Device software update is in progress Informational A Sensor software update is in the process of being pushed from the Manager Server to the Sensor. This message is for user information. No action required. Device software update successful Informational Device software update successfully pushed from the Manager server to sensor. This message is for user information. No action required. Update device configuration Device configuration deployment successful Informational The Manager successfully deployed This message is the latest configuration to device informational. <Sensor_name>. This includes new IPS signature sets, botnet detectors, and SSL keys, as applicable. Signature set Device software, IPS signature set, or botnet detectors import in progress Informational A device software, IPS signature set, or botnet detectors file is being imported into the Manager. This message is informational. Device software, IPS signature set, or botnet detectors download in progress Informational A device software, IPS signature This message is set, or botnet detectors file is informational. being downloaded from the McAfee Update Server to the Manager. NTBA faults The NTBA faults can be classified into critical, error, warning, and informational. The Action column provides you with troubleshooting tips. NTBA critical faults These are the critical faults for a NTBA device. Fault Severity Description/Cause Action BOT DAT file download failure Critical The Manager cannot push the BOT DAT file to device <Sensor_name> Occurs when the Manager cannot push the BOT DAT file to the Sensor. Could result from the network connectivity issue. Endpoint Intelligence Service is down Critical Endpoint Intelligence Service has not started as the ePO server is not reachable. Please make sure that the ePO server is up and running and is reachable to NTBA. Endpoint Intelligence Service has not started as the ePO extension does not support auto-signing service. Make sure that the ePO server supports ePO Auto Signing functionality(Change on Name confirmation). Endpoint Intelligence Service has not started because of authentication error connecting to the ePO server. Please provide valid ePO Server credentials. McAfee Network Security Platform 8.1 Troubleshooting Guide 121 4 System fault messages NTBA faults Fault Severity Description/Cause Action Endpoint Intelligence Service has not started because of due to internal error from the ePO server. ePO server responded error, please look at the ePO logs. Endpoint Intelligence Service has not started because of unexpected errors. Please look at the ePO server and NTBA logs for the error. Please try again. Endpoint Intelligence Service Certificate invalid, please retry saving has not started due to corrupt again. certificate. Endpoint Intelligence Service This port is already in use; please has not started because of the configure an unused port. configured port for Endpoint Intelligence Service is already in use. 122 Link failure of <Appliance name> Critical The link between this port and the device to which it is connected is down, and communication is unavailable. This is a connectivity issue. Contact your IT department to troubleshoot network connectivity. This fault clears when communication is re-established. NTBA Public keydownload failure Critical Cannot push NTBA Public keyfile to device <Sensor_name> Occurs when the Manager cannot push the NTBA Public key file to the Sensor. Could result from the network connectivity issue. NTBA Appliance unreachable Critical A command channel ping failed to NTBA Appliance <Appliance name> failed. The device is unreachable through its command channel. Indicates that the NTBA cannot communicate with the Manager: the connection between the NTBA and the Manager is down, or the NTBA has been administratively disconnected. Troubleshoot connectivity issues: 1) check that a connection route exists between the Manager and the NTBA; 2) check the NTBA’s status using the status command in the NTBA command line interface, or ping the NTBA or the NTBA gateway to ensure connectivity to the NTBA. This fault clears when the Manager detects the NTBA again. McAfee Network Security Platform 8.1 Troubleshooting Guide 4 System fault messages NTBA faults NTBA error faults These are the error faults for a NTBA device. Fault Severity Description/Cause Action Device Configuration update failed Error Device configuration update failed to be See the ems.log file to pushed from the Manager server to the isolate reason for Sensor. failure. Scheduled BOT DAT file deployment failed Error The Manager was unable to perform the Indicates that the scheduled Bot DAT deployment to the Manager was unable to device <Sensor_name>. perform the scheduled Bot DAT deployment to the Sensor. This is because of network connectivity between the Manger and the Sensor, or an invalid DAT file. This fault clears when an update is sent to the Sensor successfully. Error <GAME Error> Please re-check the NTBA GAME configuration. Error Sigfile parsing failed."; Please retry the NTBA configuration update. GAME configuration NTBA <GAME Error> System related NTBA Configuration Update Error Sigfile parsing failed in zone segment."; Sigfile parsing failed in communication rules segment."; Sigfile parsing failed in service segment."; Sigfile parsing failed in anomaly segment."; Sigfile parsing failed in reconnaissance segment."; Sigfile parsing failed in FFT segment."; Sigfile parsing failed in NBA segment."; Sigfile parsing failed in worm segment."; Sigfile parsing failed in policy segment."; Sigfile parsing failed in pre-processing segment."; Sigfile parsing failed in application profile segment."; Sigfile parsing error."; NTBA Sigset Mismatch Error McAfee Network Security Platform 8.1 Error There has been a mismatch between the NTBA version <tba_sw_version> and the sigset version <sigset_version>. NSM will now try to automatically push the appropriate matching sigset. Please check for the status of the follow-up NTBA configuration update. Troubleshooting Guide 123 4 System fault messages NTBA faults Fault Severity Description/Cause Action NTBA Zone Configuration Event Error Invalid interface or zone configuration. All the zones configured are <Outside/ Inside>. <Netflow processing will not work till this configuration is fixed. GTI reputation is not retrieved for internal hosts>. Please verify the zone configuration in NTBA. <Storage Server Error Please re-check the Storage Service Configuration. Storage server NTBA <Storage Server Error Error Storage Server Not Reachable Storage Server Not Reachable Storage Server Permission Denied Storage Server Permission Denied Storage Server Limit Reached 50% Storage Server Limit Reached 50% Backup Storage File Corrupted Storage Server Limit Reached 75% Storage Server Limit Exhausted> Storage Server Limit Reached 75% Backup Storage File Corrupted Storage Server Limit Exhausted> TrustedSource NTBA <TrustedSource Error> Error <TrustedSource Error> Please re-check the TrustedSource configuration. NTBA warning faults These are the warning faults for a NTBA device. 124 Fault Severity Description/Cause Action DAT Config is out of sync Warning The DAT Segments Config update to the device <Sensor_name> failed. The Bot DAT Config file on the failover pair is out of sync as a result. (The Manager will automatically make another attempt to deploy the BOT DAT Config file). Ensure that the Sensor is online and is in good health. The Manager will make another attempt to push the file. The fault will be cleared when the Manager is successful. This Release of NSM supports only one instance of NTBA vm. Warning The NTBA <NTBA_Appliance_name> is not discovered because of exceeding the max of supported instances of NTBA virtual machines. Please delete the device from ism GUI Uninitialized device Warning Device <Sensor_name> is not properly The Sensor may have just been initialized. rebooted and is not up yet. Wait a few minutes to see if this is the issue; if not, check to ensure that a signature set is present on the Sensor. A resetconfig command may have been issued, and the Sensor not yet been reconfigured. McAfee Network Security Platform 8.1 Troubleshooting Guide 4 System fault messages NTBA faults NTBA informational faults These are the informational faults for a NTBA device. Fault Severity Description/Cause Action Automatic BOT DAT set Informational A new BOT DAT set has recently been deployment in downloaded from the GTI Server to progress the Manager and is being deployed to the devices. This message is for user information. No action required. BOT DAT deployment in progress Informational A new BOT DAT file has recently been downloaded from the GTI Server to the Manager and is being deployed to the devices. This message is for user information. No action required. Interface change Informational During startup , the NTBA identifies changes(addition or removal) in the interface count. This message is for user information. No action required. NTBA database pruning Informational Current database usage: <percentage_value>% NTBA Database Pruning threshold notification. Successful automatic BOT DAT set deployment Informational A new BOT DAT set has recently been downloaded from the GTI Server to the Manager and is being deployed to the devices. This message is for user information, no action required. Successful scheduled BOT DAT set deployment Informational A new BOT DAT file has recently been downloaded from the GTI Server to the Manager and is being deployed to the devices. This message is for user information, no action required. The <NTBA Appliance/ Sensor>, <device_name> The <NTBA Appliance/ Sensor>, <device_name> discovery in progress Informational The Manager is in the process of discovering the device. Wait for the discovery of the device to complete. McAfee Network Security Platform 8.1 Troubleshooting Guide 125 4 System fault messages NTBA faults 126 McAfee Network Security Platform 8.1 Troubleshooting Guide 5 Error messages This section lists the error messages displayed in McAfee Network Security Manager (Manager). Contents Error messages for RADIUS servers Error messages for LDAP server Error messages for RADIUS servers The table lists the error messages displayed in the Manager. Error Name Description/Cause Action RADIUS Connection Successful RADIUS server is up and running RADIUS server is up and running RADIUS Connection Failed Network failure, congestion at servers or RADIUS server not available Try after sometime, check IP address and Shared Secret key No RADIUS server configured No server available Configure at least one RADIUS server Server with IP address and port already exists for RADIUS server IP address and port connection not unique Use a different IP address and port number RADIUS server host IP address/ host name is required Field cannot be blank Enter a valid host name /IP address Shared Secret key is unique in case of RADIUS server Field cannot be blank Enter a valid host name /IP address RADIUS server host IP address/ host name cannot be resolved as entered Invalid host name /IP address Enter a valid host name /IP address The table lists the error messages displayed in the User Activity Audit report. Error Name Description/Cause Error Type RADIUS Authentication User <user name> with login Id <login Id> failed to authenticate to RADIUS server <RADIUS server host name /IP address> on port <port number> due to server timeout/ network failure User Add Radius Server Manager McAfee Network Security Platform 8.1 Added RADIUS server IP Address/Host <IP address or host name>, port <port number> enable <Yes/No> Troubleshooting Guide 127 5 Error messages Error messages for LDAP server Error Name Description/Cause Error Type Edit RADIUS server IP Address/Host <IP address or host name> set port <port number>,set Enabled <Yes/No> Manager Delete RADIUS server Deleted RADIUS Server IP Address/Host <IP address or host name>, port <port number> Manager Error messages for LDAP server The table lists the error messages displayed in the Manager. Error Name Description/Cause Action Server with IP address and port already exists for LDAP server IP address and port connection not unique Use a different IP address and port number LDAP server host IP address/host Field cannot be blank name is required Enter a valid host name /IP address LDAP server host IP address/host Invalid host name /IP address name cannot be resolved as entered Enter a valid host name /IP address LDAP Connection Successful LDAP server is up and running LDAP server is up and running LDAP Connection Failed Network failure, congestion at servers or LDAP server not available Try after sometime, check IP address No LDAP server configured No server available Configure at least one LDAP server The table lists the error messages displayed in the User Activity Audit report. Error Name 128 Description/Cause Error Type LDAP Authentication User <user name> with login Id <login Id> failed to authenticate to LDAP server <LDAP server host name /IP address> on port <port number> due to server timeout/ network failure. User Add LDAP server Added LDAP server IP Address/Host <IP address or host name>, port <port number>, enable <Yes/No> Manager Edit LDAP server IP Address/Host <IP address or host name> set port <port number>,set Enabled <Yes/No> Manager Delete LDAP server Deleted LDAP Server IP Address/Host <IP address or host name", port<port number> Manager McAfee Network Security Platform 8.1 Troubleshooting Guide 6 Troubleshooting scenarios Contents Network outage due to unresolved ARP traffic Delay in alerts between the Sensor and Manager Sensor-Manager Connectivity Issues Wrong country name in IPS alerts Wrong country name in ACL alerts Network outage due to unresolved ARP traffic Scenario Sudden outage in the network due to unresolved ARP traffic. Applicable to Sensor models: M-series, NS-series Sensor software version: 7.1, 7.5, 8.1 Problem type to be solved Resolve the ARP traffic which is dropped by the Sensor due to heuristic web application server protection configuration setting. Data/Information Collection 1 Check if the attack ARP MAC Address Flip-Flop is disabled from the policy. Go to Policies | IPS Policies | Customized Active Policy. Click Edit. Check the policy on the entire device interfaces and make sure ARP flip flop alert is either disabled or not included in the policy on the entire device interfaces. McAfee Network Security Platform 8.1 Troubleshooting Guide 129 6 Troubleshooting scenarios Delay in alerts between the Sensor and Manager 2 Check if the Heuristic Web Application Server Protection is enabled. Go to Devices | Devices | <Device Interface> | Protection Profile. Check each interface of the device individually. 3 Check if ARP spoofing is enabled on the Sensor. Use the command show arp spoof status. Explanation When heuristic web application server protection is enabled, the Manager caching is disabled and only selected attacks are pushed to the Sensor. If the MAC Flip-Flop attack is not part of the attacks chosen by the user, the Sensor drops the ARP packets. This happens in scenarios such as: • Assignment of dynamic MAC address in the network (vmac) • For the firewall in failover mode which uses the Virtual MAC address, the IP address remains the same but the MAC address will change Troubleshooting Steps 1 Disable ARP spoofing on the Sensor. Use the command arp spoof to disable ARP spoofing. 2 Disable Heuristic Web Application Server Protection on the device’s individual interfaces. If the problem still persists, contact McAfee Support for further assistance. Delay in alerts between the Sensor and Manager Scenario Delay in receiving the Sensor alerts on the Manager. Applicable to Sensor models: M-series, NS-series 130 McAfee Network Security Platform 8.1 Troubleshooting Guide 6 Troubleshooting scenarios Delay in alerts between the Sensor and Manager Sensor software versions: 7.1, 7.5, 8.0, 8.1 Problem type to be solved • Delay in the Sensor alerts being sent to the Manager • Sensor alerts are not seen in real time on the Manager • Time lag in sending the Sensor alerts to the Manager Data/Information Collection 1 Execute the following commands on the Sensor : • status (execute 5 times in 10 seconds duration) • show sensor-load (execute 5 times in 10 seconds duration) • getccstats (execute 5 times in 10 seconds duration) Also execute the same commands on a similar model Sensor, which does not have the issue. 2 Collect graphs for Sensor throughput utilization and port utilization. 3 Collect the attack csv file for this Sensor from the Threat Analyzer. 4 Collect the alert archival for the last 24 hour time duration. 5 Retrieve the configuration backup of the Manager. 6 Create/collect the network diagram that clearly indicates where the Sensor and the Manager are located. Troubleshooting steps 1 Check if there are any network connectivity issues or any delay in the network. If there is a delay in the network between the Sensor and the Manager, it can lead to low alert rates. 2 Verify that the entire link between the Sensor management port and the Manager is 1G auto, and they are using the correct CAT6 cables. 3 Check if the other Sensors connected to the same the Manager are also facing this issue. If yes then it is a Manager issue. 4 Check the Sensor policy being used. If the All Inclusive with Audit or All Inclusive without Audit is used, the Sensor processes more alerts and hence alert generation rate increases. Switching to Default Inline policy can help resolve the delay issue sometimes. 5 Check if there are any saved alerts/packetlogs on the Sensor. Command: show savedalertinfo 6 Check if there is any specific category of alerts, which is delayed or all the alerts are delayed. Also check if the system events that are being raised, are also delayed. 7 Check if the alerts are seen in the Historical Threat Analyzer. The Real Time Threat Analyzer reflects the alerts from cache but the Historical Threat Analyzer shows from the database. This check will confirm if the issue is on the database or cache. Check the database size and if it is very high, purge and tune the database. 8 Check the time on the Sensor and if it matches with the Manager system time. If there is any issue with the time stamp, the Manager may show the wrong timestamp in the Threat Analyzer, which can incorrectly appear as alerts being delayed. McAfee Network Security Platform 8.1 Troubleshooting Guide 131 6 Troubleshooting scenarios Delay in alerts between the Sensor and Manager 9 Check the rate of alert generated/detected by the Sensor using the following command: getccstats: • To check the status of control/alert channel (to the Manager) • To check the alert suppression/throttling configuration status and suppression intervals • To check the sensor failover action (1 = Enabled, 2 = Disabled) and failover status (1 = Active, 2 = Standby, 3 = Init/Not Applicable), failover peer status (1 = Up, 2 = Down, 3 = Incompatible, 4 = Compatible, 5 = Init/Not Applicable), fail-open status (1 = Enabled, 2 = Disabled) • To check the count of detected alerts (signature-based, scan/recon, DoS) sent to management port and peer Manager (in case of MDR) • To check the count of throttled alerts • To check the count of alerts sent to and received from Correlation Engine, alert correlation counts • To check the count of alerts in ring buffer, queued to be sent to the Manager • To check ACL alerts’ throttling configuration status (throttling interval and threshold) • To check the count of throttled ACL alerts (both IPS and NAC) • To check the Sensor reboot count and/or alert wrap count The following statistics indicate many alerts still pending in ring buffer: AlertsInRngBufPriCount = 83621 AlertsInRngBufSecCount = 83606 PutAlertInRngBufErrCount = 6499317 The alert rate could be really high that the Manager may not be able to handle. It then introduces a delay that is similar to backoff (with the delay reaching a max of 30 seconds per alert) and this causes the alerts to be queued up in Ring Buffer. Once this condition is reached, the alerts delay will increase with time. To recover, check the type of attacks and then try to create an exception rule to filter the attack, and see if the Manager recovers. 132 McAfee Network Security Platform 8.1 Troubleshooting Guide 6 Troubleshooting scenarios Delay in alerts between the Sensor and Manager 10 Take the packet captures at the Sensor and the Manager side to identify whether the issue is at the Sensor/Manager side or network side. On the Manager, use Wireshark or equivalent to take packet captures on the Manager port 8502. Sample packet capture on the Sensor: Sample packet capture on the Manager: Using packet captures from the Sensor and the Manager, which are taken simultaneously, you can identify if there is a delay in the Sensor sending the alert to the Manager or there is a delay in the Manager sending the alert acknowledgment to the Sensor or is it both (pointing to a network issue). 11 Check if Layer 7 Data Collection is enabled on the Sensor. There is a known issue when Layer 7 Data Collection is enabled, where the alerts in the Real-Time Threat Analyzer are no longer received in real time. IntruDbg#> show l7dcap-usage Layer-7 Dcap Buffers Allocated at Init 16000 Layer-7 Dcap Buffers Available now 16000 Layer-7 Dcap Buffers Alloc Errors 0 Layer-7 Dcap Alert Buffers Allocated 40960 Layer-7 Dcap Alert Buffers Available 40960 Layer-7 Dcap Alert Buffers Allocate Error 0 Layer-7 Dcap Regular Alert's Sent 0 Layer-7 Dcap Special Alert's sent 0 Layer-7 Dcap Context End Alert's Sent 0 Layer-7 Dcap CB InActive when DCAP Called 0 Layer-7 Dcap Ring Buffer Errors 0 Alert Ring Buffer Full Cnt 0 Num Alerts Dropped at Sensors 0 Layer-7 Dcap Fifo Check Seen 0 McAfee Network Security Platform 8.1 Troubleshooting Guide 133 6 Troubleshooting scenarios Sensor-Manager Connectivity Issues 12 On the Manager database, use SQL queries output to check the frequency of alerts going to the Manager. This can be done by logging into MySQL on the Manager server and executing the following command: a Get Sensor ID from database: select sensor_id, name from iv_sensor; b Input the time range for which the alert generation rate needs to be checked: SELECT "2014-05-29 18:39:47", "2014-05-30 18:39:47" INTO @stdate, @enddate; c Total Attacks for Sensor ID and the time range: SELECT sensorid,COUNT(*) atcount FROM iv_alert WHERE creationtime BETWEEN @stdate AND @enddate GROUP BY sensorid ORDER BY atcount; d Total packetlog for Sensor ID and time range: SELECT sensorid,COUNT(*) pktcount FROM iv_packetlog WHERE (creationtime BETWEEN @stdate AND @enddate) AND sensorid=<id of problematic sensor> GROUP BY sensorid ORDER BY pktcount; If the problem still persists, contact McAfee Support for further assistance. Sensor-Manager Connectivity Issues Scenario Connectivity issues between the Sensor and Manager. Applicable to Sensor models: M-series, NS-series Sensor software versions: 7.1, 7.5, 8.1 Problems type to be solved Sensor is not detected on the Manager. Trust establishment does not happen between the Sensor and Manager. Data/Information Collection 1 134 Execute the following commands on the Sensor: • status • show • show sbcfg • show mgmtcfg • show doscfg • show mgmtport • getccstats • show netstat • checkmanagerconnectivity (applicable only to Sensor software 8.1 and above) McAfee Network Security Platform 8.1 Troubleshooting Guide 6 Troubleshooting scenarios Sensor-Manager Connectivity Issues 2 Collect the Manager infocollector logs. If possible, enable detailed debugging messages by modifying <Manager_INSTALL_DIR>/config/log4j_ism.xmlfile, by adding/changing the following lines: <category name="iv.core.DiscoveryService"> <priority value="DEBUG"/></category> <category name="iv.core.SensorConfiguration"> <priority value="DEBUG"/></category> 3 Collect the Sensor trace files. 4 Collect packet capture at the Manager (for the problematic Sensor). 5 Network diagram clearly mentioning where the Sensor and Manager are located. Troubleshooting Steps 1 Check if there is any network connectivity issue such as conflicting IP address of the Sensor. This can result in alert/pktlog channel flaps. 2 Verify that the Management Interface speed and duplex settings are configured correctly on the Manager and Sensor and that they are hard-coded. If this fails, change one link to auto and change the other side's duplex and speed settings until communications are established or combinations are exhausted. 3 Ping from the Sensor to Manager and Manager to Sensor, and make sure the ping goes fine. 4 Check if the other Sensors connected to the same Manager are also facing this issue. If yes, then it is a Manager issue. 5 Check the IP address of the system on which the Manager is installed. Make sure the correct IP address is provided in the Sensor command set manager ip. 6 Try a deinstall and establish the trust again with the Manager. 7 Check if the Manager machine has multiple NIC cards. If yes then open below file: <Manager_INSTALL_DIR>/bin/tms.bat Modify the following line to assign the relevant IP address that is also used in the Sensor configuration: set JAVA_OPTS=%JAVA_OPTS% -Dlumos.fixedManagerSNMPIPaddress=""restart Manager 8 Check the Sensor name, which is given on the Manager while adding the Sensor using the Add New Device wizard. Sensor name is case sensitive so make sure it exactly matches the one given on the Manager. 9 Check that the device type is selected as IPS or NAC sensor while adding the Sensor using Add New Device. Selecting incorrect device type can also lead to connectivity issues. 10 Make sure that firewall is not blocking traffic between the Manager and Sensor for the following ports : Manager:4167 -> Sensor:8500 (UDP) Sensor:Any -> Manager:8501-8504,8510 (TCP) for 1024-bit trusts Sensor:Any -> Manager:8504,8506-8509 (TCP) for 2048-bit trusts 11 If using the malware policy, check if the file save option is enabled. Make sure firewall is not blocking ports 8509 and 8510, which are used for saving malware files. 12 Check that UDP port 8500 is open and allows the Manager to Sensor SNMP communication. McAfee Network Security Platform 8.1 Troubleshooting Guide 135 6 Troubleshooting scenarios Wrong country name in IPS alerts 13 Use the netstat -na command to verify that ports 8501 - 8505 are listening on the Manager. Click Start | Run type cmd, press ENTER, then type netstat -na. 14 Make sure large UDP and/or fragmented UDP packets are not dropped between the Sensor and Manager communication. This can lead to SNMP timeout. Look for the following logs in ems.log: Ems log ****** 014-06-27 15:47:29,150 INFO [Thread-135] iv.core.SensorConfiguration - M1450 Experience a SNMP error during set/get, Change the STATUS to DISCCONECTED 2014-06-27 15:47:29,163 ERROR [Thread-135] iv.core.SensorConfiguration - Fail to process SNMP return node: com.intruvert.ext.sensorconfig.leap.SensorConfigException: Time Out 15 Capture UDP traffic using Wireshark on the Manager. Check if the Manager is receiving UDP response packets from the Sensor. Sample capture on the Manager: 16 Check the time on the Sensor, and if it matches with the Manager system time. 17 Check if there are any Out Of Memory related logs in the Manager. This can lead to connectivity issues between the Sensor and Manager. 18 Check if the Manager is an MDR pair. If yes, then verify that the IP of primary Manager in the sensor matches the IP of the active Manager. Also check if the Sensor is treating the standby Manager as the primary Manager or not. This may lead to connectivity issues. If the problem still persists, contact McAfee Support for further assistance. Wrong country name in IPS alerts Scenario To find the root cause of cases for IPS alerts in the Threat Analyzer that shows wrong country name for source or destination IP addresses. Applicable to Sensor models: M-series, NS-series Sensor software versions: 7.1, 7.5, 8.1 and 8.2 Problems type to be solved Threat analyzer displays wrong country name for source or destination IP address for an IPS alert. 136 McAfee Network Security Platform 8.1 Troubleshooting Guide 6 Troubleshooting scenarios Wrong country name in IPS alerts Troubleshooting Steps 1 Check for IP address in maxmind.com to find the geographic location for a particular IP address. If the IP address does not match the geographic location, then it is an issue with the Manager or the geographic database in the cloud. 2 Login to the Sensor with “admin” ID, and then in the Sensor CLI, type the debug command and then enter the following command: set loglevel mgmt (all | <0-12>) <0-15> To disable logging, execute set loglevel mgmt 0 0. ug 28 06:36:16 localhost tL: DBG2 ctrlch|postAlertDataToSyslogViewer: syslog msg len 174, data <36>Aug 28 06:36:16 GMT mil-ips-01 AlertLog: mil-ips-01 detected Outbound attack HTTP: IIS3 ASP dot2e (severity = Medium). 1.2.0.2:43058 -> 1.2.0.4:80 (result = Inconclusive) Aug 28 06:36:16 localhost tL: DBG0 ctrlch|alertTransmittedCountUpdate: IN Aug 28 06:36:16 localhost tL: DBG0 ctrlch|alertTransmittedCountUpdate: msgId is (335) Aug 28 06:36:16 localhost tL: DBG0 ctrlch|alertTransmittedCountUpdate: EXIT Aug 28 06:36:16 localhost tL: DBG0 ctrlch|CCout(0) processCtrlChanAlerts Id:335 (baseId:83886415) Aug 28 06:36:16 localhost tL: DBG0 ctrlch| -out-BEGIN Mobile SIGNATURE(335), size(565) Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Attack Id = 4202651 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Syslog Attack Id = 1438464 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Time Stamp = 1409207775 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Alert Count = 1 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| VIDS Id = 2030 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Syslog VIDS Id = 4 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| VLAN Id = 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Alert Duration = 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Log ID = 6052501239499929418 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Slot Id = 2 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Port Id = 25 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Protocol Id = 16 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Qualifier 1 = 1 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Qualifier 2 = 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Src IP = 0x1020002 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Dstn IP = 0x1020004 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Request LastByte Offset = ffffffff McAfee Network Security Platform 8.1 Troubleshooting Guide 137 6 Troubleshooting scenarios Wrong country name in IPS alerts Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Response LastByte Offset = 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Attack Pkt Search Num = 1 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| SrcPort = 43058 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| DstnPort = 80 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Protocol = 6 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Signature Id = 226 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| PP State = 14 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Prev Stream Flag = 1 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Frag Flag = 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Corr Flag = 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| Inside = 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| SuppressedSigId Bits = 1 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| inline Drop = 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| ReCfg Firewall = 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| flags = 40 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| mpeFlags = 8 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| appId = 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| normalize reputation = 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| normalize geoLocation = 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| xff ip direction= 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| mobileFlags = 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch|devProf Src deviceInfo = 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch|devProf Src confLevel = 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch|devProf Src osInfo = 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch|devProf Src detectSrcType = 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch|devProf Dst deviceInfo = 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch|devProf Dst confLevel = 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch|devProf Dst osInfo = 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch|devProf Dst detectSrcType = 0 Aug 28 06:36:16 localhost tL: DBG0 ctrlch| -------------------Aug 28 06:36:16 localhost tL: DBG0 ctrlch|64-bit Uid = a a0 50 8 be 8a d3 57. Aug 28 06:36:16 localhost tL: DBG0 ctrlch|id: 335, msgType: 1 138 McAfee Network Security Platform 8.1 Troubleshooting Guide 6 Troubleshooting scenarios Wrong country name in ACL alerts Aug 28 06:36:16 localhost tL: DBG0 ctrlch|processSigAlertMsg - reCfgFw mask = 0x0 Here geographic ID of 0 means that the Sensor does not send any geographic information for the corresponding source or destination IP addresses. 3 Execute step 2 and wait for the IPS alert to be raised again. This time the Sensor prints the country code sent from Sensor for the corresponding IPS alert. If the Sensor sends the geographic location ID as 0, then it’s an issue with the geographic database cloud when the Manager sends a geographic based query to find the geographic location matching an IP address. Typically for an IPS alert, the Sensor does not send any geographic location ID value. If the problem still persists, contact McAfee Support for further assistance. When a wrong country name is displayed for the source or destination IP address for an IPS alert, then it is an issue with the Manager. Wrong country name in ACL alerts Scenario Wrong country name appears in ACL alerts/ACL logs. Applicable to Sensor models: M-series Sensor software version: 7.1, 7.5, 8.1, 8.2 Problem type to be solved Wrong country name is displayed in the ACL alerts/ACL logs when forwarded to third party software either from the Sensor or from the Manager. Data/Information Collection Execute show acl stats in the Sensor CLI. Troubleshooting Steps Execute the show acl stats command in the Sensor CLI to fetch the following data from the management process: • Number of ACL alerts sent by the datapath processor to the management processor • Number of ACL alerts sent from the management processor to the Manager or third party software tool. If there is difference between the received and sent/sent directly count by a large value but within 10,000, then the buffer to keep the ACL alerts at management processor is full. This might potentially be the cause for the issue. intruShell@mil-ips-01> show acl stats [Acl Alerts] Received : 0 Suppressed : 0 McAfee Network Security Platform 8.1 Troubleshooting Guide 139 6 Troubleshooting scenarios Wrong country name in ACL alerts Sent : 0 Sent Direct : 0 Stateless ACL Fwd count : 0 The buffer kept for receiving the ACL alerts from datapath processor is full, and is not flushed in an event like ACL alert suppression disabled/enabled. In this type of scenario, if the ACL alert buffer is not flushed, then the country name for the old ACL alert is mixed with the new ACL alert, which results in the wrong country name in the ACL logs. If the country name is displayed wrong in the ACL alert, for either source IP address or destination IP address, then there is an issue with the Sensor. If you are not able to solve the problem even after repeating the steps explained in troubleshooting, or the problem is not understood, contact McAfee Support for further assistance. 140 McAfee Network Security Platform 8.1 Troubleshooting Guide 7 Using the InfoCollector tool This section describes the following aspects of using the Infocollector tool. Contents Introduction How to run the InfoCollector tool Using InfoCollector tool Using the Log Analyzer tool Introduction InfoCollector is an information collection tool, bundled with Manager that allows you to easily provide McAfee with McAfee® Network Security Platform-related log information. McAfee can use this information to investigate and diagnose issues you may be experiencing with the Manager. InfoCollector can collect information from the following sources within McAfee Network Security Platform: Information Type Description Ems.log Files Configurable logs containing information from various components of the Manager. The current ems.log file is renamed when its size reaches 1MB, using the current timestamp. Another ems.log is created to collect the latest log information. Configuration backup A collection of database information containing all Network Security Platform configuration information. Configuration files XML and property files within the Network Security Platform config directory. Fault log A table in the Network Security Platform database that contains generated fault log messages. Sensor Trace A file containing various McAfee® Network Security Sensor(Sensor)-related log files. Compiled Signature A file containing signature information and policy configuration for a given Sensor. InfoCollector is a tool that can be used both by you and by McAfee. McAfee systems engineers can use the InfoCollector tool to provide you with a definition (.def) file via email. This file is configured by McAfee to automatically choose information that McAfee needs from your installation of Network Security Platform. You simply open the definition file within the InfoCollector and it will automatically select the information that McAfee needs from your installation of the Manager. McAfee Network Security Platform 8.1 Troubleshooting Guide 141 7 Using the InfoCollector tool How to run the InfoCollector tool Alternatively, a manual approach can also be used with InfoCollector, and you can select information yourself to provide to McAfee. For example, McAfee may ask you to select checkboxes that correspond to different sets of information available within Network Security Platform. How to run the InfoCollector tool To run InfoCollector, follow the following steps: 1 If you do not already have InfoCollector installed, download the InfoCollector.zip file from the McAfee website and extract it to a specific location in a specific drive: Example C:\[Network Security Manager_INSTALL_DIR]\App\diag Files related to InfoCollector, such as infocollector.bat should be in a specific location: Example C:\[Network Security Manager_INSTALL_DIR]\App\diag\InfoCollector 2 Run the following batch file: C:\[Network Security Manager_INSTALL_DIR]\App\diag\InfoCollector\infocollector.bat Using InfoCollector tool To use InfoCollector, follow these steps: Task 1 After you run InfoCollector, do one of the following: • If McAfee provides you with a definition file: a After you run InfoCollector, open the File menu and click Open Definition. Figure 7-1 Navigating to Open Definition option 142 McAfee Network Security Platform 8.1 Troubleshooting Guide 7 Using the InfoCollector tool Using the Log Analyzer tool 2 b Select the definition file that McAfee sent you via email and click Select. • If McAfee instructs you to select InfoCollector checkboxes: a After you run InfoCollector, select the checkboxes as instructed by McAfee. b Select a Duration. Select Date to specify a start and end date, or select Last X Days. c Select the number of days from which InfoCollector should gather information. d Click Browse and select the path and filename of the output ZIP file. Click Run. Figure 7-2 Running selected files 3 Provide the output ZIP file to McAfee as recommended by McAfee Technical Support. You can send the file via email or through FTP. The output ZIP file contains the toolconfig.txt file, which lists the information that you have chosen to provide McAfee. Using the Log Analyzer tool This section describes the functions of the Log Analyzer tool and the tasks that can be performed in various tabs that are available in the tool. Tasks • Introduction on page 144 • Running the Log Analyzer on page 144 • Add a new customer case on page 144 • View summary of the Manager on page 146 • Create an Event Chart on page 147 • Search for a log file on page 148 • Managing log files in repository on page 150 McAfee Network Security Platform 8.1 Troubleshooting Guide 143 7 Using the InfoCollector tool Using the Log Analyzer tool Introduction Log analyzer is a tool used for troubleshooting purposes to track and analyze the log files. The InfoCollector logs are uploaded to Log Analyzer tool. This tool provides the following information: • A high level summary of the Manager installation. • Charts based on events, time slider, alerts, packet logs and memory. • Advanced search options to search only specific log files and time range • Repository of logs. Log Analyzer can analyze information from the following log files within McAfee Network Security Platform: • Ems.log • Acm.log • Aqcount.log • Pqcount.log Running the Log Analyzer To run Log Analyzer: Task 1 Extract the new build of Log Analyzer and save it in a preferred location in your local system. 2 Open the folder and click start.bat to start the server. 3 Open the browser and type the connection path. For example, localhost:8983/la/. The Log Analyzer web tool is displayed. To stop the server, click start.bat. Add a new customer case Before you begin You can add a new case to manage log sets for a customer. 144 McAfee Network Security Platform 8.1 Troubleshooting Guide Using the InfoCollector tool Using the Log Analyzer tool 7 Task 1 Click on the icon at the bottom of the left panel. The Add New Case window is displayed. Figure 7-3 Add New Case window 2 3 Enter the following fields: Fields Description Case Type the name of the case Customer Type the name of the customer Description Type a description for the new case. Click Save to save add the new case. To delete the case click on the icon at the bottom of the left panel. Tasks • Add a new log set on page 146 This section explains about adding new log set files. McAfee Network Security Platform 8.1 Troubleshooting Guide 145 7 Using the InfoCollector tool Using the Log Analyzer tool Add a new log set This section explains about adding new log set files. Task 1 Click on the Add button in the Case Details window. The Add/Edit Log Set window is displayed. Figure 7-4 Add/Edit Log Set window 2 Type the name of the log set in the Name field. 3 In the Upload Path field, click Browse and select the log set zip file to be uploaded. 4 Click Submit to add the log set. The log set is added in the Customer Cases panel. View summary of the Manager In the Summary tab, you can view the following details of the Manager. 146 Field name Description CPU Specifies the CPU details Installed By Displays the name of the person who installed the Manager. Installed On Displays the date and time of installation Installed Directory Specifies the location of the directory where the Manager is installed. Manager Type Specifies the type of the Manager. McAfee Network Security Platform 8.1 Troubleshooting Guide 7 Using the InfoCollector tool Using the Log Analyzer tool Field name Description MySQL Install Directory Specifies the location of the directory where the MySQL Install Directory is installed. Regional Language Specifies the regional language The summary details are generated from a “installer_debug.txt” file in “NetworkSecurityManager\App \UninstallerData\Logs” This file is automatically included when you upload a logset into the log analyzer to create the details for the summary tab. But on versions prior to 8.0, in the Repository tab click Add file to add this file manually to the logset for the details to be displayed on Summary tab. Figure 7-5 Summary tab Create an Event Chart In the Charts tab, you can view the following charts: • Event Chart • Rate Chart • Memory Chart The event chart displays the chart for the selected event(s). Do the following steps to create an event chart. Task 1 From the Available Events list, select the event(s) and click on included in the Selected Events list. button. The selected event(s) are To remove an event from the Selected Events list, select the event and click on the selected event(s) are removed from the Selected Events list. McAfee Network Security Platform 8.1 button. The Troubleshooting Guide 147 7 Using the InfoCollector tool Using the Log Analyzer tool 2 Click on the Create Chart button. The Event chart is displayed on the page. Figure 7-6 Event chart It also displays the following information. Field name Description Start Time Specifies the start time of event(s). End Time Specifies the endtime of event(s). Select Time Range Click and drag the slider(s) to select the start time and end time of the event(s). To further analyze the information on a more specific time, click and drag the mouse pointer on the event for a specific time. Search for a log file Before you begin You can search for a specific log file or a list of files for a specific file type in the repository. 148 McAfee Network Security Platform 8.1 Troubleshooting Guide Using the InfoCollector tool Using the Log Analyzer tool 7 To search for a specific log file: Task 1 Type the name of the log file name in the Find text area. 2 Click Submit. The log file found in the repository is displayed in the Search Results section. Figure 7-7 Search Results section 3 Under Search Options click on the type of log file and click Submit. For selecting multiple file type, press Shift and click on the file types. 4 Select the start and end time of the log files by moving the slider(s) in the Select Time Range field. The list of log files found in the repository is displayed in the Search Results section. In the Search textbox under the "Customer Cases" pane, you can search for the logsets pertaining to a particular customer or a particular case. The case and customer name fields are used to filter out the text entered in the textbox. All the related cases along with the logsets added are displayed. McAfee Network Security Platform 8.1 Troubleshooting Guide 149 7 Using the InfoCollector tool Using the Log Analyzer tool Managing log files in repository In the Repository tab, you can view, download and process the log files for analysis. Figure 7-8 Repository tab The following table shows the options available in the Repository Tab. Options Description Search Type the name of the file in the Search text field. The specified file name, if found in the repository is displayed in the page. File Name Click on the fille name link in the File Name column. Last Modified Displays the date and time since the file was last modified. Processed for Analysis A tick mark signifies that the file is processed for analysis. A cross mark signifies that the file is not processed for analysis. 150 Add File Click on the Add File button to search and upload a new file to the log set in the repository. Process Click on the Process button to process the log file for analysis. McAfee Network Security Platform 8.1 Troubleshooting Guide 8 Automatically restarting a failed Manager with Manager Watchdog This section provides information on how the Manager Watchdog works, installing the Manager Watchdog, starting the Manager Watchdog, using the Manager Watchdog in an MDR configuration, and tracking the Manager Watchdog activities. Contents Introduction How the Manager Watchdog works Install the Manager Watchdog Start the Manager Watchdog Use the Manager Watchdog with Manager in an MDR configuration Track the Manager Watchdog activities Introduction The Manager Watchdog feature is designed to restart the Manager if the Manager crashes, potentially bringing the Manager back online before MDR enables. The Manager Watchdog monitors the Manager process on the Manager server periodically for availability. If Manager Watchdog detects that the Manager has gone down unexpectedly, it restarts the service automatically. (It does not restart the Manager if the Manager has been shut down intentionally.) How the Manager Watchdog works Manager Watchdog runs as a separate process and monitors Manager through the Windows OS Services model. Manager Watchdog polls Manager every 10 seconds. If the Manager Watchdog does not detect the Manager during a polling period, it waits 30 seconds and then restarts the Manager service automatically. Manager Watchdog will make five attempts to restart the Manager and then, if it has not succeeded, it will exit. Manager Watchdog, by default, is a manual service; you must explicitly start it. You can instead change this setting to be automatic if you wish the service to start automatically after a system reboot. If you have chosen to change the Manager service setting from its default (Auto) to "Manual," (during a troubleshooting session, for example) then consider doing the same for Manager Watchdog. This will prevent the Manager Watchdog from restarting Manager automatically. McAfee Network Security Platform 8.1 Troubleshooting Guide 151 8 Automatically restarting a failed Manager with Manager Watchdog Install the Manager Watchdog Install the Manager Watchdog Manager Watchdog is installed automatically during Manager installation, and a new OS service called "Network Security Platform Watchdog Service" is created to enable you to start and stop the Manager Watchdog service. When you first install the Manager, this service is started automatically. However, the default Windows Startup Type for this service is manual. Manager Watchdog monitors only the "Network Security PlatformMgr" service; it does not monitor services like MySQL or Apache. Start the Manager Watchdog The Manager watchdog process is, by default, not started after installation; you must start the Manager watchdog process manually. To start/stop Manager Watchdog: Task 1 Select Start | Settings | Control Panel. Double-click Administrative Tools, and then double-click Services. 2 Click Network Security Platform Watchdog Service. 3 Do one of the following: • To start the service, select Action | Start. • To stop the service, selectAction | Stop. Alternatively, you can also use the Manager icon in the Windows system tray to start or stop Manager Watchdog. Right-click on the Manager icon at the bottom-right corner of your server and select Start Watchdog or Stop Watchdog as required. Use the Manager Watchdog with Manager in an MDR configuration When using Manager Watchdog on an Manager that is part of an MDR configuration, consider whether you want the Manager Watchdog to restart the Manager before failover can occur. If so, you must ensure that the value set for the MDR setting "Downtime Before Switchover" is greater than the Manager Watchdog setting of 30 seconds. This prevents the initiation of MDR, wherein the peer Manager takes over if the primary Manager fails. McAfee suggests retaining the default value of 5 minutes or greater to allow the Manager Watchdog time to restart the Manager. If the Manager Watchdog brings up a primary Manager after MDR has initiated, note that the primary Manager does not come back Active; it checks first to determine whether the secondary is Active and if so, remains as standby. Track the Manager Watchdog activities The Manager Watchdog logs all controlled activities in a log file. Log files can be found at: /<Network Security Platform install directory>/ named with the filename convention wdout_<<time stamp>>.log 152 McAfee Network Security Platform 8.1 Troubleshooting Guide Automatically restarting a failed Manager with Manager Watchdog Track the Manager Watchdog activities 8 A sample log file entry follows: Sample Manager Watchdog Log ---------------------------------------------------------------------------------------------------------------------------Restarting server at Mon Jun 09 14:48:53 GMT+05:30 2006 SERVER STDOUT: The Network Security Platform Manager Service is starting. SERVER STDOUT: The Network Security Platform Manager Service was started successfully. SERVER STDOUT: SERVER STDOUT: ---------------------------------------------------------------------------------------------------------------------------If the Manager Watchdog fails after five attempts to restart Manager, the following line appears in the log file: SERVER STDOUT: Failed to restart Manager after five attempts. Exiting. [kl] McAfee Network Security Platform 8.1 Troubleshooting Guide 153 8 Automatically restarting a failed Manager with Manager Watchdog Track the Manager Watchdog activities 154 McAfee Network Security Platform 8.1 Troubleshooting Guide 9 Utilize of the McAfee KnowledgeBase The McAfee Knowledgebase (KB) contains a large number of useful articles designed to answer specific questions that might not have been addressed elsewhere in the documentation set. We suggest checking to see if a question you have is answered in a KB article. To access McAfee Knowledgebase: Go to http://mysupport.mcafee.com, and click Search the KnowledgeBase. The following list contains some of the more commonly accessed KB articles. New Number Topic KB55446 All signature set releases with links to signature set release notes KB55447 All UDS releases and release notes of the UDS's (this is a restricted article and requires the user to log into service portal or be internal) KB55448 Table displaying the current versions for McAfee® Network Security Platform KB55449 Listing of McAfee Network Security Platform's response to high profile public vulnerabilities KB55450 How to request coverage for a threat that isn't already covered KB55451 List of all McAfee Recommended for Blocking (RFB) attacks KB55318 Sensor heat dissipation rates (BTUs per hour) KB60660 Verifying MySQL Database Tables KB55470 Network Security Platform maximum number of CIDR blocks using VIDS KB55549 Collecting a diagnostics trace from the McAfee Network Security Sensor (Sensor) KB55568 VLAN limitations for Network Security Platform KB55723 Maximum number of SSL keys for McAfee Network Security Manager (Manager) or Sensor KB55743 How to submit Network Security Platform false positives and incorrect detections to McAfee Support KB55908 Support for legacy versions KB55364 Asymmetric traffic KB56069 "Login failed: Unable to get the McAfee Network Security Manager (Manager) license information" KB56071 Configuring authentication on the Manager for the update server KB56364 3rd Party Recommended Hardware for Sensors Error: Download Failed: Reason 42: Sensor fails to apply new updates internally(Sensor signature updates fails) Network Security Platform Release Notes (Master List) McAfee Network Security Platform 8.1 Troubleshooting Guide 155 9 Utilize of the McAfee KnowledgeBase New Number Topic 156 KB59347 Sensor is reporting false DOS attacks / New network device is added and Sensor is now reporting DOS attacks KB59344 Recover the password for the Manager McAfee Network Security Platform 8.1 Troubleshooting Guide Index A about this guide 7 auto-negotiation 50 auto-negotiation and speed configurations 47 Cisco 3750-12S switch 48 Cisco catalyst 4000, 5000, 6000 series 48 Cisco CSS 11000 48 Cisco PIX® Firewall 48 gigabit auto-negotiation 47 false positives determination tuning policies 63 I InfoCollector tool 141 K KnowledgeBase 155 M C CatOS show port command counters 48 connection difficulties 40 connection limiting 24 connectivity difficulties configuring management port 41 firewall 41 setting management port speed 42 software set incompatibility; signature set compatibility 41 connectivity issues 46 duplex mismatches 47 connectivity loss 42 conventions and icons used in this guide 7 correct identification user sensitivity 64 D data link errors 61 documentation audience for this guide 7 product-specific, finding 8 typographical conventions and icons 7 download status 12 E error messages 127 external fail-open kit issues connecting to monitoring ports 20 Manager database connectivity 27 Manager status check 27 Manager watchdog 151 McAfee ServicePortal, accessing 8 MySQL issues 28 P pinging 11 Q quarantining 26 S Sensor and Manager status checks 40 Sensor failover issues 19 Sensor failover status check 11 Sensor health check 11 Sensor issues, debugging 15 Sensor reboot 13 Sensor response exceeding throughput 16 Sensor status checks 10 Sensor traffic status 12 ServicePortal, finding product documentation 8 sniffer trace 61 status checks for Sensor and other devices 46 system fault messages 67 T F false positives 63, 64 McAfee Network Security Platform 8.1 technical support, finding product information 8 traffic management 19 Troubleshooting Guide 157 Index troubleshooting before starting 9 troubleshooting tips 9 U X XC cable connection issues M8000 Sensors 20 NS9300 Sensors 20 update status 11 158 McAfee Network Security Platform 8.1 Troubleshooting Guide 0G00