Monitoring Microsoft Endpoint Protection for Windows Azure Events Microsoft Corporation Published: March 2012 Send suggestions and comments about this document to eppazurefb@microsoft.com. 1 Information in this document, including URL and other Internet Web site references, is subject to change without notice. Unless otherwise noted, the companies, organizations, products, domain names, e-mail addresses, logos, people, places, and events depicted in examples herein are fictitious. No association with any real company, organization, product, domain name, e-mail address, logo, person, place, or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. © 2012 Microsoft Corporation. All rights reserved. Microsoft, MS-DOS, Windows, Windows Server, System Center Operations Manager, and Active Directory are trademarks of the Microsoft group of companies. All other trademarks are property of their respective owners. 2 Contents 1 2 Introduction ............................................................................................................................. 3 Monitoring goals...................................................................................................................... 3 2.1 Functionality recap ........................................................................................................... 3 2.2 What to monitor for .......................................................................................................... 4 2.3 Windows Azure Monitoring Paradigm ............................................................................ 4 3 Pre-requisites for Monitoring .................................................................................................. 5 4 Monitoring solutions................................................................................................................ 7 5 Monitoring scenarios ............................................................................................................... 8 5.1 Scenario A: malware detected and removed .................................................................... 9 5.2 Scenario B: reimage role instance after malware detection ............................................. 9 5.3 Scenario C: critical malware removal failure ................................................................... 9 5.4 Scenario D: signature update failures............................................................................. 10 6 Specific Antimalware Events ................................................................................................ 10 Appendix A: list of antimalware events........................................................................................ 21 1 Introduction This document provides an overview of how to approach monitoring antimalware events for services running in Windows Azure. While the information in this document is specifically tailored to implementations running Microsoft Endpoint Protection for Windows Azure, the information applies to other versions of the Microsoft antimalware client running in Windows Azure virtual machines. For example System Center Endpoint Protection 2012 uses the same events and monitoring approach. For more information on running antimalware in Windows Azure, see: Microsoft Endpoint Protection for Windows Azure http://go.microsoft.com/fwlink/?LinkID=244362 2 Monitoring goals Once you have deployed antimalware in your VM roles, you need a way to tell if something is not right with your antimalware protection. This can range from signatures not updating properly to an active infection on a machine that cannot be cleaned. 2.1 Functionality recap When you deploy the antimalware solution as part of your Windows Azure service, the following core functionality is enabled: Real-time protection monitors activity on the system to detect and block malware from executing. 3 Scheduled scanning periodically performs targeted scanning to detect malware on the system, including actively running malicious programs. Malware remediation takes action on detected malware resources, such as deleting or quarantining malicious files and cleaning up malicious registry entries. Signature updates installs the latest protection signatures (aka “virus definitions”) to ensure protection is up-todate. Telemetry service reports threat data and suspicious files to Microsoft to ensure rapid response to the evolving threat landscape, as well as enabling real-time signature delivery through the Dynamic Signature Service (DSS). Microsoft’s antimalware client solutions are designed to run quietly in the background without human intervention required. Even if malware is detected, the client will automatically take action to remove the detected threat. Monitoring should focus on flagging VM role instances that are in a “bad state” and may require action to be taken. 2.2 What to monitor for Some of the specific goals of monitoring the AM client running in Azure are to flag VM role instances where: Real-time protection is disabled if real-time protection is disabled, the system is not being protected and is at risk of infection. Monitoring should ensure that real-time protection is running. Signatures are out-of-date if signatures are not being updated on a frequent basis, the system may be vulnerable to infection by newly released malware. Removing a detected threat failed if malware is detected, the antimalware client will take remediation actions to clean the system. In some cases the malware may not be able to be removed successfully, and may still be active on the system even after detection. Functionality issues functional issues such as failing to load the antimalware engine or similar problems may indicate a malware infection or problem with the antimalware installation. 2.3 Windows Azure Monitoring Paradigm One useful advantage of virtual machines over traditional “bare metal” systems, is the ability to revert them back to a known good state if something goes wrong. This is especially true in the Windows Azure environment, where applications must be designed to be stateless and continue running when the underlying OS is replaced. If something goes wrong in the OS, the particular role instance can simply be reimaged and will pick up whatever work is waiting for it. 4 This paradigm has some unique advantages for antimalware. Traditionally, malware remediation is a “best effort” attempt to rid an infected system of malicious resources and get it back into a healthy state. However there is always the chance that some component of the malware was missed or that an active component may be lurking undetected on the computer. “Wipe and reload” has always been a theoretical best practice, but difficult to put into practice in a production environment given the costs and downtime involved. Windows Azure, on the other hand, is designed explicitly for the “wipe and reload” approach to failures. From an antimalware standpoint, if something goes wrong, it is fairly straightforward to simply reimage the machine and get it back to a known clean state and continuing to run its workload. For this reason many of the recommendations you will find for specific monitoring scenarios will lean towards simply reimaging the Azure role instance in order to get your service back to doing work at full capacity as soon as possible. 3 Pre-requisites for Monitoring Monitoring antimalware in Windows Azure requires a different approach from traditional antimalware management solutions such as System Center Configuration Manager. Rather than working with diagnostics data on the virtual machine instance directly (event log entries, performance counters, etc), Azure monitoring solutions work with diagnostics data that is persisted in Windows Azure Storage. This is done by enabling Windows Azure Diagnostics (http://msdn.microsoft.com/enus/library/windowsazure/gg433048.aspx) in your service. 5 The first step to monitoring antimalware is to configure Windows Azure Diagnostics to put the antimalware related data into the desired storage account. Once this is configured, you can run a monitoring solution designed to work with Azure diagnostics data to actually monitor for antimalware related issues. Antimalware monitoring uses event log entries to monitor the protection state of the system. To configure Azure Diagnostics for antimalware 1. Ensure that Azure diagnostics is enabled for your service 2. Configure Azure Diagnostics to use the desired storage account to persist the antimalware event data 3. Configure your service to collect the antimalware data (refer to the MSDN documentation for Azure Diagnostics for details) a. In the code for each role (worker role, web role, etc), add support for logging events with the following criteria: Event Log: SYSTEM Source: Microsoft Antimalware For example, you’ll add a line similar to this: config.WindowsEventLog.DataSources.Add("System!*[System[Provider[@Name='Microsoft Antimalware']]]"); 6 Once you deploy your service with the antimalware events configured for collection, the data will be stored in your storage account and be ready for monitoring. To verify that Azure Diagnostics is working correctly to capture antimalware events, you can view the raw data in Visual Studio by opening the EventLog table in the storage account: 4 Monitoring solutions There are various solutions available for monitoring Windows Azure, such as: Azure Diagnostics Manager http://www.cerebrata.com/Products/AzureDiagnosticsManager/ System Center Monitoring Pack for Windows Azure http://www.microsoft.com/download/en/details.aspx?id=11324 Refer to the specific solution for details on how to configure it for monitoring. You can use the monitoring solution to create rules to determine the antimalware health state on the system. Guidance on what to monitor for is provided in the “Specific Antimalware Events” section. The following screenshot shows an example of a basic antimalware health monitoring rule created for use with the Azure Monitoring Pack: 7 In this example the event monitor is configured to flip the health state of the Azure role instance to “unhealthy” if a 2001 signature update failure event ID is raised, and flip it back to “healthy” if a subsequent 2000 event (“signature update succeeded”) event is raised. In this case the 2001 event has not been followed by a 2000 event, so this worker role instance is flagged as unhealthy: signature update failed and has not subsequently succeeded. This is a simple example, in reality you would want more fine-grained rules such as only going to an unhealthy state after multiple signature install failures rather than triggering the first one. See the “Specific Antimalware Events” section for detailed guidance on what to monitor for. 5 Monitoring scenarios Here are some examples of monitoring scenarios using antimalware events and Windows Azure diagnostics. 8 5.1 Scenario A: malware detected and removed Healthy 1117 Malware Removed This is the most basic threat-related scenario. Consider an Azure worker role that processes files as input. If a malicious executable is fed into the service, antimalware real-time protection will detect the malicious binary and then automatically quarantine it (raising a 1117 event). No further action is required, but you may want to monitor for this sequence of events to understand how malware is getting into your deployment, as well as to ensure that malware is being cleaned successfully. 5.2 Scenario B: reimage role instance after malware detection Healthy 1117 Malware Removed Role instance reimaged In this scenario, rather than letting the role continue to run after malware is detected and removed, the VM is reimaged in order to “take no chances” that malware may have compromised the role instance. This is a more paranoid, but ultimately safer approach. The downside is degraded service capability while the role instance is being reimaged. 5.3 Scenario C: critical malware removal failure Healthy 1119 Malware Removal Failed Role instance reimaged In this scenario, malware is detected but cannot be removed. For example a sophisticated rootkit has infected the system and is using hardening techniques to prevent removal. The role instance is actively infected and should be reimaged to help ensure data is not compromised. 9 5.4 Scenario D: signature update failures Healthy 2001 signature update failed 2004 signatures reverted 2000 signatures updated Role instance reimaged Firewall configuration reverted Connectivity investigated 2001 signature update failed In this scenario, a change to firewall policy intended to lock down Internet connectivity from the worker role results in the antimalware client no longer being able to download the latest signatures. In the real world case, the failure and reversion might happen several times before the instance was reimaged, but continued failure indicates that something is likely configured incorrectly or otherwise broken. In this example, once the change to the firewall policy is reverted, the antimalware client is able to download signatures once again and is back to a healthy and protected state. Your own monitoring needs and real-life scenarios will depend on your particular environment, risk assessment, and a variety of other factors. The following section provides some additional suggestions for how to use the antimalware events to monitor your Azure deployment. 6 Specific Antimalware Events The following antimalware events are of particular usefulness in terms of monitoring the health of your Windows Azure antimalware deployment. The endpoint protection client logs a number of additional events (see Appendix A) but the events described in this section are the most interesting from a monitoring standpoint: Event ID: 1005 This event is logged in the System log. Details Product Microsoft Malware Protection ID 1005 Source Microsoft Antimalware Symbolic Name MALWAREPROTECTION_SCAN_FAILED Message Endpoint Protection client scan has encountered an error and stopped. Scan ID: <ID number> Scan Type: Antivirus, Antispyware, Antimalware 10 Scan Parameters: Full Scan, Quick Scan, Custom Scan User: <Domain>\<User> Error Code: <Error code> Error description: <Error description> Explanation The Endpoint Protection client encountered an error, and the current scan has been stopped. This error record includes the scan ID, type of scan (antivirus, antispyware, antimalware), scan parameters, the user that started the scan, the error code, and a description of the error. User Action Try to run the scan again. If it fails in the same way, look up the error code by accessing the Microsoft Support Site (http://go.microsoft.com/fwlink/?LinkId=215163) and entering the error number in the Search box. Azure Monitoring Guidance This error indicates a problem with the antimalware client on-demand scanning functionality. If this error occurs three or more times in a role instance, it indicates that something is not working correctly with the antimalware solution, including the possibility of a malware infection preventing the scan from completing properly. Monitor for three or more instances of this error being logged for a given role instance. If this occurs, reimage the instance. If it happens again, manually investigate what is happening on the system. Event ID: 1117 This event is logged in the System log. Details Product Microsoft Malware Protection ID 1117 Source Microsoft Antimalware Symbolic Name MALWAREPROTECTION_MALWARE_ACTION_TAKEN Message Endpoint Protection has taken action to protect this machine from malware or other potentially unwanted software. For more information, see the following: Name: <Threat name> ID: <Threat ID> Severity: Low, Medium, High, Severe Category: <Category description> Path: <Path> Detection Origin: Unknown, Local machine, Network share, Internet, Incoming traffic, Outgoing traffic Detection Type: Heuristics, Generic, Concrete, Dynamic Signature Detection Source: User, System, Real-time protection, IE Downloads and Outlook Express Attachments, Network Inspection System, Browser Help Object User: <Remediation User Name> Process Name: <Process in the PID> Action: Remove, Clean, Quarantine, Allow, Not Applicable Action Status: <Description of additional actions> Signature Version: <Definition version> Engine Version: <Antimalware Engine version> Explanation Endpoint Protection took action on a virus. This event is logged after action is taken within Endpoint Protection. 11 User Action No user action is necessary. Azure Monitoring Guidance This event indicates that the antimalware client succeeded in remediating the detected threat. Monitor for this event to understand when malware is being encountered in your Azure environment. The role instance can be considered healthy and no further action is required. If this event continues to reoccur at regular intervals, manually investigate the source of the threat and remove it. Event ID: 1118 This event is logged in the System log. Details Product Microsoft Malware Protection ID 1118 Source Microsoft Antimalware Symbolic Name MALWAREPROTECTION_MALWARE_ACTION_FAILED Message The Endpoint Protection client has encountered a non-critical error when taking action on malware or other potentially unwanted software. For more information, see Microsoft Malware Protection Center (http://go.microsoft.com/fwlink/?linkid=158117&threatid=4294967289). Name: <Threat name> ID: <Threat ID> Severity: Low, Medium, High, Severe Category: <Category description> Path: <Path> Detection Origin: Unknown, Local machine, Network share, Internet, Incoming traffic, Outgoing traffic Detection Type: Heuristics, Generic, Concrete, Dynamic Signature Detection Source: User, System, Real-time protection, IE Downloads and Outlook Express Attachments, Network Inspection System, Browser Help Object User: <Remediation User Name> Process Name: <Process in the PID> Action: Remove, Clean, Quarantine, Allow, Not Applicable Action Status: <Description of additional actions> Error Code: <Error code> Error Description: <Error description> Signature Version: <Definition version> Engine Version: <Antimalware Engine version> Explanation The Endpoint Protection client failed to complete a task related to the malware remediation; however, it was not considered a critical failure. Azure Monitoring Guidance This event indicates that the antimalware client failed to remediate a detected threat, but that the failure was not considered critical. The system state is considered healthy and no further action is needed. Examples of where this can happen include attempts to remove malware from a read-only network location. 12 Monitor for this event to understand when malware is encountered in your environment. The role instance can be considered healthy and no further action is required. If this event continues to reoccur at regular intervals, manually investigate the source of the threat resource and remove it, for example removing malicious files on a read-only location that are being accessed by your Azure service. Event ID: 1119 Details Product Microsoft Malware Protection ID 1119 Source Microsoft Antimalware Symbolic Name MALWAREPROTECTION_MALWARE_ACTION_FAILED Message Endpoint Protection client has encountered a critical error when taking action on malware or other potentially unwanted software. For more information please see the following: http://go.microsoft.com/fwlink/?linkid=158117&threatid=4294967289 Name: <Threat name> ID: <Threat ID> Severity: Low, Medium, High, Severe Category: Exploit, Test, Vulnerability, Policy Path: <Path> Detection Origin: Unknown, Local machine, Network share, Internet, Incoming traffic, Outgoing traffic Detection Type: Heuristics, Generic, Concrete, Dynamic Signature Detection Source: User, System, Real-time protection, IE Downloads and Outlook Express Attachments, Network Inspection System, Browser Help Object User: <Remediation User Name> Process Name: <Process in the PID> Action: Remove, Clean, Quarantine, Allow, Not Applicable Action Status: <Description of additional actions> Error Code: <Error code> Error Description: <Error description> Signature Version: <Signature version> Engine Version: <Antimalware Engine version> Explanation Endpoint Protection client has received this error due to critical issues encountered while trying to remove detected malware. The computer is unprotected. User Action Review the Action Status field in the 1119 event for information on additional actions to take to remove the detected malware. For example the action status field may indicate the need to perform manual removal steps as documented in the Microsoft Malware Protection Center Encyclopedia. Azure Monitoring Guidance This event indicates that the antimalware client encountered a critical failure while attempting to remediate a detected threat. The system state is considered unprotected and presumed to be infected. Monitor for this event and reimage the role instance if it occurs in order to ensure data is not compromised by an active infection. The event details will provide information about the source of the infection. If the same threat detection and removal failure reoccurs, manually investigate to understand how the threat is entering your service. 13 Event ID: 2000 This event is logged in the System log. Details Product Microsoft Malware Protection ID 2000 Source Microsoft Antimalware Symbolic Name MALWAREPROTECTION_SIGNATURE_UPDATED Message Microsoft Antimalware signature version has been updated. Current Signature Version: <Current signature version> Previous Signature Version: <Previous signature version> Signature Type: Antivirus, Antispyware, Software Restriction, Antimalware, Network Inspection System Update Type: Full User: <Domain>\<User> Current Engine Version: <Current engine version> Previous Engine Version: <Previous engine version> Explanation This event occurs when definitions are successfully updated. User Action No user action is necessary. Azure Monitoring Guidance This event indicates that the antimalware client updated to the latest published signatures. Monitor for this event to occur once per day. If this event is not logged on a regular basis (daily by default), signatures may become out of date and the role instance will no longer be adequately protected. If this event is not being logged daily, look for event ID 2001 which will indicate a signature update failure and may provide information about the source of the failure. If 2001 events are not being logged, signature updates may not be configured correctly. Event ID: 2001 This event is logged in the System log. Details Product Microsoft Malware Protection ID 2001 Source Microsoft Antimalware Symbolic Name MALWAREPROTECTION_SIGNATURE_UPDATE_FAILED Message Endpoint Protection client has encountered an error trying to update signatures. New Signature Version: <New version number> Previous Signature Version: <Previous version number> Update Source: Signature Update Folder, Internal Definition Update Server, Microsoft Update Server, File share, Microsoft Malware Protection Center 14 Update Stage: Search, Download, Install Source Path: <File share name for UNC, server name for WSUS/MU/ADL> Signature Type: Antivirus, Antispyware, Software Restriction, Antimalware, Network Inspection System Update Type: Full, Delta User: <Domain>\<User> Current Engine Version: <Current engine version> Previous Engine Version: <Previous engine version> Error code: <Error code> Error description: <Error description> Explanation This error occurs if there is a problem while trying to update definitions. User Action If you are having problems updating definitions, the following steps can help: 1. Ensure your configuration for definition updates is correct. For more information, see Configuring definition updates (http://go.microsoft.com/fwlink/?LinkId=214996). 2. Try to update the definitions manually by downloading the full definitions files. To download the definitions, see the alternative download location (http://go.microsoft.com/fwlink/?LinkId=214316). 3. For more information about this error, review the entries in the %Windir%\WindowsUpdate.log log file. Azure Monitoring Guidance In Windows Azure, the antimalware client gets signature updates via http download from the Microsoft Download Center, as Windows Update is not available in Azure VM instances. This event indicates that the antimalware client experienced an error when trying to update to the latest published signatures. A single failure may indicate a transient issue such as a network issue. However persistent failures indicate a more serious problem. Monitor for three or more instances of this error being logged for a given role instance. If the error is not followed by Event ID 2000 (successful signature update event), reimage the role instance. If failures continue, manually investigate to determine if there is a connectivity issue or other problem. Event ID: 2002 This event is logged in the System log. Details Product Microsoft Malware Protection ID 2002 Source Microsoft Antimalware Symbolic Name MALWAREPROTECTION_ENGINE_UPDATED Message Microsoft Antimalware engine version has been updated. Current Engine Version: <Current engine version> Previous Engine Version: <Previous engine version> Engine Type: Antimalware User: <Domain>\<User> Explanation The antimalware protection engine has been updated. This event occurs when the antimalware engine is updated. This audit record includes the current engine version, the engine version before the update, the update source 15 (Schedule, User Request or Signature Update Folder), and the user that started the application. This event occurs when a software update is available and installed. User Action No user action is necessary. Azure Monitoring Guidance The antimalware engine is typically updated monthly (generally with the exception of December). This event indicates that the antimalware engine was successfully updated. Monitor for this event if you are seeing event ID 2003 (engine update failure). If this event occurs after event ID 2003, the system state can be considered healthy. However if you are not seeing this event after a 2003 event, the engine is not being successfully updated and the system is in an unhealthy state and at risk of infection. Event ID: 2003 This event is logged in the System log. Details Product Microsoft Malware Protection ID 2003 Source Microsoft Antimalware Symbolic Name MALWAREPROTECTION_ENGINE_UPDATE_FAILED Message Endpoint Protection client has encountered an error trying to update the engine. New Engine Version: <New engine version> Previous Engine Version: <Previous engine version> Engine Type: Antivirus, Antispyware, Software restriction, Antimalware, Network inspection system User: <Domain>\<User> Error Code: <Error code> Error description: <Error description> Explanation The antimalware client update has failed. This event occurs when the antimalware engine tries to update itself but fails. This error record includes the current engine version, the engine version before the update, the update source (Schedule, User Request or Signature Update Folder), the user that started the application, the error code, and a description of the error. This event commonly occurs due to a network connectivity break in the middle of an update. User Action To troubleshoot this event, use the following steps: 1. Restart the computer and try again. 2. Check Configuring definition updates (http://go.microsoft.com/fwlink/?LinkId=214996). 3. Manually download the latest definitions from the Microsoft Malware Protection Center (http://go.microsoft.com/fwlink/?LinkID=200965). Note that the size of the definitions file downloaded from the Microsoft Malware Protection Center can exceed 60 MB and should not be used as a long-term solution for updating definitions. Azure Monitoring Guidance 16 In Windows Azure, the antimalware client gets engine updates via http download from the Microsoft Download Center, as Windows Update is not available in Azure VM instances. The engine comes down as part of regular signature updates, with a new engine being released typically monthly. This event indicates that the antimalware client experienced an error when trying to update to the latest version of the antimalware engine. A single failure may indicate a transient issue such as a network issue. However persistent failures indicate a more serious problem. Monitor for three or more instances of this error being logged for a given role instance. If the error is not followed by Event ID 2002 (successful engine update event), reimage the role instance. If failures continue, manually investigate to determine if there is a connectivity issue or other problem. Event ID: 2004 This event is logged in the System log. Details Product Microsoft Malware Protection ID 2004 Source Microsoft Antimalware Symbolic Name MALWAREPROTECTION_SIGNATURE_REVERSION Message Endpoint Protection client has encountered an error trying to load signatures and will attempt reverting back to a known-good set of signatures. Signatures Attempted: Current, Backup Engine Type: <Engine type> Error Code: <Error code> Error description: <Error description> Signature version: <Definition version> Engine version: <Engine version>. Explanation The Endpoint Protection client attempted to download and install the latest definitions file and failed. This error can occur if Endpoint Protection client has encountered an error while trying to load the definitions or if the file is corrupt. Endpoint Protection client will attempt to revert back to a known-good set of definitions. User Action To troubleshoot this event, use the following steps: 1. Restart the computer and try again. 2. Check Configuring definition updates (http://go.microsoft.com/fwlink/?LinkId=214996). 3. Manually download the latest definitions from the Microsoft Malware Protection Center (http://go.microsoft.com/fwlink/?LinkID=200965). Note that the size of the definitions file downloaded from the Microsoft Malware Protection Center can exceed 60 MB and should not be used as a long-term solution for updating definitions. Azure Monitoring Guidance This event can occur in conjunction with signature install failures (event ID 2001). Monitor for three or more instances of this error being logged for a given role instance. If this occurs, reimage the instance. If it happens again, manually investigate what is happening on the system. 17 Event ID: 2012 This event is logged in the System log. Details Product Microsoft Malware Protection ID 2012 Source Microsoft Antimalware Symbolic Name MALWAREPROTECTION_SIGNATURE_FASTPATH_UPDATE_FAILED Message Endpoint Protection client has encountered an error trying to use Dynamic Signature Service. Current Signature Version: <Current version> Signature Type: Antivirus, Antispyware, Software Restriction Current Engine Version: <Version> Error code: <Error code> Error description: <Description> Dynamic Signature Type: Signature update, Signature disable notification Persistence Path: <Path> Dynamic Signature Version: <Version number> Dynamic Signature Compilation Timestamp: <Timestamp> Type: Version, Timestamp, No limit, Duration Persistence Limit: <Persistence limit> Explanation The antimalware client encountered an error when using the Dynamic Signature Service to download the latest definitions for a specific threat. This error is likely caused by a network connectivity issue. User Action Check your Internet connectivity settings. Azure Monitoring Guidance The Dynamic Signature Service delivers signatures from the cloud in real-time when new threats are detected on the client. This error indicates that a suspected threat was detected but real-time signature delivery failed. Monitor for this event. If it occurs, there may be active malware on the system that is preventing reliable signature delivery. You should reimage the instance to prevent data from being compromised. If it happens again, manually investigate what is happening on the system to determine the source of the infection. Event ID: 3002 This event is logged in the System log. Details Product Microsoft Malware Protection ID 3002 Source Microsoft Antimalware Symbolic Name MALWAREPROTECTION_RTP_AGENT_FAILURE Message Endpoint Protection client Real-Time Protection feature has encountered an error and failed. 18 Feature: On Access, IE Downloads and Outlook Express Attachments, Behavior monitoring, Network Inspection System Error Code: <Error code> Error description: <Error description> Explanation The Endpoint Protection client’s real-time protection feature encountered an error because one of the services failed to start. User Action To troubleshoot this event, use the following steps: 1. Try to restart both services. a. At an elevated command prompt, type net stop msmpsvc, and then type net start msmpsvc to restart the Antimalware engine. b. At an elevated command prompt, type net start nissrv, then type net start nissrv to restart the NIS engine by using the NiSSRV.exe file. 2. If it fails in the same way, look up the error code by accessing the Microsoft Support site (http://go.microsoft.com/fwlink/?LinkId=215163) and entering the error number in the Search box or contact eppazurefb@microsoft.com. Azure Monitoring Guidance Real-time protection provides the core functionality that prevents the system from being actively infected. If a component of real-time protection fails, the system will be vulnerable to malware that attempts to infect it. Monitor for this event. If it is followed by a 3007 event ID, the failure was temporary and the antimalware client recovered from the failure. However, there may have been a period of time during which the system was exposed to attack and may have been compromised. Reimage the role instance if this event is not followed by a 3007 recovery event. Event ID: 3007 This event is logged in the System log. Details Product Microsoft Malware Protection ID 3007 Source Microsoft Antimalware Symbolic Name MALWAREPROTECTION_RTP_FEATURE_RECOVERED Message Real-time Protection has restarted a feature. It is recommended that you run a full system scan to detect any items that may have been missed while this agent was down. Explanation If a real-time protection feature fails, the antimalware client will attempt to restart it. This event indicates the feature was successfully restarted. User Action No user action is necessary. Azure Monitoring Guidance This event indicates that real-time protection recovered form a failure and is back in a healthy state. However, because an RTP feature failed, there was a window of time when the system was not fully protected. 19 Monitor for this event to come after event ID 3002 (RTP failure). The system protection state is considered healthy if 3007 follows 3002. Alternatively, because there was a protection gap, monitor for this event and reimage the role instance whenever it occurs. This is the safest option in terms of ensuring no data is compromised by the malware, at the cost of some service degradation while the instance is being reimaged. Event ID: 5008 This event is logged in the System log. Details Product Microsoft Malware Protection ID 5008 Source Microsoft Antimalware Symbolic Name MALWAREPROTECTION_ENGINE_FAILURE Message Endpoint Protection client engine has been terminated due to an unexpected error. Failure Type: Hang, Crash Engine Type: Antivirus, Antispyware, Software Restriction, Antimalware, Network Inspection System Exception code: <Error code> Resource: <Resource> Explanation The Endpoint Protection client engine stopped due to an unexpected error. User Action To troubleshoot this event, use the following steps: 1. Try to restart the service. a. For antimalware, antivirus and spyware, at an elevated command prompt, type net stop msmpsvc, and then type net start msmpsvc to restart the Antimalware engine. b. For the Network Inspection System, at an elevated command prompt, type net start nissrv, and then type net start nissrv to restart the NIS engine by using the NiSSRV.exe file. 2. If it fails in the same way, look up the error code by accessing the Microsoft Support Site (http://go.microsoft.com/fwlink/?LinkId=215163) and entering the error number in the Search box, and contact eppazurefb@microsoft.com. Azure Monitoring Guidance This event indicates that antimalware protection engine encountered a critical failure such as a crash or hang. Monitor for this event and consider it to indicate the protection state of the system to be unhealthy. Reboot the role image and monitor to see if the error re-occurs. Alternatively, because there was a protection gap while the engine was not functioning, monitor for this event and reimage the role instance whenever it occurs. This is the safest option in terms of ensuring no data is compromised by the malware, at the cost of some service degradation while the instance is being reimaged. 20 Appendix A: list of antimalware events Event ID 1000 1001 1002 1003 1004 1005 1009 1010 1011 1012 1013 1014 1015 1100 1101 1116 1117 1118 1119 2000 2001 2002 2003 2004 2010 2011 2012 2013 3002 3007 5000 5001 5004 5007 5008 5009 5010 5011 5012 5100 5101 Hex ID 3E8 3E9 3EA 3EB 3EC 3ED 3F1 3F2 3F3 3F4 3F5 3F6 3F7 44C 44D 45C 45D 45E 45F 7D0 7D1 7D2 7D3 7D4 7D0 7D0 7D1 Type Success Success Success Success Success Failure Success Failure Level Information Information Warning Information Information Error Information Error Success Success Success Success Success Failure Failure Success Failure Success Failure Failure Success Success Failure information Warning Warning Warning Information Warning Error Information Error Information Error Error Information Information Error BBA BBF 1388 1389 138C 138F 1390 1391 1392 1393 1394 13EC 13ED Failure Success Success Success Success Success Failure Success Success Success Success Success Failure Error Information Information Information Information Information Error Information Information Information Information Warning Error SymbolicName MALWAREPROTECTION_SCAN_STARTED MALWAREPROTECTION_SCAN_COMPLETED MALWAREPROTECTION_SCAN_CANCELLED MALWAREPROTECTION_SCAN_PAUSED MALWAREPROTECTION_SCAN_RESUMED MALWAREPROTECTION_SCAN_FAILED MALWAREPROTECTION_QUARANTINE_RESTORE MALWAREPROTECTION_QUARANTINE_RESTORE_FAILED MALWAREPROTECTION_QUARANTINE_DELETE MALWAREPROTECTION_QUARANTINE_DELETE_FAILED MALWAREPROTECTION_MALWARE_HISTORY_DELETE MALWAREPROTECTION_MALWARE_HISTORY_DELETE_FAILED MALWAREPROTECTION_BEHAVIOR_DETECTED MALWAREPROTECTION_RESTRICTION_ACTION_TAKEN MALWAREPROTECTION_RESTRICTION_ACTION_TAKEN_FAILED MALWAREPROTECTION_MALWARE_DETECTED MALWAREPROTECTION_MALWARE_ACTION_TAKEN MALWAREPROTECTION_MALWARE_ACTION_FAILED MALWAREPROTECTION_MALWARE_ACTION_FAILED MALWAREPROTECTION_SIGNATURE_UPDATED MALWAREPROTECTION_SIGNATURE_UPDATE_FAILED MALWAREPROTECTION_ENGINE_UPDATED MALWAREPROTECTION_ENGINE_UPDATE_FAILED MALWAREPROTECTION_SIGNATURE_REVERSION MALWAREPROTECTION_SIGNATURE_FASTPATH_UPDATED MALWAREPROTECTION_SIGNATURE_FASTPATH_DELETED MALWAREPROTECTION_SIGNATURE_FASTPATH_UPDATE_FAILED MALWAREPROTECTION_SIGNATURE_FASTPATH_DELETED_ALL MALWAREPROTECTION_RTP_FEATURE_FAILURE MALWAREPROTECTION_RTP_FEATURE_RECOVERED MALWAREPROTECTION_RTP_ENABLED MALWAREPROTECTION_RTP_DISABLED MALWAREPROTECTION_RTP_FEATURE_CONFIGURED MALWAREPROTECTION_CONFIG_CHANGED MALWAREPROTECTION_ENGINE_FAILURE MALWAREPROTECTION_ANTISPYWARE_ENABLED MALWAREPROTECTION_ANTISPYWARE_DISABLED MALWAREPROTECTION_ANTIVIRUS_ENABLED MALWAREPROTECTION_ANTIVIRUS_DISABLED MALWAREPROTECTION_ENABLED_WARNING_STATE MALWAREPROTECTION_DISABLED_EXPIRED_STATE 21