Troubleshooting AEM and DEM

advertisement
Troubleshooting Agentless Exception Monitoring and Desktop Error
Monitoring features
1.1 Introduction
Agentless Exception Monitoring (AEM) of System Center Operations Manager and System Center
Desktop Error Monitoring (DEM) are identical features with the only difference being that AEM is
shipped with Operations Manager 2007 and DEM is shipped with Microsoft Desktop Optimization Pack
(MDOP) SKU’s. These features leverage the Microsoft Error Reporting (formerly known as Dr. Watson)
or Windows Error Reporting client applications for reporting the crash or hang. These client
applications are shipped with default settings to forward the Error reports to Microsoft Error Reporting
Service. Using DEM, they can be configured to forward the Error reports to DEM or Operations
Manager Server.
This document describes the steps you need to take to ensure that the Microsoft Error Reporting and
Windows Error Reporting are configured correctly. Once you have run through the ‘Configure Client
Monitoring’ configuration wizard here are some steps you need to take to ensure that the client
application errors get forwarded to the file share or the HTTP Listener created on the Management
Server. This configuration does not require the applications to be specifically written to use Microsoft
Error Reporting or Windows Error Reporting. Please refer to the section below 1.3 “Watsonized
Applications” for further details.
1.2 Configuring Client computers using Group policy
Verifying the Group Policy ADM file is imported correctly
After the importing the group policy ADM file that is provided after running the AEM configuration
wizard, go to Start then Run and type gpedit.msc. In the Group Policy Object editor dialogue under
Computer Configuration  Administrative Templates  Microsoft Applications there should be a node
for System Center Operations Manager (SCOM), if such a node is not present then the group policy file
did not get imported correctly.
Under System Center Operations Manager (SCOM) select the SCOM Client Monitoring node. In the
settings view select and double click “Configure Error Reporting for Windows Operating Systems older
than Windows Vista”. When the properties dialogue opens on the setting tab ensure that the GP
setting is set to “enabled” and an appropriate corporate upload file path is provided as shown below.
If the user is also collecting crashes for Windows Vista and Longhorn then you need to open the
settings dialogue for the “Configure Error Reporting for Windows Vista or later operating systems” and
ensure the GP setting is set to “enabled” and the Error Listener is name of the management server
where the data will be processed. It is important to remember that the Management Server and
fileshare can be in two different servers.
If the customer is going to use SSL and integrated authentication then those two check boxes need to
be checked.
Note: If SSL is checked here, you have to make sure that Operations Manager server is also
configured to use SSL (using certs). Same applies to windows authentication as well.
In the Advanced Error Reporting settings node users can configure the settings to a) Application
reporting settings (all or none) enabled and can further choose if they want to report all application
errors, errors only for Microsoft application or all errors in Windows components. This setting controls
whether or not errors in general applications are included when error reporting is enabled. When this
setting is enabled, you get to choose whether or not to report all application errors or no application
errors by default. When the 'Report all errors in Microsoft applications' checkbox is checked, all errors
in Microsoft applications will be reported, regardless of the specified setting for various kinds of
application.
b) Report operating system errors: This setting controls whether or not errors in the operating system
are included when error reporting is enabled and c) Report on unplanned shutdown events: This
setting controls whether or not unplanned shutdown events can be reported when error reporting is
enabled.
Verifying registry keys are configured correctly for Win2K3 and XP
Microsoft Error reporting registry key settings have a hierarchy which they follow. So even if you set a
Microsoft Error Reporting registry key to allow application crashes and another policy is applied to
disable that is higher on the hierarchy, the Microsoft Error Reporting setting that is on the top of the
hierarchy takes precedence. Another caveat is that is you set the Microsoft Error Reporting registry
key settings to allow application crashes and your system administrator send down a policy update
that overwrites your settings then your policy will be over written.
Order of precedence:
1) GP Setting for User:
HKEY_CURRENT_USER\Software\Policies\Microsoft\PCHealth\ErrorReporting\DW
2) GP Setting for Machine:
HKEY_LOCAL_MACHINE\SOFTWARE\Policies\Microsoft\PCHealth\ErrorReporting\DW
Select the Error Reporting node and verify that the value 1 is set on the following registry keys as
shown in the below image.
Note: It is recommended *not* to turn off the ShowUI during troubleshooting as it will be easier for
the same for troubleshooting. Also to use ‘silent reporting’ (with no UI and also NO queue) it is
recommended that the user ‘opts-in’ using ‘Consent’ REG Keys in Vista and later Operating Systems
and with ‘DWAllowHeadLess’ , ‘DWAlwaysReport’, ‘DoReport’, ‘ForceQueueMode’ REG keys in XP and
older OS’es.
Select the DW node and verify that the value 1 is set on the following registry keys as shown in the
below image. The share where the crash dumps are being sent should also be listed correctly in the
registry key as shown below. The dwReporteeName & dwFileTreeRoot – which will depend on the
customer environment and how they have configured their file share.
3) User Machine Setting:
HKEY_CURRENT_USER\Software\Microsoft\PCHealth\ErrorReporting\DW
4) User Machine Setting:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\PCHealth\ErrorReporting\DW
Select the Error Reporting node and verify that the value 1 is set on the following registry keys as
shown in the below image.
The Operations Manager 2007 ADM file policy settings are set on the HKLM hive and therefore if there
are GP settings in HKCU these settings take precedence over HKLM settings set by OpsMgr.
Verifying if registry keys are set correctly on Vista and Win2K8
If you use proxy for accessing internet in your corporate environment make sure proxy resolution is
turned off for local addresses. You can access this setting from Control panel Internet options
connections  LAN settings and then check the checkbox to “Bypass proxy server for local addresses”.
90 percent of the time this is the reason data is not sent up to the server.
In Vista and Longhorn the Windows Error Reporting registry key setting is different than what is used
for Windows Server 2003 and XP. But the HKEY hierarchy applies in Vista and W2K8 machines as well.
HKEY_LOCAL_MACHINE\SOFTWARE\Policies\Microsoft\Windows\WindowsErrorReporting
The complete list of Vista and W2K8 registry keys can be found here: http://msdn.microsoft.com/enus/library/bb513638(VS.85).aspx
To enable silent reporting in Vista set the registry keys in the above article as well as appropriate
‘DefaultConsent’ under ‘Consent’ Subkey. To hide UI in Vista use ‘DontShowUI’.
Vista and Windows Server 2008 does not create or update crash.log so do not look to see if the server
names are listed in the crash.log they will not be listed.
To see if the crash is being sent up to Management Server for Vista and later Operating Systems you
need to download and run a tool called fiddler which provides diagnostic on whether the crash reached
the server. This tool can be downloaded from http://fiddlertool.com/
After the tool is installed and started you need to simulate a dummy crash to see if the data is
reaching the server and if the server is responding back. HTTP response code 200 in the fiddler tool is
an indication that the packet was transferred to the management server successfully.
Enabling Microsoft Error Reporting logging
In order to trouble shoot Microsoft Error Reporting configuration issues the first thing a customer will
need to do is to enable client side logging by setting or creating the registry key DWVerboseLog and
setting the value to 1. If the registry key already exists set the value to 1. On most client machines
this registry does not exists and needs to be created.
HKEY_CURRENT_USER\Software\Microsoft\PCHealth\ErrorReporting\DW\DWVerboseLog=1
Note: This creates a client side log file ‘dw.log’ under %temp%. Again this is applicable only for XP or
older clients.
Logging
Diagnostic logging is much richer in DW 2.0, and is available in ship builds. When are getting started
with integration the log can save you a great deal of time. For example, the log records stage1 and
stage2 URLs, requests from the server, iBucket and full response URL.
The log is written to %TEMP%\dw.log. Each new DW event is appended to the log.
Logging is on by default in debug builds. In ship builds, logging can be enabled with this registry
setting:
Key:
HKEY_CURRENT_USER\Software\Microsoft\PCHealth\ErrorReporting\DW
Value: DWVerboseLog=1
Return Codes
DW 2.0 returns the following codes:
0
Successes. DW exited normally. This includes user clicking Don’t Send or Cancel, or the server
not requesting a CAB.
1
Failure. Any failure, including unable to connect to server or invalid manifest file.
16
User clicked Debug button in manifest mode.
(In manifest mode the Debug button is shown if the DwuManifestDebug flag is set. In exception mode
the Debug button is shown if msoctdsOffer includes msoctdsDebug, or if there is a debugger
registered in the AeDebug key and fDweIgnoreAeDebug is not set. If the user clicks Debug in
exception mode, DW sets msoctdsDebug in the msoctdsResult field in the shared memory block and
returns 1).
Simulate an application crash
Simulate an application crash using the tool found on the link below or if you have another tool you
can use that as well to simulate the crash.
http://www.microsoft.com/downloads/details.aspx?familyid=DB979EB5-B423-4B14-8664B16DC8157E8D&displaylang=en
Note: This tool is dependent on .NET Framework 2.0 . Please make sure you download and install the
same from Microsoft before trying this tool.
In the user temp directory (%TEMP%) there should be a DW.Log file created.
On the machine where the crash file share is located check to see if there is Crash.log file created. If
there is no file share created this means that none of the application crashes are reaching the file
share. If the Crash.log file exists check to see if the machine from where you simulated the crash is
listed on the log file.
Note: The log file will contain an entry for every crash from *only* for crashes coming from XP and
older clients – IF – DWTracking key is set/enabled for those client(s)’
Debuggers Enabled
If a debugger is turned on a client machine the debugger will catch the settings and not let it reach
Microsoft Error Reporting. You can verify if a debugger is enabled by going to the following registry
key
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\AeDebug\Auto
The value should 0. If the value is 1 then the debugger will catch the exception and Microsoft Error
Reporting will not be able to get to it. Any native application that is not Watsonized” and falls back on
the operating system if Microsoft Error Reporting is the default registered debugger then it will send
data but if the default debuggers is some other application then it will not send crash data. You can
see if Microsoft Error Reporting is the default debugger by going to Debugger registry key as shown in
the above dialogue and seeing if the data entry matches what is shown above.
To register Microsoft Error Reporting as the debugger open a command prompt dialogue and run the
command drtwtsn32 –i
1.3 “Watsonized Applications”
In general managed applications built on .Net Framework 2.0 will automatically use Microsoft Error
Reporting without requiring the application to be Microsoft Error Reporting aware. When a .Net
Framework application throws an unhandled exception, the Framekwork runtime redirects the
exception to Dr. Microsoft Error Reporting or Windows Error Reporting to create the error report.
However, if an application is built on DotNet FrameWork 1.1 and below will not be able to pick up
application crashes.
If an application is built on Native code (Win32) then starting with Windows Server 2003 and Windows
XP Microsoft Error Reporting will pick up the crash because Microsoft Error Reporting is present on
those machines. In Vista and Windows Server 2008 the Windows Error Reporting (WER) is part of the
operating system.
Some applications are instrumented to customize Dr. Microsoft Error Reporting and Windows Error
Reporting crash reporting behavior. Such applications directly call into the Windows Error reporting
client application and may change the handling of a crash, hang or exception. Such “Microsoft Error
Reportingized” applications may not follow the configuration settings as recommended. You will have
to contact the application vendor to understand the steps to capture error reports generated by these
applications.
1.4 References
Office Microsoft Error
Reporting configuration with
AEM/DEM
Windows Error Reporting
configuration
Dr. Microsoft Error Reporting
2.0 Reference
System Center Desktop Error
Monitoring
System Center Operations
Manager
http://office.microsoft.com/en-us/ork2003/HA011402421033.aspx
http://msdn.microsoft.com/en-us/library/bb513638(VS.85).aspx
http://msdn.microsoft.com/en-us/library/bb219076.aspx
<add link>
<add link>
Download