Advanced Troubleshooting Strategies For Microsoft Exchange Server 2007 Scott Schnoll Principal Technical Writer Exchange Server Product Group Microsoft Corporation Agenda • • • • Troubleshoot Methodology Exchange Troubleshooting Tools Diagnostic Logging in Exchange Area-specific Troubleshooting – Setup – Performance – Transport Troubleshooting Methodology Troubleshooting Methodology Knowledge Monitoring How components work How components interact How components depend on other Start with a baseline Without one, you have no comparisons With one, you can spot problems elements Tools Built-in tools Operating system tools Advanced tools Notification, corrective action, trend analysis Exchange Troubleshooting Tools Exchange Troubleshooting Tools Troubleshooting Best Practices Analyzer Database Troubleshooter Mail Flow Troubleshooter Performance Troubleshooter Monitoring Message Tracking Queue Viewer Routing Log Viewer Performance Monitor Exchange Troubleshooting Tools Client Access Cmdlets Test-MAPIConnectivity Test-ActiveSyncConnectivity Test-IMAPConnectivity Test-POPConnectivity Test-OWAConnectivity Test-UMConnectivity Test-WebServicesConnectivity Test-OutlookWebServices General Cmdlets Test-SystemHealth Test-ServiceHealth Exchange Troubleshooting Tools Transport Cmdlets Test-MailFlow Test-SenderID Test-IPBlockListProvider Test-IPAllowListProvider Test-EdgeSynchronization CI and CR Cmdlets Test-ExchangeSearch Test-ReplicationHealth Diagnostic Logging In Exchange Diagnostic Logging In Exchange • Exchange logging quite extensive – Starts with Setup – Continues through life of Exchange server • Transport Logs – Message Tracking Logs – Protocol Logs (SMTP) – Agent Logs – Connectivity Logs – Routing Logs – Pipeline Tracing Logs Diagnostic Logging In Exchange • Mailbox Logs – Messaging Records Management Logs – Cluster Logs • Client Access Logs – Protocol Logs (POP3, IMPA4) – IIS Logs • General Logs – Event Logs – Certificate Logs Diagnostic Logging In Exchange Get-EventLogLevel <Process> Set-EventLogLevel <Process> -Level <Level> Logging Level Description Lowest Only critical events, error events, and events with a logging level of zero are logged; default level for all processes except MSExchange ADAccess\Topology and MSExchange ADAccess\Validation Low Events with a logging level of 1 or lower are logged; default level for MSExchange ADAccess\Topology and MSExchange ADAccess\Validation Medium Events with a logging level of 3 or lower are logged. Maximum Events with a logging level of 5 or lower are logged. Expert Events with a logging level of 7 or lower are logged. Diagnostic Logging In Exchange • Best Practices – Be aware of impact to monitoring/event log collection agents – Set EventLogLevel back to original level when finished troubleshooting • Using wildcards – Asterisks are only for EventSource part of syntax – Get-EventLogLevel MSExchangeIS\9000*\* – Get-EventLogLevel MSExchangeIS\9000 Private\* • Research events at Errors and Events Message Center Troubleshooting Exchange Setup Troubleshooting Exchange Setup • Use Setup logs to troubleshoot errors that occur during setup or block installation Get-SetupLog.ps1 C:\ExchangeSetupLogs\ExchangeSetup.log –error –tree Get-SetupLog –tree:$false –error:$false | Where { $_.status –eq "Error" } | select datetime, depth, description, | Out-HTML | Out-IE Log name and path status Description <system Tracks progress of every task performed during drive>\ExchangeSetupLogs\ Setup; contains details on pre-req checks, ExchangeSetup.log installation progress, and config changes made by Setup <system Windows Installer log file that contains details on drive>\ExchangeSetupLogs\ extraction of Exchange code from installer file ExchangeSetup.msilog (ExchangeServer.msi) Troubleshooting Exchange Setup • ExchangeSetup.log is most relevant/useful when troubleshooting • Several documented resolutions for Setup failures at http://technet.microsoft.com/enus/library/bb232206(EXCHG.80).aspx • Task levels denoted by [X] – [0] – Begin main run of a particular task – [1] – High level run of a specific task – [2] – Subset of a particular task Troubleshooting Exchange Setup [1/27/2008 3:46:26 PM] [0] ********************************************** [1/27/2008 3:46:26 PM] [0] Starting Microsoft Exchange 2007 Setup [1/27/2008 3:46:26 PM] [0] ********************************************** ... [1/27/2008 4:11:12 PM] [0] End of Setup [1/27/2008 4:11:12 PM] [0] ********************************************** [1/27/2008 4:11:57 PM] [1] Executing '$RoleTargetVersion = "8.1.240.06"', handleError = False [1/27/2008 4:11:57 PM] [2] Launching sub-task '$error.Clear(); $RoleFqdnOrName = ”exmbx1.contoso.com"'. [1/27/2008 3:52:31 PM] [0] ExSetupUI was started with the following command: '-mode:install -sourcedir:D:\amd64 /FromSetup'. Troubleshooting Exchange Setup • In which phase of Setup did failure occur? – Bootstrap phase displays canopener and pre-req links for .NET Framework 2.0, Microsoft Management Console 3.0, and Windows PowerShell 1.0 – File copy phase copies core install files to %TEMP%\ExchangeServerSetup and sets Best Practices Analyzer XML file into culture-specific folder (e.g., ‘EN’ for English) – Setup wizard phase walks admin through GUIbased setup (license agreement, error reporting, paths, type, roles, etc.) Troubleshooting Exchange Setup • In which phase of Setup did failure occur? – Readiness check phase uses Best Practices Analyzer engine and XML (Test-SetupHealth) rules file to verify system and organizational readiness for selected install type – Installation phase deletes temporary files and proceeds with Org and domain prep (if not already done) and installation and configuration of specified role(s) Troubleshooting Exchange Setup • Recovering from Failed Setup – Setup creates ‘Watermark’ entry in registry to resume at point of failure HKLM\Software\Microsoft\Exchange\v8.0\<Role>\ – The value for Watermark can be mapped to an install task in a *.PS1 file in <SystemDrive>\ExchangeSetupLogs – If a Watermark is present, note for which role, then run the following to resume and complete installation: Setup.com /roles:<RoleWithWatermark> Troubleshooting Exchange 2007 Performance Troubleshooting Exchange 2007 Performance • Significant changes in architecture change the ways in which you troubleshoot and what you troubleshoot • Scoping – – – – – – How many servers affected? Which servers are affected? What are the current queue states? Are queues growing? Are performance counters spiking? Are external dependencies healthy? Troubleshooting Exchange 2007 Performance • Consider the performance impact of – Antivirus (file system and Exchange-based) – Backup applications – Archiving and compliance, including MRM – Monitoring agents and tools – Desktop tools that integrate with Outlook Troubleshooting Exchange 2007 Performance • Isolate cause of resource issues using – – – – – – – Windows Task Manager Performance Monitor Process Monitor Network Monitor Exchange Profile Analyzer Event Viewer Performance Troubleshooting Analyzer • Watch out for renamed objects in SP1 – Exchange ‘Database’ object renamed to ‘MSExchange Database’ Troubleshooting Exchange 2007 Performance • Check for counter values over thresholds Object \ Counter Description Threshold Processor\% Processor Time (_Total) Percentage of time the processor is running non-idle threads System\Processor Queue Length Number of threads in processor queue 2 Network Interface\Bytes Total/sec Rate at which network adapter is processing data bytes 6-7 MB/sec (100 MBps) 60-70 MB/sec (1000 MBps) Network Interface\Packets Outbound Errors Number of outbound packets that could not be transmitted due to errors LogicalDisk\Avg. Disk sec/Read Average time of a read of data from disk 50 ms (logs, peak) 20 ms (logs, ongoing) LogicalDisk\Avg. Disk sec/Read Average time of a read of data from disk 50 ms (database, peak) 20 ms (database, ongoing) LogicalDisk\Avg. Disk sec/Write Average time of a write of data to disk 50 ms (logs, peak) 10 ms (logs, ongoing) LogicalDisk\Avg. Disk sec/Read LogicalDisk\Avg. Disk sec/Write Average time of reads/writes on disk 10 ms (TEMP/TMP, Pagefile disk, SMTP queue disk 90% (peak) 75% (ongoing) 0 Troubleshooting Exchange 2007 Performance • Check for counter values over thresholds Object \ Counter Description MSExchangeIS\RPC Averaged Latency RPC latency averaged for last 1024 packets MSExchangeIS\RPC Requests Number of client requests being processed by IS MSExchange ADAccess Domain Controllers\Long running LDAP operations/Min Number of LDAP operations on DC that took longer than 15 seconds/Min MSExchange Database\Version buckets allocated (Information Store instance) Number of version buckets (16K chunks of version store) allocated MSExchangeTransport Queues\Largest Delivery Queue Length Number of messages in largest delivery queue MSExchange Database ==> Instances\Log Bytes Write/sec Rate at which bytes are written to log .NET CLR Memory\% Time in GC Percentage of elapsed time spent in garbage collection since last garbage collection cycle Threshold 25 ms 30 50 1,800 200 512,000 10 % Troubleshooting Exchange 2007 Performance • Performance Analyzer Log (PAL) – http://www.codeplex.com/pal • Generate HTML reports from performance monitor counter log file (.blg file) • Uses XML configuration files that parse the most important counters for Exchange performance issues and issues alerts when thresholds are exceeded for those counters Troubleshooting Exchange 2007 Performance • Windows Server 2008 (and Vista) include new TCP auto-tuning features • Not all network devices (routers, switches, firewalls, etc.) support these features, and some can actually make things much slower – Cisco PIX 500 Series Firewall, Cisco PIX 10000 Firewall, Cisco PIX Classic Firewall, Cisco IOS Firewall, Sonicwall Firewall, Check Point Firewall, some NG R55 routers, some Netgear routers • Disable auto-tuning on Windows 2008/Vista: – netsh interface tcp set global autotuninglevel=disabled Troubleshooting Exchange 2007 Transport Troubleshooting Tools • ExTRA: Exchange Troubleshooting Assistant – Internal/External DSN received – Issues with Queue (size, status) • Message Tracking – Lost Messages • Routing Log Viewer (SP1) – Routing and Topology issues • Advanced: ETW Tracing, Pipeline Tracing – Typically as part of a CSS escalation ExTRA Basics • A “sibling” tool to the Microsoft Exchange Server Best Practices Analyzer (ExBPA) • Union of troubleshooting tools and other related functionality – ExPTA: Exchange Performance Troubleshooting Analyzer – ExDRA: Exchange Disaster Recovery Analyzer – ExMFA: Exchange Mail Flow Analyzer ExTRA Prerequisites • ExTRA 1.1 (Downlevel version) – .NET Framework version 1.1 – IIS Common Files (to allow remote metabase access) • ExTRA 2007 (in Toolbox) – Installed with Exchange Management Tools – IIS Common Files – Fix for SmtpClient issue in .NET 2.0 SP1 • For both versions – Need sufficient credentials to gather data from both Active Directory and Exchange servers Symptom-Based Analysis Symptom-Based Analysis Choose the right symptom Symptom Choose this when you see: Troubleshooting includes… NDR User gets an NDR DSN code is known DSN-based analysis DNS check Message tracking for specific DSN Inbound Messages not arriving from the Outbound Messages not going out to the Network Test (DNS, Firewall) SMTP configuration Sending test mail Search message and track Queue Messages are stuck in one of Mailbox Submission Messages not going to from EdgeSync Edge Subscription not working Internet Intra-org messages not arriving Internet the queues on a server Mailbox to Hub Transport Analysis based on the type of queues (remote delivery, directory lookup, local delivery) MAPI connectivity check Hub Transport health check Configuration check Network Test (DNS, Firewall) Active Directory Application Mode (ADAM) checks Root Cause Analysis • Choice of correct symptom is critical to success • High-level symptom validation is performed in first step of analysis • Server operating state and configuration are collected, additional steps executed when variance from known good condition found • Branching to new steps continues until root cause identified • Not all root causes currently identified, but most common ones are covered • Web updates for ExTRA will fill gaps over time Message Tracking • Message Tracking tool in the Exchange Management Console Toolbox • Based on ExTRA • Constructs cmdlet filters used by GetMessageTrackingLog • Basic server-to-server tracking • PowerShell scripts can relate events together to track messages end-to-end Message Tracking Log • Enabled by default • Default values – – – – MessageTrackingLogEnabled: True MessageTrackingLogMaxAge: 30 MessageTrackingLogMaxDirectorySize: 250 MB MessageTrackingLogMaxFileSize: 10 MB – MessageTrackingLogSubjectLoggingEnabled: True • EventID describes tracking event action – BADMAIL, DEFER, DELIVER, DSN, EXPAND, FAIL, POISONMESSAGE, RECEIVE, REDIRECT, RESOLVE, SEND, SUBMIT, TRANSFER • Source describes component involved – ADMIN, AGENT, DSN, GATEWAY, PICKUP, ROUTING, SMTP, STOREDRIVER Message Tracking Log: cmdlet • Get-MessageTrackingLog – EventID (Receive,Send,Deliver,Fail,etc) Get-MessageTrackingLog -EventID fail -Server exht1 – Time Range (start, end) Get-MessageTrackingLog -start “03/01/2008 09:00 AM” -end “03/01/2008 09:30 AM” – Sender Address Get-MessageTrackingLog -Sender “nino@hypervlabs.com” – MSExchangeTransportLogSearch service on – server performs search and server-side filtering FAIL event for every NDR the server generates – RecipientStatus field displays reason FAIL occurred Message Tracking Log: Event Timestamp : 3/16/2008 2:50:03 PM ClientIp : ClientHostname : exht1 ServerIp : ServerHostname : exmbx1 SourceContext : ConnectorId : Source : STOREDRIVER EventId : DELIVER InternalMessageId : 36308614 MessageId : <2A9FABB3664AF8459CBADA1CE4E4024617A9F2A76F@exht1.hypervlabs.com> Recipients : {smes@hypervlabs.com} RecipientStatus : {} TotalBytes : 15682 RecipientCount : 1 RelatedRecipientAddress : Reference : MessageSubject : Troubleshooting Decks Sender : scotts@hypervlabs.com ReturnPath : scotts@hypervlabs.com MessageInfo : 3/16/2008 2:51:59 PM Routing Log Viewer • • • • Introduced in Service Pack 1 Equivalent to Winroute Displays routing table Provides comparison of topology at two points in time, identifies differences • Useful in determining transport topology – Route to remote Active Directory Site – Route to connector with external address space Routing Log Viewer: Backoff Path Routing Log Viewer: Comparing Logs Event Tracing For Windows (ETW) • ExTRA “Trace Control” enables ETW traces – Start…Run…Extra.exe – Select a task – Select “Trace Control” • Trace components useful in diagnosing transport issues – – – – Transport StoreDriver AD Driver Data.Storage • Common scenarios defined that enable correct components/tags • Filtering reduces the number of events logged in trace session, but must know sender or recipient before reproduction of issue ETW: Configure Trace File ETW: Types, Components, Tags ETW: Set Tags Manually (optional) ETW: Set Tags Manually (optional) Pipeline Tracing • Used to capture copies of messages before/after agent execution • Configuration (both parameters mandatory) – PipelineTracingPath: <path> – PipelineTracingSenderAddress: SMTP address • Enable Pipeline Tracing Set-TransportServer <Server> -PipeLineTracingEnabled:$TRUE • Warning: one or more copies of every message matching PipelineTracingSenderAddress will be saved in PipelineTracingPath • Entire message content logged to disk, so set appropriate ACL on folder specified in PipelineTracingPath Pipeline Tracing: Example • Enable: Set-TransportServer EXHUB1 –PipelineTracingEnabled:$True –PipelineTracingPath:C:\Trace –PipelineTracingSenderAddress:scott@contoso.com • Monitor Trace Folder: – \MessageSnapshots\<GUID> • Contains original message, plus pipeline tracing for routing and SMTP receive – \RulesTracking • Disable: Set-TransportServer EXHUB1 –PipelineTracingEnabled:$False Pipeline Tracing: Directory Key Takeaways • Knowledge of how components interact and depend on one another is critical to success of troubleshooting • Exchange Server 2007 includes built-in instrumentation that provides rich diagnostic information for troubleshooting purposes • A variety of tools from Windows Server and Exchange Server can provide workflow steps around the troubleshooting process Resources • Troubleshooting OWA 2007 Publishing Rules on ISA Server 2006 • Troubleshooting Outlook RPC dialog boxes • Configuration tips and common troubleshooting steps for multiple forest deployment of Autodiscover service Want To Be An Expert? • Get in depth and up to date technical resources from TechNet – Leverage the variety of Webcasts and Virtual Labs available – Be part of the Exchange Product Dialogue – Join the Exchange Community •http://technet.microsoft.com/exchange/ Track Resources Exchange Team Blog (You Had Me at EHLO) http://msexchangeteam.com Exchange Server TechCenter http://technet.microsoft.com/exchange Exchange Newsgroups microsoft.public.exchange* Exchange Forums http://forums.microsoft.com/TechNet/default.aspx?ForumGroupID=235&SiteID= 17 •© 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. •The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.