Microsoft Exchange Server Best Practices Analyzer Tool Paul Bowden Program Manager Exchange Server Development Microsoft Corporation What is it? • The Exchange Sever Best Practices Analyzer 'encodes' the top product support issues into a tool which can be run against a live deployment. – Step by step documentation tells you how to resolve each problem • The tool can be run as part of a proactive 'health check' which can expose availability or scalability problems. Additionally, the tool can be run as part of a reactive troubleshooting step for problem diagnosis and identification. – The tool will report issues currently causing problems within the topology, and discrepancies which may cause future outages. • The tool can be used to actively document the design and configuration of the Exchange topology. This data can be used to track the history of a deployment, or provide a ‘quick-start' to administrators and product support staff who need to analyze the history and configuration of an unfamiliar deployment. Why we developed it • Administrators are finding it difficult to keep up with the documentation that we produce – Urgency – Relevance • Customers find it difficult to keep track of whether they are conforming to all the best practices • Exchange has many options and finding root cause for a problem can be a long process – ~60% of Exchange problems are mis-configurations • We have many tools for collecting information, but not many provide auto-analysis Design Principles • Concentrate on Performance, Scalability and Availability of Exchange Servers – ExBPA does not check security configuration • Make it easy to run – – – – – • No complex configuration settings Auto-detect everything Allow multiple credentials to be entered No server-side components to install No impact on Exchange performance, even at peak periods Don’t leave me hanging – Every Error | Warning | NonDefault rule has a specific article which tells you more about the problem and how we detected it • Keep it up-to-date – Provide best practice updates every month – Make the tool auto-download the updates • Work in all environments – From single server SBS implementations through to the largest enterprise – Make the tool work seamlessly in both open and closed networks Similar Tools • MBSA – Microsoft Baseline Security Analyzer • SQLBPA – Microsoft SQL Server Best Practices Analyzer • The ExBPA engine has now been mandated as part of the WSS 2006 Common Engineering Criteria – BPAs for other Microsoft products are forthcoming Architecture • One tool runs against all versions of Exchange – No support for pure Exchange 5.5 topologies • You generally install the tool on a Windows XP workstation, and it remotely collects the data – Don’t need to install any components on the server • ExBPA is written in managed code (C#) • Input/output data model is XML based • Analysis engine is based on XPath Where do we look? • We look for data in… – – – – – – – – Active Directory DNS WMI Registry Metabase Performance Monitor Files on disk TCP/IP ports • First pass of execution - collection – ExBPA collects the data and places it in the same namespace • Second pass of execution – analysis – Individual settings are analysed against the defined rules. Crosschecking between data sources is possible as the data is in the same hierarchy How it works Active Directory Exchange Server XML Export ExBPA Dispatcher collectors Output Data Exchange Server Exchange Server ExBPA Analyzer ExBPA Interface XML Rules Import Demonstration… What does ExBPA check today? This following is not an exhaustive list of the checks that the tool performs, but it should give you a general idea! Exchange Roles • ExBPA detects and understands the difference between… – – – – – Small mailbox servers Large mailbox servers Clustered Exchange servers Front-end servers Bridgehead servers • Rules are conditioned for their roles (e.g. Circular logging needs to be disabled on mailbox servers, but should be enabled on bridgehead servers) Rule Types • Error – We found something that is causing, or will cause a problem – Example: No maximum message size set for the organization • Warning – We found something that looks suspicious – Example: An ADC connection agreement is scheduled to ‘Never’ • NonDefault – We found a setting which has been changed – Example: One of the many store parameters has been tuned/tweaked • Time – We found something that was changed during the past 5 days – Example: The cost on an SMTP connector was changed • BestPractice – We found that a best practice is not being followed – Example: Dr. Watson crashes are not being uploaded to Microsoft for analysis • Info – We found something of interest – Example: Your server has 8 processors installed Active Directory • Forest-wide – Forest functionality level – Exchange schema extensions – Default policy changes • Per-domain – Domain functionality level – Domains which have been renamed – Check availability of FSMO servers – EDS/EES group renamed/deleted/moved – MESO container renamed/deleted/moved Active Directory Connector • ADC Server – – – – Server is overloaded Server is idle (i.e. no connection agreements) There’s a newer version of the ADC available Server is running the latest OS Service Pack • Connection Agreements – – – – – Orphaned agreements Schedule set to never Nominated server is missing One way agreements Out-of-date agreements Exchange Organization • Check – Global message size limits are enforced – Stray Exchange objects in LostAndFound container – More than 10 administrators defined – ForestPrep version – Mixed/native mode – OMA/EAS options – UCE thresholds – Recipient Update Service definitions – Address List and OAB definitions Admin Groups • Check – Validity of legacyExchangeDN – Policy containers intact • Routing Groups – Check for valid routing master – Enumerate all connectors – Check for connectors that have recently changed Exchange Server object • Check – Validity of server name – FQDN/NetBIOS name resolution – Latest Exchange Service Pack / Roll-up – Time synchronization with the Active Directory Cluster Configuration • Checks both Active and Passive nodes • Cluster-specific checks – Number of nodes in the cluster – Configuration discrepancies between nodes – Cluster account TEMP/TMP path – Quorum configuration – Heartbeat configuration – DNS/WINS configuration – Enumerates all resources and parameters – Kerberos configuration Directory Access • Check – DSAccess cache configuration and non-default parameters. E.g. • MaxMemoryUser | MaxMemoryConfig • LdapKeepAliveSecs, DisableNetLogonCheck • MinUserDC – DSAccess cache efficiency • DSAccess topology – Round-trip times between Exchange and each DC/GC in the topology – Hardware/OS configuration of each DC/GC – Calculates the GC to Exchange processor ratio Information Store • Check – – – – – – – – – – – – – – – – ESE cache configuration Current state of virtual memory Online maintenance window Checkpoint depth Circular logging state Log buffer configuration Log generation level File system characteristics (NTFS/Compression/Encryption) Validity of legacyExchangeDN Database and logs on the same LUN Content Indexing state Non-default parameters in Private|Public-GUID registry Database size E-mail address on Public Folder stores RPC Compression / Buffer Packing settings Hard-coded TCP/IP ports, and clashes with other Exchange ports Store Process Parameters Check for non-default settings and bad values Examples • More – – – – – – – – – – – – – – – – – – – – – – – Disable MAPI Cllients Enable Tracing Initial Memory Percentage Initial Reserve Size KB Ignore Zombie Users Logon Only As Mailbox Cache Age Limit Mailbox Cache Idle Limit Mailbox Cache Size MaxOpenMessagesPerLogon Reserve Increment KB SuppressOOFsToDistributionLists Trace User LegacyDN VM Warning|Error Level objtAttachment objtFolder objtFolderView objtMessage ProrateFactor ProrateStart ProrateMax IMAIL settings ExIFS drive Transport • Check – – – – – – – – – – Main configuration parameters in the AD Cross-check AD and metabase for consistency Non-default settings File system characteristics for ‘mailroot’ folders (NTFS/Compression/Encryption) SMTP stack verb validation (e.g. X-LINK2STATE) SMTP mail submission test Enumeration of transport event sinks Enumeration of MTA settings, calling out any non-defaults Detection of Archive Sink and configuration Non-default routing parameters (e.g. SuppressStateChanges) System Attendant • Check – Service state – File system characteristics for message tracking folder (NTFS/Compression/Encryption) – RFR service – RFR / NSPI Target Server configuration – Hard-coded TCP/IP ports Anti-virus Support • CA eTrust 6/7 file-level AV configuration and exclusions • Trend Micro ScanMail – Patch level – Performance tuning configuration (threads/thresholds/debug settings) • Product detection and configuration settings for – McAfee GroupShield – Symantec Mail Security for Exchange – Sybari Antigen • VS API configuration settings – Warn if number of threads is not appropriate for underlying hardware Other Installed Applications • Check – RPC Client|Server binding order configuration – Presence of LeakDiag – For old versions of Simpler-Webb ERM – ISA 2000 Service Pack level – Presence of MOM Agent Hardware Configuration • Check – System BIOS is not over a year old – Specific support for HP, Dell and IBM servers – Processor configuration – Physical memory installed Disk Storage System • Check – Performance counters are enabled – Enumeration of physical and logical disks – Enumeration of identification of mount points – Enumeration of disk controllers and driver levels – Configuration of Host Bus Adaptors – Version of multi-pathing software (e.g. SecurePath, PowerPath) File Versions • Verify 29 key Exchange binaries – Physical presence – Make sure that they’re not too old – Identify binaries which are hotfixes • Check – Server MAPI subsystem – Presence of old Roll-ups – Presence of ESE API virus scanners Hotfixes • Detect all hotfixes and Service Packs installed for – Windows 2000 – Windows 2003 – Exchange 5.5 – Exchange 2000 – Exchange 2003 • Call out any updates that were installed during the past 5 days, and the logon name of the user that performed the installation Network Subsystem • Enumerate all network cards • Check – NIC connection status – DNS/WINS configuration – IP Gateway settings – Primary DNS is alive – Domain suffix Operating System • Check – – – – – – – – – – – – Page Table Entry (PTE) levels Paged|NonPaged pool configuration CrashOnAuditFail configuration HeapDeCommitFreeBlockThreshold TEMP/TMP paths SystemPages configuration /3GB /USERVA configuration Physical Address Extensions (PAE) detection OS Version and SKU (e.g. Standard, Enterprise, etc) Dr. Watson configuration Debug settings (including GlobalFlag, PageHeapFlags) Virtual PC / Virtual Server / VMWare detection Success Stories • Identified that circular logging was enabled on a 12,000 user Exchange cluster – Was a potential time-bomb • Identified incorrect memory configuration that required the Exchange server to be restarted every two weeks • Identified a case where database files were being stored on a compressed volume – Root cause of the performance problems ExBPA Timeline • V1.0 – September 21st – 1200 point collection / 800 rules • V1.1 – December 6th – Usability improvements – 1300 point collection / 900 rules • V2.0 – Early March – – – – – – Localized in all Exchange Server languages Performance sampling and root cause analysis infrastructure Admin API support (e.g. find out time of last backup) Optional integration with MOM 2005 Export to XML / HTM / CSV New baseline logic • V3.0 – Later on in the year – More rules and refinements – MAPI.NET collector Appendix: Screen Shots