UAG Trace Analyzer User’s guide Version 1.3, 27 March 2013 Erez Ben-Ari, CSS MSD, Issaquah General Description The UAG Trace Analyzer is designed to help analyze and interpret data collected in a UAG trace, and locate the cause of the issue as fast as possible. The tool primary function is parsing a trace into the individual requests posted to the UAG server, so that the problematic request can be analyzed quickly, with the least amount of interference from other concurrent processes or activity going on the server. Download, install and first-time usage The tool is available on \\newprints\Tools and Scripts\UAGTraceAnalyzer, currently version 1.0, and will be updated there with future versions. The tool does not require an installation – simply copy it to a local folder on your hard-drive (it won’t run from a network share) and launch it. If the location is a protected folder (such as the Root folder, the Windows folder or the Program Files folder), the program cannot operate correctly, because it won’t be able to store its configuration. In such a case, it will notify you of this and exit. When the program is launched for the first time, it creates a local configuration file with the default locations for: 1. The location of TMF files, which would be used to decode UAG traces from Binary to text 2. The location of the UAG Source tree* 3. The path to your preferred text editor If this is indeed the 1st time usage, the program will open the settings window, so you can modify the paths if needed, and when finished, click Save Settings. For example, if you prefer an editor other than Notepad (like MetaPad or Note Tab), you can specify the path to the executable, and the program will use it when opening text files such as reports. I strongly advice NOT to specify a network path for TMF files, as those are heavily utilized during a trace decode. Doing this over the network, even if it’s local to your building, will be a lot slower than local. The most efficient storage is to place the TMF files in a RAM drive (it would be 25% faster than a regular physical platter-based drive). Using an SSD drive is also a speedy option. Using a RAM drive reserves a piece of your RAM as a virtual hard drive, and it’s the speediest drive possible. The UAG TMF files consume about 180 MB of data, so it should not hurt the computer’s performance, but improve the speed of decoding a trace significantly. To create a TMF ram disk, follow these steps: 1. Download SoftPerfects RAMDisk tool from http://www.softperfect.com/products/ramdisk/ 2. Open the application 3. 4. 5. 6. 7. 8. 9. 10. 11. Click Image/Create Image Choose a location for your image, and specify a size of 180 MB Set the file system as NTFS, and choose a volume label (whatever you want) Click on the Plus icon Choose your created image file Select a drive letter to your liking Copy the TMF files to your new ram drive Close the application (it will stay in your System Tray) Upon a reboot, the application would load, and create the drive with the image content automatically 12. Modify the TMF location settings in the UAG Trace Analyzer to use the RAM drive you created 13. Note that Windows will issue a “low disk space” balloon tip for the drive you created, because it’s designed to show that for drives with less than 200 MB free. You can either set the image size to 480 MB (which would eat up more memory), or use this to disable the notification : http://support.microsoft.com/kb/555622 * If you do not have access to the UAG Source, leave this option as-is. It’s only used when viewing extracted traces using the built-in viewer and you can still perform most UAG trace analysis without them. If you are an SEE, you can ask for access to the UAG source. To do so: 1. Go to http://RAMWEB 2. Select Requests -> Request Access 3. Type 13154 in Project ID(Number) field and click Apply 4. Fill the following in “Access Request” form: a. Desired permissions level – CSS Engineer (unless your role is other) b. Confirm (or edit) that correct user name listed in the “User Accounts” field c. Type a justification in the appropriate field (for example “I require access to perform UAG troubleshooting and debug analysis”) d. Select “Team Vendor” in Custom Justification e. Check “terms and conditions” checkbox and click “Submit” 5. Your manager will get a notification to approve your request. Once he does, you need to Sync the source to your computer. 6. To Sync, designate a drive with at least 10 GB of free space (do NOT create a folder for it!) 7. Run the Sync script from \\Gate-FS\Public\enlist\UAGEnlist.cmd (If you don’t have permissions to run it, contact Eli Sagie for help) 8. Follow the prompts of the script, to sync the source tree. The operation takes several hours. Opening a UAG Trace To open a trace, click Browse, or drag-and-drop the trace file into the blank line: If you drag a non-textual file into the tool, it will validate whether it’s an actual UAG trace (as opposed to other traces). If so, it will offer to decode it. Naturally, you need to have a valid TMF path set up, otherwise the file will decode to a file with a lot of the nasty "no format information found". Once the decode is complete, you can press the Analyze button to have the program extract a list of requests received by UAG during the trace. You can also run the other functions, which we will discuss soon. After running the analysis, the tool will display a list of requests, showing Id, time, PFC and URL for each request. You can use the search box on the top right to search in the requests. My default, search goes through all the columns, but you can uncheck that setting to search in the URL only. Noted that the table is open to select and copy, so you can copy stuff to excel or an email. On the far right, you can click on Extract to pull out just that request to a text file. Note that some requests are different, most notably ActiveSync. ActiveSync, by design, runs extended connections over many minutes, making them problematic to extract. The tool will warn of this if you attempt to extract an ActiveSync request. Sometimes, other requests may not extract completely, if the tool is unable to find a normal request end. The tool will alert to that if it goes through over 10,000 lines (a normal request is rarely more than 4,000 lines). You can elect to continue, but that could end up in a file that's almost as large as the original trace, making it almost useless. The extraction process will save the extracted to disk, and when done, will offer to open it. By default, the tool will open the request with the built in viewer. If you uncheck the option, it will open with your selected notepad instead. By default, the extraction will clean up common trash errors, such as “CookieDecryptwas has failed” and “pCtx->m_nError [0]”. If you uncheck “Clean Extraction”, it will keep those in the extracted file. In addition, above the search box you can see trace statistics, which calculate how many requests the UAG server receives per second (RPS). This is a good indicator of the server load. Most of our customers are in the 2-5 RPS range, but occasionally, you can see 30 or more RPS, and that would be a very busy server. At around 50 RPS or above, you can expect performance issues to come about (*** this is not public information, so avoid discussing this number or expectations with customers!!!***) The built in viewer The build in viewer is not as advanced as commercial grade editors, but offers the following features: 1. It detects UAG source file references, and highlights them in blue. 2. It detects errors and highlights them in Bold 3. It detects common and potentially serious errors such as “CSP_SSL_FAIL” and “WFSR_ERR_INTERNAL” and highlights them in Bold. 4. If you have access to UAG’s source tree, double-clicking a trace line will open the matching source file with your default notepad. When you open an extracted request with the built-in viewer, it will build the view, which looks like this: The viewer will match your screen resolution, and fill the screen automatically, to reduce the need for scrolling. You can type a search string in the search box, and it will highlight all occurrences of it. If you have defined a valid path for your UAG sources, double-clicking a line that references a source file (.CPP file), the tool will open the source file so you can view the function. Text Utils As you work with UAG, you might find yourself needing to decode certain types of info, such as Base-64 encoded strings or HTML data. On the right-side of the analyzer, the Text Utils button will open a text window: As you can see from the screenshot, this tool supports Base64 decoding, URL decoding, Remove Linebreaks and HTML decoding. The Responses report Clicking on Responses will parse out and create a report that shows the various responses UAG received from backend servers during the trace. Usually, these will be 200 OK and 401 Authenticate for the most part, but the report allows you to easily find unusual ones (like a 400 or 500 error). As usual, the report is saved to disk automatically, so you can open it directly later, and the tool will offer to open it with your notepad. The cookies report Clicking on Responses will extract a list of all incoming and outgoing cookies. Incoming cookies are provided to UAG as part of client requests, and could reveal situations like failed office integration, signed cookie issues and the likes. Outgoing cookies are the session cookies that UAG creates for new clients, and seeing them may reveal session-related issues. The report shows the incoming cookies first, and then the outgoing cookies and their details (path, domain etc.) The common errors report Clicking on Common Errors looks through the trace for any of 30 pre-defined serious issues you might run into. These include: error 10054 HSE_REQ_ASYNC_READ_CLIENT ERROR:Failed to connect ERROR:Failed to init connection to the GC error 64 CMRT_FAIL_READ Possibly closed connection by server CMRT_FAIL_CLOSE_SOCET_EN error occured on QueueAsyncRead tWFEContext.pvWFESessionContext for WhlFiltAppWrap.dll is set to 0000000000000000 WFE_STATUS_SEC_VIOLATION ---[HTTP/1.1 100 Continue ---[HTTP/1.1 301 Moved Permanently ---[HTTP/1.1 302 Found ---[HTTP/1.1 302 Object Moved ---[HTTP/1.1 304 Not Modified ---[HTTP/1.1 400 Bad Request ---[HTTP/1.1 401 Unauthorized ---[HTTP/1.1 404 Not Found ---[HTTP/1.1 440 Login Timeout ---[HTTP/1.1 449 Retry after sending a PROVISION command unable to reply to an HTTP 401 EN_S_RESEND_AFTER_401 ERROR:Failed in ADsOpenObject 500 internal server error 400 Bad request ERROR:Failed to initialize security context CSP_SSL_FAIL No available servers for farm ERROR:Failed to get destination server for farm Required memory size exceed MAX_CBUFFER_SIZE EXCEPTION_ACCESS_VIOLATION WFSR_ERR_INTERNAL called with error code 0x80004005 EN_C_ERROR_OCCURRED ERROR:Fatal Error WFE_STATUS_FATAL_ERROR The resulting report file will contain all occurrences of any of these, and you can then alert the customer, or track down the specific request and analyze it. The all errors report Clicking on All Errors will extract a list of all trace lines that have the keywords Error or Fail in them, except the trash/noise errors that are false positives. The signatures report Clicking on Signatures will produce a report of all the server names that are used within the trace. It will show the external hostname (either in clear text for application-specific apps or as the HAT-signed URL for portal-hostname apps) and the actual name the server is tied to internally. The IPs report Clicking on IPs will produce a report of all the unique IP addresses used within the trace. It produces two files. One file lists all the unique IP addresses, and the other lists all IPs and Ports that are unique (which would typically be a longer list). This is an easy way to confirm if a certain client connected to UAG during the trace. The req stream report The Request Stream report filters out just the major “milestone” data from the trace for the 4-stage request processing: a. b. c. d. Client request to UAG UAG request to backend server Backend server response UAG response to client The request stream makes it easy to track down an unusual pattern in the stream, like a request that doesn’t end or one that’s too slow. The report would look like this: The Req Time and Resp Time reports The request time and response time reports provide important information about the performance of the server. The request time report calculate the average time a request takes from the time UAG receives it to when it’s completed. A high average (above 300 MS) indicates that the UAG server is taking slower than it should to handle requests. The request response report calculates the average time it takes the backend server to response to a request forwarded by UAG. A high average (above 200 MS) indicates that the backend servers are slow to response, which would jeopardize UAG’s ability to handle the user load (in such a case, if there are noticeable performance issues, the backend server needs to be analyzed for the cause of the slowness) The split trace function Sometimes, extracting requests is not possible due to them being ActiveSync, or due to performance bottlenecks in trace collection. Other scenarios are troubleshooting things that are not request-specific, such as activation issues or configuration problems. The trace-split function automatically splits the trace file to 10 MB chunks. The time cut function Similarly to the above, Time-Cut allows you to split-out a piece of the trace, covering a specific timeregion. It also shows the projected file size: The split component function Sometimes, an issue needs to be tracked down to a specific component, such as the RuleSet or AppWrap, and those can be hard to find in a large trace, or even in a single request. The Split Component function splits the trace to individual pieces, each containing only lines pertaining to that component. You can then load just one of the pieces into the tool and perform additional parsing on it, like Time-cutting or splitting. Do note, though, that this operation is CPU-intensive and could take a while on large traces. Error reporting As any tool that was developed without a test framework, the tool may still contain bugs or may crash in certain situations. If you encounter such issues and wish to report them, please collect the tools’ log file UAGTraceAnalyzer.txt from %temp%, which should reveal the problem. Try to provide repro-steps, if possible, and the trace file that led to the issue as well. Send the reports to Erez Ben-Ari (benari@microsoft.com) Support for UAG SP3 With UAG SP3, the tracing format changes a bit, to include request-context info. It will be a while until the SP is adopted by all customers, but the UAG Trace Analyzer is fully compliant with SP3 and all the functions work with SP3 traces.