User Guide - Erez Ben Ari

advertisement
UAG Trace Analyzer User’s guide
Version 1.3, 27 March 2013
Erez Ben-Ari, CSS MSD, Issaquah
General Description
The UAG Trace Analyzer is designed to help analyze and interpret data collected in a UAG trace, and
locate the cause of the issue as fast as possible. The tool primary function is parsing a trace into the
individual requests posted to the UAG server, so that the problematic request can be analyzed quickly,
with the least amount of interference from other concurrent processes or activity going on the server.
Download, install and first-time usage
The tool is available on \\newprints\Tools and Scripts\UAGTraceAnalyzer, currently version 1.0, and will
be updated there with future versions. The tool does not require an installation – simply copy it to a
local folder on your hard-drive (it won’t run from a network share) and launch it. If the location is a
protected folder (such as the Root folder, the Windows folder or the Program Files folder), the program
cannot operate correctly, because it won’t be able to store its configuration. In such a case, it will notify
you of this and exit.
When the program is launched for the first time, it creates a local configuration file with the default
locations for:
1. The location of TMF files, which would be used to decode UAG traces from Binary to text
2. The location of the UAG Source tree*
3. The path to your preferred text editor
If this is indeed the 1st time usage, the program will open the settings window, so you can modify the
paths if needed, and when finished, click Save Settings. For example, if you prefer an editor other than
Notepad (like MetaPad or Note Tab), you can specify the path to the executable, and the program will
use it when opening text files such as reports. I strongly advice NOT to specify a network path for TMF
files, as those are heavily utilized during a trace decode. Doing this over the network, even if it’s local to
your building, will be a lot slower than local. The most efficient storage is to place the TMF files in a RAM
drive (it would be 25% faster than a regular physical platter-based drive). Using an SSD drive is also a
speedy option.
Using a RAM drive reserves a piece of your RAM as a virtual hard drive, and it’s the speediest drive
possible. The UAG TMF files consume about 180 MB of data, so it should not hurt the computer’s
performance, but improve the speed of decoding a trace significantly.
To create a TMF ram disk, follow these steps:
1. Download SoftPerfects RAMDisk tool from http://www.softperfect.com/products/ramdisk/
2. Open the application
3.
4.
5.
6.
7.
8.
9.
10.
11.
Click Image/Create Image
Choose a location for your image, and specify a size of 180 MB
Set the file system as NTFS, and choose a volume label (whatever you want)
Click on the Plus icon
Choose your created image file
Select a drive letter to your liking
Copy the TMF files to your new ram drive
Close the application (it will stay in your System Tray)
Upon a reboot, the application would load, and create the drive with the image content
automatically
12. Modify the TMF location settings in the UAG Trace Analyzer to use the RAM drive you created
13. Note that Windows will issue a “low disk space” balloon tip for the drive you created, because
it’s designed to show that for drives with less than 200 MB free. You can either set the image
size to 480 MB (which would eat up more memory), or use this to disable the notification :
http://support.microsoft.com/kb/555622
* If you do not have access to the UAG Source, leave this option as-is. It’s only used when viewing
extracted traces using the built-in viewer and you can still perform most UAG trace analysis without
them. If you are an SEE, you can ask for access to the UAG source. To do so:
1. Go to http://RAMWEB
2. Select Requests -> Request Access
3. Type 13154 in Project ID(Number) field and click Apply
4. Fill the following in “Access Request” form:
a. Desired permissions level – CSS Engineer (unless your role is other)
b. Confirm (or edit) that correct user name listed in the “User Accounts” field
c. Type a justification in the appropriate field (for example “I require access to perform
UAG troubleshooting and debug analysis”)
d. Select “Team Vendor” in Custom Justification
e. Check “terms and conditions” checkbox and click “Submit”
5. Your manager will get a notification to approve your request. Once he does, you need to Sync
the source to your computer.
6. To Sync, designate a drive with at least 10 GB of free space (do NOT create a folder for it!)
7. Run the Sync script from \\Gate-FS\Public\enlist\UAGEnlist.cmd (If you don’t have permissions
to run it, contact Eli Sagie for help)
8. Follow the prompts of the script, to sync the source tree. The operation takes several hours.
Opening a UAG Trace
To open a trace, click Browse, or drag-and-drop the trace file into the blank line:
If you drag a non-textual file into the tool, it will validate whether it’s an actual UAG trace (as opposed to
other traces). If so, it will offer to decode it. Naturally, you need to have a valid TMF path set up,
otherwise the file will decode to a file with a lot of the nasty "no format information found".
Once the decode is complete, you can press the Analyze button to have the program extract a list of
requests received by UAG during the trace. You can also run the other functions, which we will discuss
soon.
After running the analysis, the tool will display a list of requests, showing Id, time, PFC and URL for each
request. You can use the search box on the top right to search in the requests. My default, search goes
through all the columns, but you can uncheck that setting to search in the URL only. Noted that the table
is open to select and copy, so you can copy stuff to excel or an email.
On the far right, you can click on Extract to pull out just that request to a text file. Note that some
requests are different, most notably ActiveSync. ActiveSync, by design, runs extended connections over
many minutes, making them problematic to extract. The tool will warn of this if you attempt to extract
an ActiveSync request.
Sometimes, other requests may not extract completely, if the tool is unable to find a normal request
end. The tool will alert to that if it goes through over 10,000 lines (a normal request is rarely more than
4,000 lines). You can elect to continue, but that could end up in a file that's almost as large as the
original trace, making it almost useless.
The extraction process will save the extracted to disk, and when done, will offer to open it. By default,
the tool will open the request with the built in viewer. If you uncheck the option, it will open with your
selected notepad instead.
By default, the extraction will clean up common trash errors, such as “CookieDecryptwas has failed” and
“pCtx->m_nError [0]”. If you uncheck “Clean Extraction”, it will keep those in the extracted file.
In addition, above the search box you can see trace statistics, which calculate how many requests the
UAG server receives per second (RPS). This is a good indicator of the server load. Most of our customers
are in the 2-5 RPS range, but occasionally, you can see 30 or more RPS, and that would be a very busy
server. At around 50 RPS or above, you can expect performance issues to come about (*** this is not
public information, so avoid discussing this number or expectations with customers!!!***)
The built in viewer
The build in viewer is not as advanced as commercial grade editors, but offers the following features:
1. It detects UAG source file references, and highlights them in blue.
2. It detects errors and highlights them in Bold
3. It detects common and potentially serious errors such as “CSP_SSL_FAIL” and
“WFSR_ERR_INTERNAL” and highlights them in Bold.
4. If you have access to UAG’s source tree, double-clicking a trace line will open the matching
source file with your default notepad.
When you open an extracted request with the built-in viewer, it will build the view, which looks like this:
The viewer will match your screen resolution, and fill the screen automatically, to reduce the need for
scrolling. You can type a search string in the search box, and it will highlight all occurrences of it. If you
have defined a valid path for your UAG sources, double-clicking a line that references a source file (.CPP
file), the tool will open the source file so you can view the function.
Text Utils
As you work with UAG, you might find yourself needing to decode certain types of info, such as Base-64
encoded strings or HTML data. On the right-side of the analyzer, the Text Utils button will open a text
window:
As you can see from the screenshot, this tool supports Base64 decoding, URL decoding, Remove Linebreaks and HTML decoding.
The Responses report
Clicking on Responses will parse out and create a report that shows the various responses UAG received
from backend servers during the trace. Usually, these will be 200 OK and 401 Authenticate for the most
part, but the report allows you to easily find unusual ones (like a 400 or 500 error). As usual, the report
is saved to disk automatically, so you can open it directly later, and the tool will offer to open it with
your notepad.
The cookies report
Clicking on Responses will extract a list of all incoming and outgoing cookies. Incoming cookies are
provided to UAG as part of client requests, and could reveal situations like failed office integration,
signed cookie issues and the likes. Outgoing cookies are the session cookies that UAG creates for new
clients, and seeing them may reveal session-related issues. The report shows the incoming cookies first,
and then the outgoing cookies and their details (path, domain etc.)
The common errors report
Clicking on Common Errors looks through the trace for any of 30 pre-defined serious issues you might
run into. These include:



















error 10054
HSE_REQ_ASYNC_READ_CLIENT
ERROR:Failed to connect
ERROR:Failed to init connection to the GC
error 64
CMRT_FAIL_READ
Possibly closed connection by server
CMRT_FAIL_CLOSE_SOCET_EN
error occured on QueueAsyncRead
tWFEContext.pvWFESessionContext for WhlFiltAppWrap.dll is set to 0000000000000000
WFE_STATUS_SEC_VIOLATION
---[HTTP/1.1 100 Continue
---[HTTP/1.1 301 Moved Permanently
---[HTTP/1.1 302 Found
---[HTTP/1.1 302 Object Moved
---[HTTP/1.1 304 Not Modified
---[HTTP/1.1 400 Bad Request
---[HTTP/1.1 401 Unauthorized
---[HTTP/1.1 404 Not Found



















---[HTTP/1.1 440 Login Timeout
---[HTTP/1.1 449 Retry after sending a PROVISION command
unable to reply to an HTTP 401
EN_S_RESEND_AFTER_401
ERROR:Failed in ADsOpenObject
500 internal server error
400 Bad request
ERROR:Failed to initialize security context
CSP_SSL_FAIL
No available servers for farm
ERROR:Failed to get destination server for farm
Required memory size exceed MAX_CBUFFER_SIZE
EXCEPTION_ACCESS_VIOLATION
WFSR_ERR_INTERNAL
called with error code
0x80004005
EN_C_ERROR_OCCURRED
ERROR:Fatal Error
WFE_STATUS_FATAL_ERROR
The resulting report file will contain all occurrences of any of these, and you can then alert the
customer, or track down the specific request and analyze it.
The all errors report
Clicking on All Errors will extract a list of all trace lines that have the keywords Error or Fail in them,
except the trash/noise errors that are false positives.
The signatures report
Clicking on Signatures will produce a report of all the server names that are used within the trace. It will
show the external hostname (either in clear text for application-specific apps or as the HAT-signed URL
for portal-hostname apps) and the actual name the server is tied to internally.
The IPs report
Clicking on IPs will produce a report of all the unique IP addresses used within the trace. It produces two
files. One file lists all the unique IP addresses, and the other lists all IPs and Ports that are unique (which
would typically be a longer list). This is an easy way to confirm if a certain client connected to UAG
during the trace.
The req stream report
The Request Stream report filters out just the major “milestone” data from the trace for the 4-stage
request processing:
a.
b.
c.
d.
Client request to UAG
UAG request to backend server
Backend server response
UAG response to client
The request stream makes it easy to track down an unusual pattern in the stream, like a request that
doesn’t end or one that’s too slow. The report would look like this:
The Req Time and Resp Time reports
The request time and response time reports provide important information about the performance of
the server. The request time report calculate the average time a request takes from the time UAG
receives it to when it’s completed. A high average (above 300 MS) indicates that the UAG server is
taking slower than it should to handle requests. The request response report calculates the average time
it takes the backend server to response to a request forwarded by UAG. A high average (above 200 MS)
indicates that the backend servers are slow to response, which would jeopardize UAG’s ability to handle
the user load (in such a case, if there are noticeable performance issues, the backend server needs to be
analyzed for the cause of the slowness)
The split trace function
Sometimes, extracting requests is not possible due to them being ActiveSync, or due to performance
bottlenecks in trace collection. Other scenarios are troubleshooting things that are not request-specific,
such as activation issues or configuration problems. The trace-split function automatically splits the
trace file to 10 MB chunks.
The time cut function
Similarly to the above, Time-Cut allows you to split-out a piece of the trace, covering a specific timeregion. It also shows the projected file size:
The split component function
Sometimes, an issue needs to be tracked down to a specific component, such as the RuleSet or
AppWrap, and those can be hard to find in a large trace, or even in a single request. The Split
Component function splits the trace to individual pieces, each containing only lines pertaining to that
component. You can then load just one of the pieces into the tool and perform additional parsing on it,
like Time-cutting or splitting. Do note, though, that this operation is CPU-intensive and could take a
while on large traces.
Error reporting
As any tool that was developed without a test framework, the tool may still contain bugs or may crash in
certain situations. If you encounter such issues and wish to report them, please collect the tools’ log file
UAGTraceAnalyzer.txt from %temp%, which should reveal the problem. Try to provide repro-steps, if
possible, and the trace file that led to the issue as well. Send the reports to Erez Ben-Ari
(benari@microsoft.com)
Support for UAG SP3
With UAG SP3, the tracing format changes a bit, to include request-context info. It will be a while until
the SP is adopted by all customers, but the UAG Trace Analyzer is fully compliant with SP3 and all the
functions work with SP3 traces.
Download