Topic: User Agent Strings Created by Jonathan Tomek Senior Threat Analyst iSIGHT Partners “The User-Agent request-header field contains information about the user agent originating the request. This is for statistical purposes, the tracing of protocol violations, and automated recognition of user agents for the sake of tailoring responses to avoid particular user agent limitations. User agents SHOULD include this field with requests. The field can contain multiple product tokens and comments identifying the agent and any subproducts which form a significant part of the user agent. By convention, the product tokens are listed in order of their significance for identifying the application.” – RFC2616 In many cases, a user agent acts as a client in a network protocol used in communications within a client–server distributed computing system. In particular, the Hypertext Transfer Protocol identifies the client software originating the request, using a "User-Agent" header, even when the client is not operated by a user. The SIP protocol (based on HTTP) followed this usage. http://en.wikipedia.org/wiki/User_agent Examples: These strings are collected in the machine logs and reviewed for statistics. Many applications trust a User-Agent to be exactly what it specifies. Servers have the ability to represent data differently based upon User-Agent. ◦ I.E. If their User-Agent says they are on linux and they download a program, the server will deliver a .deb or .rpm instead of .exe or .dmg This string can be modified to just about anything as long as it ends in newline hex(0D0A) The best way to know what is anomalous is by reviewing what is normal Review RFC 2616 by IETF to truly understand how the User-Agent string works: ◦ http://www.ietf.org/rfc/rfc2616.txt ◦ (search for section 14.43) If a User-Agent looks suspicious because it deviates from the normal standard, find out why it is behaving that way or find other suspicious triggers to correlate against. The User-Agent starts with Mozilla/5.0. Has no real meaning anymore; kept for historical purposes Chrome/19.0.1084.56 specifies the browser and version of Chrome that is running Windows NT 6.1 specifies they are on Windows 7 WOW64 means it is a 32-bit application running on a 64-bit processor The User-Agent starts with Mozilla/5.0. Basically means compatible with Mozilla. Macintosh Platform running OS X 10.6 Snow Leopard Build date 2010 Jan 01 Browser is Firefox version 13 There are other products out there with custom strings Understand how they are being used Legitimate products usually include an easy to recognize identifier This user agent is curl running on a Solaris machine Curl is a command line tool for transferring data with a URL Syntax This tool can be used in an automated method to download a redirected file This user agent is Wget running on a Linux machine Wget is command line tool used for downloading files without processing them Note: This session raises the level of suspicion because it is checking what the external IP address is of the host machine from a known good site but it does not make it malicious. These user agents were created by the coding libraries for these scripting languages In these cases it is Python and Perl Depending on the environment, this could raise the level of suspicion This is the JVM being used to crawl a site Typically not too suspicious in nature If there are lots of 404 Error responses, this could likely be an scanning attack This will raises the level of suspicion when it is vague If possible, check with the Owner These are used to index web pages for Search Engines Another bot used to index web pages for Search Engines Some vendors will dismiss RFC2616 completely and use the User-Agent string for their own needs This is a good example to show that the string can be modified to anything Be mindful that this User-Agent would be suspicious if it was not from a well known service Nessus, Qualys, and other vulnerability scanning tools often keep their product name in the user agent This could be malicious if the scanning host is not approved These user agents are used by SIP vulnerability scanning typically The User-Agent string is now more vague in recent versions As with any scanning, make sure you verify the Source IP address Knowing how normal User-Agents are supposed to behave shows what they should not be doing unless specified What happens if there is a deviation from the normal, does it mean it is bad? Not really Look at these examples to see what stands out in comparison to the other User-Agent strings ? ? This could mean numerous things but does raise the level of suspicion It is missing many other HTTP header values This User-Agent includes Javascript, highly suspicious User-Agent strings are written to a log file on the remote machine If a successful exploit occurs, that string can be called from the log file and used to aid the attacker Other known malicious tags in User-Agent strings: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; AntivirXP08; .NET CLR 1.1.4322) Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; PeoplePal 7.0; .NET CLR 2.0.50727) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; FunWebProducts; .NET CLR 1.1.4322; .NET CLR 2.0.50727) (Red is a form of fake antivirus and malicious. Orange is adware/spyware which sends data back to a remote host) After knowing what is good, we can see what looks anomalous and suspicious. These steps will help you understand what to look for and find the low hanging fruit. Good malware authors will deceive by abiding by the rules ◦ They wanted to not be detected by blending in with other normal traffic ◦ The malware could be used to download a special version of malware Analyze User Agent Strings ◦ http://www.useragentstring.com/index.php ◦ Copy paste a user agent string to break it down List of User-Agent Strings ◦ http://www.useragentstring.com/pages/useragents tring.php ◦ Summaries on what each User-Agent String means Microsoft.com ◦ http://msdn.microsoft.com/enus/library/ms537503(v=vs.85).aspx ◦ Information on how Microsoft’s user agent works Be familiar with the current browsers that are being used in the wild ◦ Common: Opera, Firefox, Chrome ◦ Uncommon: Lynx, Gecko, AOL Note the difference between a mobile device, desktop, and command line User-Agents ◦ Mobile: Fennec, Blackberry, Android ◦ CLI: Curl, Wget, BinGet User-Agent strings can contain useful information to determine a browser or system User-Agents can be visibly suspicious It alone is *not* enough to determine if something is malware but may be important during a pivot in an investigation Knowing how User-Agents function can determine how certain malware is delivered If you don’t understand a value, google it ;-)