Research on Malware and its Analysis Alex Finkelstein Kevin Hao Dom Amos Josh Suess Mike Hite The following report outlines a brief history of malware development from the early 1970’s through 2010 and discusses notable malware of the last five years. Additionally, types of malware and the propagation techniques they use are discussed along with prevention techniques for standard users as well as for enterprises. The final section of the report discusses creating a safe environment for malware analysis, techniques for analyzing samples of malware, and the difference between static and dynamic analysis. History of Malware Development 1971 – “Creeper Virus” was created. An infected computer would display the message daring them to “capture the creeper” This was an experiment and not meant to be malicious, it did however foreshadow a future of malicious attacks. The “Reaper” was created to find and “destroy” the Creeper and can be considered one of the first anti-virus programs. 1978 – The first “Trojan is released and known as ANIMAL. This Trojan did not destroy a system, but was able to be passed to other computers through copies over networks which were created while the user played a game. 1981 – “Elk Cloner” is developed for the Apple II. It spreads through floppy disks and displays a poem to the user. 1983 – Frederick Cohen coins the term “virus” to describe a computer program capable of selfreplicating. 1986 – The first virus for IBM-PC is released. 1987 – The “Jerusalem virus” is released and is designed to destroy files every time the date is Friday the 13th. This is notably one of the first time-release viruses that appear repeatedly since. 1988 – The “Morris Worm” is created and is the first worm to spread extensively through the internet. 1992 – The “Michelangelo worm” threatens to wipe machines around the world on March 6th and creates a media frenzy. The damage of the worm was very minimal, but it raised concern of the general public. 1994 – Canadian virus “MONKEY” uses self-concealing to evade detection. 1999 – Significantly more advanced malware which includes the first email virus - “Happy99 virus”, the “Melissa worm”, and “Kak worm” are released and spread quickly through Microsoft environments used by many internet users. 2000 – A 15-year-old Canadian boy crashes Yahoo.com with a DDoS attack. This is even more significant because Yahoo was the number one search engine at the time. 2000 – The “ILOVEYOU” VBScript worm infect millions of Windows machines within hours of being released. 2001 – Worms like “Nimda” are released and take advantage of backdoor entrances created by earlier worms. 2004 – “Santy” is the first webworm and spreads through phpBB, using Google to find new targets. 2007 – A deliberate DDoS attack on Estonia crashes the prime minister’s site as well as other government-run organizations including schools and banks. 2008 – “Conficker” becomes one of the most notorious and widespread pieces of malware ever generated. It infects around 10 million Microsoft server systems, including government and military machines. This event furthers public concern and consciousness in regards to the necessity for network security. 2008 – 2009 – “Scareware” program numbers increase rapidly. These programs appear to be free antimalware applications, but are actually a form of malware. 2010 – “Stuxnet” allegedly targets Iranian nuclear facilities and it widely considered to be the most advanced malware ever created. The above dates are significant advances in the field of malware and show some interesting trends. When we consider the earliest stages of malware development it was all for research purposes and developed at a much slower rate. Looking at the past decade we can see swift development, deployment, and exponentially more victims of attacks. This makes sense when we consider how few people owned a computer in the 1980’s compared to the billions of systems which exist today. We also see a change in the motive of attackers over time. To elaborate, the earliest forms of malware appear to have been used for the purpose of proving it could be done. They provide minor inconveniences to users, but do not render their systems useless. Looking at more recent forms of malware we see a much more malicious intent. Programs are designed for the purposes of deleting all of a user’s files, or completely denying service, and even with the intent of accessing military weapons. These attacks have financial, political, and sinister motives and are far more technically advanced. If we consider the evolution of motive it appears to go from fun and research, to financial motives, to espionage and warfare. Radware. Web. 15 Sep. 2014. Infamous Malware (last 5 years) characteristics impact how they infected users’ computers commonalities Heartbleed (2014) The heartbleed vulnerability was an exploit in the OpenSSL cryptography library which is widely used in the TLS protocol. It allowed theft of exploited servers’ private keys and users’ session cookies and passwords. Around the time of its discovery it is believed that around 17% of all secure web servers certified by trusted authorities were infected. This is approximately half of a million. The difference here is that this is a vulnerability with SSL which is used widely all over the internet and therefore not something a normal user could avoid being a victim to if they had information stored on an infected server. The commonality we see here is that heartbleed took advantage of an existing weakness. Target Credit Card Breach (2013) In the days before Thanksgiving 2013 someone installed phishing software on Target’s security and payment systems which was designed to steal every credit card used at any of the company’s 1,797 stores in the United States. This was the largest breach in United States retail history until just recently when Home Depot was compromised. Approximately 40 million card accounts were stolen and the personal data of over 70 million customers were also stolen in the breach and even a year later the effects are still being felt by Target as their stock value consistently dropped. The malware was installed on Target’s point of sales machines days before Thanksgiving. The spin is that Target spent $1.6 million on a highly advanced anti-malware system, called FireEye, just six months before the heist. This system would have caught and destroyed the malware with no human interaction, but was turned off. The financial motive appears yet again in this attack. Also notable is that prevention was possible in this case and this was the fault of bad communication and mistakes made by security personnel. The target was the POS system which was undoubtedly not being monitored as closely as Target’s computers and servers. Stuxnet (2010) Windows worm – approximately 1000% larger than a typical worm. Infects a system, hides itself with a rootkit and checks to see if the system is connected to a “Siemens Simatic factory system.” If the connection is found the worm changes the commands sent from the Windows computer to the PLC (programmable logic controller). F-Secure Labs estimates that it would take more than 10 man-years of work to complete this virus. Its complexity and the fact that it could be used to impair the ability of a centrifuge to enrich uranium with no monetary gain suggest it was developed by the (US and Israeli?) government. This worm damaged Iran’s centrifuges and delayed its uranium enrichment efforts. Most likely spread through USB devices. This attack was uncommon and had specific military/espionage/sabotage intent. It was far more advanced than any other worm ever created. CryptoLocker (2013) Trojan Virus that targeted computers running Microsoft Windows and encrypts files on a user’s hard drive, then prompts them to pay a ransom in order to receive the decryption key. This is considered the first true “ransomware”. In December 2013 the bitcoin addresses associated with the CryptoLocker showed movements of approximately 42,000 bitcoins which, at the time, was worth ~$27 million. Calculations made after the shutdown suggest that the operator extorted around $3 million total. On top of this tens of thousands of users lost all of their files. It propagated through infected email attachments and from an existing botnet. Email and botnets seem to be a common distribution technique as they are able to infect the most users in the least time. We also see here the financial motive which is a recurring theme in more recent attacks and bitcoin the preferred method of payment due to it anonymity. SpyEye (2009 - 2011) Trojan Horse Virus that steals money from bank accounts while simultaneously creating fake statements to show that the money is still there. Has infected more than 1.4 million computers. The main developer of the virus, a Russian named Aleksandr Andreevich Panin, is expected to have sold the virus to at least 150 clients who used it to set up C2 servers. These clients compromised more than 10,000 bank accounts. One of the clients is reported to have made over $3.2 million in a six month period. The virus is sold as a “kit” to cybercriminals who, after purchasing, are provided with a graphical interface to set up “drop zones” which are servers to receive stolen online banking credentials. There are also configuration files customized for attacking most online banking websites. This allows the attacker to inject extra fields over a bank’s web page asking for other login information and passwords. The fields appear to be a part of the legitimate site, but export all of the information to the drop zone. Again we see the financial motive here as the developer of the malware made it to sell to criminals for profit and the criminals used the malware to steal money from unsuspecting victims. Abrams, Lawrence. bleepingcomputers. 14 Oct. 2013. 17 Sep. 2014. FBI. Atlanta Division. 28 Jan. 2014. 22 Sep. 2014. Kirk, Jeremy. pcworld. IDG News Service. 14 Oct. 2011. 22 Sep. 2014. NationalInterest. Center for the National Interest. 2014. 16 Sep. 2014. Rainie, Lee; Duggan Maeve. PewResearch Internet Project. Heartbleed’s Impact. 30 April 2014. 22 Sep. 2014. Sandra. Safe&Savvy. F-Secure, WordPress. 2014. 16 Sep. 2014. Smith, Chris. BGR. 13 Mar. 2014. 22 Sep. 2014. Malware Propagation Techniques Virus – Replicates by attaching its program instructions to an ordinary host program or document so the virus’ instructions are executed when the host program executes. Examples of this include attaching to files, the boot sector, an email, RFID viruses, and macros Network Worms – Self-propagating programs that spread over a network, typically through the Internet, but can also propagate over a private network that is not connected to the internet. They are unlike viruses in that they do not depend on other programs or user actions for replication or execution. Examples of this include propagating through emails, IM, IRC, the Internet, and P2P worms. Trojan Horse and Spyware – Destructive program that appears to be non-dangerous and serve some function, thus tricking the user into downloading or executing. Examples of this include backdoor, data-collecting, downloaders, rootkits, bots, and tracking cookies. Blended Attacks – These involve combining attack vectors. For example, using a worm as the delivery mechanism for another type of malware. Embedded Malicious Code – Logic bomb/time bomb that causes harm to the system once the code is executed. Crimeware – Clicker, session hijacker, email redirector. DoS and DDoS Tools – Flood the system with requests/pings rendering it useless for any purpose as it is dealing with the large amount of incoming traffic. Malware Constructors – Tools designed to create malware. Exploits – Take advantage of known vulnerabilities in a system. Traditional Propagation Techniques Social Engineering – Oldest and most effective method. Uses stories and ambiguous filenames to entice victims into clicking or some other action which begins the infection process. File Execution – Straightforward method and foundation for all malware. Popular file types such as .flv, .doc, .ppt, .xls, .exe, .pdf, and .bat are delivered through social engineering techniques, P2P networking, file sharking, email, or memory device transfer and contain malware which is then executed. Metamorphic Propagation Techniques Metamorphic Malware – Changes as it reproduces or propagates, making it difficult to identify with signature-based antivirus or software removal tools, but does not completely alter its code. Polymorphism – Self-replicating malware that changes its structure from the original. It is able to create an unlimited number of new decryptors that can all use different encryption methods making detection extremely difficult. Oligomorphic – “Poor man’s polymorphic engine.” Selects a decryptor from a set number of predefined alternatives. Cannot change the encrypted base code. Obfuscation Propagation Techniques Archivers – ZIP, RAR, CAB, and TAR utilities which are unpacked and installed on the victim’s host. These are easily detectable with modern antivirus scanners. Encryptors – Core code is encrypted and compressed making it hard to analyze. Recent implementations use public key encryption. Packers – An encryption module used to obfuscate the actual main body of code that executes the true functionality of the malware. Many of these are publically and privately available. Can be very difficult to detect and analyze. Network Encoding – Sneak past boundary protection systems through HTTP or HTTPS channel. DDNS and Fast Flux Propagation Techniques Dynamic Domain Name Services (DDNS) – DNS where the domain name to IP resolution can be updated in real-time. IP address of the compromised host could be anywhere and move at any moment. Fast Flux – Used by various botnets, malware and phishing schemes and delivers content and command/control through a constantly changing network of proxied compromised hosts. Much faster and harder to detect than DDNS. Single-Flux – Associates a single DNS address record for a single DNS entry and produces a fluctuating list of destination addresses for a single domain name that can have thousands of entries. Double-Flux – Similar to Single-Flux, but the multiple hosts are name servers and register/deregister NS records that produce lists for the DNS zone. More difficult to implement. Malware Prevention Techniques Malware Prevention Policy as outlined in NIST 800-83. These policies are for organizational compliance. Requiring the scanning of media from outside of the organization for malware before they can be used. Requiring that email file attachments, including compressed files, be saved to local drives or media and scanned before they are opened. Forbidding the sending or receipt of certain types of files (.exe for example) via email and allowing certain additional file types to be blocked for a period of time in response to an impending malware threat. Restricting or forbidding the use of unnecessary software, such as user applications that are often used to transfer malware, and services that are not needed or duplicate the organizationprovided equivalents and might contain additional vulnerabilities that could be exploited by malware. Restricting the use of administrator-level privileges by users, which helps to limit the privileges available to malware introduced to systems by users. Requiring that systems be kept up to date with OS and application upgrades and patches. Restricting the use of removable media, particularly on systems that are at high risk of infection, such as publicly accessible kiosks. Specifying which type of preventive software are required for each type of system and application, and listing the high-level requirements for configuring and maintaining the software. Permitting access to other networks only through organization-approved and secured mechanisms. Requiring firewall configuration changes to be approved through a formal process. Specifying which types of mobile code may be used from various source. Restricting the use of mobile devices on trusted networks. For standard users these guidelines may seem too strict, but NIST is the government agency in charge of security policies for networks and organizations so there recommendations are undoubtedly the most secure way to assure prevention. Prevention steps for home users include the following: Install a Firewall Install Anti-Virus software Keep system updated and patched Backup data frequently in case of malware attack Use a secure browser Use a secure email client Activate real-time spyware protection Use a HOSTS file Further prevention techniques to consider are being vendor agnostic. By this we mean the company whose firewall you use should not be the same company whose anti-virus protection, and spyware protection you use. This is because any weakness that company has will be present in all of their products so by using software all from different brands you are able to increase the overall effectiveness and protect against a wider range of potential attacks. Bugeja, Joseph. Malware and Modern Propagation Techniques. 1 Dec. 2013. 23 Sep. 2014. Mell, Peter; Kent, Karen; Nusbaum, Joseph. Guide to Malware Incident Prevention and Handling. NIST SP 800-83. Nov. 2005. 23 Sep. 2014. Shanmuga. Ten Steps to Malware Prevention. MalwareHelp. 23 Sep. 2014. Analysis Techniques – how to analyze samples Before discussing techniques to analyze malware it is important to understand why we would want to do this. Malware analysis takes time and resources and sometimes a significant amount of each so it is important to consider the reasons behind it. Some potential reasons for analysis include: Assessing damage from an intrusion Discovering and cataloguing indicators of compromised systems to compare with other potentially compromised machines. Determining the sophistication level of the malware author. To identify what vulnerability was exploited. To identify the intruder or responsible party. Furthering understanding. After we have considered these reasons and identified our purpose we must also consider what we are dealing with. Because malware is innately designed to cause harm we do not want to be analyzing it on our own computer in a normal environment as the side effects on our own system could be catastrophic. There are multiple solutions to this problem and they are outlined below. First, we can create an “analysis lab” by placing computers on their own physically partitioned network and giving them an easily restorable, standardized software build. This way we can test malware and let it destroy the system and then simply restore from a backup image. Another solution, that is easier and less expensive, would be to use virtual machines to create a simulated lab environment. Like in the first solution, this allows us to infect the system with malware and then revert back to an old build. This is not as safe as the first solution and there are multiple things to consider if this is the solution being implemented. First, the virtualization software is not perfect and could potentially allow for information to “leak” from the virtual machine onto the host machine in an unexpected way. Second, malicious code can often detect that it is running in a virtual environment and may modify its behavior accordingly. Finally, a 0-day worm that can exploit a listening service on the host OS will escape the virtual sandbox. These are important considerations when dealing with particularly nasty malware. After deciding between using a virtual lab or physical lab the next item for consideration is the level of network access the machines will have. Using a machine connected to the Internet is faster and easier, but has several drawbacks which are listed below. An attacker may change their behavior if they see connections from a machine they did not infect. Allowing malware to connect to a controlling server, you could potentially be entering a realtime battle with a human for control of your machine. The external IP address used by your machine may become the target for additional attacks. If the malware spreads automatically or conducts DDoS attacks you may be unintentionally attacking others. Considering the gravity of these possibilities it may be better to use a closed network with virtualized services, but this method requires more time and significantly more effort to set up. When analyzing there are two main classifications of analysis – static and dynamic. In static analysis the code does not actually run which makes it safer than dynamic analysis. Static analysis is the process of inspecting the code and examining the external features. Dynamic analysis involves running the code and examining its actions. Recommended steps for both static and dynamic analysis are discussed below. Static Analysis File Fingerprinting The first step is creating a cryptographic hash value for all files being investigated. This will allow for verification when seeing if the program has been modified or modified itself. Virus Scanning If the file being investigated is a known piece of malware then it may be recognized by anti-virus software. If it has been, then the vendor will typically post their analysis of the malware which can assist in your analysis. Packer Detection A complicated factor when analyzing malware is that many programs modify an executable file to obfuscate its contents and hide the actual program logic from a reverse engineer performing analysis. Programs that modify other program files to compress or disguise their content are referred to as “executable packers”. When an executable program is packed it runs as it did before, but looks much different from a static analysis perspective. Once it has been packed the original logic and other metadata are difficult to recover through static analysis. There are free tools, such as PEID, which assist in determining if a program has been packed. Examine Strings There is no instruction manual to walk through malicious code, but much can be learned from analyzing strings of readable text embedded within the program. Be aware that the writer of the malicious code may have potentially embedded information to mislead an analyzer. Examining PE Formatted File Portable Executable (PE) is the format used by executable files on Windows systems and contain useful information which can be extracted by examining the metadata. This information includes date and time of compilation, functions imported and exported by the program, icons, menus, version information, and Strings embedded in resources. Again, there are many tools available to aid in this task. Disassembly After completing the previously discussed steps it is common practice to disassemble the file and analyze the assembly code instructions that make up the program. In the industry IDA Pro is the tool everyone uses for this. Dynamic Analysis For dynamic analysis you must have a secured, safe environment to avoid infecting your own system or others’. When analyzing malware on Windows machines programs will interact with the file system, the registry, other processes, and the network so it is important to be able to capture information from those sources. Tools like Process Monitor and Wireshark are two widely used tools for collecting such information. Process Monitor Process Monitor is a SysInternals tool that allows users to monitor all file, registry, and process activity on Windows systems. It installs a device driver that captures information about activity happening inside the kernel of the system and presents it to the user in a graphical interface. The user must know what they are looking for and filter accordingly. Wireshark Wireshark is a protocol analyzer that captures, analyzes, and filters network traffic. The flaw of Wireshark is that it does not know what process generates each packet of captured network data, making it difficult to determine if a packet was generated by the malicious program. There are similar tools which do offer this feature such as Port Explorer. Using these two tools, or like tools, in conjunction give a good picture of how the malware works. These techniques will answer the most important questions about malware, however they will not be sufficient for analyzing full featured backdoors or botnet clients that may use custom encoding methods, complicated command sets, and multiple layers of obfuscated or encrypted data. For these more complicated pieces of malware the best approach is static analysis with IDA and dynamic analysis with a debugger like Ollydbg or Windbg. Egele, Manuel; Scholte, Theodoor; Kirda, Engin; Kruegel, Christopher. A Survey on Automated Dynamic Malware Analysis Techniques and Tools. SBA. 1 Oct. 2014. Kendall, Kris. Practical Malware Analysis. Mandiant. 1 Oct. 2014.