Polymorphism in Computer Viruses Term Paper Spring 2003 CS265 Security Engineering Puneet Mishra 1 Table of Contents I. Introduction 3 II. History 3 III. Polymorphism Techniques and Virus Detection 4 IV. Conclusion 5 V. References 6 2 I. Introduction “ A computer virus is a program with malicious intent to cause abnormal disruption of the operation of a computer”. “Polymorphism : The occurrence of different forms, stages, or types in individual organisms or in organisms of the same species, independent of sexual variations”.[1] The two definitions when combined give rise to a very potentially dangerous attack on computer systems. A polymorphic virus is one that replicates itself to produce operational clones, but with different source code for each instance, to avoid detection by anti-virus applications. Polymorphism can be treated as one of key ideas behind Darwin’s theory of “Survival of the Fittest” in nature. Most systems follow the conventional laws of nature, these laws have overlapping areas across various fields of science. Thus, we come across intersection between biology and computer science in the area of polymorphic computer viruses. Polymorphism is totally absent from traditional computer applications which act as hosts to computer viruses. In nature, the process of mutations makes sure that in most cases at least some of the hosts of a viral or bacterial organism somehow survive the disease and carry the species over to the next generation with these stronger genes, while the bacteria or virus might evolve further to attack with even more vengeance in the next generation to keep their species alive. Thus, this kind of a seesaw process is a constant process in nature. However, due to the current style of computer programs, aligned with “good” software engineering practices, this goes against the process of nature that ensures continuity of a species. Developers are asked to write code from pre-existing frameworks. If these frameworks have flaws in them, all the future programs that inherit features from them will also contain the flaw. In biological terms this would equate to a weak gene in a parent might make all future generations susceptible to a disorder. This fact has been very well accepted and implemented by virus writers, while programmers have ignored them to keep inline with writing good and readable code. Polymorphic computer viruses in an inactive state defeat detection by simple pattern matching scanners. The code of a polymorphic virus incorporates randomness[3], code obfuscation[4] and complexity to counter these scanners. It produces varied copies of itself. In the next few sections we will discuss the history of polymorphism in computer viruses, the techniques employed to achieve polymorphism and some of the counter-measures that can be used to contain polymorphic viruses. II. History The first polymorphic virus came into existence in 1991, named Chameleon. The first real impact of a polymorphic viruses was felt due to the virus called Tequila[6]. It is a memory resident master boot sector (partition table) and .EXE file infector. It uses a 3 complex encryption method and garbling to avoid detection. This lead to the advent of a new kind of code transforming applications called polymorphic generators. The most well known one is the Mutation Engine by a Bulgarian virus writer called “Dark Avenger”. Some other mutation engines that have come up are VME (Visible Mutation Engine), TPE (TridenT Polymorphic Engine) and NED (Nuke Encryption Device) and DAME (Dark Angel's Multiple Encryptor)[2]. III. Polymorphism Techniques and Virus Detection These sections can be treated like a game of hide and seek being played between virus and anti-virus software. The virus tries its best to hide from the anti-virus software using the camouflaging techniques (polymorphism) discussed below, while the anti-virus software tries to look for the virus and uncover the fake masks to reveal the true identity of a virus. One way of making polymorphic viruses is to use polymorphic generators. Polymorphic virus generators get their share of advertising via bulletin message boards. They exhibit their presence as object modules with complete documentation and examples. A polymorphic generator is basically an object module, to get a polymorphic mutant virus from a conventional non-encrypting virus the object files of the generator and the virus, respectively, have to be linked together. All the virus does is that it calls the polymorphic generator from its code to generate more distinct copies of itself. Polymorphic viruses can be classified[2] on the basis of the complexity of code in the decryption part of the viruses. They were first classified by Dr. Alan Solomon and enhanced by Vesselin Bontchev. The drawback of this classification is due to the fact that it is signature-based scanner centric. Level 1: To generate a polymorphic virus a scheme from a set of encryption/decryption schemes is chosen. An instance of the virus will have one of these schemes in plaintext form at any given (e.g. the Whale virus). A signature-driven scanner, a scanner that will look for the code corresponding to the scheme, would have to exploit several signatures (one for each possible encryption method) to reliably identify a virus of this kind. Such viruses are called "semi-polymorphic" or "oligomorphic". Level 2: Virus decryptor contains one or several constant instructions, the rest of it is changeable. These viruses can be detected by using wild cards with signatures. Level 3: The virus decryptor contains unused functions or instructions like NOP, CLI, and STI etc. Signature based scanners can be used after the removal of these junk instructions. Hence the code will need to be processed before being fed into the scanner. Level 4: The virus decryptor uses interchangeable instructions and changes their order (instructions mixing). Decryption algorithm remains unchanged. The scanner will have to look for algorithms rather than for patterns in the code. 4 Level 5: All the above-mentioned techniques are used, decryption algorithm is changeable, repeated encryption of virus code and even partial encryption of the decryptor code is possible. A virus of this level is impossible to detect using a signaturebased scanner due to the enormous number of permutations possible and the complexity of the techniques being used here. Level 6: Permutating viruses. The main code of the virus is subject to change, it is divided into blocks which are positioned in random order while infecting. Despite of that the virus continues to be able to work. Such viruses may be unencrypted. This again is impossible to detect via a signature based technique as no matter how many search strings (signatures) are added to the scanner, the virus will modify itself to avoid detection. Anti-emulation is another technique used by viruses to prevent detection. Emulation automates the decryption of a virus. Thus, a virus can employ some mechanism that prevents this automated decryption by virus scanners. Anti-virus writers have resorted to running potential viruses in a virtual computer environment. Since the virus can cause havoc if run in an environment that is unprotected, this sort of sand-boxing helps the anti-virus software to test the quarantined files. This is much stronger than the pattern matching scanners but might take too long to execute if the virus uses some strong decryption technique. Another method to counter the problem of viruses in general is through the process of software uniqueness. Since some viruses attach themselves to preexisting computer applications, we can prevent this by generating unique instances of the application. This will prevent the virus from attaching itself to a predetermined position in the code, as the application will have unique object code for each instance. Thus, preventing a common virus from being written for the entire application. IV. Conclusion Polymorphic viruses pose a significant threat to the computer systems today. Although most of the anti-virus scanners can detect and remove viruses generated by polymorphic generators, they can still be susceptible to new and more potent polymorphic virus generators. A new form of viruses which has started emerging is metamorphic viruses[5], these are like polymorphic but the main difference is that these remove the old junk code, reinsert new one and recompile thus giving a totally different look to the object file. Thus it can be seen that the virus writers have kept the anti-virus writers busy by designing and incorporating interesting and effective ideas to keep the process of detection a challenging one. 5 V. References [1] Dictionary.com at [http://dictionary.reference.com/search?q=polymorphism] [2] Computer Viruses by Eugene Kaspersky at [http://www.viruslist.com/eng/ viruslistbooks.html?id=50] [3] Design Ideas for a Future Computer Virus by François-René Rideau at [http://fare. tunes.org/articles/virus_design.html] [4] A Taxonomy of Obfuscating Transformations, Tech. Report #148 Dept. of Comp.Sc., University of Auckland, 92019, New Zealand; Collberg, Thomborson, Low. [5] Computer Security Update; Toronto Star Fast Forward column for June 21 and 28, 2001 at [http://www.computerwriter.com/Star/2001/jul/computer_security_ update.htm] [6] The Living Polymorphic Virus or Contacts for Nearsighted Watchmakers by Matt Waddell, STS 129, Prof. Michael John Gorman at [http:// www.stanford.edu/group/STS/techne/Fall2002/waddell1.html] 6