UNIVERSITY OF ARIZONA MIS DEPARTMENT Newborn and Polymorphic P2P Bot Detection – Design Science Approach Research Proposal - Initial University of Arizona MIS 696A Dr Jay Nunamaker Jr. John Gastreich 10/13/2010 1st: This was my initial research proposal, but I decided to completely change the proposal to another research question and approach. Therefore, this paper does not match my slides. 2nd: Changing the research question was a good exercise for me because I was able to improve upon my first attempt. The second attempt can be seen in the slides. 3rd: I would like to completely change my research question again to get more experience and build on what I have gained from this exercise. I. Introduction The research problem is that of understanding black hats to better design countermeasures that “should lead to lessening the damage caused by criminals” (Mahmood, Siponen, Straub, Rao, & Raghu, 2010). Current literature has put emphasis on study from white hats that are well-known to be reluctant in divulging the most critical information security breaches due to confidentiality. This paper proposes a study to focus on the efficacy of honeypots to detect P2P newborn bots and P2P polymorphic bots – bots of a newer form of botnets. Botnets are used for a variety of attacks such as denial of service attacks, spam, adware, spyware, click fraud, and cyber scamming (see Appendix). One notorious botnet was the Mariposa Botnet as follows: Name Mariposa Botnet No. of Bots 8 – 12 million Computer-Assisted Crimes Stole computer users’ c.c. & bank account info Launched denial of service attacks Spread viruses *Source: The FBI (The FBI, 2010) Empirical data will be gathered to better understand the effective use of honeypots to trap and trace P2P botnets. P2P botnets can survive even when command and control are lost. Figures 1 below illustrate that P2P bots do not need the command center as traditional botnets do. Therefore, there is no central point of failure. Figure 1 Command & Control vs. P2P Botnet (Wang, Aslam, & Zou) II. The Dependent Variable Empirical data will be used to determine the efficacy of honeypots in trapping, tracing, and providing information to disrupt peer-to-peer botnets. The study will record and measure the efficacy of the honeypots detection techniques on various known and unknown attacks in a simulated laboratory environment. The ability to identify malware at an earlier stage will be measured. III. Literature Review Although honeypots are widely used to monitor botnets (Wang, Aslam, & Zou), botnet herders are becoming more and more adept at detecting honeypots to avoid being trapped and traced. Also, traditional signature-based models do not detect unknown malware. However stateful models do have the ability to capture data on newborn bots (Wurzinger, Bilge, & Holz, 2009). This can prove to be very useful. Even if the command and control of a botnet are lost, P2P bots, such as those in Stuxnet, update themselves to the latest version. (Stuxnet P2P component, 2010) Some efforts have shown that P2P botnets can be disrupted (Holz, Steiner, Dahl, Biersack, & Freiling, 2008), however new strains of malware including P2P botnets continue to survive in the wild. IV. Proposed Explanations In a laboratory environment, independent variables will be put through the model to determine the effectiveness of detection methodologies. Known and unknown bot activity will be input to determine the detection ability of various methods. Those inputs will be manipulated to demonstrate various types of activities known bots perform such as Internet Relay Chat (IRC) activity. False positives and false negatives will be measured to determine the effectiveness of various methods. Some possible hypotheses are: Newer methodologies are effective at detecting bots and botnets at earlier stages. Stateful algorithms effectively detect P2P bots in infancy. Black hat trends are predictable. Prediction of black hat trends allows for countermeasures to be implemented earlier and more effectively. Businesses and governments have new opportunities to share data for early detection and disruption of botnets. Polymorphic P2P bots are vulnerable to detection and disruption. Non-polymorphic P2P botnets have vulnerabilities that are exploitable. V. Methodology Previous methods that attempt to extract knowledge of black hat behavior would often use a rather unethical means asking subjects to imagine that they were involved in some deviate behavior. However, these types of experiments have limited success in replicating the minds of actual criminals (Garberg & Libkuman, 2009). To avoid this negative side-effect, honeypots will be used to gather data from the tools of the actual criminals – the black hats. Other possible sources of data include: the U.S. Securities and Exchange Commission (SEC) and the Computer Emergency Response Team (CERT) at Carnegie Mellon University. And lastly, another possible source of data could come from honeypots in production at real places of business. This would give insight into the attack methods and trends of malware and black hats and the challenges real businesses face. Another approach, and possibly the most promising approach, may be to use the design-science paradigm to create an effective means to use honeypots to detect, trap ,and track various botnets with tools to disrupt their propagation in the early stages of their lives. “The design-science paradigm seeks to create what is effective” (Hevner, 2004). Participants One possible method of gathering data could be to contact businesses currently using honeypots as a diversion technique. Data could be gathered on the types of attacks they are encountering. Persuading various organizations to share their data could become a very arduous task. Therefore, a wiser choice may be to focus on one organization to use as a source of data to build a simulation in a laboratory. The purpose would be to set up an environment similar to an actual business to understand the challenges a typical business faces and how P2P botnets could penetrate those businesses. If this method were chosen, a local business with relations to the University of Arizona would be a convenient choice. On the other hand, if a number of businesses were to become participants, criteria would need to be set for choosing them. Criteria could include the business’ industry, size, type of confidential information, and geographic location. Compensation to those businesses would be shared knowledge produced from the studies leading to better information security practices. Apparatus/Materials If participants were used in the study, means to gather the data would be required. Due to the sensitivity of such data, non-disclosure agreements would most likely need to be prepared and signed to outline the protection, storage, and use of the data. A secure server could be set up to allow automatic transfer of data on a regular basis over an encrypted tunnel. Procedure Experiments could be conducted on P2P botnet malware in a laboratory setting to provide for control and allow for various models and/or detection techniques. Trapping and tracking techniques could also be explored. Experiments will be designed to discover opportunities for new and better methods of P2P botnet detection, trapping, tracking, and/or disruption. The lab should be set up at the University of Arizona to provide for ease of access. Any data collected would be stored securely in the lab. VI. Results If various businesses’ data were collected, the criteria for types of data inclusion would need to be clearly planned. Due to the large volume of data that one honeypot can collect, only the most relevant data would need to be kept. Statistical analysis of data collected could be used to help determine the relevance of that data in the early stages of the study. At first, general descriptive data could be collected to determine the various types of data collected. The criteria for the data should be examined carefully before contacting any businesses to participate to eliminate any confusion or uncertainty when managing the relationship with the participants. VII. Implications Depending on the approach chosen, the implications will be vastly different. Using design-science to develop a model for early detection of malware based on stateful analysis would result in the creation of new approaches to understanding black hat methods, tools, and trends. This knowledge would lead to better understanding of methods to disrupt their efforts in earlier stages. However, if an approach to create a simulation of a business after gathering data from businesses were implemented, the realities of a typical business could be examined to provide better understanding of the challenges businesses face when dealing with malware. The broader relevance of the research would be to provide insights into black hat methodology and tools to lessen the damage caused by botnets and computer-assisted criminal activity. Appendix Cyber Scamming “Claims of Being Stranded Swindle Consumers Out of Thousands of Dollars 07/01/10—The IC3 continues to receive reports of individuals' e-mail or social networking accounts being compromised and used in a social engineering scam to swindle consumers out of thousands of dollars. Portraying to be the victim, the hacker uses the victim's account to send a notice to their contacts. The notice claims the victim is in immediate need of money due to being robbed of their credit cards, passport, money, and cell phone; leaving them stranded in London or some other location. Some claim they only have a few days to pay their hotel bill and promise to reimburse upon their return home. A sense of urgency to help their friend/contact may cause the recipient to fail to validate the claim, increasing the likelihood of them falling for this scam. If you receive a similar notice and are not sure it is a scam, you should always verify the information before sending any money. If you have been a victim of this type of scam or any other Cyber crime, you can report it to the IC3 website at www.IC3.gov. The IC3 complaint database links complaints for potential referral to the appropriate law enforcement agency for case consideration. Complaint information is also used to identity emerging trends and patterns’ (The FBI, 2010). Bibliography Garberg, N. M., & Libkuman, T. M. (2009). Community sentiment and the juvenile offender: should juveniles charged with felony murder be waived into the adult criminal justice system? Behavioral Sciences & the Law, 27, 4 , 553-575. Hevner, A. R. (2004). Design Science in in Information Systems Research. MIS Quarterly (28:1) , 75-105. Holz, T., Steiner, M., Dahl, F., Biersack, E., & Freiling, F. (2008). Measurements and Mitigation of Peer-toPeer-based Botnets: A Case Study on Storm Worm. Mahmood, M. A., Siponen, M., Straub, D., Rao, H. R., & Raghu, T. S. (2010). MOVING TOWARD BLACK HAT RESEARCH IN INFORMATION SYSTEMS SECURITY: AN EDITORIAL INTRODUCTION TO THE SPECIAL ISSUE. MIS Quarterly , 431 - 433. Stuxnet P2P component. (2010, September 17). Retrieved October 13, 2010, from Symantec: http://www.symantec.com/connect/blogs/stuxnet-p2p-component The FBI. (2010, July 28). FBI, Slovenian and Spanish Police Arrest Mariposa Botnet Creator, Operators. Retrieved October 12, 2010, from The FBI: http://www.fbi.gov/news/pressrel/press-releases/fbislovenian-and-spanish-police-arrest-mariposa-botnet-creator-operators The FBI. (2010, July 1). New E-Scams & Warnings. Retrieved October 12, 2010, from The FBI: http://www.fbi.gov/scams-safety/e-scams/e-scams Wang, P., Aslam, B., & Zou, C. C. (n.d.). Peer-to-Peer Botnets: The Next Generation of Botnet Attacks. Retrieved October 13, 2010, from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.153.6675&rep=rep1&type=pdf Wurzinger, P., Bilge, L., & Holz, T. (2009). Automatically Generating Models for Botnet Detection. Lecture Notes in Computer Science .