Analyzing and Profiling Attacker Behavior in Multistage Intrusions

advertisement
Analyzing and Profiling Attacker Behavior in
Multistage Intrusions
1
Contents
 Introduction and Background
 Literature Review
 Methodology
 Implementation
 Evaluation
 Contribution
 Conclusion
 References
2
Introduction
 Increase in technology has brought more sophisticated intrusions,




3
with which the network security has become more challenging.
Attackers might have different intentions and each attack might
have different level.
Understanding their behavior is important to understand possible
risks.
In a government study [7] attackers are classified into 9 different
groups
Amateurs, Criminals, Insiders, Phishers, Nations, Hackers,
Terrorists, Bot-network operators, and Spyware/ malware
authors.
Introduction (Contd..)
 Amateurs: This group of attackers don’t have much
knowledge. They do it for fun.
 Criminals: seek to attack systems for monetary gain. They
use spam, phishing, and spyware/malware to commit
identity theft and online fraud.
 Phishers: Execute phishing schemes in an attempt to steal
identities or information for monetary gain.
 Terrorists : Seek to destroy, incapacitate, or exploit critical
infrastructures in order to threaten national security.
4
Attacker Groups
 Hackers: Break into networks by gaining unauthorized access




5
that requires a fair amount of skill or computer knowledge
Insiders: Insiders’ knowledge of a target system often allows
them to gain unrestricted access to cause damage to the system or
to steal system data.
Nations: Use cyber tools as part of their information-gathering
and espionage activities.
Spyware/malware authors: carry out attacks against users by
producing and distributing spyware and malware.
Bot-network operators: Bot-net operators use a network, or
bot-net, remotely controlled systems to coordinate attacks
Problems with IDS (Contd..)
 It is very important to profile and predict the attacker
intentions to protect the network accordingly.
 There is a necessity find an efficient way to identify the type
of attackers.
 IDS such as Snort [8] helps in detecting single step intrusions,
but not in detecting multistage attack and attacker behavior.
 Due to
 Huge number of alerts
 Lack of proper model that can detect multistage attacks
 Lack of a method that can link multistage attacks to attacker
behavior.
6
Objective
Develop a system that can
• Detect multistage attacks
• Analyze the attacker behavior by classifying the activity
• Discover the attacker behavior patterns
• Predict and profile the type of attacker based on behavior.
7
Literature Review
 Multilevel alert clustering and intelligent alert clustering
models [2] were well formed techniques for reducing the
number of alerts.
 Complexity of the above models could degrade the
performance of the system.
 Mathew et al [1] have made a good effort to present a
technique for understanding multi stage attacks using attacktrack based visualization of heterogeneous event streams.
 They used the event correlation which is based on attack
tracks to determine the temporal relationship between the
heterogeneous events.
8
Literature Review (Contd..)
 The above approach was useful just to understand the stages
in the multistage attack, but not to predict the user behavior.
 A user behavior perception model based on markov process
[7] presented a novel user behavior perception model for
intelligent mobile terminals.
 The model is based on the Markov process, which introduces
also the idea of machine learning and context-awareness.
 The user behavior histories were used to discover user’s
preference, and information gathered from users are
described to perceive the user behaviors.
9
Methodology
 Processing the raw data
 Alert grouping
 Attacker behavior analysis
 Preparation of semi-automatic
 Training the Hidden Markov Model
 Profiling and Predicting of attacker behavior
10
Collection and Generalization of Alerts
 The raw data was provided by ORNL. It was in pcap format
 The generated alerts have a lot of insignificant information, which
needs to be eliminated.
 Essential details in each alert such as IP Address of source and
destination host, alert type and classification are extracted.
[**] [1:2000537:6] ET SCAN NMAP -sS [**]
[Classification: Attempted Information Leak] [Priority: 2]
07/17-09:30:09.298097 192.168.101.66:33966 ->
192.168.101.53:175
TCP TTL:49 TOS:0x0 ID:27814 IpLen:20 DgmLen:44
S* Seq: 0x3C25204F Ack: 0x0 Win: 0x800 TcpLen: 24
TCP Options (1) => MSS: 1460
11
Collection and Generalization of Alerts
 The raw data was provided by ORNL. It was in pcap format
 The generated alerts have a lot of insignificant information, which
needs to be eliminated.
 Essential details in each alert such as IP Address of source and
destination host, alert type and classification are extracted.
Time stamp
Alert type
(portscan)TCPPortscan , 07/17-10:03:27.114495,
192.168.101.66,3387, 192.168.101.54,4497.
Source
12
Destination
Alert Grouping
 Snort[8] generates thousands of alerts each day many of them
might be false alarms. With large number of alerts it is not
possible to profile and predict the attacker behavior.
 On an average an alert is generated for every 2 milliseconds,
therefore, we need to group them.
 Alerts that generated from same source and targeted to same
destination for the same purpose ( i.e. with same alert type) and
generated within one second of time difference are grouped
together.
Source
Destination
Time stamp
Alert type
Count
192.168.101.56, 192.168.72.1, 07/17- 10:59:06, ETSCANNMAP, 70, 50
13
Behavior code
Attacker Behavior Analysis
 Based on a government study in 2010 [7] the attackers are
divided in to different groups such as amateurs, criminals,
insiders, terrorists, and hackers.
 To predict the attacker behavior we have used Hidden
Markov Model (HMM) [9], A machine learning algorithm, to
analyze these attackers behavior by defining some rules for
each type of attacker.
14
Hidden Markov Model
 λ= (A, B, π, N) (N is number of states)
 State probabilities
 Transition probabilities
 Emission or Observation probabilities
15
Attacker Behavior Analysis (Contd..)
 We have defined five stages, which are also hidden states in
HMM
 Scanning
 Enumeration
 Access attempt
 Malware attempt
 Denial of service
16
Stages in the multistage attack
 Scanning: Attacker tries to gather the information about the target
system

Observation: ICMP PING
 Enumeration : Attacker tries to find the vulnerabilities of the target
system

Observation: CHAT_MSN
 Access attempt: Attacker tries to gain the access to the target system’s
resources.

Observation: SQL version overflow attempt
 Denial of service : Attacker tries deny service to other users.

Observation: NETBIOS SMB-DS Trans Max Param DOS attempt
 Malware attempt : Attacker tries to execute own code on the target
system.
 Observation: SHELLCODE_x86_NOOP
17
Preparation of Semi-Automatic HMM
Training
18
Preparation of Semi-Automatic HMM
Training (Contd..)
 Maps alerts (Observations) into one of the five hidden states
 For example an alert of ICMP PING type is usually considered
as a scanning type and an alert of SHELLCODE X86 INC EXC
NOOP is considered as exploitation malware attempt type.
 As of now we have around 88 rules to train our model. Once
the rule set is defined, we map the state name to each alert
by applying rules.
Alert type
07/14-13:12:54.775367 [**] [1:384:5] ICMP PING [**] [Classification: Misc
activity] [Priority: 3] {ICMP} 192.168.1.24 -> 192.168.1.1
Time stamp
19
Attacker
Victim
Preparation of Semi-Automatic HMM
Training (Contd..)
 We have classified all the alerts into five different sets same
as states in our model depending upon on the type of alert.
 For example an alert of ICMP PING type is usually considered
as a scanning type and an alert of SHELLCODE X86 INC EXC
NOOP is considered as exploitation malware attempt type.
 As of now we have around 88 rules to train our model. Once
the rule set is defined, we have assigned the state name to
each alert by applying rules.
Scanning, 07/14-13:12:54.775367, Misc activity, 192.168.1.24,
192.168.1.1
State
20
Victim
Time stamp
Attacker
Training the Hidden Markov Model
 Steps
 Initialization : This step initializes the state, transition, and
observation probabilities.
 Forward algorithm: This step calculates the observation
probabilities based on the occurred observation sequence.
 Backward algorithm: This step calculates the state and transition
probabilities based on observation probabilities and sequence.
 Re-estimation of probabilities : This step re-estimated the state,
transition, and observation probabilities by iterating the above
three steps number of times
21
Training the Hidden Markov
Model(Contd..)
Attacker Groups
Behavior
Amateur
Scanning + Enumeration
Insider, Phisher, Spyware/Malware, Botnet
Access attempt + Denial of service +
(ISBN)
Malware attempt
Criminal groups, Terrorists, Hackers, Nations
Scanning + enumeration + access attempt +
(CTHN)
Malware attempt
Terrorists, Hackers (TH)
Scanning + enumeration + Denial of
service
Terrorists, Hackers, Criminal groups (THC)
22
Scanning + enumeration + access attempt +
Table 3.1 Behavior
Classification
Denial
of service + Malware attempt
Prediction of Attacker Behavior
 As we have trained our system and stored probabilities in our
database, our next step is to match the set of incoming alerts
with one of our stored behavior.
 To find the closest behavior for a set of alerts, we have used
Kullback Leibler Distance Calculator [6].
 The Kullback-Leibler distance (K-L) [6] is a measure of the
similarity between two completely determined probability
distributions.
23
Attacker Behavior Analysis
The Kullback-Leibler distance (K-L)
 Definition: Let p1(x) and p2(x) be two continuous probability
distributions. By definition, the K-L distance D (p1, p2) between
p1(x) and p2(x) is:
Basic Properties
 D (p1, p2) is the mean of the quantity log [p1(x)/p2(x)], with
p1(x) being the reference distribution.
 The K-L distance is always nonnegative. It is zero only when the
two distributions are identical.
 It is common to encounter the symmetric version of the K-L
distance between p1 and p2:
Ds(p1, p2) = [D(p1, p2) + D(p2, p1)] / 2
24
Implementation
 Technologies we used
Clustering and Generalization -- Java
Attacker Behavior Analysis -- Java C#.net
 API used:
Hidden Markov Model – Jahmm[10]
KL-Distance calculator - Jahmm [10]
25
Implementation
(Contd..)
Transition
Probability
State
Probability
Observation
Probability
26
Figure 1 Probability Distribution
Implementation(Contd..)
Figure 2 Behavior Description
27
Evaluation – Experimentation
192.168.0.192
192.168.0.139
192.168.0.1
Amateur
type
192.169.10.11
192.168.0.10
192.168.179.1
192.168.0.191
192.168.133.1
Serious
threats
28
Attackers
Victims
Evaluation - Results
7
6
5
4
1/KLDistance
Amateur
ISBN
3
THC
TC
CTHN
2
1
0
29
Figure 4 Behavior comparison
Contribution
 Grouping alerts
 Build HMM model for each of the attacker groups.
 Profile the 5 HMM models
 Predict Attacker behavior by calculating KL – distance [3].
30
Conclusion
 In our study we achieved most of the expected results.
 Over all we had over 300 types of alerts generated through this process.
This made our system to be able to detect most of the known attacks.
 Attacker behavior analysis is very efficient way of finding the possible
behavior of an attacker, which allows us to take action according to the
intentions of the attacker.
31
Demo
32
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
33
S. Mathew, D. Britt, R. Giomundo, S. Upadhyaya, S. Sudit, Real-time Multistage Attack
Awareness Through Enhanced Intrusion Alert Clustering, In Situation Management
Workshop (SIMA 2005), MILCOM 2005, Atlantic City, NJ, October, 2005.
Siraj, Vaughn, Multilevel Alert Clustering for Intrusion Detection Sensor Data, Fuzzy Information
Processing Society, USA, 2005.
Kullback-Leibler distance
http://www.aiaccess.net/English/Glossaries/GlosMod/e_gm_kullbak.htm
Yang, Gasior, Katipally,Cui, Alerts Analysis and Visualization in Network-based Intrusion
Detection Systems, The Second IEEE International Conference on Information Privacy,
Security, Risk and Trust (PASSAT2010), 2010, USA.
Yang, Katipally, Gasior, Cui, Multistage attack detection system for network administrators,
CSIIRW -6 , 2010, USA.
Manavogulu, parlov, Giles, Probabilistic User Behavior Models, Proceedings of the Third IEEE
International Conference on Data Mining (ICDM’03), 2003 USA.
CYBERSPACE: United States Faces Challenges in Addressing Global Cybersecurity and
Governance, July 2010 .
http://www.snort.org
Mark Stamp, A Revealing Introduction to Hidden Markov Models,2008
Download