IDS Analysis Scheme

advertisement
IDS ANALYSIS SCHEME
1
OBJECTIVES
Able to compare between anomaly and signature
based detection
 Able to explain hybrid characteristics for IDS
 To explain the benefits and drawbacks of IDS

2
IDS
Burglar alarm on doors and windows of your home
= IDS for your home
 IDS used to protect your network operates in a
similar manner.
 Locate intrusive activity by examining:

Network traffic
 Host logs
 System calls
 Other areas – signal an attack against your network

3
Alarm raised No alarm
IDS TERMINOLOGY






Attack exists
T+
F-
No attack
F+
T-
Alert/Alarm: A signal suggesting that a system has been or is
being attacked.
True Positive: A legitimate attack which triggers an IDS to
produce an alarm.
False Positive: An event signaling an IDS to produce an
alarm when no attack has taken place.
False Negative: A failure of an IDS to detect an actual attack.
True Negative: When no attack has taken place and no alarm
is raised.
Noise: Data or interference that can trigger a false positive.
4
Scarfone, Karen; Mell, Peter (February 2007). "Guide to Intrusion Detection and Prevention Systems
(IDPS)". Computer Security Resource Center (National Institute of Standards and Technology) (800-94).
http://csrc.ncsl.nist.gov/publications/nistpubs/800-94/SP800-94.pdf. Retrieved 1 January 2010.
IDS TERMINOLOGY




Site policy: Guidelines within an organization that control the
rules and configurations of an IDS.
Site policy awareness: The ability an IDS has to dynamically
change its rules and configurations in response to changing
environmental activity.
Confidence value: A value an organization places on an IDS
based on past performance and analysis to help determine its
ability to effectively identify an attack.
Alarm filtering: The process of categorizing attack alerts
produced from an IDS in order to distinguish false positives
from actual attacks.
5
Scarfone, Karen; Mell, Peter (February 2007). "Guide to Intrusion Detection and Prevention Systems
(IDPS)". Computer Security Resource Center (National Institute of Standards and Technology) (800-94).
http://csrc.ncsl.nist.gov/publications/nistpubs/800-94/SP800-94.pdf. Retrieved 1 January 2010.
EVALUATE IDS

By looking at the following:
Triggers
 Monitoring locations
 Hybrid characteristics

6
1. IDS TRIGGERS
7
IDS TRIGGERS
Current IDS use 2 major triggering mechanisms to
generate intrusion alarms
 They are:

Anomaly detection
 Signature based detection


Trigger mechanisms:

Refer to action that causes the IDS to generate an
alarm
8
IDS TRIGGERS

NIDS – generates alarm if it sees a packet to a
certain port with a certain data in it.

HIDS – generates alarm if a certain system call
executes.
A system call is a request made by any arbitrary
program to the OS for performing tasks.
 Improper use of the system can easily cause a system
crash.

9
ANOMALY DETECTION
We are drowning in the overflow of data that are
being collected world-wide, while starving for
knowledge at the same time
 Anomalous events occur relatively infrequently
 However, when they occur, their consequences can
be quite dramatic and quite often in a negative
sense.

10
WHAT ARE ANOMALIES?
Anomaly is a pattern in the data that does not
conform to the expected behaviour
 Also referred to as outliers, exceptions,
peculiarities, surprise, etc.
 Anomalies translate to significant (often critical) real
life entities


Cyber intrusions


A web server involved in ftp traffic
Credit card fraud

An abnormally high purchase made on a credit card
11
WHAT ARE ANOMALIES?
N1 and N2 are regions
of normal behaviour
 Points O1 and O2 are
anomalies
 Points in region O3 are
anomalies

12
KEY CHALLENGES
Defining a representative normal region is
challenging
 The boundary between normal and outlying
behaviour is often not precise
 The exact notion of an outlier is different for
different applications domains
 Availability of labeled data for training /validation
 Malicious adversaries
 Data might contain noise
 Normal behaviour keeps evolving

13
INPUT DATA
Engine
Temperature
 Most
common form of
data handled by
anomaly detection
techniques is Record
Data
192
195
180
199
19
177
Univariate
 Multivariate

172
285
195
163
10
INPUT DATA
Tid
 Most
common form of
data handled by
anomaly detection
techniques is Record
Data
Univariate
 Multivariate

SrcIP
Start
time
Dest IP
Dest
Port
Number
Attack
of bytes
1 206.135.38.95 11:07:20 160.94.179.223
139
192
No
2 206.163.37.95 11:13:56 160.94.179.219
139
195
No
3 206.163.37.95 11:14:29 160.94.179.217
139
180
No
4 206.163.37.95 11:14:30 160.94.179.255
139
199
No
5 206.163.37.95 11:14:32 160.94.179.254
139
19
Yes
6 206.163.37.95 11:14:35 160.94.179.253
139
177
No
7 206.163.37.95 11:14:36 160.94.179.252
139
172
No
8 206.163.37.95 11:14:38 160.94.179.251
139
285
Yes
9 206.163.37.95 11:14:41 160.94.179.250
139
195
No
10 206.163.37.95 11:14:44 160.94.179.249
139
163
Yes
10
19 …. 163 … 172, 177, 180 … 192, 195, 195, 199 …. 285
INPUT DATA – NATURE OF ATTRIBUTES

Nature of attributes
Binary
 Categorical
 Continuous
 Hybrid

Tid
SrcIP
Number
Internal
of bytes
Duration
Dest IP
1 206.163.37.81
0.10
160.94.179.208
150
No
2 206.163.37.99
0.27
160.94.179.235
208
No
3 160.94.123.45
1.23
160.94.179.221
195
Yes
4 206.163.37.37 112.03
160.94.179.253
199
No
5 206.163.37.41
160.94.179.244
181
No
0.32
TYPES OF ANOMALY

Point anomalies

Contextual anomalies

Collective anomalies
17
* Varun Chandola, Arindam Banerjee, and Vipin Kumar, Anomaly Detection - A Survey, in ACM Computing
Surveys 2008.
POINT ANOMALIES

An individual data instance is anomalous w.r.t. the
data
Y
N1
o1
O3
o2
N2
18
X
CONTEXTUAL ANOMALIES
An individual data instance is anomalous within a
context
 Requires a notion of context
 Also referred to as conditional anomalies*

Anomaly
Normal
* Xiuyao Song, Mingxi Wu, Christopher Jermaine, Sanjay Ranka, Conditional Anomaly Detection, IEEE19
Transactions on Data and Knowledge Engineering, 2006.
COLLECTIVE ANOMALIES


A collection of related data instances is anomalous
Requires a relationship among data instances




Sequential Data
Spatial Data
Graph Data
The individual instances within a collective anomaly are not
anomalous by themselves
Anomalous Subsequence
20
OUTPUT OF ANOMALY DETECTION

Label
Each test instance is given a normal or anomaly label
 This is especially true of classification-based
approaches


Score

Each test instance is assigned an anomaly score
Allows the output to be ranked
 Requires additional threshold parameter

21
APPLICATIONS OF ANOMALY DETECTION
Network intrusion detection
 Insurance / credit card fraud detection
 Healthcare informatics / medical diagnostics
 Industrial damage detection
 Image processing / video surveillance
 Novel topic detection in text mining
 … etc.

22
INTRUSION DETECTION

Intrusion Detection:



Challenges



Process of monitoring the events occurring in a computer system
or network and analyzing them for intrusions
Intrusions are defined as attempts to bypass the security
mechanisms of a computer or network
Traditional signature-based intrusion detection
systems are based on signatures of known
attacks and cannot detect emerging cyber threats
Substantial latency in deployment of newly
created signatures across the computer system
Anomaly detection can alleviate these
limitations
23
FRAUD DETECTION

Fraud detection refers to detection of criminal activities
occurring in commercial organizations


Types of fraud





Malicious users might be the actual customers of the organization
or might be posing as a customer (also known as identity theft).
Credit card fraud
Insurance claim fraud
Mobile / cell phone fraud
Insider trading
Challenges


Fast and accurate real-time detection
Misclassification cost is very high
CLASSIFICATION BASED TECHNIQUES
Main
idea: build a classification model for normal (and
anomalous (rare)) events based on labelled training data,
and use it to classify each new unseen event
Classification
models must be able to handle skewed
(imbalanced) class distributions
Categories:


Supervised classification techniques

Require knowledge of both normal and anomaly class

Build classifier to distinguish between normal and known anomalies
Semi-supervised classification techniques

Require knowledge of normal class only!

Use modified classification model to learn the normal behavior and then
detect any deviations from normal behavior as anomalous
25
CLASSIFICATION BASED TECHNIQUES
Advantages:

Supervised classification techniques
Models that can be easily understood
 High accuracy in detecting many kinds of known anomalies


Semi-supervised classification techniques
Models that can be easily understood
 Normal behaviour can be accurately learned

Drawbacks:

Supervised classification techniques
Require both labels from both normal and anomaly class
 Cannot detect unknown and emerging anomalies


Semi-supervised classification techniques
Require labels from normal class
 Possible high false alarm rate - previously unseen (yet legitimate) data
records may be recognized as anomalies

26
RULE BASED TECHNIQUES
Creating
new rule based algorithms (PN-rule, CREDOS)
Adapting existing rule based techniques


Robust C4.5 algorithm [John95]
Adapting multi-class classification methods to single-class classification
problem
Association rules
 Rules with support higher than pre specified threshold may characterize normal
behaviour [Barbara01, Otey03]
 Anomalous data record occurs in fewer frequent item sets compared to normal
data record [He04]
 Frequent episodes for describing temporal normal behaviour [Lee00,Qin04]
Case specific feature/rule weighting
 Case specific feature weighting [Cardey97] - Decision tree learning, where for
each rare class test example replace global weight vector with dynamically
generated weight vector that depends on the path taken by that example
 Case specific rule weighting [Grzymala00] - LERS (Learning from Examples
based on Rough Sets) algorithm increases the rule strength for all rules
describing the rare class
27
CONTEXTUAL ANOMALY DETECTION
 Detect
contextual anomalies.
 Key Assumption : All normal instances within a context
will be similar (in terms of behavioural attributes), while
the anomalies will be different from other instances within
the context.
 General Approach :


Identify a context around a data instance (using a set of
contextual attributes).
Determine if the test data instance is anomalous within the
context (using a set of behavioural attributes).
28
COLLECTIVE ANOMALY DETECTION
Detect collective anomalies.
 Exploit the relationship among data instances.


Sequential anomaly detection


Spatial anomaly detection


Detect anomalous sequences
Detect anomalous sub-regions within a spatial data set
Graph anomaly detection

Detect anomalous sub-graphs in graph data
29
WHAT ARE INTRUSIONS?

Intrusions are actions that attempt to bypass security mechanisms of
computer systems. They are usually caused by:
 Attackers accessing the system from Internet
 Insider attackers - authorized users attempting to gain and misuse nonauthorized privileges

Typical intrusion scenario
Computer
Network
Scanning
activity
Compromised
Machine
Attacker
Machine with
vulnerability
30
IDS ANALYSIS STRATEGY

Misuse/signature detection is based on extensive knowledge of
patterns associated with known attacks provided by human
experts
Existing approaches: pattern (signature) matching, expert systems,
state transition analysis, data mining
 Major limitations:




Unable to detect novel & unanticipated attacks
Signature database has to be revised for each new type of discovered
attack
Anomaly detection is based on profiles that represent normal
behaviour of users, hosts, or networks, and detecting attacks as
significant deviations from this profile
Major benefit - potentially able to recognize unforeseen attacks.
 Major limitation - possible high false alarm rate, since detected
deviations do not necessarily represent actual attacks
 Major approaches: statistical methods, expert systems, clustering,
neural networks, support vector machines, outlier detection schemes

31
INTRUSION DETECTION
 Intrusion Detection System
– combination of software
and hardware that attempts
to perform intrusion detection
– raises the alarm when possible
intrusion happens
 Traditional intrusion detection system IDS tools (e.g. SNORT) are based
on signatures of known attacks
– Example of SNORT rule (MS-SQL “Slammer” worm)
any -> udp port 1434 (content:"|81 F1 03 01 04 9B 81 F1 01|";
content:"sock"; content:"send")
 Limitations
– Signature database has to be manually revised for each new type of
discovered intrusion
– They cannot detect emerging cyber threats
– Substantial latency in deployment of newly created signatures across the
computer system
• Data Mining can alleviate these limitations
32
DATA MINING FOR INTRUSION DETECTION

Increased interest in data mining based intrusion detection





Attacks for which it is difficult to build signatures
Attack stealthiness
Unforeseen/Unknown/Emerging attacks
Distributed/coordinated attacks
Data mining approaches for intrusion detection
 Misuse detection




Anomaly detection



Building predictive models from labelled data sets (instances
are labelled as “normal” or “intrusive”) to identify known intrusions
High accuracy in detecting many kinds of known attacks
Cannot detect unknown and emerging attacks
Detect novel attacks as deviations from “normal” behaviour
Potential high false alarm rate - previously unseen (yet legitimate)
system behaviours may also be recognized as anomalies
Summarization of network traffic
33
DATA MINING FOR INTRUSION DETECTION
Tid
SrcIP
Start
time
Dest IP
Dest
Port
Number
Attack
of bytes
Misuse Detection –
Building Predictive
Models
Tid
SrcIP
Start
time
Dest
DestPort
IP
Number
Number
Attack
Attack
of bytes
bytes
of
1 206.135.38.95 11:07:20 160.94.179.223
139
192
No
2 206.163.37.95 11:13:56 160.94.179.219
139
195
No
1 206.163.37.81 11:17:51 160.94.179.208
160.94.179.208
150
150
?
No
3 206.163.37.95 11:14:29 160.94.179.217
139
180
No
2 206.163.37.99 11:18:10 160.94.179.235
160.94.179.235
208
208
?
No
4 206.163.37.95 11:14:30 160.94.179.255
139
199
No
3 206.163.37.55 11:34:35 160.94.179.221
160.94.179.221
195
195
?
Yes
5 206.163.37.95 11:14:32 160.94.179.254
139
19
Yes
4 206.163.37.37 11:41:37 160.94.179.253
160.94.179.253
199
199
?
No
6 206.163.37.95 11:14:35 160.94.179.253
139
177
No
5 206.163.37.41 11:55:19 160.94.179.244
160.94.179.244
181
181
?
Yes
7 206.163.37.95 11:14:36 160.94.179.252
139
172
No
8 206.163.37.95 11:14:38 160.94.179.251
139
285
Yes
9 206.163.37.95 11:14:41 160.94.179.250
139
195
No
10 206.163.37.95 11:14:44 160.94.179.249
139
163
Yes
10
Summarization of
attacks using
association rules
Training
Set
Learn
Classifier
Test
Set
Model
Anomaly Detection
Rules Discovered:
{Src IP = 206.163.37.95,
Dest Port = 139,
Bytes  [150, 200]} --> {ATTACK}
34
ANOMALY DETECTION ON REAL NETWORK DATA



Anomaly detection was used at U of Minnesota and Army Research Lab
to detect various intrusive/suspicious activities
Many of these could not be detected using widely used intrusion
detection tools like SNORT
Anomalies/attacks picked by MINDS


Scanning activities
Non-standard behavior
Policy violations
Worms
MINDS – Minnesota Intrusion Detection System
Anomaly
scores
network
Data capturing
device
Net
Anomaly
detection
flow tools
tcpdump
Filtering
…
…
Association
pattern analysis
Detected
novel attacks
MINDSAT
M
I
N
D
S
Summary and
characterization
of attacks
Human
analyst
Labels
Feature
Extraction
Known attack
detection
Detected
known attacks
35
Feature Extraction
Three groups of features
 Basic features of individual TCP connections
 source & destination IP
Features 1 & 2
dst …
 source & destination port
Features 3 & 4
h1
h1
 Protocol
Feature 5
h1
 Duration
Feature 6
h2
 Bytes per packets
Feature 7
h4
 number of bytes
Feature 8
h2
dst … service … flag %S0
service … flag
http
http
http
S0
S0
S0
http
S0
http
S0
ftp
S0
existing features
useless
syn flood
normal
h1
h1
h1
http
http
http
S0
S0
S0
70
72
75
h2
http
S0
0
h4
http
S0
0
h2
ftp
S0
0
construct features with
high information gain
 Time based features
 For the same source (destination) IP address, number of unique destination (source)
IP addresses inside the network in last T seconds – Features 9 (13)
 Number of connections from source (destination) IP to the same destination (source)
port in last T seconds – Features 11 (15)
 Connection based features
 For the same source (destination) IP address, number of unique destination (source)
IP addresses inside the network in last N connections - Features 10 (14)
 Number of connections from source (destination) IP to the same destination (source)
port in last N connections - Features 12 (16)
36
TYPICAL ANOMALY DETECTION OUTPUT

48 hours after the “slammer” worm
score
37674.69
26676.62
24323.55
21169.49
19525.31
19235.39
17679.1
8183.58
7142.98
5139.01
4048.49
4008.35
3657.23
3450.9
3327.98
2796.13
2693.88
2683.05
2444.16
2385.42
2114.41
2057.15
1919.54
1634.38
1596.26
1513.96
1389.09
1315.88
1279.75
1237.97
1180.82
srcIP
63.150.X.253
63.150.X.253
63.150.X.253
63.150.X.253
63.150.X.253
63.150.X.253
63.150.X.253
63.150.X.253
63.150.X.253
63.150.X.253
142.150.Y.101
200.250.Z.20
202.175.Z.237
63.150.X.253
63.150.X.253
63.150.X.253
142.150.Y.101
63.150.X.253
142.150.Y.236
142.150.Y.101
63.150.X.253
142.150.Y.101
142.150.Y.101
142.150.Y.101
63.150.X.253
142.150.Y.107
63.150.X.253
63.150.X.253
142.150.Y.103
63.150.X.253
63.150.X.253
sPort
1161
1161
1161
1161
1161
1161
1161
1161
1161
1161
0
27016
27016
1161
1161
1161
0
1161
0
0
1161
0
0
0
1161
0
1161
1161
0
1161
1161
dstIP
128.101.X.29
160.94.X.134
128.101.X.185
160.94.X.71
160.94.X.19
160.94.X.80
160.94.X.220
128.101.X.108
128.101.X.223
128.101.X.142
128.101.X.127
128.101.X.116
128.101.X.116
128.101.X.62
160.94.X.223
128.101.X.241
128.101.X.168
160.94.X.43
128.101.X.240
128.101.X.45
160.94.X.183
128.101.X.161
128.101.X.99
128.101.X.219
128.101.X.160
128.101.X.2
128.101.X.30
128.101.X.40
128.101.X.202
160.94.X.32
128.101.X.61
dPort
1434
1434
1434
1434
1434
1434
1434
1434
1434
1434
2048
4629
4148
1434
1434
1434
2048
1434
2048
2048
1434
2048
2048
2048
1434
2048
1434
1434
2048
1434
1434
protocolflags packets bytes
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
1
16 [2,4)
[0,1829)
17
16 [2,4)
[0,1829)
17
16 [2,4)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
1
16 [2,4)
[0,1829)
17
16 [0,2)
[0,1829)
1
16 [2,4)
[0,1829)
1
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
1
16 [0,2)
[0,1829)
1
16 [2,4)
[0,1829)
1
16 [2,4)
[0,1829)
17
16 [0,2)
[0,1829)
1
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
1
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
3
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
4
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
5
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
6
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
7
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
 Anomalous connections that correspond to the “slammer” worm
 Anomalous connections that correspond to the ping scan
 Connections corresponding to UM machines connecting to “half-life” game servers
8
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
9
10 11 12 13 14 15 16
0.81 0 0.59 0 0 0 0 0
0.81 0 0.59 0 0 0 0 0
0.81 0 0.58 0 0 0 0 0
0.81 0 0.58 0 0 0 0 0
0.81 0 0.58 0 0 0 0 0
0.81 0 0.58 0 0 0 0 0
0.81 0 0.58 0 0 0 0 0
0.82 0 0.58 0 0 0 0 0
0.82 0 0.57 0 0 0 0 0
0.82 0 0.57 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
0
0
0
0 0 0 1 0
0
0
0
0 0 0 1 0
0.82 0 0.57 0 0 0 0 0
0.82 0 0.57 0 0 0 0 0
0.82 0 0.57 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
0.82 0 0.57 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
0.82 0 0.57 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
0.82 0 0.57 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
0.82 0 0.57 0 0 0 0 0
0.82 0 0.57 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
37
“SLAMMER” WORM

SQL Slammer is a computer worm that caused a denial of service on some
Internet hosts and dramatically slowed down general Internet traffic

spreads rapidly, infecting most of its 75,000 victims within ten minutes.

Discovered: January 24, 2003

Also Known As: SQL Slammer Worm [ISS], DDOS.SQLP1434.A [Trend],
W32/SQLSlammer [McAfee], Slammer [F-Secure], Sapphire [eEye],
W32/SQLSlam-A [Sophos]

Type: Worm

Systems Affected: Windows 2000, Windows 95, Windows 98, Windows Me,
Windows NT, Windows XP

W32.SQLExp.Worm is a worm that targets the systems running Microsoft SQL
Server 2000, as well as Microsoft Desktop Engine (MSDE) 2000. The worm sends
376 bytes to UDP port 1434, the SQL Server Resolution Service Port.

Although titled "SQL slammer worm", the program did not use the SQL language; it
exploited a buffer overflow bug in Microsoft's flagship SQL Server and Desktop
Engine database products, for which a patch had been released six months earlier
in MS02-039.

The worm has the unintended payload of performing a Denial of Service attack due 38
to the large number of packets it sends.
IDS TRIGGERS – ANOMALY DETECTION



Also referred – profile-based detection – must build
profiles for each user group on the system
Other systems might automatically build profiles for
individual users.
This profile incorporates:




A typical user’s habits
The services he normally uses
Established baseline for the activities that a normal user
routinely does to perform his job
A user group:
Represents a group of users who perform similar functions on
the network
 Can build user groups based on job classification, such as
engineers, clerks
 How you assign the groups is not important, as long as the
users in the group perform on the network

39
IDS TRIGGERS – ANOMALY DETECTION
Building and updating these profiles represent a
significant portion of the work required to deploy an
anomaly-based IDS
 The quality of your profiles directly related to how
successful your IDS is at detecting attacks against
your network


The most common approaches to build user
profiles include the following:



Statistical sampling
Rule-based approach
Neural networks
40
ANOMALY DETECTION – STATISTICAL SAMPLING

For profile creation:



alarms are based on deviations from your defined
normal state.
you measure deviation from normal by calculating the
standard deviation.
Control the sensitivity of your IDS:
by varying the number of standard deviations required to
generate an alarm,
 to roughly regulate the number of false positives that your IDS
generates because small user deviations are less likely to
generate false positives/alarm.

41
ANOMALY DETECTION – STATISTICAL SAMPLING
Standard deviation measures the deviation from the
median or average of a data set.
 When your data is based on a well-defined
distribution, each standard deviation defines a
percentage of data that falls within it.
 For example:


maybe 90 percent of all data falls within one standard
deviation, 95 percent of the data falls within two standard
deviations, and 98 percent of the data falls within three
standard deviations. In this example, only 2 percent of the
data falls outside three standard deviations from the mean.
By using this process, you can define statistically how
abnormal specific data is.
42
STATISTICS BASED TECHNIQUES

Key Assumption: Normal data instances occur in high
probability regions of a statistical distribution, while
anomalies occur in the low probability regions of the
statistical distribution.

General Approach: Estimate a statistical distribution using
given data, and then apply a statistical inference test to
determine if a test instance belongs to this distribution or
not.

If an observation is more than 3 standard deviations away from the
sample mean, it is an anomaly.

Anomalies have large value for
43
STATISTICS BASED TECHNIQUES
 Advantages

Utilize existing statistical modeling techniques to model
various type of distributions.

Provide a statistically justifiable solution to detect
anomalies.
 Drawbacks

With high dimensions, difficult to estimate parameters,
and to construct hypothesis tests.

Parametric assumptions might not hold true for real data
sets.
44
TYPES OF STATISTICAL TECHNIQUES


Parametric Techniques

Assume that the normal (and possibly anomalous) data is generated
from an underlying parametric distribution.

Learn the parameters from the training sample.
Non-parametric Techniques

Do not assume any knowledge of parameters.

Use non-parametric techniques to estimate the density of the
distribution – e.g., histograms, parzen window estimation.
45
ANOMALY DETECTION – RULE-BASED APPROACH
Analyze the normal traffic for different users over a
period of time and then create rules that model this
behavior.
 Any other behavior then can be considered
abnormal and generate an alarm.
 Creating the rules that define normal behavior can
be a complicated task.

46
ANOMALY DETECTION – NEURAL NETWORKS
Neural networks are a form of artificial intelligence
in which you attempt to approximate the working of
biological neurons, such as those found in the
human brain.
 With these systems, you train them by-presenting
them with a large amount of data and rules about
data relationships.
 This information is used to adjust the connection
between the neurons. After the system is trained,
network traffic is used as a stimulus to the neural
network to determine whether the traffic is
considered normal.

47
ANOMALY DETECTION – NEURAL NETWORKS
 Multi-layer
Perceptrons

Measuring the activation of output nodes [Augusteijn02]

Extending the learning beyond decision boundaries

Equivalent error bars as a measure of confidence for classification [Sykacek97]

Creating hyper-planes for separating between various classes, but also to have flexible
boundaries where points far from them are outliers [Vasconcelos95]
 Auto-associative
neural networks

Replicator NNs [Hawkins02]

Hopfield networks [Jagota91, Crook01]
 Adaptive
 Radial

Resonance Theory based [Dasgupta00, Caudel93]
Basis Functions based
Adding reverse connections from output to central layer allows each neuron to have
associated normal distribution, and any new instance that does not fit any of these
distributions is an anomaly [Albrecht00, Li02]
 Oscillatory

networks
Relaxation time of oscillatory NNs is used as a criterion for novelty detection when a new
instance is presented [Ho98, Borisyuk00]
48
ANOMALY DETECTION – ISSUES





The USER PROFILES form the heart of an anomaly-based
IDS.
Some systems use an initial training period that monitors
the network for a predetermined period of time.
This traffic then is used to create a user baseline.
This baseline determines what normal traffic on the
network looks like.
The disadvantage with this approach:
 if users’ jobs change over time, they start generating false
alarms.
 a determined attacker can gradually train the system
incrementally until his actual attack traffic appears as
normal traffic on the network.
49
ANOMALY DETECTION – BENEFITS

It can easily detect many insider attacks or account theft,


If a particular account belonging to an office clerk starts attempting
network administration functions, for example, this probably triggers an
alarm.
An attacker is not quite sure what activity generates an alarm.
With a signature-based IDS, an attacker can test which traffic generates
alarms in a lab environment. By using this information, he can then craft
tools that bypass the signature-based IDS.
 With the anomaly detection system, the attacker does not know the
training data that has been used; therefore, he cannot assume any
particular action will go undetected.


Not based on signatures for specific, known attacks.




based on a profile
can generate alarms for previously unpublished attacks
as long as the new attack deviates from normal user activity.
can detect new attacks the first time they are used.
50
ANOMALY DETECTION – DRAWBACKS
High initial training time
 No protection of network during training
 Difficult to define normal
 Must update user profiles as habits change
 Generates false negatives if traffic appears normal
 Difficult to understand alarming
 Complicated and hard to understand

51
FALSE NEGATIVE
When an IDS fails to generate an alarm for known
intrusive activity, it is called a false negative.
 False negatives represent actual attacks that the
IDS missed even though it is programmed to detect
the attack.
 Most IDS developers tend to design their systems
to prevent false negatives. It is difficult, however, to
totally eliminate false negatives.
 Furthermore, as you sensitize your system to report
fewer false negatives, you tend to increase the
number of false positives that get reported. It is a
constant trade-off.

52
SIGNATURE-BASED DETECTION
It looks for intrusive activity that matches specific
signatures.
 These signatures are based on a set of rules that
match typical patterns and exploits used by
attackers to gain access to your network.
 Highly skilled network engineers research/study
known attacks and vulnerabilities to develop the
rules for each signature.

53
SIGNATURE-BASED DETECTION: BENEFITS
Signatures are based on known intrusive activity
 Detected attacks are well-defined
 The system is easy to understand
 Attacks are challenged immediately after
installation

54
SIGNATURE-BASED DETECTION: DRAWBACKS
Maintaining state information (event horizon*)
 Updating signature database
 Attacks that circumvent the IDS (false negatives)
 Inability to detect unknown attacks

55
*EVENT HORIZON



The maximum amount of time over which an attack signature
can be successfully detected (from initial data to the final data
needed to complete the attack signature) is known as the event
horizon.
The IDS must maintain state information during this event
horizon.
The important point to understand is that your IDS cannot
maintain the state information indefinitely therefore, it uses the
event horizon to limit the amount of time that it stores the state
information
56
2. IDS MONITORING LOCATIONS
57
IDS MONITORING LOCATIONS
Examine where an IDS watches for the intrusive
traffic
 IDS typically monitors one of two locations:

The host
 The network

58
HIDS
Checks for intrusions
by checking
information at the host
or OS level
 These IDSs examine
many aspects of your
host, such as system
calls, audit logs, error
and messages

* Agent = IDS agent
59
HIDS: BENEFITS

It has first hand information on the success of the
attack.




Because a host-based IDS examines traffic after it reaches
the target of the attack (assuming the host is the target),
With a network-based IDS, the alarms are generated on
known intrusive activity,
Only a HIDS can determine the actual success or failure of
an attack.
HIDS can use the host's own IP stack to easily deal
with variable Time-To-Live (TTL)* attacks

TTL is difficult to detect using a network-based IDS.
60
*VARIABLE TIME-TO-LIVE ATTACKS





All packets traveling across the network have a TTL value.
Each router that handles the packet decreases the TTL value
by one.
If the TTL value reaches zero, the packet is discarded.
An attacker can launch an attack that includes bogus packets
with smaller TTL values than the packets that make up the
real attack.
If the network-based sensor sees all the packets, but the
target host sees only the actual attack packets, the attacker
has managed to distort the information that the sensor used,
causing the sensor to potentially miss the attack.
61
VARIABLE TIME-TOLIVE ATTACKS

The picture illustrates this
attack.

The fake packets start with a
TTL of 3, whereas the real
attack packets start with a TTL
of 7.

The sensor sees both sets of
packets, but the target host
sees only the real attack
packets. Although this attack
is possible, it is not easy to
use in practice because it
requires a detailed
understanding of the network
topology and location of IDS
sensors.
Real attack
packet
62
Fake packet
HIDS: DRAWBACKS

Limited network view



Most host-based IDSs, for example, do not detect port
scans against the host.
It is almost impossible for a host-based IDS to detect
reconnaissance (“spy”) scans against your network. These
scans represent a key indicator to more attacks against
your network.
Must operate on every OS on the network


HIDS must communicate this information to some type of
central management facility.
An attack might take a host's network communication
offline. This host then cannot communicate any information
to the central management facility.
63
NIDS: BENEFITS
A network-based IDS examines packets to locate
attacks against the network. The IDS sniffs the
network packets and compares the traffic against
signatures for known intrusive activity.
 Benefits:

Overall network perspective
 Does not have to run on every OS on the network

64
NIDS: DRAWBACKS

Bandwidth
As network pipes grow larger and larger, it is difficult to
successfully monitor all the traffic going across the network at a
single point in real time, without missing packets.
 Need to install more sensors throughout the network at locations.


Fragment reassembly
Network packets have a maximum size.
If a connection needs to send data that exceeds this maximum
bound, the data must be sent in multiple packets.
 This is known as fragmentation. When the receiving host gets the
fragmented packets, it must reassemble the data.
 Not all hosts perform the reassembly process in the same order.
Some OSs start with the last fragment and work toward the first.
Others start at the first and work toward the last. The order does
not matter if the fragments do not overlap. If they overlap, the
results differ for each reassembly process.



Encryption
65
HYBRID CHARACTERISTICS
66
HYBRID CHARACTERISTICS
Hybrid systems combine the functionality from
several different IDS categories to create a system
that provides more functionality than a traditional
IDS.
 Some hybrid systems might incorporate multiple
triggering techniques, such as anomaly and
signature-based detection.
 Other hybrid IDSs might combine multiple
monitoring locations, such as host-based and
network-based monitoring.
 The major hurdle to constructing a hybrid IDS is
getting the various components to operate in
harmony, and presenting the information to the end
user in a user-friendly manner.

67
HYBRID CHARACTERISTICS: BENEFITS

Different IDS technologies are combined.

A combined host-based and network-based system, for
example, provides the overall network visibility of a
network-based IDS, as well as detailed host-level
visibility.
Combining anomaly detection with misuse
detection can produce a signature-based IDS that
can detect previously unknown attacks.
 Each hybrid system needs to be analyzed on its
unique strengths.

68
HYBRID CHARACTERISTICS: DRAWBACKS
 Getting
these different technologies to work
together in a single IDS can be difficult

Normally, hybrid systems attempt to merge multiple
diverse intrusion detection technologies. Combining
these technologies can produce a stronger IDS.
 Presenting
the information from these multiple
technologies to the end user in a coordinated
fashion can also be a challenge.
 Each hybrid system needs to be examined to
understand its strengths and weaknesses
69
SUMMARY

The common triggering mechanisms are as follows:
Anomaly detection
 Signature based detection

Anomaly detection is more complex than signature
based detection, but it provides the capability to
detect previously unpublished attacks
 Each different types of the IDS has it owns
strengths and weaknesses.

70
SUMMARY
Anomaly detection can detect critical information in
data.
 Highly applicable in various application domains.
 Nature of anomaly detection problem is dependent
on the application domain.
 Need different approaches to solve a particular
problem formulation.

71
EXERCISE - DISCUSS
What are the two major types of IDS monitoring?
 What are the two types of IDS triggering?
 What are some drawbacks to anomaly detection?
 What is the difference between a false positive and
a false negative?

72
Download