Boundary Detection in Tokenizing Network Application Payload for Anomaly Detection

advertisement
Boundary Detection in Tokenizing Network
Application Payload for Anomaly Detection
Rachna Vargiya and Philip Chan
Department of Computer Sciences
Florida Institute of Technology
Motivation
 Existing anomaly detection techniques rely on
information derived only from the packet
headers
 More sophisticated attacks involve the
application payload
 Example : Code Red II worm
 GET /default.ida?NNNNNNNNN…
 Parsing the payload is required!
 Problems in hand-coded parsing:
Large number of application protocols
Frequent introduction of new protocols
Problem Statement
To parse application payload into tokens
without explicit knowledge of the
application protocols
These tokens are later used as features
for anomaly detection
Related work
 Pattern Detection - Important Tokens
Fixed Length:
 Forrest et al. (1998)
Variable Length:
 Wespi et al. (2000)
 Jiang et al.(2002)
 Boundary Detection – All Tokens
VOTING EXPERTS by Cohen et al. (2002)
 Boundary Entropy
 Frequency
 Binary Votes
Approach
Boundary Finding Algorithms:
Boundary Entropy
Frequency
Augmented Expected Mutual Information
Minimum Description Length
Approach is domain independent (no prior
domain knowledge)
Combining Boundary Finding
Algorithms
Combination of all or a subset (E.g.
Frequency + Minimum Description Length)
of techniques
Each algorithm can cast multiple votes,
depending on confidence measure
Boundary Entropy (Cohen et al)
Entropy at the end of each possible
window is calculated
High Entropy means more variation
w
X
Itisarainyday
  P( x | w) log P( x | w)
‘x’ is the byte following the current window
Voting using Boundary Entropy change
graph to discrete bars
Itisarainyday
Entropy in meaningful tokens starts with a
high value, drops, and peaks at the end
Vote for positions with the peak entropy
Threshold suppresses votes for low
entropy values
Threshold = Average BE
Frequency (Cohen et al)
 Most frequent set of tokens are assumed to be
meaningful tokens
 Frequencies of tokens with length =1, 2, 3…., 6
 Shorter tokens are inherently more frequent than
longer tokens
 Normalize frequencies for tokens of the same
length using standard deviation
 Boundaries are assigned at the end of most
frequent token in the window
Itis arainyday
Frequency in window:
(1)”I” = 3
(2)”It” = 5
(3) “Iti” = 2
(4)”It is” = 3
Mutual Information (MI)
Mutual Information given by:
MI (a, b)  lg[ P(a, b) /( P(a ) P (b))]
Gives us the reduction of uncertainty in
presence of event ‘b’ given event ‘a’
MI does not incorporate the counter
evidence when ‘a’ occurs without ‘b’ and
vice versa
Augmented Expected Mutual Information
(AEMI)
AEMI ( A, B)  P(a, b) MI (a, b) 
P( a, b)MI(a, b)  P(a, b)MI(a,b)
•AEMI sums the supporting evidence
and subtracts the counter evidence
•For each window, the location with the
minimum AEMI value suggests a boundary
Itisarainyday
a b
Minimum Description Length
(MDL)
 Shorter code assigned to frequent tokens to
minimize the overall coding length
 Boundary yielding shortest coding length is
assigned votes
 Coding Length per byte:
 Lg P(ti): no of bits to encode ti
 |ti|=length of ti
MDL 
  lg P(t ) / | t
i{left , right}
i
i
|
Itisarainyday
tleft
tright
Normalize scores of each algorithm
Each algorithm produces list of scores
Since the number of votes is proportional
to the score, the scores must be
normalized
Each score is replaced by the number of
standard deviations that the score is away
from the mean value
Normalize votes of each algorithm
Algorithms produce list of votes depending
on the scores
Make sure each algorithm votes with the
same weight.
Number of votes is replaced by the
number of standard deviations from the
mean value
Normalizing Scores and Votes
I t I s
s1
s2 s3
I t I s
s4
Scores
s1
s2 s3
s4
Normalized
ns1 ns2 ns3 ns4 scores ns1 ns2 ns3 ns4
v1
v2
v3
v4
nv1 nv1 nv1 nv1
Votes
v1
v2
v3
v4
nv1 nv1 nv1
nv1
Combined Normalized Votes
Combined Approach with Weighted
Voting
A list of votes from all the experts is
gathered
For each boundary, the final votes are
summed
A boundary is placed at a position if the
votes at the position exceed threshold.
Threshold = Average number of Votes
Evaluation Criteria
 Evaluation A: % of space separated words
retrieved
 Evaluation B: % of keywords in the protocol
specification that were retrieved
 Evaluation C: entropy of the tokens in output file
(lower the better)
 Evaluation D: number of detected attacks in
network traffic
A and B only for text based protocols
Anomaly Detection Algorithm – LERAD
(Mahoney and Chan)
LERAD forms rules based on 23 attributes
First 15 attributes: from packet header
Next 8 attributes: from the payload
 Example Rule:
If port = 80 then word1 = “GET”
Original
Payload
attributes:
space
separated tokens
Our
Payload
attributes:
Boundary
separated tokens
Experimental Data




1999 DARPA Intrusion Detection Evaluation Data Set
Week 3 :attack free (training) data
Weeks 4, 5: attack containing (test) data
Evaluations A, B, C (Known boundaries) : Week 3
 trained: days 1 - 4
 tested: days 5 – 7
 Prevent gaining knowledge from Weeks 4 and 5
 Evaluation D (Detected attacks)
 Trained: Week 3
 Tested :Weeks 4 and 5
Evaluation A: % of Space-Separated
Tokens Recovered
Method
Port#
25
Freq+MDL
52
Frequency
15
BE + AEMI + 21
MDL+ Freq
AEMI
5
MDL
6
BE
3
Port#
80
26
16
14
Port#
21
21
13
5
Port#
79
81
99
12
Avg
9
7
3
4
3
1
32
25
9
12.5
10.3
4.0
45.0
36.0
13.0
Evaluation B: % of Keywords in RFCs
Recovered
Method
Port#25 Port#80
Port#21 Avg
Freq+MDL
Frequency
BE+AEMI+
MDL+Freq
AEMI
MDL
BE
40
31
12
36
28
13
59
40
21
45.0
33.0
15.3
9
7
3
5
6
2
2
1
2
5.3
4.7
2.3
Evaluation C: Entropy of Output
(Lower is Better)
average across 6 ports
Method
Average Value
Frequency
MDL
Freq+MDL
BE
BE + AEMI + Freq + MDL
AEMI
5.0
5.03
5.06
5.25
5.56
6.38
Ranking of Algorithms
Method
Evaluation A Evaluation B Evaluation C
Freq+MDL
1
1
3
Frequency
2
2
1
BE+AEMI+
MDL+ Freq
AEMI
3
3
5
4
4
6
MDL
5
5
2
BE
6
6
4
Detection Rate for Space Separated Vs
Boundary Separated (Freq + MDL)
Port #
10 FP/day
Space
Boundary
100 FP/day
Space
Boundary
20
2
2
4
5
21
14
16
14
17
22
3
3
3
3
23
13
14
13
14
25
15
16
16
16
79
3
3
3
3
80
10
10
11
13
113
2
2
2
2
Overall
59
62
63
68
% Improvement
--
5
--
8
Summary of Contributions
 Used payload information, while most IDS
concentrate on header information.
 Proposed AEMI + MDL for boundary detection
 Combined all and subset of algorithms
 Used weighted voting to indicate confidence
 Proposed techniques find boundaries better than
spaces
 Achieved higher detection rates in an anomaly
detection system
Future Work
 Further evaluation on other ports
 Pick more useful tokens instead of first 8
 DARPA data set is partially synthetic, further
evaluation on real traffic
 Evaluation with other Anomaly detection
algorithms
Thank you
Experimental Results
Table 4.3.4 Results from Additional Ports for Freq + MDL
and ALL
Method
23
115
515
Evaluation
A
Evaluation
B
% Words Found
% Keywords
Found
Frq+ ALL
MDL
13
7
43
20
38
14
Evaluation
Entropy
Frq+ ALL Frq+ ALL
MDL
MDL
5
3
7.88 8.08
4.45 5.18
7.66 7.27
Download