Identifying Malicious Web Requests through Changes in Locality and Temporal Sequence

advertisement
Identifying Malicious Web Requests through
Changes in Locality and Temporal Sequence
DIMACS Workshop on Security
of Web Services and E-Commerce
Li-Chiou Chen
lchen@pace.edu
School of Computer Science and Information Systems
Pace University
May 4th, 2005
Needs for anomaly detection in distributed network traces

The fast spreading Internet worms or malicious
programs interrupts web services


Early detection and response is a vital approach
These attacks are usually launched from
distributed locations


Network traces left at distributed locations are
invaluable for searching clues of potential future
attacks
E.g. Dshield, the Honeynet Project
© Li-Chiou Chen, 5/6/2005
2
Types of IDS

Based on data

Network-based IDS


Host-based IDS


Monitors and inspects network traffic
Runs on a single host
Based on detection techniques

Signature-based IDS


Uses pattern matching to identify known attacks
Anomaly-based IDS

Uses statistical, data mining or other techniques to distinguish
normal from abnormal activities
© Li-Chiou Chen, 5/6/2005
3
Outline







Toolkits for inferring anomaly patterns from
distributed network traces
Previous works
Changes of locality over time
Markov chain analysis
Preliminary results
Summary
Future works
© Li-Chiou Chen, 5/6/2005
4
TIAP: Toolkits for inferring anomalous
patterns in distributed network traces
Network traces (web log, tcpdump, etc)
Data conversion
Locality pattern
analysis
Sequence pattern
analysis
Alerts from other IDS or
TIAP peers
(using IDMEF)
Response module
Alerts to other IDS or
TIAP peers
(using IDMEF)
© Li-Chiou Chen, 5/6/2005
Alerts to
administrators
5
Web level IDS

Anomaly detection



Structure of a HTTP request (Kruegel and Vigna 03)
Normality on streams of data access patterns (Sion et al
03)
Misuse detection


State transition analysis of HTTP requests (Vigna et al
03)
Look for attack signatures (Almgren et al 01)
© Li-Chiou Chen, 5/6/2005
6
Changes in locality patterns and temporal
sequence patterns

Locality



where the web request is sent, such as the source IP
address,
which web server is requested, such as the destination
IP address
Temporal sequence

the order of requested objects during a given period of
time
© Li-Chiou Chen, 5/6/2005
7
Locality pattern analysis in distributed
network traces
ABAA
ABCD
KIKL
ABPO
t1: AB
t2: ....
t3: ….
t4: ….
© Li-Chiou Chen, 5/6/2005
8
An example: web traces in common log format from
6 web servers
S1
S2
S3
S4
S5
S6
tstamp, ip, server, doc_tpe, user_agent
62978, 38.0.69.1, 1, 2, 3
62979, 38.0.69.1, 1, 2, 3
A session 62979, 38.0.69.1, 2, 2, 3
63001, 38.0.69.1, 1, 2, 3
……..
………
© Li-Chiou Chen, 5/6/2005
9
Data profiles






6 web servers (2 of them have links to each other,
4 of them are independent)
One day web trace
One session: a distinct IP, 10 minutes interval
193,070 HTTP requests,
11,177 sessions
HTTP requests from outside of the organization
© Li-Chiou Chen, 5/6/2005
10
Number of
web site
accessed
1
1
1
1
1
1
1
2
2
2
2
3
3
3
4
4
4
Number of
document type
accessed
1
2
3
4
5
6
7
1
2
3
4
2
3
4
2
3
4
© Li-Chiou Chen, 5/6/2005
% browser
99%
22%
94%
93%
0%
100%
100%
0%
12%
0%
0%
0%
0%
0%
0%
0%
0%
% web bot
1%
78%
6%
7%
100%
0%
0%
100%
88%
100%
100%
100%
100%
100%
100%
100%
100%
Locality pattern
analysis
86 sessions by
only two web
bots
11
Markov chain analysis
X X YY Y
XX ZZ XX
ZZZ
W W W
N S NS S
OS NS OS
NSS
N S S
N S NS S
OS NS OS
t1 t2
t6 t7 t8 t9 t10 t11
t3 t4 t5
sampling window 1
NSS
© Li-Chiou Chen, 5/6/2005
S
…………………..
…………………..
t14 t15 t16 …………….
t12 t13
sampling window 2
N
O
N S
…………………..
N
S
O
S
12
Data profiles




1 web servers
One week web traces
Window size 30
Reference list 30
© Li-Chiou Chen, 5/6/2005
13
Change of distinct IP over time- browsers
Number of unqie IP per five minutes
100
90
80
70
60
50
40
30
20
10
0
0
24
48
72
96
120
144
168
192
Hours (since 09/30/2004 0:00AM)
© Li-Chiou Chen, 5/6/2005
14
Change of distinct IP over time- web bots
Number of unique IP per five minutes
25
20
15
10
5
0
0
24
48
72
96
120
144
168
192
Hours (since 09/30/2004 00:00AM)
© Li-Chiou Chen, 5/6/2005
15
Markov chain results
0.43(0.14)
Old (O)
0.42(0.21)
0.43(0.17)
0.13 (0.10)
New (N)
0.18 (0.16)
© Li-Chiou Chen, 5/6/2005
0.13 (0.08)
0.40 (0.22)
0.06 (0.04)
Same (S)
0.83 (0.10)
16
Illustration of the state transition probability
1
0.9
0.8
Probability
0.7
0.6
S->S
S->O
S->N
0.5
0.4
0.3
0.2
0.1
0
0
24
48
72
96
120
144
168
Hours (since 09/30/04, 0:00AM)
© Li-Chiou Chen, 5/6/2005
17
Summary

The preliminary locality pattern analysis works
well with identifying distinct web bot access
patterns

The Markov chain analysis provides a way to
infer attacks that utilize random IP addresses

A combination of the two approaches is needed
© Li-Chiou Chen, 5/6/2005
18
Ongoing works

Incorporate the analytical results for malware or intrusion
detections

A distributed framework of data collection and
information sharing for inferring malwares or intrusion
attempts across servers/platforms/geographical locations

Collection of attack logs for analytical purpose

Use of the Intrusion Detection Message Exchange Format
(IDMEF) for message changes among servers
© Li-Chiou Chen, 5/6/2005
19
Download