ADWICE Overview

advertisement
Overview
ADWICE
• Basic notions:
– Anomaly Detection
– Clustering
Anomaly Detection with Real-time
Incremental Clustering
• The Safeguard context
• ADWICE (Anomaly Detection With fast Incremental
ClustEring)
– Training and detection
– Evaluation
Kalle Burbeck and Simin Nadjm-Tehrani
Real-time Systems Laboratory
www.ida.liu.se/~rtslab
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
Autumn 2005
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
2 of 23
Autumn 2005
Intrusion detection
Protected
System
Attacker
Clustering
• ADWICE uses clusters to represent normality
• Adaptation of an existing data mining algorithm
Intrusion
Detection
System
Misuse Detection
Normal
Behaviour
Bad
Model
Behaviour
Closest
Cluster
r
c
Anomaly Detection
Normal
Model
Behaviour
Bad
Behaviour
d
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
3 of 23
Autumn 2005
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
4 of 23
Autumn 2005
Overview of the project
• Goal: to enhance
survivability of Large
Complex Critical
Infrastructures (LCCIs)
• Electricity and
telecommunications
networks as practical
examples
• CF (Cluster Feature)
– Summary of cluster
• Maximal number of
clusters (M)
• CF Tree:
• Threshold requirement
(TR)
Non-leaf node
Leaf nodes
• Pre 9/11!
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
Basic ADWICE concepts
5 of 23
Autumn 2005
CF CF
CF CF CF
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
CF CF CF
6 of 23
Autumn 2005
ADWICE training
Threshold:
ADWICE training
Threshold:
Max Number of Clusters: 3
Max Number of Clusters: 3
Branching factor: 2
Data Points
Branching factor: 2
Data Points
CF Tree
CF Tree
1
1
2
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
7 of 23
Autumn 2005
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
8 of 23
Autumn 2005
ADWICE training
Threshold:
ADWICE training
Threshold:
Max Number of Clusters: 3
Max Number of Clusters: 3
Branching factor: 2
Data Points
Branching factor: 2
Data Points
CF Tree
1
4
1
2
2
CF Tree
1
3
3
3
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
9 of 23
Autumn 2005
5
2
4
1
4
5
5
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
2
4
3
5
10 of 23
Autumn 2005
Evaluation
• KDD99 Data
• General properties
– Session records (TCP/UDP summaries)
– 41 features (flags, service, traffic stats ...)
• Training data
– 4 898 431 session records
– 972 781 normal, the rest (attacks) not used
• Testing data
– 311029 session records
– normal data and 37 different attack types
CF Tree
1
1
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
ADWICE detection
Data Points
2
11 of 23
Autumn 2005
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
12 of 23
Autumn 2005
1
0,95
0,9
Alarm aggregation
• Anomaly detection may produce many similar
alarms (e.g. DoS, Probes, False positives)
• Similar alarms can be aggregated without
losing accuracy
ADWICEDT-TRR
ADWICEDT-TRD
0,85
0,8
0,75
0,015
Time
End
0,035
0,055
False positives rate
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
13 of 23
Autumn 2005
t2
<t2,HTTP, ...>
t1
Start
<t1,HTTP, ...>
<Start, End, Count = 2, HTTP, ...>
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
14 of 23
Autumn 2005
Alarm aggregation results
A Safeguard scenario
250000
1200
200000
Number of alarms
Number of aggregated alarms
Detection rate
Detection rate vs. false positives
150000
100000
50000
1000
800
400
200
0
0
1
10
20
40
60
80
100
Malicious
User
Scripts
600
500
1000
1500
2000
2500
Period number (1 minute per period)
Size of time window
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
15 of 23
Autumn 2005
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
Need for normality adaptation
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
New cases of normality
• Normality changes
– New type of normal
behaviour
• Old model incomplete
– Evaluation using
KDD data gives ~300
false positives for
new normality
Malicious
behavior
• Normality is not static!
16 of 23
Autumn 2005
Model
Normal
behaviour
17 of 23
Autumn 2005
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
Model
18 of 23
Autumn 2005
Evaluation of normality adaptation
• Admin or system reacts
– Recognize new false
positives
– Tells ADWICE to
learn this behaviour
• Normality model
adapted
– 300 -> 3 false
positives
Forgetting
Normal
Model
Behaviour
•
Model
System keeps track of model usage
– If time since last usage is very long for subset of clusters
– Decrease size (influence) of those clusters and finally
remove them if not used
Normal
Model
Behaviour
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
19 of 23
Autumn 2005
Bad
Behaviour
Bad
Behaviour
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
20 of 23
Autumn 2005
Recent work
• New grid-based index
– False positives decreased with 0.5-1.0%
– Better on-line performance
• Autonomous adaptation
– Incremental increase
– Forgetting
• Work with Safeguard test network continues
– Now 100+ machines
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
Questions?
21 of 23
Autumn 2005
ADWICE
Kalle Burbeck & Simin Nadjm-Tehrani
22 of 23
Autumn 2005
Download