Helvetica is a Good Font

advertisement
A Data Mining Approach for Building
Cost-Sensitive and Light Intrusion
Detection Models
Quarterly Review – November 2000
North Carolina State University
Columbia University
Florida Institute of Technology
Outline
• Project description
• Progress report:
–
–
–
–
–
–
–
–
–
Cost-sensitive modeling (NCSU/Columbia/FIT).
Automated feature and model construction (NCSU).
Anomaly detection (NCSU/Columbia/FIT).
Attack “clustering” and light modeling (FIT).
Real-time architecture and systems (NCSU/Columbia).
Correlation (NCSU).
Collaboration with industry (NCSU/Columbia).
Publications and software distribution.
Effort and budget.
• Plan of work for next quarter
New Ideas and Hypotheses (1/2)
• High-volume automated attacks can
overwhelm a real-time IDS and its staff
– IDS needs to consider cost factors:
• Damage cost, response cost, operational cost, etc.
• Pure statistical accuracy not ideal:
– Base-rate fallacy of anomaly detection.
– Alternative: the cost (saving) of an IDS.
New Ideas and Hypotheses (2/2)
• Thorough analysis cannot always
be done in real-time by one sensor:
– Correlation of multiple sensor outputs.
– Trend or scenario analysis.
• Need better theories and tools for
building misuse and anomaly
detection models:
– Characteristics of normal data and attack
signatures can be measured and utilized.
Main Approaches (1/2)
• Cost-sensitive models and architecture:
– Optimized for the cost metrics defined by users.
• Cost-sensitive machine learning algorithms.
– Multiple specialized and light sensors
dynamically activated/configured in run-time.
• “Load balancing” of models and data
• Aggregation and correlation.
• Cost-effectiveness as the guiding principle and
multi-model correlation as the architectural
approach.
Main Approaches (2/2)
• Theories and tools for more effective
anomaly and misuse detection:
– Information-theoretic measures for anomaly
detection
• “Regularity” of normal data is used to build model.
– New algorithms, e.g.
• Unsupervised learning using “noisy” data.
• Using “artificial anomalies”
– An automated system that integrate all these
algorithms/tools.
Project Impacts (1/2)
• A better understanding of the cost factors, cost
models, and cost metrics related to intrusion
detection.
• Modeling techniques and deployment
strategies for cost-effective IDSs
– Provide the “best-valued” protection.
• “Clustering” techniques for grouping
intrusions and building specialized and light
sensors.
• An architecture for dynamically activating,
configuring, and correlating sensors.
Project Impacts (2/2)
• More effective misuse and anomaly detection
models
– With sound theoretical foundations and
automation tools.
• Analysis/correlation techniques for
understanding/recognizing and predicting
complex attack scenarios.
Cost-Sensitive Modeling
• In previous quarters:
–
–
–
–
Cost factors and metrics definition and analysis.
Cost model definition.
Cost-sensitive modeling with machine learning.
Evaluation using DARPA off-line data.
• Current quarter:
– Real-time architecture.
– Dynamic cost-sensitive deployment and
correlation of sensors.
A Multi Layer/Component Architecture
models
Remote
IDS/Sensor
Dynamic Cost-sensitive
Decision Making
FW
Real-time
IDS
Backend
IDS
ID Model
Builder
Next Steps
• Study “realistic” cost-metrics in
the real-world.
• Implement a prototype system
– Demonstrate the advantage of costsensitive modeling and dynamic costeffective deployment
• Use representative scenarios for evaluation.
An Automated System for
Feature and Model
Construction
The Data Mining Process of Building ID Models
models
connection/
session
records
raw audit data
packets/
events
(ASCII)
Feature Construction From Patterns
patterns
new
mining
intrusion
records
mining
compare
intrusion
patterns
features
training
data
learning
normal and
historical
intrusion
records
detection
models
Status and Next Steps
• The effectiveness of the algorithms/tools
(process steps) have been validated
– 1998 DARPA Evaluation.
• Automating the process:
– Process steps “chained” together.
– Process iteration: under development.
• Field test:
– Advanced Technology Systems, General Dynamics.
– Planned public release 2Q-2001.
• Dealing with “unlabeled” data
– Integrate “anomaly detection over noisy data
(Columbia)” algorithms.
Information-Theoretic Measures
for Anomaly Detection
• Motivations:
– Need formal understandings.
• Hypothesis:
– Anomaly detection is based on “regularity” of
normal data.
• Approach:
– Entropy and conditional entropy: regularity
• Determine how to build a model.
– Relative (conditional) entropy: how the regularities
between training and test datasets relate
• Determine the performance of a model on test data.
Case Studies
• Anomaly detection for Unix processes
– “Short sequences” as normal profile.
– A classification approach:
• Given the first k system calls, predict the k+1st system call
– How to determine the “sequence length”, k? Will
including other information help?
– UNM sendmail system call traces.
– MIT Lincoln Lab BSM data.
• Anomaly detection for network
– How to partition the data – refine the complex subject.
– MIT Lincoln Lab tcpdump data.
Entropy and Conditional Entropy
H ( X )   P( x ) log( P( x ))
x
• “Impurity” of the dataset
• the smaller (the more regular) the better.
H ( X | Y )   P( x, y ) log P( x | y )
x, y
• “Irregularity” of sequential dependencies
• “uncertainty” of a sequence after seeing
its prefix (subsequences)
• the smaller (the more regular) the better.
Relative (Conditional) Entropy
p( x )
relEntropy ( p | q)   p( x ) log
q( x )
x
p( x | y )
relCondEntropy ( p | q)   p( x, y ) log (
)
q( x | y )
x, y
• How different is p from q:
• how different is the regularity of test data
from that of training data
• the smaller the better.
Information Gain and Classification
| Xv |
Gain( X , A)  H ( X )  
H (Xv)
vValues( A ) | X |
• How much can attribute/feature A contribute to the
classification process:
• the reduction of entropy when the dataset is
partitioned according to values of A.
• the larger the better.
• if A = the first k events in a sequence (i.e., Y) and the
class label is the k+1st event
• conditional entropy H(X|Y) is just the second
term of the Gain(X, A)
• the smaller the conditional entropy, the better
performance the classifier.
Conditional Entropy of Training
Data (UNM)
0.5
bounce-1.int
bounce.int
0.4
queue.int
0.3
plus.int
sendmail.int
0.2
total
mean
0.1
sliding window size
17
15
13
11
9
7
5
3
0
1
Conditional Entropy
0.6
Misclassification Rate: Training Data
50
40
bounce-1.int
35
bounce.int
30
queue.int
25
plus.int
20
sendmail.int
15
total
10
mean
5
sliding window size
17
15
13
11
9
7
5
3
0
1
Misclassification Rate
45
Conditional Entropy vs.
Misclassification Rate
condEnt and misClass rate
1.2
1
0.8
total-CondEnt
total-MisClass
0.6
mean-CondEnt
mean-MisClass
0.4
0.2
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
sliding window size
30
sendmail.int
25
total
20
sm-10763.int
15
syslog-local-1.int
10
fwd-loops-1.int
5
fwd-loops-2.int
0
fwd-loops-3.int
sliding window size
17
plus.int
15
35
13
queue.int
11
40
9
bounce.int
7
45
5
bounce-1.int
3
50
1
Misclassification Rate
Misclassification Rate of Testing
Data and Intrusion Data
fwd-loops-4.int
fwd-loops-5.int
Relative Conditional Entropy btw.
Training and Testing Normal Data
0.05
bounce-1.int
bounce.int
0.04
queue.int
0.03
plus.int
sendmail.int
0.02
total
mean
0.01
sliding window size
17
15
13
11
9
7
5
3
0
1
Relative Conditional Entropy
0.06
(Real and Estimated)
Accuracy/Cost (Time) Trade-off
0.0009
0.0008
estimated
accur/cost, total
0.0006
accur/cost, total
0.0005
estimated
accur/cost, mean
0.0004
0.0003
accur/cost, mean
0.0002
0.0001
sliding window size
17
15
13
11
9
7
5
3
0
1
Accuracy/Cost
0.0007
Conditional Entropy of In- and
Out- bound Email (MIT/LL BSM)
0.6
s-o-in0
0.5
s-in0
0.4
so-in0
0.3
s-o-out0
0.2
s-out0
0.1
so-out0
sliding window size
17
15
13
11
9
7
5
3
0
1
Conditional Entropy
0.7
0.025
s-o-in0
0.02
s-in0
0.015
so-in0
s-o-out0
0.01
s-out0
0.005
so-out0
sliding window size
17
15
13
11
9
7
5
3
0
1
Relative Conditional Entropy
Relative Conditional Entropy
Misclassification Rate of inbound Email
30
s-o-in0,80%
25
s-o-in0,20%
20
s-in0,80%
15
s-in0,20%
so-in0,80%
10
so-in0,20%
5
sliding window size
17
15
13
11
9
7
5
3
0
1
Misclassification Rate
35
Misclassification Rate of outbound Email
35
s-o-out0,80%
30
s-o-out0,20%
25
s-out0,80%
20
s-out0,20%
15
so-out0,80%
10
so-out0,20%
5
sliding window size
17
15
13
11
9
7
5
3
0
1
Misclassification Rate
40
0.001
0.0009
0.0008
0.0007
0.0006
0.0005
0.0004
0.0003
0.0002
0.0001
0
s-o-in0
s-in0
so-in0
s-o-out0
s-out0
so-out0
sliding window size
17
15
13
11
9
7
5
3
mean
1
Accuracy/cost
Accuracy/cost Trade-off
Estimated Accuracy/cost
Trade-off
s-o-in0
0.001
0.0008
s-in0
0.0006
so-in0
0.0004
s-o-out0
0.0002
s-out0
so-out0
sliding window size
17
15
13
11
9
7
5
3
0
1
Accuracy/cost
0.0012
mean
Key Findings
• “Regularity” of data can guide how to
build a model
– For sequential data, conditional entropy directly
influences the detection performance
• Determines the (best) sequence length and whether to
include more information, before building a model.
• With cost is also considered, the “optimal” model.
• Detection performance on test data can
be attained only if regularity is similar to
training data.
Next Steps
• Study how to measure more complex
environments
– Network topology/configuration/traffic, etc.
• Extend the principle/approach for
misuse detection:
– Measure normal, attack, and their
relationship
• “Parameter adjustment”, performance
prediction.
New Anomaly Detection
Approaches
• Unsupervised training methods
– Build models over noisy (not clean) data
• Artificial anomalies
– Improves performance of misuse and
anomaly detection methods.
• Network traffic anomaly detection
AD over Noisy Data
• Builds normal models over data containing
some anomalies.
• Motivating assumptions:
– Intrusions are extremely rare compared to
to normal.
– Intrusions are quantitatively different.
Approach Overview
• Mixture model
– Normal component
– Anomalous component
• Build probabilistic model of data
• Max likelihood test for detection.
Mixture Model of Anomalies
• Assume a generative model:
– The data is generated with a probability
distribution D.
• Each element originates from one of two
components:
– M, the Majority Distribution (x  M).
– A, the Anomalous Distribution (x  A).
• Thus: D = (1-)M + A.
Modeling Probability
Distributions
• Train Probability Distributions over current
sets of M and A.
• PM(X) = probability distribution for
Majority.
• PA(X) = probability distribution for
Anomaly.
• Any probability modeling method can be
used:
– Naïve Bayes, Max Entropy, etc.
Experiments
• Two Sets of experiments:
– Measured Performance against
comparison methods over noisy data.
– Measured Performance trained over noisy
data against comparison methods trained
over clean data.
– Method Robust in both comparisons.
AD Using Artificial Anomalies
• Generate abnormal behavior artificially
– Assume the given normal data are representative.
– “Near misses" of normal behavior is considered
abnormal.
– Change the value of only one feature in an instance
of normal behavior.
– Sparsely represented values are sampled more
frequently.
– “Near misses" help define a tight boundary
enclosing the normal behavior.
Experimental Results
• Learning algorithm: RIPPER
• Data: 1998 DARPA evaluation
– U2R, R2L, DOS, PRB: 22 “clusters”
• Training data: normal and artificial anomalies
• Results
–
–
–
–
–
Overall detection rate: 94.26%
Overall false alarm rate: 2.02%
100% dectection: buffer_overflow, guess_passwd, phf, back
0% detection: perl, spy, teardrop, ipsweep, nmap
50+% detection: 13 out of 22 intrusion subclasses
Combining Anomaly and Misuse
Detection
• Training data: normal data, artificially
generated anomalies, known intrusion data
• The learned model can predict normal,
anomaly, or known intrusion subclass
• Experiments were performed on increasing
subsets of known intrusion subclasses in the
training data (simulates identified intrusions
over time).
Combining Anomaly and Misuse
Detection (continued)
• Consider phf, pod, teardrop, spy, and smurf
are unknown (absent from the training data)
• Anomaly detection rate: phf=25%,
pod=100%, teardrop=93.91%, spy=50%,
smurf=100%
• Overall false alarm rate: .20%
• The false alarm rate has dropped from 2.02%
to .20% when some known attacks are
included for training
Adaptive Combined Anomaly and
Misuse Detection
• Completely re-train model whenever new
intrusion is found is very expensive and
slow process.
• Effective and fast remedy is very important
to thwart these attacks.
• Re-training is still necessary when time and
resource are enough.
Multiple Model Adaptive Approach
• Generate an additional detection module
only good at detecting the newly discovered
intrusion.
– Method 1: trained from normal and new intrusion data
– Method 2: new intrusion and artificial anomaly
• When old classifier predicts “anomaly”, it
will be further predicted by the new
classifier to examine if it is the new
intrusion.
Multiple Model Adaptive Experiment
• The “old model” is trained from n
intrusions.
• A light weight model is trained from one
new intrusion type.
• They are combined as an ensemble.
• The accuracy and training time is compared
with one model trained from n + 1
intrusions.
Multiple Model Adaptive Experiment
Result
• The accuracy difference is very small
– recall: +3.4%
– precision: -16%
– In other words, ensemble approach detects more new
intrusion, but also misidentifies more anomaly as new
intrusion.
• Training time difference: 150 time
difference! or a cup of coffee versus one or
two days.
Detecting Anomalies in Network
Traffic (1/2)
• Can we detect intrusions by identifying
novel values in network packets?
• Anomaly detection is potentially useful in
detecting novel attacks.
• Our model is trained on attack-free tcpdump
data.
• Fields in the Transport layer or below are
considered.
Detecting Anomalies in Network
Traffic (2/2)
• Normal field values are learned.
• During evaluation, a function scores a
packet based on the likelihood of
encountering novel field values.
• Initial results indicate our learned model
compares favorably with other systems on
the 1999 DARPA evaluation data.
Packet Fields
• Fields in Data-link, Network, and Transport layers.
– (Application layer will be considered later)
• Ethernet: source, destination, protocol.
• IP: header length, TOS, fragment ID, TTL,
transport protocol …
• TCP: header length, UAPRSF flags, URG pointer
…
• UDP: length …
• ICMP: type, code…
Anomaly Scoring Function (1/2)
• N1 = Number of unique values in a field in
the training data
• N = Number of packets in the training data
• Likelihood of observing a novel value in a
field is:
N1 / N
(escape probability, Witten and Bell,
1991)
Anomaly Scoring Function (2/2)
• Non-stationary model: consider the last
occurrence of novel values
• t = Number of seconds since the last novel
value in the same field
• Likelihood of observing an anomaly
P = (N1 / N) * (1 / t)
• Field anomaly score: Sf = 1 / P
• Packet anomaly score = Sf Sf
Experiments
• 1999 DARPA evaluation data (from Lincoln Lab).
• Same mechanism as DARPA in determining
detection (correct IP address of the victim, 60
seconds before and after an attack).
• Score thresholds of our system and others are
lowered to produce no more than 100 false alarms.
• Some of the other systems use binary scoring.
Initial Results
IDS
TP/FP All
TP/FP
Network
IDS Type
Oracle
200/0
72/0
ideal
FIT
64/100
51/100
anomaly
GMU
51/22
27/22
anomaly+signature
NYU
20/80
14/80
signature
SUNY
24/9
19/9
signature
NetSTAT
70/995
35/995
signature
EmeraldTCP
83/23
35/23
signature
Discussion
• All attacks: more detections with 100 or fewer
false alarms than most systems except Emerald
and NetSTAT.
• Our initial experiments did not look at fields in the
Application protocol layer.
• Network attacks: more detections with 100 or
fewer false alarms than the other systems.
• 57 out of 72 attacks were detected with 100 false
alarms.
Summary of Progress
• Florida Tech’s official start date: August 30, 2000.
• Near-term objective: using learning techniques to
build anomaly detection models that can identify
intrusions.
• Progress: initial experimental results on the 1999
DARPA evaluation data indicate that our
techniques compare favorably with the other
systems in detecting network attacks.
Plans for the Next Quarter
• Investigate an entropy approach to detecting
anomalies.
• Study methods that incorporate more information
from packets prior to the current packet.
• Examine how effective our techniques are with
respect to individual attack types.
• Devise techniques to catch attack types that are
undetected.
• Incorporate fields in the Application protocol layer
into our model.
Anomaly Detection: Summary
and Plans
• Anomaly detection is a main focus.
• Both theories and new approaches.
• Will integrate:
–
–
–
–
Theories applied to develop new AD sensors.
Incorporate cost-sensitive measures.
Study real-time architecture/performance.
Automated feature and model construction
system.
Correlation Analysis of Attack
Scenario
• Motivations:
– Detecting individual attack actions not adequate
• Damage assessment, trend prediction, etc.
• Hypothesis:
– Attacks are related and such correlation can be
learned.
• Approach:
– Start with crude knowledge models.
– Use data mining to validate/refine the models.
– An IETF/IDWG architecture/system.
Objectives (1/2)
• Local/low layer correlations in an IDS
– Multiple sources of raw (audit) data
• Raw information: tcpdump data, BSM records…
• Based on specific attack signatures, system/user
normal profiles …
– Benefits:
• Better accuracy: higher TP, lower FP
• More alarm information for higher level and global
analysis
Objectives (2/2)
• Global / High Layer Correlations
– Multiple sources of alarms by IDSs
– The bigger picture
• What really happened in our networks?
• What can we learn from these cases?
– Benefits:
• What is the intention of the attacks?
• What will happen next? When? Where?
• What can we do to prevent it from happening?
Architecture of Global
Correlation System
Alarm
Collection
Center
Alarm PreProcessor
Correlation
Engine
Alarm
PostProcessor
IDSs
Knowledge
Base
Knowledge
Controller
Report
Center
Correlation Techniques from
Network Management System (1/2)
• Rule-Based Reasoning (RBR)
– If – then rules based on the domain knowledge
and expertise.
– Sufficient for small, non-changing, and well
understood system.
• Model-Based Reasoning (MBR)
– Model both physical and logical entity, such as
hub, router …
– Correlation is a result of the collaboration
among models.
Correlation Techniques from
Network Management Systems (2/2)
• State-Transition Graph (STG)
– Logical connections via state-transition.
– May lead to unexpected behavior if the
collaborating STGs are not carefully defined.
• Case-Based Reasoning (CBS)
– Learn from the experience and offer solutions
to novel problems based on experience.
– Need to develop a similarity metric to retrieve
useful cases from the library.
Correlation Techniques for IDS
• Combination of different correlation
techniques
– Network complexity.
– Wide varieties attacking motives and tools.
• Adaptation of different correlation
techniques
– Different perspectives between NMS and IDS.
Challenges of Correlation (1/2)
• Knowledge representation
– How to represent the objects such as
alarms, log files, network entities?
– How to model the knowledge such as
network topology, network history,
intrusion library, previous cases?
Challenges of Correlation (2/2)
• Knowledge base construction
– What kind of knowledge base do we need?
– How to construct the knowledge base?
• Case library
• Network Knowledge
• Intrusion Knowledge
– Patten discovery ( domain knowledge/expert
system, data mining …)
A Case Study: DDoS
• An attack scenario from MIT/LL
– Phase 1: IPSweep of the AFB from a remote
site.
– Phase 2: Probe of live IPs to look for the
‘sadmind’ daemon running on Solaris hosts.
– Phase 3: Break-ins via the ‘sadmind’
vulnerability.
– Phase 4: Installation of the trojan program—
’mstream’ DDoS software on three hosts at the
AFB.
– Phase 5: Launching the DDoS.
Alarm Model
• Object-Oriented
• Alarm A: {feature1, feature2, …}
• Features of Alarm
–
–
–
–
–
–
–
–
–
–
Attack type
Time stamp
Service
Source IP / domain
Target IP/ domain
Target number
Source type (router , host , server…)
Target type (router, host, server … )
Duration
Frequency within time window
Alarm Model
• Example:
– IP sweep 09:51:51 ICMP ppp5-23.iawhk.com
172.16.115.x 20 hosts servers 9 1
•
•
•
•
•
•
•
•
•
•
Attack type: IP sweep
Time stamp: 09:51:51
Service: ICMP
Source IP: ppp5-23.iawhk.com
Target IP: 172.16.115.x
Target number: 20
Source type: n/a
Target type: hosts and servers
Duration: 9 seconds
Frequency: 1
Scenario Representation (1/2)
• Attack scenario graph
– Constructed by domain knowledge
• Can be validated/augmented via data mining.
– Describing attack scenarios via state
transition.
– Each transition with probability P.
– Modifiable by experts.
– Adaptive to new cases.
Scenario Representation (2/2)
• Example of attack scenario graph
TFN2K
DDoS
Buffer
Overflow
IP Sweep
Trojan
Installation
Port Scan
Trinoo
DDoS
SMURF
Syn Flood
UDP Flood
Mstream
DDoS
Correlation Rule Sets
• Based on
– Attack scenario graph.
– Domain knowledge and expertise.
– Case library.
• Two Layers of Rule Sets
– Lower layer for matching/correlating specific
alarms.
– Higher layer for trend prediction.
– Probability assigned.
Correlation Rule Sets
• Example of low layer rule sets
– If (A1.type = “IP Sweep” & A2.type = “Port
Scan” ) & (A1.time < A2.time) & (A1.domain
= A2.domain) & ( A2.target # > 10 ), then
A1&A2
….
– If (A2.type = “Port Scan” & A3.type = “Buffer
Overflow”) & (A2.time < A3.time) &
(A3.DestIP belongs to A2.domain) &
(A3.target# >=2), then A2 & A3
Correlation Rule Set
• Example of high layer rule sets
– If (A1 & A2, A2 &A3), then A1&A2&A3
– If (A1 & A2 & A3), then the attack scenario
is A1 -> A2 ->A3 -> A4 w/ probability P1
A1-> A2 -> A3 -> A4 -> A5 w/ probability P2
– E.g.,
If (“IP Sweep” & “Port Scan” & “Buffer
Overflow” )
Then next1 = “Trojan Installation” with P1
next2 = “DDoS” with P2
Status and Next Steps
• At the very beginning of this research.
• Attack Scenario Graph
– How to construct it automatically?
– How to model the statistical properties of
attack scenario state transition?
• How to automatically generate the
correlation rule sets?
• Collaboration with other groups:
– Alarm formats, architecture, IETF/IDWG.
Real-time System Implementation
• Motivations
– Validate our algorithms and models in the realworld.
– Faster technology transfer and greater impact.
• Approach
– Collaboration with industries
• Reuse available “building blocks” as much as
possible.
Conceptual Architecture
Adaptive
Model
Generator
models
Data
Warehouse
models
data
Sensor
data
data
Detector
System Architecture
Model
Generation
Supervised
Machine
Learning
Unsupervised
Machine
Learning
Real Time
Data Mining
Adaptive
Model
Generation
Data
Warehouse
NT Host
Based IDS
Linux Host
Based IDS
Solaris Host
Based IDS
Sensors
Malicious
Email
Filter
“Meta”
IDS
File System
Wrappers
NFR
Network
Based IDS
Software
Wrappers
Sensor: Host Based IDS System
• Generic Interface to Sensors
– BAM (Basic Auditing Module)
– Sends data to data warehouse
– Receives models from data warehouse
• NT System
– Fully Operational
• Linux System & BSM (Solaris) System
– Sensor Operational
– Under Construction
• Plan to finish construction by end of
semester
Sensor: Network IDS System
• NFR Based Sensor
– Data Mining based
• Efficient Evaluation Architecture
– Multiple Models
• System operational and integrated
with larger system
Sensor: Malicious Email Filter
• Monitors Email (sendmail)
– Detects malicious emails entering domain
• Key Features:
– Model Based
– Generalizes to unknown malicious attachments
– Models distributed automatically to filters
• Status:
– Prototype operational
– Open source release by end of semester
Sensor: Advanced IDS Sensors
•
•
•
•
File Wrappers
Software Wrappers
Monitor other aspects of system
Status:
– File Wrappers almost finished
– Software Wrappers under development
Data Warehouse
• Stores data collected from sensors
– Generic IDS data format
– Data can be manipulated in database
– Cross reference data from attacks
• Stores generated models
• Status:
– Currently Operational
– Refining Interface and Data Transfer Protocol
– Completed by end of Semester
Adaptive Model Generator
•
•
•
•
•
Builds models from data in data warehouse
Uses both supervised and unsupervised data
Can build models based on data collected
XML Based Data Exchange Format
Status:
– Exchange Format’s defined
– Prototype developed
– Completion by end of semester
Collaboration with Industries
• NFR.
• Cigital (RST).
• SAS.
• General Dynamics.
• Aprisma/Cabletron.
• HRL.
Publications and Software, etc.
• 4 Journal and 10+ Conference papers
– One best paper and two runner-ups.
• JAM.
• MADAMID.
• PhDs: two graduated, one
graduating, five in the pipeline …
• More to come …
Efforts: Current Tasks
• Cost-sensitive modeling (NCSU/Columbia/FIT).
• Automated feature and model construction
(NCSU/Columbia/FIT)
– Integration of all algorithms and tools.
• Anomaly detection (NCSU/Columbia/FIT).
• Attack “clustering” and light modeling (FIT).
• Real-time architecture and systems
(NCSU/Columbia).
• Correlation (NCSU).
• Collaboration with industry (NCSU/Columbia/FIT).
Download