Report on Multi-agent Data Fusion System: Design and

advertisement
Report on statistical Intrusion
Detection systems
By Ganesh Godavari
Outline of the talk
• Intrusion Detection
• Motivation
• Approaches for intrusion detection
Intrusion Detection & Data Fusion
• Intrusion Detection System
– Protect and provide availability, confidentiality
and integrity of critical information
infrastructures
• Data Fusion : task of data processing
aiming at making decisions on the basis of
distributed data sources specifying an
object
Motivation & challenges
• Threat analysis
– Known & unknown Pattern templates, traffic
analysis, statistical-anomaly detection and
state based detection
• Provide Reliability
– Reduce false alarms, increase user
confidence
Characteristics of IDS
• Key to an IDS
– Minimize the occurrence of non-justified alerts
(false-positive)
– Maximize accurate alerts (true-positive)
• Some of the methods
– Data mining
– Statistical
– Signature based or rule based
Signature based method
• Signature based IDS is as strong as its rule-sets
• If X events of interest are detected across a Ysized time window – raise an alert
• Advantages
– Potential for low alarm rates
– Accuracy of detection
– Detailed textual log
• Disadvantages
– Need to update rules every time
– Inability to detect new and previously unidentified
attacks
Statistical-Based Intrusion
Detection (SBID)
• Determine the normal network activity all
network traffic pattern outside the normal
scope is not normal
• SBID system relies on statistical models
like Bayes’ theorem to detect anomalous
packets on the network
disadvantages
• SBID system must learn what is normal
traffic for a particular network
• Longer time to adapt and cannot be handy
in smaller run unlike signature based
intrusion detection system
• If Normal traffic is malicious SBID system
will be rendered useless
• Alerts produced have no meaning to
untrained eye
Snort IDS
• Snort
– popular open IDS
– uses signature and statistical based intrusion
detection
• Statistical based intrusion detection is
provided by SPADE preprocessor plugin
SPADE
• Statistical Packet Anomaly Detection
Engine
– Silicon defence
– Probability measurements for anomalous
packet detection
– Anomaly score determined by evaluating
• Source IP
• Destination IP
• Destination port …
contd..
• Spade
– Automatically adjust threshold settings to
reduce false positives
– Generate reports about distribution of
anomaly scores.
Spade alerts
[**] [104:1:1] spp_anomsensor: Anomaly threshold exceeded: 3.8919 [**] 08/22-22:37:00.419813
24.234.114.96:3246 -> VICTIM.HOST:80 TCP TTL:116 TOS:0x0 ID:25395 IpLen:20 DgmLen:48
DF ******S* Seq: 0xEBCF8EB7 Ack: 0x0 Win: 0x4000 TcpLen: 28 TCP Options (4) => MSS:
1460 NOP NOP SackOK
The alert is an attempt to connect to a local web server. There is not a web server at the
VICTIM.HOST address, so this is unusual activity. Yet, Spade did not flag this packet with a high
anomaly score. In this specific case, the low anomaly score is likely due to the Code Red
epidemic. The anomaly score of this packet is very low because the system had become
accustomed to seeing traffic to port 80. Spade clearly thought this packet was not exceedingly
anomalous activity (instead, Spade likened the port 80 request to the scenario where the
newspaper landed on the driveway, which was anomalous, but not particularly unusual).
[**] [104:1:1] spp_anomsensor: Anomaly threshold exceeded: 10.5464 [**] 08/22-22:22:46.577210
24.41.81.216:2065 -> VICTIM.HOST:27374 TCP TTL:108 TOS:0x0 ID:10314 IpLen:20
DgmLen:48 DF ******S* Seq: 0x63B97FE2 Ack: 0x0 Win: 0x4000 TcpLen: 28 TCP Options (4)
=> MSS: 1460 NOP NOP SackOK
The packet shows a highly anomalous trace. With a score of 10.5464, this packet is extremely
unique to the network. When looking at the destination port, it becomes clear why this packet
should not be transmitted to the network. Simply, there are no services on the network utilizing the
27374 port. In fact, upon further investigation, it is realized that this port is usually associated with
the Sub Seven Trojan [22]. Therefore, the packet warrants investigation, and Spade correctly
associated a high anomaly score to the trace.
Survey log
The survey log listed below displays the distribution of anomaly scores over time (60 minutes).
The file shows the hour relative to the execution of the Spade program, the total number of
packets of the specified hour, the average anomaly score (Median Anom), the 90th percentile,
and the the 99th percentile anomaly scores.
Logfile.txt
392 packets recorded
51 packets reported as alerts
Threshold learning results: top 200 anomaly scores over 23.58361 hours
Suggested threshold based on observation: 3.522590
Top scores: 3.52317, 3.52433, 3.52549, 3.52665, 3.52782, 3.52898, 3.53015, 3.53132, 3.53249,
3.53366, 3.53483, 3.53601, 3.53718, 3.53836, 3.53954, 3.54072, 3….10.29728,
10.33199,10.33308,10.33351,10.33360,10.33362
First runner up is 3.52201, so use threshold between 3.52201 and 3.52317 for 8.523
packets/hr
H(dip)=5.30397479
H(dport|dip)=9.69742991
P(dip=44044824)= 0.064466877730
P(dip=44044824,dport=1)= 0.000062047043
P(dip=44044824,dport=2)= 0.000077558804
P(dip=44044824,dport=3)= 0.000062047043
P(dip=44044824,dport=4)= 0.000062047043
P(dip=44044824,dport=5)= 0.000062047043
Initially, the log displays basic packet statistics and the threshold learning results.
This log shows how and why Spade is determining a certain threshold for a particular time.
Towards the bottom of this file probability statistics are listed where H = entropy,
dip = destination IP, dport = destination port, and P = probability.
Questions
?
References
• http://www.silicondefence.com/
• http://www.sans.org/resources/idfaq/statisti
cs_ids.php
Download