Anomaly Detection using Curious Agents A Case Study in Network Intrusion Detection Kamran Shafi Research Associate DSARC Kathryn Merrick Lecturer ITEE Presentation Outline • Anomaly Detection • Curious Agents • Intrusion detection using curious agents • Metrics, results and analysis • Future work Anomaly Detection • The process of finding patterns that deviate from the known or expected behaviour of a monitored system. • Assumptions: – Normal is prevalent – Anomalous is significantly different • A word on novelty detection… Anomaly Detection Challenges • A burglar alarm detects anomalies… – Computational models are faced with much more difficult task • What is ‘normal’ anyway? – Listing all possibilities is infeasible – Concepts change over time – Labelled data may be unavailable, noisy • Anomalies can be very similar to normal • Anomalies are domain dependent Anomaly Detection Techniques • Classification • Nearest Neighbour (NN) • Clustering • Statistical • Information Theoretic Approaches • Spectral Analysis Based on Varun et al., 2009 Research Questions in Anomaly Detection • Streaming data • Concept drift • Data labelling • Contextual anomaly detection Curious Agents • Currently used in robotics and character animation • Online, single-pass, unsupervised learners • Programmed to seek out and focus on ‘curious’ stimuli UNSW@ADFA Sony CSL, Paris Curiosity • In humans and animals: – A motivation to seek an ‘optimal’ level of stimulation – Stimuli that are similar-yet-different to what we already know • In artificial systems: – A scalar value for environmental stimuli based on: • Similarity • Frequency of similar stimuli • Recency of similar stimuli Curious Agent Models Curious reinforcement learning agents Robotics and character animation Curious supervised learning agents Intelligent sensed environments (Merrick and Maher, 2009) Curious reflex agents Proposed for anomaly detection in networks Case Study Network Intrusion Detection • The two fundamental approaches to ID: – Misuse detection – Anomaly detection • Anomaly detection for ID classified in two ways: – How normal data is interpreted and modelled • Host based, network based… – How similarity is measured • Statistical profiling, pattern matching, classification, clustering… Intrusion Detection Challenges • Intrusions need to be detected in real time, before they can damage the system • Concepts change over time – New (legitimate) users – New applications – Novel attacks • Attacks are stealthy and disguised as normal Advantages of Curious Agents for Network Intrusion Detection • Curious agents combine three measures to analyse stimuli (network data): – Similarity: clustering layer – Recency: habituating layer – Frequency: interest layer • Online, single-pass learners: – Potential for real-time operation • Unsupervised learners: – Potential to adapt to changes in network usage – Don’t require labelled data Curious Reflex Agents for Intrusion Detection • Approaches tested: – Self-organising map – K-means clustering – Simplified ART network Experimental Data: KDD Cup Dataset • 1999 KDD Cup intrusion detection dataset: – 38 attack categories (14 only in the test data) + normal data – Approx. 40 features – Used commonly to test algorithms for intrusion detection • Critiques of KDD Cup data: – Parent dataset contains simulation artefacts – affect to KDD Cup data is not known – Not suitable to evaluate supervised learning methods – curious agents approach overcome this problem – Labelling issues, outdated Experimental Design • Data pre-processing – Mapping categorical features – Normalisation (Formula ??) • • • • Number of runs Validation on training set Validation on test set Algorithm parameters –? Metrics • True positive rate: – A weighted measure – Percent of first instances of an attack that trigger agent curiosity above some threshold C – Higher the better • False positive rate: – Percent of all normal data that triggers agent curiosity above threshold C – Lower the better Results and Analysis: Overview 100 80 90.4 83.6 78.8 SOM 60 K-Means 40.1 ART 40 15.1 20 11.9 0 TRUE POSITIVE FALSE POSITIVE C = 0.7 SOM K-Means ART sp y. te ar dr op wa . re zc lie wa nt re zm . as te r. po po d. rts we ep . ro ot ki t. sa ta n. sm ur f. ph f. pe rl. la lo nd ad . m od ul e. m ul tih op . ne pt un e. nm ap . bu ba ffe ck r_ . ov er flo w. ftp gu _w es rit s_ e. pa ss wd . im ap . ip sw ee p. Results in Detail WEIGHTED DETECTION OF ATTACKS (%) 100 80 60 40 20 0 Aggregative Detection Rates Attack Category Probe SOM (%) 99.68 KMEANS (%) 98.38 ART (%) 88.47 DOS 82.49 72.77 70.98 U2R 97.73 91.67 91.67 R2L 88.09 80.21 73.96 Overall 91.99 85.76 81.27 FA Rate 40.1 15.1 11.9 Strengths and Limitations • Strengths: – – – – High detection rates for rare attack types Potential for Real time intrusion detection Allowance for alarm aggregation Allowance for tuning detection – false alarm tradeoff • Limitations: – False positive rate is too high to be practical – Parameter settings – can be made adaptive Conclusions • Curious agents show potential as an approach to ID: – Online, single-pass, unsupervised learning – High detection rate for attacks in KDD data set • False positive detection rate needs to be decreased before practical application possible • Future Directions – – – – Parameter adaptation Semi-supervised feedback Evaluation with other datasets Real time implementation