Learning Rules from System Call Arguments and Sequences for Anomaly Detection Gaurav Tandon and Philip Chan Department of Computer Sciences Florida Institute of Technology Overview • Related work in system call sequence-based systems • Problem Statement – Can system call arguments as attributes improve anomaly detection algorithms? • Approach – LERAD ( a conditional rule learning algorithm) – Variants of attributes • Experimental evaluation • Conclusions and future work Related Work • tide (time-delay embedding) Forrest et al, 1996 • stide (sequence time-delay embedding) Hofmeyr et al, 1999 • t-stide (stide with frequency threshold) Warrender et al, 1999 • Variable length sequence-based techniques (Wespi et al, 1999, 2000; Jiang et al, 2001) False Alarms !! Problem Statement Current models – system call sequences What else can we model? System call arguments open(“/etc/passwd”) open(“/users/readme”) Approach • Models based upon system calls • 3 sets of attributes - system call sequence - system call arguments - system call arguments + sequence • Adopt a rule learning approach - Learning Rules for Anomaly Detection (LERAD) Learning Rules for Anomaly Detection (LERAD) [Mahoney and Chan, 2003] A a, B b,... X {x1 , x2 ,....} A, B, and X are attributes a, b, x1, x2 are values to the corresponding attributes SC close(), Arg1 123 Arg 2 {abc, xyz} p Pr( X {x1 , x 2 ...} | A a, B b,...) r / n p - probability of observing a value not in the consequent r - cardinality of the set {x1, x2, …} in the consequent n - number of samples that satisfy the antecedent AnomalyScore 1 / p n / r Overview of LERAD 4 steps involved in rule generation: 1 2 3 4 From a small training sample, generate candidate rules and associate probabilities with them Coverage test to minimize the rule set Update rules beyond the small training sample Validating rules on a separate validation set Step 1a: Generate Candidate Rules Training Data A B C D Random Sample S1 1 2 3 4 Random Sample S2 1 2 3 5 Random Sample S3 6 7 8 4 Training S4 1 0 9 5 Training S5 1 2 3 4 Validation S6 6 3 8 5 • Two samples are picked at random (say S1 and S2) • Matching attributes A, B and C are picked in random order (say B, C and A) • These attributes are used to form rules with 0, 1, 2 conditions in the antecedent Rule 1 : * B {2} Rule 2 : C = 3 B {2} Rule 3 : A = 1, C = 3 B {2} Step 1b: Generate Candidate Rules Training Data A B C D Random Sample S1 1 2 3 4 Random Sample S2 1 2 3 5 Random Sample S3 6 7 8 4 Training S4 1 0 9 5 Training S5 1 2 3 4 Validation S6 6 3 8 5 • Adding values to the consequent based on a subset of the training set (say S1-S3) • Probability estimate p associated with every rule when it is violated ( instead of in each rule) • Rules are sorted in increasing order of the p Rule 2 : C = 3 B {2} Rule 3 : A = 1, C = 3 B {2} [ p 1 / 2] [ p 1 / 2] Rule 1 : * B {2,7} [ p 2/3] Step 2: Coverage Test Training Data A B C D Random Sample S1 1 2 (Rule 2) 3 4 Random Sample S2 1 2 (Rule 2) 3 5 Random Sample S3 6 7 (Rule 1) 8 4 Training S4 1 0 9 5 Training S5 1 2 3 4 Validation S6 6 3 8 5 • Obtain minimal set of rules Rule 2 : C = 3 B {2} Rule 3 : A = 1, C = 3 B {2} [ p 1 / 2] [ p 1 / 2] Rule 1 : * B {2,7} [ p 2/3] Step 2: Coverage Test Training Data A B C D Random Sample S1 1 2 (Rule 2) 3 4 Random Sample S2 1 2 (Rule 2) 3 5 Random Sample S3 6 7 (Rule 1) 8 4 Training S4 1 0 9 5 Training S5 1 2 3 4 Validation S6 6 3 8 5 • Obtain minimal set of rules Rule 2 : C = 3 B {2} Rule 3 : A = 1, C = 3 B {2} [ p 1 / 2] [p 1/2] Rule 1 : * B {2,7} [ p 2/3] Step 3: Updating rules beyond the training samples Training Data A B C D Random Sample S1 1 2 3 4 Random Sample S2 1 2 3 5 Random Sample S3 6 7 8 4 Training S4 1 0 9 5 Training S5 1 2 3 4 Validation S6 6 3 8 5 • Extend rules to the entire training (minus validation) set (samples S1-S5) Rule 2 : C = 3 B {2} Rule 1 : * B {2,7,0} [ p 1 / 3] [ p 3/5] Step 4: Validating rules Training Data A B C D Random Sample S1 1 2 3 4 Random Sample S2 1 2 3 5 Random Sample S3 6 7 8 4 Training S4 1 0 9 5 Training S5 1 2 3 4 Validation S6 6 3 8 5 • Test the set of rules on the validation set (S6) • Remove rules that produce anomaly Rule 2 : C = 3 B {2} Rule 1 : * B {2,7,0} [ p 1 / 3] [ p 3/5] Step 4: Validating rules Training Data A B C D Random Sample S1 1 2 3 4 Random Sample S2 1 2 3 5 Random Sample S3 6 7 8 4 Training S4 1 0 9 5 Training S5 1 2 3 4 Validation S6 6 3 8 5 • Test the set of rules on the validation set (S6) • Remove rules that produce anomaly Rule 2 : C = 3 B {2} Rule 1 : * B {2,7,0} [ p 1 / 3] [ p 3/5] Learning Rules for Anomaly Detection (LERAD) Non-stationary model - only the last occurrence of an event is important TotalAnomalyScore t i / pi t i ni / ri i i t - time interval since the last anomalous event i - index of the rule violated Variants of attributes • 3 variants (i) S-LERAD: system call sequence (ii) A-LERAD: system call arguments (iii) M-LERAD: system call arguments + sequence S-LERAD • System call sequence-based LERAD • Samples comprising 6 contiguous system call tokens input to LERAD SC1 SC2 SC3 SC4 SC5 SC6 mmap() munmap() mmap() munmap() open() close() munmap() mmap() munmap() open() close() open() mmap() munmap() open() close() open() mmap() SC1 mmap (), SC2 munmap() SC6 {close(), mmap ()} A-LERAD • Samples containing system call along with arguments • System call will always be a condition in the antecedent of the rule SC Arg1 Arg2 Arg3 Arg4 Arg5 SC munmap() Arg1 {0 x134,0102,0 x 211,0 x124} M-LERAD • Combination of system call sequences and arguments SC1 close (), Arg1 0 x134 SC 3 {munmap()} 1999 DARPA IDS Evaluation [Lippmann et al, 2000] • Week 3 – Training data (~ 2.1 million system calls) • Weeks 4 and 5 – Test Data (over 7 million system calls) • Total – 51 attacks on the Solaris host Experimental Procedures • Preprocessing the data: BSM audit log Pi Application 1 Applications Pj Application 2 • Model per application • Merge all alarms Processes … Pk Application N Evaluation Criteria • Attack detected if alarm generated within 60 seconds of occurrence of the attack • Number of attacks detected @ 10 false alarms/day • Time and storage requirements Detections vs. false alarms 35 Attacks Detected 30 25 20 15 10 5 0 1 5 10 50 10 0 Fals e Alarm s pe r Day M-LERAD A-LERAD t-stide stide S-LERAD tide Percentage detections per attack type Percentage of Attacks Detected 100 90 80 70 60 50 40 30 20 10 0 Probes (5) DOS (19) R2L (12) U2R (9) Data (4) Attack Types (Number of Attacks) tide stide t-stide S-LERAD A-LERAD Data-U2R (2) M-LERAD Comparison of CPU times Application Training Time (seconds) [on 1 week of data] t-stide M-LERAD Testing Time (seconds) [on 2 weeks of data] t-stide M-LERAD ftpd 0.2 1.0 0.2 1.0 Telnetd 1.0 7.9 1.0 9.8 ufsdump 6.8 33.3 0.4 1.8 tcsh 6.3 32.8 5.9 37.6 login 2.4 16.7 2.4 19.9 sendmail 2.7 15.1 3.2 21.6 quota 0.2 3.5 0.2 3.8 sh 0.2 3.2 0.4 5.6 Storage Requirements • More data extracted (system calls + arguments) – more space • Only during training – can be done offline • Small rule set vs. large database (stide, t-stide) • e.g. for tcsh application: 1.5 KB file for the set of rules (M-LERAD) 5 KB for sequence database (stide) Summary of contributions • Introduced argument information to model systems • Enhanced LERAD to form rules with system calls as pivotal attributes • LERAD with argument information detects more attacks than existing system call sequence based algorithms (tide, stide, t-stide). • Sequence + argument based system generally detected the most attacks with different false alarm rates • Argument information alone can be used effectively to detect attacks at lower false alarm rates • Less memory requirements during detection as compared to sequence based techniques Future Work • More $$$$$$$$$$ Future Work • A richer representation More attributes - time between subsequent system calls • Anomaly score t-stide vs. LERAD Thank You