Intrusion Detection: New Directions Teresa Lunt Xerox Palo Alto Research Center tlunt@parc.xerox.com Detect, isolate, reconfigure, repair Detection & Response 3. Essential systems increase their degree of protection & robustness 2. Intrusion detector alerts on possible attack IDS Emergency Mode Activator Sensor Decoy/ Sensor Cleanup Sensor Sensor 1. Sensors perform security monitoring 5. Human-assisted incident response restores service and secure state Fishbowl 4. Fishbowl created Critical System to divert the attacker and observe the attack Data Collection • • What level of data to collect – – – – – – OS system calls OS command line network data (e.g., from router and firewall logs or MIBs) within applications keystrokes all characters transmitted Tradeoffs in: – – – – – – – types of intrusions that can be detected complexity and volume of data ability to formulate rules that characterize intrusions ease of playback ease of damage assessment or evidence gathering data reliability degree of privacy invasion Typical OS Audit Record Fields • subject – identifies user, session, and location • action – the action attempted • object – what the subject acted upon; subfields depend on type of action • • errorcode resource-info – CPU, memory, I/O • timestamp State of the Art • Host-based vs. network-based – Do not detect attacks that disrupt or manipulate the infrastructure • Knowledge-based – Look for patterns associated with known intrusions – Detect only what you know to look for – Most systems look for only a dozen or so intrusion types – Serious foes will use “surprise” attacks we haven’t seen before • High number of false alarms – Much flagged activity is of little concern (e.g., password guessing) – Extremely large numbers of alarms, which must be investigated manually – Lack of discrimination between suspicious and normal behaviors State of the Art cont’d • Line monitors (eavesdrop on a communications line) – View is restricted to what passes over a given line – Too much data must be examined and logged – Considerably weakened if encryption is used • Can monitor small numbers of machines/entities – Audit logs do not scale well – Monitoring individual users and machines – No ability for cooperating detectors, which could filter events of lesser or only local concern • Lack of robustness – Cannot deal with missing, incomplete, untimely, or otherwise faulty data • Unix-specific Research Challenges • • • • • • • • • • • • • Detect a wide variety of intrusion types Very high certainty Real-time detection Develop a network-wide view rather than local views Analysis must work reliably with incomplete data Detect unanticipated attack methods Scale to very large heterogeneous systems What data to collect for maximal effectiveness; network instrumentation Automated response Discover or narrow down the source of an attack Integrate with network management and fault diagnosis Infer intent; forming the big picture Cooperative problem solving Methods under Investigation • Methods to detect highly unusual events or combinations of events – Statistical methods – Neural networks – Machine learning • Methods to detect activity outside prescribed bounds – Specification-based detection • New knowledge-based • Traceback methods – Thumbprinting Acceptable Illegal Structural – Graphical intrusion detection – State transition models (model-based detection) Discrepancy Statistical analysis techniques Model/Pattern Profile Match Cooperating Detectors IDS IDS IDS IDS IDS Sensors Also needed: Efficient and effective methods for peer-to-peer cooperative problem solving to be applied to the detection problem –To filter events of only local concern –To assess a larger “region” Advanced Techniques • Statistical anomaly detection (SRI, CMU) – establish a historical behavior profile for each desired entity (e.g., user, group, device, process) – compare current behavior with the profiles – detects departures from established norms – continuously update profiles to “learn” changes in subject behavior – addresses unanticipated intrusion types • Early statistical studies: – SRI study (Javitz et al): • Showed users could be distinguished from each other based on patterns of use – Sytek study (Lunt et al): • Showed behavior characteristics can be found that discriminate between normal user behavior and simulated intrusions Advanced Techniques cont’d • Machine learning (LANL) – – – – – – – Builds a massive tree of statistical “rules” (typically 100,000’s of them) Branches are labeled with conditional probabilities Prunes the tree to a maximum depth of four to six Low-occurrence branches are combined Tree is “trained” from a few days of data Tree cannot be updated to “learn” as usage patterns change Activity is considered abnormal if it does not “match” a branch in the tree or if it matches a branch with low conditional probability last node • Meta-Learning (Columbia University) – Meta-learning integrates a number of separately learned classifiers – Multi-layered approach: • machine learning and decision procedures detect intrusions locally • meta-learning and decision procedures to integrate the collective knowledge acquired by the local agents Advanced Techniques cont’d • Computational immunology – based on biological analogies (e.g., self vs. non-self discrimination) – build up a database of observed short sequences of system calls for a program and detect when the observed program behavior exhibits short sequences not in that database (U. of NM) – allows the detection of tampered or malicious programs or other suspicious events – this potentially lightweight method is being implemented in small, autonomous agents in a CORBA environment (ORA) Advanced Techniques cont’d • Model-based detection – Detects suspicious state transitions (UC Santa Barbara) • specifies penetration scenarios as a sequence of actions • keeps track of interesting “state changes” • attempts to identify attacks in progress before damage is done – Adapt model-based diagnosis, which has been successful in diagnosing faults in microprocessors, to intrusion detection (MIT) • Graphical detection (UC Davis) – detects intrusions whose activity spans many machines that could be difficult to detect locally – specifies intrusion scenarios as graphs of actions covering many machines – the graphs provide an intuitive visual display Advanced Techniques cont’d • Specification-based detection (UC Davis) – detects departures from security specifications of privileged programs – allows detection of unanticipated attacks • Thumbprint technique (UC Davis) – allows limited traceback – thumbprint is a statistical digest of an interval of a communications channel – matching thumbprints can be used to reconstruct the path of an intruder Advanced Techniques cont’d • Signalling Infrastructure Detection (GTE) – detect anomalous events in a network and signalling infrastructure typical of telephone service providers – designed for integration into network operations centers – uses existing systems/tools for data collection – uses anomaly detection and specific signalling protocol “sanity checks” • Detection in high-speed networks (MCNC) – Integrates anomaly detection techniques with network management for ATM networking (IP over ATM) – Logical analysis of routing protocol operation to detect anomalous states Advanced Techniques cont’d • Automated response (Boeing) – Integrates firewall, intrusion detection, filtering router, and network management technologies – Local intrusion detectors determines threat presence – Firewalls communicate intrusion detection information to each other – Firewalls cooperate to locate the intruder – Network managers automatically reconfigure the network to thwart the attack – Firewalls and filtering routers dynamically alter filtering rules to block the intruder – Dynamic reconfiguration of logging, monitoring, and access control in response to detected suspicious activity – "Fusion" of intrusion-detection data reported by different detectors – The monitoring is also adapted as part of the response, to help pinpoint the problem and its source Advanced Techniques cont’d • Survivable Active Networks (Bellcore) – Will allow highly configurable network elements to cooperate with networked hosts to detect, isolate, and recover quickly and automatically from damage due to errors or malicious attacks – "Ablative software" will allow suspect activity to be "peeled off" the system while continuing to operate in a microenvironment • Planning and procedural reasoning (SRI) – Suggest and implement incident recovery procedures – Uses AI-based automated planning technology for both analysis and recovery and repair – Generates explanations to help the sys admin understand what happened and what to do about it – Integrate intrusion response tools, to combine the functionality of many tools that specialize in particular areas of incident management, into a security anchor desk (USC-ISI) Open Questions • Detection performance in realistic settings with single methods and combinations of methods • Detection performance with faulty and missing data • False positive and false negative rates • Time to detection • Scalability • Dependence on good intruder models • Distinction from common failure modes • What data to collect/observe Common Intrusion Detection Framework E1 E2 Standard Interfaces – an interconnection framework for data collection, analysis, and response components – extensible architecture – reuse of core technology – facilitate tech transfer – reduce cost E3 A1 C A2 D Reference Architecture E A D C Standard API Event Generator Event Analyzer Event Database System-specific Controller Strategic Intrusion Assessment • In a two-week period, AFIWC’s intrusion detectors at 100 AFBs alarmed on 2 million sessions • After manual review, these were reduced to 12,000 suspicious events • After further manual review, these were reduced to four actual incidents National Reporting Centers Regional Reporting Centers (CERTs) DoD Reporting Centers International/Allied Reporting Centers Organizational Security Centers Local Intrusion Detectors •Most alarms are false positives •Most true positives are trivial incidents •Of the significant incidents, most are isolated attacks to be dealt with locally Correlation Patterns Classification Infer intent Assess damage Predict future status Assess certainty Strategic Intrusion Assessment Suppress false alarms Correlate & infer intent Plan recognition • Peer-to-peer cooperation among detectors to decide what to report to higher levels. Detectors must be able to: • discover each other • negotiate requirements • collaborate on diagnosis/response • Improve individual detectors • Distinguish what is trivial from significant • Distinguish what is locally relevant – Hypothesize goals for IW adversaries – Develop plans for accomplishing each goal – automated planning technology – Overlay with observed incident data to discover intent – plan recognition technology – Estimate certainty Security Detection and Response Center Functions: • Detection: Analyzes and filters events reported from lower layers • for items of interest to this layer, and • for reporting to higher layers • Assessment: to understand coordinated events • of interest at this layer, and • for reporting to higher layers reporting to higher layers Assessment Tracing Detection Response • Tracing (e.g., IDIP, active nets) • Automated response (e.g., IDIP for connection closing/filtering) • Event notification Significant investment Early speculative investigations No research Notification reported events from lower layers to peers DARPA/AFRL Evaluations • Evaluations intended to drive improvements • Two rounds: one in 1998 (completed) and one in 1999 – results reported at Dec 1998 DARPA PI meeting – Data sources for 1998 were TCP dump and Unix audit logs – 1999 evaluation will include NT and other data sources • Live evaluation on a network at MIT/LL using simulated data similar to AFB data – Generated large amounts of realistic background traffic similar to observed/collected AFB traffic – Created the largest known collection of automated attacks with signatures (audit and sniffing) – Considered both known and new (never seen before) attacks – Capable of measuring both detection and false alarm rates • Projects also performed self-evaluations using extensive training and testing data sets Live Testbed Configuration for 1999 Evaluation “INSIDE” “OUTSIDE” (172.16 - eyrie.af.mil) (Many IP Addresses) PC Work Station PC Work Station PC Work Station OUTSIDE WS GATEWAY INSIDE GATEWAY Work Station OUTSIDE WEB GATEWAY Web Server Work Station P2 Work Station P2 P2 CISCO ROUTER Ultra Ultra 486 Solaris Sparc Sparc Linux SunOS 486 486 NT NT Solaris Audit Host DISK DUMPS Solaris Sniffer AUDIT DATA Web Server SNIFFED DATA Web Server Best combination of research prototypes ATTACKS DETECTED (%) 100 80 60 BEST COMBINATION Keyword baseline similar to COTS and GOTS products 40 20 0 0.001 KEYWORD BASELINE 0.01 0.1 1 FALSE ALARMS (%) • Over two orders of magnitude reduction in false alarms with improved detection accuracy 10 100 Conclusions • Currently available technology is not adequate for the problem • Promising methods under investigation show significant improvement over current technology • There is still a lot more to be done